Connecting Oracle OnPrem from Cloud using Integration Runtime , No Expressroute


Objective

how to leverage Microsoft Self Hosted Runtime integration in Azure Data Factory to connect On-Premise network and view data.

This is a proof of concept that said without being a hard wired Azure ExpressRoute setup or Site to Site VPN connectivity we can still connect to On-Premise Data to Cloud Infrastructure by the help of Azure Data Factory

Scope

The scope of this POC is limited to show the connectivity of the data source (Oracle) which is behind corporate firewall and view data part only. Detail scope to move data from Oracle to Azure Data Lake Storage Gen 2( ADLS Gen2) will be covered in separate blog item.

Prerequisite

  • You need to have standalone local machine VPN connectivity established to connect to the Source Database (in this case Oracle) on-Prem.
  • Windows Server 2012/2016/2019 , Windows 8.1 , Windows 10

What is Microsoft Self Hosted Integration Runtime

The Microsoft Integration Runtime is a customer managed data integration and scanning compute infrastructure used by Azure Data Factory, Azure Synapse Analytics and Azure Purview to provide data integration and scanning capabilities across different network environments. 

A self-hosted integration runtime can run copy activities between a cloud data store and a data store in a private network. It also can dispatch transform activities against compute resources in an on-premises network or an Azure virtual network. The installation of a self-hosted integration runtime needs an on-premises machine or a virtual machine inside a private network

Self Hosted Integration Runtime Architecture

The high-level overview of data flow
Self Hosted Integration Runtime Architecture as Define in Azure Portal

Steps: 

  • Create Data Pipeline in Azure Data Factory
  • Setup Integration Runtime
  • Install Microsoft Self hosted integration runtime in local laptop and Configure the connection to cloud
  • Setup Linked Services
  • View Oracle Data

Step 1 : Create Data Pipeline in Azure Data Factory

Search for Data Factory service in Azure

Create a Data Factory Pipeline

I have created “TestOracleDBConnectADF” as and Click Author and Monitor

Create New Pipeline

Click New Integration Runtimes

Choose SelfHosted option and Copy the Key . Also here we can see direct download link for the integration Runtime installer

Step 2 : Setup Self Hosted Integration runtime in local laptop

Download and Install from here : https://www.microsoft.com/en-us/download/details.aspx?id=39717

Copy the Key to below section after running the installer and hit Register

Step 3 : Verify the connection established between Local machine and Cloud

The Selfhosted runtime configuration should be look like this . No additional configuration required to set here

In Azure Portal it should look like below

Step 4 : Connect to VPN and See the Sample table in Exadata schema

Step 5 : Configure Linked Services to connect with Oracle Exadata DB info

In Azure Data factory create New Linked service called “LS_Oracle_onPrem” .

Add the Exadata OnPrem Database connection information there and Test the connection:

Step 6 : Create New Dataset

I have created Data folder called EMPDEPT and inside that new dataset called SRC_EMP 

Pick and choose the Linked Service and Test Connection . Once successful Click the Table dropdown to get the list of Schema and Table to access and you can Preview any dataset you want .

This will directly fetch the data from OnPrem Oracle to Cloud . 

That’s all for the Validating and testing connection and data access across two platform.