25 September 2023
Published by One51

Microsoft Fabric Data Factory, On-prem data sources 

Azure Data Factory (ADF) is one of the key components of the Microsoft Data platform, either as a standalone application or as part of  Azure Synapse Analytics. In addition to ADF, Microsoft has been working on the Power BI Dataflows as another great tool for data load and transformation containing hundreds of out-of-box connections.  

 

It is unsurprising to see the Data factory in Microsoft Fabric as one of the core workloads that empowers developers with a modern experience to extract, transform and load data at scale. But the key difference is that Fabric is equipped with the next generation of Data Factory. It comes with all the good features of both ADF and Dataflows! 

 

Fabric Data Factory has two major components: 

  1. Data Pipelines provide workflows at scale and can work with PB size of data. It allows for activities as copy, loops, and lookups.  
  2. Dataflows (Gen2) is a low-code interface with connectors to hundreds of sources and provides more than 300 different transformations.   

 

Fabric Data Factory is a fully managed cloud service and, like other cloud services, does not have access to the on-prem databases or any databases secured behind a firewall or virtual network. With the very first version of ADF, Microsoft introduced the concept of Integration Runtime or IR in short (managed, self-hosted and Azure) and Power BI data gateway for Power BI Dataflows. These two applications are designed to establish secure connections between on-prem/private data sources and ADF/Dataflows. Fast forward to Fabric, there is or will be a similar concept for both Data Pipelines and Dataflows. Having said that, Microsoft Fabric is still in Preview; the Data Pipelines do not support the IR and, therefore cannot be used for on-prem data sources. It is still under development and will be released soon. The good news is dataflow supports data gateway and can be used to extract data from on-prem and private data sources. There is no need to install a separate Power BI data gateway, as your existing ones can serve the Fabric dataflows. However, it might encounter issues with the dataflow refresh process. To fix that make sure the outbound traffic from the gateway server allows TCP on port 1433 for *.datawarehouse.pbidedicated.windows.net endpoint. 

 

This issue is well documented in the Microsoft portal, and you can find more details here: On-premises data gateway considerations for data destinations in Dataflow Gen2 – Microsoft Fabric | Microsoft Learn

About One51

Drawing on a wealth of expertise and a deep understanding of the Energy, Supply Chain, FMCG and other industry sectors, One51 is dedicated to helping businesses navigate the complexities of their operations by harnessing the power of data-driven insights.
With a customer-centric approach, we collaborate closely with our clients to uncover hidden patterns, mitigate risks, and realise new avenues for innovation, all while bolstering their bottom line.
Our tailored solutions, aligned with best-of-breed cloud technologies and delivered using comprehensive analytics frameworks, enable companies to optimise their processes, identify growth opportunities, and make informed strategic decisions.
As a leader in helping companies harness the power of data, the One51 team will work with you to transform complex data into actionable intelligence, helping your business gain a competitive edge in a rapidly evolving landscape.
One51 offers a comprehensive range of services encompassing the entire data and analytics lifecycle, from strategy to successful implementation and ongoing support. Our expertise covers various areas, including Data Assessments, Data Strategy, Data Governance, Data Architecture, Data Management, Data Visualisation, Advanced Analytics and Managed Services.
For more information, please visit one51.consulting