Azure Data Factory | element61 (2024)

Azure Data Factory is a Microsoft cloud service offered by the Azure platform that allows data integration from many different sources. Azure Data Factory is a perfect solution when in need of building hybrid extract-transform-load (ETL), extract-load-transform (ELT) and data integration pipelines.

Azure Data Factory | element61 (1)

What does Azure Data Factory do?

It allows you to:

  • Copy data from many supported sources both on-premise and cloud sources
  • Transform the data (cf. below paragraphs)
  • Publish the copied and transformed data, sending it to a destination data storage or analytics engine
  • Monitor the data flows using a rich graphical interface

What doesn’t Azure Data Factory do?

Data Factory isn’t SSIS (SQL Server Integration Services) in the cloud. It has less database specific features and focuses on supporting broader data transformation & movements (incl. big datasets, incl. data lake operations).

Data Factory can, however, run your SSIS packages in the Cloud (once build in SSIS). This allows to leverage Data Factory’s scalability with SSIS’s advanced ETL features.

Why do I need Azure Data Factory?

Data Factory is an enabler for any Cloud projects. In almost any Cloud project, you will need to perform data movement activities across various networks (on-premise network and Cloud) and across various services (i.e. from and to close different Azure storages).

Data Factory is particularly a required enabler for organizations who are making their first steps in the Cloud & who thus try to connect on-premise data with the Cloud. For this, Azure Data Factory has an Integration Runtime engine, a Gateway service that can be installed on-premise which guarantees performant & secure transfer of data from & to the cloud.

How does it differ from other ETL Tools?

Data Factory is one option to use as cloud ETL (or ELT) tool. There are some features that distinguish Azure Data Factory from other tools.

  • It also has the ability to run SSIS packages
  • It auto-scales (fully managed PaaS product) based on the given workload.
  • It allows to run up to once per minute
  • It bridges on-premise & Azure Cloud seamlessly through a gateway
  • It can handle big data volumes
  • It can connect & work together with other compute services (Azure Batch, HDInsights) to even run truly big data computations during ETL

From our expertise, the best alternative to Azure Data Factory would be Apache Airflow which has its advantages but also disadvantages. Contact us for more details.

How do I work with Azure Data Factory?

Azure Data Factory is a user interface tool whichoffers a very graphical overview to create/manage activities and pipelines. It doesn’t require coding skills, yet complex transformation will require Azure Data Factory experience.

Azure Data Factory | element61 (2)
click to enlarge

Important features:

  • Azure Data Factory has default connectors with close to all on-premise data sources including MySQL, SQL Server, Oracle DBs

Azure Data Factory | element61 (4).
click to enlarge

  • Azure Data Factory supports branching, where the output of one activity can be a trigger for the start of another activity.
    - e.g. first copy the data from on-premise to Blob, then merge all blobs
  • Azure Data Factory support tumbling window trigger & event trigger. The first is particularly relevant in creating partitioned data in for example a Data Lake set-up (for example storing your data automatically in daily partitioned blobs: e.g. YYYY/MM/DD/Blob.csv).
    An event trigger is applicable when an event such as a new Blob on Blob Storage should automatically trigger a transformation.
  • Azure Data Factory allows to work with parameters and thus enables to pass on parameters dynamicallybetween datasets, pipelines & triggers. An example could be that the filename of the destination file should have the name of the pipeline or should be the date of the data slice.
  • Azure Data Factory allows to run pipeline up to 1 run per minute. It thus doesn’t allow real-time but enables close to real-time.
  • Azure Data Factory provides monitoring & alerting. The execution of the different pipelines can be easily monitored through the UI & you can set-up alerts (linked to Azure Monitor) if anything fails.

Azure Data Factory | element61 (6)
click to enlarge

  • Azure Data Factory can work well with Azure Databricks to schedule ML algorithms. Read more about this in this insight.

How does Azure Data Factory work with other Azure resources?

Azure Data Factory | element61 (8)
click to enlarge

One of the main advantages of Azure Data Factory is that it integrates great with other Azure Compute & Storage resources. This is the exact purpose of linked services: i.e. to define the connection to external resources. There are 2 kinds of linked services you can define:

  • A Data Store Service to: Azure SQL Database, Azure SQL Data-warehouse, an on-premises databases, a Data Lake, a filesystem, a NoSQL DB, etc.
  • A Compute Service to transform and enrich data: e.g., Azure HDInsight, Azure Machine Learning, Stored Procedure in any SQL, Data Lake Analytics U-SQL activity, Azure Databricks and/or Azure Batch (using Custom Activity)

The pricing of Data Factory is based on usage: number of “activities” (data processing steps) per month and the integration runtime usage is charged per hour depending on the machine the number of nodes used.

Should I use Azure Data Factory or SSIS?

Use the right tool for the right purpose. Through below overview you understand that they are complementary. They are also built that way: i.e., Azure Data Factory also offers the ability to deploy, manage and run SSIS packages in managed Azure SSIS Integration Runtimes.

Based on your current platform/solution:

Hybrid On-Prem
& Azure Solution

Azure Solution

On-Prem
Only Solution

Azure Data Factory
(ADF V2)

Yes

Yes

No

Integration Services (SSIS)

Yes

Yes

Yes

Based on type of data:

Small data

Close to
real-time data
(every minute)

Big Data

Azure Data Factory
(ADF V2)

Yes

Yes

Yes

Integration Services (SSIS)

Yes

No

No

Conclusion

Data Factory offers you the possibility to easily integrate cloud data with on-premises data. It’s unique in its ease of use despite its ability to transform and enrich complex data. It delivers data integration which is scalable, available and at low costs. Today, this service is a crucial building block in any data platform & machine learning project.

element61offers a hands-on expertise with Azure Data Factory and has implemented Data Factory setups for various clients and use cases. We can help the organizations with our in-depth understanding of the concepts on ADF and our experience in building end-to-end implementations.

Continue Reading

Continue reading orcontact usto get started with Azure Data Factory

  • Learn about the difference between Azure Data Factory and Airflow
  • Read how to use Azure Data Factory to run and schedule your Azure Databricks scripts
  • Learn how to sync your on-premise data to the Cloud using Azure Data Factory

Contact usfor more information on Azure Data Factory!

Azure Data Factory | element61 (2024)
Top Articles
Decentralized Finance (DeFi) Payment Card Protocols: Disruption Potential and Smart Contract Vulnerabilities
How Much Life Insurance Do I Need? - NerdWallet
Mcgeorge Academic Calendar
Amtrust Bank Cd Rates
Practical Magic 123Movies
Noaa Swell Forecast
Lesson 1 Homework 5.5 Answer Key
Max 80 Orl
Ssefth1203
Mid90S Common Sense Media
Craigslist Cars Nwi
Napa Autocare Locator
Toy Story 3 Animation Screencaps
Charter Spectrum Store
1773X To
Earl David Worden Military Service
Long Island Jobs Craigslist
Panic! At The Disco - Spotify Top Songs
Music Go Round Music Store
Somewhere In Queens Showtimes Near The Maple Theater
104 Presidential Ct Lafayette La 70503
Delectable Birthday Dyes
Cb2 South Coast Plaza
Kabob-House-Spokane Photos
Temu Seat Covers
Great ATV Riding Tips for Beginners
Studentvue Calexico
Viduthalai Movie Download
Meggen Nut
417-990-0201
What does wym mean?
Microsoftlicentiespecialist.nl - Microcenter - ICT voor het MKB
Bimar Produkte Test & Vergleich 09/2024 » GUT bis SEHR GUT
Wattengel Funeral Home Meadow Drive
Andrew Lee Torres
Wordle Feb 27 Mashable
Lady Nagant Funko Pop
Mybiglots Net Associates
Marcal Paper Products - Nassau Paper Company Ltd. -
Brother Bear Tattoo Ideas
R/Gnv
Kaamel Hasaun Wikipedia
Actress Zazie Crossword Clue
Yosemite Sam Hood Ornament
SF bay area cars & trucks "chevrolet 50" - craigslist
Unpleasant Realities Nyt
Dmv Kiosk Bakersfield
Hy-Vee, Inc. hiring Market Grille Express Assistant Department Manager in New Hope, MN | LinkedIn
Invitation Quinceanera Espanol
Stone Eater Bike Park
Unity Webgl Extreme Race
Latest Posts
Article information

Author: Horacio Brakus JD

Last Updated:

Views: 5817

Rating: 4 / 5 (51 voted)

Reviews: 82% of readers found this page helpful

Author information

Name: Horacio Brakus JD

Birthday: 1999-08-21

Address: Apt. 524 43384 Minnie Prairie, South Edda, MA 62804

Phone: +5931039998219

Job: Sales Strategist

Hobby: Sculling, Kitesurfing, Orienteering, Painting, Computer programming, Creative writing, Scuba diving

Introduction: My name is Horacio Brakus JD, I am a lively, splendid, jolly, vivacious, vast, cheerful, agreeable person who loves writing and wants to share my knowledge and understanding with you.