Load data into Azure Data Lake Storage Gen2 - Azure Data Factory (2024)

  • Article

APPLIES TO: Load data into Azure Data Lake Storage Gen2 - Azure Data Factory (1)Azure Data Factory Load data into Azure Data Lake Storage Gen2 - Azure Data Factory (2)Azure Synapse Analytics

Tip

Try out Data Factory in Microsoft Fabric, an all-in-one analytics solution for enterprises. Microsoft Fabric covers everything from data movement to data science, real-time analytics, business intelligence, and reporting. Learn how to start a new trial for free!

Azure Data Lake Storage Gen2 is a set of capabilities dedicated to big data analytics, built into Azure Blob storage. It allows you to interface with your data using both file system and object storage paradigms.

Azure Data Factory (ADF) is a fully managed cloud-based data integration service. You can use the service to populate the lake with data from a rich set of on-premises and cloud-based data stores and save time when building your analytics solutions. For a detailed list of supported connectors, see the table of Supported data stores.

Azure Data Factory offers a scale-out, managed data movement solution. Due to the scale-out architecture of ADF, it can ingest data at a high throughput. For details, see Copy activity performance.

This article shows you how to use the Data Factory Copy Data tool to load data from Amazon Web Services S3 service into Azure Data Lake Storage Gen2. You can follow similar steps to copy data from other types of data stores.

Prerequisites

  • Azure subscription: If you don't have an Azure subscription, create a free account before you begin.
  • Azure Storage account with Data Lake Storage Gen2 enabled: If you don't have a Storage account, create an account.
  • AWS account with an S3 bucket that contains data: This article shows how to copy data from Amazon S3. You can use other data stores by following similar steps.

Create a data factory

  1. If you have not created your data factory yet, follow the steps in Quickstart: Create a data factory by using the Azure portal and Azure Data Factory Studio to create one. After creating it, browse to the data factory in the Azure portal.

    Load data into Azure Data Lake Storage Gen2 - Azure Data Factory (3)

  2. Select Open on the Open Azure Data Factory Studio tile to launch the Data Integration application in a separate tab.

Load data into Azure Data Lake Storage Gen2

  1. In the home page of Azure Data Factory, select the Ingest tile to launch the Copy Data tool.

  2. In the Properties page, choose Built-in copy task under Task type, and choose Run once now under Task cadence or task schedule, then select Next.

    Load data into Azure Data Lake Storage Gen2 - Azure Data Factory (4)

  3. In the Source data store page, complete the following steps:

    1. Select + New connection. Select Amazon S3 from the connector gallery, and select Continue.

      Load data into Azure Data Lake Storage Gen2 - Azure Data Factory (5)

    2. In the New connection (Amazon S3) page, do the following steps:

      1. Specify the Access Key ID value.
      2. Specify the Secret Access Key value.
      3. Select Test connection to validate the settings, then select Create.

      Load data into Azure Data Lake Storage Gen2 - Azure Data Factory (6)

    3. In the Source data store page, ensure that the newly created Amazon S3 connection is selected in the Connection block.

    4. In the File or folder section, browse to the folder and file that you want to copy over. Select the folder/file, and then select OK.

    5. Specify the copy behavior by checking the Recursively and Binary copy options. Select Next.

    Load data into Azure Data Lake Storage Gen2 - Azure Data Factory (7)

  4. In the Destination data store page, complete the following steps.

    1. Select + New connection, and then select Azure Data Lake Storage Gen2, and select Continue.

      Load data into Azure Data Lake Storage Gen2 - Azure Data Factory (8)

    2. In the New connection (Azure Data Lake Storage Gen2) page, select your Data Lake Storage Gen2 capable account from the "Storage account name" drop-down list, and select Create to create the connection.

      Load data into Azure Data Lake Storage Gen2 - Azure Data Factory (9)

    3. In the Destination data store page, select the newly created connection in the Connection block. Then under Folder path, enter copyfroms3 as the output folder name, and select Next. ADF will create the corresponding ADLS Gen2 file system and subfolders during copy if it doesn't exist.

      Load data into Azure Data Lake Storage Gen2 - Azure Data Factory (10)

  5. In the Settings page, specify CopyFromAmazonS3ToADLS for the Task name field, and select Next to use the default settings.

    Load data into Azure Data Lake Storage Gen2 - Azure Data Factory (11)

  6. In the Summary page, review the settings, and select Next.

    Load data into Azure Data Lake Storage Gen2 - Azure Data Factory (12)

  7. On the Deployment page, select Monitor to monitor the pipeline (task).

  8. When the pipeline run completes successfully, you see a pipeline run that is triggered by a manual trigger. You can use links under the Pipeline name column to view activity details and to rerun the pipeline.

    Load data into Azure Data Lake Storage Gen2 - Azure Data Factory (13)

  9. To see activity runs associated with the pipeline run, select the CopyFromAmazonS3ToADLS link under the Pipeline name column. For details about the copy operation, select the Details link (eyeglasses icon) under the Activity name column. You can monitor details like the volume of data copied from the source to the sink, data throughput, execution steps with corresponding duration, and used configuration.

    Load data into Azure Data Lake Storage Gen2 - Azure Data Factory (14)

    Load data into Azure Data Lake Storage Gen2 - Azure Data Factory (15)

  10. To refresh the view, select Refresh. Select All pipeline runs at the top to go back to the "Pipeline runs" view.

  11. Verify that the data is copied into your Data Lake Storage Gen2 account.

Related content

  • Copy activity overview
  • Azure Data Lake Storage Gen2 connector
Load data into Azure Data Lake Storage Gen2 - Azure Data Factory (2024)
Top Articles
What Are High Risk Mutual Funds - Advantages and Taxation
How to Write a Goodwill Letter to Remove Late Payments | The Neuron
English Bulldog Puppies For Sale Under 1000 In Florida
Katie Pavlich Bikini Photos
Gamevault Agent
Pieology Nutrition Calculator Mobile
Hocus Pocus Showtimes Near Harkins Theatres Yuma Palms 14
Hendersonville (Tennessee) – Travel guide at Wikivoyage
Compare the Samsung Galaxy S24 - 256GB - Cobalt Violet vs Apple iPhone 16 Pro - 128GB - Desert Titanium | AT&T
Vardis Olive Garden (Georgioupolis, Kreta) ✈️ inkl. Flug buchen
Craigslist Dog Kennels For Sale
Things To Do In Atlanta Tomorrow Night
Non Sequitur
Crossword Nexus Solver
How To Cut Eelgrass Grounded
Pac Man Deviantart
Alexander Funeral Home Gallatin Obituaries
Energy Healing Conference Utah
Geometry Review Quiz 5 Answer Key
Hobby Stores Near Me Now
Icivics The Electoral Process Answer Key
Allybearloves
Bible Gateway passage: Revelation 3 - New Living Translation
Yisd Home Access Center
Home
Shadbase Get Out Of Jail
Gina Wilson Angle Addition Postulate
Celina Powell Lil Meech Video: A Controversial Encounter Shakes Social Media - Video Reddit Trend
Walmart Pharmacy Near Me Open
Marquette Gas Prices
A Christmas Horse - Alison Senxation
Ou Football Brainiacs
Access a Shared Resource | Computing for Arts + Sciences
Vera Bradley Factory Outlet Sunbury Products
Pixel Combat Unblocked
Movies - EPIC Theatres
Cvs Sport Physicals
Mercedes W204 Belt Diagram
Mia Malkova Bio, Net Worth, Age & More - Magzica
'Conan Exiles' 3.0 Guide: How To Unlock Spells And Sorcery
Teenbeautyfitness
Where Can I Cash A Huntington National Bank Check
Topos De Bolos Engraçados
Sand Castle Parents Guide
Gregory (Five Nights at Freddy's)
Grand Valley State University Library Hours
Holzer Athena Portal
Hello – Cornerstone Chapel
Stoughton Commuter Rail Schedule
Nfsd Web Portal
Selly Medaline
Latest Posts
Article information

Author: Tish Haag

Last Updated:

Views: 5553

Rating: 4.7 / 5 (47 voted)

Reviews: 94% of readers found this page helpful

Author information

Name: Tish Haag

Birthday: 1999-11-18

Address: 30256 Tara Expressway, Kutchburgh, VT 92892-0078

Phone: +4215847628708

Job: Internal Consulting Engineer

Hobby: Roller skating, Roller skating, Kayaking, Flying, Graffiti, Ghost hunting, scrapbook

Introduction: My name is Tish Haag, I am a excited, delightful, curious, beautiful, agreeable, enchanting, fancy person who loves writing and wants to share my knowledge and understanding with you.