AZURE DATA FACTORY (ADF) FOR BEGINNERS | Arindam Mondal | Skillshare

Playback Speed

  • 0.5x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 2x


teacher avatar Arindam Mondal, Azure Cloud Technology Specialist

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Lessons in This Class

13 Lessons (48m)
    • 1. Introduction

    • 2. ADF Portal Overview

    • 3. Adf components

    • 4. Create Your First Pipeline

    • 5. Append Variable Activity

    • 6. Filter Activity

    • 7. Lookup activity

    • 8. Foreach Activity

    • 9. Until Activity

    • 10. Switch Activity

    • 11. Templates

    • 12. Sendings Alerts to Microsoft Teams

    • 13. Conclusions

  • --
  • Beginner level
  • Intermediate level
  • Advanced level
  • All levels
  • Beg/Int level
  • Int/Adv level

Community Generated

The level is determined by a majority opinion of students who have reviewed this class. The teacher's recommendation is shown until at least 5 student responses are collected.





About This Class

Welcome to the course Azure Data Factory for Beginners.

In this course, you will learn the below topics - 

  • ADF portal overview
  • ADF Components
  • Creating your first pipeline
  • Important ADF Components
  • Realtime Projects

Prerequisite :

No prior knowledge is required. But Knowledge of SQL would be added advantage.

Roles :

  • Beginners
  • SQL Developers
  • BI Developer

Software Requirements :

Azure Subscriptions required. If you don't have already, You can take a free Azure subscription.

Meet Your Teacher

Teacher Profile Image

Arindam Mondal

Azure Cloud Technology Specialist


Class Ratings

Expectations Met?
  • Exceeded!
  • Yes
  • Somewhat
  • Not really
Reviews Archive

In October 2018, we updated our review system to improve the way we collect feedback. Below are the reviews written before that update.

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.


1. Introduction: Hello and welcome to the course. Azure Data Factory for beginners. Microsoft is one of the leading Cloud service providers in this world. As cloud services offers many benefits when on-premises technologies. For example, in a diamond, a new x's. Cost-savings solutions, high security and scalability options, and many more. If you are still working on on-premises technologies, it is right time to land animal cloud technologies. My name is Avi nominal. I have over 10 years of experience in IP. Might focus is on databases and Cloud technologies. Let's talk about prerequisite enrolled. There is no prerequisite for this course. However, if we are knowledge and it would be an edit advantage. This course is mainly designed for absolute beginners. Wants to learn. There's your Data Factory. If you already getting knowledge on some of the ideas. You can still follow other topics in this course. Also, if you are SQL developer or VAW of our discourse is absolutely right for you. Let's talk about courts surgeon there. Faster Liskov say near its history and any important and some of the other topics. To getting started with the idea. In the next section, we'll talk about a near components in detail heroically it out fast pipeline. Then we'll talk about most used activities in ADF. And at the last section, we will discuss a complete project in AB. So let's start. I'll see you inside the course. 2. ADF Portal Overview: Here I open the Azure Data Factory. Can see in the left fin we have four icons. I'll talk, monitor and manage. Let's click on the author. Heroin cleared Azure Data Factory the surfaces like biplane dataset, data flows and park created. Let's click on the monitor icon header. We can see that by plane running information. Let's click on the Manage. Hell can see the connection information. Linked services and interior, some runtimes. And a zip file, which is currently available as preview features. In the source control option. You can configure Git repository. Also, we can import and export ARM templates. In our third option. We can see that list up to that. Also, we can create a new tree depth. From here. In the next section, we can create global parameters. 3. Adf components: Let's talk about the idea of components by planes. Hit by blend is a logical grouping of activities that together perform a setup process, such as extracting data, transforming it, and loading into some databases or files. For example, by blend can contain a group of activities to ingest data from Amazon S3, and then running Spark query on an Azure Databricks cluster to partition WHO editor factor may contain one or more pipelines. Next component is the activities. It presents the processing steeped in a pipeline. For example, we might have a delete activity to delete folders or files. Next component is Datasets. Datasets represents data structures within a data source, which represents that data. We won't be using the pipeline as inputs or outputs. Next component is linked service. We might think of linked services that connect sense gene, it defense the connection information which we will connect in the Data Factory. So linked services difference that connection information, which will be used to connect external later in the pipeline. For example, a linked service. Use to connect to Azure SQL Database. From the pipeline. The next component is introducing damping. Integration Runtime connects between activities and linked service. It provides the compute environment where the activity lamps. 4. Create Your First Pipeline: Here I am in the ADF. We need to click on the author icon to create a new file plane. We have two clique here. The plus icon. We add a new resource. And here we have multiple options available pipeline by plane from Template. And it does it lit up Flow Power Query M copied into two. For now we'll select Pipeline past the formula contains the pipeline Nim, lead serve, EL, demo of Shannon geochemists persuade description. Now we can add activities. Let's drag and Roffey copy data activity. So this pipeline consists of only one activity, which is copied into activity. We can add multiple activities in a pipeline, am and again delete activity. Now we can connect between these two activities. Using this green icon. We need to just drag and drop it to the next activity. So here, copy that activity, we'll execute past. Then delete the actuating rulings cooked. So once copy that activity gets accident, then delete activity will legs good. If we right-click on this setup, we have multiple option. Now very fast, strongest success, which means it premieres activity is successful. Then next activity links. We can change it to fill you up, which is red arrow, which means a previous activity is failed. Then the next activity we'll execute. A third option is completion. So once previous activity gets completed, the next activity will execute. The completion option does not check with our previous activity, is successful or failed. And the last Thompson is steep. So the previous activity, we'll skip the next activity, legs. Good. 5. Append Variable Activity: Hello and welcome to another video. In this video we'll talk about AP unbelievable activity. If we have an existing Adam medieval in the pipeline, we can add to that existing activity when my using append variable activity. As I said, this app invariable activity, we'll walk through it only, edit a PDF. Let's go to the ADF. Here I'm in the idea. In this demo, we'll check fn variable activity. Let's drag-and-drop fun F and V level activity. Fast to need to create a variable in the pipeline level. Let's create a new variable. And this variable must be edited. Because f n variable only walk sweet. Editor I have given the available numpy array. Let's send that type two error. Let's put some values. I have two pillows in the area. As the medieval lists edit i, we need to put them. Hello, Sweden, square bracket. Now let's go to the app and the medieval activity and the denial of channel. You can change the name. Optionally, we can specify description. Next step is variable step. We need to pull down. Here. You can see them notice CRF and wind events only supports adding two anytime variables. Let's click on the drop-down. We have only one edited video building this pipeline, Maryanne a. Now we can specify them hallux, which will be added to that PDMP. Let's give the value at P. So they append variable activity should add this for loop with the existing value in the array. As we have to spend before TNM 20 and daddy. So this app and immediate will show that P in this setting. Now I'll add a stored procedure activity to insert this value in the database so that we can check F and Marion wax correctly. Let's set a stored procedure activity. I created this unbelievable. Hello, come up in medieval activity using the stored procedure. This develops two column ID and value. Let's see the posterior cord. In this procedure it takes one input parameter, which is value. And the head, I mean setting into the table using the value parameter. Let's go back to the leaf. Now we let him the stroke Posterior. Let's select the link service and selecting the procedure. Now I need to end up Elementor. The name is Mary Lou, and the value, we add nanometer. Let's select the variable. Now we can access that. Using my index. Relieves the index value of two, which is the target value in that edit. Let's click on Finish. Now dividing the pipeline by blend subsolid. Now we'll check them. C-cl as expected to elicit the app and medieval realm, which is taffy. I hope you understood now has happened. Medieval walks in Azure Data Factory. Thank you for watching. See you in the next video. 6. Filter Activity: In this video, we'll talk about filters activity. In Azure Data Factory. We can apply a filter expression in an input array using electricity. Let's see an example. Here I am in the area. So first, let's create a parameter. Let's give the parliament anime. I must score a lamp. As this dynamic adaptation. And activity worksheet only edit bay input. Let's give some, say a hundred and two hundred. As the parliamentary Saturday if we need to prevent them, Hey, lose within square brackets. Now we add a field directly with the arrow. You can change the name and optionally we can add description. Let's go to the settings. Here we have two option, I, Dems and condition. We need to learn the input parameter, the parameter weekly it in here. I'm adding the parameter in the dynamic content. Let's click on Finish. In the condition marks. Electrician. Let's click on Add dynamic content. Let's extend the functions and logical functions. Graph stuff functions. Now we learned item, which means that current item in the fetal activity. Let's use later. Now specifying one 500 so that it can return them halos, later them a 150, clicking on Finish. So the input edit a pedometer. We just lose a hundred and two hundred in the field directivity condition. Whereas in great depth, I dim common than 50. So it will direct, you should filter all lead elements which are greater than 150. So in the subsequent activity, this editor should have only one value. Because only the quality is better than 150. And 200 should be removed by the free directivity. Now I'm debugging in the pipeline. Clicking on Okay. Now let's take now. So we can see here the item count S2. In the input parameters, we had two items. Now append a plane, x plus I mean filter activity. The item counties one. Now I'm up to 100. So the initial element, Lou Hamer, did you buy that? 7. Lookup activity: The next activity is lookup activity. Lookup activity can access a dataset supported by the Azure Data Factory. That dataset can be used in a subsequent activity. Look up activity can access the contents of a file or a table. It can also return that result set. Promise SQL query, what is stored procedure. Then the output can be used in the subsequent activity. Note that that lookup activity can return maximum amount 5 thousand rows. That is uncertain as more than 5 thousand rows. Then it really fast, 5 thousand lose. The output dataset sides should be within four megabytes sides. Activity will fail if the size of x is four megabytes. Maximum timeout alone is 24 hours. In this example, we'll discuss lookup activity and store per state activity. Let's go to the settings of the lookup activity. That dataset M is an option to check customer only from the dataset. The query of sun, we can either use grainy or stroke procedure. Here I am using sequel query. I'm getting the price per loop from the items to table. Let's go to the assessments. The item is to double. I'm using this query in the lookup activity. It was the output value from the lookup activity. Let's talk plus directivity. Let us go to the store personal settings. Here I am isn't second links and make creating the stored procedure. Since the stroke person do an activity. Updates by item and the price value takes one input parameter, Price. Go to the stroke plus Deal. Had I remember strip slip parameter, the parameter value. Hi ma'am. Using the output from that lookup activity. Let's stand the pipeline. The pipeline is completed now. Go to the assessment. Now as expected, prices someday, end of this item number a, 100. 8. Foreach Activity: Now we'll discuss for each activity, it is similar to for-each loop. In programming. It can iterate over a collection and execute activities within the loop. Let's go to the IDF. In this demo, we'll see an example of four detect vt. Let's take it for its activity. Here we can since the name. And optionally we can put that description in the Settings tab we have three options. Lability paths coming sequential. The subassembly determines whether the activities will run in parallel or she can. Count option determines how many number of activities in parallel. That default value is 20 and the maximum allowed value is 50. In the item smokes. We have to prevent them dynamic content. This done with that for its activity, will execute. Let's create a pedometer. Prm, AWS, indicating we're creating an array type parameter. I provided three numbers, 1020 m dot t. Now in the Settings tab items, let's click on Add dynamic content. I'm selecting the pedometer. Clicking on Okay. Now let's said I'm activities in the for-each loop. Here I created a table, PBL forests by Lou Reed, the Columbia. Hello. Let's see the procedure code. This is a simple procedure to test the polytope activity. It has one input parameter, hello, and inserting into dB HL for each value table. Let's go back to the ABA. And I selected the link service. Now selecting the precedent. And we name is value. In the parameter value we have added like them. Let's click on okay. Now we're dividing the pipeline. So we have to add a pedometer DNN to indie. Clicking on Okay, bye blend completed. Now, let's take the value from table. As you can see, the value TNM 20 is now available in that table. 9. Until Activity: Let's talk about until activity. It executes. In an activity. Based on the expression provided some lamp electricity. We can also specify a timeout value in the settings. Let's see an example of an activity header contains the activity name if required, or Shannon Logan's persuade their description. Now in this setting step, we have to prevent the expression. So the expression must have led to true or false. And in the next section, we'll have to pervade that timeout. That default value is seven days. We can change if required. In the activities configuration. We can add activities here. So the inactivity will execute until the expression evaluates. To part the timeout value reached. Let's create a pedometer. Here I created a parameter named colored with distinct Dave and the value is saved. Now let's go to the anti-electron excretion mops. And we add them dynamic content. So here we will use logical function equals and we will compare with that. So here I am comparing the pipeline parameter value with them hand-coded. So now the parameter value is red, and I'm also comparing the valid bit. So it must return true in this case, let's click. Okay. I have not changed the default timeout value. Let's go to them. In an activity. Here I added the sample lookup activities. To guess the functionalities of the collectivity. Let's develop the pipeline. We can see the pipeline and succeeded. And the inactivity also succeeded. Because now until activity expression return true and the interactivity executed. Let's say I am to ninja expression here. I'm tending the mandible green. Now I am comparing the parameter value red with the green. So this tammy expression evaluates to false. Let's click on Okay. It will change the default timeout to tap the second. Now let's go back to the pipeline. You can see until activity is keep running. And the enamel lookup activity is executing. Tilda expression. Evaluates to true are the timeout reads. You can see M, the status m dot. Then I'm pill activity stopped. I understand now how electricity works in India. 10. Switch Activity: The next activity is switch activity. Similar to switch statement in programming. Here, Switch Activity takes an input expression and based on that, multiple case statement, executes. Let's see an example. Let's create a new pipeline. And really the enemy to switch demo. Let's drag-and-drop the switch activity. We can change the name. Optionally, we can put the description. Let's go to the activity stem header. It can switch where the expression and below here we can add case statements. There is already a default case civil level. We can add new case S by the requirement. Now, let's add two more cases. And intimate to read. I lead the analog case. And it is green. Now when you add interactivity, let's go to that default activity. Now let's go to the SSMS. Here I'm creating a new table to insert the Switch Activity value. Stored procedure rarely was being served the switch activity, well, let's go back to the idea. Let's take a stored procedure activity. I'm renaming it to stored procedures before selecting mailing service. Selecting the prop. Now adding a pedometer in the value I am using default. Let's add similar activities too. That case, taping with Emily naming stored procedure to start posteriorly in the stored procedure parameter value m using the ID. Let's go to the green activities. I'm renaming the stored procedure to stroke was still green. Now doing the conversion in the parameter values green. Now the switch activity configuration is done. Now in the pedometer. The value you said. So if I go by plane, the activities in red block should execute. Let's divide the pipeline. Okay? You can see here the activities in red block, stored procedure read, executed. Also, let's take the value from security. And as expected, the value rarely inserted into the table. Now let's, since the pediment I value to green. So that, that case statement in Lean execute. Let's divert. And we can see the inactivity in the green block executed. Let's go to the SMS and take the value in the table. And now the blue-green inserted into the table. Now let's send the parameter relative to something else. So that the patent case statement is good. Let's send the value to yellow. Now, devaluing in here you can see the default case statement executed. Sick the Mela promise. Now the payload default also inserted into that. So I hope you understand now how Switch Activity 1. 11. Templates: In this video, we'll talk about Azure Data Factory templates. Templates that are already built by planes that can be used in the development. By using templates, we can reduce development time. We can create our own template, or we can use existing templates from that template gallery. Now we'll create our own template. Let's create a new pipeline. Let's rename it to peel template. In this pipeline, I'll use a simple stored procedure activity. Let's send the link service and is stored procedure. Now we have the option here to save as template. Can prevent the template name here. Optionally, we can provide the description and need to save to the getLocation. I'm not Templates folder. We can provide other details like tags, categories, services, and documentation link. Let's speak on okay. Now we can see that template subscript. Now, let's go to the template and healthiest option to go to the template gallery. We can filter here. It might templates. This is the template I created. Now if I want to create a pipeline from this template, need to click here. I can now say that template from here. Also I can delete the template. Now if I want to use this template in the biplane, select them links and then we have the option, use this template. Now the pipeline is created. I can add other activities as public garment. Similarly, we can also use template created by Microsoft. We have lots of useful templates available here. If you are new to the Azure Data Factory, these templates are very helpful. You can use existing templates as per your requirement. 12. Sendings Alerts to Microsoft Teams: In this video, we'll discuss how to send notification in Microsoft Teams Senate. Once he developed the pipeline, we run the pipeline manually or we use the trigger to run 18 shadowed interval. To check quickly whether pipelined pale dots succeeded. We can use notification I liked in Microsoft Teams, which is very useful. Now it is menu enterprise organization using Microsoft Teams for messaging and video communication. If you are using Microsoft Teams in your organization. You can easily set up notice fix and LF system from the Azure Data Factory pipeline. Let's talk about the prerequisite past we need to set up an incoming label. Is the process or the way that nf can provide the required information to other. I'm in Microsoft Teams app. Such for incoming label. You can see her incoming or a book in the app. Let's click on it. It says, send data from a service to your Office 365 looping little ten. Now we have to click on the editorial team button. You need to provide a PMA or H&M. Now I didn't my team name here. Next click on setup a connected. Now we need to specify a name. I provided a leaf node physicians as. Externally we can upload animals. Now let's click on Create button. Now we've gotten that label key. What? We need to copy this URL to use it in that pipeline. Click on Done to complete the setup. Now I am in the ADM. Now we have to create a new pipeline from the template. Templates are prebuilt pipeline. Whenever in Azure Data Factory. We can use template to quickly create a new pipeline. Let's click on Add new resource, then pipeline and template gallery to filter with the QR teams. Here we can see the template send notification channel in Microsoft Teams. We can see that description here. Available tags, Microsoft Teams and notifications. This is the preview pin. Click on use this template. This is the pipeline. From the template. This pipeline has two activities. One set variable activity, one way of activity. Now this pipeline spill redefined Parliament. That's the first one is Data Factory subscription. We need to provide the Azure Data Factory subscription ID. The next one is Data Factory research group header. We need to provide the resource group name. This is the pipeline run ID. It is optional. The next one is teams were able keyword, keyword, which we copied previously. When we set up the teams channel. The mixed venous activity name, name of vector B, D, which is optional activity message. We can provide optimal message here. Activity duration, duration of the activity, and activity status, whether it is failed or succeeded. The default value is failed. Now I provided all the required value, the parameters. Now let's go by plane. By planes accident. Now, we can see the pipeline element to make sense in the teams. You can see all the parameter name and values here. To see the further details. So you can click on Build Pipeline, then it will open the Azure Data Factory. Can see him the details of them. By blend them. 13. Conclusions: In this class we discussed that topic. To getting started with Azure Data Factory. I covered them is their components in ADF. And then covered most used activities in ADF. I hope now you are ready to start your Cloud journey. Thank you for watching and see you in the next class.