Introduction to Data Analytics on Microsoft Azure Cloud | AMG Inc | Skillshare

Playback Speed


  • 0.5x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 2x

Introduction to Data Analytics on Microsoft Azure Cloud

teacher avatar AMG Inc, Technologist

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Lessons in This Class

66 Lessons (3h 50m)
    • 1. Introduction to the Course

      3:34
    • 2. Lab 001 Creating MS Azure Account

      2:47
    • 3. Lab 002 Creating Azure Storage Account

      10:29
    • 4. 05 1 Create and Setup Azure SQL Database

      11:37
    • 5. 5 2 Azure SQL DB Firewall Settings

      2:55
    • 6. 5 3 Install SQL Server Management Studio 01

      1:53
    • 7. 5 3b Installing the SQL Server Managemenet Studio pt

      0:50
    • 8. 5 4 Connecting SQL Server Management Studio with Azure SQL DB

      2:43
    • 9. Theory 06 4 Workflow for the ETL in Azure Data Factory

      3:06
    • 10. Lab 06 1 Create Data Factory in Azure

      3:24
    • 11. Lab 06 2 Create Pipeline in Azure Data Factory

      1:50
    • 12. Lab 06 3 Create Linked Service for Blob Storage

      4:15
    • 13. Lab 06 4 Create Linked Service for SQL Database

      3:08
    • 14. Lab 06 5 Create COPY Activity and run the pipeline

      4:42
    • 15. Lab 06 6 Dealing with Multiple Datasets

      2:21
    • 16. Lab 6 7 Creating Tables for Multiple Datasets in SQL DB

      2:46
    • 17. Lab 6 8 Creating Linked Service for Multiple dataset in Blob

      6:30
    • 18. Lab 6 9 Creating linked service for multiple data load to SQL DB

      3:12
    • 19. Lab 6 10 COPY Activity for multiple dataset load to SQL DB

      4:28
    • 20. Lab 6 11 Upload movies dataset for ETL transformation

      1:48
    • 21. Lab 6 12 Create ADF Dataflow and Load Movies dataset

      4:26
    • 22. Lab 6 13 Apply Filter in ADF Data Flow

      3:14
    • 23. Lab 6 14 Average Aggregation Function in ADF

      2:11
    • 24. Lab 6 15 Sink Transformed Dataset to Blob Storage

      5:41
    • 25. 07 1 Create Azure Synapse Workspace

      3:11
    • 26. 07 2 Overview of Azure Synapse Workspace

      4:42
    • 27. 07 03 Upload Dataset in Azure Linked Storage and Perform SQL Query

      3:24
    • 28. 07 4 Create SQL Pool in Azure Synapse

      3:07
    • 29. 07 5 COPY Dataset from Data Lake to Workspace Database 01

      5:29
    • 30. Theory 8 1 What is SQL

      2:11
    • 31. Lab 8 1 Select All Statement

      3:18
    • 32. Lab 8 2 Select Single or Multiple Column

      1:54
    • 33. Lab 8 3 DISTINCT Statement

      2:20
    • 34. Lab 8 4 COUNT and COUNT(DISTINCT)

      2:46
    • 35. Lab 8 5 WHERE Statement

      3:56
    • 36. Lab 8 6 AND Statement

      2:34
    • 37. Lab 8 7 OR Statement

      2:02
    • 38. Lab 8 8 NOT Statement

      1:30
    • 39. Lab 8 9 UPDATE Statement

      2:35
    • 40. Lab 8 10 DELETE Statement

      2:10
    • 41. Lab 8 11 INSERT INTO

      2:10
    • 42. Lab 8 12 Why Join Datasets (Overview)

      2:48
    • 43. Lab 8 13 SQL Joins

      3:28
    • 44. Lab 8 14 SUM Function

      4:54
    • 45. Lab 8 15 AVG Function

      2:46
    • 46. Lab 8 16 Extreme Value Calculation and Sorting Data

      5:16
    • 47. Lab 8 17 Grouping and Filtering in Groups

      4:30
    • 48. Lab 9 1 Create Event Hub Namespace

      4:47
    • 49. Lab 9 2 Configure TelecoGenerator App

      2:56
    • 50. Lab 9 3 Create Azure Stream Analytics Job

      2:05
    • 51. Lab 9 4 Create Input Jobs

      3:13
    • 52. Lab 9 5 Create Output Job

      3:16
    • 53. Lab 9 6 Realtime Data Upload to Blob Storage

      5:29
    • 54. 10 1 What You Will Learn

      0:47
    • 55. 10 2 Import Dataset from Blob Storage

      3:22
    • 56. 10 3 Import Dataset from SQL Database in Azure

      2:17
    • 57. 10 4 Import Dataset from Excel File

      2:03
    • 58. 10 5 Cards in Power BI

      4:10
    • 59. 10 6 Bar Charts and Column Charts

      5:50
    • 60. 10 7 Line and Area Charts

      4:41
    • 61. 10 8 Line and Column Chart

      2:05
    • 62. 10 9 Waterfall and Funnel Charts

      4:22
    • 63. 10 10 Pie and Donut Chart

      2:40
    • 64. 10 11 Tables in Power BI

      2:37
    • 65. 10 12 Marix Tables

      2:48
    • 66. 10 13 Decomposition Tree

      4:06
  • --
  • Beginner level
  • Intermediate level
  • Advanced level
  • All levels

Community Generated

The level is determined by a majority opinion of students who have reviewed this class. The teacher's recommendation is shown until at least 5 student responses are collected.

16

Students

--

Projects

About This Class

Microsoft Azure is one of the most popular public clouds in the industry. Nearly all of the Fortune 100 companies are moving to the cloud, and being able to work with it is one of the most important skills for every developer, architect, or IT admin.

This course is designed for the students who are at their initial stage or at the beginner level in learning data analytics, cloud computing data visualization and Analytics using the Microsoft Azure Cloud Services.

This course focuses on what cloud computing is, followed by some essential concepts of data analytics. It also has practical hands-on lab exercises which covers a major portion of importing and performing some Analytics on the datasets.

The ETL tool used is Azure Data factory and analytics is performed using a visual tool known as Power BI. The lab portion covers all the essentials of the Azure SQL Databases, Azure Synapse Analytics, Azure Stream Analytics, Azure Data Factory and Power BI. Starting from importing the datasets, loading it, performing powerful SQL queries and then analyzing the same data using the queries and visual graphical tools are all covered in great detail.

Again experience with Cloud Azure services taught in this course will give you an edge in the job market and will position you for a successful career.

Meet Your Teacher

Teacher Profile Image

AMG Inc

Technologist

Teacher

Our company goal is to produce best online courses in Data Science and Cloud Computing technologies. Below is our team of experienced instructors.

Instructor1

Adnan Shaikh has been in the information technology industry for the past 18 years and has worked for the fortune 500 companies in North America. He has held roles from Data Analyst , Systems Analyst, Data Modeler, Data Manager to Data Architect with various technology companies.

He has worked with Oracle , SQL Server, DB2 , MySql and many other relational and non relational databases over the course of his career. He has written multiple data access applications using complex SQL queries and Stored Procedures for critical production systems.

He has a masters degree from Northwestern University of Chica... See full profile

Class Ratings

Expectations Met?
    Exceeded!
  • 0%
  • Yes
  • 0%
  • Somewhat
  • 0%
  • Not really
  • 0%
Reviews Archive

In October 2018, we updated our review system to improve the way we collect feedback. Below are the reviews written before that update.

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.

Transcripts

1. Introduction to the Course: Welcome to the beginner level course on Introduction to Data Analytics on Microsoft. Is there go wherever goal is to make the database QT focusing on big data analytics and report generation expedience, easy to learn for people from every walk of life. Our focus throughout the course will be to teach you the in-demand data warehouses skills in light with the SQL to read through hands-on lab exercises, we will use Microsoft Azure, which is a Cloud service portrait of Microsoft, and it's easy to use embedding functional. We will have a capstone project at the end of the goals that will let you learn all concepts in one place and in one go. Quizzes are designed to test your students learning of key concepts, meet our team. A role in researching, designing, and creating this source. Together we have close to 25 years of experience in practical implementation of technology and DJing technical courses at the university level, our team members have specialization in areas of information technology, software engineering, data science and programming. We will have a total of 11 sections that are designed to help the students learn progressively. We will start from the basic introduction of the course on cloud computing and data warehouse and gradually moved to intermediate concepts on Microsoft Azure Cloud setting up reported and make him running. We will look at creating the SQL databases and how you can set up your database on Cloud as well as on the local machine you are working on. Then we have the Azure Data Factory, which involves the ETL processes. Then we have the Azure Synapse Analytics, which covers the largest scale data warehousing, followed by the SQL foundation for data analytics. And then we will also learn about real-time streaming data, followed by Power BI, which involves visually analyzing our datasets. By the end of these lessons, you will be able to start data warehousing and analytics using the Microsoft Azure Portal services and be able to generate analytics reports using the Power BI. Key concepts through articles are explained visually to enable our students grasp the technical concepts quickly and effectively. The lab pushing governors the key concepts of the Microsoft Azure Portal, creating your account and storages, starting from the very basic setup to the in-depth analytics on datasets, setting up the SQL database and Microsoft Server Management Studio for running duties on the Cloud as well as on local machines. The Azure Data Factory, which is one of the important fundamentals in setting up a data warehouse to perform the ETL processes is also discussed in great depth. Setting up all your network connections. Indeed are four useful analysis and then be go to the SQL queries on data sets in SQL database, which covers all the SQL essentials at group level, covering almost all the basic to advanced duties required to perform analytics. Finally, we have the visual to recolor Microsoft, which is Power BI. We're vehicle visually analyze the dataset and create our dashboard. Here are some of the companies decoding as your dartboard services the skills to feel their day-to-day operation and Grids, software development needs. We look forward to having you join our course, and we promised that this course will help you build your foundational skills in data analytics using Microsoft Azure Cloud services that can help you make your resume stand out and demand a competitive salary in the marketplace. 2. Lab 001 Creating MS Azure Account: Hello and welcome to this lab session. In this lesson, we will look at how you can create your free account on Microsoft Azure Cloud services. Search for the Microsoft Azure on your browser and you will find on-screen link over here. Just click on this link. Over here you can see that you have the sign-in option and a free account option. Click on the free account option. On the next page, click on Start Free button here. So here is the sign in page. I don't have an account, so I will create one by clicking on the create one button. Dive down your ID over here. The next, create a password for your account and click on Next. To verify your email address. It will send you a basket, just open your mic and died the pass code here. This is the passcode I received. Click on Next. It is creating my account. Let's solve a puzzle to prove that I'm not a robot. All right, so we are done. Let's wait for it to redirect onto the next step. Now it asks for your account, name and phone number to verify your phone number. Click on Next. And Jews for the option to receive the good on your cell phone. I'm selecting the text me option. This is the verification code. Click on Verify Code. Now next it asks for your credit card details. So you need to fill your details and don't worry, you won't be charged. Because initially, Azure provides you $200 of credit in your account for the first 30 days after the account is created. Once those credit are consumed by you, only then you will be charged from your credit or debit card. I will write in my details and continue on to the next step. Credit card has been accepted and this is the agreement, check market, and click on Sign up. Your inflammation is being confirmed by the Microsoft. Now it is setting up your account. Your account is now setup. Just fill this sign-up experienced bomb and submit it to continue. Now, it will redirect you to the main page for your account use. So finally, our account is setup. We will look at the features, the services, user management, and all those stuff on Azure in our next lab sessions and upcoming sections. See you then. 3. Lab 002 Creating Azure Storage Account: Hello and welcome to this lab session. In this lesson, we will look at how to create an Azure Blob storage in your Azure storage. So we will create a storage account. In there. We will be creating a container, and in there we will upload our files. Before we begin, let me tell you about Azure Blob Storage. Azure Blob Storage is Microsoft's object storage solution for the cloud. Blob Storage is basically optimized for storing massive amounts of unstructured data. Users are blind. Applications can access objects in blob storage via HTTP or HTTPS. From anywhere in the world. Objects in Blob Storage are accessible via the Azure Storage rest API. As your PowerShell or Azure CLI, or an Azure Storage Client Library. Let's begin here you can see that we have the storage accounts option. Click on this option. And here we will first go to create a storage account. So click on this. Basically what happens is when you create a storage account, then you have the containers and blob inside that. This is basically for using your data among the different resources inside the Azure itself. The first thing we have here is to select the subscription under the basic option. The first option we have is for the subscription selection. As we have already discussed currently, we have a free subscription on our free account. So we will select this as your subscription one as of now. And then we have the Resource Group option. As of now, we haven't created any. So basically a resource group is a group where you have multiple resources inside it. Let's create a new group over here. And I will name it as, as your resource group. And click on it. Next, we have to give the storage name. This should be unique because insight as your multiple accounts can not have the same storage nin. So because the storage is mainly managed through the account details itself, I will name it as my Blob Storage 1 55. And then you need to select the region where the storage will be managed. So you have all these options for the region selection. You can select any one of them depending on the location nearest to you. I would prefer going with US East option and then there is an option for the performance type. We have standard and premium option, and I will stick with the standard option. The redundancy options will be the geo-redundant storage for us. There are other options as well. So you have the locally redundant option, which is actually the lowest cost option that provides you the basic protection against the server rack and drive failures. Then we have an option for a Jew didn't install it. The 0 redundant storage and then a geo zone redundant storage. Let's select the option as discussed, and then we will move to next advanced options. These are some advanced options. We will give them as default and no-one. Next. Over here we have the networking option. Here we have the connectivity method as public endpoint, which allows all the networks to access this storage. Then there is an option for a public endpoint for our selected networks, and then the private endpoint. So if you use this private endpoint option, you are not providing any public access for anyone to exist. This is storage. Whatever files you have kept a will not be shared with anyone. And if you go with this public endpoint would select the networks. You then have to provide the virtual networks that can access it. And we haven't defined any as of now. So we will go with the public endpoint option, allowing all network access. And in this case, all of your Azores, as your services have the access to this storage where required. Moving on forward, you have the network routing options. I will select the Microsoft erupting. Next, I will keep all of these as default and won't be making any changes to it. Click on Next. Then there is the DAG option. And if you want to assign any DAG and give the value for your storage so that you can differentiate among multiple storage accounts when you have them. Alright, moving on to the next option to review and create our storage. It will validate and enable the Create option for us. And here it is. Let's create it. It will take a couple of minutes to create this storage for us and deploy it as a resource to our account so that it can be used. It is submitting the deployment as of now, and the deployment is in progress. Once it is available, it will show the status are right. The storage is now deployed and it says the deployment is complete. It is also here in the notification area as well. Let me close this. So the deployment is complete. The Blob Storage name is this. This is the start time when created. The subscription name is here that is used to actually build the charges for the storage. And the resource group in which the service is contained is the Azure resource group. These are all the basic settings for this storage. Let's move to this resource now. Here we are in the my Blob Storage 15 5 dashboard. And it shows you the location for the storage where it is contained. And this is the one we selected at the time of creating the storage account. Then there is this again relocation where the redundancy backups will be made or in case the actual storage location is not available, this will be used. Instead. This subscription we already discussed, you can change it as well. If you have multiple subscriptions, you can switch to the other one to change the billing methods and other management stuff. Then you have the subscription ID, the disk state, and all that information. And below are the properties for this blob storage account. So let's move on to creating the container to upload some files. So in the left side panel you will find the container option. Select this container option. You can see that we don't have any containers in the storage account as of now. Organizes a set of blobs similar to a directory in a file system. A storage account can include an unlimited number of containers, and a container can store an unlimited number of blobs. Click on this button here to create a new container. Then we will give a name for this container. Let's name it as My first container. All right, so the tick mark shows that this name is available for us. Then there's the option for the public access level, like whatever Xist level you want to provide to this container. And there are these three options. First, we have the private access level. In this, no one can access anonymously. Then there is the blob access level where anonymous read access for only the blobs are allowed. And finally, there is the option for the container level excess. The anonymous access for reading the containers and blog is granted. So we will go through the container level public access option. And we will create this container. So our container has been created successfully. And let's move inside the container dashboard so that we can upload some files that can be used for some as your services. So here is our dashboard for this container. You can see the access control. You have, the diagnose and solve problems and other settings as well. Let me show you the access control option over here. Click on View my X's to have a look at your exit. This container has it's still loading. And here it shows that you have the owner role level access grants you the full access to manage all the resources. Let me go back here to the overview. And now I will be uploading my files using this Upload button. Click on this Upload button over here. And it pops up a window and asks you to browse and locate the file. Let's browse for the file in my local PC. I have this dataset for loop, and in there I will upload one of these files, the sales regard 100, which contains hundreds seals regard dataset. Select the style and click on the Open button, and then click on the Upload button here to upload this file. So the upload begins and the file is uploaded. So this is how you create a storage account, create containers and upload files inside it, giving it the network access with the help of the public endpoints so that it can be used by any Azure service. So here we have the file uploaded and we will be using it in our future lab sessions to copy the dataset to the SQL database and then perform some cuties over it. 4. 05 1 Create and Setup Azure SQL Database: Hello and welcome to this lab session. In this lesson, you will learn how to create and set up a SQL database in Azure Cloud Services are right. So in this video resources, you can see the Azure SQL option. And there is also this SQL database option as well. If you don't find it over here, you can search it in the search bar over here. Type in SQL database. So you see we get all the search results related to the service search we perform. So I will go with the SQL database option. Now, this is your dashboard to the SQL databases in Azure. We can see that we don't have any SQL database as of now, because this is just a newly created account. Let's create a new database. You can either create it from here or you can create it with this button bar here and click on create SQL database. So here we will create the database. This is the next option here, and here was your previous option. We will be using other free subscription, the Azure subscription one, which we discussed initially is available to us as a free dire. Then we need to select the Resource group. If you don't have a resource group, you can create one. And right now we have created. So we will be going with this Azure resource group. Moving on, we need to give a database name. This database name should be unique. And to mention here, the name should be globally unique. Let's give this a name. I will name it as seals record dB. It shows a tick mark, meaning that this name is available for us. Next, we need to select a server. In the drop-down, you will see that we don't have any server as of now. So we will create one for this using which we can connect to the database. Now here gives a sovereign and let's name it as MySQL DB. It gives an error that the name is already in use. So remember, this name should also be globally unique. Now let me name it as My SQL DB dash seals. And this name is available for us. Moving on next, you need to give a login username for the server. And let's name it as, as your user. Then we need to set a buzzword. You can choose any password of your choice. Just ensure all these points are considered while setting up the password. Now retype your password to confirm it. And then Google with dislocation has selected. All right, so we have configured our new server. Now, click on, Okay, so the server is created and we have the server in our list over here. Then there is the option for the SQL elastic pool. It is basically used for scaling multiple databases that have varying and unpredictable usage demands. We don't need it as of now. So I will select. Next. We have to compute plus storage option. So this now depending upon your requirement, if you go to the Configure database options, you can configure this compute and storage as well. Ahead here are all the premium options. You can see all these options here. We will go back to the basic one and stick with the standard. So we get then dq, which is actually the Database Transaction Unit. And the data size is 250 GB maximum. You can change it as well with the help of a slider. There is also a basic option which you can use. In here you have five DTUs and do GB of maximum data size. All right, so I will be using this standard option for my database. And it shows the charge that will be built is $15. But if you go with the basic option, you will be charged at around $5. But obviously then you have limited functionality as well. So you can see that it all depends on your requirements. Like if you want to process a large amount of data, you can go with the standard or the premium ones. If you are just working with a small amount of data in which you have a limited number of rows or data points, you can go with either basic or standard. Let's select Standard and click Apply. The next thing you have is the backup storage redundancy. I will be using the backup storage. We have already discussed it during the storage account creation in Azure. Next, let's move onto the neck breaking settings. This is very important part. And if you see it shows you the connectivity method. If you select no exit, then whatever services within the Azure you are using, if that tries to access this. Sql Database. It wouldn't be possible because you have selected noises. For a private endpoint. You need to provide the private endpoints for different services that can connect to this database. Or which services and Azure we want to give access to this database we are creating. And then we have the public endpoint option. You can give access to all the Azure services and the current client IP address as well to access this database. And here I will select both these options. Then we move to the next option for security. So we don't need any Azure Defender as of now. As we don't have any confidential data to protect, Let's move additional settings. And here we have the option to select the data source. Either you can select None, which is a blank database, or you can select a backup to restore the database from there. So when I select this, it gives me the option to select any of the backups I have in my account as this is a new account, so we don't have any backup. So it's not giving any option here. And if you select the sample option here, it will load the sample database so that you can use it in SQL cuties. So it all depends on you what selection for data source you want. Either you want to go with the blank database or you want to go with assemble or a backup option. So right now we will go with the none option because we don't need any dataset sample or so. And we want to create our new, our own dataset table to bring in some data. Next, you have the collision option in database systems, collision specifies how data is stored and compared in a database. Collision provides the sorting rules, case, and sensitivity properties for the data in the database. The default collation for character data in SQL databases, SQL lead in one general CB1 CI. This collision is also used across the SQL Azure infrastructures to sort and compare metadata that defines database objects. The server and the database level collisions are not configurable in SQL Azure. Here we have the maintenance window time, which is the system default. So during this time, if there is any maintainance from the Azure team, it will be applied in this timeframe. Let's move on to the next option for tags. Here you can give any DAG to this database. It is helpful where you have multiple databases to differentiate between them. And finally, moving on forward, we will review and create our database. Just reveal all the settings and click on the Create button here. So the database is being initialized and submitted for deployment. Let's feed for this database to source to be deployed for using our account. Deployment is in progress as of now. And it will take a couple of minutes to get deployed and get ready for us to use. All right, so the deployment is now completed and the database is ready to use. It gives us a notification as well that is now added to the resource group, was successfully deployed as well. Here is all the detailed information. The start time was this, and this is the subscription used for this database resource. And this is the resource group name, which is the one we created at the time of Blob Storage. Let's now move to this resource. This is the dashboard for this database with the server, server name as my SQL DB dash sales, and the database named as sales record dash DB. Once when we start using the database performing some duties and all such stuff. You can see all the processing status is over here. So the server is online as of now. And the location is East US. This is the subscription used and here is the subscription ID. And then we have the server name. We haven't used any elastic pool. The connection string is available here and the price tire is standard. So this is how you configure the SQL database. Now in there, if you see these are the options that you get for your database actually. And let me quickly show you the option here for the QT editor where you perform duties on your database. It asks you for the server login credentials. So you can also authenticate using your root account. And I would prefer to go with the server authentication method, just type in the password, we'll just create it when setting up the server for the database. Login to the server. Here is of acuity editor. And you can see under the table option, there are no DIBL because just recall at the time of creation, we selected an empty database. So each of the folders are empty as of now. And here in this editor you perform the duties and we will create one table here. I have the court would mean beforehand and let me paste it here. So this is the table I want to create. Let's run it. You can see that the QT succeeded and effective zeros. If you refresh this, you will find that a table we just created appears over here with a named person. You will find this debut over here with the table name. The debut or database owner is the user account that has implied permissions to perform all activities in the database. So this is how you create a database and then go on to the QT editor to perform some cuties. 5. 5 2 Azure SQL DB Firewall Settings: Hello and welcome to this club session. In this lesson, we learned some essential settings to set up your SQL database firewall so that it can be used from any third body SQL Workbench are from the SQL Server Management Studio. That is a Microsoft product used on local EPC. The basic idea is to access the SQL database we created on the Cloud from our local PC. So let's get started on the Azure Portal Dashboard. Go to the SQL database or service. Here, these are overdue databases that we created previously. You can go on any one of them and I'm interested in the sales record DB database this time, click on this database. And over here you will see the option to set a server firewall. Click on this option. This is our firewall settings mean beach. So you have public access and dyslexia law, public network level access to this particular database. And then you have other settings as well. The option viewer looking here is for the settings that allow over local machine with any service, which in our case is the SQL Server Management Studio, to connect with this database in the Cloud. So for that you will need to have the excess. First of all, for this, you need to know the IP of the machine that wants to access this database residing in the Cloud. So this option here allows you to add in multiple IPs that can access this database. If you go and click on the option to add client, you see that it automatically detects my IP. Let's say you don't get your IP here. So in that case, what you can do is search for your IP using the dual available like the website, which is my IP.com, like in my case, you can see over here. And the same IP also appears in the client list in Azure as well. If you are looking to make your database public to everyone so that anyone would the username password can access. You can add a start IP as 0 dot, dot, dot, and the seam in the end IP as well. In that case, anyone can then x is database and from any bc. Now click on the Save button to save these settings. Now that we have added the IP of our machine, this means we can now access this database from our local machine. For that, we need to have the SQL Server Management Studio and we will look at how to install and configure it in our next lab session. 6. 5 3 Install SQL Server Management Studio 01: Hello and welcome to this lab session. In this lesson, you will learn how to install the SQL Server Management Studio on your local machine to perform SQL duties on your databases available on Cloud. This will be a two-step process. First, we will install the SQL Server and then the Server Management Studio. For this, you need to search for SQL Server Management Studio on Google and click on this first link available here. You will be redirected to this speech on Microsoft. Now click on the blue button here, which says download SQL Server. So you have these two options here, either the goal with the developer option or with the express option. We will proceed with the developer option. I have already downloaded it, so we'll continue the installation process. We have three options here for the installation type, the basic customer, and the download media option. If you are installing it for the first time, I will recommend you to go with the basic option. So I will click on the basic option, accepting the terms to move ahead. Now you need to give in the location as to where you want the SQL Server to be installed. Now click on Install, and installation is in progress. All right, so over installation for SQL Server is completed successfully. The next step is to install the SSMS. For that click on this link here. Now you will be redirected to a new webpage. Scroll down to this option here and click on the download option for SQL Server Management Studio 18.9.1. So the Management Studio is downloading as of now. I will continue with the installation of the studio in the next lab session. 7. 5 3b Installing the SQL Server Managemenet Studio pt: Hello and welcome to this lab session. This is where we left off. So our file has been downloaded. And now let's run it and install the Server Management Studio. So here you need the location where you want to install the SSMS. And if you want, you can change the location as well. I'm okay with this default locations. So let's continue to install. The installation has been completed successfully, and now we will have to restart our PC. A restart is required, so I will now restart my PC and then we will continue in the next lab session using this SSMS. 8. 5 4 Connecting SQL Server Management Studio with Azure SQL DB: Hello and welcome to this lab session. In this lesson, we will look at how to access the Cloud database from our local machine using the SQL Server Management Studio. We installed, let's get started on the Azure portal and go to the SQL databases service. So here we have these two databases. We will select the sales record dB as we have configured the firewall for this one previously. Here on the database dashboard, you will find the server name. I will copy the server name. Now search for the SQL Server Management Studio on your PC. This is the studio. Here you have the Server DB option, and there are multiple options available. We will select the database engine next, best, the server name that we copied here in the server name field. Now the authentication mode will be the SQL Server authentication. Now type in the username and password that we assigned to the server while creating it. So over username was as your user and type in the password. Now click on Connect. So if the firewall settings and the username password you entered are correct, then it will connect successfully to the database. Here you go over a database is connected. And here you can see we have over folder details fetched from the database in the Cloud. Now in the databases folder, you can see that we have both our databases available. You can expand them and see the folders inside the sales accordingly. If you look at the tables, here is the table we created previously. You can also perform cuties over here in the studio on the databases in Cloud. So let's perform a cutie here. Right-click on the sales record dB, and select the new QT option. I'm interested in looking at the top 1000 rows. So if I run this query, you will find here that we get all of the results for the database that we have in the Cloud. So this is how you connect and configured the SQL database in the Azure portal and on your local machine, we will be dealing with this dataset throughout every course and we will look at each step in detail like how to bring data to the SQL database, perform cuties and individualization and all that stuff. So stay connected. 9. Theory 06 4 Workflow for the ETL in Azure Data Factory: Hello and welcome to this lab session. In this lesson, we will look at how to perform the ETL extract, transform and load process in Microsoft Azure, but it helped profits, data factories, others. We're going to be looking at the overall architecture, understanding of the ETL process and practically implementing it on your platform. In this diagram, you can see that we have the Blob storage over here, which contains a CSV file for the sales record. And on the other hand, we have the SQL database table for the sales record. Most of them have the same schema. We have already seen how you can create an account for storage and upload some files over there. In other case, we uploaded this sales records CSV file. Then we saw how you can create a database in Azure SQL and create a table over there. So we have a sales record tables in the SQL database. With the help of the data factory in Azure. Following the procedure, we are interested in bringing the dataset contained in this sales records CSV file present in the Blob storage to the SQL database table. So that you can then perform some cuties and perform some analytics. For this, we need to follow a series of steps. And in Azure, this is done with the help of an integration service known as Data Factory. Once you are inside the Data Factory, the first step is to create the pipeline. A pipeline is a logical grouping of activities that together perform. The activities in a pipeline, define actions to perform on your data. The next thing we do is create a linked service for both the source and sink. A linked service is used to connect a data steward, the ADA. It is very similar to a connection string in that it defines the connection information needed for ADF to connect to and resources such as Blob Storage or a SQL database. First, we create a linked service for the Blob storage, using which you can access the CSV file inside the Blob storage. The CSV file will produce the source dataset, which is a named view of data that simply points to or reference the data you want to use in your activities as input. And on the other side as well, we will create a linked service that will connect to the SQL database table. With the help of that linked service, we will be able to get the David information that is created inside the database. Finally, we will have the activity, which is the copy activity in our case, that reproduce once were to copy the dataset files from the CSV inside the Blob Storage into the database table in SQL database. So we will be looking at all of these steps and how to perform them on the Azure platform. I will be discussing each step in this architecture when performing a practically. 10. Lab 06 1 Create Data Factory in Azure: Hello and welcome to this lab session. In this lesson, we will look at how to create a Data Factory, the source in the Azure Data Factory, how the are on the home screen of our Azure portal, and the search bar of a search for a Data Factory. And select this onscreen option. Here, I will create a Data Factory. You can create from this button or from the creative options over here. Let's click on the button to continue. As usual, we will use the Azure subscription one, which is our free time, and then select the same resource group in which we have our other resources we previously created. Then we have the instance details. First one is the region where the Data Factory will be managed. We will keep the East US. Next, give it a name. And this name should again be globally unique. Let's name it as soon as regard. And short for Azure Data Factory. So we see a checkmark here, which means this name is available for us. Next step, the version. And here we have already selected the reader, which is the latest one. Next, let's move on to the good configuration. Here, I will go with the configure Git later option. So click on Next now for the network settings. And here we have the option to manage the virtual network. As if not, we are not using any of them. Let's just select the right option for the self-hosted integration runtime inbound connectivity. We will use the public endpoint so that any service can have access to the data factory and need a factory can also access the other SEO services. Click on Next for some advanced settings. Here we don't need any Data Factory encryption. Let's move to the bladder. Here. You can give your Data Factory a duck so that it is easy for you to manage it. This is optional and up to you and I would skip it. Let's move to the final step to Review and Create Data Factory resource. And here are all of our settings. And now let's create the resource. The deployment is being initialized and it's submitting the deployment. Now the deployment is in progress. Finally, the resource is deployed with all these settings, as you can see in these details. Now, let's move on to this resource to start creating the pipeline for our ETL process. So this is the resource dashboard. Here, it shows the Pipeline Runs. Currently, there is not any, as we have not created a pipeline yet. Then there is the equity runs, which gives information on the number of activities you run in the pipeline. Then there are triggered runs, the Integration Runtime Core CPU, and the integration runtime for the memory. So this is how we create Data Factory resource in Azure. In our next lab session, we will look at how to create pipelines in this resource. 11. Lab 06 2 Create Pipeline in Azure Data Factory: Hello and welcome to this lab session. In this lesson, we will learn how to create pipelines in Azure Data Factory. This is the dashboard which we get once we create the Data Factory to source in our Azure portal. Here we have the two options to start with. You can follow up the documentation for Azure Data Factory or start with creating the pipelines in the author and monitor sections here. Let's go into it. So the Data Factory is loading. Here are the options that Data Factory for whites. You can start with directly creating the pipeline or create a data flow. You can also create a pipeline from a list of available templates in Azure or directly start by copying the data. Configure SSIS integration, which is the SQL Server service integration. Or you can set up the code repository. You can directly start to copy activity from this option here, but I would recommend starting from scratch in the Author option over here, as you will be using this option most of the time to edit and manage your pipelines. So the first step in our ideal architecture is to create the pipeline. Click here on the pipeline option. Click on the three dots here and go to the new pipeline option. As you can see an architecture, this is our first step creating the pipeline. Coming back to the Azure portal, we will assign the name of a pipeline. Let's limit as blocked to SQL DB. This means that using this pipeline, we will load data from Blob storage to SQL database. This is how we create an empty pipeline. In our next lab session, we will start with adding the linked services and related activities to run the pipeline. 12. Lab 06 3 Create Linked Service for Blob Storage: Hello and welcome to this lab session. In this lab session we will look at how to create the linked service for the account storage to bring in the source data in our pipeline, we have graded the pipeline. The next step we have is to create a linked service for the Blob storage that will be responsible to connect to the data source in the storage account container. So over here is the monitor and manage option. You will find the option for the linked services. Click on create a linked service option here. In the Datastore option here, we will select the Azure Blob Storage and search for it in the dock here. Select this and click on Continue. In this option here, give the linked service a name. And I will go with the default name, which is as your Blob Storage one. Let this connect via integration runtime, the the auto resolve method, then the authentication method, BI account key itself. Moving on to the Azure subscription options, select available one. And I will select this. And in the subscription it will show the available storage account names. Let's select hours. So this is the one we created previously as just selected. And then we will click here on the desk connections, which will verify other connections for the latest service to the storage account. So here the connection is successful. This means that the linked service is able to connect to this Blob Storage 15 5, which is in this subscription using the account key authentication method. We will just create it. Okay, so the linked service is now created. Now go back to the author option. So with the help of this linked service, we will bring in the dataset metadata from the file stored in the Blob Storage. Select the dataset dab, click on the new dataset. Select the datastore from where you want to bring in the dataset. And we know it is blob storage. Search for it in the bar here. Now select it and click on Continue. Now we need to select the format. The format is the delimited text, which is the CSV file. Let's select this option and click on Continue. Select the link service with the help of which you will connect and bring data from the Blob Storage. So this is the linked service we just created, the Azure Blob Storage 1. Now using the browser option located, this is record file. First, select the folder we created earlier with the name my first container selected. And inside this we will go and select the sales record, a 100 CSV. So you can see we have the complete five-part now available and in our dataset file, the first row is the header. So checkmark this option. And I want my schema to be imported from the connection story itself. Now click on Okay to continue. You can see that the file with our name delimited text, one is now here. It also shows the schema of the file as well. These were all over columns in this dataset. Then here is the option to preview the dataset. Using this, you can actually preview your dataset. And here is my dataset, and this is how it looks like. So we have no issues in bringing this dataset from the Blob storage. Next, let's name this dataset for the source that its actual name acids record. And here are the changes reflected in the dataset file name for the source. This is how we create the linked service for the source data. In our next lab session with a look at how to create a linked service to sync over data in the SQL table. 13. Lab 06 4 Create Linked Service for SQL Database: Hello and welcome to this lab session. In this lesson, we will learn how to create a linked service to sync our data in the SQL database so that we can dump the source data in there. Now, we need a connection to the database so that the actual dataset can be fed into the tables of our database. I will go to the Manage tab to create the other link for the SQL database. Click on the new option and search here for the Azure SQL database. Here it is selected. Over here we are giving this linked service name as Azure SQL Database one. We will use the auto resolve integration runtime. The subscription we will use is the Azure subscription one in which we created the sales record database. Next, select the Southern my SQL DB sales, the database name was seals record dB. Then we select the authentication type, which will be the SQL authentication. So you need to give him the SQL Server username and login password to let this length service connect to the database. Now click on the Test connection so that we have no issues with the connection. So the connection is successful. This means that the linked service for SQL DB was able to connect with my SQL DB sales with the help of the server credentials be provided. Let's create this link service. Again, I will move back to this Alter option. This time, I will be creating a new dataset table from the table inside the database we have created. And this was the previous one, which is the source. Now we will go and create a sync table where we will load all the values from this source dataset. Over here on this option, select new dataset. Now we will select the SQL database and click on Continue. And this time we will select the Azure SQL Database one linked service. Now we will select the table name that is inside of a database. So it is the DB 0.6s record table. If you want, you can edit it as well. Then we will import the schema from the connection store itself. Now click on, Okay. Now here we have the schema for the table, which we previously created an Azure SQL Database with all these datatypes. So now you can see that we have the source dataset as well as the dataset table. Let me when the dataset table name and I will name it as a sales record, DB are right now we have created a linked service for the sink as well. In our next lab session, we will learn how to create the copy activity so that we can copy the data from blob storage to the SQL database. 14. Lab 06 5 Create COPY Activity and run the pipeline: Hello and welcome to this lab session. In this lesson, we will learn to create the copy activity and run our complete workflow. Next thing we need to do is create the activity for this. So we are done with this part for our blob storage and the other part for the SQL database. Now we need to create a copy activity so that we can run this complete architecture. Now let's go back to the portal and see how we can create this activity. Now for this, we will go to the Author option and select this pipeline we are working on and go to this Move and Transform option. Here is the most common Copy option. Hold it and drag it to the workspace. If you see it chose the option for this activity to setup. Let's give it a name. I will name it acids records, CSV to database. Then in the next day, you need to provide the source data and the sink datasets. So we have seen in the architecture as well that this is your source there you will get a file from. And then we will sync this data to the table inside the database. So here I will go to this source option and select the source dataset, which is the sales record with the CSP icon. Then let's move to the sink, dark and heavy. We'll select our same table in the database. This is very, we will load the data. The most important part is the mapping one, where you need to see the mapping of the source and the sink dataset tables. So for this we will click onto the import schemas option. And here it is with some heterozygous. So you can see that the column names in the source with spaces in them had some issues while mapping a to the sink. Or you can say the target table. Now let's correct them. First error, we have the item type. So let's search for the item type in the drop-down in the source table. Now the error is gone. Same procedure. I will do it for all of them. We now have successfully mapped each of these. So you can see that the mapping is not perfectly fine. And each of these strings will be, will be converted to their respective datatypes assigned while creating the table in the SQL database by default, whenever you import the dataset as a source from CSV format in Data Factory. It makes all columns as a string datatype. And then using the transformation of datatypes, you can convert it to the required format as we are doing here. So now let's debug this architecture for the Data Factory pipeline we just created so that if any issues are found before we publish, it can be sorted out. Here is the e-book option. Click on it. If everything is working fine, it will execute without any errors. And then we will be good to publish all in this pipeline. All right, so the debug was successful and it did not give us any error. So now we publish it and let's do it. Here it shows all the pending changes which will be saved and published for execution. Now click on Publish. So it is publishing and deploying changes to the factory. So the publication was completed successfully. Now let's move to the database to check if we have this data in the table. Now, select the database. Sql. Select the database we created. Now go to the Query Editor and login with the credentials. Now here is the table. Let's Curie it to check the dataset inside it, Let's select the top 1000 rows and run this query. The Curie succeeded and it is fetching the rows to display. And here you go, we get the dataset on duty in the database. This is the same dataset file which was in the block storage and is now available to us in our database. So this is how you perform the ETL process in Azure using the Azure Data Factory. 15. Lab 06 6 Dealing with Multiple Datasets: Hello and welcome to this lab session. In this lesson, we will look at how you can upload multiple Dataset Files to the Blob storage. Previously, we have seen how to upload a single file and perform the ETL process. We uploaded a CSV file and with the help of linked service inserted into the Data Factory. And then with the help of another linked service, we fetch the table metadata that was in our database. And then using the copy activity, the copied all over data from the Blob storage to the database table in Azure SQL Database. This time we will be doing the same thing, but for multiple dataset fights, we will copy the two files that we have in our blob storage. That is the dispatch record and the items records. And then we will move them to the SQL database. Let's move on to the practical for this workflow. The first thing I will do is move to the Blob storage. Here I will create a new container. Click on the new option. Let me name it as a multiple datasets, short for dS, dash sales record. So this container will contain the multiple datasets for the sales record that we have. Now setting the access level to the public, we will continue to create this container. Other container is now created successfully. I will now move inside this container. In this we will upload our files. Click on the Upload option here, browse for the file. Let's say these two files, and then upload them. The files are now uploaded. The next thing to do is to move to the SQL database. And over there we will create new tables, one for the dispatch item details and one for the item sales details. Once those are created, we will follow the concept of pipelines. And after that, the help of copy activity, we will bring in these files in the database. 16. Lab 6 7 Creating Tables for Multiple Datasets in SQL DB: Hello and welcome to this lab session. In this lesson, we will create database tables for these two datasets we uploaded in our blob storage so that we can move the contents of these two datasets to our database using the Data Factory pipelines. Let's move on to the dashboard. Select the SQL database option. Let's create a new database. I limit as multiple CLs regard dB. Moving on with the default options that previously. Next is the networking settings. Keeping these as default. Then security bill also be default setting required. So let's move on to the review option and create this database. All right, So the deployment is now completed. Let's move to this resource. And the left up, we have the QT editor option. Selected. Type in the password that you have set at the time of database creation. As we have created an empty database, you can see that we don't have any tables as of now. I have the duties that we create the database tables for us. Let me bring it in. This is the first script that will create the table for dispatch items. And we have five columns in this table. If you remember, we have two different datasets for this case. So this is one of them. Now run the duty. Duty succeeded. Now refresh the tables tab and under the tables folder you will see that we have our table created. Next, I will create the table for the other dataset, that is the item sales table. Here we have ten fields. You can find the SQL script to create these two tables in the resource section. Run the QT. Here we go, The beauty is executed. I will refresh the db folder, and we now have two tables created. Now we have successfully created the tables for our two datasets that we uploaded in Blob Storage. The next thing we need to do is copy data from storage to these tables. So in our next lab session, we will be looking at the necessary steps required to perform this activity. 17. Lab 6 8 Creating Linked Service for Multiple dataset in Blob: Hello everyone and welcome to this lab session. In this lesson, we will learn how to create linked services in Azure Data Factory. Let me explain the process with the help of this flow diagram we have. So what we want is that these two datasets in the Blob Storage move to the Azure SQL Database so that we can start performing some analytics on our datasets. For this, there is no direct method. You first need to create a linked service. The first one for the Blob storage, the second one for the tables metadata. Then with the help of copy dual activity, we will copy the data from Blob storage to the database. Do. Let's get started in the Azure dashboard Vogel for the option of data factories. Here, I will create a new Data Factory. Select the same resource group. I will name this Data Factory as multiple seals ADF sharper as you'll need a factory. Great. This name is available for us and can be used. Next we have to get configuration option. We will select the Configure liter option. Next we have the networking option. I'm okay with the public endpoint. Moving on next, leaving others as default. Let's review and create this Data Factory. Let's wait to that. Our Data Factory is deployed and ready to be used. Alright, to the deployment is successfully completed. Click on the Resource option. Here you have the option named as author and monitor. Click on this option. Data Factory is loading as of now. This is the main screen. We have already seen it previously as well. We will move to the Author option. These are all the options that we have in the auto stab. And I'm interested in creating a new pipeline. Let's name it as multiple dataset copy. So this will be used to copy over multiple Dataset Files. Now we need to create a linked service. For that, we will go to the Manage option over here. Click on the Create New linked service. In other flow diagram, we will first create a linked service for the Blob storage. That is our source in this case. Now search for the Blob storage option in the search bar. And here it is selected that the name for this service be the default name. Select auto resolve integration runtime, select account key as the authentication method. Next, select the Azure subscription one as the subscription for this Data Factory. The storage account name as my Blob Storage 1 55. It will automatically load the key for us. Right now leaving everything else to default, it is recommended to always test your connection before you proceed. Another connection is successful. Now we are good to go to Create are linked service. So we have successfully created our linked service for Blob Storage. The next thing we need to do is create a linked service for the SQL database so that we have the tibial metadata in our Data Factory. Before we create a linked service for the SQL database, you need to clear the datasets inside the Azure Data Factory. For that, go back to the author option. Here, moved to the dataset option. Click on the new dataset option. Now we need the dataset from Blob Storage. So I will select the blob storage option, the file format as we know as CSV diaper. So select the delimited text option. Rename the dataset to item D deals CSV. Now we need to select the link service. Here is the linked service we created for Blob Storage. Now we select the path of our dataset file. Let's browse for the item seals the deal CSV file to be uploaded previously. This is the file selected. Also remember that our first draw was header. So select this option and click on, Okay. Here we go. Click on the Preview data option. And this is our actual dataset that is inside the item DDL CSV file. In the same way, let's quickly create the table for our other dataset as well. That is for the seals detailed CSV. Now go to the dataset option and select the new dataset option. Again, the same blob storage option with our newly created linked service for the Blob storage, the same CSV file type and then browse for the file in our blog. Select the dispatch itemDetail CSV file. I will name this file as dispatch CSV. Again, the first draw is the header. All right, so here we have our dataset loaded. So we have created these two datasets with the help of the linked service. So we are done with this activity as of now. The next part is for the SQL database linked service and its metadata extraction. We will cover that in our next lab session. 18. Lab 6 9 Creating linked service for multiple data load to SQL DB: Hello and welcome to this lab session. In this lesson, we will look at the other linked service that we need for the SQL database. This link service will fetch the table details from the database we created inside Azure SQL database. Let's get started. For that. We will go to the Manage option. Click on New to create a new link service. Search for the SQL database and selected. I will keep the name for this linked service as default. Select the subscription. The server name for my case is this MySQL DB dash seals. Now select the database name, which is multiple sales record dB for my case. Then we have the authentication type, which I will select SQL authentication type type in your username, and then I will type in my password, leaving all others as default and then testing the connection. So my connection is, okay, let's create a linked service. So our linked service for SQLite database is now created. Going back to the author option. I will go to the dataset option. Just take a note of it that I have named these previous dataset as dispatch CSV and item details, CSV. These two are the actual leader sets that are from the Blob Storage. Now, I will create a new dataset and that will be from our SQL database tables. Select Azure SQL database option. Select the linked service. You will notice that it automatically fetches the table names. I will go with the dispatch item table first. I will name it Dispatch SQL table. With this name, it is easier for us to differentiate between the actual leader set from blob storage and the table we are creating from the SQL database, right? So click on OK option here. Dispatch SQL table is now created. Here is the schema for the table. So it successfully fetched the table details for us. Now I will quickly go and create a new dataset, and this time for the item details, repeating the same process again. All right, so you can see that the schema is successfully imported for other table as well. With reference to this flow diagram that we have, we have done these two parts. Next, we need to create the copy activities and then copy the data from the storage to the tables we created. We will be looking at the spot in our next lab session. 19. Lab 6 10 COPY Activity for multiple dataset load to SQL DB: Hello and welcome to this lab session. In this lesson, we recreate copy activity to load the multiple dataset values in our database tables. In our Azure Data Factory. Under the pipeline, select the pipeline that we created. There are multiple activities options that you can perform inside the Azure Data Factory. I will select the first one. Now select the Copy Data and drag it to the workspace. We need to copy activities this time. Now for the first one, I will name it copy dispatch records. And I will name the second one as copy item sales record. In the first copy activity, we need to provide the source and the sink dataset. First, I will select the source dataset. So our source dataset is the dispatch CSV file over here. And then moving on to the sing, our Sink Dataset DIBL is the dispatch SQL DW. We have selected the source and the sink dataset. Now we need to map both of them. Select the map and dab, and click on Import schemes. Alright, so you can see that the mapping is perfectly fine for this case, as all the fields for source are mapped on the correct fields in the sink. So the copy activity for dispatch Ricard is configured. Next, I will select the other copy activity. And then here we will do the same thing. Select the source first. Our source dataset is identity deal CSV file. Then moving on to the Sync option, our dataset DIBL is item sales as skill table. Now we move on to the mapping and important schema. All right, so here you can see that we have an editor for the item type. Other than the item type, all other fields are mapped properly. For the item type in the list for source is not mapped to the same table. So I will select the drop-down for source and select the item. Okay, great, the fields are no match for the source and the sink. Now that we have created the copy activity in our pipeline, the first thing now to do is run the debug before we apply the pipeline so that if there are any bugs or errors, we can identify it prior to deployment. Click on debug. So the debug is running as of now. Let's wait for it to be executed. All right, so the debug is executed with success. Now we can publish this pipeline. Click on the option here. It shows the changes you have performed. Click on Publish. Once the pipeline is published, we will have all our data inside the database tables in Azure SQL Database. So our pipeline is published. Now I will go back to my SQL database, and there we will look at our dataset. Hair VR, inside of our database. Here are our two tables. Let's go to them and see if we have got the data. I will select the default cutie to select the top 100 rows in the dataset. Run the query, and here you go. We have all the dataset points now in our database tables. Let's also check for the item sales table as well. And here we have the dataset for this one as well. So this is how you move the data from the Blob storage to the Azure SQL Database. Now, data analytics and SQL duties can be applied to the dataset that we have in here. We will apply duties on it in Section 8. In our next lab session, we will look at a new example and learn to apply transformation and folders in Azure Data Factory. 20. Lab 6 11 Upload movies dataset for ETL transformation: Hello and welcome to this lab session. In this lesson, we will work on the movies DB dataset. This is the movies dataset in front of you. Our goal with this dataset is to transform it and apply filters onto it. We have here the movies IDs, the adults, generous are you may call it as a category for the movie. Then we have the year of the movie, their respective readings and the Rotten Tomato values. What we will do with this dataset is transform it in a way that we get the comedy January movies for a range of yours with their respective readings. So we will upload it to the Blob storage and then bring it in the Azure Data Factory to apply transformations and filters. Let's get started. Search for the storage account and created Blob Storage. I will create a new container and name it as a movie dash dataset. Excess level will be container level x's. Now create the storage. Now I will upload the movies DB, CSV file, living all to default. Let's upload the file. All right, so the file is uploaded. With this, we will end this session and our next lab session we will create Data Factory data flow and transform this data set as per requirement. 21. Lab 6 12 Create ADF Dataflow and Load Movies dataset: Hello and welcome to this lab session and this lesson we will create a 0 Data Factory dataflow to transform the dataset. Select the Data Factory option. I will create a new Data Factory. Select the Azure subscription one, the same as your resource group. And I have a limit as transform movies set in restaurant default. Moving on next we have Git configurations selecting Configure later. Let network settings be public. Selecting everything as default, review it, and create the Data Factory. So the deployment is completed. I will go back to the resource. And over here I will open an Azure Data Factory studio. Here, select the Author option. This time we will use a different approach to bringing the data from Blob storage to the Azure Data Factory. In the Data Flow tab. Select new dataflow. Over here, click on the Add Source option. I will name it as movies csv file. Then here for the dataset drop-down menu, you will find that there is no option available. So I will click on New over here, select the Azure Blob storage option, click on Continue. Other format type CSV, named the dataset as a linked service for our blog. Now we need to select a linked service as we don't have any. So we will have to create one. Here, selecting the Azure subscription one storage account name is blob storage 155. Let's test the connection. All right, so the connection is successful. Let's create it. Before I created, I've limited as a 0 Blob Storage linked service. Let's create it. One more change. And here I will name the dataset as movies dataset from blob. Now we have this linked service, so selecting it. Now browse the movies DB CSV file. Here it is. Also the first draw is header. So checkmark this and click on. Okay, so we now have our dataset selected. Next, I will preview my data. So to preview data, you need to turn on the debug mode. To turn it on, you go to the slider button here and check it. And the debug time to live. You have these options. I will select four hours. Click on, Okay. It will take a couple of minutes to turn on the debug mode and make things 3D. So once the debug is turned on, we can then see preview of dataset. All right, so the debug is turned on. You can also see a green mark here on the DDA preview. Let me refresh it. Here you go. We have the dataset available. So this is how you bring the dataset from Blob storage to the Data Factory with the help of the data flow. In our next lab session, we will look at transforming the dataset. 22. Lab 6 13 Apply Filter in ADF Data Flow: Hello and welcome to this lab session. In this lesson, we'll look at how you can apply filters in a data flow in Azure Data Factory. Click on this plus sign here. We will apply the filter here. So you can see you have multiple options here. And what we are trying to do here is we will separate the comedy January movies from the list. And then for the year 1910, to calculate the average score of January movies for each year. Going back, I will search here for filter option. I will rename this filter to filter movies. And the incoming stream for this filter is this movie's DB CSV file. Now for the filter option, select the open expression builder option. Let's write down the expression that we want to perform. I will use it to integer function. I have used the two integer function because by default the values in the CSV file are interpreted as a string, even though they look like numbers. So this two integer function will take care of the data type conversions. We have the year. I will add it as the first argument in the two integer function. And we don't need the format and the options, so we can skip it. Now let's apply the condition we want. You are in the range greater than equals to 10 and is less than equals to 1000. Next, we will use the online command. This is used for the string matching for our patron. Of our string here is degenerate and engender is, we want to look for a committee. So this function will give out yours in the range 1910 to 2000. And movies have in January as committee only. Click on save and finish. Once you save and finish it, click on Data preview. Let me refresh it. So we have all the movies in the committee agenda. And the year is in range 1910 to 2000. All right, so one part is done. Next, we need to apply the average function to the movie ratings of each year. Like if you see here we have the year 1923. So we see three movies here in a row, and then one more here. Likewise, there are multiple movies in each respective year. So we will combine them all for each year and then calculate the average of them. So we will be performing this average function in our next lab session. 23. Lab 6 14 Average Aggregation Function in ADF: Hello and welcome to this session. In this lesson, we will look at how to perform an average function in data flow. For that, we will click on this plus button here. Select the aggregate function. So we have two options here, the groupBy and the aggregates. First, I will rename this flow as aggregate committee movies. And then over here I will select the year column for our group. By. Then for the aggregates. I will select here the column that is ratings. Now we need to give in to an expression. Opened the Expression Builder. I want my column name for this output to be writing comedy movies as we need to calculate the average. So right, AVG. We need to calculate the average for the ratings. Now it is interpreted as a string, so we need to use the integer function again and then select the rating column. So this is our expression. Let's save it. Now let's go to the Data Preview. Here we go. So we have the average rating of committee movies for each year. And these are for the year 1910 to do 1000. This is how we perform the average and the data flows in Azure Data Factory. Now the next important thing is, what do you want to do with this transformed data? There can be many use cases. For this example, we will export this transformed data to Azure Blob Storage and say that there, so that whatever it is to be used, we can x's it from there. 24. Lab 6 15 Sink Transformed Dataset to Blob Storage: Hello and welcome to this lab session. In this lesson, we will look at how you can save the transformed data in the Blob storage with the help of data flow in Azure Data Factory. The save of transform data is also known as thinking of data. Let's get started. Before I continue here, this name get somewhat confusing. So I will go to the aggregate settings and over year change this name to average committee readings. Go to the Data Preview option and refresh it. So here we can see that the column name is updated. All right, So let's now move to the sync process. Click on this plus sign here, search for the Sync option and select it. I will name this output stream as comedy reading sink. Then we have the sink dive. Either you want it to be dataset DI N line data or a gash only. As we want to be in a proper data set format, I will select this option. Now it asks us to where you want to save it. So we will create a new dataset and save it in Blob Storage. Click on the new option, select the Azure Blob Storage, and click on Continue. Now, I want my output to be in CSV format, so I will select the delimited text option. I will name it as a comedy rating dataset. And the linked service I will use is the Azure Blob link service. I want my first row to be a header row. Next, give the path where you want to save this dataset. I will save it in a movie dataset container import the schema will be none as we will go with the additional generated schema for the sink file. Click on, Okay. Now let's desktop connection. So in this way we are sure that when we published the data flow in this Data Factory, it will work fine. Great, So the connection test was successful. Let's preview the data. This is how over dataset will look once it is into the Blob storage. One thing to notice here is that if I write now publish this are running debug. It will save it in different CSV files rather than one single CSV file. For that, you need to know when important configuration in the optimized step in the partitioning option, instead of using good and partitioning, select a single partition option. Next, you need to go to the settings. And for the filename option, select the output to a single file option. This is the name of the file that will be saved in the Blob storage. This complete file will now be saved as a single file with the name as average committees ratings. All right, so now we will go to the pipeline option and create a new pipeline. In the top layer, select the dataflow options and settings option. We will select the data flow we just created. Here it is named as Dataflow one. Now moving back to the general tab, I will rename it to committee ratings. So now we are good to go with other settings as default ones. Let's run it in debug mode. So the status is in progress as of now. Okay, So here over debug run was successful. Let's click on this option here to preview the run process. Here are all the blocks of data flow. This last one is the sink. If you click on it, this shows here the summary that it took a tech and stood on the sync process. I do block 83 rows were calculated and all other details are mentioned here. In the same way we have diagnostics for all the functional parts of our data flow. Now moving back to the author option. As the run and debug was successful, let's publish it and deploy our pipeline. Once it is published, we will go back to the Blob storage and check for the dataset file. All right, the pipeline is deployed. Let's move to the Blob storage. Here we go. This is our average committee reading CSV file that be transformed using the data flow in Azure Data Factory. And here let's view it. So these are all the values we have, the year and the average committee readings for each year. So this is how you transform a dataset and apply aggregation to it with the help of data flow techniques in Azure Data Factory. This is it for this section called this section has made clear your concepts on Azure Data Factory. 25. 07 1 Create Azure Synapse Workspace: Hello everyone and welcome to this section on Azure Synapse Analytics. In this section we will look at practical working off as your Synapse Analytics, which is a limitless analytics service that brings together enterprise data warehousing and big data analytics. To now we have seen the storage account, SQL database and the ETL process as different services. But this sign App Service allows you to do all that work on a single platform. And the good thing is you can scale the storage and the database as per your requirement. Let's get started with creating our first workspace for our sign-ups analytics. In the search bar, you need to search for azure Synapse Analytics. Click on this option. Currently we do not have any workspace. So click on the Create Workspace button over here. First thing first, select the Azure subscription. Here we will use the same resource group that we have been using till now. Know managed resource group is required. Now you need to give a unique workspace name and I will name it as My as 0 sign-ups as C1 grade. So this name is available for us. Region I will use is the default one. Or you can choose the one that you want from the list here. The important part is the Data Lake Storage selection. So if you have already created one, you can select it from the list here. With the Fromm subscription option enabled, I will go with the create new account option. I would limit as as your data lake one 55. Click on, okay. Now we need to create a file system name and I will name it as root. Click on OK. Next, click on Review and Create options. And wait for the workspace to be validated. So here it gives you an estimated cost for the workspace. So it will cost us $5 per dB for serverless, SQL. And blue are the details of our workspace. The name, the file system name, then the security features where it shows the server login username and all that stuff. Let's create this workspace and wait for it. So our workspace is being created and being deployed. It will take around six to eight minutes to completely ready for use. After it is ready, we will perform the SQL via housing and the ETL process like we did previously. So that you have a comparative view of defenses using the sign-up service as a single platform. We will continue from here in our next lab session. 26. 07 2 Overview of Azure Synapse Workspace: Hello everyone and welcome to this lab session. In this session we will have an overview of the dashboard of Azure Synapse Analytics. After you have created the workspace, as we did in the last session, you move on to the resource dashboard and this is how it looks. The important options here are this analytics pool. So sign-ups analytics have these tuples, the SQL and Apache Spark. By default, the SQL is offered by Azure SQL Database and ApacheSpark is offered by Azure Databricks service. Throughout this section we will be dealing with the SQL pools. Let's move on to the Synapse Studio. Click on this open link over here. This is the main screen for your workspace. This is the name here of a workspace that we created. Then here you have these three options for Quickstart in sign-ups. First is the data ingest option. This is the interface. If you go into this, this allows you to use the copy data, do as we are just overviewing it so we wouldn't be proceeding with this one right now. Moving back, we have the explore and analyze as the second option where you can explore your data and analyze it with quick cuties. Then you have the visualized option where you can visualize your data in the form of tables and graphs. This is a very intuitive option in the discover more section, where as you have provided in-depth knowledge on how you can use this platform. Then you have the feature showcase and the templates contributed by the community. Let's move to the Knowledge Center. Here you have some ready-made to use samples. Like either you want to explore some samples data with Spark or QT data with SQL, or create a table with SQL. If you go to the Browse Gallery option. This section contains all the publicly available datasets. Then there are some notebooks samples, these are mostly used in Databricks. And you can see these are pi, Spark Notebooks, Scala and Spark SQL. Then we have the SQL scripts that can be used with the pools that you have in your sign-up. Second, like you have to assemble for our cutie CSV files, QD, JSON file, QT partway file, and other samples as scripts as well. Next we have the Pipelines option. These are the same as we used in the Azure Data Factory, but in sign-ups it is integrated. So you don't have to create the storage, then move to the pipeline. All of those steps are integrated in a single platform. So you have some interesting pipelines where you can copy from database to Azure Data Explorer, or both copy from database and many other readily available pipelines. Let's move back to the home screen. Here is the data bar. Here you can see all your ingested datasets and available databases in the pools. Next, in the Develop tab, you create your development environment. Then in the integrate option, you will find all the running pipelines that you have created. You can also monitor all your available resources within the synapse workspace. Your activity tracking and statuses. Overview. The managed part is the important one. This is where we will be creating over SQL pools. By default, there is a built-in SQL pool already available. There is no a budget sparked by default, but only the SQL pool. Then there are external connections support as well, where you can have linked service, just like the way you used in the Data Factory. And Azure Synapse the data leaked generation to support it. But if you want to have data from the Blob storage, then you will have to use this link service. There are also some integration options to set the triggers asked to when you want to ingest data are performed. Some triggered activities and all those things. Alright, so this is it for this session. This was just an overview of the workspace. We will get started using it in our next lab session. 27. 07 03 Upload Dataset in Azure Linked Storage and Perform SQL Query: Hello and welcome to this lab session. In this lesson, we will learn how to upload a dataset, do linked storage in Azure Synapse Analytics, and then perform a quick SQL QT on in their care on the Data tab option over here. Here you will find two options, the workspace and the linked data. The workspace contains two databases that you create in the sign-up spools. Let me move on to the linked data. Here you will find this as your Data Lake Storage Gen2. And this is the same stories that we created are dimorphic grading this workplace. Let me expand it. And here is the root file type name that we assigned to it. Now when you select this option, it shows that there is no container of illiberal as of now. Here you have the option to create the continuous, as we did in the Blob storage account. Let's create a new folder. I will name it as a dataset and created. Let's move inside this folder. Over here I will upload the dataset that we have with us for the sales record. Click on this Upload button here, browse for the file. Previously we uploaded this file. This time I will go with a larger dataset that contains 50 thousand data points. Click on the Upload button. So the file is uploading as of now. Alright, so the file is now uploaded. Right-click on this file and go to the properties. Here we have the dataset name, then here is the URL. Now when you are using the linked storage and you are working inside your sign-ups workspace, you'll need to use either this URL are this AB FSS spot, which is short for as your block file system to access the contents of the folder inside the container. So the two important things here are this URL and the ABF ss bar. So this is how we upload the file here. Now once the file is uploaded, we will not perform some SQL duties on it. So now I'll click on this file on top here, select the new SQL script option. It shows two options to select TOP 100 rows and bulk load. Let me bring in this and select TOP 100 rows QT. So you can see that staying on the same page, it opens up the QT for you as well. This is delaying care that I was talking about, which we have seen previously over here in the properties for this file. So instead of giving the database name, it provides the link location for the file. This is because we are using the link storage as of now to access the file and perform the Curie. Let's run this. So you can see that it gives this result here. These are all the data points in our dataset. So this is how you upload a dataset and perform cuties on them. In our next lab session, we will look at how to create SQL pools and use them for ETL and beauty purposes. 28. 07 4 Create SQL Pool in Azure Synapse: Hello everyone and welcome to this lab session. In this lesson, we will look at how to create SQL pools and Azure workspace. Over here we move to the manage tab. Here is the option for the SQL pools. We currently have a built-in serverless SQL pool available to us. But let's see how to create a new one. The care on the New button to create a new pool. Here we need to give them the basic details. And I will name this as SQL pool one. For the performance level, you can go up to 3000 DW. The lowest value, you can select a 100 Data Warehouse Units. And I will stick with this number. You can also look at the price estimation that you get for the cost per hour. The more you increase the performance level, the greater the price will be. Let's move to the additional settings. Here in the Data Source option, you have the flexibility to select a blank SQL pool. Are adding a backup data from the location you want or you want to restore point. We will go with the none option. Then there is the dogs option. We don't need it. So let's move on to the Review and Create. Click on Create. So the deployment is in progress. I will pause it here and resume once the deployment is completed, right, so our SQL pool is now ready to be used. Let's move in this SQL with one and see some properties. You have multiple options here for the connection strings. There are also the commonly known JDBC and the ODBC connection methods as well. Let's leave it as it is. Then in the configuration. These are the configurations of upper SQL pool. And here are the workspace details as which workspace is using this fool with what subscription, ID and all that. So this SQL pool is now ready. Let's move to the datatype here. This was the previous dataset file that we uploaded in a length storage. Let me move to the workspace storage and refresh this. I chose that one internal workspace data storage is available. Let's check it. So here is the SQL pool one that we just created. And here we have all the default folders that will be empty at the moment. As we selected a blank database at the time of creation. So this is how you create a SQL pool and Azure Synapse. And it is available in the workspace did adopt as well. So we will now upload files in this pool and then perform SQL queries. 29. 07 5 COPY Dataset from Data Lake to Workspace Database 01: Hello and welcome to this lab session. In this lesson, we will look at how to copy the data from the storage to the SQL pool that we previously created so that we can perform SQL queries. Let's get started. So over here you have the option for a new SQL script. I will go with the option for a new empty SQL script. I already have a script to create a table, so I will paste it here. And this is the same script that we used for creating table for sales required inside the SQL database. This time, I will name the table acids required 50 K. In addition to the table creation, we also have a width clause where we have other requirements listed. So here we are asking for the round-robin distribution. A round-robin distributed, distributed stable draws evenly across all distributions. The assignment of Ross two distributions is random. Unlike hash distributed tables, rows with equal values are not guaranteed to be assigned to the same distribution. And then the clustered columnstore index, that is the physical storage for the entire table. All right, so let's run this cutie. Here you go. The QT executed successfully. Now let me refresh this SQL pool one and now move into the table. Stop. The table is now created and inside the columns you will find this schema for this created table. Now that the table is created, let's run a basic QD to select the data points in the table as we have an empty data table, so it will return null. And here we have the empty dataset with just the column names returned. Also, if you look at the count of values inside the table, here you see it returns 0. Okay, So let's now move on to bringing in the data. There are two ways to bring in the data. One of them we did previously in the Azure Data Factory part, like can create the pipelines and do the copy activity, which can be performed here in this section in Azure Synapse in the same way. The other way is with the help of the copy statement in SQL. Let's have a look at the second method. For copy. We will write copy and do then the database table name. So we want to copy something here and what is it? So we want to copy from a location that is inside the linked services. Let's move to the linked services and move inside the root folder. Here, look for the URL for the uploaded file here. Copy this link. And moving back to the script, I will paste it inside the single quotation. Next, we need to give him some more better meters. So I will write with then the round brackets. So we want to tell the SQL script about the file format and the delimiter used in here. So the field terminator is a comma and we know the first draw is the header. So the first draw for the datasets starts from the ra2. Let me show you over here, preview the dataset. The first draw here is the header, and then the actual leader starts from law to moving back to the script. Then we need to specify the row terminator. This is 0, x 0. So this is the hex value for the new line. All right, Now we're done with this. This is the same thing that we did in Azure Data Factory pipelines using the GUI duels. And here we are specifying the same thing, but in this script format, Let's select it and run the security. The Curie executed pretty fast in four seconds. Let's not check if we have the data in our table. For that, we will use the select all command. Run this query. Here you go. So we get all the data points from our database table. So this is how you perform the copy activity using the SQL script method and perform the duty on the database table in Azure Synapse, SQL pool, rest, all other operations in Azure Synapse are the same as you perform in the Azure Data Factory. Here you get more speed and you can scale your database as per your needs. We will end this section here, and in the next section, start with the important SQL queries that you can use to perform data analytics on any dataset in the real world. 30. Theory 8 1 What is SQL: Hi, and welcome to the lab section on SQL foundation for data analytics. In this lecture, we will look at the basic SQL commands and the functions which are essential to learn in order to perform and analytics using SQL. Sql is a programming language used to communicate the data stored in a relational database management system. Sql syntax is similar to the English language, which makes it relatively easy to read and interpret. It is used for relational databases to access and manipulate the database. American National Standards Institute has make SQL standard language in 1986, and International Organization for Standardization has declared it in 1987 as a standard language. There are different versions of SQL, but all of them have support for our common features. Among them, which include the select statement, the update, delete, insert, and where the statements. Sql is used to perform duties on a relational database. It is used to run the queries to retrieve data, insert, or update an existing record value, create or delete the records within delete a base, create new tables in the database, generate tabulated views of the dataset values. And it can also perform a set of permission on tables, view and apply different procedures. Are relational database management system is a program that allows you to create, update, and administer our relational database. Relational database management system use the SQL language to access the database. In our next lab session, we will see some of the basic SQL commands that can be used to manipulate or retrieved the database. 31. Lab 8 1 Select All Statement: Hi, and welcome to this lab session. Here we will be executing the SQL queries on the sales record DSM. We have already learned to upload to the database. And as your sign-ups, we have previously seen how you can bring in your data either in the SQL database or as your Synapse SQL pool. Let's begin with some cuties on them. We will go to the Data tab here. And you will see the two options here, the workspace and linked. Select the workspace option. Under this, you will find the database which has our SQL pool one we created. Under this, you will see that we have the stable folder and we have over three database tables. The dispatch items, the item sales, and the sales record 50 K. Now we will use these datasets as assembled to perform our duties in SQL. Let's go to the select statement. So we will look at the SQL statement to get all of this table data in your output result. For that, we go to the Developer tab in the sidebar. Select the Add option, and go for the option of new SQL script editor and write the statement. Let me write the select statement. So we write select static. So the symbol means you want to fetch all of the data which are in the table you want the data to fetch from. Then we dive the from and give the name of the table. In our case, the name of the table is DIYbio dot then the name of the table. But right now, even if I give the table name it one fetch result. But because we have connected this SQL script to a strong leader, for this, you need to go to the drop-down and select the SQL pool one on which we are working right now. Let's select the SQL pool one. Now let's get the table name, which is sales record 50 K in our case. Now let's run the query. And the script is running as of now. Okay, so here you see that it gives you all of the data and the output, showing that it returned all the results in the table that we have. If you are keenly interested in looking at all the rows, then you can select the fetch all rows option. So this will give you all the columns of your data set. To just view the table is not recommended as it digs much diamonds, so it takes much usage of your free dire as well. So this does not affect your answers anyway. If you are wondering about applying the aggregation or other nomadic functions, the answer will include all the data points in the calculation. So this is all for this lab. And over next session we will see how to select a single column or specific values from the dataset. 32. Lab 8 2 Select Single or Multiple Column: Hello and welcome to this lab session. Previously we have seen how you can select all your dataset values in the available dataset you have. In this lecture, we will see how you can select specified or required columns out of the complete dataset. So for that, let's write this statement. Select. For example, here for the sales record dataset we have, I'm interested in the country. You see it automatically says that this column is present in the sales record 50 K table. And the item type column is the one that I need as well. Next, we will tell the statement from where we want these columns. That is from the sales record 50 kit. And you can see it automatically suggest us, so I will click on it and select it. All right, so these are the two columns that I'm interested in. Let's run the query. Here are the results. We have, the country and the item type. And these are the data points that are available in our dataset. The point to notice here is that on the right side we have two options. Currently, the option for first 5000 rods is selected. This means out of 50000, only the first 5000 will be displayed to us. But if we select the All rows option, it will bring all the data in the results section. But that is only recommended if you really want to see all the data else. It can affect your free time and cost you more. So this is how you perform a script and SQL to select a specific or multiple columns in the Azure SQL pool. Same can be applied in the SQL database as well. 33. Lab 8 3 DISTINCT Statement: Hello and welcome to this lab session. In this lab session, we will look at how you can see the distinct available values in the column of your dataset tables in the database. Here, if you see in the Region column, we have Europe, Sub-Saharan Africa, and Asia. Then we have got Central America and so on. Likewise, if we go and see the sales channel we have got online and the offline spread out throughout the dataset table in the sales channel column we have. So how you get the distinct names of all the values present in the columns, either region, item, type, country, or any of them. So here I will write select distinct. And here we want to look at the region. And then we write from where we want. That is reportable. Once when we run this query. You can see here we have all the different regions that exist in our dataset. Count them, and they are all seven and numbers. All right, next, let's see the sales channel as well. So for that, we will select the steels January in the distinct statement. Let's run this coding. So here you can see we have only two different types of channels that is online and offline in our complete dataset. This is how you performed the distinct operation in SQL using which you can see the distinct values in each column for your dataset. In our next lab session, we will see how to count the values in the dataset DB2, and also how you can count the distinct available values. Like over here, it was easy to say that we have seven different distinct region names. What if the values were large and not easily countable? 34. Lab 8 4 COUNT and COUNT(DISTINCT): Hello and welcome to this lab session. In this lab session, we will look at the count statement and count distinct statements in SQL to come for the values in the specified column. We have previously seen that on looking at the distinct values for the regions, we've got seven different values that exist in the dataset. Also, we know that we have a total dataset count of 50 thousand. Let's verify that with the help of the account statement. So, right, select count. Now let's say I want to count the total number of regions in my dataset. So I will type the CA1 region, and then I will select it from the sales record 50 K, which is my dataset table name. Let's run it. And obviously it will return me 50000 as we have 50000 dataset values. Now let's say we want to see what are the distinct number of region names that exist in our dataset. For that, we write the term distinct with the selected column name. This will basically count for you the distinct regions that exist in the dataset. Let's run it. So you can see that it returns seven as an answer. This is because we have seven different region names in our complete dataset of 50 thousand data points. There's one more method by which we can count distinct values. That is, we write select and then approx, count distinct statement in there, the right region as argument. And we want it to select from the sales record 50 K table. Now let me select this line to run it. So you see that it again, you don't same answer. You can perform the distinct count operation using any of the two methods. Let's make a little bit of understanding clear by checking out the gut for the distinct country. So over here I will type in the country instead of the region and select this line and run it. Here we get an answer when 85. So we have 185 distinct country names and the complete dataset of 50000. So this is how you perform the count and the count distinct statements in SQL and get the total count and the distinct count for each column in the dataset. In our next lab session, we will see the use of their statements in SQL. 35. Lab 8 5 WHERE Statement: Hello and welcome to the club session. In this lab session, we will see how to use the statement in SQL and where do you use it. Until now we have used some simple statements like count, count, distinct, select operations. The where statement is used for a condition check, which means that after the where clause, you write down the condition on which you want your duty to be executed. If I only write this condition, it will select all the 50000 dataset values from the sales records stable. But when I give the where statement with the condition that it will only be executed for the rows where the condition means. Let's say we want to select everything in this dataset, but with a condition that the country name be Russia only. Or maybe you might be interested in looking for a specific condition on sales channel. How to do that? Here I will add the where statement, where the country names should be Russia. And for that, I will dive down where country equals Russia. What will happen now is it will execute this command and after execution, it will only provide me the output dataset for the rods that will only contain the country name as Russia. Let's run it. Here we get the output, and if you carefully analyze it, we get output rose only where the country name as Russia rest. All other country name data values are not displayed as they don't need our condition. This is the power of the various statement where you can write down the conditional statement on which you want your main cutie to be executed. Let's have a look at another example. Now, let's say I'm interested in looking at the unit sold. So basically I want to see the output rows where the unit sold are greater than 1000. Farthest I liked their unit sold is greater than 1000. Let's run it. You can see here that we get all the output QT results having unit sold value greater than 1000. Let's take another example. This time, I'm interested in looking at how many rows are there in my dataset which meet this condition of units sold greater than 1000. Because I can't go individually to each of the data and the output. So it will quickly let me know the instances meeting the required condition in the dataset table. To achieve this, I will write down select count. So this statement will give me the total number of counts which meets this condition. Let's run it. Here is the result. And it says that out of 50 thousand rows, there are 44,925 rows that meet our specified condition. Let me quickly change this condition to 0. It should give us 50000. Great, Here it is. So this means that our dataset contains all the rows with a non-zero values. In the same way you can check for your required condition as well. In our next step session, we will look at how to perform the logical operators like AND, OR and NOT and SQL queries. 36. Lab 8 6 AND Statement: Hi and welcome to this lab session. In this lab session we will look at logical operator and how and where it is used in SQL statements. And operator is basically performed when you want to or more conditions to be true at the same time. And operator is used with various statements most of the times. Let me dive down the SQL query for you using the where statement with a condition that I want to see the online sales. And so this n here will satisfy the first condition as well as the second condition that I will state after this. The output result will only show the query results that meet both conditions. The next condition I want is the Europe region. For that, I will write down region equals Europe. After I run it. This will give me the result that it has a sales channel as online as well as the region as Europe. So you can see over here, we have all the outputs that contain the region as Europe, and the sales channel as online. In the same way. We can also select some specific columns with the same condition. Let's select region, country, and the order ID from our columns. And if we run this, it shows all the output with the condition that we provided in our SQL script. Now, let's say for this specific condition, we want to look at the total count. Like how many order IDs are there with this condition? I will remove these two from here and add the count statement in the script. This will give me the total count for the orders that were there with this condition. So we have 6,449 orders that were placed online from the Europe region. This is where you perform the n statements. In our next lab session, we will learn the or statement for the same criteria and compare this result. 37. Lab 8 7 OR Statement: Hello and thus come to this lab session. Unless lab session we will see the odd operator in SQL queries. And an operator is performed where you want Buddha conditions to meet. And an R operator is performed where you want either of the two conditions to be met. Here's an example, QT for the OR operator. This a statement basically says that select all from the seals to court database table for the condition where either the sales channel is online or the region is Europe. Which means that for any row, having any of the two conditions are true will be shown to us in the output. Let's run this duty. All right, so you can see over here that for this case you have the sales channel as online selected for us. And the reason is all where it is Europe. So what you see here, an output is for all of the Europe, the sales channel can be online as well as offline. And for all the online sales channel data points, the region can be Europe or any other region. Let's have a look at another QT where for the same condition we will select only the country sales channel and the order ID. Let's run the query. Here is the result for our beauty. You can see both online and offline in the result. But here the countries against each entry for the sales channel can be from the Europe region or any other. To have a better insight, let's have a look at the count for daughter order IDs that satisfied this condition. So you can see that there are 311426 values that meet this condition out of the 50 thousand data points. So this is how you perform the or statement in SQL. 38. Lab 8 8 NOT Statement: Hello everyone and welcome to this club session. In this lab session we will see how to perform a NOT operation. Let's say I'm interested in selecting the region, country, the sealed channel. And let me also add in the item type. When I run this, it gives me all the QDOT is odd. Now I will add in a condition where not the sales channel is equal to offline. Now this means that I'm asking for all the results other than offline sales channel. And we know that we have online left other than offline in our sales channel. So the result will give me all online results. Let's run it. And heavy goal, we have all the results that contain sales channel as online only. Let's try out another statement where in the argument I will write that the region is equal to Europe. This means that I will get all the regions in the output except for Europe. Let's run it. And here you go. Our first result previously was Europe, and now we have all the outputs except for the Europe region. So this is how we perform the NOT operation. It can also be used in addition to other logical statements as well. 39. Lab 8 9 UPDATE Statement: Hello and welcome to this club session. In this lab session, we will see how to use the update statement in SQL. It is one of the important statements in SQL and is used to update or modify your records. We already have some leader set values in our digital overhead. Let's see how the update statement works. The far that using the previous knowledge of SQL via gained, Let's run this cutie first. So when I run it, it will give me the row which has ordered IDS 897751939. I am touching this specific RA for you in order to show first what the data is contained in the straw. And then we will make an update to the straw and see the changes. So you can see over here that this order ID has the region as Sub-Sahara, Africa, the country as Namibia, and other parameter values. Let's update the entry. For that we write update then the dataset table and which we want to update the values. Then in the next line, right, set. And here I will send the country name to XYZ. So it will change the name of the country in order ID that we have set here for Namibia to XYZ. And important point to note here is that if you don't use the very statement or mistakenly omit this statement, it will update all our record, changing all country names to XYZ. So be careful when you use the update statements. Let's run the security. So the QT has been executed successfully. It will not give any result, but at the back-end, it will have updated the row. Let's again run this statement, fetching the row with the same order ID. We update it. Let's run it. And here you see the result. The country name has been updated to XYZ. Rest, all columns are the same. So this is how you make an update to the values and the tables using the SQL. If you want to update more fields, you type them also, like after the country I type region equals XYZ. So depending on your requirement, you can make changes accordingly. In our next lab session, we will see how to use the delete statement in SQL. 40. Lab 8 10 DELETE Statement: Hello and welcome to this lab session. In this lab session we will see how to delete a record from your dataset table using the SQL query. Previously we have looked at how to update a record, and here I will be deleting the same record. First, let's have a look at the count of the total dataset we have. You can see we have 50 thousand data points as of now. Next, let's use the delete statement. For that. We write delete from the name of the dataset table from where you want to delete any entry. And in the next line I will get the condition where I wanted to select my order ID. And then I will give the same order ID used previously to update it. Okay, So before I run it, let me show you the order ID that it exists in our record. Run it. And here is the record that we are going to delete. Now let's run this command to delete it. So select these lines and run it. The Curie executed successfully, and now this entry is deleted. Let's verify it. We can see that one record was fetched as we have deleted, disrespect, and disregard value now does not exist in our table. Let's verify it also by the count statement. Select count all from the record table. This must know return 49,999 reason being that one value is deleted out of 50 thousand. Let's run this query. So you see here we get 49,999 leader points. This is how you can delete a record value from the table in SQL. In our next lab session, we will see how to add a new record rod to your deal. 41. Lab 8 11 INSERT INTO: Hello and welcome to this lab session. In this lab session we will look at how to use the insert into statement in SQL. Previously, we have updated our record and then we looked at how to delete it. As a result, you can see that we have 4,999 values as of now. So I have seen that rho value with me and we'll be adding that draw in our dataset table. Again. Here it is. In this we are trying to insert into the table name, which are sales record 50 K. If you look at it carefully. This was over order ID which we deleted previously. And now I'm going to insert these values into this sales record dataset table. These are the values for each column. Also, if you see over here, we have written all these values to be inserted in the same order as we saw in the output result. So when the QT will do is map all the values in this order to the column cells and add these as a new entry to our dataset table. So our acuity has been executed successfully. Let's verify the execution. I will use the cout statement to see for the total count or fraud. So you see that we have a doorbell count back to 50000. Now let's go back to the Curie and search for that particular order ID that we have now inserted. This was the order ID in our case. Let's run this query. We can see that the output has been fetched and we have the details for this newly inserted or the IDB deals. So now we have the complete table again with us and we are good to go to perform some analytic urease on it. In our next lab session, we will look at joining functions in SQL to join two different data tables. 42. Lab 8 12 Why Join Datasets (Overview): This lecture will help you understand the importance of joint function and how we can actually use this function. Here are two different datasets put any Superstore may be they have two or more departments. Let's say a supermarket x has two departments. One might be a dispatch department who are dealing with the orders and other delivery to the addresses of a particular region. And the second one is the sales department who are looking for the CNC, is that a clustering and relevant data? Now both of these datasets contain 50000 rows value against gullibility pedometers. Like for the first set, which is the dispatch dataset, we have the upper level details like the order ID units that are sold, the order date, along with the region and country for that order. You can actually see this, that the details are upper level and have nothing to do with the cost for unit sold. We can see the account to deal in the second data which might be handled by the sales department. Here we have the details such as this is Jen in either order was placed online or offline the priority for each order and against the same order ID, which is also contained in the first dataset. It also contains the unit price, the unit cost, the unit for glucose, and the total profit. This is a type of scenario we had any organization may have to departments among which one needs only the dispatch related data and the other with accountants involved with it. Let's say the boss of such a company wants to perform some analysis late for a particular item. What number of the units were sold? What was it? Unit price, the total cost and the total profit on in different regions. Good anarchy. All of this data is into different files. And in order to analyze this scenario mentioned earlier, oil for this must be a single dataset file. This will help make the process smoother and this is where the field, the importance of joining function. To join this new day, you first need to import it there. And after this is done using the join function, you can generate a single data set combining these two. Here, the common value in water dataset is the order ID using which the QT will combine the two datasets. After this, we will perform some statistical analysis on it to look for the details like what was the order date or item against each item that was a brutal profit or some other details like what region people are interested in which we, which I 43. Lab 8 13 SQL Joins: Hello and welcome to this lab session. In this lab session we will look at how to use the joint statement in the SQL to join two datasets tables. Before we move on, you can see over here that inside the table, we have here the mean sales record 50 good table that contains all the columns related to the complete sales figure. Then the two different table we will work on here are the dispatch item table where the columns are only related to the delivery deals, which are also the part of the mandible. And the other table that we have is the item sales today. Here we have 10 columns, which represents the finance figure for this hips. And these columns are also part of the main sales record dataset table. Now we will work around with this dispatch item table and the sales item D will, we will combine them to look like a complete table that we have here in the form of the sales record 50 K. Let's see how this can be done. First, I will select all the rows that are inside the dispatch item table. So you can see we have these five columns in this table. One thing you need to notice is that we have the order ID in this dataset. Next, I will run the same query, but this time for the item sales table. Here we have the same order ID column in common with the dispatch item. So we will be combining these two tables with respect to these Order ID. Whenever the order ID matches for both the tables, the data point for each of that rod will be merged and combined into a single dataset point, resulting in a complete dataset with the help of the joint statement. Here I have the SQL script to join the two tables. Here I am saying that we will select all from the item sales table and then join it with the dispatch item with a condition on satisfying that order ID of board the table matches. So what it will do is it will compare the order ID of board the tables. And then it will merge the data from this table to this table and provide you the output. Let's run the query. So you can see over here that we have got all the columns which were in two separate tables into a single table. Here, if you notice, the QT will only return you the first 5000 rows. To get all rows, you need to click on the old rows option and then run this duty. Now once the dataset is combined, you can click on the Export link here. And then you can export this joint dataset in the format of your choice. You have three formats, the CSV, the text, and the XML file. I will download it in the CSV format. Here it is downloaded. So this is how you can perform the join function using the SQL cutie. In your next lab session, we will look into the SQL aggregation functions. 44. Lab 8 14 SUM Function: Hello and welcome to this lab session. In this lab session, we will be looking at some of the aggregation functions used in SQL. So the aggregation functions involve the count function, summation, average, standard deviation, variance, and all such functions. We have seen the count function in the beginning of this section. Now let see one of its detailed example. Here is SQL query I have. It will create an output table for us with the values for our count of distinct item type. When we run this cutie, we get the count as 12. And you can see that we don't have a column name here. Let's get this column a name. And this is done using the as keyword, which gives an alias to the output column name. Let's give it a name as item, diaper, distinct count when you run the security. So now you can see that we get this column name here that we assigned. Now I have a cutie here with me. This will count the total distinct item types and label it as total item type. Then it will count the distinct region. We have labeling it as two regions, and then counting the total distinct sales Janet in our dataset, labeling it as a sales channel. Let's run this query. So you can see over here that we get the output in our complete dataset, we have 12 different item types. Then we have a duty of seven different regions and do different sales channels, mediums which are online and offline. This is a use of a count function like where you can use it. Next, we have the aggregation function, which is the sum. Here is a cutie. We are calculating the sum for all the values in the total profit column and labeling it as story profit value. Let's run it. So this is the total profit value, which is the duty of some for the total profit values in the dataset table, and it is more than 19 billion. Now let's perform the sum with a conditional statement. So here we will calculate the sum for the total profit with conditions for satisfying the sales channel to be as online only. Let's run this query. So you see we get the onset around 9.8 billion, which is the Dory profit made online. Let's also check it for offline. And this is around 9.75 billion. This is how you perform the sum function in SQL. As we have two sales channel, let's create a comparison table to see what the figures for the total profit of online sales and the offline tools. Alright, so we have the statement here which says that select some for the case when the sales channel is offline, then it will calculate the total profit. Some else, it will return 0. And the statement, the sum for this case will be labeled as offline profit. In the next line, we are calculating the sum for the case where the sales channel is online and it will return 0 and end. The sum will be labeled as online profit. So our output will have two different columns for the online and offline profit total sums. Let's run this query for a quick comparison. So you can see here that we have the two values, the offline profit As around 9.74 billion and the online profit around 9.78 billion. Which shows that on LinkedIn is we're a bit more profitable than the offline fields for the dataset we have. So this is how you perform the sum aggregation function in SQL. In our next lab session, we will see how to perform the average function in SQL. 45. Lab 8 15 AVG Function: Hello and welcome to this lab session. And this lab session you will see how to perform an average function using SQL. An average basically that's about the central, our typical value in the dataset. We have to perform the average. I will write a cutie over a year. To calculate average, we use the keyword ABG, and we will calculate average of the total profit, labeling it as average profit, and we will calculate it from our sales 50 gated asset. Let's run this cutie. So we get the output here, which is this value over here. Now let's calculate the average with some specific condition. Here we have an SQL query for a conditional case where we will calculate the average of the total profit where the item type is only household. So I will name it as average profit for household items. So this will give an average for the household item on execution. Here is the result. This is the total profit average for the household items sold. Let's perform some more exercise on this. And this time we will use this cutie. Here. What we are trying to do is calculate the average for two different cases, where the first case is the sales channel being offline. When the condition meets, it will calculate the average profit value as we'll return 0. And the statement and naming it as offline average profit. In the next statement, we will calculate the average daughter profit for the case where the sales channel is online. So we're not calculating these averages for the entire dataset, but only for the household item. Let's see with sales channel give the maximum average profit for the household item types sold in our dataset. Let me run the QT. So here we have the offline and the online average profit values. We can see that the maximum average profit is n, the offline case. This is how we perform the average in SQL. In our next lab session, we will see how to get Gilead the extreme value calculation and sorting in SQL. 46. Lab 8 16 Extreme Value Calculation and Sorting Data: Hello and welcome to this lab session. In this lab session we will look at how to perform the extreme value calculation for the dataset. And also we will see how we sort our data in the SQL query. Extreme value calculation is basically the calculation of maximum and minimum values in the numeric columns of our dataset. Suppose that I'm looking for the minimum order date value, which will be the starting value for order date in our dataset. For that, Let's write the QT. So we write select, then the function min, which is far minimum, then the order date, as we want to see the minimum value of order date. And then from the sales record table. Let's run this query. So the minimum value is January 1st, 2010. So this tells us that our dataset we have has the values that are starting from the first month of January for the year 2010. Let's also see the maximum value for the same curie for that, right Max and run the query. So the maximum date is September 9th, 2016. So in the similar way, you can perform the minimum and maximum function on different columns of this dataset. Let me perform another query here. So in this cutie, the select the item type, minimum value for unit sold, and the maximum value for the unit sold from the scenes to CT dataset table. And we are grouping it by the item type. So basically this execution will show you an output having the list of maximum and minimum values for each item individually. Let's run the query. It can be seen that we have the columns with the item type and the minimum and maximum sold values. The minimum baby food soil web, and the maximum bar 9,995. And then there are values for each individual item type. You can also replace the column names with the help of alias keyword as. Let me do it quickly so that the data comes clear for us to understand. So now we have this data with the labels making it clear to understand. Okay, so now let's see the sorting in SQL. Here we have the Curie here, Vr to select the same item type, max and minimum for the total profit. Grouping it by the item type. And in the next statement we are using a sauteing close that is ordered by which we'll select the order for the maximum profit for each item type in the descending order. So it will list down the higher values first, followed by the lower values. Let's run this query. Here we have the results where we have the maximum values in the first column after item die, and then the minimum total profit values. So you can see over here that the maximum profit for cosmetics was this. And the minimum voice, this is the analyze it. The data sees that cosmetics have made a higher profit as compared to other item type C. And the fruits item have done less sales profit. Midway through the data we have baby food and zeros doing an average profit. Let's look into another QT. We will see the region vice unit sold. So over here we select the region and the Min and maximum values for the unit sold. And then we group it by the region. And if we run this cutie, here is the output and it can be seen for the region by sold output. We have in Europe the minimum sold value as one and the maximum as 9,998, and so on for different regions. Look at Middle East and Africa. Region has a value of 10 thousand for a maximum sold unit. So this is how you perform the extreme value calculations on the dataset and sort them in ascending or descending order. For the case of ascending, we write it as c instead of the ESC. In our next lab session, we will look at the grouping and filtering in groups of the data in SQL. 47. Lab 8 17 Grouping and Filtering in Groups: Hello and welcome to this lab session. In this lab session, we will look at grouping and filtering in grubs. We have already performed the group by function, but we will look at it in more detail. So let's see an SQL query over here with selects the sales channel and the count of all the values, saving it as the sales channel total label from the table sales record. And we will be grouping it by the sales channel. So basically this is statement will perform acuity for you and group all the accounts with respect to the sales channel. Let's run this query. All right, So it gives us the output, grouping the total count for the sales channel. It shows us the total count of all the online and offline sales channel. So the online sales were 25,034 and the total sales offline we're 24,966. So this actually grouped all the sales channel for the total account. Let's see yet another QT. So in this cutie, we are going to count all as dawdle count and select the region, and then group our output with respect to the region. So this will actually gives us the total count for each region in our dataset. Let's run this query. So here we get the result where we can see the total count for each of the regions mentioned here. This is how we do grouping. Let's perform another Curie. Now in this query, we are actually performing a filtering within the group itself. Up to now we haven't seen this having statement. Now to perform the filtering inside the group by statement, we use the having clause. So this cutie will actually count for you the total number of regions as seen previously. In addition, this time it will only show the results for the count. We are, the total count is greater than 199. Once we run this cutie, you will not see the North America region in the output because its value is equal to 199, which means our condition. Let's run this. So you see over here that previously there were seven rows returned. And this time we have six rows returned because of the having condition we added. Let's see another example for this. Here, Vr interested in the item type and the sum of the total cost for each item diaper. As we have grouped here. We will get the sum of the total costs for each item type. Run the query. So now over here you can see we have to have different rows in here for the item type. These are the item types and their total cost as a sum in the other column. Now, let's say I wanted to see the item type which has a total cost, some greater than this value over here for our beverages. So I will copy and write this value here of having statement where the sum of the total cost is greater than that value. Run this query. And now you can see that we don't have the beverages in our output values because of the filtering statement we have given here in the group BY clause. So this is how you perform the grouping and filtering in the grouping statements in SQL. In our next section, we will begin with the Visual Analytics with Power BI, and we will try to visually perform the same duties we did here in SQL. 48. Lab 9 1 Create Event Hub Namespace: Hello everyone, and welcome to this lab section where we will focus on the real-time analytics service of Microsoft. Is there, that is a 0 Stream Analytics. As there are Stream Analytics as a real-time analytics and complex event processing engine that is designed to analyze and process high volumes of fast streaming data from multiple sources simultaneously. Before defining the Stream Analytics job, you should prepare the input data. So the first thing we will do is create a namespace in the events. Up. For that, I will go to the Search tab here and look for Event Hubs. This deck on this option here. As 0. Event Hubs is a big data streaming platform and even ingestion service that can receive and process millions of events per second. And an Event Hubs namespace provides a unique scoping container in which you create one or more Event Hubs. Let's create our first namespace. The first thing is to select the subscription. Then we will select the resource group. This is our default resource group, so let's select it. Next. We need to give a namespace name. The name must be unique throughout the Azure, and I will name it as a 0 dash ns 15, 5. So this is our namespace name. And next we will select the pricing dire as the recommends us to use this one, the standard which has 20 consumer groups and 1000 brokered connections. You can go for other options as per your requirement. Then we need to select the throughput units. The throughput capacity of Event Hubs is controlled by throughput units. To put units are pre-purchase units of capacity. A single throughput let you ingress up to one MB per second or 1000 events per second. So we will select its value as one for now, review this and create the namespace. The validation husbands successful. Let's create it. All right, So the deployment is completed as of now. Move back to the resource. So the namespace is created successfully. Next, we will create an event hub inside of our namespace. Click on this plus sign with the Event Hub over here. Now named the event hub of your choice. And I will name it as My even talk. So this name is available for us to use. Next, we have the options for the partition count and the message retention. I will select a value of two for the partition count. Keeping others as default. Let's create this inventory. All right, the next thing we need to do is go to the Entities tab and select the Event Hubs option here. This is the Event Hub that we just created. Once you come onto this option, go to the option which is Shared access policies. And here we will add a new policy, give it a name. I will name it as My policy. And then we will select the manage option here, giving both descend and listen x's using this policy to the Event Hub. Now create this policy. The policy is successfully created. Now when you click on this option here, here are the keys, the primary, secondary, and the connection strings. Second, Ricky, Let's copy the connection string primary key. Now paste it in any editor. The thing I want to show here is that in this link, here you have the endpoint link to this event up the shared access key. And then we have the shared access key itself. And finally, the entity pot. So we will be using it with the Teleco generator app. It will be used to connect a third party app sending real-time data so that it can be collected by the Event Hubs. We will see that in the upcoming labs. 49. Lab 9 2 Configure TelecoGenerator App: Hello and welcome to this lab session. In this lesson, you've learned how to configure the Teleco done reader app provided by Microsoft Azure. This option roots the phone call data which contains fraudulent calls, which are filtered by the Stream Analytics job. If you go inside the folder, these are the files present inside the folder when you unzip it. This file, Delhi Chordata, gen dot z dot config is the important one. All the configuration will be done inside this. Let me open it. Here is the Event Hub name Configuration key, and the Microsoft Service Bus connection string. If you remember previously, we copied these links here. We have here the values for the endpoint shared access key. And the end point. The first thing you need to do is copy the name of the entity bot and go back to the configuration file and paste it into the Event Hub name over here. Then the next thing is except the entity pot value. Copy the complete configuration values, and then moving back to the configuration file, paste it over here in the Service Bus connection string. Now save this file. And we are good to go after this. We will go back to the folder here. Now we need to start the command prompt. And now using the cd command, we will change the current working directory to the part of the folder. Press Enter. And now we are in the Delacroix generator folder. Now we will run this to liquids are neutral so that we get the data in our Event Hubs. This is the command that you need to write in the command prompt. So this command sees to start the delicate tenuto up with these three arguments. The first one here is 1000, which is the number of samples that this app will generate per hour. The next value, 0.2, is the percentage of fraud probability, which means that about 20 percent of the cold records will look fraudulent. Then the third value here is the number of hours. This means that the app will run for two hours. Press Enter to run it. Here you can see the values are initialized and the data generation is started. The next thing we will do is bring in this data into a 0 Stream Analytics and verify that we are getting this data. They're there. I will explain what this data is and what more we can do with this data. 50. Lab 9 3 Create Azure Stream Analytics Job: Hello and welcome to this lab session. In this lesson, you've learned how to create an Azure Stream Analytics job. This is done after you have created the Event Hubs namespace and Event Hub. Now search here for the Stream Analytics and go onto this option. Currently we don't have any jobs as of now. So we'll go to the create Stream Analytics job and create a new job workspace for our project. First thing first, given a job name, I will name it as Tellico data fraud detect. That is short for detection. So this name is available for us to use. The Azure subscription will be same as the default one. Then the resource group that we have been using till now. Then you have the option for location. We will go with the default one. Then there is the option for the number of streaming units you want streaming units to present the computing resources that are allocated to execute a Stream Analytics job. The higher the number of SKUs, the more CPUs and memory resources are allocated for your job, we will keep it to a minimum for now. That's created. So the deployment is initializing and is in progress as of now. All right, so the deployment is completed and it is now ready for use. Let's move to this resource. And here is the dashboard for the Stream Analytics jobs, whatever activities will be performed and that can be observed and monitored here. So this is how you create a job in stream analytics. In the next lab session, we will look at creating the input and output jobs in this as your service. 51. Lab 9 4 Create Input Jobs: Hello and welcome to this lab session. In this lesson, we will learn how to create input jobs in Azure Stream Analytics. If you remember previously, we created this static or data fraud detection job in Stream Analytics. To perform any activity, you need to configure the input and the output job so that you have a plan as to where the data in real time will come from. And what you want to do with that data. Here in the sidebar you can see that we have the Job Topology tab and the options for our inputs, functions, Curie and outputs. We will go to the Inputs tab here. On the top you will find the option for the ad stream input. When you click on this, you will find multiple options to being bringing the data. So you see that we can bring in the data from the event hub. That's our guests, are from the IoT Hub where you can interface some real-time IoT sensors. And you can also bring data from Azure storage. For our case, we will use the Event Hub option. Let's name it. I will leave it as called stream data. Then there are two options. Either you can provide the Event Hub setting manually, or you can select the Event Hub from your subscription. I will go for the second option. Let's select the Default subscription we have. Then the Event Hub namespace. Next we have the Event Hub name. So we only have the single event hub that we just created. Next is the option for the Event Hub consumer group. We will select Create New. Now this option here for authentication mode is important. We have two options here, managed identity and connection string. We will work with the connection string. If you remember, we previously saw the links and we used it in the configuration file for the delicate generator app as well. Now we have the policy. So the policy we will be using is the my policy which we created, leaving the rest of the setting to default, save it. So the input we created as being desperate for connection. And now the input stream is created and you can see it here. Here you have two options to test the connection, delete the input job, and to sample the input data. That is, you can look at what data you are receiving in this job. As of now, we won't be getting any data here as our delicate app is not running. So this is how you create the input jaw. In our next lab session, we will look at creating the output job and then bringing in this real-time data. 52. Lab 9 5 Create Output Job: Hello and welcome to this lab session. In this lesson, you've learned how to configure an output job. Right now, I'm in the output drops down. If you go to the option here for the ad, you will find many options where you can save your output for the input stream data. You can send it to the, even TO the SQL Database, Blob storage or storage. Or you can visualize it in the Power BI do. And there are many other options that you can use based on your requirements. Wherever case, we will save our data into the Blob storage. Like the data that is coming in real-time from the event hub will be saved in the Blob storage. Before we configure the output job, we first need to create a convener in the blog where we want our data to get loaded. So let me move to the Blob Storage Service. Here is of a Blob Storage dot-dot-dot we have been using so far. I will go in here and inside the containers, we will create a new one. Click on the new container option, and let's name it as called stream data. And let me change the private access levels to container level. No way advanced settings required. So click on Create. Now we're container is created successfully. If I show you this, it is empty as of now. Now moving back to the output job, select the Add option, and they're selected Blob storage option. Let's name the output job as a data block. Using the subscription one and storage account for the Blob storage. Then we have the Container option. And here you can see it automatically fetches the container name for our students. Then we need to set the authentication mode and we will select the connection string authentication mode. Then this is the option to select in what format you want your data to be stored. So I will select the comma separated files option. Then the encoding will be UDF. If you want to limit the data with some minimum rows or you want specific time-frame data, you can do this here. So these are all of your settings. Let's save this job now. If you're the desktop connection and save it, hover output job is successfully created. Now that we have created the input and output job, you might be wondering that when we run this app, will we be able to see the data? So the answer is no because we need to create a cutie or a function that will be responsible to perform the activity you want. So we will be looking at it in our next lab session. 53. Lab 9 6 Realtime Data Upload to Blob Storage: Hello everyone and welcome to this lab session. In this lesson, you will learn how to bring in the real-time data from any source to the Blob storage that we created previously. So as of now we have created the input as well as the output jobs. So we're going to be bringing in the data to the container in this output job we created previously, Let's move to the delegates are neutral. These are the same arguments we are using. We need 100 samples with the raw data probability of any person. And we want to run the app for two hours, Enter to run it. So it has initialized the arguments and we have the data here are being generated. So you can see it generates the data, Uk, UK, China, and all that. Moving back to our Azure Stream Analytics, here you will find the option for the beauty. And over here then you select the input preview option. So it will refresh and try to bring in the data that is, you seething with the help of this input job, that is this called string that created all this data you see here is just for preview and is not being saved anywhere as of now. So the main question here is, how do we bring this data and save it to the Blob Storage? Let's see how to do this. The test results are empty as of now. To save data to Blob storage, we need to write acually. If you want the data, whatever is coming into this input job, we write an SQL Script. Select star means everything from the input job, which is called street. If I quickly run only this QD, you can see that we have all the data coming in life. And this is the app from there we are getting this data. Now I will write the duty to save data to the container. Right now in the container, you can see we don't have any data even when IV fresh it. So we will select all and we will push all the data inside this. So we want it to be pushed inside the call data block. And we want it to go from our called stream. So this is our duty. Let's test it. It's running and now save this duty. So now we will go back to the overview tab. Don't worry about the red warning. It is because of some previous jobs that I ran. Over here, click on the Start button. So it shows the details that will run on execution of the job. So the service is now starting. So now since it's running, if I go to the container and refresh it, you can see that it starts to get the data here. You can see the time here as low. And this is the current time. And the data is being updated in real-time. If I refresh it, it will update the time again and again. Because it's a continuous stream of data coming in all the time in this app here is running. Also in the stream jobs. You can see that the status here is running. Either when the data stops to come or you stopped the job here. The data and the students will stop this. Let me stop this job here. And then moving back to the container. Let's refresh it. So now you can see that the data has stopped coming in. Each time I refresh it. There is no update. Let's have a look at the data that we have. Click on Download here. So here is the data. This is the same data that we generated with the help of a delicate generator app. So this is how you can perform the complete activity on real-time streaming data, creating input and output jobs, and then writing down acuity to perform whatever you want to do with the data. For this learning example, we stored it in the Blob Storage. You can perform much more on the dataset like some analytics that can be with the help of SQL or in the form of visuals using the Power BI do. I hope this has cleared your concept on the overall working of the Azure Stream Analytics service. 54. 10 1 What You Will Learn: Hello everyone and welcome to this lab section. There you will learn how to perform analytics on any of the dataset visually using the Microsoft Power BI Do, you will learn Modular stuff to create a dashboard like this. By the end of this section yourself, we will be moving step-by-step through the creation of graphs and how to visualize them to perform analytics on your dataset. So we have the maps here in that board, the bar charts and war article and horizontal fashion. Then you have the stacked bar chart, the clustered bar chart, a line chart, and how you create them and generate them in a report format will be explained in much detail. I will also provide the complete Power BI file in the resource section so you can use it for your learning references. 55. 10 2 Import Dataset from Blob Storage: Hello and welcome to this lab session. In this lesson, we will learn how to import data set to Power BI from Azure Blob Storage. Far that will go to the student's account. Over here we have these two storage that we created. We will select the blob storage. And inside this blob storage, we have this container option. The previously created these two folders. Let's Move in this folder. You will see that we have these two Dataset Files. We will move these two files inside the Power BI from this data storage. Farthest, I will select the movie dataset and go to the properties option. I will copy this URL here. And moving back to the Power BI. Select Get Data option here, select the More option. These are all the sources from where you can bring in the data. Here we have the File option where you can get the data from the local files of all these four main types. Then we have the databases options, like you can bring in the data from all these database sources. Right now we will go to the Azure option. And in here you have the option for the Azure Blob Storage. Click on the Connect button. Now it asks for the Azure Blob Storage details. You can either type in the account team or the URL to your blob storage. I will paste the copied link here. You can see it gives an exclamation mark, as it is the dataset link. So I will remove the dataset name from this link. And now the option to proceed is enabled. Click on Okay. All right, so here you can see that we have these two dataset over here. Now why did the statically appear over here that aren't asking for any authentication? This is because while creating this Blob storage, we made it publically accessible. This means that anyone with the link can access the blob storage and its content. This is the option here. If you go to the Blob storage and then do the networking option here. This option allows excess drum is set to all networks. This means that the storage is accessible from all the networks over the internet. If you select the selected networks option, then you have the options to add the networks that you want to allow. The next thing we need to do is to load these datasets in the Power BI. Click on the Load button. It will create the model and load the dataset from the Azure Blob Storage. And here we go. These are overdue datasets loaded. This is how you bring in the data from Blob storage. In our next lab session, we'll look at how to import data from SQL databases in Azure. 56. 10 3 Import Dataset from SQL Database in Azure: Hello and welcome to this lab session in this lesson to learn how to import a dataset from SQL database to Power BI. So we will kind of integrate our database that we created previously inside the Azure portal. We will import the dataset from the database to Power BI. Click on this option here. It asks you for the SQL Server database details. So you need to enter the server name and the database, which is optional. You will get this information from the Azure portal. Go to the SQL databases option. Here is ever created database. Inside the overview dab on the dashboard, you will find the option for server name copied to the clipboard. And moving back to the Power BI, we will be sued here and the server name click on. Okay. It directly shows the files that are present inside the database. And when you use it for the first time, it asks you for the server username and password, as I have already saved those credentials beforehand. So it directly shows the database content. This is the sales record file which is also present inside the database here and a 0. Once you click on the dataset seals regard, it shows a preview of content inside the dataset. You can either load or transform the dataset. I will go with the Lord option as of now. So the dataset is being loaded as of now. It fetches the data from the Azure portal database. This dataset contained 100 draws. So you can see that it also displays the same. All right, so our sales data is now loaded, and here we have all the fields of the sales record dataset. So this is how you import a dataset from SQL Database in Power BI. In our next lab session, we will learn how to import data from Excel files from a local PC. And then we will start with our visual analysis on leader sets. 57. 10 4 Import Dataset from Excel File: Hello and welcome to this lab session. In this lesson, we will learn how to import data set from Excel and Power BI. Farthest we select the option to import from Excel. Click on this option and then browse for the file location. We will select the All files option as our dataset file is of CSV format. Select the Sales regard to a file and continue. Before we load the dataset, it shows a preview of the file you want to import in Power BI. Let's load this dataset. All right, so the dataset is not loaded. Here are all the fields of this dataset. The Sigma sign over here shows the fields that are in danger and on which the aggregation functions can be applied. Here these are all the visualizations that can be performed on the dataset. We will cover most of them in this section. Let's explore the guard visual to see visuals of different columns in the dataset. Whatever column you drop here will be displayed on the card. Let's say we want to look at the total cost. And if I select this, it shows you the total cost in the guard. That is 46.66 billion. Likewise, we can also display some other fields as well. Let's look at the total profit. So this figure here is 19, 53 billion is the total profit. You can also perform the aggregation function in all such duties visually that we perform previously in the SQL section. We will look at each of them in our upcoming lab sessions. 58. 10 5 Cards in Power BI: Hello everyone and welcome to this lab session. In this lab session we will learn about Guard visualization and cover BI hat times a single number is the most important thing you want to track in your Power BI dashboard or report, such as total sales, market share, you're over year or total opportunities. This type of visualization is called a garden. This is where we left off previously. And if we click on this card, you can see this field option here. This drop-down menu will show you the options to perform these available aggregation functions on the selected fields. Right now we have the total profit. You can also rename the scaled by selecting this option. Let me rename total profit for sales. So the name also gets changed here in the guard is low. Then we can perform all these aggregation functions. Like instead of sum here, you can select the average function. So the value here in the card also gets changed to average. Let's say the minimum value that we have for the total profit. So it is 7.23. And if you look at the maximum value, so it turned out to be 1.74 million. Then we have to count distinct. But this won't apply here because we use it when we have some multiple text values are dates in our dataset. Moving on, we have the count option. This gives us 50000 because of the fact that we have 50000 dataset blinds in our dataset. And then in the list we have the standard deviation, variance and the median. You can also select the percentage value as well. So these are all the functions that we also performed in SQL queries. Let's explore some more options. And this time I'll select the item type. You can see here that once you click on this, it shows you the first name in the list for the item type. To change it, you need to go to the drop-down menu here. Select whatever you want to see. Like if I select the last item. And it shows me the last item in the complete dataset for the item type. And if the select account, it shows 50000 because we have 50000 dataset points, I'm selecting the count distinct option, it shows 12. This is the same answer we already saw in the SQL query section as well. We have 12 different item types in our dataset. Let's have a look at some other columns. This time. I will select the region field, and it shows the first region in the list. Similarly, on selecting the last option, it shows the last region name and the count distinct, shows that we have seven different regions in our dataset. We perform the same curious for the country column as well in SQL. So let me do the same and cards for the country. This is the country here. From the drop-down, I will select count distinct. And here you go, we have 185 countries in our complete dataset. Guards can be used for the display summarized result in your dashboard. So this is how you generate cards in Power BI along with the different aggregation functions. In our next lab session, we will be looking at creating the bar charts and column charts in Power BI. 59. 10 6 Bar Charts and Column Charts: Hello and welcome to this lab session. In this lesson, we will learn about bar charts and column charts and Power BI. We will go through the stacked bar chart, stacked column charts, the clustered bar chart, and the clustered column chart. We will also look at the percentage stack and clustered chart. Let me add a new sheet over here. You can find this Power BI file that I'm writing on in the resource section. I will name it as charts. So I will select the stacked bar chart from the visualization block. A stacked bar chart, also known as a stacked bar graph, is the graph that is used to break down and compare parts of a whole. Each bar chart represents a whole and segments in the bar represent different parts of categories of, of that whole care. I'm interested in visualizing the region. Then I want to see the item type. You will see that we haven't got any graph. Now, we need to give it the value perimeter or like what values you want to visualize. I'm interested in unit sold. I will drag it and bring it in the values field here. Once I drag it, you can see that we get the graph. Let me zoom it a bit. The shoes, the item diaper units sold in each of the regions. All the different item types that we have are presented by different colors. With respect to the region. It shows the unit sold status for each item type. For example. For our first example, we have the Sub-Saharan African region. And ender. We see that the serial item type units sold were this much. This is represented by the color orange. When you over the mouse to different parts of the graph, it shows the calculated value and its detail. And then for the other regions as well, for example, the Australia and Oceania region. This was the unit sold status. And if he quickly do a comparison in Sub-Saharan Africa, the unit sold were more competitively. This graphs gives a horizontal view of parameters you want to visualize in a stacked bar fashion. In the same way we have to work good stacked bar chart. I will do the same visualization with this graph tag. I will select the region, the item type, and the unit sold. Let me stretch to have a good view. And you will observe that it is the same that we saw previously. Just the graph is now in vertical fashion. Nothing much difference. Just you have the view format. Either you want it in vertical or horizontal patron. Let's now have a look at the clustered bar chart. I will perform the same visualization for each graph type. So I will select the region, the item type, and the unit sold. So you can see over here that in this graph format, we have the same analysis, but instead of everything in a single bar, we have different bars for each item dive grew up together four different regions. However, if you notice the color representation of item types are the same. The moment you hover over the mouse on each different bar, you can see that the deals for each one of them. If I hover on the baby food, you will see the same value for this in each of the graphs. The type of graph only changes the way you want your visual report to look like. Similar to the horizontal clustered bar chart. We have the word clustered bar chart. I will move in the same columns to the visual field. So you see we have the same thing again, but the only change is the word cool layout. Now, let's look at the percentage stacked bar chart. In a 100 percent stacked bar chart, the cumulative height of the columns are bars equals a 100 percent. And the segments displays that relative size of the categories. Unlike the conventionally stacked bar chart, the a 100 percent stacked bars to have different heights. And due to present any absolute values. Let's select the region, item type and the unit sold and the unit cost. You can see that it shows the same height bar. But the proportion of each colored stack depends on the relative values of each item. I hover the cursor on snacks. In the first region, it shows the percentage value that for the total units sold in this region, this snacks identified value is 8.42% of the total value. Likewise, you can have an overview for each of the values. In the same way we have to work the conversion for this one. It gives the same result but in vertical layout. With this, we will end our lecture on the bar and column charts. In our next lab session, we will learn how to generate the line and area charts. 60. 10 7 Line and Area Charts: Hello and welcome to this lab session. In this lesson, we will learn about line and area charts and Power BI. A line chart is a series of data points that are represented by dots and connected by straight lines. It may have one or many lines. It also has an X and a Y axis. Whereas the area chart is based on the line chart. The area between x's and line is filled with colors to indicate volume. Area charts emphasize the magnitude of change over time and can be used to draw attention to the total value across a trend. Let's start with the line chart. You get the line chart by simply clicking on the line chart button here. For this chart, I'm interested in looking at the items that were sold with respect to the ship date. Select the ship date. You will see the ship date here in the excess field. Then select the unit sold. So you see it choose the year wise line chart for the unit sold. If you look at the unit sold in 2010, these were more than 30 million. And if we look at 2016, there were more than 33 million. Likewise for 2017, the unit sold were more than 20 million. Comparatively, we can say that in 2016, more units were sold. Let's suppose you want to compare two different parameters in the same line chart. For that you need to select the other parameter you want to add in. Like you can select the unit price as well. Are the unit cost, the total revenue, or any other parameter? For the current parameter we select the unit sold. You can also set any aggregation function to, let's select the average function. So the values changes do average. Now this gives us the average value of the overall unit sold per year. And if you look at the minimum values, the line graph now changes to the minimum number of units sold each year. Now the option depends on you and I will stick with the sum function as of now. Then in the excess field where we have the ship date selected, you can actually have different views do let's say we want to have a quarter view so that I will delete the year pedometer. And now you can see that our X's changes to quarter. I will select the ship date again as a whole. Let's move on to the area chart. I will select the ship date and the units sold. Now you will notice the same graph again, but this time the ATO from the x-axis to the line curve is colored. This graph is very useful to compare different values. Like when I select the unit price, it shows the new line with another highlighted color this time. In the same way as we did previously for the line chart, you can select different aggregation functions. Let's select the minimum value. And here you can see that the graph shows the minimum values for each year. Let's have a look at this stack AT a line chart. A stacked area chart is the extension of a basic area chart. It displays the evolution of the value of several groups on the same graphic. Let's add the ship date, and after that, I will add the unit sold. This is similar to the area chart, but when you add the unit cost, you will notice that the new graph lies on the top of the previous one in the form of a stack. When you start adding other parameters, all the new ones will lie on top of the previous one. Let's have a look at other examples. And this time I will add the ship date, unit sold and the sales channel. Now it shows online and offline sales statistics for each year as we hover the cursor over the graph. For 2016, these Word online and offline sales statistics. In the same way, you can perform more statistics on different pedometers available in the dataset. All right, So this is it for the line and area charts in Power BI. In our next lab session, we will look at lines and charts and bubble BI. 61. 10 8 Line and Column Chart: Hello and welcome to this lab session. In this lesson, we will look at the line and stacked column chart and the line and clustered column chart. The line and stacked column chart is a combo chart that combines the line chart and column chart together in one visual. By combining these two visual together, you can make a very quick comparison between two sets of measures. The main benefit of this type of chart is that you can have one or two y-axis. So let's begin with the line and stacked column chart. For this, I will first select the item type, then I will select the sales channel, and then the unit sold. So now you can see over here that this is something similar to the stacked column chart we saw previously, where we have online and offline sales value for each item type. The benefit of this one is that we can add a line chart in this as well. The benefit of this one is that we can add a line chart in this as well. Here, whatever you add in this field will look like a line chart. You can bring in any of the available parameters you want from the available columns. I'm interested in dog-eared revenue. I will drag this here, and there you go. We have the line graph added in this chart. It shows the total revenue for each item type in each bar. In the same way, you can add visuals for the line and clustered column chart. I will add in the same pedometers for this one as vote. And here we have the chart for this one having a clustered bar this time. So this is how we create visuals for line and stacked column charts and the line and clustered column chart in Power BI. In our next lab session, we will look at a waterfall and funnel charts. 62. 10 9 Waterfall and Funnel Charts: Hello and welcome to this lab session. In this lab session we will look at the waterfall and the funnel charts in Power BI. So from here you get the option for waterfall charts. Of waterfall chart is a form of data visualization that helps in understanding the cumulative effect of sequentially introduced positive or negative values. For example, we will look at how much did every region contribute toward the total cost. I will select the region here and then select the total cost. Here is the graph. And it shows that the Sub-Sahara Africa region leads toward the total cost as compared to the other regions in the list we have. Let's dive into more detail. This time, I'm interested in looking at what was the breakdown of each region's contribution according to the sales channel. So for this, I will add the sales channel in the graph. On adding this, it shows that like for the Europe region, the online sales to build cost was this amount. And it also shows a quick comparison with next one that is Asia. Then we have the offline contribution as well. Once you remove the breakdown, you will see the actual waterfall graph. In the same way. You can also see the total units sold as well and other pedometers as per your requirement. We can also perform different aggregation functions as well. So this is how the waterfall graph. Let's have a look at the funnel charts. A funnel chart is a specialized chart type that demonstrate the flow of users through a business or sales process. The chart takes its name from its shape, which starts from abroad head and ends in a narrow neck. I will select the total cost. Once you select, it shows the total cost for the entire dataset, which is 46.66 billion. Let's say we want to have a look at the breakdown for total cost according to the sales channel we have. So if you select the sales channel here, you can see the total cost that was online and the total sales that were offline. So the total cost offline is ninety-nine point eight percent of online sales. Let's now select the unit sold. So it shows the total units sold online and offline. What look the same, but if you hover over them, then you can see the exact value. And the offline total units sold were 99.3% of the total unit sold online. Let's see the reason why is unit sold. Now you can see the total units sold for each region, where we can see the reasons against in descending order, Sub-Saharan African region leads here as compared to the other regions. In the same way, Let's select the total profit this time. So here are the values for profit by region. These are also arranged in descending order vary. We can see that Sub-Saharan African region did more profit. And the North America region is the last in the list. Let's look at the funnel chart for total profit done by each item. It shows that the cosmetic item died did more profit as compared to the other items. This analysis we also did in SQL. Curious. Now if we select the country, you can see that the funnel chart extend so much that it's hard to visualize. But if you hover over the graph, you can see different country names and their total profits. But this does not add much value to the graph. So it's better to select regions for this funnel chart. You can select any of the available columns to visualize it in the funnel chart. So this is how you create waterfall and funnel charts. And Barbara BI, in our next lab session, we'll look at by an doughnut shops. 63. 10 10 Pie and Donut Chart: Hello and welcome to this session. In this lesson, we will look at by an donor chart in Power BI. Let's start with a pie chart first. A pie chart is a circular statistical graphic which is divided into slices to illustrate numerical proportion. Far this pie chart, I will select the region and the Turiel units sold for each one of them. Here is a pie chart and it shows all the details for each region and the total units sold in each one of them. Again, the Sub-Sahara Africa region is the one which has the most number of units sold. In the same way, you can also add other pedometers like the total profit. Here is the graph for this one. And in addition to this, you can also add two parameters together, like to report that and the total cost. Let's have a look at some more pedometers. This time, I will select the sales channel and the total profit. It can be seen that 9.78 billion profit was made online and then you have value for the offline as well. You can perform this pie chart visual analysis on any of the parameters from the dataset as per your requirement. So this is how the pie chart works. The same analysis can be performed in the doughnut chart. I don't wanna chart is essentially a pie chart with an idea of the central cut out. Donut charts are more space efficient than pie charts because the blank space inside the donor chart can be used to display information inside it. Similar to the pie charts, I will select the region and the total revenue. This gives us a nice and clean look to the visuals for the total revenue generated by each region. Let's look at the total profit then by each item type in our list. And this is what the graph says. The household items made a profit of 3.4 billion, and cosmetics leads the total profit with a value of 3.64 billion. These are all the same results that we previously got from the SQL queries as well. You can perform the skill coolies and get the same answers visually as well. This all depends on what type of analysis you are looking for. So this is how we use the doughnut chart and pie chart in Power BI. In our next lab session, we will look at tables. 64. 10 11 Tables in Power BI: Hello and welcome to this lab session. In this lesson, you will learn about tables in Power BI. This is the option to select a table visual select this option. For this, I'm interested in looking at item type count for each of the regions in our dataset. Here I select the region and you can see that we get all the regions that have the disease. These are the seven regions in the dataset displayed over here. Next, I will go and select the item type. Here you will notice that we get item type for all the regions and that too for all the entries we have in our dataset. And that is for all the 50000 data points. As we are interested in looking at the total count for each item type in each region. For that, we will go to the item type value field. And from the drop-down menu, select the count. And now you can see that we have the count of item type for each of the regions. All of these accumulate to 50000. So for the Asia region we have a doodle item type count equals to 7,348. Likewise, we also have different counts for each of the regions. In the same way, let's look at online and offline sales as well. This time, it shows a breakdown view for each region having sales offline and online. This means that each region has both types of seals that are online and offline. We will explore this particular case in the next video that will be on the matrix table. Let's now select the icon for the same case. And this gives a combined view and v are not declared that how many of the doodle gone, we're offline and how many of them were online? This can be seen in the matrix table and graphs as well. You can add more fields in the table as well. Like adding the total profit, total cost, the revenue, and all those things that you want to be part of your table. You can also add in the headings to the table in this sheet. And setting a text box as you do in other Microsoft tools. And I will name it as summary of sales record. All right, So this is it for this lab session. I hope you have got an understanding on generating tables in Power BI. In our next lab session, we will learn about metrics table in Power BI. 65. 10 12 Marix Tables: Hello and welcome to this lab session. In this lesson, we will learn how to use the metric stable in Power BI. A matrix table is a three-dimensional table that gives a quick insight that the values that you select in the columns and the rows, this is the option here to select the metrics table. I will select the item types. So over here we get all the values we have in the item types. And next, I will select the sales channel as well. So we have the sales channel here in the top row. Let's say I'm interested in looking at the individual values of total profit for each of the item types listed here for offline and online sales. For that, I will go here and select the total profit column. And now you can see that it shows the total profit values for online and offline sales channel for all the item types we have. This is how useful the matrix table is for analysis. In the same way, you can also perform the analysis for some other pedometers. Maybe you are not interested in the sales channel and you want to see the reason why still true profit. Now drag the region to the column spilled over here. And now you can see over here that we have the profit values for each of the item types in region. In this way, you can quickly have an insight as to what item did more profit and in which region. Now let me quickly add in another metric students for some more analysis. This time, I'm interested in looking at the total profit count for online and offline sales. So far this I will select the item type, and then I will select the sales channel. Now in the value. So I will add the sales channel, and inside the drop-down, I will select the count. This basically gives you the count of how many steels for each item we're done online or offline. You will see that the total for this accumulates towards 50000, which is the total number of values in our dataset. Let's explore some more information out of the dataset we have. I will select the region this time. And let's say we want to see the sales channel. And then I will select the total profit. So this gives you the total profit for each of the regions that is online and offline. So this is how we use the matrix table in Power BI to perform the analytics. In our next step session, we will look at the decomposition tree in Power BI. 66. 10 13 Decomposition Tree: Hello everyone and welcome to this lab session. In this lesson, we learned how decomposition create charts are generated in Power BI. This is the option here for the decomposition tree. Here if we have an empty template for this one, a decomposition tree gives a Cree architecture view of the pedometers that you select for it. I'm interested in the total profit pedometer. It shows a bar like visuals and nothing else, only the value for the total profit. Next, I will add the region. You can see this plus sign over here. When you click on this plus sign, it gives the option to explore the regions. And menu click on it. It gives a breakdown of the story profit for each region. Now let's say we also want to have a look at the sales channel breakdown. So let me add it. Here are the additional plus signs for each region. Then we click on it. You will find the sales channel option. This gives us a further breakdown view for the offline sales profit and the online sales profit. Clicking on the cross button here, you can close it. You can also go the other way around. Now, let's say this time I only want to see the total profit sales channel wise. For that, I will click on the plus sign here and then select the sales channel option. Now, it shows you directly the online sales and the offline sales total profit values. This is the power of a decomposition tree. Now here comes the bonus part. I will tell you about the filters. You can perform filters either on the current page that you're working on or on all the pages in your BI file. First, let's have a look at filters on this page. I will apply the filter on the item type. So drag it to the filters field. Now over here, instead of selecting all, I will first deselect all and only select the baby food. You will observe that the value changes with respect to the baby food. Because now only it will show us the baby food related calculations. In the same way if I deselect this and select the meat. So it shows that this was the offline and this value here was the online sales. Similar approach can be followed for any of the item diets required to analyze. Same thing can also be explored for region wise analysis as well. Like if I select the region here. And when I select this serial item type, the computation changes. And it shows the view for each region in the descending order. Filters in the same way can also be applied for different fields as well. Like selecting the order priority will allow you to select the value to filter among the priority column. Whatever option you select, the value changes accordingly. The filters in the same way can also be applied on all the pages. In that case, whatever fields you filter that will apply on all the pages instead of the current one that you are working on. What does, I hope you have got a clear understanding of all the visualization fields that we have explored throughout dissection. And I hope you have learned a lot during this course on Microsoft Azure. I will leave the rest for you to explore yourself. If you have any issues using them, do let me know in the comments section for the course. And our instructors will best help you in all means, daycare and all the best.