Becoming a Cloud Expert - Microsoft Azure IaaS - Level 2 | Idan Gabrieli | Skillshare

Playback Speed


  • 0.5x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 2x

Becoming a Cloud Expert - Microsoft Azure IaaS - Level 2

teacher avatar Idan Gabrieli, Pre-sales Manager | Cloud and AI Expert

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Lessons in This Class

38 Lessons (3h 32m)
    • 1. Course Promo Level 2

      2:14
    • 2. Section 1 - Welcome

      3:44
    • 3. Course Objectives and Structure

      4:04
    • 4. Why

      6:03
    • 5. What

      5:13
    • 6. How

      3:27
    • 7. Section 2 - Overview

      1:15
    • 8. Data Sources

      5:56
    • 9. Metrics and Logs

      5:05
    • 10. Data Type #1 - Azure Subscription-Level Activity Log

      4:24
    • 11. Data Type #2 - Azure Resources Diagnostics Logs

      1:36
    • 12. Data Type #3 - Azure Resources Metrics

      3:43
    • 13. Data Type #4 - Guest OS Metrics and Logs

      2:02
    • 14. Data Type #5 - Applications Metrics and Logs

      2:18
    • 15. Section 03 - Overview

      4:43
    • 16. Step 1 - Back End MySQL DB Server

      14:03
    • 17. Step 2 - Front End Apache Web Server

      11:15
    • 18. Step 3 - Log Analytics Workspace

      2:51
    • 19. Step 4 - Diagnostics Settings

      4:23
    • 20. Step 5 - Azure Monitor for VMs

      2:59
    • 21. Step 6 - Application Insights

      7:25
    • 22. Section 4 - Overview

      4:24
    • 23. The Concept of Alerts and Actions

      4:56
    • 24. How to Create an Alert Rule

      9:16
    • 25. Configure Our Rules and Actions

      12:02
    • 26. Simulating a Load on the Web Server

      4:46
    • 27. Managing Alerts

      5:28
    • 28. Section 5 - Overview

      3:01
    • 29. Azure Global Status

      3:46
    • 30. Personalized Service Health

      9:28
    • 31. Section 6 - Overview

      2:26
    • 32. Review the Status, Health and Activity Logs

      3:17
    • 33. Metrics Explorer

      5:12
    • 34. Log Analytics

      15:13
    • 35. Azure Monitor for VMs

      9:38
    • 36. Azure Advisor

      1:51
    • 37. Application Insights

      9:45
    • 38. A Quick Recap

      8:31
  • --
  • Beginner level
  • Intermediate level
  • Advanced level
  • All levels
  • Beg/Int level
  • Int/Adv level

Community Generated

The level is determined by a majority opinion of students who have reviewed this class. The teacher's recommendation is shown until at least 5 student responses are collected.

39

Students

--

Projects

About This Class

a78c6c5f

Microsoft Azure

Microsoft Azure is one of the leading cloud providers (together with Amazon AWS and Google Cloud) with a global cloud infrastructure for providing public cloud services around the world. Cloud computing is one of the biggest and fastest technology revolutions in the IT industry and the global demand for more skilled people in the area of cloud computing is also growing rapidly across multiple industries.

Becoming a Cloud Expert

If you are looking to become a cloud expert then this training program is designed to help you to build your knowledge and experience in the subject of cloud computing while using the Azure cloud platform. The training program is divided into levels.

Level 2 - is all about Monitoring...

In level 1, we learned to build an end-to-end cloud solution inside the Microsoft Azure platform. Now it is time to learn how to monitor effectively our IaaS solution. In level 2, we are planning to learn how to monitor and analyze the performance and health of our Azure resources and applications as well as Azure platform and services.  One thing for sure, it is going to be interesting!

Join us and start to pave your way as a Cloud Expert!

Meet Your Teacher

Teacher Profile Image

Idan Gabrieli

Pre-sales Manager | Cloud and AI Expert

Teacher

Class Ratings

Expectations Met?
  • Exceeded!
    0%
  • Yes
    0%
  • Somewhat
    0%
  • Not really
    0%
Reviews Archive

In October 2018, we updated our review system to improve the way we collect feedback. Below are the reviews written before that update.

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.

Transcripts

1. Course Promo Level 2: hi and welcome to my training program. Becoming a cloud expert. I guess you already know that cloud computing is a big thing in the i T industry and the market demand for cloud expert is going rapidly. Experts that can navigate between the available options designed, complex solution, deployed production system and then monitor and optimize them becoming a cloud expert. A training program is designed to help you achieve this goal by dividing the needed knowledge into levels in level one. We learned how to create resource is in a zoo. This is level number two, and our main focus will be to make you an expert in monitoring and analyzing the performance and health off your cloud solution. We're going to build a demo system in Microsoft. A zoo configure anything that is needed to collect telemetry, data meaning metrics and loves. Then we will learn how to monitor and analyze the data using a variety of tools. My name is he done teaching online courses for several years now you're more than welcome to check out additional cloud computing training courses. I already created the getting started with cloud computing that can be a good starting point and also the first level off this training program, meaning becoming a cloud. Expect Microsoft the zoo in for such a service level. One. If you are HIV, then I'm ready, see with side and let's get started. 2. Section 1 - Welcome: hi and welcome. Thanks for joining. My name is he Don and I will be a teacher. As you probably know, this course is sparked off a training program called Becoming a Cloud Expert. The training program is divided into levels, and it is focusing on building the knowledge and expertise around cloud computing while using the Microsoft a zoo solution. As a quick reminder in Level one, we covered many topics related to Microsoft. Zoom as a cloud service things like how to create virtual networks, set up security whose allocated manage virtual machines, allocate storage capacity. It limit their access to specific resources and much more. Alia commend to review level one before learning the next levels. Level one is the foundation for this training program. This level level two is all about money totalling effectively, the performance and health off our cloud resource is we learned how to create resources in level one, and now it is important to understand how to morninto those resources while using the comprehensive capabilities that are part off. The as a solution is always I a few simple recommendation before we start to dive in 1st 1 Don't forget that I'm here for question. If something is not clear or you want more explanation on a topic, please drop me a question while using the course dashboard. I think it's a great options that sometimes overlooked by many student mission, is possible. The topics covered in this course are very practical. And like many practical things, the best way to really get it is by trying something. So at the end, off each section, I would provide you with some guidelines to create something similar bios. If this course was recorded in high definition eso, please make sure your video player setting is showing you the 10 80 resolution. Otherwise, the video you will get will not be sharp and clear. If you're not getting the option toe under 10 80 a option, then most probably the Internet bend we'd you're using is not strong enough. Maybe try to run the course, form a different location in case you find a mistake somewhere. Okay, it would be really great if you can update me using the course dashboard. Just post a quick comment about the problem with the section number and lecture number. Time for him. Try to complete the course in a reasonable time, like less than one or two weeks, even less. Otherwise, you will lose the excitement and momentum while learning something new. So put some commitment to learn the complete A course and to end and lasting practice and play with a zoo platform. This is the only way to really understand the material. It a much a deeper level. Okay, that's all my recommendation. For now, let's review the course objectives. 3. Course Objectives and Structure: So what are we going to learn and how we're going to live in this section? Section number one. We will start with high level overview off. Why? What and how? A Why, why we need to monitor a cloud based solution. Is that a business? Lee Kwan Mint. And in case we need to monitor or something what exactly we should monitor? There are multiple layers that can generally data. Each one of those letters can be monitored, and lastly, we'll talk about the how how we are going to monitor our cloud based solution. There is a dedicate a tool called a Zo monitor, and we will learn what kind of features and options are available using the store for money towing something. We need data telemetry data about the behaviour off the money told system. So our first step will be to review the different types of data that can be collected and also the least off data sources that such data can be collected. Data types like performance metrics, activity logs, the agnostics logs, guest OS logs and application logs. This can be a little bit confusing. Topic. Eso We will have a complete section toe cover those data types without losing the big picture. This training is supposed to be very practical and tobe able to achieve this goal. I need a demo system. An interesting system toe demonstrate a the topics. So in Section number three, I would like to show you how I'm kating such system from sketch. This is what I called the playing ground. In addition, it is important to understand the steps to set up they needed configuration to be able to monitor and collect a data what features we need to enable what resources should be created at the end. Off this section, we will have an end wind cloud system that will be fully configured toe collect the needed telemetry. Data automation is a critical part off any monitoring system when we have a large number off. Resource is it is becoming a difficult task and less effective to monitor everything manually. So in section number four, we will learn the concept off alert rules and action groups, enabling us to add the automation layer into our monitoring solution. This will be also the foundation for using in advance a features called automatic scaling something I would like to cover on the future 11. In section five and six. We will review the different tools to visual and analyze the collected data, including checking their global status off as oh, the specific cloud services were using and then investigate issues related to our cloud resources. We can visual data using reports, dashboards and perform manual queries on logs combined multiple data types in interactive reports and more will do all of that while using the playing ground demo system we gated in Section Tree. This is the structure and also the main learning objectives off this level. Level two. It is going to be very interesting with many things to learn and try. I'm excited to start and before moving forward, I would like to wish you an interesting journey. Go and grab some coffee and let's start 4. Why: The first question I would like to talk about is, why should we moan? It'll a cloud based solution if we is. And customers would like to allocate resources from a public cloud provider like Microsoft Zoo or any other provider, that should be their job. They should money toe my cloud resources, right? Well, that really depends on the service model we're using. It is basically a question off responsibility. If we are responsible for a specific layer, then we should morninto that layer as a quick reminder. There are several service models provided in the cloud market we have in for sexual service . Platform as a service function is a service and software of the service. While using software as a service, we're not responsible for anything. Okay, the whole cloud computing stack is under the responsibility off the cesspool. Vaida. I think that's clear. For example, as a user, I don't care how Google is money towing the Gmail sauce solution. I just want to be able to access my account any day, any hour. Next we have platform is the service, meaning the whole ICTY infrastructure is provided as a platform by the cloud provider by the public cloud provider. We can deploy application without handling. They needed a silvers. We are basically responsible for the upper lay application and the public provider is going toe money toe all the underlying ICTY infrastructure that is needed function is a services similar toe blood former the service in the context off money towing were using functions provided by the cloud providers. Those functions are monitored by the cloud provider and the last one infrastructure as a service. Is the topic off our training program becoming a cloud expert Microsoft a zoo, infrastructures or service? In this model, we have a full control off. The allocated resource is, and we have also full responsibility to monitor the health and performance off. Those resource is, if you manage to master the infrastructure service than everything else will be like walking in the park slowly while eating ice cream. It's easy and fun. If we are using the infrastructural service to create a solution inside a cloud provider, then we need to pay attention to the health and performance off that solution. And the good news is that this is the main topic off this course. The second question I would like to raise is what is the business value for money towing a cloud based solution. If this is our responsibility under the infrastructure service model, then why should we do it If you have a little bit background in I T. And I guess you have this question will be easy for you. When we create a resource in a cloud environment, it would be allocated with specific parameters specific capacity. As an example. When we allocate a virtual machine, we must select the specific virtual machine time. What is the amount off? Virtual CPU, virtual memory, maximum traffic bandwidth, etcetera. Now, at the point off creating that virtual machine, we estimated that the workload created by the application will not exceed the virtual mission capacity. But in real scenario, this is a very difficult task. Many application negating dynamic war clothes. Parton's especially were based application connected to the Internet. The easy solution will be to allocate a very large virtual machine that will be able to handle a such a picks and an increasing demands. However, from a business perspective, it is not cost effective. It is not practical. More capacity is translated to more cost now back to our topic toe Beller, optimize the native capacity and keep the cost toe the right level. We must monitor. Our exhaust is each. The source can generate performance metrics that can be used toe analyze what's going on. We can use this information toe adjust capacity is needed. In addition, there is another important issue on the business side. Many applications running over the clouds will be available with the certain S allays service level agreements. How can we assure that our applications are learning as expected according to the business requirements? And if something is not working, how fast we can find the root cause and troubleshoot the issue? Okay, you know the Internet today and downtime is translated to losing money. This is the job off a monitoring solution. It is a critical element in any production system. 5. What: we talked about the Y part, why we should monitor a cloud based solution. And I guess this is clear. I would like to move on and talk about the what what layers can be monitored inside a cloud based solution. If a cloud based solution is like a black books, we would like to open it and look inside that box. This book's is based on several layers, and the key word here is layers. Monitoring is all about getting the data for multiple layers to create a complete and 20 picture on what's going on. A problem in one little can impact Upper Leal's. Now what kind of flares I'm talking about? Let's review them one by one, starting from the upper layer, moving down. Probably the most interesting layer is the application. If I'm hunting a website Web application, then I would like to know things like the experience from a user perspective. For example, what is the ever age page loading time? What are the top five location worldwide that I'll accessing my side over the last months? Monitoring the application layer is useful for two main users. They add mean uses that are responsible to a shoe that the production system is running smoothly. And the second group is Web developers that can use such information toe test and troubleshoot the application doing the development cycle. And this is becoming more critical when moving Tau develops Methodology application. A learning inside operation system likely Noakes and Windows. The health often operation system is critical. Toe the opulent application. OK, that's making sense. In that case, data will be collected about the operating system on which our obligation is running. It is also called the virtual Machine guest operating system. The application, a new operating system that we just so are basically a piece off code. We need to install them in a virtual machines. Virtual machine is an azure resource. Microsoft a zoo Resources are basically any resource is allocated in our solution. Any building blocks that we decided to use in our solution can be money told somehow toe some level a virtual network networking toe face, a network security group, a virtual disk and more. All of them are the sources, and we need some level of visibility off. Those resource is if a virtual disk space is an Anglo, then it's a problem. If a network security group is dropping a legitimate traffic, then it is a problem. Basically, it is a telemetry data about the internal operation off Azzurri sources. Now, in some cases, the issue we can see in our resource is can be related toe problems in a zoo services something that can. He bucked several customers, not just our specific system. So we should be able to monitor their zoom platform in the context off our subscription. Okay, They held off the zoo services in our subscriptions that our application and resource is rely on using. If we are using computing resource is in the U. S. Region, it will be important to understand that the computing services in that location are experiencing some issue. Or maybe they are under a some planned mental illness. Okay, the context off our subscription is relevant. I don't care about issues that are not affecting my system. In addition, at the subscription level, we would like to know about any configuration change made it our Azumah resources. If someone in a will team changed the size off virtual machine, we would like to know details about this action. Okay. Who performed that action when it was done. And finally the status off that action. If I would summarize when we will analyze or troubleshooting issue with our cloud solution , we may require to use data for multiple layers. Each layer is a data source for a different kind off information. Together, it will provide us with the end 20 monitoring visibility. 6. How: the next. A question or the last question is how? How should we monitor a cloud based system in a zoo? Well, there is a dedicated service called Microsoft a zoo monitor. It is the main a touch point when it comes to money towing something in a zoo. It is used to monitor any layer that we mentioned in the last lecture. We can monitor our a cloud based solution as well as on premise environment outside the azure cloud. Let's review the zoo Manito architecture in high level. On the left side, we have the data sources a coming from multiple layers like the application operating system. Azzurri. So serves etcetera. Those data sources are generating telemetry data, which is divided into two main types, metrics and logs. We will get into more details about the data tops and data sources in the next section. Section number two. Now those metrics and logs are stored in a zoo monitor in a dedicated data stores is what you see in the middle. Next a part is what we can do with the collect data, and the good news is that we can do a lot of things. Let's review them one by one again in high level, a first awful visual and analyze the data using dashboards reports and query logs using analytics tools. There are multiple tools that we can use it, and we will review them at a deeper level in section number five and six. Next will be automation. This is very important things we can do with the collected data by setting automation holes that will be used toe, identify performance, Parton's automatically and in some cases even trigger a variety off automatic action. Okay, for example, an automated action can be scaled out resources like ending another ritual machine toe, a cluster off virtual machines, walking in a load balancing configuration. And the last one is streaming data to external system. This option is used in case would like to integrate as a monitor with other system while sharing the collected data. That's the zoo monitor. Architectures will use this tool a lot in the coming lectures as a quick, some beautiful dissection. We started with the why why we need to monitor a system running in a cloud environment and of course it is clear that we should do it in case we're using the infrastructure as a service model. Then we open the black books off a cloud system, and so the layers that can be monitored inside. And the last step was to understand that we're going to do all the monitoring activities while using the zoo money toe, a service that we just talked about in this lecture in the next section we're going to review in a great details in the data types and data sources for collecting telemetry data . 7. Section 2 - Overview: Hi and welcome back. We're planning to start section number two in our previous section we reviewed in high level the Why, what and how? Why should me monitor a cloud based solution what kind of flares we can and should monitor . And also, we talked about how we can do it while using the zoom monitor. O'Toole. It was a high level introduction in this section. I would like to drill down and zoom on specific topic, which is about the telemetry data. I would like to talk about the variety off data types as well as the relevant data sources . Each data type is used for a particular reason, and it can be generated for multiple data sources. It's part of the cloud solution. I think it's important toe fully understand this topic before running into the details on how we can actually configure the telemetry data collection and how we can visual oh, analyze such data 8. Data Sources: money doing is all about collecting telemetry data form different resources that are used by our cloud based solution. When I'm saying telemetry data, I mean that typically many monitoring solution are built around the collection Off metrics and logs, data metrics are used to monitor the performance off resource is, and logs are used to record all kind off off events that happened on such resource is the collection off metrics and logs should be done for multiple data sources that are related to multiple layers and in section and one. We already talked about those layers, and I would like to quickly review them once again. This time, I would present them in a different order, starting from the lowest level and moving up the first underline layer. Is there a zoo? A platform? A zoo platform? Is the cloud environment where we are kating? All kind off resource is computing, networking and storage. Resource is those resources are usually grouped into a variety off cloud services. Now, if we like it or not, in some cases, the issue we will see in a specific resource can be related toe a problem in a zoo services those services can experience downtime, performance, degradation of security issues in specific location. In that case, we should be able to monitor the Asso platform in the context off our subscription. What is the health off the azo services in our subscription that our the sources are using right now? If we are using, for example, computing resource is in the US East region, it will be important to understand if the computing services in that location are experiencing some issue. The context off our subscription is irrelevant. In addition, and the subscription level, we would like to know about any configuration change made to our Azul resources. If someone in our Deem created a new virtual network, we would like the ability to analyze those actions who perform detection when it was done and the status off those actions. Now it's important to mention also that the telemetry data about this slayer al collected automatically by a zoo, so in that case, we don't need to set up anything. A zoo resources are basically any resource are located in our solution. Any building blocks that we decided to use can be morninto somehow. To some level, it can be a virtual machine, a virtual network, a specific network interface and IT work security group and more. It is telemetry data about the internal operation off as a resource is what's going on inside. Most of the performance metrics will analyze will be related to the Slayer operating system . Putting aside the concept off containers, which I would like to present and future level, an application is usually learning inside a specific operating system like lean oaks and windows. In that case, data can be collected about the operating system on which the application is honey. It is also called the virtual Machine guest operating system, and the upper layer is applications and the endless example off application, commercial off the shelf application or pencils, application and fully customized application. Usually a complex system is built on multiple application that are performing different functions, for example, and an application for a website can use a plaster off front and Web server running Apache and multiple database servers using or a kill. My sequel AM a Duke edged DFS, maybe also bacon server etcetera OK to system. In that case, it will be useful to monitor the front end Web server and measure things like that. The page loading time or maybe monitor they database server and collect metrics about the number off where is running over the last hour. Also, it would be useful to see and analyze their network connections between application or models, learning on different virtual machines that are combined together into a system. Okay, this is the application they'll 9. Metrics and Logs: before we run into the different data types that can be collected in a. So let's make sure we understand the concept off metrics and logs. So what is a metric a metric is basically a numerical value that describes some aspect off a system at the particular point in time. For example, limit it can be the memory utilization percentage on a virtual machine measured over one hour ago. Now perform this morning. Going is a concept. Is the collection off metrics in pre defined intervals for multiple resource is in layers. For example, if we will take a virtual machine is a system it is based on resources like CPU and memory capacity and metric can be the free disk space on a virtual disk, the memory utilization on a virtual machine or the number off access request for a Web application. Metrics are usually stood in a structural data for month like a table in Excel, we have lows and columns. Each lower represents a simple a point in time when the system simple some value. The columns represent several fields with information about the metric like at times temp, meaning the date and time it which the metric value was measured or collected. The name off the metric off course, the name space like a group A off several metrics, a all kind of dimension keys, etcetera. All those fields are used to identify the resource on which they metric. It was simple. Matrix are usually aggregated into higher time intervals like form one minutes collection granularity into five minute a frequency, then from five minutes to 15 minutes, 15 minutes, toe terry minutes and then one hour. One day it said, OK, you go. The idea. It's a time aggregation now. This process allows a gradual reduction off data, a resolution over time, which is translated to less storage capacity. Okay, the segregation is based on a lie kind of formulas like average maximum minimum and also samarie. When we select a metric, we'll see that later. But when you select a metric in Asia, you can also select what kind off formula you would like to apply toe that a metric. Okay, this was a quick introduction about metrics. Now let's talk about logs. Matrix are gato. Identify a problem and evolving Trent and notify almost in real time about a problem. A weather In some cases, there are not enough to understand the root cause off a problem. This is well, logs are coming into the picture toe. Compliment the collection off metrics. A log is a wreck code curated automatically to describe a specific event that happened over time. It can be an event about user that just logged into the system. Oh, user that change Some configuration opposes that failed toe. One in many more examples. Logs are useful for troubleshooting, debugging and auditing. Some logs are created and collected in pre defined intervals like every five minutes, and some of them will be created. Sport Iqlim When something is happening, a very simple example is when a user is trying to log in into a system a log the court will be created, describing if that user was able to log in or not. Now there are many types off events that can happen in a system, and each event will be translated to a a log with different structure with different set off properties. They can contain numeric values like metrics as well as text data with detailed description . In the next lectures will review the types off logs collected in ISO 10. Data Type #1 - Azure Subscription-Level Activity Log: the first type of data would like to talk about is activity looks. As the name implies, activity logs can help us to track different type of activities performed in a zoo in the context off our subscription. Okay, it is a subscription level events. It can be activities that are performed by end users by us or activities that are performed directly automatically by the zoo system activity. A logs are divided into multiple categories and the first category is administrative. Using this type of flogs, we can monitor and track any change that were made on our cloud. Resource is this category contains the record off. All create update, delete and action operations. It performed a true a zoo resource manager. They are also called control plane operations. In most simple words, what kind of change were made? Who made those changes when those changes took place? And finally, the status off those actions. For example, operations like create a new virtual network, a delete virtual machines stop a virtual machine and much more. Every change submitted to as a platform is logged to the as activity logs, giving us the ability to trace any action related to our resource is the next category under activity logs is esos health. Okay, this category contains the record off Amy resource health events that have occurred toe our azul resource is okay. They are generated automatically by a zoo and they are used to tell us if our resources are running as expected and even inform us about the current and past health off Those resource is service held events would provide us the information on services were using in a zoo. For example, compute resources in West Europe are experiencing downtime. Okay, assuming we're using resources in West Euro, there is a dedicated option in a ceremony toe to check out a service health issues. Those issues are handled by the azo team and this is the place to understand what's going on. The next type of activity log is alert. We have a dedicated section about it later in this course. But basically we can set up a lead worst automatically identify performance partners and all kind off events in our system. Every time in a lit is created, it will be recorded as events inside activity. Look, the last toe categories are the commendation and security in a zoo, that is, some smiled service called zoo adviser. It will scan. Our resource is automatically and we generate a variety off recommendation based on our specific configuration and based on our own usage, Parton's. It's actually a very nice option that will review later, and the last one is security. This category contains all kind off records off any a list generated by the asso Security Center. As you can see, the multiple categories for activity logs. It is a great and useful data type, and it is collected automatically by the Azzurri platform. 11. Data Type #2 - Azure Resources Diagnostics Logs: we learned that activity logs provides information about operations performed on Azzurri sources. Operation on Resource is formed the outside. Okay, this is called the control plane operation that were performed on resources in our subscription using the resource Manager, for example, creating a virtual machine or the leading and network interface activity logs is a subscription level. Look. The topic for this lecture is the agnostic looks. The agnostic logs provides insight into the operations that were performed with in the esos itself. Okay, it's called resource level diagnostic clogs, and it capture resource specific data from the zoom platform itself. For example, after we have created a network security group, we can monitor how it is actually used and they talk. Security group includes wars that allow or deny traffic it where a virtual network sub NATO a network interface. Now logs will be created about the network security group A rules that were applied while allowing or denying traffic 12. Data Type #3 - Azure Resources Metrics: we already defined what is a metric. Now let's talk about why metrics are needed and what we can do it matrix collected in a zoo . Typically, metrics are best used for three main use. Case festival is money towing and then profiling, and last one is a leading. Now. Metrics are just numbers at the end, measured over pre defined intervals. And this is why they can be quickly summarized into reports and dashboards displaying historical Parton's In a zoo and also in other cloud environment. Almost all type of resources are limited to some capacity. Okay, we can allocate a virtual machine with the computing power off five seaview or 20 CPU. We cannot locate its three virtual machines working as a class today, or maybe six virtual machine walking together in that plaster. More resources are, at the end, translated into more cost. At the basic level, we would like to effectively use our allocated resources and at the same time make sure that our application, all still walking, is expected. Okay, it's a duplicate balance between how much capacity should I should allocate and how much money it will cost it yet, and we will probably do some, you know, initial planning. Decide what will be the needed capacity and then update this system based on real performance data. Now the ability to monitor what's going on and then take a operation and business decision is the job off a performance management solution. In our case in a zoo, it will be the Azuma monitor tool. While using the metrics calculated from our resource is using, those metrics will be able to identify over utilised. Resource is meaning bottlenecks under utilized resource is meaning unneeded resource is regal alerts on real time performance issue. Oh, maybe only evolving performance Parton's that may impact the system In the near future. Metrics can be used to create, but he configured scaling policies that will automatically scale up skill down or scale out of or in the sources. If a load off an application is reaching a special point based on metric, let's add more capacity. Also the other way around. Remove capacity when the demand is doping. This is actually aligning very nicely with the concept off elasticity off a cloud environment for almost every resource type. There are multiple metrics that are collected or dramatically at the moment we created an enabled that resource will find metrics on virtual machine, that working toe face storage account, security groups and much more. 13. Data Type #4 - Guest OS Metrics and Logs: it's part off any infrastructure in service solution. We are going to allocate the group of virtual machines, and a zoo will start to collect automatically metrics as soon as we create those a virtual machine. In addition in each virtual machine will use some operating system like Windows only knocks . This is the guest operating system. No. The guest operating system can provide additional or extended telemetry data based on metrics and looks. This is data type number four. It is gold guest level monitoring, and we can enable it. Virtual machine, for example, will be able to collect a metrics like process or memory network file system, disk, all kind off system. Look their codes off the operating system, events like colonel messages, user level messages, security and dedication messages. It said the when we will enable that option. The zoo, the agnostics extension agent, will be installed on that particular virtual machine. This agent will basically monitor what's going on inside the S and send that telemetry data back to the A zoo morninto. The collected data from the agent will be stored in a storage account, so it's part of the process. We will need to create a such storage account. By the way, when we create a new virtual machine, we have the options toe unable. This always a guest level monitor in the process of creating the virtual machine. All we can edit later under specific setting, and we will see those options in the next section. 14. Data Type #5 - Applications Metrics and Logs: our last day. A type is application metrics and logs. It is important to understand what's going on in all the building blocks off our solution. The virtual machine storage, these virtual network etcetera. But what about the actual application? An application can generate metrics and logs related of the performance on operation off that application in the area of performance Morning towing It is gold, a PM application performance. Morning towing as a simple example. Let's say our application is a website and we would like to money toe the page loading time . This kind off metric can only be collected from the application layer. Secondly, when we traveled to the issue, how can we identify the would cause? Is it a problem off a specific Web server that is maybe overloaded or maybe a slow A P I s ponds for an external system, or maybe a performance issue on the back and database ever? Okay, it's It could be a complex scenario. Still, the starting point should be to measure the direct impact on the application layer. A zoo monitor can be used also toe monitor the performance off the application level and specifically for Web application we will see it later. In this training, there is a dedicated model cold application insight. Now developer can use it during the development phase toe a test the actual code that they are developing and also engineering responsible to make sure that the system is operating as expected. In a real production environment, it's part of the next section. We will install Web Selvin with the website. To be used is the application we will morninto. 15. Section 03 - Overview: hi and they will come back so far. We learned the different types off telemetry data that can be collected from a variety of data sources. Now to be ableto them most rate what kind offsetting are needed to collect telemetry data and also toe there must read how we can analyze it. We need some interesting system. We need a playing ground to test and play with the available options. That's the topic off the section. It is divided into several steps to simplify the process of creating that system and also to apply the neither day setting. We are going to use few lean oaks commands for installation and setting. And even if you're not really familiar with Lee knocks, don't worry, it is not rocket science, and I will do it step by step. The outcome will be an end to end solution. We can start monitor. As I said, we need to monitor oh, something to be able to present the nice features in a zoo, especially if I would like to show you how to monitor a solution up to the application layer. So what are we going to build, starting with the back end a database. Servers using Lee Knox is an operation system. Then we will install a few application insides. Maybe you heard about it. But Web developers are using simple testing environment where all their Web server components are sitting on the same server or the same virtual machine. We will use something that is gold a a lamp stack l is fully knocks a years apart. You observable and is my sequel database and peak is PHP. Okay, This is a server scripting language. But in general, don't worry. The whole installation process is quite simple and this part off the process we will create also virtual networks. And it will get the face with the private static i p and also public i p. We will open a few protocols for that particular virtual machine like it's a sage. So we'll be able to configure that form remotely. And also h two B. And also the final step will be to set up the my sequel database. And for that we will do some additional installation, will install something that is called PHP. My ad mean tobe able to manage that database, and at the end we will create a database. Instance, with one, it is a building site. Our next virtual machine will be a front end Web server, which is supposed to end the http request form and user via the Internet. The phone and Web server is getting an http request. Display a site and access the beck and server for getting information. OK, getting information from the database. It will be also based on Lennox Operation System. He's stalling the same lump stock using private, static, European dynamic public I p Open protocols like a sage http and FTP. And then we will install FTP server inside will use it toe upload our website toe the Web server set up permission toe access the beckoned A database server. Then the following step will be to set up all kind off configuration and Kate few dedicated Resource is for collecting and storing the telemetry data from those virtual machine like metrics and logs a form a multiple layers. Now, last thing before we start your model Welcome to follow the steps one by one and k the system by yourself. It may be it will be a wise to see the whole steps. At least one time. Meaning the whole section. And only then try to build it by yourself. Oh, you can just watch me do it. And after completing the whole course, you can decide if you would like to move it to the practical side. Okay? This is our plan. I'm sure we can do it. Let's start. 16. Step 1 - Back End MySQL DB Server: our first step will be to create and set up a virtual machine for their beckoned the base over. As you know, we need a esos group. So we cater. Hazel's go type it some name and then select a reall element region. Let's keep it the central US and creating new Resource Group Next would be too great a virtual machine. And for that we need to a provider kind of details. And first, the reasons group that we just created. And then they virtually machine name. Let's call it beckoned a DB server. So, like the region centrally US no availability options Image will be open to sell over and the size I will select the smallest size, which is this one. Okay, select it. Okay. Next step will be the administration will use a password. Don't forget the user name and password that you're clicking here. OK, they are critical to the set up face. Okay, write it down somewhere. The user name and password moving to networking aspect. I would like to allow some few inbound airports and it's easy to do it from all the village tipi and also as a sage. Great. Let's move on to the next stop disk where I will select the cheapest option, which is stand up HDD going to networking over there. It's going to create a new virtual network as the new virtual network with the sub knit for specific, a private eye be and also public. I'd be we also considering Bon Ports management, nothing special, a guest configuration and then ah, review the whole setting. And if I'm getting that, it's it's ah validated I can create now. This process will take a few minutes until the new virtual mission will be created and will be up and running. I will skip to save time, toe the final result where the virtual machine is ready. Now I can go to the virtual machines to the list that will see the bed can be silver. It's running right now. I can click on that and check this setting on the overview. I can go today and see that there's a public I P address that it can be used toe access. This server. If I will go to the networking blade, I will be able to see the special setting for the inbound ports. Okay, HDP and It's a sage. Also, they allocated disk for the operation system. Okay, this is more than enough just to validate. The server is up and running, and we can use it. Next will be to set up the Clough chill option in Asia. Okay, they will be. If this is the first time there will be some first set up clicking this option and then create a storage to store the information. After finishing this phase, am I will get a prompt. Okay. To start typing, they needed commands. Next step would be to open an ssh. So I need toe copy the public. I p I'll dress type as a sage with my user name and the public I address. Okay. Just click here. Yes. And then type the passport that you used to create a virtual machine. That's it. I'm now in the beckoned debates. Over the first command will be soo do a P T update, which will update all kind off available off the shelf application that I can install quite easily. Next will be to install the lump A applications. So so do a PT install lamp server and this sign at the end. Okay, this one click enter. Okay. It will take minutes to install the all the components. Okay, I will skip on that and go to the final stage where everything is installed. Now we'll do a simple test. Copied the public i p address, paste it in our browser. Click enter and I will get some default. Apache A page just to make sure that this server is up and running is everything is okay. Next, we will secure the my sequel database using a dedicated script called Okay, Typing. So do my sequel under school Secure under school installation. And then we will have a few questions that we need to answer off. A. I don't want to validate specialists, passwords or press no. And then I need a passport for the root account of just using would pass something simple to remember or so right this possible down the sides So you will be able to use it. The end in the new would password. It should remove anonymous user no disable. Log in remotely no remove dust. No and privilege. Upload the privilege tables again. That's all. Next we will add a user account of the bicycle later base will be able to use it and we will log in tow the mice, equal database, using this command and type they would pass world and we are in the mice equal a shell inside the prompt. We need to use the grand command with the necessary options to apply the needed permission . It's a long line. So we just copied from a different location Grant all privilege on all database Instance for that particular user Cordy done on this a silver local host and identified by this password. Okay, done pass. Very simple. Run that. That's all. Let's exit from the mice. Equal prompt. Now it's a little bit tricky configuration. We need toe edit a the Mice Equal main configuration file and I'm using an application. And I supplicant here that is called the Nano. And there is some complex location for that file. Let's say find it my circle dot corn that d And then this is the name off the configuration file, see, and f And inside the readyto, I need to find a particular line which bind a the the database before particular, i e the local server i p. I need to remove that. So I will just at the pound symbol and it will be ignored. Okay, let's save it. Se modified. Yes, they named in same name file and last thing I need to restart the mice equal service. So the change will be applied. So do service. My sequel restart. There is a nice network utility, a command that we can use to check that the my sequel is listening to the 11 ports in the whole address it called nets that I'm using this barometers. And then I will find the my suit will lie in This is the line and I can say it actually listening toe all I p address on that particular port 3306 Great. That my sequel is walking as expected. The last component we need to install is called PHP might mean Okay, So a preteen stall pH being my ad main, it's ah, very useful way toe set up the my civic will database using a web site. They want to continue. Yes, it will air prompt some configuration. I'm using the Bacci web, Selvin Okay. And then I need to type some password for the PHP. My had mean Okay. I would like to configure the configurable database right now. Yes, type some bus would over here for the PHP. My dad mean, confer bustled conformation the same password, and we almost done. Okay, that's it. Okay, we finished with the whole installation. Now, let's copy the public I p address off the back candy, be silver. A basic in the browser and add the my PHP ID mento be ableto access this application Beach be and my dad mean great. I'm getting a logging, I I will be done. And he done pass this is they using them? Impossible. That vacated when I set up the database sent, I'm getting them. PHP my head mean user interface inside a will create a database. Instance that's called my database and create that database and also a table. So I will be able to store something in that database. That's the name of the table. I need a tree. Columns click create table, and then I need to set up the name off the columns. The 1st 1 is called times them with the particular popular tees off getting the a 11 times temp and I'm goingto be a primary index for that table and then two more columns, temperature and humidity. I can preview the SQL Command that is going toe one on the database using preview SQL. This is the command that will create the relevant table with the relevant columns. Let's close it and round that command and we have the new columns inside. So we created a database, and then we created a particular table inside that database with specific columns. Now, let's generate a few records in that A table. I need to end the one low at a time. Click some value in the temperature and humidity one and one. Okay. Use deception and go. Okay. New low was inserted into the database. Lets another one changed it to a single of Oh, use this option. Go. And the last one the same process. Okay, great. Let's go. Today they the base, you know the cell and we can see the tree lows that we entered inside. That's all. We have a big can't database server running my sequel with database and a table with records inside. And now we can actually use it 17. Step 2 - Front End Apache Web Server: in Step two. We would like to create additional virtual machine for the front and server. It will be almost the same process. Let's do it quite quickly. I'm choosing the same races group typing the virtual machine name will be formed and way whips over rate. Select the region Central U. S. And then the image of bone toe server changed the little machine size to the smallest size . This one click select next open next week will be to set up the point user password again. I'm using a user name and password. Okay, Write it down. Don't forget it, Okay? It will be easier just to set up. The signals are puzzled for the two virtual machine and open specific ports ish to be and as a sage, we need to open. Also for FTP will do it later. Disk Select the standard G D networking. We're ready. Created the virtual network so we use it and private and public be moving next a view and create the new virtual machine again. Let's go to the list of virtual Machine and I will say the new phone and Web server. Click on that and let's cook Pete the public i p address and open the shell. So I would be able to ssh for that new silver type as a sage and the user name and public I p address off this front end server. Yes. And then the past world, Unless step will be to the installation, Abdi actually update to the whole A pretty package. So we'll be ableto perform the United Component installation again. We're going toe install the lump stuck using this command. So do it between stole and I'm silver. Okay, It will take a few sick ends and we will have those components installed. Nowhere. Web server. Let's do a quick check Test copied a public I p address, paste it in our browser and will get the Apache default page. Okay, great. We can move on. I would like it to upload the website to the funding server. So for that, I will install and ftp silver in that Lennox. A machine has the name of the component. And right after the installation, I need to adjust some configuration, set up Toby ableto upload the file and to actually write to the ftp silver. So I will edit this file So, Donna, No. And this is the location. The S f d p d. That's the configuration. That configuration and I will find a specific line that is right now. It disabled. Okay, this one right enabled Let's remove this pound key and saved the modification. And we need to also Easter this FTP service, so the change would be applied. So the service vsv ftp d we stopped. Okay, that's it. Let's check. The FTP server is walking using the Net start command, and I can see it listening to the port 21 which is FTP service. The last thing I need to do. So in the context off ftp I need toe add the ftp as their role in the list off security holes. So it will be any source and it portrays. But the destination port will be 21 protocol TCP and let's call it FTP and click ed and it will be added to the talk security group, so we'll be able to access this front end from remote using the FTP service. Okay, It's created the new security rules and I can see it over. That's the FTP. Now I would like to upload a webpage it to the phone and Web server, which will access the database in the backend server. And to do that, I need to adjust something. You see this? Options the database, name the user name and password be done in Eden Pass. The only thing that I need to adjust is the server name, which is created am automatically by the Asia system. So I need to check that I would go to the beckon system networking and cook p this private I p address. Okay, Does the private I p address start the public and write it over here? Okay, you need to do that also. Okay. Need to add this. Edit this file name and make sure the parameters are correct. Meaning the server name, user name, password and the database name. As soon as you finish, you can save it. Now, I would like to upload this web page into my phone and Web server Will copy the public. I be open a simple FTP client. I can use the simple FTP commanding windows logging into the Lennox Elvir using the user name and password for the when you created the virtual machine is the same user name and password. Now, I A put that particular a webpage in the drive. So I'm using this command. Input the drive okay and all. Anything that is end with PHP and not this one. This one, My daughter Logo BHP. Yes, and that's it. It's small file. It will be uploaded into the phone, Tend a server. That's it. I would like to upload this web file into the front end web cell ville. Okay, It's a right now in a particular folder in my home folder under my user name Me. Done. So let's logging to the phone today. Web cell there and see the least off files in the home directory. Okay. Lists minus l. Okay, that's the name of the file. My daughter Logan dot PHP. Great. Now I need to change the access for that PHP file because it should be available from the internet. So I'm using this common ch mode and provide the relevant access permission. Okay, that's that's enough. And the name of the file click Enter in. It will change the A privilege access to that file, and then I will copy that file into a the okay. Sudo cp, their home slash She done all your use on them and everything that is ending with PHP to the location off the web server. Okay, they it's it's sitting on the oval W w html That's it. Now the file is located in the like director and I can access that from remote. Now, the last thing I need to do is to allow the phone and sell their toe access the back end database, and we need to add some line into the mice equal a a database. So again, I'm going to a sausage to the backend server using the public i p address and log in to the my sequel A a prompt. Okay. Using this command again, you need to type in the woods password to log into the mice equal. Okay. And we are going to use the same. I am a structure. You remember the grant privilege A command will copy. Eat pasty toe. Very now, one thing that we need toe Be sure that we take the front end a private I p address. You need toe, take the phone at a I p eyeless and write it over here. Okay? I will remove this text file. This is like a place all the and right, the i p others the private i p. Others off the flaunted. Okay, So it going Toby Grant or privileged for all later baseball the user name it done identified by the passport. Even pass for a they I the private i p address off the front itself, that's all. And then we need to restart the my sequel service and we finished made. It was a long process. Now we'll be able to access the Web page and do some small dist. I will take the public I p address off the phone and Web server, compete okay, based it in my browser and then add that particular weapons my that a longer don't BHP click ender. And that says this is the webpage, Okay, displaying the records that I placed in the my sequel database. Now we have a webpage learning from the front end Web server, and it's accessing the beck and database and displaying dynamic information 18. Step 3 - Log Analytics Workspace: At this point, we have to lean looks, settlers up and running. And a zoo already started to collect metrics about the resource is vacated so far. Azuma Nitto as to data stores toe handle the collected data one for metrics and another 14 logs. Now the best location to store and analyze logs will be in an entity called Log Analytics workspace. So our next step will be to create such a workspace, and then we'll use it later to stop in, analyze. Look, it's actually quite simple process. I'm going toe all services and search for the service Log Analytics. Okay, you can say look, analytics workspace. Click on that. There is no instance of vacate a new instance. And then I need toe right down. Basic properties like the workspace names. Are we quality done. Log analytics. Um, race is good, but we used existing races. Cool, but I have my web up. Ah, this is group. Then the location. I will use the east in U. S. And that's all. Click ok, and it will create a new log analytics workspace. Let's click on this new log Analytics workspace instances don't look a I am getting the over Weeb relaid. But the interesting part is right now is the workspace days of sources. Now nothing is connected right now today to this new workspace. Nothing in sending data. And let's change that. So I would like that all Asia activity logs will be connected to this workspace, and the way to do that is to click on here. Click on Connect. That's a quite quickly process, and then an hour. I can see it's connected, sending later to the workspace. Let's go to the virtual machines and do the same process. Connect. Okay, finished. And let's do that. Also toe the second isso over the phone and Web server the same process. And that's all. At this point, I will be a bill now t to see the virtual machine and connected today War a Log Analytics workspace 19. Step 4 - Diagnostics Settings: We just created a log Analytics workspace So much start to use it. Many a zoo resources are able to write the agnostics, logs and metrics directly to the local politics We can enable. The collection off resource is diagnostic logs after a resource has been created either by going to a specifically, it's also by navigating to the assume monitor. We just created a Log analytics walk space and let's start to use it and we can start toe collect logs. A diagnostic looks and we are changing the setting. From there's a monitor called diagnostic settings. We can see each they saw that we created. If the diagnostic status is unable or disabled, if I'm clicking on some line over A, I have an option toe unable down on the agnostics to start to collect it. I need to type some name doesn't really matter, and that's basically over here and then Jack Marks to send the log to Logan. It takes like I need toe select. The instance indicated he done log a and what kind of flogs Let's it was all of them save, and now this resource is connected to the Log Analytics. We can see now it will be actually in the list. It will be right now. Enabled. Okay, let's go back now. The front end Web server and network security group is connected. I have only the last two. Let's connect all of them. Okay? And this is something that you shall do also for the old resource is that you are a kating , so all the list will be enabled. Now, there is one special resource that we can't set it up using the diagnostic setting from the as Omar Little Service. And I'm talking about virtual machinery sources. In our case, we have to virtual machines and they're not in this list. Okay, there is a different way toe. Add them into the list and start to collect the aeronautics. A looks from the virtual machines at this point zoo is not collecting any information from inside the virtual machine itself To be able to do it, the first step will be to install the theater Gnostics Extension agent, which is used to collect guest level metrics, logs and other diagnostics data for individual setting Inside the view, I can enable the guest level money towing. We click on that and then it will install some agent inside the virtual machine itself after the installation will be were able to see the list off metrics and logs that are being collected in the name of the agent. I can go to the metric stab and see the simple rate marriage metrics the CISL August are being collected and the setting off that particular agent. If I will go to the extension at least over here under setting, I will be able to see the new lean Nox diagnostic agent that I just installed. And because we have another ritual machine, the phone and Web server will follow the same process. Go to diagnostic setting, enable guest level money towing. It will do the installation off that agent, and after that I will get the setting an information relating the agent, that's all, and I'm ready to move on. That's the simple testing. We go to the Metrics Explorer, which will cover in much more details later just to show you under the metrics name space. Now we have to option the host in KP eyes in the guest, K p I. The guest Operation KP are kind off metrics that are being collected by that particular extension agent and also all kind of diagnostic strokes what will use at a later stage? 20. Step 5 - Azure Monitor for VMs: At this point, we have a log analytics workspace, and we already enabled the collection off the agnostics logs from several Azul resources, including the installation off a dedicated agent inside a virtual machine. And now we can move in today a zoo monitor for ritual machine. That is a complete monitoring solution for Richard Machine. If I will open a virtual mission to stick the back and settle the and go to extension, I will be able to see only toe. Agent really looks diagnostic. Enormous. Agent Foley knocks, and then I will go down to a special option that is called insights. Okay, under monitoring and the festive will be toe unable. This a capability. And I need to choose some a Logan Itics workspace, which is okay, because we already created one. So we just need to click an able and it will. It's a process that will take a little bit time. Let's jump to the end result where this is already enabled. You can see that even it's already enabled will take up to 20 minutes until later will start to arrive to this say tool to the inside, too. Anyway, in the extension now I can see additional agent that is called dependency agent Ah Lee Knox . And that's a very important powerful agent. And in a minute we'll see what we can do with that. Let's do the same process for their front and Web server meaning. Go to the inside and over there, also unable. Hey, that capability, I'm actually jumping to the map tab and then it will ask me toe enable this option. And then again, it's process that will take time and soon it's It's It's finished. We can move on. No, let's go again to the the extension. Go to the list off extension just to see that this agent is now installed inside the dependency agent. That's great. If I will go toe the inside tool and to the map, uh, here, over here Well, actually will not see anything because it just enabled that. Let's go back again to the back and server, which I enabled before and over there under the mapper will be able to see something. Okay, and this is very useful. This is the way to present a list of processes but each server and how the connection toe other component other servers, something that will use later. Okay, next section will, when we start to use this data 21. Step 6 - Application Insights: Our last step is about money towing the application layer for Web application that we are using a Web application. This is being done by using a zoo application insights that is part of the azo monitor. It can be used to monitor the availability performance, a new, such off a Web application. The first step will be to create an application. Insights a resource. Okay, there's something like that in a zoo, and the place that we're doing that is from the zoo monitor tool. We have a special category that is called applications. If we click on that for the first time, it will be empty. And then form Ovary will create a new instance clicking ed. Now, over here, I need to type some name. Let's call it my Web application. Um, insights and under application type. I need to select general subscription pay as you go. Let's use the resource group that will be created. My application. My Web application is this group location East us create. After a while, I can click, refresh, and I will be able to see the new instance. Okay, it's an application. Instance resource. I can click on that and that many optional, Very that some of them will be covered in section five and six. Okay, so the application insight is the resource is ready to be used. Next step will be to upload some website, toe the front and web server. Okay, so there will be something interesting toe more itto using the application insights. And there is a website if file that you can download is a zip file and just extracted in into your drive and open the index HTML, this is, like, you know, tell his ovation site. Okay. Like a template with all kind off page orme wombs restaurants about kind off a webpage, articulating some traffic when the user is clicking on those pages and will monitor the behavior in that perspective. Okay, so this is our website. We need toe upload this website into defendants silver now to be ableto connect between the website. Okay, the webpage it to the application inside tool. We need some a script that will send data. And this is what you see here. This is their from a soda commendation. There is ah, script that we need to insert toe every page. That would like to monitor before the Ed. Doug. Okay. And let's do that together. If you don't know the provided files in this lecture, then you will see that the weapons that includes this script, that specific script and the only thing that we need to do is to edit a particular field. This is called the Estimation Key. When we create an application inside resource, we're getting a unique key. This is like the destination location where a webpage that includes a JavaScript code will be able to send our application data. So I'm going back to the application inside resources dedicated. And inside the overview blade you have on the right side a specific field called installation key. We need toe. Copy that and then baste it in every way Paige would like to monitor. So I will open the index html file and based my a key over here. Okay, You should do the same process. And I need to do that for the all others webpage. The that I would like to monitor so or the webpage here. I will replace the install mation key. Okay, so please perform this simple. Add it to every webpage when you finish zip it okay to a new file, my website. And then we will upload this file. Now open an ftp A to the phone and Web server using the public. I'd be others again. User name and password for that front end Web cell ville. And then a use. Ah, command harsh and being okay and then put use. Ah, and then use the input to upload that zip file. Okay, Great finish. Their file is uploaded. Now inside the front end Web server. Let's see the police to file. I can see them. My website zip file. I am. In some cases, you need to install the aan zee publication. So this is the command. And then we will copy this zip file. So the elephant location am inside the web server. Okay, that's the location. Vaal W w html. And then unzip the fun. I will goto the irrelevant. A directory where this files will be located. Here we go. This is the my website. That zip. All I need to do is unzip the file. It will a open all those files in a specific folder. Yeah, this is the folder. My website does the full dunning and the last step is to open this website that we just uploaded from that particular a directory. My website. Let's click on that and I will get and the hotel site now every html a page that I will open. It will send telemetry data using the JavaScript code to the our application inside walk space using the particular you, Nicky that we gated. Okay, that's all. We created a were playing ground solution and configure they needed setting to monitor development layers. From this point, I'm going to use this system because the whole training if you have any question about the process so far, please post a question in the course dashboard. 22. Section 4 - Overview: Hi and welcome back. We're in section number four. Effective monitoring is all about automation. At this point, we already created the demo system to monitor. It is a combination of two servers and phones and Web Server and the Beck and server running inside the virtual network. We also added some hotel website to the front and the server and it can be accessed from the Internet is a regular website. This is basically our playing go. In addition, we saw almost all donated steps to configure in a zoom collection, off metrics and logs for multiple layers and multiple data sources. We didn't yet Len, how and well toe analyze these data and we get into that important steps in the next section in this section section number full. I would like to talk about the concept off automation, which is really important. Automation is a critical element in any effective monitoring system, and I will explain myself with a simple example. Imagine that we need to monitor 100 virtual machines that are used by 10 different application. Even if we will check this complex system everyone our using metrics and logs, it will take us a lot of time and efforts to keep the system running smoothly. The amount of telemetry data will be huge, and it would be difficult to monitor the system effectively if the CPU utilization on a Web server is reaching 90% over the last five minutes, we want to know about it now in a little time. Maybe would like the system to do something about it, automatically toe proactively, perform some actions to handle the problem, like scaling out the front end Web, several ale and at another virtual machine toe handle the going amount of traffic flow using the elasticity capability in a cloud environment. This is basically resentful Skilling. So instead off waiting for evolving patterns to show up in any boat in a delay we can avoid and handle them in real time. That's the head sense off automation in the context. Off monitoring. Identify a problem, notify about it and perform some action toe handle it. This is why, in every performance management system ing, including a zoo, we'll find something that is gold racial management. Basically, we can set up a group of fresh or rules that will monitor the collected data constantly and notify in case off a trestle bridge. Those words can be used to check any label any type off resource automatically. That's the power about it. And in this section we will learn how to set up and manage those rules in a zoo. Now I will share with you a little secret about automation today and probably in the near future. Automation is more and more referring to the ability to harness and utilize machine learning and artificial intelligence algorithms to enhance the automated process off monitoring and in the context, off morning towing. It's making a lot of sense. In some cases, it is difficult to set up some pre defined holes because I don't know what to look in a complex data sets. Things are changing all the time. This is the sweet spot. Well, artificial intelligence solution can help us. If you would like to open a start up about it, then it's a good time. Okay, back to a zoo. Let's start by defining the concept off alerts and actions in Microsoft Zoo 23. The Concept of Alerts and Actions: in Microsoft a zoo. The concept off treasured management is gold alert holes, and the option to perform something is gold action groups. We as end users are responsible to create an entity that is called an electable at the basic level. The definition, often 11 total, includes the target, the source and the criteria for a leading and let's use the following diagram to present it . The first thing we need to select will be the target resource. It is used to define the scope and signals available for alerting, and it can be almost any resource in a zoo. For example, the target resource can be a storage account, a virtual network, a virtual machine. The scope can be one single resource, or maybe a group of resource is for a certain type of resources, like a group of virtual machines under a specific resource. Go now what exactly? We are monitoring at the target resource. So it group off resource is this is the signal. Signals are telemetry data created about the target resource. And as we learned, it can be metrics value all kind off logs. So we have the target easels. Okay. A group of resources like a specific virtual machine or a group of future mission, and we have a specific signal to monitor, like a sip utilization as a metric being collected. Form that resource. The last thing that we need to define will be the testing criteria. Attesting criteria is the combination off signal and logic applied toe a target resource, for example, sip utilization. He is greater than 80% on a Web cell Virtual machine. Okay, the Web cell visual machine is the target resource. The signal is specific. E p I, which is the soup utilization and the testing criteria is greater than 80%. Now it's part off. The A little definition will lead also to configure the alert, name some a little description and civility which range between zero and four. Now, let's say that we created few a levels but a walking in our system now, in case a specific testing criteria, we will be true. Okay, What will happen is that the system will trigger automatically and I lived event okay, a new a lot. Ah, let will be created. And this is something that we learn how to mourn. It'll now also the opposite in case that the same testing criteria will be false. Okay, back again to a force state, it will update that particular alert automatically. The next thing I would like to mention before going to the practical side is the concept off action groups. In addition to creating an internal alert notification in case off a trestle bridge, we can actually do something about it. This is the last configuration option, which is Gold Action Group. An action group is a specific action taking automatically when an alert has been triggered . We can use action groups to do many things, and this is the nice thing about that. First of all, on the notification level, we can send a voice call SMS email to 11 team members. Secondly, we can trigger different types of automated actions. For example, we can run a function in some application to do something we can deploy a Web hook. This is like learning some a P I on a different system to do something, and we can on an automated ran book that is used, for example, as a automated a script to create a new virtual machine automatically. And it can be also be used to integrate with an external I T service management system, for example, toe open a trouble ticket automatically. 24. How to Create an Alert Rule: in this lecture, I would like to move to the practical side and present the steps to create an A little. Now, under the zoo money tow service. We have a specific category, which is gold alerts. Here we have a few options first of for creating new 11 tool manage existing a little's and the last one is managed action groups, starting with creating a new alert wool. Let's select this option, and I'm getting a weasel toe feeling the required configuration. Select the target resource that wish to monitor Clicking. Select will bring us another skin to select the resource I can feel the resource is by specific subscription and then by Cecil Stipe. In my case, I only have one subscription, which is called Pay as you Go. In addition, I would like to fill there and get the list of virtual machines. Now I have two options. I can never the condition toe this specific virtual machine by selecting it. Oh, I can apply the condition toe all virtual machines, like a group under the resource group or maybe on the old subscription living. We can see all this information he'll under the selection, preview and also the available signals even before moving to the next. By the way, I can also create and manage a little's form a specific resource blade. Let's open the village jewel machine as an example under the monitor in category, we'll have the alerts. It's the same look and feel. If I will click on New Allah Tool than the target resource is automatically selected, which is something I can change. I think it will be easier to manage alert words from the main monitor screen, which is not filter to a specific resource, but this option is also available. Tobie managed under the resource blade. Okay, let's go back to the little screen, and I will select as the target resource all virtual machines inside that resource group. The next step will be to define the condition. It can be a single condition or group of conditions. Click here on. At condition will bring us a skein to select signal and then condition on that. A signal the list of available signals. It is automatically a filtered based on the type of the target resource we selected in step number one. We can filter signals by signal a type okay, like metrics or logs, and then apply an additional fill, though, on the monitor service, which creates that signal after selecting a specific signal form, that least we basically need to define the alleged logic. In many cases, it is useful to take a look at historical Parton's and identify what is normal on what is not normal. A zoo is helping us by presenting a graph with historical values on some selected time frame. Now, after looking at the historical daytime deciding the right values for identifying a trestle bridge, we can actually configure that logic right here. Under the condition, there are few options like greater than greater than weak will to less then no equal to was just less than next will be to select the time aggregation formula and can be average maximum any moment total. And, of course, the threshold value. This is really depends on the metrics itself. In some cases, it making sense to apply an average calculation, while in other cases the options to summarize the value is more relevant. So it is a case by case decision below will get a quick, useful preview about our selections so far and the last thing we can configure will be the period and frequency. The billiard is basically the time Spain which toe? Check the condition. In many cases, metrics are collected every 60 seconds. This is the low data. Now, if I would like to check the average sip utilization over the last 50 minutes, then I will select 15 minutes here and the condition will be evaluated on the last 15 values. Assuming each value is collected every 60 seconds, it can be an ever edge value the maximum minimum or total as a time aggregation statistics . And we have also the frequency How the system should check that condition every minutes, every five minutes, etcetera. After performing all the selection will say that you condition over Hill. Please note that those conditions cost money and we will see here the estimated monthly cost. We can add additional conditions based on the platform limitation, which is something that is changing a from time to time. If I will summarize from the time the alert tool is created, the system will check the condition. Every one minutes is what we selected and look at metrics value for the last 15 minutes and check if the average off those value exceed is some percentage. Our next step doing the process of creating in your letter tool will be to define an action group. If we already created few action groups, they will be visible. Hill Toby, selected it is empty right now is I didn't create any action group yet. Let's create a new action group. We need to feel, first of all, the action group name that would be unique within the resource group. Let's call it V M Load. Short name for the Action group will use the same name and select the subscription and wrestles group. Next step will be to create actions can be a single action. All the group off actions under the action name that should be unique name for the action will put Notify Operation Group AM Action type from the drop down menu will get a few options to select Let's elect email and then in the new screen under the email optional, you will take my email address. That's it. We can Kate the new action group and use it to create the electoral. The last thing that we need to do is to provide some basic a little details. Specify the name. Let's call it Just rule. And also on the description dist Mm and select Specific Resource group and the severity. Select something and just click Create a little. And then we go got the new alleged wolf and and you can see the name of that rule. The condition the statuses could be enabled or disabled. The target resource. The target resource type, which is virtual machine and the signal type that is a metric This is the process and to end. What I would like to do is to create several action groups in several a ah little's, and then we relegate some load on a server, and then I let events will be created based on our configured rules. So I will remove this testing wool and start form scratch 25. Configure Our Rules and Actions: Now we'll configure real action groups and allowing worlds in our demo systems are going to the alerts blade and starting with manage action groups. I already created, um, an action group that is called VM Load. And now let's add a few more action group. So the 1st 1 will be M v M ad mean action. Let's copy that also to the short name subscription is this group would be my with my web up. This is group. And then I would like to as an action to notify the security group. Okay, notify security group that someone performs some action. So some admin action on that. The only virtual machine. Okay, mark and type some mail address. If you will type a hill, Others you will get After submitting the action group, you will get a notification that you added into that action group. Okay, great. I need to reduce the short name less than 12 characters. So let's remove the VM. Great. Okay, We have a new action group which is called VM at me in action. Next action group will be elected to the Web application on a specific metric that is called a page load time. Okay, Every time you use there is logging in is trying to load some page. That is a metric that measure how much time the page was loaded? It So I would like to do to actions. First action is to notify the operation team. There is some problem on the application layer Will do the same. Choosing an email like some email. Others over here Write something. The next one is more interesting. I would like to do some and scale up action. Okay, so I will call it Scale up Web layer and under the action type, we choose something that's called Automation Handbook. And here you can be really, you know, creating without automation leader you can use built in on boo Co creator you a user on book by yourself. I will choose the scale up the M out of the books run book to a subscription. If I'm using that for the first time, I need toe create something that is called automation account. And I'm doing that right now. This is the name of the automation account. It will be under the same resource group location and the U S. Click OK Now this thing will take a while. I'm jumping on the process and I have a new account which I can now use. And now I have to action. One is to identify, and one is to scale up the Web. Layla, here we have it. The Web up page loading action. I could see the resource scope and are to action email in Automation Gun Book. Next Action Group is a the integration with extended trouble ticket system. Let's call it idea. Same. I t service management open a trouble ticket using something that is called Web Hook. Okay, that's his life like a u R l that you are sending to some other system how we're actually doing that. So I was under action name. Let's say Call it new trouble. Decayed and under action type. There is a special option that is called Web Hook. Now what I need to do is tow faced some your your your girl over here, and I'm going to take that form some testing a site that is gold Web oak dot sight. Let's see it for a second. If see the address Web hoax side. This is being used for testing such cases. So it's provide me some your your girl that I can just copy and paste into my account. Okay, now we see the result over here. If something is triggered, I will see that in real time in that site, and we'll see that later in the next section. Let's base it over there, okay? And we have another action type idea, Sim, which is running a Web hook action. Okay, we have few action groups now. We can move on into allowed old, so I'm going back to this grain. New allowed. Actually, I will go to manage a lot of rules, and I can see it's empty right now and click a new alert Well, and then there's a process. Select the target resource. Let's elect I virtual machine over here on the the Option Virtual Machine and Select Day Formed and Web Server, you can see the selection, preview and available metrics and logs now go toe condition and over here I would like to use the percentage is Cebu Metric and let's get a very simple A let logic I would like to check if the civilization is a great ah then and on every ledge from the absolute number. 25% can. Let's see value that I am in a very short time frame. Okay, over the last one minutes. And the frequencies, of course. One minute. That will be OK. Also, we need toe select the action groups. Let's choose this one, Veum Load. And also, I would like to open a trouble ticket on that issue. Specify some a little details The little name. Hi CIB utilization. Okay, I'm on a specific virtual machine on the Web server informed and Web server intended to one and gate. And you a little. Okay, Great. I even you condition. I knew a lot of tools with its a neighbor. Right now, Jackie Moon the next a little that would like to create will be on the application layer on a specific metric, so the resource type will be application insights. And let's select the instance that we created my web up application insights and the condition will be a specific metrics for my browser. Yeah, so let's search. But I was ah. And so, like this one browser page load time now under the alert logic again, let's create something very simple if it's greater than ever, Ridge. So for 50 milliseconds and below, let's change it to five minutes is OK and everyone minutes done. The action will be. If you remember, we created something specific that's called the Web up page. Load Time would like to send notification, and we also would Act one some automation handbook. Click some alleles details web up and slow page. Lower time, By the way, the description is automatically populated when the event is triggered, so anyone that is analyzing their event will see this information. Let's such keep the severity to tree, creating new allow told, you know we have it. Okay, great. Okay, The last one, the last one will money toe a logs instead. Off metrics specific logs about the virtual machines. Let's select some virtual machine, and the condition will be activity logs and specific activity logs. Let's let's search for restart if anyone they start the virtual machine. I would like to get some notification about it. Okay, now below. Under the logic, though a different kind of events if the status off those events, if it's failed, started or succeeded, we keep the event initiated by Azan empty. So this well, that will be the condition Prevue. Okay, Done selecting the action group, we created something specifically which is called VM and being action. Get notification via email again. A little details. Richard, machinery started. Action was initiated. Please check as soon as possible. And now we can. Kate. Okay, that's it. We have the tree. Yeah, a little's. Now we can actually use that vacating load on the virtual machine. 26. Simulating a Load on the Web Server: In the last lecture, we learned how to configured in a little, and we actually created few 11 pools on our front and the Web server. Before we can move on and talk about the concept off managing alerts, we need some events. Some alarms Toby created based on the Configured A Little's as a reminder. We created an a little that is checking the front and Web server. It's CPU every minutes. If the CPU is closing the value of 25% it will send a notification to an email and also trigger an external Web link, something that is called a Web hook. So in this lecture, we will simulate a simple load. When I will front end a Web server, I will do it. Using a stress application in Lenox, I already opened a message to the Fort and Web server. Let's do a quick care install toe the application, pay kids. So do I. Pity update, and then I will extol the stress applications. So do a PT install stress. Okay, that's all. And now we can use it now before running. The Stress command on that Web server would like to open another parallel a session, a sage session toe the same front in the Web survival. And the idea is to run some a utility to monitor the CPU almost in real time. Okay, it's called H Top and I will get this a nice application that will monitor CPU and memory and all the list off running processes on that a fronted Web server and keep it walking and in popular run distress command. Okay, so we have this application. We show us the superhuman memory, and we have this stub with the Web hook site waiting for the We're book request. And we have Decision Over There Will based the command, which is called stress, and there's few parameters to run it, and it would like to run it toe 200 seconds. Let's goto this one. Now we can see the CPU is reaching a 100%. Now, after a military, too, we get an email to my email account. Okay, Okay. We can see the some alert was triggered High soup utilization. I am on a percentage CPU metrics on the time segregation and the actual value off that metric. Let's also check the web hook That's supposed to be three Good Over here, I can see under the workbook a site. There was some requests with all kinds of possibilities and that it's not really, really important right now. We just want to see that we got such requests going to the cloud shell disappear, Return toe normal value. Let's also trigger another 11 tool, which is related to securities to control in a notification in case there is some a specific activity like the start. So let's restart the phone and Web cell ville form here. This is an activity log. And after a while we get that tow my emails because vacated inaction a group it that will send such identification toe my email. Okay, agile monitor Relevant, please. That action was initiated. Okay, if you remember that we created such a a little now, in case a specific testing criteria in some, a little is not relevant. ID anymore will say the activate a alert. Okay. Meaning it was resolved. And we'll see information as in as another event that we are, that I will get it into my email. Okay, So what next If we manage toe tree girl some allowance using a little's now we would like to be able to monitor those a little's, and this is the subject for the next lecture. 27. Managing Alerts: hi and welcome to the last lecture In this section we managed, Toki ate a three a levels and set up some interesting action groups. We also okay the condition that some off the 11 rules will be very good. The last thing I would like to cover will be about a let's management how to moon it'll and managed the alerts. Let's open again the I lives blade inside a zoom on it all we're getting in alarms Samarie Beach Now on the last 24 hours, this is the selected time frame the system created this amount off alerts based on this amounts off alleged rules. Each line here represents a civility level with the total off alarms. In that severity, there are five severity levers. Then we have three alarm state new acknowledged enclosed new meaning. The issue has been detected by the system acknowledged, meaning the user has reviewed the alert and started walking on it, and last one is closed. The issue has been resolved just to be clear. We as users are supposed to change each alarm state to reflect the current situation off that issue. It's sell here is a link, so for example, I can click on the total off alarms on the severity, and I will get at least off alarms that were created within the selected time Windows clicking on a specific line, meaning a specific alarm will get us screen with the details off the selected loves. Over here, I can change the alert states now. One important things to keep in mind is the difference between monitor condition and a little state in a lead state is something that is set by the users OK by us after it was created, meaning we can change it to acknowledged or closed. Now the monitor condition is a value that is being said by the system, which is actually updated by the underlying monitoring service that detected the issue now there to value for money toe conditions fired or resolved. If I will provide a simple example, let's say that we moan. It'll a memory utilization on a virtual machine. The condition is that an alert should be created in case the memory utilization is causing 60% now. Over the last five minutes, the remember tribulation was reaching 70% and ah, let was created no after 20 minutes the memory utilization, it went back to 40%. In that scenario. If no one opened the A, let it is still an alert state off you. But the moon it'll condition inside, which caused the alert to file already cleared and it will be set automatically toe resolved. Another nice option to summarize alarms will be by using small groups. Small groups are basically something that is automatically created using machine learning algorithms looking for bottles. A zoo will try to group alarms in a smart way to reduce something that is called alert noise mean too much alarms that we need to handle and actually help to resolve. Event faster. We can switch between views by clicking on this Bonner at the top of the page. Small groups have the same properties as individual alerts, such as usual defined states always story so we can click on a small group and changed the state. We can see here the number off alerts that are part of the same alarm group. Now the main idea here is that in many cases, and a lead in one place can be actually the would cause off. Many other alerts now based on historic information based on historic patterns. The system can group alarms we come on potential would cause I played with a system a little bit, and I was quite impressed by the result, So this can be a quite useful feature. 28. Section 5 - Overview: hi and welcome. Thanks for watching so far. It don't forget that you're more than welcome to ask questions. They suggest something. Also share your knowledge and experience. If you have something interesting to share, then go ahead and let us know about it. Using the course dashboard doing the last section Section number four We covered a key element in a monitoring solution, which is automation. Basically, we learned how to configure the letter tours and action groups being used to identify Parton's in the collected telemetry data, genetic notification and in some cases, also trigger actions. This is great, but what about us? The end users In the following section section number five and moving forward, I'm planning to present the tools we have in a zoo to visualize, analyze and troubleshoot issues related to our cloud environment, which we implemented inside a zoo and the first layer I would like to start with. It will be the zoo platform itself. If you remember in Level A Level one, the first course in this training program, we learned that the zoo is a global infrastructure managed by Microsoft. It's a collection off cloud services. Those services are available in a going amount off location worldwide? No, if we like it or not, those services may have interruptions due to many reasons. Think about your home Internet provider. Do you think that your home is connected 24 hours, seven days a week, 365 days in a year? Today, Internet without any downtime? Well, probably Dancer is no. Those complication companies are performing equipment meant an and software and hardware upgrades in different location in the network. In some cases, it can be also unplanned issues, like when some contractor just cut an important fiber while digging in the ground. Poor blames will happen so no or later. It's a question of time, and this is true also for a global cloud provider like Microsoft Zoo, poor blames in a zoo, data centers or problem in external services, which indirectly impact the services provided by a zoo. In this section, I would like us to be able to understand if a problem we see in our system is actually related to the zoo platform. So the services provided by Microsoft a zoo and there are several tools to help us identify this border line 29. Azure Global Status: Microsoft Zoo. It provides us with a dedicated webpage to monitor a zoo global status, which is gold. A zoo status. Anyone can access that page. Even if you don't have any account, let's open it. This page. Summarize the health off a Zo Services On the left side, we have a least off products and services on the high level categories, like Compute Networking, Storage, Web, Mobile, etcetera. Each Coolum represent a specific regions like east US Central, US West years, etcetera. They're all grouped under graphical locations like Europe, and so one. Every cell here in the stable can be in the following status options, good warning ever and information, and we can adjust the overall refresh rate. Okay, it's getting update information from their zoom platform, and this is the most recent updated snapshot of the global status. Right now, let's say one off our virtual network is down. We know where we implemented that virtual network. This is a resource. So as a first step, we can open this page and check the status off the services related to networking and focus on that relevant region. In addition, we have the option to investigate a zoo start to see a story from over here, and then we filter for specific product. For example, village jewel networks keep it for the whole regions. And for the last 90 days inside, I can see a list off issues with a detailed description pair each one. For example, doing November 2000 and 18 a subset off customers using virtual networks in the UK South and UK West may have experienced difficulties. Connecting to resource is hosted in these air regions. If I will check the would cause analyses, it was a 16 minute Hubble failure on a one outer that caused a large amount of traffic to fail over from a primary pad to a backup. That and causing increased congestion on a backup on that back up. A. That under the next step section will also find some details about what was done or planned to be done. This is actually a great example of sharing information with the public, which is important to increase the trust between customers and the cloud service providers . Now this historic least and also they as you start to stable that we just saw is a samarie off all services available in a zoo across all regions. We, as customers are using as a subset off those services a small subset off those services and a small number off legions. And it seems that a more focus view will help us to check out what's going on with services that we're using. And this is the subject off our next lecture service health. 30. Personalized Service Health: In many cases we is. Customers are using a very small fraction off a zoo services, if I will put it in most simple words, we don't really care about all azo cloud infrastructure that our own infrastructure, so it will be useful to get a more narrow and focus view off the services were using inside a zoo monitor. We have an option, which is gold service. Health service Health provides us with a personal dashboard, which it cracks the health off. Our services in the regions were were using them. Right now, the stool also provides guidance and support when some of those issues in a zoo services affect us. The first option service issues help us to understand if there are active service issues right now. For example, let's I created a virtual little resource in the East Europe region. If there is a service issue with the Village Jewel Network service impacting the East Europe region than it will be visible hell, under this service health, the region would not be called in Greenpoint anymore right now, in a were playing on, we can see again point indicating that everything is okay in those two regions I can also check a service events in the health history as a nice example. Let's open the healthy story option and filter for the last three months. In my case, when I recorded this course, I was able to find an event. A that actually was all over the news. It is gold. L see a network infrastructure event. When we review a specific event, it will be useful to check when the issue started and what services and regions are were impacted. For example, this event started on this date 7.5 UTC time. The impacted service was network infrastructure, which is actually a critical layer. Two regions were impacted East US and issue arrest too. In addition, inside the summary tub, we can check a what the Microsoft team it was doing about it or maybe still doing about it under the LLC a tub. We can understand that they would cause problem was an external DNS service provider. They don't mention that here, but the quick search on the Internet will say that it was a Central Inc and according to description here, Sentral ng experience some global outage after rolling out a software update which exposed the data co option issue and blah blah, blah rounder servers. This is a great example that poor blimps just a question of time, and we need to be prepared as much as we can. The next useful option will be to automate the process to some level. We can get notifications the next time in Asia service issue will affect us or maybe perform some actions based on a specific service. Health allows. It is being done by creating a specific type off 11 holes, which all called Selves health alerts. It is similar to the process off, creating alert rules and action groups on a zoo resources. Let's create a new service Health alert double Okay under a lead target subscription. My subscription. Next a under service. Let's remove all services and then select virtual machines and virtual network and also network infrastructure on the region. I will select a central, US and East us where my resources are allocated and service held criteria. I would like to know about unexpected service issue as well as plant maintenance Action group. Let's select some existing action group like a idea Sam and under a little details a little name. Let's call it so with issue on a zoo platform and description service issue on a zoo platform, it was detected. That's a This is an example off automatically. A morning towing service. Health issues. Let's say that we go to service notification, and under this category, we understand that there is a service issue going on right now in a specific region. How can we quickly verify the status off our resources? How can we know if our the sauces are being affected right now? Well, this is the next option right over here, which is gold resource health. We can select a specific resource type like the storage account, and all storage account resource is that I created will be listed here. But each resource, a health status indicator, will help us to understand if that particular resource is right now available. If I will click on a specific resource than I will be able to see also historical information about the status off that resource, let's talk about the resource health status options which are imported in the troubleshooting process. And so the 1st 1 is the status off available, and it means that the service hasn't detected any events that affect the health off the off the resource. A status so fine available means that the service has detected and ongoing a platform or non platform event that effects the health off the resource platform. Events are triggered by multiple component off the zoo infrastructure. They includes a scheduled options for for example, plan meant an instance all also unexpected events on the other end and none platform events by use directions. For example, if someone stopped a virtual machine, Okay, because of that, the status off the resource was changed, toe unavailable and other health status will be unknown, which indicates that resource health hasn't received information about this whistles for the past 10 minutes. If we are experienced problems with the resource with our resource and we can see this unknown status, it may suggest that an event in the platform is actually affecting our resource. And the last one is health status off degraded it that indicates that our resources has detected a loss in performance. Okay, although it is still available for usage. Okay, this is a performance issue. The last option I would like to cover is called Blend maintenance. Like any system, a zoo team must perform all kind off maintenance activities like a braiding hardware and software components. It we will be able to see here the scheduled upcoming maintenance activities which can affect some services in a zoo and as a result, also affect our resource is okay, that's all about self is health. We learned how to draw a clear line and understand that an issue we see in our resource is is actually related to a global service problem in a zoo. And now we are ready to move on and shift our focus from the zoo a platform in tow, our resources which we created in our system. 31. Section 6 - Overview: Hi and welcome back. Thanks for watching so far were in section number six. Monitoring our Azzouz solution is a small request for my side. Please spend 23 minutes and right down inside the course review section. Your experience so far, every view is important for me. I put the great efforts to provide interesting training and I hope this is working for you is expected in the previous section. We focused on the zoo. A platform is the underlying infrastructure using the age old status page and the service helpful inside a zoo monitor. We know now how to draw the line if a service issue impacts our resources and this is useful doing the troubleshooting process to find the root cause in this section, I would like to move on and talk about monitoring our specific in for such a service solution. And the main resource is in such a solution all virtual machines. Those virtual machines are the servers that were using in our cloud system. We learned how to check the status and health off a virtual machine and trace activities. Actions performed on resource is activities that can be performed by end users or activities that are performed by the zoo platform used the metrics Explorer tool toe, overlay a variety of performance metrics on charts and perform more complex analysis using the Log Analytics tool while quitting a logs. Utilized, they assume money. Toe A for virtual machine is a consolidated end to end Solution for monitor, which are machines. Optimize our system while using the recommendation coming from a zoo advisable and lastly perform some interesting analysis on the application level while using the application inside a tour as a reminder in Section Number Tree, We already covered how to perform all the needed configuration while creating our playing ground. So at this point we can focus on what we can do with the data instead on how to collect it . 32. Review the Status, Health and Activity Logs: Let's open the village jewel machine in our A playing ground. For example, I will open the front end Web self will and review together a few options that will help us to monitor the status off that particular a virtual machine. So either the overview category. I'm checking the status off that a VM. For example, This VM is in a stopped A status which means it is also D are located. There is no public I p address attached to that virtual machine. If I will jump below toe, check the visas health option. We will see that it is unavailable right now, which is making sense because it was the are located. It also means that there are no metrics have been collected from that village on Mission. Assuming I would like to know when this virtual mission was the allocated and by whom, then my next place will be activity looks as a reminder. Activity logs can help us toe track different types off activities performed in a zoo in the context off our subscriptions. And there are two main types off activities activities that are performed by end users OK by us or activities that are performed automatically by the zoo system. In our case, let's use this stool toe. Find out the answer to my question. First of all, I need to make sure that the filtering options here are configured is needed. This particular Veum is selected over here. I will add another filter related to the type off operation selecting the allocate a virtual machine. And here we have it. This virtual machine was the allocated at this date by this user. Yes, you can blame me. Okay, I'm the one the allocated that virtual machine. Now let's go back to the overview option and start this virtual machine. Okay? It will take a while, but after a few minutes, this virtual machine is up and running. We can see that under the status and also now a public I P address is allocated to that virtual machine. We can access that from the Internet, going to the resource health option. It is now available without any known platform problems that can affect this specific virtual machine. As a quick summary, we talked on three main things. He in this lecture first awful about the status off that virtual machine and also the resource held And the last one was activity Looks 33. Metrics Explorer: Our next topic is about metrics. While using the metrics explorer toe embedded in a zoo monitor. Let's open the zoo monitor and select metrics, forgetting their metrics. Explore little. We can do it, form a particular resource like a virtual machine or using the A zoo Monitor O'Toole. Okay, let's talk about the options we have you. The first step will be to select a specific resource we would like to investigate. I can select all vessels group and keep also all Hazel's types now a small number off resources in my demo system. So it z not the problem. Next will be to select the specific resource for from the list. Let's take the front end Web server. Now the tool can display us only metrics related to the selected a resource. In our case, the virtual machine metrics are grouped into something that is gold name spaces. In this case, it can be metrics coming from the underlying virtual machine host infrastructure. Azuma Nitto collects a host level metrics like soup utilization, disk and network usage for all virtual machines. Without any additional configuration, we can teach those metrics by selecting the VM host name space, another option will be to analyze metrics that are collected from wheat in the virtual machine, meaning from the guest operating system using agents, for example, under this name, Space, the VM host. I'm not expecting toe get metrics about the virtual machine memory. But if I will switch to the guest name space over here, I will be able to find such a metric. Let's select the member we used Metric, and also I will add another metric called memory percentage. On the same graph, I can adjust the graph name in type. Let's do that for a second and play with more setting in the chart setting option like, for example, adjusting manually the many moment maximum values off the Y axis. Next, very important filter will be to adjust the required A Time Ridge. Let's elect A For the last four hours. Metrics are collected every 60 seconds, so any option will select. It will completely change the child. Let's say I would like to understand the memory load on every rage for the past one week. They just select the seven days off any range. Now, this virtual machine is mostly not turning in my demo system So it's making sense that over the last seven days the mom utilization was zero. OK, no data was collected. In addition, I can save this graph is a template and painting toe a specific Dodge boat and also export data to accept. We can also set up a letter tools on those metrics, which is something we already covered in great details. Let's open another type off a resource. If example, this storage account. Either the metrics option will see several name space account blob file accused tables if you remember from level one. Those are different storage options under the storage account, but each one of them I will see different metrics. For example, under the count, I will select the Metrics Eagerness, which is measuring a data. Eight transactions is outgoing traffic from the storage account outside and the last example it will be a network interface. Let's elect a different and networking toe face. And there let's overlay day bytes received as well as send bites altogether. That's all about metric explorers. It is a great tool to perform head oak analyses on performance metrics 34. Log Analytics: the following lecture is a little bit more complex, assuming you don't have any experience with the analyzing logs. So go ahead and grab some nice coffee tea, whatever is working for you. And let's that analyzing performance metrics on Childs is very useful for many use cases. And we saw that option in the previous lecture using metrics. We can understand that there was a performance issue on specific resource in a particular time, which is great to point out. Also on the impacted resource is, however, in some cases those metrics are not enough to understand the real would cause that created a performance degradation at the first place. Let me give you an example. Let's say that I will. Web Server. Our fondant Web server is under a denial of service attack from a particular location. Now this event can be translated to a higher memory utilization. Inside the guest operating system. The server is working hard while trying to handle the amount off request. If I will analyze just the memory, utilization is a metric is an isolated data source. It will be difficult toe call it between those issues and even point out the right would cause now such detailed a security events are collected as logs under my network security groups. Those are the agnostics looks, so why not add another type of data, meaning logs toe cover. More complicated use cases. We can do it by using the Logan Analytics Egg Tool, which is embedded inside a zoo. Moloto logs. It can be collected into a zoo log Analytics workspace. If you remember, we already created that workspace in section tree while creating a with demo system. And let's see that workspace going to all. Resource is looking on the resource type A Logan Analytics workspace. This is my workspace. It done log A Let's open it now. What kind of data sources are stored in this work space. We can see it right over hill workspace, a data sources. Data sources can be virtual machines, and we can see that my virtual machines are connected it to the workspace, meaning all kind off. The agnostic looks from those virtual machines are stored hill. If I will go to the Asia Activity Log, we will say that any subscription levels activity logs is also stored here. We can query logs a that are collected in a Zolak analytics by using a dedicated query language that is Gold Cousteau or Que que el Costa query language. This language was designed by Microsoft as a simplified layer to quit big data storage. We're going to learn the basic language. A structure. Don't worry. It's not really complicated. Okay, let's move to the practical side. We use this option called a logs. By the way, I can open it from the Log Analytics workspace, as we just doing right now. Oh, from the as a monitor, as we have done so far with other tools. After opening the logs a tool, we're getting the option for writing equerry using the Cousteau in this editor at the Basic level Acoustic query is the only request to process data and return results. The request is stated in a plain text using a data flow model designed to make the syntax easy toe with. If you familiar a little bit with SQL, then you will be able to see that it is using a schema. Entities that are organized very similar to SQL. We have database tables and columns. Let's do something. So a zoo log analytics organized daytime tables, each composed of multiple columns, stowing different type of information. The same tax off. A query will usually start by declaring the data source, meaning a specific table name in the database. They all called table based queries, and I used to define a clear scope for the question, which will be translated toe better performance. Take a look on the left side over here. This is the least off available schemers under my walk space. I can search for specific schema name in the search area. The idea is to define a relevant table, Hille, and then take a look on the data in the table. Let's open the log management and inside will see many tables that are used to store different types off logs. For example, I can present a simple off the activity table by clicking Hill, and we'll see the result over here. This is the place we'll see all the activity looks. Let's review another one. A zoo Diagnostics, which includes logs about what's going on inside all resources, okay, very is full, full troubleshooting, moving toe. The editorial I will type perf, which is the scheme I used to stop performance metrics assuming this is the table I would like to investigate. This simple syntax is now declaring the scope off the query. Okay, the table. I would like to search for the data now. Right after this is Kim a name we can use a set off data transformation operate owes that are connected together through the use off the pipe, the limiter, the pipe characters will help us to separate commands. So the output off the 1st 1 is the input off the following command. Okay. Similar today. Lennox Syntax as a starting point, I will select Onley. One command called a take with the 10 records is an input and click the one. The query returns 10 results from the perf table without any specific order. But it is useful way to take a look at the table and understand the structure. The columns that in the table now looking on a few lines here we have the name off the virtual machine under the column computer am relevant components in object name like a process. So network memory and then they counter name, counter value and more. Keep in mind that the default time range when hunting equerry is the last 24 hours to get only records, for example, from the last four hours, I can select this option and the bond equity again. If I would like to take the latest 100 records and organized them in a descending order, then I can use the top command it which will solve the entire table on the server side and then returns it the top records. So it's going like that. So pref and then top 100 by time generated descending. What about filtering data? Inside the condition? We can use the well operator, followed by one or more condition. So in this example, I will add Well, object name is equal toe the process off. Now I'm getting records with this specific object name, assuming I would like to filter out values that are below a certain number. So let's ed after the object name, equal toe process so and counter value more or equal to two. Another remark about the well command. We can also use this command to perform a specific time filter, overriding the global time range so I will use it and click health. It will. Time generated is more than a go and two days or any other value would like to get inside. What about reducing the number off columns and order the structure off the result? To select a specific column, we can use the project command So it's going something like that Bill and I'm keeping the well time General. It is more than two days and adding project time generated. This is the first column would like to get and then computer, object name, counter name and counter value. Another useful option will be toe aggregate go before. In that case, we can use a summarize to identify group off records according Toa one or more columns, and then apply some aggregation. Tow them. The most common use off summarizing is counted, which will returns the number for result in each age group. So it's going something like that again. Press will time General, it is more than today's, and then I will lead summarize count by object name. Now this query reviews all a perf records from the last two days, group them by object name and counts the records in age group here. I'm getting the number off records bear object name for the last 48 hours. If I would like to break this group into subgroups based, for example, on the counter name, then I will use another dimension. Its unique combination off this values define a separate group. So just after the object name, I will ed count Ernie and then get new result. But each line with the number off the cords there, this today mention I can also perform some mathematical or statistical calculation for which group. So let me fix the second line, summarized maximum count value by computer and come to the name. The squaring will calculate the maximum counter value for each computer and bill each counter name. Here are the least off counters for the front and server, with the maximum value over the last two days, and the last option I would like to show you is the search query. A search query is an alternative to the table based query, meaning what we usedto file search queries. Unless a structured, which makes them a better choice when searching for a specific value across columns, Over course, even multiple tables. Just keep in mind that such queries can take more time to process the data. So the first option is to search in the whole database without any scope. For example, a week. Search the world. The allocate. Kate will check all the tables, and eventually it will reach the zoo activity table over the all kind off events related toe de allocating virtual machine. Another option will do a specific search in a specific table. Okay, this is gold table scoping. So we click search in a zoo activity the same. Ah, world the allocate. Okay, I can perform also a column scoping meaning search in a zoo captivity, but in a specific column. And in my case, it will be operation name. And I would like to search for the world. Delete. Okay, that's all. I think this is more than enough to get you started and feel more comfortable while using this powerful lake with a language toe. Analyze different types off fare logs 35. Azure Monitor for VMs: in almost any in for such a service. A cloud solution will allocate virtual machines for a variety of functions. For example, in our playing ground, we have to virtual machines. One is the front end Web server connected to the Internet, and the other one is the Bacon Database server. Each server is earning multiple processes that may be connected toe. Other poses is running in a different service. A zoo monitor for virtual machines is a tool to help us better manage a group of virtual machines and also get deeper insight to the internal processes in each virtual machine. It is basically out of the books consolidated monitoring view dedicated. Therefore, virtual machines toe be able to use it. We need to enable it a virtual machine level, which we already did in Section number three. So we are ready to start to use it. We can open this stool from each virtual machine or from the A zoom on. It'll let's open it, form a specific virtual machine. Okay, we have the falling A tree tops, health performance and maps. The health tub is used to display a quick health summary about this virtual machine, including their resource health, which we can access, also formed this option. Resource health. It is the eso status reported by the azo platform. The next one is guest virtual Machine health, which is based on diagnostic data collected from the agent inside the virtual machine and also a nice break down off the main virtual machine components health. Okay, we have CPU, disk memory and network. If you think about it, it is useful. Health summary off that particular which our mission. If something is not okay, we can perform investigation. I can click on disk to get more information about why this health status off that component is in a different status. Moving next to the performance stub will find pre defined reports on several key metrics related toe this virtual machine. Just select it the needed Time Ridge. It is now the last 24 hours, and we have all kind off you. Nicely portal, for example, Logical disk performance, How much space is allocated used? What are they read? Like Thai hopes based on the scene, for we can identify storage bottlenecks, a sip utilization and a villa in available memory. Doing the process off, creating a new virtual machine, and we selected the virtual machine size included the computing power. And here we can see if the computing resources are all the utilize of underutilized and more details. A graph about the allocated disk like theological disguise hopes, transfer rate and megabytes etcetera. We can teach those metrics from the Metrics Explorer, but it is easier to see all of them in one place the next one, which is gold. The map tub is really interesting and useful. It can be used to display information on the internal processes that are running inside the virtual machine. In addition, it will show us how each process inside this virtual machine is connected on the network level toe other computers. Here we have the front end Web server wheat, nine processes running inside. I can click hell and get the least off processes. For example, in the server, we have the Apache two whips over the my sequel server and also the FTP Server. In addition, we can see lines connecting between a specific process to some external servers, a by a specific port number. For example, this internal process, called empty as the connected via port number a 44 a tree, which is a secure communication channel. This process is actually the lean oaks, a diagnostic extension agent, and it is sending telemetry data from the virtual machine in tow, the A zoo morninto database. If I will click on that line, I will be able to see some metrics like a response time, a quest for minutes, a traffic and so one. Now, let's do a small test. If you remember doing a section number three well, creating the demo system. We uploaded the my data Logan a PHP file into the phone and Web server. And then we created a database and some simple table inside and uploaded some and information into that table. Now, when we access this file am from the fund and Web server, it will query the database server, and this is the testing would like to use now. So I will access this PHP file while using the front end public I p. And then ending the specific final name called My Data Low. Gail, it don't PHP and let's run it Ah, several times in different jobs. Okay, it will take a minute or two and let's go back to the map view. Now we can see another connection between the front and silver and the beckons of I can open this and we will see the beckon server. It is a desipio port number treat reasonable six being used for connecting toe my sequel server. Okay, we just created IHS some stream between the front and server to the Beckett server while accessing the my data local and fight. Let's do another simple dist. I will open FTP connection to the phone and silver for my computer. Okay, FDP and the funded serve a public i p and let's upload some heavy fire and open again the map. Now we will see another link using port 21 which is FTP connection, toe deformed and server. As you can see, this is really useful to map and investigate ideological little connection between several virtual missions that are interconnected toe create a more complex solution. Okay, this is the map view in this store. Another option toe consider will be toe open this stool, form a zoom on. It'll instead form a specific virtual machine. In that case, the information will get will be a samarie off a group of virtual machine under specific races, and you can select which resource group you would like to investigate. I think this can be a better a starting point before looking on a particular virtual machine. In addition, under the performance stub we can a benchmark metrics between different virtual machine. Let's open it. For example, in this soup utilization graph, we can see it three Virtual Machine, which I created in the demo system. Another option would be to select the top end, at least stop here. I can see the ever digitalization each virtual machine in specific time from I am on. Also, I could select Add a metric Let's change it to logical disk space used. Okay, this is very useful Option toe quickly compare between virtual mission. 36. Azure Advisor: the next nice stool I would like to talk about is the zoo adviser. As the name implies, the this advisable is a commendation tool that can help us toe optimize our cloud deployments. Basically, it will automatically scan our deployment all the time and utilize all kind of best practice guidelines and knowledge. Toe provides us all kind of accommodation in four main areas. Availability, performance, security and cost. And the nice thing about it is that this isn't out of the book service, meaning we don't need to configure anything, and it is not associated with additional cost. Let's open the A zoo monitor schooling below, toe the supporting troubleshooting category and open the adviser recommendation. We can see that it is divided into this four main topics right now. Under the security topics, I have several recommendation to review. Clicking on this security topic will open the security center, which is a summary dashboard Over here. The recommendation is to use some Diskant Crip shin on my virtual machine. If I will click on this item, it will open more information about this specific recommendation. In a real production system with many virtual machine, this stool can really help us to mitigate all kind of poor blames. All kind off, Miss Configuration it use a cost by making sure things are optimized. 37. Application Insights: until this point, recover the main tools for monitoring the A zoo infrastructure as well as our allocated resource is we used the metrics exploit to visual and analyze performance metrics and the Log Analytics tool to analyze logs they zu monitor for Virtual Machine, which provides a deeper insight into a virtual machines. And the last one was the so advisor that is helping us to optimize our system. Now I would like to talk about the upper layer in our money doing playing ground, which is the application in Asia. We have a dedicated tool called Application Insight, the main target off the stories toe Help a Web developers doing the development cycle off a Web application as well as develops engineers that are responsible for Web application running in a production system. A simple example. If a Web developer just released a new feature or fixed a some issue, then it's possible to measure the performance impact form user perspective. We can use it also to find out which page are more popular in our website, at what times off a day specific days in a months, and the location off end users accessing those pages, it is a very powerful analytics tool. Now, just to set the expectation. This tool includes an extensive list of features, and I'm going to touch only some of them that are relevant to the system I created for our playing ground. And let's that going toe the assume, Juanito, Under the inside category, I will select this one application. Now, if you remember when we created our A demo system, I will playing ground in section number three. The last step was to create the application insights. Hey, sauce. And this is the resource we can see here. That's up in it. In addition, we connected our web application toe this resource by putting a small piece off Java script . Code it with our specific install mation key. Okay, we have done it in section three. If you remember assuming we have done, everything is needed than any user, access to the website will be recorded and can be analysed using this tool. Before using this analysis tool inside, I would like to generate some traffic on our website located in the playground system. For that, I need the front end Web server, public AP just a copy and pasted in the You are well and add support. It'll will. Our side is located now. I will jump between a few pages. OK, open this. Say you're really in another Web browser. Okay, one between the pages. All those actions will create some traffic. Some logs in the application insights under the main overview beach. There are some default Childs. Some of them are on the serval side, which we are not money towing right now. And this is why they are empty. This one, a unique users is is a client side logs, and we can see the total amount off users connected over the last one hour going to metrics will be able to investigate different application metrics. In our case, the relevant metrics are related to the client side, like browser page load time, I will adjust the Time Ridge. In some cases, it is difficult to identify which metrics are related toe server side or client side, meaning the browser. So we have a dedicated categories silver and browser. Let's open the browser and inside the browser performance, we are getting a nice summary over the specific global time range, which we can adjust. In addition, I have this adjusting about to tune my investigation to a specific time range. The graph is displaying Page view count session count user. Come below. I can see the specific operation name, which are multiple web page at the user access page one off them, the every generation and the amount will be displayed. In our case, it seems the main webpage loading time is a little bit longer than the other pages on this side. We have another rough called distribution of duration. I can see here how the Web pages loading time is distributed. It is a percentile. A graph below under e sites the system Identify that this amount off, you know, in percentage off the whole requested in this time frame have some common properties. And this is making sense is I created the traffic load from the same computer inside. This group will see the client operation system, client location, etcetera, and I can also drilling toe there. Oh, data samples. As I told you, this extensive reach analysis to we can toggle between server and browser here on the right side. But in our case, the server option will be empty. As we are not collecting data from the server. In addition, our website may request services form other location. This is the dependencies, a top over here. In our case, when a user opens the contact page, let's see it for a second. Okay, it is presenting a Google map, which is translated toe a dynamic AP I request to Google Maps Service toe. Get a map in a specific location. If I will click on the low, Simple will be able to see the actual command. This is a get quest off the ledge, tippy toe wiggle map service with specific parameters. There is also a dedicated option to analyze a bows of fail owes. OK, it's the same look and feels like a Bowser performance. We can also set up synthetic a test that will check if our website is available. Let's create one. Okay, add test being in my website and the US will being best put, the public eye be. It would be five minutes from the following location. Success. Good idea is to get such tippy response. Toe 200 auto particulate Cates and the lead If three from five location failed in five minutes, let's click create now under details, Click Refresh, and we will be able to see the new test name below. I already created a similar test before, so we'll be able to see the result. This is basically a nice option toe. Automate the website monitoring. Get an automatic notification about Neil time site availability as well as historic behavior. Last option I would like to show you is the workbooks work. A. Books are interactive reports that can combine metrics, analytic square and parameters into very nice and Rich reports. We can create a walk book, a template from scratch according to our maids. So to you some out of the box workbook from the list below. You can even contribute workbook template to the Asia community if you like. Let's open one off the available workbook, a usage and then active users. Okay, that's all about application insights. We managed to cover some basic a ground about this very comprehensive tool. There is much more option available in the stool, and probably more features will be added in the future 38. A Quick Recap: hi and welcome back. Thanks for watching so far. We completed this training level. Level two. We covered a multiple money towing tools, and in some cases I assume it could be a little bit overwhelming with all the option we have. And I hope it was interesting and that I managed to break it down to more manageable a building blocks. It will be awesome if you managed to create the demo system and play a little bit with the Microsoft a zoo monitor tool. If you didn't and just watch the lectures, that's also great. And I still recommend you to trite go back to Section three and see the needed steps to great such system from sketch for really understanding something. It is better to try it by yourself, struggle a little bit with problems and get more familiar with the variety off option in a zoo. Morninto. At this point, I would like to have a quick summary off the things we learned doing this level and also to talk about the next step moving forward. We started with a high level overview off. Why, what and how, as it always good to understand the big picture before diving into details. Why we need to monitor a cloud based solution. This is really important business requirement for almost any application running in the cloud. Poor blimps will happen sooner or later, and it is very important to monitor the performance and health off our resource is and and then what exactly? We should monitor the a few layers that are generating telemetry data, and lastly, we talked about also on the how how we're going to monitor our cloud based solution. We saw the extensive list off telemetry data, meaning metrics and logs that can be collected from different layers layers like their zoo infrastructure. Our allocated cloud resource is the guest operations system and also the upper layer application. This deliberately data is collected installed in a zoo monitor so we'll be able to analyze it. The collection off telemetry later is fully automated. When we create, our resource is, but it will be on everything that can be reported from outside. If we would like the visibility on what's going inside some resources, then additional configuration will be needed. We created a demo system so we'll be able to monitor or something and use it as a were playing ground while using a variety off monitoring tools. And then we saw all the needed steps to set up and configure the collection off the limb. A tree data, of course, all the different layers. Toby Able to monitor a system effectively, it is important toe automate things as much as possible. We learned the concept off 11 tools and action groups. Those A Littles are actually scheduled jobs running automatically atop the collected data. And I used to identify performance degradation as well as all kind off outstanding events and then to create Hill time notification alerts. In addition, we combined action group templates to trigger all kind off. More advanced action, like opening a trouble ticket running a pretty find land book. Three. Get some a P I etcetera. Problems in the cloud solution can be related toe a local or maybe global service issue in a zoo. So I will. First step was to be able to draw a clear line and understand if the problem we see is actually related toe. So infrastructure we use the asso status page as well as the more focused a service helpful inside the as a minuto in case one off our resources is experiencing something, then we can monitor the availability off our resource while checking their resource status and health. In case we would like to trace what kindof actions were performed on our resources, we can use the activity looks. Every action formed by a zoo or by end users is recorded. We used the metrics explore. It'll inside angel monitor, toe analyze metrics, metrics that are being collected, of course, different layers. In our demo system, we can adjust the neither timeframe. Select the resource instance, the metric name space, a specific medic or maybe a group off metrics. It is a useful tool to investigate and focus on performance issues. In some cases, metrics may not be enough to identify a good cause issue. It is important to combine logs a spout off our analyses. We learned how to use the Log Analytics and query data logs using the Cousteau Query language. It is a very powerful and flexible tool. Logs are organizing tables and columns. We can identify the relevant table and perform all kind off analysis using the Costa Query language. A zoo monitor for virtual machine is really a nice option toe consolidate in one place. Information about our virtual machines. It has three view tops, health performance and map. The health is breaking down the voter mission in tow. Building blocks like there. Cebu Disk memory. It's on The performance view has a out off the shelf Charles, about the main metrics on a virtual machine, and last one map is really helping us to look inside the virtual machine. What kindof processes are running inside, and also how those processes are communicating with other server related to that particular virtual machine. We saw the a zoo adviser tool that can be used to provide us all kind off recommendation about our system so we'll be ableto optimize it, optimize the availability, performance, security and cost off our specific cloud system. They last a tool was application insights. We used it toe. Analyze different metrics related to the user experience data collected directly from the browsers. What is the page loading time? The number off users accessing the Web server etcetera. This is basically visibility to the application layer. Using all those tools we covered so far, we can get a really good understanding off what's going on inside our resources as well as in a zoo services. They're becoming a cloud expert. A training program is a collection off few courses divided into levels. Level 123 maybe more. Recover the second level in this program and I will release a new levels going forward. One important request for my side is a teacher. Please spend 23 minutes and stereo feedback inside the course dashboard. It is really important. Thanks again for watching and investing time for learning the stop big cloud computing. I would like to wish you good luck and I hope to see you again.