Deep learning for object detection using Tensorflow 2 with Faster RCNN | Nour Islam Mokhtari | Skillshare

Playback Speed


  • 0.5x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 2x

Deep learning for object detection using Tensorflow 2 with Faster RCNN

teacher avatar Nour Islam Mokhtari, Deep Learning Engineer

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Lessons in This Class

34 Lessons (4h 59m)
    • 1. Promo

      1:26
    • 2. What is object detection for computer vision?

      2:43
    • 3. Object detection can be for multiple objects in the image

      1:46
    • 4. Why deep learning for object detection?

      2:02
    • 5. High level overview of Faster RCNN

      11:49
    • 6. How to install tensorflow with GPU support (part 1)

      9:09
    • 7. How to install tensorflow with GPU support (part 2)

      11:14
    • 8. How to install tensorflow 2 object detection API

      7:07
    • 9. Data preparation for object detection

      4:12
    • 10. The dataset that we will use to build an object detection model

      9:38
    • 11. Downloading and setting up our annotation tool : labelImg

      11:24
    • 12. Annotating the dataset

      12:25
    • 13. Transforming our xml files into one csv file

      12:39
    • 14. Creating a labelmap for our dataset

      4:04
    • 15. The tool that we will use to generate tfrecords

      1:59
    • 16. Generating tfrecords

      7:09
    • 17. Overview of the steps needed to build an object detector

      2:02
    • 18. Transfer learning

      4:49
    • 19. Downloading the pretrained model and getting its corresponding config file

      8:38
    • 20. Preparing your config file

      10:33
    • 21. Running the training and testing for experimentation

      18:38
    • 22. Running Tensorboard to analyse the development of the loss and precision

      9:35
    • 23. Settings for training and evaluating a Faster RCNN model on your local machine

      4:11
    • 24. What is cloud computing and what is AI-Platform?

      5:48
    • 25. Creating a Google Cloud account

      5:28
    • 26. Downloading Google Cloud SDK

      5:50
    • 27. Creating a google bucket and uploading data to it

      8:32
    • 28. Preparing our config file for training on google cloud

      7:51
    • 29. Running the training using Faster RCNN model

      22:57
    • 30. Running the evaluation during the training

      8:16
    • 31. Analyzing the results after the training of Faster RCNN model is finished

      16:51
    • 32. Possible things to do to improve our model performance

      2:52
    • 33. Downloading the trained model and exporting the frozen model from checkpoints

      24:51
    • 34. Running the frozen model on new examples locally

      20:03
  • --
  • Beginner level
  • Intermediate level
  • Advanced level
  • All levels
  • Beg/Int level
  • Int/Adv level

Community Generated

The level is determined by a majority opinion of students who have reviewed this class. The teacher's recommendation is shown until at least 5 student responses are collected.

46

Students

--

Projects

About This Class

This course is designed to make you proficient in training and evaluating deep learning based object detection models. Specifically, you will learn about Faster RCNN.

For Faster RCNN model, you will first learn about how it is designed from a high level perspective. This will help you build the intuition about how it works.

After this, you will learn how to leverage the power of Tensorflow 2 to train and evaluate this model on your local machine.

Finally, you will learn how to leverage the power of cloud computing to improve your training process. For this last part, you will learn how to use Google Cloud AI Platform in order to train and evaluate your models on powerful GPUs offered by google.

I designed this course to help you become proficient in training and evaluating object detection models. This is done by helping you in several different ways, including :

  1. Building the necessary intuition that will help you answer most questions about object detection using deep learning, which is a very common topic in interviews for positions in the fields of computer vision and deep learning.

  2. By teaching you how to create your own models using your own custom dataset. This will enable you to create some powerful AI solutions.

  3. By teaching you how to leverage the power of Google Cloud AI Platform in order to push your model's performance by having access to powerful GPUs.

Meet Your Teacher

Teacher Profile Image

Nour Islam Mokhtari

Deep Learning Engineer

Teacher

Hello!

My name is Nour-Islam Mokhtari and I am a machine learning engineer with a focus on computer vision applications. I have 3 years of experience developing and maintaining deep learning pipelines. I worked on several artificial intelligence projects, mostly focused on applying deep learning research to real world industry projects. My goal on Skillshare is to help my students learn and acquire real world and industry focused experience. I aim to build courses that can make your learning experience smooth and  focused on the practical aspects of things!

See full profile

Class Ratings

Expectations Met?
  • Exceeded!
    0%
  • Yes
    0%
  • Somewhat
    0%
  • Not really
    0%
Reviews Archive

In October 2018, we updated our review system to improve the way we collect feedback. Below are the reviews written before that update.

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.

Transcripts

1. Promo: Hello and welcome to this course. My name is not Islam authority and I will be your instructor for this course. I am a Computer Vision and deepening engineer with three years of experience working on several objects, dish projects. Detecting objects in images is one of the highly requested scales in the computer vision industry. Many companies use it to automate defects detection. Others use it in autonomous driving, and many others use it in different case scenarios. Discourse, I will show you how to solve a real-world problem using object detection models such as fester, ICANN, and SSD, we TensorFlow to. I'll show you how to build an object detection model that can detect whether people are wearing masks or not. I'll show you where to get the data set from and how to prepare it for training on your local machine and also on Google Cloud Platform. Discourse is for anyone who is interested in learning how to use deep learning for object detection with TensorFlow to, you will need a basic level in Python. So if you know what functions and classes are, then you should be good to go. You will also need a basic understanding of deep learning concepts such as convolutional neural networks. So if this sounds interesting to you that please join me in this course, you will learn a lot. So I hope to see you in class by. 2. What is object detection for computer vision?: Hello. In this video, we will be talking about object detection in computer vision. And we will be defining what object detection is. So let's first start by a well-known task that is in computer vision field, which is image classification. So in image classification where we have as an image such as this one here, and we have a cat in this image. And the goal is to tell whether there is a cat in the image or not. So when you develop a or an algorithm for image classification, your input would be the image and your output would be some sort of a confidence score such as this one here. So the confidence score is usually between 01, which represents the probability of having a certain object, in our case, a cat in this image or not. So what we usually do is that we use some sort of a threshold, for example, 0.05. and if our algorithm gives us a score that is equal or bigger than 0.5 that we say that we are confident that a cat exists in this image. If the score is lower than 0.5. then we say, we are less confident and most likely there is no cat in the image. So this is about image classification. In object detection, what we do is actually classification plus localization. So what we would get is something like this. In an object detection task for computer vision. Where we want to do is to first tell whether there is a cat in the image or not. And if there is, we want to be able to tell where that cat exists inside that image. So at the end, what we would get a, some sort of an output like this where we have a confidence of whether there is a cat or not in the image. And then we would have some coordinates. So for example, we get an x and y coordinates of this point here. And then we would get the width and the height of our bounding box. So this is called a bounding box. So as you can see in object detection where we have is basically image classification plus localization. And that's it for the definition of object detection. And see you in the next video. 3. Object detection can be for multiple objects in the image: Object detection can also be used for detecting multiple objects in a single image. For example, a free look at this image here we can see that we can detect dogs, bicycles, trucks, and we can even detect more things if we wanted to, such as trees or motorcycles. And when we do this, the output would look something like this. Where we have an output that contains confidence scores. And then we have coordinates of the bounding boxes. So here we have three detections based on the detections inside the image. So the first detection is for the dogs and second one for the bicycle and the third one for the truck. And each time we have a confidence score, and then we have coordinates of the bounding boxes. Usually this is x and y of the top left corner of the bounding box. And then we have width and height. Sometimes x and y could be the coordinates of the center of the bounding box. This is a small detail and you don't need to worry about it for now. So as you can see, we can generalize object detection to detect multiple objects in one image. And this is very useful. And many new industries, such as the autonomous driving, where we are, where the car needs to detect signs. It also needs to detect people and other cards. So this object detection task is widely used in the industry. 4. Why deep learning for object detection?: So deep learning tackles these issues by doing several things. So the first thing that is very distinctive for deep learning is that when we use neural networks for object detection, that's what we mean, what we mean by deep learning for object detection. So we are using deep neural networks. In this case, the neural network starts learning very distinguishing features which improves accuracy. And the second thing is that the computation is shared when we are doing the detection part, which improves the speed. And finally, the neural network starts learning enough features distributions to be able to generalize well on new unseen images. So when you look at state of the art in research and you look at how neural networks are performing compared to the old traditional methods where we didn't use deep learning. You can see that there was a huge jump in accuracy and also in generalization. So all of these, these points here are very positive for deep learning compared to traditional image processing techniques. And this is why, if you look at the research now in the field of object detection, basically no one is using traditional image processing techniques anymore. Everyone is using deep learning. So it just changes from one neural network to the other. They just changed the architecture the way they framed the problem. But at the end of the day, they are mostly all of them using object detection with neural networks. 5. High level overview of Faster RCNN: In this video, I will be showing you the structure or the architecture of the neural network called Faster. Our CNN will only do a high overview of the network and see how this mechanism actually works. On the article for faster on CNN. They have this picture here or this image that shows the different steps that the faster our CNN neural network takes in order to make predictions at the end for classes and also for bounding boxes surrounding objects in those images. So if we look at the image here, well, we can see is that we start with an input image, such as this one shown here. And what we do is that we pass it to convolutional layers here. Usually these are layers from a pre-trained neural network that was used for classification mostly. So, for example, we would get, we would use convolutional layers from VGG network or inception network. And what we get at the end or as an outward feature maps. So these feature maps will then go through a neural network, a small neural network, cold region proposal network, or RPN for short. And the goal of this RPN neural network is to propose regions that this network thinks that objects will exist in it. So it might think that this surface here or this area here, will contain some objects. Although it will not tell us which object, it is. Just going to tell us that, for example, these three regions here, they have objects that are different than background. So they could be like dog, bicycle, cat, whatever, but they are just different than background. So this is the goal of the RPN neural network. And after that, what we have is a, another neural network. So all of these networks work together. So when we finish this part here we have finished the first stage. So as you remember, the faster our CNN, a neural network is a two stages based object detector. So when we finish this phase here we have finished the first stage. And what happens next is that we pass the output of the RPN Neural Network and the feature maps to another neural network. That is that has the main task of region or ROI pooling, ROI pooling region of interest. So what happens is that in this neural network, we get at the end classes and bounding boxes surrounding objects in the image. So just to recapitulate here, we have an image, we pass it to a set of convolutional layers. Usually they come from a pre-trained neural network such as C or inception, or mobile net or others. We get some feature maps here. The feature maps to the RPN neural network. The RPN neural network gives us some proposals where it thinks that some objects might exist in these areas. And then we pass those proposals and the feature maps to another neural network. This neural network is the one responsible for giving US classes out bounding boxes at the end. And just to go a little deeper, we will look at these steps one by one. So here in the faster our CNN neural network, we start by an image. So this image goes through convolutional layers. Usually we call it a backbone. So as I said, this could be a VGG 16 inception, v1, v2, v3, mobile NADH resonance. All of these used as backbone. But just to emphasize here, we actually use only the convolutional layers. And sometimes even the convolutional layers. We don't use all of them. We use some of them. We don't use fully connected layers in this part here. So for the VC Genome Network, we're only gonna take first part of it where there are only convolutional layers. So at the output of this backbone we get feature maps. And then what we do is that we assigned some anchor boxes to the image. So this is something that could be confusing in the beginning on why we use anchor boxes. But I hope that by the end of this video you'll understand why we use anchor boxes in faster our CNN. Anchor boxes are a set of boxes of coordinates that we predefined in the beginning. We assign them to some cells in the original image, and then we try to use them as input for the region proposal network. In order to actually use regression and make those bounding boxes converge to our ground truth bounding boxes. So I'll show you later an image that shows how ACR boxes usually are in the original image, but only at the end after I finish showing you how this diagram works. So we start with the image, we pass it to the backbone, we get the feature maps. These anchor boxes are actually assigned to the image. So we assign them, we choose these anchor boxes. Then we pass those anchor boxes and the feature maps to our RPN neural network. And The output of the RPN neural network. We pass it to the ROI pooling network, and we pass the feature maps to the ROI pooling network. And then we have here the RC, CNN or region convolutional neural network that actually takes the output of the ROI pooling and gives us a class corresponding to each object. And also there is a regressor here, which gives us the bounding box at the output. So here we are saying refined bounding box. We are saying this because the goal of the anchor boxes is to actually start from a set of anchor boxes like this one here. So as I told you, imagine this is an image 600 by 600. So in this position here, almost three, twenty, three, twenty. So in this position here, we assign some anchor boxes here. So as you can see, we have different sizes and different aspect ratios. And this we will do for each pixel, or we do it for, for example, for one pixel, then we use a stride of two and we move, we do. After two pixels, we assign these anchor boxes, so on and so forth. So at the end you can imagine that for each cell here, we will have a set of anchor boxes. And the goal at the end is to make these anchor boxes converge to our ground truth bounding boxes. This is what we call quality regressor. And this is what we say refined bounding boxes. We are refining our our bounding boxes in order to make them fit or all converge to our ground truth bounding boxes. So just to recap, interpolate again, we have an image, we have anchor boxes, and here I have used two different colors. The blue means that we are actually passing through some, some layers. And the color orange here or brown, S is representing things. Inputs are the things that we said ourselves, or things that are outputs. So inputs, outputs or things that we set ourselves. So here the image is an input, feature, maps is the output. And here the anchor boxes are things that we set ourselves. And you can think of them as just a set or a list, or some coordinates that represent all of these different bounding boxes here. So this is on a high level how the faster our CNN works. And I hope that now it's more clear on how, how mechanism works. And later when I show you the one stage neural networks for object detection, you will see how they are different from this. So here it is little bit complicated structure here or architecture. And maybe the faster our CNN is one of the most complicated architectures for object detection. But with the new research and new approaches, they are actually, researchers are trying to make this much simpler. And the one stage object detectors, for example, have this purpose of making things straightforward because one of the things that one of the disadvantages, or faster, or CNN is that it's slow. It's usually a very accurate, but it's also slow because we have these two different neural networks that are working together. So this one proposes regions and then the other one gives us the classes and refines the bounding boxes. So this makes it a slow neural network. And on the other hand is very accurate. And I'm saying this also from experience. I have just said this. And I a tested other neural networks that, that are one-stage based neural networks. And I have realized that faster our CNN sometimes works much, much better than that. In fact, sometimes it works and the other neural networks want work. That's how good the faster our CNN. But the disadvantage is that it is slow. We will look into this in more detail a little bit later. 6. How to install tensorflow with GPU support (part 1): So the main tool that we will be using to train our deep learning models for object detection as the TensorFlow library and a library from Google for deep learning. And now they have the TensorFlow version two, which will be installed by default when you install TensorFlow. So for us, we will be training the models on our local machine at first. And for that, you, if you have a GPU enabled machines, if you have actually an NVIDIA GPU, then maybe you can use that GPU during the training, which accelerates the training integrate manner compared to if you do the training on your CPU. So in order to install TensorFlow for GPU had to TensorFlow.org slash install sludge GPU. And you will get to this webpage here where there are all the necessary information for you to install TensorFlow GPU. And we will not specific would not be specifying the version 1.15 because we will be installing the newest versions and not the old versions. So the first thing that you need to do is actually to verify whether your GPU can be used for training. So for this, you can go to use this link here. And let's try to see here and here. On this page you can check the machine that you have, whether it's enabled or not in terms of training using the GPU. So for me, for example, let's say, okay, don't have Tesla, but if you have a Tesla Machine, which she's very good machine for training dependent models. Then you can check here. Let's see quadro products here. If you have a quadro products, you can check here, this is the list. And you can go check all of these. If's your GPU or if you're NVIDIA GPU exists here, then you can use it for training deep learning models. Again, the same thing here. You can check this. For me. I actually have a g-force RT x 1050, the 10-50, ie. And at some point, in fact, it didn't exist on this list, but it's still had the GPU enabled for deep learning. I checked that online. For now. I don't know if it was added to the list or not. Let me just check quickly here. So let me just do a quick search. So 1050. So there's only this one mentioned here is the same thing as the first time I checked this list. So here there's only G-force GTX 1050, but in fact, mine as GTX 1050 TI, and it still has GPU enabled. So. If you have an NVIDIA GPU and you can't find it in any of these lists, then you can do a quick Google search to see if it, if you can use it for training deep learning models. Because sometimes it won't be in this list, but you will find that some people have used it and they can confirm to you that you can use it for deep learning. So this is the first thing you should do if you don't have an NVIDIA GPU, then in this case, you can't really train deep learning models on your GPU. You can still train them on your CPU is just that. It will be slower, in fact, much slower than when you use a GPU. So here, let me go back here. Once you check whether you have an NVIDIA GPU that you can use for training or not, you can come to due to the software requirements here. And as you can see, you have to or you need to have some things, install, some software installed on your machine in order to be able to use TensorFlow for GPU. So here the first thing that you should install as the NVIDIA GPU drivers. And as its shown here it says cuda 10.1 requires for 18 0.8x or higher drivers. So here if you go to this link here, and I have actually opened it already in here, and you can enter your configuration for your specific GPU. For me, it's 10-50, TI, G-force ten series. And the, and here I actually have a Linux 64 bit. So for me, in fact, I didn't install the drivers using this, this download here. In fact, what I have done is that I went to activities and if you type driver's, you can go to additional drivers. And here it's going to open a window. And in additional drivers, it should list the drivers for NVDA GPU that are available. So for me at first, it was not installed. I had in fact this choice here that was already chosen. For me. I chose the 450, which is proprietary and it's tested. So I chose this and I, once I click Apply Changes IDE installed it on my machine. So I went this way for your bond to, if you have you been to, I recommend that you do the same thing as I did. Don't do it, uh, by downloading the drivers and install them from Manually, let's say. But here you can just go to the additional drivers, shoes, the rights or the last driver that is from an videos and it's proprietary and tested. If you have, if you see on the lists some choices where they have open source drivers and they have proprietary drivers than I recommend that you choose the proprietary ones. For me, I don't have open-source drivers, but at some point I remember I had those choices as well. So if you have both of them, then choose proprietary drivers, which is recommended. So when you apply, the changes is going to be installed and you will have the drivers on your machine. Now, if you have a Windows machine, in that case, you can enter your configuration here. And for the operating system, you can go to your Windows version. Let's say you have Windows 1064 bits. And let's say you have the same configurations as these ones. Here. Can click Search. And here you get to this page where you can basically download the driver. And once you download the driver, you can install it like you install regular software on Windows. So for me I won't be going through this, but I want to show you at least how to download the driver for your Windows machine. And once you get to this page here, you can just click on Download. And it's gonna give you access to that download link. So you're going to download an executable and then you run it and just go next next and install. So for me, I will not install it. If you face problems with this, please let me know. But for me, I will just stay with my current configuration for Linux, for your B12. And I'll be closing this. So going back here, in this step here you can see that you should also install cuda toolkit and go cook Thai here, which comes basically it ships with could advocate. So you don't need to install it separately. And you also need to have q Dn and SDK 7.6. This last step here is optional, so we'll not consider it for now. And there is actually a much easier way to install all of these in one go. So that's what we will be doing in the next video. We will not be installing these things separately. We will actually create a virtual environment and install all of them at the same time with a few commands. So let's do that in the next video. 7. How to install tensorflow with GPU support (part 2): So as I said, for installing the rest of the software that's needed to run the object detection API and to run other scripts as well. We will create a virtual environment using Anaconda, and we will install these dependencies inside of that virtual environment. So for anaconda, actually, we will install the Conda Python package using pip. So Conda is, as it is mentioned here, is crossbow platform language agnostic binary, binary package manager. And it is the package manager used by Anaconda. So you can actually install Anaconda, which comes with a GUI that you can use and you can install things using that graphical user interface. But for our purpose, for the deep learning part that we will do, Conda will suffice. So we're just going to install the package here using pip. So for this, let's copy this. And for me it's already installed. So if I run this, it's going to tell me that requirement already satisfied for you is going to ask you if it can install some packages you answered with yes, and it's gonna install all of those packages. So let me clear this. So now I have Conda install. And just to mention something, if you are on Windows, you can go to the website. So dot-dot conduct for example, you can, Maybe I'll link this to this lecture. So installing on Windows, there's also a guide here on how to install on Windows. For Windows, you will need to download a certain installer. So you can either go with many conda. So many conda is basically a small terminal where you can run all your commands, the same commands that we will be running throughout this course. So you can either download this or you can download Anaconda installer, which has a graphical user interface. And you can install things actually by choosing them from a menu, so on and so forth. So depending on your choice, for the purposes of this course, I will be only using the terminal to run all conduct commands. So many Candace should be enough. So for you, you can just come here and choose, for example, for Python three and I do recommend you choose the ones for Python three. Avoid using the ones for Python two. And here, based on your machine, if you have 64 bit or 32 bits, you choose the right installer and then you can download it. So let me just stop this here and let me just close this. Come back here. Okay. So that's the way you can do to install it if you have a Windows machine. So for you been to a we will. We installed it using the pip, install a candle command, and now we have it and we can interact with Conda using the virtual or using the terminal. So now we need to create a new virtual environments and then we will install the rest of the software, the ones here, we will install them at the same time. So for this, let me just go back to the guide that I showed you, the one that my friend created, and I do recommend you come here as well. You look at his tutorial, it's an additional tutorial. We will not be using the same example that he used here. Use the famous example from a raccoon data set. We will actually use something very different. But for the installation process, it's very clear here. So we're going to use the same commands as he, he had or something similar. So let's first creates our environment by using this command here. So let me just copy it. Come here, paste it, and I want to change this. I want to use the 3.8 Python version. And here I want to change this to t f2. And I'm going to state here that is a GPU. That's how we'll be installing TF to GPU version. This will make it easier for me and for you as well. When you look at your virtual environments, you can immediately know which environment contains the installation for TensorFlow enabled GPU. And if you, for example, the side at some point that you want to use, TensorFlow for CPU, then you can create another environment and you can give it a meaningful name. For example, t f2 underscore CPU, which you know that you're going to install TensorFlow for CPU in that environment. So let me create this environment here. And then we click or choose yes here to install all of these. It could take some time to install everything. No, actually it was very quick. So now that we have this, we will activate our environments using this command shown here. So let's do Conda, activate t f2 underscore GPU. So now we have activated our virtual environment and you can always verify a few activated it or not buy looking whether the name of your environment is used at the start of the command line here. So now let me clear this, and now we will install the rest of the dependencies. And for that, what we will do is to install or to use this command here. So if you come to the guide, again, can see that you can either install TensorFlow GPU with a specific version here. But if you use this command, this means that You have already installed these pieces of software here by your own or on your own. But if you haven't installed them, which I do recommend if you just want something quick and you want something even isolated from your system. Because if you install these things by each time going to the website and downloading some installers and running them was going to happen is that they get to be installed on your system, so on specifically, they won't be isolated. And this is something that I try to avoid. So for me, I will choose the second type here, where I'll be installing, creating a virtual environment, or installing inside my virtual environment, which comes with a cuda Toolkit installed, and it has everything necessary to run TensorFlow for GPU. So now, in order to install all the necessary software that comes with TensorFlow for training on the GPU, we're gonna use this command here. So Conda install, in fact, I want specify the version here. I think it already comes with version 2.2 on Anaconda. So stay this and okay, the space state here, so calm, I saw that It's going to verify everything. So now it's asks us to allow it to install all of these packages here. And you can see that we have TensorFlow and TensorFlow GPU here, version 2.2. And let's click yes or choose Yes and click enter, and let's wait for it to install those packages. It might take some time here. Some packages are big size, for example, tensor flow. So just be patient and once this installation is finished, hobby back. So after a few minutes and once all the packages had been installed, now we can see that it has finished here and we can test our installation here. So let me just clear the terminal and then I'm going to run a Python console here and import TensorFlow. Lets just see what we get here. And what I would like to verify is one where the TensorFlow is installed correctly and it is. And the second thing I want to verify is whether we are actually using or we will be using TensorFlow for GPU. So in order to do that, there is a command from TensorFlow. So let me just Google it's here. So TensorFlow lists, devices. And let me just get here. And here. She's kinda happy this. So control C, control V. And then I want to list the devices. See. So here, now we have all the available devices for training. And if everything is installed correctly so that TensorFlow can use GPU. You can verify this ND devices available here. So if you see GPU such as this one here, so as you can see XLA GPU, GPU 0. This means that the version of TensorFlow that is installed in your virtual environment will actually use the GPU on your machine. If you only see devices listed as CPU, This means that you haven't configured correctly your GPU, maybe you didn't install the drivers correctly, or that you just don't have a GPU that you can use for training. Maybe you have a different GPU then NVIDIA GPUs. So for us, we see that everything is installed correctly on our virtual environment. And we, since we can see that we have GPU listed as a device here, this means that we can use it for training and we will be using it for training. 8. How to install tensorflow 2 object detection API: And now we will install object detection API for TensorFlow to. So this is a tool created by the TensorFlow team, so from Google. And it helps us to train our deep learning models easily, specifically SSD and faster our CNN, which we are meant to which they are mentioned in this course. So for that, we will just follow the guide shown here. So one of the main things to realize that there are two different types of installations. You can either build a darker image by using these commands here, or you can creative path python package installation basically. So what you would have is almost like a module for object detection installed on your system or inside your virtual environments. So for us, we will take the second approach here. And for that, we will start first of all by cloning the repository. So let me go here. Okay, this is, we don't need this anymore. Control. Okay. Let me clear this. And in fact, may be our our created in my I'm going to put it here. So for this, I'm going to call this, or I'm going to open it in a new terminal and I'm going to clone it here. So let's run this. Okay, I actually don't have Git installed here, so usually I use it, I use it from within a virtual environments. But here I don't have it, so I'm just going to install it. And let me get my password here. Should be installed quickly, hopefully. And once this is installed, we're gonna clone it. And it's going to be in the same folder where we have, sorry, the labeling tool that we used before. So I'm going to create it, or I'm going to add it to this folder here. So let's just wait for this to finish downloading. It's almost there. Okay, so let me clear this. Base it again. Sorry. I want to clone this. So let's clone it here. And we should have it inside this folder here. So it's gonna take a bit of time here because it is more or less a big API. So once the download is finished, I'll be back to show you the rest of the steps. So now that we have the repository cloned and it's downloaded to this folder called models. But I'll be doing is moving this folder to a new folder that I created called TensorFlow to Detection API. So I'm gonna put it inside. And now I have the necessary files inside this folder here, which is inside the TensorFlow to object detection API folder. So now what we're gonna do is continue with the rest of the steps here. And we will be doing this from our virtual environment that we created using conda. So here, if you remember, t f2 GPU is our Conda environment. And let me just seeding into my folder. Sir CD TensorFlow models. And in fact here I need to be inside the Research folder. So I'm going to CD to researchers. Well, let me clear this and let's finish the installation process by running these commands here. So protocol is now it's only these lasts two. So let me say this and paste it here. So basically here we're going to copy the setup.py file from T F2 to the current directory which has research. So let's run this command. And finally, we will run this last command here. But in fact, I faced problems while trying to use this command here because of this tag that was added. So this, this tag, I actually did not use it when I was using the tensor flow object detection API for TensorFlow one, but for TensorFlow today added it, but in fact, it caused some problems during the installation. So I decided to not use it. And I know that it doesn't really affect the part that we will be doing, which is training the models for object detection. So if you run just this command, Python, pip install dot or pip install setup.py install the object detection API. So let's run this. So this could take some time for the installation process and I'm going to be back once it's finished. So now that the installation has finished, let's run a simple test that shown here by running this command. So let me copy this and paste it. And let's see if the test is okay or not. So if the test is not okay, we're gonna see that gives us an error at the end. But if everything is okay, and we should get no errors here. And once this test is passed successfully, then we can actually get to the point where we can train our deep learning models using TensorFlow to object detection API. So let's wait a little bit. So now it has finished the test here and everything seems to be OK. And now our object detection API is ready to be used so that we can train deep learning models for detecting objects on images. 9. Data preparation for object detection: In order for us to train deep learning models for object detection, we first need to prepare our data set as the first step in our training pipeline. So for SSD and faster our CNN deep learning models, we will be using what is called TensorFlow to object detection API. So this is a tool provided by Google from the TensorFlow team, which allows us to train multiple deep learning models. And it allows us to configure things very easily. And we don't have to go at the low level of TensorFlow to understand how we train a deep learning model for object detection. So there was a first version called TensorFlow object detection API, but there is a new version for TensorFlow to. So if you don't know, in fact, the TensorFlow to is very different than TensorFlow. One, where they have used Cara's API to develop all the models or most of the models. And with the new API, they redefine these new models and now they have a version that's specifically made for TensorFlow to. So let's take a look at how our data preparation will be like. So first we will have a data set comprised of images. And I will show you where to get this data set. And what are we going to do is that we are going to annotate these datasets. And when we annotate it, we actually get different or multiple XML files that represent annotations. So each image in our data set will have one XML file corresponding to it that that has the annotations corresponding to that image. If you don't know what annotation, as, you will learn this very soon. But for now, just think of them as information that we add. Besides our images when we are training our deep learning model so that it can learn how to detect the position or the localization of objects in an image. So once we do the annotation using a tool that I will show you later, we are going to get a set of XML files that correspond to the images that we had in the beginning. After that, what we're gonna do is transform those XML files into one CSV file. So this one CSB file will, will comprise of all the annotations that exist in all the XML files. So at this step here, we will have the images as we had from the beginning. But now, instead of having multiple XML files, we will have one csv file that contains all the necessary information for the annotation part. And once we reach this step here, we will use a tool to transform this data into this CF records format, which is a format that is known in TensorFlow that helps accelerate the learning. Because dy format that this file will have is a binary format, which is very, very efficient in terms of reading and writing. And that's going to help the training speed to be much, much better. So for each step, we will introduce a certain tool that will help us to do that. And in the next videos, I will be showing you how to get those tools, how to install them. And we will go through these steps one by one. 10. The dataset that we will use to build an object detection model: In this new lecture of the course, we will look at how to download the data set that we'll be using to train an object detection model. So for this, I have chosen a data set that is very fitting to our current time, the pandemic time where we have many people wearing masks. So the data set would contain images that have people either wearing masks like these two and this one here. And other people who are not wearing masks, such as this one here. And the goal of our object detection model will be to be able to identify people's faces or heads in on images and tell us whether they are wearing masks or not. So for this, I am downloading a data set that exists on the robo Flow website. I will be sharing the link with you, linked to this lecture. And you can just go to the link and download the datasets. I believe you have to subscribe to the website and to enter your email address. So apart from this, you shouldn't have any problems and downloaded the data set should be for free. So in this data set they have 149 images. And each image you have people who are wearing masks and other people who aren't wearing masks. So for me, I am downloading the row data set here. This one is a data set of padded images. So let's take a look. Maybe we can see some images here. So there are loading, as you can see, this this datasets, the padded data set, actually has some padding, so some black we'll look here in the size of the image. And we will not be using this for our object detection model. We will be actually using the row data set here. So when you click the row data set here, you will get to this or you will go to this part of the webpage. And here we will download a specific version of the datasets. So as you can see here, there's the cocoa JSON creates an image JSON. There all these types of datasets. And for us, we will be downloading the TensorFlow object detection CSV datasets. And we will be downloading this because we will be using TensorFlow object detection API. So we will need a data set that is appropriate for that tool. And in fact, you can even download this sense of UTF Record, which is the final form of the datasets that the models will, will read during the training. But for our purposes, for learning purposes, we will be downloading this data set here. And I will show you why in a bit. So if you click this here, is going to tell you to choose, okay, the format we have already chosen this download zip to computer or show download code. We will just click on Download ZIP to computer. Click continue. And here if you're not logged in, it's going to ask you to login so you can sign in and just after this, you will be able to download the data set. So for me, I will not be doing this. I already have datasets downloaded, so let me just check here. So here, if you don't know, if you download the data sets, it will be zipped and when you unzip it, you will get a folder such as this one. And when you go inside this folder, you will have all of these folders and these two text files here. So if we open the training folder, for example, what you're gonna notice is that they have a bunch of images that's open one for example. And here you can see that as an image of people wearing masks such as all these ones. And some people are now wearing masks such as these three ladies here. And apart from the images, you should have a file here, a csv file that contains the annotations. So if we open this file to take a look at it, let's open it in. It's in the library office. If you have Excel, you can open it on Excel. And when you open the file, you will see that what you have is a table that contains D annotations of the datasets. So what you would have here, for example, is this image here. This file name corresponds to the image name and the datasets. So in this folder, and then what you have is the width and height of the image. Then you will have the class. So for this specific image, you can see that there are five different annotations. What this means is that we have five different people in this specific image. And all of them are wearing masks. And here what we have are the coordinates of those people's faces or people's heads. That's, that can, that are wearing masks. And in fact, what we have here is the pre annotated data sets. So usually when you are trying to build a deep learning model, you in fact don't have all these annotations. This is something that you need to do yourself. And I will be showing you how to do this. For now. We will keep this annotations file in here and we will be using it a bit later. But I will be showing you the steps to do in order to get this annotations that CSV file when you don't have it. Because usually when you want to build an object detection model, you just start with a bunch of images such as this one here. And what you would need to do is to annotate them by yourself or you hire someone to do that for you. And what the annotation process is is basically to go through the images and choose or annotated by selecting the head or the object that you want to annotate. So in our case it would be the head. So we would be putting a rectangle here and assigning a class to debt rectangles. So for this specific image, this lady here is wearing a mask. So we will be adding the class mask to our annotation. All of this will be much clearer when we do this by ourselves in a future video. But this is what I wanted to show you. So here for this specific data set, we already have the annotations, which are going to help us a lot. And for me what I would like to show you is how to annotate the data set, show you some examples and later we will actually use these annotations. We will not annotate all the images because it will take a long time and it's really not a learning experience. So for me, I want to show you the tools to annotate the data set, but then we will come back to our data set and our annotations, and we will use them as they are in here. And one other note about the datasets. So as you can see, a split 23 folders. There is a training folder, there's a validation folder, and there is a testing folder. So usually in deep learning projects or machine learning projects in general, what you have is a certain split of the data set where you have a folder that contains data that will be used during training. And what you would have also as either one other folder or two folders. So here we have two. So we can use this validation folder to validates our our model during the training as well. And once the training is done, we can test our model on this test data set in here. So apart from this, this is exactly what we will be using for our training. And see you in the next video where I'll be showing you more about the annotation process. 11. Downloading and setting up our annotation tool : labelImg: As I have mentioned before, even though we have our data set annotated, we actually will do some annotations ourselves just to see how the whole anti-ship process works. So for that, we'll be using this tool called label image. And basically allows us to do our to shown in this image here where we can go through the parts or the objects that we want to detect. We can assign a bounding box around the object. And what we do after that is assigned a class to that object. So as you can see here, we're annotating these three persons here, and we're choosing the class person for each one of them. And to download or to have access to this tool, what you would need is to follow these steps here mentioned. So there, there are steps for you bumped to Linux for Python two and Python three is also, there are also steps for MacOS and finally for Windows. So for my case, I am using you been, so for that, I'll be using this specific guide here, but I will be changing some things. I will tell you later why I changed some of these things, specifically this second line here. So the first thing that you need to do is to clone the the code. So we're just gonna download a zip of the code. And here I'm just gonna put it in. I have already done this before, but we'll do it again together. So here I already have, as you can see, a folder called label image. But for us, let's download it to this folder here. And let's go here and unzip the folder. So Extract Here. Let me delete this. And for this, I'm just gonna change this to copy, so label image copy just so that I can distinguish it from my original folder here. And here what I will be doing as let me open a new terminal window. And here I'm gonna see D2. This. So just to avoid copying the whole or CD into each folder one-by-one here. I just drag this disk file here and then change the folder. So let me clear this. So what I have now is the same thing as I have in this folder here. And let's go back to the guide and see what we should do next. So for us we are using Ubuntu Linux, Python three, which is the recommended way. And here the first thing you need to do is installing these Biku T5 devtools. So let me copy this. And here let's face it. In fact, you don't need to CD to this folder in order to run this command, but I'm just doing this so that I know that I am in the right folder later when I'm trying to, Well, when I will reach the this fourth or fifth steps here. And for this, let me just run this, trim my password. And as you can see, I already have this installed. So it's already the newest version. Okay. I have already installed on my system, so no need to install it again. And when we come to the second step here, when I ran this command here, I actually faced a problem. And when I went to the issues here, I actually found someone who faced the same problem as me, so I have the same problem here. So installation issue when trying to run the command that I just showed you, and I have the exact same errors here. So for that, I think the problem is coming from the versions that are mentioned in the requirements that TXT file here. So as you can see, there are specific versions of pi q, T5, and Alex and L. So for me what I have done to solve this problem, which is what I have mentioned here in this issue. This is actually my account. So in order to fix it, I just installed by q, T5 and XML using Pip and without specifying the specific version. And to do this, actually, I don't want to install these things on my system. So what I usually do is create a virtual environment and install everything necessary to run this Python script inside that virtual environment. So for this, I am using virtual nth. So it's a tool that allows you to create virtual environments very easily. As you can see when I type virtual and it shows me all the different commands that I can use for you. Maybe it won't be installed. So if it's not installed, it will give you a command on how to install it. And you can just run that command. So it should be something like pseudo APT install virtual and, or something similar, but it's going to actually give you that command if you don't have virtual installed on your system. So let me clear this first. And to create a new virtual environment while we need to do is to run. So virtual EMF. And we need to give a name to our virtual environment. And you should know that when you create a virtual environment using visual EMF, that environment or will be in some sort of a folder that contains all the necessary things to, to, to, to allow you to run that those scripts in the virtual environment and that folder would be actually created inside this folder here. So let me just first do an ls here. As you can see, we don't have any virtual environment. I haven't created anything. Then we run the the command he used virtual. And I'm gonna call it VM. And when I run this, now, I should have a new folder called VM, as you can see as shown here. So inside this VM, I have all the necessary things for my virtual environment to be run. And in order for me to activate my environment, I need to run source. And basically run a script that exists in Vienna bin and it's called activates. So when I do this, as you can see now, I have this name of my virtual environment at the start of my command here, command line here. So let me just clear this and now let's start installing the necessary things to run. Our virtual or our script here. So let's go back and just see, okay, we have, we need to install things that are in the requirements. And I have, as I said before, I face some problems. And in order to solve them, I actually installed by q, T5 and Pip and sorry, and L XML using pip. So let me go back here just so that I can enter to the requirements.txt and just take these names of the packages. So let me control see this. And I'm gonna run Pip, Pip install by q T5 and also Alex CML. And let's run this and see what we get. So here is collecting everything and it should install all the necessary packages. And I'll be back when it's already installed in fact. So it installed by q t five and also installed the XML. Let me just verify here. So yeah, the requirements is already satisfied. So now that we have these two Python packages installed, let's go back to the steps here and look at the next commands. So here we need to run this command. And in order to run this command, in fact, you now need to be inside the folder that contains those files. So here we are, inside our folder that contains all the necessary things. Here. And here we're going to run make Q T5. So the command was run successfully and now we can actually run Python label image that by. So here just to mention a quick thing here, as you can see, the rant Python three, I am only using Python because I only have Python three installed on my system. So when I run, I'm going to run Python. The mineral needs it alone. You can see that I have 3.8 installed. If I had two versions of Python installed on my system. If I had Python two and Python three, then in that case, I would need to use three here in order to tell my system that I want to use Python three for, for doing this command. But for me I only have Python three. So just using Python label image, that should be enough. And when we run this command, we get access to our labeling tool here. And as you can see, as exactly as we've seen in that image on the GitHub repository. And here you can do so many things. Like, for example, you can open one image here, or you can open a directory here. And we'll look more into this in the next video. 12. Annotating the dataset: So again, in this interface here you can see all of these buttons here that we will be using. So this one for opening an image, opening directory, going, going back and forth between images. And this is for creating a rectangle, duplicating it, delete it, so on and so forth. So let's just start by opening a directory. And for this I'm gonna go to the datasets that I downloaded. So it's this one mask wearing V4 row tensor flow. And for me I have changed the name here before. These, these underscores. Actually, instead of them, we had some dots and I remove them. This is to me much more clear when trying to read the folder. And here if we go and we choose this folder, if you click open, you get a list of all of your images here. And with this, you can actually go and annotate the datasets by choosing, for example, create a rectangle. And when you have a rectangle here, you can enter the name of your class. So for us, maybe we'll write mask. And as you can see, there are already some predefined classes in here. And these are coming from the folder that let me just check here. Or the file cold, maybe classes, maybe data, yeah, predefined classes. So this folder actually contains these predefined classes. You can remove them from here and keep it empty in order to avoid having them whenever you trying to undertake a new object here. So this is just the, an example, but for us who will not be annotating all the datasets, as I have mentioned before. So what I'm going to do here is that I will go to my data. Leave it was here. So I'm gonna go to the data set again. We're gonna go to this folder, training folder. I will just take a few images and put them in a separate folder and I'm just gonna copy them. I will not move them. So after I do this, Can it be it's stuck because of this one. Let me just stop it for now. So I'll see. Just stop it from here. Okay. I hope that's not pleased. Okay. So now let's go back. Inter, what I'll be doing as taken maybe just five images. Let me check. They don't contain many people here. This is just for to show you how you can annotate the data. So we're not going to take this image. As you can see, there's a lot of people in this image. What you would usually do is go back and go through each person here and annotate it and tell whether this person is wearing mask or naughts. So we're not going to use this image for testing purposes. We will use this in this as well. This and this. Okay, let's just take this, this, these ones here. So I'm just gonna copy them. Going back here. Create a new folder. I'll call it for example, dummy. Go inside, paste these images here. I'm gonna go back to my terminal here, clear it, run the app again. I will open a directory. For this. I'm gonna go back to the data set here, mask, and go to the dummy folder. So here I only have four images and we will go through them and annotate, annotate them quickly here. So in order to annotate the image, which you need to do, okay, I think actually I haven't removed these yet, so let me just let me stop this and remove these classes before we move on to the to the untaken parts. So here, let me go back to the 2s, your label image copy. And inside data predefined classes. Then I moved this, save it. Go back here, run the command again, open the directory, choosing that dummy directory. And okay, we have the four images. So for annotating, I'm going to click on create rectangle here or wrecked box. And I will just basically click then drag here my mouse in order to get the whole head. And now, since this, the person is wearing a mask, I'm gonna call this mask. And for the baby, I won't be annotating it because we don't see the whole face of the baby here. So we're just going to annotate this person here, the other people, we can't see them clearly, so we're not going to annotate that as well. And here you can just click on save as going to show you that it's going to save it to the same folder. And click save here. And let's just see what we actually get. As you can see here, we actually get an XML file that has the same name as the image. The only difference is the extension. So if I open this, not sure if I have a good way to open this. Ok, I'm just going to open this in Chrome. And what you see here is that in that XML file, we have the file name. We have the path to the file name. We have okay, that we don't have a name for the database that does. Okay, then we have the width and height and the depth, which, which represents the dimensions of the image. So the depth is the number of channels, RGB, red, green, blue. And then what we have is the bounding box coordinates and the name of the class that corresponds to that object. So as you remember, we chose mask and here we have the coordinates of the pixels that represent the bounding box. So if we go back to our tool, so d2 x min, y min values represent the value of this pixel here, and x max, y max represent the values of this pixel here. And Let's go back. So this is how you get an annotation for specific image. And let me just go back and go to the next image. So you click here next image. Let's annotate this again. And there are actually some shortcuts that you can use from your keyboard instead of every time going to clicking on this button here, for example, for creating wrecked box, you can just click on w. So if I click on w, I get the same thing. And now I can annotate this. So this is wearing a mask, and this person is wearing a mask. And now since this person is not wearing a mask and going right, no mask. And the same thing for this person. So click w Again. This no mass. Ok. So now I have the four, the four annotations. And let's say I made a mistake for this space or this face. I can just go here, click it. I can remove it, for example, is going to be removed from my annotations and I can annotate it again. So here, let me just click here. And it's no mass. So, okay. So now that I have this, this image of finished with the annotation part, I can also save it using control s. So I don't have to go to this button here each time. So just clicking control as and I will get the same thing. So I'm gonna save it. And if I go to my folder IC again that for this image I have a corresponding annotation file, an XML file. And if I open it, he is the same thing. We have an image, the bath to that image, the size of that image. And here as you can see, we have for different annotations. So there is, there are two of them with mask and two of them with no mask. And each time we have the coordinates of the points representing the bounding box. And again, X-Men, y min represents the values of pixel of this pixel or this pixel. And the Zmax, WiMAX represent the coordinates of this pixel or this pixel or this one. So this is how you annotate the data set and you can go to the rest of the images and do the same thing. So here let's do this. And as you can see, the two classes will be saved here so that you don't have to write them every time. You only do it once and it's saved. And he let me annotate the rest of them, control S to save it. Save, go to the next image. W. Annotate this again. So this person is wearing a mask. The same thing for this person here. And this one. And this one here again. And we have these two people here. That's their faces are more clear than the rest of the faces through them. This one and this one are behind a, we can't see them very well, so we're not going to annotate them. So Control S again, save, go back, and we now have the whole data set. Let's say for this dummy data set, we have it annotated and we have an XML file corresponding to each image. So now that we have this, we actually need to transform our data set in such a way that is appropriate for the tool that we will be using for training and object detection model. In fact, we can't use these XML files as an input to our pipeline. We need to transform them into what is called TF records. So it's a specific formats that TensorFlow accepts when training object detection model using object detection API. And we will look at how to do this and the next videos. 13. Transforming our xml files into one csv file: So previously we had annotated our datasets or our images, and we got those XML files. And now we will transform those XML files into one csv file that contains or combines all the annotations from all the XML files. So for this, we will be using a script from an open source repository. So I will be linking this link here to the lecture. And what I have done is that I took this file or this a script here, and I have pasted it. And now what I will be using as this specific function here, xml, sorry, generates CSV file. So what I have done actually is that I make some minor changes. So here as you can see, there is the main function, XML to CSV. And then they basically run it on the training set and a validation set. And what they do is they create these functions with the different paths to training and validation. And then they run these two functions one after the other. So for me, what I have done is that I created one function that can take the path to the images and the path to the output CSV file. So this will be generated and at the end of the function, once the function is executed, we will have a new CSV file that combines all the annotations from the XML files. So I will be maybe linking only this XML to CSV, Python pile or Python scripts. And the just to let you know that I have actually taken it from this repository here. So if you want to go to the source and check it, then you can go to this link in here. So let's go back to the script. So basically what we are doing is that we are going through all the XML files in our path. And after that, we basically, we are constructing a tree. So a tree is just a representation of the elements in XML file. So if we go to here, let me open. Then we tried to find one XML file. So this one for example. And if I open it, what you're going to see is that XML files are basically represented in some sort of a hierarchy. So you have this annotation here, which represents basically the biggest element. This is very similar to HTML files if you have worked with HTML. And so you have a 100 notation inside the annotation. You have folder, you have filename. You have path and then you have source than inside the source we have database. Then you have size. Inside the size you have width, height, depth. So you can imagine a, some sort of a tree that represents these data in a manner that's easily accessible by programs and while coding them in Python. So this is the only thing we're doing. So we are creating a tree, we are getting the root of the tree, and then we are going through the root and trying to find specific elements in that tree. So there's object, there's filename, there is size. And what we're doing is creating this tuple here that contains all of these elements. And then we are appending them to a list. And then we, this is where we write the CSV file. Basically, what we do is first recreates a, a row that contains the titles of each column here. So we have Filename with height class X-men, so on and so forth. And then what we do is that we take that XML lists and we basically created a data frame. And in that data frame, we choose the column name, and we choose our lists, and we get a data frame here. And in the CSV file, what we basically do is that we first used that previous function, which takes the path to the images and gives us a dataframe. And now what we do is that we transform this dataframe to CSV and we'll give it a path where the CSV file will be written. So at the end, if we just use this function generates CSV file. So it's going to take the path to the images patches CSB file. And it's just going to write our final CSV file to this path here that I have defined. So now if I go back and just to verify things before I run this script. So let me go back to this part here where I have the datasets and here I'm not sure if I changed the path again. So let me just maybe take one image here. And now what I would like to do is to take this part, the path here. So till here. I'm gonna copy it just to make sure that I am using the right path. And then this again pasted. Ok, so it's basically the same thing. And what I have created here as the dummy that CSV, you can call it whatever you like, maybe rotations the CSV as more representative of the file. So now that we have this Let me just clear this. Now that we have this, we need to run our, our script basically, and you just save this. So in order for us to run the script, again, this script will have some dependencies, as you can see here, glob OS, pandas. I'm not sure if they are installed on my system. But you can either install it all, install the dependencies on your system, or you can create a virtual environment that you can use to install all these dependencies. Or you can actually just use the a virtual environment that we have created before for the annotation tool here labeled image. So I don't need the annotation tool anymore, so I'm going to close it. Let me clear this. I'm gonna cd to my XML CSV file, so it's in media and shrink this to date. And I think I'm called an XML CSV. So let me clear this. That's LSD Just to verify. So we have the virtual environment and we have XML to CSV file. So just kinda write Python XML to CSV. So the goal here is that we go from these four XML files and we generate a new CSV file, and it should be written inside this folder. So for this, let me go back here. Let me run it. If we get this kind of errors where we don't have dependencies, we can just install them again using the same manner. So PIP install pandas. So it's gotta be installed quickly here. And usually that's how I do things. Sometimes if I run a script and I don't know if I have all the dependencies or not. I just run it a few times and each time is going to tell me if something is missing. So now basically the only thing that was missing, the pandas Python package. So let's go back here and very price. So as you can see, we have our new annotations file written here. It's only one file. And if we open it, open it Labor Office, and let me go here. Okay. And as you can see, our final CSB file has the same format as the first annotation pile that I have shown you that came with the data set when we downloaded it. So here as you can see, we have 1234, so four different images, because these 1s represent the same image. This one represents one image, this one again, and these ones are all corresponding to the same image. The only thing that's changing is the amputations inside that image. So this image here has six annotations, so six bounding boxes. This one only has one. This one has four, and this one has one. So as you can see now, we basically combined all the images or all the XML file, sorry, into one CSB file that has all the necessary information for the annotation part. So let's just go back and verify something here. So the dummy data set is here. Let's open the, maybe the big training set annotations file. So let's go here, open it. And as you can see as the same format, we have Filename with heights class and go here, and I don't need this anymore. So annotation dot csv. So filename, width, height class, weight height class x min, two WiMAX. So as you can see, this is how you get to this specific phase of preparing your data set. So as you remember, I have shown you before when we downloaded the data set, we had these files already included in the data set. But now you know how they got this CSV file here. So we started with the dummy data set just to go quickly and to see how they went from images to images plus CSV file. And now that we have this, we can actually go back to the step here. We will be using the original datasets from now on. So these three folders, we will be using them from now on. And we will be using the CSV files instead of that dummy CSV file. So the whole point of this dummy folder here is just to show you how they basically came to that specific step where they had images and a csv file that contains annotations. 14. Creating a labelmap for our dataset: Another important thing that we need when we are creating our datasets or when we are trying to create the final tf records is what is called label map. So a label map is basically a file that looks like this, where we define a different items. Each item will represent one class or one type of objects in our images. So as you remember in our annotations, we labeled the faces with either mask or no mask. So we basically have two classes in our data set. What this means is that we need to create a file, a label map, but that assigns IDs into each name that we have given to our classes. So for us, we gave the first class the name mask and the second class the name no mask. Which means that we need to have two different IDs. One ID for Mask and one ID for no mask. So here let me just copy this from this open source repository here you can look at the URL here. I just basically typed labeled map. And I went to the first repository that I saw and I took this part here. And what I'll be doing is that I'll be pasting this into a new file here. I'm going to paste it here. So the ID1, I'm gonna give it to mask and id2, I'm gonna give it no mask. So just before we save this, let's go to our datasets and verify in our annotations. Or let's just go here. So in our annotations, as you can see, we have class mask and class no mask. And in fact, I have made a mistake here. So as you can see, when they labeled their images, the original data set that we downloaded, they actually wrote no dash mass instead of underscore. So for me I have to change this as well. So this should not be and underscore, this should be a dash. And now I have the label, label nab, sorry, corresponding to the data set that we have. And the label map is the same for all of our folders here. So we have the test, training and validation. They all have the same name when it comes to the classes. So mass or no mask. So now that we have this, let me just save this song Control S. I'm going to save it inside this folder. I'm gonna call it labeled, sorry. This so label map that PB TXT. So I am using this extension here because that's the convention that's used by by tensor for object detection API. So I'll be keeping it the same. So this file extension will be used for our label map. So let me save this. Let's go here. And as you can see, it's right here, the file. And now what I will be doing as that of using a new tool or new Python scripts that goes from the label map, the CSV files, the images, and all combines them into one t f records that we will use later for training. 15. The tool that we will use to generate tfrecords: So the tool that we will be using to transform our data set into Tf records format is basically a script called generate THE records. And this script, I took it from this repository and facts. This repository belongs to a friend of mine called man gaba. And we basically studied together at some point, and we also worked together at some point. So he created this very useful scripts that we can use in order to generate the t up records. So TF records again, is a format that censorship to object detection API basically accepts and it's easy to train our models using this format here. So what I have done is that I took this script here and I have pasted it to a new script on my local machine. And now it's actually very easy to use this TF records or this generative records pile in order to transform our data set. And again, in order to run the script. As you can see, it has many dependencies including TensorFlow and PIL and others. So for this to work, we actually need to have some tools installed in a virtual environment. And since these are the tools that we will be using for training the object detection models. We will create a separate environments or virtual environments to run this specific tool and also later to run the training and evaluation. 16. Generating tfrecords: In fact, instead of creating a new virtual environment, we will just use the virtual environment that we created before when we install TensorFlow GPU and also object, object detection API. So for that, I just opened my script here and I open the terminal window inside VS Code. And I'll be doing the or I'll be activating the virtual environment here. And I'll be running the script from here just so that I can see all the arguments that are mentioned here. So for that, I'm going to run Canada activate. So our environment is called TAF to GPU. And now as you can see, is, is activated. And we clear this. And what we should do now is run our script here, because I am already inside the folder that contains that script. And the script takes four different arguments here. So the first argument is the path to images. So I'm gonna do bad to images. And for that, what I will be doing as just dragging and dropping the folders bats here. So let's start with the train folder here. So I'm just gonna drop it here, train. And now we need the path to two annotation. So the second argument here is the path to the CSV file. So for that, let me go here. Opened is look for the annotations. And now I have the path to my CSV. And again just two reiterate. We are no longer using the dummy folder here where we created the annotations. This was for demonstration purposes. Just show you how you annotate your data set and you get to the point where you have the annotation that CSV file. But now we actually are using the real data that we downloaded. And that's already annotated for us. So that's why we're using the path to this folder and also the path to this CSV file. And the fourth argument is bad to label map. So as you remember, we have, I'm sorry, we have our label map that we created in here. So I'm just gonna drag this and drop it here. And finally, we need the path to save the TF records. So this bat is for saving the TF records. And for that, I want to save them in the same folder here. So inside this folder, just so that I Q, I can keep things in order. So now we should have all the arguments necessary to run the script. And let me just run this and see what we get. Okay, we actually have a problem here. Let me check. In fact, The problem is that for the final arguments here, so bad to see it to safety F records, it actually needs to point to a specific file that we want to create that has the extension record here. So for that I'm just gonna do trained record. So that saved and let's see what we get now. So now is generating the TF record. It should successfully have created the CF record file, and now we can see it here. So this file here actually contains all the images and the annotations. And it has put them in a binary format that is easily to read by the model during the training. So what we're gonna do next as to do the same thing for test folder and validation folder so that we get three different TF records that we can use for training or testing or validation. So I'll be doing this quickly now. So let's clear this. Let's just create or just go through the arguments again. So for this, I'm gonna go to val validation. So this is food safety or to save the TF records, the label map should stay the same. The annotations. Let's check this. So validation that too on notation should be that. And finally, we have the path to images. This also should be vel validation. So let me just check the name of the folder gas so they don't make a mistake. And infection be valid. Sorry, above this, it should be valid. And okay, the name of the csv file is the same. So valid. And here again, valid. The label maps should stay the same and the rest here it should be valid. So let me run this. And it should have created it. So let's check, okay, we have it here. And finally, we will do the same thing for our test folder here. So everything that's valid to change to test here. So the test folder for the TF records the label map again should stay the same annotations. This one should be tests. And finally, the path to the image here should go or should point to test. And let's run this again. Let's check our folder and now we have the TF record for the testing folder as well. So now we have all the data put in the right formats that we need for training and also for testing and validation. And the only thing that is left now is to configure our training, choosing our model, and then running the training. So we will look at this in a future video. 17. Overview of the steps needed to build an object detector: And now we will look at the steps needed to build an object detector using TensorFlow to object detection API. So in order for us to train and evaluate a object detection model for whatever tests that we want to do, we have to follow these steps. So first we need to choose a neural network architecture. So this could be SSD fester or CNN or others. Then we need to download a pre-trained neural network corresponding to our choice. In step one, we choose SSD, then we will go and download a pre-trained neural network. So pre-trained SSD network from a specific place that I will show you. After that, we will download or we will look for the configuration pile corresponding to our pre-trained neural network in step two. So if we choose SST, we download the pre-trained model and also a configuration file. And after that, we will configure our training using the configuration file. So this configuration file will, will have multiple parameters that we need in order to create our training job. You can call it. For example, we need to specify the path to training data. We need to specify the number of classes that we have in our data set and many other parameters as well. So in the configuration pile we will be doing this. And finally, we will run the training and evaluation using the scripts that are part of the tensor flow to object detection API. And for this also, I'll be showing you which scripts to run and also which parameters to pass to the scripts so that we can run the training and evaluation. 18. Transfer learning: And now let's talk about transfer learning because we will be using it for our object detection task. So what do we mean by transfer learning? So usually what we have is a neural network such as this one. This is the SSD neural network and it has these layers. And the process is to give it some images at the beginning. And we want to predict the classes or the objects that are in that image. So we want to know what kind of objects are those and also where they exist in the image, where they are localized inside the image. And if you have a large data set, when I say large, it's in the order of hundreds of thousands and ideally millions of images. In that case, you can start from the known network architecture and you go from scratch and you train it from, from 0. Let's call it like that, which means that you only defined layers. They will be initialized randomly. And then during the training they will learn to extract the necessary information so that they can make predictions, good predictions at the end. But for most of the cases in the industrial contexts, we don't really have that large, that type of larger datasets. We usually have very small datasets, such as the one that we have. So it's less than 200 images. So even actually if you have maybe a data set in the orders of tens of thousands of images, it is still relatively small compared to the very large datasets that are used that are publicly known in deep learning field that researchers use to train these kind of neural networks. So for our case, what we will be doing is that we will start from a pre-trained neural network. Hence, the reason why we will be downloading a pre-trained neural network as I shown you before or as I have mentioned before in the steps for training and object detection model. And when we do this, we actually have two different options in the tensor flow to object detection API. So the first option is to only start from the pre-trained part, the pre-trained network of declassification parts. So these will be initialized randomly, but this part here will be initialized using the weights that were used when this neural network was trained for some classification task. And the second option is to actually start by using pre-trained neural network for all the layers. So all the layers will be initialized using some weights that were basically learned when the neural network was trained on a large datasets. So these are two different options. And for our case, we will start from o. We will choose the option where all the layers have been initialized. And that's because we have a small datasets. In fact, if you have a data set in the orders of tens of thousands of images, in that case, maybe it's better if you only use pre-trained part of the classification layers and keep these initialized randomly and then you train everything using your data set. But for our cases we have a very small datasets. We will keep the pre-trained layers. We will keep all of them, and we will start from them as our initial state. So remember this because at the point where we will fill our configuration file, we will have the choice to either start from the classification layers as pre-trained, o, all the layers as pre-trained. So usually they, they're called detection layers as pre-trained. So we will be looking at this option in the configuration file when we are preparing our file for training. And just remember this, remember these two options and remember why we will be doing option two instead of option one. 19. Downloading the pretrained model and getting its corresponding config file: In this video, I'll be showing you how to structure your folders and prepare your data for the training. So for this, there are several steps. And the first step would be to create a bunch of folders where things will be stored in them, training artifacts and other things as well. So what I usually like to do is to create some sort of folder like this one here. So I called it object detection. And inside of that folder, I will be adding several experiments that's called them like this. And an experiment basically represents a certain setup that I used for training. So for example, one experiment could be by using faster our CNN neural network. Another experiment could be using SSD based neural network. So I like to structure my, my folders like this. And I start by creating an experiments. So here we have experiments one, and when you go inside that folder, you will have two folders there. The first one is called data, and the second one is called training process. And what, what these folders represent is. So for the training process, it's an empty folder where all the training artifacts will be saved. So by training artifacts, I mean two things. The first thing are the trained models. So during the training, at certain timestamps, a model or a checkpoint will be saved to this folder. And the second thing is what's called events. So these events basically are files that store what's happening during the training. So the storing values of accuracy of loss functions and so on and so forth. And we will be using those events files in order to see what's happening during the training. And to do this, we'll be using a tool called Tensor Board that comes later. So going back, so this folder is for saving the training artifacts. And this folder here will contain all the necessary data and files in order to run the training. So the way I like to structure it as, as follows. One folder I call the train. And inside this folder I'll be storing the training data so the TF records that are used for trainings. So let's just do this right now quickly. So if we go to train and we look for our TF records here, so we have this one, so let me copy it and let me go here and paste it. So this is for the training. And just something that I want to mention here since we don't have a lot of data, I believe that maybe it's better to use as much data as we can for training and use the rest for testing. So for that, what I have decided to do is to basically take the validation data and use it for training as well. And we will use the evaluation data for testing. And this is again, a decision that I made based on the number of images that we have. I don't think it's a good idea to split them into three different parts when we don't really have that much data anyway. So let me copy this and paste it here as well. And let me just change the name. So I'm gonna call this, for example, train too. And this one will be trained one. And these will be both files. Both files will be used for the training. And the second folder that we have as the testing folder. So this folder will contain the data for testing. So for this, I'm gonna go back here and go to the test folder. So again, sometimes I say evaluation, sometimes I say testing, but they mean the same thing. So here, testing, I'm going to copy this and come here, I'll paste it here. And this will be all the images and annotations stored in this dot record file will be used for testing. And the last folder that I have here is the folder that will contain my pre-trained model. So the pre-trained model is the model that I'll be downloading that will be useful as a starting point for our training. And for this, to get this pre-trained model, I'll be attaching this link to this video here. And this link basically points to the model x2. So on our tensor flu, TensorFlow to Detection Model x2. So this table here contains checkpoints or pre-trained models for many, many different architectures. As you can see. For example, if you take SSD or FES or RCN and there are many different backbones. For each backbone you have a different checkpoint. So for us, for this specific video, I'll be choosing fester our CNN with resonate 50, v1 as my architecture, as my starting point. So for that, I'm just going to click here. And it should give me the link to download and I should just save it for me. I have already downloaded it here, so it is in my downloaded my downloads files. Sorry. So here I'm just going to go to to my folder here. I'm going to open this in a new window. I'm going to take this. I'm gonna copy it. I'll paste it here. And then I will extract it here. So let me just extracted here, extract here. And once it's finished extracting, i will delete this zipped folder or zip file here. So the last thing that we will add to our data folder as the label map. So the label map is what we created before. So just to see it, again, it just contains this map. Basically it maps an ID to the name of the class that we have. So one will point to mask and two will point to no mask. Let me close this. And now what we need is one last piece of data that, that we will use for training, which is what I have mentioned before. It's called the config file. So for the config file, you can just go to the clone repository or the downloaded repository for object detection, if you remember. So it was when you download it you should have had this folder called models. So inside models research and then object detection does go to object detection. And here inside can't fix t f2, you will have a list of all the config files that are available and that these conflict files correspond to this model zoo here. So if you come here and you go to the training, sorry not to training process data pre-trained model. Here, you should look for a config file that has the same name as this one. So for this, I'm just going to copy the name here. So Control C, and I come here and I will just look for it. So control f, control V, and this is the conflict file that I need for the training. So I'm gonna copy this. And I will put it here. And now I have everything necessary to run my training. The next step will be about changing the parameters inside this config file. And we will be doing this in the next video. 20. Preparing your config file: And now let's make the necessary changes to our conflict pile in order to allow the training to be run and to define all the necessary parameters that training and evaluation will need. So for that, I'm going to open my config file using some text editor. And here, as you can see, the config file is divided into multiple sections. So there's the section about the model configuration. So here we see some parameters regarding the model example, the number of classes, the input size here. So the size of the images at training time. At the same time. We also have some other parameters for the model. Here. We also have here number of section about the training config. So for the training topic, we have the batch size, we have the number of steps, we have the path to the pre-trained model, the one that we downloaded. So those checkpoints we need to point to them here. And also we have a train in poetry there here. So this is the part where we will define the paths to our TF records. So our datasets basically, and also the path to the label map will be defined here. We have an eval config section here where we can define some configuration for the evaluation or for the testing part. So for example, we can change the bat size here or the metric sets. And finally we have evaluation input reader or a testing input reader. And this is the part where we will define, again the path to the label map because it's needed here, and also the path to our data records files. So basically the path to the datasets or due to the part of the data set that will be used for testing. So as you can see, there are lots of parameters here to change, but there are mainly some, some parameter that you will have to change each time you starting from this specific setup of the config file. So we just copied this from the, from the folder inside our object detection API. So there are some parameters that you most likely will be changing them each time. And there are other parameters that you don't really need to change them. You can have your training and evaluation without changing them, but they can improve the results in some cases. So these parameters are basically a second degree. They are not the main parameters. So these second-degree parameters, we will talk about them a little later, but for now we will speak about the main parameters that you need to change. So the first parameter will be the number of classes. So here, because this was the pre-trained neural network that we downloaded. Train on cocoa 17 datasets. In that data set, they have 90 classes. In our case, we don't have 90 classes, we only have two. So we have the objects with mass and the objects with no mass. So only two different types of objects or only two classes. So this will be changed to, to. The other thing that we need to change is the batch size. So for the batch size during training, better to have a large batch size. But since we are running the training on our local machine, as specifically for my local machine, I only have a GPU that has four gigabytes of memory. So it won't handle this number of images at the same time. For that, I'm going to change it to two. And the number of steps here. So each step basically represents one batch of data. So if in the batch we have two images each time, this means that in each step we will pass two images at the same time. So here, for the testing purposes, just so that we can see the training running, I'm going to change this to only 200 as just want to show you some things. And when, when you are happy with the configuration that you have defined here, in that case, you can increase the size of the number of steps and then you can run the training to get your final best. Object detection model. Another thing that I want to change or I will need to change in fact, is defined tuned checkpoints. So this path here will be the path to where our pre-trained model is on the local machine that we have. So for this, I'm gonna go to my pre-trained model path. So it should be data pre-trained model than past year then checkpoints. And it's this one. Check on 0. So for that, I'm just going to open this in a terminal just so that I can get the full path here. I'm gonna copy this and paste it here. And it should be checkpoints, CT-PET, dash. So this is the starting. There are checkpoints that we will need to define. So here I have defined the path to that checkpoints. And here, as you can see, we have fine tune checkpoint type is you have the option of choosing either classification or detection. And this is based on the slide that I have shown you before. When we have a neural network, we either keep deep parameters for the classification part only or we keep the parameters for the whole networks are all the layers. And that's what we want to do in our case. And for that, we're gonna change this to. Detection. So we want to start from a neural network that has all the parameters initialized using the pre-trained neural network defined here. So this we're going to start from Detection. And let's go down a little further. So here we need to define the path to our label map. So just to verify here, the label map is inside data. So her dad and then just copy it from here. And in here. Let me copy this and paste it here. And just to avoid mistakes, and you're gonna copy the whole control C, control V. And in fact the path to the label map. We're going to need it for the training part and the testing parts as y, gonna copy it, paste it here. And the other thing that we need to change are the paths to our datasets for training. So here for the training part, I'm gonna change this. Let me just go back and check pads here. So basically until we get to data, then we go to train. So here I'm gonna do the same thing basically. And we're gonna paste it until we get to data than, than train. Just verify data. Train. Now we are at the folder that contains our training records. But since we have two different files here, what we should do is do something like the following. So when we write this notation here, so star dot record, it means that I want this path here to include all the files that have an extension of dot record. So that's why I'm adding this. And finally for the evaluation part or the testing parts, I'm going to copy the same thing except that I will change the fats and copy this. Paste it here. And I will change this to test. Well, let me just go back and verify. So I have data, I have test. And in fact I can just write test.py record, but this doesn't hurt since we have only one file that has the extension dot record H will be chosen as the file for the evaluation parts. So with these parameters defined, I believe we need to change anything for at least for now. So these are the main parameters that you will usually change whenever you get a new conflict file. And for the other parameters, most of them, you will, most likely won't be touching, but some specific parameters. You might want to change them from time to time when you are trying to get the best accuracy possible from your model. And maybe we'll look at those a little later. But for now, these are the main parameters that we need to define. And with this conflict file finished, let's try to run a training and devaluation and see whether everything is working correctly or not. 21. Running the training and testing for experimentation: In order to run the training and the evaluation, we will follow the guide here shown on the GitHub repository for the object detection API. And basically what I have done as just copy commands here and put them in a file that I can always come back to without going to the repository. And for that, we need to define a few parameters to pass to the scripts that will do the training. So here as you can see, the main script that does the training is called model main, CF2 dot pi. And this model, or this script will take three parameters. So there is the pipeline conflict path. So this is the path to our config file. This is also another parameter called modal Dear. So the model Dear as the path to where everything that comes from the training will be stored. So the checkpoints and also the events that store the accuracy or the development of the loss and the accuracy of the model. And this last parameter here is just to log everything that's happening in the training to our folder that we define in model, dear. So the first thing that we need to do is define these parameters here. Some, we can choose the path to the conflict path, choose the path to the model. And then we will run this command directly because it will take these parameters that we defined here as an input. So pipeline compri, path will be here, model year will be defined as well. So for that I'm just copy this. And in fact, what I want to do is I'm going to open to different terminals. And I have activated my environment, my Conda environments that has CF2 GPU installed and object detection API installed. So I have two terminals here, and I will define the pads in both terminals. One terminal of using it for training and the other one for testing. So let me just define here, there's the pipeline complex path and that should be in, let me go back here. This should be, should be here. So this is the config file that we just fine. So I'm going to put this here. Same command will run it in my second terminal as well. And again, what we need to define here as the model that we're gonna take this. And I'm gonna paste it here. So the marble, the year will be the path to where everything will be stored. So this should be our training process here. Everything that comes from the training will be stored in here. So I'm going to put this here as my model here. And I'm gonna run the same command. In my second terminal here. And let's look at the command here. So the command that we need to run now as this command where we will run the script model main, CF2 dot py. And we're gonna give it the parameters that we just defined has as an input here. So for that, let me copy this. Copy. And one thing that you should, you should do is you should make sure that you are inside models research directory in order to run this command. So for that, let's come back here and see, okay, we already inside models research. If you're not, then you have to come to this specific path here. So that's the scripts can be around and everything will go smoothly. So for that, let me just paste this here and this should be for the training. So I'm going to run this. And you should know that for the starting, it takes some time to start. And also for the logs, it's only shows us the logs after each 100 steps. I'll show you that a little bit later. So these are just some logs that come from the API and from TensorFlow. You don't need to look at them for now. You don't they don't tell you a lot of just some information about preparation of training. So these are not, don't worry about them. If you see warnings, it's it's very it's very okay. There is no problem with this. And almost started just to make sure. Ok. And for the evaluation part, what we will be doing as running the second command here. And the second command here we will run the same script as you can see, Model main, CF2 dot pi. We're gonna give it the pipeline complete path, just like before. The model dir just like before. And the other parameter that we need to define so that we tell the script that we want to run testing and not training is this or is this parameter here. So this is called checkpoints deer. So what this represents is the directory where the checkpoints will be saved. If you give this as an input, then this script will realize that you want to do testing and training. And what it's going to do is just wait for some checkpoints to be saved in this model dear folder. And then it will read them using this checkpoint directory in order to run the evaluation. So just to reiterate, for the script model, mentee F2 dot pi, if you only give it these three parameters here. So we're not going to speak about this last parameter is just for logging. But we're going to speak about these two. If you give it only the pipeline complex pad and the model Dear Dan is going to know that for training purposes, so it will run the training if you give it this third path here, which represents the path to where the checkpoints have been saved. So in our case, for example, they will be, they will be saved in the same folder defined here. So the same model deer can be chosen for checkpoints, deer. So if we choose or if we add this parameter here, then this script, we'll realize that we want to do evaluation or we want to do testing and it won't run the training. It will just wait for checkpoints to be saved and then use them to test all the data that we defined in the eval input reader here. So he's going to use this data here in order to do the testing part. So let me just go back to the terminals. So here, in fact, if you see this, then the training has started. And it's gonna take some time just to finish the first 100 steps. Once the first 100 steps are done, then you will see a i log here that tells you that too was done. And it's going to go to the second part of that 200 steps, so the second 100 step. And now if you see this, this log until this point here, that means that you're training has started without any problems. If it stops, then there probably is some problems, some error. And you can just read the laws in order to understand what's, what was wrong. But for now, if you get to this phase here, that, that means that your training is working properly. So now let's go to the evaluation part and let's define the final parameter that we need to use for the evaluation part or for the testing part. As you can see, I keep saying evaluation or testing. They mean the same thing. So just keep this in mind. And different researchers or different engineers might call it one or the other. So they mean the same thing and don't worry about this. Because basically they both mean that we are going to use that data for inference only. And we're going to look at accuracy Only and we're not going to train the model using that data. So let me just copy this here. What I will do here, I'm going to change or I'm gonna choose the path where my checkpoints will be saved. So again, if we go here and we go to training process, as you can see, these are the models that were just generated by the, by the training process. So this is the folder where everything will be saved. The training and if we go to train folder here, This is a folder that was generated by the detection API and it saves the events. So basically the history of the training part, so the development of the loss values and also the average precision during the training process. So here I'm going to choose the same folder. So training process, because I want to read the check points from the same folder here. That's what, that's what it means. And we have two different parameters because you can think of of a way where you have maybe you have your checkpoints, you have moved them somewhere else. In that case, you can actually just point this checkpoint directory to wherever you have, put your, your checkpoints. And you might be asking why do we need the model directory anyway? So actually the model directory here will help, basically will help us save the evaluation events when we are doing evaluation. So the same way here as you can see, we have the events for the training are being saved here. When we run the evaluation, we're going to have a new folder here called eval, and the events for the evaluation part will be saved inside the evil folder. So that's why we need to define this model dear here as well. So let me just run this and now let's run, let's run our script here. So copy. And again, make sure that you are inside the folder. Models Research. And let me paste this. I'm gonna run it. And now he's gonna take maybe some time to set everything up correctly. And maybe now which will only do evaluation using the checkpoint that's already saved. So check point here is going to use it for testing. And then it's going to stop or pause basically and wait for a new checkpoint to be saved. And when a new checkpoint is saved, at that point, this script will take that new checkpoints and use the same data in order to run the evaluation again. So let's just wait a little bit. And of course, it can take some time for the training and evaluation. So be prepared for this, especially if you have a machine that is not very powerful, such as the one that I am using. So here it's still setting things up. And of course, I'll be showing you may be. If it doesn't take a long time, I'll be showing you what happens here for the evaluation. And if it takes too long, then I may pause the video and come back to it later. When. When things are moving a little bit quickly. So here it's still moving. There should be there should be some blogs after this that tells us that it has done the evaluation part. So let's wait a little bit here. So after a few seconds or maybe around a minute, I had these logs here that where you can see average precision, you can see detection boxes, so on and so forth. And what these represents is the log's coming from the evaluation part. So if you see this, that means that the evaluation is running correctly. And once it finishes, this part, in fact is going to stop or it's going to pause and wait for a new checkpoints to be saved. And this new checkpoint will be saved after the training part here. So after this script reaches or does terminal reaches a certain, certain step where everything will be saved again? So what's going to happen is that we're going to have a new checkpoint here called CK P t2. And once it's created, then the evaluation part here will take that new checkpoint and run the evaluation again. And as you can see, a new folder was created called eval. And inside of it, as you can see, we have events and these events represent the development of the evaluation part. And, and this was created because we defined the model. They are the same as the as in the training part here. So maybe let's just wait a little bit and see. So just, just now, in fact, I had this new log and this is what I was mentioning before. As you can see, the evaluation part is basically paused here, and it says waiting for new checkpoints at this path here. So once we get a new checkpoint from the training part, then the evaluation will start again using that new checkpoint. So I just wanted to mention this and the training is still running now. So once it reaches a certain level where new checkpoints are created, then I'll be back and we'll see what happens then. So now after 200 steps, detraining has stopped, as we can see in here. And basically, for each step it took five seconds. That's why it was little log. And we can see just from these two values here, we see that the loss is continuing to go down, so that's good. And the other thing that I want to show you here is that as you remember, we were at this stage. So waiting for new checkpoint at this bath here. And sorry, this one. So here once a second checkpoint. So once we finish the 100 steps, the first one's a second checkpoint who's saved? So it was here, these two. And once they were saved, a new, you can say an event or this event has triggered the evaluation scripts so that it started again. So as you can see from here, waiting for new checkpoints, then it says, found new checkpoints at this bath here. So once the new checkpoint was saved, devaluation started again. And as we can see the same thing as before. We have the detection and everything for the testing parts as shown here. And again, it's waiting for new checkpoint. So if, if, for example we run the training again and it goes for longer times, then this evaluation script will continue to wait for those for those new checkpoints. And each time a new checkpoint is generated, is gonna use it for the testing or for the evaluation. 22. Running Tensorboard to analyse the development of the loss and precision : So now after that, the training has finished, well, basically finished on the 200 steps that we defined because it's not, it's not finished in terms of finished the whole training and the model is ready for production or for inference. Now, this is just for the experimentation purposes. We chose 200 steps so they are finished. And now what we can do is analyze the development of the loss function and see how the precision also, or the mean average precision has developed for the evaluation part as well. So for that, what I will do is use a tool called Tensor Board. So Tensor Board, this is a tool that comes with TensorFlow and it basically helps us analyze all the history of the training and even the graph that represents the neural network. So for doing this, what we need to do is run Tensor Board and give it a parameter called log deer and longer is the path to where the events are saved. For example, for the training part, we can pass this. So Tensor Board and dash, dash log Dear, and then the path to the training folder. And let's run this and see what we get. So here it gives us the results in this, in the browser, and we can access that page by going here. So I'm clicking on control and then clicking this with my mouth, with my mouse. Sorry. Here we can see the development of the loss function here. So this is one of the loss functions. So for pastoral CNN, there are several loss functions based on the different stages. As you know, there are two stages in pressure, our CNN. So for example, we have the classification loss and the localization loss for the final network or the mini network and the region proposal network, if you remember, SDI Network that comes before the last one and it comes right after the backbone part. So for this we can see this is the development of the, of the last. And what we usually look for is that the laws should be going down overall. So that, that should be our aim of our training. So it, the laws is going down then that's good. If it's not going down, then that's bad. And also it should go down smoothly. So when we look at these plots here, we realize that the last overall is going down, but it's doing so many fluctuations here as you can see. And also there is this. Thing here, but actually this comes from the fact that I stopped the training here by myself, then I ran it again. So what happened is that a to one back to basically the initial value of the loss. And then I started converging again. That's why from here to here. So this left part here, there are actually two different lots there. They are superimposed one over the other. And that's because I stopped the training and I ran it again. But of course this is just for experimentation purposes. On this side here. More clear, we can see that OK, the loss is developing in this sense and it's going down overall. But even with this, even if we ignore the fact that I stopped training and started again, we realized, oh, we can see that the fluctuations are too much here. And one reason for this to be the case is that we are using a very small batch size. So when we are doing this, basically the gradient is moving slowly and it's not, it's not taking good steps. It's not looking at the, sorry, not the steps is not looking at the a big chunk of the data set every time it makes a step. So it's only looking at a small variation of the data, so only two images each time. So when it's making the move downhill, because when I say downhill, this is what the gradient is trying to do with the gradient descent algorithm is trying to do so. We're trying to move downward in some sort of hyperparameters function. And the thing is that when we are only looking at a very small subset of the data set every time. So 2m, just every time we're not making the right decision when trying to go downhill. But if we increase the batch size, this will probably be much, much smoother. And of course, I can go, I can't choose high values of the batch size because of the limitations of my machine. So this is something that you should consider when you are developing your object detection model. If you don't have good machine, then the training could go, could be something like this. And the end model would not be that good if the training is like this. So as you can see, the same basically in all the different loss values here or the loss functions here. I mean overall, for example, for the localization loss, we see there is going down which is good. And again, this is where I stop the training. So x went back to the initial value. The same thing here and here. But overall, we still see that the loss function is going down. So that's, that's a good sign. But apart from this, we can't really say that this is a good training. And again, there are many reasons for this. Mainly the batch size that we are using as very small and it should be increased. If I see, if I'm doing this in my job and I see this kind of, this kind of plots here for my loss function. I would, the first thing that I would try as the batch size to increase it and see if it becomes smoother or not. So this was for the training part. And now if we look at the evaluation part, so let me just go to eval so evil because here I am trying to access these events here. And then I don't get a lot of information for the average precision here because we didn't run the training for that long. But if we go to the images part here where we can see as a side-by-side comparison. So the left side is what the neural network debt we are training is trying to predict or has predicted. And on the right is the ground truth. So as you can see, the network is still far from, from good. Because as you can see here, we should have texted this as the mask. And it says actually says no mass and Islam even being close to the shape of the face here. So the same thing can be seen for the other images as well. So here we have, this is the ground truth on the left and this is an image here. And this is where the network has predicted the bounding boxes. And again, the results are very bad. But this is something that is understood. We did not run the training for a long time and we don't have large batch size. And this was the first trials, so I'm not surprised that we got these results. So the goal is to have a dN, a good model that's able to detect the, the people that are wearing masks and the people who are not wearing masks. And so far, the model is still not capable of doing that. So we'll look and we'll see how to improve this in the next videos. 23. Settings for training and evaluating a Faster RCNN model on your local machine: So in this video, we will be talking about the different things that you can do in order to improve the training on your machine. So for me, as I have mentioned before, I don't have a very good machines for training large neural networks such as this one. Fester, our CNN will resonate 50. Let me just check, for example, my NVIDIA settings here. So this is my GPU. So as you can see, my G-force GTX, 1000 feet high, and it has four gigabytes of memory. And even with this, I cannot go beyond the number two for batch size. In fact, I can maybe choose three, but usually it's better to choose a power of two. So it is 24816 and so on and so forth. Because the way that the GPUs are made, it helps a lot. If you actually have a power of two as a batch size, it is much more efficient for parallelization, but that doesn't stop you. You can actually try three, you can try five. Just that. Usually staff recommended. So for me, since I am stuck with two and I cannot go beyond this. And we have seen the final training. And at least for the 200 steps, we've seen how it went. So I have decided not to do the training on my machine because I know from my experience that the final model will not perform very well. But for you, if you have a good machine, then of course, the first thing that you should do, based on what we have seen in the previous videos about how the training went, You can just start by changing the batch size and making it high value. Maybe you could start with aids. If you have really good machine, you can even try 16 or 32 or even higher. So for me, I will leave it up to here and I will not be doing the training on my local machine. In fact, what I will be showing you a bit later is how to leverage the power of Google Cloud platform in order to run the training on some of the powerful machines on Google Cloud Platform. And that will allow us to increase the batch size and get better results. So this is my recommendation for you. If you have really good machine than you can, the first thing you should try is to change the batch size to a higher value here. And also you can, you can increase the or you should increase the number of steps. So let me just look firstly, a number of steps here. You can increase it to 2 thousand, for example, at least to see how it went or how the training goes in these 2 thousand steps. For me, I will stop here, as I have mentioned. But the exact same setup you can use on your machine. And you can get, or you can train a deep learning model for object detection on your local machine. You just need to know that if you don't have a good machine, then the final model will not be that good. In fact, even with the settings that we have defined here, we can train a deep learning model and we can use it at the end, is just that the results will not be good enough, especially for production if you are working in the industry and you're trying to develop a good deep learning model for your task, then having a good enough machine for training or model is very necessary. 24. What is cloud computing and what is AI-Platform?: Hello and welcome to this new section of the course, where we will be exploring how to leverage the power of cloud computing, specifically using Google Cloud Platform in order to train our deep learning model. And we will be using the service called a platform for this specific task. So if you go to Wikipedia definition that you will find there is that cloud computing is the on-demand availability of computer system resources, especially data storage and computing power without direct active management by the user. So what they mean by this is that you have access to these machines. And these machines have available resources that you can use for storage and also for computing power. For me, I always, when I think of cloud computing, I think of it as remote Linux machines that are, that have almost the same thing as you have on your laptop, except that the configuration might be much, much better if you don't have good GPUs on your machine, for example, then on these remote machines, on the cloud, you will have better resources. You will have access to more storage if you need to, so on and so forth. And one important aspect also is that these remote Linux machines actually contains some software that does specific things that you can just use it directly. You don't have to install it on your local machine. You don't have to worry about whether your machine can handle demands and the resources that the software will need. So cloud computing is made for this specific task, which is to allow you to get access to these powerful machines with a high availability of storage and computing power. And for this specific section, what we will be using Google AI platform. So a platform is one of the services available on Google Cloud Platform. So this a platform for artificial intelligence tasks. So almost anything related to AI, you, you will know that you would need in your projects, you will find this AI platform service. Moreover, it has different approaches to training machine learning models. So there is the direct approach where, for example, they have support for TensorFlow. And you can just use your code, you upload your code. And a platform service would read your code. And then it will do the training as you would do on your local machine. And also there is the way that we are doing in this course, which is through containers, where we build a container that has inside all the code and the dependencies of that code that you need in order to do your training. And then we can just use this container on a platform in order to run our training. Another great thing about the services that it supports, most, if not all, machine learning frameworks. So if you think of TensorFlow by George amex, scikit-learn, all of these libraries you can use in your AI platform service. And the best way to do this, in my opinion, is to use containers. Because when you use containers, you are isolating your code from from the system and you're also compacting the dependencies that your code needs in just one image, a Docker image. So that makes it very easy to use any new libraries and frameworks. And even if you'd like to experiment with different versions of the same frameworks. For example, as you know, as you have heard, maybe the TensorFlow 1 x is different than TensorFlow, TensorFlow to point x. So you might want to maybe experiments on these versions, the versions that, that come with 1 x and diversions that come with 2 x. And maybe you want to run the same training and see the differences. Maybe you want to test new, very let's say a developer version of some framework. You can do all of this using containers. And since a platform suppose supports containers, you can use those frameworks and libraries to do whatever you like using. Using Google Cloud Platform. 25. Creating a Google Cloud account: In this video, we'll be looking at how we can create a new Google Cloud Platform account. And for that, the first thing we need to do is to have a Gmail account. So for me I already have a Gmail account that I created specifically for this course, just to show you how to start from 0. So once I have my gmail account created, then I went to Google and looked for Google Cloud Platform. Then I chose cloud dot google.com. And once I'm at this page, I chose this one here. Okay, let me just change the language to English. And you can just click on get started for free. So with Google Cloud Platform, what you, what you will have as a came just to check your country and just check that you have read the conditions and everything. And what you're going to get is $300 of free credit. So here what this means is that you can test so many services on Google Cloud Platform for basically for free because this is $300 offered by Google as a, as a cloud credits. So this is going to help you try so many things without really using your own money. So let's just click on continue here. And okay, you need to fill some basic information here. O here you actually need to use your credit card, but don't worry about this because you are not nice to, you will not be paying with your credit card. They just need this information in order to, to know that you are a real person. You're not above and that you actually have a visa card or whatever credit card that you can pay with later if you choose to maybe continue after you have that $300 finished. So for here I'm just going to enter my credentials. And once I'm going to enter my credentials, I'm going to click on start my free trial here. And after that, I'll show you what's gonna get, where we're going to get once we finish this phase here. So here actually, you will get to a specific step where you need to validate some information. And for that, I had to take a picture of my credit card to buy while hiding most of the numbers, just leaving the first four digits just so that they can verify it. And as mentioned here, it might take some time to verify all of my my credit card and all the necessary information. And for me, I will stop here and I will continue using my my account that I had created long before. But for you, once you create this, you can start your free trial and then you will access the same things that I'll be accessing once I create one size, start using my other account. So here now I am in the Google Cloud Platform, but using my old account, the one that I always use to create projects on Google Cloud account and on Google.com, sorry, and to use all of their services. So as you can see here, I'm already in a dashboard and it has already selected the project for me, which is a project that I'm using for testing purposes and for other things. And for you. Once you come here, you probably going to find one project desk already created for you. And you can either choose that or you can create a new project. So you can click on New Project. And here you can give a name for your projects and you can create it here. And let me just, So let's call it tests projects. Course. This MOOC, just click Create. And here, we should take maybe a few seconds. Now, if you come here again, you're going to see the new projects here. If you click it, then you're going to have the new dashboard that corresponds to your chosen projects. So for me, I'm going to continue using my old project here that I create a specifically for object detection, for testing and for everything linked to this course. So I'm just going to keep using this one here for you. You can start with that old one. So once you get to this page, you have everything necessary so that you can start using Google Cloud Platform. And I'll see you in next video. 26. Downloading Google Cloud SDK: Another important thing that we need to have in order to be able to run the training on Google Cloud Platform is what's called Google Cloud SDK. So this is a software development kits that you can download and you can have access to Google Cloud by, by the mean of your terminal. So here if I write, for example, G-Cloud, you will see a set of options here. Ok, so as you can see here, it's given me a list of options that things that I can use for on Google Cloud Platform using my terminal. For you, if you try this command, you probably going to have an error is going to say that you don't have that command installed on your machine. So for that, if you go and you search for Google Cloud SDK, I'm gonna take you to the page. So here just shows Cloud SDK. And if I come here, okay, says get started, I'm just going to click on get started here. And once you come here, you actually have the different options to install the latest SDK depending on your operating system. So for me I have Linux, so I downloaded it using these commands here. So just downloading this. You can also alternatively just download an archive using this command and you can extract it. And then you just run the script here, just follow these commands here. It's very simple. If you have Windows, then you can also follow these steps here. And there is a GUI for installing everything necessary. So for me I have already installed this. I am on Debian or you bunch of sorry. So for you been to follow these steps exactly. I just copied and pasted these commands on my terminal and I ran them and at the end I had Google Cloud SDK installed in my machine. So here, once you have it installed, let me just clear this. Let's clear it. So here, if I do, for example, G-Cloud in it. So this one is gonna just show me some information about my my e-mail, the project that I'm settings. So the project that I created or the project that I'm currently using. So for this, you can just verify that everything is set up correctly. And here, for example, it says pick configuration to use either reissue, reinitialize this configuration. We knew settings or create a new configuration for us. We're not gonna do any of the tube because we already have everything setup. Who's going to click Control C. And once you have you have your correct Gmail account linked to your G-Cloud or Google Cloud as decay. Here you can check so many things. So if we do, for example, G-Cloud, think, show G-Cloud I think projects list here, for example, is going to tell me all the projects that are created on my Google Cloud account. So if I come here for example, you can see these four and they are the same here. So I have four projects here. And if you want to set one of the projects instead of another one, what you can do is Google Cloud config set. And I think here you just copy this. So let me, let me choose this project for example. So copy, paste. Okay, I think I actually think it should be set projects. So here it has set a new project. And if I do again G cloud and see what we have here, as you can see, I have that projects set four used to be used. So for me, of course, I don't want to use this project. I want to continue using my other projects. So let me just do list my projects again. And here I want to set this project object detection using T F2. Now I just updated my project here. So let me check. Sorry, cloud in it. And here, Okay, I can see that it was set correctly. Control see, clear. So now I have, once you can use these commands, them means you have your Google Cloud SDK correctly installed on your machine and you can continue with the rest of the course. 27. Creating a google bucket and uploading data to it: Something that will be different now that we want to train our deep learning models on Google Cloud Platform is that our data needs to be on the cloud as well. So, so far we had data on our local machine, but now it needs to be on some storage facility or in Google Cloud Platform. And for this, we're going to use a service called you can just choose here Google storage or let me, let me maybe try bucket. So if I write bucket here, I can just create a bucket. I have to first option here. So creating a bucket means that you're going to create a almost like a drive where you can store your data just like your local drive, except that now it's on the cloud. So I'm just gonna click here. So I actually just changed the language in my computer, just stay always on English. So here, when creating a new bucket, you need to give it a name. So you can say for example, you can either create each bucket. You can consider it as an experiment, or you can create a bucket. And inside of it you create many experiments that depends on how you prefer things. So for me I'm going to create one bucket and maybe set several experiments inside of it. So here let's say, for example, t f2, say Schacht detection API bucket. And I always like to and my bucket names with the name buckets so that I can distinguish it from the rest of the names of whatever things I create on Google Cloud. So here I'm going to click on continue. And of course you are free to choose whatever name you like. So I'm gonna click on Continue here. This, she's going to choose one region. So we're going to have the lowest latency within a single region. And here for the location, you can choose whatever you like. For me, I am in Europe, so I'm gonna choose maybe Europe, West One. And can I click on Continue here? As you can see, you have four different options, standard nearline and coldline, archive and gallant choose Standard because it's the best one for short-term storage and frequently accessed data. Because the data we're gonna put here, the training process is going to access it frequently in order to read the images and pass them to the neural network. So I'm going to choose standard. So I'll click continue here. I'm going to choose uniform. So this is the type of access control that you want to give to you. Google bucket. For me, I'm gonna choose uniform because it ensures access to all objects in the bucket by using only bucket level permissions. Here you actually have to specify access to individual objects. And I don't wanna do that for testing purposes. And I usually just go with the second example because I want to ensure a uniform access to all of my objects. So I'm gonna click on continue. And here you can either choose a customer managed key or Google-managed key. I'm going to choose Google-managed keep because I don't want to add any more configuration. And here I'm just going to click on create. So now our bucket has been created and you can think of it now as just as your local drive where you have basically like a C drive or dy dr. And inside a vet you can create whatever you like. So for us, for example, we can create a folder. We call it experiment one for example. So if we do this, we're going to have a folder just like we had in our let me check. So data. So just like we had here, experiment one and inside of it we had these, these data folder training process and also the configuration file. We can do the same thing and just upload them here. And in fact, let's just do this right now. So if you click on Upload folder, let's just go to experiment one and the school to data just to verify, okay, here we have this, we have the pre-trained model. So yeah, so let's just upload this folder here, data. So I'm going to upload it. It would take some time since we have some data here. And while waiting for this, let's create a new folder. We're going to call it training process. So let me click on Create. So as you can see, I'm trying to keep the same, the same basically structure that I had my local disk inside my Google buckets. And this makes it easy to understand things. And for it makes it easy to understand and to keep the, to keep things the same way. So that whenever you, you do testing on your local machine and you go to the cloud, you know exactly what things are and you don't get confused by different setups that you choose for your local machine and Google Cloud Platform. So now that we have this, OK. Now almost all of our data has been uploaded. Now, I think all of them have been uploaded successfully. So let's close this. So as you can see now we have go to go to training process here it's an empty folder. This folder will contain everything necessary or everything that's generated from the training process. So there is the model and there is the, the events and everything that we had on our local machine than I have it here. For the data folder, what we have is the label map. We have the test and train folders where we have our record files or T Africa files. And we have a pre-trained model. For this experiment. We have faster, our CNN will resonate 50. So It's exactly the same things that we had on our local machine. Let me go back here. So the one thing that is missing is the configuration file, which is the file that contains everything necessary for the training. So if we go here, again, as you remember, we need to define paths to our files best to our dots records, and so on and so forth. So on our local machine, we redefine these pads here depending on the structure and the different drives that we have, all the different disks that we have on our welcome machine. But on the cloud we actually need to change this because now when we run the training on Google Cloud Platform, the training will need to access these folders here. So everything needs to point to this bucket here. And inside this bucket we need to point to all of these folders. So for that, I'll be showing you how to change things in your configuration file so that it's basically configured for training on the cloud. So we're gonna do this in the next video. 28. Preparing our config file for training on google cloud: So in order to configure things in our configuration file to accommodate training on Google Cloud Platform. Let's first try to remember this. Now we have a bucket here, so we have this name of the bucket. And inside of that bucket we have a folder called experiment one. And inside that folder, experiment one, we have all of these new, all these data and the folder where the training artifacts will be saved. So if we go back to our configuration file that we used for training locally, what we had was something like this. So we had basically until this point here. Okay. Let me maybe look for the faster our CNN. Okay, I don't have that one, but for SSD for example, we see that we have this folder here, and then we have ok, the experiment name and then data file. And the same thing goes for the rest of the past that we defined. So food label map for the training records, testing records. And it's the same thing, starting from the name of the experiment and what. And this will come before as basically the path to wherever these folders are found on our local machine. So if we think of it the same way, but just different in terms of now we have the data on our Google Storage. The only thing that's gonna change is that now I will pass are going to start with gs, which is stands for glucose storage. And then we have a column then slash slash. Here. We need to add the name of the bucket and also the name of the folder that we are using switches experiment one. But the rest will be the same thing as before. So for the chairpersons, There are found inside data to train model. So if we come here, so if we go to Data pre-trained model, we're going to find this folder here inside that folder. If we go to checkpoint, we're going to have CK PT in here. So if we go here is exactly the same thing starting from data. The only thing that's gonna change is where that, the name of that bucket and then the folder that contains the rest of these things. So for that, let's go back here. And let's take the name of our bucket. So copy down here. I'm gonna change this to this. And what's going to be left here is the name of the folder that we have here. So this is experiment one. So if we come here and we change this to experiment one, now we have. Exactly the same thing as we had before on our local machine, except that this path now points to Google Storage, to the bucket that we created. So if we take this part here, we can just copy and paste it basically everywhere. So for the label map bad, again, it's inside the Google Storage. Then here we need to give the name of our buckets and then the name of the folder experiment one, then label map would be found inside data. And for the training records, if we come here, we changed this. So gs column slash slash name of the bucket experiment one. And the same thing goes for the other label map or food label map for evaluation. And here, again for evaluation we have basically the same path except that now we're pointing at the test folder. So what this thing's changed now we have all of our paths pointing to the right place on Google Storage. And of course now that we're gonna do the training on Google Cloud Platform, specifically AI platform, because we actually have the ability to run the training on in two different setups. But for us, we're going to choose AI platform because it's much simpler. And now we have the possibility or the ability to change the batch size. So instead of using two, now, I am using eight. So eight is a larger batch size, which are going to make the training much, much better. And for the number of steps, I increased it to 2 thousand here. And here, mostly, most of the teams will stay the same or not going to change them. Again, I have shown you before what you can do with scales and aspect ratios and what they basically represent. So you can change this. For now. I'm just going to keep them the same because we're just testing now and we're experimenting. The only new thing that I have added as this parameter here called second-stage balanced fraction, which by default it's 0.25. which basically represents the amount of positive, positive examples that are taken while training. And for me I just increased it to 0.5. This means that I want to take things. I wanted to take the examples that represents a negative example. So background and positive examples and more equilibrium manner you can say. So this is the only thing that I changed. You can ignore this if you don't want to use it. But for me, I have actually tested with 0.25.5 and I did realize or I did notice some improvement in the results. So you can add this parameter here as well. If you like, if you want to just experiment, maybe you can remove it or you just don't use it and you run the training. And after that, you analyze your results. If they're not good enough, you come back here, you add it, add, you give it the value 0.5. So congrats to save this here. And now I think everything is setup for training on the cloud. What I will be doing now is going back here and going to experiment. And here I'm going to upload files now instead of folders because I just want to upload my conflict file. So let me click here, come back here. And I just changed the name of my conflict file. I added dash or underscore cloud at the end just to be able to distinguish it from the other conflict file that are used for training locally. So I'm going to open this. And now it's inside my buckets and everything is set up for training. And I'm gonna show you how to undertraining In the next video. 29. Running the training using Faster RCNN model: So before when we were running the training on our local machine, we use these commands from documentation. And as you can see here, it says local. So in order to undertraining locally, we just use this command where we defined the pipeline complex path and the model Lear. And the same thing for evaluation. Now, when you want to train on Google Cloud Platform, you actually have two different options. The first option is using what's called Google Cloud VM. So basically what you do here is that you rent a virtual machine on Google Cloud platform. So you can have access to that machine, you can customize it, and you pay by the hour. And basically that machine is yours and you're going to pay for it as long as you are renting it, as long as you are configuring it pro your use, then it's yours and you have to pay for for, for all of those hours even if you don't use it. Now for this, of course, you can setup things as they are mentioned here. And for me, what I prefer is using the second option here. So you can also train your object detection model on Google Cloud platform using the service called AI platform. So AI platform as a service offered by Google Cloud, which makes things much, much easier when trying to run the training on Google Cloud Platform. So in order to run the training, what you need to basically do is to do these commands here. So the first thing is that you need to copy the setup.py file from the object detection folder into the Research folder. So for me, I have actually already done this. Let me go back here. And here. As you can see, I am inside research. So if I do ls, you can see that I have setup.py file here. If you don't have it, you can just run this command. So what this command does is basically from the Research folder, you're going to run this command. So it's gonna go to this folder. Atf2 is going to take this setup.py file, is going to copy it to the current folder, which is research. So you can do this manually or you can run this command as you like. For me, as I said, I already done it. So I have set up the up by file inside my research inside my research folder here. So let me clear this, sorry, clear this. And the main command or the only command in fact that you need to use is this one here. And in fact, I have tried this before and I realized that some things are not updated. So something's did now work for me and I ended up tweaking a few things to make it work. And for that, I ended up using this command here. And in fact, I will be attaching this, this file here to this lecture so that makes it easier for you. So now many things change. Really. The only things that change are, for example, the Python version. They are mentioning 3.6, but it's not. If you run it like this, you're going to have an error. It's going to say it, I think is going to say that Python version 3.6 is not supported. So python 3.7 is the one that is supported, at least as of now. So we are in October 20-20. And apart from that, what I have changed are maybe I added this, actually this line here. So scale tier two custom. I think they don't use it here. Yes, they don't use it, but it is required when you run. It's going to tell you that you need to set up the scaled hear arguments to custom. So I have done this. Another thing that I needed to change was this or master accelerator as see what they used. So master axial and they use a count eight and type and VDS law V 100. I used to count two, and I kept the same type here. If you use count eight, you basically going to run out of quotas. And you need to make a request for new quotas. And that could take some time because you actually make an a request to the Google Cloud team and they need to study your request and take some time to answer you. But if you choose count two, then you can just, you already have this quota and you can use the machines with discount. Of course, if you want more powerful machines source, you want to add like eight different types at the same time. This is going to cost you more, that's for sure, but also you're going to have to make a request to the Google Cloud team. So for me, when I change this to two, h worked perfectly. So I'm just going to keep it here. I'm going to keep it as it is. And the things that I think apart from this, nothing really changed. The same commands are left as they are. And the only things that we need to set up, the model deer and the pipeline conflict that. So if you go to your terminal, you actually need to set up these two. These two paths here, but now they will be representing a, a pad that exists on Google Storage and not on your local machine. So as it is mentioned here, let me just keep this always on top. So here, this path here. So gs colon slash slash modeled Lear specifies the directory on Google Cloud Storage where the training checkpoints and events will be written. So basically to our training process folder. So if we go back to our bucket here, and what we need to point as this folder here, training process. And okay, let me just go back and check. So this should be, this should be modeled deer. So for us, let's start by defining K model deer. So my model Dear needs to be. Let's go back again to our Google bucket. So model Dear should be t f2 object. So I'll just copy this copy. And then slash experiment, experiment one and training process process. So this should be our model deer. So this is going to be added to this. Let's go back again and check on. Let's just look here. Okay, yeah, model data is going to be added to this was going to be Google Storage colon slash slash than this path here. So it should be the correct path. And now we need to also define the pipeline conflicts. So Pipeline conflict here. And this basically needs to point to our configuration file. And for that, what we need to have a is the same thing as before. So nice to be until this. So we're going to have the Google bucket name, Experiment one. Inside of that. We're going to take the name here. So maybe we just can't take it from here. So control a, control C. And yeah, this should be the correct pipeline conflicts. So let me clear this. And apart from this, I think we're set up to run the training. So let me just come here and copy all of this command. So copy this. And of course, again, I am inside my environment that I created using Anaconda, but I don't think it's necessary. This thing. Okay, and let me just verify something very quickly. Okay. Control see clear. And in fact, I am not sure, but I think you don't need to be inside this this environments that we have created since now. We're not going to be using the local environment that we create to, we're not going to be using tensorflow installed inset in that local environment. Everything's going to be happening in the cloud. And in fact, let me just deactivate my environment, clear this. And I have this. Okay? And I mean, I just check something first. We need to see whether our, our local variables that we have defined still exist once we close the environment. Let me check this, okay? Yeah. Basically the variables that you define inside the terminal, again to exist inside that terminal no matter whether you are inside that virtual environment or not. So let me clear this, this face this again. Let's run it and see what we get. So I just want to mention something here. So the first time you run this command, the command that we took from here or from the repository of TensorFlow object detection API. When you run it, the first time is going to ask you to to allow it, to enable this service here. So ML dot Google APIs.com on your project. So for that you just need to choose here yes, and click return. And then it's going to enable. It might take a few minutes. And once it enables it, everything should work properly. Now, you might face this problem here way says error and it tells you that it's an internal error was encountered. So to tackle this, what I would suggest you to do is just wait a little bit more and then run the same command again. Because sometimes even when you enabled the API, it might take a little bit of time to make the effect permanent and all of your project. So for that, sometimes when you run the command, we did here the first time. The effects of enabling the API is still not propagates it. So if you wait a little bit like I did here, and I run the same command again. You see that the command worked perfectly. So again, this is just a small thing that you should consider when you run the training the first time, or when you run this command here the first time. So I hope this is clear and if you face more problems that are different than does, then please let me know. So in fact, detraining. A job has started successfully, as it's mentioned here, which says it's queued, so it's already, it's already one of the training jobs on our Google AI platform. And just to verify that we're gonna go back here and we're gonna look for AI platform on our Dashboard. And let's go here. And in fact here you can see our jobs. So as you can see your EHR platform, I have this job here that just started. So maybe let me click on this. It could take some time maybe to start things, okay. In fact, we are inside our job here. Okay, let me just go back, maybe check all the jobs. Think my browser is a little slow, but I think the training job has been created successfully. So as just wait a little bit. So yeah. In fact, these jobs are the ones that I have tested by myself before. And this job here, as you can see, this disturbing thing, it shows us that this job is still running. These ones have finished. So if I open this in a new tab, we're going to see details about that new creative job. So as you can see, the job has just been created. So October 272020. Exactly. So just three minutes ago we ran this is the exact same job that we created here. So here if we go to View log, so I'm going to open this in a new tab again. And here we should see details about our job. So f the job starts and the training starts correctly. You're going to see all the details in here. If a problem appears, then we're going to see the problem appears here as well. So this could take some time to, to start a virtual machine on Google Platform and to set everything up. So I'm going to wait a little bit and we're going to see what we have after everything has been set up correctly. So after some time, we see that we have lots of logs in here where things seem to be configuring inside some of the machines on Google Cloud Platform. And if I go way down, I can see here that it says Step 100 per step time. Okay, this is first step time and then the lowest value. And this is an indicator that my training has started correctly and already 100 steps have finished. And in fact, you will see these logs after each 100 steps. So here let me just try to maybe see if yeah, okay. Actually, I'm sorry. So here, after 200 steps, we see another log and we see that the loss in fact has decreased, which is a good indication that the training has started correctly. So now the only thing we need to do is to wait for the training to be finished. And one thing we can do to maybe verify things as, Let's go to our Google bucket. Okay, maybe it's not. Maybe they just search for it here. So bucket. And we're gonna look for storage here. And here I'm going to go to the buckets that we create a so TF to object detection API bucket experiment one, turning process. And as you can see now we have some checkpoints. So the model is being saved to this folder. And we also have a folder called train. Inside that train we have events files, which are the files that basically save the development of the loss function during the training and also some other parameters during the training. So maybe to look this up, maybe something we can do is run Tensor Board. So locally when we were running Tensor Board, we were pointing to a local directory. But add we will using our terminal for Google Cloud platform, what we can use is the Cloud Shell. So it's almost like a terminal. And it's, it's running on your cloud or all your Google Cloud Platform. So here, once it's opened, we can see that we are inside our project object detection using T F2. And here what we can do is use the same commands as before. So Tensor, Board, log deer. And here we need to give it the path to this folder here. So if I do Google Storage, colon slash, slash, then the stake maybe let's say the name of the bucket. Then experiment one. Then inside of that training process. And one thing we need to add here is the what's called the port. So, so that when we open this, we open it in a specific ports that we can use to visualize our Tensor Board window and everything inside of it. So if we do dash, dash court. 8080 and I ran this command. It should show me that. Okay, the first thing it's going to ask me as to authorize Cloud Shell to basically make GCP API calls. So I'm going to say authorize. And now my Tensor Board window should be in this window here. But in fact, if you click it, you won't be able to access that site. The correct way to access it as by going here to this button here was called, which is called Web preview. And if you click it, it's gonna say Preview on port 8080. You click that. And now you have your Tensor Board window opening here. And you can see all the graphs inside this window. So immediately we can see that our training is basically much better than before. There. We still have these oscillations here, but still the, it's much smoother compared to what we had before. And the loss function is going down. So that's a good thing. And of course, the main thing that we need to look at is the total loss. So if the total loss is going down, then that's a good indication. If it's going up there and that's a bad thing and you need to check your training. So here we can see that the training is going smoothly and it has reached step 350. And as you remember, our training will will have 2 thousand steps if I'm not mistaken. So let me check again. 2 thousand steps. So after 2 thousand steps, the training is gonna stop. And we're going to have the final model saved inside our bucket here. So inside training process. So now you can see how you can do basically the same thing that we're doing on your local machine. You can do it on Google Cloud Platform. You have access to the same things except that now you have better, better machine. You have access to better configurations than what you have on your local machine. If of course you have a local machine that's like mine. But if you have a good local machine with good GPUs, then of course you can run the training locally and you don't need to go on Google Cloud Platform. And one final thing that we need to do is to run the evaluation. So basically you can run the evaluation at the same time as the training. So it's gonna do the same thing like it did on our local machine. We're gonna devaluation job basically, or the evaluation script is going to look for new checkpoints. And once we have new checkpoints is gonna run some evaluation and is going to save the evaluation values like green average precision into a folder. And there's going to be in the forms of events, just like the training folder here. So for running the evaluation will do this in the next video. 30. Running the evaluation during the training: And now we will be running the evaluation on Google AI platform. And for that, we're going to be running this command here. So it's almost the same as the previous command except this added part here. So checkpoint directory. And here this checkpoint directory is gonna, we're gonna give it the same thing as the model deer. And what it's going to do is basically create a folder called eval inside our buckets. So if we go here, here, I'm sorry, inside this. So the same thing that we have here. So there is a protocol train. We're going to have a new folder called eval, and inside of it, everything will be saved. And again, some of these commands are outdated. So for me after experimenting, I will be using these commands here that are updated. And again, you first need to copy that setup.py file from T F2 to the research directory and ID if you already done it in the first part when you ran the the training, then you don't need to do it here. So for us we already done that. So what we're gonna do is basically run this command here. So I'm gonna copy it from here, go back to the terminal. And I'm just gonna paste it again in here. So this will be I command to run a new job, which is object detection evaluation job. And now we see that the job has been queued. So let's go back to our dashboard and see whether we have the job. So this is the training job. It doesn't have evolved in it. And I hope you noticed this. So when I run the training, I name my job object detection and then I give it a date, so the current date. But when I run the evaluation job, I call it object detection eval, then the date. And of course the commands are the same. So this is the runtime version, basically for the TensorFlow. So it's 2.1, Python version 2.7, job deer. So basically where we're going to save our artifacts package path, this is something so that ALL platform nodes that we're gonna use, object detection, the module name, the same thing. This is four for air platform to know that we're gonna use object detection. This is the region scaled here, as I mentioned, we need to add it because if you don't add it, you're going to have an error Is going to. I think there is a default value, a value here. But if you want to use these machines, so Nvidia Tesla V 100, you need to set this to cluster the master machine type at the master accelerator. I'm gonna keep them the same as in the command mentioned in the documentation. The model there is going to be the same pipeline conflict is going to be the same as well. So now that we have this, let's go back and see. And now you can see, of course we have this new job and I'm going to open it in a new window. Maybe I'm gonna push it after the training here. And also you can get access to the logs. Here. Here you're going to have the same thing. You are going to have a Google EHR platform setting up everything so that it can run that training, sorry that evaluation part. And let's wait a little bit. And again, what this is going to do is basically create a new folder. Inside this folder here is going to be called eval. And inside of it, it's gonna save all the events for, for the evaluation job. So okay, let me go back maybe again and check to see whether started or not. So it's still setting up things in this remote machine. So I'll be back when when things have started and when the evaluation has started basically. So after a few minutes, I see that we have these blogs here where we have Detection boxes, predicts precision, sorry, so we have all of these values here. When I see this, this means that the evaluation is running the same thing that we had on our local machine. And if we go to our bucket again, we see that we now have this new folder that has been created is called evolve. And inside of it we have these events. And in order to see the evaluation part, we can also visualize this on censor boards. So let's open the cloud shell here. And of course the CloudShell is still running our tensor board here. So the training part, let me just maybe stop it. Eval. Let me refresh this again. So here because we only have one check point now we have only one data point here. And the mean average precision is still 0 because the training has not moved a lot yet. But after the training moves many steps ahead, we're going to see new values here, newValue saved. And now what you basically have two different jobs, so too different setups where one is doing the training and the other one is doing the evaluation. Whenever the training reaches a certain level, a new new checkpoints will be saved here. And once those new checkpoints will be saved, the evaluation job is gonna detect that there are new files and it's gonna run the evaluation again. When it runs the evaluation, we're going to see new data points here where we have mean average precision based on that new checkpoint that has been saved. So with this, you now have all the necessary tools to run the training, the evaluation on Google AI platform. And I think this will wrap up everything. Of course, once the training is finished, I'm going to show you the end result. But for now this will be at, and of course for the training this could take some time. These are some previous jobs that I read by myself. So as you can see here, this one took around two hours and a half. Of course, some of them I stopped them by myself. And for the evaluation, it really depends if you run it at the same time when you are running the training, then it could take some time. And if you choose to run it at the end than this, I believe could save some some of the cloud credit because the evaluation could go very fast once all the checkpoints had been saved. For me, deciding now to run them at the same time. I know that at most deltaic, maybe two hours and a half or three hours. I think for now, I still have enough credits to run both the training and evaluation at the same time. And again, once everything has finished, I'll show you the end results. So with this, I see you in the next video. 31. Analyzing the results after the training of Faster RCNN model is finished: So now that the training has finished and also the evaluation, let's take a look at what we have here and let's analyze the results that we got. So just the first thing I want to see is if you go back to the jobs list here, you see that both of the training job and evaluation job's finished. So for the training it took around two hours and a half. And also for the evaluation, let me just refresh this page to make sure that I have everything up to date. In fact, devaluation was the last job. So this is the training job that we ran. It took around two hours and a half and the evaluation job took one hour and a half because it was mostly waiting for the training part. So whenever the checkpoints are saved, the evaluation job basically reads that checkpoint and makes a new evaluation. So in order to run or in order to see the results and to analyze them as go back maybe here and just open a Cloud Shell. So let's wait for it. Okay, now we have it. Let me clear this. And what I want to do now is to maybe start with the train part. So again, when you use Tensor Board, use log deer and you point to the folder that contains your events. Then you will have the analysis going through those events files and we will see what's happening. What was happening during training. So let me run this and let me use Web preview here. I'll preview on port 8080 and close this. So here we see that the, this is how the development of the training went. So as you can see, overall detraining or the lowest functions went down. So every loss here, we see some fluctuations here. And this is normal, I would say. Since we are still in fact using some a low batch size, if for example, we use a better machine on Google Cloud Platform. And when I say better machine, I am talking about these values here. So what you give to Mestre machine type and master accelerator. And of course, I am using this because I don't need to make a request for CO2, for example. And also these machines are not very expensive. So if you want more machines and you want maybe more powerful ones in order to increase the batch size, then of course the price will be higher. So if we go back here and we look at the loss function, so here the total loss mainly encapsulates all of the other losses. So if you look at this is going to give you a much better intuition about how you're training went. So we see here that the training over all went well because the loss function kept going down until the 22 to the step two k. Of course here when I see this, now, something that you can immediately realize that for these last steps here, maybe, maybe the last 500 or even 800 steps, the loss function did not go down that much. So if I were to run the training again, then I probably wouldn't have used 2 thousand steps because I see that the loss, loss value here basically stayed the same identity move that much. So maybe I would have stopped it at this step here. So one hundred, two hundred. So this is just something that I usually do. The first training you run. You see how the train went. If you see that the training and the loss function stayed the same for a long time. Then if you're undertraining again, it's better to just stop the training before. So overall, we can say that this training was, was good. And now what I would like to see as the evaluation part. So for this, let me go back here and I'll stop this. And now I'm going to run the same command except that now I'm going to point to the folder eval. And just to reiterate, the, these folders actually exist on k, maybe I don't have them here. Let me just look for them here. So storage, we have our bucket. So TF to object detection API bucket, experiment one, training process. So again, we are pointing to this folder here. When we point to train, then we are looking at the events that hold the information that we're saved during the training. So the loss function values. And for the folder eval, what we have here are the events or the different parameters and different metrics that were saved during the evaluation job. So let me go back here and run this again. And I should now have in this page the evaluation part. So when I refresh this, I can see that the the mean average precision here changed from the first time. So the first checkpoint that were saved, if you remember, the first checkpoint basically is saved, wants the training starts immediately. And the other checkpoint was saved at step one k. And although we did, we chose two k steps. In fact, we don't see any, any value from mean average precision at the 2K step. I believe this is because the mean average precision did not improve. So now at the 1K step here, this is the best mean average precision that we got. And we can see that mean average precision basically gives us information about large, medium, and small objects. So inside the image, sometimes when we are detecting the masks or no masks, if they are close to the camera, then they're gonna look bigger and larger. And if they are far from the camera, they gonna look small. So what these, these graphs here tell us is that in the large or on the large objects. So when the people are basically close to the camera, we have a more or less high mean average precision. So this is a value of 0.6, but when it's smooth, that is 0.3. And here, for medium objects surround 0.3. if it's smoother than 0.2. And again, for the small ones here, we see that they have the lowest value from mean average precision, which tells us that for small objects, our model is not performing very well. So for example, if you're undertraining again, maybe this is something that you would want to focus on. Maybe you can add new images or collecting new images with objects that are far from the cameras so they look small. So this could improve the results as well. And also you can tweak the parameters in the conflict file for that purpose. And here it is just the, the recall. But for us that's just focus mainly on the precision here. In fact, if you are working in a team and you're trying to build an AI solution is better to focus on one metric. This is something that I read in the book from Andrew Zhi. He says that it's better to focus on one metric when you are trying to optimize a deep learning model for some AI project. So for me, I usually try to focus on the mean average precision because it tells me how my model is performing overall. And apart from this, we can actually look at some images. So these are the images coming from the evaluation data set. So if you remember here in our experiments in data test, so here we have a dot record file which contains images and annotations for for the data that the model did not see during the training. Or at least I didn't use during the training because in fact, the model did see them during the training because we ran the evaluation. At the same time as the training. So at d1 k step and at the first step also, we were doing evaluation. So the model did see these images, but it did not use them for training. So this is why we use them to see how our model basically generalizes on images that were not used for training. So here, what we have here is that on the left we have the predictions of our model. And on the right we have the ground truth. So here for the first image, we see that our model is performing Jose very well. Here. It detected all the faces with masks, so it had the correct class predicted. And also the bounding box is, is very good. I would say it's the model is very sure that this, these people are wearing masks. And for the ground truth, we see that the same thing. So we didn't miss anything from the ground truth, which is a good thing. Let's look at other images, may be this one. So again, on the right we have, sorry, on the right we have the ground truth. On the left we have the predictions. And again, we see that the model was able to detect these people wearing masks correctly. Let's maybe look at different images. Maybe something that's a little difficult for the model. Okay, let's look at this one for example. So here, again, on the left we have the predictions, and on the right we have the ground truth. And you can realize this if you look at the probability of predictions or the probability linked to that annotations on the ground truth, you see everything is 100%. The, we are very confident because as a human who annotated these images. But on the left, you see that for some predictions. For example, this one here it says that it's wearing a mask, but it's very low probability. I believe this is I'm not sure if it's 97 or 37, but still, the model is still not sure. Here we have prediction of someone that's not worrying less and the probability is a little low. Again, the same thing here. So dispersion is not wearing the mask and the probability is low. And we also see that in fact there is a person here that the model miss. So this person here is not wearing the mask. The model should have predicted this, but it didn't. So again, this shows how our model under performs when it comes to small, small objects in the image. So when the objects are big, so when the people are closer to the camera, their faces or their heads are, are very clear. Soda model is find it's easy to predict whether they're wearing a mask or not. When they're a little far, then it becomes a bit more difficult for the model to make the correct prediction. But overall, I would say that the model is doing well. I, I almost never expect the model to do to be perfect, so to have 100% accuracy and to find all the objects. But I expect it to at least be able to predict the correct classes and the correct the correct things for the objects that are clear to me as a human. So these ones are very clear. So I would expect the model to do well on these. The ones that are far a little bit, then I'm usually OK if it's not performing that well on those things. But of course, there's always room for improvement and you can always do more things to improve your model. And again, maybe this image is also a good example on when our model underperforms. So mostly these are small objects. Most people or all people here are far from the camera. So the model is able to predict most or a lot of these people and whether wearing, but also it misses other people like these ones, these ones, these ones here. And again we can see this underground truths. So there are so many people here, but in the prediction part, we're not able to predict those very well. But again, overall, I am happy with these results and I think that this would be it for the training and evaluation part. What I would like to do now is to take our trained model. So the model that's inside, inside our training process. So I would like to take this model here. So C checkpoint to and use it in my, in my, in some Python application to make predictions on the images. And again, as you can see here we have checkpoint. Checkpoint. Checkpoint. Checkpoint one is the one that was saved once we ran the training. So whenever you're undertraining, using object detection API is going to save a checkpoint at the beginning. And that one usually is very bad because we're still at the start of the training. And we also have another one that was saved at the 10000 step, 0.2, and another one that was saved at 2007. I am going to choose d1, that's called checkpoint too. Because again, if we go back to our graphs here, we see that one model or the model that was saved at the 1K step is doing well. And no other model was saved because they probably did not do well compared to this model here. So we're gonna take this checkpoint to download it and use it in some Python application to make predictions on our images. 32. Possible things to do to improve our model performance: So I have mentioned a technique or a possible approach to tackle that problem where we had the model not performing very well on small objects in the image. So one thing that I mentioned was two maybe annotated new images where the objects are small and include them in your training. So that's a possibility. Another thing that you can try is going at the scales for the anchor generator here. And what you can do is add maybe another scale there's even smaller than this, so maybe 0.1. here. And what this means is that we're going to generate some anchor boxes that are very small and that they will learn to converge to those small objects in our gram, ground-truth images. And this will teach basically neural network to recognize better the small objects. And also maybe I will remove this bigger size here because we didn't have, maybe we didn't have that much. A damning images where the object was that big. And just doing this, I believe this could improve the results. And again, since I have explained this before and what they mean, I hope that you understand the intuition behind this, because what I'm trying to do here is generate anchor boxes that are very small. And when I train my model, this will basically, it's as if I am guiding my model to detect things that are small as well. So this is another thing that you can do. And of course, you can still play with the other parameters. Maybe you can add more data augmentation so that you can generate more, more images during the training. So this is also a possibility. So there's a lot that you can do just playing with these parameters can improve things. But of course, the purpose of this course is to show you how you can approach these things when you're building a deep learning model for object detection. And we can't go through all the options and do them. Because I don't think even that the cloud credit that we get from Google Cloud will be sufficient for doing all these experiments. But of course, I am giving you some guidelines here and I encourage you to try them. And if you facing a problem, don't hesitate to let me know. 33. Downloading the trained model and exporting the frozen model from checkpoints: And now, since our training has finished, what we need to do is to use our model for inference or for making predictions using our images, using new images or images. And for this to be the case, as you remember what we have now as or are these checkpoints here that we're saved during the training. But DES, although they contain the model weights and the architecture of our neural network. The convention is not to use these for inference at production time. So when you are trying to put your model in production. So whether that be on the cloud or on your laptop as part of some sort of a desktop application or a mobile application on Android or iOS. For that, you will not be using the checkpoints here bots, you will be using what's called the frozen model. And this frozen model is basically taken or is generated from these checkpoints. So this is what I'll be showing you in this video. So the point is, or the purpose, and the aim is to take the checkpoints here and turn them into a frozen model that we can use to make predictions. So the first thing I wanna do is to check which ones, which one of these check points. As you can see, we have three different checkpoints here. I want to see which one is better so that I can only take that one. And that one I'll be using for creating my frozen model. So the first thing I'm gonna do is just activate a cloud shell here. And I will run the tensor board again just to look at the evaluation events. And based on that, we are going to choose which checkpoint to keep. So here basically, if you ran the commands before for Tensor Board, if you use your arrow keys for like now I am using the up arrow key. I can go to the commands that I used before, and I can just run them from here without retyping them. So here, as I mentioned, I want to go to to my checkpoints are auto my events here in the evaluation. I want to see these. So for dads, I'm gonna choose here evolve. And I will run this command from here. And once I see that the command is running, and of course we have to authorize this. Now that I see that the tensor board is running in this, on this address in the browser. And we're gonna go to Web preview, preview on port 8080. And here we're going to force gets our plots again. So as I have mentioned before. If we look at the mean average precision, here, we see that we have basically two points where the model was saved. There is the first step 0, so that represents checkpoint one, and there is the other one that was saved at step one k. So that represents checkpoint to. And of course we have here, as you can see, we have three different checkpoints. So the third checkpoint, we didn't, although the model was saved, there was no evaluation done. The model probably did not improve that much. That's why evaluation job did not save or did not add that point to the graphs here. So for that, what we going to do now since we see that in this plot, the model that was saved at 1K is better than the model that was saved in 1K. Because step 0, because a step 0, we basically had the same retrained model that we took from the model zoo. So of course that model did not know what mass or what people wear a mask and people not wearing masks. So of course we had the mean average precision as 0, but here it started to learn and the mean average precision started to increase. So we're gonna take checkpoints one and for that or checkpoint to, sorry, so check 0.1 is the model that was saved at step 0. Check point to is the model that was saved at step one k. So here I'm going to download these two files here. So let me just in fact, I think I already have them downloaded, so let me check here. Yeah. In fact, I already have them here, so maybe I'm just going to cancel this. And we're gonna use these ones that we have them saved here. I already downloaded this. So what you need to download these two files. So checkpoint to check 0.2 here, one for the index, one for the data. And also you need to download this file here called checkpoint. And with that being said, let's take the three files from here. So I'm gonna copy them. I'm gonna go back to our folders where we ran the experiments locally. And here I'm going to create a new folder. I'm going to call it inference. So inference in deep learning terminology means that you're just going to pass images through the model to make predictions. You're not going to run the training. So this is the inference phase. And let me just maybe create a folder called data. Inside of it. I'm going to create a folder called checkpoints. And here I will save or I will paste all of the files that were downloaded. So as you can see here, when the When the files were downloaded, the Google Cloud Platform basically has this naming convention where it's basically giving us the name of the folder here. So experiment one, then training process, and then we have the names of our files here. And this okay, you can keep them like this. But in fact there is, there is something or there is a problem that you're gonna face at a later stage, which is basically one we're going to transform our checkpoints into a frozen model. Descriptive we're gonna use is gonna read this file here. So the checkpoints, this first file here. So let me just open it and show you what it has. So this pile here, as you can see, it has basically a history of the checkpoints that were saved. So here it says, OK model checkpoint path, it's pointing to the last model, so check 0.3. And here, every time there is a new checkpoint, it adds it here as and with the corresponding name, so checkpoint 123. And if we go here we see them named as such. So we have checkpoint, checkpoint, checkpoint three. So when we are going to run, our script is going to have a problem because in this file here, basically this file tells us that the history of the training, these are the files that were saved. But then when our script is going to look at our folder, is going to see that there are other files here because it's going to see that they have other names. So it's not going to recognize that this checkpoint three or this checkpoint to, for example, corresponds to these two files. Although we have Checkpoint two in the name here, but our script is not going to know them because there's all of these, this part here, all of this here. That is, that was added by the Google Cloud platform just to help us basically distinguished between our files and that poses a problem for us. So the easiest way to fix this, and basically just to remove everything that was added by Google Cloud Platform. So again, I remove this. And I'm gonna go here and remove this two. And finally, we're going to remove this. So now we have the same structure that we have on our Google Cloud platform, and we have the same names. And one we open our checkpoint here is going to be the same. So when our file checkpoints has these names here, if our script goes to our folder here and it reads. The checkpoints is immediately going to recognize that this CK P t2 corresponds to this C K P t2. So now, okay, let me just close this. We don't need this anymore. And now that we have this, we're gonna run the script that comes with Google or TensorFlow API. So let me just open a new terminal. And here I'm going to activate my Conda environment, activate TF to GPU. And as you can see, is activated now. So I'm going to clear this and now I'm going to CD to my, to my folder that contains the tensor for object detection API. So this should be n here, TensorFlow. And then we're going to go to models and research. So now we are inside the folder Research, which is inside our TensorFlow to object detection API. And the script that we need to run is actually inside the folder object detection. So let us take a look at the are here. And the one that we are going to use is this one. This Python script here will take the checkpoints that we have here as input. And it's gonna give us a frozen model at the output. And that frozen model, we're going to use it to make predictions. And that would be defiled that you would deploy in production however you like. And you won't be using the checkpoints in production. So for that, let me clear this again. And the way to run that script is, in fact, I already have written this to help you basically take that and run that script easily. So I'll be sharing this story with this lecture. So here, in order to run the script, what you are going to give it. First of all, this is a mandatory input here or argument where we tell it which type of input we have. So we have an image tensor. So I'm just gonna mention this. This is by default and always stays like this. The second argument will be pipeline compact path. So this one would be the path to the config file that we have. Then we have the trained to checkpoint, dear, this will be this is the path to where we saved our checkpoints. So this is why I mentioned in the adding new this here. So the dollar sign then model deer. And I mentioned in that this is the path to where your checkpoints were saved. And finally, this is the output directory, and this is the directory that's gonna contain the Frozen model. So for that, let me just take a look at how we are structuring our data here. So first we have the checkpoints saved here. And what I usually like to do is to put everything that I used for inference inside this folder here. So that's, or at least inside the inference folder here, so that I know what I used during the phase where I froze my model. So for that, let me go back here and I'm just gonna take the config file that we used on the cloud. So Control C. And maybe I'm going to add it, add it here. And what I also need as No, I think that's it actually, that's the only thing that I need. But we will be saving our frozen model into a directory. And for that, I'm gonna create a new directory. I'm going to call it frozen model. And this naming convention Frozen is known and the deep learning community, so what it means is basically the model will not be eligible for training anymore. It's gotta be made for prediction only. So when we run this, the exporter main V2 dot pi is going to optimize our model for inference. And it's gonna remove any parts that are not used during inference. So now that we have this here, which is going to go here. And I'm just gonna go to my script again. And I'm just going to define these variables in here so that it's easier for me to run that final command here. So the pipeline conflict path. This one would be our configuration file. So I'm going to run this. The model dear will be the bats to where our checkpoints are. So this, I'm going to add it here, checkpoints. And finally the output dear will be the bath where I'll be saving my frozen model. So this would be that one. And now we have everything necessary to run our script for freezing our model. So let me just copy this command here. And I'm going to paste it here. And it's verify we have or we should have everything here. And let's run this command and see what we get. So if everything goes smoothly, we should get some new files added into this folder here. So let's wait a little bit. This could take maybe around a minute or so. So let's wait and see what we get in here. In fact, I realized that I forgot to change something. So I wanted to show you this error here so that if you face this yourself, you're going to know how to fix it. So here it says this. It did not find this checkpoints, you'd be T3. And the reason for that is that when our scripts model exporter, exporter main v2, when it's run, its going to read this file here, checkpoints. And this model here is basically pointing to the last checkpoints that was used or that was generated new generative learning training. But for us, since we were going to use checkpoint to, we need to change this to checkpoint to We don't we don't really need to remove these SharePoint three bars from the fire will just need to tell our script that look at Checkpoint two and not check 0.3. So now that we have this, let me clear the screen again and we're going to run the same command again. So I just use the arrow keys to get this command and let's run it again and see what we get now. So here I wanted to show you this error as well. As you can see here, it says that there is some tensor that's incompatible with another tensor shape here. So you read something like this, it might be very confusing and you may not know what the problem, the problem is. So for you, like, what's the difference? Exactly? What's what's the problem? Where is this problem coming from? You downloaded the checkpoints and you put them in your directory and you just took that config file that you use on your local machine and you just running this script here. And the problem is, for me, at least this config file that I am using here. I actually took it from the local training. So when I was training locally and when I was running the training locally, I had some values a little bit different than some of the values that were used on AI platform. So as you remember, if I go back here, so I had one configuration file that I use locally. Then I created another one for training on the cloud, and then I made some changes and I uploaded this one to the cloud. But I believe I did make some changes to this conflict file even after I uploaded it to the cloud. So diversion of the conflict file that's that was used for training. Our model on air platform is different than the one that is saved on my local machine. So this is posing a problem. And in order to solve this problem, what you need to do is just download the exact configuration file that you use on your AI platform when you were doing the training. So if you download this exact one. And for me, I downloaded it and it's saved in my downloads folder here. So I'm gonna take this one. I'm gonna go back here, go to inference. I am going to remove this one. And I will base my configuration file that I downloaded it just now. And of course, as you can see again, when Google Cloud Platform downloads or gives us some files to be downloaded, it changes their naming. So here I'm gonna change it to the original name that you can keep it as it was. This does not add any problem, but just for me, I like to keep things. I like to keep the same structure everywhere so that it's easier for me to understand and to remember what I have done in the past. So now I am sure that this is the exact configuration file that I use during the training on AI platform. And let me just go back to my terminal here and let me clear this. And I'm gonna run the command again. So now everything should work fine. There shouldn't be any problem now because we have the exact same structure that we had on our platform. We have it now here. So everything should go smoothly and it shouldn't take that long. And it seems that everything is okay. There's no error, there's no problem popping up. And of course, in our folder here, okay, now you see that our folder is filled with some new files and this is the frozen model. And as you can see, the freezing process, if we can call it like that, is finished. And now we have this folder here. A contains some files here. So this is just another configuration file, which is basically the same configuration file that we had, just copies it here, as you can see, we have the same the same values that we had in our configuration file here. And you just reload this. What else do we have? We have a folder called checkpoint, where we have the checkpoints that we use. So the ones that were here, they were copied to that folder here, or data holder, Sorry. And finally, what will generated or the main thing that was generated as a frozen model, as the saved model here. So as you can see now we have these three folders, and we have these two folders and this file here, so this file is dot v b. So this is a, an extent and extension, which means brutal, but this is an extension that basically is a binary file, so it's small and it's also encapsulates the architecture of our model. And in the variables here we have the weights of our model. So this is the new convention of the frozen models in TensorFlow to, so if, in fact if you go back to TensorFlow one, what you would get a DNA is just one file, dots PV. That's a large file which contains both the architecture and the weights. But now this is a different approach from the tensor flow. This is what the TensorFlow team has chosen. So now we have a file called which has an extension dot PV, and we have a folder that contains the variables. And we also have a folder that's called assets. But honestly, I I still don't understand what's the purpose of this folder here is always empty, but there's probably some specific case where it might contain something. But still we gotta keep it because it was generated. So all of these represent the frozen model and this is what we are going to use to make predictions on our images locally. 34. Running the frozen model on new examples locally: Now that we have our frozen model generated here, let's use it to make predictions on new images. And for that we're gonna use some helpful scripts that are ND repository from my friend again. So he made some very helpful scripts that we can use during the inference time. And these two scripts are one called detector dot by and another one called objects, detects that buy and use them. I actually copied them and pasted them in my sum, some files on my local machine. And what I will be doing as using the scripts. And in fact, I made some small changes to these scripts and I will be sharing with you my version of the scripts, but they are very minor changes. And here what basically a D scripts do as the first one, detector dot by as represents a class here called detector tf. And this class is a class that helps us do so many things where, for example, we're going to load the label map. As you remember, the label map is the map that basically has IDs and then has classes. So for example, ID1 would map to class mass and id2 we would map to class no mask. So here it's also using some utilities from the object detection API. And what we're doing here, just loading the some necessary fires like the the label map we're creating or we clearing a session here. So when you see this command here, what we're actually doing is just that if there is a, some model already loaded in memory, then this command clears it. So that for example, if you run the inference once, if you run it again, then first going to clear the memory of your machine and then continue to load the model here as it's shown in this command here we're loading the model. So first we are clearing the session and then we're loading in the model. We do this. Because if you already have a model in memory and you try to load another model, sometimes you can get a memory error where you basically either your script freezes or something is going to prevent from running the inference. So when we clear the session, we're making sure that the memory is empty, or at least there's no model loaded in that memory. And then we load the model again. And then there is a function here that basically runs the detections. So what we do here is. We create a tensor image or an input tensor. So the only thing we're doing here is just we're taking our image which has three dimensions. We have width, height, and depth. Depth represents the number of channels, usually three. And then we were adding a new dimension. So now our image becomes something like one. Then with high and three. So that's, we're adding this value or this new dimension here because during the training we were using something like this. And this helps us, for example, load mini images and pass them to the model that we constructed. So you can imagine that we have, for example, two or three or four images, and they will all be stored in the first dimension. So for example, you can imagine a structure like this where we have four images and each one has the same width, height, and depth. So the only thing we're doing here, and then we are getting the detections by passing that input tensor to the model. And once we are getting the detections, and then we're basically decomposing the information that's in here because the detection part contains so many things, it contains the coordinates of the bounding boxes. It contains the classes or in our case, mask or no mass. It also contains the scores or the confidence levels. So if you remember when I showed you in tensor board, some of them had like 99% confidence and some of them had maybe 77 or 74%. So that represents the score. And usually a stunning percentage. In fact, this number in percentage, it's a value between 01. So these would be something like 0.1 or 0.7 or 0.9, so on and so forth. So here we're just basically putting everything together and then returning detection boxes here. And of course this extract boxes as just doing what I have mentioned. So he just putting everything together, we appending the different boxes one after the other and we're returning them. And then here there is this function that displays the bounding boxes. So what it does is that it takes the image, it takes the coordinates of the bounding boxes and their confidence scores and their classes. And it's just, as you can see, we're using Open CV to draw rectangles on our images using the coordinates that we got from the detection part. And that's it. There's no there are many things. There's nothing here that's complicated. Everything as just. Blaine, simple, we just make your predictions, taken. Those predictions, organizing them into different bounding boxes and then we'll join them. So that's what the detector class here is helping us do. And here just want to mention something. I think I made some changes you when I was doing some testing. I'm just going to verify here if there's anything that I changed that you don't know about? Yeah, I think I think there isn't there's nothing here. Maybe I just changed the size of the rectangles because they were the lines with two things. So I maybe I added some some size to that. But apart from that, there's nothing specific WHO does nothing complicated here. And then we have the detect objects Python file. And this file basically uses the class that's defined here. And we're either making detections on the video or some folder that contains images. And for us what we will be doing as running the detection on images that are in a folder. So for that, let's go back to our folder where we have everything shown here. So we have the checkpoints to prose and model. Let's create a new folder called them, for example, test images. And then inside this folder here we going to add some images. For data. I'm going to look for images to add them here. And once I add them, we're gonna use them to make the predictions. So here we have these four images that we're going to use to test our frozen model. So here I just have an image of my brother wearing his mask. Also an image of me wearing the mask. And also I have two images from the data set that we used. So we have this image where two people are wearing masks here. And here we have a bunch of people who are wearing masks. And also this person here is not wearing any mask. So we're gonna run our script here on these images. And let's see what we get. And I just want to mention something. Since I am recording and I'm also going to do the inference, the memory could be overloaded. So so far I'm not sure if it's going to freeze or not. If my screen freezes, then we're going to wait a little bit. Or there's also something else that I'm going to do which is saving the predictions. This script basically allows us to do that as you can see here. This last parameter allows us to save our scripts or our output images. So at the end we should have some output images that are basically the same images. As ours except that there will be bounding boxes here. So for that, let's go here and just look at the parameters before we run the script. So here the scripts, we're gonna give it the model path. So the frozen model, we're gonna give it to the path, to the label map. Here. We don't need to give it this because we want to show all the classes. This basically, if you have a, let's say many classes and you don't want to show all of them when you do predictions, let's say you have people with mass, people with no mask. And let's say a chair and you don't want to C, D predictions for the chair. And your image. In that case, you just use the ideas for the classes that you want to show. So for example, one for people with mass, two for people with no mass. And if three corresponds to the chairs, then you don't add three here. For us, we want to show everything so we're not going to pass this, this threshold here as the confidence score. So for example, you might choose to show only the objects that have a confidence that's higher than 0.8 or 80%. Or you want to show, in this case, for example, we want to show all the objects that have confidence above 0.4. Or you can even decrease this and put it 0 or 0.01. where you're gonna show even more examples. I think 0.4. is okay. In fact, I'm even going to increase it maybe a little bit to 0.5. And apart from that, if you pass the video, bad argument here, that means you want to make predictions on a video. For us, we're not going to be using this. And I do encourage you to try this as well. And finally, we're going to pass the arguments save output. And what this is going to do a save the predictions. So our images will contain those rectangles. So if I do output directory here, if I add a path in here, then that's where the output will be saved. But for me, I'm not going to add it here, I'm just going to change it where I give the arguments. So I wanted to save them in a folder called outputs, inside the folder called data. But let me just maybe change this and maybe I'll get a new folder. I'll call it output. And I want to use this folder for my as an output directory. So let me just clear this. And I want to get this in here. And I'm gonna take this folder. So copy and paste it in here. So this means that any generated images or any images where the predictions will be shown on them will be saved in this output directory in here. So now let's, let's try to run our script here. And in order to just be able to see all the arguments, I'm just going to run the script from here. So the first thing, again, we're going to have to activate our environments or CF2 GPU. I'm gonna clear this and then we're going to run Python detect objects dot pi. I'm gonna give it the path to the model. So modeled path will be look at this. So the model path will be in the frozen saved model. So I'm going to pass this one here, saved model. And what do I also need to pass as the path to the label maps of paths to label map. And I don't have the label map in my folder here. So let me go back here. I want to add it here as well. So as I have mentioned before, I like to keep everything that I used for some specific task in one place. Even if, even if I'm copying things, I just easier to understand and remember things that way. So I'm going to copy this from here, and I'm gonna go back here and put it here. Okay, so now I have the label maps I'm going to facet here. What else do I need to pass? I need to pass the directory where my images are. So images dear. This would be the folder test images. And apart from that, we only need to pass the save outputs argument here. So we don't pass anything to this One who just say save outputs. If you add this arguments in here, this means that it's going to save the output images into our output directory. So let's run this and see what we get. And again, this might take a little bit of time and it might freeze a little bit because we would be overloading the GPU memory. So let's hope that it goes smoothly. And let's try this now. So now OK, we have some logs from TensorFlow. Okay, successfully opened the dynamic libraries from CNN. And now as you can see, we have this image, the first image, it's not shown a well here because of the way OpenCV shows the images, but I'm just going to click on space just to go to the next image. And here we see a, another image where we have debonding boxes and the writing here. So the, the, the class plus the confidence score and the The writing is a little bit bigger here because I made some changes, but I'll speak about this a little bit later when we finish this. This is another image where we have the different faces. And here as you can see, this class says no mask, this mass, this is mass. So let's go and this is my, this is the image of me wearing my mask. And that would be so good that it didn't freeze. Now let's go to our output directory. I'm not sure it should have saved them here. In fact, I had a problem here where I forgot to add a forward slash at the end of my bath here. So this was causing a problem. Thus why the images were not being saved here. And once I added my forward slash here, the images were being saved when we run the script in here. So now that's the images are saved. Let's look at them. So I'm going to open this first image of me wearing a mask. So as you can see, our model was able to detect my face. And it says that there is a mask, so I am wearing mass. That's correct. And it's very confident prediction. So it's one is basically a 100%. And that's good because we never shown these images to our model during training animals still able to predict it. Well, so that's good. If we go back here, we see that it was able to detect all of the faces here. Of course, the writing is a little bit big in here. And that's because I made some changes in order to be able to see the writing in here because my image is larger than these images. That's why if I did not increase the size of the, of the writing here, we wouldn't have been able to see it. But here it's also able to detect these ones. So that's really good. Here it says no masks. And as true, this kid has his mask a little bit. He's not wearing it correctly. So that's why it's predicted as no mask. Here. This is predicted as mask, this one we can't see because the writing is a little bit on the top. And this one also we can see that it says masks, so it's a good prediction. And overall, I can say that our model is performing very well. So with this, now you can see that we, how we went from the training to getting our checkpoints to freezing them, to get a frozen model, and then using that model to make predictions or new set of images. And now you have seen basically the whole pipeline where you, where you do the training and also the inference parts.