#### Transcripts

1. 1 Understanding pyTorch download dataset: Hello and welcome to this video about Well, implementing pytorch pytorch is a great framework for creating learning nor networks. And in order to you lies the power and the wealth customized actually pytorch for you specific machine learning problem. You need to really understand what Pytorch is doing another hood. And to do this, we want to develop, actually. Well, train and train a basic nor network on the M this data set at first and then we want to implement step by step the functionalities, which pytorch actually offers to us and you'll see at the end how easy it actually is and how powerful the framework is. Okay, so ah, let's get into that. What are you going to do at first? Of course, we want to basically import a few models, and then we want to download the data set. So in case you have no store to the end estate, is that on your computer or laptop? So far off course, we need to download it first. Okay? And you can see I'm using to put a notebook. But of course, feel free to is any other kind off Well, editor, you would like to do I would like to use The notebook is just one of them. But just make sure that you have installed Typhoon Three. That's important because I'm using the path left robbery. And the pilot library, as far as I know at least, is only available from Python version 3.4 and later versions. So make sure you have path on three if you want to follow it and do it this way. Okay, so the first thing is we're gonna do is actually we will import the path lottery. So let's say from path, lip, path, lip. I want to import path. Okay, half, because I want to use this and then I will also import requests. So import requests like that's, um just to really download the day. Said request is a great labrie, actually, to download a year from the Internet off course, feel free to also use any other kind of LaBrie front. Since your lip, you are a clip. Of course, this all success and other kinds of, but okay, now we get the models we need, and then we can specify a path so we can say in this case, our data path data path is equal to you. And then we say simply, path, um oh, path like after path. And actually, I misspelled it here. It's small, so path like that. Okay, so what data path, um, is path and then reason? Simply specify data in this case, and, um, reprehensibly say in this case, they are okay, that's fine. And then we say the path itself by actually data and land slash feminist. Okay, it specified here. Data amnesty. Okay. And now we can make the directory. We can simply say data path data path dots make dir make direct use to create peace. The directory. And they can simply say, in this case, data is equal or in this case, sorry. Parents, parents, parents is equal to true like that. And also, we specify that the exist exist on the school. Okay, is also equal to true. So those two parameters do actually the following. So we just make sure that if this fat path, which we have here so if this does not exist and we specify parents is false, which is the default one, then? Of course, if data does not exist, we will get an air Okay, So basically, we make sure that the parents we need to exist and if they don't exist, we create them, and they exist. Okay. It was true. Simply means that if we have already downloaded the data set so this state of five path is already exists, and then the status has already downloaded. Then we do not get an era. Okay, so if it exists, it's OK. And they would simply continue our coat. So this is just for instance, if you want to Well, well, reuse your code, Laurent, or just rerun the coat. Then of course, we want to make sure that we do not run into arrows. Okay? That's why we specified those two parameters. OK, so the next thing is that we actually then specify the rail. So now we got the fat path where we want to download the data and store the data. And now we need to specify neuro so we can suppose that your l is equal to and now we need the well you're out to actually the ominous data sets and we can use several, but for instance, one of them, but we can find it is on the HP, it's that slash And then we can simply spy specified deep learning, deep learning and dot net slash data slash feminist. Okay, another stash. And then, of course, there's also the fire name itself. And the file name actually is, um, feminist. So we can specified here, let's say file name like, now it's fire name is equal to And then we simply say the file name itself is feminist dot tiki l peaky. L simply means pickle. So the them is filed here is pickled. Um and then we say dot Josette because it's also zipped. Okay, so this is actually the fire which you find under this your l which wanted downloaded. Okay, so now we get your l and the final aim. And now we can simply say you can check whether we have already don't have the file nuts, and so we can be so we can simply say if not and I would say path in this case. So it path in the file name itself. So in this case, the ah data path, data path and, um, the fine name itself. So file name. So if this file don't exists, exists like that. And it's a function so used to parenthesis here. And then if it's not, if it does not exist. So we have not down or it's a far we wanted down on it, right? So we can simply say in this case, the download download I call it is equal to you. And I'll say requests, requests, not get So. And now we need to your l So it specify your l And of course, we also at the file name. Okay, that's the filing file file name like that's and then we say we want to have the content content. Okay. All right. So we don't want it, um, using requests and the content itself stored in download. And then we simply now create the file. Actually, we open the file path, so we simply say, in this case, um, the data path, data path and always say data path, slash. And now that we need the filing self file name like That's, of course, put this into Cooper Emphasis year. So simply for holidays like that's data path. And finally, and we simply say that dot open. So we want to open it in writing finery. So write diary and we won't say not right rights. And we want to write the contents. Okay, Like that's okay. Here we go. And we can run this and we're done. OK, so we do not encounter any errors. And basically, if you download the first time, of course it probably maybe take a little longer to execute the cell. But I have downloaded before. That's why we worked really fast for me. Okay? So basically would be doing again. We import the file lip here, the path module and also Quest Module. We specify the fire path ensure that even if this path already exists, we do not get any errors. We specify the exact your l from where we want to download the file. Also the fire name which we wanna download and then we check simply have already donated. So it does this path exists or not? If not, then we say requests not get so get basically you're on the finally get the content. So this filing here from this home page we downloaded stored in this variable and then we simply open exactly the file path which we specified up here. And we say we want to write something in this fight path. So in this fine name, we want to write it in buying a remote, and we want to write actually the content. So basically that what we downloaded, T Oh, sorry. Not the content. Something's wrong here. That found out. Okay, down, load like that and run it again. And of course. Okay. We also get no errors. That's it. Okay, so that's it, actually, for down during the day out and, ah, for this video and hopefully see you in the next video. Until then, best guys.
2. 2 Understanding pyTorch loading the dataset: no one will come back to the next video. Now we have actually downloaded the data suddenly last video. And now the data said is in a numb pyre. Ray form it and it has been stored using a pickle. So a python specific format for serializing they are. So we need to actually use the pickle module as well as the Jesup module toe Answer about data and read our data. OK, so in order do this. We created cell here and we say we want to import two new Mario Soares also import pickle pickle. And I will also import in this case, that Jesus model. Okay, so also, in case you encounter any Aritzia, you need to install those packages first. Okay. All right. So I have those. And now I can simply open my file. I can say with in this case with Jesup dot open and the reason why I'm using the with model simply that I do not need to call the clothes afterwards. It's will be closed automatically. Okay, So, Jesus that open. And now I need to use my data path here. So data path and also my fire name, file, name so I want to open. Actually this 5 5/2 where I download the data, you can see it here. Right? And then I say, Don't s Post X plus i x So as post i X, it's called. This is a PM foul path using specific specific to the path library. That's where we need to call it. Okay, as politics, okay, and its function. And we want you actually hear, specify We want to read binary. Remember, because you write in finery in this fire path. So we want to read it and bind remote, and we simply can call it s to be actually close this year at first, and then we say, um, s f okay. And actually, I'm there's, ah, parenthesis missing like that. Okay, okay. As f So after simply in deprivation for a while. So we open it, actually. And now we need you. Well, basically, store our data so we want to have. And that's why the well, the status that is stored in training and testing data set right, so and also in, um, the data itself and, uh, or image itself. And then, of course, also the label. So what kind off numbers on the image so we can simply say extreme ex train comma. And then why trained? Why train also by train and spent? It's the 1st 1 And then you also say comma and they'll soon say X ballot. Okay, X Velits. And also why volatile? Okay, why ballots? And then we also use a common here and also underscore here as equal to anos a pickle pickle dot load. Oh, because I don't load. So we want you load the pickup file and we load f So, in this case, f and we say encoding so encoding. And we need to specify the encoding because my devote its utf eight. But in our case, it's not utf eight. The wind specified it. We didn't need to say it's Latin one. Okay, Latin one. Right. Okay. And here we go and we can run this and suppose x because I missed underscore here. Okay, run this again, and we can see now we have basically opened out data with pickle molecule. We loaded the data. We encode the data first and then restored it in those four variables. Okay, so now we've got our data and of course, we can also take a closer look at the data. So if I should be an interior and simply type and, for instance, extreme. Okay, extreme. And then simply not shape. If I call this and I run this, you can see we got 50,000 images here with which are seven and 84. Um, well, numbers in this role was to simply the pixels. Okay. And also, of course, we could also called the white trains that take a look. This also why trained or shape? Why train don't shape, shape like that. Call around. This begins to get 50,000 here. Like this array. Okay. All right. So that's the data. And in the next thing, the next video, we simply want to actually visualize the data, So stay tuned for that. See you there.
3. 3 Understanding pyTorch visualizing MNIST: Okay, each off this data here. So each of those 50,000 images, there's actually a new image which has a size or within height by 28 by 28. And it's stored as a flattened roll off the length off. Seven and 84. You can see here. Right. So 8 28 by 28 SNP seven in 84. So in order to visualize this, we actually need to well, reshape the data OK and into ah, two dimensional Well, array first. So don't do this. We want to import. So let's say from Matt, plot lip don't pipe lots. Or actually, we want to import member lift rpai plot. So it's a import imports metre. Let the pilots as plt. Okay, so it's simply an abbreviation. And then we also need a magic methods at least interpreting notebook. Because in order to really visualize something within the trooper notebook, we need this special symbol. Okay, so we need to call. Does in touch Mets. Ah, plot Lip. Dark pipa metre. Live in line. Sorry. In line like that. Okay. We simply need to call. This is a magic command within Jupiter notebooks and simply in order to print something within the Jupiter notebook. Okay, so we got those two things and then a wall. Also important empire. Because remember, currently, these are still numb pay rates. We toe Wakely, convert those into 10 sirs, But for now, for planting, we can use the nine pirates, so we'll see. Imports num pai nam pie as MP. Okay, so we got all actually the models we need, and I was simply cancel plt plt dot in show so much show. Show me the image and then we simply specify extreme and we simply you only want to Well, let's realize one image. So the 1st 1 and we simply say dot reshape reshape. Because we need to reshape the data from the 7 84 flattened image. We need to reshape it into a two dimensional array. Because in peace, a 28 28 spy 28 in this case and we open also can specify a color map here, so we can simply say See map. So column app is equal to agree. Okay, because it's great. Scary images only has one color channel. It's not rgb. It's only great. Okay. And then we can speak print test. So can say simply actually control this plt potential. And we can run this and we can see that basically, we have the image, okay? And curly, it looks like a five. Of course, if we take a look at another image, for instance, let's say the 10th and report this and we can see it's probably a three. Okay. All right. So that's actually for vision in the data. And in the next video, we're gonna start. Actually, you well, converting on Umpire Ray in two tenders, which we can use them within pytorch. Remember, tenses are multi dimensional race, and those are actually the specific file for much which we need in orderto will create, then in your network. Okay. So you there in the next video Until then, yes, guys.
4. 4 Understanding pyTorch from numpy array to tensor: Okay, We'll come back now before we can actually use pytorch. We need to first convert our data because currently what we have are numb pie race. So what we need instead are tenses because Torch is working or pytorch is working with sensors. So we need to convert the data time. Okay, so let's take a quick look at that. At first, let's take a look that we really get number race. So can simply say French. And I can say the type of the type function from python and SNP type in X train. Okay, as an example and can run this control enter and you can see I get a numb pie n dimensional array. That's what eternity is, right? So what we need are, as I said before, we need to answers. So what we can do is we can simply convert the extreme. So from there, the extreme Why train the expected why? Well, it into tenses. And there are various ways how we can do this. But one of the easiest way would be simply to use the map function in python because the map function is the built in function python, which we can leverage, which you can use in order to go to map a specific function which we get from the torch module onto our data. Okay, So what we need to do is we need first import pytorch We can say import torch like that, and then we can say for the data. So let's say extreme why train and also, um, X village, and why will it, like that is equal to and that we can map this so called the map here and now we need a function, and we get this function from the torch module. That's why we imported here. We can simply say torch dot tensor torture. Attention. Tensor is the function to convert the data into tensor. Okay. And now we need to give it the data on which we want to map the function. So we simply put in in year in, um, furnaces. We say X ray And what train? And also expel it. And why ballot Okay, like that. Okay, so we got this. Now we converted basically our number array into our tensor, which you can work with. Okay, That's the data we have. And then what? You also want to do is we want toe safe. Um, well, shapes off our training fired because we will use this later on, so I can simply say the amount is equal to or let's say, the amount as well as the size size is equal to an r se X train dot shape. OK, so like that. Okay, so exactly print this at first. Let's take a look at the That's also called Prince and called the type function type. And I'll say extreme Okay, like that. And if we were on this, we can see that I now get a torture tensor attributes. OK, so my data type is converted in by using the map function in the torture by using map and torture Attention tens of function on the data we get now. Torture tends us for the X train. Why train and also for the ex felon Violet. You can also print this, of course. Now let's also take a look at the amount of the sites. So let's say prints also are the amounts, which is simply our data size, which is in this case, you can see 50,000. It's totally fine. And also, let's say the size, so also, print the size size and we can run this weekend 7 84 OK, that's why I just wanted to actually store those two. Very. But in these or the these two data parts in the variables. Okay, I run this again. Okay, so that's all I need. Okay, So, um, that's that. And Ah, no. We could also actually take a look at a few other attributes. Let's say, for instance, that why trained men as well as the max. So we could also take a look at the let's say, prince print and I'll say X train and the zero. So the first element we can run this, we can see That's the data. What it looks like. Okay, that's the tense that we have. Like, That's okay. And, um, what we also could do is just take a look at the little more Look at the data set itself. You could also say print in this case print, and I could say Why train? Why train and all? And here, doc men, OK, like that's and you could also train the why train, Dr Max. Okay, just take a look at our labels And if we run this, we can see the minimum. It's a tense off zero. The maximum stands off nine. Right? And that's totally true, because we get in total 10 numbers. So from 0 to 9. Okay, Okay, that's it, actually, for converting our data into, uh, tensor object. And now we can actually use or go dive deeper and pytorch and create the nor net okay in pytorch. So thanks for watching. And I hopefully see you in the next video. Until then, best guys.
5. 5 Understanding pyTorch weights and biases: and I would welcome back now, in order to greet the nor network. What we need is we need some weights and buys us right. And we also need to actually create an activation function as well as the model itself. So let's get that. What we gonna do is at first we gonna import elaborately we will say imports, math, important math. Who use this just in a second. And then we instead she ate the weight. Sweet. It's a piece eight waits waits is equal to, and we can say torch and we can use a function from torch. You could say simply, torch start rant n to general generate number ending numbers. Okay, And then we say 7 84 84 comma 10. And the reason why we do it this way off course is because remember, our data shape at the beginning is 50,000 images with a length because they're flat mints 7 84 which is 28 28. Right, So for the weights itself, we need a za. First, I mentioned the 7 84 and for the second that 10 because we get 10 output values which are basically our classes. Right. Okay. So what we will do in addition to that, it was will simply divide this and we will divide this by the math dot square root square root. And then we will say 7 84 Okay, so why are we doing this? Well, this is a process called Savior initialization. And, um well, what was, well, research found out that this simply helps to make the training off the nor network more stable. Okay, that's why we do it. Of course, you could also try it without it. But this is just what research tells us. Okay. All right. So we instance, created the weights, and now what we will do is we will call another function and pytorch with which is requires Grant. Okay? And I said simply say weights. So I'm using my weights again. And I say don't requires requires, like that on the score, Grant and underscore again like that. Okay, so what does this basically do? Well, pytorch allows us to record all operations done on the 10 sirs, and this is a great thing because this allows us then to calculate the great entering the attack back propagation process automatically okay. It simply means that we do not have to do anything later on, OK? Simply by calling this requires great. We will simply or pytorch is recording all the operations and doing everything for us. But you'll see this later on. OK, now, one more thing is why would we use an underscore here? This underscore has a meaning. The underscore simply means that this operation is taking in place. Okay, so here, for instance, we say weights and restore within this weights variable this part. Right? So this part here but here we simply say Wait start requires great. And with the underscore here, we simply this allows us instead off writing waits equals weights that requires great with two parenthesis would simply say weights that requires great with underscoring to burn So it's in place. Okay, it's on the weights and we'll do something similar toward bias, of course. So we can seem to say for the bias bias is equal to and it can say torch dot zeros. So instead she it as zero values and then we need to 10 off them because of the shape. And here we simply call requires requires underscore Kratt is he going to true? Okay, we could also Sorry. Consent. She ate it this way and again. The reason why we do it for bias here this way, like that's and for the weight sweep at first, divided by the square root and then do it like that is simply because this savior initialization what to do at the beginning. This should not be part off. The court requires great in function. Okay, That's why we have two steps in here. Just in case you're wondering otherwise, who could do it exactly the same way? Light towards store grand end in a four comma 10. And then you simply say it requires grad is true for the weights. Okay, but because we do in this kind of initialization to make our neural network more stable during training process, we simply do it in two steps. Okay. All right. Now what we're gonna do it next is we creates our activation function, and we would do this in plain python at the beginning. Later on, you'll see that we can do in leverage pytorch the whole library and make it much more easy . Okay, But to really understand pytorch from scratch. We start doing it from scratch. OK, so we simply say deaf for a function. And we can call this whatever you like. I'd say it's the locks off, Max, because it's a lot of room stuff. Next function like that. And we put in some input value, which is this X. And then we simply want to return something. We say return, return and simply say X minus extort exponential function. Call it this way dot some minus one. Okay. And then dot lock because the longer rhythm and we then we squeeze it. Okay, call it on. Squeeze like this. Minus one. Okay, so that's basically our soft next. So our activation function, which we want to apply after multiplying by the weights and at the bias. Okay. All right. So this is the activation function. And then, of course, but you also want is we need to create the model, okay? And the simple model would simply will be a linear model. OK, so all we gotta do is again we create a function here, we call it model model, and then we put in some input values experience here, for instance, and then what we do is we simply want to return something return and then we call self Mex function. So our activation function lock soft max like that. And then we simply need to put in the input values, which is this one. And then we do a dot product multiplication. And in order to do this, we could use the at symbol. Okay, so that simple simply means that it's a dot product operation. Okay? And we multiply this or do the product operations with the weights, weights like that. And then we simply at the bias Okay, twice. So it's a simple linear transformation. And but for now, it's in plain python, Okay? Without using any additional pytorch functionalities and functions, which we have. Okay, now that's it. For actually creating wealth awaits essentially in the weights in centering the bias and also creating the activation function as well as the model. And, uh, yeah, next thing will be that we could define the best size. So, for instance, we do not want to feed in the whole 50,000 training images at one time. We want to do it in batches, right? So, in samples off those 50,000 so What you can do is at first let me run this, okay? And now let's actually create the batch size. So let's say batch sizes equal to 64 for instance. Of course. Feel free to use any other kind of batch size if you like. But let's say he of 64. OK, then. One mini batch would be XP. So as a batch is sequel toe ex train. So from the train data sets, we want to have 64 images, right? So we can simply say, starting from zero to, in this case, the batch size. Okay, so give me the 1st 64 images and this my first batch, and now we could say all predictions. So prediction prediction is equal to you and say model. So we're using our model and we feed in simply are batch OK, so batch. So using this function here lost my self max activation, multiplying this and get our prediction Okay, like that. And now what we could do is we could simply take a look at that. We can simply say prints, prints, and then we could say prediction for a Dick Shem, and we want to have the first prediction of like, that's and also, let's say, prints the prediction that shape so prediction dot shape Okay, like that. And if we scroll up a little bit and that's from this and you can see that's what we get. Okay, this is our output, and we get a tensor, which is this one here which simply consists off 10 values. So this is basically the prediction for the first image. So what's on this image? What kind of number? And of course, because we have 10 numbers from 0 to 9, we get 10 Artesia. So 1234 5678 nine and 10. Okay, then what you also get is this great and function here So you can see that's exactly what we need is why we said requires greatest shoe. So basically, pipe I'd watch is then simply during this training process. As we said before, it's basically saving or the radiance which we need later on, okay. And also for the size, of course, it's 64 come attend because here we only printed the 1st 1 But our patch, of course, consists of 64 off those 10 well rose here. One road, 10. 10 columns. Right. So 64 of them and each of them has a length of 10. Okay, so these, because we have 10 numbers here, and just because you're wondering normally what's what are prediction would be would be the highest value off those, right? So in case your friends, since the highest value is probably its minds 1.76 So basically, this first prediction would be a zero. Okay, So basically, our model Curry would say on this first image, there's a zero because this value here is the highest value off all of them. Okay, so that's it, actually, for? Yeah, incensing the weights and creating the model from scratch and also to first prediction. Okay, so, uh, thanks Watching and see you in the next video. Until then, this guys
6. 6 Understanding pyTorch loss and accuracy: Hello and welcome back. Now, our next task will be that we create the lost function, right? Because each nor network does not only need to make a prediction off course, it also needs a lost function in order to calculate. Okay, how far are we away from the true value and then doing basically the back propagation process in order to optimize and tweak are no network to make better predictions. Right. Okay, So what we're gonna do as a loss function, we will use the negative lock likelihood. And actually, this kind of lost function is what you are probably many of you know, as the cross entropy function, which we'll use later on as a part of the package from the Piper Pytorch functionality. But if you do it in plain python, you would create a negative look like cute function in order to create the loss. Okay, so we can simply say death. And now for the last function. So NLL so which simply means negative locked likelihood. Okay. And this function uses an input in port, which will be basically the prediction of are no network and also a target value, which will be the true values on the image. Right? So the white trains and this will be the output from our model, which will be the prediction saying, put All right. And now what we're gonna do is we return something and this negative look like the old function looks. This looks the following so minus input minus import. And then here we'd say range and then say targets dot shape and we will use the zero here. And then we say target. Okay, so like that, Okay. And, uh, parenthesis here must be there. OK, start for that target like that. And then we call the mean function again. So this is simply the loss function. And just just in case you're wondering, you do not need to remove this, okay? This is just to understand the plain pie thinker. But later on, we use the functionality in pytorch, and you do not have to remember any of those functions here. It's really just to understand and to start from scratch and then go step by step to using real and understanding pytorch. Okay, so this would be never look like you'd function. And, you know, we have the lost function. We can. Instead, she ate it. We can say the last function. So let's lost function is equal to you. And you could say, Well, this is our last function, and we can run this. Okay, so we have got it implemented And what we can do it. We can simply calculate our loss. Can say the y P so is equal to now. Why train? Why train, train, train like that? And we start from zero and we end at the better signs. Okay, so this simply means this. These are the 1st 64 labels so far, the first batch. So we can simply here we make make the predictions for the 1st 64 images. And here are the true labels for the 1st 50 for 64 images. Okay, now we got this. And then what we can do is we can feed both of them into our last functions. On this function here, we can simply call now the last function we can say, Prince and I will use the last function last funk like that. And then we simply put in our predictions so prediction and also the true values which will be why be OK like that. And we can run. This are and you can see we get a tensor here, which is our lost switches. 2.258 currently. And also the Grady in function, which is, in this case, the negative back ward. Because we got the negative look, likely vote. And this will be the function which we will call then later on, in order to do the back propagation process itself. Okay, and all right, one more thing, which we can also do here is we Can we'll actually take a look at the cure, ISI, which is one off the metrics we can measure later on in order to have a coach or make an estimate how good our model is. So in this case, we also use a function here s o death and implement a function and say that the function is called accuracy accuracy like that's and this function needs an output. So basically, that's and the two values why be like that? And then we simply call the prediction prediction here is equal Chew and Torch thought Arc maxes Max its function, its function okay, tortured Arcamax and we call it output And also we need to specify that dimension and that I mentioned will be one okay like that. So why do we specify this way? Remember what I sat here. When you make the prediction, we get intense er with 10 values. Okay for each of the images. And I said that the highest value of those in this case this minus 1.76 and the one this will be the prediction off our nor network. Okay, so the highest value of them. And this is why we call torched art, Max. Okay, Because the Art rex function then gives us actually exactly this value here. Right? Or actually, it gives us the position. That's important thing, because we do not want to One minus. We do not want to have minus 1.76 but we want to have zero. Right, So 012345678 and nine. Okay, these are the numbers. So when we call torture art Max, you want to have a number zero if this is the smallest value, and that's basically the prediction of are no network and we called them so for that, I mentioned equals one because we want to have it across this dimension. You're right. So, across the columns, that's why we called our mentioning courts one. All right, So we got the prediction, and we simply now need to check. Basically, if this is exactly the same as the true value. Okay. So we can simply say return return and we say Prediction, pray. Addiction like that's equals equals Why be Okay, So the two value. So it's the prediction equal to the true value. If that's the case, we get a true if that's not the case, we get a false. So we need to convert those two enforces into float values. So basically into ones and zeros. And we can simply do this by calling dot float, float. So basically converting, choose and forced into ones and zeros. And then we compete or call the doc mean function because we want to have an average over our patch. Tysoe. What about training? Right, So, like that. Okay, so we come in, we can also run this and then what we want to do here is we want toe now. Simply take a look at the accuracy What we currently have. So what you can do is simply say we want to print something and we call now the accuracy, function, accuracy like that. And then we put in the prediction which we have so prediction. And we also need to put in the two values, which is why be, And then we can run this and we can simply say we got a tensor off zero watts, 1 to 5. So basically 12.5%. So this is pretty pretty bad, actually, far model. But of course, this is totally normal because we have not trained it right? All we did was actually creating or insensate the weights advices. We created an activation function. We created the model itself. We simply then calculated basically the output of the model for the 1st 64 batches without any training. And all we did was we used the last function. We calculate the lost, which is pretty high. And this, of course, is also the reason why the curious he currently so basically, the outputs off the model currently do not really match the two values. So basically, the currency is really low. Okay, so that's where we start and Now what we do is we can dive into Pytorch and actually optimizing this so hopefully excited. Can't wait to see you in the next section. Until then, best guys.
7. 7 Understanding pyTorch training our neural network: Hello and welcome back to training the nor network. Now, in order to train it, we need a training loop and thes training loop will actually, at first select the mini batch of our data off the size of the batch size. Then we will use the model to make the prediction. We will calculate the loss. So concerning the predictions of the model as well as the true values and then we will lost . Called the Lost Art Backward function, which is his perceive a specific function pytorch to update the radiance off the model in this case, the weights in the biases. So let's do this. Let's get into that at first. What we gotta do is we specify have parameters, but let me first actually call the amount again, OK, amount, because we will use this, which is 50,000. Right? Okay, just to remember that, Okay, let me get rid of this. Okay. So what we're gonna do this? We use learning rate is you go to and 0.5. This is simply value we're gonna start with. But of course, you can tweak this can play around with this and the other half a barometer is the epochs, right? So epochs and we simply say is equal to three. Okay, that simply means that we've got currently 50,000 training images. But we want to have three epochs, which means we go through all those 50,000 images for three times. Okay, for the training process. Now we need a loop. So we simply say, in this case, four epoch in range epochs. That looks like that. Okay. And of course, I need to use no Grinch epochs to parenthesis. Okay. All right. So this will be the first group, and the second group will be in that we will go through each of those epochs in batches, right? That's why we you specify the batch size of 64. So for each epoch and each epoch consists of 50,000 images, we gonna through looked through in 64 batch sizes. Okay, so we simply can call for I in range and then simply say, in this case amounts. That's why you called a 50,000 year minus one. And then we simply do an institute in titter division. So the stuff on slash is simply means into two divisions. Okay? Because we couldn't. We are not able to go to, um, through 3.5 matches, for instance. Right, So we need either three or four loops. OK, so we do. We can't deal with float values. This is why we use those double slashes here. So to get uninterested division and divide this by batch size plus one. Okay, that's the training loop. All right, Now we get we start here and then we specify, actually a start and an end. So the start itself, so say start start is equal to and then we start with I yes, dot is I times the bench lies. Soul times. That's nice. Yes. So let's start the end. The end will be equal to you, and I'll call it in this case, the start. I saw the start, plus the best size. Okay, So, starting from zero, for instance, and ending it zero post 64. So actually, um, at the 64th Valley, which is executed, right, because we got start from 0 to 63 in this case, actually. All right, so we got son ends, and now we simply put this into the variable. So let's call it X b is he go to And now from our extreme. So from our extreme, we'll start with the starch starch, and we will end with the end. Okay, So extreme from Sergeant and for the labels. So why B is equal to why train, okay. And also from start to end like that again, and this will be loop. And now we need we need to make a prediction. So we can simply say prediction prediction here is equal to model. So we use the model again model, and then you put in simply, well, our data, which is the batch itself. Okay? And then we could calculate the loss so our loss will be equal to, and we can call the loss function again. So lost function. And we simply put in our prediction prediction. And also the two labels like that. Yeah. And now we have calculate the loss. And now what we want to do is we want to actually calculate the radiance, right? Because remember when we called the last function here the last function we had this great and function. Okay, so, basically under the hood, pytorch was already calculating the grating function or actually well store all the values for calculating ingredient. Okay, so the function exists and pytorch will be able to basically do the back propagation process now. But in order actually to do the back propagation, what we need do is we need to call the backward function. So what we do here is simply we say, loss last dot backward. OK, backward. So if we call this function year, which is a building functioning pytorch, then basically from the lost now PYTORCH is calculating or the Grady INTs for the values. OK, And it's doing this automatically, though there is no need for us to do any kind of manual calculations of that. Okay. And that one we're gonna do is we will call year a rapper, we say with torch dot No Kratt no grant like that. And now we need to update our weights in the biases. Okay, so we can simply say with this, waits, waits, weights like that is equal to or actually waits minus equal like that, and they'll say weights. Wits. Ah, waits. Okay, waits dot Grete times the learning rate. Okay, this Why were you specified? This is a hyper parameter, but this the updating Basically years we updated the weights and we subtract the radiant times, the learning rate off the weights. Okay, so this is how the back propagation process actually works. And ah, also, we do the same for the bias. We can simply say buys is equal to or actually, Sorry, bias me minus equals is also bias. So original bias value and Ruby simply using the radiant of those so rat in this case. And also multiply this by the learning rate. Okay, like that. And then finally, we basically set back the Grady. It's 20 so we simply say it waits waits dot Gratz underscore zero. Okay. Like that. It also do the same for the bias bias dot Grete Underscore, Gratz. Sorry, gretch 0.0 dot zero. It's called an unschooled for the in place operation. And also, I made a mistake here. Um, it's grant 00.0. Okay, Like that. And also we need on in place operation. Okay. So again, why are we doing this? Well, what we want to do, we want to do the back perfection process, but then we actually want to set after this. We actually want to. We want to update the weights in the bias. But this updating process of weights and biases we do not want to store gray against here. OK, so we do not want to calculate any kind of brains in this process year. We simply want to update the weights in the bias. And then we want to set back the grains zero and the reason why I want to set them back. It's because because we do not want to accumulate the radiance during training process. Okay, we simply want to calculate those Grady INTs for one batch. We want to update the weights and then for the next batch, we want to start from scratch. OK, so we do not want to accumulate those. And this is why we call the red 10.0 and also wrap this in the torch. No, grad. Okay, this is a reason why we do it this way. Okay? Starting from scratch. But remember, um, it will get more easy. Just a few steps, okay? But this is where we start. And that's the reason why we actually update the radiance, but then set them back to zero. Because we do not want accumulate them over the whole training process. Okay. All right. So that's it, actually, for training or for the training process, the training loop. And now we can actually calculate this so we can run this at first. Let's run this, Okay? We wait and we're done, and we can take a look at the lost function. It was the accuracy. So we can simply say print and I'll say, Last function. So the last funk sham from our in this case model and XP. So after the training process, now the last should be smaller, and also the two values will be might be. So this will be the loss. And then we also want to print the accuracy, accuracy like that and the cure, see function and recall this with the in this case, the model X B and also these values here. And it says, Take a look at the white B. So the true labels here and then we simply can run this the cell and take a look, and we can see that now the tensor is 0.71 for me and also get a tensor off one for the accuracy. So this is simply, um, this around. Okay, this is not 1% of simply round up okay for the accuracy, but for the tensile, you can already see that starting with a loss function here or with a loss. A total off 2.258 And so on. We now, after the training, loopy derived at the los off your dots. Here. Seven. So the losses really small now. So the training worked for us, and also we started here intense off. They're not 12 or 1 to 5. So in. And now the accuracy is almost close to one. Okay, after the training process, so you can see that the training process works, but it's still a little bit cumbersome. And you'll see that if you go far in foreign, dive deeper and pytorch it will get even more easy than now. Okay, so as always, thanks a lot for watching and hopefully see you in the next video. Until then, best guys
8. 8 Understanding pyTorch Make our code easier: and I went, Welcome back to this video. And now we're gonna start diving into Pytorch even more because let's take a look at what we've done so far, okay? Because what we have is, well, we have a train drink model, right? We Instead, she ate it. The weights advisers at the beginning, we then calculate our own functions, which is, in this case, our activation function as well as the model itself. So what returns our predictions? And then we also calculate the lost function here, and we also calculate security function. And then we do did the looping and the training process. Okay, so this is what we did. But it's could see and probably also have in mind is that it's a lot of code, right? And it's not that clear. It's not that understandable. That's how can we make this shorter? How can we make this easier? How can we make it more flexible as well as simply better? Okay. And this is where now all the different kinds off options which pipe pipe? Which gives us coming. OK, so we're going to start now making this code step by step much more easy. Okay, So let's get into that. So the first thing we could do it, we can simply replace the last function. Right? So remember, we called here the last function, which is this negative log likelihood, and we then use this kind of formula here, but this already quite complex. And we can make it much more easy because there's a built in function which gives us the opportunity to doing exactly the neg. Negative Look like yours a loss but Catalan and simply by calling the cross entropy. Okay, so what we need to do in order to do that, we need to import something from the Torchmark. You we can say import import, and I can send torch dot And then so from this an end class here, we were not functional. Functional? We want to import Oh, source. Laurie s f This the approbation. So we want to from the torch model from them and more here. We want to import the functional and effort simply the abbreviation which is commonly used for this. OK, so this means that we now import using this f here or the functional here, or the a lot of functions which are available in pytorch. It could be CNN layers as well as other kind of players. But in this case, what we want is we want the negative, like, basically a lost function. Okay, so we do this now. We imported f, and now we can insensate our new loss. So now in the new last function. So last funk here, funk, there's new lost function will simply be f dot cross entropy. Okay, across underscore entropy like that. Okay. And that's all this sound new, less loss function, and that's all we need to do here. Okay, So instead of writing it like that's here, we also only do we import f in this case, and then we simply call the after because entropy and stole This is our last function. Okay? And now what you can do is we can simply again use our model. So our model here again is the death model itself. Model and user input here. And then we simply return something. So return return, like that's and we want to return dot product. So XB with that symbol, which is the product with the weights, weights like that's and we simply at the bys okay, Bias like that. Okay. So we can run this and OK, we're done. And now what we can do is again We can print the loss as well. A security source, a print just to make sure that it works as expected. So we can simply say lost function. And now, in this case, we're referring to this new law. Section two, this one here, the cross entropy instead of old one. Right, Which was up here. So this one, this would be the old one. And now we're now we're simply referring to it. This one here. So the new lost function here. But let's be the last function and we put in the model. So the model output up, which is basically the prediction XP. And we also put in the Y p, which is simply of the true labels. And beside the loss, we also want to print the accuracy. So again, accuracy receive. Like, that's a cure. See, like that. And we again put in the prediction itself, soul from the mortal. And also the two labels would be okay. And for the curious function, we currently again using the original one. So this one which you created here. Okay, aren't so we can run this and you can see we got a tensor output, which is this one as well as the tensor here. And just in case you're wondering, of course, the model is already trained, right, Because this training process is still a valid. So this is why we got now lost function, which is lower this case, which is the same as this one and not as high as ah, this one at the beginning, OK, because the training process already took place. But the idea is simply that we can replace this dysfunction here, which is, at least for my point of view, not that intuitive, which are much easier way by simply in implementing this function as F and then simply called after, of course, entropy and restore the complete loss function which is the percentage the function and set off using the negative like that. Okay, so this is the first way. The first thing which could do in order to make our coat much more easier. Eso simply replacing the original lost function which can be well, function to finding pytorch like that and simply called the f dog cross entropy and stored business function this way. Okay, so that's the first re factoring. And ah, we'll do in the next one in the next video. Until then, best guys.
9. 9 Understanding pyTorch creating a network class: Hello and welcome back. Now the next re factoring and making our code easier is using classes and pytorch So we will use the n and module and an end dot parameter. It's called for a clearer and more concise training up. So let's get into that. Okay, so let me first enter a new line here, Okay? Like that. And now we will import something. So from torch again you want to import, And then Okay, what you have seen up here, So from the n n, which is the default model here in pytorch here, we imported the functional just to replace our lost function. And now we will import the end more than and class itself and want to create a subclass off this. Okay, And how this works is to follow him. We gonna start quoting the class so object oriented programming pytorch so class and recall it simply reminisced Feminist, for instance, because we got the amnesty data set in here and simply call this amnesty as a class. And then we call here the nn module. Basically, it's inherits from the n n What? You OK? So this is the parent class, which we get from this. Nn Mature here, Okay. And then that multiple is simply the parent class, which consists off the all the normal pytorch well, nor networks. In this case, if you want to put all of them in one box since they all come from this model, Okay, so basically, whenever we Korea class and pytorch we inherit from this model Okay, All right. And now we need to some create some constructor. So the constructor itself will be deaf thunder in it, which is current for python here. And we call the self argument So So instead, she ate the constructive constructor. And then what you also want to do, is we. Instead, she ate the parent class. So by simply call the super function. Okay, so super dots dunder in it. Okay, like that. So that's how you can do it in Pipe Python three. So again, impact or python to it works a little bit different, but in python three, we can simply call the super function like that. Just you, Actually. Instead, she ate also the n n, which is in this case, the end and module, which is the parent class. Okay, so we create a constructor for our class, which record here and we instead, she ate also or called instructor at the story a constructor off the parent class. All right, so we got this okay, And Americans actually safe or weights and biases within this class. So it's set off as we've seen before At the beginning, when we did it this way, we say weights is torture, Grant and and so on and biases this we now Instead, she ate those weights and biases within the class so we can simply say self dot waits waits like that is equal to. And now, as I said before, we can now use the n and parameter model. Okay, so we can say an end dot parameter barometer like that. And then we say, torch dot rent And so essentially it the same way and say some of that 84 again 10. And we divide this by the square root by math dot square root where route like that's and also 7 84 OK, like that's and we also Instead, she ate the bys. We say self daunts buys. In this case, bias is equal to and again an end dot parameter parameter like that's and we say torch 0.0 . Okay. And again, we need to 10 of them. Okay, so actually, this in sensation here. So this part here is the same as we've seen before. Appear for our weights and four Bisys. Right? But we do not need to call here. That requires. Great. Remember that we still use it, but it's built in parameters here. OK, so we do not need to call it additional. We simply get rid of this and safe again. Some coat some lines of code, but simply storing it this way. Okay, self device. Okay. All right. So got this. And then what each class needs. So each of those classes within pytorch needs of forward function and the forward function is actually making the prediction itself. Okay, so death, it's called and it's called forward in pytorch. And again, it's we put in the self self needs to be put in here because it's part of the class. Okay, so the first argument will always be self in case it's Yeah, a method from this class. And then we also put in the bias s or not device, but the input values have seen before. And then what we do is we return something return like that. And we say, in this case again, it's not products or XB at self dot weights, right? Rates like that. And we at the BIS thus self not vice bias like that. Okay, so we got basically here, our new class, which will be our no network. Okay, so this is now the nor Network, which we have. So instead of creating model class here, intention the weights at the beginning, which we had seen appear the weights and biases and then having this law stock my function and this so instead off this all we now create a class. Okay, we use a class, and this starts allows us now to use here the nn parameter to incense shit awaits and up Isis and also using the Ford function in order to then make a prediction for are no network . Okay. All right, So that's this part. We can run this. Okay, So far, so good. And now we can Instead, she ate the model so we can simply say our new model so model model is equal to, and I'll say and missed. Okay, like that. And of course, we need to essentially to parenthesis here. So we instead she at the model run, It's ok, works fine for us. So we got a new model, and now we can print the last function. We can simply say Prince and again, last function lost function, which has already been converted into an easier way, which simply the after across entropy, the dysfunction. And all we need to do is we put in the model. And now this time we do not refer to the original model function this one here. But now we refer to this class here. Okay, this will be our new nor network. Okay, So the model and you put in XP, which it needs in order to do the Ford Pass. This way, we put in the XB into the model, okay? And pytorch is automatically so there's no need to call the Ford past force forward function for us. So whenever we in Stan, she ate the class here, and then we put in some values here. Pytorch is automatically calling this forward in order to make a prediction. Okay, so there's no need to call the forward for us by itself. So it's empty half this XP, which is the prediction and recently again compare it to the original values. Wybie. Okay, And now we can run this and we simply get intensive year, which is this one. And we got the loss function year as well as the radiant. And okay, you can see that now. The last year again is really big. The reason why is here in the step before again remember the training process took or replace this one. Lost function was so small. But now we create a complete new model. OK, which random values for the weights as well as random values for advisors. But we insensate with zeros, okay. And then we did the forward. So basically, we made the prediction on complete new random values. And this is why the lost function again is pretty high here because of the network is not trained yet. Okay, but you can see we got a loss function, and again we got here a Grady and function, which is the negative local radio. Okay. And that's why this is why this last function, which we get it here with the after cross cross entropy, which is our new loss function is actually again the same as the last function we had at the beginning when we did it like that. Okay, this and Adele here, this one. Okay, so but now we have the model again. We instead, she ate the model. We call it the forward under the hood. So by actually simply putting the inputs within the model itself. So in this model, putting the imports automatically using the Fordham method to calculate basically prediction. And then we compare those two using the final using the cross entropy not been represented by the cross and every function in order to calculate the loss and also store grain function to basically then do the back propagation. Okay, Now, the next thing or final thing will be training than or network. Right? So, in order do this. We can insert a new line year, and we can put this actually in a function. Okay, so let's say death training, okay, training like that. And then during this training function, here we go again through the epochs. So four epoch in range epochs. So this works actually, actually, exactly the same, right? So we go through them And then again, we say for I in range and we start for n minus one or sorry amount minus one like that's a mountain is one And then we simply do an interview division by the batch size and we at one OK, like that's and divide this. Notify the start here So we get the loop and again start is equal to you and I'll say I time specialize. I'm specialized. The end is equal to now in this case, start plus the batch size. So for the loop and then we simply use the batch itself. So from 20 data sets So X B is equal to and say, X trained train from in this case start to end And then also for the two labels, Y B is equal to White train from start to end. Okay, like that and we know we will make a prediction again. So it's a prediction prediction is equal to, and I'll say model and I put in the batch OK, which we want to make prediction on. And then we calculate the loss. So lost a sequel to the last function lost function and ah, we put in the prediction prediction for diction like that's and also the true values OK, like that. And the final part will be that we call the lost art backwards to calculate basically ingredients so lost off backwards and, uh, one of this year because its function. And then we simply again say, with torch dot no Kratt no grant like that to parenthesis. And then what we're gonna do is we instead off doing it this way, what you've done here with the weights and updating the weights with the radiance times, the learning rate and the bias with their biased radiant times, the learning rate. We can do this together in one line through a loop because we start the weights in the biases within the class year in the parameter, right, so in this year and ended parameter. And this is why we can live through them. So we can simply say, with torture, no Kratt, we can say four p in model dot parameters. Okay, so the model itself remember the model is this function here instead created here and the parameters of the model are the weights in the biases. That's where we start them in n parameter. And this is why we can loop through model parameters and basically get the waiters was devices. And what we do is we update them. We can simply say p minus equals and it's a p dot Grete and we multiply this by the learning rate and the final thing that will be that again. We need to zero the radiance after updating them. So we simply say model. So get out of the loop and say model 0.0 on the score. Great. Okay. Like that Parenthesis. So this will be the training. And remember that the training itself is quite similar to this process here. But now, which already changed a few things because for the lost function itself, we already changed this using the function f dot cross entropy and also for the model itself. We do not keep this model. We have a new model which is this one, right? And for the updating off the parameters. So off the weights and vices, we also change this within this loop here for model of parameters within the model parameters, we can actually get the weights in the biases, so all the parameters off the model. Okay, We updating them by multiplying the radiance with the learning rate and subtracting it from the ritual once. And then, of course, we because we do not want to accumulate them with zero The radiance again at the end. Right? So that's what we do. And finally, we can run this, actually. And now we can do the training itself so we can simply call the training funds function because the training, training and run it. So we're done. And finally, of course, we want to take a look at the on the output. So actually at the los again, for instance, and it could say print, print and then say lost function. So the last function, which is the cross entropy function, and he put in the model on from the model the match nice here and also put in the two values sweaty Okay, like that. And we can run this and you could see again. We got a tensor off 0.7 10 and so on. And even the NL lost backward. OK? And again, you can see we started here with this loss because we created a completely new model here. And then we did the back propagation process, and we derived it. This last functional 0.171 like that, which you all always get here. Like we're of the original model we had. Okay. All right, so that's it, actually, for this part, and Ah. Now, in the next video, we start even making our court more simpler than that. Okay, so hopefully see you there, as always. Thanks. So fortunate. Your interest and hopefully see you in the next video.
10. 10 Understanding pyTorch Implementing layers: Hello and welcome back. Now let's make our code even more easy, right? Because it's still a lot to write, actually, and it's not that clear to me at least. So what? We started here with the class itself, the endless last year we inherit from an energy module. That's fine. But then we had to. Instead, she ate itself there waits as well a self that Bisys manually, right, using them the nn parameter here and also then creating them. But what if we would like to make this even more easy? Because this already is too much code. We can do that using the linear layer in this case. So in instead of Friday, get this way, we could write it differently. Okay, so let's do this again. Let's call it class and call this feminist to Okay, so the second m nus class again, it inherits from the an end of module. This is the parent. You would always call when you ever you create a function and pytorch she would basically inherit from this and in Montreal. Okay, so in and Mario All right, so this is a class, and now again, we need to create a constructor. So deaf thunder in it. Self. And then, of course, we also want you call the Super Class. So the parent class, which is the end module so we can simply call super dot thunder in it to instead, she ate. This is Well, okay. Like that. All right. So instead, she the we call the constructor. Instead, she aided the enemy module. And by doing this, actually, all the functionalities and options from this module from this enemy audio will be available in the tri class in the amnesty. Okay, so now we can use functions and all the available, available thinks within the end and module within the amnesty. Okay. By calling the Super today in it, Okay. And what you do is we simply use the linear layer. Okay? We can simply say self dot Linda Lanier and say is equal to and say, n n dot Alinea Lenya. And we say 70 84. Come on, 10. Okay. And that's it. That's its four. Instead, off doing it this way like self taught weights and then parameter tortured rent and so on. So instead she eating them and story. And in the end and parameter just to call them them here, like more parameters and also do the same for the bias year we have set. Get rid of this and only use a linear Leah. Okay? And by calling an end gardenia sounded 84 10 pytorch already tells us or knows. Okay, I need a weight matrix off son in 84 times 10. And I need a bias matrix off 10. Okay. And store this in a linear. Leah. Okay, Simply by calling this, that's all we need to do here. And then again, as always, we need to use afford function. So deaf forward, which will be automatically court during the forward process off, actually making predictions how we use it. You put in XP. So the imports here and then we will return something return in this case. But in this case, we return the self dot Linear. Okay, linear. And then we simply put in the XP. So basically here Instead, she ate it. Super class. We, Houston. Yulia instead she it and saved it here in self Alenia. And then during the forward past. So during the basically making the prediction, we simply call the safe not linear and simply give it exactly the inputs in this case, the batch size which we then put in the class itself in court. Okay, All right. So we can run this, okay, we're done. And now we can't. Actually. Instead, she ate the model so we can simply say model is equal to amnesty, to amnesty, to, to process here. Now. Instead, she ate it. And now we can again print the last function just to make sure that it still works. So it's a prince prints and simply called lost function lost function like that. And then some people in the model model next be like that's and also put in the true value . Swabey. Okay, so the predictions is with true values and then calculate the loss. And if we run this, we simply get here. I lost from 2.33 So again you can see the loss again is really, really big because we have a complete new model, which is this and this too. We do not work with the m nist. So not with this model anymore. With then and parameters and so on. But now, with this model, which is already even easier to work with. Right? Because we only got self that linear here and set off having this with the parameters. OK? All right, that's that. And now we can were still able to call the function which we have training here, the back propagation function. So instead of writing it again, we could simply call the training function. Okay, so we can call it on our model here. So consider say, here, in here and again. Remember, here we got the prediction of the model year. But now if we call the training function, this model here is not referring to the amnesty year anymore. But this model will be referring to the amnesty. Okay, because we instead she ate it here. Okay, so now we can call this. We can simply say in this case training, but again, I just make this clear here. This is training with new model, which will be amnesty or this new model, which will be, um, m this Jew. Okay. So called the train process again. Training and run it and we're done. Okay, let's take a look at that. It's good. Call the print here. The loss again. So the lost function function here and put in the model within with the batch size here and also the why be Okay, so the true labels and we can run this And you can see we got now again an opera value which will be the last year, which is syria 0.7 Okay, so again, it's a small loss. It's around this one here instead of having this big last year. So you can see that by replacing basically the self that weight since after pies from this year with an and parameter here. With this new linear function, we still are able to calculate the loss, calling the training during the training process with the loop and also get then ah, yeah, reduce the loss as well as getting yeah, better prediction off our model. Okay, so that's it, actually, for re factoring or just making this class easier. And in the next video, we still make it even easier. A little bit more so station for that. See, there
11. 11 Understanding pyTorch the optimizer: Hello and welcome back to this video. And ah, in the last couple of videos, we already made some replacements. So we basically got a new last function, which is this after cross entropy. And we also replace the original model using at first, then and parameter and then making this even more easy by simply calling an end arlena, which is everything we need to do in order cradling Neilia and then do with afford press within this class. Okay, so the next thing would be that we will use an optimizer. Okay, to make the train process a little bit easier, because currently, what you can see here, we use this with tortured no grant and then say for P and model parameters and then updating them. But we can replace this and make this even more easy. Okay? Using the optimum. Okay. And how does this work? Well, let's implement this. Okay, So at first we need to get our model so we can say we could simply call a function so deaf . Get model. It's a function. Okay. And this function will at first instead she at all model. So we say model is equal to amnesty to amnesty. So our model our class, which we credit here. Okay. And then we want to return something. You want to return the model itself as well as something else. We want to return the up Tim. Okay. It's called up Tim. And optimum is a package in pytorch, you know, to use it, we at first need to actually import it. So let me get here and say import or let's say from porch from torch, I want to import the optimum like that. Okay, so we optimizer the optimist, optimize the package and torch pytorch. So we want to return the model as well as the optimizer from this model are for this model . And we will use stochastic reading sent so as GT for stochastic or in dissent. And then simply you need to put into things for the optimizer. So the optimizer just needs to know. Okay, what do I have to optimize and what is the learning rate? Okay, so we need to parameters here this learning rate parameter and what I need to optimize and what the optimizer needs to optimize is the model parameters, right? So the model parameters are still the weights in the biases in our model And this what we want to optimize. So in order to do this, we simply say up 10 dot STD And we put in the model dot parameters Barra maters like that, and we need to give it the learning rate. Okay, learning rate is equal to learning, right? Okay. And this is the only way to simply the value, and that only right here is simply refers to the returning right. So this syrup 0.5, which we set up a two beginning here. Okay, this morning right here. All right, so that's what we basically So we in this function is not doing actually thing. It's just essentially the model itself and returns the model as well as the optimizer, stochastic or dissent Optimizer for this parameters and the optimizer should optimize the parameters with the learning rate. Okay, so that's what we do here. So now we have this and that we can actually Instead, she ate that so we can get the model. We can say the model model as well as our optimizer optimizer is equal to you. And we can call get model. Okay. Model like that. So it's just to get the optimizing here is one of the model, all right. And now we can print prints, prints like that's so print. And we want to print the last function. Last function here. Thanks for the model. And we put in the XB. So basically the batch size and also the why be okay like that And we can run this and you can see that again. The losses pretty high, because remember, here we create the loss. But here we instead, she ate the function completely new. Okay, from scratch. So the more we have a complete new model here, which will be returned as well as the optimized for the model by calling this get model function. And then we print the loss, which is the original loss before doing any training. Okay, that's what it was very start with. And now we're doing the training process. So again we go through the epochs. So we say for epoch in range that books So bill, three times. So one of 50,000 pictures in total. And now we go the second loop for the batch sizes. So for I in range and there's simply call in this case, the amounts minus one. And we do an Internet division by the best size. And we had one okay like that. All right, so we got this. And again, we say start is equal to I Time specialize, time specialized and is equal to in this case, start plus the best size plus that size space here. Okay. Plus the batch lies. Okay, we got the start and end. Now we call the batch itself. The batch will be equal to and I'll say, um X train extreme and start from start go to end and for the true values who also need the batch. I think this way. Uh, so why train And also from start Trent okay for doing dispatch. And now we can make the prediction. We can say prediction prediction is equal to. So it is equal to you and I We used a model and put in the batch, and then we can't let the loss. So the loss is equal to, you know, say lost function, so lost function. And we put on the prediction protection as well as the true values, but not OK. And now we need to basically call again the lost art backboard in order to calculate the radiance in our no network. So I can't say los dark backboard again backwards. So now we have calculate the radiance. And now we want to optimize our update them right? And in order to update them, we can now leverage to up him right, which we have here. So it's set off, going appear which we had here. So with torch, no grants for P and more parameters p minus. He could speed up Rat time's running right more lots, Eurocrats. Instead of this we make this more easy. Using the optimum, we simply say, lost or backwards. So calculate ingredients, then do a back propagation process which is simply done by calling opt So up him. Ah, optimize our, um Isa don't step okay like that. And then we need to set the grains again back to zero because we don't want accumulate them . So we can simply say optimizer optimizer And then we simply say 0.0 on the score. Great. Okay. Great. Like that Eso That's it. Actually, this is the press process. So now we calculate the ingredients, Then we go back with the others on staff, basically doing the back propagation calculation, and then we are cereal grains out because we don't really committed them. So if you go up here just as a comparison, that would yet before you all say that Radio Sears But instead off this actually, this rapper with the P and model parameters and update, we simply call optimizing out step. That's everything into do. Okay, so this basically does the same as we have here. Okay. And make the code again more easy for the training process, All right? And finally, what we can do is basically go out off this group here, so out of the loop and simply print the loss. Okay, Suitably print. Well, let's make actually a space here, Okay? Like that. I want to print something, and it's a print. Now we print the loss function last function, and we put in the in this case, the production of the model model X B and again, then using their to label 20. Okay. And we can run this and we get the output here again. Okay? And you can see now the training process worked. So the optimize it out step worked exactly the same way as the in this case, the the more the parameters. So this update here because we started with the loss of here 2.27858 and so one. And in the end, we derived in los at 0.74 So basically, we reduced the lost during the training process. Okay, But this training process already easier to understand because we only need to call optimize out of step simply to doing that that back propagation process. Okay, so that's it, actually, for this video, as always, Thanks for watching and offers you in the next video into then best guys.
12. 12 Understanding pyTorch tensordataset and dataloader: Now there are two more things we can use to make our coat even a little bit more easy to understand. Which is the train training or pencil data set which is available as well as the data Lodha . Okay, which is then be will be than the iterated for our training process. But let's start with the training data set. Okay? So the tens a day does it. So at this in this part, what we did was we used X B and Y B and used X train from start to end as well as white trained from start to end. So this party it right so we can make this easier by calling attends a day doesn't. So what we do is we say for going here, we could simply say from torch dot You tills util. So it's a utility function. Dots, data. I want to import import tensor data set. Okay, because it's a data set off. Well, basically the imports as well as the predictions, right? The output story, the labels. Okay, so we got the tin tens of data set and we can instead she it that we can speaks a train data sets Trained yes, is equal to and also tensor data set tensor data sets. And it's ah, why X train. And also why train Okay like that? Why trying? So we got the training data sets as a tense A data set here for extreme And why train so and now what we can do? We can use this again and we do not need to call. Ah, well, right this again. We simply copy it. Right. Okay, So what we do is we use the model optimizer and get the model. So a copy this and put it in here. Okay. So again, we got the morning off. Um, Isa from the get model function. So far, so get And that what we want is in here. It's actually print the loss as well. Okay, leaving this again, you're and run its and run it. You can see we start with this. Okay? So, basically, again, we create a complete new model and optimizer and the model it's so far is not trained. Okay, But now we will train it using this here this loop again, but implemented tense a data set. So what we can do is weakens MP. Copy this coat. Otherwise, of course, you feel free to type it again if you like, but for me Oh, I don't want you actually bore you, so I will copy it. And instead off basically doing it this way. Experian, Ex wives, ex train from start to enter and from start to end. What begin to you instead is when you're just a little bit. So instead of start and end like that, we simply get rid of this. These four lines here get rid of them. And instead, what we do, we should be, say XB and why be is equal to And I refer to the train data set. So basically this tense a data set with extreme in X y and then simply say, in this case, we start with I and we go from I times that size. Okay, Chul, I time spent size times. That's nice. Plus B s. Okay, so this actually is quite similar to what we have here, right? For a start and end, and then you go in here. But in this case, we shouldn't be doing it with the training data sets or in this case, the tensor if he had this training data set, which is, um, 10 NSA data said functionality, which we have here, right? This one and empty stored next being X and y be okay. And we can run this and I can see you get an error because trained he s not to find. So of course I need to Ah, run this first. So sorry for that. I didn't run it. And it may run the skin and you can see we got the output now. OK, so this works for just fine. But now, instead, off having those four lines of code, we only have one line of code here. Okay, But using the tense a data set up here. Okay, this one. Okay, so this waas the first part, And now the second part is simply that we get the data load, OK? To make it even more easy, because here can't. We were still Iterating using this range here for I in range and so on. Okay, like the second loop. But what if we want to get rid of this group? We can do this by using the data loader functionality. Okay. The data Lodha in Pytorch is responsible for many managing batches so we can create a data Lola from any data set. Okay, that's why we first need. Of course, this tense today doesn't. If we have the tense a data set, we can then leverage to date a load of functionality, okay? And the data loader makes it easier. Tow it a rate over batches. So then Robin having to use ah, training data set with I this one. Exactly which I highlight here. So instead of doing this way, we can simply leverage data. Lodha, look in the data load itself. Gives us many batches. He's doing this automatically. Okay, that's a great thing. It is important new line here. And what I want to do is, at first I need to import this. So let's say from torch. Doctor tells you, tells not data, I want to import the data. Lodha. Okay, loader. So it's the same also from torture tells data Lodha and a same extensive data. Sit here. Okay, so we got the day Lodha and how we can use this. We can simply say train data sets is equal to 10. Sir. Data set data set for extra and also. Why train like that? So this is exactly the same this rift in here. Okay. And now we put We have no created the tense A data set and we'll use this within the data. Lola, just two gives us patches. OK, The danger loader itself is just a it aerator to give us patches for through the training group. Okay, so what we do is we simply say train What? Yellow for data Loda is equal to And I'll say data Lodha and we put in the train data sets . So in this case, that 10 today doesn't hear this one, and then we need to define Okay, what is the best size? Okay, because a trade that data needs to know. Okay, what is the training data and what kind of patch do you want me to give you? Okay, so as Anita radio. Okay, so we need to define the batch. We say the best sites. That's nice. He is able to in our CBS. Okay, So what will be batch size with off 64 OK, and we can run this. Okay, we got it. So we got the data load as well. And now we can use this data loader and replace some parts of our code into our training process. For instance, disk this whole part of the coat. Okay, So in order to do this, what we do is going here and simply say now, the first we want instantly the model itself. So I using this again using concrete, creating a complete new model and optimizer control C control, veto pace it so again used to get model function which we got created appear we simply Instead, she ate this morning with the sameness to and we returned the model s ballast optimizer which should optimize the parameters which are the weights advisers with the learning rate . Okay, All right. So we got this and we can run this and see that the losses pretty high because the model itself is not trained yet. Okay. And now we want to do the training, but we want to use it in the most easiest way possible. So we can simply say for epoch in range epochs. So again, go three times for the whole training data sets and always simply say for XB comma, why be so? I am the inputs as well as the labels in training data Lodha. Okay, so referring to the data loader here. So this one here and that's all, Okay, because it's Irureta and it gives us 64 a size of 64 batch batch lines 64. Sorry into XP. And why be okay? And we know we can use this to make a prediction. So say prediction. Prediction is equal to model XB. So we get the prediction. Then we calculate the Los SE loss is equal to losses equal to you. The loss function lost function. And we put in our predictions predictions like that's ah prediction. Sorry prediction. And the two labels just to take a look what the loss looks like. And then be cold. The lost off backwards in order to calculate the Grady INTs for the weights in the biases. And then we use the optimizer, optimize it out step. Just do update the weights and the biases with the radiance times, the learning rate, and then we simply zero grade Syria graphic radiance again. So optimizer 0.0 Grant, just because we do not want accumulate the radiance right, we want to use them to optimize the weights advisers, but then we want to get rid of them for the next batch. We calculate them from switch. Okay. All right. So we got this, and all we have to do now is actually simply print the loss. Okay, so let's say prints and no call. The last function lost function and he put in the model. So after the training process, what is? Actually, if you put in XB as well, that's why we're here. Just to get rid, take a look at what it looks like, and we can run this and we see that we cannot put here, which is the trained model again with a tensor off 007 one for the lost function. And also here we get a negative negative looked likely lost backboard, which we calculated in order to do the radiant right catering the wedding. But he can see that now. This is the whole training loop. Okay, So basically, we had here where we had to tenser and we started appear. Right. So this whole process here actually here we already have implemented optimizer. But when we started the first part, your first time we did it it was like that. Okay, this is all the original one for the epoch and range epochs. Then we did another loop, and we could It started end. We went through them. That the prediction, having this rapper here, updating the weights and bias and so one and set them back to zero. So this was the original one, and at the end, we derived at this code here. Okay, which is much easier. Okay, because all we do is actually, for we go through the box. Then we have a training data loader, which is an iterated and gives us the batches for the imports as well as the labels we make . The prediction we call the lost function. We calculate the radiance, we update the weights and vices of the model parameters with the radiance with a step function. And then we set them back to zero just to avoid accumulating the the radiance. Okay. And that's it. That's the whole training process. These few lines here and after this, we can simply call print the last function here from the new model of the train model as well as the labels. And we can see that we already get a pretty good money. Okay, so that's it, actually, for this train process. So, as always, thanks a lot for watching for interest and hopefully see you in the next video. Until then, pass, guys.
13. 13 Understanding pyTorch training validation: Now, the final part will be that we will use our validation training data set. And also just to make sure that we do not over fit our model. Okay, But just in case you're wondering, or you get different results here for the lost function, just remember that your networks in Stan she ate actually random values. Right? So whenever we use this, if I run this again, you'll see that I get a slightly different last year, like 005 Different since I've ever run again. I'll got even different last year from that, just in case you get a different value. Justice the reason why. OK, all right. So now the next thing is that we, as you said before, you want to test whether we are over fitting, so we That's why we create the validation data set at the beginning. Right? So we we actually had at the beginning we had your training data set of swallows, a validation data set, and this is what we want to use now. Okay, so finally we gonna do we say train data sets again. So the train data set is again equal to a tensor data sets. So tensor data sets data sets and you put in the extreme as well. So why train? Okay, which is our training data sets, and then we call the Trained it on order. So trained data Lodha is equal to data Lodha, data Lota and then we need to give it the tenser data set, which is, in this case, the training data set. And also, we need to define what kind of batch sizes it. That's a batch on the score. Size is equal to the size. Okay, batch size. And then we gonna shuffle this as well. So it's always a good idea to shuffle the data just to make sure that there is no cohesion or any cork correlation at the beginning of the data if it's not shuffled, right? So just to make give us random values here. So we said to shuffle argument to true. Okay. And the next thing we gotta do ah, is using the medication a chance that so say valets data sets is equal to again. It works the same way. It's ah ah tensor data sets. But in this case, we're referring to expel it as well as true why that it? Okay, but it like that. And then we also put this into the data Lodha to get an iterated a So say ballots Data Lodha is equal to, in this case, data loader. And we put in this case the valid data set valid data sets and we need to define the batch size here as well. So batch on a school sites here and then we is equal to the best size. Okay, court. So we got this as well. And ah, we can run this okay, So far, so good snow heiress, That's fine. And we will actually get go gets model again. So we simply say model and the model and the optimizer optimizer is equal to you. Get model. So our function we instead she ate a complete new model. That's fine. And then what we do is we now going off throughout training process and we say for epoch epoch in range epochs like that's so we loop over and now we call a specific function which is called model dot train OK, the train function. That simply means that there are actually two functions here. There's a train function and any valve function. And the reason why we do this is when you could when you to a training process. You normally want to call the train function at the beginning because this means for pytorch at least under the hood that certain functions within pytorch module behave differently during the training process concerning or comparing to the the variation testing or trade process. Okay, so there's a difference between your training process and the evaluation process. So the testing and this is why you want to cold the model train at the beginning off a training process. Okay? And we can simply say for XP why be so Imports and the labels in train data Lodha. So it running through them, we simply say now the predictions. So production is equal to an essay model from the imports. Okay, this will be the predictions. Then I will calculate the loss So lost is eager to the last function last function here and we put in the predictions prediction as well as the true values. OK, and then again, we do the back propagation, so we calculate the radiance lost start backwards. Then we step optimum optimizer upto miser not step So basically we doing, um, updating the Willian's well, I think the weights with the grains through the back propagation. And then what we do is we simply during the training process, we zero the radiance again? So optimizer dot zero Great. Okay, like that And this is the training process itself. And then what we also want to do is during the revelation process. So we go in here like that's and say, Let's go out off this loop here and say Model Dr Evil. So now we doing the evaluation? Yeah, for that's why we call evaluation year. So after training, we train the model and then v evaluated okay and say more lady vow and I'll say with torch Dutch, no grant for the validation here and say the vegetation loss. So the validation lost is equal to, and I'll say some and we simply aggregating the lots. Okay, say last function from the outputs. So the train model XP and the true labels, why be okay? And they say four x b comma labels in valid deal. So that's basically do is basically we said the greatest zero here because and she had this important because we do not want to update this right here. Okay, So the training process, we updating it, we calculate the Grady INTs, we do the back propagation, and then we train the model and optimize the model. And then here we actually won't only want to have the last itself. And actually, I have a serial here. I want to put in a call, Sonny. Okay. And then the validation us will be the aggregated loss off. Basically off the prediction as well as the two labels within the data set. Okay, within the validation day doesn't. Okay, so and then finally, of course, what you want to do is we want to print the output. Say, let's call it prints here. And what you want print is we want to print the park itself, and then we also want to print the val it in this case loss. And we simply divided by the length off the valid data load. OK? Because we want to divided, of course, through, um, the valets data set. Okay, the size. Okay, that's it. And to know we can run this and waits. And here we get to the airport. Okay, We got three outputs here because we've got in total three epochs, right? And we got here the first answer. Then we get output for the second tenser. And finally, we got the output for the 13th. And, of course, no other third training epoch, right? Well, apple and you can see and hear, even though it z going a little up. But then it's going down, okay? And it should to go down further if you use more books. Okay, so that's it, actually, for evaluating or actually take a look after training the process, training our our model now evaluating model. Basically using the validation data set. And this is our Okay, So that's it. Actually, for this part, that's actually after we have trained after we have learned how we can re factor the code and pytorch remark more easy, step by step, starting from a from my point of view, of course. Ah, really? Well, not so easy to understand. Um, coat, we simply derived this and make it even more easier. Step by step. And at the end, we derived a at least from my point of your pretty easy on understandable on clear, concise coat, which then allows us to train the model as well as to using this train more than on the validation data such to get a meditation in this case of violation lost. Okay. Okay, so that's it. Actually, for this video again, Thanks so much for watching things. Look for the interest in the course, and it's always hopefully see you in the next video. Until then, best guys.
14. 14 Bonus Understanding pyTorch ConvNets in pyTorch: and I won't Welcome back to this video, and it's a little bonus. What if we would like to have a convolution or network now? But remember, what we did so far was only to create a simple linear regression nor network. Right. So we have the self Delaney, a model. If I scroll down and you just call up here after linear this one, we have this class here, right currently. But what if you would like to implement convolution our network? So basically everything stays the same beside this class, right? So we simply need to change the model. But we could again using the training validation data. Not exactly the same as we have here. And also this training loop exactly the same as here. So all we need to do is we need to create a new model. So how can we do this by simply? Well, instead, she ate a new class. Let's say class. Okay. Khan for br class. Let's call it convolution. Convolution like that class convolution and want Sorry. And it ah inherits from an end our model that says that before all those all classes Ah, you'll like that. Okay. Inherit from that. And what we need to do is look sickly, saying we create an constructor, say deaf thunder in it self OK, like that's and that would be going to do is we call the super So the parent class, which is that an N module here. Just to have ah, well, have access to all the functionalities. So it's a super, and I'll say dots dunder in it. Okay, so like that, Yeah, this. Okay, so we also instead of this one, okay. And now we can simply call a few conclusion layers. We can simply say self dot com form. So for the first convolution is equal to n n dot and I'll say instead, off Lenya which we had before we say come to d Okay, it's ah comfortably layer. And we say we want to have the input will be one for instance. Okay, then we have 16 kernels and the colonel size itself. Colonel, it is equal though the carnal sites Carl Zeiss is equal to let's say three. Okay, then we have a stride. Stride is equal to two and the petting Is he going to one Okay like that and we can do this a few times. So we say self dot com to is equal to and exactly the same. So can copy it and paste it. Okay. And besides, they just need to change one thing here. This means must be 16. Okay? Because we start 11 and we create 16 kernels. So if we have 16 colonels than the input, of course, for the next layer would be 16 counts. Okay? And we keep those 16 here, and then we have 1/3 convolution layer. Self taught con three three is equal to you. And I say again, I can copy this because it looks exactly the same again and every continuity. And then I also need to define the curls size the striders, while this year 16 and at the end, we derive at 10. Okay, so we have 10 kernels here. All right, so we got this. And then what we have is and have another function, which is the forward function. So afford path, basically making the prediction right, which is for much again forward for two nights, and we put in self and of course, the the values here. And then we say in this case, X p is equal to XB. Thought view. Okay, you and we put in minus one comma, one comma 28 28. So why this shape? It's easy, because when we have a convolution, a layer, of course, that what we want to feed in our images. Okay, so in this case for the convolutions, we need exactly the shape we actually printed out here. So the 28 by 28 image. Okay, But you can see that at the beginning, we had a shape off seven and 84. So in order to get the 28 by 28 we can simply call this expert view. The view is simply the same as an umpire. The reshape. Okay. And recently say we want to reshape it minus 11 28 28. So 28 28. Basically the image with an image height and the one is the color channel because we have a great image so black and white, not color. We have one here. Otherwise, we would have for three here. Okay? And the minus one simply puts it in the right shape here. Okay, Art. Okay, So we got expert view, and we simply call X b is equal to you. Functional dots. In this case, we using the real oh activation function. And we put in the self dot com one and sorry. Confront its and we put in here the XP. Okay, this will be the 1st 1 So here we instead, she ate a different kind of layers, and here we use them. Okay, so we first reshape it the way we need it. And then we put the image into this year. Okay, So in the convolution layer itself, and we use an activation function to restore it again in the XB value. Okay. And we do the same a few times. We say X b is equal to f dot riedel and again self dot com to in this case and again, you put index B. So the XB always refers to the XB in the in the row before. Okay, right. Okay, so we get this end, then we use it again. So XP equals two f dot riedel and again self dot com three. And we put in backs me. Okay. And then in convolution, you always probably know that we also can do some kind of pulling just to shrink down the data to aggregate the data. And it's also something we can do by simply call X p is equal to s o f dot average pulling . Okay, average ponies. Also a function which is available in the and and module in the functional here. Okay, that's why you can call it F pool Judy and we put in XP, and also then hopefully we'll pull it, and that's a four, OK? And finally, of course, we need to return something. So return return, and then we can say XB dot view again because we need to reshape it again. So whenever you see view, always think about reshape, okay? And then we want to reshape as minus one and comma XB dot SISE Okay, at the end. One Okay. Like that. So, from the sights and people to have the first dimension of this ice art And then finally, what we can do is we can specify a learning right here, so we can simply say this case. Ah, but here the learning rate is equal to 0.1. Okay, so we change. The learning rate is high parameter. And then we can run this. Okay? And we have fun. It's and then all we need do is we need to Instead, she ate this so we can simply say this case. This is the model, but get moral year, which we had at the beginning. Where is it? The get model. Ah, screw up here. Get more listless, right? The model X be okay. Get more A list like that model optimizes get model and where's I'll get mortal function here? Is it okay? Model am Nissenson one. So in this case, we do it a little different. We used to get model function again. I copy this and we go down here in this case, we put it in here and say Get model. That's a call. It's get const model. Okay, get Continental. And this case we have not do not refer to m is too refer to the convolution. Okay? Convolution and returned this. Okay, so get comfortable. All right, you go and we can run this. OK? And now we get the model. You can simply say in this case, um model and optimizer optimizer is equal to and we seem to say get con model the model A That, um Okay, so we got the model and optimizer. It works exactly same a street here and now we can again use it this way. We need the train data set in a train day. Lodha, the validator said invalidate a loner. But we actually have already here so we can run this again. And then what we need is this training Lupul, right. So like that. And I copied this and again going here and enter a new line. And of course, now we get a new model, which is which is the convolution model, Right? Optimizer, We have this well and use the model itself as well as the optimizer within this new training. Okay? And we can run it and wait during the training process and then take a look at the outputs from the convolution. All their okay might take a little longer, but as you can see, actually, all we did was simply changing the model. But the rest of the code, like the instead she ate in sensational setting off the training data, said the data loader, as well as then actually the loop for the epochs as well as the back propagation. So calculating the radiance Back propagated radiance optimizer out step and so on. All these works exactly the same. Okay, all we did was actually changing this, um, the model structure itself. And this off course is true for all the other kind of models. Well, if you want to create on our nn or any kind off STM, it works exactly the same. You define the model itself, and then you can basically do the training process by simply calling the functions which we learned during this course. Okay, just one second, and we don't just wait and wait and okay, so here we got the democrats of outputs. And you can see also that this loss goes down for each of the different types of books. Okay, so that's it, actually, for creating a CNN and also training the CNN as well. A screen taking a look at the validation on the train data set for a convolution nor network in pytorch. So, as always, thanks a lot for watching and hopefully see you in the next video. Until then, best guys