Transcripts
1. 1 Introduction to the course: enter the m dot or dot the matrix, where it looks like the Matrix. Now it's not the Matrix. It is the machine learning world, in this case. Specifically, it's the world off neural networks which are part off where she learning Now, what is this course all about? Let me first introduce myself. You can see my picture up there. And my name is Daniel. I'm 29 years old and I'm a business intelligence consultant. I'm working with Power Bi. I am sort certified in tableau, and I'm also working with neural networks in Python. And this course is all about getting familiar with one off the hottest topics currently off their okay. And this topic will drastically and I really mean it drastically change the future. You already know we have self driving cars in other kinds of innovation. But computers will rule the world basically, and nor networks will actually help machines toe automate things to learn by itself. This will drastically help us and improve our lives as well. So it's, I think, now the point to really get into this and understand what this really is, what this means and how it can implement it yourself. So I'm assuming you are a total beginner. Okay? You have never worked with tensorflow and with, nor Networks. And within this course, the main goal is to get you started. Okay, So, starting from scratch, um, you will learn, actually, how you can build your own neural network in tensorflow. Okay, you will master all the basics you need. In order to create this moral, you will understand the tensorflow code. So what's actually the code does and what we're doing here, and we will at the end. As I said before, we will create. Or you will create your own model using a data set which I provide, which you can find a Donald section on Donald for free. Of course. And you will do all of this together in less than three hours. Okay? I promise it. So excited. Hopefully are. I am definitely. And I can't wait to see you in the first lecture, So let's get into it.
2. 2 Understand the relevant steps to build your first neural network from scratch: Hey, students, welcome to the let's say agenda off the course. So the steps to build your first nor let for a from scratch. So even if you have no experience with Norm, that works a t least if you know a little bit of python code, which is helpful after me, that then you'll hear on the right spot, actually, to get you started to learn how to implement a neural network from scratch. Okay, let's actually take a look at the steps, which we need to build our first normal. Okay, So the 1st 1 will be what said the load the data set. So at first, she showing you the data set? Of course. So what does the data look like? And we'll talk a little about it about it, and then you'll actually see what we're going to try to do with our normal. Okay, The next thing will be that we or you actually, we together will set the features and the labels. So what is meant by that? Well, the features, actually, is the import. Okay, So the features here is the input. Four are no network. So this is what we give our no network and their network needs to process this those features. So to then make a prediction and the prediction actually will be the label. So we have here a supervised learning problem, which means that we have given features and we have given labels. So these are kind of observations we already have. So we can train the network actually on these features and on those labels to make predictions. So new labels and then we can compare. Actually, the labels we have with the labels are networks predicting. Okay, so that's actually the features on labor's part. But again, I will talk about this when we actually implement this in code. Okay, so then the next thing is pre processed that they doesn't. So let's face it, no matter what you're doing, if you're working with Excel or any other kind of data, normally you have data which is not in the right, let's say former or the right shape and in this case, this particular interest from nor networks, because we are not able to feed any kind of text data within our no network. Okay, so the idea is if we have some kind of text data that we need to pre process it into numerical values. And also, if we have in America Well, yes. Sometimes we want to pre process those as well to put it in the right for much to give our partner network really what it needs to make good predictions. OK, that's what we do in the third step. The next thing is that we will actually create a training data set and the testing data set . Basically, this only means that we will split the data set, which we prepared here in the step three, and we will split it into a training data set and a testing data. Since the reason why we do this is that the training data set will be used to train the model itself. Okay, that's what I talked about in step two. And the testing data said later will be the data set. We will actually try to get an understanding how good our model is. Okay, So after training is done, we want to give actually the testing data to our model, and then we want to see Well, is the model now capable off actually predicting the right labels? OK, That's the idea behind step number four. Step number five is that we will define the model structure. So when we dealing with TENSORFLOW and nor networks, then there is one special thing we need to keep in mind, which simply means that at first we will define the model structure. So think of it like a piece of paper and a pen. Okay, what we're doing is we draw the structure off a neural network on this piece of paper, so on the sheet and after we're done with that when we have actually said, Okay, this is the structure or known effort should look like Then we will actually start to run the coat within the structure. OK, so it's actually first creating the structure and afterward running the code. That's how tensorflow works. And here I have actually a picture off this nor network. Okay, this is a basic picture, how you can think of it. So basically, we have some input layers. So one input layer, the input layer itself is actually this is the training data, which we feed into our model. Okay, this is where we start. This is the first layer down here then we have actually connections here to what's called a hidden layer. So the hidden layer with these actually these balls are neurons. OK, Think of those balls like norms and think off the arrows as actually connections to those nerds. So this way, each neuron of the previous steps of previous layer is caffeinated to the nerds off the next layer. So that's the idea. And then we have one or several hidden layers. That depends on how complex the model should be. And also within these hidden Lius we have in this case, for instance, three nerds, we all could also have several more neurons. So this is something we have to define a beginning. So we need to actually to find what the structure off the model looks like. That's where when I said what I said, what I meant when I saw talked about actually defining the moral structure here. Okay, so the structure means that Okay, we need to define how many hidden, uh, harmony norms do. We want to have within one hidden layer than how many hidden layers do we want. And this is something which we defined in structure. And then we have an output layer, and the upwardly actually only gives us the final prediction. Okay, so the output layer should give us a label. So that's what it's predicting. And then we will compare those labels which the network predicts with the true labels we have. Okay, and considering then how right or wrong the model is This gives us actually annoy idea, as I said, how good the model is, Okay. And just by the way, because this is a term which often arises when someone talks about deep learning. Okay, it sounds so fancy, but the main idea behind this is actually a soon as we have more than one hidden layer. Let's say we have two or three or doesn't matter as soon as we have actually more than one hidden layer. Let's say way have to in Lius. So think off another hidden layer here. We could actually talk about deep learning. Okay, So deep learning only refers to the amount of hidden Leah's we have because normally you might. You can say that the mawr hidden layers you have. So the deeper the nor network is actually, the better it is, it's better it will be able to make better predictions. But you also have to keep in mind the more Henley s you give the neural network, the longer it takes to train the nor network and the mawr, computational resources and power it needs. So this is a trade off. So if you want to make a prediction really quickly, for instance, then off course, you would have to sacrifice Ah, a little bit of accuracy, which means a little bit off the, um well, the performance of the network because you would use Ah, not too much hidden layers. Okay, so that's it for that. And the next thing will be the training model. So the train training the model simply refers to you can see here those connections, right? And as I said before we using the features at the beginning, we feed them, feed them. It's called feeding as an input layer. Then we have those connections, which is simply multiplication. So this is actually only math. Okay, All we do here is we're giving the input features. And that's why I said we need numerical values here. So we're giving. It's the include features. We convert it, which means pre processing those features into in America values. We multiply those values so these arrows are only multiplication. So that's pure math. We multiply those features, then we derive here and then we multiply those features again. And at the end, we are deriving at an output and this is just a numerical value. And this is actually which we work, then combined to the true label. Okay, so that's the idea behind that and training the model simply refers again to those connections he have here, actually, which are as a set mathematical functions. Um, multiplication. It's a simple multiplication, actually a matrix multiplication. So all we going to do here is we adjusting the weights and adjusting the weights means simply we change. Those multiplication is here. So even though the input layer is fixed by the features by the pre processed features, the multiplication matrices we have here So actually the as I said, the values which we will use to multiply the input features with we can change those and these are called weights and these are the values we will adjust to make better predictions . And we will do this for each layer. So here As I said, we only have one hidden layer. But think about it. We have several hidden layers. Then, of course, we would do this for each step within our model for each of the hidden layers. Okay, this means training the model. Then, as I said before, in step number four, we created training data set as well as a testing data set. And the next thing would be that after we have trained the model and we say, Okay, right now we have a model which is quite good. So we like it gives us good predictions. Then we will try to test it. So, off course, we need to test the model afterwards on, let's say unseen data. So we split actually the training and the testing data set because we want the test data set to have actually, so far no connection store model. So the test data set will be used. At least the features of the test data set will be used as in Pelleas. And then we will use our pre trained model pre trained for the training data set to make a prediction. And then we compare those predictions with the labels in the test data set. And if they're good as well, then we know. Okay, we have a really well trained model. If, for instance then the predictions on the test data set a really weak then we know. Okay, Our model is not that good because it's not able to actually predict on new data, which is which it has not seen before, Which is the test data, OK, and the final thing would be that if the test data is predicted, really good as well. So the network of your network is performing on the test status. That really good. Then we can make a prediction in the future so we can give the model completely new, unseen data and we will make a prediction and we'll have the future. So actually this is it. Hopefully you understood what I'm telling you. If not, then feel free to ask me. Of course, but I will refer to all of these things when we go into the coding. And that's where you going to do next. Ok, so thanks for watching. Um hopefully you understood it. And now let's go into the fun part. See you in the next video
3. 3 What are tensors: Hi, students. Welcome back. Now, let's before we start diving into the code, let's briefly talk about why it's called, actually tensorflow in Google s. So what are tenses? Well, um, you might ask yourself this. I guess So. What tends us? Well, if you think about math in school, if you remember this Well, what you are probably familiar with were vectors. So what is the vector? So let's take a look. That's what a vector looks like, actually. So I think you know what is the scale? Er's Well, a scale is simply a number. For instance, I could say a one, for instance, would be s Kayla, but the vector is actually a bunch of numbers. So in one dimension, Okay, it's a one dimensional vector. Bunch of numbers. Okay, that's a vector. So the next thing you might also remember from school, I guess, is what is the matrix? Well, a matrix is actually nothing more than a two dimensional vector if you want. Okay? Because you have Rose and you have columns. Okay. If you think about databases furnaces, that's that's a matrix. OK, that's what it looks like. A tensor. Now, what is a tensor. Well, this would be, for instance, a tensor, And this would be actually a three dimensional tensor, because in this case, I plot it here in three dimensions. Okay, but a tensor could easily have 456 or n dimensions. Okay, it's not limited to three. So attention tenses, actually, each kind of objects with higher dimensions. So this is what actually is often propagate it. However, if you think back about the vector, for instance, you could also call the vector at one dimensional tensor because that's what it is. A vector is nothing more than a one dimensional tenser, and you can think about the Matrix than pretty simple. It's a two dimensional tenser. And if you think about this specific tenser I products on the right, you might think office already told you a three dimensional tensor. So actually, everything is a tensor. Beginning from a vector just would be one dimension, and people know the word vector. So it's more familiar than the Tensor, of course, but still it would be about one dimensional tenser, and the same is true for Matrix. So people know what a matrix is not only from the movie the minute they know what a matrix is. But they also know they're also called two dimensional tensor, so that would also be right and three dimensional Tensor is a set would be under right side . So and this might go on actually to end so n dimensional tensile would also exist would have been one and dimensions. But of course I would not be able to plot this in a two dimensional PowerPoint presentation , Right? Okay, so that's actually cancers. And the reason why it's called tensorflow is actually because we are doing multiplication with intense er's okay so we could feed in multi dimensional data within the input layer. So multi dimensional tensor is if you want, and we multiply those tenses with weight, matrixes matrices or wait tenses. If you want to derive at the final output, so might be a little bit abstract. I get that even though if you're not that kind men not that deep into math. But I think it's important to know that that this is why it's called Tensorflow. Okay, Tensorflow because actually, we have tenses which flow through our network, and now you know what a 10 sarees. Okay, so that's it for that, and I'll see you in the next video. Until then, best guys
4. 4 Intro to Tensorflow datatypes: Hey, guys, Welcome back. Remember? That's when I talk to you about the structure itself. Off what you are doing here. So point number five once that we define the model structure. So remember what I said was intensive flow were first defined. The structure off the model. This is what you can see on the right. So what the model actually should look like. So this whole structure on the right will they will be defined first. And after we have finished that the process of designing this, then we will actually execute it. Okay, then we will feed data in this model, and we will execute it and train the model. So that's how it works. All right, so let's get out of here and let's actually get into the coat. Okay? We go into the code here. All right? Is that okay now within the coat? No, it's actually at first import the library. We need, of course, which is plain tensorflow. So all I need to do is in order to import it a simply call import and, of course, with lower letters. So we'll import tensorflow tensorflow, and now STF and TF is simply an abbreviation. You could use everything. But in case you don't want to write tensorflow the whole time simply right, TF then you can use it this way. Okay, so we'll import tensorflow stf and all I need to do is I will define something here. I will define notes. So, actually, what is a note If I go back to my power point here, this all these things thes neurons also caught notes. Okay, this is simply another term for the neuron itself. So what I do is a kareena note. Let's say my note. Um first, Okay, my note first is equal to you. And now I would have to define, um, kind of data type within tensorflow an intensive feel. We have different kinds of data times we have constants, we have variables and we have placeholders, and we will talk about place orders and wearables later on. But let's first about talk about Constant's. Okay, So all I need to say is TF for tens of low is my abbreviation, and I say don't constant so I can define a constant here, and all I need to do here is actually any to put in two things or actually only one thing. But if I want to be more specific, I will also put in the data time. So let's say, for instance, 10.0. Okay, this is ah, well decimal number, as you can see. And now I could also specify the data type. If I want to do it, I could say, for instance, TF dot float. It's called afloat with decimal number in this case, and you could say 64 or 32 in this case, I say 32 bit, so I'd say it's a tear float 32. Okay, that's my data time. So I say my for my note first in this case is a constant, with the value of 10.0 and a data type of tear float 32. Okay, and all I would want to do now is create a second note. So my note, second and again, I will say it's a T f taught constant constant and our define it. So let's say, for instance, it's 20 point. You and I could again specify the data type. If I want, I could leave it out if I also I could do that. But in this case at say I like to do it. So I say float 32. I think that's much cleaner coat, but this is actually all I do. Okay, So I define actually two constants from Tensorflow here, and I say those constants within the note here my first note and my second note, everything I do, nothing more. And now I define actually a session because you might think off this really plain and simple because it's really only the introduction to tend to tensorflow. But this is, for instance, now, in our case, our model. Okay, so think of a really abstract. It's not the model itself, but think of it like this is actually the model structure. OK, so this is the structure. But so far we only defined structure, So we did not actually execute the coat so I could run this, for instance. I could say I could run this, but I won't get any output, and the reason is that I only have actually designed the structure of the model. So if I want to now print something, so if I want to create an output here, I need to create a session a session. That's what is called in tensorflow. So I could write with GF dot session Okay, so I could define a session and I need to put into parenthesis because it's a method. And I say s says, Okay, this SS again, it's only an abbreviation like tensorflow here, TF. Okay, so I defined the session and then simply another point. And now I can actually do my executing so I can now execute the coat. So what do I want to do? And I could say save it in an output variable could say output is equal to And now I use my session. So my session as assess abbreviated and I run this session, it's called running. So he says, Start, run in this case and all I need to do it and to give it actually, the very boats or what I want to run now. And here we only to find Constance. Of course, all I could do here is simply put in those constant in a list. I could say I want to run note first and I want to run note second, so I only want to run those two and then I can simply and save this. So I run this and I save it in output in our can printers, I can say. Now, please give me the output. Okay, So, print output. I like that. And that's it. And if I now run my code again executed, you can see that now I get actually the output. So in this case, I get a list in this case off 10 and 20. Okay, that's everything it does. It gives me this. So, um, that's actually again. What you need to understand is what I want. You make understand here not only what is a constant, of course, but also that we define, actually the structure, but without actually executing it within a session, the structure means nothing. Okay, the structure don't give us any output. So only when we create a session. And when we run decision within or with the structure we defined, then we get the output. Okay, that's it's actually for the introduction to Constance as well as remembering what actually we have here. We have what tensorflow is. It is actually the signing a structure and then in a second step, executed the code within a session. OK, so thanks. Watching the introduction, and I'll see you in the next video. Until then, best guys
5. 5 Tensorflow datatypes and operations: a student's nice to have you back now in this video, we won't actually talk a little bit more about what we have done in the last video. And we want off course extent. There's because what we did in the last video let's recap briefly now, actually, with designed in this case, the structure off our this case not nor network so far. But we defined a structure within tensorflow. Um, a graph. This is our TENSORFLOW graph in this case, only those two constants here. And then we run the session where we actually executing what we have designed up here. Okay. And this, of course, is a little bit boring, because all we did was actually were to find two constants, and then we printed them. But we could off course also run any kind off operations in here. And that's what we want to do now. OK, so I enter a new line in here in my Jupiter notebook, and I will simply do it pretty easy. I take my code and a simply copy it because what we do now is we will do a Mathematica operation so simply and we could do this. I could say once I could say division so division like that And I could say my division is equal to and now say note first, some referring to my constant and I simply divided by note. Second. Okay, So divide those two and a save it in division. And then, of course, I want to run my coat again. But in this case, within my session, I don't want to actually find the these as a list here. What I want to simply run is my division. So division and I save it again in the output variable. And I can run this now. So let me run this and let me explain then a few things about that so I can see I get 0.5 which simply 10 divide between. So what you can see here are the important thing. Here is the following. We don't actually defined the division year as the note first divided by notes Second, sexually, we were referring toa those two constants which we defined up here. Okay. However, when we run the session, we only run division OK, so we do not have to run note first note seconds as kind off initializing those two. That's not this, um, we don't need to do it because all we need to do is actually run the division itself and the division then uses note First note second, which simply means it's referring to those two constants, and it's actually initializing them. Okay, and then it's printing in this case 0.5. Okay, so that's actually it only four division. And the next thing I want to give you is a new variable, which is really, really important. Now, Um, when we are creating the nor network itself, which are the placeholders because so far would be talked about were constants, right? But the problem with the constant is it's a fixed number. So in this case, the constant will always be easier. 10. So if I call note first, it will only always be 10. But what we want actually is you want to put in put in our in our graph, right? So we want to have our features, which we talked at the beginning. So we want to feed in the features. So we want to put external data into our model and the model should use this data So how can we do this? Well, probably not with Constance. So what we need instead are place holders. So if I import, I also run this always. Of course, it's only necessary one time, but I also always right import tensorflow It's a flow, so you can actually always run those code snippets. Ah, when you reached out to put a no question, you don't have to start from the 1st 1 again if you want. Okay, so we'll import. Tensorflow asked TF and now I want What I want to do is I want to define the place order so I could say place a is equal to an s a t f dot placeholder police holder. And I only need to define in this case, the data type. So I say it's again a t f dot float 32. Okay, So to find my place holder and I will do this the second time. I say placed B is equal to TF dot placeholder. So another placeholder where want to fit in some data and I say TF dot float 32. Okay, so I to find actually to placeholders, which are two very boats which are currently empty. That's why they're called placeholders. But what we do is we actually make some kind of promise year. So we basically say, OK, currently, these are place holders, but when we execute our code, we will give some data. So we put some data into those placeholders. OK, that's what we actually doing here. So OK, I said I find my place holders, and now I can say this case, for instance, addition. So, addition again, we will use a mathematic operation. And I say in this case is cli equals Sorry. Um, and I'll say place a and now I simply add place. Be okay, place be like that. Okay, so this place orders and we say we won't actually at those two and save it into variable, which is called Edition, and then you already know that. Then we would have to run the session. So I say with TF, don't session. And just by the way, if you don't know that, why do I use that with, um he worked here so I would not have to you. I don't have to use it. I could also call this simply TF. That session a session assess. I could also run this. But if I do this, then I would have to close the session afterwards so would have to write session dot clothes or in this case, sets not close afterwards. And if I simply want to save some, this one line of code, um, I could use a with statement. That's actually simply means run this and as soon as you're done with it with off the code within the with statement, then closed the session, that's that's the beauty the full thing about with. And that's what I Why I like to work with with Okay, so with TF, don't session and let's again don't forget those two, um, parenthesis. And then again, I say S says. And then I say OK, now, actually, I want to run this so let's say or two, save it into verbal. So let's say I'll put again. Output is equal to, and I'll say simply actually run addition. So output ah will be equal to assess, don't run. And what we want to run is simply the addition addition. So but now what you also have to give it is a kind of, um, school dictionary? Because now can be what we have. What? What? What? What what we have? Well, we have to placeholders here, and we have an addition. And with the session will actually execute the addition. The problem now is that currently the place A. So the placeholder, this case, the place or the A and the placeholder, we are empty because currently there only placeholders. Okay, there's there are no numerical values here. It's actually what we need to give it some additional values. So we simply call sets, start, run, and we want to run addition. But we will give it additional values years. So simply cabin Akama and then all you do is you need to give it a dictionary. So I could say in this case with two curly braces and I could say in my place holder A so simply using a And you could also use another kind off he work here. Doesn't matter. Okay, but simply a And within this a within this, actually, you should name it placeholder. A sorry place. A. Okay, um, within this place, a What I want to do is I want you actually fit in now. And if I want to fit in more than one number or actually only always use those kind of This is a notation within tensorflow. Always use those brackets here. Okay, but in this case, four inches, I want to add three numbers. So let's say 12 and three, Okay. For my placeholder. A And then I will do this for my place, All the beer. So I say, place me police on a school p and from my place be I will explain. Let's make a space year. Okay, that's cleaner. Okay, for my place. Be in this case again. What do I want to fit in here for my place be? I want to put for instance for five and six. Okay, So also three numbers. This, of course, has to be the same, right, Because we will execute this. And so these two have to match up from the numbers here. So three numbers here, three numbers here. Okay. Could be more. But to make sure that those are the two those two of the same things. Okay, and then we need to close. The dictionary will close than the session here. Okay. And then what? We want to do is we want to print the output. Okay, so print outputs. And then I will execute this and they get a narrow Let's say tense or flow. Of course, because where of that tensor flow? Okay, I run this and you can see I get a 57 and nine. Okay, So basically, what it's doing is it's using than the one and the four put it in the placeholder, and then it adds them in addition and ended prints the output. Okay. And it does this for all three values together and returns this one. Okay, Because when we're dealing with tensorflow again, it's not executing. Let's a single number. We're dealing with tenses, right? Were No, no, what tenses are, So we're do we're dealing with multi dimensional. A race so tense is also a race. And that's why these are execute together. OK, so that's why it's adding 1 to 4 to 25 and 3 to 6, and then it returns 579 Okay, that's what you have to keep in mind. So also execute. It's always executing this as it's kind of a batch, so together. Okay. All right, so that's it actually for the placeholder, which again is really were important when we create the nor network itself. OK, so thanks for watching. And I hopefully see you in the next video. Until then, best guys.
6. 6 Tensorflow datatypes variables: Hi, students. Welcome back now. We now know actually what a constant is and we know what a placeholder is. And we have talked about placeholders and we now know if you want you actually put some data within our model. So, external data, then we have to use place orders, OK? Which simply are nothing more than kind of promise that we will fit in some extra data later on. Okay, Now, one more important, actually. Data type within within tensorflow are variables because you might think about it the following way. When we train our model, what we actually do is we adjusting the arrows you've seen okay within the parliament's Let's go, Let me go back in here. So you understand that So actually, those arrows here a simply method medical operations. Okay, so we multiplying what we have in our input, Leah, we multiplying those because we converted them into numerical values were multiplying those with our matrices here. So weight matrices. And then we have a new output here follows. And what we want is in order to train the model, we have to adjust the weights k, those weight matrices here and in order to adjust them. What we have to do, we have to be able to somehow create some kind of data types within tensorflow which can be adjusted. And in order to this, we will create variables again. They're called variables. So the third data time beside constants and the placeholder itself is the variable. So again, I will create some. So import tensorflow, please make sure that you're coding along with me because that's the way how you learn the most. Believe me. Okay, so import tensorflow stf and not say Okay, I will create my variables. And within a nor network, we actually always have weights and we have biases and I'll talk about them just in the next video. Okay, so, um, the weights itself is simply an referred to w as a weight. And I say my weight is TF dot variable. So I'm defining a bearable now and now all I want to do is I want to know, Put some valleys in here, for instance, I could say my weight in this case is 0.5. OK, that's my weight. And then again, as before, I will define the data types on this case again, it's afloat. So I say TF dot float And now they 32. Okay, So have to find, in this case my weight here and defined it as a variable with a starting value s your 0.5. Okay. And I'll do this also for my bias. So I have a bias year, and it's a is equal to and again, I'm referring to TF dot variable variable. And now what I do is create here second value. And I say this value should be, for instance ah, minus 0.1. Okay. And then I use a comma. And again I will define the data types a t f Don't float float 32. Okay, now, I got this a swell. And now I also find a place holder. But we already know what a placeholder is, and I'm using this acts as a placeholder, and Asai axes equal to TF dot place, order, placeholder. And now I say this case, it's only the data type. So I say the placeholder will be t f dot float float 32 like we've seen before. Okay, Now have to find two variables and a place over. Okay? And now what I want to do. Issue. Actually, I would create a model. So let's say my model is equal to you, and I will say w times X plus B. Okay, that's a formula. So all I do is I will use my weight matrices here. I have my bias year, which will add, and I have my placeholder. So, actually, all I do is I feed him some data here, and then I will multiply this data by the weight matrices, and it will add the bias. Okay. And you will This will look familiar later on when you write the tensorflow cold for nor network. Okay, so this is the model. And now what I need to do whenever we dealing with variables, we need to initialize them. So this is one more line of code which you need to implement when you're dealing with variables. Okay. So simply call its initialize er, for instance, I say my initial izer is TF dot global low. It's called global underscore Variables on a score initialize er okay. In Isha issue. Um, Liza, Like like that. Okay. And to parenthesis. All right, I m so this simply means that I want to initialize those variables here, and I need to explicitly define it, then here. Okay? And then I will do what you already know. I will start creating recession because so far nothing will happen. Um, initialized variables in initialize er, but they're not initialized yet because the session has not been run. Okay, so far again, we only designed the structure. Now we want to run the session, So I say says is equal to an a t f dot session. So I'm starting position. And of course, I want to use the with Kate ski What I say with and do it the soon DF dot session Do it this way. Okay. As says So we're in my session, and now I can define actually what I want to do. Okay. Actually, I want you execute my model, which I create here. And you could say this in an output. Very belief. Wonder could say output is equal to, and I could write this, but this case I used a model. Okay, So, um, Esme, I am my burial to save it. So what I do is I actually started the session, so I want to now run the session and I need to run two things here. So I need to run my initial Isar. So I need to recess. Start, run. And I put in my initial Isar initialize er Okay, this one. This simply means now I'm running. I'm initializing my very boots. This will be the first part. And the second part is actually that Then I would execute, um, my model in this case so I could say print and I could just do this in one could actually define. And here's where I could say, for instance, my, um this is my output again. My out put is equal to this way and I'll say sess dot room So we initialize the variables and now we will run our model. I will run the model. And of course, as I've done before, I have a placeholder. So I need to give it some data, okay. Within a dictionary in Python. So I could say my dictionary and that put in X double point and then I could give it a list . Okay. A list of values like, um 567 and eight. Okay. These are my values here and now. I could actually start printing this so I could say Brinks output, not boats. And let's see, hopefully I did not spell anything wrong, but let's see it. Let's be execute the coat and then we'll see. Okay, I run this and you can see I get the output. Okay? Everything looks fine. And all we did here is actually executing this year. Okay? We initialize the variables here, so 0.5 and minus your but one. And then we simply used those variables we have here and executed the code. Okay? And finally, of course, we print it and then we terminate the session because we using it with he worked here. Okay, so this is it for the very bolts itself, and we will use those. I'll rely on those when we create their network. So that's it for this video. And also here in the next video. Until them this guys
7. 7 How does the network learn what does loss mean: Hi and welcome back. Now let's actually go the next step and take this Liburd further. So in the last video, we talked about the variables and we used some kind of operation here. So we multiplying the weights, times our inputs and we at the bis. And to understand this a little clearer, let me again refer to this slide here. So basically what we do the X value. Okay, so the exe valley here, this one is actually nothing more than our import earlier. Okay, so this will be the X value in our case. The input layer. The weights here will be those matrices. We are multiplying x with. Okay, so basically these arrows are simply referring to we using the axe and we multiply by the w Okay, we multiplied by the W and then we derive at thes outputs here. But we take it one step further and this is also common in your networks. We at a bias. It's called a bias and we simply at this term So basically this arrow here So one of thes arose simply referring to the X value here we multiplied by the W and we at this bias, and then we're here. Ok, so this will be the next step. So afterwards, there's still an activation function. But for now, that's everything you need to know. Okay? And we go through this through the network, depending on how many hidden layers we have, and then at the end, we derive at an output. Okay, so we're doing this small troops matrix modification several times, So in this case, we only do it once. But of course, if we have a deeper network and we would do it several times, depending how many hidden layers And of course, um Well, yeah, How Maney hit list we have. Okay, so let's get back in. All right, Now, what we want to do is actually for now, we want to think about what if we have another place or the verbal, which is we have an output, and that's what we call the label. Okay, let's say again, we have an output here, which is why, Okay, We have an upward. Why? And we multiplying those we doing these steps and that our model gives us an output and what we want to do we want to compare. We want to compare the output which are model creates. In this case, it's what we safe here in the model. So w turns experts be the output will be the model. So these values which we have here Okay, those ones And, um, we want to compare those are puts with the true outputs which we have okay with the true values And then because the difference between what our model says and what the rial outputs are is actually or gives us an idea of how good or how well our model is performing , okay. And this is then something we have to optimize right at the end. What we do is we adjust those weights and biases which we created here, so we'll adjust those values. And that's why we used variables to create them because we can. Or our model is able to adjust those values Syrah 0.5 and 0.1. Okay, those two values, minus 0.1, will be adjusted. So the model will actually change those values in order to make better predictions. So we derive actually at the rial. Why? Well, you saw the real output. Okay? And this is done during the training process off the model. Okay, Um, then let's actually do it. Now, let's take a look how our model is performing so far. Okay, So in this case, if you're referring to this simple model, we could again say I will import No, do it or Libbets, um, faster now, because it's actually the same code. So far, so important. Tensorflow stf. Okay, And then I say again, my weight matrix is DF dot variable. Again, I encouraged you to code along with me and coated several times because the more often you do it, the better you get. OK, so I'm referring to a 0.5, and I say it's TF. He f dot float 32. So to find data type, I do it the same with my bias or by biases equal to utf dot variable. And again, I put it within. The brackets never fall. But forget the brackets. Otherwise you will receive a rose later on. And as a minus minus 0.1. Put it in here again. I defined the data types. TF don't float 32 and I got my second date of time. Okay. And now I will define my ex value, so X is equal to W So w times or sorry, my model. Sorry. My model. My model is equal to W. Times X and I at two. B two b. Okay, so that's actually my model again and then threw up a little bit. Now I need you actually create another verbal which you haven't done so far. Okay? Now I actually want to have a placeholder for my true values. So what I will do is I say, why is equal to and I must say TF dot placeholder placeholder. And I forgot actually something. So let me says Copy that. Because for the acts, of course, we also need the placeholder, right? So I paste it simply here, and I got my w might be and my ex, okay. And my got my model here and now I This will actually output again once we have seen here, as in the model here or later. If we simply run this and save it in the operating were burned the output. But this is actually what the model returns to us. And we want to compare those values. As I said before, with the values which are the true values and the two will use will be those Why Wallace here? So as defined us as a placeholder place holder like that. And I say the why is again a t f dot float very 1 32 because it looks the same as the expert, right? But the why will be the output and the X will be the input. And we will use the X to calculate output for the model. And then we'll compare again this output to the true output. Why so what we need now we need to define what's called a loss function. So intensive flow, we're dealing with loss functions. And the loss function actually is nothing more than the difference in any way, depending on how you calculate the difference between the why and the output of the model and the big of this difference. The worst is the model. Okay, pretty simple, because what we actually want at the end, we want to have the difference between the outputs off the model and the true out to two labels that while values close to zero become that, then the model would be 1% and it would be a perfect model, right? Ah, the reality. You actually never have 1%. Um, and but this is actually what we aspiring to, so Okay, I defined the wine. And now what I want to do is I want to create my lost function. So let's simply pacing here. I will create a loss function, okay? And if you route this use this has symbol. It's simply, um, commentary. Uh, um, yeah, in, um, um, or it so create a loss function function. And the loss function itself is simply the squared delta. Okay, The square difference between why and those well used to model is out putting so simply I could say squared. So squared difference. Muskrat difference is equal to and us a t f dot square. So please square the difference between the model. So the output of the model and minus the true alias. Why? So we simply subtract actually, what the model is outpouring from, or actually the true values from the water models out putting. And then we squared. OK, so we square all these values and then we take a look how big this number actually is. Okay. And the idea is now that the loss, this will be the next step here. So, so far, the square difference is exactly this difference between those two squared, OK, and the loss is equal to and this is called in tensorflow TF dot Reduce some. Okay. We want to reduce the do reduce some off course off these, um, is the loss and simply put in the square difference. Okay, squared difference. So that's the difference. Um, and we reduce some. This is the loss. Okay? And ah, now I could say, actually, I could in it going to use my initial Isar. So I say initialize er Nischelle ELISA like that. My initial izer is TF dot global on a score. Barry bolts. Very old underscore initialize er she'll Liza like that. You parenthesis. Don't forget them. And ah, Then what we want to do is we want to run the session so I could say with DF don't decision session and ah, Then you has asked cess So we'll use an abbreviation again and I now I say as my session, um I want to run the session. So the first thing, of course I want to run the initial Isar success not run and I want to run the initial Isar initialize er And then, of course, I want to also output something. So, for instance, I could say in this case, what I want to output is actually the loss. OK, What I want you know how big is the difference? Okay, um so I could do simply I could run. This could say print in this case and I will do it here directly. I don't save it in a variable so could also use this code and put it directly in the print function. Okay, so I could say print and our run the session so assessed that run. But I also want to run. I want to run my loss function So I want to run this last year and ah, what I need to give it, of course, is my dictionary because what it does it's referring to square difference Square difference itself is referring to why so which is one place old as well. It's model and model again. It's referring to W and B, which are initialized with the global very wearable initialize er, but also to the to the X and the X is also place. All right, So what I need to do here is the need to fit in the dictionary. So again, I say, in this case, my dictionary and my dictionary will he will be X and within X I will feed in a list of values. Let's say 12 and three, for instance. And I also will do this for the why. So I say why and for the why I will now we'll give it also free, free values here. So it's a, um, for instance 56 and 77 Okay. Ah, like that. And ah, Then what I could do is simply could execute this cult and again there let's see initialize er initialized if the global but with initial Isar tens of fluids, notably low bell variable. And of course, this is spelling. Mistake is always, um, it's a global variables in knits. Realize they're OK. Where's the spelling mistake? Actually, Shall I? The logo variables and intensity ELISA or sorry for that initial Isar and I run that and you can see now I got 78.53 So be aware that be careful with the spelling mistakes. I'm sorry for that. And Okay, so I run those. And actually, what I get get here is exactly this. OK, so actually I half my ex values, I'm multiplied them with the wise. And I added, the bias is a safe place in the model. And I have the model gives me some outputs, as we've seen here up here. But then I also have the true outputs the 12 puts otherwise values. And what I do is simply again use the square difference to them and I calculate the loss. And this is the loss. Okay, so, country, this loss is really, really big. So actually, the model is out putting some data for X, which for why, but the true value is 567 So there's a huge difference and the difference in total is this one value here. And by the way, this also if you haven't fully understand, might haven't haven't explained it correctly or a deep there do some simply means that we actually would have three different outputs here, right? Like here, for instance, we used four x values, so we had four outputs and you also get here three outputs for the X and we have why three outputs and with the reduced some, we actually reduce those three values toe one value. Okay, so the lost this is the total loss for all of our three outputs. That's why we say to you have to introduce some. That's what's called this way. So this is actually the output, okay. And the whole idea behind the neural network itself now is actually adjusting w and be okay because those are verbals again, a referent of rebels. Which means we can adjust them the model we can train the model. So the model learns, actually, what the variable, Why and what the very, very readable you sorry and the world will be have to be. So how can it adjust those variables so that this multiplication is actually closer? So the output of this manipulation is closer to the true values. Why and that this difference the square difference here and reduced to a value is close to zero. OK, because if we achieve this, if you're just w and B and to get a value here and the lost function close to zero, which simply means that we have a really, really well trained model and this is actually where we want to go. Okay? This is the target to achieve this status so that this value, which is currently pretty pretty high, will become really, really low. Okay. And that's actually the process which is called Grady and dissent. So I won't go into the details how waiting to send works. Because this way too much math, but that the whole structure in the process itself, hopefully is clear now. So what we're trying to achieve, if not off course, you can ask a question. But hopefully it's clear. And ah, yeah, that's it for this video. And I'll see you in the next video. Until then, this guys.
8. 8 How optimization works finishing with the basics: Hey, guys, Welcome back. Now that's actually implement the optimization in code. So in tensorflow and ah, I'm gonna show you how we can do it. Okay? So at first, I will, um, course, create a new line here. And as you've seen before, we can be have a really, really high lost year. Okay, Because are more or less not trained. Okay, It's not trained. We simply fit in some data here for X and simply gives us some data for Why? So which is the prediction based on W with 0.5 and based on the buys off, minus the your 0.1. But as you seen here, the loss is way too high. So we need to optimize that to get better predictions for why? Okay. And a smaller loss so we need to do is we need to add a few lines to that. So that's why I simply use this in a copy it and paste it here. No. Otherwise you would see me. Ah, put in the whole code again, and I don't want to bore you. Okay? All right. So we have to find the last year. But what? We'll also will define is an optimizer, and this is used done by grading descent. So what I will do is I will ride optimizer so it will create an optimizer. And this optimizer is equal to u t f dot train don't Grady int descent. So that's what the let's say the algorithm be used, the great in dissent or the the math. We use an optimizer optimizer like that and you could also specify ah, learning rate if you want. So the learning rate is actually how fast should, um the optimize the work. OK, so how fast should it? Um, yeah, just awaits. Um, so I don't want to go into details here. Are can't do it here in a short time, But you could specify the default value like 0.1 This is actually value, which is often used so you could play around with this. You can a justice learning rate, of course, if you want. But, um, if you want to stick to a default learning rate, so this zero point your one is really, really common. Okay, So we define the optimizer and we say the optimize that you have to train that great and sent optimizer and we gave it a learning rate. So the optimizer currencies doesn't do anything right again. At first, we're still in actually designing than the structure. Only in the session itself here, you running the structure. But we haven't done anything with the optimizer. So what we need to do now is on import or create a new line, which is simply the training. Okay, so the training process or simply training here is equal to an hour will use my optimizer. So optimizer dot And what do I want to do with this optimizer? I want to minimize the loss. So I say simply minimize, minimize and I will want to minimize the loss. Okay, so we're recreating the optimizer here or the great into sent up the mice and we save it in the optimizer variable. And then we want to minimize something with this optimizer which is the last function and the last function again. It's the square difference between the output of the model and the true values. Why? Ok, so we want to optimize this and then what we want to do is I want you sister run initialize er that's okay. um and I can print here in this case, the X and the y. So the lost year by default. But then what I want to do is I need to quit a loop because in this case, I need to go through steps to train my model. So I could say for I in range and I can define how often do I want to train in this case, let's say only 100 times that's enough for now. And I say, OK, I want to go 100 steps and in each step I want to train or adjust. My weight's just my weights and my biases. So these two very boats again variables are changeable. So we want to adjust those variables so that we get better predictions as here in the model concerning the difference here to a true values. OK, so for I in range 100 and I need to run. So assess, not run, always to remember. If you want to execute something from your well your creation here or from your real Mitchell, whatever you wanna call it, you create up here. Um so from the structure now, thank you for the formal structure Then you would do Will have to pay set in the run. Okay, You have to run a session so assess don't run. And I want to run the training process actually. So my optimizing minimizing the loss So run the training and of course, I need to give it the values here. So in this case, I would refer to those two so I could paste this in here as a dictionary. And let's see, this is the first. That's OK. So close the parenthesis. I want to run this and I could also print This is have one if I want to So I can say Ah print and I can't say says dot Run and I I want to print w as well as be okay. And if I execute this now So let me run this then that's what you get. OK, so basically we have the 78.53 which is here again we printed before This is simply only the printing off the loss. Okay, this waas the actual loss which when we did not adjust anything when we kept w s 0.5 and we kept be as minor syrup on one. And then we simply multiplied them by the values. 123 And then we calculate the difference off this all pro to the y values, and then we simply put it into one value with the reduce some. So this is the original loss. And then what we What we do is we had just awaits. Okay? So, simply in this first, um, adjustments, you can see it starts with a 1.13 OK, so actually, this means that now. Sorry. Um oh, yes, fine. 1.13 So, actually, ah, instead of 0.5 it now uses for the W 1.13 And instead of using minus 0.1, it uses for the B zero point 205 and so on. So this is the starting value. Okay? And then it's calculating again the loss. And then it's comparing the loss to the true comparing, actually the model sort of comparing its cattle in the model. It's comparing the model to the to value with those teeth square, and then it tries again to adjust those W and P further to reduce the loss further. Okay, and at the end what you can see. These are all the values year, So 100 orations. So one of our times and at the end, it's it created this one here. Okay, so it has a value off 1.69 for the W 1.69 and so one. And for the bias, it has actually a value of 2.42 and so on. So it has optimized the original W with 0.5 and original bias with minus your 0.1 to the values you've seen down here. And we can check this actually, if I print my lost function again So if I would now actually run this line again So remember, we've seen here. We started with 7.853 as a loss. And if I wouldn't now after my for loop after my adjustments off weights and buys is if I would run the lost function again. If I now execute this coat, then I get now, remember, we starting with this loss, then we have just the weights 100 times in this case because we use the for loop for 100 times. We adjusted way to adjusted weights. And the one and down here, our new loss is only 1.6 95 Okay, this is the new loss. So we started actually with 78 as a loss and our optimizer trained tomato also adjust the weights and biases adjusted w and be the 100 times at the end. We only had a loss off 1.695 And you can actually check this if you want. Simply use those two values here. Okay? And all you need to do is you use those two values. You multiply by 12 and three each time than you subtract the way the output off these from 56 and seven. And then you square it and which we've done here, the square difference. And then you derive at exactly this kind of loss. Okay, but this is actually how it works. Okay? This is the whole structure. How the tensorflow works. So we have our way to ever biases. We have placeholders which are actually the data, which will be the features on labels we fit in. Then we check. We multiply by the weights and biases with X value. We have an output which is in this case, the model. We compare the model to the true outputs. We calculate the loss and then we used an optimizer most of the time again and descend Optimizer, This is basic. The state of the art which is coming to use and we used is optimized uses optimizer to reduce the difference here between those values by adjusting the variables. Okay. And at the end, you have seen we started with this high loss and at the end only 100 durations with our simple logic here were derived by a loss by only 1.6 on one point or seven close to that. Okay, so that's actually how optimization works. And that's actually the last video for the introduction toe. Basic instruction. Because the next video we will actually start implementing this into our really example in tensorflow. OK, so thanks for watching. As always, hopefully enjoy it. Hopefully you learn something and get ready to start with the next lecture. See you there
9. 9 What is activation: Hey, welcome back. Let's briefly talk about this slide which we already know. Remember, we have the neurons which are simply those balls, and we have the connections which are simply the multiplication with the weights and adding the BIS to get to the next layer. In this case, it's the hidden way I can see now what I want to talk about are actually the activation functions. Because so far all we did was we used the input, the X we multiply it by a weight matrix w and we at the bias. But there is one more thing we need to keep in mind. Which is this graphs here. OK, which is the activation you can see to the right. You have the inputs which are the X values. You have the weights and you have the biases which I couldn t not visible here. But this is the third instance we need. And then we put all of this together that we multiply it and then it goes to this an urn and within the neuron neuron itself, there's an activation. Okay. And this activation is simply again, a kind of mathematical function which is applied to the multiplication output we get from the imports, the weights and the bias. And there are different kinds off activation functions. I talked about this in my other in death machine learning course in your network causing you to me. But, um, all you I want to show you here is the 10 H function, for instance, or the sigmoid function. This would be functions which would be applied on the output which we derived. Okay. And, um, in the next slight, you can see the soft max function and to the right, and this would also be a function. But I used this actually this picture because he can see again what we basically do. Okay. We use the different kinds of X values, which are the inputs. We multiply it by a weight matrix, depending on how many norms we actually have. This is a structure of the weight matrix. We at the bias to the right, you can see it. And then we put all of this together in one function. In this case, it's the soft next function, and then we actually calculate the why which is the output of our model in this case is the output layer to the top and this output layer. Then we compare this layer actually with the true outputs. Why? Ok, so that's actually all about the activation function, which is again a really important topic and also implemented in the code. Okay, so yeah, that's it for this video. Briefly. But hopefully it hope you and see you in the next video. Until then, that's guys.
10. 10 What is one hot encoding: Hey, guys, Welcome back. Now, this is only a short video, but really important because we will use 100 coating within the model we built. So the question is, what is one hot encoding? And here I have a little graphic which, actually symbols really great off fink. Visualize is really in a good way. What 100 coating is. You can see that we have a male female and not specified. For instance, if we have these three inputs or outputs if you want. Okay. So if we want hard and code it at first, which means that we have to transform this into a numerical representation instead of male , female and not specified, which is text, we need numbers. And we can simply do that simply by using 100 coating. And you can see this actually two the But within the image, you can see that instead off male male is represented in 100 Okay, Female, on the other hand, is represented 2010 and not specified is represented as 001 Okay, so basically, instead, off using mail, we use ones. You're zero for each male which occurs in our data set. And instead of female, we using zero 10 in our data set instead of instead off not specified, we were using 001 Okay, that's the whole idea behind 100 coating. And this is actually a technique or then a structure that the 100 coating, which is really, really helpful when we want to make predictions. OK, so, um, that's about one encoding. Hopefully, you got that. And thanks for watching and can't see, does it. Can't wait to see you in the next video Until then, best goes.
11. 11 Creating the neural network understanding the dataset: Hey, guys, Welcome back now, before we actually dive into writing the code Ah. Want to give you a brief overview off what? Actually, our data looks like, right? So we want to know what kind of data said are we dealing with? And you can see here, this is actually the home page where you can download the data set for free. OK, And it's in this case. Ah, Sona data set, which compares minds was rocks. So basically, we want to predict whether at the end, we have a mine or rock and let's actually read through this. Okay, so we have actually two files, but we will download the combined file. Okay, which contains the minds as well as the rocks. So in this case here, it states the file sona Mines contains one and 11 patterns obtained by bouncing sonar signals off a metal cylinder at various angles and under various conditions. The file sona Rocks contains 97 patterns a paint from rocks under similar conditions. The transmitted sona signal is a frequency modulated, chirp, rising and frequency. The data set contains signals obtained from a variety of different aspect angles. Spending 90 degrees for the cylinder and 180 degrees for the rock. So each pattern is a set off 60 numbers in the range, 0 to 1. Each number represents the energy within a particular frequency band integrated over a certain period of time. The integration apertura for higher frequencies occurred later in time. Since these frequencies are transmitted later during the chirp, the label associated with each record contains the letter R. If the object is a rock and M if the object or if it's a mine, a metal cylinder. Okay, the number is the label. Oh, sorry. The numbers in the labels are in increasing order off aspect angle, but they do not encode the angle directly. Okay, well, wolf Well, this sounds quite Ah. Well, interesting. Now, let's take a look at the data set. So I downloaded it. And safety disease free file. And I, uh, you should do the same again. You can get it for free from this website. Okay. Um so me actually close this and let's take a look. This is what data looks like. Okay. We have what the data stated before we have actually 60 in this case is it sees refile. But we have 60 values here between zero and one. Okay, For each line on each line, in this case, in is an observation. And the first though this first values here, starting from 1 to 60. These are actually the lay bare the feature. Sorry. These are the features. So these this is actually the numerical values, which are again if you're referring to the model we created. Um, so I got here and take a look at this. The 60 the 61st 61st columns are actually the ax. So these will be the labels here. Okay? And we will multiply. We'll create a weight matrix will multiply this weight matrix by those 60 for each of the observation. So from one line 2 228 at one a date. Sorry. 28. So for all of these lines, we will create a await matrix. We will multiply this weight matrix by all the 60 columns here, and we will add a bias. A swell. So remember the bias. And at the end, if I scroll to the right, we have an output here. So at the end, peasant are in this case you can see it here. And there's also, um, em here. So if it's a rock or a mine and currently they are sorted, this is something we don't want, so we will actually shuffle our data. So we're simply means we will change the line so that it's totally random. If this is an r and the second could beer than an M in the third could be an r again. We will shuffle, so it will be random. Okay, but this is actually the label. So only the last column here. So this are whether it's in our Iraq or whether it's an Emam mine. So this will be the label in the label in our here now, training actually is the y. Okay, so the why will be the true value, which we with the label R M in this case. And currently it's it's ah, text. So it's a string value are on em. So you would have to transform this into numerical value. So, like one and zero, for instance. Okay, one could be our r and zero could be, um, on em could also be the other way around. Okay, depending how we want encode it, but that's actually what we need to do. The other values in this simple example, of course, are good for us or all numbers between zero and one. So there is nothing to encode we already have in America, values with which we can use which we can feed our model into. Okay. And then we will do the same. What we've done in the last video, we will actually create some kind of difference. Off course. This will look like a little bit different. We will see that then when we implement this inter coat. But we will kind off, um, actually doing the same, which we've done here. Okay, so we will, um, actually create await matrix supplies. Matrix will multiply this we will create a model. So the model gifts is an output. Actually, we will compare the output to the true value. So whether it's zero or one at the end, because we want to know if it's a rock or mine and then we will again at all of these losses together to derive at one loss. And then we will create an optimizer and we will train are no network to reduce this loss as much as we can, And then we want to print and accuracy. And security simply means. Then how often was the model, right, or how good is the performance of the model? Okay, how well is the model performing? And this could be actually any value between zero and 101 100%. So if the accuracy would be 1% this would simply mean that our model is correct in 1% of the times. Okay, but that's actually the structure. Um, this is the data set, and there you can download the data. Set us A C C five. Okay. Yeah, that's it, actually, for the introduction. Um, And now let's get into the coating. I'll see you in the next video. Until then, best guys
12. 12 Start building the network 1: Welcome back, students. Now we're finally there and we start actually implementing a neural network from scratch with tensorflow in Python. So basically we together, right? So please follow along and code along with me. Okay, so the first wing we going to do is we will start with importing dependencies. Okay, so I want to import, and I want to plot something, actually. So what I will do is I will import a Met prototype, so we'll import Matt Plant life. Metro Lipe don't pie plot by plot asked plt And whenever we using the ass, the simply refers to abbreviation. Okay, so this is the first thing I want to import. The next thing I will import here is tensorflow, So we'll import tensorflow tensorflow s t f. Okay, then I want to important empire. So imports numb pie, which is theory elaborate, actually imparts on S and P. And then I will also import pandas to read my data sets. So import pandas SPD as an abbreviation And then I will also also want to import something from sky could learn, which is the machine learning library in python, which is awesome. So, um, gives you a lot of a lot of options. So, um, you always use it, Actually, So it's really, really helpful. So from sky could learn SK learn. I want to import off Actually, from the pre processing scag learned of pre processing processing. So from this class actually want to import the label and colder import label encoder. So leave Lakota. Ah, encoder. Because we have seen it within our data set, we can't be half rocks as our and minds as m. So we need to encode them into ones and zeros. Right, Because we can't deal with text in tensorflow, we first need to convert it to numerical values. OK, then I also want to import the shuffle. So from SK, learn SK learn and be aware you did spell correctly sk learn dot You tilts. So from your tilts, I want to import shuffle. Okay, shuffle because I want to show for my data set. Remember? Currently our data set is a structure a structure in that way that I have my rocks and then I have my mind's. But of course I want to shuffle it. Okay, I I want to completely randomly generate. Actually, the rose in my data set, so it could be a mind. It could be Ah, um, So the first could be a mind. The second could be, um, a rock and salon. Okay, so that's why we import the shuffle. OK? Important shuffle and then from SK learn escape, learn dot model selection model on the score selection selection. I want to import train test split. So train underscored test on the score split simply helps us actually to split the data into a training data set which will use for training the data and test data set, which we will use, then actually to evaluate our model and take a look at how well our model is performing. Okay, what's the accuracy of the model? Okay. Um all right. And now I also want to plot something within Jupiter notebook, and that's why I use this. Let's a miracle command, which is this personage figure and simply refer to met plot Lipe met plot lipe and are right in line. Okay, um, if you're using, um, any other kind off software to set off Jupiter notebook, for instance, pie charm, then of course, Then you would not have to use this line here probably would get a narrow. Okay, But if using tensorflow like I do, then of course, if you want to plot something within, um, if you using super the notebook story you want to plot something with in Juba notebook, then you use this combined want line here. Okay? All right. Now, let's actually create our first function, which simply reads the data set. So can simply sigh, say, death for for function. And I call this, for instance, reached data set. So reach data sets here. I want to read my data sets and two parenthesis. And what I actually do is I save it in the data frame, so DF equals and are using panda COPD dot Reid sees V. So I'm reading a C C file at which we don't load it, and it's stored. Stored is a role string I pasted on here currently is on my desktop. So you, of course, need to import, and you put in the five path where you stored the file here, Okay. And, um all right, so I read this and I save it in a data frame, and now I, of course, split it in this way that I know that the 1st 60 columns remember to the last video The 1st 60 columns are actually the labels as the features. I'm sorry, the features and the final. Um actually, the final column is then the label. Okay, that's the Y value. So it's is it R or M? Or later will it be a one or zero K? Is the rock our mind? So Okay, so the X is equal to and from my data frame, I'm want you actually half an extract from the DF dot columns. But only here from the zero to the sixties and the 61 will be in as the sixties were being excluded. OK, so actually, it started from zero, and it ends at 59. So since starting with your it are exactly 60. Okay, so 012 and so on. So 2 59 60 will be excluded here for the X values. Okay. And I also want to convert this into an array. And if I want to do this, there is an easy command. How could do it's simply dot values. OK, so this simply means that from my complete data set, give me only the zero to the 60th column, so excluding the sixties and converted into an array. So that's what we do here and save it in the X value. The next thing is that we do something similar for the why. But of course, we don't convert it into one race so far. We can't do it because it's text, right? The wise are or am currently. So the Y is equal to and I will say, from my data frame, they also extract something from my data frame. It's the F dot columns got columns, and here it is only in the last column. So in this case, I'm referring to 60 which simply means only the last column. Okay, All right, so that's it for X and for why? And then when I also want to do is I want to use the, um, the pre pressing the label and code which imported here. So from straggler and I will use gag, it learned leverage Skyland in order to convert this r R M, which I already have. So it's a rocker mine. I want to convert this into a one or zero, and this is what the label and code is for So I will also create this. So I say and coda. So my encoder here coder is equal to and the same I in Kota is Instead she ate it first. So I say it's a label in Kota label and coda to parenthesis. And now I have my coda And now I want you actually uses in Tota I need to fit it. It's called Fitting simply means that I expose my encoder to my my data set and the encoder then gets the are of the M in this case and then the next step would be transforming the RM into one or zero. Okay, so how does it work? Well, simply I have my why So I used only extracted the last column. So all the values in last column put it in the UAE. I have mine coda. Instead she it then Kota. Now I train. Then coda Training means fitting so and Koda is equal to you and I'll say, um, I could say and got a fit by. Let's type directly. That's a encoded outfit and I put in the Wye Valley is here. So why? Okay, I'm using mine kowtow isn't instead created here Leben coda and I fit it with the Why Well , you OK so the last column And then in order to really make ones and zeros instead off r and M so rocks and minds what I want to do of what I need to do is any to transform it. So I say, Why could also named swine you if you want But if you want to follow along with the code and e, let's say you stick with why, As I do so why is equal to you and Arcee and Coda which is fitted so which is exposed to drain there is it now Transform Transform the UAE Wally's Okay, so I'm referring here to with this. Why? I'm referring to this. Why here? And I transform it with my label encoder and save it in a new variable. Why so basically this world But why will be overridden by a new variable which contains the encode it's y values. Ok, so instead of RM again, it's 10 now. Okay, um and, um now what I want to do is, um I also want you one heart and code it. And I do this simply by writing. Why is equal to one heart and coat So one hearts and coat, and we actually have to create the function. Um, so I have to create the function here, but currently, I'm using it. But I want to great it later on. Okay, Just give me a second. So 100 coat. Ah, and put in the UAE. Okay, so this will be the last transformation step for the Why? Um, so basically we we had why, as text, we extracted the A and m's. So last column from our data set, we used the labour encoder to actually transform it into ones and zeros. If you want. So into in America values with a transform here and now, we want hot and code. Those values, um, for network can't come to actually compare the output off the network with the to buy values here. So that's why we need to prepare it a little bit, okay? And that we want when we want to do finally is simply were turned something. So, um, this case turn return and we return the X, and we also returned the Why. Okay, So the X has not been transformed only into an array. But again, these are already in America values in our data set. So we don't need and to do any transformations, which is great in this case, but very good as an introduction, Lee or beginners tutorial. And, um, now we can actually, um yeah, we have transformed the wireless, and now we return them. So one more thing off course, we have this. But in order to use the one holding code, of course we need first. Right this function. Okay, so currently, we don't execute anything. So I could write the one code function year. I could also write it on top if I want, but that doesn't Doesn't matter. So I could say death. And I say one hot and coat and coat. And this my function and dysfunction will we need to give something to the suction because we use it here. And we give the why while you're here. So let's say gift. Something like label again. You could name this whatever you like. Okay. Doesn't have to be labeled, but makes sense because basically what we're doing is we fitting the why which is the label okay into the one heart in code. And ah, or labels you want Because actually, it's ah, column off several values. So now we have a one hot function here, and what we want to do is we say, for instance, the number off labels. So labels, um, is equal to length. So using the python building function length, which gives us the length off the labels, OK, labels like that. Okay, so the safety of the number So the length of labels and now what? We will use this. We will actually get the unique values of those which is actually only to write because we only have rocks and, ah, um, minds or one and zero. So, um, the number off unique, unique labels, labels like that is equal to you and always say now, referring to a new imply function again, we front you length so length function within problem. And now I say, n p dot unique, unique like that. And I'm referring to my labels like that put in the labels here, and I got the unique values, which are only too, by the way. Okay. And now I say one hot and coat one hot and coat, and this is simply a variable again. This is only a name I'm using. Year is I'm creating zero matrix. So np dot zeros, which is simply matrix or number array in this case doesn't have to be a matrix. Could be any kind of array. So numb pie array of zeros and I will fit in from the structure here the number of labels, the number off labels so able labels And by the way, and these are also labels here. So the name of labels here off it in the number of labels here and beside that, of course, also the number off unique enables like That's OK, which I'm referring up here. So I get the length of the labels and length of unique labels and creating a zero matrix here with the labels. A swell of unique labels. This is my one heart and coding. And now, um, of course I need to encode it into Zeron ones. So what I do is one hot and code again and the new one hot and coat will be Actually, I can do it this way because I want to put something inside in my zero UK. Currently, this whole array structure only contains zeros. And of course, now I need to give it a one on the place where it actually is. Um, Iraq. Right. So or, um, mine. So I need to I want to one hot encoded. So, um, in my within my qwerty only zero containing, um three. Now I want to 100 code it. So I say n p dot a range. So a ray range with the number off labels like that, the number of labels and a close birthers is and labels on the position labels, and this is equal to one. Okay, so that's it, actually, and everything I want to do our need to do now is return something. So return, return. And what do we want to return while the one hot and cold. Okay. So exactly this value, which I credit appear, which I then one hard coded in here. So that's the one on encoding function. Okay, so that's it. So far, this is the first part. And now I will cut this video and we will continue in the next one. Otherwise, the videos will come too long. Um which was an issue for me in the past. So, um, students don't like it. So if you like longer videos, of course, I could do this as well. But for now, let's cut it here. Um, take a look at the code again. Try to understand everything we do. Hopefully did. Um and I'll see you in the next video. Until then, best guys.
13. 13 Start building the network 2 shuffle train test and datashapes: Hey, students, Welcome back now. So far, we have done quite a few functions. Like the reading the data set A swallows, we create the one coating, which we can use them after label and coating to create one hot and cold it output variables or labels for our Y values, which we need in order to make the comparison and then really work train on your network. Okay, so the next thing would be that after this function, you will actually read the data, so I could say X. Come on. Why? And now I can apply my first function, which is read on a score. Data sit. I think that you parenthesis here, okay? And I saved, actually, even my two valleys here, my ex and my weatherly's, um, here, um which is which is returned. Um, here. Okay, so after retail. So I say this in X and y Okay, the next thing would be that I want to shuffle my tennis it again. Remember? Currently, the data is structured in a way which we don't want. We don't want to see all the rocks at first, and then, um, or the minds we want you completely shuffle them, okay? And that's why I actually imported here. The shuffle function. So what we do is here. I say these X and y values, which we have a red here and safe here the new X and Y values will be. We will use the shuffle function. So shuffle. That's why we implemented it. Now you're imported it from Scotland from your tails package eso shuffle. And then I say, I want to shuffle X and I want to show for why So what I read here Now put those two I want to shuffle and let's say a random state Brenham state is often used. Um, if you want to, then trace your mistakes or errors, which might occur in the code. Which often happens, of course, because, well, you might easily forget a comma or, um just changed the shape in a way that Tensorflow doesn't understand or any other kinds of issues. So Random State is always a good way to work with that, because then, actually, the let's say random numbers will always generate it in the same structure, and it's easier actually defined mistakes and errors in your code. Okay, that's all. What random state does. Um OK, so I have shovelled data. And now I want to actually, um, split my data into a training set in the testing set so I could use the train test split function here, which we imported here from Sky could learn as well. So I could say, um this case, my ex? Why? Um actually, now we want to split this in training sense of a train X. So Train X and I also say test X So my test x x and then the train must go. Why this? Well, it's the test on a school. Why eso is equal to And I'm referring to train on a score test on the score split so trained to spit. And I say I want to drink list it my ex my Why? Which I shuffled. So I'm referring now to those two after the shuffling. I'm using them and I want to split them. And now I can define the test size so I could say test sites is equal to 0.2 simply means that from 100% of my data, some from my 208 in this case Ah Rose off data. I want to have 20% as my test size, and the other 80% will be the training. So Okay, so, um, and I also again define a random states, so I could say random on the school state and here, we're going to find any kind of number. Ah, in this case, for instance, this one. Okay. Okay. No, I got this. A swell. And now let's actually take a look at the shape. So off are the well, you So are trained ex train. Why? And the test X, for instance, So simply what you would you have to do it so you could print some things you could say print. And it's a train. Um, X shape and I was also could do this could say print. Ah, train. Why don't shape shape that? And of course, because the print print here test extort. Okay, So OK, let's run this and he would get the output. So Okay, we don't get any arrest, which is good. So obviously we did not make any mistakes in the code, and you can see that's what it looks like. So here the train extra shape is 165 rows and 60 columns, which is completely true because, remember, from zero to the 59th off, 59 throw are the features. And we only have one at 65 instead of 28. Simply this is 80% right, because he said the training should be 80% of testing should be 20%. So Sierra 0.8 times 28 should be above 465. OK, the same is true here. For the wide out shape we get one of 65 we get to. So why do we get two columns? Well, simple, because we use the 100 coding, right? So instead of off only having a one or a zero for a rock or, um ah, rock or a mind we have, for instance, a one and a zero. So one comma, zero for Iraq. And we have ah, zero comma, one for a mine. Okay, that's why we have two columns here, and, ah, the 42 points 60. These are actually than the other ones. So the test except okay. In this case, um, 40 to 60 which are 20% off the data set. Okay, so, so far, so good. And next thing would be defining the hyper parameters as well as then creating the model structure. OK, so this is something we cover in the next part. See there.
14. 14 Start building the network 3 hyperparameters: Welcome back now, as I said before or in the last video, we are now actually beginning to define the model structure. Okay, so now you we are to the point where you actually want you create the structure. So we use pen and paper if you want, and define exactly what the model should look like. Okay. And there are some something called hyper parameters. And the hyper parameters are things which the model is not training. Okay, so it's not like the weights and the biases which are adjusted by our optimizer. We also have parameters which are called the hyper parameters which are fixed. So these parameters we have to set at the beginning. So it's our it's our duty actually to set them and to give the model a kind of structure. OK, And you can play around with this, of course, because this is something the model is not learning. This is something we need to find out. So we need to play around with different kinds of settings here. So the first learning rate will be, as first parameter have permitted will be the learning rate. Remember? Used this before. So learning, learning great learning rate is equal to and now in this case because we have more data, I'm using a higher. So pick a learning rate. Remember the last time we used the only way to use your points, your one. And this is actually a learning with which is often used. But as I said before, thes are now hyper parameters, which simply means you need to play around with them. Okay, so let me actually could comment this if you want, you can comment everything again with the hash symbol year they could say, um, define our hyper parameter. Okay. Like that, for instance. And, um, the learning right now should be 0.3. Okay, so it's still, um it's much bigger value than before the 0.1 because we want to actually train faster. Um, but be aware that you should not use to be too high learning rate. Actually, the model normally the way it works, is that you start with a higher learning rate, and then you're just the learning rate. So the learning we get smaller and smaller because if you keep the learning rate to bake, then you might overshoot, which simply means that you will never reach the optimum because the model will just overshoot. So instead, off stopping at the optimum, it will just, um Well, go, Father. And then the model gets worse again. OK, so that's that's the reason. So you set the learning rate. Okay, so the next thing is that we say who's at the training the epochs. So training, training on a score, epochs and epoch simply means how often, actually, do we want to fit in our data into our model and train and update our weights? Okay, so here, of course, you could play with around with different kinds of valleys as well. So I could say, a train, a box, one thousands. But then, of course, you would have 1000 times training the model, so would off would give us better. Um so simply means that the model s more time to optimize awaits instead off using only 100 . But if using 100 of course. Then the model trains faster. Okay, So again, you can play around with this. Ah, I might play around with this when we finished with the model structure, but let's say for now I said it to 1000. OK, so these are the training epochs. So the next thing is that we have a cost history. So I say cost history. We will use this again later on. So the cost history is equal to an empty numb pyre. A and p dot empty. So simply first rate empty, empty in number. Ray. The shape currently is equal to one. Okay. And the data types so D type is equal to float like that. Okay, so don't worry too much about this. Um, this is the cost of streak. So currently, it's only an empty, numb pirie. So then the dimensions. So the number off dimension is equal to x dot shape. Um, except shape again. The first. Okay. Extra shape one. Okay. And then I could actually ah, do that. So I could say that the number of dimensions. So let us print that later on. Said say print. And I say the number off dimension here in this case and all I want to print is again the and in which we defined up here. Okay, So this is the number of dimensions, all right. And the number off classes. So a number of classes is equal to two. Why? Because we have two outputs, right? We have rock and we have a mine. So actually have two outputs. So we have two classes. Okay. All right. See, these are the hyper parameters. And now we will define the model structure itself. So we say the number off hidden neurons in the first layer. Okay, so number off hidden no runs in the first layer will equal will be equal to 60. Okay. And this simply refers to remember the thoughts we had right in our model, we want to have 16 neurons. 16 earns in our first in layer. And since we want to create deep learning nor network here, of course, we have several hidden layers and we want to create four off them. Okay, Eso I say number off hidden layers so hidden to in this case and it's a is equal to 60. Then the number off hidden layers neurons in Henley. A three will also be 60 in the number off neurons in the last 10 layer. As long hidden Alaia four again will also be 60. And you can name this ABC 1234 or whatever you like. Okay? And you could also again then later on and encourage you to do that. Play around with that and increase the size of norms here. Okay, depending on how much training time to train the model you have. You can play around this, I think here we get on with so that they would get actually, a rather easy data said So, um, you could also go much higher with the neurons. But of course, um yeah, might not take too long because the data said it's really small. Actually, we only have 200 rose and 60 columns, so it's actually really small. Okay, Eso these are actually what we define. So these are also hyper parameters. OK, so 16 norms in the 1st 2nd and third and the fourth layer also high parameters. So this is something again which we can play around with. We can change. OK, but the model is not changing this automatically for us. That's what we need to keep in mind. So the next thing is something we are now familiar with. We have to create some play soldiers and rebels. OK, so X is equal to, you know say, TF don't placeholder. So placeholder like that and the data type is GFP afloat and 32. And now I also specify that I mentioned. Okay, so we'll specify the dimension. So in this case, I say that dimensions will look like and actually put this into brackets. The dimensions will look like none. And n dimensions. Okay. And him. So what is this actually giving us? Well, remember, here we define the number of dimensions at extort shape One. Why is that? Well, we have for our X values So far are features. We have 60 columns, right, So this shaped on one refers to 60. So basically, this tells us that for the placeholder, our promises, we will We will feed this place order with data, but the amount of rose are currently open. So we don't say we want to fit exactly 208 rows off our data immediately. We say none, and none gives us the option actually, to use any kind of number later on. But what we specifies the number of dimensions, which simply means here we're okay. We will fit in some data. We'll feed data into you, but everything which the data needs to have is 60 columns. So how many rows is currently open? Okay, this is totally up to us. So this is something we can adjust. That's why we use none. The next thing is we will create a variable, So w is TF dot variable variable. And here s a t f dot zeros. So instead, she ated our weight matrix here with zeros. Okay? Instead, off something else. And we also need to give it that the dimension. So how how does our weight matrix needs to look like in order to be multiplied with this 60 column X value here. Okay. And the structure will be here that will fit in the number off dimension. And this is pure matrix multiplication. So if the second term here is a number of dimension, Okay, in this case, 60. And also, if we multiply by w by this weight matrix, the first time emption mentioned must be match to 60. So this must be matched the same. Okay, so the rose in the column here. Otherwise we will not be able to do a matrix multiplication. This simply math. Okay. This basic math, you can look that up on the Internet if you want. Um, okay. And the second here will be the number of classes, Some referring here as number of classes. Okay, so that's actually what my weight matrix should look like. So I got my ex, I got my W. And now I want to have my Bisys. So I say, be form Isis is equal again. It's a variable, so TF dot variable and I'm referring to here is well, duty f dot zeros and give it here, in this case, only the number off classes. Okay, so a number of class like that. Okay, so I got my biases as well. I got the front, my double use as well. Um, and the reason why I don't only give you the number of classes, by the way, is simply because we multiply x by w and R new. Um, well, the output off this will have s, um, currently none. So the structure will be none. And number of classes. Okay, so the or put off these two modifications will be none. Number of classes. So all I want to do is to add toe each of the each of those dimensions here with number of classes, one bias. That's what I'm referring to. A number of classes. Okay, Suck up my bias as well. And one more thing I need of course, which is a placeholder for my Why So in this case, I say, why dash here and say is equal to TF dot placeholder So place holder which is going to open what I want to fit in Just make the promise again. If you want to refer to a promise, I make a promise that I will give it some data, OK? And I say tea after float 32 though this is the data which I will give you and concerned the data structure I want to give you the following again. I'm referring to none so it's currently open. But the second thing is the number of classes, OK, because I only want to have actually an output off columns with two. So you either 01 or a 10 which in people first, then as a rock or a mine. OK, so Okay, now I have to find actually, um these variables and placeholders and the next thing would be actually defined within a function. The multilayered perception. So how does it look like? Okay. And this is what we do in the next video. Until then, best guys
15. 15 Start building the network 4 defining the structure: and I went Welcome back to the next part. And now we said before we creating the function for our nor network, so could say death for function. Key word. And now I can say, um ma the name of my network, Um and I will name this s multi layered perceptron. Why? Because the well, the word multi layer perceptron is actually what a basic neural network is again. So the basic structure of another network is often referred as a multi layered perceptron. As so you heard the term now and now you know at least what it is. Okay, But of course, you could name this function whatever you like, But are you multi layer and multi layer? Because we have several layers, right? We have a deep learning your network here and multi layer perceptron septum. And if you are into neuroscience and biology and you know something about the human brain, then of course, you also know the word Perceptron. So I'm no expert here, so but of course you can google it. Okay, all right. And what we feed into our this Well, our no network needs the following. It needs the axe. It needs the weights. So needs the weights, and it also needs the biases. Okay, Bias looking. So that's what we want to feed into. That the network you can name these a permit is, of course, whatever you like. But later on, we will feed into our X value our W value in our be. OK, so these are actually referring to those here. Okay, so, um, I got this and ah, just don't be confused. Is still the output which we print here. Okay. For the shapes here. Um, OK, so now let's actually start quoting the structure off. Well, today Perception. Okay, so we start with the first layer. So let's say the layer one or the first layer is equal to t f dot at. So, um, we will add. And in this case, TF thought Matt Mole. So with some people first toe a matrix multiplication and we put in the X, and we also put in the weights. Okay, Some referred to weights, and we will define them. Um, just give me a few lines of goat, so we'll define them. And I'm referring to the weights. H one. Okay, call this H one. See the other weights. And ah. Then I will use the bis spices, and I'm referring to isis one. So, um, better this one. Okay, so that's it. And then I could those apprentice. Let's see, I'm made a mistake here. I want you close the metal actually here, and they want to get rid of this year. So that's my layer one. And let me also put in directly the activation function we talked about. Okay, So layer one is equal to you, and i'll say TF dot n n dot sigmoid Some using a sigmoid activation function here. And I put in my layer one. Okay, that's it. So, basically, what do I do here? Well, I have my layer one and within my this this variable, I will use the same what we did before. So if I screw up here, all this is actually a matrix multiplication off X and the weights w for the first layer. Then after this matrix multiplication is simply with the t f dot at at the bias. Okay, instead of tea after at you could also use the aftermath more X weights and simply then at plus spires. So instead, off after accurately the at under of the common here and simply right plus spices. Then off course could also read of this. You could also do this. Okay, Both is possible. And we saved this in layer one. And after that, we use that they have one and put this into an activation function. In this case, it's a sigmoid activation function and stored again in layer one. Okay, so this is the output off the first layer. So if I still my problem precision. Oh, from here, What we do is we have the Impala we multiplied by the weight matrix. So this is all wait matrix here. Okay, so this impala, this is the weight matrix, and we at the biases 50 after at we put in an activation function. In this case, we used the sigmoid, not the soft max. And we derive at the output in this case for the first hidden layer. Okay, we have the output of the first hidden layer after that. That's what we do here. And we will do something quite similar, actually, for the other layers. So in order to do this, I could simply do it and I use ah, layer to So my second layer is equal to you. So, like that is equal to t f dot at. And this is actually the same code, right? So I'm heading doing the first the matrix multiplication. So using Matt Mo. So for matrix multiplication multiplying in this case, but not the X value anymore. Because I'm multiplying the output, which I get from layer one. Okay, so I'm multiplying layer one and amusingly a one, and I multiplied by the weights. But in this case, of course, the weights in the hidden Leah two. Okay, so he'd lay a to here and then what I want to do is I first close, disparate business. Then again, I'm referring to the Bisys. But of course, also the biases. So, Isis in hidden layer to okay, and then I do exactly the same. So I say layer to is equal to and I'll say TF dot n n Don't signal it sigmoid. And I'm referring to layer to All right, that's it for the lia to now. What are we doing next? Whoa, Let's see. Well, we have Aaliyah three so the layer three is equal to TF dot ad and the TF. Don't mad Mel's. So I don't think it's an endless loop, but it still works the same way. Okay, after that, we again use a metrics. Be careful. Matrix, multiplication and ah, musical this year. Okay, so majors medication. And now I'm referring que layer to here, and I will use the weights within my new layer. So the person referred to the weights in layer three. Okay, Beside that, I will at the Bisys Bisys from getting referring to the biases in layer three b three and then I close this and I forgot apparent there's this Europea aware that so otherwise the coat will for own era. Okay, All right, so I got money or three. And then again, the layer three is equal to the new variable. Layer three will be equal to, um TF dot n n dot sigmoid enemy referring to sigmoid and using layer three years basically using there to hear the operative layer to use the majors modification with the weights and layer three i at the biases and their three a Save it and layer three. And then I put this value here, so this layer three put it into an activation function here into the sigmoid function and save it again in the named in available named layer Three. Okay. All right. So Smiley three. And then I got away before So, therefore is equal to s a t f. Don't, aunt. And I say TF taught Matt Mall. So again, it's a matrix multiplication for the for their And I'm known referring to layer three. I'm putting my weights weights from my fourth layer. So each four and I would add after the close parenthesis I was at the biases spices from my fourth layer. So before, before and four. Okay, before. And now I want to put this into an activation function, but this is my last layer. So in this case, I say my layer four is equal to a t f dot an end. And now I will use exactly the function you've seen here. Um, actually, no, I will use a new one. There's a soft max. Sorry for that. I will use a Really Okay. So really, who is actually a function? Orson activation function, which is most common in nowadays, which is in this case, applied on the last, um, the last step. Ah, within Ali s okay. And ah, we put in off course the layer for Okay, so I have this in a stored in variable. Also called Therefore And the last thing after all, four layers after our form of applications is we have an upper layer. So if I take a look this of course, we have an upper layer here. Right? So currently we have actually covered the four hidden layers. So here we have only one. But of course, we have would have 2nd 1 the 3rd 1 in the four form, and then we have the output. Leah and Theo player here with a fine it. So I say output layer output layer and my upper layer is equal to u t f dot matt mole. So Matt Moore and ah, I could use to have the ad again. But just to show you that what I've told you before, I could also use a plus sign. So I could also say TF dot Martin mole and I could say I want to use the lay of four, which is my output here. After putting it Internet activation function, I want to use the weights. I want to use the weights. And I'm referring to the weights, um, in my opal layer. And then I simply use a plus sign and the use Bisys and also you referring to the biases in my Coppelia here. Okay, so that's it. That's actually the same as using t f dot at your TF thought ad and put it in quotes. Put it and get rid of this plus sign here. Okay, that will be the same. So let me get rid of this. Okay, Um, so you it's up to you How you can you want to write if you want to use to have? Not at all. You want to write a simple plus sign here? Um, yeah. As I said, that's up to you. But if that normal therefore waits outs and we at the eyes is okay. So, um and then finally, what we do is we returned the opilio, so return you now function. Remember, we have functioning. We want to return something so we will return the upper Leah. They are like that. Okay, so that's it for this. And let's make some space here. Okay? So we return something and then. Now, actually, what I want to do is well, you seen year. I'm referring to weights H one and buys it. Be one. But I haven't to find this so far. I have ah ah motel a Perceptron by weights and biases and want to give those wise and biases to the function here. But I have not to find a Chuan and be one. So this is what I be next. And this is something I also could do before. But I don't have to. Okay, because here we only write the function itself. We don't execute the code. Okay, so my weight's okay. My weight's It's nothing less than a simple dictionary. Okay, it's a dictionary. And within this dictionary, we have a key and a value. And the key, of course, is what we call here. So the key must be the same name. It's years on this case. H one is the key. So in this case that define a dictionary h one and this key has falling value, which is a variable, because why do we for do we refer to a variable yet you have that variable very bull bull simply because remember very boats are adjustable, okay? And we want to adjust our weights and our biases. That's why we create variables here. So TF variable and I want to give it a t f dot truncated normal. It's called truncated on a score normal. Okay, this simply refers toe give me a random value from a normal distribution, but from a truncated, normal distribution. This is how I want to initialize my variables. Okay. This only should only give me, um, random variables. Okay. I don't want to define for each off my off. My weight's a starting value. I simply want to use some random distribution to generate random numbers and starting values. In this case, I'm drawing those random numbers from a truncated, normal distribution. That's see, that's the idea behind that. Okay. And then I also need to give it a structure, OK? And the structure is this case, the number off dimensions. So because remember, we had here the one for the weights. We have actually this matrix this year, and we have the number of dimension is the first year the first import. So also those these relatives here, um, will be the number of dimensions as the, um, the first Ah, dimension. Here's with Rose. If you want. And the number off hidden in this case one so will be will be the columns. Okay, because he was seen You have 16 murals. Okay, so referring to 60 year. So I have my Each one is a very after variable drawn from our truncated normal distribution with dimensions off and dim and and hidden One. Okay, that's it. So then we'll use a comma and instead off using this or typing this the whole time. This case, I really want to save a little bit of time off your valuable time. So let me simply paste this, um, four times. So of course, we need to change something here. Because for the of course, we have a hidden H two. We have a tree we have edge for Okay, we're referring to those, um, appear And by the way, we also have an output. Right? So, of course we need to actually put this as well here and then referring to out here. So this out And these are all variables, okay? And we draw all of them randomly from ah truncated normal distribution and Of course, we need to adjust those as well, because the number of the mention number of hidden is only, um, correct for the first layer. So for the second layer, actually, the input should have that here the rosen, but must be matched to the columns output of the first. So I'm referring to you. Number off hidden on the school one here and as simply type in number off, hidden in my layer two. And just be careful with that hitem like that. Okay, No more fittingly, a one number fiddler to then in the next, of course, I'm a hidden layer three. I need my weight matrix with a shape off number off hidden. He didn't wanna score two, and the number of hidden 33 will be the next. So in my fourth, I will have the number off hidden neurons in my third layer and the number of hidden neurons in my fourth layer. Okay. And the last thing would be, of course, that here I have the number off hidden neurons in my fourth layer. And the in this case, you're the number of classes, okay. And class. So that's actually what the weight matrices should look like. Okay, so and that's what I have to look like. Otherwise you would receive a narrow later on that the shape does not fit. And let's get rid of this comma here, So Okay, you should be aware that. Okay, so now we get a week matrix. But of course, who? You would have to do the same for the bias this year. So quite the same. All right, Because we also use devices. So I simply refer to another, um, Bisys in this case, another numb pie addiction, not number I Panda Python dictionary. Sorry. And here I say, simply be one. So be one will be the first key, which I'm referring up here. Right. So here's a B one Sony to define it here. So be one. My biases array be one will be again a t f dot variable variable, very ill. And I'm referring to, um a t f dot from Kate. It's on a school normal. Okay, So also referring to a truncated normal. So I want you draw again. A random number for ah from a truncated, normal distribution and the dimensions Remember the dimensions for the bisons simply must match. Actually, the columns here for our, um for our hidden layers if you want. Right. So the output here. So I simply can say here this is simply in hidden like that and hidden one. Okay, so it said from that and putting a comma here, and then let's be lazy. So do you Do not have to write this code all along. So could simply go in here and say, Well, the control be and control V and Control. All right, so we got those here, and of course, we need to justice. Right. So this is second. This is the third layer, so must match the third layer. This movement must match the fourth layer, and this finally must match the number off classes. Okay, so I'm simply referring here. Exactly. It must match to the column in the weight matrix. Okay, So, um, and of course, does not always be oneness. Be to this. Be three this before, and this is the out. Okay. So simply exactly the same names you applied here. Okay. All right. So that's actually it's for deciding this. And now we are going to start actually with the verbal initializing as well s then, um, creating our rule network. Okay, so that's it, actually, for this video, because otherwise it gets too long. Um, and we will continue in the next lecture. Until then, this guys.
16. 16 start buidling the network 5 6: Hello and welcome back, students. Now let's actually dive into the need. Agree? That's actually initializing variables. So in this case, are simply refer to in its And I say for initializing I say is equal to you. And that's a t f dot Globo global on the score variables And be aware of the s here sometimes forget this initial Isar initialized Katri parenthesis here. Okay, Now you initializing the variables and now I will I mean, so far we don't do anything because we didn't run a session right? But here we want to initialize are wearables. That's okay. And I'll say why is equal to so why refers to the output off our model year. And now I will use the function which we create actually up here, right, the multi layer perceptron. So I say the Y is equal to multi layer on the score Perceptron and ah. Then I need to give it the X and the weights as well as the biases. Right? So I can say X Come on, weights, weights and Bisys once. And by the way, let's actually put in the space. You're right, cleaner. All right. So, actually, I'm using now the motel A perceptron. So the function which quick created here and I'm feeding in my x value of feeding in my weights, which is this dictionary and also the biases, which is this dictionary. Because then I can get the variables to truncate normal variables and put them in here. Okay. Into the multiplication. Put it in the sigmoid and go to the next layer until next layer to the outer layer. And then this thing should return on output so that operalia, which is receive returned, is then stored into why. And this why here is then the prediction off our model. Okay, which we want to compare them to the true output, which is in this case, the why dash year. So this is a true airport which is currently placeholder. And this will the why will be the output of our model. And we will compare those and then we will check. Okay, How big is the loss? And then we will train it to reduce the loss. So, um all right, that's it for that. And now it's actually define a cost function. So let's say the cost FCT. So cost function is equal to you and now TF dot Reduce mean. You've seen that before. Reduce. Um, there were do some, I think. Right. So in this case, we will use the reduced mean, But it actually works the kind of similar way. And, ah, I will put in here a kind off, um, lost function, which in this case is TF dot n n for no network dot soft max. So I'm using an activation function here, which is soft. Max underscore. Cross on a score. Entropy, entropy on the score with Underscore Lodge. It's okay. It's quite a huge term. I know, but this is something which gets really much more deep into the math behind it. But basically what we do here and we put into values here, um, we put in the lodge. It's as well as the labels in the lodge. It's they are actually the output which we get from our model. So this is why so I could say lodge, it's large. It's is equal to why, and my labels labels is equal to y dash. Okay, but basically all we do here is simply we compare the labels. Why dash and due to the lodge it's which is why, which is the output off our model. So what the model gives us as a prediction for the true values. So the labels here and then we try to reduce actually this so reduce it to the mean so by all the values. Yet we have no data set, give us the mean and off this difference here with across entropy with rockets and save it in the cost function. And this should give us actually one value for the cost. OK, that's what we do with the dude produce mean this work similar to the reduce some. In this case, we will use the mean instead of the some. And we need to use this cross intra people lodge. It's This is actually the function how we can actually derive the costs between the true y dash and the protection of more or model which other lodges, Which is why. OK, all right, so that's the cost. And then we do the training so it could say training could also state training step because it's a step by step operation and then this is equal to TF dot train. So we want to train and we use again or radiant descent to send optimizer. Okay, Optimizer, which we have, um, well seen before. And we need to We can give it a learning rate of this. What we do so are learning, um, on the school rate. So last time we used ah, value like their point. You're one here. I'm referring with this learning rate. I'm referring to something we specified here in the hyper problem dissection. Okay, so that's why if you want to adjust this, simply change this year 10.3, and then your model will adjust. Okay? Some referring here with this learning reach to this happen barometer. So I'm using I'm training with a grade sent optimizer radiant sent optimizer the learning rate. And now I want to minimize want to minimize the cost of right. So minimize and using the cost function year. So cost fcd Michael's function. That's what I want to minimize when I'm doing the train. And now I want to actually start the session so say says is equal to and now it's a t f dot and instead, off session, I'm using the interactive session because for me, at least for me, um, sometimes I get some wheat output or ah Samaras, if I use standard sessions or in this case, I'm referring to interactive session interactive session. And I think that's something which has to do with the trip it a notebook because, ah, so far I haven't experienced it with with pie charm. So might try both TF dot session as well as to have to interact. Exception. But if you're using to put a notebook like I do, then I would actually recommend that you also. Instead, she ate the interactive session. Okay, So insensitive decision and now will start running the initialization. So I say, assess, start, run, and I'm referring to in it. Okay, so here I've only actually initializing my global variables. Okay, that's what I do here. This the first thing. Okay. And then the next thing is that I will actually created, let's say, Emma's e. So this is the mean squared error. That's why I called M E and underscore history. So history is equal to an empty list so far. And I say my accuracy, accuracy your ah, see, like that accuracy history history is also equal to an empty list. Okay, so those two things in here. And now what I need to do is I need to go through, um, train steps. And we did this last time with an iteration, and that's what we do here as well. Okay, so I say for epoch and you can name this for X for X would not be a good option because we also used expert for ABC, for instance, um, in range. And now what is the range? The range is the training epochs. Okay, training on a score epochs. And here I am, referring again to ah, hyper parameter the training epochs here. So this one, this training, eh? Parks is simply there was nothing more than training. Approx were defined within our half of Rama dissection, which is this one here. So this 1000 times currently Okay, so that's what we doing here. And so I'm going with a loop, are going through all these 1000 times in this case, and I want to run a session so says start, run and are in a position. And I say I want to run my optimizer So training on the school step let me run the optimizer and I you need to give it a dictionary. And the dictionary is the feet thick. Okay. So far, we did not specify this name, but, ah, this cleaner code actually to use it because this actually what it's named it's the feet dictionary. So what kind of data to re feet into this optimizer? And this optimize again wants to minimize the cost, right? And the murders here wants to minimize the cost. Here. The cost function itself refers to the output of our model. And our model itself is using all these variables, right? And considering this, of course, the output using not only those variables here in this coat, but also off course it using the X right. And the X is a placeholder. Right? So we need to What do we need? You. We need to give it a dictionary, so we need to tell it. Okay. What? What? We have to put inside the X value as well as the Wye Valley. Okay, so I see X and I want to fit in for my ex. My train on the score X, which we created with the train to split. Remember, if we go up here, say the train X Here is this train X test X train. Why trance y And we use this and we create those four output variables. Okay, so in this case, I want to train the model. Some referring to my training X. Okay, so X s drinks and my Why in this case is train on a school. Why? OK, close this closeness. All right, so the cost is equal to assess Sesto Drum. Ah, because Sister Ansari says the run and now the cost function. So I want to run the cost function here as well. So I say cost is equal to system run cost function. And here also give it a feat Dictionary. So feet on Scott Dictionary. So what kind of data do I need to put in my place holders? And here I'm referring to X and within the exits, the train X, Dr Nix, and for the wife is equal to the train. Uh, I must quote why, OK, art. So putting. Uh yeah. Let's put in a space here. It's well, okay. All right. Okay. Now we get the cost as well. Now the cost history remember, this is an empty A list actually which we created here. But maybe you want to take a look at it later on, and we want to print this, actually. So that's why I'm using here a list, and we'll will append to this list. Okay, So say simply cost history is equal to And Peter depend some using my numb pie again. And I want to append to my cost history, cost history. And what do I want to panto a pen to this empty list? Well, actually, the cost. Okay, so I'm referring to the cost. And of course, you could also write for instance Ah, simply, um, cost history. Dr. Penned if you want. But then, of course, you would have to define the costs to re first. Right? Okay. Okay. So I got my cost a serene. And now I can actually take a look off. Um, the correct prediction. So, in order to calculate the accuracy, simply state correct predictions. So the core of prediction core prediction is equal to announce a t f dot equal. You have not equal in the s a t f dot arc. Max, You have the Ark Max. And why come a one and want to compare this actually with TF dot arcamax and I'm referring to why dash, which is the true output. Sorry at the to label comma one. Okay, so basically, what I'm doing here is the following. Um, the tea after equal tells me with a true or false afterwards. If, um, the Eric maximum. And remember, we 100 code of this, if the ark maximum from the output of our model is the same as art maximum off the off the label. So this basically what this tells us is is the model predicting the right class but the organ extraction? Because where's the art maximum? Because, um, remember, with the 100 coat we have, you know, data set. We have zeros and ones for our re true values. But the our model will give us, for instance, values like 0.8 and 0.2. Okay, so that's why we use the rmx, which simply gives us the position where the maximum value off the prediction is. And then we compare whether this position is the same in the output of the model and the output on the label, the 100 gold label. And if that's the case, then this tea after equal returns. A true otherwise it would return a false. Okay, so that's that. That's the idea bin behind that. That's how it works. Um, okay. And the accuracy itself. So a kerosene cure. See, like that the cure c is equal to and I say TF dot reduce So reduce some at me. Sorry would use me. And here s a t f dot cast and a C correct prediction. I'm referring to what I did before. Prediction here. Correct prediction, Um, and cast this into, ah TF dot float float 32. So basically this card prediction with the equal. So basically, we're comparing the Ark maximum as a set. We, um, used to have equal it returns, choose on forces. But these truths and false ALS don't help us, because what we need, we need, like ones and zeros. Okay, we need in America values. And that's why we cast those. We will have cast and recast the truth and falseness which would get here in our predictions. In car prediction, we cast those troops and forces into afloat 32 so into an American value, and then we reduce the mean. So this should give us again about concerning all our data. So the complete data should give us actually reduced. Mean So von value. Okay, one value for the accuracy. That's what you want. We want to have, for instance, and 90% Okay, that that means that our model is correct in 90% off the predictions. So we want to have a presence. Figure one value, Okay? And that's why we use it to do some. And we cost it because we have troops and falls is here. Okay. To get one zeros and then to actually reduce the mean here and get their cures score. So that's what we do here. All right? Now, the next thing would be that we actually do kind off prediction. Why? So we do in a prediction, salt say, in this case, um, prediction. Why is equal to and I'm running. Musician says stock room, some running decision, and I say I put in the why here and my feet dictionary. So feet on a school and take my feet. Dictionary is in this case in that, um or let's say sorry. Feed dictionaries equal to put it in here and say I want to fit in the X here. And the X will be, um, test X. Sorry Will be test ICS. Okay, so test on the score. Six. Okay, so this will be the prediction. Okay, because so far, we actually have trained the model on the training data. We put in this thesis values here for the placeholder. And now we want to make a prediction. So prediction on unseen data. So we will run, actually. Why? Again? So why is so multi layered Perceptron, which we have here. But in this case, um, we have the weight with advisers, but in this case for the x value in this case, we want to use our test set. Okay, so now we're gonna know. Okay, what is our model actually out putting as a prediction? Okay. On new unseen data, because the test data has not been used to train the model. Okay, so OK, so this is the predicted output. And now we can actually compare this so we can take a look at this. That the air OK, so I could say Emma's e so means greater is equal to you, and I say TF dot Reduced mean again. So reduced on a school mean like that. And l s a t f dot square. No, you use something we already know. So if you have thought square Okay in a square, please. Now the predicted Why So this is the output, which are models predicting. So please square the difference between the predicted why and the test y okay, test on a school. Why? So these are the true labels. So if it's, um, this case back to our ah, they decide if it's a rock of it's a mine. These are the true values and the predicted Why is the output off our model after it has been trained? Okay, streak right here. Where it on this, Using Thursday as it and now we compare it and we square the difference. Okay? And we reduce means so simply, we want to have one value for this for this loss or this difference we have here, right? And stored in embassy. Okay, means sweater. So the next thing would be that embassy and they use it on the score. Here s a M A C is equal to. And now I want to actually run the session. Okay, says dot Run and I want to refer to MZ. Okay, so this would have here the squaring year. So the difference and calculating actually the loss if you want, um, within a session, okay. And stored in embassy dash. Okay, now I will pent this aero. So embassy history. That's why I actually created here this empty list. So embassy history and I say this history game here, there. I want to upend something. So dot append this list and what I want to repent to the list. Well, I want to upend MZ Dash. Okay, what we have here, so actually calculate, please the loss off my test data from the airport to the true output. So to labels and then appended to this, um, this this list in the boy and Python. Sorry. Okay, so after this and now it's actually a preach here for the cure. See itself. So say, in this case, I think you're see sorry like you receive. So a cure see in this case is equal to you. And now say Sesto drum. Okay, so it's sort of story that I can say says don't run for the cure. See? And here are put in the accuracy a cure. See, You receive like that. Please run the accuracy and the feet, dick. So feet on a school day Yeah, it is equal to And I need again my dictionary here say X and I got in my train data. So train, let's go X. And for my wide dash in this case is the train. Why? Okay, Why? Dash is doubled train on a school. Why like that? Okay, so that's my cure, See? And of course, also, I want to upend to this heresy. So that's why I created the accuracy history list up here. So the same as I did for the squared error. So for regeneration, show me the square error. So actually, the loss and I want to see this loss. Reduce, reduce, reduce which you have each epoch. And I also want to see the accuracy and security, of course, should rise with each epoch. Okay. Want to have better accuracies Olson p better performance of my model. So for that, I say your see your c on different you underscore history. So to my list and to this list I want to upend. Actually, you Kirsi mercy like that years, okay? And finally, what I want to do is I want you print something. Okay, So I say Prince print. And now I say I want to print, um, this case epoch. So for the A park itself, it's so which, which step of integration and my currently in. So I'm trying to epoch, which gives me this value. Then I'm using a dash here. And the second thing would be that using the cost. So what is the costs? The current cost. So here's well space in here and then say the cost currently is the cost. And this is only for printing within two. Pinup. Okay, um then the embassy. So what is the means? Greater Conley. So dash here. Embassy. So that means great arrow and the mean squared are currently is nothing more than the MSC dash. Okay. And I'm referring to all these variables we have designed up here, and I also want only want to print them for each of my epochs. Okay, that's what we do. And ah, no. OK, then. One more thing. The train accuracy. So training occurs the training, accuracy, training accuracy. And here I want to actually put in secrecy. Okay, so that's it, actually, for this and now we can plot it and also then take a look at the final score. So this would be the final part of the code. But this isn't that we cover in the next video, because this is really weighed along. So thanks for the attention. Thanks for coding along with me and hopefully see you in the next video. Until then, this guy's
17. 17 Start building the network 7 finishing: Hey, guys, Welcome back. Remember at the beginning of the on an internal network we imported actually metro life because we want to put something. And then we used persons that schlepper lap in line. And this is exactly what we will use now. Because what we want to print here or what we want to visualize, actually is, um, the accuracy of the development of the security of our model. And we also want you actually plots Theo Aero. Okay, so we can do this. That's why we actually create these lists here so that we have all the historical eras and all the historical accuracy's in the development. And that's what we want a plot. So as a plt dot and we see Did I miss something? Because it's read here? Ah, guilty. I don't think so. So print See that with that? Okay, um, no plt dot Okay, here we go. So plt dot plots. So what do I want a plot? What? I want to show what I want to see the MSC history. Okay. His story. Okay, So the atmosphere history, then what I also want to do here, put then actually are here like that. Okay, What I also want to plot is actually want to show it so plt plt don't show Okay like that we run this and then I will do the same for the accuracy history. So plt dot plot a lot And I say here, accuracy, you're a C Let's go history history like that. Okay. And I will also show this so plt dot show like that Super emphasis. And here we go. Ok, so now actually, we can create Ah, final prediction year. So that's what I want to do. So let's take a look. So let's say the correct prediction. So correct prediction after plotting this and I say TF don't equal. So you have to equal in our s a t f again dot arcamax. You already know this. So we comparing actually the maximum off. Why comma one and do the same here as actually TF thought Arcamax four y dash, come one. Okay, the white. That's the true value. I call my one Sorry the true value. Aaron, compare this actually to, um, the maximum value off the output of the, um are no network and the equal returns to us and false is. So we need to cast this again So I can say here, um, is not this the a cure? See, take your a C in this case is equal to an S a t f Don't reduce on the school mean so reducing the mean say TF don't cast. And here I put in the correct prediction. So correct prediction, prediction referring to this one. And I want to cast us into a float TF dot float 32. Okay, simply again using this value of these tools and falters And please convert this into an American values and then reduce this toe one value. Okay, so the accuracy should at the end to be a percentage figure. That's what we want. That's what we get with the reduced mean here. So it was all the values. Give me the mean off all of these values, and please, before you giving the mean of course, you need to convert it instead of troops and forces I want to have in America about this year. Okay. All right. So we get to cure, see? No, let's print this. Actually, that's a print, and I'll say, um, the test que si Tess Secure. See like that in the testicular, See? And then give a space here. And then we say, Now we run the session, so assess don't run. Let's run this session. And I want to run the accuracy accuracy like that's and then we'll give it, Of course, if each dictionary So for the place holder, I need to give it some data eso to fulfill my promise if you want. Right. And so I use my ex, And here I give it the test. Um X And for the Why Dash, of course, I will give it the test y test school. Why? Okay, because this now we referring to the test data. Okay, we have finished with training, and now we want to see, actually, what is the accuracy on unseen day us on the test data. That's why referring to the test values here, Okay. And ah, see, that's you could see security a closed. This parenthesis closed this parenthesis on. I think I missed one parenthesis year. They think I think there should also be one. Okay. All right. So that's been run, and that kills This is Well, okay, So, um and that's also print, actually the last mean squared. Aargh! Uh, see, um What? Okay, No. Even so, be aware. It's sometimes really difficult that you do not forget one of the princess's. Okay? Um all right, so now let's actually print the final means. Quero. So it's a threat. So, prediction why in this case is equal to set start room. So I'm running a session and I say, why? And I put in my feet dictionary so feet dicked in this case is equal to So the prediction, of course, will be done based on the X values from a test data also X and you're referring to test school X. Okay, so I close this and now I say mean square. Also, embassy is equal to TF dot Reduce on a school mean So please give me only one value for this means greater also for the error for the loss if you want of our model. And, um, he m referring again to tea after square, so TF dot square. And please give me the square difference off the prediction. So proud. Why? Which is actually the prediction of our model and minus 70 minus the test out of school. Why? Okay, the difference squared and reduce it to one value. This is the means. Great. Where? Okay, And then what I will do is simply a say print print. And now say, Prince, the embassy. So the means Great Aero. And this aero years And here put some recent years. So let's say, ah, it's God era, the point game. And here I fit in. For instance, In this case, I want to have personage forever. So I want to have actually four digits, um, off decimal number with four digits. And here it can simply put something inside. So persons in assess got room and Jiamusi like that, and that's it. All right, so that's it, actually, for a model, I'm quite a lot of code. I get it. But we went through all the different kinds of, well, lines, if you want. So all different kind of steps, all the functions. Hopefully we were able to follow along and understood on what I did. Try to explain it as good as I can. Um, well, I'm not in your city, professor, of course, but hopefully you were able to follow along and just help you so far. So that's it, actually, for this video and I'll see you
18. 18 Start building the network 8 corrections and running: Hi, guys. Welcome back. Now let's get into it. Let's run the model right now. There's one problem. I made a few mistakes, So if you run this model right now, then you will probably get on Eros. If I could run this, I can run it. You can see that we got in there, Okay? Because I made a few actually made four mistakes here in the coat. I'm really sorry for that. Um, but ah, this is actually something which happens all the time. Um, if you at least going with this kind of code, you can see that. Ah, this is a lot of lines and you make easily mistakes. So I just want to tell you get used to it. And ah, this actually is what happens. So from Jimmy, at least maybe maybe you have a better, um Well, spelling that I do. But this is something which quite often happens. Okay, so don't worry about that. Simply search for mistakes. So read through the mistake. Here, it's for the coat. And then you can see here, for instance, TFT zeros. This is not subscript herbal. The reason is the following. So this was my first mistake here. So if you go up here and half once my zeros let's see, um, where do I find them? Where in satiated here with tea after variable zeros itself. So the zeros need, of course, put into additional, um, your, um parenthesis. And this is the same, actually, for this one here to have two zeros, so also need to put them in here. So, um, the next mistake I made actually, if I scroll down is the following here. Unfortunately, I made to parenthesis here. I don't know why, but of course, they need to get rid. Okay, because the real function is, ah, different activation function. But the code actually is. It's the same. Okay, 80 after an end, or really after the end that seek more to that end of soft max is always the same. Okay, so this would be my was my second mistake then. My third mistake was actually that here with the epochs, um What? I want you fit in, actually want to fit in here in the not in the UAE, but want to fit this into the UAE dash. Okay, So also here for my cost put in the white dash here. Okay, That's something I missed here in. The last thing I missed actually was that I made a spelling mistake against of us. Go down here. Ah, take a look. Where to find it with the m s. E. Um, here are I wrote for a prediction. Why mine minus text? Why? And of course, this is a test. Why, Right, So this should be it. Actually, this with the former stakes I made eso they now correct it. Hopefully in your code as well, or you you did not just take it all would be even better. But ah, as I said before, it is something which really happens all the time. So don't worry. Just take a look at the mistake and then go to a coding snippet and take a look at it and then correct the mistake. Okay, so that's actually run This, um, what I want to do is before I do that, um, I chose here to use only 100 instead of 1000 training epochs because I don't want you to wait all the time by my immortalised training here. Of course, depending on whether you have a strong CPU because if you have, ah, in video, strong CPU achieve eu. Sorry, Then, of course, you can train much faster because you need to have a strong graphic card on better processor. But let's go with 100 epochs on and train the model and let's run this. Okay, so I run this year, and now we can see we got the epochs and it starts running. Eso we have one box here and starts training. Is that strange? So strange that it's really fast, actually can see it, and we print each of the epoch. So each of our 100 training steps we print the cost function at the beginning with print the mean square. Oh, we print the training accuracy and what we want is actually we want the cost to be reduced , and we want the training accuracy to get higher and higher. Okay, that's that's a year, and you can see it here by default gets small and smaller. Year gets a little higher, but at the overall trend, that's the main idea. Gets small and smaller for the cost, and from the here, issue should get higher and higher. Okay, And for now we only have 100 epochs. So the 99 was the last one because she started with zero a park here. So 100 epochs training steps where we just actually did the same, which we've done before. In a much simpler example, we simply reducing the cost. We, in this case, increase the accuracy, which simply means improving the model. Okay. And in this case at the end, with the rifle security off 51% which is in this case for that as well, of course. Um, this is really bad. I know. But here I'd say you need to train longer. Okay, This was just with one of the epochs. You can train much longer. Use 1000 use even mawr, and you get better accuracy. OK, but you can see here. This is how the loss actually get reduced. So we plotted this with the Metro life. Lavery, you can see it false very quickly, but then it looks like it's constant, but it's not OK, so the cost gets smaller and smaller. And you can also see here the accuracy and insecurity gets better and better. Okay, so in this case, here, up to 51 votes is here was 56. So again, I would encourage you. But he was 47 was lower and coached you to use more steps, more training operations. And so you get a better output. Okay? And then you have a testicular. See? Here we reduce it to the mean. So the overall test security or train? We trained. It's already This is the training data set. And then we used it on our test data set. And the test data set gave us back security off 0.54 in this case. So the model was correct in predicting, actually the output in 54.76%. And this is a means greater currently again, use change, play around the high parameters. For instance, use more neurons, use more hidden layers. And, of course, of course, of course, this is we're really important thing. Use more training operations and simply 100. Which ideas? OK, so yeah, that's it. Actually, um, that's it. You're done. Congratulations. That's all I have to say now, because you have done it. You made it through that, um, the tensorflow. So you learned about tensorflow starting from scratch. You were able now to implement your first nor network so multi layered perceptron creating a new network in, um in Paterson and intensive flow. And you understand now all the steps which are necessary and you're now able actually to create your own model and use different kinds of data instead off rocks, for instance, and ah, and minds which we had here. We could also use this on a different data set. If you want on your own Data said maybe, and you can play around with it and make predictions. So that's it. Thanks a lot for watching, and I'll see you in the next video. Until then, this guys.
19. 19 Log out final words and important clues: Look out now, the final words Let me first tell you Congratulations. You did it. You made it through this course. So you dived into one off the hottest topics currently out there and you build up a solid basic understanding. So you started from scratch is a total beginner. You had no idea about tensorflow a case. That's what I'm supposing now and you learn step by step the basic tensorflow syntax. Okay. And you now know actually, how you can implement your own nor network in tensorflow coat in person. And if you want to learn more than feel free to check out my other machine learning courses might keep learning courses and manure networks courses by now. You know that deep learning in your networks closely similar, of course. And I have various courses where we dive much deeper into not only tensorflow but also use other kinds of libraries and off tensorflow to make the whole coat easier. Yes, there's a way. But I still think it's actually necessary to know the basics and you know no them. And within these other kinds of courses, we also dive much deeper into other examples. And we trying to figure out, figure out different kinds of things. And we also trying to make predictions on various kinds off. Um, well, reviews or other kinds of things. Okay. So really cool stuff. Feel free to check it out if you're interested. So if you want a special student discount, feel that few Frito are to write me an email and I'll go back to you. Okay, so the last thing is never step along learning and off course. Thank you. And just by the way, this is something I want to show you, because I could not simply use this 50% which we had. So I trained it, and I did those 1000 steps and with the one stop 1000 steps, we already at security off 78% for test data set an 84% of our training data sent. And I'm sure if you again use more than 1000 operation at alterations or you entries actually import improve for the have permitted. Then you get much better. Even much better presidential figures than only the security of 84%. But still, this is at least much better than when we trained with 100 we only derived at around 50%. Okay, so, um yeah, that's it, actually, um, for my last my last video for this course. So thank you a lot for your interest. It means a lot to me. And if you want to read the course, of course, is also helps me and also gives me some kind of feedback, which I highly appreciate. Um, so that then hopefully see you another course and wish you all the best to then best guys.