Transcripts
1. Welcome to this Course: Hi, welcome to this
course on TensorFlow. Tensorflow is a
program that helps engineers build and train
machine-learning models. In this course, you will
learn about tensors and how to work with
dancers using TensorFlow. We will start by looking
at what tensors are. We will then learn how to
build tensors using data. We will then see how to perform basic and intermediate math
operations using TensorFlow. Tensorflow can also work
with GPUs and TPUs, which are types
of computer chips bill to extend dense
and loose capacities. These chips make
TensorFlow run faster, which is helpful when you have a lot of data to work with. At the end of the
course, you will have a good
understanding of what TensorFlow is and how we use it to build deep
learning models. We will also be
building a computer vision project where
we will create a simple TensorFlow model to recognize handwritten images. This is the beginner
level course, but I'm assuming
some basic knowledge of python and machine learning. You don't have to be an
expert in machine learning. But if you understand
how data is used to train models
for prediction, you'll be able to
understand this course. If not, please find links to a couple of intro videos
in the course description. If you're stuck at any
point of the course, send me an email at
admonition shiva.com, and I'll get back to
you as soon as again. So let's get started.
2. Tensors and Tensorflow: In this lesson, we will
look at what a tensor is, followed by the popular deep
learning library tensorflow. Let's first look at
what a tensor is. A simple explanation
would be that a tensor is a
multi-dimensional array. E.g. we have scalars, which is just a single number. Then we have a vector
also called as an array. Then we have a matrix which will be a
two-dimensional array. Finally, we have an answer which is an n-dimensional array, which means it can have
any number of dimensions. In TensorFlow, everything can be considered a tensor,
including a scalar. A scalar will be a tensor
of dimension zero, a vector of dimension one, and a matrix of dimension two. Now this is useful
because we're not limited to working with complex
data sets in TensorFlow. Tensorflow can
handle any type of data and feed it to
machine learning models. Tensorflow is an open
source software library for building deep
neural networks. Google Brain team was
the one who built it, and it is now the most popular deep-learning
library in the market today. You can use TensorFlow
to build AI models including image and
speech recognition, natural language processing,
and predictive modelling. Tensorflow uses a dataflow graph to represent computations. To put it simply, TensorFlow
has made it easy to build a complex machine
learning models and so-called takes
care of a lot of work behind the scenes, which makes it useful
while building and training any type of a
deep learning model. Diencephalon also
manages the computation, including parallelization and optimization on
the user's behalf. And Tableau has a high
level API called Keras. Get us was initially a
standalone project which is now available within
the TensorFlow library. Get us makes it easy to
define and train models. While TensorFlow provides more control over
the computation, TensorFlow supports a
wide range of hardware, including CPUs, GPUs and TPUs. Tpus are tensor processing units built specifically to what the dense layers and TensorFlow. You can also run TensorFlow
on mobile devices and IoT devices using
TensorFlow Lite principle also has a large community of developers and it
is updated with new features and gave him the obese almost on a monthly basis. Hope this video helped you to understand tensors and
TensorFlow in detail. Next week we'll
start working with TensorFlow on a Google
Colab notebook.
3. Working with Tensorflow: Let's start writing some code. I'll be using a Google Colab
notebook and you can find the link for that completed notebook in the
course description. Alphas connect this
notebook with CPU. Let's wait for a minute. Now it's initializing and
grid. It is connected. If you don't know. Google
Colab Notebooks help us to run Python and then
submit code on the web. It's much easier to
work with rather than setting up a local
development environment. Now let's start by importing TensorFlow and printing
out the Washington. You can press Command Enter or abdomen to run the code block. Great, we're using
quotient to 0.9, 0.2. If you have a different
version, don't worry about it. They won't be much difference. Let's start by creating a
scalar using tf.constant. Tf.constant is a function that we will be using
to load the course. But in real-world scenarios, we won't be using it that
much because TensorFlow will handle a lot of
tensile creation for you. But for now, let's
create a scalar. It could be encoded as seven. And let's print it. You can see that
we have created a scale-out with the
language seven. He doesn't have a shape because
it's just a single value. And the data type is integer 32. Now let's create a vector of the input will be an
array with two values. And now let's print it. Greg, we can see
that we have created a vector of shape to it
when I was in digitally do. Now let's try creating a matrix. The input will be a
two-dimensional array. Then we'll have 12, 30. Let's print it. We'll create the matrix which has
shape to gamma2. It's a two-dimensional array and the data vectors integer 32. Now let's create a denser, call it densa of understanding. Bake all be based on language. And let's print it. You can see that we have
created an actual tensor, which is also a
three-dimensional array, and it has a shape of
three by one by three. This is how we create an actual
denser using tf.constant. We can see that the
datatype is integer 32. What if we wanted to use
a different datatype, let's say float 32s. We can use this data type as an argument and password
when creating a bedsore. We'll just copy the
symptoms are returned. I would call it. So one same values, but I'll specify the data
type is float render. You can see that now the
data that was filtered into, this is something you
can change the data type when working with TensorFlow. In case you run into
issues when you're working with large models
and principle of science, it can handle it by
changing the database. In real-world scenarios,
we'll be dealing with tensors of higher dimensions
and even bigger shapes. In the following
lessons, I'll also show you how to convert a real-world data set like a group of images into a tensor. We have seen tf.constant, which is used to create
constant and sauce. This is what you'll be using
throughout the course. But if you want to create
a variable denser, you can use tf dot variable. The difference between
constants and variables that we can change the values in a variable denser
but in a constant. And so you can't
change the varnish. Let's create a variable dancer. I'll use the same
pterosaur and call it denser than use
the F dot variable. Now I'll print it. You can see that we have created a variable tensor
with the same shape. And they're, they
both into your W2. One of the most
important attributes of a tensor is its dimension. Let's look at what
the dimension of each of these varnish or
the staff at the vector. Let's print the dimension
using the ending property. In them. The virulent. We can see that it's
dimensionless one, what's expected, and you
can look at the scale-out. It's dimensionally be zero because it's just
a single value. And for matrix AB dimension tool, there we go. And for denser, the diamond
show would it be three? Think of dimensions as the
number of columns, e.g. if you're using a dataset to
calculate housing prices, you will be inputting
the sky, feed, the location, and maybe
a few other inputs. Each of these inputs will
be called as dimensions. There can also be
called as features. So I hope this
lesson helped you to understand how to create tensors and find basic attributes
like shapes and dimensions. In the next lesson, we will
see how to generate dancers and also we'll see how the load tends us from NumPy arrays.
4. Generating and Loading Tensors: Now let's look at how
to generate answers. In most cases, you won't be creating tensors from scratch. You will need to load a dataset, convert other datasets
like NumPy arrays to tensors or generate answers. Let's see how we can
generate some tensors. We will first create a
tensor with random values. There are two common
ways you can do this. You can jump into
normal distribution or you can generate a uniform
distribution of data. The normal distribution
is a bell-shaped curve. So this represents the
distribution of data. This means that most of
the data will be close to the average and fewer data
is away from the average. It basically means
that the probability of getting a value near
the average is higher. The uniform
distribution, however, is a straight line
that represents the distribution of data. So all the values can a
uniform distribution will have an equal probability of occurring within
the given range? That is one more thing
you need to know before we start generating
random values. That is a concept called seed. Seed is just a value. And if we use a seed value, we can regenerate the same
set of data multiple times. So you know that we are going to generate random values, right? So if we use a seed, that same random values will be generated
again and again. This is really
useful when you're working with a machine
learning model and you want to test that model against the same set of data. So let me create the seed l, call it seed db dot random dog. Alright, dove from seed. That will set the
value, Let's say 42. Now, I'm going to
create a set of random values based on
the normal distribution. Or mother ends up
is seed dot normal. And I'll put a shape. Let's say p by two. It has been created,
let's print it. We have a tensor with the
shape of three by two. And if you plot
all these values, you can see that it
deform a bell curve. So all these belong to
the normal distribution. Now let's create
another random tensor with the uniform distribution. Uniform Tarasoff. C dot uniform. I'll give it the same
shape, three by two. And let me print it. But come into sharp formula. Now this is a uniform tensor with the shape of three by two. So this is how we can
generate a set of random dead cells using normal
and uniform distributions. Now let's see how to create
tensors with zeros and ones. You might be wondering why
we need this in TensorFlow. Tensors filled with
zeros and ones are often used as a starting point for creating other dancers. E.g. they can also be used as placeholders for inputs
in a computational graph. So let's first create a
denser. Would see those. For that, we'd be using
the df1 zeros function. Let's call it zeros. And we'll give it a shape. You can see that we have created a three by two tensor
fluid syllabus. Now let's create the
same tensor with one. The hub dot once, ship and ship equal to two. Let's print it. Here we go. We have a tensor of
shape three by two for the dwarfs that are more different types
of freedom pencils that you can create
using TensorFlow, but these are the common
ones that you will be using. Now let's look at how to convert a NumPy array into a tensor. If you don't know what NumPy is, it is a Python library
for numerical computing. It helps us handle
large data sets and perform a variety of
computations on there. And before we had tensorflow, Numpy was all we had when we were working with
machine learning models. So let's first import numpy
and create an antibiotic. I'll create a NumPy array, followed by
underscore a odd-odd. And I'll be using
the range function, which tells NumPy to generate a list of
values, let's say 1-25. And the data byte bleep
be numb by Bob int. Let's printed. There we go. We have an array with
24 values ranging 1-25. Now we're going to
convert this into a TensorFlow has made
to support for NumPy. So you can easily convert you an unbiased into answer object. Let's call them by then. So this tf dot constant and my input will be the
non-binary I just created. And the shape will be, I can set the custom
shape for I equal to say, two by 34 by three. And let's print it. There we go. We have converted a
one-dimensional NumPy array into a two-by-four
by three tensor. You'll be doing this
often when you're working with real-world
machine learning problems. Because oftentimes there'll
be other machine learning algorithms or other frameworks
which people will be used. And you will get the final
dataset in a numpy array. So it's very important to
understand how we got to what NumPy arrays of objects. I hope this lesson
helped you to understand how the genotype
tensors and how to convert NumPy arrays
into tensors. So in the next lesson,
we will see how to perform some
basic calculations, aggregations and matrix
multiplication using TensorFlow.
5. Basic Operations using Tensorflow: Let's look at some basic
operations using tensors. We will start by
looking at how to get some information from
our existing tensors. Let me create a new 40 denser. Use the df.columns function
to generate a 4D tensor. Let's go with trying for tensor. And the shapely be two comma three color fulcrum
of fire that's printed. So we have created a 40 denser, but the shape of two comma
three comma four, comma five. Now let's get some information
about this tensor. First print the
size of the tensor. So for that we use tf dot size. Yeah, let me also print out the shape for that called
the tensor dot shape. Then let's print the
dimension b or b. Seen this flipped and Sadat and grit. Let's
see what comes up. There we go. The size of what tensors 120 because there are 12d values. We have the shape and
we have the dimension. This will be useful
when you're working with complicated pencils. And if you want to equip
you to get some information about the shape and
size and dimensions. Now let's perform some basic
operations on tensors. So let me define our simple dense or what we call
it basic denser. Df, dy, constant,
uncreated beauty denser, let's say 101112 grid. Let's try addition. I just wanted to add a number to each
value in the tensor. So you can simply say, basic denser blows stem. This will add ten
to all the numbers in the denser and we
print out the result. Let's try some fraction. May also include
multiplication and division. Multiplied by ten, comparing it to add it back down,
Let's print up to it. But there, there we go. The first is the Edison. Second is subtraction. That is all the values multiplied by then
and the full dish, all the values
divided with them. The solution is a bit important because we'd be using
it in our project. So there is a concept
called normalization. That'll be fine converters that are running into zero to one. I'll explain that in detail
when we come to that lesson, but please keep this in mind. Now let's look at
matrix multiplication. This is also something
that we'll be doing often when we are working on
machine learning projects. Let me quickly
create two pencils. We call it zero doesn't one, constant, two comma two. Forgone or for Ben, I'll create a new
denser called filler, want to say 234 and acquired. Before we go into
matrix multiplication, please keep in mind that
the inner dimensions of the tensor that you're trying
to multiply should match. E.g. let's assume you have two tensors of shape
three by five. This multiplication won't work. But if you have two tens us with shape three by phi
n phi by three, that will work because the
inner dimensions match. The final result of
the multiplication will be off the shape of
the outer dimensions. So if you have two tensors, each with five by 3.3
by phi at the shapes, the final result will
be five-by-five. Let's try multiplying
these two tensors will be using the df.count malfunction meet
printed tf dot html. And so the zero double
one, and so 012. There we go. We have the
product of these two matrices. Let's look at Domo
matrix operations. So reshape and transpose. We will often use a V-shape to change your matrix structure. Then cleaning neural
networks, e.g. an image pixel of
Netflix 28 by 28 will be converted into a
one-dimensional array of 7-day default values. You will see this in
our upcoming project and I'll explain this in detail. But for now, understand that reshaping is a very important
concept in TensorFlow. And use the sentence or that be created in the
previous code blocks. Denser zero double one left me with reshape it
into four by one. For that I'll be
using df dot reshape. And my first argument will
be the actual tensor, will just answered that one. And the second argument will be the shape that they
say four by one. So if you see this,
I'm trying to convert a two-by-two Tensor into
fall by one tensor. There we go. So the value is two
comma 2.4 comma four are converted into
a four by one tensor. This is all we reshape a tensor. Now let's see how the transpose
and then soft baby using the df dot transpose function. And then you can
keep it the same, denser than, sorry, zero to one. Straight. There we go. You can see that the values
have been transposed, e.g. the values two-by-two and
four-by-four are now 24.2 for, if you don't know what
transposing a matrix, this is just converting into rows and two columns and
column to the horse. Now let's look at performing some aggregations
using dense us. In often cases you would
want to find the sum, mean, median, and standard
deviation of a pencil. Let me create a symbol
dancer, DFP, constant. We'll just do a
simple array, 1-9, pre band six, settling
a date, work. I'll set the data diapers for. Now. Let me print out some values. First, I want to
find the minimum value within this tensor. For that, I'll be using
the reduced Min function f dot with her, but abuse and then give the denser lead
us input on three. Let's print it. You can see that the
minimum value is one. Let's do a few more
aggregations like this. Reduce. Let's copy this. Then. Now to find the sun. Let's try them English. So the minimum value is one, the maximum value is nine
and the sun was 45 grid. Now let's open a new code block and dry somewhat aggregations. I wondered then the
standard deviation of this tensor for that, I'll be using tf dot map, reduce, STD, and the denser
lastly in bold there. And I want to print
the variance, which is also in P of Dartmouth. Let's try this. There we go. We have the standard
deviation and the variance. Now let's try a few
more simple aggregation is we're inclined to find this quiet is quiet volt and the log of all the
ladies of the tensor. First to the squash ruled, which is T of dark SQRT, denser than disquiet will
just tf dot squared denser. And I wanted to print the
log, which is in math. So df.net dot log of the
densa, let's print it. You can see that DFS quite
robust phone disquiet or for each value of the tensor. And D are both quiet squares, all the numbers in the denser. And finally, Math.min log finds the log value of all the
bodies of the dancer. These are the basic operations that you need to know for now. In the next video,
we will look at an important concept
called one-hot encoding.
6. One-hot Encoding: In this lesson, we will look
at an important concept that you will come across in deep
learning, one-hot encoding. One-hot encoding is a
process used to represent a set of values as category
based binary data. This encoding creates
a new binary column for each unique category. Each row in the dataset
is then assigned either a one or a zero. So e.g. consider a dataset with a variable color having three unique categories,
red, green, and blue. If we use one-hot encoding, this variable can be represented as three new binary columns. Each row in the dataset
will have a one in the column corresponding
to assign the color. E.g. if you look at row one, which is thread,
the first value is one and the other
two values are zero. And in green, the first and the last value is just zero and the middle is one. And in blue the
last value is one. Deep learning algorithms,
particularly neural networks, work with numerical data. One-hot encoding provides a convenient way
for us to convert these type of
categorical variables into simple numerical
representations. So these can then
be used as inputs or even outputs for
deep learning models. Using one-hot encoding, we can handle many type of
category variables. This is usually difficult for other encoding models to handle. And finally, one-hot
encoding provides a clear mapping between
the categories and their corresponding
numerical representations. So this makes it easier
for the model to interpret and
analyze the results of deep learning models, e.g. if we tried to build
an algorithm to classify between cats and dogs, a one-hot encoded
output will be much easier for us to convert
into the final result. I hope this lesson helps you to understand how one-hot
encoding works. In the next lesson,
we will see how TensorFlow can work
with GPUs and TPUs. We won't be working with GPUs
and TPUs and this project, but it is great for you to
understand how you can make use of GPUs or TPUs if
you have them available.
7. Working with GPUs and TPUs: So let's see how to work
with GPUs and TPUs using TensorFlow won't be seen
what GPUs and TPUs are, but let's look at
them in more detail on how we can use them. But TensorFlow, GPU or
graphics processing units, and TPUs are tensor
processing units or special hardware design for speeding up machine
learning process. They have many goals
and can process data much faster than
traditional CPUs. Let's first start by looking at what devices we have available
in this Colab notebook. For that, we will use
the F dot config list. Physical devices. You can see that B, we only
have a CPU allocated for us. Now, let's use our runtime
and allocate a GPU. Could change from time to
time and convert it to a GPU. I wouldn't be showing you
how to connect to a TPU because CPU will
always be available. So let's just look at the GPU and how we
can work with that. Save it. So very important once
you change the runtime, please make sure that
your Colab notebook is collected and run TensorFlow. You just have to read
on the import buck. You don't have to
worry about the rest. Let's feed on it. Great. Now let's see what devices
we have access to. There we go. We can see that we have
a GPU or by good for us. As far as TensorFlow
is concerned, you don't have to switch
between CPUs and GPUs because TensorFlow
automatically takes care of it for you. If there are TPUs, the code
will be slightly different, but you won't be using
TPUs unless you're working with extremely
large deep learning models. If you want to specify a device that you want
to use for your code, what you can do is you can use the tf dot device function
and give it the device name. It would say GPU Cielo. And then you can
write the rest of your code in this code block. So all the gold that is in this code block will
run using the GPU. So I hope this lesson helps
you to understand what GPUs and TPUs are and how you can actually make use of
them in your code. In the next lesson, we'll
start with our actual project, which is a handwriting
recognition neural network using TensorFlow and get us.
8. Preparing the Model: Now that we have learned
and slow basics, let's start building
our project. We will use the MNIST dataset of handwritten digits
to train our model. Mnist is the image dataset with 60,000 training images
and 10,000 test images. So we can use this 60,000 training images for
training our model. And we can use the
10,000 test images to see how well our
model performs. These digits range 0-9. Each image is of size 28
by 28 and in grayscale. So if you look at this block, the x-axis is 28 and
the y-axis is 028. Each pixel will have a
value ranging 0-255, e.g. the empty blocks
will be zero and the darkest blocks will
have the value of to 55. So this is how each image is
represented in this dataset. We will have a total of 784
pixel values per image. If this is clear to you, Let's
start by adding the code. It will first import
the MNIST data set. I'll create a variable
called MNIST. And we can get the
dataset from Geddes, DFL, get us data sets. Now I'm going to
create four variables. Extreme, white train and
x test and bike test. So the x train and y
train will be used for training while the x test and y desk will be
used for testing. And I'll call the
load data function. And we should add the date downloaded into these variables. So let's test it. And one value from x train. There we go. We have a 28 by 28. And so a wide range will
add the value of five. So x has images and Y has the actual
value of the signatures. That is an important
step we need to do, which is called normalization. So each pixel in this
image will links 0-255 and we're going to come with their traits
to zero to one. So this is to make the data-set simple and easy for the
model to understand. In order to normalize these, I'm just going to divide all
the image values by two. So x train and x test would
be divided by width divide. Let's double-check this. There we go. Now our image values
range 0-1 and 002, 55. X test dataset is ready. Now let's start
building our model. We will build a sequential model using getters, as we saw before. Get us used to be an
individual library for building deep
learning models, but it has been integrated
with TensorFlow. Keras we used to code
all the layers herself. It was quite complicated and made TensorFlow
hub to work with. But with Geddes,
It's much easier for us to stack a
bunch of layers together and build a
network of gold is modern. I'll call the sequential model. And my first layer, it will
be a flattening layer. So Kayla's stopped. But it still flatten. The input shape is 28 by 28. So what this layer will
do is it will take our 28 by 28 tensor and convert that into a
single-dimensional array. So our input layer won't
be a 28 by 28 layer, it will be 0784 layer
one-dimensional array. So that will be passed as an
input to our neural network. I'll show an image at
the end of this model, so it will be much
more clear to you. The next layer would
be a dense layer. And this will have 128 neurons with an activation
function called relu. Activation functions help us to capture patterns
and relationships. And this ReLu activation is a commonly used
activation function. So it will take the output of the first layer and only
they're done a positive value. If the output is negative,
it returns zero. So this helps us to avoid negative values and speed-up
training the model. Next layer will be
a dropout layer. And we'll set the
drop off rate of 0.2. So the dropout layer
prevents overfitting. Overfitting happens
when our model starts to lean on a specific
output, e.g. if we train our
model with 10,000 documenters and thousand images, the model will automatically
have a bias towards dogs and we'll classify more
cat pictures of stocks. So this troubled layer
helps us to reduce that. The 0.2 arguments specifies
that the rapid rate is 20%, meaning that 20%
of the neurons in the previous list will be
dropped during training. Finally, we will construct another dense layer which
will be our output. This will have 10-year-olds there and let's
just print it out. Model. Made a mistake of
Messina common. So our model is ready. This is what we
have constructed. If you look at this model, the first layer has so
many different neurons, which will be the input layer. And this layer flattens out 28 by 28 denser image into a salmon AT for
one-dimensional array. So that will be
passed as an input. Then we have the
activation layer, then we have the dropout layer, and finally, we have
the output layer. We will need one more layer
later in the project, which will be a softmax layer. So all of these outputs will be predictions and they
won't be probabilities. So there'll be scores. And we need a way to
convert the scores into actual probability so that we know what the model
is trying to predict. So we have just
constructed a model. It is not trained
yet, but we will just pass it some data and
see how well it is working. Create a variable
called predictions, and I'll call the model and pass a value from x-ray that's
passed the first value. The reason I'm using
this slice operator is because the slice
operator returns an array. Input has to be an
array of tensors. So even if it's a single
value does to me. But then another way, That's why I'm using the slice operator. It returns an array of values. And let me print predictions. You'll see that it will print
out a bunch of scores here. So there are positive
and negative values. Let's see called softmax works. So softmax is
basically a layer that converts a discourse
into actual bumblebees. So create a layer called softmax and pass it
the predictions. Let's see what it comes up like. So now we have probabilities. So this means that the model
is trying to tell us what is the probability that
the input belongs to any of these then glasses? So the first one means zero. Second one is one minus two, and the last one is tonight. So we have probabilities for
each numbers in the dataset. So let's make this
even simpler and see what the model is
trained to predict. For that, we'll just convert these predictions in
do a simple array. I use NumPy to make it simpler and I'll get
the first learning. Because as you can see,
this is within another, I'm just grabbing
this first value and I'm going to convert
it into a Python list. And I'm going to print
the maximum value, the index of the
maximum value for simple array index match
value of sub below it. So now we can see what the
model is trying to predict. Debbie go, it's trying
to say it's seven. Let's see what the
actual value was. So print x train of one, so white train of one. So the actual value spike, but our model says it's seven. That's because our
model is trained yet. So once you have
trained our model, you will see how much
the spiritual impulse. So with the code
block B ever done via constructed a fully-connected
layer deep neural network. So the next chapter
we'll be looking at optimizers and loss function and we'll see why
they're important for us to train our model.
9. Optimizer and Loss Function : In this lesson, we will
discuss the use of loss functions and how they help in training deep
neural networks. We will also talk about
optimizers and see how we can use them to
minimize the loss function. We finish the lesson by looking at the popular Adam optimizer, which we will be using
for our project. I lost function is also
known as a cost function. It is a mathematical function that measures the
difference between the predicted output of a model and the
actual target value. So when we train
a neural network, but inputs and outputs, the neural network will
generate its own output. It then compares
that output with the actual output that we have
given in the training set. So this is how the
neural network learns. The goal of the
training process is to minimize the value of
these loss functions. The closer the predicted values
are to the actual values, the lower the value of the
loss function will be. In deep learning, loss functions help us to improve the
accuracy of the model. The model parameters
are adjusted in such a way that the loss
function is minimized, giving us better predictions. The choice of loss
function will depend on the type of problem
you'll find yourself. So there are a lot of loss functions like
cross-entropy loss function. Or if you are working on
other collision problem, you'll be using something like a mean squared
error, loss function. Aggressive means, stock
market price prediction, housing prices prediction
and other similar problems. You can find more resources
on loss function, the course description,
I'll add a link for you. So now let's talk
about optimizes. Optimizers are algorithms that help us to minimize
the loss function. So they work by updating
the model parameters in such a way that it keeps bringing down the value
of the loss function. And as we saw before, the
lesser the loss function is, the better the
model gets trained. So there are many different
optimal choices available, each with its own
strength and weakness. One bubbled optimizer is
called Adam optimizer. Adam optimizer is very
effective in a wide range of deep learning tasks and it's a great choice for us to
use for this project. Adam optimizer combines the advantages of
gradient descent, which is an older algorithm. If you have studied
at initial learning, you would have heard
about gradient descent. So Adam optimizer is an
implement on gradient descent. And also it's computationally
very efficient. Even though to understand
the logic behind these cost functions and
optimize it for now, TensorFlow will handle all
there for you. For now. Just understand that loss
functions will help us reduce the endosome predictions and optimizes help minimize
the loss function. So I hope you understand how lost one chosen
optimizers work. In the next lesson,
we'll construct a loss function and you have an optimized it and we start compiling and
training our model.
10. Compiling and Training the Model: So now let's create
the loss function. I'm going to use the sparse categorical cross
entropy function, which is commonly used for
classification models. Then go to goal loss
function in Keras. And I'm going to save
from the logits is true. I'm calling the logits because the actual predictions
or goddess logics. So that's why I'm
saying we're going to calculate losses from logits. Now let's compile our model. And I'm going to
specify the optimizer, which will be Adam. And my loss function is in-laws
for sure I just created and I'm going to print out
some accuracy metrics. For now. I just want him to
see the accuracy. So this will tell us how good our model is getting
during training. We have, because I've
made a mistake here. So let me run this and make
sure the stroke is properly. That is one word, dipole. It
just doesn't now a hedge, so sparse scattered across
interfaith, that stuff there. And now let's compile
the money. There we go. Part of modulus Compiled. Now let's start training. Our model will be
called model dot fit, which is fitting the data
into the neural network. And I'm going to pass it
extreme and white train values. I'll also specify a
box feeble to fight. So epoch means an iteration. So epochs tell the
model how many times this model should
train with this dataset. So if we specify
the e-book as 5k, the deep learning model, we go through this dataset phi times. This is for the model to
understand any patterns that it has missed during the
first or second titrations. So you can improve the trading
by increasing the box. But after a certain
point of time, the accuracy will
start to be constant. So that means the
model as the lungs as much as it can
from the given data. So let's start with the box fight and let
me train them on. You can see that the training has started and the first titration is going on. And you can see our
accuracy is now 91%. It goes to rent high percent
in the second iteration, and it keeps getting better than the loss function gets reduced. Now we added the 97% accuracy. Let's increase the box to ten so that you can understand
what is exactly does. And when I increase the box, this model is already trained. So it can be training. So if you continue training
it from this point, so you'll see that the
starting accuracy ranges. I do seven per cent. Yeah, we have my good person, but it goes a little down. So you can see that the
accuracy is now all 98%. It doesn't get better than that. So this is a good point for us to stop with
our inbox count. So that is only a
certain number of vibrations that you can do
with a machine learning model. Put it back to five. So you can see that
the training is now complete and we have, we are at 98% accuracy
in real-world scenarios. Even if you have an
accuracy of more than 80%, you should have a decent
deep-learning model. Our model is trained
and ready to predict. In the next lesson,
we'll play with some sample digit values and see how well
our model performs.
11. Predicting Handwritten Digits: So we have built-in
trained our model. Now, let's see how
good our model is. First, we will use the inbuilt
evaluate function and pass the testing data and see how
good the model performs. Let's try that.
Modern birth weight, x-test and white dust. Lots of verbosity so that we know what's happening
in the background. So we have an accuracy
of around 97%. The loss is 0.7. Great. This means that we
have a legal mortgage. We need to add one
more step before we start looking at
the prediction values. As we saw before, this model prints out the scores
for the predictions. It doesn't give us
the probabilities. So we will add a softmax layer and then we will look
at the problem it is. Now I'm going to create
another sequence layer. And clubs are existing model and a softmax
layer together. Let's do that. We'll call
this the probability model. So this model is the same
as our existing model, but we're just going to
add a softmax layer. So let's say the
Kayla's dot sequential. And we're going to
add our model and a softmax layer grid. Now our model is ready. So we will take some values
from the testing set, which is x test and y test and see if I'm only gets
the value is correct. This is something
we did earlier, but our model was predicting everything wrong. Now
let's see how it's doing. After I've training, I'm going
to add another code block where first I'm going
to print the original. Call it original. And it'll be a white dust
value. Let's stop at zero. Let's look at what
way does this. It's seven. Great. Now we will bust
the x test value in the mix and give it to our model and
ask it to predict it. So if you get the same value, it means that model is
working as expected. So let's say output will
be probability model. And I'm going to Bus it. X test value will be zero. So this is the first value. So the zeroth element,
That's why I'm saying it's called The one. You can also say
zero golden one. Now, I'm going to use the same
logic which I did before, but just to convert this
into among byte array and get the index of the
maximum probability. Let's say output
is output dot non by grabbed the first element
and convert it to a list. And I'll turn predicted value of the index of the maximum
valid when the output rate. So therefore it's printing out the original value than we
are passing the x value that belongs to this output and we're setting it or modern
to ask the model what it thinks the given input S. So if we have our original
and productive equal, that means our model
is doing well. Let's try this grid or more. You got it, right? Let's try another one called
dual 1200. Do the same job. There we go. Religion is five and the
predicted value is five. Let me try one more. Go for 50. Melody used to seeing value. Show. There we go. It's 4.4. You can play with
different inputs and you will see that 97 out of 100 times our model will predict the
values correctly. So great job. We have built a working
deep learning model using computer vision that
predicts handwritten digits. You can only imagine how other classification
problems like Datadog predictions
and other gentlemen image classification
problems work. And I'm sure you
have a few questions after going through this course. So please don't hesitate
to get in touch with me. You can reach out
to me if you're stuck at any point
and the cosine, I'll be happy to help you up. So let's do a quick summary
of what we've seen so far.
12. Conclusion: Again, great job at
finishing the course. You have learned a lot. We started by looking
at tensors and how TensorFlow helps us to build
and work with tensors. We then saw how to
perform operations on tensors and generally tends
us using NumPy arrays. We also looked at
loss functions and optimizes and how they help to improve our
neural network. We then learn how to load the MNIST dataset and
build a model using Keras. Finally, we compile and train our model to predict
handwritten digits. Hope this course helps
you to understand how to work with TensorFlow. If you have any
questions, please don't hesitate to get
in touch with week. You're going to reach out to me at help admonition shiva.com. Thank you for
taking this course. I would love to hear
your feedback to make the next
course even better. So thank you again for
taking this course. See you soon with a new topic.