Transcripts
1. Intro Video: Hello, everyone, and welcome to this course on Big O notation. This is a great computer
science topic that will help you become more
efficient at your job. And it'll also help you understand computer
science a bit better. This is great for all
types of professions, including programmers, as well as anyone in
the business realm. It'll help you increase
your productivity, as you'll understand what you're working with
just a little bit more, which will give you
more insight and action into the steps you might need to take to make
things more efficient. And that's what we're going to be covering
in this course. We're gonna be covering
the notation behind how we analyze different
algorithms and scalability. We want to understand if we
are building a program or a piece of code or even just
an organizational unit. Can it scale? If we are creating tasks and procedures that are
exponential in nature, then they'll never
be able to scale. And that's what we want to be
able to analyze is does it work the same if we have ten
people or 100,000 people? Think about if you want to check every single file in
a filing cabinet. There's one way
where you can pull out every single file and
individually look at it, or you can create a procedure
where it's organized and sorted so that you know
where to go to find things. And that's what you can do in the programming space as well, but also with the
things around us. Another example would be if we needed to add someone
to an organization, but to do that,
we had to contact every other member of
the organization first. That's an unscalable solution. It's an unscalable procedure. And we want to be
able to analyze that rework it and make it
a more scalable procedure. So we're going to be both
focusing on code in this, but understand that
these principles apply in daily life as well. They apply in things
all around us. So let's jump into this
computer science topic and take a look at some really
interesting mathematics.
2. Big O Introduction: Let's talk about our
first major subject in analyzing algorithms, and that is going to be
this idea of in notation. So what we're actually
going to be evolving this into is something
called Big O notation, which is going to look something like this where you have an O, and then the in notation
goes inside of it. But before we can do that, we need to figure out what
exactly is in notation. So in notation is this
weird sort of relationship, where is a function of in. Now, stay with me. This
is confusing at first. So we're going to have to sort of unpack this a little bit. It's this sort of way where output is a function
of the input, and they just happened
to be the same thing. So let's go ahead and create a sort of a visual
model of this. Whenever we create an
algorithm or a program, what we are doing is we are creating this sort of black box. So what we have is we have over on the left side is input. We have things
coming into the box. We have maybe it's
your Facebook friends, maybe it's your Google Drive, all the data you're
sending to it. It's the Internet. When you're uploading something,
that's the input. It gets sent to some sort of
logic, some sort of program. So this is, you know, the logic or the controller rate here. And then it has some
sort of output. Something happens because it has gone to this
logic right here. Something has happened? Something has moved,
something's been saved. That's really what
the logic is doing. So, for example, let's say
we're uploading pictures. The input is the pictures. We send it to the controller. It takes the pictures, maybe it down scales them to
make them easier to store. Then it takes those pictures and it saves them into a database. So it puts them and
it writes them into physical memory
onto a hard drive. And so that's the output. So the input, the pictures, the output is every
single one of those save operations
onto a hard drive. And so what we're
trying to figure out is given the input in. So this is the input of in. What is going to be our output. In relationship to this. An right here is
just a variable. It's just because we don't know how many pieces of
data are coming in. If for example, we had a program that did Facebook friends, I could have maybe, you know, maybe I have 182
Facebook friends. Well, maybe you have 1,373, and your distant cousin has
10,157 Facebook friends, you know, and so on and so on. And let's say that
there's no limit. I think there is a limit. But let's say that there's no
limit on Facebook friends. So that means you
could have zero all the way up to infinity. So we don't know what number
is going to go in here. So instead of trying to
just make up a number, we just say, which is just going to be the
placeholder value. So we're going to put
an input of in here. It can be any of these numbers
from zero to infinity. So any number. And then it's going to go
through our logic. And then at the end, we're going to have certain
number of operations. So between this point rate here, we're going to have logic
and then operations, and then it's going to output. And we want to know what is
happening during this point. How does this relate to this? So this right here is going to be the output or our function. This is going to
be the notation. And what we typically write is we actually write it
as an in as well. But we write it
as a function of, meaning that is not the only
thing we put over here. There's actually a
bunch of different things that we could
put over here. So what we could have is in, and that just means that
whenever we put in, let's say, 1,000 pieces of data, it just takes 1,000 operations. For example, something like this is if you find a bunch of files on your hard drive and you want to move them
to another location? Well, there's nothing else
that really needs to happen. We take one, we move it over. We take one, we move it over. That's in. For every piece
of data that you include, there's just one more operation. Now, there's different
things, of course. So instead of, there
could be there's squared. There's something called log of. There's just log of
n. There's cubed. There's the square root of. And so on and so on. So all we're doing is we're
taking all of these, and we're making them a
function of the input. This will always stay the same. This is always going to be in. I is our input. It's the amount of
data that's coming in. On the right is just
the relationship, and we just use because it shows us that this is what
we're making a function of. I mean, we could use x
and the x squared and then x log of x and
stuff like that, and so on and so on and so on. But the problem is that
you might get confused as, you know, what exactly is x. And then we have to put something
up here that just says, Well, x equals n,
x equals or input. And then we're just
thinking, why don't we just stick with N to
do this whole thing. So that's really the
biggest confusing part of notation is that N looks like it can be the input and the output. But what's important
is that is the input, and then the output is
just a function of N. It's a representation of how long or how many operations N
must use to get over. To the outputs. Now let's look at a little
bit of a concrete example. Let's go back to that Facebook
friends one because that's something that we can all sort of really quickly visualize. Let's say that I have an
algorithm right here. I have one algorithm that takes an amount
of Facebook friends, and it inputs them
into a database at the time of or
the relationship of. And let's say I have another
algorithm right here. Which does the exact same thing. Except instead it takes
in the same of in, so it takes in, but it
outputs in in squared. There's a little bit less
of an efficiency here. So now we can sort
of look at these two and just compare
them a little bit. Which one of these
algorithms is quicker? So if we went ahead and we
just did some quick math, let's say that we're
going to test this with 100 Facebook france. So on this first one, well, it's just going to be 100 goes
into our little black box, and it comes out with
100 different operations to do whatever it does. Well, on the bottom one here, we have 100 goes into
our little black box. Remember is always
going to stay the same, but our function is n squared, which means now we need to take actually 100 and square it, which just turns out
being 100 times 100, which means our
total is going to be Oops got to get
the zeros right here. Four zeros or 10,000. So you can see
that the first one only took 100 operations, while the second one
took 10,000 operations. We're going to get into
this a little bit later. However, the growth patterns of these are what is important. Within what we have is we actually have something
called a linear, and then with in squared, we have something that's
called exponential, which means the growth goes up more and more and
more over time. But like I said, we'll
get into that later. What I just kind
of wanted to show you is how we sort
of look at this. So we have an input value
and we have an output value. We have another input value
and another output value, and there's two
different algorithms, algorithm A, and algorithm B. And we can quickly say, Well, A is definitely faster than B. A is you could say
it's greater than B, or you could say that it uses less time than B or however
you want to compare them. But this one is
fast and this one is slow because of how
many operations they take. So then, what's the point of? What are we trying
to get at here? I allows us to see how an algorithm scales
with input size. If we design an algorithm,
we want to know, is it efficient with
more people and input? We want our programs to succeed. So we want to make
solutions which scale. And this is something
that you'll hear a lot. We want to make
scalable solutions. And a scalable solution is
a solution which doesn't only work for the
exact amount of people that's currently
using it right now. So, for example, like I said, we're going to keep going
on this Facebook example. Imagine if Facebook
used this one. Imagine if it used
the squared algorithm because it didn't think
they were going to scale. Well, the thing is,
we're programmers. We're trying to make our product the best product that it can be. And that means we want
the product to succeed. And when a product succeeds,
more people use it, more pieces of data are input, and the algorithm will have
to use these new pieces. If we don't make a program
which is scalable, we'll run into the whole
exponential problem, where yes, this will work for 100 and maybe it'll
work for 1,000 and hey, maybe it'll work for 10,000. But if you realize that's a very tiny part
of the population. Let's say that maybe
1 million people. Want to use our program. Well, now, we have a solution which doesn't work
for 1 million people. If we did 1 million
times 1 million. Well, we would have a number
that is extremely large. A number which isn't even
comprehendable by our brains. And this actually, we'll do some of this math a
little bit later on. A number this large
ends up taking days or even years to execute. So a computer, the computer
that can, you know, log online and do all
this quick math will take days or years to
accomplish a simple task, and that's because the
program isn't scalable. So that's the basis of notation. It's standardized,
form of notation that allows us to compare
two algorithms in a completely
sterile environment, meaning that we don't
have to implement them. We can just look at them. It allows us to compare them and figure out which one
is more scalable. Which one will be
quicker whenever we add more people or more data
or more operations to it. And that's the
point of notation. And the next
lecture, we're going to keep diving into
this and we're going to keep unpacking the box
that is notation so that we can get this solid base to be able to compare
different algorithms.
3. N notation: Let's take a little bit of a deeper dive into end notation and sort
of look at some of the goals behind why we as computer scientists
use end notation and what we're trying
to get out of it. Because end notation can be used a whole bunch
of different ways. There's a bunch of different
sort of combinations or pieces of information
that we can get out of it by using
this notation. But what I want to
go over is the core, what most computer
scientists use it for. And what are some of the
goals? Why do we use it? So when we use notation, what we're doing is we're
looking for large patterns, and we're looking
for these patterns in sort of theoretical space, meaning that we're not
actually going to have a program implemented
or anything like that. We're just looking for
those large patterns. We're looking for, for example, if one graph goes like this and the other
one goes like this. We don't care if at some point we have that this
graph on the bottom, maybe it goes like
this, and then this top one goes like
that or something? We don't care at
this point that yes, there's this point
right here where I squared is actually just
a bit more efficient? That's for nitty gritty
when you actually have an algorithm in front of you and you're trying
to optimize it? That's what like maybe a
server optimizer might do or database management,
stuff like that. We just care about that
the theory overall. We're looking for which equation
scales worse over time. So as we go towards infinity, as we go towards the
unknown amount of number of pieces of data that could come
indoor algorithm, what's going to happen
to our equation? Are we going to get that
equation that can't be finished in this century because of how many calculations
it's going to take, or we're going to get
something like this, where it doesn't matter how
many calculations we give it. It's just going to keep chugging along. And so that's
what we want to see. We want to see that
major efficiency change between different algorithms. That's the point of notation
and why we use notation. So with this because of this, there are a couple of rules that we typically use with notation. And they're just sort
of shorthands that we use to not confuse ourselves. And again, to get to
that big picture. So we don't spend all of our
time in the nitty gritty. We can actually just look at
two equations and be like, yeah, this one scales
a little bit better. The first one is we don't
care about multiples. And what that means
is, for example, If we have n squared and
then three n squared. These are, for all
intents purposes, equal. They're basically
equal to each other. We don't care. I mean, three, you know, maybe you could
justify to yourself. Yeah, three is fine, but
this could be, you know, this could be like 3
million 3 billion, 30 billion, 30 trillion. It can be any number that we
want, and it doesn't matter. We take that multiple
and we erase it off. We don't care about
the multiples. And this just comes
down to, again, the big overarching thing, it doesn't matter how
big this number at some point between
zero and infinity. At some point, n
squared is going to make this number
not even effective. Imagine if at some point
was equal to, I don't know. 300 Google, and Google
is 100 zeros on the end. So you have 100 zeros on
the very end of this. Do you think that the 3 million that's being multiplied
really matters at that point? I mean, it's going to be very, very tiny difference
in the end result. We're talking like You know, it's going to be a difference of a number that looks like this or between a number
that has this onto it. You know, a little
extra onto it. It does not matter at all. So
that's why in theoretical, whenever we're going
from zero to infinity, we don't care about multiples. So for comparing two algorithms, it could be 3 million squared, and we're comparing
it to like no the cubed, doesn't matter. We're still just comparing
n squared to n cubed. So you always remove the
multiples in the beginning. Now, how do we get
these multiples? Well, let's say that we have
an equation that does this. We have the black box, so we have the inputs
coming over here. We have an squared operation. The next we bring all
of those inputs into another black box, another
squared operation. Okay. And let's say at the very end here,
we have another one, but this one is only an
in log in operation. So with an algorithm like this, we actually create
a little bit of an equation here. We do
something like this. We go in squared plus
squared plus log in. And then we can actually
combine these into a multiple. So this is two n
squared plus log in. And so this is how we
come up with a multiple, and you know what
we've been saying here is that we don't really
care about the multiple. I can do this as
many times we want. We're just looking at
that big changing factor. So this then becomes
squared plus log in. So then, this comes
to our second rule. We take the largest an equation. So whenever we have an
equation like this, again, this is not going to
scale nearly as fast as this. When we get into those
giant numbers with 50 zero, 1 billion zeros at
the end, I mean, it's two and finish,
so there can at some point be
1 billion zeros. There could be 1 billion zeros on the end of this, not
the number billion. I'm talking about 1
billion actual zero. It would be a number so
long that, you know, you couldn't fit it
around the entire Earth. There could be a
number like that. I mean, we're going
to infinity here. So at some point, this
is just going to become so wildly larger than this
number that we don't care. This is just a little
bit of a road bump on the way to whatever
this number is being. It's just going to
affect it by maybe 0.00, 000 1% at some point. It's at some point,
affected by almost nothing. It's going to be
such a small number. It's almost zero.
And because of that, We remove it off
the end as well, and we just leave an
equation like this to say, Hey, this whole
algorithm over here, this entire algorithm
right here is actually it's actually
just squared. And so there's a couple
of the scaling rules. Now, if you want
to go technical, if we're doing server
optimization or any of that other stuff where
we actually have equation in front of us, yeah, this is probably something
we want to look at because if we're looking
at an actual server and we've got an
equation that can run it at n squared plus log in, instead of two squared
plus n log in, we want to do this one
because it's going to be faster in actual practice. But again, this is not
practice. This is theory. We're trying to figure
out just the differences of large scale applications. And so in that sense, we do both of these rules, we break it down to just what is the large runtime of this. And we actually have
little modifiers of big O and little O and Beta. We're going to talk
about those in a little while as well that will sort of make this a little
bit easier to understand. So let's take a look
at some charts here. Let's take a look at, you
know, kind of what I've been detailing here with the
whole thing of the numbers, they'll outscale their partners, and so we only really care
about the big numbers. So, right here,
we have sort of a representation of a graph here. We have all of the things
we've been talking about. This is one, which
is constant time. It means that we could put in 1,000, we could
put in 1 billion, we could put in any
number of we want, and it'll only ever take some constant number
of operations. So it'll only take maybe three. Maybe it's 62
different operations. No matter what number we put in, it takes 62 operations. So that's what constant time is. And then we got
that log in time, and you can see it has that sort of graph that goes like this. Then we have just the
square root of N, which is similar to log in, but it goes up over time. We have linear or. Linar is just straight
line at an angle. In log in sort of it looks
a little bit like this, but then it kind of goes
into a linear line, maybe with slight
tendencies upwards. Then we got ourselves squared,
which is exponential. Then we got two factorial n, which is even more
exponential and almost just goes
straight up in the air. And then we got in factorial, which might as well
look like this. It goes almost straight up in the air and it creates
such large numbers that if any of your
program ever runs in in factorial, it's wrong. It has to be done
in a better way. Let's take a look at
the examples though of what all of these would sort of if we
added numbers to them, what the run times might be. So in this situation, we're
saying that we can see here, is the number of inputs, and in capital in right here is going to be the
number of operations. So over here, these
are the little ins. This is what we're inputting. We're inputting the 100,000
or 10,000 or 1110 or one. And this is what the output is how long it
would take to run. And you notice on the first row, it's actually kind
of interesting is that with just one piece, no matter what you have, it's always going to run it
basically at the same time. Of course, there might
be a little differences, for example, maybe
constant time. Maybe this is a one time
where if constant times 62, all of these would actually
run a little bit quicker. But again, that's in the nitty gritty and we don't
really care about that. We want to look at these
trends as they go over time. So if we go to factorial, the E plus at the end here, that means that's how many
zeros are on the end of it. So that means this
is 9.3 or nine three with 157 or 156 zeros
if you move the decimal. Anyway, there's a lot of
zeros on the end of that one. And this one is, you know,
it's four with 2567350000, 456,000 zeros on the end. And you can notice how big
this changes over time. And then we have two in here. This is two and then
two some factorial. So, you know, in this situation, it's two tenth or two to 100 and so on and so on and so on. But anyway, You can notice
that even these two. These two are basically
straight up in the air. There's such a big difference. It's the difference
400-56 thousand zeros and 300 or 30,000 zeros. And then we move on
more and more and more. And what we have is we
have down to n squared, which we've been talking
about, we've been showing how inefficient it is. And this actually looks very tiny compared to
two to the n and n factorial when it gets over more and more.
Then we have N log in. We're actually getting to
readable numbers here, just 1,660,000, then
just n which, of course, is going to be equivalent
to how many we put in because N will always just go to N. That's what it's
representing right there. And then we start
getting into these ones down here where you notice that these are
actually kind of cool. They're efficient
algorithms here. We have we put in 100,000 and
we only do 316 operations. There's a smaller
and smaller change as we go on over time. And this, you can kind of
notice how we could get some sort of algorithms to get
actually pretty efficient. I mean, look at this. So we have the difference between
90 right here. And in that difference, we
go up by three operations. Now, we have the difference of 900 right here. And
how much do we go up? Well, just by about
three operations. Then we have the difference
of 9,000 new pieces of data. How much do we go up? Well,
about three operations. And then we go up by
90,000 is the next step. And how much do we have?
Well, just three operations. It's the exact
opposite. The more we add to it, the
less it goes up. It increases less
and less over time. And there are actually algorithms that
we're going to talk about throughout this
course that run like this. It uses something
called like divide and conquer where we
actually split up. The information like this, and that way we can
just access it really, really fast instead of
having to go through every single piece to
get to what we want. But anyway, that's just sort of another rundown of notation, how it scales, and then some of the terms that we use with
notation and scaling. In the next lecture
we're going to kind of go over a real world example, maybe add a little bit of get a little bit out
of the theory side and sort of look
at the differences of these if we actually apply, you know, 0.01 seconds
to every operation. We can see just how massive these numbers get and the
differences between them.
4. N notation example: Let's just put some numbers to what we've been talking about to sort of just put this in a little bit of respect that
everyone can understand. So let's go over
this problem right here and let's take a look at the differences of what you log in and squared might look like. In the last lecture,
we looked at this sort of this graph rate here. And so what we're looking
at is this rate here, in log in and squared. And in this sort of
little graph rate here, I mean, they look really
close to each other, right? They look like there wouldn't be that much of a
difference between them. And this is what I want
to show is that there are major differences even
between these up here, and the efficiency change
is very important. So then, let's take a
look at this example. It'll just go over
it, and it should hopefully show you
the differences. So we have or we say
that every cycle of a program takes
approximately 0.001 seconds, which I believe
is a millisecond. And so what we're saying
is how much faster will log in take
than in squared, if 1,000 pieces of data are input and then
further down here, if 25,000 pieces
of data are input. And so with this, what
we're going to do is we're just going
to plug in this. We're going to calculate how
many operations it takes, multiply it by a milliseconds, and then we're going to
get how long a program that would run like
this would take. And so, our first example,
we have that in log in. We multiply 1,000
by log of 1,000, and we're using log
based two, remember. So these are all log based two. And so what we get is we get
9,960 times milliseconds, and that all comes
out to 9.96 seconds. So then if we go
with the squared, we're going to do 1,000 squared. That's going to be
1,000 times 1,000, which ends up getting us
the number 1 million here. And with 1 million,
multiply the milliseconds, and we get 1,000
seconds or 16 minutes. So the difference between
using these two algorithms is one will take 9.96 seconds. So, you know, 10 seconds. It's not that long at all,
and the other one will take 16 minutes to accomplish. Imagine if you're online
and you submit a form, and one of them uses the n log in while another
form uses n squared, a lot of people could wait 10 seconds for a form to submit. But if we had to
wait 16 minutes, that's not going to happen. No one's going to
do that. So let's go for a little bit
higher of an example. Let's say we have
25,000 pieces of data. So we do the 25,000 pieces. It's 25 times log
based two of 25,000. We get 365,000
times milliseconds. This then comes out to 6
minutes and 5 seconds. Note here, this is important
to note is that even with $25,000 or 24,000
more pieces of data. This still takes 10
minutes less than if we used n squared on
the previous one. Anyway, now we take a look
down here at the n squared. So we do 25,000 times 25,000. That gets us 625 million
multiply a milliseconds, and we get something
extraordinary here. We get seven days and 5
hours to accomplish this. So it's just 25,000
pieces of data, we've now moved from going from 6 minutes all the way
up to seven days. And for example, if we did
250,000 pieces of data, this rate here would
be something in the years or the
thousands of years. It would be something
where it would not be accomplished in our lifetime, or grandchildren's lifetime, the sun might burn out before we get to the accomplishment, before the program actually
finishes execution. And so I hope that this
little example here, just of this little math and adding in the milliseconds here, you can see the big difference, even though they look
kind of the same here and maybe even a
little bit down here. When we scale up to
more and more numbers, we start getting gigantic
inconsistencies. And those giant inconsistencies is what the computer science
theorem is all about. Is what the computer science
curriculum is about. I we're just trying to find what What sort of data structure
or techniques should we use in each given situation
so that we can create the most efficient possible
algorithms so that we end up accomplishing things
quickly instead of spending extra time resources or not even being able to accomplish ds because it would take too long. And so this is just a little bit of an example of that. Okay.
5. Big O: So now we have a good
understanding of N and how exactly it works, and it ties into our
analysis of algorithms, we can start kind of
adding on to that. So we know that is a way of classifying how
fast an algorithm runs, given a certain number in how long is it going to take with
respect to that number? So is it going to
take just in time? So if it was 1,000,
is it going to equal 1,000 or is it going to take something like
n squared times, where when it's 1,000, it's going to equal 1 million. And this is a really important
sort of classification. However, programs
aren't that easy. They don't just run at an
exact time all the time. A lot of times
they have a bound. So, for example, maybe
when it's loading, it's going to run an n squared, but it executes an in, and then it exports
in in log in. And so we have to be
able to look at this, and we have to look at
the program and give a classification on the
program as a whole. So, for example, this program, we would always look
at the worst case is always going to worst
case, take in squared. However, at certain steps, it's going to run faster. And because of that, we actually have this sort of
system rate here, which is going to classify the
bounds on our in notation. So let's just kind of go down this and take
a look at this. These up here are
called Omicron. You have Theta in the
middle and then Omega, and this is lower
case Omega down here. And so you can see that
they're Greek alphabets, but we just a lot
of times in math, use Greek because the
alphabet gets confusing. So we use Greek to sort of
symbolically represent things. And in this instance,
every one of these Greek symbols symbolically
represents a bound. So what you have
here is let's draw, for example, a bound
right in the middle here. And up here is faster. So I'll say up here is
going to be faster, and below here is
going to be slower. So, our first
notation is little O, and you can see that we just
don't like to say Omicron, like big Omicron notation
that just takes a lot. So we just say little big O. And so little right here. It just means it's
going to be faster, which means the program will always be faster
than this bound. So, for example, if our bound
was squared, right here, if we had little O of n squared. Right here, we would see that it means that it's never
going to touch n squared. It's actually always going
to be faster than n squared. So it's always going to
be right above this line, but faster than n squared. And so what we can do is
we can actually sort of use this to figure
out the bound. So the next one we
have is we have big O. And big O means that it is going to be
faster or equal to. So that means that it's going to touch this line or be faster. So it's going to be at worst, and that's something
that's very key here. This means at worst. It's going to be in squared. The next one we have is Theta, which means it's not going to
touch either of these two. It's not going to go low, and it's not going to go high. What it's going to go
is right on this line. So no matter what it's
going to be in squared, it's going to be right
along this line somewhere. And then the next one down we
have is slower or equal to. So we have this one is Omega, and this one is
slower or equal to. So it touches this line,
and it's always slower. So, you know, we
got n to the third n to the fourth down here,
up here, we might have. So it's always slower than this, or equal to n squared. So at best. So this
one is at best. It's going to run at
this if you put this. So, for example, if we
write of n squared like so. Let's cut off a little
bit right there, but of n squared like so, then we understand
that this program will always run at the best, it'll run is n squared. Which means it can
run worse than that. And then beneath that
we have slower than, which means it'll always
run worse than n squared. It'll never touch squared, but it'll run worse
than nSquared. And so why is this important? Why do we want to focus on
this one out of all of these? And that's because Big O dictates the worst
case scenario. And we're concerned with
the worst case scenario because given the
probability and statistics, our program might touch the worst case
scenario at one point, and we want to know
without a doubt, what is that worst
case scenario? So that is why we use the big O. It's at the worst, what
is it going to be. And you can see, I'm going to bring up the chart
again right here. When we have this idea, when we have the ability to say, what is our program going to be, we can actually start
sort of filling this in. So if we have, for example, maybe big O of N. We understand that at
worst it's going to be in. So it could be in, but it could also be anything
greater than in. So we can prepare all of our time diagrams and sketches
and anything like that, going off the assumption
that it's never going to go on this side of the line. It's always going
to go on this side. And that's really powerful
because now we can start comparing programs worst cases, and we can start
seeing that maybe when a program scales, for example, if we had an in log in
program versus an in program, we could see that
when this one scales, it's going to be
worse than this one. So let's kind of just take let's break this
down a little bit, and we can kind of see why
some of these are meaningless. So, we have a program that
runs at Little O n squared, which means it's
faster than n squared. So that means it could be
anywhere from n squared all the way up to Basically, it has to be just
faster than it. So it's not going to be squared, but it could be anything
faster than squared. And this one is kind
of just like big O. But the thing is the
thing we're given, so it's little O of N. We
can't actually touch this. So it just makes it a
little bit confusing. It gives us a bound, but it doesn't make it
easy for us to see. We can't immediately
look at this and go, Oh, this is going to run time. We have to look at
this and be like, Oh, it's going to run faster than, but we don't really know where. And it doesn't really give
us a lot of wiggle room. And then we have
our big O notation. And that's what we
just went over, is we can look at this
and we'll be like, Oh, at worst, it'll run at log in. Theta would be great if we
could always use theta. This is either equal to or sometimes it's used
as the average, but most of the
time, it means equal to. So this would be great. But the problem is
programs don't always run exactly along a line. Sometimes they might
run maybe at one point, it runs an, but at one point, it runs and log in, and it
goes anywhere in between this. So Theta is just too specific
for us to really like. And then these are pretty much meaningless because they don't give us a lot of information. Think about it. This one
right here, which is Omega. So Omega squared. That means it's going
to be slower or equal to Omega squared, which is like you're saying, our program is
going to be either running at squared or
it's going to be slower. How are we supposed to
plan with that in mind? Because slower could
go into infinity. It could be as slow as possible, and we would never be
able to prepare for that because we'd never
have a bound on it. We'd be like, Okay,
so it's going to just run how fast it could
take n squared, it could take in factorial, it could take the infinity. It doesn't help us analyze the program because there is
this infinite side to it. And exactly the same idea with the little Omega here as well, is it goes off into
infinity on the bad side, which does not help us. So these help us because
the bad side is, let me race some of this here and make a little bit more room. So these help us
because the bad side. So, let's say that over
here is off to infinity. It gives us a bound. So anything, it's going to be at worst right here,
but it could be better. And we don't care.
Like, if it comes back and it runs that
constant, that's fine. That's perfectly
fine. That means our programs running great. Sure. But we can plan for that. It just means our
program is going to execute faster
than we expected. So what we can do then is just plan for the
worst case scenario. However, if you go on the
other side of this line, you cannot plan for infinity. So that is why Big O
is so important is it's the most useful out
of all of these notations. It gives us a worst
case scenario. And so let's kind of take
an example of how we might apply this to a
particular problem. So, usually, what
you always do is you use the worst case scenario, meaning, at what
point in the program, is it going to be the
most inefficient? So here, for example, let's say that we had
big O of n squared. We had big O of log in. Big O of N and then big
O of constant time. And so these down
here are great. But the thing is, is
that no matter what, we're always going
to be bottlenecked by this guy right here. And that's because our load time right up here is in squared, which means this program
is going to rawnet in squared plus log in, plus, plus to the first, which let me keep it
sort of constant here. We'll go plus one. And you might be thinking,
why am I doing this? And that's because
you can actually add each part of
the steps together. So because you do this part, then, you do this part, then you do this part,
then you do this part. So you can actually
add them all together. And I remember
what we were doing beforehand when
we're talking about how we try to look at a program
as it goes to infinity. So if we look at this number, which one of these is going to outpace the others when
we go to infinity? Well, for example, let's
put it at 1 million. How significant is 1 million, which is going to be verse 1
million, going to be verse, maybe somewhere
around 3 million, going to be verse 1 quadrillion
or maybe 1 trillion, six one and 12 zeros. This one is going to be
substantially larger. These are going to be a
smaller and smaller portion of the number as it
goes to infinity. Up to the point
where these will go near zero as significance. So you'll start having, you know, one Google number, which is 100 zero plus maybe like 8 million over
here or something like that. And that means that
this part over here is completely insignificant
at this point. So what we do is we
eliminate the lower ones and the runtime of this
program is in squared. And then, so let's just sort of cement that with
another example here. Let's say that we had big say
we had big O of n squared. Big O of n squared. Big O of got to draw the s here. Big O of n squared again, and then let's say now we
have big O of n third. And so now what we have here
is we have n squared plus n squared plus n squared
plus the third. And so we can actually
combine these down to three n squared plus n the third. And now, remember what
we talked about in the last lecture on
the end notation. These don't matter. The constant rate here does not matter because as we
go into infinity, this becomes less
and less significant until once we get to a
sufficiently high number, it becomes basically zero. So we can cross that one out. And so now we have n
squared plus n third. And then, of course, this one is going to take off way
higher than the other one. So our program ends up
just being in third. And now since these are
all big O notations, we know at the worst
case scenario, our program will run at third, which means now we have
this bound of third and infinity of run
time goes off this way. And now we can plan
that at the worst case, we're going to have nd third, and we can plan everything
all the way back to constant time because now all we have is just
this bound right here. We understand that, you know, if we put in 1 million piece
of data, it might not work. Maybe we can look
to bring this over, but it gives us
somewhere to start. It gives us somewhere to
compare it to others. So that is big O notation. Very important and over time, you're going to start basically, this will become second nature just being able to look at this. You won't really ever see
these other ones too often. You'll see this one
sometimes if it says, like a certain point
of a program is equal to this run time,
then you'll see this one. But other than that, these
two will probably never see and you'll probably
never see little mercon two. So these notations right here
just sort of memorize them, understand what they mean,
and you should be good to go. You'll start to
understand a lot more of sort of computer
science documents.
6. Real World: Now, we've learned
a lot of theory. Let's go ahead and
take a step back and go over some real
world code analysis. Now, I'm not going
to be using complex or any sort of code that you
need to know right here. It's all pseudocode,
and I'm going to be explaining it every
step of the way as if you've never touched code before because that's kind of
how I want to explain this entire sort
of course is we're looking at the theory behind
everything, not the code. However, applying it to something real world
can help you know, sort of figure out
some of the concepts, and that's why we're doing it. So let's just look at
our first piece of code. I'm going to go ahead and
walk through this here. So what we have is
we have it says, I wrote it in pseudo that sort of just reads
right off the tongue, something you can look at and
understand what's going on. So, it says, for each
data in data list. So we'll create a data
list here in a second, print data to screen. Okay, easy enough. Let's say we input into this,
a piece of data, a list here that maybe
is three, two, ten. And then we'll go Buffalo. So this is three
integers and a string. That's how it would be
technically classified, but we're not even
going to look at that. These are four pieces of data. And so what we're saying
is that for each data, so for every single
data in our list, we're going to print
that data to the screen. So in our situation
here, what is in? Well, in this
situation right here, in equals, the one, two, three, four, because we
have four pieces of data. So in this situation,
n equals four. So now let's see what the
run time of this would be. So we have for each
piece of data, and then we printed
to the screen. So, okay, let's go
down the list here. Let's go with our
first piece of data. So our first piece
of data is three. We grab three from our
document rate here. So this is this is zero in a array or if you
want to think about it, this is one, two, three, four, Computers usually
go with zero, one, two, three, it's just
the way they work. So we are going to grab our first piece of
data right here, and then we're going to
print it to the screen. So we grab our three, we
printed to the screen. So that's one run time. So our run time
right now is one. And then now we're
going to grab the two because we're
going down the list. So for each piece of data,
we've touched this one. So now we move onto this one. So now we grab two and we
printed to the screen. So now our screen
looks like this. And so that's an additional
run time right there. That's one runtime.
And then now we print our next piece
of data, which is ten. So now we have three, two, ten. That's additional run
time right there. And then now we print buffalo, which would be three, two, ten buffalo. Like so. And that's an
additional run time. Because all we're doing is we're grabbing the piece of data
and we're printing it. There's no special thought
process going in here. We're just grabbing, printing, grabbing, printing,
grabbing, printing. And now, if we add
all of this up, we'll see that it
comes out to one plus one plus one plus one,
which equals four. And so our runtime in
this situation is four, and you'll see that our runtime equals the amount of rate here. And if we can just think about this theoretically
for a second, if we had 1 million
pieces of data in here, there would be at no point
in the program where we'd ever have to touch more than
1 million pieces of data. So no matter what we do here, runtime is going to
be whatever our n is, which means what we have here is a runtime of N. In this
particular situation, we could actually use Theta
as n because it actually equals to n. There's no sort
of wiggle room in here, but we will go ahead
and just say that it is at worst going to be in because that's the
notation we like to use. So this is an example of an in, which this is
called a four loop. So four loops are typically an example of an
in in a situation. Let's go to a slightly
more complex problem here. So let me break this
one down as well. What we have here is it says, for each data in in a data list. So that means instead
of calling it data now, each piece of data in a
list is going to be in. Check if the data
is in the list. So we're going to then go for
each data in the data list, if n equals W print true. So let's go ahead and sort of break this one down a
little bit, as well. Let's go with our same example from before, which was three, two, ten Buffalo. Like so. And in this situation, our
in is still equal to four. And so now what we're going
to do is we're going to look at the first problem here. So let's take a look right. Now, let's say, we're going to grab each one of
these pieces of data, so we're going to grab the
three, and that's now in. And then now for each piece
of data in this data list, and you can see these names
are exactly the same. So that means this
is the data list. We're checking for both of
these. So we have our three. We've got our three grabbed. And now we're trying to check If this three over here is equivalent
to anything up here. And the only way we can
do that is brute force. So we're going to
take this three. We're going to check
it with the first one. We're going to check it
with the second one. We're going to check
it with the third one. We're going to check it
with the fourth one. So that's going to be one,
two, three, four operations. So let's break this, like, really far down here.
We have our three. We've grabbed our three, and now we're checking
it, does it equal. Let me make that a little
bit smaller because it's going to be a little
bit bigger of a graph. So we're saying does three equal The first
piece of our data, which is going to
be the three again because we didn't
tell it to start at a number that isn't or any
fancy things like that. We're just checking this
list a second time. So we've got our
three, and we're checking does it
equal the first one. So does it equal
three? Yes, it does. So this will print
out a true message. And this took one operation. Now, does our three
equal two does not. So we don't print anything.
That's one operation. Does our three equal ten. It does not. So
that's one operation. Does our three equal
Buffalo. It does not. So that's one operation. And now we're done with the
three. So we've moved on. We've taken care of the three. So let's create a
second one down here. Let's see T one is going to be the one
we're checking with. They're exactly the
same, but I just want to kind of make this a little
bit more visual for you all. So we're done with the three. So let's move on
to the twos now. So does two equal three? It does not. That's
one operation. Does two equal two? It does. That's one operation. Does two equal ten? It does not. That's one operation.
Does two equal Buffalo? It does not. That's
one operation. And then let's go down
the list one more. Does ten equal three? Does not? That's one operation. Does ten equal two, does
not. That's one operation. Does ten equal ten. It does. That's one operation. Does ten equal Buffalo. It does not. That's
one operation. Does Buffalo? I'm just going
to abbreviate it as B here, so I don't want to keep
writing out Buffalo. Does Buffalo equal
three? It does not. So that's one operation.
Does Buffalo equal two? It does not. So
that's one operation. Does Buffalo equal
ten? It does not. So that's one operation. Does Buffalo equal Buffalo? It does. So that's
one operation. And now, if we add all
of these up rate here, what we're going to see
is this is going to be one, two, three, four, five, six, seven,
eight, nine, ten, 11, 12, 13, 14, 15, 16. So in this situation,
our in was four, but our runtime was roughly 16. And what does that
come out to be? Well, that comes out
to be in squared. So let's take if we took
four and we squared it, it would equal 16. And we can theoretically
think about this as well. If we expanded this out to five, we'd have not only an
additional five here, but we'd have to multiply
it because now we'd have to check an additional
one as well. So every single instance, we'll have five
and we'll have an entire other five,
which squares it. So in this situation, our runtime is in squared. And the reason for this is even though we have
two four loops, there's what is
known as nesting. We've nested one four loop
inside of another four loop. So this part out here. Let's draw this in red. This part out here is in. While this part
inside is in as well. So what we do is we take
the in on the outside. We multiply it by the
in on the inside, and we get in squared. And so that will be our final
runtime for this situation. So, like I said, this course
isn't heavily designated on writing code or outworithm
but still define the theory, but I thought that this
might kind of help show you what we've been
talking about this whole time, how code could be
actually analyzed so we can understand
these different pieces. And you can see that
no matter what, we've actually just learned
something very important. Four loops are always
going to be in, but nested four loops are always going to be however
many nesting there are. So if this one right
here nested a third one, this would be an n
squared formula, and you start getting up into really large numbers to start
checking all of this stuff. So I hope that this
practical example helped you understand this
concept a little bit better.
7. Data Structure Example: So now let's discuss a bit
of a more concrete example. And we're going to be discussing
the difference between an array and a list and how that's going to tie into
what we've been learning? And what we've been learning
about is scalability, storing information
and how that reacts with run time as we get to
more and more information. So with these two examples here, we're going to be
looking at an array, and we're going to be
looking at a list, and we're going to be looking
at their access and add to the back times of these arrays. So with this, what we are
going to be looking at is, how does that scale over time with these two different
ways of storing data? How can we use that
analysis to figure out what solution we would
like to use in the future. Let's take a quick look. We're
going to do an overview of what these data structures
are very light overview. You don't need to know a
lot of computer science to follow along
with this lesson, but it's going to be
very important to add a concrete element to
what we've been learning. So the first data structure that we're going to be going
over here is the array. So with an array, what we're essentially doing
is we're grabbing a piece of continuous
data from our memory. Now, the processor is going to be basically telling us where we're going to
be grabbing this. It could be the ram,
it could be the cache. It could be the hard drive. It doesn't matter. In
this case, it's memory, and that's all we
need to know about. So with memory, Basically, we're separating
all of our memory into tiny little blocks of data. When we ask the
processor for an array, it goes and finds a little
piece of memory for us. That's continuous that can
fit in what we're requesting. In this case, we're requesting for four
pieces of data here. So it's an array size of four. Now, when we do this, we are creating it,
it's continuous, but on the other side, we do not have access to them, meaning that if we need to increase the size of this array, we have to create
a brand new array and copy things over to it. And that's going to
be the disadvantage. So let's go through
our two here, our access and our ad. So whenever we are
accessing data in array, what's great is that
it's going to be in constant time because whenever we are asking for
this, it's continuous. Therefore, we can ask
for very specific pieces of data within here. So if we need to figure out, if we need to grab something from the second or
in this position, the third place in the array, which is indexed by two. That just means that everything in computer science
is zero index, so the first column
is zero, not one. We can go with this
little formula rate here, which works in basically every single
programming language. It just means grab that
element right there. It's going to return it
to us in constant time. It's going to just
basically spit out what's right there
in constant time. We could have tons and
tons of information. This could be 2000 long. This could be 3,000 long. This could be 20,000 $30,000
and so on and so forth. And if we just put
that into here, we'll get it back
constant time every time. And so that's what
allows our array to have an access of worst case
scenario, constant time. It's always going
to be constant. Now, here's the problem. Remember when I said that when
we add to the end of this, if we've reached
our array limit, we've filled up the array, we have to request
a brand new array, meaning that the memory
module is basic going to kill our old array and then find a larger set that we can
put information into. Well, to make sure we
don't lose our old data, we have to copy
every single piece. Over one at a time. So if we have 300,000, 3 million, 3 billion entries, they all have to be copied over, and each piece of data in
here needs to be touched. And that's very, very
important because that means that as our
size of our array, as the number of inputs
we have increases, the number of operations we
have to use also increases. And it increases at
a linear fashion, meaning we have a
worst case scenario of O to the when
adding to an array. So now, we've covered
the array pretty well. Et'sake look down here at
our other data structure. The list. The list is interesting
because what we're doing with a list is we're essentially going to our piece of data here. We're requesting it.
And instead of us giving us a continuous
piece of data, it essentially gives us
one element at a time. So we can request one, it'll
be here. We request another. It creates a pointer, and
then goes to that piece. We want another one. Well,
it's going to go over here, and so on and so forth. And you can see that there is no really set way that it's
going to be doing this. I'm sure that, you know,
the advanced memory modules and everything like that
has an efficient way, but they're not going
to be continuous. They're going to be
all over the place in the most efficient
way that it sees fit. Meaning that if we're trying
to access our information and we want to grab this piece
of information right here, well, look at this
mess right here. We have to follow the point. So we have to start
here and go here. Then here, then here, then here. And, Oh, there's
our piece of data. And what that means is that
when we access our data, if we're trying to find
something down the list, something farther
away, and, you know, it could be something
like a number, a letter, a sentence,
a picture, anything. Let's just put PE here. To get to P, we have to run the entire linked
list to find it. So what's our access time? Well, that doesn't
sound constant. That sounds like as the
number of elements grows, our access time also grows, which looks like it's going to be access of worst
case scenario, is it going to be
linear time to the in. Now, the list has an advantage
of the array with adding, as long as we keep what's
known as a back pointer, which essentially is just
a piece of data that we update that so that is a
direct arrow to the back. So if we add a new piece,
we essentially just take that arrow and we just
move it to the new piece. If we keep that up to date, which is just a single
constant operation, well, to add data to this
is constant time. It's always going to be
the exact same operation. We point to the back and we
add to the next one over. Actually, in this case, we
don't Yeah, yeah, we do. So we have the
back pointer here. So if we need to
add to the back, we follow the back pointer to the very end of the link list, and then we just add a
new element and then update the pointer
to the new element. It's a constant operation. Even though there's a
couple of steps in there, it's always going
to be the same. If there's 1 billion
pieces or one piece, the exact same things
need to happen. We follow the back
pointer, we add a new one, and then we put
our data into it. So that means that our add time is actually going
to come out as constant. So now we can see that with
these two data structures, we have two different
possibilities of a solution here. If we need fast access time, so think of the
list in your phone, your contact list in your phone. We need fast access
time with that. You don't want to find
someone in the zs of your contact list and
have to wait for it to search through every
other piece of data. That does not sound
like it's very efficient at all.
That would be silly. And we don't add
to it very often. You know, we add a
contact here and there, but it's not something we add
continuously over and over. So the array might
be the best solution for that particular problem. Now, with a list, maybe
we have an operation that needs to store lots and lots and lots of elements over time, but it hardly ever
needs to access them. So we need constant, you know, add time, but we don't really
need a lot of access time. Well, then we would
be looking at a list for that solution. So the things that we're
learning in this can be used to figure out better
solutions to our problems. And in this particular case, are storage problems in
computer programming. But this can apply
to other areas of both the computer world
and to real life as well. The solution you
have is scalable. And with this, we can see which ones are and
which ones aren't.
8. Project Video: Now that we have gone
through notation, bigot, and all of the scaling that
happens in between it. The project here
is going to be to take everything
you've learned and analyze something in your life and tell us what
is its procedure, and then what is its
timing signature. So, this can be computer
science related. You could do an algorithm, a searching algorithm or
think about, you know, how a Facebook friend request might work or
something like that, but it can also just be
observational around you. It can be how I said at the
beginning of the course, like a filing cabinet
or how someone might join a member of
your organization or how you might search and find something in either
a large set of data or a grouping of people
or anything of that nature. I want you to be creative
and just analyze the world around you with the knowledge that you've
learned and share it. It's a pretty fun exercise, and it's pretty neat to see that basically what we're doing
is just applied mathematics whenever we're programming
or we're working with these sort of the
theory of math itself. If we're applying it, we're learning how the world
works and operates, and then we're engaging with it. And so that is what this
project is all about. To create a critical
thinking and then to analyze it and to sort of
reinforce what we've learned. I'm really excited to see
what everyone comes up with, and I will be in the project
section responding to them. So Sea there, thanks everyone
for joining in this course.