Transcripts
1. Ready? Set. Go.: [MUSIC] Coding interviews
are really intimidating. There are difficult questions, tricky puzzles, and even
unsolvable problems. There are plans of attack
for each scenario, but we'll focus on
the most obvious one when the question
is straightforward. Where there are no
curve balls, no tricks. I want to make sure
you ace the interview. Hi. I'm Alvin, a research scientists
in industry. This past spring, I
received offers from Meta, Apple, Amazon, Tesla and more. Within the past few years, I've also received engineering
offers from Google, NVIDIA, Facebook, and Lyft. The experience I've gained from these interviews was invaluable. I really sat down to
understand what are all the pieces that made
my interviews successful? What are the fundamentals
and the tricks? This course is one step
into important directions, a step forward in an
interview preparation and a step forward in
coding mastery. To accomplish both, we'll
learn about data structures. Data structures are a way to organize data and
in this course, you'll see examples,
how they work, and how to use them. By the end of the course,
you'll have two skills, one for each objective. You'll be able to pick
data structures to use in coding interviews.
This is a big deal. Instead of inventing
efficient algorithms, just for call-in poll
ready efficient algorithm. You'll also be able to implement
these data structures, combining and mixing
implementations as needed. This
is a one as well. Knowing implementations
effectively gives you starter code
in the interview, a free leg up. I hope you're excited. I know
I am. Let's get started.
2. Why You Need Data Structures: Let me explain why you need data structures,
in particular, why this course on
data structures is so critical to your
growth as a coder, and how you can leverage
this course effectively. We'll hit three major
talking points: First, we'll discuss the benefits of mastering this topic, second, we'll discuss specific
learning objectives in the course to quicken
your mastery, and third, we'll
introduce an exercise to help you gain mastery
after the course is over. Here's our first talking point, why you should care, why
I'm teaching this course, the benefits you'll reap. The biggest benefit to taking this course is being more
prepared for interviews. There are two types of
technical interviews: First, there are coding questions, these are specific
problems like, how do you reverse a string? How do you remove all
duplicate numbers? Second, there are systems
design questions. These are broader
questions like, how would you build
a clone of Airbnb? Basically, how to design infrastructure and software
for a whole product. Both interviews can
be challenging, but data structures help
you tremendously with both of these technical
interviews, and here's how. Here are two specific benefits to knowing data structures. Benefit Number 1 is for your first technical
interview, coding interviews. A significant portion,
probably over a third of coding questions are
related to data structures. If you know your
data structures, you can often apply instead of design algorithms
in your interviews. This is huge. There's no need to come up
with ideas from scratch. If you know data structure A
meets exactly these needs, you simply say that, "I know HashMaps are
ideal for this use case; here's why and how
I would use it." That's a killer
interview response. Benefit Number 2 is for your
systems design interview, being able to optimize systems for your
specific application, both in your interviews
and in practice. I'm not talking
about a 10 percent or 25 percent faster system, I'm talking about a system
that operates orders of magnitudes more efficiently by picking the right database, server, or serverless
infrastructure, and more. You would say, "We need
directed acyclic graphs. Graph databases are optimized specifically for this kind of
relation, and here's why." Now that is a well-informed and convincing design decision. Benefit Number 3 is beyond
the interview process. With an understanding
of data structures, you'll be able to tackle more advanced topics
beyond your interviews. In computer science generally, these topics give you new
abilities like making a game. One of the most popular
game creation tools, Unity, uses what's called a data-oriented
programming paradigm. As the name suggests, understanding your data
structures is pretty critical. To summarize, here are three benefits to
understanding data structures. Being able to apply instead of design from scratch
in coding interviews, being able to optimize for the application in systems
design interviews, and being able to tackle more advanced topics to open new doors beyond
just interviews. In this course, we'll
focus on benefit Number 1. In every other lesson, you'll practice applying
data structures to coding interview questions. To realize these benefits, you'll need to focus on two learning objectives
in this course. Objective Number 1 is to
know how each data structure works and practice implementing
each data structure. This is key because
down the line, you may have to
implement parts of these data structures or combine data structures for
different purposes. Just like in my other classes, I highly recommend you
code alongside me, place this video on one side of your screen and your favorite code editor on the other side. Copy my code exactly, and no need to rush, pause the video
whenever you need to. What's more important is that
you build muscle memory. Objective Number 2 is to know how and when to use
each data structure. Specifically, remember the pros and cons of each data structure. In the next lesson, we'll share specific rubric items you'll judge each
data structure by. Pros and cons
aren't wishy-washy, they're very precise
and well-defined. To summarize, your two main learning objectives
are to know how data structures
work by practicing implementations and
coding alongside me, second, know how and when to use data structures by remembering the pros and cons we go over. Our last talking
point here is how to solidify your data structure understanding
beyond this course. Sure, it's to practice, but it's not actually
what you expect. We have to turn
the tables around, pretend you're the interviewer, say you're writing the data
structure questions now. Your task is to
practice designing data structure questions and do this daily for three days. The goal is to get you thinking
about data structures for just a few times after
a good night's rest. Do this during your downtime,
during your commute, while you're
brushing your teeth, or while you're waiting in line. Add this to your do this
when I'm bored list. The more you do this, the more questions
you come up with, the better you'll
master data structures. To recap, we've discussed the benefits of knowing
data structures, better interview readiness, and a stepping stone to
advanced topics, we've discussed your
learning objectives, know how to implement
data structures and how to use data structures, we've finally discussed a way to reinforce your knowledge, for three days after the course, design a data structure
question each day. I also encourage you to
post your question in the discussion section, my bad. I really, really mean this. It is incredibly rewarding for me to see the
questions you post, either a question you made or a question about the course, so please please upload.
I look forward to it. That's it for the overview. For a copy of these slides
and more resources, make sure to check out
the course website. We've finished
overview, and next, we'll talk about how to
compare data structures.
3. What is a "Good" Data Structure?: What is a good data structure? This is key for us to understand because one of our
learning objectives is to know how and when to
use each data structure. But even before we
ask that question, we need to first ask what
even is a data structure? We can start with the
simple definition. A data structure is simply
a collection of data. That sounds straightforward so let's now expand on that a bit. Let's expand this
definition by adding some expectations for
a data structure. In particular, we
should be able to modify and access
the contained data. A data structure
is a collection of data that can be
modified and accessed. To modify the data structure, we should be able to insert
data and remove data. To access the data, we should be able to
search and fetch data. These are the four actions that every data structure
must support. Each of these actions
can be very slow or very fast depending
on the data structure. This brings us to
our next section, which is how do we quantify
very fast or very slow? How efficient or inefficient
is each of these actions? In other words, how do
we assess efficiency? To assess efficiency
of an algorithm, we will count the number of operations an algorithm
takes to execute. For example, let's say
we have this list of fruits and say we're
searching for avocado. To search the list, we need to traverse
the list one by one until we find the item
we're looking for. In this case, we took three "operations" to
search for avocado. However, if avocado is at the end or if it's not
in the list at all, your search algorithm
would have to traverse all the
items in the list. For a list with n items, searching list for a value
takes up to n operations. Now, the definition of
operation is vague. Maybe a memory access
is an operation, maybe checking string equality
is another operation. Let's say the number
of operations needed for a single item is c. Then searching
takes cn operations. However, this constant c is uninteresting to us when
comparing algorithms. We simply say the
algorithm takes O of n operations where the O
notation hides any constants. We call this O of n statistic, the runtime or
computational complexity. Here's another way to see
what O of n complexity means. What we know is that as
the list gets longer, the number of operations, however you count
it, grows linearly. This means that our algorithms computational complexity can be bounded above and below
by linear functions. Any algorithm whose
complexity falls within these bounds is O
of n. As a result, searching over your list has O of n computational complexity. We now know that for a list, which is also called an array, takes O of n times a search. To be specific, we
search for a value when we don't know where a
certain value is in the list. However, let's say we want to fetch exactly the ith item. We know we want the ith item and don't need to
search for a value. That fetch takes constant
time or O of one, because we simply
access the ith item of the list without looking
at other values. More specifically, as
the list length grows, the effort needed to fetch
an item doesn't change. This is great. Fetching
an item is cheap. To maximize efficiency, this means that we should
use an array when we have to fetch more data more often
than we search for data. This is one example of
a practical takeaway, and we'll see many of
these in the course. Each of these decisions aims
to maximize efficiency. We should amend our definition
of a data structure. A data structure is
a collection of data that can be efficiently
modified and accessed. Based on each application, we'll need to pick
data structures to ensure this efficiency. We'll analyze the complexity of a few data structures
in this course. As we perform these analyses, we'll come across several
common orders of growth. Orders of growth are the O of something that
we saw earlier. We saw two examples of
orders of growth already. Constant-time growth O of one
and linear growth O of n. These are six common orders of growth shown
here on this slide. But for the data structures
and algorithms we're covering we'll only see
four orders of growth, constant O of one, logarithmic O of log n, linear O of n, and O of n squared. We'll see what O of log n looks like later on but for now, keep in mind that you have
a new tool to analyze algorithmic complexity
and that these four are the most common
orders of growth you'll find. For a copy of these slides
and more resources, make sure to check out
the course website. Now, you have a new tool under your belt and orders
of growth analysis, which allows you to assess
computational complexity for any algorithm. More importantly, it
allows you to pick data structures for
different applications based on how slow or how fast a data structure
is for insertion, deletion, fetching,
and searching. In the next section,
we'll practice analyzing orders of growth.
4. Orders of Growth Practice: Welcome to some data
structures practice. In this lesson, we're going
to assess data structures. In other words, we'll actually
assess algorithms and data structures using a system
called orders of growth. This is what we talked about
in the previous lesson. The Big O Notation will
tell us how quickly or how slowly an algorithm takes to
run based on its input size. We'll put aside interview
tactics for now, at least interview
specific tactics. We'll certainly have
interview related tips, but we need to focus on
the fundamentals first. Start off by navigating
to this URL, alvinwan.com/data
structures101/oog. Once you see this page, fork the project to get
your own editable copy. To do this, click on the project name in the
top-left to get a drop-down. Then click on the ellipsis in the top-right to get
another drop-down. Finally, click on
the "Fork" button. Then I suggest placing
your Skillshare and Repl.it windows
side-by-side as shown here. You should then
see a window just like mine on the
right hand side. Before we actually dive
into the problems, let's review the orders of growth that we talked
about previously. There are four relevant
orders of growth. As a reminder, what I'm writing
here is a function of n, where n is the
size of the input. These orders of growth will tell us how much slower
the function will run if we increase the
input size n. As a result, we want runtimes to grow more slowly as the input
grows in size. Moving up this list
of runtimes is good. The slower the order of growth, the more efficient and preferred the algorithm
or data structure is. One tip I have for you is to know these runtimes by heart, these are the four core runtimes that you need to know for
every single interview. The first one O(1)
or constant time means that no matter
what the input size is, our function always takes
the same amount of time. Second, O(log n) will
actually talk about later. This is a little bit more of a complicated
runtime to explain. Third one O(n), or linear runtime means that
if our input size doubles, our runtime also doubles. If our input size triples then
our runtime also triples. In other words, our function
runtime actually grows linearly with respect to n. O(n) squared here
means quadratic. If we double the input size, then our function runtime quadruples or algorithm
runtime quadruples. If our input size triples then our runtime will actually
grow by nine times, and so on and so forth. Now let's dive right
into the exercises. On the right hand side here, I've got our first exercise. What is the runtime
complexity for the following? Here we have just a for
loop for i in range n. Remember that range
of n gives you an enterable of n
different items. What is the runtime
complexity of this function? Pause the video here and try it out and here's the answer. This function again
iterates once over every single item in this
iterable of n items. In other words, it iterates
once for every n items, it will iterate n times. The runtime for this function
or this algorithm is O(n). [LAUGHTER] I forgot to show you. Here are the list
of four different runtime options that you have. I'll keep these here as we
go through these exercises. Now let's move on to
the second exercise. What is the runtime complexity
of this function, again, as a function of n. Here
we have two, for loops. Each one of them
iterates up to n. Another way to see
this would be to ask how many times has the
print function called? Pause the video here and
try and here's the answer. Here we iterate over i n times. Then for every single i, we also iterate over j n times. That makes for a total
of n times n runs. This algorithm is O(n) squared. Let's move on to
the next exercise. What is the runtime
complexity of this function as
a function of n? Here we start off with n.
Then in this while loop, for every single step
we'll have that value. What is the runtime
complexity of this function? You can use a process
of elimination to figure out what the runtime is. Pause the video here and give it a shot and here's the answer. The answer for this function, which we can deduce
by process of elimination is O (logn). Here's how that process of
elimination might work. Here we see that the function
is definitely not constant. It definitely takes a little
longer anytime n increases. It's definitely not O(1). However, it's not
as slow as O(n), because O(n) would iterate once. It would iterate over all
of the items one by one, up to n. However
we don't do that. We half the value
every single time. We're just skipping
around a little bit. It's not as slow as O(n), and it's not as fast as O(1). The only runtime in
between is O(log n). We'll do this more
precisely later on to explain how O(logn) comes about. But for now, let's talk
about a visual explanation. What does logarithmic
growth look like and what does it mean to
have logarithmic growth? Logarithmic growth is
really really slow. To give you an idea of how slow consider the log function. Remember, n is the input size, the value of log
n is the runtime. Say our input has size two, then the runtime has
unit one runtime. Let's say our input
doubles the size four, the runtime increases by one. If our input doubles
again to size eight, the runtime increases by one. Finally if your input
doubles again to size 16, the runtime still only
increases by one. Notice our input
size needs to keep doubling to increase
the runtime by one. That means our algorithm
is pretty efficient. Visually here's what
that looks like. We plot input size
along the x-axis, we plot runtime
along the y-axis, then plot all the
points from earlier. Notice our function
grows quite slowly. In fact, here's a bigger
plot with more values of x. You can see that our graph
becomes flat very quickly. In other words, logarithmic
functions grow very slowly. Keep this in mind. Let's go back to our function. We can now see that i here this, or i or n, let's say n needs to double in order to increase the number of while loop iterates by one. If n is two, the while loop iterates once. If n is four, this iterates two times. If n is eight, this iterates three times, and so on and so forth. This value n needs to
double every single time to increase the
runtime by just one. This is logarithmic growth. The other way to
see this is that our logarithmic function halves the input size in each step. As you can see here, we half i every single step, so we're halving our workload every single step.
This is the tip. Logarithmic functions
halves input size in every single step. Here's a summary of
the three examples we just went over and
their runtimes. That's it for the
"basic examples." I don't mean basic, as in these are easy. These examples are
pretty confusing, especially when you see
it for the first time. I say basic to mean that
all other examples will be some modification of
this basic structure. To figure out runtimes
in the future, there are two steps. First, start by memorizing what these "basic
examples" look like. Second, once you
know these examples, you can then memorize the different possible
modifications. That's our tip. Now let's move on to
those modifications, those tricky examples. These are again modifications
of the above examples to either look confusing
or to change the runtime. Here are the four
runtimes before. Let's now move on to the
examples, the tricky examples. How many times is print called
in the following function? Here we have a nested for loop. Then we have this if condition, if i is equal to j. Notice this if statement and how does it affect
complexity if at all? Pause the video here and
give it a shot again, how many times is print called? Here's the answer. Print
is only called n times. This is because our inner
for loop is only run once. If i is equal to zero, then this if condition only passes if j is
also equal to zero. If i is equal to one, again, this if condition only passes
if j is also equal to one. That means that for
every single outer loop, the inner loop only runs once, so that's n times
one iterations, this is a total of O(n). Let's now move on to our
next tricky example. How many times is print
called as a function of n? Here we have a nested for loop, but notice that the inner for
loop only counts up to i. Again, how does that
affect runtime if at all? Pause the video here
and give it a shot. Here is now the answer. As it turns out, even though we're only iterating up to i, it seems like we might have
a faster runtime complexity. It turns out we don't. This is still O(n) squared. Here's the reason why. I'll have to explain by
showing you a picture. Here, let's say that we have a normal for loop on
the left-hand side. Then for every
single value of i, we go through j n times. We'll do this for
i is equal to one, we iterate through
all values of j, i is equal to two all values of j and so on and so
forth. Keep doing this. What this means is there are a total of n squared different
times that we print. Let's say we print i j. Now let's say that we
have this new for loop where we only iterate
only up to i. Then what that means is
here in this first row, when i is equal to one, we only run j once. When i is equal to two
in the second row, we only run j twice. When i is equal to three, we only run j three times, and so on and so forth. Here, instead of n squared total times
that we call print, we have 1/2 times n squared
times that we call print. However, 1/2 of n squared is still on the
order of n squared, which is why this
function is still O of n squared in
algorithmic complexity. That concludes this
tricky example. If you have any questions or
if you're still confused, feel free to drop a question
in the discussion section. These are the four different
topics that we covered. Orders of growth we recovered
some fundamental examples, logarithmic growth and some tricky examples
along with how they modified those basic examples
for more tricky example. For a copy of these slides and the finished code
for this lesson, make sure to check out
the course website. That concludes this lesson for
assessing data structures.
5. Sequences: Stack vs. Queue: One common format for
data is a sequence, and in particular, there are two common types of sequences. There are stacks, like a stack of clothes or a stack of rocks. There are also queues like a line of people or
a tunnel of cars. Once we've analyzed
both stacks and queues, we'll then discuss
their pros and cons. First up is a stack. Just like with a stack of rocks, you add items to the
top of the stack. Then to remove a
rock or an item, you also remove it from
the top of the stack. We summarize this as last in, first out; the last item you add is the first you remove. The insertion
operation according to this logic is simple then. We take the value
we want to add, we push that value onto
the stack and that's it. It takes constant or O
of one time to insert a new value to the stack since no matter how
long the stack is, the effort to simply insert
one item is the same. Note that by definition, we can only insert values
at the top of the stack. The deletion operation
is also simple. We take the last
value shown here in gray and pop the value. This takes constant or
O of one time as well. Since the length of the
list doesn't change number of operations, deletion kneads. Taken together, we
now know that stacks feature constant time
insertion and deletion, so modification is cheap. But what about access? How expensive is searching
the stack for a value? Unfortunately, pretty slow. Let's say we're searching
for the value 8. We pop an item, check its value. Is 7 equal to eight? It is not, so we
pop another value, then check its value. Is 2 equal to 8? It is not so we pop once
more then check its value. Is 8 equal to? It absolutely is, so our search is complete. However, if there are
n items in the stack, we need to perform
up to n checks. This means search takes
linear or O of n time. Fetching a specific
item is similar. Say we want to fetch the
second item with the value 8. We pop an item, that's one item, we pop another item, that's two items, then we pop another item,
that's three items. We've finally fetched
the value we want. Like before, if there are
n items on the stack, we may need up to n pops. This makes fetch O of n time. To summarize, we can insert to and delete from a stack
in constant time. Modification is
efficient. However, search and fetch
take linear time. Access is inefficient. But what if instead
of a stack of rocks, we had a line of people? The first person
to join the line should be the first to leave it. This is called a queue. This time we insert a value at the end of the queue and delete a value from the
front of the queue. We call this a first in, first out sequence or a queue. As we mentioned before,
insertion is simple. We consider a value to add, then enqueue that
item into the queue. No matter the
length of the list, the simplicity of
insertion is not affected. This is an O of one operation. Deletion is similar. We consider a value to
remove pictured in gray, then dequeue that
value from the queue. This is also an O
of one operation. To summarize, insertion
and deletion for a queue are thus
both O of one time. Once again, search is
quite slow, however. Say we're searching the queue
for the value 8, again, we dequeue, then check
is 7 equal to 8? It is not so we dequeue
again, then check. Is 2 equal to 8? It is not. We dequeue and check. Is 8 equal to 8? It is. Our search is done. However, if there are
n items in the queue, we could need up to n checks. This makes search and
queues an O of n algorithm. Fetch is again similar. Say we're looking to fetch the
3rd item with the value 8. We dequeue one item, that's 1, we dequeue
another item, that's 2, and we dequeue another item to finally retrieve the 3rd item's
is value, which is eight. For a queue with n items, we need n dequeues, so fetching is O of
n. To summarize, a queue is efficient to modify, but inefficient to access
just like with stacks. Let's now analyze the pros and
cons of stacks and queues. On the far left, we have the four operations we
care about: insertion, deletion, searching,
and fetching. In the 2nd column, we have
the runtimes for a stack, and in the 3rd column, we have the runtimes
for a queue. Unfortunately, the
runtimes themselves aren't super interesting
so we have to know which to use based
on the application and how each data
structure was defined. For example, a
stack is ideal for tracking changes like
undoing an operation, then redoing that operation. This is a stack, as the last change added
is the first you undo. A queue is ideal for
sequential processing like processing cars in a line at a toll bridge or download
requests on a server. Keep this slide in mind as this summarizes the entire lesson. For a copy of these slides
and more resources, make sure to check out
the course website. Let's now see these data
structures in action. In the next lesson, we'll practice both using and implementing these
data structures, tackling both of our learning
objectives for this course.
6. Sequences Practice: Welcome to another
practice lesson. In this lesson, we'll
cover sequences, in other words,
stacks and queues. Just like before, we'll put aside nitty-gritty
interview tactics, we need to master these
fundamentals first. Note that these questions
are pretty difficult. This is not a cool application. This is fundamentals
that you'd need for interviews and it's technically
challenging material. So just like before, navigate to this URL,
alvinwan.com/datastructures101/sequences. Once you see this page, fork the project to get
your own editable copy. To do this, click
on the project name in the top-left to
get a drop-down, click on the ellipsis to
get another drop-down, and finally click
on the Fork button. Then I suggest placing
your Skillshare and Replit edit windows side-by-side
as shown here. We'll talk about palms in
three different categories. We'll first talk about
implementation problems. These are problems
that implement core functionality
for a data structure. Second is usage, we'll implement
some functionality and pick the right
data structure to use, that's the key part. We're going to learn how to pick the data structures based
on their pros and cons. For the third one, we're going to combine
data structures into possibly a new
data structure or a new application that
requires us to juggle multiple data structures for
each of their strengths. Regardless of what
category of problems, here's a pro tip that you
should always keep in mind. In fact, we're going to
follow this pro tip for this entire lesson and all the practice
lessons that follow. We're going to conduct all of our puzzle-solving
questions in this way. You should conduct all of your puzzle-solving interviews
in this way as well. First, we'll design the
algorithm without coding. Verbally or visually design the algorithm with
your interviewer. This is so that your
interviewer can guide you and your thinking
towards the right answer. Second, report the runtime and memory complexity.
This is important. First to demonstrate
understanding, but also to see if
your interviewer wants a more efficient solution. Third, and finally,
then you code. We'll structure your
practice in this way. Design the algorithm, then I'll go over the solution, report the complexity, and then I'll go
over the solution, and finally code the algorithm, and then again we'll
go over the solution. The idea is that
even if you mess up any part of this
three-step process, you'll still be able to
practice the other steps. Here's our very first question. We're going to implement
queues but using stacks. Before you start though, let me talk about one tip. Remember that a stack
is last-in, first-out. What that means is it's like a stack of rocks or
a stack of boxes. The last item you put
on that stack is going to be the first one you
take off, that's stack. Now queue is just like a line, a queue of cars let's say. That means first-in, first-out. So the first car that
lined up is the first car that will leave this
queue. Keep this in mind. That what I wrote here, a queue is first-in-first-out, a stack is last-out, first-in. Now, granted, I never
remember this first-in, first-out, last-out, first-in. What I remember
instead is pretty much that I'm working with the
stack, like a stack of rocks, stack of clothes,
stack of boxes, and then I think of a
queue as just a line. If you keep that in
mind, that'll start to tell you what direction
you're moving in. Now again, the
exercise here is to implement a queue using stacks. If you scroll down
in your ripl.it, you'll see this right here. This is what you
need to implement. So pause the video
here and first, think about the algorithm. How are you going to do this conceptually? So give it a shot. Here's the answer. So we're
going to do the following. Say we have this
list here in black. We have 9, 8, 2, 7. Nine is our first item. Our last item is seven. To emulate a queue, we need to remove
the first item, which is nine, with that
red arrow right there. Our goal, shown here in red
is to remove the value nine and to give back or keep the
remaining stack, 8, 2, 7. So knowing that that's our goal, the only thing we can
do with a stack though, is remove from the last item. We can only pop seven. Well, that's okay. We
can keep doing that. We'll pop seven, we'll pop two, we'll pop eight and
finally, we'll pop nine. There we go. We can
now return nine. Now we can dequeue from a stack. However, we've lost all the items that were
popped off earlier, all the 8, 2, and 7. So we need another stack
to hold all of them. So here what we're going
to do is pop off seven, but push it onto
the other stack. Then we're going to pop off two, eight, and then finally
we'll return nine. That's great. We're
almost there. However, we need 8, 2, 7 in our stack. Right now we have
the reverse 7, 2, 8. So what we're going
to do is pop from the second stack and push it back onto the original stack. So we're going to
push that back, and there we go. We have both goals. We're able to return
nine as though it was a queue and we're able to
retain a remaining stack 8, 2, 7 as we wanted. Now, this seems like
a clever solution. I would probably say to myself, no way I can come up
with that solution. If you're thinking
the same thing, I'm completely with you and I'm here to tell you
that's not a problem. There's actually no need to
come up with that solution. Instead, understand
it thoroughly. In your interviews
or on the jobs, simply recall the tactics
available to you. So really focus
your mental energy on understanding the solution. If something doesn't make sense, just ask in the
discussion section. Knowing that this
is the algorithm, let's move on to
the second step. What is the algorithmic
complexity? What is the runtime complexity and what is the
memory complexity? Pause the video here
and try to figure it out. Here's the answer. The algorithmic or runtime
complexity here is O(n). We have to basically
perform two n operations, which is O(n) time. We need to pop off every single item into
the second stack, and then we need to pop off every single item
from the second stack back into the first. So that is O(n) for
runtime complexity, for memory complexity, it is also O(n) simply because we had to
keep this other stack, and this other stack required
up to n different items. So we have O(n) runtime complexity,
O(n) memory complexity. Now, let's try to
implement this in code. Again, pause the video here
and try it yourself first. Let's now implement
this solution. We'll start off by
writing a constructor. This constructor will
take in an iterable. So here we can see that this constructor
takes an iterable, which is a list
of just one item. From the iterable,
we'll build a stack. [NOISE] Now in order to enqueue, [NOISE] we'll simply push
to the stack. [NOISE] However, I want to dequeue
or remove from the stack, that's where the
special logic comes in. That's the entire
algorithm that we've talked about here on
the left-hand side. We're going to first
start off by emptying the original stack
into a new stack. Here we'll have the new stack. Then while we have
items in the old stack, will push them
onto the new stack and we'll move them
from the old one. Now that we've got
all these items out, we're now going to finally
pop off the last item. This will give us the value that we need to actually return. Now let's empty the new stack back into the original stack. Remember that was the last step here where we had
everything in reverse. We need to push them all back
into the original stack. Now here we'll have, while the stack has items, will push it back onto
the original stack. Finally, we're going to
return the value. Here we go. We have implemented our
three-step algorithm from before. Let's now check whether or not this function passes
our doctests. To do that, hit the Run, the green button at the
very top of your file. Here we can see that we had failures in all the other
functions, but not this one. Remember, our function
here is queue via stack, which is missing from this list. We're good to go. We've passed our doctests for this function. Let's move on now to
the next question. In our next question, we're
going to do the reverse. We're going to implement
a stack using queues. Again here we implemented
a queue using stacks. Now in our next problem, we're going to do the reverse. Go ahead and take
a second to pause the video and consider what algorithm you
would build to do this. Here's a solution. This is very similar to
the last problem. Just like before, we're going to instantiate
a second queue. We're then going to
take every item in our first queue and push
it onto the second queue. Once we get to the last item, will then return that list
item just like a stack would. That's pretty much it. Now
that we have the algorithm, let's go ahead and
think about the memory and runtime complexity
of the algorithm. Pause the video again. Now here's a solution
conceptually. Here we want to implement
a stack however, we only have a queue. Recall that a queue can
only be dequeued from the beginning whereas in
order to emulate a stack, we need to be able
to pop from the end. We have this dilemma,
but we'll resolve it in the same or similar
way as we did before. We'll create a second queue. Then we'll dequeue nine and enqueue it onto
the second queue. Then we'll dequeue eight
and add a second queue, then dequeue two, and add it to the second queue. Finally, we'll dequeue
seven and return it. Here we've managed to
actually "pop seven". Here's something
new, unlike before our second queue is already
in the correct order. Remember that from
before our queue was 9, 8, 2, 7 and now it's 9, 8, 2. This is the exact and
correct ordering. We don't need to move it back to the original queue
like we did before. Knowing that we can now know exactly how we're going
to implement this algorithm. However, before we do that, let's report the runtime and memory complexity
of the function. Pause the video here and
try to figure it out. Now here's the solution. The runtime and
memory complexity is o of n, just like before. For runtime complexity,
we have to dequeue every single item in the list and enqueue it into
the second queue, so it's o of n. Now
for memory complexity, we do have to maintain
the second queue. Second queue incurs
basically o of n memory up to n items.
There you have it. Both runtime and memory
complexity are both o of n. Now, let's finally move on
to the third step. Let's try coding this algorithm. Pause the video here
and try to code it. Now, let's go ahead and
talk about the solution. Let's implement the
stack using queues. Just like before we're going
to create a constructor. This constructor is going
to take an iterable. That iterable will be our
queues initial value. Then let's implement
push for this stack. This will just be a
queue's and queue call. All the magic will now
happen in the pop method. In this pop method, we'll first empty n minus one items in the original
queue into the new queue. To do that, let's
create a queue. This is our new queue. Then while our old queue
has at least one item, we're going to keep
removing from it. Here it will actually
dequeue from the old queue and add
it into the new queue. Now that we've dequeued
almost all of them, let's now dequeue the last
item from the old queue. Whoops, so this is
queue, dequeue. This last item would
be seven, for example. Then we would return that value. Just like we said before, our new queue has all the elements that we
need in the right order. We'll simply replace our
old queue with our new one. Now we can return a value. Let's go ahead and
try running all of our tests and see what
happens. There we go. We can see that our stack via queue is no
longer in the list of failures. Tada, we have
passed these tests. Let's now move on to
the next problem. The next problem, we're going
to use a data structure. I'm not going to tell
you which one to use, and there will be up to
you to decide which one. We're going to use,
either stacks or queues. In particular to print all combinations of
the letters a and b. We're going to print
out all combinations of length k. Here's an example, print_baba of 2 will
give us aa, ab, bb. Now if we call print baba of 3, then this will give us all three-letter "words" that
have the letters a and b. Again, it's all
possible combinations. Notice that there
are no repeats. Notice that every single
value here has length three. Knowing that, let's
again go through that three-step process
we've talked about earlier. In the first step, let's
design the algorithm. Pause the video here and try
to design the algorithm. Again, you'll want to use
either a stack or a queue. Let's now talk about the
solution conceptually. Here in this
particular solution, we're going to use a queue. I'm going to denote
the queue using just a basic list here first to visualize
what's going on. To start off, let's say I
have the letter a in here. What we're going to do is
first dequeue the letter a. Now we have a and
we're going to enqueue the only two possible ways
to extend the letter a. To go from a one letter sequence
of all possible a's and b's to all possible two letter
sequences of a's and b's. We're going to add aa and ab. This will account for
all possible ways to create an a and b sequence
from this starting point. Now we can make this a
little bit more extensive. We're going to start off
with the empty string. Then we're going to
dequeue that empty string. Then we'll add the only two possible ways
of extending this, which are adding an a at the end or adding
a b at the end. Then we will dequeue and then we will add a plus a plus b, so aa ab. Then we'll
do this again. We'll dequeue the first item
right here, which will be b. Then we'll extend
with ba and bb. We'll keep doing this. Now here we will have
aa we dequeue it, and then we'll add
aa plus aa plus b, so aaa, aab. We'll keep doing this. By stopping at the
right point in time, we'll ensure that
we've actually covered all possible k length
sequences of A's and B's. Knowing that this
is the algorithm, let's now talk about the
memory and runtime complexity. Pause the video
here and see if you can come up with both
of those complexities. Let's talk about the computational
and memory complexity. The computational and
memory complexity is actually complicated in
this particular scenario, but we'll talk
through it anyways. Given what you've known so far, it might not be able to
come up with this one, but that's completely okay. Again, your goal is
just to understand the solution that
I'm presenting now. Let's talk about how
many times it takes, or let's talk about first how many results there are here. All possible three length
sequences of a's and b's. Here we're going to have two options for
every single letter, a or b times 2 times 2. That means there are
two to the power of three different options here. If we want all possible
k length sequences, we're going to output two to
the k different responses. Now, in order to produce two to the k
different responses, we actually had to generate all the two length and one
length solutions as well. In reality, what we have is to generate two to the
three different options. We had to go through
two to the two, and then two to the one as well. This is true in general. If we wanted to generate
all four length sequences, we would have generated two
to the one two to the two the three and two to the four different combinations
and so on and so forth. In reality, what that
means is that this is actually a summation of up to, let's say j values where
at most j is equal to k. Here we have this nasty
summation of a bunch of terms. However, fortunately for us, order of magnitude analysis only cares about the largest
magnitude term. Here we can simply
summarize this as o of 2 to the n, or I guess k, since we've been using k. Knowing that this
algorithm is pretty slow, this is the computational
complexity. Fortunately for us, the
memory complexity is, well actually,
unfortunately for us, numeric complexity
is also the same. O of 2 to the k. This is
a pretty slow algorithm. It's also pretty
expensive algorithm, but we're going to leave that, that it's a pretty,
it's complex. But at least given what
we've said so far, this is one way to do it, and we'll do it this way for now. Go ahead and pause the video
here and try to code it up. Now here's a solution. To get started, we're
going to create a queue. Just like we talked
about before, we're going to add
in the empty string. This empty string is all possible sets of
zero length sequences. From this, we will
generate all possible sets of one length sequences,
and so on, so forth. While there is
something in the queue, we will keep generating
again and again and again. Let's go ahead and pop off or
dequeue that first string. If our word here is
exactly length k, then we're going to simply print that word, and we're good to go. However, if the word is
not yet of length k, then we will extend it. We will extend it
with the letter a and we will extend it with the letter b. There we go. This is it for our function. This will give us all
possible Kalign sequences because this ensures that we'll extend with
all possibilities, a and b, and this ensures that will catch all Kalign
sequences as they come about. If there's anything longer
than a Kalign sequence, then, well, it
doesn't go anywhere. We won't continue
to work with it. In fact, you can actually
prove that there will be no length longer than k, but that's not super important. Point is this function
should give us what we want. Let's go ahead and try it now. Hit the green "Run" button
at the top and you should see that print_baba will disappear from this list
of failures. There we go. Print_baba is now gone
from the list of failures. We've passed this test as well. Let's move on to
the next question. In this next question,
let's now use stacks or queues to print stairs. In particular, print the
number of ways that we can climb k stairs if you can take one or two
steps at a time. If you have one stair, you can only climb it one way. If you have two stairs, you can climb it two ways, either you take one
step at a time, or you take both steps at once. If there are three stairs, then you can either climb 1,1,1, or you can climb two
steps and then one step, or you can climb one
step and then two steps, so it's 1,1,1; 1,2, or 2,1. There are three ways
to climb three steps. This function is supposed to compute the number
of ways it takes to climb k stairs if you can
take one or two steps. Now, before you consider the
algorithm, here's a hint. Well, you can pause the video here if you don't want the hint, but the hint is the following; before our, quote-unquote, actions, for every single word that was not yet of length k, we had two different actions. You can add a or you can add b. Here, you have
something similar. If you have not yet
climbed all k stairs, you have two different actions. You can climb one step or
you can climb two steps. Knowing that and knowing that we previously
used the queue to represent all possible
ways to build up a word, you can also use that
queue now to build up all possible ways
to climb stairs. Pause the video here and see if you can figure
out the algorithm. Now here's a solution. The solution is
actually going to mirror the previous
solution pretty closely. What we're going to do is for every single set of stairs, let's say we have a queue now, our goal will be, for every single item
in the queue will be another way of reaching
that certain number of steps. Here's one way to put it.
Let's say we have zero. That means is this is one way of making
it to zero stairs. Then we're going
to dequeue that 0. Then we're going to add
to the queue two items. You can either take one step
or you can take two steps. We'll take 0 plus
1 and 0 plus 2. Then you'll dequeue
the first item. Then we'll, again enqueue. We'll say, there are
two possibilities, you can either take one step, you can take 1 plus
1, which is 2, or you can take two
steps, which is three. Then we'll dequeue again. We'll dequeue this two. Then we'll say, here is two. You can take one or two
steps from that point on. You can take 2 plus 1
or three, or 2 plus 2, which is 4, and then so on and so forth.
You can keep doing this. Basically, what will happen is if you count the number of times that k occurs
in this queue, that will tell you how
many ways there are to get to that number of stairs. That's it for the conceptual
solution for the algorithm. We're actually going to skip the memory and
run-time complexity for this particular problem just because it is fairly
difficult to compute. There is a way to compute it. We're going to skip it for now. I will include a
written version of the solution in the
final solutions file. In the meantime,
let's move on to actually coding this function. Again, pause the video here, and try to code the solution. Let's now code up the solution. We'll start off by
creating a queue right here with just zero items. This queue now has zero items. We also will initialize
the number of ways that we can climb
up to k stairs is zero. While we have something
in the queue, so while there are
ways up the stairs, we will dequeue one
way up the stairs. This will dequeue one
way up the stairs. Then if we've made
it up to k steps, we will increment the number of possible ways to get
up to k steps by one. If we have not yet
made it up to k steps, so stairs is less than k, then we will consider
ways of getting up there. We'll consider either
plus 1 stairs or will consider taking two steps. Then finally, we'll return
the total number of ways that it took to
get up to k steps. Now that you've got this,
let's go ahead and hit "Run". There we go. We can now see that print_stairs is no longer
in our list of failures. We've passed all the test cases. Before we move on to
the next question, I did want to bring up this tip. The tip is solve
the core problem, then consider the edge cases. We've been skipping some of the tips here for edge cases and you can see them if you
go back to the slides, but the basic idea is, again, focus on the core problem before you addressed edge cases. This is not super
relevant right now, but as we work through problems, we'll see more and more
of these edge cases, and sometimes they
can be a distraction. Again, focus on solving
the core problem first. Here we're now going to use stacks and queues to
check parentheses. Actually, this is the first problem where
we'll see a bunch of random edge cases.
Here's the problem. Go and scroll down a little
bit in your file and you'll see our goal is to check if the provided string contains matching parentheses. [LAUGHTER] I've actually
already given you a hint here. You should use a stack, but let's go in, and scroll down here,
and see what that means. Matching parentheses means that for every single
start parenthesis, you have a corresponding
closing parenthesis. However, it's not
just if it exists, it has to be in the right order. For example, see, here you have a
start parenthesis. Actually, this is just
missing an end parenthesis, but here is a case where you
have invalid parentheses. Here you have end parenthesis
and then start parenthesis. These are improperly closed. What this means is before every single
closed parenthesis, you need a corresponding
starting parenthesis. That's what it means to match. This is actually a
very common problem that you'll see both as exercise and possibly even
in the interview itself. This is good to know, good to understand what the
premise of the problem is. Again, we want to check if the provided string contains
matching parentheses. Again, the hint is
to use a stack. Go ahead and pause
the video here and see if you can come up
with the algorithm. Let's now talk
about the solution. Our solution, like we mentioned earlier in
the tip or in the hint, is to actually use a stack. Let's say we've got
this stack right here. This stack is going to represent how many start
parentheses or how deep we are in parentheses. Let's say that we
iterate through the string and we saw
this start parentheses, we're going to add a
start parentheses here. Then now we see a
close parentheses. Anytime we see a
close parentheses, we're going to pop
off from the stack. We popped off from the stack, we removed one level higher, we removed that group. Now that we see another
start parentheses, we're going to add back in, we see another start parentheses we're going to add back in, and then here we see
a closed parentheses, so let's go ahead
and remove that, and then we see another
start parentheses, so let's add that back in. Finally, we're going
to close this again. Now we can see at the very end of the
function we still have one unclosed parentheses,
which is this one. This group is unclosed. What that means is this is not a matching set
of parentheses, we can return false. In essence, this stack actually
represents how deep we are in this nested
set of parentheses. Anytime you pop off, it means we're back up a level. We have gotten rid of one
nested set of parentheses. If we haven't gotten
all the way back up and out of the parentheses, then it's not matching. At the same time, if you
see something like this, which starts from a
close parentheses, then at the very beginning of the string the
stack is empty, so there's nothing to pop. This will immediately
tell us that again, this is an invalid
set of parentheses. That is conceptually, how
we are going to do this. Now, let's go on
and talk about what the runtime complexity is
and the memory complexity. Pause the video here and
try to figure it out. Here's the answer.
For both runtime and memory complexity,
both are linear. We have as we iterate
through here, well, basically we only iterate over the string one time. Every single item
is only seen once. Second, to comprehend the
linear memory complexity, imagine if you had a string
that looked like this, then our stack would simply be all of these
start parentheses. We have up to N different
items in our stack, that means our memory
complexity is O of N. Now let's finally move
on to the third step, let's start coding
this algorithm. Pause the video here
and try to code it. Let's go in and code
up this algorithm now. Here we're going to create
a stack to begin with, and for every single
letter in the string, we're going to
check if the letter starts a new set of parentheses, we're going to push
onto the stack. Otherwise, we can assume that the letter is a
closing parentheses. This is something that
I just wrote earlier. Basically, in an interview, you should ask if the string could contain other characters, like numbers, or
letters, and whatever. Now, just for simplicity though, we're going to assume that there are only parentheses
in the string. Here, if it's not an
opening parentheses, it's a closing parentheses. For every single
closing parentheses, we're going to pop
off from the stack. Then finally, if the
stack has nothing left, then we'll return true. If the stack has something left, it means we haven't closed
all our parentheses, so we should return false. Now, knowing this, here is actually a
tip that I mentioned earlier or in the slides from earlier, but
I didn't mention. Here consider an empty sequence. What happens when your
stack is actually empty? Is there any place in your code where your code would break? In this particular
case, yes, stack.pop. Here, we don't check if the stack is already
empty, so let's do that. If the stack is already empty, well, what do we do here? If the stack is already
empty and we're trying to remove an
opening parentheses, this means we saw a closing parentheses before we saw the corresponding
starting parentheses. Knowing that, this means, again, the string is invalid, so we should return false. That's it for the
edge cases really. Invalid inputs
doesn't apply here, so that concludes this problem, let's go ahead and see if
it passes all of our tests. Running the code, you'll notice that is_valid_parentheses is not in this list of failures, so we have passed
this functions tests. Let's move on to
the next problem. Here's the tip, actually
consider extraneous characters. That was up here, the tip, in an interview make sure
to ask if the string could contain other characters for
this particular problem. Now you should try
this on your own. Use stacks or queues
to check if there are matching parentheses
and matching brackets. Here we have parentheses
that open and close, and we also have brackets
that open and close. This is a valid grouping because every single pair of parentheses
is perfectly matched. But there's an
additional challenge. Notice here that if you only
consider square brackets, they are correctly matched. If you only consider
the parentheses, they are also matched. However, you'll notice
that this set of square brackets and parentheses are intersecting each other, that is an invalid grouping. Knowing this, let's try to design an algorithm
for this problem. Pause the video here and try to see how you would use
your data structures, stacks, or queues to
solve this problem. Again, you can use your previous
problem as inspiration. Pause the video here
and try it out. Here's a solution.
Just like before, we're going to use a
stack, except this time, the stack will contain either starting parentheses
or starting brackets. It will always contain basically the
starting punctuation. Knowing that,
anytime we want to, let's say we see a
closing parentheses, then we'll try popping
off from the stack. In this case, we'll pop
off this square bracket. We're currently seeing
a closing parentheses. Closing parentheses and this opening
bracket don't match, so that means this is
an invalid grouping. That would occur in
this particular case. You would see on the stack
something like this. You would see a square bracket, you would see a parentheses, and then when you pop off, you'll see this
opening parentheses versus this closing bracket. So when those two won't
match, it is now invalid. Otherwise the algorithm looks pretty much the same as before. Anytime you see an
opening punctuation, add it to the stack, anytime you see a
closing punctuation, attempt to pop off and
make sure you're popping off the right
opening punctuation. That's it for the algorithm. Again, let's now consider what is the runtime
and memory complexity. Pause the video here and see
if you can figure it out. Now, here's the answer. For both runtime and
memory complexity, we have O of N. Just like before, for computational
complexity, we simply iterate once over
every single character, so that's O of N. For
memory complexity, we again, could
have a string with just a bunch of opening
punctuation like this. If you have a bunch of opening
punctuation like this, then your stack is pretty much as long as your input string. Here our memory
complexity is also O of N. Now that we've passed
these first two steps, let's now try the third step. Let's try coding this algorithm. Pause the video here and
see if you can code up. Let's now try coding
up this function. We're going to start off
by creating a stack, and then we're going
to iterate over the stack just like
we did before. If our current character is
a starting punctuation mark, then we'll push it
onto the stack. Then if we see a
closing parentheses, double-check that
the current stack has a starting parentheses
and then return false, if the two don't match. Otherwise, if it's
a closing bracket, makes sure your stack is
currently inside of a bracket, otherwise, return false. Finally, make sure that your stack has closed
all possible groups. If you've closed all groups, then your stack would
have nothing left in it. All of the corresponding groups would have
been popped off. Now that we've got our function, let's go ahead and run it. Click on the "Run
button", the very top, the green button that will
execute all the tests. Here we can see that is_valid_grouping is missing
from our list of failures, so we have successfully
passed all these tests. That concludes the
exercises in this lesson. One more closing tip, stack is like a stack of rocks, stack of clothes,
stack of boxes, the last item you add on is
the first item you take off. A queue is just like a
line, a queue of cars, of queue of birds, I guess, I don't think
birds queue up. Queue of people, queue of boxes, any line, the first item you add is the
first item that comes off. Again, keep this in mind. That concludes this final
tip and this final note. For a copy of these slides and the finished code
for this lesson, make sure to check out
the course website. Finally, this concludes
our sequences practice. I've mentioned some
interview tips, but don't spend too much mental energy on
memorizing those. Let the practice
we've been doing get you in the habit of
following those three steps. That's the most important part, getting in the habit of
doing those three things, so that when you're
extremely nervous, when you're freaking
out, when you're not quite thinking
straight in an interview, you will default to those
three steps. That's it.
7. Lists: Array vs. Linked List: Our first category is
for lists of items. We'll analyze two
data structures, in particular, arrays
and linked lists. We'll then cover their pros and cons based on our analysis. Let's start by finishing
our array analysis. An array is simply
a list of items. In memory, the array items are placed right
next to each other. Earlier we found that
searching an array takes O of n time since we have to traverse the entire list one item at a time
to find the value. We also found that fetching
takes O of 1 constant time. Let's now finish this
analysis and see how long it takes to insert an
item into the array. Say you have this
array of three items. We're now visualizing a piece of your memory in your computer. We would like to insert a
new item, a value of four. We could in theory just allocate a space for four like this. However, your computer
may have already allocated that space
to other data. As a result, the insertion
operation will allocate a new chunk of memory for
your new length for array, then the algorithm will copy
each element one by one. Finally, it will update
the new item's value. For an array with n items, insertion makes n copies. Insertion is thus an
O of n operation. To summarize, insertion
takes linear time. Let's see how long
deletion takes. Say you have an
array of six items, you would like to delete the
3rd item, the number 2 here. To do that, the
deletion algorithm will shift all the
numbers after it up. It will copy 3 forward, copy 4 forward, copy 7 forward, then deallocate the last spot. As a result, for an
array of length n, deletion takes n copies. This makes deletion O of n. This completes our table of run-time complexities
for the array. You'll notice immediately that the array is very efficient for fetching but inefficient
at everything else. Our array is inefficient for
modification in particular only because we have to keep all the data
contiguous in memory. In other words,
all the items must be right next to each
other like this. But we can change this. Let's place each item in a list wherever we want in memory. In order to connect all the
items together into a list, we need to add links. Each of these links are
also called pointers. Each pointer points to a location for the
next item in the list. We also need to add a marker that indicates
the list has ended. This is what we
call a linked list because of all the
links we inserted. This linked list comes with
some really nice properties. Let's now analyze
this linked list. Let's say we want to insert a new value 3 at the
end of the list, then we simply
allocate two new spots and make sure 2 points to 3. No matter how many items
there are in the linked list, we have a constant
amount of work to do to insert a new item, allocate two spaces and
change one pointer. As a result, insertion for linked lists is O
of 1 constant time. The story is similar, even if we want to insert a value in the
middle of the list, allocate two new spaces for
the new value and a pointer, then point the
previous value 8 to 3, then point the new value 3 to 2. There we have it.
Insertion is complete. No matter the length
of the linked list, we just need to allocate two spaces and
change two pointers. This is still
constant time O of 1. To summarize so far, a linked list has
constant time insertion. Let's see now how removal
or deletion works. Say we want to delete
3, the last item. This is simple. We just reverse what we
did for insertion. Delete the 2nd to last pointer. Now the item 3 is no longer
a part of the linked list. To delete the last item, you just need a
one-pointer change. This is constant time or O of 1. Deleting an item in the middle
of the list is simple too. Say you want to delete
3, the 3rd item, change the pointer for
eight to skip over 3. Now, in effect, 3 is no longer a part of the linked
list, and that's it. To delete an item, no matter where it is, we simply change one pointer. This is constant time
again or O of 1. To summarize so far, a linked list has constant
time insertion and deletion. Let's now see how long accessing
the linked list takes. Let's say we're searching
for the value 4. We need to traverse the entire linked list one
by one until we find 4. First, we'll check the 1st item. Is this 4? 9 is not 4 so
access the next item. Is this 4? 8 is not 4 so
access the next item. Is this 4? 3 is not 4, next, 2 is not 4. So for a linked
list of length n, we need up to n checks. Searching a linked list
takes O of n time. To summarize, a linked list has constant time insertion
and deletion, but linear search time. Unfortunately,
fetching items from a linked list is
pretty slow too. Let's say we want to check or fetch the 3rd item
in the linked list. There is no way to know where the 3rd item is in memory so we need to start
from the beginning. We'll access the 1st item, then follow its pointer
to the next item. Access the next item, then follow its pointer
to the next item. Finally, access this item. Since this is the 3rd
item, return its value. For a linked list of length n, we need to traverse
up to n nodes. Fetching the ith item from a linked list takes O of n time. To summarize, we've now completed the linked
list analysis. Modifying the list, inserting, or deleting is efficient. Accessing the list, searching, or fetching is inefficient. Now that we've analyzed both
arrays and linked lists, let's see when to use which. This is the most important
part of this lesson. Even if you forget
everything I said before, keep this next slide tucked in your back pocket. This
is your takeaway. Here's a summary of the
run-time complexities. On the left, we have the
four operations we analyzed. In the 2nd column, we have array run-time
complexities from before. On the right, we have linked
list run-time complexities. For an array,
modification is slow, but fetching is fast. This means arrays are ideal for many random accesses for
a fixed set of data. For a linked list,
access is slow, but modification is fast. This means linked lists
are advantageous for their dynamic size ideal for holding constantly
changing data. That's it. This is the summary
slide for this lesson. For a copy of these slides
and more resources, make sure to check out
the course website. Let's now see these data
structures in action. In the next lesson,
we'll practice both using and implementing
these data structures, tackling both of our learning
objectives for this course.
8. Lists Practice: Welcome to another
practice lesson. Here we're going
to practice lists. In particular, we're going to
cover mostly linked lists. Just like before, navigate to this URL
alvinwan.com/datastructures101/lists. Once you see this page, fork the project to get
your own editable copy. To do this, click
on the project name in the top left to
get a drop-down, click on the ellipsis to
get another drop-down, and finally, click on
the "Fork" button. Then I suggest placing your
Skillshare and rippled on it windows side-by-side
as shown here. Here are the different subtopics that I
want to go through. We're going to start
off with the prompt. This is the code that your interviewer will
provide you with. Then we'll talk about
three categories of prompts: implementation, usage, and then combining
different data structures. To start off, on the
right-hand side, you should see the code that your interviewer would
likely provide you with. Here, you would have a stack, you'd have a queue. Well, I guess stack and queue probably would not
provide you with, but you could certainly use
those if you needed to. This is the code
that you'd be given, basically a linked
list implementation. You'll have something that
looks like the following. You would have the value
for the link and you would also have a connection
to the next link. This right here is
a utility that I've provided so that
you can actually visualize your linked list. This may or may not be provided. But if it wasn't, you could also use
this implementation. Let's now start off with
our very first exercise. Now, for all of our exercises,
including this one, we're going to follow
the same three steps that we talked about before. Again, this is very important to keep in mind and to practice. First, we'll design
the algorithm without code verbally or visually. Second, we'll report the
runtime and memory complexity. Third, then we'll start coding. Here is linked list appending. What we're going to do is append an item to the linked list. So here we'll have, let's say a linked list
that has 1, 2, and 3. We want to append
the value four. Then after appending it, we'll then have a linked
list that's 1, 2, 3, 4. Now go ahead and pause the video here to determine
what algorithm you're using. Conceptually, what the
steps are going to be. Here's the answer. Conceptually, what we're going to do is again, we only have a pointer to the beginning of
the linked list. We're going to follow that
pointer just like what's shown here on the left-hand side with this left arrow in black. We're going to start from that pointer then we're
going to navigate to the next link and keep going until you've reached the very end of our linked list. Then at the very end
of our linked list, we'll simply create a new link
and attach it to the list. That's conceptually
what we're going to do. We're going to
traverse to the end of the linked list and then we're
going to add a new link. Knowing that now, again, we'll have to consider
the second step in our three-step process. What is the memory and runtime complexity of
this append algorithm? Pause the video here and see
if you can figure it out. Now here's the solution. This algorithm is linear in computational complexity
simply because you have to traverse the
entire linked list, which could be up to length N. That's O of N
computational complexity. For memory complexity,
we have O of 1 memory complexity
simply because we didn't use any additional
memory other than storage for the current item and the existing linked list. We didn't actually create
any new data structure. We just have this link
which is O of 1 cost. Knowing that, let's now
actually code the algorithm. Again, pause the video here and see if you can code it up. Now here's a solution. Here we're going to traverse
to the very end of the list. Here we'll say while
link.next is not none. This means as long as we have another link in
our current list, we're going to keep traversing
until we get to the end. Then finally, at the
very end of the list, so this link will now be
after this while loop, the last element, so it'll be 3. Now we've got 3, we're going to write link.next
is equal to our new link, which contains a value 4. Knowing this, let's now go
ahead and run our tests. We can see now that append
is missing from this list of failures so we have
passed our appended test. Let's now move on to
the next question. Now before we actually move
on to the next question, actually for the next
question as well, memorize how to
traverse a linked list. It will always
almost be the same. You're going to have
some while loop, you're going to check to
make sure that you're not accessing some none type object, then you'll just
access the next link. This is how you'll pretty much always traverse
a linked list. Keep this in mind because the very next problem is going
to use something similar. Go ahead and scroll down in your file and you'll see now
here is the insert question. Our goal is to insert after
the provided index I. In particular, let's say you
had this linked list 1, 2, 3 and we want to insert
a link right here after index 0 and we want
to insert the value 4. Here we'll insert after
index 0, the value 4. Here we have 1, 4, 2, 3. If you're looking on
the left-hand side, we've actually visualized
what this will look like to help you
out a little bit. [LAUGHTER] Go ahead and
pause the video here, and see if you can
determine what the algorithm would look like. Now, here's the answer. What we'll do is first
traverse the list until we get to the
right link to modify. In this case, let's say
we're inserting H after 9, just like we're
inserting 4 after 1. Here we're going
to navigate to 9. Once we've reached the value 9, then we'll create a new link. We'll relink 9 to 8
and then we we'll relink 8 to 2. There
are three steps. Navigate to the correct element, relink to the current link, and then use the current
link to connect to the next link. That's
the algorithm. Now, knowing that this
is the algorithm, go ahead and pause
the video here and figure out what the runtime
and memory complexity is. The memory and runtime
complexity here. Computational complexity
is going to be linear. We may have to traverse
all the way to the very end of the list
to insert something. Memory complexity
is constant because the only thing we're creating
is this link itself, which is O of 1
constant memory cost. Knowing that, let's now
move on to the code. Pause the video here
and try to code. Now here is a solution. Just like before, we're
going to start off by actually navigating
using a while loop. Actually instead of
using a while loop, we're going to use a
for-loop because we know exactly how many steps
you're going to take. For link in range i, link is equal to link.next. I've used underscore here to denote that we're not going
to use that variable. All we care about is that we access.next.next.next
basically i times until you get to
the right element. Once you're there, we're going
to reassign a few things. We're going to first reassign the current value's next
attribute to the new value. But then, we also
need to connect this new link to the
rest of the linked list. We're going to use this to connect to the
original link.next. Again, in this example
on the left-hand side, 9 originally pointed to 2. We're now going to point 9 to 8. That's right here. Link.next
is equal to our new object. Then we're going
to point 8 to 2, and that's right here
in the second part. We have the original link.next and we substitute that in
for the new link.next. That's it. This is
our Insert function. Go ahead and rerun your tests by hitting the green "Run"
button at the top. Now we can see that there
are one fewer failures. We have insert not in
our list of failures, we have successfully
passed the tests. Let's move on to
the next problem. Here, one tip that we
had for this problem and for future ones is to
always check for none values. So one common error
that you'll get with linked lists
problems is you'll get, for example, in this link.next, you'll get link is
an NoneType object. Next is not an attribute
of a NoneType object. That's super common. When you're debugging, and
even before you debug, even before you run the code as you're talking
to the interviewer, make sure you're running
through your code and checking for any place
where there might be none values and making
sure that you're not calling.next or.value
on an NoneType value. In this particular
function right here in line 83 and 84, it's quite possible
that i was invalid. Let's say i here was a 100. Then we would reach
at some point in NoneType value and we'd
tried to call next on it. To make this function
more robust, you'd have to check for that. You'd have to account for that. In this particular
simple implementation, we didn't do that, but just something to
keep in mind always, always check for none values in your interviews and any
poem that you're writing. Let's now move on to
the next problem. Our goal now is to implement
a stack using a linked list. Here we have a linked list, we have one, two. Our goal is to be able to treat this as though it was a stack. If we call dot pop on it, it would give us the
last item which is two. If we call push, here we push 2, push 3, then if we pop, will get back the last item
we pushed, which is three. Pop again, you'll
get the second to last item you push,
which is two. Our goal is to implement a stack but using a linked
list in the backend. Go ahead and pause the
video here and see if you can figure out
should the algorithm, in this case, the algorithm is, how are you going to push and
how are you going to pop? Pause the video here.
Here is the answer. Conceptually speaking, this is actually pretty similar
to the beginning. When we say push here, we're going to pretty much write the append function that we
talked about previously. Can we say pop here, we're going to write a
new function, delete. We haven't actually written
a delete function yet, but it will pretty
much be just like any other function that
deletes from a link list. So now that we know
the algorithm, go ahead and pause the video here and see if
you can figure out the computational and
memory complexity. The solution for this part, runtime and memory complexity will actually runtime complexity
is going to be linear. We're going to traverse
all the way to the very end of the linked
list and then add to it. In pops case, we're going
to traverse all the way to the very end of linked
list to remove from it. Now, knowing that, the memory complexity
is constant. The memory complexity, all
you're doing is creating a new link and so that
is O of one memory cost. You're not creating
a whole new list. You're not creating
another data structure that stores an items. You're just creating one link. Memory cost is O of one. Now, pause the video here
and try to code it up. Let's now write up the code. So here we're going to
implement push pretty much just like re-implemented
append earlier. We're going to first define
a local variable link that is our current linked list beginning and then while
link.next is not none. So while we have something that linked list, let's traverse it. [NOISE] Then finally
at the very end, we're going to add a new link. That's pretty much it for the
push function push method. Let's now move on
to the next one. To remove something,
we're going to again set this local variable link to our start with
the linked list. Then while we're one
away from the end, so we don't actually
want the very last link, we want the second-to-last link because that's going
to be our new tail. We're going to again traverse. Then what we're going to do
at this point is first grab the value of the very last item before we actually remove it. Here, we wanted to get the
value of the very last link. Then next, we're
going to delink it. So now our second-to-last
link is pointing to nothing. This is now our
brand-new last link, and then finally
return the value. This is how we actually delete from the very end
of the linked list. Let's go ahead and
click the green button, the very top to run
our file and check our tests. There we go. We can now see that our stack via Linked
list is actually gone. There is no longer in
our list of failures, so we have passed these tests. Let's move on to
the next problem. Before I move on to that
problem, we have one more tip. You want to easily make one of your own questions pretty much just mix and
match any of these. You've talked about three
different data structures, linked lists,
stacks, and queues. In order to create
a new question, just implement, pick one
using, and then pick another. Any one of these
combinations will give you a brand new problem
that you can try. As long as you can pass the same tests that
you've written for a stack except using your brand new data structure,
then you're good to go. This goes for
anything else. If you have tests for queue, then just make sure you're
brand new data structure can pass those tests, and then tests were linked
list, and so on, so forth. It's mix and match any
of these combinations to get a brand new problem. Now, let's implement cycle
removal for a linked list. In particular, find and remove a cycle in a linked
list if it's present. For now, for simplicity, you can assume all the
link values are distinct. This is very important because the first question
you're going to consider when you see find and remove cycle is how are you
going to detect a cycle? One way to detect a
cycle would be if I've seen this value before already, then I know I've hit a cycle.
I guess to clarify it. Let say you're
moving from left to right along this linked list. As you progress, if you
suddenly see three twice, then you know that
you've hit a cycle. But that you can only say if you know that all the
link values are distinct. This is why link values being distinct is very important
question for this problem. I wrote that here are
all values unique. That's going to be one of your first question is for
your interviewer. Your second question,
and this is a little bit more of an
advanced question. You want to ask, is there
a max cycle length? This has to do with
the implementation, because if you want to keep track of all the values
that you've seen so far, and then you want
to say, "All right, have I seen this value already?" Then you could potentially
never see a cycle. This entire linked list
could be a million long and that there is
not a cycle anywhere, then you basically have kept
it this massive store of a million items waiting to see if any of those million
items show up again. This is a good question to ask would be is there
a max cycle length? Because that way you
wouldn't have to keep all million items if you know the maximum cycle
length is only five. These are good
questions to keep in mind for this problem. As you progress and
practice these problems, you'll begin to understand
what questions to ask. So for now, if you're thinking these are crazy
questions out of the blue, no way to ever think
about these, that's okay. Just as you see more
and more of them, you'll get more
familiar with what kinds of questions to ask. For now, let's see if you
can think of an algorithm. I gave it away parts of it but pause the video
here and see if you can come up with a way
to remove cycles , and here's the answer. Basically, what you're
going to do is keep a collection of all the
values you've already seen. If you've seen the
value already, then you know that
this is a cycle and you can break the cycle by simply reassigning
link.next accordingly. Knowing that, that is
now the algorithm. What is the runtime and memory complexity?
Pause the video here. For the memory and
runtime complexity, for computational
complexity in particular, you're going to have O
of N. We're going to traverse over this
entire list one by one. We need to do that in order to figure out if there's
a cycle or not. For the second question is, what does the memory complexity? For memory complexity,
we're going to again, keep a collection of all the different values
that we've seen so far. That's an O of N
memory complexity. O of N memory complexity and O of N computational
complexity. Knowing that, let's
now try to code it up. Pause the video here. Now, let's go ahead and code this up. Here we're going to start off by creating a set of values. So if you haven't
seen this already, a set is a data structure
that allows you to check for membership
in constant time. Meaning that you can do something
like this one in scene, and this no matter
how long your set is, will always execute
in constant time. That's a set. We're going to use
the set to track all the values that
you've seen so far. While there's something
else in our linked list, we're going to check
if that next value is in our scene values list. If it is, then we're going to break the connection with
the next link. Again, if the next
link is in our list, I've seen values,
we're going to break that link and we're
going to end here. We're assuming there is
only one cycle in our list. Again, that would be a great question to ask
your interviewer. Will there be more
than one cycle? In this particular
question we're going to pretend
there's only one cycle. Seen.add and then we'll have
linked up value right here. What we're doing here is
we're saying, all right, we've considered that
there is no cycle here. Because there's no
cycle, let's add the current links value to
our list of seeing values. Finally, this is the standard
a linked list traversal, we're just going to access the
very next item and it will keep going. That's it. Let's go ahead and
run this function and make sure that
our tests pass. There we go, we can
see that removes cycle is no longer in our
list of failed functions. Let's move on to
the next question. For the next question,
and also for this one, one tip is to always draw your linked lists and
check for broken links. What I mean by draw is basically the illustrations that I had before right here. If need be mixture, draw these out to understand if you're linking all
in the right ways. I moved pretty quickly through
some of these questions because I've already
written the solution and I know the solutions. But basically, if you're
given a brand new problem, the odds of making an
incorrect link is very high. That's completely okay as
long as you draw it out. If you draw it out,
you can work out the correct links to connect. That's one of my tips here. Draw your linked list and
check for broken links. Now, for this
particular problem, we're only going to
discuss it conceptually. I've actually written a
fully working solution in code for you but because
it's fairly complex, we're only going to talk
about this algorithmically. For this question,
our goal is to use or create a new file object by using the data structures
that we've already seen. Our file object is going to
be merged with other files. [LAUGHTER] I just showed you a part of the
description of the solution. But your goal is to create a file object that can officially
merge with other files. The idea is there are
many of these files. What that means is
you don't want to copy all of the data
into one massive file. If you do all that copying, it's going to take
a very long time. How can we combine a bunch of
files without copying them? If you don't want the hint, then go ahead and
pause the video here. But one hint that I have for you is we don't want to copy, so instead of copying, let's store a pointer
to the next file. We'll have file 1
and then we'll say, from file 1, we know what
the next file is file 2. All you're going to do is store a connection or a pointer. That should sound awfully
a lot like one of the data structures we just
covered and I've been using. Now pause the video
here and see if you can figure out
the algorithm. Now here's the solution. Basically we're going
to use a linked list. Every single file is
going to be a link. Each file will then
point to the next file and say that's the next
file value to traverse. That's how you're going to store this massive merged file. But that's not all,
there's a little bit more because we need to support
this merge function. Knowing how we're going
to store them is great. We don't have to make
copies, but how do you actually merge a new file? Well, the emergent new file, it just like you would add
to any other linked list. You pretty much traverse to
the end of the linked list and add the new file.
That's pretty much it. For this problem
or this exercise, the only conceptual difficulty is realizing that you have
to use a linked list. That was really the
pointer problem. Now that we've covered
the pointer problem, we're going to
actually move on just because I do have a written solution for
this and there are line-by-line annotations
for what each one is doing and how it works but the details are
not super important. Let's move on to
the next question. Again, doesn't other ship here. Linked lists we use for
dynamic sized collections. In this particular case, this set of files is a dynamically sized
set of collections. We don't know how big it is, we might add to it
a bunch of times, and we might even remove
from it a bunch of times. Linked lists are great for
those collections because you don't have to copy
data anytime you want to add or remove an object. Anytime you want
to add or remove link, it's constant time. Now however, arrays are
perfect for random access. We talked about
this a little bit before but basically what this means is if you want to access
the ith item in an array, you know exactly where
it is in memory. You go straight to
that part in memory. However, for a linked
list if you want the ith item as you saw before, you need to traverse
the entire list, so not quite as efficient.
That's the tip. Linked lists are for
dynamically sized collections, arrays are for random access. For our last question here, your goal is to combine
data structures to find a palindrome
in a linked list. A palindrome is a word that is bred the same backwards as
it is sports like racecar. If you flip racecar,
you still get racecar. Here, your word is going to be represented
as a linked list. Here we have a link
that has r in it, and that points to the next
link which has a in it, points to the next
link which has c, and then points to
e, points to c, points to a, so on and so forth. Our linked list represents
the word racecar. We want to check if this
linked list is a palindrome. Consider what other data
structures you may use to also see at this linked
list is a palindrome. Pause the video here and see
if you can figure it out. The solution conceptually for this pretty challenging is you want both a
stack and a queue. As you progress and traverse
along this linked list, you add to both the
stack and the queue. Then once you've finished servicing the
entire linked list, simply pop slash dequeue, just remove from each of the stacking queue and then check if each one of
those equals each other. That's great because the
stack removes from the end, the queue removed
from the beginning so you're basically moving along the string in reverse
and forward at the same time by just removing from the
stack and the queue. That's perfect for us
because in a palindrome, one way to check if it's
a palindrome is to check if the first character
is you're the last one. At the second one is
you go to the second last one and so on and so forth. By removing from the stack
and the queue, you get that. Let's pause right here. Let's see, that's the algorithm. What is the run-time
and memory complexity? Here's the answer.
For both run-time and memory complexity, we have linear complexity. For memory complexity
is linear because we're storing a stack and a queue that each could have up to length n, and we also have linear computational
complexity because we're iterating over each
of these letters. In fact, we're iterating twice, but it doesn't matter, it's o of n. Now that we know the
algorithm and we know the complexity, let's
try coding it up. Pause the video here,
and try to code it up. Let's now cover the solution. We're going to start
off by initializing both the stack and a queue. Now that we've got both
of these data structures, let's traverse the linked list, just like we said, and for every single
link we're going to add it to both of
these data structures. Here I'm going to
add to the stack. I'm going to push the
stack and then I will enqueue onto the queue as well. Then I will traverse
to the next link. Just like I said before,
this traversal code is pretty much always the same. Now having finished that, let's now run through both
of these data structures, the stack and the
queue, and makes sure that for every
single item we remove, they're exactly the same. While there's something
in the stack, pop from the stack and
dequeue from the queue. As long as they match,
we're good to go. If they don't match at any
point, we return false. If nothing returns false, all of the match and
we can return true. That's it. Let's now go
ahead and evaluate this. Hit the green Run
button at the very top and check that
all your tests pass. Here we can see that
our function is palindrome link is nowhere to be found in the
list of failures. We have successfully
passed that test. These three remaining
functions are bonus functions. I've also included fully
fleshed out solutions for them and explanations
as well in the code, so you can try those on
your own time as well. If you have any
questions, feel free to post them in the
discussion section. That's it for this lesson. Go ahead for the slides, for the fully
fleshed out code and solutions nature to visit
the course website. That's it for lists, both arrays and linked lists.
9. Mappings: Hashmaps: In this lesson, we'll introduce a data structure for mappings. A mapping is what
it sounds like, a mapping from one
item to another. Here we map cat to
the number five. In this case, let's say
we're mapping animal to age. Animal is what we call
the key in the mapping, and age is what we call
the value in the mapping. We can also have additional key value pairs in the mapping. Notice that this mapping
has no sense of order, each key value pair is completely separate
from one another. The data structure that stores this mapping is
called a HashMap. We'll explain how
a HashMap works, discuss its pros and cons, then briefly discuss
a bonus concept. Here's why this data structure
is called a Hash map. At the core, a HashMap uses a function called
a hash function. This hash function takes in some value with
arbitrary length, like cat, then transforms that input to produce a
fixed size output like two. This hash function is important, and so as that output too, let's see how both
of these are used. Say we want to
insert some keys and values into a hash table. On the right, we
have a hash table where zero is the first
slots memory address, one is the second
slots memory address, and so on and so forth. We want to map the key
cat to the value five, to do this we'll
hash the value cat. This gives us the
Memory Address 2, so in the position two
place the value five. We'll repeat this for more data, to map dog to three, hash dog. This gives us the
Memory Address 3, so in the position three, place the value three. We repeat this one more time, to map pig to four, hash pig. This gives us the
memory address zero, so in the position zero
place the value four. This process takes
just constant or O of one time for each key
value pair we insert. To delete from the HashMap, we go through similar steps. To delete cat from the HashMap, first hash cat to obtain two, then at the position
two remove the value. This operation is also
constant or O of one time. To summarize, insertion
and deletion for a HashMap are both
O of one time. Very fast. Let's say we now
want to search the HashMap, which value corresponds to cat. Almost the same process, hash the cat to obtain its
memory address, which is two. Then at position two, grab the value five
and return it. As a result, we say searching a HashMap takes constant O, again O of one time. To summarize,
searching a HashMap takes constant O of one time. For a HashMap, search
and fetch are the same, so we skip analyzing
fetch separately. Let's now understand the pros
and cons of these HashMaps. HashMaps in summary, have some
very impressive runtimes, constant time, modification, and access, due to
how HashMaps work, they're ideal for any data
with unique identifiers. By definition, HashMaps are
not designed for order data. As we said before, each of these key value pairs
are completely separate. There is no sense of ordering, there is no relationship between separate
key value pairs. We've discussed the first
two sections already, what a HashMap is and
what it's ideal for. We'll discuss the bonus section after wrapping up
the core content. For a copy of these slides
and more resources, make sure to check out
the course website. This concludes our core
discussion of HashMaps. If you're happy with
this level of detail, feel free to move on to the upcoming lesson
with HashMap practice. Otherwise, let's now
discuss the bonus section. In particular, we
discussed a caveat for HashMaps called
hash collisions. We previously said
that HashMaps are constant time for
modification and access. However, this isn't always true. Runtime actually depends on how a HashMap was implemented
and here's why. Sometimes this phenomenon
when mentioned before, called a hash collision occurs. For example, say we want to
insert two key value pairs, cat maps to five and
dog maps to three. As before, we hash
cat to get two. At position two, we then
place the value five, this is just like before. Now, to insert our
second key value pair, we hash dog to get two, at position two, we now have a problem. We can't override
the existing data, so we have two options. The first option is
called open addressing. In open addressing, we simply find another
slot in the HashMap, if this next slot is empty, place your new value there. One way to find an
empty slot is to simply traverse the slots
until you find an empty one, and there place your value. There are also strategies
for slot finding, which we won't discuss here. Another completely
different technique for handling hash collisions
is called chaining. In this technique, we treat each slot as a data
structure of its own. For example, we could treat
each slot as a linked list, then to insert
into position two, we simply insert three
into the linked list. We can also treat each
slot as a HashMap, any data structure we do each with its own trade
offs once more. In summary, we defined
a hash collision and also saw two ways of
handling a hash collision. Open addressing or we find
other empty slots to use or chaining where we treat each slot as its
own data structure. That concludes our bonus section here on hash collisions, make sure to check
out our next lesson for practice with HashMaps.
10. Mappings Practice: Welcome to the mappings
data structure practice. Here, again, we will
practice more exercises and go over some interview tips related to the mapping
data structure. In particular, these
data structures are also called Hashmaps. So we will use those
names a lot, Hashmaps. Just like before navigate to this URL
alvinwan.com/datastructures101/mappings. That will bring you to repl.it. I know we've talked about these instructions
several times now. I'm going to go over it one
more time just in case, once you see this page, fork the project to get
your own editable copy. Click on the project name in the top-left to get a drop-down. Click on the ellipsis to
get another drop-down, and then click on
the "Fork" button. Then I suggest placing
your Skillshare and the repl.it windows
side-by-side as shown here. You should then see
something that looks like what you have on
the right-hand side. So let's now talk about the kinds of exercises
that you'll see. You'll have the implementation, usage and combination exercises. If you scroll down, you'll
see our very first exercise here is to implement a stack. We're going to do
this in three steps, just like we've done in
all the other lessons. We're going to first design the algorithm without any code. We're going to report the
runtime and memory complexity. Then we're going to actually
code up the algorithm. Let's apply these three steps to the very very
first algorithm, or our first exercise here. Implement a stack that limits all values in the stack to
only three occurrences. If a value is being pushed that has more
than three occurrences, just ignore that value. So in this particular case, we have an example here. We initialize a stack
first and then we push the value one,10 times. However, only the
first three value ones are actually added to the stack. Everything else is ignored. You can then push other
values other than one and it will be added
to the stack normally. Go ahead and try this out. First, design the algorithm
that will actually implement this
stack with a limit. Pause the video here and try
to figure out an algorithm. Here's the solution
algorithmically. What we're going
to do is actually use the baseline stack
implementation first. So you can see here that
my capped stack class actually inherits from
the base stack class. Knowing that, we're
going to impose restrictions on when you can actually push to a
stack, and that's it. To do that, we're going
to keep a HashMap, mapping all the values to
their counts in the stack. Anytime you push, you'll
increment that value. If that value exceeds the limit, we'll simply ignore the value, and then pop will actually make sure to
decrement that count. In push, you check the count, add it if the count is okay, and then increment the count. In pop, you decrement the count. So that's the algorithm. Now, let's discuss the runtime
and memory complexity. Pause the video here and see
if you can figure it out. For the runtime complexity, it's whatever the complexity of the original algorithm was. In this particular case, push is linear and pop is the way we implemented
it, also linear. So both push and pop
are linear operations. Now for the actual
memory complexity, so what does that look like? Well, we have to create
a new value here. It's not exactly o
of n. We do have one value added to the
HashMap for every new value, but that's not as many as the number of total
values that we added. We only have the number of
unique values in our HashMap. We need to invent a new letter. Instead of o of n, we'll say
it's o of k for example, and we'll define k
as the number of unique values that have
been added to the stack. If you actually had
this in an interview, you would say
something like that. We actually need to
define a new variable, k, where k is the number
of unique values, and our algorithm, our memory complexity for
our algorithm is o of. That's it for the complexity. Let's now move on to the code. So pause the video
here and try to code. In the constructor
we're going to initialize a new counter. From collections,
import, default, dict. This is a different
kind of dictionary. What this dictionary
does is if you access a key that doesn't
have a current value, it'll initialize that
value using this function. In this particular
case, for this counter, if let's say we
access counter 0, 0 is not currently a valid
key for this dictionary, so it'll initialize it
with this int function. The default for the
int function is zero. By default, self counter
of 0 will give me 0. In fact, self
counter of anything, will also give me 0. That's how this default
dictionary works, and this will just make some of the code look a
little bit simpler. Now that I've got this, let's now implement the push function. Like we said before,
the first thing we're going to do is check if the current count
is under the limit. If it's under the limit,
then we'll proceed. If it's under the limit, then we'll actually
increment the count, and then we'll actually
push it onto the stack. In pop all we'll do is
simply decrement the count. Here we'll first meet the value. Here we'll pop from the stack, and then once we pop that value will decrement the
count for that value, and then we'll return the value. Let's now see if our
cap stack tests pass. At the very top, hit the green
button, and there we go, cap snack is now no longer
in the list of failures, so we've passed this test. Let's move on to
the next question. One tip is that Hashmaps are ideal for data
with unique IDs. In this particular case, we know that every single value, or the premise of the
problem is that we're gatekeeping whether or not to push onto the stack
by the value. So the value itself
is the unique ID. Because we have this unique ID, Hashmaps were perfect
for this problem. However, Hashmaps are
worst for ordered data. If there's a relationship
between your data, let's say I have a sequence
of numbers, one up to 10, and I need to know
that ordering. Or another way to
put it would be, let's say I have 10 values and I insert them into
a data structure. If I need to remember the
order that I inserted it in, Hashmaps are not the
data structure to use, because Hashmaps forget
all the ordering. All they do is maintain a
mapping from keys to values. They don't remember
anything about what order you
insert the keys in. So that's important. Hashmaps are ideal for
data with unique IDs, but they're not so great for data where ordering
is important. For our next question, scroll down to see
the next question. Return a histogram of all the unique letters and
their numbers of occurrences. I guess this should
have gone first. This is a subset of
the previous problem. But anyways, pause the video here and see if you can figure
out the algorithm. The solution
conceptually is we're going to initialize a
HashMap as a counter, and anytime we're going to
iterate over the string, and for every single new value, we're going to
access our counter. If we've already
seen this value, we're going to add
one to the count. If we haven't seen
this value before, we will initialize the count to one. So that's the algorithm. What is the runtime
and memory complexity? Pause the video here.
Here's the answer. The runtime
complexity is linear. We have to iterate over every single letter
in the string. There's no way around that. The second one,
memory complexity. Again, we need to invent
a brand new variable. That variable is the
number of unique values, let's say k. This
algorithm is o of k, where k is the number of unique values that
are in the string. That's it for the
combine series, let's now try to code it up. Pause the video here,
and try the code. Now let's cover the
solution in code. Here we're going to
create a counter. This counter is going
to be a dictionary. In fact, what I'm going to do is use the default
dictionary from before. That way the dictionary
initializes to zero. Here we will have from
collections, import, defaultdict. Then here for every single
letter in the string, we're going to increment
the corresponding count. Here we're going to increment the corresponding count and
finally return the counter. Now, I also want to
show the implementation without this defaultdict. Let's assume that you
didn't use the defaultdict. Let's assume you just
create a normal dictionary. Then here we would check if
the letter is in the counter. If we've already seen
this letter before, then we'll simply
increment by one. If we haven't seen this before, then we'll initialize it to one. Counter letter is equal to one. Let's now go ahead and
run this to check if unique is now a passing
test. There we go. Unique now has passed
all of its tests. We've also completed
this exercise. Let's now move on to
the next exercise. One tip is make sure your key
exists before accessing it. We did that right here. In line 103, we checked if the letter was
in the counter. In other words, if we
had seen this letter before and there was already
a valid key in our mapping. So knowing that,
let's move on to the question that comes after. Using our data structure to find the most frequent value in
a list of whole numbers. I've specified whole numbers to make this a
little bit simpler. But there are several questions you'd want to ask for
your interviewer. The first question would be, if they hadn't specified
the whole numbers, they most likely would not. They would most
likely say numbers. Then you'd want to ask, can there be negative values? Can there be decimal values? What are the range? What is the set of possible
values that I might see? The second question
that you would want to ask is what if there's a tie? That the question itself is not well-defined
if there's a tie. Now, in the event of a tie
in this problem description, report any of the numbers tied for the greatest frequency. That's usually what will happen. But these are good
questions to ask and it gives you some brownie
points in your interview. In this particular question, we have a list of numbers, and I want to figure out what
the most frequent one is, pause the video here and
see if you can figure out the algorithm.
Here's the solution. We're going to do something
very similar to before. We're going to use a HashMap to actually store the counts for every single value in our list. Once you've got that counter, well then use the
counter to figure out what the greatest one was. That'd be one way to do it. Another way to do this that doesn't require
two passes of the data, would be to actually, as you're adding to the counter, keep a variable that represents
the most frequent value. If you see a value that
all of a sudden has more instances then
replace that variable. You'll see this in
action in just a second. But conceptually, the
main idea is to keep a counter of all
the unique values and then use that counter. Now that we know the algorithm, pause the video here and see if you can figure
out the runtime and memory complexity.
Here's the answer. Just like before,
we have the exact same runtime memory complexity and the previous functions. Run-time complexity is
going to be linear. You have to traverse over every single item in the list. There's no way around that. The second one is O of k, where k is the number
of unique values. That's I guess
becoming a motif now. But at least for
these exercises, O of k is the correct one
time, so keep that in mind. It's linear in the
number of unique values. Now that we've covered
the complexity, we've covered the algorithm, pause the video here
and try the code. Here's the solution. Let's write this out now. We're going to start
off with counter and counter is going to
be an empty dictionary. The greatest value is going
to be none, let's say. For value in numbers, we're going to check if the
value is in the counter. If we've seen this value before, actually I'm going to change
this up a little bit. If the value is not
in the counter, then we'll initialize
that to zero. Regardless of what its value
is, we'll increment by one. If the counter is greater
than the greatest value, so if the current count is greater than the greatest count, then we'll reassign, and then it we'll return
the greatest value. Before running this function, you should keep in mind
something which is, well, one of the tactics
that I will bring up in a future course is you should always
virtually execute your code. What do I mean by that
is just run the code in your head as though you
were the interpreter. If you catch any bugs
before running the code, that's usually another
bound point for yourself. In this particular case, we would actually see an error here because of
counter greatest. Greatest here is none and none does not yet
exist in the counter. This goes back to
our previous tip, always ensure that the key exists before you
try to access it. In this particular case, I'm going to add the
none key right here, and I'm going to set
it to negative one. That way, every single
other value with a count will be
greater than this one. In the very first iteration, replace the greatest value
with the current value. That's it for this function. Let's now try executing the
tests and make sure that most frequent no longer
shows up as a failed test. Click the green "Run"
button at the very top. There you go. You can now see that most frequent is no longer
a failing test. Here we want to
consider empty input. There is one case right
here where you might have something strange
going on if you passed an empty list
into most frequent, then you would simply
get back none, which is a reasonable default. But you should just make sure
that your function doesn't break if you get empty input. In this case, our function does not break. It returns none. So as long as we declare
that the function will return none
upon empty input, then that's perfectly valid. For our next question. This one is pretty
challenging conceptually. We're going to use
a data structure or any data structure to
sort a bunch of travels. A bunch of steps are formatted
as source and destination. You can think of these as
source and destination cities. Once we have that, we want to
compute the complete path. We want to find the
real starting position and the real or ultimate
ending position. You can assume for
now that there is only one path and that
there's one path as complete. That means that
there are no gaps. Never end up at some random
position in the middle with a bunch of unrelated steps
and there are no cycles. You'll never go from
one to two back to one. If you go from one to two, you will always
proceed to three, four, so on, so forth. Your goal is to print
the sequence of steps in order from first to last. Here we have a bunch of steps. We went from three to two, one to three, and two to five. Your goal is to find out what the correct sequence
of steps is. Here here correct
sequence is one to three, then three to two,
then two to five. Pause the video here, see what data structures you want to use and how you would use them. Here's the solution.
Conceptually, what we're going to do is first find out what the ultimate
starting position is. In order to do that,
we're going to create a set of Hashmaps. These Hashmaps will map from the destination
to the source. Then what we'll
do is we'll start from any arbitrary destination. From that destination
we'll check, is the source in the HashMap? Is the current source
another steps destination? If so, then we go to that steps destination and source, and you keep doing this. Let's say for example, we start from two and then we say, is three in our mapping? It is, three is in our mapping. Let's look up the
corresponding source. One, is one in our mapping? No, it's not in our mapping. So that means one is the ultimate starting position. That's what we're going to do. We're going to create a
mapping from destination to source.. Then we're
going to keep traversing it until we get
to the ultimate source. Once you've gotten there,
that's the first step. We've gotten the
ultimate source. We now need to actually print
out this series of steps. We need to go backwards,
we need to start from the source and keep traversing. We'll start from one,
we'll go to three, then we'll look up three. We see three, we then go to two. We look up two, two is here.
Then we keep doing that. You keep traversing in
the forward direction. You need a set of Hashmaps that give you the reverse direction. Then you need to
set up HashMap that gives you the forward direction. We're going to do that. I'm
going to create two Hashmaps. One HashMap always links
from destination to source. One always links from
source to destination, so that's our algorithm. Now what is the runtime and memory complexity?
Pause the video here. Here's the answer.
Our memory complexity is linear in the
number of steps, because for every single
step we add it to both of these Hashmaps or
both these dictionaries. Now for our runtime
complexity, same thing. It's linear in the number
of steps because we have to access every single one of
these steps at least once, and fortunately
for us only once. Knowing that, let's now actually write up the
function, the code. Pause the video here
and try the code. Let's now cover the solution. Here we're going to initialize two Hashmaps or
two dictionaries. One is going to be forward, one is going to be backwards. For every single
pair of source and destination in our steps, we're going to populate
both of these. The forward mapping will go
from source to destination. The backward mapping will go
from destination to source. Now that we've populated
both these mappings, let's traverse from
any arbitrary source all the way back to
the ultimate source. While the source is in backward, and this I guess is a
little bit unclear, but here I'll just
make it clear. Here we'll have source is
equal to the first step, and then it will be
the first step source. While source is in backwards
we'll keep traversing. Once this while loop is done, source is no longer in
the backward mapping, that means source is
the ultimate source, the first place that
we ever went to. Then, as long as source is in the forward dictionary or in the forward mapping,
then we'll print it. Here we're going to use
f strings in Python. You can print this
however you want and then we'll
traverse this way. As long as our source is
in the forward mapping, we will access that item in the forward mapping
for its destination. That's pretty much it. Let's now run this function
and make sure that we pass this function, get path from tests. There we go, we only
have one set of failing tests that are
in another function, so we've successfully
passed this exercise. Let's not talk about
this challenge question. This question is also
fairly difficult, so we're only going to talk
about it algorithmically, we're not going to
actually code it up. Again, I do have written
solutions for you, however, we just won't
talk about the code here. I've annotated every single line in the solutions so
that you can go through and understand it and the most important part here
though, is the algorithm. For this problem, we're going to make
a dictionary with a maximum size that operates as a least recently used cash. In this dictionary,
anytime you insert, update or access a key, that counts as using that key, and if the dictionary has more keys than the maximum size, you'll drop the least
recently used key. Let's see an example of this. Let's say I have this
new dictionary and I insert the Value
1 with the key a, so we'll just look
at the keys here, the values don't
matter too much. We've inserted a value for key a and then we insert
a value for key b. At this point, b is the most recently used value and 1 is the least
recently used value. However, here I access a. Now a becomes the most
recently used value , here we've swapped it. One corresponds to a, so a is the most recently used and the other value is
least recently used. Now we assign a new value to the key c and you'll
notice that at this point, b was the least recently
used value right here, so b is dropped and now we
only have a and c left. In this particular dictionary, the maximum size was two. Now, that's how we actually saw one of the keys get dropped. Keep in mind,
anytime you access, update or insert a value for a key that counts as accessing
or using that value, and anytime you insert if
you exceed the maximum size, you lose the least
recently used value. Knowing all of this,
let's now think about how to design
this algorithmically. Pause the video here and see if you can come up
with an algorithm. Here is the solution. To actually implement
this cache, what we're going to do is keep actually a doubly linked list. Before we talk about
a linked list, a linked list keeps a
link to its next link, so you would look
something like this. You would have one that links
to two, links to three. However, a doubly linked list actually includes
both directions, three knows that its
previous value is two, two knows that its
previous value is one. The reason why you want
this doubly linked list is because we want
the linked list to keep track of the ordering between all of our data and
this ordering will tell us that the most recently used and the least recently
used values are. Fortunately with a linked list, it's very easy to
insert one value and to remove one
value at a time. Those are both constant
time operations. Note that this isn't
true for an array. In an array, if you
wanted to move, let's say three. Let's
say we had this. Let's say we had 1, 2, 3, 4, 5 and let's say that five is now the most
recently used value. Your new list would have
to be 5, 1, 2, 3, 4. With a doubly-linked list, this is a constant
time operation, all you do is reassign the connections like we
talked about before. However, for an array, you would have to actually
copy 1-2's position, 2-3, 3-4, 4-5, and then copy
five to the beginning. You'd have to do n copies, n operations in order
to reshuffle this list. However, with a
doubly-linked list, you don't have to
do any of that. It's a constant time operation. This is why we want a
doubly-linked list to combine with our mapping or a HashMap to implement this
least recently used cache. Knowing that we want
a doubly-linked list, how are we going to
actually use it? Let's talk about the two
functions that we actually need. We have two different functions. I'm not going to actually fully flesh out these functions, but I'm going to add some
pseudo code here just so you know what the general structure of the code would look like. Here, we would have
the ability to get an item and we would also have another function here that would give us the
ability to set an item. The basic idea here is that underneath both
of these functions, we would update the key, we would say something like, make most recent key, and
then same thing here. We'll also make most recent key, and that's pretty much it
for the high level logic. Most of the nitty-gritty
would have occurred here. In the make most recent
function right here, you would actually do
all the reassignment needed to both
remove an item from the linked list and
then to reinsert it at the very beginning
and that's it. This is the conceptual
solution to this problem. You would use a
doubly-linked list and you'll also use a map. This seems fairly complex, I'm not going to code out, at least not here in the lesson. I've already written
out the solutions and the solutions
will be available to you with fully
annotated code. However, the most important part of this question was coming up with using the
doubly-linked list. There will be quite
a few questions like this in your interviews
where you have to come up with some novel
combination of different data structures in order to accomplish your goals. In this particular case, we wanted a least
recently used cache, and we want it to
be very efficient, using this doubly-linked list allows us to keep this
function efficient. That brings us to our second
part of the question. What is the runtime and memory complexity
of our algorithm? In particular, what is the runtime your
memory complexity of inserting and then actually
accessing a value? Pause the video here,
and here's the solution. Getting the value
is pretty simple. Getting the value itself, just accessing some value in a HashMap is all of one,
it's constant time. But the real question is, how long does it take to
update the most recent value? Because that involves
removing from and inserting back into a
doubly-linked list. Well, fortunately, insertion and deletion from a linked
list is constant time. Given that, we know
that this algorithm, this data structure is constant time to
access. That's great. We've maintained
the efficiency of a HashMap and now what
about assignment? What are updating this HashMap? Well, fortunately, that is
also still constant time. It's constant time to update
a HashMap itself and again, to make that key and value
pair the most recent. That operation itself
is again constant time, just like we talked
about earlier. Because that constant
time operation, making something
the most recent is just inserting and removing
from a doubly-linked list. All this to say, we came up with a nice combination of data structures to
keep constant time, insertion and access
for a data structure. That's the most efficient
you can possibly get. I guess the takeaway
here is that knowing the right data
structures can really mean an order of
magnitude difference, an order of growth
difference between an implementation that is
efficient and inefficient. In this case, using doubly-linked
list is what got us an efficiency and this is a
useful skill for interviews, maybe less so for on the job, but if you had a massive
production system, if you had a system
critical application that was using inefficient
data structures, then you would have
the opportunity to create some massive
business impact. That's it for this exercise. Again, for access to all these coarse slides
to all the solutions, make sure to access
the course website. Finally, that
concludes the HashMap slash mappings practice lesson.
11. Trees: Breadth vs. Depth: In this lesson, we'll discuss a way to store data
hierarchically. A data structure called a tree. We'll explain how a tree works, how to traverse a tree, how to organize a tree, what O(logn) time means. Finally, we'll
discuss the pros and cons of this new data structure. Let's start with what a tree is. A tree consists of a root node. That root node has
two child nodes. Each of those nodes then has
two more child nodes each. This structure is
what we call a tree. In fact, because each
node has two children, we call it a binary tree. Let's say you want to search for a value or we want
to modify this tree, we need to know how
to traverse a tree. Here are two different
ways to achieve that. The first method is called
Breadth First Traversal, or BFS for short. In BFS, we traverse the nodes left to right level by level. First we access three. We finished the first level, so let's move on
to the next level. Then we access 7, then 2, we've finished
the second level. So let's move on
to the last level. We access zero, one, nine and eight. We've finished the last level, so we're done with BFS. Our second way of
traversing this tree is called Depth-First Traversal, or DFS for short. Here we access from
the root node, three, we also access seven. We continue to traverse deeper, so we access two. Now that we've hit a dead end, we go back to seven, and explore as
deeply as possible. So we access one. Again, we've hit a dead end. So we go back to seven,
go back to three, and again, explore as
deeply as possible. We access two then nine.
We've hit a dead end. So go back to two and
finally access eight. And those are the two
ways to traverse a tree, breadth-first or
depth-first traversal. We'll now explain one way
to organize data in a tree. This is called the
binary search tree. Notice we've
rearranged the values. In particular, we've
arranged in them so that the left child is always
less than the parent. For example, three here is less than five and two
is less than three. We've also arranged
these values so that the right child is always
greater than the parent. For example, eight is
greater than five. So the right child is always greater and the left
child is always less. This ordered structure
allows us to manage data very efficiently. Let's say we want to
insert data into the tree. Here, we want to insert six. To do that, start from
the root node five. From here we'll iteratively
look for a home for six. Since six is greater than five, we know six must
belong to the right. Now we have the value eight. Since six is less than eight, we know six must
belong to the left. There is no node here, so this is now six's home. This concludes the
insertion algorithm. Let's say the depth
of the tree is d, then this algorithm takes
up to d operations. As a result, insertion
takes O of d time to run. To summarize, insertion
takes O(d) time. Let's now see how
deletion works. Say we're deleting
six shown in red. Well, that's pretty simple. Simply delete that node and
the pointer from its parent. However, let's say we deleted from the
middle of the tree. In particular, we
want to delete five. Let's try what we did last
time, just drop five. However, there's now
a hole in the tree. We can use either
child to fill in the gap, three or eight. Here, we'll use the
right child eight. So shift eight up. However, now there's a gap
where eight used to be. To be consistent, we'll use
eight's right child, nine. Again, shift nine up. Now there's a whole where nine used to be, but that's okay. There are no more children. So we can simply delete
the extra pointer. Just like with the
previous algorithm. This algorithm takes
up to d operations. So deletion takes O of d time. To summarize insertion
and deletion for a binary search tree
both take O of d time. Our goal is to search
for the value six. We start from the root node
five, like with insertion, we compare our desired value six with the current node five. Six is greater than
five, so we go right. Now we access eight. We compare, six is less
than eight, so we go left. Now we access six, six is the value
we're looking for. So we're done searching.
This algorithm can take up to d operations. So search is O of
d. To summarize, a binary search
tree features O of d modification and access. Note that in a tree there is no such thing as directly
fetching the eighth node, because there's no global
ordering of nodes, and you don't know where any
of the nodes are in memory. As a result, there is no
fetch run-time analysis. This concludes our
analysis of trees. However, we have one more to do. We'd like to translate
these run-times. All of our runtimes are
expressed as a function of the tree depth or d. Visualized, here's a tree and here's
the tree's depth. In this case we have
a depth of four. We would like to express
these runtimes in terms of n, the number of nodes, in this case there are 15 nodes. This is because
runtimes for all of our previous data structures were expressed in terms of n, the number of items. As a result, we should express our new runtimes in
n as well to make sure these runtimes are
comparable. Here's the same tree. Let's try to understand
the relationship between d and n. Consider
the first layer. This is Depth 1 with one node. Consider the first two layers. This is Depth 2
with three nodes. The first three layers, this is Depth 3
with seven nodes. Consider all layers. This is Depth 4 with 15 nodes. It may be hard to
notice a pattern. So instead of counting
the number of nodes, I'll count the number
of nodes plus one. Now, if you look at the numbers in red,
do you see a pattern? As you may have noticed,
the number of reds, number of nodes roughly doubles. As a result, we can
state that n is approximately two
to the power of d. Now we have a mathematical
relationship between d and n. Since we want to plug in d into our runtime
complexities, let's solve for d.
To solve for d, we'll first rewrite
the equation, then apply logarithms
to both sides. Remember that with logs, we can move the exponent d outside of the log
as a coefficient. Then we simply ignore
the constant log two. So now we get that log of n is approximately d. We
can now plug in. Previously we had these
runtimes express in terms of d. Plugging
in d is equal to logn. We now get O of log n for all modification and
access runtimes. Using this analysis,
let's now discuss the pros and cons of
binary search trees. Binary search trees in some have impressive runtimes,
just O(logn). In practice, O of log n is very close to o of
one constant time. By design, binary
search trees are ideal for random sequences. The more uniformly
spread out, the better. However, binary
search trees are the worst for sequential
data, and here's why. Let's say I have a sequence
of numbers from 0-10. First we create the root node 0, then we answered one, then two, then three, then four, then five. The list goes on,
and as you can see, we basically have
a straight line instead of a filled
binary search tree. This means that now the
depth is no longer log n, instead the depth is exactly equal to the
number of nodes. Meaning all of our algorithms now take O of n or linear time. So in summary,
binary search trees are really inefficient
for sequential data. In summary, we've
discussed what a tree is, breadth-first versus
depth-first traversal, what a binary search tree is, what the O of log n time means, and some pros and cons
of binary search trees. For a copy of these
slides and resources, make sure to check out
the course website. Let's now see these data
structures in action. In the next lesson,
we'll practice using and implementing
these data structures, tackling both of our learning
objectives for this course.
12. Trees Practice: Welcome to the final practice
lesson for this course. In this lesson, we're going
to practice with trees, in particular with breadth
versus depth-first search. Let's go ahead and
get started here. Just like before, navigate to
alvinwan.com/datastructures101/trees. At that link, you'll see
something like this. Forward the project
to get your own copy, click on the project name, click on the ellipsis, and click on fork. I suggest placing your
Skillshare and reply window side-by-side so that you can
code along as I'm coding. You should then see a
screen just like this one, go ahead and scroll
down and you'll get to the first exercise right here called bfs or
breadth-first search. So one tip before we get
started is to memorize how to perform breadth-first
and depth-first search, abbreviated here as bfs and dfs, because it will
always be the same. In this particular
case, we're not using a technique
called recursion. So without recursion, the way to do bfs and dfs always
looks like this. Knowing that, let's now try to implement
breadth-first search. Here's what a
breadth-first search traversal of a tree looks like. Here we've got a
tree, 1, 2, 3, 4. Let me go and illustrate
what this looks like. Here you would have one, and then you would
have a child right here that would be two, and then you would have another child right
there, that'd be three. Then you'll also have
a tree like this. This is what that
tree looks like. You would have basically
one of the root, it's left child is
going to be two, and then two's left child
is going to be three. Then this is the
right child of one. This is what the
tree looks like. Knowing that's what that
tree it looks like, here's the breadth-first
search traversal looks like. Breadth-first traversal
will give us one first, and then it'll give us two, and then four, and then three. It's traversing and it's also called level
order traversal, so here we're basically
moving down levels at a time, 1, 2, 4, 3. If we'd had down here
five for example, then we would have 1, 2, 4, 3, 5 and so on and so forth. That's what that looks like. Knowing that this is
now breadth-first traversal, here's a hint. Pause the video now and try this problem on your own if
you don't want the hint. The hint is that you'll need to use one of the data structures that we
talked about before. In particular, you'll
have to use a stack or a queue in order to do this. Go ahead and pause
the video here and try to design the algorithm for a
breadth-first traversal. Now here's the solution. The solution is going to
involve using a queue. What we're going to
do is in this queue, we're going to start
by adding the root. Once we've added the root, then what we're going
to do is take the root and enqueue all the
children for that root. Once we've done that, then we're going to dequeue. Dequeuing is going to give
us the very first item here. Then we're going to enqueue
all of its children. That's going to be
three and whatever else is a child of two. Actually, one way
to do this would be to include a visualization. Here we have one, we're
going to dequeue one, and we're going to add all of
its children two and four. Then we're going to dequeue two and then we're going to
enqueue of its children, which are just three. Then let's say maybe
there's another one, let's say there's five. Here we would have
three and five. Then we would finally take our. Then four we'd enqueue
all its children, which is nothing
and so on to forth. The idea is you'll notice that the order of dequeuing
was 1, 2, 4. Then after that, it's going
to be three and five, which is exactly what bfs is. The queue gives us exactly a breadth-first
tree traversal. Knowing that, let's now see what is the runtime in memory complexity.
Pause the video here. Here's the solution.
The runtime and memory complexity
are both going to be o of n. You may be
thinking, what is n here? Well, n is the number of nodes. That's important. Before we had this case where we defined
a variable on our own. We define a variable
as a number of unique values that
are being inputted. In this case, by
default for a tree, that variable n is going to be the number
of nodes in the tree. You could also redefine it, you could redefine the
n as the height of the tree or you can define it as something else but usually, n is number of
nodes in the tree. Knowing that, when you report your runtime memory
complexity for a tree, you should report in terms
of the number of nodes, which for us is
going to be o of n, because you have to traverse
over every single node, that's what a breadth-first
traversal is and second, you have to add all of
these nodes to your queue. At some point, actually, sorry I lied a little bit, your queue will never exceed the maximum
width of the tree. We'll see what width
means later on, but we'll just say
for now that o of n is the memory complexity. We can play with
that nuance later, I'll explain why, there's a better answer. For now, go ahead and pause the video and try
to code this up. Let's write out the solution. Here we're going to
define a queue and the queue's first value is
going to be the root actually. While there's something
in the queue, we're going to dequeue. Then we're going to extend, so we're going to
enqueue actually. For each branch in
the tree's branches, we're going to enqueue, so queue enqueue the branch. Then finally, we'll
print the current value. This will give us a
breadth-first traversal. Let's now check to see if
this passes our tests. Click the Run, green button at the very
top and you can see now that bfs is no longer in our
list of failed functions. You've successfully
passed this test. Let's move on to
the next question. Actually, before we move
on to the next question, one tip is that you should
always use new, hold on. One of the tips is
that you should always use a queue for
breadth-first traversal. You saw that in example here. You saw how to use it.
Keep this in mind. This will pretty much
be always how you implement bfs without recursion. We'll talk about recursion
in a future course. Knowing that, let's now
implement depth-first search. Depth-first search will
look like the following. Let's say we had this tree and this tree is pretty much
the same structure. It's a little different
structure actually. Let me draw this out for you. Here we have the root 1. It's left child is going to be 2 and 2's left child
is going to be 3. Then there is no
other children here. 1's right child is going
to be 4, right here. Then 4's left child
is going to be 5. Here we're going to hit. One thing to note is I keep saying left and right
child is because usually tree problems are built around your binary
tree where you only have one left
and one right child. The way that we've
implemented a tree here, we can have unlimited branches. You're not limited
to a left and right. Technically speaking, 5
isn't a left child of 4, 5 is the only branch of 4. I've just been saying left
and right out of habit, but it doesn't really matter. Point is that here is what
the tree roughly looks like. We have 1 at the root, it's children are 2 and 4 and
then 2's only child is 3, 4's only child is 5. Knowing that, a
depth-first traversal or first go down one
of these paths. In the way that we've
implemented it, 1 will first go down this path over here
4 and then we'll go all the way down
to 5 before coming all the way back up and
exploring other paths. That's what depth-first
traversal will look like. It'll explore all the
way down to a leaf. Again, a leaf is a node in the tree that has no
children, no branches. It'll explore all the
way down first and then it'll come
back up and start to explore other avenues. That's what depth-first
traversal looks like. You see down here it'll go 1, 4, 5, and then it'll go back up
and explore 2 and then 3. That's a depth-first traversal. Let's try to come up
with an algorithm. Pause the video here.
Here's the solution. For depth-first traversal, well, we previously used a queue
for breadth-first traversal. For depth-first traversal,
we're going to use a stack. The stack is going to operate
in a very similar way. We're going to keep adding
items to the stack, and we're going to keep
popping from the stack. Here's what it's
going to look like. When I start with 1,
we're going to pop 1 and then we're going to
add its children 2 and 4. Now that we've got 2 and 4, we're going to pop the
last item which is 4. Then we're going to add
its children, which is 5. Then we're going to pop 5. You can see where this is going. So far we've gone
through 1, 4, 5. Once we're done with 5,
we're going to pop 2. This will give us our depth first traversal using a stack. Knowing that that's
the algorithm, let's go and talk
about the memory and runtime complexity. What is the run-time
in memory complexity? Pause the video here,
and here's the answer. The memory complexity
here is going to be linear in the
number of nodes. For memory complexity,
we're going to digest the memory complexity. For computational complexity,
we're going to have linear computational
complexity because we have to iterate through
all of these nodes. Same caveat that I mentioned before for memory complexity, it's actually not linear
in the number of nodes, but we'll talk about
that more later. Now let's go ahead and
try coding this function. Pause the video here
and give it a shot. Let's go ahead and
code up this function. Here in our
depth-first traversal, we're going to start off
by initializing a stack. This stack is going to have
our root node inside of it. Then we're going to iterate. As long as we have
something in the stack, we'll then pop from the stack. Then for every single
branch of our current node. For branch in current
top branches, we will then push that branch onto our stack and then we'll print
our current value. That's pretty much it for
our depth-first traversal. Let's go ahead and
double-check this works. Hit the green "Run"
button at the very top. There you go, dfs is no longer in our list
of failures so we have successfully
passed the dfs tests. The tip here is you should use a stack for
depth-first traversal. Let's now move on to
the next question. For every single
node in the tree, add a new attribute called parent that points
to its parent. Here we're going to use bfs
or dfs for parent assignment. This attribute parent will point to the node's
parent in the tree. Let's go ahead and
draw that tree again. Let's say we have
this tree right here. We have here 1, and it's only child is 2
and 2 has two children, which are 3 and 4. It's going to look
something like this and there are no other
branches, just these guys. Now that you have this, we're going to actually
populate parents. Were going to
implement this so that every single tree here
points back to its parent. So 2 will point to 1, 3 will point to 2,
4 will point to 2. Knowing that, let's think about this and design an algorithm for this. Pause the video here. Let's talk about the solution. For this question, it
actually doesn't matter whether you use bfs or dfs. In either case, you
can make this work. Basically the algorithm
is going to be, we're going to perform some traversal and
if you look above, we actually iterated over all the children or all the branches for
their current node. While you're doing
that iteration, you'll update this branch to point back to the current
node, which is the parent. It doesn't really matter
which traversal you use. That's the algorithm
and in fact, that's pretty much
the code solution. Go ahead and consider now, what is the memory and runtime complexity?
Pause the video here. Just like before, because we're effectively using DFS and BFS, the runtime and
computational complexity is the same as before. Computational complexity is O of N and memory complexity,
we'll say for now, is basically O of
N. Knowing that, let's now go ahead
and try the code. Pause the video here and
give it a shot first. Now let's go over the solution. I'm going to cheat a little bit. I'm actually just going to copy and paste from the
previous function, which already performed DFS. Let's go ahead and
paste that down below. I'm going to remove
this print statement and then I'm going to assign branch.parent is
equal to current. That's it. Let's go ahead and check that
this function is correct. We're going to hit the
green "Run" button at the very top. There we go. We can see now that populate parents is no longer in
the list of failing tests. We've successfully
passed all these tests. Like I said before, BFS, DFS, as long as you memorize
that implementation, you're going to see
that very often. At least without recursion, this is what it's
going to look like, so make sure you know
that implementation. Tip from earlier, always consider
empty input as well. In this particular case, if our tree was empty, then our function would fail because here we're going to pop, we're going to get current,
current is going to be none, and then we're going to attempt
to access.branches of it. Again, for all of your interviews and for
all of your questions, make sure to consider what
would you do with empty input. In this particular case, we should have said
if tree is none, then we'll simply do
nothing, just return. That's important
to keep in mind. Let's move on to
the next question. Here we're going to use
BFS or DFS to compute the maximum value in an
extremely wide but shallow tree. Let's go ahead and
consider the following. Consider what this means for
your choice of traversal. What is the maximum
space complexity of breadth-first search or of depth-first
search. Go ahead. This is actually related to the nuance that we
talked about earlier. Pause the video here and consider first before you
can consider the algorithm. Actually as you're
considering the algorithm. [NOISE] Here's the
next question. We're gonna get the
maximum value from an extremely wide
but shallow tree. The important part of this
question is to consider what an extremely wide
but shallow tree does to your algorithm. Why is this important?
What does this mean for your choice
of traversal? We talked about a subtle
nuance earlier about the space or memory complexity
of your mode of traversal. This question is really
honing in on that, to try and understand what that nuance means
for your algorithm. Pause the video here and see
if you can figure this out. What is the difference between extremely wide but
shallow tree versus a deep tree but very narrow one for BFS or
DFS. Pause the video here. Here's the answer. For an extremely wide
but shallow tree, you're going to want to use DFS. The reason is because BFS
or breadth-first traversal, it's space complexity or the maximum number of
items in the queue or stack or whatever
collection you're using is going to be
the width of the tree. Whereas for depth-first
search or DFS, the maximum space that we'll use is equivalent to
the depth of the tree. If I've got an
extremely shallow tree, then you'll want DFS to minimize
your space consumption. Because if you have
a very shallow tree, you'll have a very short
collection to consider. Knowing that, let's go ahead and design the algorithm
to get the maximum value. Pause the video here and try
to design the algorithm. Here's the solution.
For the algorithm, we're simply going to take the maximum value anytime
you traverse a node. Anytime you traverse a node, compare that node's value to
a globally greatest value. If your current
value is greater, then simply replace it. This is very similar to the maximum frequency example that we talked about earlier. In this case, we are
considering the maximum value. Now that we've considered
the algorithm, we've considered the
mode of traversal, pause the video here
and see if you can come up with the memory and
runtime complexity. Here's the answer. For
runtime complexity, again, it's linear in
the number of nodes, it's O of N. For memory
complexity, well, we talked about this before, which is exactly why we
chose DFS for this problem. Your space complexity or memory complexity is going
to be the depth of the tree. This is not exactly O of N.
It's also, at least for now, not clear exactly what the
depth of the tree would be. We're just going to assign
a new variable to it. We're going to say D is
the depth of the tree, so our algorithm is o of d. Now let's try coding it
up. Pause the video here. Now since we have a
DFS implementation, I'm again going to be lazy. I'm just going to
copy and paste it. Let's scroll back up here, and then we're going to copy
this DFS implementation, and then we're going
to scroll back down and we're going to paste it. Now that we've got this
DFS implementation, we're going to actually
remove the print statement. Like we said before, for every single node
that we consider, we're going to update
the maximum value. Let's say the maximum value in the very beginning is
just the roots value. Then when we explore
every single node, we're going to take
the maximum between the current value and
the greatest value. Finally, we're going to
return the maximum value. Let's try running this
function and make sure it passes all the tests. Go ahead and hit the
green "Run" button at the very top. There we go. We can see now that get maximum is no longer in
the list of failing tests. Let's now move on to
the next question. We're going to combine
data structures to compute the tree width. Let's now see what
the tree width is. The width of the tree is the
maximum number of nodes at any given depth or
any given level. Let's doodle this tree again. Here we've got one and
one has one child, which is just two, and
two has two children, which are three and four. Here we have three, and then here we have four. That's what this
tree looks like. The width of this
tree then is two because this level has
two elements in it. Now if we had another
one right here, let's say we had five, and then we had
another one, six, let's say, then the
width of this tree would be three because you have at most three
on a single level. Knowing that, let's
go ahead and update. Let's discuss the algorithm. Go on and pause the video
here and see how you would compute the width of this tree. Here's a solution. Fortunately for us,
we have a mode of traversal that makes it a lot simpler for us
to compute the depth, which is going to be level
order traversal or BFS. There are two ways
that you can do this. One is that you can use
another data structure to actually keep track of how many nodes there
are at a certain depth. The second one is the
modified BFS so that you traverse one level
at a time and you keep track of basically
when you start the next level. Those
are your two options. Those are two
different algorithms. We're going to implement
the first one where we're going to use
another data structure, when you use a mapping. Now that you've
talked about this, let's talk about the runtime
and memory complexity. Pause the video here and see
if you can figure it out. Here's the answer. For runtime complexity,
both algorithms, regardless of which
one you picked, are going to be O of N, where n is number of nodes. Both computational
complexities are linear in the number of nodes. Now if you decide to pursue
multiple data structures, where you keep a mapping
from depth to counts, that's going to be an
additional space complexity of O of D in the depth. You're going to be linear
in the depth because you're storing one value for
every single depth. However, if you use
the algorithm where you modify BFS in place, then your algorithm
would be O of width. Both algorithms are
at least O of width, because the baseline, that's just how the original BFS works. However, the algorithm that additional uses a
HashMap to store that mapping from depth
to count is going to be O of width plus O of depth. We're going to implement
the less efficient one, which would seem backwards
but I've already implemented the efficient one already
in the solution so you'll be able to check
that if you want to see the more efficient
implementation. But for now, I'm going to implement the
less efficient one. Go ahead and pause
the video here and see if you can come up
with the code yourself. Here's our solution.
Let's go ahead and write. We're going to use our
favorite data structure here. We're going to use a
default dictionary. We're going to start by
initializing a queue. Every single item in this
queue is going to be a depth of the current node in addition to the node itself, because we need to keep track of depth in order to know which count in our counter of
widths to actually update. Here's our counter, widths, and it's going to initialize
all widths to zero. While we have something
in the queue, we're going to dequeue
and then we're going to update the widths dictionary by incrementing
that value by one. Finally, for each branch, we're going to enqueue the depth incremented by
one and the node itself. Anytime you go from
parent to child, we'll increment
the depth by one. Finally, we'll return the
maximum width that we see. This will give us
the maximum width and all the widths
that we counted up. Let's now run this
and make sure that passes all of our
tests. There we go. Lack of output means that
all of our doc tests passed. You've finally finished all the exercises
for this lesson. For all these slides, full solutions and more
practice exercises, make sure to visit this website. That's it for our
practice for trees, breadth versus depth
versus traversal. You've seen a bunch
of problems here. You've seen a bunch
of problems in previous practice
lessons as well. I hope that these
were very useful. Let's now move on to
our very last lesson where we'll wrap this all up.
13. Conclusion: Congratulations on
finishing the course. We've covered a ton of
topics, orders of growth, stacks and queues, linked lists, hash maps, trees. That was a pretty
difficult course with a ton of content and some
really brutal questions. You've finished 20
exercises in depth, 20 interview tips and practice the three-step
method for solving puzzles again and
again and again. That was a lot of content and the summary slides are
great tools to memorize. However, the most
important takeaway is actually the
three-step process. As far as interviews go, that is the key. Here's the three-step process one one time: design
algorithms without code, reports runtime and
memory complexities, and finally, code the algorithm. Now to really solidify your understanding for three
days after the course, design a data structure
question each day. I also encourage you to post your question in the
discussion section. You just need some sleep in-between thinking
about data structures. It doesn't have to be
your life's best work. Any question will do and post
it in discussion section. I really do look forward to
seeing them and that's it. Fantastic work. This is the
start of your interview preparation and you're
off on a strong footing. Check out my Skillshare for
more courses on data science, machine learning and other important programming concepts. For more interview preparation, follow me on Skillshare for
more updates as we'll cover more commonly used
coding puzzles and interviews specific tricks. Congratulations once
more on making it to the end of the course
and until next time.