Kick Butt in Coding Interviews: Data Structures in Python

Alvin Wan, Research Scientist

Get unlimited access to every class

Taught by industry leaders & working professionals

Topics include illustration, design, photography, and more

Get unlimited access to every class

Taught by industry leaders & working professionals

Topics include illustration, design, photography, and more

Lessons in This Class

- 1.
  
  Ready? Set. Go.
  
  1:41
- 2.
  
  Why You Need Data Structures
  
  6:15
- 3.
  
  What is a "Good" Data Structure?
  
  5:31
- 4.
  
  Orders of Growth Practice
  
  12:00
- 5.
  
  Sequences: Stack vs. Queue
  
  6:32
- 6.
  
  Sequences Practice
  
  39:09
- 7.
  
  Lists: Array vs. Linked List
  
  7:22
- 8.
  
  Lists Practice
  
  27:38
- 9.
  
  Mappings: Hashmaps
  
  6:22
- 10.
  
  Mappings Practice
  
  29:51
- 11.
  
  Trees: Breadth vs. Depth
  
  9:14
- 12.
  
  Trees Practice
  
  25:39
- 13.
  
  Conclusion
  
  1:32

Beginner level

Intermediate level

Advanced level

All levels

786

Students

Projects

About This Class

No idea how to get ready for coding interviews? The answer lies in understanding practical benefits of an age-old computer science 101 topic: data structures.

This class covers a must-know topic for every programmer: data structures. We’ll cover several concepts and takeaways:

How to analyze and understand algorithm “efficiency” using orders of growth analysis
Commonly-used data structures: stacks, queues, linked lists, trees, hash maps
Analysis of data structure algorithms: insertion, deletion, search, and access
Practical implications of data structures strengths and weaknesses
28 in-depth interview-style practice questions
20 interview tips

The class is highly interactive, as we’ll be coding together. By the end of this class, you’ll gain data structures know-how, learning the fundamentals and understanding practical implications. Most importantly, you’ll have interview-style practice under your belt.

Interested in creative coding? Check out my VR101 (AFrame) class.

Interested in data science or machine learning? Check out my Coding 101 (Python), OOP 101 (Python), SQL 101 (Database Design), Data 101 (Analytics), or Computer Vision 101 (Applied ML) classes.

Meet Your Teacher

Alvin Wan

Research Scientist

Top Teacher

Hi, I'm Alvin. I was formerly a computer science lecturer at UC Berkeley, where I served on various course staffs for 5 years. I'm now a research scientist at a large tech company, working on cutting edge AI. I've got courses to get you started -- not just to teach the basics, but also to get you excited to learn more. For more, see my Guide to Coding or YouTube.

Welcoming Guest Teacher Derek! I was formerly an instructor for the largest computer science course at UC Berkeley, where I taught for several years and won the Distinguished GSI (graduate student instructor) award. I am now a software engineer working on experimentation platforms at a large tech company. 4.45 / 5.00 average rating (943 reviews) at UC Berkeley. For more, see my Skillshare or Webs... See full profile

Related Skills

Development More Development Data Science

Level: Intermediate

Hands-on Class Project

Your project in this course focuses on solidifying what you've learned, by flipping the role of the student:

For 3 days after your finish this course, write a new data structures question daily, and post the questions here. The important bit is to revisit data structures after a good night’s sleep, several times.
To lower the effort needed, I suggest finding a time where you're bored -- maybe waiting for a bus, brushing your teeth, or waiting in line for food. Then, use that time to come up with a data structures question.
Make sure to include the solution to your exercises (so I know what you had intended!)

I look forward to seeing your exercises; I may not be able to respond to all projects but I will definitely read all of them. It's always a joy seeing them, and can't wait!

Sanskruti Gunde 4 likes

Manage Event Attendees

Joan Valverde 3 likes

Nick Angiers 3 likes

test

ARPITA DAS

Class Ratings

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.

Transcripts

1. Ready? Set. Go.: [MUSIC] Coding interviews are really intimidating. There are difficult questions, tricky puzzles, and even unsolvable problems. There are plans of attack for each scenario, but we'll focus on the most obvious one when the question is straightforward. Where there are no curve balls, no tricks. I want to make sure you ace the interview. Hi. I'm Alvin, a research scientists in industry. This past spring, I received offers from Meta, Apple, Amazon, Tesla and more. Within the past few years, I've also received engineering offers from Google, NVIDIA, Facebook, and Lyft. The experience I've gained from these interviews was invaluable. I really sat down to understand what are all the pieces that made my interviews successful? What are the fundamentals and the tricks? This course is one step into important directions, a step forward in an interview preparation and a step forward in coding mastery. To accomplish both, we'll learn about data structures. Data structures are a way to organize data and in this course, you'll see examples, how they work, and how to use them. By the end of the course, you'll have two skills, one for each objective. You'll be able to pick data structures to use in coding interviews. This is a big deal. Instead of inventing efficient algorithms, just for call-in poll ready efficient algorithm. You'll also be able to implement these data structures, combining and mixing implementations as needed. This is a one as well. Knowing implementations effectively gives you starter code in the interview, a free leg up. I hope you're excited. I know I am. Let's get started. 2. Why You Need Data Structures: Let me explain why you need data structures, in particular, why this course on data structures is so critical to your growth as a coder, and how you can leverage this course effectively. We'll hit three major talking points: First, we'll discuss the benefits of mastering this topic, second, we'll discuss specific learning objectives in the course to quicken your mastery, and third, we'll introduce an exercise to help you gain mastery after the course is over. Here's our first talking point, why you should care, why I'm teaching this course, the benefits you'll reap. The biggest benefit to taking this course is being more prepared for interviews. There are two types of technical interviews: First, there are coding questions, these are specific problems like, how do you reverse a string? How do you remove all duplicate numbers? Second, there are systems design questions. These are broader questions like, how would you build a clone of Airbnb? Basically, how to design infrastructure and software for a whole product. Both interviews can be challenging, but data structures help you tremendously with both of these technical interviews, and here's how. Here are two specific benefits to knowing data structures. Benefit Number 1 is for your first technical interview, coding interviews. A significant portion, probably over a third of coding questions are related to data structures. If you know your data structures, you can often apply instead of design algorithms in your interviews. This is huge. There's no need to come up with ideas from scratch. If you know data structure A meets exactly these needs, you simply say that, "I know HashMaps are ideal for this use case; here's why and how I would use it." That's a killer interview response. Benefit Number 2 is for your systems design interview, being able to optimize systems for your specific application, both in your interviews and in practice. I'm not talking about a 10 percent or 25 percent faster system, I'm talking about a system that operates orders of magnitudes more efficiently by picking the right database, server, or serverless infrastructure, and more. You would say, "We need directed acyclic graphs. Graph databases are optimized specifically for this kind of relation, and here's why." Now that is a well-informed and convincing design decision. Benefit Number 3 is beyond the interview process. With an understanding of data structures, you'll be able to tackle more advanced topics beyond your interviews. In computer science generally, these topics give you new abilities like making a game. One of the most popular game creation tools, Unity, uses what's called a data-oriented programming paradigm. As the name suggests, understanding your data structures is pretty critical. To summarize, here are three benefits to understanding data structures. Being able to apply instead of design from scratch in coding interviews, being able to optimize for the application in systems design interviews, and being able to tackle more advanced topics to open new doors beyond just interviews. In this course, we'll focus on benefit Number 1. In every other lesson, you'll practice applying data structures to coding interview questions. To realize these benefits, you'll need to focus on two learning objectives in this course. Objective Number 1 is to know how each data structure works and practice implementing each data structure. This is key because down the line, you may have to implement parts of these data structures or combine data structures for different purposes. Just like in my other classes, I highly recommend you code alongside me, place this video on one side of your screen and your favorite code editor on the other side. Copy my code exactly, and no need to rush, pause the video whenever you need to. What's more important is that you build muscle memory. Objective Number 2 is to know how and when to use each data structure. Specifically, remember the pros and cons of each data structure. In the next lesson, we'll share specific rubric items you'll judge each data structure by. Pros and cons aren't wishy-washy, they're very precise and well-defined. To summarize, your two main learning objectives are to know how data structures work by practicing implementations and coding alongside me, second, know how and when to use data structures by remembering the pros and cons we go over. Our last talking point here is how to solidify your data structure understanding beyond this course. Sure, it's to practice, but it's not actually what you expect. We have to turn the tables around, pretend you're the interviewer, say you're writing the data structure questions now. Your task is to practice designing data structure questions and do this daily for three days. The goal is to get you thinking about data structures for just a few times after a good night's rest. Do this during your downtime, during your commute, while you're brushing your teeth, or while you're waiting in line. Add this to your do this when I'm bored list. The more you do this, the more questions you come up with, the better you'll master data structures. To recap, we've discussed the benefits of knowing data structures, better interview readiness, and a stepping stone to advanced topics, we've discussed your learning objectives, know how to implement data structures and how to use data structures, we've finally discussed a way to reinforce your knowledge, for three days after the course, design a data structure question each day. I also encourage you to post your question in the discussion section, my bad. I really, really mean this. It is incredibly rewarding for me to see the questions you post, either a question you made or a question about the course, so please please upload. I look forward to it. That's it for the overview. For a copy of these slides and more resources, make sure to check out the course website. We've finished overview, and next, we'll talk about how to compare data structures. 3. What is a "Good" Data Structure?: What is a good data structure? This is key for us to understand because one of our learning objectives is to know how and when to use each data structure. But even before we ask that question, we need to first ask what even is a data structure? We can start with the simple definition. A data structure is simply a collection of data. That sounds straightforward so let's now expand on that a bit. Let's expand this definition by adding some expectations for a data structure. In particular, we should be able to modify and access the contained data. A data structure is a collection of data that can be modified and accessed. To modify the data structure, we should be able to insert data and remove data. To access the data, we should be able to search and fetch data. These are the four actions that every data structure must support. Each of these actions can be very slow or very fast depending on the data structure. This brings us to our next section, which is how do we quantify very fast or very slow? How efficient or inefficient is each of these actions? In other words, how do we assess efficiency? To assess efficiency of an algorithm, we will count the number of operations an algorithm takes to execute. For example, let's say we have this list of fruits and say we're searching for avocado. To search the list, we need to traverse the list one by one until we find the item we're looking for. In this case, we took three "operations" to search for avocado. However, if avocado is at the end or if it's not in the list at all, your search algorithm would have to traverse all the items in the list. For a list with n items, searching list for a value takes up to n operations. Now, the definition of operation is vague. Maybe a memory access is an operation, maybe checking string equality is another operation. Let's say the number of operations needed for a single item is c. Then searching takes cn operations. However, this constant c is uninteresting to us when comparing algorithms. We simply say the algorithm takes O of n operations where the O notation hides any constants. We call this O of n statistic, the runtime or computational complexity. Here's another way to see what O of n complexity means. What we know is that as the list gets longer, the number of operations, however you count it, grows linearly. This means that our algorithms computational complexity can be bounded above and below by linear functions. Any algorithm whose complexity falls within these bounds is O of n. As a result, searching over your list has O of n computational complexity. We now know that for a list, which is also called an array, takes O of n times a search. To be specific, we search for a value when we don't know where a certain value is in the list. However, let's say we want to fetch exactly the ith item. We know we want the ith item and don't need to search for a value. That fetch takes constant time or O of one, because we simply access the ith item of the list without looking at other values. More specifically, as the list length grows, the effort needed to fetch an item doesn't change. This is great. Fetching an item is cheap. To maximize efficiency, this means that we should use an array when we have to fetch more data more often than we search for data. This is one example of a practical takeaway, and we'll see many of these in the course. Each of these decisions aims to maximize efficiency. We should amend our definition of a data structure. A data structure is a collection of data that can be efficiently modified and accessed. Based on each application, we'll need to pick data structures to ensure this efficiency. We'll analyze the complexity of a few data structures in this course. As we perform these analyses, we'll come across several common orders of growth. Orders of growth are the O of something that we saw earlier. We saw two examples of orders of growth already. Constant-time growth O of one and linear growth O of n. These are six common orders of growth shown here on this slide. But for the data structures and algorithms we're covering we'll only see four orders of growth, constant O of one, logarithmic O of log n, linear O of n, and O of n squared. We'll see what O of log n looks like later on but for now, keep in mind that you have a new tool to analyze algorithmic complexity and that these four are the most common orders of growth you'll find. For a copy of these slides and more resources, make sure to check out the course website. Now, you have a new tool under your belt and orders of growth analysis, which allows you to assess computational complexity for any algorithm. More importantly, it allows you to pick data structures for different applications based on how slow or how fast a data structure is for insertion, deletion, fetching, and searching. In the next section, we'll practice analyzing orders of growth. 4. Orders of Growth Practice: Welcome to some data structures practice. In this lesson, we're going to assess data structures. In other words, we'll actually assess algorithms and data structures using a system called orders of growth. This is what we talked about in the previous lesson. The Big O Notation will tell us how quickly or how slowly an algorithm takes to run based on its input size. We'll put aside interview tactics for now, at least interview specific tactics. We'll certainly have interview related tips, but we need to focus on the fundamentals first. Start off by navigating to this URL, alvinwan.com/data structures101/oog. Once you see this page, fork the project to get your own editable copy. To do this, click on the project name in the top-left to get a drop-down. Then click on the ellipsis in the top-right to get another drop-down. Finally, click on the "Fork" button. Then I suggest placing your Skillshare and Repl.it windows side-by-side as shown here. You should then see a window just like mine on the right hand side. Before we actually dive into the problems, let's review the orders of growth that we talked about previously. There are four relevant orders of growth. As a reminder, what I'm writing here is a function of n, where n is the size of the input. These orders of growth will tell us how much slower the function will run if we increase the input size n. As a result, we want runtimes to grow more slowly as the input grows in size. Moving up this list of runtimes is good. The slower the order of growth, the more efficient and preferred the algorithm or data structure is. One tip I have for you is to know these runtimes by heart, these are the four core runtimes that you need to know for every single interview. The first one O(1) or constant time means that no matter what the input size is, our function always takes the same amount of time. Second, O(log n) will actually talk about later. This is a little bit more of a complicated runtime to explain. Third one O(n), or linear runtime means that if our input size doubles, our runtime also doubles. If our input size triples then our runtime also triples. In other words, our function runtime actually grows linearly with respect to n. O(n) squared here means quadratic. If we double the input size, then our function runtime quadruples or algorithm runtime quadruples. If our input size triples then our runtime will actually grow by nine times, and so on and so forth. Now let's dive right into the exercises. On the right hand side here, I've got our first exercise. What is the runtime complexity for the following? Here we have just a for loop for i in range n. Remember that range of n gives you an enterable of n different items. What is the runtime complexity of this function? Pause the video here and try it out and here's the answer. This function again iterates once over every single item in this iterable of n items. In other words, it iterates once for every n items, it will iterate n times. The runtime for this function or this algorithm is O(n). [LAUGHTER] I forgot to show you. Here are the list of four different runtime options that you have. I'll keep these here as we go through these exercises. Now let's move on to the second exercise. What is the runtime complexity of this function, again, as a function of n. Here we have two, for loops. Each one of them iterates up to n. Another way to see this would be to ask how many times has the print function called? Pause the video here and try and here's the answer. Here we iterate over i n times. Then for every single i, we also iterate over j n times. That makes for a total of n times n runs. This algorithm is O(n) squared. Let's move on to the next exercise. What is the runtime complexity of this function as a function of n? Here we start off with n. Then in this while loop, for every single step we'll have that value. What is the runtime complexity of this function? You can use a process of elimination to figure out what the runtime is. Pause the video here and give it a shot and here's the answer. The answer for this function, which we can deduce by process of elimination is O (logn). Here's how that process of elimination might work. Here we see that the function is definitely not constant. It definitely takes a little longer anytime n increases. It's definitely not O(1). However, it's not as slow as O(n), because O(n) would iterate once. It would iterate over all of the items one by one, up to n. However we don't do that. We half the value every single time. We're just skipping around a little bit. It's not as slow as O(n), and it's not as fast as O(1). The only runtime in between is O(log n). We'll do this more precisely later on to explain how O(logn) comes about. But for now, let's talk about a visual explanation. What does logarithmic growth look like and what does it mean to have logarithmic growth? Logarithmic growth is really really slow. To give you an idea of how slow consider the log function. Remember, n is the input size, the value of log n is the runtime. Say our input has size two, then the runtime has unit one runtime. Let's say our input doubles the size four, the runtime increases by one. If our input doubles again to size eight, the runtime increases by one. Finally if your input doubles again to size 16, the runtime still only increases by one. Notice our input size needs to keep doubling to increase the runtime by one. That means our algorithm is pretty efficient. Visually here's what that looks like. We plot input size along the x-axis, we plot runtime along the y-axis, then plot all the points from earlier. Notice our function grows quite slowly. In fact, here's a bigger plot with more values of x. You can see that our graph becomes flat very quickly. In other words, logarithmic functions grow very slowly. Keep this in mind. Let's go back to our function. We can now see that i here this, or i or n, let's say n needs to double in order to increase the number of while loop iterates by one. If n is two, the while loop iterates once. If n is four, this iterates two times. If n is eight, this iterates three times, and so on and so forth. This value n needs to double every single time to increase the runtime by just one. This is logarithmic growth. The other way to see this is that our logarithmic function halves the input size in each step. As you can see here, we half i every single step, so we're halving our workload every single step. This is the tip. Logarithmic functions halves input size in every single step. Here's a summary of the three examples we just went over and their runtimes. That's it for the "basic examples." I don't mean basic, as in these are easy. These examples are pretty confusing, especially when you see it for the first time. I say basic to mean that all other examples will be some modification of this basic structure. To figure out runtimes in the future, there are two steps. First, start by memorizing what these "basic examples" look like. Second, once you know these examples, you can then memorize the different possible modifications. That's our tip. Now let's move on to those modifications, those tricky examples. These are again modifications of the above examples to either look confusing or to change the runtime. Here are the four runtimes before. Let's now move on to the examples, the tricky examples. How many times is print called in the following function? Here we have a nested for loop. Then we have this if condition, if i is equal to j. Notice this if statement and how does it affect complexity if at all? Pause the video here and give it a shot again, how many times is print called? Here's the answer. Print is only called n times. This is because our inner for loop is only run once. If i is equal to zero, then this if condition only passes if j is also equal to zero. If i is equal to one, again, this if condition only passes if j is also equal to one. That means that for every single outer loop, the inner loop only runs once, so that's n times one iterations, this is a total of O(n). Let's now move on to our next tricky example. How many times is print called as a function of n? Here we have a nested for loop, but notice that the inner for loop only counts up to i. Again, how does that affect runtime if at all? Pause the video here and give it a shot. Here is now the answer. As it turns out, even though we're only iterating up to i, it seems like we might have a faster runtime complexity. It turns out we don't. This is still O(n) squared. Here's the reason why. I'll have to explain by showing you a picture. Here, let's say that we have a normal for loop on the left-hand side. Then for every single value of i, we go through j n times. We'll do this for i is equal to one, we iterate through all values of j, i is equal to two all values of j and so on and so forth. Keep doing this. What this means is there are a total of n squared different times that we print. Let's say we print i j. Now let's say that we have this new for loop where we only iterate only up to i. Then what that means is here in this first row, when i is equal to one, we only run j once. When i is equal to two in the second row, we only run j twice. When i is equal to three, we only run j three times, and so on and so forth. Here, instead of n squared total times that we call print, we have 1/2 times n squared times that we call print. However, 1/2 of n squared is still on the order of n squared, which is why this function is still O of n squared in algorithmic complexity. That concludes this tricky example. If you have any questions or if you're still confused, feel free to drop a question in the discussion section. These are the four different topics that we covered. Orders of growth we recovered some fundamental examples, logarithmic growth and some tricky examples along with how they modified those basic examples for more tricky example. For a copy of these slides and the finished code for this lesson, make sure to check out the course website. That concludes this lesson for assessing data structures. 5. Sequences: Stack vs. Queue: One common format for data is a sequence, and in particular, there are two common types of sequences. There are stacks, like a stack of clothes or a stack of rocks. There are also queues like a line of people or a tunnel of cars. Once we've analyzed both stacks and queues, we'll then discuss their pros and cons. First up is a stack. Just like with a stack of rocks, you add items to the top of the stack. Then to remove a rock or an item, you also remove it from the top of the stack. We summarize this as last in, first out; the last item you add is the first you remove. The insertion operation according to this logic is simple then. We take the value we want to add, we push that value onto the stack and that's it. It takes constant or O of one time to insert a new value to the stack since no matter how long the stack is, the effort to simply insert one item is the same. Note that by definition, we can only insert values at the top of the stack. The deletion operation is also simple. We take the last value shown here in gray and pop the value. This takes constant or O of one time as well. Since the length of the list doesn't change number of operations, deletion kneads. Taken together, we now know that stacks feature constant time insertion and deletion, so modification is cheap. But what about access? How expensive is searching the stack for a value? Unfortunately, pretty slow. Let's say we're searching for the value 8. We pop an item, check its value. Is 7 equal to eight? It is not, so we pop another value, then check its value. Is 2 equal to 8? It is not so we pop once more then check its value. Is 8 equal to? It absolutely is, so our search is complete. However, if there are n items in the stack, we need to perform up to n checks. This means search takes linear or O of n time. Fetching a specific item is similar. Say we want to fetch the second item with the value 8. We pop an item, that's one item, we pop another item, that's two items, then we pop another item, that's three items. We've finally fetched the value we want. Like before, if there are n items on the stack, we may need up to n pops. This makes fetch O of n time. To summarize, we can insert to and delete from a stack in constant time. Modification is efficient. However, search and fetch take linear time. Access is inefficient. But what if instead of a stack of rocks, we had a line of people? The first person to join the line should be the first to leave it. This is called a queue. This time we insert a value at the end of the queue and delete a value from the front of the queue. We call this a first in, first out sequence or a queue. As we mentioned before, insertion is simple. We consider a value to add, then enqueue that item into the queue. No matter the length of the list, the simplicity of insertion is not affected. This is an O of one operation. Deletion is similar. We consider a value to remove pictured in gray, then dequeue that value from the queue. This is also an O of one operation. To summarize, insertion and deletion for a queue are thus both O of one time. Once again, search is quite slow, however. Say we're searching the queue for the value 8, again, we dequeue, then check is 7 equal to 8? It is not so we dequeue again, then check. Is 2 equal to 8? It is not. We dequeue and check. Is 8 equal to 8? It is. Our search is done. However, if there are n items in the queue, we could need up to n checks. This makes search and queues an O of n algorithm. Fetch is again similar. Say we're looking to fetch the 3rd item with the value 8. We dequeue one item, that's 1, we dequeue another item, that's 2, and we dequeue another item to finally retrieve the 3rd item's is value, which is eight. For a queue with n items, we need n dequeues, so fetching is O of n. To summarize, a queue is efficient to modify, but inefficient to access just like with stacks. Let's now analyze the pros and cons of stacks and queues. On the far left, we have the four operations we care about: insertion, deletion, searching, and fetching. In the 2nd column, we have the runtimes for a stack, and in the 3rd column, we have the runtimes for a queue. Unfortunately, the runtimes themselves aren't super interesting so we have to know which to use based on the application and how each data structure was defined. For example, a stack is ideal for tracking changes like undoing an operation, then redoing that operation. This is a stack, as the last change added is the first you undo. A queue is ideal for sequential processing like processing cars in a line at a toll bridge or download requests on a server. Keep this slide in mind as this summarizes the entire lesson. For a copy of these slides and more resources, make sure to check out the course website. Let's now see these data structures in action. In the next lesson, we'll practice both using and implementing these data structures, tackling both of our learning objectives for this course. 6. Sequences Practice: Welcome to another practice lesson. In this lesson, we'll cover sequences, in other words, stacks and queues. Just like before, we'll put aside nitty-gritty interview tactics, we need to master these fundamentals first. Note that these questions are pretty difficult. This is not a cool application. This is fundamentals that you'd need for interviews and it's technically challenging material. So just like before, navigate to this URL, alvinwan.com/datastructures101/sequences. Once you see this page, fork the project to get your own editable copy. To do this, click on the project name in the top-left to get a drop-down, click on the ellipsis to get another drop-down, and finally click on the Fork button. Then I suggest placing your Skillshare and Replit edit windows side-by-side as shown here. We'll talk about palms in three different categories. We'll first talk about implementation problems. These are problems that implement core functionality for a data structure. Second is usage, we'll implement some functionality and pick the right data structure to use, that's the key part. We're going to learn how to pick the data structures based on their pros and cons. For the third one, we're going to combine data structures into possibly a new data structure or a new application that requires us to juggle multiple data structures for each of their strengths. Regardless of what category of problems, here's a pro tip that you should always keep in mind. In fact, we're going to follow this pro tip for this entire lesson and all the practice lessons that follow. We're going to conduct all of our puzzle-solving questions in this way. You should conduct all of your puzzle-solving interviews in this way as well. First, we'll design the algorithm without coding. Verbally or visually design the algorithm with your interviewer. This is so that your interviewer can guide you and your thinking towards the right answer. Second, report the runtime and memory complexity. This is important. First to demonstrate understanding, but also to see if your interviewer wants a more efficient solution. Third, and finally, then you code. We'll structure your practice in this way. Design the algorithm, then I'll go over the solution, report the complexity, and then I'll go over the solution, and finally code the algorithm, and then again we'll go over the solution. The idea is that even if you mess up any part of this three-step process, you'll still be able to practice the other steps. Here's our very first question. We're going to implement queues but using stacks. Before you start though, let me talk about one tip. Remember that a stack is last-in, first-out. What that means is it's like a stack of rocks or a stack of boxes. The last item you put on that stack is going to be the first one you take off, that's stack. Now queue is just like a line, a queue of cars let's say. That means first-in, first-out. So the first car that lined up is the first car that will leave this queue. Keep this in mind. That what I wrote here, a queue is first-in-first-out, a stack is last-out, first-in. Now, granted, I never remember this first-in, first-out, last-out, first-in. What I remember instead is pretty much that I'm working with the stack, like a stack of rocks, stack of clothes, stack of boxes, and then I think of a queue as just a line. If you keep that in mind, that'll start to tell you what direction you're moving in. Now again, the exercise here is to implement a queue using stacks. If you scroll down in your ripl.it, you'll see this right here. This is what you need to implement. So pause the video here and first, think about the algorithm. How are you going to do this conceptually? So give it a shot. Here's the answer. So we're going to do the following. Say we have this list here in black. We have 9, 8, 2, 7. Nine is our first item. Our last item is seven. To emulate a queue, we need to remove the first item, which is nine, with that red arrow right there. Our goal, shown here in red is to remove the value nine and to give back or keep the remaining stack, 8, 2, 7. So knowing that that's our goal, the only thing we can do with a stack though, is remove from the last item. We can only pop seven. Well, that's okay. We can keep doing that. We'll pop seven, we'll pop two, we'll pop eight and finally, we'll pop nine. There we go. We can now return nine. Now we can dequeue from a stack. However, we've lost all the items that were popped off earlier, all the 8, 2, and 7. So we need another stack to hold all of them. So here what we're going to do is pop off seven, but push it onto the other stack. Then we're going to pop off two, eight, and then finally we'll return nine. That's great. We're almost there. However, we need 8, 2, 7 in our stack. Right now we have the reverse 7, 2, 8. So what we're going to do is pop from the second stack and push it back onto the original stack. So we're going to push that back, and there we go. We have both goals. We're able to return nine as though it was a queue and we're able to retain a remaining stack 8, 2, 7 as we wanted. Now, this seems like a clever solution. I would probably say to myself, no way I can come up with that solution. If you're thinking the same thing, I'm completely with you and I'm here to tell you that's not a problem. There's actually no need to come up with that solution. Instead, understand it thoroughly. In your interviews or on the jobs, simply recall the tactics available to you. So really focus your mental energy on understanding the solution. If something doesn't make sense, just ask in the discussion section. Knowing that this is the algorithm, let's move on to the second step. What is the algorithmic complexity? What is the runtime complexity and what is the memory complexity? Pause the video here and try to figure it out. Here's the answer. The algorithmic or runtime complexity here is O(n). We have to basically perform two n operations, which is O(n) time. We need to pop off every single item into the second stack, and then we need to pop off every single item from the second stack back into the first. So that is O(n) for runtime complexity, for memory complexity, it is also O(n) simply because we had to keep this other stack, and this other stack required up to n different items. So we have O(n) runtime complexity, O(n) memory complexity. Now, let's try to implement this in code. Again, pause the video here and try it yourself first. Let's now implement this solution. We'll start off by writing a constructor. This constructor will take in an iterable. So here we can see that this constructor takes an iterable, which is a list of just one item. From the iterable, we'll build a stack. [NOISE] Now in order to enqueue, [NOISE] we'll simply push to the stack. [NOISE] However, I want to dequeue or remove from the stack, that's where the special logic comes in. That's the entire algorithm that we've talked about here on the left-hand side. We're going to first start off by emptying the original stack into a new stack. Here we'll have the new stack. Then while we have items in the old stack, will push them onto the new stack and we'll move them from the old one. Now that we've got all these items out, we're now going to finally pop off the last item. This will give us the value that we need to actually return. Now let's empty the new stack back into the original stack. Remember that was the last step here where we had everything in reverse. We need to push them all back into the original stack. Now here we'll have, while the stack has items, will push it back onto the original stack. Finally, we're going to return the value. Here we go. We have implemented our three-step algorithm from before. Let's now check whether or not this function passes our doctests. To do that, hit the Run, the green button at the very top of your file. Here we can see that we had failures in all the other functions, but not this one. Remember, our function here is queue via stack, which is missing from this list. We're good to go. We've passed our doctests for this function. Let's move on now to the next question. In our next question, we're going to do the reverse. We're going to implement a stack using queues. Again here we implemented a queue using stacks. Now in our next problem, we're going to do the reverse. Go ahead and take a second to pause the video and consider what algorithm you would build to do this. Here's a solution. This is very similar to the last problem. Just like before, we're going to instantiate a second queue. We're then going to take every item in our first queue and push it onto the second queue. Once we get to the last item, will then return that list item just like a stack would. That's pretty much it. Now that we have the algorithm, let's go ahead and think about the memory and runtime complexity of the algorithm. Pause the video again. Now here's a solution conceptually. Here we want to implement a stack however, we only have a queue. Recall that a queue can only be dequeued from the beginning whereas in order to emulate a stack, we need to be able to pop from the end. We have this dilemma, but we'll resolve it in the same or similar way as we did before. We'll create a second queue. Then we'll dequeue nine and enqueue it onto the second queue. Then we'll dequeue eight and add a second queue, then dequeue two, and add it to the second queue. Finally, we'll dequeue seven and return it. Here we've managed to actually "pop seven". Here's something new, unlike before our second queue is already in the correct order. Remember that from before our queue was 9, 8, 2, 7 and now it's 9, 8, 2. This is the exact and correct ordering. We don't need to move it back to the original queue like we did before. Knowing that we can now know exactly how we're going to implement this algorithm. However, before we do that, let's report the runtime and memory complexity of the function. Pause the video here and try to figure it out. Now here's the solution. The runtime and memory complexity is o of n, just like before. For runtime complexity, we have to dequeue every single item in the list and enqueue it into the second queue, so it's o of n. Now for memory complexity, we do have to maintain the second queue. Second queue incurs basically o of n memory up to n items. There you have it. Both runtime and memory complexity are both o of n. Now, let's finally move on to the third step. Let's try coding this algorithm. Pause the video here and try to code it. Now, let's go ahead and talk about the solution. Let's implement the stack using queues. Just like before we're going to create a constructor. This constructor is going to take an iterable. That iterable will be our queues initial value. Then let's implement push for this stack. This will just be a queue's and queue call. All the magic will now happen in the pop method. In this pop method, we'll first empty n minus one items in the original queue into the new queue. To do that, let's create a queue. This is our new queue. Then while our old queue has at least one item, we're going to keep removing from it. Here it will actually dequeue from the old queue and add it into the new queue. Now that we've dequeued almost all of them, let's now dequeue the last item from the old queue. Whoops, so this is queue, dequeue. This last item would be seven, for example. Then we would return that value. Just like we said before, our new queue has all the elements that we need in the right order. We'll simply replace our old queue with our new one. Now we can return a value. Let's go ahead and try running all of our tests and see what happens. There we go. We can see that our stack via queue is no longer in the list of failures. Tada, we have passed these tests. Let's now move on to the next problem. The next problem, we're going to use a data structure. I'm not going to tell you which one to use, and there will be up to you to decide which one. We're going to use, either stacks or queues. In particular to print all combinations of the letters a and b. We're going to print out all combinations of length k. Here's an example, print_baba of 2 will give us aa, ab, bb. Now if we call print baba of 3, then this will give us all three-letter "words" that have the letters a and b. Again, it's all possible combinations. Notice that there are no repeats. Notice that every single value here has length three. Knowing that, let's again go through that three-step process we've talked about earlier. In the first step, let's design the algorithm. Pause the video here and try to design the algorithm. Again, you'll want to use either a stack or a queue. Let's now talk about the solution conceptually. Here in this particular solution, we're going to use a queue. I'm going to denote the queue using just a basic list here first to visualize what's going on. To start off, let's say I have the letter a in here. What we're going to do is first dequeue the letter a. Now we have a and we're going to enqueue the only two possible ways to extend the letter a. To go from a one letter sequence of all possible a's and b's to all possible two letter sequences of a's and b's. We're going to add aa and ab. This will account for all possible ways to create an a and b sequence from this starting point. Now we can make this a little bit more extensive. We're going to start off with the empty string. Then we're going to dequeue that empty string. Then we'll add the only two possible ways of extending this, which are adding an a at the end or adding a b at the end. Then we will dequeue and then we will add a plus a plus b, so aa ab. Then we'll do this again. We'll dequeue the first item right here, which will be b. Then we'll extend with ba and bb. We'll keep doing this. Now here we will have aa we dequeue it, and then we'll add aa plus aa plus b, so aaa, aab. We'll keep doing this. By stopping at the right point in time, we'll ensure that we've actually covered all possible k length sequences of A's and B's. Knowing that this is the algorithm, let's now talk about the memory and runtime complexity. Pause the video here and see if you can come up with both of those complexities. Let's talk about the computational and memory complexity. The computational and memory complexity is actually complicated in this particular scenario, but we'll talk through it anyways. Given what you've known so far, it might not be able to come up with this one, but that's completely okay. Again, your goal is just to understand the solution that I'm presenting now. Let's talk about how many times it takes, or let's talk about first how many results there are here. All possible three length sequences of a's and b's. Here we're going to have two options for every single letter, a or b times 2 times 2. That means there are two to the power of three different options here. If we want all possible k length sequences, we're going to output two to the k different responses. Now, in order to produce two to the k different responses, we actually had to generate all the two length and one length solutions as well. In reality, what we have is to generate two to the three different options. We had to go through two to the two, and then two to the one as well. This is true in general. If we wanted to generate all four length sequences, we would have generated two to the one two to the two the three and two to the four different combinations and so on and so forth. In reality, what that means is that this is actually a summation of up to, let's say j values where at most j is equal to k. Here we have this nasty summation of a bunch of terms. However, fortunately for us, order of magnitude analysis only cares about the largest magnitude term. Here we can simply summarize this as o of 2 to the n, or I guess k, since we've been using k. Knowing that this algorithm is pretty slow, this is the computational complexity. Fortunately for us, the memory complexity is, well actually, unfortunately for us, numeric complexity is also the same. O of 2 to the k. This is a pretty slow algorithm. It's also pretty expensive algorithm, but we're going to leave that, that it's a pretty, it's complex. But at least given what we've said so far, this is one way to do it, and we'll do it this way for now. Go ahead and pause the video here and try to code it up. Now here's a solution. To get started, we're going to create a queue. Just like we talked about before, we're going to add in the empty string. This empty string is all possible sets of zero length sequences. From this, we will generate all possible sets of one length sequences, and so on, so forth. While there is something in the queue, we will keep generating again and again and again. Let's go ahead and pop off or dequeue that first string. If our word here is exactly length k, then we're going to simply print that word, and we're good to go. However, if the word is not yet of length k, then we will extend it. We will extend it with the letter a and we will extend it with the letter b. There we go. This is it for our function. This will give us all possible Kalign sequences because this ensures that we'll extend with all possibilities, a and b, and this ensures that will catch all Kalign sequences as they come about. If there's anything longer than a Kalign sequence, then, well, it doesn't go anywhere. We won't continue to work with it. In fact, you can actually prove that there will be no length longer than k, but that's not super important. Point is this function should give us what we want. Let's go ahead and try it now. Hit the green "Run" button at the top and you should see that print_baba will disappear from this list of failures. There we go. Print_baba is now gone from the list of failures. We've passed this test as well. Let's move on to the next question. In this next question, let's now use stacks or queues to print stairs. In particular, print the number of ways that we can climb k stairs if you can take one or two steps at a time. If you have one stair, you can only climb it one way. If you have two stairs, you can climb it two ways, either you take one step at a time, or you take both steps at once. If there are three stairs, then you can either climb 1,1,1, or you can climb two steps and then one step, or you can climb one step and then two steps, so it's 1,1,1; 1,2, or 2,1. There are three ways to climb three steps. This function is supposed to compute the number of ways it takes to climb k stairs if you can take one or two steps. Now, before you consider the algorithm, here's a hint. Well, you can pause the video here if you don't want the hint, but the hint is the following; before our, quote-unquote, actions, for every single word that was not yet of length k, we had two different actions. You can add a or you can add b. Here, you have something similar. If you have not yet climbed all k stairs, you have two different actions. You can climb one step or you can climb two steps. Knowing that and knowing that we previously used the queue to represent all possible ways to build up a word, you can also use that queue now to build up all possible ways to climb stairs. Pause the video here and see if you can figure out the algorithm. Now here's a solution. The solution is actually going to mirror the previous solution pretty closely. What we're going to do is for every single set of stairs, let's say we have a queue now, our goal will be, for every single item in the queue will be another way of reaching that certain number of steps. Here's one way to put it. Let's say we have zero. That means is this is one way of making it to zero stairs. Then we're going to dequeue that 0. Then we're going to add to the queue two items. You can either take one step or you can take two steps. We'll take 0 plus 1 and 0 plus 2. Then you'll dequeue the first item. Then we'll, again enqueue. We'll say, there are two possibilities, you can either take one step, you can take 1 plus 1, which is 2, or you can take two steps, which is three. Then we'll dequeue again. We'll dequeue this two. Then we'll say, here is two. You can take one or two steps from that point on. You can take 2 plus 1 or three, or 2 plus 2, which is 4, and then so on and so forth. You can keep doing this. Basically, what will happen is if you count the number of times that k occurs in this queue, that will tell you how many ways there are to get to that number of stairs. That's it for the conceptual solution for the algorithm. We're actually going to skip the memory and run-time complexity for this particular problem just because it is fairly difficult to compute. There is a way to compute it. We're going to skip it for now. I will include a written version of the solution in the final solutions file. In the meantime, let's move on to actually coding this function. Again, pause the video here, and try to code the solution. Let's now code up the solution. We'll start off by creating a queue right here with just zero items. This queue now has zero items. We also will initialize the number of ways that we can climb up to k stairs is zero. While we have something in the queue, so while there are ways up the stairs, we will dequeue one way up the stairs. This will dequeue one way up the stairs. Then if we've made it up to k steps, we will increment the number of possible ways to get up to k steps by one. If we have not yet made it up to k steps, so stairs is less than k, then we will consider ways of getting up there. We'll consider either plus 1 stairs or will consider taking two steps. Then finally, we'll return the total number of ways that it took to get up to k steps. Now that you've got this, let's go ahead and hit "Run". There we go. We can now see that print_stairs is no longer in our list of failures. We've passed all the test cases. Before we move on to the next question, I did want to bring up this tip. The tip is solve the core problem, then consider the edge cases. We've been skipping some of the tips here for edge cases and you can see them if you go back to the slides, but the basic idea is, again, focus on the core problem before you addressed edge cases. This is not super relevant right now, but as we work through problems, we'll see more and more of these edge cases, and sometimes they can be a distraction. Again, focus on solving the core problem first. Here we're now going to use stacks and queues to check parentheses. Actually, this is the first problem where we'll see a bunch of random edge cases. Here's the problem. Go and scroll down a little bit in your file and you'll see our goal is to check if the provided string contains matching parentheses. [LAUGHTER] I've actually already given you a hint here. You should use a stack, but let's go in, and scroll down here, and see what that means. Matching parentheses means that for every single start parenthesis, you have a corresponding closing parenthesis. However, it's not just if it exists, it has to be in the right order. For example, see, here you have a start parenthesis. Actually, this is just missing an end parenthesis, but here is a case where you have invalid parentheses. Here you have end parenthesis and then start parenthesis. These are improperly closed. What this means is before every single closed parenthesis, you need a corresponding starting parenthesis. That's what it means to match. This is actually a very common problem that you'll see both as exercise and possibly even in the interview itself. This is good to know, good to understand what the premise of the problem is. Again, we want to check if the provided string contains matching parentheses. Again, the hint is to use a stack. Go ahead and pause the video here and see if you can come up with the algorithm. Let's now talk about the solution. Our solution, like we mentioned earlier in the tip or in the hint, is to actually use a stack. Let's say we've got this stack right here. This stack is going to represent how many start parentheses or how deep we are in parentheses. Let's say that we iterate through the string and we saw this start parentheses, we're going to add a start parentheses here. Then now we see a close parentheses. Anytime we see a close parentheses, we're going to pop off from the stack. We popped off from the stack, we removed one level higher, we removed that group. Now that we see another start parentheses, we're going to add back in, we see another start parentheses we're going to add back in, and then here we see a closed parentheses, so let's go ahead and remove that, and then we see another start parentheses, so let's add that back in. Finally, we're going to close this again. Now we can see at the very end of the function we still have one unclosed parentheses, which is this one. This group is unclosed. What that means is this is not a matching set of parentheses, we can return false. In essence, this stack actually represents how deep we are in this nested set of parentheses. Anytime you pop off, it means we're back up a level. We have gotten rid of one nested set of parentheses. If we haven't gotten all the way back up and out of the parentheses, then it's not matching. At the same time, if you see something like this, which starts from a close parentheses, then at the very beginning of the string the stack is empty, so there's nothing to pop. This will immediately tell us that again, this is an invalid set of parentheses. That is conceptually, how we are going to do this. Now, let's go on and talk about what the runtime complexity is and the memory complexity. Pause the video here and try to figure it out. Here's the answer. For both runtime and memory complexity, both are linear. We have as we iterate through here, well, basically we only iterate over the string one time. Every single item is only seen once. Second, to comprehend the linear memory complexity, imagine if you had a string that looked like this, then our stack would simply be all of these start parentheses. We have up to N different items in our stack, that means our memory complexity is O of N. Now let's finally move on to the third step, let's start coding this algorithm. Pause the video here and try to code it. Let's go in and code up this algorithm now. Here we're going to create a stack to begin with, and for every single letter in the string, we're going to check if the letter starts a new set of parentheses, we're going to push onto the stack. Otherwise, we can assume that the letter is a closing parentheses. This is something that I just wrote earlier. Basically, in an interview, you should ask if the string could contain other characters, like numbers, or letters, and whatever. Now, just for simplicity though, we're going to assume that there are only parentheses in the string. Here, if it's not an opening parentheses, it's a closing parentheses. For every single closing parentheses, we're going to pop off from the stack. Then finally, if the stack has nothing left, then we'll return true. If the stack has something left, it means we haven't closed all our parentheses, so we should return false. Now, knowing this, here is actually a tip that I mentioned earlier or in the slides from earlier, but I didn't mention. Here consider an empty sequence. What happens when your stack is actually empty? Is there any place in your code where your code would break? In this particular case, yes, stack.pop. Here, we don't check if the stack is already empty, so let's do that. If the stack is already empty, well, what do we do here? If the stack is already empty and we're trying to remove an opening parentheses, this means we saw a closing parentheses before we saw the corresponding starting parentheses. Knowing that, this means, again, the string is invalid, so we should return false. That's it for the edge cases really. Invalid inputs doesn't apply here, so that concludes this problem, let's go ahead and see if it passes all of our tests. Running the code, you'll notice that is_valid_parentheses is not in this list of failures, so we have passed this functions tests. Let's move on to the next problem. Here's the tip, actually consider extraneous characters. That was up here, the tip, in an interview make sure to ask if the string could contain other characters for this particular problem. Now you should try this on your own. Use stacks or queues to check if there are matching parentheses and matching brackets. Here we have parentheses that open and close, and we also have brackets that open and close. This is a valid grouping because every single pair of parentheses is perfectly matched. But there's an additional challenge. Notice here that if you only consider square brackets, they are correctly matched. If you only consider the parentheses, they are also matched. However, you'll notice that this set of square brackets and parentheses are intersecting each other, that is an invalid grouping. Knowing this, let's try to design an algorithm for this problem. Pause the video here and try to see how you would use your data structures, stacks, or queues to solve this problem. Again, you can use your previous problem as inspiration. Pause the video here and try it out. Here's a solution. Just like before, we're going to use a stack, except this time, the stack will contain either starting parentheses or starting brackets. It will always contain basically the starting punctuation. Knowing that, anytime we want to, let's say we see a closing parentheses, then we'll try popping off from the stack. In this case, we'll pop off this square bracket. We're currently seeing a closing parentheses. Closing parentheses and this opening bracket don't match, so that means this is an invalid grouping. That would occur in this particular case. You would see on the stack something like this. You would see a square bracket, you would see a parentheses, and then when you pop off, you'll see this opening parentheses versus this closing bracket. So when those two won't match, it is now invalid. Otherwise the algorithm looks pretty much the same as before. Anytime you see an opening punctuation, add it to the stack, anytime you see a closing punctuation, attempt to pop off and make sure you're popping off the right opening punctuation. That's it for the algorithm. Again, let's now consider what is the runtime and memory complexity. Pause the video here and see if you can figure it out. Now, here's the answer. For both runtime and memory complexity, we have O of N. Just like before, for computational complexity, we simply iterate once over every single character, so that's O of N. For memory complexity, we again, could have a string with just a bunch of opening punctuation like this. If you have a bunch of opening punctuation like this, then your stack is pretty much as long as your input string. Here our memory complexity is also O of N. Now that we've passed these first two steps, let's now try the third step. Let's try coding this algorithm. Pause the video here and see if you can code up. Let's now try coding up this function. We're going to start off by creating a stack, and then we're going to iterate over the stack just like we did before. If our current character is a starting punctuation mark, then we'll push it onto the stack. Then if we see a closing parentheses, double-check that the current stack has a starting parentheses and then return false, if the two don't match. Otherwise, if it's a closing bracket, makes sure your stack is currently inside of a bracket, otherwise, return false. Finally, make sure that your stack has closed all possible groups. If you've closed all groups, then your stack would have nothing left in it. All of the corresponding groups would have been popped off. Now that we've got our function, let's go ahead and run it. Click on the "Run button", the very top, the green button that will execute all the tests. Here we can see that is_valid_grouping is missing from our list of failures, so we have successfully passed all these tests. That concludes the exercises in this lesson. One more closing tip, stack is like a stack of rocks, stack of clothes, stack of boxes, the last item you add on is the first item you take off. A queue is just like a line, a queue of cars, of queue of birds, I guess, I don't think birds queue up. Queue of people, queue of boxes, any line, the first item you add is the first item that comes off. Again, keep this in mind. That concludes this final tip and this final note. For a copy of these slides and the finished code for this lesson, make sure to check out the course website. Finally, this concludes our sequences practice. I've mentioned some interview tips, but don't spend too much mental energy on memorizing those. Let the practice we've been doing get you in the habit of following those three steps. That's the most important part, getting in the habit of doing those three things, so that when you're extremely nervous, when you're freaking out, when you're not quite thinking straight in an interview, you will default to those three steps. That's it. 7. Lists: Array vs. Linked List: Our first category is for lists of items. We'll analyze two data structures, in particular, arrays and linked lists. We'll then cover their pros and cons based on our analysis. Let's start by finishing our array analysis. An array is simply a list of items. In memory, the array items are placed right next to each other. Earlier we found that searching an array takes O of n time since we have to traverse the entire list one item at a time to find the value. We also found that fetching takes O of 1 constant time. Let's now finish this analysis and see how long it takes to insert an item into the array. Say you have this array of three items. We're now visualizing a piece of your memory in your computer. We would like to insert a new item, a value of four. We could in theory just allocate a space for four like this. However, your computer may have already allocated that space to other data. As a result, the insertion operation will allocate a new chunk of memory for your new length for array, then the algorithm will copy each element one by one. Finally, it will update the new item's value. For an array with n items, insertion makes n copies. Insertion is thus an O of n operation. To summarize, insertion takes linear time. Let's see how long deletion takes. Say you have an array of six items, you would like to delete the 3rd item, the number 2 here. To do that, the deletion algorithm will shift all the numbers after it up. It will copy 3 forward, copy 4 forward, copy 7 forward, then deallocate the last spot. As a result, for an array of length n, deletion takes n copies. This makes deletion O of n. This completes our table of run-time complexities for the array. You'll notice immediately that the array is very efficient for fetching but inefficient at everything else. Our array is inefficient for modification in particular only because we have to keep all the data contiguous in memory. In other words, all the items must be right next to each other like this. But we can change this. Let's place each item in a list wherever we want in memory. In order to connect all the items together into a list, we need to add links. Each of these links are also called pointers. Each pointer points to a location for the next item in the list. We also need to add a marker that indicates the list has ended. This is what we call a linked list because of all the links we inserted. This linked list comes with some really nice properties. Let's now analyze this linked list. Let's say we want to insert a new value 3 at the end of the list, then we simply allocate two new spots and make sure 2 points to 3. No matter how many items there are in the linked list, we have a constant amount of work to do to insert a new item, allocate two spaces and change one pointer. As a result, insertion for linked lists is O of 1 constant time. The story is similar, even if we want to insert a value in the middle of the list, allocate two new spaces for the new value and a pointer, then point the previous value 8 to 3, then point the new value 3 to 2. There we have it. Insertion is complete. No matter the length of the linked list, we just need to allocate two spaces and change two pointers. This is still constant time O of 1. To summarize so far, a linked list has constant time insertion. Let's see now how removal or deletion works. Say we want to delete 3, the last item. This is simple. We just reverse what we did for insertion. Delete the 2nd to last pointer. Now the item 3 is no longer a part of the linked list. To delete the last item, you just need a one-pointer change. This is constant time or O of 1. Deleting an item in the middle of the list is simple too. Say you want to delete 3, the 3rd item, change the pointer for eight to skip over 3. Now, in effect, 3 is no longer a part of the linked list, and that's it. To delete an item, no matter where it is, we simply change one pointer. This is constant time again or O of 1. To summarize so far, a linked list has constant time insertion and deletion. Let's now see how long accessing the linked list takes. Let's say we're searching for the value 4. We need to traverse the entire linked list one by one until we find 4. First, we'll check the 1st item. Is this 4? 9 is not 4 so access the next item. Is this 4? 8 is not 4 so access the next item. Is this 4? 3 is not 4, next, 2 is not 4. So for a linked list of length n, we need up to n checks. Searching a linked list takes O of n time. To summarize, a linked list has constant time insertion and deletion, but linear search time. Unfortunately, fetching items from a linked list is pretty slow too. Let's say we want to check or fetch the 3rd item in the linked list. There is no way to know where the 3rd item is in memory so we need to start from the beginning. We'll access the 1st item, then follow its pointer to the next item. Access the next item, then follow its pointer to the next item. Finally, access this item. Since this is the 3rd item, return its value. For a linked list of length n, we need to traverse up to n nodes. Fetching the ith item from a linked list takes O of n time. To summarize, we've now completed the linked list analysis. Modifying the list, inserting, or deleting is efficient. Accessing the list, searching, or fetching is inefficient. Now that we've analyzed both arrays and linked lists, let's see when to use which. This is the most important part of this lesson. Even if you forget everything I said before, keep this next slide tucked in your back pocket. This is your takeaway. Here's a summary of the run-time complexities. On the left, we have the four operations we analyzed. In the 2nd column, we have array run-time complexities from before. On the right, we have linked list run-time complexities. For an array, modification is slow, but fetching is fast. This means arrays are ideal for many random accesses for a fixed set of data. For a linked list, access is slow, but modification is fast. This means linked lists are advantageous for their dynamic size ideal for holding constantly changing data. That's it. This is the summary slide for this lesson. For a copy of these slides and more resources, make sure to check out the course website. Let's now see these data structures in action. In the next lesson, we'll practice both using and implementing these data structures, tackling both of our learning objectives for this course. 8. Lists Practice: Welcome to another practice lesson. Here we're going to practice lists. In particular, we're going to cover mostly linked lists. Just like before, navigate to this URL alvinwan.com/datastructures101/lists. Once you see this page, fork the project to get your own editable copy. To do this, click on the project name in the top left to get a drop-down, click on the ellipsis to get another drop-down, and finally, click on the "Fork" button. Then I suggest placing your Skillshare and rippled on it windows side-by-side as shown here. Here are the different subtopics that I want to go through. We're going to start off with the prompt. This is the code that your interviewer will provide you with. Then we'll talk about three categories of prompts: implementation, usage, and then combining different data structures. To start off, on the right-hand side, you should see the code that your interviewer would likely provide you with. Here, you would have a stack, you'd have a queue. Well, I guess stack and queue probably would not provide you with, but you could certainly use those if you needed to. This is the code that you'd be given, basically a linked list implementation. You'll have something that looks like the following. You would have the value for the link and you would also have a connection to the next link. This right here is a utility that I've provided so that you can actually visualize your linked list. This may or may not be provided. But if it wasn't, you could also use this implementation. Let's now start off with our very first exercise. Now, for all of our exercises, including this one, we're going to follow the same three steps that we talked about before. Again, this is very important to keep in mind and to practice. First, we'll design the algorithm without code verbally or visually. Second, we'll report the runtime and memory complexity. Third, then we'll start coding. Here is linked list appending. What we're going to do is append an item to the linked list. So here we'll have, let's say a linked list that has 1, 2, and 3. We want to append the value four. Then after appending it, we'll then have a linked list that's 1, 2, 3, 4. Now go ahead and pause the video here to determine what algorithm you're using. Conceptually, what the steps are going to be. Here's the answer. Conceptually, what we're going to do is again, we only have a pointer to the beginning of the linked list. We're going to follow that pointer just like what's shown here on the left-hand side with this left arrow in black. We're going to start from that pointer then we're going to navigate to the next link and keep going until you've reached the very end of our linked list. Then at the very end of our linked list, we'll simply create a new link and attach it to the list. That's conceptually what we're going to do. We're going to traverse to the end of the linked list and then we're going to add a new link. Knowing that now, again, we'll have to consider the second step in our three-step process. What is the memory and runtime complexity of this append algorithm? Pause the video here and see if you can figure it out. Now here's the solution. This algorithm is linear in computational complexity simply because you have to traverse the entire linked list, which could be up to length N. That's O of N computational complexity. For memory complexity, we have O of 1 memory complexity simply because we didn't use any additional memory other than storage for the current item and the existing linked list. We didn't actually create any new data structure. We just have this link which is O of 1 cost. Knowing that, let's now actually code the algorithm. Again, pause the video here and see if you can code it up. Now here's a solution. Here we're going to traverse to the very end of the list. Here we'll say while link.next is not none. This means as long as we have another link in our current list, we're going to keep traversing until we get to the end. Then finally, at the very end of the list, so this link will now be after this while loop, the last element, so it'll be 3. Now we've got 3, we're going to write link.next is equal to our new link, which contains a value 4. Knowing this, let's now go ahead and run our tests. We can see now that append is missing from this list of failures so we have passed our appended test. Let's now move on to the next question. Now before we actually move on to the next question, actually for the next question as well, memorize how to traverse a linked list. It will always almost be the same. You're going to have some while loop, you're going to check to make sure that you're not accessing some none type object, then you'll just access the next link. This is how you'll pretty much always traverse a linked list. Keep this in mind because the very next problem is going to use something similar. Go ahead and scroll down in your file and you'll see now here is the insert question. Our goal is to insert after the provided index I. In particular, let's say you had this linked list 1, 2, 3 and we want to insert a link right here after index 0 and we want to insert the value 4. Here we'll insert after index 0, the value 4. Here we have 1, 4, 2, 3. If you're looking on the left-hand side, we've actually visualized what this will look like to help you out a little bit. [LAUGHTER] Go ahead and pause the video here, and see if you can determine what the algorithm would look like. Now, here's the answer. What we'll do is first traverse the list until we get to the right link to modify. In this case, let's say we're inserting H after 9, just like we're inserting 4 after 1. Here we're going to navigate to 9. Once we've reached the value 9, then we'll create a new link. We'll relink 9 to 8 and then we we'll relink 8 to 2. There are three steps. Navigate to the correct element, relink to the current link, and then use the current link to connect to the next link. That's the algorithm. Now, knowing that this is the algorithm, go ahead and pause the video here and figure out what the runtime and memory complexity is. The memory and runtime complexity here. Computational complexity is going to be linear. We may have to traverse all the way to the very end of the list to insert something. Memory complexity is constant because the only thing we're creating is this link itself, which is O of 1 constant memory cost. Knowing that, let's now move on to the code. Pause the video here and try to code. Now here is a solution. Just like before, we're going to start off by actually navigating using a while loop. Actually instead of using a while loop, we're going to use a for-loop because we know exactly how many steps you're going to take. For link in range i, link is equal to link.next. I've used underscore here to denote that we're not going to use that variable. All we care about is that we access.next.next.next basically i times until you get to the right element. Once you're there, we're going to reassign a few things. We're going to first reassign the current value's next attribute to the new value. But then, we also need to connect this new link to the rest of the linked list. We're going to use this to connect to the original link.next. Again, in this example on the left-hand side, 9 originally pointed to 2. We're now going to point 9 to 8. That's right here. Link.next is equal to our new object. Then we're going to point 8 to 2, and that's right here in the second part. We have the original link.next and we substitute that in for the new link.next. That's it. This is our Insert function. Go ahead and rerun your tests by hitting the green "Run" button at the top. Now we can see that there are one fewer failures. We have insert not in our list of failures, we have successfully passed the tests. Let's move on to the next problem. Here, one tip that we had for this problem and for future ones is to always check for none values. So one common error that you'll get with linked lists problems is you'll get, for example, in this link.next, you'll get link is an NoneType object. Next is not an attribute of a NoneType object. That's super common. When you're debugging, and even before you debug, even before you run the code as you're talking to the interviewer, make sure you're running through your code and checking for any place where there might be none values and making sure that you're not calling.next or.value on an NoneType value. In this particular function right here in line 83 and 84, it's quite possible that i was invalid. Let's say i here was a 100. Then we would reach at some point in NoneType value and we'd tried to call next on it. To make this function more robust, you'd have to check for that. You'd have to account for that. In this particular simple implementation, we didn't do that, but just something to keep in mind always, always check for none values in your interviews and any poem that you're writing. Let's now move on to the next problem. Our goal now is to implement a stack using a linked list. Here we have a linked list, we have one, two. Our goal is to be able to treat this as though it was a stack. If we call dot pop on it, it would give us the last item which is two. If we call push, here we push 2, push 3, then if we pop, will get back the last item we pushed, which is three. Pop again, you'll get the second to last item you push, which is two. Our goal is to implement a stack but using a linked list in the backend. Go ahead and pause the video here and see if you can figure out should the algorithm, in this case, the algorithm is, how are you going to push and how are you going to pop? Pause the video here. Here is the answer. Conceptually speaking, this is actually pretty similar to the beginning. When we say push here, we're going to pretty much write the append function that we talked about previously. Can we say pop here, we're going to write a new function, delete. We haven't actually written a delete function yet, but it will pretty much be just like any other function that deletes from a link list. So now that we know the algorithm, go ahead and pause the video here and see if you can figure out the computational and memory complexity. The solution for this part, runtime and memory complexity will actually runtime complexity is going to be linear. We're going to traverse all the way to the very end of the linked list and then add to it. In pops case, we're going to traverse all the way to the very end of linked list to remove from it. Now, knowing that, the memory complexity is constant. The memory complexity, all you're doing is creating a new link and so that is O of one memory cost. You're not creating a whole new list. You're not creating another data structure that stores an items. You're just creating one link. Memory cost is O of one. Now, pause the video here and try to code it up. Let's now write up the code. So here we're going to implement push pretty much just like re-implemented append earlier. We're going to first define a local variable link that is our current linked list beginning and then while link.next is not none. So while we have something that linked list, let's traverse it. [NOISE] Then finally at the very end, we're going to add a new link. That's pretty much it for the push function push method. Let's now move on to the next one. To remove something, we're going to again set this local variable link to our start with the linked list. Then while we're one away from the end, so we don't actually want the very last link, we want the second-to-last link because that's going to be our new tail. We're going to again traverse. Then what we're going to do at this point is first grab the value of the very last item before we actually remove it. Here, we wanted to get the value of the very last link. Then next, we're going to delink it. So now our second-to-last link is pointing to nothing. This is now our brand-new last link, and then finally return the value. This is how we actually delete from the very end of the linked list. Let's go ahead and click the green button, the very top to run our file and check our tests. There we go. We can now see that our stack via Linked list is actually gone. There is no longer in our list of failures, so we have passed these tests. Let's move on to the next problem. Before I move on to that problem, we have one more tip. You want to easily make one of your own questions pretty much just mix and match any of these. You've talked about three different data structures, linked lists, stacks, and queues. In order to create a new question, just implement, pick one using, and then pick another. Any one of these combinations will give you a brand new problem that you can try. As long as you can pass the same tests that you've written for a stack except using your brand new data structure, then you're good to go. This goes for anything else. If you have tests for queue, then just make sure you're brand new data structure can pass those tests, and then tests were linked list, and so on, so forth. It's mix and match any of these combinations to get a brand new problem. Now, let's implement cycle removal for a linked list. In particular, find and remove a cycle in a linked list if it's present. For now, for simplicity, you can assume all the link values are distinct. This is very important because the first question you're going to consider when you see find and remove cycle is how are you going to detect a cycle? One way to detect a cycle would be if I've seen this value before already, then I know I've hit a cycle. I guess to clarify it. Let say you're moving from left to right along this linked list. As you progress, if you suddenly see three twice, then you know that you've hit a cycle. But that you can only say if you know that all the link values are distinct. This is why link values being distinct is very important question for this problem. I wrote that here are all values unique. That's going to be one of your first question is for your interviewer. Your second question, and this is a little bit more of an advanced question. You want to ask, is there a max cycle length? This has to do with the implementation, because if you want to keep track of all the values that you've seen so far, and then you want to say, "All right, have I seen this value already?" Then you could potentially never see a cycle. This entire linked list could be a million long and that there is not a cycle anywhere, then you basically have kept it this massive store of a million items waiting to see if any of those million items show up again. This is a good question to ask would be is there a max cycle length? Because that way you wouldn't have to keep all million items if you know the maximum cycle length is only five. These are good questions to keep in mind for this problem. As you progress and practice these problems, you'll begin to understand what questions to ask. So for now, if you're thinking these are crazy questions out of the blue, no way to ever think about these, that's okay. Just as you see more and more of them, you'll get more familiar with what kinds of questions to ask. For now, let's see if you can think of an algorithm. I gave it away parts of it but pause the video here and see if you can come up with a way to remove cycles , and here's the answer. Basically, what you're going to do is keep a collection of all the values you've already seen. If you've seen the value already, then you know that this is a cycle and you can break the cycle by simply reassigning link.next accordingly. Knowing that, that is now the algorithm. What is the runtime and memory complexity? Pause the video here. For the memory and runtime complexity, for computational complexity in particular, you're going to have O of N. We're going to traverse over this entire list one by one. We need to do that in order to figure out if there's a cycle or not. For the second question is, what does the memory complexity? For memory complexity, we're going to again, keep a collection of all the different values that we've seen so far. That's an O of N memory complexity. O of N memory complexity and O of N computational complexity. Knowing that, let's now try to code it up. Pause the video here. Now, let's go ahead and code this up. Here we're going to start off by creating a set of values. So if you haven't seen this already, a set is a data structure that allows you to check for membership in constant time. Meaning that you can do something like this one in scene, and this no matter how long your set is, will always execute in constant time. That's a set. We're going to use the set to track all the values that you've seen so far. While there's something else in our linked list, we're going to check if that next value is in our scene values list. If it is, then we're going to break the connection with the next link. Again, if the next link is in our list, I've seen values, we're going to break that link and we're going to end here. We're assuming there is only one cycle in our list. Again, that would be a great question to ask your interviewer. Will there be more than one cycle? In this particular question we're going to pretend there's only one cycle. Seen.add and then we'll have linked up value right here. What we're doing here is we're saying, all right, we've considered that there is no cycle here. Because there's no cycle, let's add the current links value to our list of seeing values. Finally, this is the standard a linked list traversal, we're just going to access the very next item and it will keep going. That's it. Let's go ahead and run this function and make sure that our tests pass. There we go, we can see that removes cycle is no longer in our list of failed functions. Let's move on to the next question. For the next question, and also for this one, one tip is to always draw your linked lists and check for broken links. What I mean by draw is basically the illustrations that I had before right here. If need be mixture, draw these out to understand if you're linking all in the right ways. I moved pretty quickly through some of these questions because I've already written the solution and I know the solutions. But basically, if you're given a brand new problem, the odds of making an incorrect link is very high. That's completely okay as long as you draw it out. If you draw it out, you can work out the correct links to connect. That's one of my tips here. Draw your linked list and check for broken links. Now, for this particular problem, we're only going to discuss it conceptually. I've actually written a fully working solution in code for you but because it's fairly complex, we're only going to talk about this algorithmically. For this question, our goal is to use or create a new file object by using the data structures that we've already seen. Our file object is going to be merged with other files. [LAUGHTER] I just showed you a part of the description of the solution. But your goal is to create a file object that can officially merge with other files. The idea is there are many of these files. What that means is you don't want to copy all of the data into one massive file. If you do all that copying, it's going to take a very long time. How can we combine a bunch of files without copying them? If you don't want the hint, then go ahead and pause the video here. But one hint that I have for you is we don't want to copy, so instead of copying, let's store a pointer to the next file. We'll have file 1 and then we'll say, from file 1, we know what the next file is file 2. All you're going to do is store a connection or a pointer. That should sound awfully a lot like one of the data structures we just covered and I've been using. Now pause the video here and see if you can figure out the algorithm. Now here's the solution. Basically we're going to use a linked list. Every single file is going to be a link. Each file will then point to the next file and say that's the next file value to traverse. That's how you're going to store this massive merged file. But that's not all, there's a little bit more because we need to support this merge function. Knowing how we're going to store them is great. We don't have to make copies, but how do you actually merge a new file? Well, the emergent new file, it just like you would add to any other linked list. You pretty much traverse to the end of the linked list and add the new file. That's pretty much it. For this problem or this exercise, the only conceptual difficulty is realizing that you have to use a linked list. That was really the pointer problem. Now that we've covered the pointer problem, we're going to actually move on just because I do have a written solution for this and there are line-by-line annotations for what each one is doing and how it works but the details are not super important. Let's move on to the next question. Again, doesn't other ship here. Linked lists we use for dynamic sized collections. In this particular case, this set of files is a dynamically sized set of collections. We don't know how big it is, we might add to it a bunch of times, and we might even remove from it a bunch of times. Linked lists are great for those collections because you don't have to copy data anytime you want to add or remove an object. Anytime you want to add or remove link, it's constant time. Now however, arrays are perfect for random access. We talked about this a little bit before but basically what this means is if you want to access the ith item in an array, you know exactly where it is in memory. You go straight to that part in memory. However, for a linked list if you want the ith item as you saw before, you need to traverse the entire list, so not quite as efficient. That's the tip. Linked lists are for dynamically sized collections, arrays are for random access. For our last question here, your goal is to combine data structures to find a palindrome in a linked list. A palindrome is a word that is bred the same backwards as it is sports like racecar. If you flip racecar, you still get racecar. Here, your word is going to be represented as a linked list. Here we have a link that has r in it, and that points to the next link which has a in it, points to the next link which has c, and then points to e, points to c, points to a, so on and so forth. Our linked list represents the word racecar. We want to check if this linked list is a palindrome. Consider what other data structures you may use to also see at this linked list is a palindrome. Pause the video here and see if you can figure it out. The solution conceptually for this pretty challenging is you want both a stack and a queue. As you progress and traverse along this linked list, you add to both the stack and the queue. Then once you've finished servicing the entire linked list, simply pop slash dequeue, just remove from each of the stacking queue and then check if each one of those equals each other. That's great because the stack removes from the end, the queue removed from the beginning so you're basically moving along the string in reverse and forward at the same time by just removing from the stack and the queue. That's perfect for us because in a palindrome, one way to check if it's a palindrome is to check if the first character is you're the last one. At the second one is you go to the second last one and so on and so forth. By removing from the stack and the queue, you get that. Let's pause right here. Let's see, that's the algorithm. What is the run-time and memory complexity? Here's the answer. For both run-time and memory complexity, we have linear complexity. For memory complexity is linear because we're storing a stack and a queue that each could have up to length n, and we also have linear computational complexity because we're iterating over each of these letters. In fact, we're iterating twice, but it doesn't matter, it's o of n. Now that we know the algorithm and we know the complexity, let's try coding it up. Pause the video here, and try to code it up. Let's now cover the solution. We're going to start off by initializing both the stack and a queue. Now that we've got both of these data structures, let's traverse the linked list, just like we said, and for every single link we're going to add it to both of these data structures. Here I'm going to add to the stack. I'm going to push the stack and then I will enqueue onto the queue as well. Then I will traverse to the next link. Just like I said before, this traversal code is pretty much always the same. Now having finished that, let's now run through both of these data structures, the stack and the queue, and makes sure that for every single item we remove, they're exactly the same. While there's something in the stack, pop from the stack and dequeue from the queue. As long as they match, we're good to go. If they don't match at any point, we return false. If nothing returns false, all of the match and we can return true. That's it. Let's now go ahead and evaluate this. Hit the green Run button at the very top and check that all your tests pass. Here we can see that our function is palindrome link is nowhere to be found in the list of failures. We have successfully passed that test. These three remaining functions are bonus functions. I've also included fully fleshed out solutions for them and explanations as well in the code, so you can try those on your own time as well. If you have any questions, feel free to post them in the discussion section. That's it for this lesson. Go ahead for the slides, for the fully fleshed out code and solutions nature to visit the course website. That's it for lists, both arrays and linked lists. 9. Mappings: Hashmaps: In this lesson, we'll introduce a data structure for mappings. A mapping is what it sounds like, a mapping from one item to another. Here we map cat to the number five. In this case, let's say we're mapping animal to age. Animal is what we call the key in the mapping, and age is what we call the value in the mapping. We can also have additional key value pairs in the mapping. Notice that this mapping has no sense of order, each key value pair is completely separate from one another. The data structure that stores this mapping is called a HashMap. We'll explain how a HashMap works, discuss its pros and cons, then briefly discuss a bonus concept. Here's why this data structure is called a Hash map. At the core, a HashMap uses a function called a hash function. This hash function takes in some value with arbitrary length, like cat, then transforms that input to produce a fixed size output like two. This hash function is important, and so as that output too, let's see how both of these are used. Say we want to insert some keys and values into a hash table. On the right, we have a hash table where zero is the first slots memory address, one is the second slots memory address, and so on and so forth. We want to map the key cat to the value five, to do this we'll hash the value cat. This gives us the Memory Address 2, so in the position two place the value five. We'll repeat this for more data, to map dog to three, hash dog. This gives us the Memory Address 3, so in the position three, place the value three. We repeat this one more time, to map pig to four, hash pig. This gives us the memory address zero, so in the position zero place the value four. This process takes just constant or O of one time for each key value pair we insert. To delete from the HashMap, we go through similar steps. To delete cat from the HashMap, first hash cat to obtain two, then at the position two remove the value. This operation is also constant or O of one time. To summarize, insertion and deletion for a HashMap are both O of one time. Very fast. Let's say we now want to search the HashMap, which value corresponds to cat. Almost the same process, hash the cat to obtain its memory address, which is two. Then at position two, grab the value five and return it. As a result, we say searching a HashMap takes constant O, again O of one time. To summarize, searching a HashMap takes constant O of one time. For a HashMap, search and fetch are the same, so we skip analyzing fetch separately. Let's now understand the pros and cons of these HashMaps. HashMaps in summary, have some very impressive runtimes, constant time, modification, and access, due to how HashMaps work, they're ideal for any data with unique identifiers. By definition, HashMaps are not designed for order data. As we said before, each of these key value pairs are completely separate. There is no sense of ordering, there is no relationship between separate key value pairs. We've discussed the first two sections already, what a HashMap is and what it's ideal for. We'll discuss the bonus section after wrapping up the core content. For a copy of these slides and more resources, make sure to check out the course website. This concludes our core discussion of HashMaps. If you're happy with this level of detail, feel free to move on to the upcoming lesson with HashMap practice. Otherwise, let's now discuss the bonus section. In particular, we discussed a caveat for HashMaps called hash collisions. We previously said that HashMaps are constant time for modification and access. However, this isn't always true. Runtime actually depends on how a HashMap was implemented and here's why. Sometimes this phenomenon when mentioned before, called a hash collision occurs. For example, say we want to insert two key value pairs, cat maps to five and dog maps to three. As before, we hash cat to get two. At position two, we then place the value five, this is just like before. Now, to insert our second key value pair, we hash dog to get two, at position two, we now have a problem. We can't override the existing data, so we have two options. The first option is called open addressing. In open addressing, we simply find another slot in the HashMap, if this next slot is empty, place your new value there. One way to find an empty slot is to simply traverse the slots until you find an empty one, and there place your value. There are also strategies for slot finding, which we won't discuss here. Another completely different technique for handling hash collisions is called chaining. In this technique, we treat each slot as a data structure of its own. For example, we could treat each slot as a linked list, then to insert into position two, we simply insert three into the linked list. We can also treat each slot as a HashMap, any data structure we do each with its own trade offs once more. In summary, we defined a hash collision and also saw two ways of handling a hash collision. Open addressing or we find other empty slots to use or chaining where we treat each slot as its own data structure. That concludes our bonus section here on hash collisions, make sure to check out our next lesson for practice with HashMaps. 10. Mappings Practice: Welcome to the mappings data structure practice. Here, again, we will practice more exercises and go over some interview tips related to the mapping data structure. In particular, these data structures are also called Hashmaps. So we will use those names a lot, Hashmaps. Just like before navigate to this URL alvinwan.com/datastructures101/mappings. That will bring you to repl.it. I know we've talked about these instructions several times now. I'm going to go over it one more time just in case, once you see this page, fork the project to get your own editable copy. Click on the project name in the top-left to get a drop-down. Click on the ellipsis to get another drop-down, and then click on the "Fork" button. Then I suggest placing your Skillshare and the repl.it windows side-by-side as shown here. You should then see something that looks like what you have on the right-hand side. So let's now talk about the kinds of exercises that you'll see. You'll have the implementation, usage and combination exercises. If you scroll down, you'll see our very first exercise here is to implement a stack. We're going to do this in three steps, just like we've done in all the other lessons. We're going to first design the algorithm without any code. We're going to report the runtime and memory complexity. Then we're going to actually code up the algorithm. Let's apply these three steps to the very very first algorithm, or our first exercise here. Implement a stack that limits all values in the stack to only three occurrences. If a value is being pushed that has more than three occurrences, just ignore that value. So in this particular case, we have an example here. We initialize a stack first and then we push the value one,10 times. However, only the first three value ones are actually added to the stack. Everything else is ignored. You can then push other values other than one and it will be added to the stack normally. Go ahead and try this out. First, design the algorithm that will actually implement this stack with a limit. Pause the video here and try to figure out an algorithm. Here's the solution algorithmically. What we're going to do is actually use the baseline stack implementation first. So you can see here that my capped stack class actually inherits from the base stack class. Knowing that, we're going to impose restrictions on when you can actually push to a stack, and that's it. To do that, we're going to keep a HashMap, mapping all the values to their counts in the stack. Anytime you push, you'll increment that value. If that value exceeds the limit, we'll simply ignore the value, and then pop will actually make sure to decrement that count. In push, you check the count, add it if the count is okay, and then increment the count. In pop, you decrement the count. So that's the algorithm. Now, let's discuss the runtime and memory complexity. Pause the video here and see if you can figure it out. For the runtime complexity, it's whatever the complexity of the original algorithm was. In this particular case, push is linear and pop is the way we implemented it, also linear. So both push and pop are linear operations. Now for the actual memory complexity, so what does that look like? Well, we have to create a new value here. It's not exactly o of n. We do have one value added to the HashMap for every new value, but that's not as many as the number of total values that we added. We only have the number of unique values in our HashMap. We need to invent a new letter. Instead of o of n, we'll say it's o of k for example, and we'll define k as the number of unique values that have been added to the stack. If you actually had this in an interview, you would say something like that. We actually need to define a new variable, k, where k is the number of unique values, and our algorithm, our memory complexity for our algorithm is o of. That's it for the complexity. Let's now move on to the code. So pause the video here and try to code. In the constructor we're going to initialize a new counter. From collections, import, default, dict. This is a different kind of dictionary. What this dictionary does is if you access a key that doesn't have a current value, it'll initialize that value using this function. In this particular case, for this counter, if let's say we access counter 0, 0 is not currently a valid key for this dictionary, so it'll initialize it with this int function. The default for the int function is zero. By default, self counter of 0 will give me 0. In fact, self counter of anything, will also give me 0. That's how this default dictionary works, and this will just make some of the code look a little bit simpler. Now that I've got this, let's now implement the push function. Like we said before, the first thing we're going to do is check if the current count is under the limit. If it's under the limit, then we'll proceed. If it's under the limit, then we'll actually increment the count, and then we'll actually push it onto the stack. In pop all we'll do is simply decrement the count. Here we'll first meet the value. Here we'll pop from the stack, and then once we pop that value will decrement the count for that value, and then we'll return the value. Let's now see if our cap stack tests pass. At the very top, hit the green button, and there we go, cap snack is now no longer in the list of failures, so we've passed this test. Let's move on to the next question. One tip is that Hashmaps are ideal for data with unique IDs. In this particular case, we know that every single value, or the premise of the problem is that we're gatekeeping whether or not to push onto the stack by the value. So the value itself is the unique ID. Because we have this unique ID, Hashmaps were perfect for this problem. However, Hashmaps are worst for ordered data. If there's a relationship between your data, let's say I have a sequence of numbers, one up to 10, and I need to know that ordering. Or another way to put it would be, let's say I have 10 values and I insert them into a data structure. If I need to remember the order that I inserted it in, Hashmaps are not the data structure to use, because Hashmaps forget all the ordering. All they do is maintain a mapping from keys to values. They don't remember anything about what order you insert the keys in. So that's important. Hashmaps are ideal for data with unique IDs, but they're not so great for data where ordering is important. For our next question, scroll down to see the next question. Return a histogram of all the unique letters and their numbers of occurrences. I guess this should have gone first. This is a subset of the previous problem. But anyways, pause the video here and see if you can figure out the algorithm. The solution conceptually is we're going to initialize a HashMap as a counter, and anytime we're going to iterate over the string, and for every single new value, we're going to access our counter. If we've already seen this value, we're going to add one to the count. If we haven't seen this value before, we will initialize the count to one. So that's the algorithm. What is the runtime and memory complexity? Pause the video here. Here's the answer. The runtime complexity is linear. We have to iterate over every single letter in the string. There's no way around that. The second one, memory complexity. Again, we need to invent a brand new variable. That variable is the number of unique values, let's say k. This algorithm is o of k, where k is the number of unique values that are in the string. That's it for the combine series, let's now try to code it up. Pause the video here, and try the code. Now let's cover the solution in code. Here we're going to create a counter. This counter is going to be a dictionary. In fact, what I'm going to do is use the default dictionary from before. That way the dictionary initializes to zero. Here we will have from collections, import, defaultdict. Then here for every single letter in the string, we're going to increment the corresponding count. Here we're going to increment the corresponding count and finally return the counter. Now, I also want to show the implementation without this defaultdict. Let's assume that you didn't use the defaultdict. Let's assume you just create a normal dictionary. Then here we would check if the letter is in the counter. If we've already seen this letter before, then we'll simply increment by one. If we haven't seen this before, then we'll initialize it to one. Counter letter is equal to one. Let's now go ahead and run this to check if unique is now a passing test. There we go. Unique now has passed all of its tests. We've also completed this exercise. Let's now move on to the next exercise. One tip is make sure your key exists before accessing it. We did that right here. In line 103, we checked if the letter was in the counter. In other words, if we had seen this letter before and there was already a valid key in our mapping. So knowing that, let's move on to the question that comes after. Using our data structure to find the most frequent value in a list of whole numbers. I've specified whole numbers to make this a little bit simpler. But there are several questions you'd want to ask for your interviewer. The first question would be, if they hadn't specified the whole numbers, they most likely would not. They would most likely say numbers. Then you'd want to ask, can there be negative values? Can there be decimal values? What are the range? What is the set of possible values that I might see? The second question that you would want to ask is what if there's a tie? That the question itself is not well-defined if there's a tie. Now, in the event of a tie in this problem description, report any of the numbers tied for the greatest frequency. That's usually what will happen. But these are good questions to ask and it gives you some brownie points in your interview. In this particular question, we have a list of numbers, and I want to figure out what the most frequent one is, pause the video here and see if you can figure out the algorithm. Here's the solution. We're going to do something very similar to before. We're going to use a HashMap to actually store the counts for every single value in our list. Once you've got that counter, well then use the counter to figure out what the greatest one was. That'd be one way to do it. Another way to do this that doesn't require two passes of the data, would be to actually, as you're adding to the counter, keep a variable that represents the most frequent value. If you see a value that all of a sudden has more instances then replace that variable. You'll see this in action in just a second. But conceptually, the main idea is to keep a counter of all the unique values and then use that counter. Now that we know the algorithm, pause the video here and see if you can figure out the runtime and memory complexity. Here's the answer. Just like before, we have the exact same runtime memory complexity and the previous functions. Run-time complexity is going to be linear. You have to traverse over every single item in the list. There's no way around that. The second one is O of k, where k is the number of unique values. That's I guess becoming a motif now. But at least for these exercises, O of k is the correct one time, so keep that in mind. It's linear in the number of unique values. Now that we've covered the complexity, we've covered the algorithm, pause the video here and try the code. Here's the solution. Let's write this out now. We're going to start off with counter and counter is going to be an empty dictionary. The greatest value is going to be none, let's say. For value in numbers, we're going to check if the value is in the counter. If we've seen this value before, actually I'm going to change this up a little bit. If the value is not in the counter, then we'll initialize that to zero. Regardless of what its value is, we'll increment by one. If the counter is greater than the greatest value, so if the current count is greater than the greatest count, then we'll reassign, and then it we'll return the greatest value. Before running this function, you should keep in mind something which is, well, one of the tactics that I will bring up in a future course is you should always virtually execute your code. What do I mean by that is just run the code in your head as though you were the interpreter. If you catch any bugs before running the code, that's usually another bound point for yourself. In this particular case, we would actually see an error here because of counter greatest. Greatest here is none and none does not yet exist in the counter. This goes back to our previous tip, always ensure that the key exists before you try to access it. In this particular case, I'm going to add the none key right here, and I'm going to set it to negative one. That way, every single other value with a count will be greater than this one. In the very first iteration, replace the greatest value with the current value. That's it for this function. Let's now try executing the tests and make sure that most frequent no longer shows up as a failed test. Click the green "Run" button at the very top. There you go. You can now see that most frequent is no longer a failing test. Here we want to consider empty input. There is one case right here where you might have something strange going on if you passed an empty list into most frequent, then you would simply get back none, which is a reasonable default. But you should just make sure that your function doesn't break if you get empty input. In this case, our function does not break. It returns none. So as long as we declare that the function will return none upon empty input, then that's perfectly valid. For our next question. This one is pretty challenging conceptually. We're going to use a data structure or any data structure to sort a bunch of travels. A bunch of steps are formatted as source and destination. You can think of these as source and destination cities. Once we have that, we want to compute the complete path. We want to find the real starting position and the real or ultimate ending position. You can assume for now that there is only one path and that there's one path as complete. That means that there are no gaps. Never end up at some random position in the middle with a bunch of unrelated steps and there are no cycles. You'll never go from one to two back to one. If you go from one to two, you will always proceed to three, four, so on, so forth. Your goal is to print the sequence of steps in order from first to last. Here we have a bunch of steps. We went from three to two, one to three, and two to five. Your goal is to find out what the correct sequence of steps is. Here here correct sequence is one to three, then three to two, then two to five. Pause the video here, see what data structures you want to use and how you would use them. Here's the solution. Conceptually, what we're going to do is first find out what the ultimate starting position is. In order to do that, we're going to create a set of Hashmaps. These Hashmaps will map from the destination to the source. Then what we'll do is we'll start from any arbitrary destination. From that destination we'll check, is the source in the HashMap? Is the current source another steps destination? If so, then we go to that steps destination and source, and you keep doing this. Let's say for example, we start from two and then we say, is three in our mapping? It is, three is in our mapping. Let's look up the corresponding source. One, is one in our mapping? No, it's not in our mapping. So that means one is the ultimate starting position. That's what we're going to do. We're going to create a mapping from destination to source.. Then we're going to keep traversing it until we get to the ultimate source. Once you've gotten there, that's the first step. We've gotten the ultimate source. We now need to actually print out this series of steps. We need to go backwards, we need to start from the source and keep traversing. We'll start from one, we'll go to three, then we'll look up three. We see three, we then go to two. We look up two, two is here. Then we keep doing that. You keep traversing in the forward direction. You need a set of Hashmaps that give you the reverse direction. Then you need to set up HashMap that gives you the forward direction. We're going to do that. I'm going to create two Hashmaps. One HashMap always links from destination to source. One always links from source to destination, so that's our algorithm. Now what is the runtime and memory complexity? Pause the video here. Here's the answer. Our memory complexity is linear in the number of steps, because for every single step we add it to both of these Hashmaps or both these dictionaries. Now for our runtime complexity, same thing. It's linear in the number of steps because we have to access every single one of these steps at least once, and fortunately for us only once. Knowing that, let's now actually write up the function, the code. Pause the video here and try the code. Let's now cover the solution. Here we're going to initialize two Hashmaps or two dictionaries. One is going to be forward, one is going to be backwards. For every single pair of source and destination in our steps, we're going to populate both of these. The forward mapping will go from source to destination. The backward mapping will go from destination to source. Now that we've populated both these mappings, let's traverse from any arbitrary source all the way back to the ultimate source. While the source is in backward, and this I guess is a little bit unclear, but here I'll just make it clear. Here we'll have source is equal to the first step, and then it will be the first step source. While source is in backwards we'll keep traversing. Once this while loop is done, source is no longer in the backward mapping, that means source is the ultimate source, the first place that we ever went to. Then, as long as source is in the forward dictionary or in the forward mapping, then we'll print it. Here we're going to use f strings in Python. You can print this however you want and then we'll traverse this way. As long as our source is in the forward mapping, we will access that item in the forward mapping for its destination. That's pretty much it. Let's now run this function and make sure that we pass this function, get path from tests. There we go, we only have one set of failing tests that are in another function, so we've successfully passed this exercise. Let's not talk about this challenge question. This question is also fairly difficult, so we're only going to talk about it algorithmically, we're not going to actually code it up. Again, I do have written solutions for you, however, we just won't talk about the code here. I've annotated every single line in the solutions so that you can go through and understand it and the most important part here though, is the algorithm. For this problem, we're going to make a dictionary with a maximum size that operates as a least recently used cash. In this dictionary, anytime you insert, update or access a key, that counts as using that key, and if the dictionary has more keys than the maximum size, you'll drop the least recently used key. Let's see an example of this. Let's say I have this new dictionary and I insert the Value 1 with the key a, so we'll just look at the keys here, the values don't matter too much. We've inserted a value for key a and then we insert a value for key b. At this point, b is the most recently used value and 1 is the least recently used value. However, here I access a. Now a becomes the most recently used value , here we've swapped it. One corresponds to a, so a is the most recently used and the other value is least recently used. Now we assign a new value to the key c and you'll notice that at this point, b was the least recently used value right here, so b is dropped and now we only have a and c left. In this particular dictionary, the maximum size was two. Now, that's how we actually saw one of the keys get dropped. Keep in mind, anytime you access, update or insert a value for a key that counts as accessing or using that value, and anytime you insert if you exceed the maximum size, you lose the least recently used value. Knowing all of this, let's now think about how to design this algorithmically. Pause the video here and see if you can come up with an algorithm. Here is the solution. To actually implement this cache, what we're going to do is keep actually a doubly linked list. Before we talk about a linked list, a linked list keeps a link to its next link, so you would look something like this. You would have one that links to two, links to three. However, a doubly linked list actually includes both directions, three knows that its previous value is two, two knows that its previous value is one. The reason why you want this doubly linked list is because we want the linked list to keep track of the ordering between all of our data and this ordering will tell us that the most recently used and the least recently used values are. Fortunately with a linked list, it's very easy to insert one value and to remove one value at a time. Those are both constant time operations. Note that this isn't true for an array. In an array, if you wanted to move, let's say three. Let's say we had this. Let's say we had 1, 2, 3, 4, 5 and let's say that five is now the most recently used value. Your new list would have to be 5, 1, 2, 3, 4. With a doubly-linked list, this is a constant time operation, all you do is reassign the connections like we talked about before. However, for an array, you would have to actually copy 1-2's position, 2-3, 3-4, 4-5, and then copy five to the beginning. You'd have to do n copies, n operations in order to reshuffle this list. However, with a doubly-linked list, you don't have to do any of that. It's a constant time operation. This is why we want a doubly-linked list to combine with our mapping or a HashMap to implement this least recently used cache. Knowing that we want a doubly-linked list, how are we going to actually use it? Let's talk about the two functions that we actually need. We have two different functions. I'm not going to actually fully flesh out these functions, but I'm going to add some pseudo code here just so you know what the general structure of the code would look like. Here, we would have the ability to get an item and we would also have another function here that would give us the ability to set an item. The basic idea here is that underneath both of these functions, we would update the key, we would say something like, make most recent key, and then same thing here. We'll also make most recent key, and that's pretty much it for the high level logic. Most of the nitty-gritty would have occurred here. In the make most recent function right here, you would actually do all the reassignment needed to both remove an item from the linked list and then to reinsert it at the very beginning and that's it. This is the conceptual solution to this problem. You would use a doubly-linked list and you'll also use a map. This seems fairly complex, I'm not going to code out, at least not here in the lesson. I've already written out the solutions and the solutions will be available to you with fully annotated code. However, the most important part of this question was coming up with using the doubly-linked list. There will be quite a few questions like this in your interviews where you have to come up with some novel combination of different data structures in order to accomplish your goals. In this particular case, we wanted a least recently used cache, and we want it to be very efficient, using this doubly-linked list allows us to keep this function efficient. That brings us to our second part of the question. What is the runtime and memory complexity of our algorithm? In particular, what is the runtime your memory complexity of inserting and then actually accessing a value? Pause the video here, and here's the solution. Getting the value is pretty simple. Getting the value itself, just accessing some value in a HashMap is all of one, it's constant time. But the real question is, how long does it take to update the most recent value? Because that involves removing from and inserting back into a doubly-linked list. Well, fortunately, insertion and deletion from a linked list is constant time. Given that, we know that this algorithm, this data structure is constant time to access. That's great. We've maintained the efficiency of a HashMap and now what about assignment? What are updating this HashMap? Well, fortunately, that is also still constant time. It's constant time to update a HashMap itself and again, to make that key and value pair the most recent. That operation itself is again constant time, just like we talked about earlier. Because that constant time operation, making something the most recent is just inserting and removing from a doubly-linked list. All this to say, we came up with a nice combination of data structures to keep constant time, insertion and access for a data structure. That's the most efficient you can possibly get. I guess the takeaway here is that knowing the right data structures can really mean an order of magnitude difference, an order of growth difference between an implementation that is efficient and inefficient. In this case, using doubly-linked list is what got us an efficiency and this is a useful skill for interviews, maybe less so for on the job, but if you had a massive production system, if you had a system critical application that was using inefficient data structures, then you would have the opportunity to create some massive business impact. That's it for this exercise. Again, for access to all these coarse slides to all the solutions, make sure to access the course website. Finally, that concludes the HashMap slash mappings practice lesson. 11. Trees: Breadth vs. Depth: In this lesson, we'll discuss a way to store data hierarchically. A data structure called a tree. We'll explain how a tree works, how to traverse a tree, how to organize a tree, what O(logn) time means. Finally, we'll discuss the pros and cons of this new data structure. Let's start with what a tree is. A tree consists of a root node. That root node has two child nodes. Each of those nodes then has two more child nodes each. This structure is what we call a tree. In fact, because each node has two children, we call it a binary tree. Let's say you want to search for a value or we want to modify this tree, we need to know how to traverse a tree. Here are two different ways to achieve that. The first method is called Breadth First Traversal, or BFS for short. In BFS, we traverse the nodes left to right level by level. First we access three. We finished the first level, so let's move on to the next level. Then we access 7, then 2, we've finished the second level. So let's move on to the last level. We access zero, one, nine and eight. We've finished the last level, so we're done with BFS. Our second way of traversing this tree is called Depth-First Traversal, or DFS for short. Here we access from the root node, three, we also access seven. We continue to traverse deeper, so we access two. Now that we've hit a dead end, we go back to seven, and explore as deeply as possible. So we access one. Again, we've hit a dead end. So we go back to seven, go back to three, and again, explore as deeply as possible. We access two then nine. We've hit a dead end. So go back to two and finally access eight. And those are the two ways to traverse a tree, breadth-first or depth-first traversal. We'll now explain one way to organize data in a tree. This is called the binary search tree. Notice we've rearranged the values. In particular, we've arranged in them so that the left child is always less than the parent. For example, three here is less than five and two is less than three. We've also arranged these values so that the right child is always greater than the parent. For example, eight is greater than five. So the right child is always greater and the left child is always less. This ordered structure allows us to manage data very efficiently. Let's say we want to insert data into the tree. Here, we want to insert six. To do that, start from the root node five. From here we'll iteratively look for a home for six. Since six is greater than five, we know six must belong to the right. Now we have the value eight. Since six is less than eight, we know six must belong to the left. There is no node here, so this is now six's home. This concludes the insertion algorithm. Let's say the depth of the tree is d, then this algorithm takes up to d operations. As a result, insertion takes O of d time to run. To summarize, insertion takes O(d) time. Let's now see how deletion works. Say we're deleting six shown in red. Well, that's pretty simple. Simply delete that node and the pointer from its parent. However, let's say we deleted from the middle of the tree. In particular, we want to delete five. Let's try what we did last time, just drop five. However, there's now a hole in the tree. We can use either child to fill in the gap, three or eight. Here, we'll use the right child eight. So shift eight up. However, now there's a gap where eight used to be. To be consistent, we'll use eight's right child, nine. Again, shift nine up. Now there's a whole where nine used to be, but that's okay. There are no more children. So we can simply delete the extra pointer. Just like with the previous algorithm. This algorithm takes up to d operations. So deletion takes O of d time. To summarize insertion and deletion for a binary search tree both take O of d time. Our goal is to search for the value six. We start from the root node five, like with insertion, we compare our desired value six with the current node five. Six is greater than five, so we go right. Now we access eight. We compare, six is less than eight, so we go left. Now we access six, six is the value we're looking for. So we're done searching. This algorithm can take up to d operations. So search is O of d. To summarize, a binary search tree features O of d modification and access. Note that in a tree there is no such thing as directly fetching the eighth node, because there's no global ordering of nodes, and you don't know where any of the nodes are in memory. As a result, there is no fetch run-time analysis. This concludes our analysis of trees. However, we have one more to do. We'd like to translate these run-times. All of our runtimes are expressed as a function of the tree depth or d. Visualized, here's a tree and here's the tree's depth. In this case we have a depth of four. We would like to express these runtimes in terms of n, the number of nodes, in this case there are 15 nodes. This is because runtimes for all of our previous data structures were expressed in terms of n, the number of items. As a result, we should express our new runtimes in n as well to make sure these runtimes are comparable. Here's the same tree. Let's try to understand the relationship between d and n. Consider the first layer. This is Depth 1 with one node. Consider the first two layers. This is Depth 2 with three nodes. The first three layers, this is Depth 3 with seven nodes. Consider all layers. This is Depth 4 with 15 nodes. It may be hard to notice a pattern. So instead of counting the number of nodes, I'll count the number of nodes plus one. Now, if you look at the numbers in red, do you see a pattern? As you may have noticed, the number of reds, number of nodes roughly doubles. As a result, we can state that n is approximately two to the power of d. Now we have a mathematical relationship between d and n. Since we want to plug in d into our runtime complexities, let's solve for d. To solve for d, we'll first rewrite the equation, then apply logarithms to both sides. Remember that with logs, we can move the exponent d outside of the log as a coefficient. Then we simply ignore the constant log two. So now we get that log of n is approximately d. We can now plug in. Previously we had these runtimes express in terms of d. Plugging in d is equal to logn. We now get O of log n for all modification and access runtimes. Using this analysis, let's now discuss the pros and cons of binary search trees. Binary search trees in some have impressive runtimes, just O(logn). In practice, O of log n is very close to o of one constant time. By design, binary search trees are ideal for random sequences. The more uniformly spread out, the better. However, binary search trees are the worst for sequential data, and here's why. Let's say I have a sequence of numbers from 0-10. First we create the root node 0, then we answered one, then two, then three, then four, then five. The list goes on, and as you can see, we basically have a straight line instead of a filled binary search tree. This means that now the depth is no longer log n, instead the depth is exactly equal to the number of nodes. Meaning all of our algorithms now take O of n or linear time. So in summary, binary search trees are really inefficient for sequential data. In summary, we've discussed what a tree is, breadth-first versus depth-first traversal, what a binary search tree is, what the O of log n time means, and some pros and cons of binary search trees. For a copy of these slides and resources, make sure to check out the course website. Let's now see these data structures in action. In the next lesson, we'll practice using and implementing these data structures, tackling both of our learning objectives for this course. 12. Trees Practice: Welcome to the final practice lesson for this course. In this lesson, we're going to practice with trees, in particular with breadth versus depth-first search. Let's go ahead and get started here. Just like before, navigate to alvinwan.com/datastructures101/trees. At that link, you'll see something like this. Forward the project to get your own copy, click on the project name, click on the ellipsis, and click on fork. I suggest placing your Skillshare and reply window side-by-side so that you can code along as I'm coding. You should then see a screen just like this one, go ahead and scroll down and you'll get to the first exercise right here called bfs or breadth-first search. So one tip before we get started is to memorize how to perform breadth-first and depth-first search, abbreviated here as bfs and dfs, because it will always be the same. In this particular case, we're not using a technique called recursion. So without recursion, the way to do bfs and dfs always looks like this. Knowing that, let's now try to implement breadth-first search. Here's what a breadth-first search traversal of a tree looks like. Here we've got a tree, 1, 2, 3, 4. Let me go and illustrate what this looks like. Here you would have one, and then you would have a child right here that would be two, and then you would have another child right there, that'd be three. Then you'll also have a tree like this. This is what that tree looks like. You would have basically one of the root, it's left child is going to be two, and then two's left child is going to be three. Then this is the right child of one. This is what the tree looks like. Knowing that's what that tree it looks like, here's the breadth-first search traversal looks like. Breadth-first traversal will give us one first, and then it'll give us two, and then four, and then three. It's traversing and it's also called level order traversal, so here we're basically moving down levels at a time, 1, 2, 4, 3. If we'd had down here five for example, then we would have 1, 2, 4, 3, 5 and so on and so forth. That's what that looks like. Knowing that this is now breadth-first traversal, here's a hint. Pause the video now and try this problem on your own if you don't want the hint. The hint is that you'll need to use one of the data structures that we talked about before. In particular, you'll have to use a stack or a queue in order to do this. Go ahead and pause the video here and try to design the algorithm for a breadth-first traversal. Now here's the solution. The solution is going to involve using a queue. What we're going to do is in this queue, we're going to start by adding the root. Once we've added the root, then what we're going to do is take the root and enqueue all the children for that root. Once we've done that, then we're going to dequeue. Dequeuing is going to give us the very first item here. Then we're going to enqueue all of its children. That's going to be three and whatever else is a child of two. Actually, one way to do this would be to include a visualization. Here we have one, we're going to dequeue one, and we're going to add all of its children two and four. Then we're going to dequeue two and then we're going to enqueue of its children, which are just three. Then let's say maybe there's another one, let's say there's five. Here we would have three and five. Then we would finally take our. Then four we'd enqueue all its children, which is nothing and so on to forth. The idea is you'll notice that the order of dequeuing was 1, 2, 4. Then after that, it's going to be three and five, which is exactly what bfs is. The queue gives us exactly a breadth-first tree traversal. Knowing that, let's now see what is the runtime in memory complexity. Pause the video here. Here's the solution. The runtime and memory complexity are both going to be o of n. You may be thinking, what is n here? Well, n is the number of nodes. That's important. Before we had this case where we defined a variable on our own. We define a variable as a number of unique values that are being inputted. In this case, by default for a tree, that variable n is going to be the number of nodes in the tree. You could also redefine it, you could redefine the n as the height of the tree or you can define it as something else but usually, n is number of nodes in the tree. Knowing that, when you report your runtime memory complexity for a tree, you should report in terms of the number of nodes, which for us is going to be o of n, because you have to traverse over every single node, that's what a breadth-first traversal is and second, you have to add all of these nodes to your queue. At some point, actually, sorry I lied a little bit, your queue will never exceed the maximum width of the tree. We'll see what width means later on, but we'll just say for now that o of n is the memory complexity. We can play with that nuance later, I'll explain why, there's a better answer. For now, go ahead and pause the video and try to code this up. Let's write out the solution. Here we're going to define a queue and the queue's first value is going to be the root actually. While there's something in the queue, we're going to dequeue. Then we're going to extend, so we're going to enqueue actually. For each branch in the tree's branches, we're going to enqueue, so queue enqueue the branch. Then finally, we'll print the current value. This will give us a breadth-first traversal. Let's now check to see if this passes our tests. Click the Run, green button at the very top and you can see now that bfs is no longer in our list of failed functions. You've successfully passed this test. Let's move on to the next question. Actually, before we move on to the next question, one tip is that you should always use new, hold on. One of the tips is that you should always use a queue for breadth-first traversal. You saw that in example here. You saw how to use it. Keep this in mind. This will pretty much be always how you implement bfs without recursion. We'll talk about recursion in a future course. Knowing that, let's now implement depth-first search. Depth-first search will look like the following. Let's say we had this tree and this tree is pretty much the same structure. It's a little different structure actually. Let me draw this out for you. Here we have the root 1. It's left child is going to be 2 and 2's left child is going to be 3. Then there is no other children here. 1's right child is going to be 4, right here. Then 4's left child is going to be 5. Here we're going to hit. One thing to note is I keep saying left and right child is because usually tree problems are built around your binary tree where you only have one left and one right child. The way that we've implemented a tree here, we can have unlimited branches. You're not limited to a left and right. Technically speaking, 5 isn't a left child of 4, 5 is the only branch of 4. I've just been saying left and right out of habit, but it doesn't really matter. Point is that here is what the tree roughly looks like. We have 1 at the root, it's children are 2 and 4 and then 2's only child is 3, 4's only child is 5. Knowing that, a depth-first traversal or first go down one of these paths. In the way that we've implemented it, 1 will first go down this path over here 4 and then we'll go all the way down to 5 before coming all the way back up and exploring other paths. That's what depth-first traversal will look like. It'll explore all the way down to a leaf. Again, a leaf is a node in the tree that has no children, no branches. It'll explore all the way down first and then it'll come back up and start to explore other avenues. That's what depth-first traversal looks like. You see down here it'll go 1, 4, 5, and then it'll go back up and explore 2 and then 3. That's a depth-first traversal. Let's try to come up with an algorithm. Pause the video here. Here's the solution. For depth-first traversal, well, we previously used a queue for breadth-first traversal. For depth-first traversal, we're going to use a stack. The stack is going to operate in a very similar way. We're going to keep adding items to the stack, and we're going to keep popping from the stack. Here's what it's going to look like. When I start with 1, we're going to pop 1 and then we're going to add its children 2 and 4. Now that we've got 2 and 4, we're going to pop the last item which is 4. Then we're going to add its children, which is 5. Then we're going to pop 5. You can see where this is going. So far we've gone through 1, 4, 5. Once we're done with 5, we're going to pop 2. This will give us our depth first traversal using a stack. Knowing that that's the algorithm, let's go and talk about the memory and runtime complexity. What is the run-time in memory complexity? Pause the video here, and here's the answer. The memory complexity here is going to be linear in the number of nodes. For memory complexity, we're going to digest the memory complexity. For computational complexity, we're going to have linear computational complexity because we have to iterate through all of these nodes. Same caveat that I mentioned before for memory complexity, it's actually not linear in the number of nodes, but we'll talk about that more later. Now let's go ahead and try coding this function. Pause the video here and give it a shot. Let's go ahead and code up this function. Here in our depth-first traversal, we're going to start off by initializing a stack. This stack is going to have our root node inside of it. Then we're going to iterate. As long as we have something in the stack, we'll then pop from the stack. Then for every single branch of our current node. For branch in current top branches, we will then push that branch onto our stack and then we'll print our current value. That's pretty much it for our depth-first traversal. Let's go ahead and double-check this works. Hit the green "Run" button at the very top. There you go, dfs is no longer in our list of failures so we have successfully passed the dfs tests. The tip here is you should use a stack for depth-first traversal. Let's now move on to the next question. For every single node in the tree, add a new attribute called parent that points to its parent. Here we're going to use bfs or dfs for parent assignment. This attribute parent will point to the node's parent in the tree. Let's go ahead and draw that tree again. Let's say we have this tree right here. We have here 1, and it's only child is 2 and 2 has two children, which are 3 and 4. It's going to look something like this and there are no other branches, just these guys. Now that you have this, we're going to actually populate parents. Were going to implement this so that every single tree here points back to its parent. So 2 will point to 1, 3 will point to 2, 4 will point to 2. Knowing that, let's think about this and design an algorithm for this. Pause the video here. Let's talk about the solution. For this question, it actually doesn't matter whether you use bfs or dfs. In either case, you can make this work. Basically the algorithm is going to be, we're going to perform some traversal and if you look above, we actually iterated over all the children or all the branches for their current node. While you're doing that iteration, you'll update this branch to point back to the current node, which is the parent. It doesn't really matter which traversal you use. That's the algorithm and in fact, that's pretty much the code solution. Go ahead and consider now, what is the memory and runtime complexity? Pause the video here. Just like before, because we're effectively using DFS and BFS, the runtime and computational complexity is the same as before. Computational complexity is O of N and memory complexity, we'll say for now, is basically O of N. Knowing that, let's now go ahead and try the code. Pause the video here and give it a shot first. Now let's go over the solution. I'm going to cheat a little bit. I'm actually just going to copy and paste from the previous function, which already performed DFS. Let's go ahead and paste that down below. I'm going to remove this print statement and then I'm going to assign branch.parent is equal to current. That's it. Let's go ahead and check that this function is correct. We're going to hit the green "Run" button at the very top. There we go. We can see now that populate parents is no longer in the list of failing tests. We've successfully passed all these tests. Like I said before, BFS, DFS, as long as you memorize that implementation, you're going to see that very often. At least without recursion, this is what it's going to look like, so make sure you know that implementation. Tip from earlier, always consider empty input as well. In this particular case, if our tree was empty, then our function would fail because here we're going to pop, we're going to get current, current is going to be none, and then we're going to attempt to access.branches of it. Again, for all of your interviews and for all of your questions, make sure to consider what would you do with empty input. In this particular case, we should have said if tree is none, then we'll simply do nothing, just return. That's important to keep in mind. Let's move on to the next question. Here we're going to use BFS or DFS to compute the maximum value in an extremely wide but shallow tree. Let's go ahead and consider the following. Consider what this means for your choice of traversal. What is the maximum space complexity of breadth-first search or of depth-first search. Go ahead. This is actually related to the nuance that we talked about earlier. Pause the video here and consider first before you can consider the algorithm. Actually as you're considering the algorithm. [NOISE] Here's the next question. We're gonna get the maximum value from an extremely wide but shallow tree. The important part of this question is to consider what an extremely wide but shallow tree does to your algorithm. Why is this important? What does this mean for your choice of traversal? We talked about a subtle nuance earlier about the space or memory complexity of your mode of traversal. This question is really honing in on that, to try and understand what that nuance means for your algorithm. Pause the video here and see if you can figure this out. What is the difference between extremely wide but shallow tree versus a deep tree but very narrow one for BFS or DFS. Pause the video here. Here's the answer. For an extremely wide but shallow tree, you're going to want to use DFS. The reason is because BFS or breadth-first traversal, it's space complexity or the maximum number of items in the queue or stack or whatever collection you're using is going to be the width of the tree. Whereas for depth-first search or DFS, the maximum space that we'll use is equivalent to the depth of the tree. If I've got an extremely shallow tree, then you'll want DFS to minimize your space consumption. Because if you have a very shallow tree, you'll have a very short collection to consider. Knowing that, let's go ahead and design the algorithm to get the maximum value. Pause the video here and try to design the algorithm. Here's the solution. For the algorithm, we're simply going to take the maximum value anytime you traverse a node. Anytime you traverse a node, compare that node's value to a globally greatest value. If your current value is greater, then simply replace it. This is very similar to the maximum frequency example that we talked about earlier. In this case, we are considering the maximum value. Now that we've considered the algorithm, we've considered the mode of traversal, pause the video here and see if you can come up with the memory and runtime complexity. Here's the answer. For runtime complexity, again, it's linear in the number of nodes, it's O of N. For memory complexity, well, we talked about this before, which is exactly why we chose DFS for this problem. Your space complexity or memory complexity is going to be the depth of the tree. This is not exactly O of N. It's also, at least for now, not clear exactly what the depth of the tree would be. We're just going to assign a new variable to it. We're going to say D is the depth of the tree, so our algorithm is o of d. Now let's try coding it up. Pause the video here. Now since we have a DFS implementation, I'm again going to be lazy. I'm just going to copy and paste it. Let's scroll back up here, and then we're going to copy this DFS implementation, and then we're going to scroll back down and we're going to paste it. Now that we've got this DFS implementation, we're going to actually remove the print statement. Like we said before, for every single node that we consider, we're going to update the maximum value. Let's say the maximum value in the very beginning is just the roots value. Then when we explore every single node, we're going to take the maximum between the current value and the greatest value. Finally, we're going to return the maximum value. Let's try running this function and make sure it passes all the tests. Go ahead and hit the green "Run" button at the very top. There we go. We can see now that get maximum is no longer in the list of failing tests. Let's now move on to the next question. We're going to combine data structures to compute the tree width. Let's now see what the tree width is. The width of the tree is the maximum number of nodes at any given depth or any given level. Let's doodle this tree again. Here we've got one and one has one child, which is just two, and two has two children, which are three and four. Here we have three, and then here we have four. That's what this tree looks like. The width of this tree then is two because this level has two elements in it. Now if we had another one right here, let's say we had five, and then we had another one, six, let's say, then the width of this tree would be three because you have at most three on a single level. Knowing that, let's go ahead and update. Let's discuss the algorithm. Go on and pause the video here and see how you would compute the width of this tree. Here's a solution. Fortunately for us, we have a mode of traversal that makes it a lot simpler for us to compute the depth, which is going to be level order traversal or BFS. There are two ways that you can do this. One is that you can use another data structure to actually keep track of how many nodes there are at a certain depth. The second one is the modified BFS so that you traverse one level at a time and you keep track of basically when you start the next level. Those are your two options. Those are two different algorithms. We're going to implement the first one where we're going to use another data structure, when you use a mapping. Now that you've talked about this, let's talk about the runtime and memory complexity. Pause the video here and see if you can figure it out. Here's the answer. For runtime complexity, both algorithms, regardless of which one you picked, are going to be O of N, where n is number of nodes. Both computational complexities are linear in the number of nodes. Now if you decide to pursue multiple data structures, where you keep a mapping from depth to counts, that's going to be an additional space complexity of O of D in the depth. You're going to be linear in the depth because you're storing one value for every single depth. However, if you use the algorithm where you modify BFS in place, then your algorithm would be O of width. Both algorithms are at least O of width, because the baseline, that's just how the original BFS works. However, the algorithm that additional uses a HashMap to store that mapping from depth to count is going to be O of width plus O of depth. We're going to implement the less efficient one, which would seem backwards but I've already implemented the efficient one already in the solution so you'll be able to check that if you want to see the more efficient implementation. But for now, I'm going to implement the less efficient one. Go ahead and pause the video here and see if you can come up with the code yourself. Here's our solution. Let's go ahead and write. We're going to use our favorite data structure here. We're going to use a default dictionary. We're going to start by initializing a queue. Every single item in this queue is going to be a depth of the current node in addition to the node itself, because we need to keep track of depth in order to know which count in our counter of widths to actually update. Here's our counter, widths, and it's going to initialize all widths to zero. While we have something in the queue, we're going to dequeue and then we're going to update the widths dictionary by incrementing that value by one. Finally, for each branch, we're going to enqueue the depth incremented by one and the node itself. Anytime you go from parent to child, we'll increment the depth by one. Finally, we'll return the maximum width that we see. This will give us the maximum width and all the widths that we counted up. Let's now run this and make sure that passes all of our tests. There we go. Lack of output means that all of our doc tests passed. You've finally finished all the exercises for this lesson. For all these slides, full solutions and more practice exercises, make sure to visit this website. That's it for our practice for trees, breadth versus depth versus traversal. You've seen a bunch of problems here. You've seen a bunch of problems in previous practice lessons as well. I hope that these were very useful. Let's now move on to our very last lesson where we'll wrap this all up. 13. Conclusion: Congratulations on finishing the course. We've covered a ton of topics, orders of growth, stacks and queues, linked lists, hash maps, trees. That was a pretty difficult course with a ton of content and some really brutal questions. You've finished 20 exercises in depth, 20 interview tips and practice the three-step method for solving puzzles again and again and again. That was a lot of content and the summary slides are great tools to memorize. However, the most important takeaway is actually the three-step process. As far as interviews go, that is the key. Here's the three-step process one one time: design algorithms without code, reports runtime and memory complexities, and finally, code the algorithm. Now to really solidify your understanding for three days after the course, design a data structure question each day. I also encourage you to post your question in the discussion section. You just need some sleep in-between thinking about data structures. It doesn't have to be your life's best work. Any question will do and post it in discussion section. I really do look forward to seeing them and that's it. Fantastic work. This is the start of your interview preparation and you're off on a strong footing. Check out my Skillshare for more courses on data science, machine learning and other important programming concepts. For more interview preparation, follow me on Skillshare for more updates as we'll cover more commonly used coding puzzles and interviews specific tricks. Congratulations once more on making it to the end of the course and until next time.

Kick Butt in Coding Interviews: Data Structures in Python

Alvin Wan, Research Scientist

Watch this class and thousands more

Watch this class and thousands more

Lessons in This Class

1.

Ready? Set. Go.

1:41

2.

Why You Need Data Structures

6:15

3.

What is a "Good" Data Structure?

5:31

4.

Orders of Growth Practice

12:00

5.

Sequences: Stack vs. Queue

6:32

6.

Sequences Practice

39:09

7.

Lists: Array vs. Linked List

7:22

8.

Lists Practice

27:38

9.

Mappings: Hashmaps

6:22

10.

Mappings Practice

29:51

11.

Trees: Breadth vs. Depth

9:14

12.

Trees Practice

25:39

13.

Conclusion

1:32