Database Sharding (From Scratch with PostgreSQL) | Hussein Nasser | Skillshare

Playback Speed


  • 0.5x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 2x

Database Sharding (From Scratch with PostgreSQL)

teacher avatar Hussein Nasser, Author, Software Engineer

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Lessons in This Class

11 Lessons (56m)
    • 1. Introduction to Database Sharding

      1:06
    • 2. What is Sharding

      3:31
    • 3. Consistent Hashing

      2:41
    • 4. Horizontal Partitioning vs Sharding

      1:08
    • 5. Class Project -Sharding with Postgres

      1:05
    • 6. Class Project - Spin up Postgres Shards on Docker

      7:23
    • 7. Class Project - Writing to a Shard

      21:58
    • 8. Class Project - Reading from a Shard

      11:19
    • 9. Advantages of Database Sharding

      1:59
    • 10. Disadvantages of Database Sharding

      3:33
    • 11. Class Summary

      0:45
  • --
  • Beginner level
  • Intermediate level
  • Advanced level
  • All levels
  • Beg/Int level
  • Int/Adv level

Community Generated

The level is determined by a majority opinion of students who have reviewed this class. The teacher's recommendation is shown until at least 5 student responses are collected.

33

Students

--

Projects

About This Class

Database sharding is the process of segmenting the data into partitions that are spread on multiple database instances to speed up queries and scale the system. In this class we will learn what is database sharding, we will implement sharding from scratch with postgres which will be your project class and finally discuss we will discuss the pros and cons of sharding.

Meet Your Teacher

Teacher Profile Image

Hussein Nasser

Author, Software Engineer

Teacher

My name is Hussein and I’m a software engineer. Ever since my uncle gave me my first programming book in 1998 (Learn programming with Visual Basic 2) I discovered that software is my passion. I started my blog, and YouTube channel as an outlet to talk about software.

Using software to solve interesting problems is one of the fascinating things I really enjoy. Feel free to contact me on my social media channels to tell your software story, ask questions or share interesting problems. I would love to hear it!

I also specialize in the field of geographic information systems (or GIS). I helped many organizations in different countries implement the GIS technology and wrote custom apps to fit their use cases and streamline their workflows since 2005. I wrote fiv... See full profile

Class Ratings

Expectations Met?
  • Exceeded!
    0%
  • Yes
    0%
  • Somewhat
    0%
  • Not really
    0%
Reviews Archive

In October 2018, we updated our review system to improve the way we collect feedback. Below are the reviews written before that update.

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.

Transcripts

1. Introduction to Database Sharding: Database charting is a process of segmenting the data into partitions that are spread on multiple database instances. This is essentially to speed up query and scale the system. I want to talk about the following. I want to discuss what is shorting. And then in order to discuss charting, we kinda, kinda, we have to gloss over the idea of consistent hashing, talk a little bit about it, and then we'll get to talk about the difference between horizontal partitioning and shot him because those two kinda go hand in hand. I want I wanted just a slight difference and I want to talk about that. Like that's why I have put it like a single slide to talk about that, right? And then also we're going to talk about my favorite part. We're going to do an example. We're going to code on a NodeJS and spin up multiple Postgres shards and then write and read from those charts, right? And essentially we're going to build a URL shortener kind of thing. And finally, we're gonna talk about the pros and cons of database short and goes, Guess what? Nothing is perfect. And finally, we're going to summarize the whole thing. 2. What is Sharding: So essentially, before we don't want to show our name, just the idea of traditionally how we do things is we have one big database with a huge tables, right? And we would have some indexes on those tables query and then you get a results is always a centralized one database. Your pain point is one server and querying becomes slower and slower the more data you have. Because guess what? You're going to have indexes. Those indexes are going to grow large as your tables, a girl or large and, and that can slow down your queries, right? And, and you need more memory and you need more CPU. So it's just data scale, right? Just make sense. So if I went, I make a select your, uh, from this table where URL I call this. Let's say this is like a URL shortener website, right? Was it like one huge table? This is the URLs and this is the code. So essentially what you do is like, Hey, give me the URL for this code so I can visit that page, right? That's what the URL shortener Dow's, right? And then you query that is a huge table and say there are 1 million rows. So that's their score is going to get slower and slower and slower the more rows you have. Because even with indexes that stills gonna get slow, right? Because it's just the index gonna grow and grow, grow, grow. So you want, how do we speed things up? So people invented the idea of shorting. Let's plug that table and put in a multiple databases. It sounds very simple and that's, that's what it is. It's very simple. It's now really rocket science. You partition your table. Row-based partition based on certain field, certain column will call the partition key or Schottky. And we throw those rows on multiple tables uniquely, right? So let's say 200 k goes here, and others to a 100 K goes here, another 208 goes here. But the question is, how do you shard, how do you distribute? Like, I mean, if you have customer table, that's easy, right? You can't undo like on the ZIP code. Like these are the zip code for the West goes. This is a ZIP codes for the East goes. Or if you have a bunch of users, maybe a users from one to 500 million goes to this database and 400, 500 and one to 1 million goes to this database. That's another kind of sharding, right? And that's kinda easier. But what are fewer? Key is a string like this, right? Which database server as this five FTO j is on, right? So it's like how do you know them? So that's where we're going to talk about consistent hashing. But there's just a function and you can always find out it's not really heartbreak and build your own function to determine how to consistently hash this input to a server and this gaze, Hey, I got to serve are the three that we're going to talk about little bit that right side. And then once you find out which server you want to connect to, you connect to only that server and you send your query to only that server. Sounds simple, right? You do not want to send your query to all databases to get essentially a bunch of messages, right? You want to connect to the right server that you're sure that your database or your M was actually there. And how do you find out that meet consistent hashing? 3. Consistent Hashing: So let's assume I have three chars and that's an example we're going to built by the way, guys, one of the example part of it. And, uh, and these 33 databases just say I have an input and put one, right? So the idea of consistent hashing, you take in and put or a string or any user provided piece of data that you want to query on. And you want to know which database to query on, essentially that hash again and give you back that instance somehow, right? So let's say hashing of input 1 will give you a database 5432. Good hair. So every time you submit the input one string, it will always goes to this database, right? And if you submit input to somehow this goes to a 504 33, right? And then if he's up in Employee 3, now might go to server 5, 4, 3, 4, right, which is listening on port 543 for and that's kinda consistent hashing. So as long as you give me always the same story, I'm going to make sure to always consistently hash that to the same server, to the same node, essentially. Okay? And you might get another like as I again, submit input 2 again, you can always get the same output, right? All right, So how might, you might just like how, how it is actually this work? So one implementation would be just, hey, make this input into some software number, right? I don't know how you do this. Sum all these character ascii codes converted into binary, and then shove it back to a decimal and then do a modulo, right, which is the remainder or three because I have three nodes. And then the input is like, I don't know if it happened to be 10, right? Let's say you've got 10, 10 modular three is divided by 3, that's 3 and then 9. And it's like a remainder of one, right? One is, you add back 5432 and that equals 25 for 33, and that's goes to this server. So that's how one implementation of I'm not saying that as of right now I'm going to use one implementation and NodeJS that do, does consistent hashing essentially, right? They get an input, give me the port number back so I can connect to my database. And yeah, and that is called a hash rank as well, right? So this is like a, almost like a ring. And then use Cassandra uses this all the time with the shards that they have, that consistent hashing ring essentially. So anytime you have an input, I know which database to hit. 4. Horizontal Partitioning vs Sharding: Alright, so let's talk about how horizontal partition reverses shortening. So horizontal partition splits this big table into the same database, but it's in another table or maybe to another schema and the same database, right? So you kind of share the same database, but now the client has to know the other schema that has table or the other table, table name, that could be customers underscore West or customers underscore east, right? Something like that. So that's, that's the horizontal partitioning so he can spread your data into the same database. Sharding and the other ads plates the same table across multiple servers, so he keep the same name. But the database server or the database instance change, right? That's the difference, right? So as we said, like IT, partitional actually changes the database name iser, we said it'll split it another schema. Well, sharding everything is the same but the server chain. So that's essentially the differences between these two guys, right? Horizontal Bush Junior sharding, sharding multiple databases, right? And horizontal partitioning same database, multiple pebbles. 5. Class Project -Sharding with Postgres: How do we build something, guys? Here's what we're gonna do. We're gonna spin up with Docker biosensor where we're going to build, as we get to build, spin up three postulates and instances running on these guys. For 5 4, 3, 2, 5 4, 3, 3, and 5, 4, 3, 4, right? And each of them we're going to create, when we spin up that Docker instance, we're going to initialize the database with a table. And that table is our URL table essentially that has like this short URL shortener, right? Have some fields and all that stuff so we don't have to manually go and spin up that. And we're going to set as spin up a BG, BG admin to show you the actual shards and all that. So shot one, shot to shot three. And then we're going to write some code and NodeJS to actually, well that would express sorrow rest API to write to the sharded application was like, Hey, shorten this URL for me and we're going to figure it out which database to hit and then write to that database. And then we're going to read from the database. How about we just jump into a guys? 6. Class Project - Spin up Postgres Shards on Docker: All right guys. So the first thing we need to do is to spin up the Docker instances for the Postgres. But in order to do that, each Docker instances that we're going to spin up half to create that table, that magic table. So the first thing I'm gonna do is create a SQL script that creates that table, the initial table that we will have our stuff. So let's do that. So let's get into a directory here called, I don't know, sharding video, whatever. And then there's just swap into this folder and then we're going to create as, let's just create an innate dot SQL file. That's just a file that we'll have our create statement and a little bit C-Cl here. Let's write some SQL guys. So what I want to do is like I want to create table, URL table. That's what I'm going to call it. And I still remember C-Cl. We're gonna do the first fields called ID, right? And that's cereal. That means just a incremental. We don't really care about this field much, right? It's not null. And it's also a primary key, right? And then the second field is URL, that's the long URL, and that's text, right? And the third field is a URL underscore ID. And that guy is just a character 0, 5. I wanted to be five character exactly. All right, so that's just essentially just the structure of the schema of the table. And then because we want to spin up multiple instance, I want this to get executed every time it's spin up a new Docker container, right? So that's the first step. Looks good, the script looks good. There's no errors hopefully, right? Let's just write this thing. Okay, that's the first step. The second step is we need to write a Docker file because our just random postgres image won't do. We need to create our own image that essentially executes that script. So how do we do that? So let's go ahead and create a Docker file than them here. And then from the Postgres instance, right, the image, I want to just literally add 11 line of code. I want to copy that in its SQL that just the file we wrote into a special folder called Docker, entry point in dB dot d. And what that does is essentially the moment you copy that and you spin up the container, that Docker container, the postgres image will say, Oh, there is a SQL file here, I'm going to execute it. That's what it does, right? So anything you can write an SQL file and it doesn't have to be called anything by the way, anything that you can call it anything you want RAM. And that is going to execute him. That's cool, That's what we want, right? So now we're ready. So what we're gonna do here is let's build the image, right. And because we have the Docker file, let's build the image from which we are going to spin up the containers. So the next step is we want to build the image from the Docker file because that's a special image. I don't want to run my Postgres from just the vanilla postgres image. I want to run from mine, this special image. And let's build that image because nobody knows about it, right? So why we do it to build as soon as I could do, go Docker build dash T and getting called BG shard fluoride. And that's how you name tag it essentially tag the image or dash d prime. I don't know what it's called. Dash, dash name just like with confound name, but people like to be confusing, so I don't know. And then once you do that, you do a dot at the end. That's very important because I want to build the current directory. And then when you do that is you, and that's done. So my image is called E G sharp, right? Hub always spin up some contain or guys, let's spin up some containers, right? And then finally, once we've spent out the container, I'm going to spin up also my PG Admin container to administrator those things, right? And then, so let's spin up a Docker. Run, dash, dash name. Those always a good idea to give it a name. I'm going to call it PG shard one. Okay. The first chart. This, you can see it guys. I first shard, the port is 54325432, and then the image is PG short. Remember, and that's it. And how about we detach it because we don't really want, like, I'll go occupy the terminal. So do that and we have one sharp baby. So let's spin up another short on port 54332. Nice. And for sweet. All right guys, so just like that, we have three instances, database instances. One is running on 54321 of his own father for 33 and 34. And by default, if I connect, I'm going to show you now all these shards, right? I'm going to use PG Admin Docker container. I'm not going to show you how to spin up a PG Admin because I already did it. Another video go check it out. I'm going to reference in here. But I already have a Docker image called BG Admin. So I'm going to just use that one, right? So this is usually listening to port 55. 55. Let's just check it out. Bg admin. And that's the one that's the one I have it running for a long time now. It's called DPG and that's called BG Admin and is listening on port 55. 55. So I'm just gonna use that to manage my, my shorts. How about that? Let's just do that. Let's show you that these shores has been executed him and spin up and then have the schema of the table. And those tables have to be empty, orange or obviously, right. All right guys, so I have BG admin here. I logged in, and then let's go let's start adding my servers, my database server. So I'm going to add a server. I'm going to call it short on. And they databases, remember it's Hussein Mac, that's my machine, and it's running on port 5, 4, 3, 2. And the default users, when you don't specify anything and the properties, the password and username becomes poll scores, and the database also becomes Postgres. Everything in the same them. All right, so now we have Allscripts, Schema, public variables, you ADL table, Look at that. That's the table we created. And that's oldest stuff we have. All right, let's add the other two, right. Shard to its Hussein Mac. And that's three Postgres. Postgres. And let's just add the third one. Short three. Same MAC. And the sport for right? Now I'm Postgres. Postgres. And we have three shots, guys. And the same exactly, same exact schema. That's what we want. So while we're shortening setup is ready, how about we write some code, guys? 7. Class Project - Writing to a Shard: All right guys, so I have Visual Studio Code here. The minimum requirements, Yana have Node.js installed and Visual Studio Code and you can start working on this stuff right there. All right, so I'm gonna go ahead and open and brand new folder. Let's go ahead and create a JavaScript playground. I'm going to call it shorting, right? And open. And what I'm gonna do is create a file called index.js. The best file ever. Then. Let's initialize and BM because it's a new project, right? So NVM in a dash y i means Shut up. I know what I'm doing. Just create a package.json is called sharding, sweet. All right, So now remember guys, we did it a tutorial on how to work with Postgres from no jazz using a very beautiful library called PG, right? And we're going to use the exact same library to connect to the databases was initially right. And what we're building here is like opening a web application that have a post and a good thrived post will create a URL shortener. Ri will shorten the URL given a URL, and get well, given that URL code, the short URL will give me the long URL. This is very simple stuff, right? So let's go ahead and obviously we're going to need expressed. So again, do require Express, spin up that express beautiful stuff app.get. We're going to do something with this thing. Requests response. That's the get command, Abdullah, pause, that's the post command. That's what we will actually write. Write our beautiful, beautiful URL shortener, okay? And here's what we want. We want the databases guys. So the database is called cones client, right? And let's use this destructure assignment. I know sometimes it's confusing, right? This is essentially called the library for the Postgres database called PG, right? You can either do it this way or just do the destructing assignments, whatever rocks your bullet, right? That's that's exactly the same. All right. And we have the client. What do we do with a client? We need to spin up multiple code. We have to spin up three clients. So here's what we're gonna do. So again, a creative act clients write JSON object and the JSON object, the key will be the port, because I wanted to play this trick here, where the he is the port and that JSON value is the configuration here. So, so how would we do this as a general serial 54, 32, That's the first client. And that value is new client and then you pass in that beautiful configuration. So what do we do there? We would the host, right? Jose Mac. Haggling guys, I have done through all that stuff. You want, detailed tutorial, all the Postgres self, I'm going to reference that video, but this will be just skimming through these stuff rhyme. So port, obviously port 5432. And what else user, as well as Toward is also Postgres. And database also Postgres. I'm just lazy guys. Just use the same exact thing, right? And then that's one client, we need others to clients. The client running on port 33 and the client running room. Or how about that guys? I think we're good here. So that is a bunch of clients, right? Still, we didn't connect them. How do we connect? Right? So we need a function called Connect here. So esters, because I'm going to use an async await here. So I'm going to do a function called conduct fun engine function if I can spill connect. And what we're gonna do essentially is just called this function immediately. And then connect those guys, right? So we can do like fancy loop or just too old school way where just like literally 5432. Dots connect, connect those TCP connection, establish those beautiful TCP connections. Alright? A try-catch would be nice here, right? But I'm trying to go through the video real quick. So that will actually give us the kinds connections, beautiful stuff. What's next guys? We need? The next thing is, Let's start with the writing aspects of it, right? So writing and new row, right? So I'm going to write to the database again. So how did we write, right, so posting, let's assume I'm going to get the URL off that query parameter. And we got to me, I made a video about the query parameters versus resources. If you're interested in that, go watch that video I'm going to present here. But you're going to request dot query parameter dot URL. So just give that URL. And once you get that URL, here's the thing. We need to consistently hash that thing. Hash this to get up port and get that port. So how do we do this? This is the consistent hash that we talked about this. So let's do that. So there's a library called caused consists tent hash. And I think it's called literally consistent hash. I think it's called death, right? So it's called consistent dash hash. So that's what it's called, right? And then when you do that, what do you essentially want to do is create a hash ring that the one we just talked about, right? So it's like a, how should I have how many nodes? In this case, we have three nodes, is really easy stuff. So I'm going to create a hash rank HR equal new, consistent Tasha, right? So we're going to create a hash ring, right? And then we start adding our notes, right? But we're not gonna get IntelliSense. I think that's what I thought. So let's go ahead and just install all that stuff. It goes, we have done a lot of stuff and we have never installed any of that stuff, right. Alright, so let's go ahead and do npm install express and we get an install EEG. We're going to install consistent dash hash. Let's install all that stuff. We have installed all of that stuff, right? We're going to also need a corrupt DO, but well, we'll use is when we really need it. Okay, So HR dot, yeah, you go there is until it since edge are good at. And what are we adding here? We're adding essentially these puppies, right? We have three nodes to be consistently hash, right? And, and so easy to use guy. So once we add all that stuff, here's the thing, right? For now. I'm going do a post, right? And if you give me the URL, right, I'm going to hash that URL and then get a result out of it, right? And then based on that result, take the first five correct, and that becomes my essential in my code, right? So let's go ahead and actually do install mine. It might as well just do a crypto as well. Alright. Crypto, require crypto. All right, we're going to use crypto to do the hashing guys are we talked about, so what we wanna do is initially is Take that leg your URL. So the URL could be, I don't know, WW dot wikipedia.com slash sharding, like dumping like that, right? And I want to take that string and make it into a beautiful hash base-64. So how do we do that, right? So a crypto dog, I think called create hash, right? And then you have to specify the function is called SHA1, SHA2, 256 because that's what the hash on we're interested in. And you want to update that string of data you have. So generate the hash and then, and then you want to convert it to base 64, which is called the digest, and then called base-64. That's what do you do it That gives you a beautiful hash, guys. It's just a string of hash and that hash is a little bit long. So let's go ahead and actually run this thing guys. So you can see, so let's go ahead and just destroy response does send back. I don't know. Let's just send back the same information. So the hash, we God. And so far we don't have any other information rank. That's that. So let's just go ahead and run and make sure everything is running, right. So App, we have the app-get. We forgot to listen. Let's go to app dot lesson 1881, right? I want to listen to port 8080 one. It's just a number. So let's just do it a number and then do console.log. Listening to 1881, right? So let's just go ahead and debug, make sure we don't have any errors. Then you're gonna get an error because we didn't install a crypto read. We didn't install corrupted, did we write that's going to install crypto desk in case npm install could even till this run it again. And we're listening to port 81. So we want to do a post request to this poppy, right? Let's see what will happen now. Remember guys, well, we're going to do with it. Give it, give me a URL. Romeo RL going to hash that thing and get a big hash strength. I'm going to take only the five. Does. Again guys, this is not the perfect implementation of URL shortener that are way better guys than me. Doing an actual design and fill ought to our one hour video just how to build a proper hash function and to do your URL shortener. This is just to give you an idea of sharding, right? All right, so let's go to the browser. All right guys, so we go to our favorite console, right? I'm going to use the beautiful Fetch API. Just do old-school way. It's 80, 81. And I am actually euros HTTPS, wikipedia.com slash sharding whatever ion. And then dot. Yeah, we're doing a method. The method is post and I think that's all we need. Right? Dot then I'm expecting JSON back. Give me Jason baby and console.log. We made a request. Immediately. We get that breakpoint, we get the sharpening, we get the URL and then hashed the stuff. And we get this beautiful string and look at that. That's trying. I'm only interested in the first five characters, right? We're going to take that five characters and shoving the database, right, as a URL code or you all an idea, whatever we called it. And then that hash will determine way at r we get a hit which servers going ahead and we're gonna talk about that. No. All right. So we're gonna do that or are we going to obviously get what we're getting? Getting an error for some reason axis to Fichte from origin. We're not under that are on the average in a DAW. Alright, Sure, Let's just do a localhost, 8080, one. Happier now, okay, are in the same regression. You happy now. You happy now I think you're happy. I think you're happy. Thank you. Happy. Good, good, good. All right. So just like this course thing we talked about course, my daughter, we talked about a lot of stuff in this channel, guys. How are you still here? So much stuff, right? So there's a lot of thick, right? And just exploring all this stuff becomes US likely become a better software engineering as you hit the thing that's dollar Indra standing, right? Course, beautiful core stuff. But I could have had something on the server to it that allows other people to access this, but I don't know. I just don't feel like it. All right. So we've got the hash back. This is no, I won't go though. I went that you ought to code. What is it your code guys? That your code is, I should initially URL id equal hash dot substring 0 to 5. That's what I did. I'm just in the first 55 Care Act does not sure computer science stuff. Is this going to give me five characters or six? This struggle man. I always struggle with these bits. You'd think FY 15 years experience and computer scientists still get this wrong, man goes every implementation good that sometimes like the strings get it. Oh, start from one, so it dies home 0. Everyone implements their own stuff, man, right? So let's just do hash again. What do we do? We get it right? Do we get it right? Yes. Looks like five characters to me, right? 012345, right. So you'd think that there's going to give you six Scott There it goes. 0 to five, is actually never mind. All right, so now we get back the URL guys, look at that, give you that. And if I do it again, Let's just remove that breaklines just getting annoying. Now, at this point, getting annoying. Annoying, remove it, right? And every time we get that, we're gonna get the same key, right? But let's say, change that to something else and we get a completely different key, right? That's not what we want. We have the key. Now I can write it to the debits. We're ready Almost, We're almost ready. Gizmo, Almost, almost ready. Let's just throw into your actual oil you How about that? Okay, just throwing a oral short and then I'm going to throw in the server because I don't know which server I'm going to connect to. Because that's the tricky part, right? Alright, so how do I know which server to connect to that easy? E, back to the consistent hashing, right? Hr, the hash shrink. Remember we added three notes to the hash shrink, right? All we do to do is ask the node, hey, hash writing, I have this input which is called your LID. Give me one of the three nodes and it will do its magic. I do a module, whatever the implementation is, going to give you a server. That's the beauty of this thing. Let's do it. So let's see what do we get. If I do this? Let's clean this up and then do look at that. We got foreshortening one, we got silver, 5, 4, 3, 2, and will this key, right? That's funny. It's actually says key, Oh my God, that's completely a coincidence. Let's say sharding to we still heading in the same server, sharding 3. We start we have another server, 54 34. So completely almost random. Memorandum. I know I want to hit please give me the server 33. We didn't hit that server yet. Dot BH behind. Okay. Five 432, dammit. I won't 54, 33. Can you give me if I 433 a random 2 that gives me for over 33. Okay? That's what I'm talking about, right. Just as you start running, this will almost balance that stuff in a consistent manner. And every time you do the same exact point, you're going to hit the same exact server. Okay. We have all the pieces, guys, all the pieces of this year. So how about we actually had to write this stuff to the database. All right, Let's write it. How do you write stuff to the database? Very simple. We get that. We need to get the client right, that clients are in this bond, right? Clients such as what? What's the sub here? I want this observer, right? Because this is the key, right? Because now I have this server which is a string at the port that will give me back the object and the object when I get that thing I wanted to query right? Now, right? Why is it not showing me intelligence? I think it's unaware that is actually an object. Well, aberration or that should work. So select star, bad idea. Don't ever do a select star in your database. Let's see, you know what you're doing, right? So he's called URL underscore table. I like to make things big. That's what she said. Select star from URL in this table. Yeah, I don't know what SQL is always at cast to be kept locked. All right, It's just something that we learned back when I do when I used to do Oracle stuff. Back into those and 5, 4 actually, yeah, first job. All right, so where we do where what what, why are we selecting? That's not swear inserting, insert into your L tabled, right? I'm inserting what ID is going to be automatically generated, that URL and the URL id. And then we do values and then the fancy stuff, 1 and 2. And the 1 is equal URL. And the other one is you are all already. Okay. Sounds good. Right? There's an answer to me. All right. And that will insert, but you have to await it, right? And since we're awaiting that function that we're any better be a sink, right? And that's okay. And then we have to send that push always still recording. I'm good. All right, so now we sending that back and then we are sure that hey, we got it. Okay, I'm not I can add a try catch and be like a little bit more fancy. But yeah, Let's just make sure that we actually made it right. Otherwise we're gonna get an error. So let's do it, guys. Let's do it as do it. Just do it, refresh, right? And post. Wow, that's bolded that work. How about that? What servers that 54, 33. So how do we go which is 54? 54, 33. Damn it. I should have just put the right number. Yeah, I don't know what it's for for 33. I think that's a good chart to his fourth 433. Okay. Public and we go tables. Do we have a row, guys? That we have a row? Do we overall? So really what the heck was that printing PDF? Why? Jesus, OK, chip. Well, look at that beautiful stuff, guys. Wow. Well, how about are we doing another one, okay, Another one, digit to the right. And it's same server. Okay. Sure. It's even easier to hear me. Alright. Like that, another one. And that's the key, right? Let's do on Google.com doesn't have to be Wikipedia. Right? Addie, refresh. Although we had another server, Let's do that. Let's go to that server by showed three guys, one guy. I'm going to reference the code below so you don't have to pause and do all that stuff, right? Goofy stuff. You just, you just go to the GitHub page, look at that guys. So we're consistently hashing that f out of this stuff, man, consistent babes, that's what we're talking about. Consistent slay. About that. That's just throwing some URLs that URLs they call. How do I do URLs? Right? Let's just do URLS. I, less than 100, I plus plus. And then it is like URLs to push HTTPS, google.com slash. In India it goes away. Now let's go there. I social q equal cert test. I have to do the techs. Can you read guys? Hope you can read this. Okay. I'm going to zoom in a little bit. All right. Sorry about that. All right, so we added a look at how many oils we have. A lot of URLs, guys. So many oils. Oh my God, Let's go and do it right this time, guys. For each you you wouldn't even know me. I like it when you know. You been if God is a Longman. Alright. Alright, I think that's right. Did I get it right? Really debilitates. But showed us do it all Mama seat like that. Look at all that stuff. Okay. So we had a bunch of servers. We had five 433, if I for all that jazz. She arch, oh my God, look at all beautiful stuff, man. All right. We know this stuff is working guys or distributedly consistent. All right. How about that? 8. Class Project - Reading from a Shard: All right. We did the post about that. Where do they get? They get it will be easier, right? Let's do it. Get, you get, let's get that thing, right. And what do we do to do a get essentially what we are getting from the server as Alright, so I'm going to do a fancier this time, right? So I'm going to use the like that ID. I think it's called like that. Is it as a colon ID? I think. So the URL id will be like that, right? Will be part of the URL. So like for you do like extra GPA or ash localhost 8080 one. And then you do like nails. If this, this URL now will, this will be your LID. Get it. I get it. And I don't plan on getting guys. I haven't split everything with you guys. Sheesh. All right. So it goes to the auto ID will be equal request, but I think it's called params dot. You ought to lie D, right then what do you do it this way? I think I might be wrong, so we'll check it out. Yeah, so URL id 5, so we get that URL ID, which is this thing, right? Right. So what do we do? We need to clear it. Firstly, to do the query, we need to find which way at which database to hit. And to do that, we don't need to hash because we have the hash, right? Rvs, we have the meat. Alright, so we do that. Good, good, good. The U or L, the cell. And once you get the server, now we know what server, we know which clients. So we essentially do this, right? Isn't easy peasy Japanese a, right, so we don't need this. We just need to query and select star again. Never do this, right? Never ever, ever do this. Url tables I what's called 0 table, table URL. So select star from your L table where your L is equal to one. Thank God. What's his name? This guy who built this library, I think his name is Brian. Brilliant guy, man who follow men's wear. All right. Brilliant guy. Okay, so why we do URL id? I wanted to give you that. So you don't have to do that nasty single codes that you don't have to do that because you'll be, you'll be SQL injected and y, the way we talked about single injection because I talk about everything I want to learn in this job because I want my goal is to become a better software engineer every day. So anything that I don't know, essentially, I make a video about it. I learned and I make a video about it. So sequel injection, boat ticket up. Okay. We have that stuff. What abolish? We actually check this stuff out. Okay, So give me results and I think I awaited that thing. The results have a row. So if result dot rows, I think it's called lose count. Is it a grade and then 0. Then we just send back and resolving S. And then if there is no results, What's the status code for something that was not found? Guide 40 for sending status. Since since status 40 for those 40, 40 because it's not found right. If it's found that I'm going to return this thing. Right? And what do you do oral ID. Whales? Well, all right, so you're LID. Let's just do the same exact thing so we don't miss upright. Stay consistent, consistent hashing babes. I say URL id is the same line or UK, I want this guy and he tells me which server I went to to pick that thing up. All right. Looks like it's all right. Let's do it. Do it. Only valid in async function? Yes, sir. Because I forgot to add async here. Thank Gil. All right. Who do this get? Well, because it's easy, just do this. And now I need something that actually exist, right? This is not found, this funny. It actually tells you not found right? Because it didn't now is going to return for me and 40 for air because this d is Dasa doesn't exist, does so? That's, that doesn't exist. 40, 40 not found. Okay, so how about we find something that is actually there, okay, so if I go to any server, I'd just pick up any so what's this shorter three? Yeah, the source server, if you do this, right, we know work in or they might have some bugs guys. Go on, check what's going on. It hit it. Did I get this right? Yes. I get hc M7. And if I do that, I'm going to get citizens server 5, 4, 3, 2. And if I query, right, and I get rows count 0, 0 count as nor rows count its row counts. Even though I didn't get any rows back. All right, so let's see, what would it what are we doing wrong here, guys? All right guys. I had to pause the video where he found a big, huge bug on one of the implementation of the old library, that consistent hash library for some reason. I've practiced this video, I didn't hit it. What I headed viewing this video while recording this video. And apparently that hash ring thing, that consistent hash was not giving me the same value. So it was stupid and useless. It's just bugging implementation. So I found another library called hashing, which is a correct implementation. Finally, it's exactly the same, right? So you're going to add the same thing at our ad this, but now when I do a get right with a key to same thing, exactly, right. It gives me the same value every time because that's what I want. I don't want it to give me a random values back defeats the purpose of a consistent hashing. I don't know what that implementation was about. Right. So I checked on it wasn't updated for four years. I was a bad idea from us. I apologize. So the rest of the video, we'll use this hash Rang and instead of the consistent dash hash, right, apologize for that, but it's the same implementation writes is same idea, nothing changed. The code will, the latest score will use the HashMap. So now if I actually, so I go a width and a truncated older rose on my, all my shot and repopulated with a brand new set of IDs, right? So now I go back to gets right here. And if I do, like let's say I went back here and I say, okay, shard, white, I'm a shard three and I want E4 are jacks. So this is what shots three, right. Shard three. And I did. Okay. Go ahead and show me that for that ID. I get back three, right. So let's go ahead to actually shard one of the rows and short one. Okay? Unless check that out. Do we give the values here? So yeah, there's a bunch of values here. Let's pick one of them. And then MCA 875, I do that, right. It's just to refresh that thing. Right. And if I do me that, I'm going to head start one, 54, 32, That's exactly what we want, right? Powerful stuff, right? So once we did that I nor server to go, alright, so let's go ahead and complete that query and add all that jazz, right, so let's go ahead. So we get the result back from the query, right? If the result or row count, I think it's called greater than 0, then I want to send back, obviously this results, right, guys? And I want actually just send the rows back, right? Just go ahead and send it. We can just do. Okay, because we're going to get one rollback, right? This implementation, by the way, guys could have duplicates, right? I didn't account for duplicate, but it's easy to fix, right? I'm not going to fix it here. Obviously, this is a sharpening video, not a whole. The URL shortener design course, Ryan, so a URL shortener, giving you the idea, I'm going to get oil. How do I get the URLs? I'm going to do this rose sub 0. And let's just do row sub 0. I don't know the actual values there, so and that server, right? And then if it doesn't, if you couldn't find a we said we're going to do response dot send status for O four. Okay, nowadays, coolish, let's do it. And let's just remove that breakpoint so it doesn't get a and knowing that we've got its id 4D and it's formed server this and all that jazz, right? So how about we take another one, another one to jihad it? That's the same server. That's what we pick up. Another solo on, not thus Silva. And that's when I truncated those tables. Let's go short too. How about that? Let's pick something from short to do we prove that actually there's showDish warping, right? Log of that shot to 54, 33, and we picked our results with it. So I spread out my databases and now my rose on my table then now I'm always getting that stuff back. Powerful stuff, right? So we know how to write, how to read. And that's essentially the basic self sharpening, right? And what if I do something that doesn't exist, I get no found, right? And if I do that is not found, right? So very basic implementation of a URL shortener with sharding giving to you guys there, I'd hope you enjoy it. It's coded, it will be in the description below it, so it's not that bad, right? It's like foreshortening. That's pretty cool if you think about it right. Now, guys, adding shards is not something easy. You can't just add any sharp because you have to really be careful while you're doing. You have to be careful with the algorithm, right? So the one I did is like it's fixed to the number of sharp, right? So the moment I add a new shot is going to redistribute and everything is going to go to F, right? So you really need to know what you're doing with this sharding or racial earning is as different topic. It's one of the disadvantages actually of offshore settings. All right, so that's it. It's just jump back and delete slides and talk about pros and goes off sharding. 9. Advantages of Database Sharding: All right guys, so we talked about charting little bit. We did an implementational shorting with consistent hashing and all that jazz, right? Pretty cool stuff, right? How about we talked about the pros and cons of sharding? Is this perfect? Well, let's check, let's check it out and see. So the pros of shorting is you're going to get scalability. And that's both in data, memory and even CPU and any resource that is on your server. Instead of having your server, essentially, the resources you're getting hogged with one machine, you can spread the load on multiple machines. Das horizontally partitioned horizontally scale things, right? Because now essentially your data's shows spread out and now the client is aware of which is where and I just need to know and head that essentially and just build up and build that up, right? So that's essentially what needs to happen. So you get scalability for free. That's great, right? You also get security, right? And that's, that's interesting one Let's say you like, if you have like certain customers data, you want this short of customers to be axes only by certain people, right? So you can put your VIP customers in one short and other to Mergesort another shards, right? And then you can have little bit Access Control or what short can be accessed by which clients. So with that, and instead of having everything in one table, you kinda spread it out and have kind of access control over was shorts like Hey, so already you cannot access this short, right? So you can essentially have this level of control of security with prose was sharpening, has pretty cool stuff, right? And it's optimal because they were going to get a smaller and smaller and smaller index size. And that's cool, right? Because the smaller index you get, the better and quicker and more efficient queries become, right, cool stuff. 10. Disadvantages of Database Sharding: About cones, is there anything battle this? Well, you so you saw that go guys, see how complex our clients get. It's so complicated, right? The clients becomes very complicated because they are aware of this sharding thing and they need to know it's like, oh, this input is in this short and this input is in the Schar. Well, you can use consistent hashing to hide that by still, the awareness of the shorts become really complicated as you start building a huge application, this becomes really nasty, right? Becomes really nasty to have a complex complexity into the client. The tests, which I think I'm going to make a video about tech from YouTube that allows you to do sharding on the spot. And it's hitting from you. It's a back-end showering and the client just makes it a coin, does the shorting for you, which is pretty cool tech, right? So that's, that's the potential. And talk about that. Transactions across short is a problem. How do I do like to insource that happened to have 22 tables, right? And I want them to do it atomically. That's almost impossible because you're, you're essentially doing 20 to database transactions to separate database at atomic transactions. It's almost impossible is very hard to do, right? So you can't do transactions very effectively with charting, right? I'm right. I mean, if you do it on the same sure, that's yes, you can do that. Rollbacks or very hard lecture will say, Oh, I want to insert in this table and then I want to insert in that tail. But ooh, something that happened, Let's roll back everything. Well, what does that mean? You're going to roll back this transaction, rollback the other transaction, despite it being one logical transaction, complicating stuff, right? Wall backs up. Very hard. Schema changes. Let's say I want to add another field to my URL table. I don't know day towards something. I have to add it in all the Schar drives like I make a change and all the shards and you can just roll it, roll this change slowly if you want to, but it's a little bit dangerous. If you miss adding one schema and one short, but you add charts, joins, I've no idea how to do joins across databases that even possible, right? It just becomes very hard to do joins, right? That's why you might use horizontal partitioning instead of sharding. So your stuff as one benefit of HR, horizontal partitioning, not human resources, right? So and so he essentially have the two tables in the same database. And then you can join them, right, instead of having the table across multiple databases. And how does other thing it has to be something you known that query. So if you're, if you're querying for something, the key, the partition key, the shorting key in our case was the URL id. What we know, right? And that's based on that, we know which database to hit or what if I want to query on some other field that I have no knowledge? And so that is expensive because now you have no knowledge, you don't have the key. So you forced to hit all the databases to know which key, right, where, where the data resides essentially in. An example would be like, Hey, if you have like customer West and customer eastern to shards and you searching by name, Aflac cried. Name has nothing to do with short. So you have to query both shots. 11. Class Summary: Summary, what did we discuss, guy? We discussed shorting, we discussed charting is the meaning of sharding, spread your tables into multiple database instances where discussed consistent hashing. Just a little bit. Consistently hashed data into multiple entries, right? We discussed horizontal partitioning versus, versus sharding difference between the little bit that we talked about, the advantage, disadvantage with those where you went through code guys was long code and the code is available there. So we essentially wrote an URL shortener kind of thing that shortened URL in and expand those URLs, right? And then finally, we talked, we talked about pros and cons. Hope you enjoyed this video. We're going to see you on the next one, guys, you guys stay awesome.