T-SQL In Two Hours | Tim Smith | Skillshare
Play Speed
  • 0.5x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 2x
10 Lessons (2h 18m) View My Notes
    • 1. Introduction and Part 1

      10:00
    • 2. Part 2

      9:46
    • 3. Part 3

      11:00
    • 4. Part 4

      11:39
    • 5. Part 5

      8:45
    • 6. Part 6

      10:38
    • 7. Part 7

      9:04
    • 8. Part 8

      10:26
    • 9. Bonus Challenge - Live Coding Practice 1

      33:55
    • 10. Bonus Challenge - Live Coding Practice 2

      23:11

About This Class

In this short course, we'll learn:

  • The basics to T-SQL coding.
  • How to ask questions with data and answer those same questions with T-SQL.
  • Two highly-profitable example questions that we can answer with T-SQL.

This course is ideal for:

  • Web and application developers who work with a database layer.
  • Data analysts and developers wanting to review the basics quickly.
  • DBAs who lack T-SQL expertise.
  • Entrepreneurs who want to analyze their own data quickly.

Transcripts

1. Introduction and Part 1: All right, We're gonna be beginning Introduction. Today's database development Basic T sequel querying. So this is the part of the course where we're gonna be going over some of the basics of TCP querian. I would suggest going through this part of the series, then coming back to this part of the Siri's when you start to work on the course project, OK, there is one of my favorite pieces of advices Review the basics of your profession every year. So every year, I go over basic t sequel querying, which is what we're gonna be going through on the reason why I do that is because I will forget some basics as well. Okay? And you want to always be touching back up on the basics that you learn. So I'm gonna keep these videos to about 10 minutes each, and after 10 minutes, I'm gonna stop. And then there's gonna be a part two in a part three. So I'm gonna be going pretty fast. My suggestion would be with you to me. You can slow down the place speed if you want to get more out of it or if you think I'm going to slow you can speed up the place, speed. So the first important thing to note about when we Cleary Teoh using TC full, we have to query in the context of the database. And when I log onto a server like this server right here, it's gonna default me to master. Okay, so you'll notice. I say use application because I want to be connecting to the application database first. The most important thing. If I don't do that, it's gonna catch me by default to master. And then I'm no longer in the context that I need to be. So now that I'm in the context of application, I can query the tables involved in application, and we're gonna go over in a second. How I can Cleary tables in another database we can do across query across database query, but for now, we're not gonna cover that. So let's look at I'm going to select the top one star from this table Gold historical data or GLBT Historical data. Now, what is top one star? Me. Okay. If I select star, that means I'm selecting everything from a table. Right now. There's 2300 and 55 rows you see at the very bottom right corner. Some of you may be able to see my mouth. Some of you may not, depending on the depending on the software that you're using to view this and you can see the output on will cover the output in a second with the columns here. But the key is that if I select storm getting everything from the table, right, so this is the table name we can see. The table here will cover DB O in a second. But this is the table here. I'm getting everything. That's what star means. If I say select top one star, that means just get the top one. Just get one record basically. And we see that what sequel server does is it orders it by that I d feel there is no index on here. We'll cover that later. So what are the columns? Will the columns if I drop down this in the table? We have columns in this table and in the columns we have rows of data, right? I can actually look at how this table was built. We're not building tables right now, but we do build tables at this course. And if I script this table, I can see this is how this table was designed, right? There is an I D column, and it's an integer. There's a price data column and it's a date, right? Price date column and it's a date. There is a price column. It is a decimal, etcetera, right. This is this table, So schemers are what tables air on a lot of people depends on who you ask, really, But a lot of people use scheme is for organization, right? So let's use an example. Here I am going to create schema, blah, blah, and it's gonna create me a new schema. Now, if I refresh tables will, no no tables using that schema. But let's say that I want to use that scheme. I could do Select Star into blah blah dot gold historical data, and I have now created a table on that scheme of So let's go back to this. Let's look at that new table that we just created, and we'll cover this later in the course. But that's one of the quickest ways to create a table from an existing table is to do select star next line into the new table name from the existing table. But it does assume that we have a neg zip sting table. Because if we don't, then we don't. And you'll see that now that I have, I know how I'm sorry. A table under the blah, blah schema on the table. Name is gold. Historical Data. Okay. And it has the same amount of data has the same rose. Same. Everything okay? Because basically what we did is we copied it. The regs squiggly line is because I would have to reconnect a sequel server. It thinks this doesn't exist. It does, but it doesn't realize that because it doesn't refresh that often. And this is true for even sequel server 2016. Now this right here is 2008 are too. In this course we used 2008 are, too, and we use sequel server 2000 and 16. We might even use sequel server 2014. I use several additions for the benefit of students, and I want to say that the Sequels over 2016 I use I think it's either it's this one above developer, which for the like of me. I'm drawing a blank or its Enterprise edition. So it's either Enterprise edition or whatever. The one is right below Enterprise. It's above developer. This one is expressed. Okay. All right. So we know how to select data from a database context. Right? Were in the context of application. We know how to select all of the data we know how to select. A little bit of the data will get into the where clause later. Not in this video. Just because I'm looking at the time right now when it will probably be a bit So what if I wanted to select all of the data from another database? I'm sorry. From a database in A from A table in another database. Okay, now, first, before we do that, let's look at how weaken. Let's suppose I was connected to master instead of application. How can I select the trying to think about a word this How can I select a full table from the database starting with the database? Okay, so remember before I'm doing gold historical data, okay? I'm not specifying the schema, so I could if I were to specify the schema notice how that works. If I do blah blah, it would work right. I can select the data. What if I were to do blah blah mo, which is Altria Group? For the record, there's no object name that right? I get an error. But what happens if I were to do db o dot meow We see that there is a db omo, right? We don't see that. There's a blah blah. Let me refresh this. That's why you see where we have a blah blah here, but we don't have a db blah, blah mo right. We only have a blah, blah gold historical data. OK, so just to know that's that's something that we're doing well. One of the context for selecting the database dot schema dot table is this dot dot context , right? You'll notice I'm selecting from the application database, the schema and then the table right? I can also put them in brackets, and that's to me the best practice to dio database schema and the table name. I can also do here because we know we have a block law and we'll see this works as well. Okay, to go back to the view So the context here and I'm writing this out there is no table Name this I saw something was wrong immediately. Okay. The context therefore is select star from database dot schema dot table name or a table. Ok, while outside the scope of this course. Because we're not gonna be doing any link server If you were doing a link server, this is the way you would do this. Server dot database dot schema dot table students pay attention Posits video if you need Teoh. I have worked with senior developers who do not know this. This is pretty embarrassing cause it's very simple. As you see database dot schema dot table if if you want to do it on the database level on the server level, server dot database dot schema dot table always works the same way this server will be whatever the server name is, the database will be whatever the database name is, the scheme will be whatever the schema name is, on the table will be whatever the table name is in the case of link servers. So you do need to make sure that you have permissions you haven't set up correctly and Sometimes what you'll find is the way that if they don't work, here's a quick hack for a problem that you may face. What you have to do is you have toe right click on the Link server, load it up and then hit. Okay, You didn't change anything, but that refreshes it. I don't know why, but that's actually a fix for a lot of links server problems, So jot that down is very important. I've worked with senior developers who do not know that, and it's pretty embarrassing. Okay, so we're on the context of application. We got about one more minute. So let's look at how If I'm on one database, how could I connect to another database without changing this use to another database? So, while I'm connected to application, I can select from a different database I can select from bit lending analysis. Db o dot db o dot table agency ratings. So posits video. I'm gonna ask you a question right now and posits video and see if you can answer this before hearing the answer. What does this mean? Of course, I'm on application. What does this mean? The bit lending analysis. What is db o mean. And what is TB? Underscore Agency ratings mean posit video and answer the question. I'm gonna give five seconds before I answer. Okay, so bid Lindy analysis database name BBO was a schema name and TB underscore. Agency ratings is a table name, right? Even though I'm in the context of application, You see, it hasn't changed, right? I'm still in the context of application because I've specified the database here with the dot, dot dot I am connecting to that database. So that wraps up this very first part. In the next part, we're going to go into a little bit more details. 2. Part 2: Let's begin Part two based, basically sequel querying. And I'm gonna dive right into this so that we can use our time most efficiently. If you think I'm going to fast please slow down. The speed you to me does allow you to do that. If you think I'm going to slow, please speed up the speed. All right, So in the last video, what we looked at is we looked at some of the basics of select a right, and we have columns and we have rows of data. So I'm gonna select the top 100 from this table. I'm gonna show you really fast. So this this I d is a column price data brought price date is a column. Prices call him average. I'm not sure what, Boba, Boba is on gun. Of course, this one is a row to his row, etcetera, and you can see that. You know there's data in these tables like 5 4004 etcetera. OK, but what happens if I want to look at just one row of data and I want to look at the row of data? That is 2000 and 11. 0104 Well, I'm gonna select Star and I added a line of code here called the wear Cloths. The where clause, if you're taking notes, is a filter clause. This is where we apply our filter. Now, there's a lot of different ways that I could teach this part of the course. But what I'm going to focus on instead is some of some basics about the where clause that is very important, that people stumble over. Even senior people stumble over first where clauses are meaningless without an index. Okay, um, we'll show an example this button right up here in management studio shows include actual execution plan. Okay, we're going to see how sequel server actually analyzes this query behind the scenes. So sequel server is going to try to optimize all the queries we run, and we'll see that what it does is it does a table scan. You wanna know why? Because there's no index at all in what table skin, for those of you don't no hits the entire table, right, because there's no filtering. So really, this is kind of a meaningless query in the sense that there is no point for me to specify aware clause if I don't really have an index on it, unless I just want to waste Resource is, and that's always a possibility. But the better thing to do would be to index the table. So in this version of Sequel Server, which is 2008 are two and a good interview question, by the way, would be I would I ask interview questions a lot that I try to trip people up, so I'll say so in sequel Server 2000 and eight are, too. I have a clustered column store index that what I would expect someone to say You can't have a clustered column store index onions equals over 2008 are too Sequel Server 2000 and eight or two. You can only have two indexes. A clustered index and a non clustered in next. That's it. OK, so we're going to create a non clustered index so the sin taxes create non clustered index on. I'm going to say I X I'm gonna name the column Price date on the table name, which is G. L D historical data and then the column name Price date and I want it indexed. Well, we'll just, uh what is due price date? I don't have to do. Is Cindy Okay? So I'm going to create this index now, remember, in the execution plan that when I filtered by price date, it did a table skin. Now, let's compare what happens now when I do that, Okay. Whoa. It's totally different, right? It's no longer just a table scan, okay? And we will see that it does a nested loop here. It doesn't index seek, and it doesn't. All righty, you look up heap. Okay, Now, I'm not gonna go too deep into creep plans here. We will cover some of that later in this course, but the key is that it's no longer a table scan. Now, really, for most people, myself included, that doesn't really mean much, right? Because it's like, Yeah, but I mean, is that a lot better than it was? So let's go ahead and let's drop the index. Okay, so now it's gone. Let's do the query again and we'll see that it does a table scan. And since we're querying like to light, let's go ahead and let's compare the cost. So I'm going to set the statistics. I 01 I've worked a senior developers, by the way, who do not know how to do this? So this is this is really good stuff that you will want to pay attention, Teoh. And if you're taking notes, I would take notes and learn how to do this. Because this is when I query tune. People think I am incredibly good at Cleary tuning, but this is all I'm doing. I'm looking at the plan, and I'm looking at the statistics. That's it, and I'll determine it from there. But I'm always surprised how many people don't know how to do that. Because there's a lot of So we have our table scam. What are the statistics? I see that it did. One scan, 1274. Logical reads. What I'm gonna do is I'm a copy. This gun control. See this? Okay, this is how you can look like a genius to. All right, so that's my 1st 1 Now I'm gonna go back, and I'm gonna build my index on that table, and we're going to select the same query. Look at the execution plan. Yeah, Index. See God, er, iata and We're going to copy this and we're gonna compare. Like to like, I like how this scrolls thank you. Sequel server about trying to get it up there for you students. Here course everything else. Zero, I think everything zero k. So we see the two sides. So with the index, we have a scan count of one. Makes sense. Logical reads of four and 0000 Everything else. So compared to two, which one is cheaper? Well, logical reads of four is much less than logical reads of 1274. Remember that the table only has 23 55 records 2355 records. I think about that for a second. Imagine if that table was a 1,000,000,000 records. What the logical reads would be. It would be a lot more than 12. 74. Look at what this is. This is just four Reed's wife. Because what sequel servers doing? I don't want to get to advance. So I'm I'm trying to think of how to convey this to you as a student. That makes sense kind of introductory to sequel server here, But think of it as when you're in a car and you're deciding to turn left, right, Right. And you're looking at cars that are coming. What you're doing in your brain is your kind of calculating your guessing at the speed of those cars. Right? You're gonna You're only gonna turn right if you think you have enough time to do that based on what you estimate their speed to be right. So that's kind of an unrelated example. The sequel server. But what the optimizer What's the sequel server Optimizer is doing is it is making assumptions of how to get that record as quickly as possible based on the assumptions that it has. And what we do is we structure tables in a way that allows it to make faster assumptions. So by building an index, it's able to make a quicker assumption that if there's no index, so that's what I mean by a where clause becomes useless. If I don't have some type of be indexing structure that allows me to maximize that filter, why would I throw a filter on a query if there's absolutely no index at all? And I and I wonder why? Well, why is my query. So right now there are exceptions to that rule. It may be that I just want to run a query one time, and that's the only time I ever want to run that query. But it doesn't really make any sense that I would on a regular query, that I would have no index and be very strict with my filters and complain about my queries going slow. So a couple of things to really review here because this actually is a little more advanced than just junior level and I have worked with junior developers have no idea how to determine if Aquarius faster, slow set statistics. I Owen Okay, that will tell you the statistics of your queries. Okay, include the actual execution plan and then read the execution plan. There is a great book by Grant Fritchey, and I believe it is free through Red Gate on execution plans and reading execution plans, I highly suggested. First of all, it's free. Second of all, it will make you much more familiar with what's going on. We're going to do some of this in the course, but reading that book, I read that book every single year because it is so useful. Okay, if you're going to use a where clause and it's very important and most crucial need to, then you want to also look at where are your indexes and how can you maximize the speed. So I know it seems like it doesn't all make sense. But as we go through the course and we optimize and retuned things, you'll realize how we're using these tools over and over again. So if you have a where clause and you have no indexes, not the smartest thing in the world, at least if it's a regularly returning query that you want to, you want to be running very quickly, so that wraps up this part to lessen. I don't think I have enough minutes to cover the next topic, but we've looked at kind of the basics of the where clause, and as we go into our project, the where clause is going to make a lot more sense to many of you are gonna be like, Oh, I get it now. This is where we're applying this, but for now, just recognized that the where clauses one of the filters that we have in sequel server, and it's very important that if we're needing something to return quickly and we have a where clause, but we have no indexes than that where causes not maximized. 3. Part 3: we're not gonna begin Part three of introduction to database development. Basic T sequel querying. And we're gonna be going lickety split on this video as well. If I'm going too fast, slow it down with you to me. If I'm going to slow speed it up with you to me, we're gonna be demonstrating two things here today. We're gonna be looking at the group by in the order by now. Before we do that, I want to introduce students to the concept of the update statistics. Any time you add data or use a track data any time you change any type of structure in the table, whether it's data less data, more data, I would highly suggest running updates. Statistics on the table. This will update statistics on the table itself as well as any of the objects related. In fact, we can see all of the statistics related to this table. You see these five statistics? Those will all be updated. If I were to specify those objects individually, I could only update them. For instance, I can only update statistics for an index if I want, But updating statistics is not gonna disrupt you. It all updating statistics does help in some cases, and it is a great way to tune. You can read more about that, but I do want to demonstrate that now because it's very useful toe learn early on, any time you add a bunch of data, it's same thing with my ET al course, which I teach students in detail was. It's a very good thing to update statistics. You will get a lot more speed in many cases. Sometimes that's what's actually happening on as a senior. Db A. I've run into a lot of issues that are just a simple matter of updating statistics. You would not believe what they are, too. It always surprises me. So let's go over. Grew by an order by Grew by is a very difficult concept for most people to get. So I'm gonna start with water by and you might want to review this video multiple times. Okay, So first let's go to this query here, and we're gonna be using one of sequel servers built in functions. Will several of them actually, so you'll see that the price state we selected is 2011 No. 104 Well, what happens if I want to get all of the dates in 2000 and 11? Let's go ahead and do that now. This is a very poor quality query, but I'm intentionally doing that to demonstrate something. I can say the year of the price date and I can do equals two, 2000 and 11. Actually, I believe it's this. Yes, that's what I'll demonstrate. Order buying a second now was going to give you a quick can't hear. The reason why this is awful terrible is what sequel server has to do is it has to apply this function to this entire column very poorly written. The better way to write the query would be where the price date between 2011 whips No. One, No. One and 2011 12 ships 12 31 right, and we get the same. The reason is because noticed there's no function applied to this, so sequel server doesn't have to first apply function. I'm now passing in. I'm saying, Hey, just take this column and find me all the records between here, it can make far fewer assumptions. This is what we call a solvable query. OK, very important. Much better way to write your queries. So we're gonna do it right now. I wanted to show you first how to do it wrong. It's always good to know how to not do something so that you know what to do and why. And so when someone says, why can't I do this? Wait, now you can You can sound like on expert and say, because it's not a sergeant query clam. So let's look at the order by order by price date, what do you think's gonna happen when I warded this by Price State? What do you think's gonna happen to the price State? Look at how the price state is right now. In 2011. 0103 or four or five or six or seven. What's gonna happen? OK, now the default is ascending, it does it ascending. And by the way, the way you specify ascending is this. But it is the default. What happens if I do descending, okay. And by the way, I'm an inverted thinker. I'm just warning you. So even I sometimes get swapped on ascending and descending. So, um, I'll be the 1st 1 due to call myself out on that. And it's descending means later comes first. Ascending means earlier comes first, but I You would not believe how many times I have gotten that backwards, and it's because I'm an inverted thinker. So I was think backwards and have to rethink backwards. I know it's crazy. So what order by does? It says, How do I want the records to come back in? What order? Right now. Right now, I'm doing it on date. So I eat. Either can start with the early dates ascending or the later dates, which is descending. What happens if I want to do by the price instead? Let's just look at it at the price I want to do from price descending for the year. Well, now it doesn't matter what the dates are, right? Look, the dates aren't even ordered, but I'm ordering about the price. The highest price comes back first. Okay. What about ascending? What was the lowest price of gold or the G L The E T F. That year was 1 27 again. The dates are not ordered. It's the price. That order. Right? Okay, so that's something to note on the order by says that gets set out of the way Group by is a little bit more difficult. You have five minutes to discuss that. So let's go after that one. So grew by It is difficult because a lot of times people have to do this many times before they get it. So we are ordering by the month of the price state, so we know it's gonna come back ascending, right? We're ordering this one by the price day by the day of the price date. I'm sorry, and we're doing that ascending. What about the group? By what is group by doing here. So let's look at this. All right. I am grouping by the month of the price date Inquiry one, an inquiry to I am grouping by the day of the price state. So when we grew by the month of the price State, what we are doing in this query is we're saying, tell me all of the prices that we have the count of all of the prices by each month. Okay, so there are 12 months in the year. You'll see we get 12 records back, okay? And we can see this Because let suppose I were to leave out this call and I would say Tell me the count of everything, and it's gonna be like, Well, okay, fine. I can I can order that, But you'll notice we don't know which month that IHS, Right? So let's go back. Likewise, I can remove this column and watch what will happen. It was just tell me the day of the month. It was just giving 12 months, right? So usually when you do, group buys. In fact, almost every time. What you're trying to do is you're trying to determine aggregates, right? This is the way we aggregate data. So I'm going to scroll this window up and we're gonna look at some of the other aggregates . It's too high, actually. Some of the other aggregates that we can do. So we know the count of the values each month. What is the average of the values each month? What is the max of the values each month? And what is the minimum of the values eat month? I have to group by the month because that's the aggregates that I'm interested. I'm interested in the county average an accident men each month, right? So I'm on a group by that month. Okay? I can't group by day. Watch what happens if I do this If I grew by day, but I'm looking for the month. It's gonna be like it's invalid in the slickness because is not contained in either an aggregate function or the group by Klaus. So if I'm looking for these values, these aggregate values by month, I need a group by month. Okay, so let's look at these and we'll see. I love the no column name. By the way, you can resolve that by saying, as count in months as average of month average of month as Max of Month, what was forgotten H and then I can do as men of month case. Let's look at this and we'll see now two columns have a name so we can see. For instance, in date month, which would be date Month would be a January. I'm drawing a blank. There's a 202 total values in the table that are in the month of January, the average in the month golds Pricer GL. These price, I should say, is 100 to the max that it ever was in the month of January is 169 and the minimum that it ever waas in January is 41. What's very interesting to me is always about doing analysis like this is you always find these months where it's like what and actually we can do I want to say STD Evey gathered is we could look at the standard deviation of fast standard deviation of of month just to see really fast. Hold on. Yeah, that's really not that meaningful. For a minute I was like, I wonder how meaningful these values would be relative to the standard deviation. Not very meaningful, but you can see it. It is pretty interesting that it looks like gold does have certain times of the year. It peaks higher Gld I should say it. It's a little bit higher than other times of the year. What I'm doing here is I'm grouping by month, right? I can take these same aggregates here and I can apply into this career. I'm grouping by the day instead. Now the problem is, I haven't by month. Let me change in today. Yeah, I'm gonna be pushing 11 minutes on this video and I apologize for that bomb and go really, really fast now once I get thistle typed out day and and then day and the purpose of this is to show you the group I But most people with group buy it takes a lot of work to get by stepping through exercise after exercise because it doesn't feel intuitive efforts, though some of you will immediately get this. So you see, now what I'm doing is I'm looking at the count in the table the total number of values that are in the day of the month of one and then the day of the month of two and in the day of the month of three. And I'm looking at the average in the max value in the mini value. And then, of course, the standard deviation. So what group by is grew by allows me to look at aggregates. It allows me to look at things like, What are the total number of values that I have in the table that are in the month of January? What is the total number? What is the average of that value in January or what is the average of that value on the first business day of every month, etcetera so grew by is very useful in aggregates. Anytime you have someone requesting aggregate data, you were more than likely going to be using a group by 4. Part 4: All right, let's look at introduction to database today. The basic querying part four. We're gonna be looking at left joins and inner joins, and we're gonna use some simple examples in the course project. We're gonna be doing more advanced or some more advanced analysis, I should say so when we cover the course project, you're going to see mawr of the joints and left joins, inner joins and left joints in the context of them actually being used for business logic. This is just a simple demo example that we're going to be looking at. So let's scroll up here and let's look at our two tables. One of our tables is called table. Join one in one of our tables is called table joint, too. We're looking specifically at the column one i d. And we're looking at the column one i d and the other table. So both tables have a column called one I D. And what I'm gonna do is I'm gonna do a basic join where I am going to join on table, join one, and don't join on table joined to. What I'm gonna do is I'm gonna join on the one I d and I'm gonna join on the other one. I d right. So let's look at this. So in inner join is looking for a match. Okay, so let's go back to the tables. That's what the joint looks like. You'll see each table. House of one I d. Of 12341234 But if we look at the tables will realize that table joined to Has a night one i d. Of five. Now we noticed When we do to join, we don't see five. Why is that? Well, it's looking Table joined. One right table joined one doesn't have five on Lee has up to four, so that one is excluded because it's not on their. Which brings us to the concept of left joints. And these are simple examples intentionally. But we'll see in the course project why these this is important and why we would want to do this. And the business logic, especially for analysis, left joins and Inter joins are very useful. So I'm going to be now selecting from table join, too, and I'm gonna be joining to table join one. So a left join is saying I want you to look at all of the records from this table that exist in the other table or they don't exist. In other words, where there is a match for where there is not a match. If there is no match, then just return a no for the column. So we get everything in table two and we get everything in table one that matches, and we get Knowles for something that doesn't match. Okay, if we were to take this query and we were to flip it around, we would not get the five. And the reason what I'm saying is select everything from here that is in this table that matches, and that doesn't. But the position is starting in this table, right? This table, every one of its records also exist in the other table. Where is this query is? I'm saying, look at the everything that's in this table and get me the records that match in this table . Table one are some joined one, but that don't match as well and then return a note for the ones that don't match. Right. So the reason why these two return differently is the starting point in a query tube. The starting point is table joined one which doesn't have a record of five. And so it only returns everything that matches because it doesn't have something that doesn't match. Now we can change that one second. Let's do this just to not mess up things. OK, now let's look at our tables. I inserted a record into that table and we'll notice now this has a record that this doesn't have right has five. So now let's look at our left joint. Let's first of all, look at the first query because table joint to say, Get me everything in here. That doesn't match five. But where's the six act? But we're not starting from the table. Join one, right? We're starting from cable. Joined to it is five and the other table doesn't, but it doesn't know about six. What happens if we start from table joined one Lit knows about six because it had a six in it, but table drawing to doesn't have six in it. Okay, so that's why these the behavior of these air different, because what is the table that we're starting with? We're starting with table joined one right or we're starting with table joint to that's going to affect our left joint. Very important note. But if I go back to the inner join, watch what happens when I do the inner join Remember I had in six. Doesn't matter, right? Because six is not in both tables. Neither is five. So we have records in both tables that do not exist in the other table, right? We have five only have six. Those don't exist. Another table, so the left joint will pull that out, but the inner join will not. All right, now let's look at one example. This is ah, very important example. You're gonna run into this if you're a sequel, T sequel, database developer. Pretty much your entire career. And that is the issue of duplicates. Because inner joins are affected by duplicates, as we'll see. So we noticed that in table joint One. We have an idea of too. We have won I d 12 idea of to, I should say, but in table joint to we have 12342 ID's of two. Now, let's say hypothetically, these are duplicates. How is this going to affect our inner join. What do you think our inner joins going to do when we have in one table only one idea of two, But in another table we have four ideas of to watch. What happens. We get four back now we know we'll wait. Wait a minute, Tim, but there aren't four to ideas and table one, it doesn't matter, though. It's determining whether there's a match right. There is four matches on to idea of two in Table two that there isn't Table one, so it's going to return all four of those now think about how that's gonna affect the duplicates if we only need one. What that means is at this table joint to has three duplicates, if you would of the first. All right, So let's look at a simple example of the table with duplicates, and it only has one I d film. I kept this simple. Let's suppose that the rule the business rule of this table is that it can only have one distinct I d. What I mean by that is we can have one. We can have to have three. We could have four, but we cannot have two of one of those i d. S. But we notice that we have two twos. That's a problem. Right? Okay, so this is one of the quickest ways to remove or to select out. We're just gonna look at selecting out duplicates. Remove. We can get into later. But we're going to do with select out dupes as we're going to select the row number. Actually. Will do. Didn't drink over. Partitioned by I d order by i d. And we'll call this dupe I d. Because we can't have two columns. Name the same thing from this table right here. Then we're going to do selective star from select out dupes, and we'll cover what we're doing in a second, actually, what am I doing? Uh, roh number not Didn't shrink. There you go. Headed backwards. All right, So and I realized the logic of this for beginners is kind of like, Whoa, we'll get in that in a second. What the dupe idea is doing is it's telling us where are duplicates are. All right. So if I wanted to select where non duplicate Rose are, I would just say, where do pai d equals one? Okay, now, I have selected out the duplicates right now the duplicate anymore. Okay, but let's go back to let's go back to this and row number over partition by I d ord Obaidi . I will admit this is this is usually very difficult for some people to get ordered by i d. We all understand its ordering the rose in a certain way. Okay, so we all understand that we get that partitioning is we're going to break it into categories. OK, so I say partitioned by i d. It's like a division problem. OK, so I am going to say I want you to partition the row number, which is what this is going to be. It's going to be the row number over and we can see this. Let me let me demonstrate. This is well, what happens if I just said row number over order by i d. Let's do descending just for fun. Unless you're descending i d. And then let's copy the same things will probably make more sense to students ascending on the flight coding, by the way, is a lot of fun, but it's it's very helpful, and it helps. You'll see how to do these things. So let's look at this now. Like star. Okay, Good. So you can see what I've done with the ascending ideas. 12345 Right. So, I mean, it's all it's doing is the same thing with this 12345 is doing right. It's just ordering it and then descending is just going backwards. Right? So we get that this is this is pretty straight forward. These to make sense were just ordering it. But when I say partition by I d, what I'm saying is, what I want you to do is I want you to divide it by repetitious i ds if there are any right , So we're gonna code that comment that out. By the way, that's how you comment out something we're back to here. So there's two of these, right? So when I'm partitioning by the i d, it's going to say, Wait, there are two of these, so I'm gonna order this, right. So this is a distinct row number because there's only 11 This is a district number because there's only 13 and there's only 14 But there are two I ds so it's going to say Okay, well, I'm gonna order this by road number. So I'm gonna say one and I'm gonna say to so on and so forth. And let's just prove that right? And we can do that. And the great thing about T. C equals it's very similar to math. Insert into table duplicates values. Let's insert another two and see what happens. And let's insert a 2nd 3 and then let's insert five. And actually that's entered another two. And then let's insert another three and see what happens. So we're gonna insert more values into this table and let's see what the effect is when we do this role number. And I realize I'm getting near 10 minutes, but this will make a lot more sense past 10 minutes. I'm sorry, but you'll notice 1234 because we have four twos now, right? So it is ordering the wrong number, but it is dividing it by that I d. Same thing with three didn't realize we have three threes, right? We have four twos and three threes, but we only have one form. We only have 15 and we only have 11 right? So if I want to select out my duplicates, I can just say where the Dubai D equals one and watch what happens. It filters them all out, and I get the distinct ideas back. OK? Very important to note, because you will have that challenge a lot in business where you have to remove duplicates . This is one way in which you can select out duplicates so that covers the basics of the inner. Join the left join, and then how to select out duplicates, because you will run into times where your inner join returns. Two Values. Three. Valerie's four values. What that means is one of the underlying tables in that joint has duplicates triplets, quadruplets, whatever you wanna call him in one of those tables, and that's how you can select those out. 5. Part 5: we're gonna quickly cover the basic T sequel lesson of Cross apply. Now, cross apply is one of the most useful functions in T is equal. And I used to actually ask in interviews about this one because it's very useful. So we're going to do years. We're gonna look at a table with student loan data or we're gonna look at a table, I should say, with net worth data of Echo Boomers, which are people born from 1980 to 1995. And in this case, I'm filtering out all of the American Echo boomers. I live in the United States. However, not all echo boomers are Americans. Some of them are not from the United States. And so this is exclusively looking at foreign echo boomers on this. On this, I should say year range. And then this right here is looking at the average net worth and the average student loan for the echo boomers. In fact, what I'm gonna do actually is I'm gonna subtract the average student loan because we're just gonna be looking at net worths when you subtract that down here as well. Okay, so we're just looking at those. So This is the average net worth for all echo boomers in this year of the study. So in this year of the study, this is all of the foreign echo boomers. As you can see, the non American echo boomers there's about 76 I believe. I wanted to say there was several 1000 echo boomers a year that I spoke to. Well, what happens if I want to look at every single one of these foreign echo boomers? And I want to compare what their net worth is relative to the net worth of the entire group as a total? And so we're gonna be doing here is we're gonna be doing cross play. There are actually many ways to solve this problem, but this is this is one way to use cross apply. So what I'm gonna do here is I'm gonna move my window up a little bit and then I'm gonna move down here, and then we're going to I got to get myself space And what I'm going to dio take this off to look at this query because I realize there may be some questions on this query, so we'll go over this in second, but I'm gonna be demonstrating really fast the cross apply, and I'm gonna call this column as foreign echo Boomer difference. So what is what's going on here? So we look at these two queries about first of all, what I'm really doing is I'm looking at two queries. I'm looking at all of the foreign echo boomers, and I'm looking at the average of the total net worth of all echo boomers. Right? So that's all I'm doing now what I'm doing with cross supplies. I'm saying I want you to apply this results set highlighted the wrong thing. I want you to apply this results that to every single column of the table. OK, so in other words or every single row, I should say so I want you to out a call them to the table, and then I want you to apply that Call him to every single row. Okay, So when we look at this, if I were to select this these data, this average net worth is coming from this cross supply Notice how this average net worth. It added a column from the cross, apply okay. And then every single road that we have of net worth. It adds a row that is that result set. Okay, so it's actually adding this value to every cone. I'm trying to every row now. This value if you think about it. There is just one of this value, but this value is added to every rule. Why is that useful? Well, what am I doing here? Well, I'm looking at the difference between this person's net worth and the average net worth, right? So I want to know how this person's net worth differs. I mean, each of these echo boomers, these foreign echo boomers or non American echo boomers, I should say how much each of these non American echo boomers their net worth differs from the average net worth of all of the Echo boomers. Right? So this allows me to drill into this analysis and see overall how many, how well they're doing. And I happen to know from all of the data that I collected over believe was three or four years foreign echo boomers in general are doing a lot better than the United States echo boomers. So people who are non Americans do much better in our country. than the people that are from our country on a lot of that comes to various reasons that are kind of outside of the scope of this. But that is something that I can use to drill into the analysis now. There's also something useful here that I'm using, which is this case win statement. And I'm gonna explain this really briefly remember that I'm looking at the difference between the net worth and the average net worth, but the average net worth is negative now. What is a negative? Minus eight? What is a subtracting? A negative result in subtracting a negative results in a positive right. Let's do a little quick demo of this just to review some basic math. And this is why math is kind of helpful to understand. If I sit there and I say one minus five, I'm gonna get equals Negative four. Right? But what happens if I say one minus negative? 51 math? We know that's gonna actually equal six. Why? Because a negative minus a negative equals plus right because we're getting the difference . That's the important thing. That's the reason why one minus of minus five equals a plus. because we're getting the difference. So what we have to do? We want to look, we want to compare the difference of the net worth. So if the difference of the net worth in our example here is a negative number, then we're going to take the absolute value of that number, right? What is the absolute value of negative five? Well, it's five, right? So we say one and then we say, minus when we say we're gonna I'm gonna put this in parentheses, absolute value of negative five, which is five, which equals negative four. Okay, so we get back our answer that we should. Well, the same is true here. What I'm doing is I'm saying the case when the average net worth is less than zero to it's a negative. Then get me the absolute value of this nuts worth. Otherwise. Just get me the value of the network. The else here means just go ahead and get me the normal value if it's greater than or equal to zero. Because if it's a greater than or equal to zero, who cares, right? I mean, I don't need to subject a negative. In other words, I don't need to take the absolute value because it's a positive or zero, because, let's say what is it? Five minus zero was 50 doesn't affect it, but five minus negative one is gonna be six because we're getting the difference. And that's why so this function here, this a B s function in parentheses, which, for instance, let me demonstrate this really fast like a B s minus one minus five is going to get me the absolute value of five. I'm sorry. Negative five, right. Same thing. If I get the absolute value of five, which will be just five, it will be five as well. That's what this a BS does. The case win is wrapping logic around it. OK, the case when is just saying Hey, this scenario get me the absolute value when the network is less than zero, when it's a negative number, Otherwise, just give me the network. Now we know ahead of time that the network is negative. But if the network net worth were to ever to be positive, that's the key here. And the reason why we want to write it this way is because net worth changes over the years as I do more and more analysis, right? Same thing with your business model. Although your business model or your idea or your program or your Web development application, whatever it is, you might be in a situation where two negative number. But you never know if later on it's gonna be a positive number. Right? So you want to be careful about manually hard coating? If I were to do this, if I were to manually hard code, that is, then it could affect things. Now, in this case, this is one rare example where I can actually just take the absolute value of the net worth overall. But you want to be very careful about doing that. So this demonstrates kind of a case, one of how we can use the absolute value of its negative number or we can use the other. And this also demonstrates how to use cross Apply now, in the case of no matter what, I always want a positive number. There is the value of using the absolute value as a case important. So let me just go ahead and do this year, we're gonna cut this out, and then wrap this up here, and we'll just copy down as and what we can do here is the absolute value of the network. Because no matter what, it's not gonna change. And that's how we would write this. And we see. So regardless, whether it's positive zero or anything, we can just do it that way. The case one is very useful if we want, if we want some logic built into it. But the case one is not useful if we don't need logic. So that wraps up looking at cross ply, and that looks up as a simple example of using a built in T sequel function. That's what pink means. This is adult in TC for function. And then, of course, the case win statement, which is how to kind of wrap our business logic around whatever our applications trying to do. 6. Part 6: all right in this video in Basic T sequel, important functions and views were gonna be getting into some of the built in functions that sequel server provides. And we're gonna also be looking at views now one of the things that you do not cover in T sequel in two hours, as they do not cover user defined functions. And the reason is because user defined functions are very inefficient and t sequel and I consider to be one of the poor practices of building functions and T Sequel. There are a few functions that you can build, but for the most part, if you understand the built in functions, most of the good functions that you could use anti sequel are already provided to you by Microsoft Sequel Server. What you would rather do is use dot net, So let's go ahead and let's look at this. And first of all, we're gonna look at some 44 basic, I think some of the most important functions, especially if you're looking at data sets. And since the course project is built around data sets in which all the important data, I want you to look at the data set in which you just summarize it and 500 words or more. Because it's really gay practice. These functions are very useful. So the first thing we're gonna look at is the max amend, the average and the standard deviation. These are incredibly useful with all kinds of different analysis. I'm looking at the maximum minimum price of anything. It's that or let's say, oil or whatnot. Thes thes. A really helpful writing. I'm looking at the student loan analysis of Echo boomers who are the millennial generation . Echo boomers is another named for him because there are large generation and this is from some data that I collected, and you'll notice that the largest student loan in the Mulford millennials is $300,000. The minimum loan, of course, zero. The average loan is 6500 and then the standard deviation is $19 in 857. Now some of you may be like Wait, Tim, I thought the media said that the average millennial has a lot more student loans in 6500. I mean, like they're they're they're buried in debt. This is actually not true. What they mean by the average student only mean the average college student or the average borrower is what they actually mean when you really dig into the data, not just the data that I collected, but when I dug in, even to the government's data. What I found is a lot of the figures air quoting are the average borrower. That's not the same thing as the average millennial, because not all millennials gold go to school Onley about, I think when I was doing it, I want to say in the range of 40 to 50% of millennials went to school. So there's quite a few of them that did not and remember that not all millennials who go to school have student loans on Lee. About 1/3 had student loans. 1/3 of millennials have their parents paper of their school or they paid and then another third use scholarships, and then only 1/3 had a student loans. So that's another thing that we can do is we can combine that on our query. Let's say we look at the average student loan when the students have at least some, so you'll notice here in my filter cause I'm using greater than zero. This is really important. Look at some data analysis notice. All of a sudden the average loan is 20,597 now. When I was doing my study, the average borrower or the average student loan borrower, according to Media, was $22,000. So this is very close to that figure. And the government figures that I pulled then was about it arranged. One of them was about 19,000. Another one was about 21,000. So I was right in between it And what the's are looking at, though, is they're looking at the average borrower, the average student loan borrower, not the average millennial. It's very important note because again, if you aren't familiar with the data set, when you actually start digging into the numbers, you realize what they mean. But a lot of times with journalist, they'll accidentally quote something wrong and they don't realize where they went wrong. That's why it's very important to know these things. So what are some other things that we can look at? Well, it would be kind of meaningless, but another thing, another function we can look at screaming off is we can look at the we can cast a student loan, which is a numeric field, and we can cast it as a text field. So let's suppose I want to cast it as a text field and I'm gonna say from this right here, So I'm gonna make student loan a var car field. Well, why would I do that? Well, I could say now that it's of our car field as opposed to a numeric field, I could say echo, Let me do it this way, Echo Brumer. Oh, and then plus So I'm contaminating a string here and you'll notice that I've put made the numeric value a string basically here. Okay. And C sharp. This is kind of like a replace. I'm sorry not replace a convert. I cannot think of what it is off the top of my head. Improvising is not a strength. Always when I'm doing math. Where is if I did this without the cast? Watch what would happen? So what would if I What if I just did this? I'll get in there and it's like, Whoa, you can't convert a character string to money, right? That's money. It's on even numeric. So I can't do that. But this is what this cast allows me to do. It allows me to convert a string to a numeric value. Right. What about if I wanted Teoh? Ah, hold on a second. So let's say yeah, let's use this. I wanted to replace, so I'm gonna use another function which called the place Control Control Z. Not sure why I did that. And I want to replace all instances of zero point 00 with nothing. Okay, so it's gonna go through the string, and every time there's a 0.0 you can see 0.0 It's gonna place with nothing. In fact, let's do better. Let's do it this way, which means not applicable right. And you'll see it goes in it replaces. Now there's a problem. And some of you will realize this if we look at this well, but the problem with 0.0 is that some people, let's say $120,000. Well, there is a 0.0 So what I might want to do is I might want to replace it where there's a space right, and so that also leads course than the problem is open on the new space To in a replaced can be a tricky function to play around with. But for the most part we should get the correct numbers because there should be no space on the people who owe amounts. But as you can see, this is a built in function. Another thing that I can do, which I did in a former video. We can look at the absolute value of the student loan. This is another built in function, very useful. Mathematically, I believe all the student loans that I have in this table are reported in positive numbers . So not gonna have any effect. That is one thing that we can do if we have negative numbers in a field, a lot of times we will run into the situation where we do not want to have negative number . So we'll structure our data alone is technically a negative number. But notice how the way I'm reporting it is as a positive number, and a lot of that is gonna be for filtering. So that brings us to the next point as we look at here, which is views, right? What we want to do is we want to look at how to build a view. Now, one of the big notes here, We'll go ahead and copy this. We're gonna get the use echo boomers go to use the database context here, one of the things that I highly suggest. And we're gonna use a simple view here. Quick analysis, as I highly suggest, building a query that you can apply to any data set. Okay, I use this all the time. This is probably a gold bar as faras math lessons for some of you in this course because many of you, if you apply this, you will be able to use this all the time in your life. Let me give an example. Some sales person comes to you and they say we want you to buy natural gas. And I've had this before. Right there. Your electric company in their life. We want you to buy natural gas where you go. Look at the price of natural gas. And what is it? It's $11. Will you go? You import the history and you look at all the natural gas history. You find out that $11 that only 10% of the time natural gas is $11 or more. Is that sales person trying to benefit you? Are they trying to rob you trying to rob you write like that's why being able to quickly analyze any data set will provide you with tons value. I have used queries like this that have made me thousands of dollars, and I've used them also to save thousands of dollars so you can tell really quickly when someone ripping you off. That's one way to look at it. First you think you can also uses to identify really good investment opportunities were just opportunities in general, especially in your area. So now that we've created the views, I've created this view and you'll notice the structures create view. There's a view name as and then this query. What it does is it saves my query. Okay, so now I can select from this view if I do it right and you'll notice it's safe. So if I find other helpful analysis, maybe I find Median is helpful. I add median. Maybe I find the standard deviation from every point. One of the views that I use, one of the data sets that I used or data programs that I use to analyze data sets is actually stored. Procedure on what it is is dynamic enough where I can pass in any table and any comb that I one analyze, which is generally the price. Let's say it's a Lego set on. I want to see if a Lego set is pretty cheap. I can pass in the table, name the price, and it'll go through when it will do an analysis on the entire history of Let's say, Let's say it's the Lego set and I could go get the entire history about Lego set and the price everything. And it does something similar to this. But it has a lot more detailed information, and again I can look and see what that whole history is. Great analysis. So this is something that we can do. We can build a view, got saves a query. So now we can call the view as opposed to the query, which helps us. And then, of course, sequel server does provide us with some built in functionality. There are a lot more functions. These are some of the most helpful, and these are probably some of the ones that I use the most, and that's why I introduced them. But generally speaking, if you're looking for a function that you need to do something, most of the Time T sequel already has that function in very exceptional cases where you need to build one. If you do need to build one, though, I highly suggest that in most situations it's much more effective to do that in a object oriented programming language like Dot Net, or what not than it is to do it in T sequel since T sequel is a set based language and not an object based language. 7. Part 7: All right, let's look. A basic T sequel was stored Procedures dynamics equal. You will often hear the dynamic sequel is bad. For the most part, it is something that you would should be very careful about using, if at all. I will definitely expressed some concerns about it. But for the most part, if you're doing analysis, there are some convenient things that you can dio if you're doing, that's a quick analysis for yourself or you're trying to get some men. And Max is of tables or whatnot are just just information really quickly. So let's go ahead and let's start off by just building the stored procedure in our first tour procedure. What we're going to dio because we're just gonna select top one start from this table, OK, and we're gonna call the store procedure. Excuse me, I need to get rid of that. You're not using that one called the stored procedure and the way that I call it is just accept and then the procedure name. And then when we call it, it'll just select the top one star from that table. OK, excellent. And you'll notice, too, as well if I want to change the stored procedure. I just type an altar instead of create, create, create it. Alter alters it. Okay, well, I can also do insert statements, delete statements, update statements, etcetera. So let's say that I want to select top one star into table new, and I'm gonna do that. First, I'm gonna create the table first. And if I do a select top one star into a table again, it'll fail because it was already an object. So now what I'm gonna do is I'm gonna switch it around and I'm gonna do control X. And then I'm going to do on one second insert into this table and I'm just gonna test it out. It works. Changed my procedure, it works, and then call the procedure. And then after that, let's select from the table the new table and we will see that there should be three rows. So you'll notice that I can also insert into a table I can delete. I cannot bait etcetera OK, in my procedure so I can select from it. I can insert, I cannot beat, I can delete. So let's go ahead. Let's demonstrated. Delete. Let's say I want to delete all records from the stable. So delete Oops, not select delete from table new. And let's go ahead. Do that and let's select from that table and we'll see. I will have to call it first altar, call the procedure and deletes all the rows. All three roads were affected, and then we'll see. There's nothing in there. So stored procedures air. Great, because they're basically saved code, right? If I have a process, let's say I'm importing data and I want to save the process that automatically. Let's say scrubs. The data cleans the data for me. Well, then I can save it in the store. But freeze stored procedure like this, right? So I don't have to keep creating code for it all the time. I can say that in a storm procedure. Okay, well, right now we're looking at a store procedure that just call something right. There's no parameter that were passing in. What happens if we want to passing parameter into a storm procedure? So, in other words, let's say in a hypothetical example that I want to select star from any table, so I want my parameter b a table and I want my query in the procedure to query whatever table I pass in. This is where it starts to become debatable whether this is a good idea or not. But for the most part, we're gonna go ahead and we're gonna show an example of this. So in this example, I am creating a procedure table name is my parameter and I am declaring a dynamic sequel, Strand, which means this is subject to change. What I mean by that is that depending on what table I pass in it, maybe select star from table ones like start from table to table three, etcetera. Right? So whatever table I pass in well will change the potential query That's run. Now you'll notice that I'm doing a select star from the table I've wrapped in quote name, which is a built in function which uses system objects. I could spend a lot of time explaining that, but the bottom line is sequel injection is definitely a concern. And so, with dynamic sequel, so I always wrap my system objects in quote name. It's just something that I do good practice. It's the same thing as I parameter tries my aware clauses. Okay, good practice. You can look that up. Now. Notice here that I'm printing the sequel. I am not going to execute this by actually selecting from it. First, I'm gonna print it, and I already created that. So I want to just see what this is going to do and you'll see what it does. Is it? Select Star from the Echo is an echo student loan analysis. So if I call this and I say select, actually, let's do this. Hold on one sec. Top one star. Okay. Like together one. Let's do this. Okay, let's call it. It's copy this place that Okay, I do that. We see that it selects the top one. Start from there. Now, all I'm doing is printing at this present time. So what? How do I call this? How do I make it where I want? When I execute this procedure, it's going to do whatever I select here, right from whatever I pass in, which is dynamic. It's always going to select from the table that I pass in. And that is I'm gonna call the exact function. This is SP under School Execute sequel, which is going to execute a dynamic sequel String, and you will see this sequel. String is a dynamic sequel String. OK, so that comes from here, which is declared any time I need to declare variable. I first declared the variable the data type. And then, of course, I set that variable equal to whatever I wanted to be. In this case, it's a dynamic sequel, Strand. Okay, let me all through the stored procedure here. So I am now changed it. So this time it's gonna actually select from the stables. Not gonna print print it out. It's gonna actually select from it. Okay, so let's look at the logic here in an altar procedure. I'm sorry, a crate procedure when there's a parameter after the create procedure, whatever the procedures name is, we need to declare the variable in the data type. If there's another data type that I would do comma and let's say I said, Ah, column value. And I said it was Navarre Car 100. That would be this in tax. Right? So it's going to be the variable name, the data type comma, the next very well named the data type comma the next and so on. And so forth. There's only one common needed that I'm gonna do as and then begin inside of my beginning end. I'm gonna put the data here. I'm sorry. The query text now in this case is dynamic sequel. So I need to declare my dynamic sequel Stream, which in this case is a sequel. Variable. That's what the variable name is. Variable names always start with the at symbol that I'm gonna set that dynamic sequel to In this case, this is a string variable. I'm very string variable and I'm setting that equal to select top one star from And then this is where the dynamic part comes in. Whatever the table name it is that I pass into this store procedure and then I'm actually gonna call it here. Okay, now there are other things that I can do. Let's go ahead and demonstrate this in this case, we're going to just print out sequel. It's gonna execute it, but it's all it's gonna do is print out the sequel. I'm start point out the text and we'll see print. Using dynamics equals very, very bad table name. See? Okay. So we can do I mean, we can do all kinds of things in dynamics equal, I can insert into a table dynamically. I can select from it. I can delete from it. I can update it. I can print things out. But dynamic sequel is not saved code. So it's compiled at the query runtime, which is just a fancy way of saying that it's not always gonna run in the optimal way, so be careful about that. It doesn't mean it always will be run poorly, but it's not going to necessarily running the optimal way. And then the other thing is, you must parameter rise your queries. You cannot allow people to enter whatever they want. Okay, you have to be very strict if you're going to use dynamic sequel. Where do I like Dynamics equal? Not in any application. I want to use hard code. Yes, it takes a little more work, but it's safe. But when it comes to dynamic sequel, I like to use for my own analysis. So I have some queries that I run, and a lot of my career is that a runner to analyze data sets very quickly, and since these data sets the price and the date information is always the same, or that's how I structure it. I can use the same queries over and over again until I use dynamics equal. In that case, I have a safe state saved store procedure, that is, and that saved stored procedure. I can use that over and over and over again by just passing in the new table, name the price column and then the date column. And there you go. That being said, I'm not putting that on the front end for anyone to use. That's only for my own use. So that's one of the reasons why I can use Dynamic sequel That case. If it's an application, there's gonna have to be a very strong argument in 99.9999% of the time. The answer is, I'm not gonna be using dynamics equal. So we looked over this overview of how to use store procedure. The fact that we can select from it, we can update, we can insert, we could do all kinds of things, are stored procedures that save codes that we don't have to rewrite it every time. When we looked at Dynamic sequel, which is allows us to pass in a changing object and do things on objects that possibly change, along with some of the concerns about dynamic sequel Maybe good may not be good. 8. Part 8: All right, let's look apart. Eight of Basic T Sequel and we're gonna be looking at Tim tables, variable tables and Seti's common table expressions, and we're also gonna be looking at some queries, and I've got to put that up there, right? So let's get started right off the bat with 10 tables, variable tables and global tim tables. So Tim tables are start out with the hash symbol. If you use Twitter, you know that this is the hash symbol. Onda Global Tim Table, which will I'll show in a second demo in a second starts, has to ashes. A variable table has a different syntax. Notice that the create table syntax of the Tim table is create table table name and then the columns in it. Same thing with the global Tim table. However, a variable table is not. It's like a variable. We declare it. And then we put the column names. We declared the name of it, which is temp. So we declare Tim Tim table, and then we put in the call names and the parentheses. Now you'll notice with these timetables that I'm inserting into the top five from that table. Echo boom I'm sorry, Echo Student loan analysis, and I'm inserting it. All of these and I'm selecting from all of those tables. For the record, with Tim tables we can delete from them, we can update. We can insert, weaken, select weaken, do everything that we can normally do with most tables. The differences these tables are stored in Tempe be so they are temporary tables. The purpose of them is on Lee for temporary purposes. So if you look, we should get five records in each of these tables, all of them with the gender here of F now very key to note the difference. One of the big differences between a variable table and a Tim table is that if I were to select star from this variable table, it will say, Hey, you must declare the table variable temp, right? In other words, the temp, the variable table has to exist in the transactional process. So in order for me to do this again, I have to run the whole transaction again. By contrast, I do not have to do that with this table. Thes Tim tables still exist. They are created and they exist. Until I dropped them or until the scope of this transaction goes out. What I mean by that? If I would you let me copy this if I were to select Star from this timetable, everything is great. But if I go to a different transaction window, this is a new transaction. And I were to select Star from that Tim table wife. What will happen? We'll say it doesn't exist. It's like there is no Val object name that whereas if I go to the global variable, take I'm sorry. The global temp table It existed all transactions. Which means you should be very careful about using this because every transaction that's going on will have access to what's in there. Right? So always be careful about global. Remember, this timetable does exist. If I go here, it exists. If I go to this one, it exists as well, right? Both of these groups, um, both of these pin both of these exist in both of these. Both of these exist in this scope, but in this scope in this transaction Onley this one exists. That's very key to note between the differences with one that is Onley in this transaction and the global, which is which exist in all transactions. As you should Be very careful about that. This transaction is number 52. This transaction number is 55. I'm going over the hovering there. Okay, so this transaction has access to everything. But this this variable table is already out of scope because it has to be. It only exist during runtime. So if I were Teoh to select from this again, or I were to do this again, you'll notice little throw errors because these tables already exist. So let me code these two out and now are variable table will exist again. Course this time our Tim tables have more records because we inserted even more records into them. Right? But the variable table Onley exist in that one scope. So it only has that five. OK, so let's go ahead and let's do a demo of a variable table just by itself to make this very clear, because this does tend to trip up many developers and tim tables. You can create indexes. You could do a lot of the same things you do with, um with regular tables. The key is, though, is their temporary So if I run uh, starting from master, if I run this transaction, it'll insert the data. If I run it again, it'll insert the data. But if I only highlight this, I'm only running. What I'm highlighting it doesn't exist. Same thing if I only highlight thes two, it doesn't exist. Now there is a note about Tim tables as well. We can also create a tape Tim table by doing a select star into. But remember, we can only do that once. Once it's created, we cannot create it again this way. So I could technically say select top five into temp table. I'm have a new timetable now, and it will create that table. And now if I select star from this, I will get from this timetable and we have that. Okay, I cannot do that with variable tables, though. If I were to do temp table bullets to do Tim, it will not allow me to do that, Okay? And that's because a variable table must be declared okay so we can select star into a table. But if I do this again, this table now exist. It'll throw an error because there's already an object named that right, So I have to go back to doing an insert statement. All right, so let's move into sub queries and CT's, and I'm gonna keep this pretty basic. You can really look into detail on Seti's or they're very useful. I like them because they're clean coding. They're very easy to organize, and they're very easy to troubleshoot most of your time. And T Sequel is gonna be sittin troubleshooting queries. That's just the reality of the way it works. How a CT Works is the declaring sin taxes with CTE as or another alternative sin taxes with C T E. As you'll specify the columns. And then, in this case, it would be gender. And then, inside of this bracket, you'll have your your idea, your items. Okay, I specify my brackets within the CD. That's my preference. You could do whatever you want. Um, it does not matter. It's not gonna It's not gonna affect the analysis. So I'm gonna execute this, and then I'm going to select from the C. T. So whatever I put within here limits what's returned later on, right? So one of the questions that people do ask is Well, what's the purpose of the CTO, or why would I do a C T over attempt table and depends on who you ask, But one of the things that I like about to see tea, as opposed to attempt able, is I don't have to do the insert into step, which you think about it is writing data and reading data, right? If I do insert into table Select Star from other table, I had to read and write with a CT. I don't have to do that right. I am taking a query. I am putting it within brackets and then I'm selecting what I want so I can limit my query here or my queries. It could be joining a union. And then I'm selecting from that here so I don't have to go to this step of creating a temp table. So when I know that I need to create a Tim table, I always look at the use cases, use case analysis and then ask myself if a C T is a better idea. And generally a CT will be now. Sometimes a CT, though, is slower than a sub query. So what is a sub query and I look at these. In other words, I'll compare these when I'm and I waiting that some query is almost identical. It's very similar, a little bit different, but it's very similar. You'll notice I have my typical select star from okay. But instead of specifying a table within these parentheses here, I'm specifying a query, Right? So I'm doing my select top five gender from Echo Echo student loan analysis that keep wanting to say echo boomers. Um, from this this query here I'm selecting from that So that becomes my table. And then I'm labeling this as sub. And so that's why do sub dot star the sub query. So when I run this, of course, I get the same results set. But the idea is that my sub query where I'm selecting from is that some query, right? This could be very, very useful enjoins. This could be very useful in selecting from a query that has joins. This could be very useful analysis. It depends. The thing that I do not like about some queries as much is depending on how they're structured. If you're going to do where not exists, which a lot of people do with sub query, and then they will join on the table, like on a column. You cannot look out the query in between the sub query. Okay? And what do I mean by that? If I'm gonna join from a value from a table that exists to a table that is in my sub query when I'm troubleshooting, I cannot debug that in the sub query where as I can with the common table expression. Okay, So that is one reason why I slightly prefer common table expressions even there, even if they're a little bit slower. And the main reason is because the common table expression allows for simplified debugging , where submarines do make that a little more complicated. In future videos, we're going to look at doing some joins with some queries and CTS. But for now, just know that when you're doing a joint on a regular table, the submarine is gonna limit you a little bit. And what you're able to do, especially on like on a were not exist, etcetera. So that wraps up our segment here of Tim tables, which cover both local Tim tables and global temp tables, variable table CT's and, of course, some queries. 9. Bonus Challenge - Live Coding Practice 1: In this video, we are going to do a coding challenge. And we are going to be doing a coding challenge with a technical pattern. For those of you who have heard of technical analysis, one of my really good friends is in to technical analysis and now he and I recently and we're talking about a pattern that I thought would be interesting because it was a very challenging pattern for SQL. But it's a great example of one in which we can do where if you're a beginner, this is a really good challenge to go along with me on and code as I code and then you can make tweaks as you go on. And then if you're advanced, you can see how I coded to end. If you're advanced, see a way in which you want to do it even better. So first of all, let's actually look at a graph right here. And this graph shows the pattern that we're going to be creating. And I'm going to kind of draw it out and much super good at drawing here. But I'm gonna go ahead and draw out what we have. So the pattern that we're looking at, let's see if I go here is what is referred to as the cup part of the company handled, let's put it that way. And so in this case, the cup in theory is going to be this right here. This is how it should come out up here. The handle would be here. And keep in mind that this is all my friends theory. There's no way to know because if you look at this being a company handle, if you're familiar with technical analysis, a person could say, hey, this was a company handle right here, but it wasn't. And so my point is, is that it gets kind of, it can be kind of hindsight bias, but essentially accompany handle of what we're going to be doing on coding is we're going to do something like this. We're gonna be looking for that cup. Now if we wanted to go further, we would be doing a handle. The handle could be something like this as well. Here let me do this instead. The handle could be something like this. And also keep in mind that even though there is a time range here and we'll look at this as we start to code. The time range can actually be very different. So let's go ahead and let's open up some code. And let's look at this. And so I'm going to switch over to the dummy database here. And I have saved the file GLD. Now GLD is an ETF, that's, let's say represents gold, but it's not the exact price. So the pricing is going to be a little bit different than the image that we just saw. So the first thing that I want to do is I want to actually import that GLD data. And so what I did is I just went to Yahoo Finance and I got dead GLD ETF. So I my coding window in front of me. So the first thing that I wanna do here is I want to create a table, and I'm going to call it table gold. And then I'm going to do f, date because the first field is at date and time value. And then there is f open. And I am not going to use this fields data, but I'm gonna go ahead and save. I'm gonna go ahead and varchar it. So if it's a varchar, but I'm not going to use the field. I'm gonna do f hi. And that's going to be a varchar, f low. That's going to be a varchar. F close. That's also going to be a varchar. And then I'm gonna do adjust clothes that I'm going to keep. And so that's going to be 228. And then I'm gonna do f volume. And that's going to be a varchar. And actually I'm gonna do one difference here, GLB date. So basically the fields that I'm going to be keeping are going to be this close right here and this date right here. And I'm gonna keep it very straightforward and simple again so that beginners can understand it. All of the logic is going to be written in sql. There are other languages you could write this in, but this is SQL again, this is mainly for beginners to take, to get. And then of course, defaultable law, I should say. And for advanced people to look at from a perspective of how could I make this even better? Okay, so there's sea and it's I believe ETL and on this computer in GLD dot csv. Okay, that's where I keep it out. Ok, so with, and then we're going to do the standard book insert here. And it's going to be field terminator. And this is coming from, just like these fields right here. This is coming from that first line in the file. So when you download the GLD file as a CSV with all of the data. By the way, you want to make sure it's all of the data. That's what we're referring to. So the field terminator, as we know, is a comma. And then the row terminator is a new line, so that's 0 x 0 a. And then we're gonna save that the first row, which is the truth. Here, is going to be to do is we're going to skip, we're going to be skipping the header row. So the first thing we wanna do is we want to create our table. And the next thing that we wanna do is we want to bulk insert our data. And we did that. And then the next thing that we want to do is we want to delete from TV G golden. This way we can all be on the same timeframe. So all of you who are following along where the GLD date is greater than. And we're gonna do 2020. We're gonna do August. We're going to do 31. Oops. So we want to remove this from the table. So we get rid of that. Okay? And then what we're going to do is we're going to select the GLD date. And we're gonna slice the adjusted close. We're going to select those into assessment. And we're gonna do from TB G gold. Okay. Alright, so we have done that. This is Select star into this creates another table. So we now have this table right here. And so what we can do just to verify that, and you should see this as well. When we go to GLD date, you should see this in your foul of assessment and we can see the minimum date. Okay. Let's look at that. And then the other thing let's look at is let's look at the max. I just close. We want to look at the Mac's price. And we see it's a 193. Like I said, GLD as an ETF is a little bit different than the others. Now, what I'm gonna do here, so I can just execute it easily is I am going to code a comment this out. And that's what I did really fast as comment this out. And that way I can just run the code like crazy. Okay? So going back to this image here, okay, so what we're going to be making is this COP, okay? So let's think about this from a logical perspective, okay? As far as what we're doing here, we're essentially looking for a time range here, a period of time. And between this time range as a starting point that's here. And there's an Indian Point that which is within a range of here, right? So technically this, it looks like I'm drawing a smiley face. I can see how students think that. But technically this, this period, we have this high period. If you would have this high price, there's a time period and then there's another high price. And keep in mind, this can happen over a long period of time or a short period of time, right? We see in this one it was like 2011, but also 2012. There's kind of this high period here, right? And then we see later on, there's this high period here, right? And then there's this cup part here. And again, yes, it does look like a smiley face now that I'm drawing this all out the way that I'm drawing this. Okay? So and I'm going to go back. Ok. So essentially one of the things and the reason why I'm highlighting this, what we don't want to do in this case, and I don't know how specific the TA rules or you'd have to ask him he at TA expert. But what we don't want to look for is something that's super thin, right? So if there's a period of time where it reaches a high and then there's a channel and then there's another period of time. This is more like a two-time fork, right? In other words, we're going to assume that if whatever the period is that starts here, right here, let's say that the period between here is longer, at least three times longer than the period here. So what I mean by that is, Let's say, we say it's a weak that it takes to get to this high. And it's a week that it takes to get to this high. We're going to assume this right here is 21 days, right? So we're going to assume that the period of time that it takes to reach here that say if it's a seven day, the average price of the Seven Days and the average price of this seven days. This right here is going to be 21 days between these, okay? That is not necessarily hard and fast rules of TA By the way, you'd have to ask an expert, but I'm not a 100% sure if this would be defined as a cup. Or in terms of part of a company handle or whether it's like there has to be a length of time. Ok, so the reason why I highlight that is because we're looking at gold here and we know that gold of course is, is pretty well, does it pretty stretched out as far as how long this goes? It goes from 2011 to 2020. Problem because you have this year we'd hit it hit a high as of the time of this video. So what that means is let's look at, we're going to definitely want to declare some variables. And one of those variables is going to be max. And this is going to represent the Mac's price. And we used the decimal 228. So that's first we want to know what the max is, because we're going to use that as this area up here. So that's what we're doing when we talk about the Mac's, we're looking at either this or it's going to be this right here, right? So this may actually in this case, the max is going to be this because we know by looking at this chart, this is where we're going to be defining. Ok, so that's what we're actually looking for at, at first. The other one is, well, is that even though we describe it as a cup, more than likely it's going to be within a range. So I'm going to have another one called the percent. And this percent is going to be decimal and it's going to be 32. Okay? And I'm gonna go ahead and set some of these variables while I'm doing this. So let's do select MAX. And that is equal to the max by just close from assessment. And you made a wolf. I'll tell you while I'm doing this in a second. So that's going to be, let's say the upper part of the cup. And we're going to be looking now for a range. And I'm going to define the range set percent equal to 10%. Yeah. So what that means, again is I want, wherever this hi is, I'm looking for another high that is within 15% of this. We could define it stricter, and in fact, we'll play around with it being stricter. But let's start at 15%. I guess we could do 20%, but let's start at that, start at 15%, and then we'll play around with numbers as we develop it. In this case, I'm looking at this graph here and I fit hit 2 thousand, and this was 1810% of what is it? 2 thousand is 200. So that we may be able to stretch as far as 10%. Okay. Alright. Then. Oh, okay. Yeah. Do they sit that yes. At max on firm. That's fine. Okay. Then what we wanna do is we want to do what is it? This would be a time period. So the time period and small and for sure because we don't I mean, we don't need a big int for time period is the period of time that we're going to get the average. So in this case, it's the period of time that it reaches this point or this point, right? So for instance, again, using the graph on gold. And so this would be a time period right here. And this would be a time period because it really is, you could say a double top there and then this would be a time period, right? And so in this case, we're going to say that the time period is going to be set, time period equal to 90. Now let's, let's do 60 mm. It is a little hard. I mean, it could be, we could even do is sturdy, but this actually looks like it took a little bit more. This might have taken a little bit more because these are years and I have to remember that. But let's use 60. Let's, we can always again, do longer though, the longer time it takes to reach the, the more of this would be, the shorter time it takes to reach like a shortcut and handled. By the way, you'll see these sometimes that happen over a period of two weeks, then we would use a time period of like one, though we wouldn't need like probably the values in during the day. Okay? So we are going to have to do a loop for this example. So we're gonna go ahead and we're going to have to do loop. So we want to do a loop start. And that's one thing. Dilute start is going to be it has to be a date because we're iterating over periods of time. So we're going to call this datetime and the loop start is going to be today's date. Now we know that we got rid of values for today, but we'll keep iterating over the values until we go backwards because that's what essentially we are going to be doing and thinking out loud while I'm doing this. But essentially we are going to be going backwards. So that's the loop start is just going to be the start of a loop and then it's gonna go backwards. Okay? So that means if we have a loop start, then we have a loop end. And that's going to be the original, the very earliest value of the table. So that's our min value, that's what that is. Okay, so loop end. Again thinking while I'm doing this, so the loop end is going to be the minimum. And this is going to be the GLD date from assessment. And I'm just going to go up here, make sure you watch GLD date. In fact, what I wanna do is scroll this down. That makes, this makes it easy for everyone to see. Okay, so we have a loop start that's going to be the beginning of the loop and then the loop end. Okay? And then we're going to have, okay. So we have this time period, which is going to be 60, but we need to get the price for that time period. So what we need is the current average. And that's the current average for the time period. And in this case, the current average is going to be decimal 228. Okay? And then of course, we're going to have to do so. We want to skip. Yeah, I don't I don't think we need another one. I think we can do just that. I'm thinking out loud here, but I think that, so here's the problem. If we immediately start the loop, it's going backwards. So let's say it gets the average price. So let's follow along what our algorithm would do. How are we going to do it? We're gonna get the Mac's price in this area over 60 days. It'll grab that max price, then it's going to start going backwards. But we don't want because because think about where I'm already at. I'm already in 2009. You've already gone a year because this image is so stretched out. I don't actually want to look for the next Max Price for at least a year. So what that means, one way we could do that is we could have like account that would skip a certain amount of periods. But another way, now that I think about it is I could select just set the loop, start equal to date ad, that's going to add days, day, day, that's gonna add days. Heck, someone would do a subtraction. This is how I subtract days, by the way, 365 and get date. So actually I'm gonna just because of order of what I'm gonna do is I'm gonna do this and there's that. And so what I can do instead now is I can just do this. So loop start is a is over a year. So we're gonna get the max value. That is, we're going to get that max value for that first 60 days and then we're going to skip a time period. Okay. I'm just thinking about Is this all that I need? Because if I go to this image, basically we are going to be starting here. We're going to be going backwards and we're going to be getting these averages over period of time. So just as a case in point, we would get the average from here. This time we get the average and this period of time, this average, this average, this average, this average, this average. And then what we're going to be doing is we're going to be looking at whether that average is within a percent range of that max price that we got. So let's go ahead and let's start working out our logic with a while loop. And this is, this is going to be pretty, this is where it gets a little bit intense, okay, at first. And so I'll probably do some of the loop initially just to show what's going on. So what we're doing is we're going backwards. So what we're looking at is we're looking when the loop where we're currently at, we are looking for this is a good question. Basically, we're looking for while they're still dates to iterate over. In other words, while there still dates to go backwards from, we're going to continue going. So essentially what we're saying is while Date Ad day, day minus 60. And this is loop start. So while Date Ad is greater than Loop End by this is the challenge of live coding. Begin, end. Okay? So, okay. So if we were to execute this, this is fun, but we want to do is we want to select while Date Ad. Let's go ahead and select this value. And what we want to do a, there has to be a set operation. We're going to set the loop start, we need to. That's why, because it's using that loop start. We need to set that loop start equal to date ad, and then our day, day minus 60. And then it's going to be loop start to reference itself. And it just hit me that we actually have the time period here. So what I'm gonna do here, I can replace this and I can replace this. And I can replace this. So and yes, if you put a negative time period, it will do the negative time period. So who bond, there's this and then cheque date. So we're gonna check that date. So what should happen here? Let me pull this up, is what should happen is it's going to iterate over this. And while there are still, while this loop start essentially is greater than loop end, it's gonna keep going over and over and over. But eventually it's not. That's what this set operation is gonna do. Eventually loops start is going to be earlier than Luke end and it's going to break out of the loop. So let's I'm doing select. Oh, yeah, I can do is like I was just told me I could do print as well. But let's go ahead and let's do select. So if we run this, it should go backwards slowly but surely. And so we see 20191086422018108, and it goes on and there's quite a few values, but that is exactly what we want it to do. So that for those of you who are asking, okay, so what are we gonna do with this? What we are going to be getting the average of the price in this time period. Now it is interesting, I just hit me as I'm looking at this hold on 1 second. That this actually starts here because of the 365. Okay? Okay, but it's, we're already declaring max to be here, so we should be okay. I'm thinking that we're declaring max to be in the loop, but it's not the case. We're declaring max here. So we should be okay because I was gonna say it's skipping on the max 2020 price, but that's already declared outside of that. So that makes sense because we're looking at the current price and that's what we would be doing an assessment on. Okay? And this is always the challenge of live coding, but it's very useful for students to see because again, it's, it's good practice. Ok, so we want to keep we definitely want to keep our set here. That's for sure. We don't actually need to keep this select and we have checked our date. Now what we wanna do is we want to start actually coding the logic of what we're going to do. So what we wanna do next is we want to select, remember one of the keys is we want to get that current average, right? We want to get the current average over East iteration of a loop. So we want to say the current average is going to be equal to the average and it was adjusted close, if I recall correctly at right there from assessment. From assessment. And we want to set that where the GLD date. And we have to always remember that. Forget this part when it comes to any logic here, but I always have to start from the beginning to the end. So where it is between between and its date ad. And it's going to be day, date minus time period. And it's going to be loop start and loop start. And I've seen myself in logic before, do loop start and then over here put the end and then something else. And it's never going to find anything because it's backwards. So I make that mistake with date, sometimes with the between operator. So the earliest to the latest, that's how it has to be. It can't be latest earliest, that's backwards. It won't come up with anything. Okay? And okay, so we have the current average. And so actually if I did this and I selected the average, In fact, I think let's go ahead and do that. Let's see what happens. Current average. And let's do occur a Vg. Okay? Let's do a check. And there's the current average. It goes and gets the current average. And by the way, again, if you're more advanced, this is where you could figure out optimizations. This is a great video for beginners, and this is a great video for advanced learners to figure out how to make better. Ok? So we'd actually does go get our current average. That is exactly what we wanted to. It's going to loop and it's going to go get the current average of all those time periods. Okay, but remember we want it to return, going back to our image, we want it to return this range here. And this is where we're gonna get that percent, right. So essentially the percent we're looking at is when it's within 15% Of the Mac's price. So what we're going to say here, now we're going to write an If statement. I'm going to scroll down. I don't think I need a table. Yep. Okay. So if we say if we say the current, I put this in parentheses ahead of time, if the current average is between, I'll put this in parentheses and, and, and this in parentheses. If it's between the Mac's price max times. And I'm gonna put this, it's going to be one minus the percent. Percent there it is. So one minus the percent, right? So why is it one minus the percent? Because it's going to be 85% of the price, right? And the max times. And this is going to be different, one plus the percent, right? So it's between that value, it's either going to be between basically 85% of the Mac's price and a 115%. That's what we're doing here. We're getting that percent range. It can be a little bit less, a little bit greater. And like I said, we can tighten it down. So if the current average is between there, then we're gonna do begin and end. And we're going to, Let's see, we're iterating over. Let's think about that we could, we could do is we, let's select from the table. And let's do the min, min. And let's do the GLD date. And let's do the max GLD date. And then let's do the AVG price. And then we're going to do from assessment. That's right, because that exist. And we're just going to borrow this right here. Whoops, I'm sorry. That's this right here. Because this is the time period that we're currently in. So this is going to tell us when the current average, as it's iterating over everything, is between this range. And I'm thinking price. Now, that is what it is, but it's actually adjusted close. So, okay, so if we do this, let's see what happens. Okay, let's scroll up and let's look at our values. So we have 2012201220112011. So let's really go back to the image and what we would see. And yeah, let's go ahead and use this color here. Is this is where our high is technically at. So it's because we're using 15%, it's going to find something in here, which is 20112012. That would be correct. Let's look at it. That value is 168166165171. Okay, let's go back here. That's, that. That is correct. Now remember we use 15% and I know, like I said, we can experiment around with others. Let's use 12%. Let's be a little stricter here. And we get 2011 and we have 171. That was a stronger filter. Okay, so we know that R max price was 193. It technically 19.3 just and this is gold, this isn't GLD, but 193 would be probably my guess is it's going to be above this area anyway. So, you know, if if I was doing just a full cup, I probably would start out like 185, but yeah, this is actually correct. So let's, again, let's go back and let's look at our logic here. And just to show, but what we've done. And again, there are many other ways to solve this problem. So don't get me wrong. There's, this isn't the only way to solve this problem. This is just a good practice, I think for beginners to see. But what we've done is we have passed in a max value or the value basically that we want to compare two factors would actually be better named compare two. We're gonna go with Max for now. In this case, we did get it as the max out of that table. We had a percent that isn't a percent range within that max as we use here, right? It's the percent above it, in the percent below it, and that's what we're doing. The percent below. The percent above, right. Okay. Then we have the time period. In this case, it's 16 days. We're looking for an average over that time period. Keep in mind, if we were looking at a company handled that was over three months, this would be much, much smaller. This would probably be only five days or maybe even three days, right? So it really depends on the length of time we're looking for. We then have our loop start. And, and this is another point where our loops start would be not far out at all. Like for instance, if you are looking for a company handle over or a cup, I'm sorry, over 30 days. I would have the loop start date at day date minus probably at most three or 55 would be pushing it. Definitely three over 30 days. Yeah, don't wanna skip too many days, right? And then we had that loop end and that's just a minimum date of whatever dataset that we have. And so then we go through and get the current average of each of those groups of data. And then we check that current average if it's between that range, and then we return it if it is. And then of course we iterate over our loop start time period. Now the great thing is this isn't SQL, which of course definitely is one of the things in this course. But keep in mind, there are other languages that probably already had this automatically added. So this makes a great coding exercise, which is the purpose of this. But if you have another language that you prefer, that you would do this in, you wouldn't have to do it in SQL. But if you're a big dinner in general and you're beginning SQL and you've never worked with SQL at all. This is a great challenge for you to follow along and then optimize it as you see fit and and make it yours, maybe make it over a shorter period of time. If you're an expert, look at this and think about how you would optimize it with any of the tools, not just SQL, but with any tool as well. But this makes a great practice and I think it's a great example of something that, you know, some people, by the way, for x and they think company handles are legit. Some people think they're not real. I think TAs kinda funny, but my friends, some of them, I should say they really like it a lot, but I think it's a fun exercise. And for the record, I would highly suggest any of you who want to get better at, let's say coating sql, play around with all kinds of technical patterns like the Head and Shoulders, reverse pattern, etc, right? Like these are really fun coding exercises and it will definitely get you better at writing SQL. 10. Bonus Challenge - Live Coding Practice 2: In this video, we are going to be looking at security when it comes to SQL injection and sanitizing input, this applies to all database context and all SQL databases. When really we want to be thinking about this anytime we allow input from users. So what we're gonna do here, this is going to be a full on coding video. We are going to be looking at a couple of examples. And the first one is a contrived example and the second one is going to be live coding. So it's going to be some practice live coding. I will attach the script, the second live coding because I think it'll be very useful for you all to play around with and see. So this applies to your database development. There's applies to extract, transform, and load. This applies to anything in which security is involved. And I've seen dynamic SQL used effectively, but if it's not written well, it could potentially compromises from hackers and so it could allow for SQL injection. So let's start off by looking at unsanitized and unclean SQL. And we're going to use a contrived example. And let's say that what we're doing here, and I'll explain this in why it's bad for several reasons is we have a variable called anything goes. We'll see that this is a varchar of 1000. That's way too long in this case. And what anything goes is supposed to be just to select star from our table, right? That's what it's supposed to be. In fact, actually, let's go ahead and I didn't realize this when I made this, but let's do anything goes, let's say it was a legitimate value. So what this should be is just our table, right? I'm going to code this out. So what we're supposed to get when we run this is we're supposed to basically select star from our table, and this is the interred output as we see, I'm selecting that interred output, that's what's passed in, and this is what comes out, which is ID in the table is blank. There's nothing in it. Okay, well, let's suppose a hacker were to use that because we're not sanitizing this input as we see, we're allowing anything. And we're also seeing there's a varchar of 100, right? That's not good, right? So let's suppose that they were to run this. And what we'll see is very, very concerning, right? We see this, that it actually runs this select star from our table, right? But then it breaks right here and it runs this select statement. So the hacker is able to get information from our SIS tables where it's like Admin. This right here is an example in the backend of what SQL injection looks like. In other words, they were able to inject more SQL and more information to extract details. Though, the most common example is what happens if they run a drop table statement? And there's again some truth to that. But a lot of times hackers wanna do recognizance, and so they're looking for more information. Okay, so let's look at this same example and let's look at sanitizing it, and let's look at why this is better sanitization. So actually I'm going to start off by using the bad injury. So we're going back up here. I'm using the same entries, but I mean he's a bad engineering, then we'll use the good entry. Okay? So we have, in this case what we have instead of anything goes as a variable. We have a table variable. You'll notice one of the things is notice how the table variable is 25. Now we're limiting this. So again, if this, if either of these row stored procedure. This would allow anything. This right here would only allow 25, right? Or if this is SQL directly in the code, notice how this is more restrictive and this allows anything. Then we see the hacker again passes in his malicious value. Let's see what happens when we run this and what the heck to value turns out to be that it's, it's table, our table select name that would, that would, that would fail even up here. And in fact, we could actually, let's go ahead and do that. Actually, let's, I shouldn't live code like this. Go ahead and do that. And just as an experiment, but we'll go, anything goes OK. So we'll do the same and we'll see fails, right, because that's not a legitimate query, right? Like that fails already. Okay, I'm going to undo that. Okay? And let's go down to here. So we'll see when we run this, this already fails, that it's not long enough. But even if it was legitimate and we'll look at an example of it being legitimate in a second. If we were to run this full thing, it's going to fail. And the reason is because we're not dynamically selecting from the table that this person's entering. What we're doing is we're declaring a select table down here. And we're going, and we're getting, we're setting that select table equal to the table schema. That's why, by the way, this is longer dot the table name. We're the table name equals table. What this is, is this a sanitization, right? We're not taking what, whoever, the users, the hacker in this case, we're actually first of all, sanitizing it. And when we pass in a legitimate variable, we'll see how this all works. But you'll notice that's null. That's because it's looking for a table with his full name and there is none, right? That doesn't actually exist. Will actually, I'd take them out. It's looking for a table that's named this, I'm sorry. It would that right there. Right. But there's no table name that and so it doesn't find any table, so it doesn't return anything. So what happens with this output when we pass in a legitimate table? Whoops. Okay, let's do this. We see that we get, are we get our Intrade output, correct, right? So what happens is, and I don't have the select query dynamically built here first, I'll do that actually, right after this. I can do that live coding, but we see that it passes into legitimate value. It shouldn't be a hacked entry at this point. So that's a legitimate value and it returns DIYbio table, our table. So it returns this right here, this select table, and we see what the quote name does is it wraps it in brackets, right? So it wraps the name in brackets. It then has this period here and quote name. By the way, this is a good example of good database development right here in good SQL development, right? There's legitimizing the input. But you notice by legitimizing the input, we get that from this information schema dot tables. We're not getting it from the user, right? So now what happens if we take this same query up here? Ok. And I still wouldn't write it this way, but if we wanted to write it this way, and we do select star from select table. Okay? And then we, and I'm going to actually go ahead and hash this out here because we already know what that is. So let's go ahead and run this. And we'll see the ID is blank because again, there's no values in that table, but it does return that select. Now again, this is not clean on, I'm not a big fan of this, but it is still a 1000 times better than that affair. And the reason is because what this is essentially running is this right here. And that's right here, Right? And keep in mind you say, well what happens if they put a semicolon? Well, they can't, because this select table variable is coming from this. There's no way they can add input to that, right? So that's why we don't have to worry about that situation at all. Okay, cool. So now let's actually do a live coding exercise. And this is going to take a bit of work. So I'm gonna move my mike a little bit because I'm going to actually be live clothing. And what we're going to be doing is we're going to look at an example because I can see some students say, well, why would we ever use dynamic SQL? And whether we're doing database development or ETL development or any type of development, there's legitimate use cases for dynamic SQL, but we should be aware of some of the drawbacks of that like we just saw. We want to sanitize it. So this is an example of a legitimate use case of dynamic SQL and why it's handy. So I have a table here and what I'm gonna do here is I'm going to create a CTE. I'm gonna call it drops here. And I'm gonna do as and I'm going to do select, select star from table groups. And then I'm gonna do select star from dips, okay? And so this should return output. Yes, we see these three values are duplicates. We see these two values are duplicates and we see these values are duplicates. So doing math on the flat looks like for values out of this table are duplicates that we would like to be removed. Okay? Alright, so we could write the duplicate logic and get rid of the duplicate logic each time, right? And that's something that many people do and it depends on, we might want a subset of columns, but I can tell you that there's a lot of environments that I've walked into where we wanna remove duplicates by all of the column names. So what we want is the entire list of columns. And we want to be able to remove duplicates by the entire list of columns. If we have a situation like this, if you think about it, all of the columns are duplicated. It is true that sometimes we might just have these two that are duplicated that we want to remove. But there are many situations in which we want to remove the, I'm sorry, the rows in the table that have duplicate columns across the table. So how can we do that dynamically? Alright, so let's go ahead and let's write some live coding. What we're gonna do is we're going to dynamically get the entire column list. So what I'm gonna do first of all here is I'm going to declare, and I am going to declare a big variable here because it is going to be big. And that is Navarre car 1000. And I'm going to declare another variable called column, and it's going to be the same size, 1000. And it's because this would store all of the columns. And then SQL is going to be no varchar max. And we'll, I'll demonstrate SQL in a second and why making it Max. But for now, this right here is how this, the reason why this is so long is because I would be storing all of the column names, id I name, I date. But there could be more in this, this, this column list. Okay, then I'm going to declare a table like the other one, which is going to be a variable. And we're gonna do this as varchar 100 and we're going to set it equal to table groups. We remember that was our table. Okay, now the next thing I'm gonna do is I'm gonna select star from Information Schema dot tables. And I'll explain why this, I'm sorry, not tables, I did that wrong. Columns. This is because it gets me all of the column names from a table. And I'm going to say Where table name equals Paypal. Okay, so let's go ahead and do that. And we'll see the output, will see the database, the table schema and will see look at this, the column names, right? That's cool. Alright, so now what I'm gonna do from here is I am going to use our dynamic SQL and I did make one mistake here. I need to set this equal to a blank. Okay? And this is why live coding can be so challenging is because sometimes I forget what I'm doing. And I'm like, okay, wait a minute, how did I do that before? Okay, so what I'm gonna do is I'm going to set this column lists plus equals. That means add what you have here and then also add the quote name. We already saw this used in the other. What this does is wrap it in brackets. Remember, and we're going to say the column name. So going back to this example, when we use this quote name, it wrapped it in brackets like this. Okay, so we're going to actually look at the output here. But you'll notice we're selecting column n and we're adding column list, I'm sorry, and we're adding each of the column names plus a comma. Ok, So let's look at the output of what this would look like. All columnist. Okay, let's look at this. We see we have ID, I name, I date, then we have a comma at the end. Now will see the logic in a second, but bottom line is we want that Indian comma removed. So what we'll do is we will select the length of this string. Whoops. So what happens when you use too many languages since you that's valid in another language, okay? And we'll see it's 21. However, there is a function in T-SQL called substring. And substring, we will enter the expression which is column list, will start at the first position. And we'll do the length is going to be the closing bracket here to length of column list minus one K. Yes, that's correct. We do this and we see what it returns is this ID I name, there's no trailing comma. This is exactly what we want here. And so what I'm gonna do is I'm going to divide away the length of the calmness was 21 characters. We see what this column list is. It's this full thing. What I'm gonna do is I'm going to code this out. I am going to attach the code. But when I attached the code, it's going to look like this as opposed to this other. I'm going to actually code this out as well. And what I'm gonna do here is I'm gonna set column. I'll call this columns. I didn't do that right? Equal to and it is substring this. So this, this variable is equal to this. Okay? Now you're, some of you are probably correctly wondering, Okay, so, you know, why are we doing this dynamically like this? And that is a good question. As we see right now, what we've been able to do is we've been able to get. And the column list, and we've been able to remove the trailing columns, save that. Now what we're gonna do is we're going to select out, we're not going to remove, we're gonna select out the duplicate values. This is where we're going to use this SQL and the varchar max and were the reason why we're setting it SQL varchar max is because we have a lot of SQL that we're gonna be putting in here. And so we're gonna be using that common table expression with Hadoop's As, and we're going to follow the same logic. The first thing that I'm gonna do is the same thing that I did before. I'm going to do select star from TV. Do, keep in mind, we could dynamically add dopes here. We could do a plus and then the table name. The reason why I'm actually I'll come back and depending on how much time we'll come back and do that actually, and if not, we could do that. We can make it one further dynamic. I want to demonstrate that the more dynamic named nature of these columns. But if we have time actually, I'm gonna go ahead and do that. So I'm gonna do select star from Duke's. This is from the common table expression. The common table expression is getting its values from here. But I'm selecting from this common table expression. And I'm just gonna leave it at that for now. And I know what I'm going to add. Well, actually I'll go ahead and comment this, but I'm going to say where du by dy, which is going to be the name of the column and the future is one. Basically we're, we don't get any of the duplicate values. However, I'm commenting that out this will not run and this will run when I run this. Okay, and then I'm gonna do execute SP, Execute SQL, and I'm gonna run the SQL. I could spend an entire video and why we should use this instead. But for now, I would suggest if you really want to know, you can definitely searched for y. But this is a procedure that definitely makes Running dynamic SQL. If it's well-written, it, it does add a layer of security. It does some sanitization, but I'm going to leave it at that for now because that's a little more advanced topic and we're gonna do this, okay? And so we see it actually selects everything. Remember there were nine values. We remember there are four values that are duplicates. Okay? So we haven't done anything to remove the duplicates yet, but we see our dynamic SQL runs, right? And keep in mind there is no input in this dynamic SQL. So if somebody were to say it's unsafe, that's incorrect because there's actually no input, but we're about to use the input of the column list. Now here's y. Using this sanitization is so important, we're not going to use the direct value that they passed in, right? If this was a procedure, if this was coming from the outside, we're not using this value. We're actually going to use this sanitized values, right? Okay, so Let's look at this here. Okay? So we're gonna do a comma here, and we're gonna do row number. And I am going from memory here, so row number over and it's partition by space plus, plus that. And I'll, I'll come back to that in a second order by plus plus that. And then I'm going to call this dupe ID. There's where our column is at, okay? Now we're partitioning by a set of columns, which will, we're organizing it by a set of columns. And then we'll ordering it by those columns. And where are we going to get those from? It's going to be the columns. And remember these are already in the form of brackets, right? So we don't have to add any brackets to our dynamic SQL. So what we're doing here, in fact, I'm going to actually select out the duplicate ID first. We're gonna select star from the table. So we'll do that so that people can, students can see the output. Ok. So what I'm doing here, because to be fair, this video does assume that people from beginners to advanced people are looking at this. So advanced people will understand beginners won't as much. What I'm doing here is I'm passing in this columns. Now keep in mind this columns is not something that they've inputted, right? This is, this is why dynamic SQL can be very useful, is we've gotten this because we've sanitized it. So our sanitization step actually got this. Okay, so now let's run this. And we see by the way, the du by dy, right? And this is correct. This is what happens when you partition by, we're partitioning by these columns and it's saying, well wait a minute, there's three partitions because these are the exact same. If we look at these values, these are the exact same, right? So if you look at this, this partitioned by 123 partition, there's duplicates here, there's 12. We'll see there's no duplicates here. There's no duplicates here, but there's duplicates here, right? There's these two. And so that's the, that partitioning does and we're just ordering it by that set of columns. This is less important in terms of the removal, but we still have to we have to include it. It it has to be there has to be an order when we're specifying that row number. Ok, well what happens if we select the du by dy where du by dy equals one, what's gonna select out are duplicates, right? See how that works. So let's go back to here, just, just to compare. And again, the script is going to be attached. See how we see the du by dy 1231212. It's going to select where it's one, right? And so that's going to eliminate all our duties, which means four values get eliminated and we're left with five values. Okay? So this is a good example of where and this comes up in a lot of environments. We will sometimes have duplicates across that in every column of the table. And we wanna select it out. And you'll notice I'm using dynamic SQL. I don't actually have to reuse the what is at the table each time. And I don't have the eyes to say, I'm sorry, I don't have to rewrite this. What is it to CTE each time this will work for any table with duplicates, right? That's the beauty of dynamic SQL. Now it is true there may be so many columns is this has to be a lot longer. That's, that's possible. Ok. The other thing as well, and I do see this. The one thing as I'm looking at this is we're not making the table name dynamic. So what I want to do here, and as I said, if I had time, I would come back to this. Let's just see. I'm gonna do this really fast. Okay? And off the top of my head, I haven't tried this, but this again is, is more live coding. So just be aware this is the advantage and disadvantage of live coding. If I were to create another variable here called table select varchar. And I think the table is 100. Okay? And I don't know if I can do this, but if I were to do table select equals, quote, name. And I'm gonna do table schema. And then I'm gonna do this, and then I'm gonna do quote name. Their reason is we can always do it in another select statements so we could have the naming, a table name, okay? We can always have another batch of select from here for information schema columns, and we can have another one. However, we might be able to save some space by just doing it here. I don't off the top of my head know if we can do this. So that's why I'm running this as a test. But let's see, incorrect syntax near. Oh, I'm sorry. Select, select, table, select. That's right now set. Okay, that works. Well, but let me just check is columns, okay? Select columns. Let me, I've got to check that one. Yes, that's okay. Okay. So we can do that both. Okay, cool. So we can actually do both of those. Ok, cool. So I'll leave this. Now that I think about it, coded, commented out in the script that I'll attach in the script or whatnot. Or for those of you who are just replicating it, I should say. You can just replicate it. For those of you who are going through this exercise, there are, there's definitely platforms in which I can add the script and then there's platforms I can't add the script. Ok. So now all I have to do to make this further dynamic is I will do from and then plus, and then plus and then that OK. And so what I'm gonna do is table select. Ok. And so now let's go ahead and let me see if I've missed anything. I feel like I have gotten everything. Okay, so now let's run this and there we have it. So now if this was a stored procedure, I don't ever have to rewrite this. Again. The part that I may have to rewrite is the link of these parameters. And the reason is because it is possible that the table has a lot of columns. I have seen table, tables with hundreds and hundreds of columns. And so that is definitely possible. But you can see with this code, what we're able to do and this is why I'm not against dynamic SQL. If we're going to use dynamic SQL though, as you can see, I'm a big advocate of sanitizing the input. So for those of you who are on platforms that the, That allows you, I should say too, that allows me to attach the code. This code will be attached if you're not, and you're in more of the beginner course where it doesn't allow, this is actually very easy to replicate, because what you're actually replicating is just this, this right here. And then you're replicating this right here. And then you're replicating this. So going along and doing that, even in SQL express, all of this works this, there's no advanced T-SQL, different languages, different SQL languages like MySQL or PostgreSQL might be a little bit different, but this is very easy to replicate. And now you have a dynamic SQL function that you can build within a stored procedure that allows you to get rid of duplicates in any table that where the duplicate is based on every single column. So you can get rid of duplicate rows where all of the columns are exactly the same.