Transcripts
1. Analytics Tips Intro: Alright, If you're watching this video, you're in for a treat. This next section of the course will introduce you to some of excels. Most powerful analytics tools like data tables, forecast sheets, gold seek and scenario manager. We'll work on solving complex optimization is with solver, exploring data tables with cube functions and even building Monte Carlo simulations using a full scale roulette simulator. I hope you're excited. Now let's get started.
2. Quick Analysis Tools: all right for this pro tip. I want to talk about how we can instantly explore our data using something called quick analysis tools. This is a one star analytics tip very simple to use, but a really interesting tool to play with. So most recent versions of Excel believed 2013 and onward, include an option called Quick Analysis Tools, and all you need to do is select a range of cells, kind of like this one. You'll see a pop up appear in the bottom, right? Looks like a lightning bolt with a bar chart underneath it. And when you click that option, or press control que, you'll access a quick analysis tool menu. And within this menu you'll find all sorts of analytics tools, including conditional formats, chart types, calculated rows and columns. You can convert this range to a table can't spark lines. All sorts of very, very interesting and powerful analytics tools all consolidated into this single menu here for quick access. And what that allows us to do is take a rock basic range of cells like you see at the top of the slide and convert it into something like this. You know either color scale or adding a calculated column with the average values or inserting a clustered column chart with the click of a button. You know, anyone who knows me and knows my courses knows that. Generally speaking, I try to avoid these kind of shortcut tools like this because I guess I'm old school. I like to go through the traditional menus, but I'm actually quite impressed by how smooth and how convenient this quick analysis tool menu is, and as a result, I've started to use it quite a bit more myself. Now, one note here, you won't always see all of these options because those options and those tools may require the data to be in a certain format. So if you're selected, data isn't compatible with a certain type of tool, like a spark line or a pivot. For instance, you may not see that option in your quick analysis toolbar. So most common use cases here, obviously just quickly exploring a variety of popular data analysis tools without having to manually navigate through different ribbon options. And also, I found it's a great way to quickly add calculated rows or columns for things like sums counts averages totals without having to actually type in a single formula. So let's jump into excel. Got a good range of data that we can use to practice some of these quick analysis tools. All right, so if you'd like to follow along, go ahead and open up your protest workbook had your table of contents, and we're gonna scroll all the way to the right to our analytics tips. And in this case, we want the quick analysis tools Demo. Go ahead and links right out to that sheet. And what you'll see is a simple range of cells here. We're looking at movie I am DB scores averaged out by year 2011 through 2015 and by genre here in column A. Now this is a great sample bit of data to use to practice these quick analysis tools because it's two dimensional. We've got a nice range or table of values here that we can use for value based conditional formatting, things like color scales and high concepts. We've got time, Siri's or trending here with five years of data so that we can show things like running totals or line or column charts. We've got opportunities to add calculated rows and Row 13 or calculated column or spark lines here in column G. So it's a good, versatile data set to really see a wide range of options for these tools. Let's go ahead and select a to all the way down to F 12. Note that I'm including the headers as well. And as you can see, if you hover over that range that you've selected, you'll get that quick analysis Pop up right here in the lower right. Go ahead and click or press control Que. And here we are in the quick analysis menu and you've got formatting options. Chart options totals, which are calculated rows and columns, can have tables or pivots spark lines. So all of really the most popular and powerful analytics tools packed here into a single convenient menu. Now, perhaps my favorite feature is the fact that you don't even have to commit to any option in order to see the output. So you get this kind of quick preview in your selection as you simply hover over these different options. And if you'd like to make custom is ations, which is something that I often do like to do As you click through these tools, you'll actually access the formatting dialog box where you can customize you know exactly what you want here and then press OK and just like any other change, press control Z undo, and you can jump right back into those tools so the formatting options really speak for themselves. Very simple for charts. Basically, this just inserts and new chart as a workbook object got different common options here at the front of our list. In the total section, we can add calculated rose sure shown us this kind of blue shaded row here in the icon, or if we continue to the right, we can add calculated columns. And this use is your most popular aggregation or summer ization modes, some an average count percent of total and a running total again, very easy to add without typing a single formula. There we can convert, arrange to a table, insert a blank pivot table straight from here, or we can add spark lines and excel nose right off the bat That the most appropriate place to drop spark lines in this case would be right there in column G So again, I've been very, very impressed by these tools and by how smooth and convenient they really are. So I encourage you to either work with data shown here or drop in a sample of your own and just explore how these quick analysis tools can help you analyse and understand your data.
3. The Scenario Manager: All right, let's take a few minutes and talk about excels, scenario, manager, tool and, specifically, how this tool can be used to quickly compare different model outputs. So the Scenario manager, which you'll find in your data tab under the what if analysis options essentially allows you to save and quickly access specific combinations of preset cell values. So let me show you an example here. If we drill into that scenario manager option, we'll see a list of any pre defined scenarios that we've created. In this case. I have four. And what happens when you add or create a new scenario from scratch? You're basically telling Excel. I wanted to find a scenario based on this specific set of cells. In this case, I have three of them have named one cell percent down. I've named a second cell interest rate and 1/3 cell term length. Now quick tip. Here. You don't have to give yourselves names. This could just say H seven h eight h nine. I find that naming those cells helps make this process a bit more intuitive and user friendly. So essentially we're telling Excel that we're defining a scenario based on particular values for each of these three cells. This case, I've named the scenario 20% 30 years, and for that scenario, I want sell one to change to a value of point to 20%. I won't sell to the interest rate to change to a value of 0.475 or 4.75% and I want to change Cell three the term length to 30 now. Once you've defined and named that scenario, you can quickly view it and automatically change those cells to the values that you've configured. Now. This is very often used for modelling exercises, which allow you to fix different combinations of inputs like these three cells here, the down payment, the interest rate and the term length in order to calculate the effect on a certain formula driven output in this case, the cash to close on the property and the estimated monthly expenses. And if we were to continue this process and create three additional scenarios, then with a click of a button, we can quickly compare four different outputs based on those four combinations of our inputs. So as you can see a great tool for quickly evaluating the impact of different combinations of input values. Now common use cases. This is often used for building forecasts that are based on several variables that may change like seasonality, interest rates, etcetera, or modeling several potential outcomes in cases where uncertainty or risk is a factor. For instance, creating a scenario for low risk, medium risk or high risk to predict your stock portfolio returns. So with that, let's jump into Excel and create some of these scenarios for ourselves. All right, so go ahead and open up your pro tip workbook. If you're following along, had to scenario manager in the Purple Analytics Tips section and let's go ahead and link straight out here. We've got a property cost calculator now, for those of you have been following along with the entire course. This may look familiar from our formula auditing tip, but basically we're evaluating two key numbers here. Cash required to close on the property and the estimated monthly expenses, both of which are determined by a number of factors here in Collins, G and H purchase price of the property tax rate, different loan conditions here, which we're gonna be focusing on in this demo as well as other factors like utilities, insurance costs, fees and so on. Now you'll notice that I've named each of these cells using the name box right here to the left of the formula bar. So percent down, interest rate and term length and again that will make it a bit more readable, but more intuitive once we start creating some scenarios now, the idea here is to essentially preset a number of combinations for these three values in H seven through H nine, in order to quickly and easily determine the impact to our key outputs here. And there are a few ways to do that. We could turn these into data validation, drop down cells, you know containing a limited list of options and have users manually select those combinations. But in this case, we really have three or four combinations that are most likely and most realistic and to kind of preset those combinations. The scenario manager will be a great tool to use in this particular case, so let's go ahead and select the three cells that will be part of our scenario h seven h eight h nine. We're gonna go into our data town drill into the what if analysis tools in our forecast section and click Scenario manager. So from here, we're gonna define four different scenarios for the purpose of demonstration. The first will be 20% down and a 5% interest rate for 30 years. So let's go ahead and add this one. Now you can give it whatever name you want. In this case, I'm not gonna be very clever. I'm just gonna actually type in what we're using for inputs. So 20% 5%. 30. And since we've selected H 73 each nine Excel already knows that these air the three cells that will be changing to define this scenario. And then you can add comments here customize the protection this case were preventing users for making changes and press. OK, so here we go. It's populated with the current values in those cells, and we can just modify them to fit our criteria. So percent down his point to whatever is easier for you can type 0.2 or you can actually type 20% here. I'm kind of like that approach a bit better. It's been more readable. Um, interest rate in this case is 5% in the term. Length is 30. So press okay there, and it's created our scenario. And now watch what happens when I click this show button boom. It's changed that interest rate to 5%. So now those three cells take on the exact values to find. By that first scenario, we just created. So the same process. Let's cruise through and add three more. Second one's gonna be 20% down at 4% over 15 years. 20% 4% 15 Press OK, 20%. 4% 15 years. All right, let's continue on this. Next one's going to be a lower down payments, only 10% down at 5% over 30 years. And I know I'm moving quickly, so feel free to slow things down to follow along. But here we've got 10 got 5%. That was correct. Over 30 years. That looks good. And finally, we'll add our fourth here, which is 10% had 4% interest rate over 15 and press it Okay, 10% 4% 15. There we go. So now we've got our four scenarios defined here, and to quickly determine the impact of each of those scenarios on our output cells. All we need to do is select the scenario and click Show. So now check that out. It changed to 20%. 4% 15 click. The third item show 10% 5% 30 and then the last option. 10% 4% 15. And every time we change these items, you can see that our cash to close and or our monthly expense calculations change accordingly. So there you go quick demo of using the scenario manager to define preset combinations of cell input values.
4. Optimization with Goal Seek: All right, This next pro tip is a fun one. We're gonna talk about how to solve for individual outputs or simple optimization models using excels Gold seek tool now gold seek, which you'll find in your data tab under the what if analysis options basically allows you to find the result that you want By allowing excel toe automatically change the value of a single given input cell. Gold seek requires very simple inputs and outputs. You need to determine a single hard coded input cell and a single formula based output cells. We can't add other conditions here. We can't test combinations of multiple inputs for those more complex optimization. Problems will be using Excels Solver tool in a more advanced demo in the course. So consider a case like this. We have a value based input, which is the quantity of units sold right there and sell seat to. We have a set of calculation factors like the unit price or retail price, the cost to produce each unit and a certain flat fixed cost. And with those factors along with the quantity value, we can determine the profit, which is our formula based output in this case, our profit will equal the quantity times the difference between the price and unit cost minus the flat fixed cost. So this particular scenario, this simple model checks all the boxes for the goal seek tool. We've got a single hard coded input quantity and a single formula based output. The prophet. So let's see what this actually looks like. When we drill into that goal seek tool, you'll see a dialog box that looks like this, and what we're doing here is we're setting our output cell C eight, which is our profit to a specific goal or target value in this case, $1000 by changing that value based input, see to quantity. So in plain English were telling Excel, I want you to change cell C two however you need to in order to land at $1000 of profit. How much quantity or how many units doing to sell to make 1000 bucks? And when you configure gold, seek this way and press OK, it automatically produces these values. You fixed your profit $2000 kind of working backwards to the quantity that equates to 833 units sold now common use cases here, just like we're showing here, determining the ideal input required to produce a specific goal or target outcome. You know, for instance, we could do a profit of $1000 here. We could do a profit of zero in order to figure out how much quantity we need to sell to break even. You know any combination like that, or really just solving any simple optimization problems that are based on a single input variable. So let's bring this example into Excel and practice configuring the gold seek tool ourselves. All right, So if you'd like to follow along with me, open up your pro tip workbook and from our table of contents, we're gonna drill into our goal, seek demo in our analytics tips section and go ahead and link out. And here, you'll see, are familiar. Very simple model based on this one input quantity and this one output the prophet. And as you can see, if I select cell C two, this is simply a hard coded value. We've got some pre defined calculation factors here from the unit price, the unit cost and the fixed cost in our prophet cell. C eight. This is our formula driven output, which takes the quantity, multiplies it by the unit price minus the unit cost and then subtracts out the fixed cost at the end to determine the actual profit. So, as you can see, you can adjust different values here. Say Okay, if I sold 325 units, my profit would be to 38. If I sold 23,000 units, my profit would be 34,250 and so on and so forth. So the way a lot of users who aren't familiar with gold seek will use Excel and specifically use a model like this is, you know, let's say we want $10,000 of profit. You know, you might say, OK, let's try selling 8000. Well, that's too many. Let's try 7500 are still too high. Maybe 6000 to low 65. You're kind of trying to narrow in on that number using the guests and check approach, and this is exactly what gold seek is designed toe automate for you and will. You also can kind of back into the formula to create that optimal quantity value by pre sending your profit gold seek is designed to do that for you go through that process and not make you kind of reverse the calculations to figure out that number and see to. So let's jump into our data tab. We're gonna look at our what if analysis tools and click on gold seek or second option here and all we have are three conditions here. We're gonna set a specific cell, which is our profit set that prophet to a specific value. And this could be anything. Let's actually start with $0. I want to find the break even point. And we want to do that by changing a given cell, which is the quantity here and see to, and all we need to do is press OK and check it out. It just automated the process of solving this for us. So target value was zero. It optimized found current value of zero. Okay, this tells us that Okay, we need to sell 167 units before we break a profit. And to kind of confirm this if we choose 1 66 well, we lost a dollar. Choose 1 68 we made $2 and you can very quickly test different scenarios here. Like in this case, we want to set the prophet to 100,000 for instance. Same exact process. Same exact inputs. Press OK, And there you go. Found a solution and told us that Okay. For $100,000 profit, you've gotta sell 66,833 units. So there's a crash course in gold seek. It's a pretty amazing tool when you need to optimize simple single input and single output models in excel.
5. Basic Forecasting: all right, time to talk about one of my favorite analytics techniques forecasting. And specifically, I want to show you how we can create forecast sheets using historical data and excel. Now. Recent versions of Excel believes starting with 2016 and Office 3 65 include a built in forecast cheat tool that allows you to calculate forecasts based on a given set of historical values. So, for instance, we have basic data like this monthly average temperatures, which in this case actually continue to extend through about six years worth of data. What we can do is select both of those columns, including the month or the date indicator, as well as the values Drill into the data tab and click on the forecast cheat tool right next to the what if analysis options. And when we do that, we'll see a basic preview of the default forecast that Excel has produced, as well as options to customize things like the forecast length, the confidence interval, any seasonality and also customized how to handle things like missing or duplicate values. And once you've configured those settings as you'd like, Excel will produce a sheet that looks something like this. This is the actual forecast sheet. It's a new tab in your workbook, and it contains the seasonality chart as well as a new table of data containing your existing historical values, as well as a set of calculated forecasted values. I'm gonna show you a couple examples of exactly what this looks like in Excel now one quick note. Sometimes you can use tools like trend lines for very simple forecast exercises. But keep in mind that those won't account for things like confidence or seasonality. So for more complex or custom cases and definitely recommend using this forecast cheat option instead. So common use cases here for one, predicting any future values such as interest rates or stock returns. And two if you want to calculate an expected range of future outcomes based on a given level of confidence, this forecast sheet and the formulas that it produces are very, very helpful for doing that. So with that enough talk, let's go into excel in practice, building our own forecast. All right, so for those of you following along with me had your pro tip workbook, we're looking for the basic forecasting demo in our purple Analytics tips section. Go ahead and link straight out, and here you'll see two different sets of data got stock market data here in columns, a through E, looking at Apple stock prices, that closing price and volume by day. We also have some monthly average temperatures from Barcelona, Spain, which span about six years worth of data. So two very, very different types of data sets that we're going to use to practice forecasting. So let's start with our stock market data here. And one thing I want to show you is if we were to select column B and control, click the values that we want to forecast, which in this case will be the closing stock price. Now, if we go to our data sheet and choose forecast sheet next to what if analysis, you'll see this error message it says, Hey, we can't create a forecast because your timeline isn't evenly spaced. In other words, you've got inconsistent gaps between some of these points and the reason that's the cases, because the market is closed on holidays and weekends. So we have these gaps, you know, like march 2nd and third, where we don't provide any data here. There's no closing price or volume, so what I've done is at a new column here. Column C That's just a series of sequential values or an index that basically just labels the Days Day one day, two day three, regardless of whether it's a weekend or weekday or holiday, whatever it might be. And now, if we select column C and D and go back into our forecast sheet now, we're able to produce an actual forecast, so sell will take its best guess at your settings. You know your forecast. Start your forecast and confidence interval seasonality, etcetera. But if you want to customize exactly how this is configured, go ahead and drill into the options in the lower left and here we've got a bunch of additional options to choose from, and one option is to change how far out you want a forecast. So maybe instead of 1574 days, maybe we only need to forecast to 1400 days, which kind of shrinks the forecast length of it. Another option is to actually change one. The forecast begins. So by default this forecast it's gonna begin. On the last day that we have actual values and the reason why excelled defaults to taking that approach is that allows it to kind of use all of these data points. All of these historical actuals toe help generate the prediction. But what we can do is change this. You no to something like 1000 days and force excel to start the forecast earlier. And basically it gives us a period of overlap where we can actually compare the forecasted values to the actuals. Now, the downside which we're seeing the results of here is that by doing that by accelerating the forecast start, we're feeding less historical information to excel and therefore potentially impacting the accuracy of the forecast itself. So maybe we kind of split the difference here and say, Start at 1100 days. That looks pretty good. We still have a good chunk of overlap here, but that seems to be a slightly more realistic forecast. Now, the other items that weaken customized here the confidence interval and that's basically creating these two bounds the upper and lower bounds and telling us that we can be 95% confident that the actual observed values in the future will fall within these bounds, so watch what happens when we increase the confidence to 99. You'll notice that the bounds get further part, creating a wider window that could potentially contain our actual future values. And if we shrink this down to something like 50% confident now, we have a much narrower bound and a much more difficult target to hit. Therefore, were less confident that the actual values will always fall within this range. So, generally speaking, you're gonna wanna stick with something like 90 or 95 those air kind of standard. And from here we can also deal with seasonality. By default, Excel will detect seasonality automatically. In this case, if there were certain number of days or a certain period where we see a pattern repeat over and over and over, we could set that pattern manually, and I'll show you exactly how we could do that for the weather data. And then when we create the sheet, we can also include some forecast stats like Just check that box. And last but not least, we've got our timeline range and column C Day index values Living column D, and then we can choose if we had missing points we could interpret late early of those at zero. This case, we don't have any missing points. And if we happen to have duplicates, could choose how to summarise or aggregate. Those duplicates in most cases averages an appropriate choice there. So with that, I'm confident. I'm happy with our choices. That way we can figure this. Go ahead and click. Create, and what Excel does is insert a brand new sheet here, gives us a little pop up toe. Welcome us to our new sheet, and it says, Great. We've created a table with a copy of our data Columns A and B along with the forecasted values and our confidence. Bounds in column C, D and e So got it. Sounds good. Let's go ahead and drag this sheet to the right of our forecasting sheet and name it something like stock forecast. Check it out. It's included the chart for us, just like we previewed. It's included these forecast statistics here in this little mini table in columns G and H. I won't get into those right now and then got our actual copies of our historical data points here. In comes A and B and if we scroll down or even more efficient, selected empty selling used. The control are down approach. Here we go, the last day of our observed values or in this case, the first day that we determined that we want our forecast to start now. These new columns air being populated here as well, and they're actually being populated with forecast formulas that are feeding in inputs based on how we configured our settings in the dialog box and same thing with our lower and upper confidence. Bounds were using confident confidence interval formulas here. So if we wanted to tweak any of these settings, we could either go back and create a brand new forecast from scratch using the dialog box. Or we could go ahead and update individual inputs or arguments of these functions and have the table update dynamically. So really helpful tool there. Let's scroll the top. And one thing I want to show you here is that in this case we can edit this chart just like any other. And let's say that we wanted to follow a simpler approach for forecasting these stock values and simply go to her chart tools. Add a chart element and throw a linear trendline in place that created this blue dotted trendline which, as you can see, follows almost the exact forecast that Excel is created. So in this particular case, using a linear forecast as a very quick, simple first pass might be a valid approach, and in fact, it's gonna produce almost exactly the same values as our forecast here. But that's not always gonna work. And one case where it often fall short is a case where you have seasonality. So let's jump back to our basic forecasting sheet, Select Columns G and H This time, which is our monthly temperature data. Go back into data and let's see what happens here and we choose forecast sheet. So take a look at this very, very different pattern here, and there's clearly seasonality that's happening over the course of those six years of actual values. So the difference between this example and the stock example is that a we can't just slap a basic trendline option here to follow this forecast path because there is a seasonality trend happening because, literally, we're looking at season's worth of data. Eso Excel is saying all right we've detected seasonality automatically and if you said it manually, were basically saying every 12 points, which are months, we see the same pattern repeat, which makes sense. So in this case, I won't create a new sheets. It'll look just the same, has the other one. But there you have it, too. Good examples of how to use excels forecast sheet to provide predicted or forecasted values using two very different types of data.
6. Outlier Detection: all right for this pro tip, We're getting a little bit more advanced. This is a four star analytics tip. We're gonna talk about how to find or detect outliers, using a combination of stats, functions and conditional formatting. Now, what we're gonna do here is use functions like median and quartile toe actually calculate statistical outliers, and then we're gonna apply conditional formatting rules to draw attention to them or highlight them within our columns. So in this case, we're looking at athlete data with names and column A. We've got heights and column B and weights and columns. See, And what will be doing is creating the outlier calculations. Using those stats functions, we'll calculate the median value. The first quartile, third quartile, the inter quartile range or like you are and something called defense multiplier. And that fence is essentially what determines how far out of the norm of value must fall in order to be labeled as an outlier. And what will end up with is a range of values to find by the inner or lower fence and the upper or outer fence, meaning that any values that fall outside of that range well, technically be labeled as an outlier. So from there, all we need to do is head to our home tab, conditional formatting rules. And we're going to use a custom highlight rule to highlight cells that are not between those fence values and sells F eight F 11. And when we apply that formatting rule to in this case column B immediately receive those outlier values highlighted in yellow Now. One thing to note. If you're dealing with much simpler calculations or formatting rules like the top end values or a specific top percentage, just use basic conditional formats. You don't have to go through this entire process, but what we're doing is using actual statistical functions to define those outliers in a little bit more of a scientific way. And as I'll show you from here, we can kind of adjust our fence and see how that impacts, which sells air being highlighted and labeled as outliers. So common use cases here first and foremost identifying any statistical anomalies in the data set based on this custom criteria that you define and second, finding and removing values that may have been incorrectly entered. So if we saw height here of 800 inches. You know, that will very quickly be flagged as an outlier and a mentor what settings we choose. And that would be a pretty good indication of user input error and not ah, not just a very, very tall human being. So with that, let's jump into a pro tip workbook. We're gonna take a look at this exact demo data set and run through this process Step by step. All right, so if you're following along, you should know the drill by now, had your table of contents. And look for the outlier detection demo in our Purple Analytics tips section. Go ahead and link straight out to that. She and what you'll find here are player names or athletes and column A. And just like we described heights and weights and columns B and C. Our goal is to populate these statistical calculations here in column F to determine exactly how we're treating and defining outliers. So we're gonna start with median and sell F to. This won't actually be used in our outlier calculation, but it's a great summary stat to include I'm just gonna use the median function 0.2 column B press OK. That tells me that the median height in this sample of 74 inches now that brings us to the core tile functions. Start by typing a quartile formula, and what you'll see are three different options. Here you've got the E X, c or exclusive version I NC inclusive version and the original quartile version, which is a compatibility function compatible with earlier versions of Excel. And what this tells you is that originally, there was just the quartile function and then starting and I believe 2010 Microsoft started introducing new variations of that formula. Now, in this particular case, we'll see the same answer no matter which version we use. But I want to show you the i N C. The inclusive version right now, some tab that in all three versions use the same arguments, starting with the array cominby and then the court tile, which you can label with a 0123 or four. And in this case, we want the first quartile or the 25th percentile in a press 81 there and close it off and press enter. Now what this tells us is that players who are 72 inches tall are at least as tall as 25% of the other rose or players in the sample. Now a little bit of a nuance here. But the difference between the inclusive and exclusive versions is that the exclusive version would basically determine the value that's greater than 25% of values in the array . The inclusive version determines the value that's greater than or equal to 25% of values in the array. So that's the difference there. That's the nuance. Exclusive acts like greater than inclusive acts like greater than or equal to. And one other note here. If you did want to use this function to return the minimum or maximum value with a label of zero or four, you would have to use the inclusive version to do that. So it's going and drop. Are one back in their press enter, we can simply copy that formula. Paste it here for 1/3 quartile simply changed the 123 and we get a 75. Now I Q R's air Inter quartile range. That's a simple as the third quarter I'll in a four minus the first quartile in F three. It is a three inch range, and next up, our fence multiplier. Now what the fence does is essentially determine how far outside of our inter quartile range a value must fall in order to be labeled as a statistical outlier. So the higher the fence or the higher the fence multiplier, the further from our expected range. That value needs to fall to be considered an outlier, and this is a hard coded value. In many cases, 1.5 is kind of a good standard starting point. Let's go ahead and start there and then last but not least, we need to determine our inner fence or lower fence as well as our outer fence or upper fence. So the lower fence is defined as the first quartile, minus open parenthesis. Inter quartile range times our fence so F three minus five times at six that returns a 67.5 . That means that the lowest value a Selcan take before it's treated as an outlier is 67.5 inches. Very similar logic here for the outer fence or upper fence. Take the third quartile and we're going to add the I Q. R. times the fence multiplier Close it off. Press enter. Same thought process here. The highest value a Selcan take before it's treated as an outlier is 79.5 inches. Now, from here, we've done the hard part. The last step is to actually flag or draw attention to the actual outliers that we've defined here in column B. So it's grabby, too. Old control shift arrow down to grab the entire column. Gonna go into our home menu conditional formatting. And we're gonna use, um, highlight cell rules here. And what we really want is cells that have a value that's not within the range of our upper and lower fences. So we want a not between option here, which we don't have is one of our standards. But if we go into more rules here, we have that option where we can say format cell values that are not between. And then we can simply point to the cells that we care about, which are all the way up here at the top. So we want to highlight cells that don't fall in this range. They're not between eight or s 11 and we comply. Format this one's up to you going to use kind of yellow Phil here and press OK and check it out. We press OK now. Any value that's higher than the upper fence greater than 79.5 are now highlighted. And if we jump to the bottom control error down, we've got to values here that were below the lower fence of 67.5. So we've essentially flagged those outliers based on our statistical conditions or a configuration here. And I watch what happens if we change this fence multiplier from a 1.5 to a one. Now we're including more values that were labelling as outliers here because we've created essentially a narrower range of normal values. We change this fence multiplier to to or expand that fence. Now we're making it less likely that a value is labeled as an outlier. So we only see a few data points here in column B that get flagged in this case, three of the tallest baseball players in the game. So there you have a great little crash course on how to use some of these relatively simple stats functions combined with conditional formatting toe actually detect and flag outliers in your data
7. Automated Data Tables: all right for this pro tip, we're gonna talk about a four star, relatively advanced analytics tip and we're going to discuss how to evaluate variable inputs using excels data tables. Now, the data tables that we're talking about here are different from your standard cell ranges , formatted as tables. In fact, I wish Excel gave them a different name because they actually operate. Quite differently we're talking about are the data tables that you'll find in your data tab from the what if analysis tools And what these data tables do is they allow you to calculate on array of results with the click of a button based on an entire range or variety of potential input values. So to show you what I mean by that, we're gonna look at our property cost calculator tool, which we've seen before throughout the course. And the idea here is to understand exactly what our mortgage costs are. Monthly mortgage costs would look like based on different interest rates ranging all the way down from 3% up to 10% in half percentage increments. So the manual kind of tedious old school approach would be to plug in those different interest rates into Cell H eight and see what the mortgage costs spit out and sell h 14. But instead of doing that, we're gonna get a little bit more sophisticated, and we're gonna select this table this blank range in columns, Jane K. We're gonna use excels data table tool to produce an array of results based on that range of interest rates. So we're gonna head to our data town gonna drill into the data table option from the what if analysis tools. And then here we only have two potential inputs a row input cell and a column input cell. And in this particular demo, all we have is a column containing that range values and the cell in which we want to test that range of values lives in h eight. So by treating age eight as the column input, Excel is able to spit out this entire array of results right there in column K based on the range of interest rates in column J. So rather than manually testing 15 or so different interest rate values and recording the results or typing a formula 15 times, we've configured the data table tool one time and produced all of the results with the click of a button. Now, data tables can be used to evaluate results like we see here based on changes to either a single input variable like we've shown with interest rate or with multiple variables, in which case we also plug in a row input cell. And I'm gonna show you an example of both as soon as we jump into excel. But real quick common use cases calculating a matrix of results based on combinations of input values like this. Monthly payments based on things like interest rates and down payment amounts. Or taking that even a step further. You could use this tool to identify the optimal outcome given multiple combinations of variable inputs. So let's go ahead. Open up our pro tip workbook, roll up our sleeves and practice building some of these data tables. All right, so from your table of contents, go ahead and scroll over to the analytics tips. We're gonna look at the data table demo in this case, go ahead and link straight out to that purple tab. And here you'll see our property cost calculator, which, as we've seen before, takes a bunch of inputs like the purchase price, that tax rate and some loan terms and spits out outputs like the loan amount that mortgage costs, property tax, etcetera. Now what we care about in this case is the monthly mortgage cost here and sell H 14. And this mortgage cost is a function of a few different factors for one, the down payment and for to the interest rate, among others. So if we change your down payment from 20% to 30% you'll notice that our monthly mortgage costs are reduced. By contrast, if we change our interest rate from 4% to 5% our mortgage costs increase. So these two factors both impact the mortgage cost. So what we're gonna do here is create two different data tables to test or evaluate different combinations of those potential input values. So in the first case here, all we're testing are variations in the interest rate. So Step one is to create a column containing each of the variations in the interest rate that we want to test in order to see the impact on our output formula or a mortgage costs. And I've just got to show you this tip here, you can either type in these values manually or what you can do and delete those You can type your first value here, hover over the lower right corner and listen to this. Hold the right click button, dragged down and back up, then release to access this kind of hidden menu here. And what we're gonna do is drop into Siri's at the bottom. We're gonna fill a Siris in a column. We're gonna step or increment by 0.5%. We're going to stop at 10%. That's our max value press. OK, Booms filled that perfect, incremental serious for us with one click of the button. Now, the goal here is to populate these blank cells here from K four through K 18. And in order to do that with data tables, we have to set things up in a very specific way. Step one, which we just achieved, is laying out the range of input values. Step two is actually linking to the formula that we're looking to evaluate. So we're just taking the formula from H 14 and we're referencing it right here in South K three. What that's gonna allow us to do is select this entire range of cells, which includes both the formula itself and the variety of input values. And from here we can go into our data tab. What if analysis dated table and again, just like our demo. We don't have any input variables in a row, but we do have variable inputs in a column, and the input cell itself is not this range that I just created. It's the cell that's actually referenced in the formula that contains the interest rate, which in this case is sell H eight. And that's all we need to configure this data table press OK, and there we go. It's populated all of the monthly cost outputs based on this different range of interest rates. And as you can see, it's generated a single array or table array rather than 15 individual formulas here. So moving on to our next demo, we're gonna follow that same approach, except we're adding one additional set of criteria here for different down payments so it will do here is create an actual matrix of values that account for not only changes to interest rate but also changes to down payment either 5 10 or 20% so very similar process here and select the entire range of cells, including the formula and the variable row and column input cells. We're gonna go into data. What if analysis dated table? And now we do have a row input cell. And that's the cell in the formula containing our row input variations, which are the down payments down payments live and sell H seven. There we go. And again, our Colin is the cell containing the interest rate. Just our column index and press. OK, and there you have it. We've produced a single two dimensional array this time that produces output values based on multiple criteria. So there you go. These data tables in excel could be a little bit tricky, a little awkward to work with at first. But once you really get a handle on them, they could be a great tool for evaluating an entire array of results based on variable inputs
8. Power Query Tools: all right. I want to take a few minutes and walk you through a quick and brief introduction to one of the most powerful tools in Excel called Power Query. Now, this is a four star advanced analytics tip. We're gonna talk about how to connect, shape, transform and load data from external sources into excel using these power query tools. So in classic Microsoft fashion, you may see these tools named all sorts of different things. Depending on the version of Excel that you're using, you might see power query you might see. Get data you may see get in transform and all means the same thing. I'm using office 3 65 pro plus. And this is what I see in my data tab. I've got a group of tools called Get and Transformed Data and they get data command off to the left. And what we're gonna do in this demo is connect to a flat CSP file note that you can also connect to database sources or web ap. I isn't so on and so forth. This case, we're gonna keep it simple. I've made this file available, is part of the course. Basically, we're going to do a quick analysis of some Kickstarter campaign data. So once we point to the file will see this query preview and then click Edit tow. Launch our query editor, and this will open as a separate workbook outside of our Excel worksheet environment, and you'll see a preview of your data here and all sorts of tools that we have to shape and filter and customize and transform this data before we load it into excel. So this is like our command center for configuring this query or connection. Now, keep in mind, there's no way I could even scratch the surface of all these tools within the scope of this particular demo. But I do have an entire course that goes in depth into these tools the Power Query, Power pivot and Dax course. Now, for the purposes of this introduction, keep in mind that there are a few important sections of this query editing window. You've got your query editing toolbar at the top with options to transform or add new columns and a window in the lower right called Applied Steps. Now these applied steps are absolute game changers because essentially, what's going on here? is that every time you make an adjustment or transformation, any operation that you apply to your table gets recorded as a new applied steps exactly like steps get recorded in a V B a macro. Now, once you've transformed and shaped your data and you're ready to load it into excel there to places where you can load it either to a worksheet where it will live in rows and columns and cells, or to the data model, where you can compress this data and store much, much more information hundreds of millions of rose. And it's also where you can actually build relations models to create table relationships and connect or blend information from multiple sources. So to quickly summarize common use cases number one connecting to flat files or database sources and then transforming or filtering that data before loading it into excel. For further analysis, number two Creating a fully automated E T L process, which stands for extract, transform and load that could be refreshed with one click as new data becomes available again. Details. Air all covered in my power query, power pivot and Dax course. So with that, let's jump into excel. You're gonna walk you through one quick demo for how we can use power query to connect and transform on external CSP file. All right, so if you'd like to follow along, head your table of contents and look for this power Corey demo in our purple analytics tips , and when you link out, all you'll see here is a blank sheet. For now. Now, to introduce this topic of power query, I've made a project file available to you called Kickstarter projects dot CSP. Feel free to download that to follow along or just sit back and watch this show. I'm gonna head to my data tab here a couple of ways I can grab a CSP file either from this from text slash CSP button or in the get data drop down, which shows me some of the other data sources as well. This case I'm gonna go into from file from text slash CSP. I'm gonna double click the Kickstarter Project file that I've loaded to my desktop. So the first thing we'll see is a quick preview based on the 1st 200 rows of that file got project I DS names, categories, main categories, then all sorts of information about when the project was launched, when the deadline is how many backers, how much money was pledged and so on and so forth. Now we're gonna want to filter down this data, trim it down to just what we need. So we're gonna click the edit button toe, actually launch the query editor and you'll see that it launches in a new window outside of the Excel spreadsheet environment. And this is where we're going to do all of that data shaping and filtering and transforming before it gets loaded into excel and you'll notice it's already applied a few steps to identify the source from my just stop here promoted the headers, and it's detected some of the data types here and applied those data types automatically. So for the sake of demonstration, let's add a few more filters here. Um, perhaps we don't want all of the deadline data here got projects going back to 2009 so let's apply a date, filter and say Onley projects out of here. Adobe reader that are after, Why don't we say 2014 so we can go to December 2014 Select 31st Press OK, we've added a new applied step to filter those rose, and we can sort that column ascending just like we would with any Excel table to see that. Now we're dealing with projects that only have deadlines going back to January 1st. 2015. Now the same story here we can filter down by country. For now, it's just look at the U. S. Projects. There we go. And now the state whether campaign was successful or if it failed or was canceled supply filter there as well. And on Lee, look at successful projects. Press. OK, now we could continue to go on and on. We could change data types here. We could remove extra columns, got some redundant ones in this case. But for the sake of demonstration, let's go ahead and move on to loading the data. I'm gonna choose clothes and load to from the upper left, and this will show me my loading options. Now, if I didn't want to load this data to a worksheet and actually store those rose those observations in rows and columns but I could do is Onley create the connection and drop it into the data model. No, you may not have access to the data model, depending on the version of Excel that you're using. And in this case, I want to reserve the data model conversation for a different pro tip. So for now, we're gonna use the table option and actually load this filter data that we created right here into our existing worksheet, the power query worksheet and sell a one and we press. Okay, we'll see our query pain pop up and it says, All right, we're loading some data from that Kickstarter project CSP. It loaded 41,097 rows and it dropped that data as a table right here in my power. Corrie 10. Let's go ahead and close that pain from here. We can work with this table just like any other we can insert Pivot table, for instance. Let's drop this in a new sheet and tab over, and we can call this one Kickstarter pivot, for instance. And here we've got our fields from that Kickstarter table and we can work just like a normal pivot table here. Maybe want to break down data by main category here on Rose. And in this case, let's look at the amount pledged in U. S. Dollars. And just like any other pivot, we can change the number formats. Currency could have done this in the query editor as well. And let's grab these state projects state as filters. And remember, we sorted it are filtered it down toe only show successful projects. That's the only option we have here. Now that's pretty cool. We can now see how much money was pledged by category. See a lot of money and design games technology, which makes sense. But this isn't really what makes power query so powerful we could have just loaded up that CS being copied and pasted into a tab. What makes Power query so powerful is that if we now want to change some of the settings or features of our connection, maybe we don't want to filter down toe Onley successful projects. Maybe we want to look at failed projects Now. Instead, we don't have to reinvent the wheel. We don't have to manually filter down table that was dropped into her power. Corey Tam. Instead, we can go into data queries and connections, and we can right click to edit that Kickstarter project query and check it out. This last step, this filtered Rose step is where we had filtered down to country equals U S and the Project state equal to successful. All we need to do is click this little gear. I come to edit that step, and the second line here is thes state filter. Click this ellipse, delete that filter and press OK and it will refresh to bring those rows back in from the original CSP file. Now, when we close and load, you'll see in our query pain that Kickstarter Project Curry is refreshing itself. And what we should see is that more data comes in here instead of just 41,000. This should show a higher total because we're now including projects with different projects states. And that's exactly what we see Now we see 127,000 875 rows loaded to this table and just like you'd expect, we can close out our pain, head to our pivot, jump into pivot table tools and refresh, and now check it out. When we look at our state filter. Now, we don't only have successful options. We have failed, canceled live and suspended as well. And there you have it. So very, very quick, very surface level demo of some of these basic power query tools. Again, if you want that deep foundational knowledge, check out the power query, Power pivot and Dax course and go really enjoy it. But hopefully, for now that at least inspired you play around with some of these more advanced tools and see how they might be able to revolutionize your workflow in excel.
9. Data Modeling 101: all right for this pro tip. I'd like to introduce you to a very powerful concept and excel. We're gonna talk about how to build an analyze Relational data models in Excel now excels. Data model is used for a number of different purposes. For one, you can compress extremely large amounts of data hundreds of millions of rose, if not more. Second, you can create table relationships, which is a much, much more efficient, an elegant way to blend data across tables or sources without manually stitching them together with self formulas. And third, you can add brand new calculated columns and measures using a formula language called DAX or data analysis expressions. So there are a few ways to get data into the data model. The most common ways to create a new connection to a data source using a tool like power query, also known as get data or getting transform and loads straight from that connection into the model. A simpler way to add dated to the model, especially for the sake of demonstration, is to take a table from a worksheet, which is generally a much smaller table and click the add to data model button within the power pivot tab. Now this is a good point to pause and let you know that some versions of Excel will not have access to these data modelling tools. So if you don't see your power pivot tab, the first place to check is file options. Adan's and look for your calm Adan's specifically, If you still don't see power, pivot there. I'd recommend Googling Where is power pivot? And you'll be taken to the office support website that looks like this. And here you'll actually see the exact office products that do have access to these data modelling and power pivot tools. Now, assuming you do have access to the data model, once you've added your tables to the model, you'll be able to manage or edit your data model window and see a few different views like you see here on the left, we have the diagram view, in which case we see our tables appear as individual objects, and that's where we can create the actual table relationships between them, based on things called primary and foreign keys. On the right, we have our data view, which is a more traditional kind of tabular layout. But the beauty of the data model is that once these relationships are defined, once you've created what we call a relational data model, which, for the purpose of this lecture I'm defining as a group of related tables, then we can insert a pivot table to explore and analyse the data across all of these related sources in one single view. And that's why we call this a power pivot. It looks and feels exactly like a regular pivot table, but it's sitting on top of a data model. So common use cases here, combining information from multiple sources without actually mashing it together by merging or using functions like Look Up or Index and two. This is the foundation for building robust business intelligence solutions that can integrate and blend data sources from all sorts of places, like sales data. HR finance, marketing, etcetera. So with that, let's jump into excel. I'm gonna run through an extremely quick, very brief demo to showcase some of what these data model and power pivot tools conduce. All right, so if you're feeling brave and you'd like to follow along, head to your table of contents in your protest workbook looking for the Purple Analytics tips here and let's jump into the data modelling Demo four star demo. We'll go ahead and link out to that sheet, and what we're looking at here is a table called Transactions. We've got the transaction date column A got the product that was sold and Colin be some information about the customer based on a customer i d. Here in column C and then the actual values here, the quantity sold in column D Now on its own, this table isn't very helpful to us. We could plug this into a pivot table and we could roll up the quantity or the sales. You know, for product I ds or customers or days. But what good does it really tell us that we sold product number 76 six times, right? We have no additional context about these I ds. So what we'd really like Here are some look up tables that could potentially map this product I d or this customer i d two additional dimensions or additional pieces of information that we can use to really learn more about these sales. And in fact, that's exactly what we have. If we expand thes group columns. You'll see we have three more tables. Got a green table called products. We have an orange table called customers and a yellow table called Calendar. Now, as you might expect, each of these tables includes an I D column or, in the calendar case, a date column, which can map this information back to our transactions. Data in columns Athor deep. In other words, if we know the product I D, which we do, then by creating a connection or relationship to the product table, then we also know the product brand and the product name, as well as the retail price and cost. Same goes with the customer i D. That key can be used to pull in information about those customers the names, cities, countries, marital status, gender what kind of membership they have. All of that information can be tied into this transaction table because we have the key columns that can connect them. So you might be tempted to go ahead and start writing functions like the look up or index match to tie these together into a single master table. Because in old school, Excel were used to needing a single table or single source for pivot tables. That was always the one big limitation of pivots. So maybe you type, you know, if you look up here, we're looking at the product I d within our products table and grabbing the second column with an exact match. And boom, there we go. We have our product brand now in column E. So we tied that product brand in to our transaction table. Now listen. We could continue this process to pull in product name, retail price, as well as all of the customer information and all of our calendar information as well. But there are two problems with that number one. What if these tables included hundreds of columns instead of just a handful like we have here all of a sudden? This approach isn't quite a scalable, and to what we've created is a really inefficient table with a ton of duplicate values that simply aren't necessary. So let's go ahead and delete that column here, and I'm gonna show you the more elegant the more sophisticated approach to accomplish the same thing using the data model. So I've already added these tables to the data model, but If you'd like to practice yourself, you would select an individual table, drill into your power, pivot time and click this ad to data model button right here Now, because I've already added them, we can go to our manage button which will open up the data model window here. We're in our data view. By default, we can see these four tables in tabular format as separate tabs and diagram view, which actually shows each of our tables as an object on the canvas. Now you'll notice these lines connecting the tables represent relationships. So in this case, product I D. Which is highlighted, is what connects products to transactions. We have a customer, i d, which connects customers to transactions. And finally, we have our transaction date which connects to the date field in our calendar table. Now, to show you how to create these relationships from scratch, going ahead to the design tab here, manage relationships, and all we need to do is select these three relationships, press delete, okay. And when we close that dialogue box now, we'll see those tables kind of disconnected again. And all we need to do to simple is grabbing the field we want and connecting it to a primary key within the look up table. So transaction date to date product i D to product i d. Customer i d to customer 18. And it's a simple is that in a matter of under 10 seconds, we've replicated all of the effort of creating those look up or index functions to manually stitch that data together. And we've accomplished the exact same result. And now here's the best part. I head back into the home tab here, insert a pivot table in a new worksheet press. OK, let's double click to name this time something like data model pivot. Here's the beauty of it. Now we're dealing with a power pivot, which is technically the same old pivot table layout just sitting on top of a data model. And we have access to all of our data model tables, calendar, customers, products and transactions as well as other tables in the workbook, which in this case won't be helpful because there's no relationship to find between them. So if we drill into our transactions table, you'll see a few different measures here that I've defined transactions, total quantity and revenue These were designed with a language called Dax, which is out of the scope of this course, but it's something that I do cover in my power query parapet and Dax course. So let's grab total quantity. Pull that into our values. Here and now we can access any one of our other tables and break those quantities down in all different ways, like Byproduct Brand on Rose, for instance, here we can see that American products sold 1200 atomic products sold 695. We can grab a field from our customer table as well, like the customer city. Drag that in and secondary role labels. Pope product brand out. So as you can see, I'm manipulating this pivot exactly like I would manipulate any other pivot table in Excel . The only difference is that now I'm able to access fields across four different tables that are connected only with the relationships that we defined within the data model. Incredibly powerful stuff
10. CUBE Functions: all right. Time to dive into a five star, truly expert level analytics tip. We're gonna talk about how to explore data models outside of pivots using something called cube functions. Now, two quick caveats before we dive in. Number one. If you haven't viewed the data modelling pro tip or you haven't learned about data models in the past and highly encourage you to do so Number two, you're gonna need a version of Excel. It's compatible with the data model and with power pivot. If you're not sure, go on Google search for Where is power pivot and you'll jump to the office support website that looks like this. And here you'll be able to see the exact versions of Excel that include access to these data model and power pivot tools. Now let's get back to it. So cube functions. The easiest way to explore data in a data model is through a pivot table or a power pivot. But what cube functions do is allow you to pull a retrieve specific filtered values out of your data model and pull them directly into worksheet cells. So, for instance, if you want to create a view that looks like this. You can use cube functions to do this, and in my experience, there are four types of cube functions that I use almost exclusively. The first is called a cube set, and essentially a cube set is a collection of items or cube members from your model. It's essentially equivalent to an entire column from a table in your model. In this case, we've defined a cube set that contains information about product brands. Now, within cube sets, you have something called Cube members and these air shaded in blue here in the Visual Cube member is a single item from within a cube set. So, in other words, it's one item out of a table column here. We're looking at different values within the membership column in our customer table, golden, silver, bronze and normal. And note that we also use Q members to define quantitative measures or calculated measures like transactions or quantity. Next up, we have a special type of member called a cube ranked member. These are shaded in green here in the visual and these air, just like you members, their individual items within a cube set. The only difference is that they're based on an order rank. And that's what allows us to do things like show the top five products by quantity, for instance, like you see here. And last but not least, we have our fourth primary type of cube, which is the cube value, and those are all of the cells shaded in yellow. There, the actual numerical values that were aggregating based on a set of member expressions. And those member expressions help us filter down those values to retrieve the proper numbers. Now these condemn finitely be tricky and a little unfamiliar to work with. At first I'm gonna walk you through exactly how to build a view just like this. But keep in mind that there are many other types of cube functions as well, and I've included a link for more info here. What this short link will take you to is the office Support Documentation Page four Cube functions, which looks like this, and here you can scroll through and learn a little bit more about the fundamentals of Cuba functions and some of these other options that I'm not covering. So common use cases here for one building spreadsheet based reports or dashboards that sit on top of data in your data model without having to rely on pivot tables and to cube functions could be a great way to document all of the sets and members within your data model. So with that, let's jump into our pro tip workbook and practice building a report view like the one you see here. All right, so if you're following along with the course, you know, the drill had to your table of contents scroll to your purple analytics tips here, we're gonna drill into this five Star Cube function demo. Go ahead and link out to that sheet. Now, what you're looking at here is a basic blank template that I've created of a sample report . There's no formulas or anything here. It's just format. It sells. And our goal is to convert this and wired up into a fully functional report that pulls in data from the underlying data model. Now, to get started, we need to do a quick review of the data model that we're working with. So I'm gonna jump into the power pivot tab, click on manage the data model, and let's jump into diagram view here So this is the model that we're working with in this workbook. We've got transaction data with three calculated measures transactions, total quantity and revenue. Those in the measures are metrics that we're gonna pull into our report. And we also have a number of fields that we can pull from. These three look up tables got calendar fields like the day, the month, quarter and year customer information like countries and cities, gender and member card status and products which include things like the brand, the name, the retail price and the cost. So let's go ahead and actually insert a pivot table from our data model, and I'm gonna drop it into an existing workbook. Gonna put it right here, um, in our cube functions worksheet. And let's drop it right here and sell H two and press OK. And what we'll do is use this pivot table to help spot check in queue and make sure that the numbers that were populating with cube functions are accurate. And then once we've done so, we can go ahead and delete the pivot itself and be left with just the custom report that we've built so as you can see here we've got a few different breakouts of the data. We're looking at three different metrics. Transactions, quantity and revenue. First, we're breaking it down by month for the previous three months. April, May and June. Then we're doing a breakdown by member type golden, silver and bronze. And this third table here is showing the sales by product for the top five products based on either transactions, quantity or revenue. So let's start at the top with sales by month. And I'm gonna drill into this pivot, and I'm gonna pull in some of these fields transactions, quantity and revenue. And in this case, we want to break it down by month from my calendar table Grab month name pulling into rose . And in this case, all we really care about are April, May and June press. OK, let's sort it a dizzy. So this is a little preview of the view that we're trying to achieve right here by pulling those same values into the worksheet cells themselves. So if you recall the cube value function was the type of cube that can pull in or aggregate those numerical values. Problem is, if we start by just typing cube value. We're open the parenthesis. Our connection is going to start with an open quote and it's gonna be this workbook data model. You can press tab toe lock that in closed The quote Now the second argument and the third and the fourth and the fifth for a cube value function basically ask you to point to a member expression. And that member expression helps tell Excel how to filter down the values to return. So in this case, we're filtering down based on two different conditions or criteria. The month name and be five and comma to the measure name in Cell C four. So that's really all I need for this cube value function. I close the parenthesis press enter, I get in N A. And the reason I get this error is because Q values don't know how to interpret cells that just contain text. So right now this cell means nothing. It's just a string of text that says transactions. Same thing with my month names. April, May June golden silver, bronze. So what we're going to need to do is convert these text strings into actual Cube member functions. So let's start with our metrics here, we can type equals Cube member open Quote this workbook data model Start like this every time comin over to the member expression and this is the field or the item from our data model that we want to capture in this cell. So open another quote. This will allow you to access the tables and fields in your model. This case, we want a field from our measures. And if we enter a period, it takes us to the next level in the hierarchy, which is the actual list of measure options here. And we want transactions. So I'm gonna tab it in, close the quotation, close the parenthesis, press enter. So now it still says transactions. But now this is an actual que member, and we're gonna go through the same process here for April equals Cube member from this workbook data model and the member expression, remember, opened the quote again. This time it's coming from the calendar table period to the list of columns we want the month name Colin. And now, because we want a specific month within the month name Colin, we're gonna add one more period and type the name of the month surrounded in brackets. So we want specifically the month of April and then close the quote. Close the parenthesis. Feel free to pause on the screen for a moment to make sure that you've got the right function and I'm gonna press, enter and check it out. Two things happen. Number one. My cue member cell now takes a value of April, which is not just text. It's an actual que member, and our cube value is now returning a proper number, which is 6588 which we can map to our pivot as a Q. And that does match. So we are on the right track. And now that we've defined some of these, it's very easy to apply them to the rest of ourselves so we can grab this transactions column, copy it, paste it two more times, and then simply adjust that last argument from transactions to total quantity and from transactions to revenue press. Enter same thing with April, May and June. I'm gonna pace this two more times and just change the April in the final argument. Two main and finally to June. And just like regular Excel functions. We can treat our reference types carefully here because our months always live in column B . We can fix that column reference. And because our headers are measures always live in Row four. We can fix that row reference and press enter and check this out. Dragged down, drag over and boom! We've got all of our Q values populated. We can check against the pivot. All looks good. And now we're starting to Cruz here. We can actually grab one of these cube members from month. We can paste it into the member type template here and now, instead of the calendar table, we're gonna open this up and we want it feels from our customer table. And the column that we want is the member card column. And in this case, for the first value, the members specifically that we want is golden, so you can type it in closed quote, close the parenthesis, press enter, and then copy that two more times to populate silver and bronze. So silver right there and bronze right here and now check it out. Copy this cube value, paste it down on lee. The references that we want to move are moving, the other ones are fixed. They're still pointing to the correct headers up here as well as the proper members for member type here in column B. So let's go ahead and give. It was poor month. Name out, full member card in, and we can just kind of re sort this a bit gold at the top and silver than bronze. Use the values we should be seeing. And boom! Those are the values that we do see right here from our Q values. So we're two for two. The last thing that we need to do here is populate this sales by product matrix, which is gonna pull in the top five products. So here's the catch. We don't know offhand what the top five products are, so we can't just start typing regular Cube members from scratch like we did with month or member type. Now we could sort the different product brands in the pivot to figure out what the top five will be. But if we want this view to change dynamically and always sort based on current data, we've got a better approach to do that and what we're gonna do here is introduced a cube set first here in column B 16. So let's go ahead and type Cube set function this time as always, gonna start with the connection to this workbook data model. And now the set expression is going to be the list or the collection of product brands, which you're gonna come from our product table. So open the quote products dot product brand. And now here's the key. The last little option after the dot I don't want to select all, and I don't want to select any specific item. So I'm gonna type members, which is gonna force this cube set to store all of the possible members within that product brand Colin closed the quote, and now we need to update or populate the additional arguments here. So for the caption, what we want the cell to display, let's do brands. And now the sort order. This is how we determine how any cube ranked members get sorted within this set. So in this case, we're gonna sort descending high volumes on top. That's number two. And what do we want to sort by? Well, we can choose any one of these measures. Let's go ahead and choose transactions. And because that comes from our measures table and open the quote measures dot transaction's closed The quote. Close the parenthesis, pause the screen for a moment. Make sure you got that down. So long function and we'll enter that in. So always see is the word brands here. But this cell contains an entire set of members, and that's gonna allow us to now use something called a cube ranked member to pull individual items out of that set. So check it out equals Cube ranked member from this workbook data model. Now the set expression is a reference to a cube set, which we just created here in Selby 16 Press. Afford a lock it in because that cube set will always live there. And now for rank, you'll notice that I've created a little index from 1 to 5. There in column A. That's gonna be our rank index. So let's just point over to that rank number, and we can go ahead and close off the function press enter, and the first item we see is all. If we click and drag down, we see a few different product brands appearing Karmanos, Telltale Ebony and High Top. Now the issues that all is not an individual brand. It's actually the total across all brands and that will always show up at the top, have a cube set. So the easiest way to just for that is to simply start with the rank two, then three and four than five, then six. And because we've set a reference is, well, copy that value. Paste it down, drag it down, drag it over, and it's looking good. But let's do one final Que A and R pivot a full member card out drilling to products product brand in, and we're gonna sort those product brands descending by transactions and press. OK, check it out. Karmanos Telltale Ebony Hi Top Tri State. Same exact five that we populated using Cube rank members and Cube sets, and it looks like the transactions, quantity and revenue values match up perfectly. So the pivot has done its job and recommend keeping one just for quick and easy data exploration. But in this case, we can go ahead and select those columns, delete the pivot, and there we go. We're left with are perfect custom format and report that we built entirely using cube formulas
11. Monte Carlo Simulation: all right, I want to share one of my favorite Advanced Analytics techniques in Excel. This is a five star, expert level tip. What we're gonna do is build our own Monte Carlo simulation model entirely within Excel. No V B A. No macros, Just straight up sell formulas and data tables. Now Monte Carlo simulation. It's typically used to predict the probability of some future outcome or outcomes based on repeated random sampling. In other words, we randomly simulate the same outcome over and over and over and over again and then analyze the resulting array of results. So what better way to demonstrate Monte Carlo simulation than to build our own roulette simulator in Excel? So the way this works is that you can place a certain type of bet red or black, even or odd, specific numbers, and each bet is associated with a given payout. So after placing your bet, determining a wager were using ran between two essentially randomize or simulate a spin of the roulette wheel. In this particular case, we've bet $10 on red. The result is the red Number 23. So we won 10 bucks. But the goal isn't just to sit here and randomly spin over and over and over again. What we want to create is something like this where we're actually recording the results of a number of simulations in this case, 10 spins and then evaluating those results and summarizing them with basic stats functions . So here we're calculating the total gain or loss, the number of spins, the number of wins versus losses, the win percentage average, return per spin and basic max and men. And the tool that we're using toe actually drive these simulations is excels data table. So we're gonna head to our data tab, drop into our what if analysis tools and use the data table to create and record as many simulations as we determine. Now. Obviously, we're using this to kind of create something fun and interesting. But there are a number of very serious, very realistic use cases where this could come into play for one randomly simulating a business model thousands of times in order to understand the probability of something like a profit versus a loss or to build on that creating predictive models that actually can account for a given degree of uncertainty for one or more model inputs like what interest rates will do in the future or how supply costs will change. So this is an incredibly powerful data science and analytics tool that we can play with and learn about right here in the familiar excel environment. So let's go ahead and jump into her pro tip workbook and practice building some of these simulations. All right, So if you'd like to follow along, head to your table of contents, scroll on over to the analytics tips here in purple and we're gonna jump into our Monte Carlo simulation demo. Go ahead and link straight out to that tab, and here will find our roulette simulator. Now, for those of you who aren't familiar with the game of Relent, this is what the table looks like. You've got 36 numbers, half black, half read, as well as to green numbers zero and double zero. So the dealer will spin a wheel that contains all 38 slots associated with each of those numbers. As faras bets are concerned, you can place bets in a number of different ways, which you can see here in this drop down cell. You can bet on individual numbers, which are the most uncommon results and therefore payout, the highest in this case, $35 to 1 on your bet. You can also bet on the 1st 3rd of numbers. 1st 12 2nd 12 or 3rd 12 the 1st 2nd or third column of numbers. The first half second, half even numbers. Odd numbers, red or black. So as a simple example, let's say we bet on black. We can input a certain bet amount, let's say $10. This payout tells us that it bet on black pays out 1 to 1. So if we win, we get $10. We lose, we lose the $10 that we wagered. So from here we compress the F nine calculate button or head to her formula tab and quickly calculate now button to simulate different spins of the wheel. So, as you can see, we've bet on black. So any red result means that we lose our bet down $10. Any black result means that we win $10 because of the oneto. One payout. Now, what we really care about is the information down here. Our simulation data and our simulation results so right here. This is where we're going to use a data table toe, actually spin the wheel 10 times virtually and record the results of each spin here in column C and I've already populated some functions in Colin F. To summarize. Those results have created a named range called Spins, and we're using a basic some to calculate total gain loss. We're taking a max of column B for the number of spins using count if to calculate the winds and so on and so forth. So let's go ahead and start with a small number of simulations, just 10 spins. So what we're gonna do here is select this entire range from B 15 through C 25 and remember that when we used data tables, one of the cells in the selection needs to contain the formula that we're simulating, which in this case, is the spin result housed and sell I nine. So I've simply referenced that result right here in C 15 so that we can select the entire reference as we insert our data table. So head to your data tab here. We're gonna go to the what if analysis options put data table at the bottom, and here we've got our two criteria a row. Input in our column input. Now we don't have any variable values that were testing in rows, so that's obviously gonna be blank. But here's where things get a little bit strange. We do have different values here in our column, so we need some sort of a column input. But these values don't actually feed into any input that impacts our result. In other words, we're not actually plugging the number one or two or three into some formula input. We just want to randomize that formula a certain number of times and to make Excel realize that all we have to do is reference a blank cell. So Selby, 14 is completely blank. We can use that for a column, input and press OK, and there you go. Once we've done that, Excel is now iterating through this data table, you could see it's created a single array containing all of the results, and it's recorded a Siris of results, wins and losses in this case based on 10 different spins of the wheel. So in this first example, we're betting a $10 wager on black. We spun it 10 times. We ended up winning $40 because we hit black seven times. We hit red or green, meaning we lost three times. So our win probability was 70% and our average return per spin was $4. So let's go ahead and try another simulation by just simply calculating now. Okay, this time we lost 20 when percentage was only 40% again. Now we lost 60. We only won twice out of 10 spins. Then again lost 60. Lost, 40 broke. Even that time lost 20. So as you can see, there's quite a bit of variability in terms of that win probability. Now we know the actual stats. Here we know the rial probabilities. The odds of winning a red or black placement in roulette are equal to 18 winning results out of 38 possible results, which equals 47.4%. And the reason this win probability that we see here and sell F 19 is bouncing around so wildly and not even really targeting in on that true probability is that we're dealing with a very, very small sample size. Anything can happen in 10 spins. You could win 10 out of 10 you could lose 10 out of 10. So what I'd like to do here is repeat the process not for 10 spins, but for 10,000 spins and see what happens now. Keep in mind you can't just delete an individual cell from your data table, which is an array. You have to select the whole thing, starting with C 16 gonna control shift arrow down and then delete. And I'm also gonna delete the spin index numbers here and little pro tip here. I'm going to select the number one, hover over the right corner of the cell, hold the right click on my mouse as I dragged down and back up. And when I release, I'll get this secret menu here that has a Siri's option at the bottom. And what I want to do is fill a Siris in this column. I want to step in integers of one and I want to stop at 10,000. Press OK, and there we go. If I control our down, you'll see it's created a Siri's all the way down to 10,000 spins now same process as before. When you select this range control shift Arrow down to grab the entire range of cells Data . What if analysis Data table, No row Input column. Input equals black Selby 14 and press. OK, and now it's gonna think for a little while because it's iterating 10,000 times and recording the results and let's see what happens. There we go. So we're making the same bet as before, betting $10 on Black. The only difference is instead of making that bet 10 times, we just made that bet 10,000 times. It was a very long night at the casino. And the result that we see here is a total loss of $4980. You can see respondent 10,000 times. We won 4751 times and take a look at that win probability. Remember what we said. The actual true probability for this bet is 47.4%. Here we see a result of 47.5, and if we go ahead and spend it one more time by hitting, calculate now, remember, it's gonna take a few seconds, usually 10 seconds or so to actually run through the simulation and there you have it. So 47.6 this time we lost $4900. And let's just do one more calculation. Give it a few seconds to run. There we go. Our third result, we only lost 3400 that time. We had a higher than normal wind probability, but still very, very close to the true percentage or true odds of winning. So what I'm actually gonna do here is select that entire range here and delete that data table just cause I don't wanna slow things down here and let's go ahead and just delete these index values down to 10,000 and we're kind of back where we started. So from here, feel free to play around. Try different types of bets, try different bet amounts, simulate different numbers of spins. It's totally up to you from here. But that's your basic crash course. In creating a Monte Carlo simulation model in Excel
12. Advanced Optimization with Solver: all right. I'm so excited to share this next pro tip with you. This is a five star, truly expert level analytics tip. We're gonna talk about optimizing complex models using solver. Now it's solvers designed to do is work with real world complex optimization problems that require multiple inputs or decision variables subject to a given set of constraints. Now consider a case like this. You were looking at a transportation matrix that tells us the cost to ship goods between four different factories Boston, New York, Chicago and Oakland, and four different distribution centers Miami, Dallas, Seattle and Baltimore. And we're gonna be walking through this in depth when we jump into excel. But the bottom line? The objective here is to figure out how many units of product to ship from each factory to each distribution center in order to minimize our total shipping costs. Now, it's not quite that simple, because in the real world we have constraints and limitations as well. For one, we need to fulfill all of the orders demanded in this case, 6000 total orders, and two, we can't exceed the inventory available at each specific factory. So for a problem like this. We really can't use simpler tools like gold Seek, which required a single input or decision variable and a single concrete outcome. In this case, we need to use solver because solver allows us to either minimise, maximise or target an objective value. It allows us to change multiple input cells or decision variables, and it allows us to determine specific constraints as well. So we're gonna do here's head to our data tab and click on the Solver button to launch the Solver parameters box. Now, if you don't see solver in your data town, keep in mind that most versions of Excel half solver as a built in plug in but by default it's typically not enabled. So you may need to head to file options Adan's and then drill into your Excel. Add ins to plug batting. Now, looking at this dialog box here, there are three parameters that will need to determine Number one is our objective. In this case, our objective lives in cell E 23. It's our shipping cost, and our goal is not to set that shipping cost to any specific value. It's to minimize number two. We determine exactly which variable cells or input variables to change in order to meet our objectives. So here are decision variables live in cells D 16 30 19 And last but not least, we need to optimize subject to a specific set of constraints. So in this case, we can't ship fractions of units, so we're units. Need to be whole numbers are integers. We can't exceed the inventory available at each factory, and we need each distribution center to fulfill its entire demand. Now one very quick note about solving methods, which you'll see right beneath that constraint window. There are three different options you can choose here. Not gonna go into the details. It's well outside the scope of this course, but rule of thumb you're going to use simplex LP for any simple linear optimization is like problem we're dealing with here and g r g or evolutionary for more complex, non linear optimization. The big difference between those two is that G R. G is much faster, but may not provide the global optimal solution. Evolutionary is more likely to provide that global solution, but takes a lot longer to calculate. So now that we've set our objective our decision variables and our constraints can go ahead and click salt and check it out. Excel will produce or populate all of the values in those decision variables to minimize our bottom lines. Shipping cost. So here we see that we're shipping 225 units from Boston to Miami, worshipping 975 from New York City to Dallas, 500 from Oakland to Seattle. And at the end of the day, we're meeting the demand worshiping all 6000 units were not exceeding our inventory at each factory, and our total shipping cost comes in at just under $97,000. So, as you can see, incredibly powerful tool common use cases just like we see here determining optimal results subject to real world constraints, things like limited inventory like price floors like integer values, etcetera so that let's jump into a pro tip workbook and solve this optimization problem. Step by step. All right, so if you're following along, head your table contents squirrel all the way over to your analytics tips, and we're gonna link out to our solver demo here and here, you'll see that transportation matrix that we talked about in the slide. But before I dive in, pop into your data tab and look for that solver button. It's usually all the way on the right. If you don't see it, had to file options Adan's And then from this drop down, you're gonna want to manage your Excel, add ins and press Go, and you should see solver Adan in this window here. Just pop a check mark in that box. Press OK, and you should see that pop up in your date attempt. Now returning to our problem here. What we have our transportation costs, the actual cost to ship a single unit from our factories to our distribution centers. This is based on actual mileage, right? So shipping from Boston to Baltimore is quite cheap, only about $4 per unit. But shipping from Boston to Seattle is much, much more expensive, over $30 per unit. Now we're also dealing with two different constraints. Here we have supply constraints, which is the limit to the number of units that each factory can supply. So Boston currently only has 1000 units in stock. New York City has 2000 Chicago's 2500 Oakland only has 500 so we can't ship more units than a given factory has in inventory. So that's constraint number one or supply constraint. We also have demand constraints. You're listed right here in Rhode 10 which says that Okay, Miami Onley needs to fulfill 1250 orders so it doesn't make sense to ship more units to Miami. And in fact, we can't ship fewer units to Miami because we need to fulfill these orders. We need to get these products into the hands of the customers who ordered them. So we've got to get 975 to Dallas. Got to get 3200 and 52 customers in the Seattle area and so on and so forth. So those are our demand constraints. And bottom line is that there 6000 units that we're looking to ship here as part of this model. Now, if we jump down beneath the transportation matrix, we have our objective or our outcomes. So the goal here again is to populate these values right here in D 16 through G 19. Those are the input or decision variables that will be changed in order to evaluate or calculated the total shipping cost. So watch what happens if we just plug some numbers in. Let's say we ship 100 units from Boston to Miami, 50 from New York to Dallas, 350 from Chicago to Seattle and 25 from Oakland to Baltimore. As you can see, we're calculating the number of units shipped per factory. We're gonna make sure that those numbers don't exceed these constraints in I five through eight. We're also calculating the number of units received by each distribution center. I want to make sure that these numbers don't exceed or come in under the ones here in D 10 through G 10. These have to be exactly equal at the end of the day because we've got to fulfill those orders and then last but not least, our total shipping costs cell. This is the bottom line objective that we care about is a simple some product function that basically takes all the units sold or shipped from each factory distribution center, multiplies them by their respective costs and then sums up the total to spit out that total shipping cost. In this case for the 525 units that we just plugged in is a demo that would cost a little over $10,000 to ship. So we know our objective. We know our decision variables, and we know our constraints. It's time to solve this thing. Let's go ahead to data click on Solver and here you'll see your solver parameters. So first things first. What's our objective? Well, objective is shipping costs lives right here and sell E 23. And we don't want to set that shipping cost to any particular value, although we could if we wanted to. And we certainly don't want to maximize the cost of shipping. That would just be crazy. We want to minimize our shipping costs and we're gonna do so by changing a certain set of variable cells or decision variables which live right here in Green de 16 30 19. There we go. And the only thing we need to do now is set our constraints and they're gonna be a few different constraints. Go ahead and add the first. Our first constraint, which is a simple one but it's important to add in here is that we can't ship fractions of products, right. We have to ship whole units or quantities. So we know that our decision variables here de 16 30 19 have to be I anti, which indicates that it must be an integer so press. OK, there we go. That's constraint number one. We'll add two more. First is gonna be our supply constraint, which means that our units shipped out, which live here in I 16 19 units by factory have to be less than or equal to the number of units that each factory has available. And note that I could do this in one step by selecting kind of the entire array of four cells. The alternative would be to turn this into four different constraints. You know where I 16 is less than or equal to I five i 17 this less than or equal to I six. But we can do it in one step by selecting all four cells at once. So again, this says that we can't ship more units than a factory has Chris OK? And then our third constraint is our demand constraint. So we've got to fulfill the entire demand for each distribution center. So the units received by those distribution centers lives right here in D 21 3rd G 21. And in this case, we can't be over or under. We have to be exactly equal to the number of orders demanded or units demanded for each of those locations which live right here and sells D 10 through G 10 Press OK, and we are good to go. Solving method is going to be the simplex LP option. Since this is a linear optimization case and check it out. All we need to do now is click solved. And almost instantly we get this results box that says solver found a solution. All constraints and optimum ality conditions are satisfied and note that it plugged in the values right here in our decision variables and it spit out that minimum total shipping costs, which in this case is $96,687 and 75 cents. So let's go ahead and keep that solver solution. Note that we can save these values as a scenario in our scenario, manager or we can just press OK and there you have it. We have optimized this very complex logistical challenge by determining exactly how many units to ship from four different factories to our four different distribution centers. And there you have it quick demo of how to use Excel Solver tool to solve these types of more complex real world optimization problems.
13. Analysis ToolPak (Preview): all right, This next pro tip is a little bit of a teaser, kind of a sneak preview into some different ways that you can explore your data using something called the analysis Tool Pack. So, just like solver, the analysis Tool Pack is a built in Excel plug in that's designed to support a range of more advanced and complex data science and statistical tools and methods, things like analysis of variance and over co variance regression. T tests hissed a grams and so on. Now, obviously, I don't have time to dive into each of those specific methods. In turn, they each deserve a much deeper, much more comprehensive review. But what I will show you is the scope of options available within this tool pack, as well as a few that we can use right off the bat that are simpler but no less powerful. So in this demo, we're gonna be looking at a sample of about 5000 Olympic athletes along with their ages, heights and weights. And what we can do is drill into our data tab and look for the data analysis button. If you don't see this likely means you just need to activate that. Add in. So go ahead to file options. Adan's and manage Your Excel Adan's toe. Activate this tool on your ribbon. And once you click through data analysis, you'll see a scrolling list of all of your different options here, some of which are simple. Some are quite advanced. Now, for the sake of demonstration, we're gonna be talking about two options here. First, we're gonna look at a correlation matrix between our three fields, age, height and weight. And then from there we're gonna generate some descriptive statistics to calculate things like the mean mode. Variants ranged count skew nous men and Max values and so on without having to type a single formula. So that will be a great way to really describe and understand our data with the click of a button. So common use cases here, just like I showed you, generating those descriptive stats without having to use formulas and to exploring or analyzing data using these more advanced data science or statistical methods like analyzing variants, building predictive models, etcetera. So let's jump into excel and take this analysis tool pack for a test drive. All right, so go ahead and open up your pro tip workbook scroll in the table of contents to your Purple analytics tips and we're gonna jump into the analysis tool packed preview demo here . And when we link out, we'll see our list of Olympic athletes again. We've got a sample of about 5000 athletes here. We're looking at ages, heights and centimeters and weights in kilograms. So this would be a nice data set to practice some of these analysis tool pack options. So first things first, head to your data tab off on the right. This is where you should see your activated plug ins. If you don't see data analysis, go ahead to file options. Adan's You're gonna manage your Excel Adan's press Go. You should see analysis Tool Pack in that list. Go ahead and check the box and press OK. That should pop up right here and let's go ahead and see what we have available to us. So as we scroll through, we'll see a number of different options here. Analysis of variance, correlation, co variance, exponential smoothing, moving average random numbers, T tests, T tests and a few others. So obviously kind of a range of level of difficulty and complexity here. Some simple tools, some that are quite advanced. In this case. We're going to stick to a few basic options that you can utilize right off the bat, starting with descriptive statistics. So let's go ahead and press OK. Here, we'll see a pretty intuitive dialog box says. All right, what's your input range of values? Well, I'd like to explore or analyze all three of these columns, so let's select B Through D. These are organized in columns, so that's good. We do have headers or labels in our first row. And where do we want to drop this output? Well, let's put the upper left corner right here and sell F one. And let's also include the case largest and smallest, one value, basically a fancy way of saying the max and the men. And I think that should just about do it. We don't need to worry about confidence level for mean. At this 0.95% default is just fine. Let's go ahead and press OK and look at this. It's gonna drop in all of these summary stats right here in columns, F through K, literally in a matter of seconds. I think about how long that would have taken to produce all of these values with self formulas. These means thes medians, standard deviation variants would have taken quite some time and were able to just produced all of those values, plus quite a few more in a matter of seconds. So looking at her, Max and men looks like we have an athlete who was 65 years old, is the oldest. Our youngest athlete was only 12 in terms of height. Tallest athlete, 216 centimeters. And for those of you who aren't on the metric system, that's about seven feet one inch. And we can confirm by simply filtering our source data here. Let's sort descending by height. And here we see and dress Glina da Kiss, who is a Greek national basketball player, which shouldn't come as too much of a surprise given how tall he is and then looking at weights looks like our largest. Actually, it was 160 kilograms. Our smallest was only 35 kilograms and same story here. Let's sort those weights ascending, and you can see that it's like Oh Soo Ihara, who's a Japanese gymnast. And again, for those non metric folks, 35 kilograms is a hair over £77. So those air are descriptive stats really, really helpful, but also quite simple to produce. Let's go ahead and do one more example here, back into our data analysis options. It's time I want to look at correlations or relationships between these three columns. So same input ranch here. Be 30. Okay, the air columns again. We've got labels headers in the first row, and this time let's drop our correlation output beneath their stats right here in F 20. And it's a simple Is that Okay? And there we go. Let's go ahead and format these so it's a bit more readable as percentages. And check this out. We've got this correlation matrix that's produced here. You'll see 100% kind of on the diagonal because each field is 100% correlated with itself. But the ones that we really care about here are these two values as well as this one here. Just gonna make those bold. And basically, what this tells us is that there's not much of a correlation or relationship between ages and heights. only about 11% correlation, which might seem a little bit surprising at first until you realize that we're not looking at the entire population. We're not looking at heights all the way down two newborns or infants. These are all well, for the most part, adult athletes. So once you've reached adulthood, you're not really growing anymore. So it's not surprising that we see a relatively low correlation there, and that same story holds for age and wait for the same reasons. Now it's interesting is that we do see a very clear, very strong correlation 78% between height and weight, which also makes sense because taller, larger athletes tend to weigh more. So there you go. That's a very quick way to use tools like correlation and descriptive stats to explore and learn more about your data set without having to use any fancy tools or formulas. And again, I won't be diving into the other kind of more complex tools within that analysis tool pack . But I'd encourage you to explore those options and try playing with those tools for yourself. So there we have it. Quick primer, Quick preview of the analysis tool pack