Transcripts
1. Trailer: Hi my name is Al Chen. Welcome to my Skillshare series, Intro to Excel. Now, I used to think using Excel was meant for data nerds and people that didn't have social skills and after I started working, I realized Excel is something I would use every single day. In class three, we're going to analyze real data from a Google online advertising campaign and we're going to create a hypothesis around what we think the analysis should look like, and finally present the data to our fake CEO of the company. Excel is a useful tool no matter what line of business you're in, whether you're working for a advertising company, a non profit, you'll come across using a spreadsheet at one point or another. So please take this series and be better and faster in Excel.
2. Setting Up the Scenario: Welcome to Unit One, Setting up the Scenario. In this unit, we will look at some raw data from a real life Google AdWords campaign and talk about the goal of our analysis. At this point you should be familiar with how to do basic operations in Excel and how to do things in a very fast and productive way with keyboard shortcuts. This class is all about figuring out how to analyze real data. Before we actually dig into the actual data analysis, formula building, et cetera, I want you to just sit back and think more about what is the purpose of data analysis and data visualization. This goes beyond Excel. This goes beyond just crunching through numbers. This is more about what is a story you're trying to tell to your audience, whether it's your boss to your customer, maybe to someone you're working with or another colleague, you have to know what is the goal of your analysis before you start going off and crunching numbers in your desk. So, the first lesson is about setting up the scenario and hypothesis for the analysis. I know that sounds like a lot of kind of mambo jumbo and almost like a science experiment but you should treat it as such it should be kind of like an experiment where you are controlling where the analysis goes so that you're not just taking data in and try to throw things at the wall to see what sticks. You have a very clear path for how you're going to analyze the data. So, in terms of setting up the scenario, we are going to look at some real data that I've anonymized a little bit but it's basically online advertising AdWords data. So, if any of you have ever used AdWords, it's something like if I go to Google right now and I search in let's say pants, these are all Google AdWords ads. These are just ads that companies like H&M, Express, Armani Exchange have paid for to be on google.com. So, Google AdWords is the interface that lets you advertise on Google, and we're going to look at some raw data from a campaign that ran last 2012 and Q4 in 2012. So, let's quickly look at the data on this tab called Acme Raw Data in the workbook. So, you'll see it's a ton of data, There's about almost 350 rows of data and you'll see that we have things like week, campaign, budget, clicks, impressions, CTR which stands for click-through rate, CPC, cost, et cetera. Now, this scenario involves analyzing this data to figure out whether or not online advertising works. So, some of you may or may not be familiar with online advertising in Google AdWords. So, let's quickly walk through each column here to understand what the raw data actually is. So, week is pretty self-explanatory. This is the week that the Google AdWords campaigns were running. Campaign is simply the name for the products that were advertised on Google over this period of time. As you can see Acme the company, this fake company has various products like small walnut bowl, recycled paper basket, Italian ceramic pot, these are all kind of made up products that I found on Etsy. So, they're kind of made up but the data here on the right is actually real data from an AdWords campaign that I've pulled from. Now, budget is essentially the budget per day that this advertiser Acme has allocated for these various products. Clicks are the number of clicks that people search on Google, clicked on an ad, and went to the Acme website to look at the product or maybe buy the product. Impressions is simply the number of times the ad was showed up on Google. Then CTR, again is click-through rate which is essentially clicks divided by impressions. So, out of all the times that the ad it showed up on Google, how many times that people click on the ad that rate click-through rate. Cost-per-click is the amount of money that the advertiser paid Google in order to have the ads show up and be clicked on by someone searching on Google and that cost-per-click is simply the cost here column H divided by the number of clicks. So, you'll see here that this Acme small walnut bowl had a 0.31 cost-per-click because it's 0.31 divide by one, but our example here is Acme and recycled paper basket during November 26. The cost was $4.56 and you divide that by the number of clicks which is 28 and you'll get 0.16 cost per click. Everything else is kind of- we're not going to be using most of this data but average position is this is kind of like the position on Google that the ad would show up. Converted clicks, this is actually kind of an important field because this is the number of clicks that resulted in a purchase of the product. So, for instance, out of four Acme and recycled paper basket during the week of November 19th 2012, the ad had showed up 75 times on Google and lead to 31 clicks. Of those 31 clicks, four people actually bought the recycled paper basket after clicking on the ad on Google. So, you know that this was actually a very effective campaign for the company Acme. So, that's scenario working with again and this is online advertising data from Google AdWords and we, our job going back to the lesson plan here, is to figure out if online advertising works. So, again, this is advertising campaign data from October to December 2012. We talked about specific key fields in the data like clicks conversions, cost, and the CEO of Acme has asked you, the data analyst, the Excel whiz, to figure out if online advertising works for her. Is it effective? Is it cost effective? Are they spending too much? These are all kind of questions that she has kind of asked you, but the overall goal is to figure out if online advertising, specifically Google AdWords, works for Acme. So, the next step in understanding why we're doing this analysis and why this is important is, we need to tell a story to our CEO about why online advertising works. We can't just throw numbers at her, we can't just say the cost per click was this, the number of clicks was that, we have to tell a story around why online advertising is effective for helping you sell products from your store. If you just crunch numbers for the sake of data crunching, you're not really adding them much value, computer can crunch the numbers but actually having a story arc or a story line around the data, that's what kind of drives business decisions and that's what gets people to change their way of thinking whether it's in business, in government, in whatever field you work in. So, having that in the back of your mind when you're analyzing data is important so that you're not just kind of going away and crunch through numbers without a goal in mind. The final step is let's create a hypothesis for this analysis. Let's think of like using your hunch, using your best judgment, think about what do you think will be a result of this analysis? So, two things I thought about was, I think a majority of the clicks to the campaign, to the products, probably come from a handful of products. I think that maybe a lot of people might buy a porcelain product and a lot people would buy a recycled waste basket but all the other products on the website might not be in season, might not be selling well, they might be priced too high who knows, but my hypothesis is that I think only a handful of products will be five,10 products, 5 to 10 products in Acme drive all the clicks and all in advertising spend. So, in that sense online advertising would be useful because it's helping Acme sell a few of the products really well. Additionally, campaign took place in October to December 2012, I think most of the online AdWords, Google AdWords activity will take place around Christmas because people are buying gifts for their friends and family, they want to take advantage of deals and so they're probably searching on Google a lot before and maybe during Christmas to find good deals on things they're trying to buy. So, my hypothesis is that there's going to be a lot of activity around that time period because of the seasonality aspect. So, this lesson again we haven't looked at any data yet, we're just setting up the scenario. Important points to think about is, what is the goal of your analysis? What is your hypothesis for the analysis and thinking about what is the overall story you're going to tell which is centered around your hypothesis? In this case, figuring out which products drove the most clicks, when most the clicks happened, et cetera, et cetera. Having these things in the back your mind as you're doing the analysis helps you laser focus on what you're actually crunching in the data. That wraps up lesson one.
3. Cleaning Up the Raw Data: Welcome to unit two, Cleaning Up Raw Data. In this unit, we will look at the raw data again and do some basic formatting and formula exercises to clean up the data so it's ready for us to analyze. Now, we're going to be using some of the Excel skills you learned in class one in terms of formulas and functions to clean up a raw data set that isn't exactly perfect yet for analyzing. A lot of times you'll get data from a database or from someone else in your company, and it still has extra characters or is not filtered correctly. You just have to quickly massage the data a little bit to make sure it's ready for you to analyze, because if you're trying to analyze data that's not correctly formatted or contains incorrect values, then that's not going to be useful at all. So, we're going to do some quick tidying up of the data before we actually analyze it. This is a very common practice because sometimes when you get data from a database that comes in a CSV format, it's not always going to be perfectly formatted. You'll have to do a lot of formatting before the data is even ready to do any kind of crunching or analyzing or doing pivot tables, and stuff like that. So, the first step is, if you look at the raw data set, you notice that in the Campaign field, every single product has the word ACME and &. ACME & Small Walnut Bowl, ACME & Recycled Paper Basket, ACME & Hand Thrown Stoneware. So, we want to make sure that the campaign, the final data, doesn't have the AMCE &, and then, the product. We just want to have the product. So, for instance, in B2, I only want the words Small Walnut Bowl. I don't want all of ACME and Small Walnut Bowl together, if that makes sense. So, if you go back to the lesson plan, we want to get rid of the word ACME text in the Campaign column. So, how do we do this in Excel? We're going to use a formula that you'll probably come to use a lot when you're doing this kind of stuff, and it's called the MID formula. It's similar to LEFT, which we talked about in class one, but it's a little more advanced. I'll show you how to do it right now. So, our goal, again, is to get just the product name and not the word ACME, the company name. So, I'm going to be in column M and cell M2, and I'm going to start typing in the MID formula here, equals mid. Then, I'm going to hit left parenthesis, and you'll notice that it takes in three arguments, three kind of input variables. We have the input text, which is the actual text cell that we want to pull data from, start_num, which is the point in the cell that we want to start taking values from, and number of characters is how many values or characters we want to take from that cell once we've indicated a starting point. Now, what I've said sounds complicated, so let's walk through how I would do this in Excel. So, the first input, again, is the actual cell that we want the data from. So, text, I'm going to select column B2. You notice that, I think was a B2. I'm going to put in comma. So, start_num, so this is where we want to start taking in the characters. So, what that means is, at what point in this cell do we want to strip out the characters. So, you notice the letter A in ACME is starting point one, the letter C is starting point two, the letter M is starting point three. So, you notice a pattern. Actually, go back to the data here. The pattern in all this data is that ACME & prefixes all the products. So, when does the product name actually start? So if we double-click on the cell, if I count the number of characters it takes to get to the product name, we'll see what the start number is. So, again, we know A is one. So, let's move over to the product name. So, one, two, three, four, five, six, seven. So, that means that all the product names start in position seven of every single one of these cells. We know that everything before position seven, we don't want that data because it's the ACME & text. So, let's go back to our MID formula, equals MID left parenthesis. I'm going to take this cell again, B2, comma, and we found out that seven is a starting point for where the text starts. So, I'm going to [inaudible] number seven, and then, write comma again. Number of characters, this is how many characters you want to pull after the position seven. So, Small Walnut Bowl looks like it's about 20 or so characters. We want to be conservative and just try to pull as many characters as possible. So, I'm going to put 50 here, just in case. Let's say, I put the number 10 here. I put number 10, I put right parenthesis. I'm only going to be pulling the first 10 characters of that cell, and I might not pull all the characters that I want. So, if I press Enter here, you see that the value is just Small Wal, and it omitted the Walnut Bowl text. So, that's why you want to be conservative and just say, 50, or even 100. Let's do 100, just to make sure we pull all the stuff afterwards. So, you see what Excel did. It took all the text starting at position seven of B2 and took everything afterwards, and in this case, it is just the words Small Walnut Bowl. So, we have our formula now. So, let's copy this all the way down to all the cells below in column M because you want to make sure we get the product name for all these cells. Not just for the first row, but for every single product. So, I'm just going to drag and drop, and we have about 300 or something right there. I am dragging and dropping, let go, and there you go. So, they go back to the list. You'll notice that I have all the product names set up. Actually, I think we need to make this a number 8, just to be safe. So, seven actually counts the space here. So, let's see, one, two, three, four, five, six. Seven is the space, so the number here should actually be eight. So, let's do eight. I'm going to do eight. I'm going to fill this down. We learned about fill formula down in class two. So, now, we have just the product name, and we want to replace column B with column M. So, I'm going to do a Paste Special Values, which is a keyboard shortcut we learned in class two, and it's also a function we learned in class one. But I want to replace everything in column B so I don't have the ACME name anymore. So, I'm going to select this data, all of it, all the way down to row 347, do a Command C, and now, I have everything selected, as you can see with the ants marching around the selection. I'm going to move over now to B2. I'm going to Paste Special, which is, again, Command Control V, and then, Command V for values, Enter, and there you go. Now, our column B has only the product name and no more ACME company name. Now, I can just delete all of column M because I did that just primarily for using the MID formula. So, there you go. I've correctly cleaned up the data a little bit here. Now, let's go back to the lesson plan and see what else we have to do in terms of formatting. We want to add dollar signs to the budget and cost columns, and also add a percentage character to the Clickthrough Rate column. So, this is, again, just to make your data look a little better. I want to go back to ACME. I'm going to select. Looks like it's already correctly formatted. But let's make budget have only one decimal point. And do format Cells and go to Number. I'm going to go to Currency, I'm going to do only one decimal place here, Enter. Clickthrough Rate, I don't want to have any kind of decimal place. Command one for Format Cells again. I'm going to do zero decimal places. Again, this is just to format your data so it looks right because sometimes, you'll get [inaudible] cost-per-click. There's no formatting around it. Sometimes, there's one decimal place and there is two. Let's make this all have a dollar format and also have two decimal places. So, let's select all this data, Format Cells, Currency, two decimal places. There you go. Now, it just looks a little easier to play with, the data, once you have it all nicely formatted. So, this lesson was all about just cleaning up your data set once you have the data. Again, when you get raw data, it's not always going to be perfect. Sometimes you have to do text-to-columns, sometimes you have to do the MID formula I talked about, and other times, it's just a matter of cleaning up the formatting in terms of number formats and adding decimal places, et cetera. So, that wraps up lesson two, and in lesson three, we'll move on to actually doing some analysis.
4. Filtering, Sorting, and Pivot Tables: Welcome to Unit three: Filtering, sorting and using pivot tables. In this unit, we will do most of our data crunching to find the trends that we need to present back to the CEO of ACME incorporated. This is the lesson where we're going to be doing most of our data crunching, filtering, sorting. Again, in lesson one, we talked about figuring out the storyline, the hypothesis that we want to prove or disprove, and our main hypothesis is that, or my hypothesis, you might have a different one, is that most of the Google AdWords spend comes from five or so products. Additionally, I think that most of the spent will come around at the December Christmas time-frame. So, those are just things that you keep in the back of your mind as you're doing filtering, soldering, and Pivot Tables with the data. So, we have our raw data set in ACME raw data. So how do we actually start doing some filtering and sorting to find the results that we need? So, as again, we're going to be doing most of our crunching in this stage, let's go to our step two here. Let's filter on just the cost-column for anything greater than 20 dollars, and sort by the largest to smallest. So, what does that mean from a Google AdWords perspective? That means we want to find all the campaigns by week that cost the company more than 20 dollars that week, and then sort by largest to smallest. After you do that, your datasets should look like this; but we're going to do this in the raw data set. I'll show you the operations that I did to get to this final output. So, if I go back to ACME raw data, I'm going to first go back to the first field, and again if you look at the data, just skim through it, you'll notice that the costs per week by product is around, sometimes it's a dollar, two dollars, 15 dollars, et cetera. Sometimes it gets as high as 23, 24, but just from eyeballing, it looks like five dollars, maybe ten dollars might be a good average, and we can obviously write the average formula to figure it out, but let's just try to find the campaigns and products that cost the company the most in terms of online advertising. So, I'm going to set the limit at 20 dollars. So, let's go to data, and go to filter, and here I've got my filters set up. Now, I'm going to click on the drop down arrow, and the cost, because this is where we want to do the filtering. I'm going to filter by anything greater than 20 bucks. So, click on greater than, enter the number 20. Right. So, there you go now, and your list should automatically filter to anything over 20 dollars. You can see here, some of these campaigns are 36 dollars, 53 dollars for watermelon pot-holders, and so on, so forth. So, after we've filtered on things that are going at 20 dollars, I want you to also put things in descending order. So, I'm going to put the very highest costing products at the top. So, hit this ending in the costs, in the filter menu, and there we go. Now, we can see that our very highest, we can sort by, we've basically found their products, campaigns by weekly cost the most in terms of advertising spend. So the three hook message board in October, the beginning of October cost the most at 59 dollars and 55 dollars followed by watermelon pot-holders, and the three-hook message board again. So, we see some patterns here that the three hook message board is pretty hot products, and also cost us the most in terms of online advertising. If I go back to the lesson plan, 3.3, this is just a copy and paste of the first ten lines from that output, and so we can see here again a copy and paste from that dataset. So take a look at it, see if you see any interesting trends, you know the first things I noticed obviously, were that the three three message board is the most expensive product I know. I also noticed that the click-through rate is around 0.70 percent, the cost per click is maybe around a dollar. You just have this eyeballing trends, I'm not doing any format building, I'm just looking at the data, because I've sorted it, I filtered it, and now I'm just eyeballing for general trends that I think are interesting to point out. So, let's do another filtering exercise. Let's sort the data by click-through rate from highest to lowest. So, your output will look like this, but let's quickly go back to the dataset and do it in Excel. So, I'm going to get rid of my filter by quickly clicking on the filter. I'm going to click the filter again. Again here, we want to filter, oh, sorry sort by the click-through rate. So, I'm going to click on the drop down arrow, hit descending, and there you go. Now, I've sorted the list by the click-through rate which is CTR, column F. You notice that some campaigns have a click-through rate as high as 50 percent. Let's just go back to our lesson plan to see the dataset more summarized. So, again, this is the same data that we have that I just copied and pasted from the raw data. The click-through rate is again really high for some campaigns, and if you compare this with the first data set that we got, which is filtering and sorting by cost, this click-through rate is nothing compared to these click-through rates. The recycled paper waste basket, for instance, has almost 30 percent click-through rate versus the three hook message board, which is a less than one percent click-through rate. Now, these are all questions you can ask yourself. Why is that happening? Why do these products have such a high click-through rate versus other products? The one caveat, about online advertising on Google AdWords, and this is more specific to how AdWords works, is that you'll notice that the impressions for these products for the high click-through rate are only like in the 70s and 60s and 90s, whereas these products are getting tons more, they're getting thousands more impressions, and so you can imagine that the click-through rate on these impressions are much lower, and so, even though the click-through rate is very high on these products, they might not be purchasing that many products in general. So, all these questions you can think about as you're analyzing this data set. So, now we can see again which product campaigns had the highest click-through rate, and we know that the small walnut bowl won the prize even though I had two impressions and one to click, one divided by two is still 50 percent, so the click-through rate is very high for that product. It seems like the recycled paper basket is also very highly clickable product on the quote, so another important tidbit of information to know. So the last step in this exercise- We're going to do two more things, is we're going to create a Pivot Table now to summarize data more succinctly for our purposes. So, let's create a Pivot Table to show the top products with the most clicks. So, if you look at the raw data set, you'll notice that everything is broken out by week and by product. But we want to kind of summarize everything at the product level so we can see which products got the most clicks. Because right now we can only see products by week, we want to see total products. So, if we go to the dataset and we click on, again, we're still on the data menu. I'm going to click on Pivot Table and the drop-down arrow, I'm going to Create an Automatic Pivot Table. Now, you can do either option. The automatic pivot table just kind of makes it a little faster to do things in terms of creating a pivot table. So, let's actually do the manual pivot tables so we can see how to build a pivot table from scratch. So, I'm going to click on a Pivot Table here, and now the Create Pivot Table menu comes up. Excel smartly knows I want to look at this raw data. So, it already even knows that, Okay, I want ACME raw data, cell A12L 347 which is basically our entire data set in this worksheet, and then choose where to place new pivot table. Sometimes you want to create a new worksheet for a pivot table. I normally do that or maybe sometimes you want to put it on an existing worksheet in your workbook. So, we're just going keep it as new worksheet for now. Hit OK. There you go. We have our pivot table now. Again, a pivot table is just a way to quickly summarize information in Excel. I'm going to rename this worksheet just to be safe, call it pivot table. What is our goal again, let's go back to 3.3, the step. Create a pivot table to find the products with the most number of clicks and our output should have eventually looked like this. You have campaigns or the product in the left hand side and you have the clicks, total number of clicks on the right. So, let's go back to our pivot table worksheet. So, you'll notice that there is a row area and a column area as well as a main middle section which is the values. So, values is where we're going to put our clicks. So, let's click on the clicks in our pivot table menu here. Clicks and there you go, we have now the total number of clicks for our entire data set which is not useful for us because we want to see the products. So, now I'm going to select the product and the field name here. Where is it? It's at the very top actually campaign. Click on that, and there we go. Now, we have all our products and all the clicks associated with that product and we don't have anything with the weeks or other kind of fields, we just have some of clicks right here broken down by product. I want to put this in descending order. So, I'm going to go back to the Data tab, click Sort, the drop down arrow and I'm going to select Descending, and there you go. Now, we have all our products in descending order in terms of number of clicks. You'll see if I copy and paste this data and put it here, we'll get the same data set. So again, let's eyeball this dataset and see if we find any interesting trends. Well, the doily cereal bowl obviously is the most clicked on ad. Perhaps people around Christmas time in 2012 really wanted to buy doily cereal bowls followed by wall stencils, pan mason jars et cetera. We actually don't see our three hook message board until down here at 387, it's actually behind metal storage cabinet, watermelon pot holders. I remember in step four we found the products with the highest click-through rate. Recycled paper basket is even though it has a high click through rate, is all the way, where is it? In the middle here at 257 clicks. So, you can see that when you start filtering and doing pivot tables of the data, you'll find different trends that emerged as you dig deeper into the data. But the important thing that we see here is that the doily cereal bowl, the wall stencils, the pan mason jars, they're driving the most number of clicks for the campaign. The final step here. Let's again play with the pivot table again. I want you to reorganize the pivot table to show cost per week. Again, if you look at the raw data, we have costs by week and by product. But I want to aggregate all this data so I only see the cost by week, I don't want to see it broken down by the individual products. So, if I go back to our pivot table worksheet that we made, I'm going to get rid of clicks. I'm going to uncheck that. I'm going to get rid of campaign, uncheck that and now I'm going to select the costs, right? So, I'm going to click say cost because I want to see cost by week and then I'm going to select the actual week name which is here the very top. So, you notice here that it actually moved my values from the, sorry the sum of costs here is 2974, but in the values also has the count of the weeks. There's no really purpose in counting the number of weeks, I want to see the weeks along the row here. So, I'm going to move this over here and you'll see that now I have it broken out the cost by week. If we move that week over to column, we now see it broken out by column. So, now we can get a high-level view of how much my AdWords campaign costs over from a weekly week bases. If I go back to the lesson plan 3.3, you'll see that we have the same exact data. Now, you can see that the costs on a weekly basis definitely varies from week to week. Some weeks is as high as $300 and other weeks it dips down to $150 and so on so forth. So, we actually made the original hypothesis that Christmas time would drive the most cost, but in reality it looks like it's actually around kind of November time frame where the costs are to drive up a little more and Christmas time actually is decreased. Perhaps people already do all their Christmas shopping in November, and so they're not buying as many [inaudible] products online for instance, right before Christmas time. So, this is where we can see some high-level trends in terms of cost by week. So, again using the pivot table is really helpful because you can aggregate data beyond just looking at the actual raw data itself which is what we did here in the ACME raw dataset. So, that wraps up lesson three from analyzing real data and we'll move on to actually creating charts and graphs in lesson four.
5. Creating Charts: Welcome to unit four, Creating Charts. In this unit, we will take the data that we have filtered and sorted, and we'll create some basic charts to show trends in the data. In lesson three, we were able to crunch through some data using filtering and sorting techniques as well as using pivot tables. This lesson, we're going to look at the data and actually create data visualizations that can help us better to take trends. Because when you're presenting your data to someone, whether it's your boss or your upper management, or even a client, you don't just want to show a bunch of numbers and texts because it's not very easy to see the trends or take away the message you're trying to get across to your audience. So, with charts, we can quickly allow our readers or viewers to see what the story is, we wanted them to take away with, and the message that we're trying to get across to them, and the easiest way possible. So, let's look at creating first, a bar chart which is one of the most basic charts in Excel. We're going to look at the dataset that we got from lesson three. So, this is a dataset that we've sorted when we looked at the cost by week, and where is this coming from? So, if you go back to section three, you go to the very top. Were just copying and pasting this data here, and we're putting it in right here in lesson four. So, how do I create a bar chart showing cost by week by product? So, this is going to be the final output of the graph, but I'm going to show you how to do this from scratch. So, the first thing to do is, select the data that you want. So, we select this whole or entire range, even though includes clicks impressions and click-through rate. We're going to select everything just in case. I'm going to go do charts in the ribbon, and then bar chart here, and again the most basic bar chart. Oh, sorry, Column chart. The most basic bar chart is the clustered column chart. I'm going to select that. You'll see that there's a whole ton of stuff here that I don't need. Right. We have clicks, we have impressions, and click-through rate. I only want to look at the cost by week for individual products. Right? So, how do I get this graph to look more like that, because right now I have all these different things that aren't even showing up in the graph, because their whole different of metrics. So, the first thing we're going to do is select the graph, and then hit select. So, again the steps here are clicking on the chart, and then you'll have this data menu, and you click select. You'll see all these references that the chart is referring to. So, the chart data range as we're just looking at this section right here. Here are different series names. So, you want to get rid of all the series names that don't apply to us. So, we don't want the CPC, we don't want cause, and we don't want the budget, clicks impressions. We want the costs. Right. Sorry. But that we want the cost get rid of average position converting clicks, et cetera, and there we go. Now you notice that as you're removing series, your chart will show you what you want. Hit Okay. There we go now, we have the costs by week and by product. So, you can see here that the three Hook message board during the week of October 8th had the highest cost for us, and it slowly trickles down as we look across all different various products by week. Great. So, now let's create another graph. We're going to create a stacked bar chart which is similar to the previous chart, but it puts all the data into one giant bar. So, again this dataset we have from our previous lesson, lesson three, we have the products by clicks. Right? So, if you go back to lesson three, we did this in step six, and here we have again the products by clicks. Right? So, let's go back to lesson four. I'm going to select all this data right here, and then select column in the chart ribbon menu column. Then, I'm going to do a stack column. Right. Hit Enter click. Great. But you'll notice that doesn't exactly look like the stacked chart that we have, it looks like a regular bar chart that shows products by clicks. It doesn't do exactly what I want. So, what I'm going to do here is click on the chart. Again, I'm going to hit Select here under the data menu. I think and this is underneath, just the charts are ribbon menu after you selected your chart Select. What you need to do here is just switch the common row, because right now we have the rows and columns switched up, where we don't have the sec partial we need. Someone to hit Select switch row and column. Now we have busy reoriented dataset so that the values, the name values are all within the one stacked bar chart. I'm going to hit Okay. There you go now starting to look a little more like the graph that showed up on the worksheet. We first opened it. So, now you can see here that the product is broken down by the individual bar charts. So, we can see that for instance, it's hard to see here, but the doily cereal bowl. If we expand this a little more, the doily cereal bowl has the most number of clicks as you can see from the dataset. The wall stencils is next. The painted mason jars follows that. Again, this is just a visual way to look at the dataset so that it's not all just in a table like this. Right? I'm just going to delete this for now so we can move on to the next section. So, in the final step here, we want to look at a time series graph, which is one of the most common graphs you will create, when you look at data over time. Which is a line graph showing anything related to time. If you are ever showing data that by time that is not a line graph. Definitely, think more about how you're presented data because it's probably not the best way. Time series should always,19 percent of times be presented using a line graph and this is exactly what we're going to do here. So, this data again is from section 3.3. We have this in our last step, which is recreated a pivot table. We showing costs by week, and we copy and paste the data right here. So, again, the simple process is selecting the data that we want. So, I'm going to select all this data, go to line graph, this time under the charts, click on the 2D line chart. It pretty much already does what I needed to do. It creates a line chart, and this allows us to see trends over time, better than when we can look at just the table. Right? So, I'm just going to delete this for now, and we're going to look at the graph we already created. So, here we see that the spending definitely increases around November timeframe, and actually took a dip around the end of October. So, most of our AdWords spend comes in around the very beginning of October and then the middle is timeframe in November. Then, takes a series drop as we get into the Christmas holiday season. So, this trend becomes very apparent when you do a time series graph in Excel. So, that wraps up this lesson. We walked through basic ways to create bar charts, and line graphs in Excel. It's a matter of selecting the data that you want, and just clicking on the charts ribbon tab, and selecting the kind of chart you want. Feel free to experiment with a different type of graphs. There's area graphs, pie charts, scatter plots et cetra. But the most common types of charts you'll be creating in Excel will be these kind of 2D bar charts, stacked bar chart sometimes, and definitely the line graphs which show time series information. So, play around with that data, try to reorient the axes, sometimes as we did here with step two, with this stacked bar chart and as if you can visualize data in different way than what we showed here in this lesson.
6. Detecting Trends: Welcome to Unit five, Detecting Trends. In this unit, you will put on your detective hat and try to find some trends in the data, through the charts and the data that we've sorted in Excel. So, the first step here, is let's look at the first to see what trends we can see in terms of cost. So, if you go back to the first chart, you'll see the cost by week again, it's driven mostly by the Three Hook Message Board per week. So, it costs 60 bucks, almost $60 on October 8th, the week of October 8th. October 1st is also a big week for the Three Hook Message Board, the Watermelon Pot Holders came in third and then the Three Hook Message Board happened once again on October 15th. Excuse me. So, the trend here is that we know the Three Hook Message Board definitely drove a lot of the costs for the Google AdWords campaign. But how can we dig deeper into this data? Does the actual campaign, once the user or person searching for this product on Google, after they click on the ad, do they actually buy the products? Do they actually go to the store and go to assist store and click on the product and check out and all kind of stuff? Maybe all the costs we're spending on advertising the Three Hook Message Board is all in vain and no one's actually buying this Three Hook Message Board. So, these are ways you can detect the trends in the data to tell the story. So, let's actually look at a dataset here that comes from the ACMI raw data set. Now, I'm going to spare you from actually having to analyze and crunch the data this one time. Here's the data set that we have here. Essentially, what it shows you is data by campaign and for each campaign, it shows the cost and number of converted clicks. Converted clicks again stands for the number of clicks that actually resulted in a purchase on the store. So, for instance, the Wall Stencils cost a total of $212 over this time period, Q4, 2012. It drove 49 purchases which is actually a ton of clicks that drove to a purchase. So, we see that as things go down past the wall stencils, the recycled paper basket also drove a lot of purchases. But the wall stencils was definitely was the biggest driver in terms of purchases. If we look at the Three Hook Message Board which is what we saw here in the graph 3.4, even though it cost the most in terms of spend on the AdWords campaign, it only actually drove one purchase which sucks. Basically, what you're telling me if I'm the CEO of ACMI and you show me this data. What you're telling me is that all the campaigns, even though we spent a lot of money, $428 to be exact as you can see here, it only drove one purchase. Now, why is that? Why did that campaign perform so well in Google but actually didn't drive purchases? Why did this wall stencils, we paid half as much in terms of advertising spend but 49 different purchases were bought through that campaign. So, that is an interesting trend that you can maybe bring out in the final analysis which is, we know that the campaign for Three Hook Message Board drove a lot of costs but there's something about maybe the product picture or maybe about the checkout process or maybe the description wasn't clear enough for the potential buyer to actually want to buy the product, if the goal of your company is to sell products. For a store like ACMI, which is selling a bunch of products, it makes sense to that they would want to make sure their advertising is actually giving them a return for their money. So, that's an interesting trend that we can definitely call out to the CEO. Now again, in step two here, we dig deeper into which products are being bought the most. The wall stencils again are the most bought product because twice as less as Three Hook Message Board. Another question you can ask yourself and this is also when you put on your detective hat is, maybe the Three Hook Message Board is just too expensive? Maybe the Three Hook Message Board cost $10,000 and even though one person bought it and it cost $430 to fund the campaign, if one person bought it at $10,000 that's actually worth our investment because we're making a huge profit there. Maybe the wall stencils only costs 50 cents per wall stencil, so at 50 clicks, at 50 purchases, that's only $25 worth of revenue. So again, these are questions that you can ask as you dig deeper into the business. This is more to the business side of things and asking the CEO of ACMI is what are the actual unit costs of these products? Maybe the reasons why the people are buying the wall stencils so much is because they're cheap and they're on sale. So, these are trends that will lead you to ask more questions. Again, part of analyzing data is not so much just having the answers but also putting your detective hat and asking the right questions to your potential stakeholders and your business colleagues. So, that's the message that we want to walk away with from detecting trends about that Three Hook Message Board from the first graph. Now, if you look at the number of clicks, we look at that graph here in lesson four, we noticed, I'm going to delete this graph just to clear things up. We see that it looks like the number of clicks is pretty disperse, so these are the number of clicks again broken down by products. We know that the Doily Cereal Bowl drove in the most number of clicks. Now, the wall stencils, also we know that people bought that product but also drove a huge number of clicks. If you look at the graph, it seems like it's pretty well distributed. Like a few clicks here, a few clicks there. But let's use a quick formula to see how many clicks were driven by the the top five most products. So, I'm going to write a quick formula here, "=SUM(" I'm going to sum up all this data here. So, we know the total number of clicks is 6,290. Now, in this cell over here, I want to see how many clicks were driven by just the top five. I have a hunch that the top five products actually drove the most number of clicks, that was actually one of our first hypotheses. So, =SUM(, I'm going to sum up the first five cells; one, two, three, four, five. There you go and then we divide by the total number of clicks that we just saw here, which is 6,290, divide by 6,290, Enter, and there we go. I'm going to quickly format this to show percentage. So, this means that 46 percent of all the clicks were driven by just these top five products; Doily Cereal Bowl, the wall stencils, painted mason jars, watermelon pot holders and the metal storage cabinet. So, that's an interesting trend that I would want to call out is that, even though the clicks look like they're pretty well distributed, in reality, almost half of all the clicks to our campaigns or all our product campaigns we're driven by just five products. Perhaps people are only looking for cereal bowls in November or wall stencils in October. Again, these are trends that you can detect by just thinking about how can you summarize this data in a better way? What does the storyline you want to tell across to your audience? Et cetera. So, that's an interesting trend that we just saw with the clicks. Step four, let's look at the line chart that we created in lesson four. That was a time series chart showing cost by week. So, we're going to go down here and we'll see that. Here is the cost by week. We noticed that a lot of the spend happens around the beginning to the middle of November. That trend is pretty apparent as you see this dip here and the increase here. Why does that happen? We pointed these questions in the last lesson, is why are people clicking more in November versus these low months, and in October? Are certain products performing better? Is the click-through rate higher during those weeks leading to more purchases? Or perhaps are we bidding more on these keywords in Google AdWords so that we're spending more of those weeks too. So, the takeaway here is that I see a trend that shows that a lot of activity is happening around the Thanksgiving time period. So, maybe people are buying more products around Thanksgiving in preparation for their Christmas present to their significant others. So, again, detecting trends, this class was all about looking at the data at a high level and seeing what nuggets of cool trends or interesting facts we can pull from the data. We don't want to just show a huge list of data inside here and just put that on a slide and show it to our audience, we want to see how we can actually add value to the data by finding the storyline, finding the trends that you want to tell across your audience and adding value that way. Again, being an analyst that looks at data, you have to think about what the data is actually saying and always be asking questions, that's the most important thing is always ask questions that lead you to dig deeper into the actual business decisions and the actions that you should take to improve your profit, improve the number of purchases, et cetera. So, that wraps up lesson five, which is just all about detecting trends in data.
7. Revisiting the Original Hypothesis: Welcome to Unit 6, revisiting the original hypothesis. In this unit, we will look back to our hypothesis from Unit One to help us figure out what kind of story we want to tell with the data. So this is going to be a quick lesson around how the data either supports or rejects our original hypothesis from Lesson One. So the first hypothesis that we thought about that I proposed was whether or not the majority of clicks from our AdWords campaign came from just a few handful products from our store. So, we saw here, I created a simple pie chart that shows that again the 46% of our products, of our clicks, excuse me, come from the top five- Sorry, top five most clicked on ads were from just five products alone. The rest, around, let's see how many, one, two, three, four, five, one, two three, four... 19 other products in our store accounted for 54% of the rest of the clicks. So, this is a pretty interesting trend, is that why do these five products drive so many clicks? So, our original hypothesis is supported in that there are only a few products that drive most of the spend and the clicks in our campaign. Now, if I were analyzing this data set more, I would try to dig into why these five products are causing so many clicks? Maybe it's the way that we wrote the advertising texts in the Google AdWords campaign, maybe the products are very hot right now because they're in style, and they showed up on TV, on some TV show or something like that, but these are all questions that I could ask to dig deeper into the data on why these products were driving so many clicks for this company. So, that original hypothesis is supported, and thinking about the hypothesis really helps you develop a story that you want to tell in the final presentation that you give to the CEO of Acme. The second hypothesis was pretty generic idea that most people do shopping right before Christmas time to buy presents for their family, their friends, and so on, and so forth, for their coworkers, but in reality the least number of products were bought during right before Christmas, and most of the products were bought in November, as you can see here, based on this time series chart. Again, if I were to dig deeper into why this is happening I would want to know are we actually increasing more spend during November? Maybe we're spending too much and we need to spend more in December and we want to increase number of people buying around the Christmas time holidays. So, that hypothesis is actually rejected and one thing to bring out in the final presentation is answering the question, why is this happening? Why do we spend so much in November? Are people searching more for products in November? These are all questions that you can ask yourself and also to the CEO of Acme as you're presenting the information, but the important thing here is that our original hypothesis was rejected, and you, as an analyst, can dig deeper into why this is happening by looking at the data set more. Finally, even though we proved one hypothesis and rejected another hypothesis, this lesson is similar to lesson five in that we want to keep on asking questions on what products drove the most during thanksgiving? Why did those products drive the most cost? Maybe we need to increase the click-through rate on our ads around Christmas time because we think that we can get more people to buy our products around Christmas. So if I were to analyze this data even more, I would want to figure out what is the click-through rate in November versus December? Perhaps people are just not clicking on as many ads for our ads during the Christmas time. So, the goal will be, how can we get more conversions and items bought for our Christmas-time shoppers? Another question is how can we get more products bought for more expensive products? Maybe we did see that our wall stencils was our most bought products, maybe all those wall stencils were being bought in the November time frame, but in reality the three hook message board, which we saw cost us a lot in terms of advertising but only drove one purchase in the October to December 2012 time frame. So, maybe we want to figure out a strategy for figuring out how to get more products bought from our more expensive product items. So, these are more hypotheses, more questions that you can ask yourself and also to the CEO of Acme when you're presenting your final findings on this data set. So, that wraps up Lesson 6, it's pretty short lesson, and we just wanted to figure out if our original hypothesis was approved or rejected, and we did see that our first one was proven and our second one was rejected, but it did lead to some more interesting insights around the data.
8. Formatting: Welcome to Unit seven, formatting. In this unit, we will talk only about how to make your charts visually appealing, and so that your audience walks away with one message from your data analysis. When you present data, various websites like the New York Times, Business Insider, all these news sites. They're doing a really good job of showing the data in a readable fashion that points to bigger aspects of data visualization that we can use in our own charts and own graphs, so that when you present your final findings to your audience, whether it's a CEO, whether it's to your colleagues, they walk away with a message that you're trying to get across and nothing else. The worst thing you want to do is present a slide and PowerPoint with a bunch of data, with a bunch of graphs and there is no clear message and no clear storyline that the audience can walk away with. So, here are some quick tips around formatting your bar charts and your line charts to make sure that your charts are clean, that the message is clear, and that your audience will walk away with one salient point that they can use to drive a new business decision, or some other action that you want them to take. So, the first step is, let's look at our bar chart or sorry, lets look at our pie chart that we created in line in lesson six. It's a pretty simple pie chart. We can quickly see that our top five products generate 46 percent of our clicks and the remaining 19 products are 54 percent of the clicks. In data visualization best practices, there's a theory that you should actually never use pie charts. Pie charts can actually distort data, especially when you have more than one series. Because if you imagined, if you had a pie chart with more than two slices, it's card is figure out like does this slice make one fourth of the pie? Does this slice make up two-thirds of the pie? You can obviously add values like 46 percent and 54 percent here to make it more apparent. But as a rule of thumb, try to avoid using pie charts, stick with bar charts. So, that's what we did with this chart right here in step two of this lesson. We basically converted the pie chart to a simple two series bar chart. What that allows us to do is quickly see a trend in the data which is Top Five Products again, generate 46 percent of the clicks. The remaining 19 products generate 54 percent of the clicks. Other aspects of creating a chart to be aware of s is to clearly always mark off the title of the bar chart, as well as the axes. So in this case, I have labeled this as Clicks to Top Five products Versus the rest. Top five products versus the rest. Also the horizontal axis I have labeled as Top Five Products, I remain 19 products. So, these are just coming straight from this dataset right here. So, let's build this graph from scratch to show you how I would do this by writing in the right axis titles, the coloring, so on and so forth. So again, these are again, data visualization tips to make sure that your graphs and charts are clearly marked and labeled so that your audience can walk away with one clear message from your your data. Let's move this graph over here. I'm going to select this data right here. Again, this is just our top five, the clicks going to our top five products and the clicks going to our remaining 19 products in our Acme Product Store. I'm going to select this data, I'm going to charts. I'm going to select column, and then I'm going to select clustered column. Then here we see that the chart is not exactly how we want it to be. So, I'm going to select data and I'm going to switch the row and column, because we want each value within the series to have its own bar. Series two, you'll see that is this red thing. We actually don't want the red series which is the clicks. We just want to show the percentage. So, I'm going to remove that, oop, sorry. This is that series two was actually the correct series. I'm going to get rid of the series one. Remove. Here we go. Now, we have our just the percentages and not the number of clicks. I'm going to hit okay. You notice that it looks a little funky compared to our final output. The first thing you want to know is that, when you're creating a bar chart, the most important thing is to make sure your y-axis is zeroed out. What that means is, Excel automatically tries to find that middle range. In this case, it tries to use 42 percent to 56 percent as the range. We want to make sure that the bottom axis is always zero. So, how do you do that? I'm going to click on the vertical axis. I'm going to go to chart layout. I'm going to go two axes and then go to horizontal. Sorry, vertical axis and go to axis options. I want to make sure that the number, oh sorry, the scale, the minimum is zero. Right now it's at 0.42 as you can see, I'm going to set it to zero. Now you can see that the chart is starting to look a little more like our final output. So, other quick formatting things I'm going to do. I'm going to delete the legend actually because we only have one series, so we don't really need to have a legend to tells us this is one series and this is another series. I'm just going to hit delete. I'm also gonna delete the gridlines. Grid lines actually are a little distracting for your audience, so we don't need them in our final graph. Again, when presenting graphs always think less is more. If you can get across your point with just showing as minimal information as possible. That's your goal because anytime you introduce other aspects in the graph, they might confuse your final audience. Then we want to also make sure that the y-axis has the right number format, so we're looking at percentages here. I'm just gonna click on the axis again, go to axes vertical axis axis options. There's a number menu here. Currency oh, I'm sorry percentage. Oh sorry, I'm going to have to uncheck link the source first and then percentage. I want zero decimal places, enter, and there you go. Now, we have our percentages. Other things you can do from here, I want to make sure the font is big enough to read maybe, so I could obviously increase the font size. I could go to home and go to make sure this is a 12 font, and make sure the y-axis or the x-axis is a 12 on two. Then other important things again is to make sure you have a proper axis title. So, here we want to say percentage of total Q4 2012 clicks. Again, I'm going to go to chart axis title, vertical axis title, rotated title and it automatically will insert a title thing for you here. Use it to double-click and then start typing in your title percentage. It's percentage of total Q4 12 clicks and there you go. So, I'm going to spare, I'm going to not walk through the rest of adding the title name, but you get the idea is just to make sure you clearly demarcate or clearly write out what the axis titles are. Adding labels is also key here is because sometimes this might be hard to read but this is read to 48 percent or 49 percent. If you just add the labels, here you can have the 46 percent or 54 percent, so, it's very clear what the value is, so that people know exactly what the number is. So, here I'm going to say data labels value and then there you go. You can again format this to how you want it to be, 12, so on and so forth. So, in terms of the line chart, similar concepts there. I mean, we already have the line chart pretty much made. Important things to we get rid of is don't have the grid lines, maybe also add the values above each point, so that the audience knows it went down to 200 then it went up to 285. It dipped all the way down to $23 in the last week of December. We have costs here on the y-axis, it's the label. So, if you look at the difference between this chart and this chart, there's not a whole lot that we need to change but just important things to know are just to get rid of the grid lines and to add in proper titles on the x-axis and the y-axis. We did that already in the first graph but just to hammer in the point. Sometimes it makes sense to include the title, the data label, sometimes it doesn't. But it really is up to you and how you want to get across your message. So, that wraps up lesson seven in terms of quick data visualization formatting tips. Data visualization is a huge topic in itself. Again, in The New York Times, they have a whole data visualization department that just releases info-graphics that shows really cool trends in our society around, economics, et cetera. If you want to read more about data visualization, that's a whole separate topic and could be a whole separate sculpture class. Edward Tufte, he is the leading author on data visualization. He wrote a book called Data Visualization many years ago which is still considered the seminal piece on data visualization. So, in terms of Excel though, there are certain things that you want to be aware of when you are creating your charts and graphs. That's lesson seven.
9. Presenting Findings: Welcome to unit eight, presenting your findings. In this unit, we will take all the data that we've collected and analyzed and summarize it all on one slide to present to the CEO of Acme Incorporated. The topic here is a little bigger than Excel, it's more around how do you present data in a very effective and visual way, and there's many different ways you can do this. This is just one way that I've taken our data and put it into a slide. You can have many different ways you can do this depending on the story you want to tell to your audience. For our CEO of Acme, original hypothesis was that most of the clicks to our campaign or Google AdWords campaign are driven around a few products, and that most of our activity on Google AdWords or costs rather is around Christmas. By thinking about this hypothesis, we can start to answer the question, does online advertising work for Acme Incorporated? So, I've listed the steps here that we need to answer the original question, does online advertising work, and we've seen that it does. It does drive actual purchases. It does drive actual clicks to your website as we can see here on the slide. We need to tell a story around the data instead of just regurgitating numbers from the analysis. So, how would you present everything on one slide? How would you summarize everything into one graph, two graphs and have your audience walk away, have the CEO of Acme walk away with a few things about whether or not online advertising works? Again, this is how I would structure the slide if I were to present it in a PowerPoint format. You might have different ideas and more creative methods of doing so, but this is how I would structure something in terms of the analysis. So, I always want to have a good title, which is Does Online Advertising Work. We'll look at Q4 2012 Google AdWords campaign, and this is the analysis that goes along with it, of course, and a subheading which is basically the one bullet point takeaway that if I had to summarize everything around my analysis to the CEO of Acme. I can obviously say something like, "Yes, online advertising does work." Takeaway is that majority of the clicks are driven by five products in your product line, and they all happened around the Thanksgiving timeframe. That's the one takeaway summary point that I would want to get across to the CEO of Acme. Everything below that bullet point is this summary graphs that can help support that statement as well as help the CEO see more into the data and ask you questions about your analysis. So, you'll see here that we have our two graphs that we created around clicks and cost. This graph where you've seen in the previous lesson seven, which is just our top five products, drive most of the clicks in our campaign. The CEO of Acme may say, "Oh, well, interesting. What are those five products? Why did they drive so many clicks?" That's where you can put on your analysis had and say, "Oh, well, actually, it was driven by our three hook message board and so on and so forth." Allows the CEO and you to have a more valuable conversation around whether or not online advertising works. Our second hypothesis was around whether or not the spend for the advertising campaign happens around December, Christmas time, and actually I drew a little red highlight here so I can help my CEO laser-focus on what I want her to see in this graph. In this case, I want her to focus on just that really uptick in cost around the mid-November timeframe, which is right before Thanksgiving. Perhaps I could ask the CEO, does this make sense to you? Do we want more of your products bought during November? Do we want to increase spend in December, et cetera? So, this again, highlight that red circle allows your CEO to focus just on the one part of the graph that makes the most sense to show. Then beyond that, I would just add a few bullet points below the graphs. If there's any kind of interesting trends or analyses that you want to include in the slide, sometimes it might make sense to just summarize what the graph means if it's not super clear or it might just be to dig a little deeper into what the graph means. Here, I just say online advertising works well for a few products. My recommendation might be, maybe we should focus our budget on those products. Maybe those products are doing really well and sell a lot on our store, and we should maybe decrease the spend on the products that aren't doing well in AdWords and aren't driving purchases. Another bullet point I might add is our Christmas advertising campaign performance is very low. There's not much being spent around Christmas, and maybe we can get more people to buy products if we advertise more on Google. So, my recommendation might be, we see a lot of spent happening in November. If we increase our spending around Christmas time, we might be able to get more purchases of our products. A final bullet point would be around the AdWords campaign itself which is we should again optimiz our campaign because we noticed that some products are being bought on Google and we're still advertising for them. For instance, the three hook message board, we saw in our previous lesson that while it was costing us a lot to advertise per week, only one person bought the product over that three-month timeframe. So, perhaps it's time to decrease our budget around advertising the three hook message board in the AdWords campaign. So, to wrap up this lesson, we want to again succinctly put all our data into one slide but not overwhelm the audience with too much data because then it's like, what's the takeaway message here? What's the actionable item that I can walk away from? So, this slide gives you a general overview of whether or not online advertising works for the CEO, and a few takeaway points that you can add in. That's really your value add to the analysis, which is surely you have all this data. You have the raw data set, you're able to filter and sort the data, but how do you actually act on the data? That's where this analysis and the findings could come into play. You're not just presenting the data, you're not just saying here's the update from last week, here's the update from Q4, but here are some actual action items you can do to increase the number of purchases, to drive the number of downloads, to get more people to look at your website, so on and so forth. So, that wraps up this lesson on presenting your findings to your audience. Thank you for taking my intro to Excel class. You've finished all three classes in the series. Now, you have the skills you need to be a real data analysts and in the real world to analyze data to help your business, your nonprofit, your school. Go forth and do great things.