Google Data Studio: How to use Data Blending and JOINs? | Lachezar Arabadzhiev | Skillshare

Playback Speed


  • 0.5x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 2x

Google Data Studio: How to use Data Blending and JOINs?

teacher avatar Lachezar Arabadzhiev, Founder and CEO @SkildLabs

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Lessons in This Class

5 Lessons (25m)
    • 1. Introduction

      0:22
    • 2. Getting your .csv data sources ready for blending

      2:39
    • 3. Blending data sources with a single JOIN key

      9:42
    • 4. Creating a calculated field in a blended data source

      3:50
    • 5. Blending data sources using multiple dimensions

      8:46
  • --
  • Beginner level
  • Intermediate level
  • Advanced level
  • All levels
  • Beg/Int level
  • Int/Adv level

Community Generated

The level is determined by a majority opinion of students who have reviewed this class. The teacher's recommendation is shown until at least 5 student responses are collected.

11

Students

--

Projects

About This Class

By default, all charts in Google Data Studio are connected to a single data source. However, as part of Data Studio’s data blending capabilities, you can create charts based on multiple data sources. For example, you can blend two different Google Analytics 4 (GA4) properties to measure the performance of your app and website in a single visualization. 

ef4be8a0.png

As such, in this class, you are going to learn about data blending in Google Data Studio (the 2021 edition) and see how the functionality has evolved over the years. We are going to explore join keys, data relationships and combining tables with multiple dimensions. Let's dive in! 

Meet Your Teacher

Teacher Profile Image

Lachezar Arabadzhiev

Founder and CEO @SkildLabs

Teacher

Hey there! I'm Lachezar Arabadzhiev and for the past five years, I have helped companies harness the power of data in a variety of ways to drive business growth and innovation. I began my career as a digital marketer at Microsoft but soon transitioned to the audience and analytics world, where I had the opportunity to work with major brands including Air Canada, RBC, Walmart, Kimberly-Clark, Nintendo, Mazda, and HSBC. In addition, I am the owner of the Data Studio Canada educational website. 

Throughout all that time, I always loved teaching and developing learning programs for my colleagues and friends. That passion materialized in early 2021 in the form of a brand new company that I was luckily enough to start.

I am currently the Founder and CEO of SkildLabs, where ... See full profile

Class Ratings

Expectations Met?
  • Exceeded!
    0%
  • Yes
    0%
  • Somewhat
    0%
  • Not really
    0%
Reviews Archive

In October 2018, we updated our review system to improve the way we collect feedback. Below are the reviews written before that update.

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.

Transcripts

1. Introduction: Hey, there latches are here with another video on data visualization, would Google Data Studio. In this class, we're going to learn how to use data blending, which is one of the most complex and useful features in Google Data Studio. We're going to talk about junkies, cardinality, and how to blend data sources with multiple dimensions. Let's get started. 2. Getting your .csv data sources ready for blending: We're going to start off by configuring the datasets that we're later going to blend. The first thing that you have to do is as instructed, go to the URL of this shared directory. Once you click on it, it will lead you to this page and this is Google Drive. And all these files are Google Sheets. And we're going to be using them for the extra sizes. Would you have to do? And this is really important because it will affect the configuration and ingestion of the data. As you can see, they are in a folder. So if you go back here and click and shared with me, you'll see this Alice and data blending folder. Once you click within the folder, you'll see the files. The thing that you have to do, the important piece that you have to do is make sure you select these four files. Then go to the three dots here on top, and click on Make copy. And once you click on Make copy, Google will start uploading these files from this shared directory to the actual drive that you have. So as you can see here, copied for files into MyDrive. So by doing that, you now have local copies of these files. Now the next step would be to go to Google Data Studio just to see how this will be visualized. I've opened Google Data Studio here. I'm just going to click on it. And we're going to click Blank Report. And then we're going to be of course, prompted to select a data source. In this case, I'm going to click on Google Sheets. And immediately when I clicked, I can see of all the items that I have. Copy of dataset 1, 2, 3, and 4. I have them right here. So when, when we're selecting them later on and using them as data sources and blending source. We just click on it and click on Add. You don't have to have them right now. We'll do it later on when we are looking at the case scenarios. And lastly, it is very important to use the same accounts. So check your account here and email address that you're using. And also check within Google Drive that you're using also the same account for Data Studio. Sometimes there are situations where Google will log you in into one account if you have different Gmail accounts and you would not be able to see the files. So ensure that you check that when you toggle between Google Data Studio and Google Drive. In the next video, we're actually going to start blending the data. 3. Blending data sources with a single JOIN key: Now let's dive into data blending. We're going to open a brand new report and simply click Add Data. And we're going to repeat what we did in the last video, google Sheets here. And there is a bunch of datasets that are available. We're going to start first with dataset one main. And I click on that. There's only one sheet there. And we're going to leave everything as is. I'm going to click and we're going to get the second prompt on a little bit. And we're going to click on Add to Report. What we're seeing here is one of the dimensions, website name and then the default dimension, record count that Google appends tool and new tables. We're going to do a bunch of changes here. So we're going to change this to we're going to add date. And we're going to add session. I'm going to drag them right here. And we're going to remove the record count because we don't need it. And we're going to expand the table. So what we're seeing here is the date is January 25th, so the month of January, all of that. And we have website name inside Data Studio dot ca, that's one of our resources. And then we have sessions for every single date. Now imagine if we had another website that was also in our Google Analytics, or we own that digital property and we want it to report on the sessions together for those two websites. That's where data blending could be super useful. So instead of creating a brand new data sheet and doing all the manipulations there and then re-upload ing it. We can actually do a lot of the work in Data Studio. So we can see here on data-set number one, we've already added in underneath, there is something called Blend data. So we're gonna click on that. And a new, fairly new menus going to show up. So I'm going to expand it here. What you're seeing here on the left is your dataset. So we have dimensions. We have date, website, website name, sessions, and there's nothing really different. Now let's add another data source. So we're going to click out here and notice something. We have the available data sources here, and then we have added data sources. So this is the one that we add it. Now, you would have to go back before we proceed, will have to go back and add the second dataset in order for it to show here, because even though we haven't in our Google Drive, it's not visible to Data Studio yet. So I'm just going to close it quickly and discard changes. I'm going to go to Add Data. Same thing with Google Sheets and click on main dataset 2. We're going to repeat the same procedure, fairly easy ad, and then add to report. And we are done. So dataset one is still here, blend data. We're going to expand this again at another data source. And there you have it. We have a dataset 2. So we're going to click on it. And here's something interesting that changes. So we have this join keys. We talked about the different types of joins that there are. And with Google Data Studio, the main join used is left outer join. And just as a refresher, what that means is the product of this joint will include every single data point in this left side here. So everything that this dataset plus the matching values between dataset 12. So that is what's going to be our product. In order for this to work, you have to make sure that you're joining the right spot. So what is the unique identifier that we have between those two tables? Well, if I shrink this a little bit here, well, I have the date and then I have the website name. But here is Data Studio dot CAD voter dataset has another website, so it's not a really unique. And then the sessions is the metric that we're really trying to add and get. So it's not really going to be a unique identifier. So that leaves us with date. We really want to merge this based on the date. So let's try to do that. Well, Google Data Studio normally recognize us a couple of these joints and it doesn't automatically, but in this case, it has pulled website as well and we don't really need that. So we're going to close this off and delete it. And we'll show missing. So you will have to close it off from here as well to completely disappear. Now here's the fun part. We have the sessions here from dataset one, and then we have sessions here from dataset 2. So I'm going to drag this to metrics. And what's going to happen here is I'm going to have sessions and the name is going to be absolutely the same. So what we wanna do is rename it. So sessions I'll put number 2, just as a reminder that this is the session number 2. And then I'm going to put sessions number one here. So session number 1. And on the right side here, you can see what the product is going to be. So this shows you what dimensions are going to be available when you click Save. And so it tells you there's going to be date, there's going to be the sessions of website one and then the sessions of website too. So let's click Save just to see what the product is going to be. We're going to click Save. And what's going to happen here is we're going to get another data source called blended data. And of course you can rename that there is no problem, but where you can access all your data blends is in the resource section here, and you're going to click Manage blended data. And here is the blend that we just created. You can duplicate it, you can edit it, and it's very similar to the data sources. So I'm going to close this off and let's see what our product does. So against every single date, we have the sessions for website one and website too. And because we've noted them here, we know which one is website one, that's the Data Studio. A session number 2 is actually Data Studio.com. So let's see for correct because we have the numbers but we're not sure if this is actually in blended in the correct way. So I'm going to expand this table as much as I can so we can see most of the record. So we have four and then a little bit more, maybe I'll scroll down. And then we're going to see one here, right? So just for the sake of the exercise, we're going to add another chart here on the right side. And we're not going to have the blend. We're actually going to have data-set number one. Just so we note it down, I'll click on the text box here and I'm going to write dataset one. And then we're going to copy this table. So we're going to click on data here, Control Copy or command copy. If you're on Mac, paste it here. And we're going to change that dataset as well. And we're going to change it to dataset and number two, main. And so let's see, let's pick a date, for example, that we want to double-check against all these. Well, let's say that's January seventh. So let's see what happens in January 7th. But before that, we can style this a little bit better so it's easy to understand. So we have dataset one, and I'm going to scroll this down and write dataset 2. And also we're going to add this control for data range. So we can filter just the specific date. And we've added it here. Let's click View and see if this is all correct. So what we're going to go and change this to January 7th, and we're gonna go back here. Similar thing, we're going to be January 7 and we're going to click apply. And let's see what happens. There might be an error, but let's see where that error is going to come from. So we're going to apply in. There you go. So first dataset number 2, we have 87 thousand, and in this case we have 87,907 here. So we are correct sessions number two, that's perfect. But dataset one didn't really filter. And we'll see why that's the case in a minute. But we do have the correct data, 29505 and then 29505. So I'm going to click Edit. And just as a reminder, when you're dealing with CSV files and Google sheets or any other external formats that are not native connectors. It's really important to make sure that the date range dimension is applied because data student needs to know which one is the date. So we can apply it when you're filtering. So if I go here and add the date, we're going to be able to filter it just fine. So let's try it again. Go back to January 7. And, and then click on seventh here. And now it should all be fine. So it's easy for us to check this blend. 4. Creating a calculated field in a blended data source: Let's take a little bit deeper into the blend that we just created. So we're going to look into the dimensions and metrics that we have in one of the important pieces is, how do you actually take sessions 12 and have them in one column? How do you add them up? Because you really want to have a total that you can use for another type of chart and not necessarily to have it in two separate columns. So Data Studio has really made it easy for us to do that. We simply click on the Blend and we could come in here and we'll just have to move these sessions to metric just here on the bottom. And then similar to what we would normally do, if this wasn't just any type of connector. We're going to click on Add metric, Create field. And we're going to say number one plus number two. And then we're going to say Session 1 plus sessions too. And so it's as simple as this, we're going to click Apply. It asks us for the aggregation, which is great. We're going to have it at the Somme at this point. So that is perfect. And as you can see here, we have sessions 1, 2, and then 1 plus 2 here on the side. If we're doing any type of a chart, we can even collapse 12. Or even potentially we can have it as an optional metric. So if we have 12, we can add this one here, and then we can add it to here and simply have 12 as your main table. Then if you click on View and you want to break it down to see which website did better. You're just going to click optional and simply add these guys back again. As you can see, it's really flexible and powerful what you could do with the blend. Another aspect that's really helpful when you're doing blends is the type of charts. So if I go back to edit here and perhaps click on this one and change the chart type. You will see how this can be really helpful, what we just did with the optional metrics. So we're just gonna do, let's say, a line chart. And it's a little bit distorted obviously because it's fairly long, so we're just going to normalize it like that. We have the dates here. We have 1 plus 2, which is great. But in this case we're going to just click on optional matrix once again at session one and session two here, and click view. And so what happens is we have the total number and let's say you were analyzing our website. We're seeing where the days were. A lot of people are visiting our website then where the slumps are. But if we want to see who contributed to that more, one or two, we can once again click on the optional metrics. Have sessions, add sessions here again, and we have it all. We can easily spot which website contributed in which day. I can see that on January 17, website number 2 was really, really performing and there's a huge spike, whereas website number one, it was really pretty much linear and there was no, No real spikes or bumps. So by using data blending, you can combine all of those different tables and create more powerful and more telling visualizations. That being said, what we explored in this video is a fairly simple relationship between those tables and the number of joint keys that we have. But in the next video, we're gonna take this to the next level by introducing multiple join keys and blending tables that have a lot of different dimensions. 5. Blending data sources using multiple dimensions: In this last video of the module, we're going to tap into the other two data sources that we had. So we're gonna go back to Add Data and click on Google Sheets. And this time we're going to click dataset three left. And we're going to add that. We're going to wait for the prompt, and then we're going to add the other one as well. So add data, Google Sheets dataset for, right. And we're going to click Add here. And then one more time, Add to Report. Now we have both of these loaded. What I want you to focus here on is the all of the data source. So we're going to go into resources to explore it a little bit further. We're going to click on data sources. And of course we're going to see all the ones that we used in this module, but we want to focus on Dataset 3 and 4. So I'm quickly going to click on 3 and show you what dimensions we have here. So we have campaign impressions, creative version, clicks, total conversions, and tactic. So these type of metrics and dimensions resemble a paid media campaign. Now you might have noticed that we have six dimensions here, and a lot of these are metrics. We're going to fix that later on. But simply because this is a external data source and data studio was not able to recognize them correctly. We can still see the type is number or text here. So we're safe in terms of solar save in terms of the actual type. And also aggregation is applied a sum here to the clicks and impressions. So we're good. We're going to go back into a dataset. And for a second, look at dataset number four. So dataset number four-halves tactic, creative version, campaign and then spend. So we're seeing that spend is what we're looking for so we can match how much you spend on each campaign and we can append that to the impressions clicks. So we have a fuller picture of our paid media campaign. So that is what we're going to be looking for when putting this blend together. We're going to click down here and we're going to proceed to creating our blend. So far this looks good. We have data-set number three here. So all we have to do is click blend data. And we're going to try a couple of things. Now, I've put it here as a notation as left because this is on the left side. This is where we had a lot of the data. So we would want to have this left outer join here. Then we're going to click on add another data source. And then we're going to have the right one, dataset number four. So the modifications that we will have to make here is we have the join key. We're going to replace the record count. We impressions. We're going to add click here on the bottom, we're going to add total conversions. And then on the right side, we're going to add spend because that's what we're looking for. The join key that we've selected here is campaign, and it's a single join key. So let's see what happens if we proceed right now with this blend and click just safe. Now let's expand this table and see what is happening. Well, it seems like everything is fine here and we have campaign 12. Let's double-check with the method that we learned last time. We're going to click on Add Chart table and we're going to place it here. And all we have to do here is just change it to Dataset number three, just the left one. And it seems like everything is fine. We have 42 million here for campaign 142, 36 for campaign to. And on top we have 36. So it seems like the blend was working. Now the little trick here is what happens when we start adding dementia. So I will add a tactic dimension. And then I want to creative dimension. I want to see per every single paid media tactic. And for the creative versions that I had, how many impressions did a half for each one. So we have the active metric here, which is great. Well, let's click on our blend. Obviously we don't have the dimension here because we haven't added it. But let's try this. So edit data source will expand this a little bit. Click on add dimension, and let's add what we were missing. We're going to add tactic, but we're missing. And we're going to add creative version. And so let's see what happens if we do this. We're going to save it. And now we're going to add the missing dimensions. So because these were inputted in the wrong place, we're just going to remove them from dimensions and put them in metrics because they are indeed metrics. And that can sometimes happen as I noted, Data Studio does not classified them automatically when it's coming from an external file. So It's important to pay close attention. Now let's add tactic and let's add creative version. And let's see what happens. Well, well we have the campaigns here and it seems like impressions were fine, cliques are fine and total convergence would be fine. Let me just add total conversions here. And let's add clicks. And let's just organize them a little bit better. So from what, at a first glance, it seems like those metrics are fine, but the spend metric here is showing the same value for each campaign. And if I view this, We can see that for a campaign to, it's this value. And for campaign one is this value. And it's clearly not correct because if we told us up, it will be way, way more than what we would have in our dataset number two. In this case, we've made an error here because we're assuming that this is a one-to-one relationship. And there's not going to be a breakdown between a campaign tactic and create a Google. The issue is confused. So based on the join keys, it doesn't know how to split the spent correctly for each campaign. So it applies it again and again and again for each line. How do we correct this? Well, we'll have to show Data Studio the specific granularity that's available. And we can do that by going to the data source here to the same blend. And instead of having just campaign here, we can add tactic and then we can add creative version. So by doing that, we're informing Google Data Studio that the tables should be joined by campaign tactic concrete a version. And only then if there is a match with another dimension that will be displayed correctly. We no longer need the dimensions here because that's just from the left joins side, and we've already added them here. So I'm going to delete these guys. Then click on Save. So let's see what happens with the table. Will go back to close. And obviously it's invalid because those dimensions do not exist. We just deleted them even though they were the same names. Sometimes you wouldn't be able to see that. So we're going to delete these guys and we're technically going to add the new dimensions. We're going to add that. We're going to add tactics and create a version. So in that case you can see this has changed in much better state. Well, how would we prove that? Well, we can create another table so we can delete this one. Click on Add Chart and create water table here on the bottom. That only includes dataset number four. We're going to click here and we're going to add tactic. And we're going to add creative version. And just spend instead of record count the record Can, we will delete it. And let's just reformat a little bit. And let's click on View. Let's try to find a campaign tactic and aversion to compare. So let's assume campaign number 2, tactic three and blue. Well, we're as bad campaign number 2, tactic three, actually on the bottom here, campaign number 2, tactic three, blue. What do we have here? 58,938 and the first record, 58,938. So we've correctly blended the data because we're using multiple join key to reflect the granularity of our table. So we're feeding more information to Google Data Studio. There are a lot of different cases that are we even more complex when it comes to joining and blending data sources. So be mindful and always check what's the relationship between the columns that you have in the two or three different data sources that you're trying to blend.