Data Visualization for Beginners: From Charts to Storytelling

Andrew Pach ⭐, PowerPoint, Animation & Video Expert

Get unlimited access to every class

Taught by industry leaders & working professionals

Topics include illustration, design, photography, and more

Get unlimited access to every class

Taught by industry leaders & working professionals

Topics include illustration, design, photography, and more

Lessons in This Class

- 1.
  
  Introduction
  
  0:59
- 2.
  
  01.01 - What is Data Visualization?
  
  4:01
- 3.
  
  01.02 - When to use charts
  
  2:38
- 4.
  
  01.03 - Data-Ink Ratio
  
  2:56
- 5.
  
  01.04 - Data-Ink Examples
  
  4:33
- 6.
  
  01.05 - Encoding and Decoding
  
  3:57
- 7.
  
  01.06 - Perceptual Tasks
  
  5:01
- 8.
  
  01.07 - Perceptual Tasks #2
  
  6:26
- 9.
  
  01.08 - Remember this
  
  2:36
- 10.
  
  02.01 - Proportion
  
  2:32
- 11.
  
  02.02 - Color selection
  
  2:50
- 12.
  
  02.03 - Accessibility
  
  2:17
- 13.
  
  02.04 - Annotations
  
  3:27
- 14.
  
  02.05 - Labels
  
  3:18
- 15.
  
  02.06 - Estimates
  
  3:02
- 16.
  
  02.07 - Decluttering
  
  3:14
- 17.
  
  02.08 - Remember this
  
  2:36
- 18.
  
  03.01 - Framing
  
  4:03
- 19.
  
  03.02 - Narrative
  
  4:45
- 20.
  
  03.03 - Dashboards
  
  2:52
- 21.
  
  04.01 - Distortion
  
  3:48
- 22.
  
  04.02 - Lie Factor
  
  4:26
- 23.
  
  04.03 - Correlation
  
  3:05
- 24.
  
  04.04 - Data Bias
  
  3:14
- 25.
  
  04.05 - Normalization
  
  2:42
- 26.
  
  04.06 - Aspect Ratio
  
  3:48
- 27.
  
  04.07 - Y Axis in Line Charts
  
  4:01
- 28.
  
  04.08 - Error Bars
  
  4:33
- 29.
  
  04.09 - Remember This
  
  2:15
- 30.
  
  05.01 - Plot Types
  
  3:58
- 31.
  
  05.02 - Line Chart
  
  2:40
- 32.
  
  05.03 - Area Chart
  
  4:25
- 33.
  
  05.04 - Bar Chart
  
  3:14
- 34.
  
  05.05 - Scatterplot
  
  3:43
- 35.
  
  05.06 - Bubble Chart
  
  2:32
- 36.
  
  05.07 - Histogram
  
  4:12
- 37.
  
  05.08 - Density Plot
  
  3:16
- 38.
  
  05.09 - Pie Chart
  
  3:08
- 39.
  
  05.10 - Waterfall Chart
  
  2:17
- 40.
  
  05.11 - Combo Chart
  
  3:43
- 41.
  
  05.12 - Maps
  
  3:09
- 42.
  
  05.13 - Remember This
  
  1:14
- 43.
  
  Exercise 1 – Analyzing Data
  
  3:59
- 44.
  
  Exercise 1 – Building Chart
  
  4:47
- 45.
  
  Exercise 1 – Building Title
  
  4:55
- 46.
  
  Exercise 2 – Analyzing Data
  
  3:48
- 47.
  
  Exercise 2 – Action Title
  
  5:08

Beginner level

Intermediate level

Advanced level

All levels

Students

Projects

About This Class

Practical Data Visualization: Create Clear Charts and Tell Better Data Stories

Learn the fundamentals of data visualization, chart design, and data storytelling in a practical, concise course focused on real-world application.

This class will show you how to create charts that communicate clearly, choose the right visualization for your data, and apply modern design principles to improve presentations, dashboards, and reports—whether you work in Excel, PowerPoint, Google Sheets, Tableau, or BI tools.

You’ll learn:

How to choose the right chart type for different data
Core principles of effective data visualization design
How to improve readability with color, labels, axes, and annotations
How to declutter charts and avoid common visualization mistakes
How to use data storytelling techniques to guide attention and highlight insights
How to turn raw data into clear, persuasive visuals through practical examples

This class is for analysts, marketers, business professionals, students, dashboard creators, and anyone who wants to communicate data more effectively.

No advanced tools or design background required—these principles work across platforms.

If you want to improve your charts, graphs, dashboards, and visual communication skills, this class will give you a strong practical foundation you can apply immediately.

Meet Your Teacher

Andrew Pach ⭐

PowerPoint, Animation & Video Expert

Teacher

Hi! My name is Andrew Pach and if you want to learn PowerPoint you are definately in the right spot! To my friends I'm known as 'Nigel'! I am an After Effects / PowerPoint / video / graphic design junkie eager to teach people how to utilize their yet uncovered raw design talent! I run a YouTube channel called "andrew pach" which I do with absolute joy and passion. Here on Skillshare, I would like to share interesting, project-based classes that will make your design workflow a greater experience. If you look below you can select any of my PowerPoint classes to learn from them!

See full profile

Related Skills

Project Management Presentation & Public Speaking Data Visualization Marketing & Business Business Skills Leadership & Management Strategy & Planning

Level: Beginner

Hands-on Class Project

Share a screenshot of a Chart you prepared for this class.

You can use any software you prefer to create it.

Class Ratings

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.

Transcripts

1. Introduction: Hello. My name is Andrew, and I welcome you to the Master Data visualization course. In this course, you learn how to choose the right chart and when does it make sense, how to build charts, how to label them, how to annotate, how to estimate, and how to use color. And I'll also show you what to avoid, so your data is always honest. You'll learn how to, for example, declutter a chart from a mess like this into something more readable into a well executed design that is easy to explain and use for Data Storytelling. After this course, you'll never look at charts the same way. You will pay attention to the little things like the aces, how forecasts are showcased, what colors were used, what design choices were made here, and it is a beautiful process. You don't need to be a designer. You don't need any advanced tools. You just need to be ready to learn. If you're ready, it will be a lot of fun. I'm waiting inside. Let us start. 2. 01.01 - What is Data Visualization?: Hello and welcome in the beginning of the course. We have to start somewhere, so let's explain what is data visualization. If I would just have one tiny sentence, I would tell you, data visualization is turning data into meaning. It's not just about charts. The chart itself is only the tool we use to move an idea from a spreadsheet, from a table, from data that we have into someone else's head in a way that makes him understand it. And this is crucial. You want people to understand your statement. I like to teach on examples, I like to be practical with my courses, so let's go over it on this extreme example. This is a table. This is a table with raw data. This is Skystream Analytics, a company I've obviously made up. It's the revenue for 2060 expressed in US dollars. Now, you see raw data and you basically come up with no conclusion. Unless you know the company and unless you know what I'm about to say, you basically aren't able to come up with a more sophisticated conclusion than just saying, Hey, month A was better or worse than Month B. If I ask you how the summer went, then you could start to look at it. But what do I mean by summer? Do I mean just June and July? Do I mean warm months of the year? I would need to be a tiny bit more precise here. Do I mean astrological summer? Now, on the contrary, previously, we had a table. Now let's do the exact opposite. Now we have just a statement. SkystreamGrowth, accelerated significantly in the last three months of the year. This is the conclusion I wanted you to come up with. But what do I mean by accelerated significantly? Do I mean 1%, 10%, 30%? This can vary. Okay. Now the same exact data we had in the table and the same data that I gave you in this statement is put on a line chart because I thought it will be the most accurate and the most suitable type of chart to use here. I'm using an action title showing you revenue search 42% in othqar following the October pivot. If you read this sentence, you already know what I want to say with this chart. I want to show you, Hey, we had a slight slump in the summer months, but later on at the end, look how our revenue exploded. Take a look at the end of the chart. If I would be the designer, I would probably draw a chart similar to this. I would draw your attention to the last three months of the year. I would make some kind of annotation like this, and I would give you a brief statement, Skyscrem growth accelerated significantly in the last three months of the year. This is what the data visualization is all about. The key takeaways from this very first lecture is communication first. The chart is only the vehicle for the message. Evidence and insight. Data proves the claim like we had the data inside of the table, Visuals explain it. This time, we used a line chart to explain the data that we had in the table. You need a definition, there are many ways to define data visualization in itself, but we could say data visualization is a functional tool used to reduce the time it takes to understand information, and this is beautifully put together understand information. There is one bonus information I want to give you. Data visualization is not just charts. I will expand this in the next lecture. For now, the goal of this course is completely set. After this course, you will be able to turn raw numbers into a visual story, into a narrative. This is the goal we are both working towards and striving for. So let us continue. 3. 01.02 - When to use charts: Are charts always better. A common mistake in data visualization is assuming that every piece of data always needs a chart. In reality, the best visualization is simply the one that communicates the point fastest and most understandable. Sometimes, obviously, this will be a graph, but sometimes just a simple sentence or even a table. Let's go over that as always on an example, on an actual example that you can comprehend. Yesterday, our checkout success rate was 98%. A sentence for that seems completely fine. Let's compare the same data in a table. This type of data doesn't take much sense to be presented within a table. Now a chart. Look at this comparison. The same information put in a sentence, table and chart. The table and chart look almost ridiculous. On the chart, you need to read the axis, read the title, how big the bar is, the value, it's overkill. When the data is simple, the sentence is the winner because it has the lowest cognitive load for you. Your brain processes it instantly without having to navigate through an entire graph, but is a sentence data visualization? Of course, it is. A sentence, I presented correctly, sure, you can consider this data visualization. Now let's go over the second example. The checkout rate for desktop was 99%, while mobile was 94, tablets were 91%, smart watches were 75, and overall, new users average 88%. As you see, text has a breaking point. As soon as we have five or this type of different categories to compare, a sentence simply fails. It was exhausting, just reading this for me, let alone you understanding and comprehending that. This is why in this situation, I would maybe use a table. If you need to see the precise values like 99%, 94, 91, a table would be great. It's organized and it allows you for a quick overview. The same on a chart, it would be just as good. Maybe to make it easier to read, I would put the data points inside the bars. To recap this entire lecture, depending on the data you have, you might want to use a sentence, a table, a chart, or something else, a drawing. As long as it conveys the information in the most simple, quick, and understandable way, it will be proper data visualization. Let's now move forward. 4. 01.03 - Data-Ink Ratio: This lecture, we are becoming more and more practical. We will talk about the data ing ratio after EdvarTift. It is a foundational concept introduced by Edward Tift in his 1983 book, The Visual Display of Quantiative Information. The equation here is that data ink ratio equals data ink divided by total ink used in the graphic. Currently, nowadays, we don't use ink as much anymore, so let's translate that into pixels because most data visualization nowadays is created on a screen. As always, I will explain this on a example. Data ink is the actual ink, the actual pixels used to represent the data. The blue background, the thick guidelines are also part of the chart. The data ink ratio will measure how much of your chart is actually doing work versus how much is just decoration. Like the blue background here, you would still understand this chart without this background, so this is just decoration. Your goal is to get that ratio as high as possible. If I would take a look on this chart, this green part represents data, those labels represent data, and, of course, the titles of the actual bars represent data, and maybe the title. If a graph has too much noise and distracting elements, it is considered to have low data ink. Okay, let's go over this example. In the below example, the background, the grid lines, the shadows and other unnecessary aesthetics distract the data from being represented. On the contrary here, take a look at the second more simplified chart. Removing distractions makes the visualization far easier to understand, and the person viewing this is able to focus more on the data itself. I think this is pretty clear. However, we need to make sure that the diagram is not simplified so much that the ability to understand the data is reduced. Like for example here, I'll remove the data next to the YouTube, the red bar. It becomes difficult to understand. Yes, you may see that the bar is twice as long as messenger as the blue one or maybe more than twice. So to summarize this, try to reduce clutter without compromising the chart's message. I think this is pretty clear and TITA actually laid out five laws of data ink. Above all else, show the data, maximize the data ink ratio, erase non data ink, of course, within reason, not like I did before. I erased a little bit too much, erase redundant data ink and revise and edit. In the next lecture, we will go over an example where we do all those tasks and overall take a look at a chart and try to adjust it. 5. 01.04 - Data-Ink Examples: In this lecture, I want to go over a practical example regarding data ratio. We have a chart here. There are many changes that we need to apply to it. Let's take a look at it. Above all else, we should show the data. But if we use a three D chart, the data is skewed, the bars appear a little longer than they really are. So first of, I'll get rid of the three D data. Now the grid lines, the grid lines obscure my vision. They actually make those bars difficult to read, so I'll remove this and the second guides as well because we actually don't need them as much. If you want them, you can make them thinner lighter, but definitely not as thick. Now, because we have a shadow on the text on the left side, I want to remove this shadow because it makes the entire chart difficult to read. Now I want to remove the shadow around the bars themselves. I think the colors are perfectly clear. Now some colors are not perfectly visible. We will address that soon. Currently, we have plenty of colors, but for example, the green one, the yellow one. In my opinion, they are a bit hard to see on this background. I will make the background white because I will not tell you that you never can use gradient backgrounds, but in the majority of cases, it isn't a great idea, especially if it obstructs the vision of the actual data. Now, take a look at the bottom axis. The bottom axis goes from zero points to 100 points. Do we need 50, 60, 70, 80 when there is no data? Let's reduce the axis to stop at 50. Currently, it is better adjusted. The bars are actually better visible. They became longer, but they start at zero, so that is no problem. Okay, what else can we do here? Actually, it's a bit difficult now to compare the data. For example, can you tell me how much is orange? Is it four? Is it tree? Is it tree and a half? And how much is the dark green one? Is it 18, 19? So let's put the data directly over the wars. Now I can see precisely that's beautiful. 4.2 for orange, 18.5 for the green one. Okay? Let's move forward. If you take a look at point number, erase redundant data ink. What is redundant here? On the left side, we have all the names of the categories, and on the right side, we have a legend with colors and with the categories written again. Do we need this? I think it's perfectly clear that you see yellow email newsletter, you don't have to put it twister. So I'll just remove the legend. Now everything is cleaner, but we are at 0.5 now. Revise and edit. I like this design, but I dislike the colors that we have. Let's make everything uniform, and let's focus on the message we want to give. What is the message here? Depending on what you want to showcase. In my case, I wanted to showcase the two highest bars. Those are retargeting ads and referral program. So let's change the color of just those 2 bars. Now I don't have to look at yellow at orange at blue, and I don't need as much brainpower to see what is here. Would prefer personally, if the highest bars would be on the top side. This is now revising and editing the chart. Because those are the biggest, the highest and the most prominent ones that I want to show you, I'll put them first, so I'll order everything descending from them. Okay? Right now, I can put the data inside. I think it looks just a little bit cleaner. This is a personal design choice, but I think this will look better. Okay, I'm editing the chart, and now I can decide whether I want to stay with the bars or with respect to the data ink ratio. The bar is a little bit big. Why do I need that much ink if only the last bit of information, if only the data point is important? I can change the actual entire bar into just a line. Here, I'm using a lollipop chart to display the data. This is how we went from a I don't want to say ugly but improper three D over designed chart into a clean, simple, and understandable chart like here. And this is exactly what the data ratio is meant to represent and what you now going forward, will take into consideration when creating charts and data visualization overall. 6. 01.05 - Encoding and Decoding: In this lecture, we will talk about encoding and decoding, and I'm very excited to show this to you because it's a very important part that is often missing when talking about data visualization. And this is the reason why you soon will be the better designer than anyone else you know. Hopefully, Data visualization is a two way translation process. As a creator, you take a number, for example, from a table like a revenue figure, and you encode it into a visual property, like the length of the column in a chart. This is called encoding. Then your audience looks at the bar at this chart and must decode it back into a number or conclusion, and this is the most difficult part. This part is called decoding. So you need to make sure that the chart you designed will be understandable for people and they will draw exactly the conclusion that you want them to. Nothing should get lost in translation. This is your entire job here. Encoding is simply the act of turning a number into a shape. These are just examples here in a table depending on what you want to use. You have the same value here, $100, but on different charts, those $100 will be represented differently. On a line chart, it will be a line. On a bar chart, it will be a column or a bar. In a bubble chart, it will be a circle. On a pie chart, this will be half a circle or a part of the circle or an angle. On a map, this might be a color, a darker color, a brighter color. It all depends on what chart type you pick. How do you select? What will be simpler? What will be more difficult? What will be appropriate? I will teach you everything about that in this course in the upcoming lectures, so don't worry. Now, decoding is where your audience's brain tries to turn your shapes back into numbers and just look at it. You've probably not thought about charts like that. The problem here is that the human eye is not a perfect sensor. We are naturally better at understanding some shapes than others. For example, we have an easier time to understand the length of the bar than the area of a circle, and that is a fact. You need to memorize this very clearly. Let's take a look at the circles. Circle A is our baseline. Circle B is exactly 30% smaller. But looking at them, can you really say that? Can you judge it? Just by looking at them? It's very difficult. Now, let's compare those bars here. Those are the very same numbers, but encoded as length, only length, not entire area. Bar B is, again, exactly 30% shorter than bar A. It's not that easy to see, but if you would put them on a common scale next to each other, now it becomes much simpler. This is why graphs, charts and especially the bar chart is so understandable. If you want your audience to get the right answer, you have to choose the right shape. Here we would only have to decode length and we would be done. To recap this entire lecture, you, as the designer, take the data and want to extract a message out of it. This will be your encoding. You choose a way to visualize it, for example, with a chart, and Walla, we have data visualization, but this is only half of the story. The tricky part is your audience needs to have an easy time decoding it. So it's not enough creating a chart. You also need to always be mind of the people who will see, read, and hopefully understand it. In the next lecture, we will actually go over what is simpler and what is more difficult to comprehend for people. 7. 01.06 - Perceptual Tasks: In this lecture, we will be talking about perceptual tasks. I'll present to you the accuracy hierarchy. You will learn which shapes are easier and which are more difficult to understand. If you want to be a professional, you have to stop picking charts because they look nice. You have to pick them based on this accuracy hierarchy. This exact list here above is based on a famous study by Cleveland and McGill. From 1984, it shows what the human brain decodes most accurately, and it is still perfectly fresh and up to date until today. I'm teaching data visualization, and I'm showing you a big table with plenty of columns and rows, and it's difficult to comprehend and understand. Let me help to decode it by changing the first column into simple numbers. That's much easier. Now, the perceptual tasks, instead of reading them, I'll change them into icons. I'll, of course, give you some labels because probably you see them for the first time. Don't worry. They are simple, and I'll show you every one of those tests on actual examples. Let's take some data. We will use the data for mostly all the examples that we see here, 100, 95, and 50. Those are our three data points. Let us start with position on a common scale. The most simple and easy to understand perceptual task. And this is why you see those bar charts and column charts so often. Because they are positioned over a common scale, they all start at zero here. It's very simple to decode the length of the bars. Not only that, we have a beautiful axis on the bottom, so it's very simple. Okay, this is example number one. Example number two, the same data, but I encoded the data into rectangles with little dots inside of it. The higher the dot, the bigger the data point. Now, the problem here is we have the same data, 195, and 50. You kind of see the difference, but it's very difficult to decode. It would be a bit simpler if I put them on a common scale. Almost there. We are almost at a common scale, but they are still spread apart and it's difficult to judge. If I position them together, it's much simpler and easier to understand and see. And this is why you need to always try to reach for the highest possible perceptual task. You want to use task number one whenever possible, instead of test number two, three, four, five or six for that reason. Here we have position on a common scale, and it is perfectly understandable what is what? Now, the same data presented on a bar chart would be also very understandable. Okay, let me move forward. What if we have a stacked bar chart? So the first part starts again, aligned at the same zero point. But the light purple bars start here. This one starts here and this one starts here. Can you tell me which one is the longest? You probably can't only because they are all the same. All represent a value of 74, but they are unaligned. They don't have a common scale. So this is where you have to be very careful when using stacked bar charts. Okay, let's move forward to not make this lecture too long. The same goes for length. Basically, when you use bar charts or different types of charts, you often decode length, but the same principle applies. If you have length dispersed on a screen, it's very difficult to judge. It would be much easier if you put them on a common scale. And now, no matter the design, it will be much easier to judge what is simpler, what is less simple. So the common scale is really a lifesaver here. Now something difficult angle and angle is very difficult for our brain. This is why you have to avoid the pie chart as much as possible. Can you judge What is this angle? What is the second angle? You can see it's less than 90, but is it 89? Is it 88? And now the third. Can you tell me how much it is? Is it 120, 130, 40? It's 125, and this is the exact reason why Pie charts are so difficult. Now we have the same data that we had before 100, 95, and 50 decoded inside of a Pie chart. So it equals roughly to 41, 39, and 20. While you can see those two big points, if you wouldn't have the data here, can you really tell what is exactly what? Now the same decoded on a bar chart. Look how simple it looks it is, and it also respects the data in ratio because a pie chart itself is a lot of pixels, a lot of information that your brain tries to decode. A Brchart is simply a superior choice, and in that case, you simply should use it. Okay, let us continue with the next perceptual tasks in the next lecture. 8. 01.07 - Perceptual Tasks #2: In this lecture, we continue our perceptual tasks. So we learn what is simpler and what is more difficult to understand. Here, we have circles, A and B look identical. The 5% difference has completely disappeared visually. Maybe you can see that B is a tiny bit smaller. Area is very difficult to decode for our brain. We have lost all precision here and we cannot decode the values without having them. If I put one over the other, yes, we can now see it, but can you tell it? Is it 5%, 6%, 7% or 3% smaller? And if I put this here, do you know how much smaller it is? It is actually 50% smaller because I'm representing the same data. But instead of using charts, I'm using area circles. Even worse, we have now reached the territory of chart junk. And yes, this is a term specifically chart junk. When we move from two D shapes to volume and curvature, we are asking the human brain to do three D calculus just to understand, for example, a sales figure or any type of data. Take a look at the cubes because they are tilted in space. Your eye can no longer find the flat line to measure. Is up B 95% of cub A or is it 80% of cub A? What do you think? You cannot really say because the depth itself confuses your perception in professional data visualization, three D is almost always a mistake because it prioritizes decoration over documentation. Take a look. Now, it is so difficult to judge. See even worse. Can you really tell that this is half the size of it? You can approximately eyeball it, but you cannot tell me that you see it precisely. Now, the very same data presented on a flat surface, it's still not perfect because it's area. If I would have the possibility, I would try to rank up in the visual hierarchy. Okay, let's move forward. This is not to tell you. The bar chart is the best and you need to use it always for every single case and scenario. But if the data allows it and the perceptual tests do not restrict it, then you should the same with this pie chart. Part B now looks much bigger than Part A because it is closer to us. So be very careful when using three D. Now something different, curvature. This gauge is a favorite in executive dashboards, and I have to say, they look cool, but look at the needle itself. The scale the needle is moving on is bent into a curve. So your brain has to work twice as hard to figure out the exact value. Of course, you have a car, so we are used to it. It's very easy for us to decode. But because the needle is just slightly above 40, can you tell me what the exact value here is? You will probably assume it is 41 or maybe 40.5. It is hard to tell on an arc. It is precisely why you want to avoid those donut charts and Pi charles altogether. Because here, if you put it on a simple bar, you can precisely see it's 40.5. Okay. Let's go into the more difficult category. Color shading. I'll put this in one group as number six, our eyes cannot reliably and systematically always quantify color in the same way. Not to mention some color impairments, Duteranopia or protonotopia, on Windows, you have a shortcut Control, Windows key, and see where you can turn on and off different vision impairments, and it's very important, especially as a data visualization specialist. If you work with color, you need to take into account people with vision impairments might have difficulties reading this. But going back into the topic, I will put two arrows here. Do you see this color being the same or not? I will tell you it's not the same. Those colors are the same. Maps, colors, heat maps, and so on, of course, are usable in data visualization, but be very mindful. It might sometimes get difficult, especially if there is another color next to it that is very similar. Here we have a table. The darker the color, the greater the value. And here on the opposite, I think it's a beautiful usage of color to enhance the message of a simple table. The table, no matter the topic of the table, is very easy to understand because of color. And here, the color did enhance the message, and I love this. Here we have population change over time by region, and this is also beautiful because we see that in Hawaii, for example, the population has not changed since 1920, and for South Atlantic, the population has increased and color depicts it beautifully. Here another perceptual task shading shading is the most difficult. Can you tell me the boxes in the middle? Which box is the lightest, which the darkest? This is a very popular example when talking about data visualization and Walla they are all the same, and this is exactly why you need to be very cautious about color, about shading and about using colorful backgrounds because colorful backgrounds can distort what you see. If you need to use gray shading, for example, do it like that. A chart like that screams. Hey, look at bar number one, look at 4.3, but a bar like that doesn't give you that information. You use colors just to use colors, not to guide the viewer or the code information. Pretty, yes, but it gives me no tangible information. And without you as the presenter, I would know what do you want to say with this chart. To recap, as you move down the hierarchy, you are trading accuracy for aesthetics or overview. This is a cheat sheet here to the accuracy hierarchy. We will use this and what we've learned here across all the charts you are drawing from now into the future. Please memorize this or give it a good look to start to engrave this in your brain, and it will make such a difference in your data visualization story. 9. 01.08 - Remember this: Let's make a quick recap. What to remember if you want to create charts that communicate properly. Take a look at the orientation orientation will naturally guide the viewer. That kind of orientation will tell the viewer, Hey, watch this chart top to bottom or bottom to top. A landscape orientation will tell me watch the chart left to right. And a square orientation gives freedom, but depending on the data, you will give hints to the viewer how he should read this. Sizing your chart properly. As a starting point, the ratio of one to 1.6 is a perfect size to start, but it all depends on your design. For a 60 by nine screen, as you can see, two charts fit perfectly. If you want to fit three charts, you will most likely have to go for squares. And if you want to have more charts, then be careful. It might get a little crowded. When using colors and presentations, use them intentionally. Don't make them random. Also remember that colors like green naturally are interpreted by viewers as positive, and colors like red, especially if they are drawn together, symbolize negative values. You are free to use your own color scheme as long as it isn't a rainbow and it makes sense, and it doesn't give additional cognitive load to the viewer. Here, I would most likely talk about one data point, so I would give it a separate color. I would maybe enhance my message with an annotation. You don't have to always use annotations, but in that situation, this would be okay. When it comes to access on the bottom and on the left side, 1k2k3 K, but what? Potatoes, percentage, revenue in millions, you always need to properly label. If you've done or read any business presentation, clearly know that there is always proper labeling to what you are displaying. If you have any kind of estimates, communicate, for example, like that, put an E on the bottom, use a different color, make it transparent, give it a dash line. Now the user would clearly see that quarter two, three, and four are only estimates. There is a general rule to use transparency and some dishes because this would also be perfectly understandable when printed out and when presented in gray scale. Thank you very much. Let us continue now. 10. 02.01 - Proportion: Here, I would like to talk about proportion and scale. Let's go over different examples. The default way to display charts is landscape because it's natural for our eyes to flow from left to right. This is how you most often see charts. You may as well see them vertical, but it looks acceptable here, but it's very cramped together. I would most likely go for the same data, but in a bar chart here. This is vertical, but we of course, can also have charts that are displayed in square format. Let us display this on some examples. However, orientation suggests the natural reading flow. Here we have reading from top to bottom or from bottom to top. Here we have from left to right, and here we could read them both ways. So you need to decide. Always be mindful when creating those charts and when orientating them, how people will read them? How big do I make my chart? How long, how tall? You could start out with a golden ratio of one to 1.6. Of course, this is not always the case, but it is simply a ratio that works well for any designs you create. Let's see that now in some examples. Here's a presentation from an investor relations presentation, and the most important thing I wanted to highlight here is that this slide uses two different charts, one on the left and one on the right. On the top side, we have some key takeaways, and then we have the supporting data for it and supporting data, which are those charts. If you take a look at those charts, if I put the golden ratio above it, you can see we approximately are right here. This is not rocket science, but the designer deliberately wanted to fit two charts here, and he tried to make them visually also pleasing. However, don't try to force it. Here's the same slide from the same presentation. But here we have two charts and one informational box. If we put the golden ratio on top of this chart, you can clearly see it doesn't fit, of course, because this is a square, this is a square, and this is a square. That's completely fine because the designer wanted to show three different insights here. To recap this entire lecture, the most important thing, you may use the golden rule as a starting point, but don't obsess over it. The orientation of your chart will influence the direction in which your data is being read and compared. 11. 02.02 - Color selection: Let's talk about color and data visualization. Color is a powerful tool, but it is also the easiest way to confuse your audience, and from a designer's perspective, is the easiest to change within a chart or anything. To choose the right palette you first have to ask, what is the relationship between my data points. We have three most common categories when using color. We have categorical, meaning different colors. We have sequential. Those are used when data goes from low to high. And third, diverging. Those are used when your data has a meaningful middle or zero point, use two different colors that meet in a neutral center. Let's go over some examples. Categorical, there's not much to talk about other than different categories can have different colors from your color palette. Each distinct category has a different color. There is no sequence or no meaning behind it. You need to be deliberate with that, and the viewer needs to know and needs to understand that there is no correlation between the colors here, only the data. As an example, I have a presentation from Visa from their investor relation presentation, and here they simply use their brand colors, which is blue for their main logo and yellow, the supporting color. Here they display two different quarters, and they use two different colors, simply categorical, because this is the colors that they use in their presentations. We move to sequential palettes. This is all about intensity because these tears have a natural order from a new sign up to an elite member, the color follows the same path. Darker colors feel heavier to the human eye by making elite members the darkest. The audience knows that this group has the most weight and value. Okay, here a different example, and it goes from this bright color bright green into a darker blue color. 0-8 ", it goes from green to this dark blue. Okay, now diverge. Those are used when you have a critical middle or zero point here, zero or the neutral color, we have a very light color, everything positive, we have green, everything negative or a loss is red, and those are two extremes here. The closer the city is to a zero, the more the color fades to a neutral color. Deep red, of course, is here, the most scary. The most classic example of them all will be, of course, temperatures. We have a clear zero point at 50 Fahrenheit. Everything below is marked in blue and a darker blue. Everything above will be marked in deeper, deeper red. Just note that this color is a continuous gradient. This is all about colors. 12. 02.03 - Accessibility: Here, I would like to talk about color misuse, accessibility, and not confusing people with colors that we use in our data visualization. This is a simple example. Just because you have 1 million colors in your software doesn't mean you should use it. Of course, if you have a beautiful color palette, that's in sync with your brand and with your presentation in general, go for it. Like I did here for the entire presentation, I'm using a purple color and a purple color scheme when I'm teaching you this lecture. Okay, but don't just use color for fun, different colors here because the viewer's eye will try to decode, Okay, we have different quarters, we have different revenue. Why is something red? Why is something blue? Why is something orange? Does this have a meaning? I'm wasting unnecessary time to decode the colors. If I would like to talk about quarter, why don't I give it a separate color? If I want to talk about all the data, why don't I make it a consistent color across the board? However, be very mindful to be also color friendly with your designs. Let's say, here we have purple colors that are very similar to each other. I can also open my color filters. I can select deuteranopia, protanopia, or tritanopia, and also gray scale. Why not test if they work well on a gray scale? Okay. To make it simpler, I would maybe spread the colors a little apart. Another example. Here we have a bubble chart with marketing reach against investments, and the bigger the bubble, the greater the investment must be to achieve those results. Now, why is Tokyo green and why is London pink? Is there any reasoning behind it? Or did I just use colors to use colors? Yes, I just use colors for fun, but I gave you another task to your brain to decode colors. And if you remember perceptual tasks, color is on the very low side on this scale, so the easier your chart is to read, the higher we can go on the perceptual task scale, the better. I would choose a consistent color. I would maybe gray out the background lines to make everything stand out to me a little bit more. This is what I want you to remember when thinking about color choices. 13. 02.04 - Annotations: Let us talk about additional annotations on our chart designs. Let us show a chart that shows impact of social media campaigns. Okay? We have the charts. This chart shows what happened. However, an annotation would explain why it happened. So what happened in Week number three and week number six? Are those spikes just random data points? Your audience either guesses what happened or will ignore it. Unless you are speaking to them, you can help yourself with annotations. Here we have a viral TikTok mention, and here we had an influencer partnership. Memorize this, notations can be your additional voice on a slide when you aren't speaking. Here a way to communicate forecasts, we have a dash line that clearly shows that this is just a trajectory based on the previous estimates. If you are using annotations, you can, for example, go for a simple text speech bubble and plot it above the chart itself. This gives beautiful insight. Here, a very, very exaggerated example of what I've just said. Let's say that I want to just tell you one story from this slide. This is an overuse of those annotations, but this is still proper usage of annotations. Of course, I would prefer there being less, but this was the intention of this chart to give a wide overview of everything important that happened. From the beginning, let's say, for example, here, we had in 2025 imposed tariffs by Donald Trump, and this caused the stock market to go rapidly down. If I wouldn't use annotations for myself right now, use arrows, you wouldn't really see what's happening here because there are so many information. Those tariffs were paused for 90 days, and this created a sharp rebound in the stock market. With my arrows here, with those yellow arrows, I'm helping myself. Okay, here a better example. Here's a better example, and I think this is beautiful chart design. No matter the data here, we have beautiful annotations showing only the most important points. Then we have beautiful usage of arrows to point to those annotations, meaning those are important points on this entire chart. At the end, we have an insight that we only communicate through this entire chart, and this insight also has an arrow pointing to date to a annotated data point at the end of this chart. I could show you different examples like here, it's something simpler. We have simply year over year growth notified as this dotted line and information in the middle. We also have the same information between different quarters of a year. I think this is also beautiful usage of annotations. Here another presentation from Verizon. Here we have growth of 2.1% between a year, and we don't have to count this manually because someone gave us those beautiful annotations. This is proper usage of annotation. To enhance the message. Not only charts can have annotations. Here another presentation where on the left, we have a table, and on the right, we have some insights, and there's an arrow between them. This arrow instantly screams to me. Arrows are the best ways of giving annotations. This arrow instantly screams to me, Hey, this left bottom side of the table, this is the inside of it. And this is exactly how you should treat it and how you should work with annotations within your visualizations. 14. 02.05 - Labels: Here, I would like to talk about labels and xs, something extremely important. Now, remember always to label without clutter. We often over label our charts because we are afraid the audience will miss something. But this is why you are here for. Take a look at the bottom, how much percentages are that. Let's reduce it. Okay? Now we have less percentages. We could also put them directly on the chart itself to clean it up a little. Okay, now the percentages are directly on the chart. But as you can see, on the right side, we have a legend called requested feature. On the left side, we also have requested feature, and then we actually have the feature names. I think everyone would understand that we are talking about features. I would remove the legend itself, and probably in this case, if it's not that important, if this is obvious that those are features, I would also remove the requested features on the left side. This chart would be much clearer to understand. Now, let us go over another topic when to show the exact values. Do you have to label every single data point? Usually, the answer is no, depending on what you want to show. If your goal is to show a trend, let the line do the talking. For example, you could label only the most important points, like the start and end that we did here. By only labeling January and June, you would say to your audience, Hey, look how far we've come. If you label every month in between, you're giving them a bit more mental math to do. I also think this chart is too wide, so let's narrow it down a little. Of course, putting the data points here would be acceptable, especially if they are important. Be mindful of your labels. Here, I selected users, and I specifically said in thousands because on the left side, we have an axis. On the bottom, you can clearly see those are months. I don't need to write months below it or date. I think this is pretty understandable, but on the left side, you didn't really know unless I specifically say it. Now let's go back to the example we had before. Previously, I was talking about the arrows and the beautiful annotations. But did you see the left axis, the 100200300405 hundred K? The designer didn't repeat K, meaning thousands for every single data point. He only did the K once on the top side to communicate that those are values in thousands. I think this is a beautiful design choice. For the dates, also, the timeline wasn't as important here. Wanted to show ten years, but the most important message was on the chart itself, so he didn't bother to put many years here only three different years. I really love this because this was the key message that everyone should be drawn to. Here are a different example of beautiful labeling and actually focusing on a part of the chart. Here, with a blue line on the main chart, a part of the chart is selected, a 5.7% increase. Then on the right side, there is a separate chart that showcases this change in more detail. I really like the design choice here, and the blue color beautifully communicates what is what? 15. 02.06 - Estimates: It's very important to be capable of showing forecasts and estimates on your charts. I'll show you a couple of ways to do this, and we'll also go over examples. How do we show growth that hasn't happened yet? You should never use a solid line for a forecast. It implies the same level of certainty as your historical data. Here in this example, I switched to a dotted line. Of course, on the bottom, if you look at the axis, on the left side, we have actual data, and at the very right side, 2055, we have estimates. This is also clear communication of which part are the estimate. You could also use a dark color and you could even give annotation for estimated 10% growth. This is one of the examples you could use both annotations and lines for. Another way of showing forecasts would be giving a transparent pattern to one of the columns in your barchart or any given chart. There are different ways to do this. For example, you could maybe reduce the transparency, but what I prefer if you reduce the transparency, you might confuse someone with having one data point and another data point. So I highly recommend that you increase this with dash lines over it. There is a general consensus in data visualization that forecasts should use dash lines. Here a practical example. Here are some estimates to where the stock market might go in the future from different companies. What I like here that we have several companies we have their colors and branding used, and we have the most positive, the bull case and the most negative, the bear case, signalized with a red color on the top side and on the bottom. Perhaps the designer didn't want to give a green color for the highest rating because this would create too many colors. We have a blue line, we have gray lines, we have a red line, so giving another color here might be a bit too much. I think the design choices are perfect here. If you notice the lines are dashed and there's a dot at the very end. Then we have the companies displayed. Here's another way how estimates were shown with a dash line that is empty inside because this quarter or this part of this year hasn't happened yet. I think this is another beautiful way to showcase it. Here we have the entire bar and some estimates on it. However, always, be careful when looking at that. Here we have different data. This is biofuel energy production, and there is a map attached to it. We have dashed lines across half of the globe. So is this data or isn't this data? Take a look at the legend underneath. If you click on some of the countries, you can see on the legend on the bottom, there is simply no data. This is signalized here in the legend, but someone might initially think, Hey, this is a forecast, but why is there no color? There is simply no data for this dataset, and this is clearly shown in the labeling below. 16. 02.07 - Decluttering: Let's talk about declattering your charts. Declattering is simply the process of identifying and removing non informative visual elements to reduce cognitive load and highlight the actual data. Ed Varuft that we are already talking about has described this as chart junk, meaning things that doesn't represent data. All the software that we use have so many fancy things we can add to charts. Why do we use them? Sometimes it's simply not useful. On a chart like that, I would definitely remove the background. I would definitely decrease the intensity of those gridlines. Do you see the shadows on the bottom? I would remove the shadows. Why do we need them? I would remove the three D rotation altogether to clearly show the data to you. Now, I would remove every non essential thing to show you only the data that I want. How you could approach this on, for example, a line chart. We have here a complete spaghetti chart. At first, I would increase the size and decrease the colors of everything that I don't want to show. Let's say that you want to show only two lines. This would be my starting point. Now I would make those lines thinner. Perfect. Now I can clearly see the purple line and the blue line. Now, look at the bottom. I would remove the legend and put it on the right side so I have more space. But if I look at the bottom, how horrible are those years to look at? Okay, let's start working on them. What I could do to declutter this entire chart would be maybe positioning them like that, but you still have to tilt your head, and there's so many redundant information. The year is always displayed four times. What if we didn't show the year and only shown the quarters? Then I would have free space to beautifully at the years below. Do you notice the difference? Now we have a clear chart divided into three different years. Did you see this in the first slide? I bet you didn't because it was so difficult to see anything here. Are simple techniques. Now I would lower the color. Maybe I would increase the color of the part that I want to talk about. Maybe I would add some data. Now I have room to work with. Those are techniques that you can use within your charts to declutter them. Another technique is using whitespace. Let's take a look at this chart, and this is a scatter plot, and here we barely see the dots. Let's first remove the grid lines. Let's now reduce the names. We have European something Latin American office. Let's go EU NA, LatAm and APAC. Now the dots, do we need that many colors? Let's reduce the colors to gray, and the data point that I want to show you to blue. If I want to talk about Europe and some kind of data, I would just highlight this one. Now I would remove the big black border and we are left with a clean, understandable and beautiful chart that was decluttered. 17. 02.08 - Remember this: Let's make a quick recap. What to remember if you want to create charts that communicate properly. Take a look at the orientation orientation will naturally guide the viewer. That kind of orientation will tell the viewer, Hey, watch this chart top to bottom or bottom to top. A landscape orientation will tell me watch the chart left to right. And a square orientation gives freedom, but depending on the data, you will give hints to the viewer how he should read this. Sizing your chart properly. As a starting point, the ratio of one to 1.6 is a perfect size to start, but it all depends on your design. For a 60 by nine screen, as you can see, two charts fit perfectly. If you want to fit three charts, you will most likely have to go for squares. And if you want to have more charts, then be careful. It might get a little crowded. When using colors and presentations, use them intentionally. Don't make them random. Also remember that colors like green naturally are interpreted by viewers as positive, and colors like red, especially if they are drawn together, symbolize negative values. You are free to use your own color scheme as long as it isn't a rainbow and it makes sense, and it doesn't give additional cognitive load to the viewer. Here, I would most likely talk about one data point, so I would give it a separate color. I would maybe enhance my message with an annotation. You don't have to always use annotations, but in that situation, this would be okay. When it comes to access on the bottom and on the left side, 1k2k3 K, but what? Potatoes, percentage, revenue in millions, you always need to properly label. If you've done or read any business presentation, clearly know that there is always proper labeling to what you are displaying. If you have any kind of estimates, communicate, for example, like that, put an E on the bottom, use a different color, make it transparent, give it a dash line. Now the user would clearly see that quarter two, three, and four are only estimates. There is a general rule to use transparency and some dishes because this would also be perfectly understandable when printed out and when presented in gray scale. Thank you very much. Let us continue now. 18. 03.01 - Framing: Here, let us talk about data storytelling and how to frame your insight and why it is so important. Let me give you a very brief definition. Data storytelling is the process of translating data into a narrative that guides an audience to a specific conclusion or action. What you want to do with your data storytelling, you want it to have a narrative. You want it to have a conclusion. You want it to have a hero story that someone is following and with that understanding your chart, not the other way around. Is a simple example. It's a histogram representing how long it usually takes to deliver a package. Let's say your boss wants to know how are we doing. So most packages, as you can see, are delivered within one and three days, but there are some packages that take five, six or seven days. There are even packages that take nine days or ten days. There are a couple of data points in those buckets. If you would tell your boss, we should only focus on the things that we do good, we shouldn't be looking at the outliers. Well, you would be basically lying with your data. Let's be honest here. If I would just draw a conclusion, our average delivery time is three days. Well, that wouldn't be very fair. It would be better if I would say one in ten orders is double our promised time. Let's work on that. Let's see what happened with those packages. Before the first framing, we would just say, everything is okay. We are fast. But the second message would tell you, boss, we are inconsistent. Let's work on that. So the way you frame your titles and you showcase your chart influences how they are perceived. Again, another way of framing this. Most orders arrive on time. What is the conclusion here? Well, it is true, but it doesn't really tell the whole honest story. If I would tell our outliers are driving 80% of negative reviews, we would have something actionable to work on. You need to be very nimble and aware of what you want to frame, what you want to tell, and what you want to showcase. Here's a different chart, a waterfall chart telling where did the cash go, spending against earnings. We had 500,000 in January and 520,000 in December. We invested heavily in new hires and R&D cost, and thanks to that, we could achieve 250,000 of new sales. But the story would be lost if we would just show January and December and now how we should frame some insights. For example, you shouldn't be doing cash flow variant ir quarter fodqar. No insight, no story. Like, no conclusion comes from such a title. If I would tell our $200,000 investment in R&D and New IRS unlocked those sales, I would be given an actionable title that you can follow and understand the chart of. So the key takeaways when framing your insight for data storytelling is headline against data. Never name a chart by its data, name it by its conclusion. Every single business presentation you'll see, you'll have action titles, titles that explain the chart, not titles that just inform what is on the chart. The so what test. If the viewer can find the point in 5 seconds, the frame is too weak. Please consider that when you create your charts. Context is king. Never show a number alone, compare it to a goal to a peer, to a past. This makes it much easier to comprehend and identify the hero or on the contrary, when you want to show negative data, identify the villain. Every chart has a main character, a bar, a line, a bin, or annotation that you inform with. Highlight that specific conclusion. Thank you for listening. Let's move forward. 19. 03.02 - Narrative: Here we will be talking about structuring a narrative with visuals. Let us go over this. We should lead with the key takeaway right away. This is proper communication and giving the information right away and the insight so everyone can understand a chart. Let's look at this chart. This is warehouse efficiency. Look at this title, the title alone. It is way too passive. It tells me what the data is, but not why it matters. It's a folder name, not a story. Would you draw a conclusion by just having this operating cost per unit by location? Gives you some information. But what is the story here? What I want to show with this chart? Do I want to show just the data? Then I would be just as good showing you a table. Let's go forward. Regional warehouse performance summary. Still not good. What do I mean by performance? Do I mean speed, safety, cost, size? What do I mean here? There's no actual insight in the title. Now, let's go forward. If I would, Cost of shipping a box of oranges, San Francisco costs $82 per unit, while Dallas costs $38. What's going on here? Why is sending a box in Phoenix that much cheaper? This is the cost of processing. That means labor, electricity, and rent required to take one box of oranges off a truck, scan it and put it on a shelf. If you want to show a solution, you first have to show a problem, and I'm showing you the problem right here. The villain here are the expensive cities that are way, way more expensive than, for example, Phoenix, Austin, or Dallas. When you look at this chart, your brain shouldn't be looking at 22 cities. It should be looking at the big gap in the middle in price. And the most important part here is the title that gives the insight because if I would hide the chart from you, just by reading the title, you would kind of understand where I'm going with the data that I'll be showing below. I will be comparing costs and I will be thinking together with you, how can we reduce one to get it to the lower level? It would be even better if we, for example, at annotations here. Why is the gap so big? I, as the presenter, need to know what I want to say and make simple graphics. Big exclamation mark, big red arrows. Now you can clearly see that I want to compare those values. Let's move it one level higher. I'd be guiding the viewer's eye because, okay, the title is a little bit more interesting. 13 regional hubs have officially exceeded the $50 MX cost target. We are a company. I have a MAX target of $50. I will mark that right here on the chart, and now you can clearly see plenty of cities go over our price target. Now we could be starting to discuss it. I'm calling this the cost target, and now someone even not knowing what this is about clearly sees, Hey, something is to the right of this line and something is to the left of this line. Okay. We should be using maybe color to enhance the message, or, for example, framing this depending on how we want to explain it, but always remember you have full control over simple annotations that you give while presenting live. Of course, we should be doing a bit less of framing if this would be printed out, but you get the idea. Okay, let's try to, for example, divide this. You could use two different colors like this color would show where we have to fix the cost and the gray are the okay ones. We could add a line. We could add a green zone. The green zone is our target, what we want to do. The red zone would be our anti target where we want to change. Of course, this would be going a little bit much. I'm just showing you different possibilities what you can do on a chart to enhance the message you are trying to do. The title is extremely important to give the insight right away. Remember, if your article 0R publication has 20 different charts, no one will look precisely at them. You want people to understand your point by just reading the title and having a quick glimpse on it. You can also use additional charts to build logic, like we have in this example here. Here I have a longer chart, and at the end, there is a specific part which the designer wanted to highlight, and he made a second chart showcasing the minor differences and the details from that little part. I think this is one of the most beautiful designs I've ever saw about data visualization. This is why I love to share this example and I have it stored on my PC. Thank you very much for listening here. Let us continue. 20. 03.03 - Dashboards: Here, let us talk about dashboards against single charts or multiple charts against single charts. Here, we have one big chart. If you need to prove a specific point and you brute force the data on the viewer, this would be perfect. One big chart showing it. 24 weeks of data. It proves there is no lack involved. There is a trend, a pattern. I think we should even use a line chart here. It would be more appropriate because we are using a lot of ink to show all the bars, while the data that matters are only the percentage Okay, let's excuse this. You get the idea. Now a different type of constellation of charts. Here we have some kind of dashboard. Let's say this is a daily performance dashboard. We have calorie division, we have progress, our steps, our calories, and we have expenses that we take. Three different categories, seemingly connected to each other, but depending on how we place them, how big we make them, we switch the focus of the user, and we are giving a bird's eye overview over the data, not focusing on one particular thing. The advantage here that multiple charts next to each other, try to showcase some kind of correlation between them. For example, you can see how takeout expenses or restaurant expenses increase your carb intake and make you go less steps. But this is a thought process someone needs to very carefully go over. The problem with multiple charts next to each other or with dashboards is that they dilute the focus. There is a lot of noise here. You can use this kind of setup for general information, general awareness, but you want to use single charts for specific insights, specific actions. Here, you possibly might be hiding a great step count with a bad bank balance. So you are spending too much, but it's okay because you are making a lot of steps. This may be an exaggeration, but that's the idea when our brain sees multiple charts on one screen or on one inside. Of course, there are situations where this makes sense. If you would have several countries and several charts displaying some kind of data point in those countries, it would be easier to compare. But here we don't have a common scale. Here we have the Pie chart. We have a radio chart that is only decoration. It would be much simpler to see this in a bar chart. But okay, let's say a circle looks nice, so someone used it, and on the right, we have another type of chart. There is a lot of cognitive load for us. So be careful. When you are storytelling with data, you want to know what type of insight you would like to form showcasing several charts next to each other. 21. 04.01 - Distortion: Misleading data and visual distortion, a very important topic here will be graphical distortion. Take a look at this chart. If you set your Y axis to start at 90 instead of zero, the difference between, for example, this small bar and a bigger bar like here appears as if would be three times as big or even four times as big. While for a bar chart, the baseline has to be zero. Cropping the axis is the most common way to lie to the audience. If we start this axis at zero, look what happens to the data. Now it is a real representation of the data, and now it's a real comparison. We've seen this countless time. This is a very popular example, but I'm sure that you've seen in TV in politics, someone trying to skew the data to his favor. Here, we have clearly only 4% difference, but look how much bigger this bar is, and it appears as it would be much, much bigger. Here a different example during the pandemic. This example is even worse because the axes stay the same, but the dates below here, we have a difference of two weeks, and here we have a difference of just a couple of days. If you show this on a real chart, this is the part that is shown on the left side. Yes, it's pretty, but it skews the data to the narrative it wanted to show. Not only that, moreover, the cases of coronavirus per week in England were, of course, dependent on the amount of tests that were made. So this is a very tricky data to showcase and to draw plausible conclusions from. Another way of distortions are three D charts. They introduce perspective bias. When we view objects in three D, the objects that are further away appear smaller, like you can see in this picture. But our brain tries to calculate the difference between what he sees in the front and what he sees in the back, and this creates not only a lot of problem, but a lot of distortion. The prime example of that being the pie chart. Look at the pie chart if it is tilted in three D, this pie looks much, much bigger than that part of the pie, while, in reality, the sizes are the same, especially technically because both are 20. I think this requires no further information. Here, baseline distortion. Here we have a line chart, and it is the perfect way to showcase this data, depending on what we want to show. But what if you use an area chart? Now you are hiding certain data. Now, can you tell me precisely the purple data? The red covers it, the blue covers it. So the purple data is now imprecise. What we should do if we really have to use an area chart, which isn't an advantage here, we should definitely use transparency or maybe move the purple to the front in that particular situation. Then if we move purple to the front, then red would be obscured and blue would be obscured. So it's very tricky to showcase area chart properly. We could display them as a stacked chart, but then again, a stacked chart doesn't have the common ground, doesn't start at zero because the red chart and the blue chart start on top of each other. So this becomes very tricky. If I compare them next to each other, please remember what you want to say. If the individual trends matter use a line chart. If only the total sum at the end matters like here, then a stack chart would be okay, and it could be an area chart. No problem. 22. 04.02 - Lie Factor: Here, I would like to talk about the factor by Edvard Tufte. The lie factor is there to find out if a chart is lying. You measure the physical change on the paper screen or the pixels in our case. If the data, for example, grew by 20%, the data point that we have, but the bar or icon grew by 60%. While it should be representing 20% growth, the lie factor will be an equal of three. You have tripled the truth, so to say. Let me show this on different examples. Here is a core example. We have fuel capacity, and in liters, we have four different phases, and let's say that we have a fuel capacity of 100, 120, 150, and then 200. I grew two times. Let me remove the middle data. We have 0.1, 1,000.2 200. We used an icon that also showcases volume. In reality, this icon is four times as big than the first one, while the data only grew two times. This is clearly an inappropriate usage of icons because icons also grow not only in height but also in width. So you are showing volume where you shouldn't. It looks pretty, yes, but it's completely inappropriate. So if you would calculate the lie factor, the effect shown in the graphic is four times as big. The effect shown in the data was two times as big. So the lie factor would be the size of the graphic, divided by the size of the data. 4/2 is two, and this is way too much. When we are talking about the life factor, specifically, we should be within an acceptable range. The lie factor, the scale of the graphic should always correspond to changes in the data being represented. The graph that I've shown in that example breaks that rule by using area to show one dimensional data. Okay, Life factor equal to one will be representation of data true to real quantitative data. When the life factor is smaller than one, the data is underrepresented, and when the life factor is bigger than one, the data will be over represented by real data. Apologies. This would be a normal fair chart. Now, this chart, because we made the axis much larger, has a lower life factor, much too low because it doesn't equally represent each point. And here we have an over representation because we have truncated the axis itself, and now the bar appears two times as big, while the data only shows five points of difference. Okay, let us, for example, count this one. First, you need to count the relative change. So we need to make that simple calculation, and the relative change between the data points, only the data points is 0.05. Now let's take the pixels that are used to represent the data. The bars. 1 bar is a little higher, 1 bar is a little smaller, so we have that amount of pixels if we count the relative change between them, we get a value nearly identical because there might be one or two pixels on my monitor that wasn't able to show this perfectly precise. If I calculate the lie factor, it's 0.98, so there's absolutely no problem. I'm not lying with the data when showcasing it like that. Here is more of a fun information I found, and I think it's valuable to showcase. Here we have a chart where the amplitude between the degrees is very high. We have very high highs and very low lows. But if we would just be showing the average temperature, well, it is the perfect spot to live. We have always 60 degrees around that. Well, this obviously isn't the truth. And here another fun example, how to make a lot in a year, acquire at least one wife and 13 children, calculate the per capita income, multiply it by the amount of children, and Walla you are getting rich just by the number of people. This is an exaggeration and just a fun example, but it shows how it is possible to lie with different data. So especially, keep that in mind to remember about the lie factor and lying misleading people with data in general. 23. 04.03 - Correlation: Let me show you an extremely important topic, meaning correlation and causation in data visualization and the problems you might face, assuming one and the other. To explain a little, correlation measures the degree to which two variables move together. This indicates statistical dependence without implying a cause and effect relationship, so only a correlation. Causation on the other side, it describes a cause and effect link where changes in one variable directly produce changes in another through an identifiable mechanism. Establishing causation requires additional evidence beyond just observed correlation. Let me give you an example to explain all of this and why it is so difficult. A fire broke out. You need a certain amount of firemen to extinguish the fire. This is obvious. The bigger the fire becomes, most likely, the more firemen is required to put it out. Is a correlation between the size of the fire and the amount of firemen. But if you would translate exactly the same thing to a cause, it's not exactly like that because following that logic, you would tell that firemen cause fire, even worse. The more firemen, the bigger the fire. Well, of course, this isn't the case, so this causation would be simply wrong. Statistically speaking, yes, there is a positive correlation between those two variables, but they aren't the cause of one another. So more firemen do not create more fire. This is a spirious correlation. It occurs when two variables appear to be directly related, but a hidden third variable actually influences both. Let's go to another example. This is a very popular example, meaning the ice cream sales against shark attacks. The concept here, a spurious correlation happens when those two variables move together perfectly, but they really do not have a direct relationship. Looking at the chart, the lines for ice cream and shark attacks are nearly identical. So you could say eating ice cream causes shark attacks. Or sharks love the taste of people who eat ice cream. But of course, this data is driven by a hidden third factor, summer and high temperature. It's hot, people buy ice cream. When it's hot, people go swimming in the ocean, there is a chance of a shark attack, a slim, but there is one. So those two variables are correlated, but one does not cause the other. There is no causation between them. Another problem here is that I show the data like that. If I would change one axis or the other, I could very easily showcase that the graphs do not overlap each other as beautifully as we have seen previously. This is everything I wanted to explain for correlation and causation. Be careful of spurious correlations that you will find between two variables when there is a hidden third variable behind them. 24. 04.04 - Data Bias: Let us talk about data bias and data selection problems. Here is a simple example. I will conduct a research regarding remote work preference in New York. I need to pick a sample set of people where I will conduct my research on. So I'm selecting a sample from people who work in New York who are actively working. Then I will get some results and I'll extrapolate it onto the entire population. How good my sample selection will be determines how good my research will actually turn out. Now, let's do the same study, but I will only pick people who are already working remotely and maybe already like it. If I would select those people for my study, obviously, this would be selection bias because probably those results would be much different than just selecting a random sample of people. This example will not be informative at all. Okay. Let's conduct another study. I want to ask, how do people get to work? And I will ask people, do they prefer to drive with a car? Do they prefer to walk or do they prefer to ride on a bicycle? On the surface, a perfectly equal chance. What if I would ask the question on a gas station where everyone is with their car in 8:00 A.M. In the morning when they are driving to their work? Obviously, most people would select the car here. So this would be a clear situation of selection bias. From my part. Selection bias occurs when the data used for a chart does not represent the real world population it claims to describe. Obviously, you cannot ask all the people in the world to conduct some studies. You have to extrapolate it to the entire dataset, but your selection is extremely important. There is one other thing when selection attrition. Let's say I have a 30 day fitness app challenge. I'm starting the challenge, and 1,000 people start my challenge to lose 10 kilograms in 30 days. A program. Okay, on day number one, everyone is motivated, everyone starts. On day number 15, probably less people are as engaged, and the people who remain there are more motivated because they start getting results. We have some people that are being happy and definitely will hold out to the end. On day 30, we have barely anyone left. The ones that survived until the end are relatively happy. So this would be my final group. If I would display the data only on the people that are left here, our users you lose an average of 12 kilograms. Well, from the 20 users who left at the end of the 1,000 people selected, well, yes, the average might be this good. And this is attrition bias. Attrition bias happens when people or data points drop out of a dataset over time, and the ones who remain are systematically different from the ones who left. So be very careful when selecting your data points, and if there occurs any attrition or selecting bias. 25. 04.05 - Normalization: Here I would like to talk about incorrect normalization. This is a simple topic, but yet so important to never try to again lie with your data. On the left side, we have healthcare spending in billions between two different countries. What we don't see here is that country A is five times bigger than country B. So even yes, the spending is a little bit higher, but in reality, what we should be showing if we want to compare those two countries, they have to have a common denominator. If population isn't an option, then let's make it spending per person. This would be a common denominator between those two, and here the data looks completely different. We have 5,000 for country A and 10,000 for country B. I think this is pretty obvious and understandable if you have a big square and if you have a small square, 10% of the big square isn't really comparable to 10% of the small square. If you would compare them directly, it's like comparing apples to oranges because they are completely different and on a different scale. Here another example of a wrong denominator. I think it's rarely that we do something like that, but here we have sales per store. We have between company A and B. But let's take a look at the data behind it. This is the data behind it, and company A divided the total sales by total employees. While what they should be doing, they should be dividing total sales by total stores, which is two, and it would amount to 5 million, not 1 million. Company B, however, is counted properly on this chart. We have total sales divided by the number of stores. So sales per store should be changed maybe to sales per employee. No matter what change occurs, both need to show the same denominator, not two different values for two different types of data. If I would like to summarize this entire lecture, incorrect normalization happens when data is scaled or adjusted using the wrong reference, using totals instead of per unit values, dividing by the wrong population or, for example, mixing different time periods or units together. You cannot precisely compare a year to a month. To recap this entire lecture, the key takeaways here will be never compare row totals for different sized groups. Always check the sample group and equals how many and always label your axis clearly. Is it in percentage, in dollars in units per person, per sale, per store. I think this is perfectly understandable, and I'm sure you will never make such mistakes. 26. 04.06 - Aspect Ratio: Let us talk about another important topic, meaning Spec ratio in line charts. The line chart is very often used and prominent and there is something to understand about it. This is very simple to see on such an example where I take the same chart, and if the chart is very wide and flat, it suggests that there is a very slow trend happening, while if I take the same chart and I simply narrow it down, I squash it together. Now all of a sudden, the trend becomes quicker. So a narrow and high design will exaggerate the trend. What do we do here? How do we find the middle ground and how do we decide what chart slope is appropriate? So what is a good, correct or honest aspect ratio for line charts? There is often a cited rule called banking to 45 degrees. It says that the average slope of the lines on a chart should be 45 degrees, but this is only a starting point. This was in the research paper by Cleveland McGill and McGill, and it's only a starting point. But let me show you this on an example. I would have a chart, I would count the degrees or average slope in my chart. Here, for example, I have 73, 79 around that. If I make this chart wider, it's 58 and 63 now. If I make it even wider, okay, we are getting close to the 45 degrees, and this would be approximately how this chart should be displayed. But in my opinion, currently, it's a bit too wide. It covers almost the entire screen, but the slope is very easy to see and very easy to read. We don't always have a simple chart like that. What you also have to keep in mind that even if you make your chart longer or narrower, you also have to consider the Y axis. Look at the left chart and at the right chart. The left chart has a wider Y axis, it's higher. The way you read the data here would be more extreme on the left side and less extreme, less volatile on the right side. So it's very important to know what you want to show and show it appropriately, not trying to exaggerate some trends. Now, here's a very interesting example that shows this magnified by 1,000. A study from 2006 illustrated that different aspect ratios can reveal different signals in time series. I'll show you now CO two concentration in a measuring station on Hawaii. Both of those charts display the exact same data, but we want to find something different in this trend. This is why the right one is more narrow. If I take a look here, what is the conclusion that I see? What I see that the downward movement is quicker than the upward movement. The upward movement happens slowly. Here we have 45 degrees, so it's displayed nicely. Here we have around 60 degrees, and I can now draw a conclusion by looking at this chart that I wasn't seeing on the left chart because the left chart has so many lines and they are so high, I can't really differentiate between which slope has more degrees. While on the right side, it becomes a little bit more apparent. The same chart can show you different conclusions, different trends, and different situations. Remember about that remember about that, especially when working with line charts. This shows that the aspect ratio of a chart can influence what you can see in the data. Let banking to 45 degrees be only a starting point, but, of course, deviate from the rule when needed. As you saw in the previous example, different slopes can tell different stories, especially when using line charts. 27. 04.07 - Y Axis in Line Charts: Here, let us talk. When is it okay to cut the Y axis from a chart, especially the line chart. So cutting the Y axis. Bar charts are about size, about absolute magnitudes. Here, the bars start at zero, so the length correctly represents how big each value is. That's okay. Now, watch what would happen when I cut the Y axis. Let me cut it. And as we've already talked, this is unacceptable within bar charts. Cutting the axis distorts length, so it breaks the meaning of the chart altogether. The bars suddenly look very different, even though the data did not change. This is because in bar charts, we decode length, cutting the axis distorts the length completely, it breaks the meaning of the chart itself. So we need to decode length. We need to start at zero. This is the rule of tamp. However, on the right side, I would like to show you the line chart. The line chart is about change, about a trend, so the zero isn't as important. It's even sometimes necessary to adjust the axis itself to actually see the data because here previously, we didn't see anything. We saw a very narrow line, but if we change the size of the axis, we finally can find some information. Here we decode the position on a common scale, not the length. So the rule of temp will be adjust the Y axis in line charts to show the trend clearly. That trend needs to be shown clearly. So if you need it, just adjust it. Here is an exaggerated example from the book How to Lie with Statistics by Huff and Gels. And it's from year 1954. I've seen this entire book and it's phenomenal, especially the drawings. I love the comparisons and the drawings. You can see on the left, we have the entire chart, and on the right side, we have only the little part with the head of the chart. What happens if you change the YX yes, you make the face longer, but sometimes you need to do this to show the trend, but don't try to exaggerate. Just try to be informative. There is a rule of thumb that Andrew Gelman said, If zero is in the neighborhood, invite it in. Here, we start the axis on the left side from five, ten, 15, 20. So why don't we include the zero? Here, it's very close to our actual dataset and our actual trend. So there's no problem. Here a different example. Here is the number of medals. For different countries, let's say those are some kind of Olympic games, and what would happen if I were to change the Y axis to start at zero with France? So France would have in this showcase, zero medals, USA would have 150 medals more, and Spain would have negative 150 medals. Well, Spain has here about 100 less medals than France. I wouldn't like to see this chart because not only do I have to count everything by hand, what does it show actually to me? Doesn't show much. I much prefer just seeing the raw numbers, the raw totals, especially when using a bar chart. Of course, if someone really needs and wants, you could show that, but I wouldn't really endorse such a chart. To recap our information, so how do we choose? How do we choose whether we can delete or not delete the Y axis in a line chart? In general, in a time series, use a baseline that shows the data, not the zero point. So it's complicated. You can include the zero point, but in line chart and trends over time, you don't have to include zero, while in bar charts, our perception decodes length, so we need to use the zero point. The choice is context dependent. Thank you so much for listening, and let us continue. 28. 04.08 - Error Bars: Let me talk about error bars and charts and why they are very difficult to read, understand, and use. You need to be very mindful when you use error bars. So what are they? They are basically a graphical way to show you how much your data might vary. Let's say that we have some data about the public satisfaction with local public transport in Denmark. And we survey people, and we find out that 65% of people are satisfied with local public transport. If we conducted this study again, it could be a little bit less. It could be a little bit more, but we are showcasing on the bottom that the error bars show a 95% confidence interval with a margin of error plus -3%. And this is crucially important. I've displayed on the bottom, how did we count those error bars? They can be represented graphically a little bit different. We can use the caps on the left and right side. We can use it without caps or, for example, showcasing a line like that. This would be the range in which the values can find themselves. This would be the lower bound, and this would be its upper bound. Let's see the anatomy of an error bar. Usually, they look like that. We have the data point. Of course, we have the cap or we haven't, depending on how you want to design them, have the lower bound and the upper bound. This is the entire error bar. Error bars can be used in different charts, for example, in scatter plots in dot plots, in bar charts, in line charts, but each single one poses their own problems. For example, in a bar chart, if it's on white here, it's easy visible. But what if I made it purple or black? You wouldn't see the bottom side of it. It would look like a dynamite without showing the bottom part of the error bar. Another problem is the difference in the mathematical methodology already poses a problem in itself that we can show error bars that show standard deviation that show standard error and the confidence intervals that we used previously, or we can have a completely custom range with minimum and maximum values that we applied to our dataset. Previously, on the data that I displayed, we showed this on the bottom that we are using 95% confidence intervals. What does that mean? It means that if I would make this poll 100 times, I would expect that the results will be 95 times out of 100 accurate plus -3%. So I would be having results ranging 60-68% each time conducting this research. Let's give you another example. You know that we have to start bar charts from zero, but for the sake of knowledge, because we can barely see the differences here, let me make the axis start with 36, only this one time, okay? But you'll in a second see why I shouldn't be using this chart altogether. We are showcasing the average body temperature. What I would like to show you is only this yellow dot. This yellow dot represents the data that I want to show the average body temperature of each group. Let's delete the data, and let's now put the bars above each other. Let me put them above each other to show you this more clearly. If I would be counting some kind of mean value between those values, those are the values that I'm looking at, let's soon delete the part that we don't need because if you look at the yellow dots, the average between them would be in the middle of them. This is the average that I want to show. So this is the actual data point, and the yellow dots are the individual data points or maybe my error bars. So we have the average in the middle, and all the yellow points are the data points that we have available. We could represent this with a dot plot. We could represent this with a box plot where the middle line is the average. And here we could also display it with a violin plot. All of those plots would display the same data. You can decide for yourself which is graphically the most pleasing and the most appropriate for this situation. Error bars, they are named error, but they don't always display error. They can also display the range of values where your data lives. I hope this is understandable. This is a difficult part of data visualization. This is why you rarely see it because it's not easy to use error bars properly. 29. 04.09 - Remember This: What do you remember about this section when we talk about visual misrepresentation of data. Be very careful about your axis. Try to avoid truncation. Don't try to hide data with the axes themselves. The same data might look vastly different. Just starting the axis with zero isn't always enough. Scale it appropriately. Try to avoid the lie factor. You can under represent the real data or you can over represent the real data. Be very careful about. Consider correlation and the cause that it provides. Because something is correlated, it doesn't necessarily mean that one causes the other like we had in the fireman example. We cannot really say that firemen cause fire and more firemen cause bigger fires, but there is a reverse correlation where bigger fires cause more firemen to show up. It's really important that you select a fair dataset. Don't try to skew the data. Don't try to use selection bias and attrition bias to make your data look favorable when you conduct a research. The aspect ratio you choose has a big impact on how a chart is perceived. Know what you want to say, know your charts, and know your data to represent them correctly. Know what you want to say. Also, don't make your axis skew the story line. I know that we want to tell stories with our data. This is why there is something like data storytelling, but don't try to overuse or misuse it. This is an example which I wouldn't endorse, as I said, showcasing data like that, which is skew. It seems like France has no medals, where in reality, it is the reference point. It seems like USA did better than everyone. While in reality, yes, they did a little better, but this is the data that we actually have plotted here. Okay, thank you so much. I think this is understandable. And as we practice data visualization, we will get better with it. And this will be surely automatically engraved in our brain, and we will remember those important things. 30. 05.01 - Plot Types: Welcome in this section where we talk about types of charts. Here, I would like to give you a brief overview so you know that we have different categories for different types of charts, just so you start to organize our knowledge. Later on, we will go into specific charts. But here, I would like to present to you, we have data overtime charts that will track trends like the line chart. Then we will have comparison charts for seeing how values stack up against each other. Then we have relationship charts to find connections. Then we have distribution charts. You've probably seen charts like the density plot, a histogram. It shows you the spread of data. And finally, part to hold the famous or infamous pie chart, and so on. Let me now go briefly over every category, so just start to organize the knowledge in your head. First, data over time. This category is used to visualize the evolution of a trend across a chronological period. This would be most prominently the line graph, the area chart, or the candlestick chart, like we have in stocks. The most important information here, it moves up, it moves down, or are we standing still. Comparison, this category is designed to show the relative difference in size or magnitude between distinct categories. And here, of course, we have the bar chart, the lollipop chart, and a clustered column chart as well. You could track how many pizza slices you ate versus your wife and versus your son. Most importantly, it shows you who is winning and by how much. Then relationship chart. They are used to identify the correlation or dependency between two or more different variables. Here we can use the scatter plot where we could track, for example, months and ice cream sales, and obviously, the warmer it is, the more ice cream is sold. Or the bubble chart, let's say you have the price of coffee and we have the amount of profit that we make. But the size of the bubble would show us how much customers go into our shop. So a third variable added here. Or a heat map showing you hot spots. For example, we would have students that have different subjects, and at different time frames, they seem to be fresh, and the later it gets the colder the spot become because they are less focused. Those charts are meant to show you how one thing affects the other. Then distribution shows you the frequency and spread of values within a single dataset. This would be a histogram, for example, let's say, shoe size. The shoe size number nine is much more common than a shoe size 13. We are grouped that in categories and showcased on a histogram. Then we have a density plot that smooths out the curve is very similar to a histogram. It will smooth out the data, and, for example, a box and whisker plot that shows you the average values and the outliers. It reveals the shape and outliers of your data is the main information here and part to whole type of charts, the composition of a total divided into its individual parts. What do I show here? Of course, the Pie chart. Let's say it's your phone storage divided into photos, videos, and so on. Then a treemap would be also a part to whole relationship. It could represent the hierarchy through nested rectangles. And a stacked bar chart is also a type of part to hold because each bar shows you a total but divided into separate little groups. Of course, we have more types of charts. We have maps and so on, but I don't want to show you everything in the world. I want to show you the main categories, and from those categories, we can move forward and group our information accordingly. 31. 05.02 - Line Chart: Let me talk about the first category, data over time and what we have here. Of course, the line chart. This is our go to tool for showing the big picture over time. A line graph will be most frequently used to show trends and analyze how the data has changed over time. Let's take a look at the anatomy. Line graphs are created by plotting values as points and then connecting them with a line. Typically, most often the X axis is a timescale like days, years, periods, or months, and the Y axis is simply a quantitative value or percentage. If I would go over key points that make up a line chart. Trends tracks values over time to show evolution, slopes, they are a metaphor. A upward slope means growth and a downward means decline. If we put it simply, we plot points on a grid and connect them to show the path of the data. Let me go over a couple of examples. On the first example, we can use a single line to track gym memberships. It clearly shows the study climb from January through December. A singular line is never a problem. Let me go now to the second example. For a more complicated view, we can compare four different social media platforms on one chart. Here, I can see which one is growing the fastest, where they are crossing each other, but we need to be very careful with the amount of lines. I think four is approximately the higher end of the line chart usage. If you want to talk specifically about one data point, one line, you could, for example, highlight it like that. Let me now show you another example where there are more lines and take a look at that. This is what you can call a spaghetti chart. We try to plot eight different product categories on this one grid because the lines for winter gear like coats are dropping while summer gear like shoes are rising, everything crashes in the middle, everything crosses each other, and this is horrible to see to watch and to talk about. I cannot even talk about here while presenting this to you. What I would do, I would definitely highlight the data points that I want to talk about, and I would decrease the visibility of the rest. You could, for example, also use a different chart for that because a line chart won't be perfect here, but this is a lecture about a line chart. I want to just show you the possibilities, and this is everything you need to know about this. A simple but very useful chart. 32. 05.03 - Area Chart: In this lecture, I'll explain the area chart to you. An area chart can show you the big picture. Let's go over the anatomy. It's, of course, very similar to a line chart. At first, we have data points that we have connected with a line. Just like a line graph, we have the X axis. Most often some time intervals and for the Y axis, we have a given value or percentage. The difference here is that the area beneath it is shaded. This creates opportunities and problems. I'll explain in a second. Key points I would like to go over volume. It represents the magnitude of change over time. Area. The shaded area emphasizes the total, not just the and because it's graphically very pleasing that the bottom area is shaded and has a solid color, it's sometimes a bit easier to comprehend, to understand. And continuity, it is best used for continuous data over a period. Let me go now over different examples. On the simple side, we can use a single area chart to track, for example, monthly active users. And because everything beneath is shaded, we kind of feel the size and the momentum, but this could be just as well represented with a line chart. So why should and why shouldn't you use an area chart? The problem here is that the area chart uses a lot of unnecessary ink to display everything, which isn't actually data. The only data point we need to read this chart correctly, the line. Those are the points where the entire data sits. So be very careful when using area charts because most often they are just graphical design tricks and no merit is behind them. Even more problems arise when I start to use stacked area charts. In this example, we are tracking website traffic by device. And it looks good, and it's a proper usage of an area chart. It will tell two stories at once. The overall height shows us the total traffic is increasing, but the individual colors show a massive shift where mobile traffic starts to cannibalize desktop traffic over the course of the week. What's the problem? We are no longer on a common scale. Look at mobile. Mobile starts at zero. That's perfect. But if you look at desktop, on Tuesday, it starts here, on Thursday, it starts here, and on Saturday it starts here and ends above it. How do you compare those different dates? You cannot properly do this. So unless the data is completely not important, you can go for a stacked area chart like that, else, I wouldn't recommend it. Here, I just wanted to see the total views, so it's completely fine. Now, let's go over a different problem. Occlusion or overlap of actual data. Well, this would be horrible by the designer if I would use a shaded area that is shading over another area. Here I have two products, product A and product B, but you cannot see product A. You have no idea where it is. If I do the design correctly, now you can see it. But still, I caused some problems. Now, the last slide, what I want to say to you is on the left side, you have a regular area chart, and on the right side, you have a stacked area chart, and the designer or you, the data visualization specialist, needs to inform the audience what is what? Because on the left side, we have product. Product A is clearly bigger than product B. And if I communicate this correctly, it's okay. But on the second chart, Product A together with product B, make up this and this revenue, for example, but it's very difficult to judge. By looking at the second chart, I feel like product A is four times as big as product B. And maybe it is, but I have no way of judging because they are not on a common scale for each separate month, and it would be very difficult to judge. So as you can see, area charts are very often used. They look beautiful. I have to say, but they use a lot of ink, and they may mislead you with the information they are trying to showcase. Be always mindful about that. 33. 05.04 - Bar Chart: In this lecture, we are going to talk about a bar chart. It is the ideal chart to compare different categories to each other. A bar chart uses either horizontal or vertical bars, also called the column chart to show discrete numerical comparisons across categories. If we go over its anatomy, a bar chart is drawn by placing a specific category on one axis, for example, categories, years products, and a value scale on the other like number of sales, a value, degrees, I think this is pretty understandable. Key point I would like to go over. It's categorical. Each bar represents a separate category. Discrete numerical comparisons across different groups. Ranking, the length or height instantly answers how many? This is why it's so easy to judge and compare the zero rule. It relies on a shared baseline to compare magnitudes, and this is a key point, a very important point. When you use bar charts, you should start at the zero value. Here we have a very simple example. I'm sure you've already seen plenty of bar charts. On this example, we have website traffic sources in January 2050, and we can clearly see that the first bar is the biggest, meaning that this is our powerhouse. Our website gets visited directly. Okay, but could we, for the same chart, display a line chart? No, because direct search, social, and referral are different categories. They are not connected to each other. They aren't a trend over time. They are separate beings, so the data between them is not correlated at all. Okay, let's now go over a bit more advanced example. Here we have plenty of categories. We have ten different categories, and we are using a horizontal chart because we need more space to write out our titles, our category names. Let's see how it would look on a column chart. We can barely see the names of the categories. So going back to it, the proper way would be to use a bar chart like that, a horizontal one. Then I could clearly give my statements or for example, add an annotation here or even give data and put a green annotations that those are the top performers against the other. Beautiful. However, just as I said, be very mindful about the axis on the bottom. Here we have the axis starting with 96 to 106, and the five points almost look like two times the size. But this is truncated and creates a huge life factor. If we are honest with our data and we start with zero, now we can clearly see the bars are very close to each other in terms of size. The five points difference isn't as much as the previous chart shown. So be very careful when using bar charts, always start them at zero. 34. 05.05 - Scatterplot: Relationship. In this lecture, I would like to talk about Scatterplot. A Scatterplot is perfect to show the relationship between two different variables, meaning they showcase the correlation between your data points to see if one influenced the other. For example, if you spend on advertising, then the app downloads should be going up, right? We can plot this on a chart and see if there is a correlation between more spending and more app downloads. Secondly, you can find outliers that way. Those are points that would be plotted far away from the overall trend. Okay, let's go over the anatomy before we go into examples. We plot our data on a grid where variable A sits on the x axis. Let's say those are spending on a good desk. And the second variable will be on the second axis, and let's say this is productivity. In theory, the more expensive the desk we bought, we should be more productive. I know it is a stretch, but this is the data that we got. Those are individual points we plotted on the Scatterplot, and we could add a trend line that approximately fits in the middle of the entire dataset. However, we should also keep an eye on outliers. What happened that some of the people, they got a cheap desk, but they still were very productive. Always, watch out for these. Okay, I don't want to go too long over the theory, but if we are talking about Scatterplot, the correlation can be positive, can be negative or can be neutral. Also, the correlation can be linear. When one variable rises, the second variable rises as well, can be exponential when rising the price of something causes a lot of more, for example, app download and U shaped depending on what's happening here. You could also talk about the strength. I know that a lot of theory, but we have to go over it. The correlation can be strong when the points are close together, it can be weak when the points are further apart, or there can be no correlation. Now we know that there is no correlation between the two different variables. To recap everything, you could have a correlation that is positive exponential or a positive linear strong correlation or a positive linear weak correlation. It all depends on what you want to show. There are also problems with Scatterplot, and it depends on the data that you are working. Let's see problem number one. There is no correlation. Like if I plot sunscreen this is a very popular example, sunscreen unit SLT and total reported shark attacks or total reported shark incident. It seems like when you sell more sunscreen, then you get more shark attacks. But in reality, of course, most shark attacks probably occur during the summer at the same period as sunscreen is sold. This is why it looks like it would be correlated when we plot both those data on this chart, but we are, of course, data visualization specialists, and we wouldn't come up with something like that. The second problem is overploting. You can see here a couple of dots. But what if I told you that in reality, there are 100 results here, 100 dots. This is because right on the bottom when we have 2 hours, 2 hours of app usage, we have plenty of data points. To overcome this, we can jitter the data around. We can scatter it around, but we still want to be precise here. So this is a problem when using Scatterplot. This is everything I wanted you to know about Scatterplot. 35. 05.06 - Bubble Chart: In this lecture, we will talk about the bubble chart. That is a relationship type of chart. When you use a Scatterplot, it shows us a simple connection between two different numbers, two different variables. A bubble chart, however, shows us the weight of the numbers because we can use the size of the bubble to add a third variable. So the most important point why we use bubble charts is that the size itself of the bubble will tell us a third story. And also, it gives you an important overview at a glance. You are drawn to the most important biggest bubble. Okay, let us explain this on an example. Before we plot any data, let's look at the anatomy as always. We will lose this chart like that. We will have time on the bottom and difficulty to fix an IT issue on the Y axis. Let's draw the chart and you can immediately see where the biggest bubbles are. The size of the circle represents a third variable, how many users are affected. This is the weight of the problem. Because we now know how many users are affected, we shouldn't just choose what is quickest to fix and what is the easiest to fix, even though the big bubble, being the system outage is pretty difficult to fix. It is at a six on the scale. We probably should go over that first before going to the database error because the system outage is affecting 8,000 people. And this is exactly why a bubble plot might become useful. Let's go to a different example. Here, the bubble size will show you the estimated cost of a given event. On the bottom, we have the number of guests and on the left side on the YXs we have prep days. So looking at that, even though a wedding has approximately the same amount of guests that a seminar and takes approximately the same time to prepare, it is much more expensive. And the bubble chart is capable of showing you that by showing you the size of the bubble. But sadly, the bubble chart, of course, has its problems. If a bubble is big enough to cover up some data, what do we have to do? We have to use transparency or make an outline because we don't want to skew, we don't want to mislead anyone by hiding certain data. I think this is obvious. This is everything you need to know about a bubble chart. An interesting chart, but very difficult to use properly. 36. 05.07 - Histogram: Welcome in this lecture, I would like to talk about the histogram, a beautiful type of chart, but with very specific usage. While a bar chart, it looks like a bar chart, but a bar chart compares different categories like apples, oranges, and other. But a histogram will show us the shape of where your data lives for a single category. So you have one category. Divided into multiple bins. I'll explain this in a second and show you examples. The primary reason to use a histogram, of course, it groups our data into binsk. For example, you could have age groups one to five, six to ten and 11 to 15, and you could group them into three separate binds to give you a rough idea of the data. Of course, more binds would be necessary here, but this is the idea behind it. Secondly, it identifies the peak, and this is a very important information depending on the drawn chart that you identify, what is the common value between those groups. When it comes to the anatomy, it's very simple to understand. Of course, on the bottom, we need some kind of variable like wait time. For example, if you search something on Google, you can see how many customers are currently there and how it compares to an average time. And this is exactly a histogram. Here, I'll plot frequency, the number of customers. On the bottom, we have wait time. Let's see. Looking at this chart, you would see that the two highest bars would show the most common wait times for our customers. Let's say this is ten and 15 minutes. While it is less often that people wait 30 or 40 minutes, it seems that it still occurs. Take note that all the bars touch each other. From a design perspective, we could make very tiny gaps between them, but the X axis is continuous. You don't want gaps because probably there is some data. So you need to plot it correctly and accordingly to the data you have. This would be the peak, and let's go over an example of a histogram. Here we have a boutique loyalty program, and we have 76 data points or 76 persons that go into our boutique. We used a histogram to group those age groups into three different years each, seven to ten, ten to 13, 13 to 16, and so on. You can notice that the bars climb steadily from age one. Obviously, a baby cannot be client, but we reach a massive peak between ages nine and 17, and then a drop off from the age 20, approximately this is a clear information that our business doesn't just have young customers. We have a very specific hotspot here of pre teens and young teenagers. If we will order inventory for our boutique, we would now focus 80% of our budget on styles that will appear to those age groups. However, a histogram, of course, has its problems. It can simply hide the truth if your bins are the wrong size. Let's say, for example, here, the same boutique, but we have grouped them into age one and 20 and 20 and 40. Obviously, this is now a big wall of basically no information because it gives me no information. I can roughly say that we have more mature people buying than teens, young people. Well, this isn't a conclusion at all. I could also make too many bins. Like this also doesn't give me a clear picture because age 19 and 20, we have only one person, age 20 and 21, we have only one person, and age 22 and 23, you have zero persons, so we shouldn't cater to those age groups, and we should cater to a different age group. Yes, we can see a clear spike in the middle, but you shouldn't use too much bins because it essentially becomes a very large column graph, which we wanted to avoid by grouping certain data into bins to show us a histogram where our data lives. 37. 05.08 - Density Plot: In this beautiful lecture, I would like to talk about the density plot. The density plot, a very specific type of chart. It uses a smooth line to show the shape of where your data lives. We use it for two primary reasons. It smooths out the noise. Let me tell you what do I mean by noise? Let's say that you are tracking ages and you have customers on those ages, you have ten customers six years. Then you have zero customers that have 27 years, then you have eight customers that has 28 years, then you have no customers that have 29 years because this is data from today, let's say, it would be very difficult to plot it here ten, here zero. It would be difficult if you just use a column graph. The density plot will smooth out the average of those values and it's easier to see the data. The second reason why I want to use this chart is, of course, with those type of charts, it can identify the peak. Let's show you the anatomy. There's nothing difficult. It looks like an area chart, but there is an important distinction we'll talk about in a second. On the bottom, we have a variable or a group or a category, for example, an age group, and here we have density, likelihood of occurrence. And this is important. We don't have volume. We don't have total values. We have likelihood of occurrence. Okay, this would be the peak, and this would be the tail where the data is tapering off or where some outliers lie. If we go to a practical example, we have our Boutique loyalty program member age. On the bottom, we have the member age, then we have the frequency, how much of those people subscribe to our Boutique maybe promotion. And we can clearly see that 25, 26 years is our core customers. So this is where we should focus our marketing efforts. This would be the peak, and those would be the outliers or the less common people in our store. Maybe our inventory is targeted at this younger age group. However, the problem with density plot is this looks almost identical to this area chart on the right side, right? But you read this completely different and you use it for completely different cases. On the left, it shows us where our customers are. The peak at 25 tells us that 25 is our most common age. We simply have no inventory for people from a higher age group, and we are not surprised that they aren't coming to our shop. However, if you take this area chart, without seeing the density plot, without knowing, for example, about our boutique, you could tell, Oh, no, within the higher age groups, we have no customers. What is happening or if this would be amount of profit made from those age groups. What are we doing wrong? Why aren't we making any profit from those age groups? It's not like that. You shouldn't be afraid that sales are crashing with higher age groups. Simply the sales aren't there because there are no customers that age in our boutique. This is an overall view on the density plot. To recap, here we look at the shape, and here we look at the raw number at the volume. 38. 05.09 - Pie Chart: What is it about the pie chart? It's such a beautiful chart, but comes with its own set of problems. We have to judge the entire area, so it's always difficult to read a pie chart. But let me show you the advantages and disadvantages of it. A Pie chart is optimal if you have percentages. For example, showing three different numbers like here, I see the percentage. I can roughly judge, Okay, which is the biggest, which is the smaller one, and we have a total of 100. That's perfect. It becomes a bit more difficult when we have total numbers. If we wouldn't have the 50 here, you would only see 30, eight, 12, you wouldn't be exactly sure what the total value is, you would have to count it, and you always need to make sure that if you show a pie chart, you want to show the whole Pie chart because you want to see differences between categories. All right. The pie chart is okay to showcase less than six categories because if you are going over that 678, the Pie hart becomes increasingly difficult to read. Just look at the pie chart on the right side. It becomes difficult to judge, for example, the blue on the left bottom side and the purple on the right bottom side. They are kind of similar or the blue above it, but you can't really tell the percentage unless you see it. A Pie chart is okay to showcase, for example, for macardshare for different segments or age groups, for device usage, compare categories like that, but the optimal measurement still will remain percentage. Percentage is ideal because one equals 100%, and a circle displays that nicely, but nicely is the core word here. Nicely doesn't make it optimal. So to recap, the pie chart is great to show that one segment is big or small in comparison to the other segments. It's okay to show how one segment correlates to the whole. Like here we have app downloads, and most of our applications are downloaded on a phone and to show simple percentages, like a big slice to a small slice. Pie chart is very good for that. However, in most situations when showcasing data visualization, a bar chart will simply be more understandable and better. Let me show you some examples where the pie chart fails. If the slices are similar, you can barely tell the difference. If we have too many values. You can also not clearly judge or you have that many similar slices. It gives you no real information and you have to judge the area of. If you use a pie chart for a separate category, you might as well not do it because it's very difficult to read the areas. Why not use a column chart here? Why not use here a bar chart horizontally showcasing all the data. It's much more understandable. It's higher on the perceptual tasks, and it's easier to compare it on a common scale. Here as well, we should use different bars for different categories and we would be settled. This is everything about the Pipe chart, you need to know. 39. 05.10 - Waterfall Chart: Here, I would like to explain the waterfall chart to you. A waterfall chart is all about the journey between two points, the starting point and the ending point. Instead of just seeing that you started at one number and ended at another, it breaks down and explains change step by step. It is mostly used for financial data, so it highlights gains and losses. Let's go over the anatomy. To read this chart, look for the starting and ending pillar. They are called totals. You have the starting total and ending total. You can also have a running total in between depending on the data we use. Then we have data that shows increase in those values, what happened there, and that show decrease. I think this is very understandable. Also, we used green color for increase and red color for decrease to make it a bit simpler. And we also have connectors because this is one continuous journey. Okay, let's show it on an example. Here, we have a chart analyzing our net income. We have our revenue at the beginning and net income at the end. Here we have cost of goods sales and expenses that drew down our gross revenue. Then we had some interest on our money, but we also had to pay taxes. The end result is shown in the last total, but because we have the information in the middle, we learned how the journey went and what actually happened during the year. Just because I said that the waterfall chart is mostly used for financials, it doesn't mean that it always has to be. You can, of course, plot any data that you prefer. Here we have total Spark Fitness memberships. And we see we had 32 memberships on Monday and 45 memberships on Sunday. Happened in between? Well, we have some new sign ups because of an evening class. Some people canceled. That happens, and we earned a few customers because we had a discount. If I would just see Monday and Sunday, I wouldn't really know what happened in between. Thanks to this waterfall chart, I know where the inflows and outflows went. This is what you need to know about a waterfall chart. 40. 05.11 - Combo Chart: In this lecture, I would like to talk about a specific chart, namely a combo chart. A combo chart will combine several charts into one. So the key point here is combining two different chart types. Most often, we see a line chart and a bar chart together where the bar chart shows absolute values while a line will show a rate or trend regarding that. The purpose, of course, is not decoration. It is to show related metrics together. You want context, not just a beautiful chart. Let's go over an example. Here's the same story split across two charts. We have quarterly sales volume and quarterly sales growth rate. On the left, we have total values in millions of euro and on the right side, we have percentage growth or decrease of growth. This works, but you need to look at the left. You need to look at right. Why not combine them? Well, okay, but as you can see, this is still not perfect. Not only can you not see anything, and you are not sure what represents what. Let's put the axis for the percentage growth on the right side. It is better, but still something is off. Okay, let's scratch that. Let's do it again. What we should do here, we should plot the data. On the same chart. Now, let's change it to a line chart. Okay, now we see two of the charts on top of each other with different colors, but in order to understand it at first glance, I would need to increase the difference in color. I would make, for example, the line chart blue and its right axis blue, and the left axis purple and the chart purple as well. Now I can see two distinctive things within this chart. But as always, you have to be very careful when using two axis because let us focus now on the data. Here we have negative values. But did you notice that those are negative values where they are above zero on the left axis? Did you take a look at the right axis that we have negative two and the negative four? Well, that's one of the problems when using two xs. Those are separate beings, and you need to consider them both separately. Additionally, you can lie with the right axis. Let's change the axis from 10% to 20%. Look what happens to the line. It gets lowered down a lot. If I increase this to 30%, I can put this even lower. Now let's change the bottom. Instead of negative five, let's make negative 20. Now I've put this higher. The line chart got squished between. Of course, I want this chart to look good. I'm not lying with data, but I can skew the data a little and its perspective to make the change feel better or worse. If I would have a lot of decline and I would adjust the lines, the decline doesn't look as scary, but if you go back and the decline is a little bigger, now the dip in quarter three looks a bit more scary. Okay, to summarize everything I've said, the ideal situation would be if the data is clearly color related. Our data was. So that's no problem. For example, ADSPend and new customers acquired. The ideal situation would be also if they can use the same aces. This would be perfect as there would be absolutely no distortion between the data we are looking at. But this is ideal word, like, for example, total revenue against what a company can keep. So this could use the same aces. We don't always have that luxury. So when you will combine different charts together, please be mindful of how you use the excess and the chart in general. 41. 05.12 - Maps: In this lecture, I would like to talk about maps, a specific type of chart. Of course, they are used in data visualization because they can highlight geographic clusters and they reveal regional outliers, depending on what we want to show, or in general, we can show different charts on top of a map to enhance the message regarding to its variables in different regions. I think you saw data like that. And what's interesting about maps that you can plot different kinds of charts inside of maps. Here, for example, a geographical bubble map combines the features of a bubble chart with geographic locations. In such a map, a bubble is placed in a specific location and the size of the bubble represents a particular value or variable. Let's go over a different map. This is a heat map on top of a map, and intuitively, you see that the orange color is the more intensive area, and the purple and lighter color are the less intensive area. Here we use a system of color coding to represent value. It will be very useful to choose intuitive color palettes that effectively convey the magnitude of the values. Here, I'm not sure what orange represents and purple represents, and initially, without me knowing what that is, it could possibly pose a problem, but this is not up to debate at this point. Now, the most prominent kind of map chart that you can use is a region map where some regions with more intensive data are darker and regions with less intensive data are lighter. That way you can show data by country, by state, by neighborhood. It gives you a clear snapshot of the distribution of an area. What are some problems that I personally know from my data visualization journey? You need to be very mindful of borders when you use a map. For example, here, I have Astra, and this was the most precise map I was able to find. For the majority of people, this won't be a problem. But, for example, if you are from this country, then you are looking closely at the borders, and you might point out that something here is not right, something here should be made different if you want to be politically correct. Smaller maps, it's not such a problem. But if, for example, I have downloaded a world map here to use in my design. Now, I wanted to show some data from Germany. I'll extract Germany, and I would now show some data. Then someone who knows the map very well would tell, but hey, the map isn't politically correct. It isn't very precise. I would say, apologies, it was just for illustrational purposes. But if it's possible, and if you have a specific country, you can find vector maps of different countries, so you should try to be as precise as possible here to avoid any misinformation regarding to day especially if you want to dive deep into regions. This is everything about maps. Maps are beautiful ways to show data, but be mindful where do you get your maps from? 42. 05.13 - Remember This: What to remember after this entire section. What we learned, we learned different categories for charts, data over time, comparison, relationship, distribution, party whole. Of course, this is not everything. We have maps, and some charts are in multiple categories at the same time. Let's just take a look about comparison. It's not possible that I will explain all the charts to you here. Here is a graph showing what type of charts get into the comparison category. Of course, you get the general idea. You want to compare one category to a different category, and those are the charts you could possibly use. Each chart that we've learned about and that we didn't learn about has its strong sides and its weak sides. As a data visualization specialist, you need to know what to choose when and what would be inappropriate for the dataset that you are currently working with. Sometimes it is okay to combine charts into a combod chart, but be very mindful about the access that you use and don't try to lie with your data. Always be honest and try to minimize the lie factor as much as possible. 43. Exercise 1 – Analyzing Data: I Welcome in the practical exercises section of this course, where we will apply everything we've learned so far to a real chart design to see if we actually apply what we preach. This is the case study, Switzerland tourism data 2050-2060. Debrief, welcome to Neo Travel Horizons. We are the leading strategic consultancy for the 2050 travel market. We have finalized our decade forecast for Switzerland. Our team has combined historical data that was available with our proprietary hyper loop impact estimates. Our mission, your mission is to design a chart. Our board of directors needs to see the relationship between the total volume of visitors and the speed of growth. They won't look at a table. They need a high integrity chart that clearly separates what we know from what we predict. Just reading the brief. Between the total volume of visitors, total volume, most likely a bar chart, a column chart, and the speed of growth, speed of growth, changes in percentage, changes over time. This could imply a line chart. Those are my plot suggestions below this design, column and could take your own spin on this butt. This would be the most appropriate. Here is the table with the data. I have two tables. One with commas and one with dots because the US version requires dots. If you use Excel or PowerPoint, this will be my tool of choice. I'll do this in PowerPoint, but you can of course use any tool you want. I'll share the file with all the data here. I'm in Europe, so I'll select the commas, and we have a year. We have total visitors, we have percentage of change, and we have data status to be actual and estimate. So we need to figure out a way to show the estimated data as well. I've selected our logo, our font, and a beautiful color scheme, you can use the color scheme as well for your designs, or you can go with your own. That's not a problem. Here is a finished end result of the chart that I want you to create. This is my spin on the design that we just had and the table we just had, and I will go over that with you. I will use PowerPoint, the slide where it says, Work here, and here are the tasks you need to perform. Analyze the table, decide on an initial chart type. I think we've already done that because when I look at the table, I'll, of course, use the years. And then I have total visitors. This is a total volume, so I'll use some kind of bar chart or column chart, and for the percentage of change, we will use a line chart. So we need to combine two or make two separate charts to showcase the data. We need to remember that some of the data is estimated. Okay, let us start working. Select appropriate data. Just so I see what I've already done, I'll select everything for now because I want all the data that is possible here. I'll select Insert chart, and I need to decide upon the chart initially. I'll select this column chart. And I'll plug in my data. As you can see, the text is a little bit, but that's no problem. I'll decrease the amount of data to select just the total visitors for now, and this would be my result. I'll close this. I have now inserted a column chart, so I've finished task number three, and I'll start to redesign and adjust it in a second. Let's adjust this in the next lecture. This is our initial design, and from here, we will try to improve it step by step. 44. Exercise 1 – Building Chart: In this lecture, we'll adjust the design so it looks more like this and is more understandable for the viewer. Okay, let us follow the brief and adjust the design to our liking the bar width, the data label, the grid lines, the font size. Let's go over it, depending of course on the tool you use. I'll take in PowerPoint, right click, format data series, and decrease the gap width. Of course, not so they touch each other because this looks like a histogram or an area graph, but I want them to be considerably larger or thicker. Okay, I think this is beautiful. Now the data label. I do like if we have the data labels here, and this is data that is actually important. So I want to enable the data labels. Okay? Because I have the data labels already here, I don't need to be redundant. Remember, remove unnecessary ink from the slide. Okay, I removed the left axis because of that. Now we have grid lines and font size. I don't think we need the grid lines because grid lines are useful if we have the data labels or the aces on the left side. Right now, we actually want to see the data clearly, so I'll delete the grid lines. For now, I will delete the title because we will add a title here, and I'll increase the font size, Alpres Control B to bolden this up, and I'll increase it. I do like if we have the same color. You can decide upon that yourself. I'll change the color into this first color of this presentation. Now the color is consistent with the bars. Now on the bottom, we have the years. The years are beautifully displayed. I think we can have the years here, and I need to decide whether I want them bold as well and a little bigger. I think I do. I think this is okay. Let's take a look at what we had previously, okay, very similar design. Beautiful. I think I've adjusted this design. Now I want to signal estimates appropriately. In my case, when I'm using PowerPoint, it doesn't have any type of estimate possibilities. I can place a letter E here. Let's go to insert textbox. I'll insert the letter E, and I'll take the letter and use one of the colors that I have. I have a red color here, so I'll put the estimates next to the four last years. I'll reduce the size so everything looks consistent, and I'll press Control D. I'll take the year. And I'll position this accordingly. Okay. Control D again, position it accordingly. Maybe a little to the left and Control D, a little bit to the right. I think we can group them together. Now I want to take the last 4 bars. I need to select the bars first. Then I need to click another time. I need to go to its filling options, and I will start to work with a solid fill and partial transparency. The solid fill should be with the same color. Let's go for 60% to be consistent solid fill, 60%, solid fill, 60%, and solid fill 60%. I would like to put a dotted line here as well to make it very clear, I'll select a gradient line. I'll deselect the gradient. Let's make it maybe this dark blue that we have going over for the width, I'll increase the width so I see it clearly, and I'll change the dash type to dashes. Okay, maybe shorter dashes. Beautiful. Now I would like the bottom to be transparent, just so the bottom ones aren't visible here. I'll increase the transparency. Now I can move this color a little further to lower down the design. Okay, I think this is a beautiful showcase of estimations, but we don't need the five points width. I think the width to be 1.75 or two is beautiful. Now this is a beautiful design. I'll select the gradient line, and I will repeat the steps on the other bars. It is a bit tedious, but it's the most professional way I know to do this. This way, we clearly show that this part of the chart are only estimates. You can of course, make your own version of that, but I feel like maybe we could increase the transparency to make it more apparent. But other than that, I feel everything is very well designed. This part is complete. Let's go to the next lecture where we will try to plot the change over time on top of this chart. 45. Exercise 1 – Building Title: In this lecture, let us add the line chart on top of it or use a Combo Chart and work on the title. Let us go over task number six. I'll click on the chart. I'll go to Chart Design, edit data, and we need to include this part as well or just create a separate chart with this data, depending on the software you use. Okay, in PowerPoint, I can just increase that, and now I have both on the same. I'll go to change chart type. I'll use a Combo Chart. Lustered column, and here I want a line chart or a line chart with markers will be even prettier. I want this to be on the secondary axis. I'll press Okay. I'll enable the axis back again, so I see both. Okay. And now I can adjust the line chart to my liking. Here on the right side, I'll right click on this axis, I'll format it. And in the sizing, well, I need to make this a little lower. And this make this a little lower and the minimum bound maybe to 0.1. So the line chart is actually visible here. Now, here I would just require a couple of design changes to make everything look appropriate, depending on the software you use. Of course, you can do this yourself. Somewhere, I lost those data labels, so I'll just bring them back again. I'll make this bigger and blue. And for this chart, I want to have the same color that the chart has, so I'll select the yellow or you can use another color depending on what you use. Data labels, in my case, I want them to be above the chart. And if something is not visible, I'll just take this and put higher. And here, if I have zero, I'll just delete it, and I cannot properly see those on the right side. For that, you can either add a shadow behind it, text options, shadow, open the shadow and give a shadow or take the bars. Let me go to the design options. Click on the bar and reduce the transparency maybe to 60%. So the contrast, 55%. So the contrast between the text and the bar is a little lower. I think this would still look very good, and now it's easier to read. Okay, B PowerPoint has problems when I remove the axis, I'll just take a shape and I'll put a white shape above it. That's no problem. It's just a design gimmicky trick I need to do here in this particular software, Alpras Control D, and I want to hide this as well. For the estimates, we need to move the estimates back again, and we are approximately at where we want it to be. Of course, I need to reposition them, but but that's just something that I have to do depending on the program I'm working in. Okay, for this line itself, I would prefer if the line itself would be much thicker and the markers themselves also would be a lot thicker. So it stands out a little bit more to me. Okay, now I think the design is beautiful, very similar of what we did. What I want to say with my title, I want to say that this what you see here are the total number of visitors in Switzerland and the change over time, 2050, 2060, are the years. And we could give an annotation maybe an asteris and estimates by Neo travel Horizon. Maybe this is my company name. I'll put it in red because we have the logo here as well. I could put the information somewhere else. If I prefer to, I could put this information maybe on the bottom, estimate and I think this will look a little cleaner and not clutter the actual title. This is a very simple title. If you would like to use an Action Title, you would have to draw a conclusion depending on what you want to say. If you want to focus on the estimates, you would say something like estimates predict to have an average of 6% growth over the next four years, or growth has been steadily increasing or holding the same level in recent years and the foreseeable future. The title needs to reflect what the narrative of the chart is. For the legend, I will not go over it. I think this is perfectly understandable and depending on the tool you use. In PowerPoint, I like to design custom legends because if I take the legend, the original legend that we have here, I cannot adjust a lot of things here. So I prefer to design the legend myself, depending on what I need. This is the finished first exercise and what I want you to be capable of achieving after this course. 46. Exercise 2 – Analyzing Data: In this exercise, we are cargo. We have a case study, the warehouse optimization. Debris. Welcome to cargo. We manage supply chains for the agricultural sector. Currently, our main grain silo in Kentucky holds 206 tons of inventory. To meet the demand, we need to reach a final capacity of 209. Our mission, we will have a dataset, and our mission is to build a chart, the bridge that explains how we reach this target. The regional manager needs to see the gains from the autumn harvest, the supplier buyback, and the winter distribution losses. We need to see the movement of physical goods, not just the final number. My plot suggestion here will be the waterfall because the waterfall will beautifully show the change over time that is happening within this table. Take a look at this table. Have on the left side what happens in the inventory factory. The change, the running total, if we don't need the individual change numbers and the category type, what is happening. The baseline is 206, 209 is what we need to reach, and we see where the increase and decrease is. This is beautiful to create a Waterfall Chart. I've designed some elements, a color scheme, our fonts, our logo, and the axis break icons because in this lecture, you will learn the axis break icons that we need to use to showcase this Waterfall Chart professionally. Here is the end result of what we want to achieve, and you can see the axis breaks because in order to see the changes to be magnified and bigger and be able to showcase both the big bars, the 206 bars and the 5 bars next to each other, we need to somehow reduce the axis but make it professionally, not just reduce it to our liking, but use axis breaks that can be symbolized like that or, for example, like that. Different presentations have it differently. Most likely, it will look something like that. From the data, in my case, I will only need this data and the changes that occur. I have finished test number one, I have decided on an initial chart type, and I have also selected the appropriate data and will be inserting a chart. Within PowerPoint, I'll select Insert Chart, and this time, I want to search for the Waterfall Chart. The Waterfall Chart will allow me to show this precisely. Okay, there is a lot going on. Let me first plug in the data into the software. Let me make that a little bigger, that a little bigger, so we see it, and I'll remove all data that is unnecessary. Okay. But we have no empty space here. Depending on your software in PowerPoint, if you have this empty space, you need to select the data once again because currently everything is selected. You can see even the data that I no longer have. So I need to select the data again to only display the things that I have. I'll press okay, and you can see we have fixed the table. Okay, I've completed test number three. Before we use an excess break and do any changes, I need to take this last one, click on it, right click and select set as total because on a waterflow chart, we have the totals and the changes over time. The first one should be a total, as well. I think it is, Oh, it isn't. I'll set this as total. Now I'm sure that this is a total. This is change, change, change, and this is another total. I'll additionally enable data labels. And I'll go from there. In the next lecture, let me adjust everything so it looks normal, and this is exactly why we need to change the axis, make an axis break, so it looks professional. 47. Exercise 2 – Action Title: In this lecture, we will continue the design of our wattle for chart, and to make everything more understandable, let's reduce the amount of words that are used here. I'll pick on the chart. I'll go to Chart Design, edit data, and from the data, what do I want to do? Do I need 2048 opening inventory? Let's make If or inventory. Okay? 249, If we have inflow outflow buyback. I think we will know what's going on here. Okay, this is now much cleaner. For the chart itself, we have the title separately above it, so I'll delete. I'll get rid of the title because we have it above and do we need this legend? It depends. You can have it or I think this is perfectly understandable like that. Now for the axis, I need to adjust the axis to start with maybe 190. I'll right click on the axis, format axis in my case, and I'll select the minimum to 190. Now, everything is more understandable, but we need to do the break. PowerPoint a is doesn't have this feature. So what I can I can select the shape. I can hide this with any kind of shape that has the color of the background. In my case, the color of the background is the first color, and I'll select no outline. What I want to do here, I want to insert some kind of zero so everyone sees that this starts at zero. Okay? Beautiful. This zero can be a little bigger. T zero can be here. Now I need icons for the axis break. I've prepared icons, so I'll select the icons here, Control C, and I'll bring them here Control V. What you want to do, you want to put the axis break on everything that is being broken. In my case, those 2 bars, the total bars, and also, of course, on the axis itself, because it needs to be communicated that the axis is being broken in that very place, and the axis resumes at 195. I'll maybe make the labeling a little bit more prominent, even a bit bigger. And we have now 14 font, and I'll do the same for zero, 14 font, the blue, and I'll put it appropriately. Opinion, the axis is a little bit invisible. So I would like to increase the line here. I would like to increase the size of the line or actually make it visible. So I like solid line, I'll go for the solid color that we have the blue color. I'll increase the width of it. And now the line is very apparent and I can put this break here and everything is beautifully displayed. I think the grid lines aren't necessary, so I'll remove the grid lines. What else do we have to do? We have completed test number four. We have adjusted the design a little bit. Of course, it's debatable if this is enough. I could make those a little bigger, as well. In my opinion, this will be beautifully displayed that way. And we need to build a title. What we want to say is that we have a net three ton gain over the previous year. So the 2049 closing inventory shows a net three ton gain over the previous fiscal year over the previous year. And we also should say that we reached the goal. Maybe we reached the goal of this indicates that we achieved our result that we wanted. Depending on what you want to say with your chart, and always remember when you write titles, make the titles so they are being understood without seeing the chart. This title itself, we reached the goal of maybe 29 209 tons closing inventory in 2049. Which is a net three toon gain over the previous fiscal year. I know this is extensive and long, but right now, if someone wouldn't see the chart whatsoever, he would understand what has happened. The other way around, if someone sees this chart, he then will understand this action title because here I have the entire story and narrative that this chart drives. I think we now completed this exercise. Do your own adjustments, do your own design choices. Try to make a waterfall chart. If you want, you can, of course, for example, use green for increase, use red for decrease. This would be just as good depending on the designs we want to do. And here I have a clear, understandable chart that reaches the goal of what we wanted to say with it. Thank you so much for working with me through this exercise.

Data Visualization for Beginners: From Charts to Storytelling

Andrew Pach ⭐, PowerPoint, Animation & Video Expert

Watch this class and thousands more

Watch this class and thousands more

Lessons in This Class

1.

Introduction

0:59

2.

01.01 - What is Data Visualization?

4:01

3.

01.02 - When to use charts

2:38

4.

01.03 - Data-Ink Ratio

2:56

5.

01.04 - Data-Ink Examples

4:33

6.

01.05 - Encoding and Decoding

3:57

7.

01.06 - Perceptual Tasks

5:01

8.

01.07 - Perceptual Tasks #2

6:26

9.

01.08 - Remember this

2:36

10.

02.01 - Proportion

2:32

11.

02.02 - Color selection

2:50

12.

02.03 - Accessibility

2:17

13.

02.04 - Annotations

3:27

14.

02.05 - Labels

3:18

15.

02.06 - Estimates

3:02

16.

02.07 - Decluttering

3:14

17.

02.08 - Remember this

2:36

18.

03.01 - Framing

4:03

19.

03.02 - Narrative

4:45

20.

03.03 - Dashboards

2:52

21.

04.01 - Distortion

3:48

22.

04.02 - Lie Factor

4:26

23.

04.03 - Correlation

3:05

24.

04.04 - Data Bias

3:14

25.

04.05 - Normalization

2:42