The Ultimate SQL & Tableau Course: From Zero to Hero | Baraa Khatib Salkini | Skillshare

Playback Speed


1.0x


  • 0.5x
  • 0.75x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 1.75x
  • 2x

The Ultimate SQL & Tableau Course: From Zero to Hero

teacher avatar Baraa Khatib Salkini, Lead Big Data, Cloud Architecture, Data

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Lessons in This Class

    • 1.

      SQL | Course Introduction

      1:57

    • 2.

      SQL | Course Curriculum Overview

      1:52

    • 3.

      SQL | Introduction

      6:30

    • 4.

      SQL | Why Learn SQL?

      4:07

    • 5.

      SQL | The Database Concepts

      3:59

    • 6.

      SQL | Table Concepts

      2:45

    • 7.

      SQL | Main SQL Commands

      4:23

    • 8.

      SQL | The Elements of SQL Statements

      4:28

    • 9.

      SQL | Download & Install MySQL

      5:51

    • 10.

      SQL | Tour in the Interface of MySQL Workbench

      5:30

    • 11.

      SQL | Install the Course Database

      4:37

    • 12.

      SQL | Guide to SQL Coding Style

      4:41

    • 13.

      SQL | SELECT Statement

      7:29

    • 14.

      SQL | DISTINCT

      3:19

    • 15.

      SQL | ORDER BY

      9:11

    • 16.

      SQL | WHERE

      6:21

    • 17.

      SQL | Comparison Operators: =, >, <, >=, <=, !=

      7:05

    • 18.

      SQL | Logical Operators: AND, OR, NOT

      11:31

    • 19.

      SQL | BETWEEN

      6:12

    • 20.

      SQL | IN

      4:42

    • 21.

      SQL | LIKE

      12:28

    • 22.

      SQL | JOINS Concept

      4:42

    • 23.

      SQL | AS Statement - Aliases

      3:45

    • 24.

      SQL | INNER JOIN

      8:21

    • 25.

      SQL | LEFT JOIN

      3:09

    • 26.

      SQL | RIGHT JOIN

      2:30

    • 27.

      SQL | FULL JOIN

      4:06

    • 28.

      SQL | UNION

      10:11

    • 29.

      SQL | Aggregate Functions

      12:12

    • 30.

      SQL | String Functions

      12:52

    • 31.

      SQL | GROUP BY

      8:28

    • 32.

      SQL | HAVING

      5:47

    • 33.

      SQL | SubQuery: EXISTS vs IN

      9:51

    • 34.

      SQL | INSERT

      14:52

    • 35.

      SQL | UPDATE

      5:53

    • 36.

      SQL | DELETE & TRUNCATE

      4:37

    • 37.

      SQL | CREATE Table

      10:03

    • 38.

      SQL | ALTER Table

      1:49

    • 39.

      SQL | DROP Table

      0:54

    • 40.

      Tableau | Course Introduction

      3:21

    • 41.

      Tableau | Course Curriculum Overview

      5:11

    • 42.

      Tableau | Section: Tableau Basics

      0:32

    • 43.

      Tableau | Big Data Buzzwords

      9:01

    • 44.

      Tableau | What is Business Intelligence (BI)

      3:03

    • 45.

      Tableau | The Power of Data Visualization

      3:27

    • 46.

      Tableau | Tableau vs Excel

      9:33

    • 47.

      Tableau | Best 3 BI Tools

      1:09

    • 48.

      Tableau | What is Tableau?

      2:51

    • 49.

      Tableau | Why Tableau is Powerfull?

      5:30

    • 50.

      Tableau | Section: Tableau Products

      0:29

    • 51.

      Tableau | Development Process

      3:41

    • 52.

      Tableau | Tableau Desktop

      2:08

    • 53.

      Tableau | Tableau Public Desktop

      1:22

    • 54.

      Tableau | Tableau Prep

      2:22

    • 55.

      Tableau | Tableau Desktop vs Prep

      3:35

    • 56.

      Tableau | Sharing Process

      2:49

    • 57.

      Tableau | Hosting Tableau: On-Prem vs IaaS vs Saas

      6:34

    • 58.

      Tableau | Tableau Server & Cloud

      2:59

    • 59.

      Tableau | Tableau Public

      3:05

    • 60.

      Tableau | Tableau Reader & Mobile

      2:43

    • 61.

      Tableau | Tableau Server vs Cloud vs Public vs Reader vs Mobile

      4:09

    • 62.

      Tableau | Section: Tableau Architecture

      0:38

    • 63.

      Tableau | Live vs Extract

      2:33

    • 64.

      Tableau | Tableau File Types

      4:59

    • 65.

      Tableau | Tableau Architecture: Desktop Components

      8:09

    • 66.

      Tableau | Publish Process

      1:54

    • 67.

      Tableau | Authentication Process

      1:54

    • 68.

      Tableau | Access View Process

      4:58

    • 69.

      Tableau | Tableau Server Architecture

      11:43

    • 70.

      Tableau | Tableau Public Architecture

      3:45

    • 71.

      Tableau | Section: Prepare Your Pc

      0:36

    • 72.

      Tableau | Download & Install Tableau

      1:40

    • 73.

      Tableau | Create Tableau Public Account

      1:40

    • 74.

      Tableau | Get Training Datasets

      6:28

    • 75.

      Tableau | Publish First Viz

      2:37

    • 76.

      Tableau | Tour of the Interface

      14:31

    • 77.

      Tableau | Section: Data Modeling

      0:47

    • 78.

      Tableau | Concept of Data Modeling

      6:44

    • 79.

      Tableau | Tableau Data Modeling

      5:47

    • 80.

      Tableau | Joins

      9:23

    • 81.

      Tableau | Union

      7:38

    • 82.

      Tableau | Relationships

      17:56

    • 83.

      Tableau | Data Blending

      7:30

    • 84.

      Tableau | Join vs Union

      0:57

    • 85.

      Tableau | Join vs Data Blending

      4:07

    • 86.

      Tableau | Join vs Relationship

      5:51

    • 87.

      Tableau | Join vs Relationship vs Union vs Blending

      3:44

    • 88.

      Tableau | Build 2x Data Sources

      12:31

    • 89.

      Tableau | Section: Tableau Metadata

      0:48

    • 90.

      Tableau | Introduction to Metadata

      2:21

    • 91.

      Tableau | Data Types

      18:17

    • 92.

      Tableau | Data Type Roles

      5:12

    • 93.

      Tableau | Dimensions vs Measures

      19:08

    • 94.

      Tableau | Discrete vs Continuous

      15:57

    • 95.

      Tableau | Data Types vs Dimension & Measure vs Discrete & Continuous

      1:52

    • 96.

      Tableau | Section: Tableau Renaming

      0:30

    • 97.

      Tableau | Naming Conventions

      11:36

    • 98.

      Tableau | Renaming

      11:12

    • 99.

      Tableau | Aliases

      9:20

    • 100.

      Tableau | Section: Organizing Your Data

      0:38

    • 101.

      Tableau | Hierarchy

      19:26

    • 102.

      Tableau | Groups

      14:04

    • 103.

      Tableau | Cluster Groups

      10:36

    • 104.

      Tableau | Sets

      25:46

    • 105.

      Tableau | Bins & Histograms

      11:22

    • 106.

      Tableau | Section: Filtering & Sorting Data

      0:39

    • 107.

      Tableau | Types of Filters

      19:26

    • 108.

      Tableau | How to Create Filters

      24:59

    • 109.

      Tableau | Customize Filters

      30:45

    • 110.

      Tableau | 10x Filter Tips & Tricks

      17:14

    • 111.

      Tableau | Sorting Data

      17:21

    • 112.

      Tableau | Section: Parameters

      2:33

    • 113.

      Tableau | Dynamic Calculations using Parameters

      6:22

    • 114.

      Tableau | Dynamic Reference Lines using Parameters

      1:52

    • 115.

      Tableau | Dynamic Filters using Parameters

      3:57

    • 116.

      Tableau | Swap Measures/Dimensions using Parameters

      10:15

    • 117.

      Tableau | Dynamic Titles using Parameters

      3:02

    • 118.

      Tableau | Dynamic Bins Using Parameters

      3:28

    • 119.

      Tableau | Section: Actions

      2:57

    • 120.

      Tableau | Action: Go To URL

      6:18

    • 121.

      Tableau | Action: Go To Sheet

      1:50

    • 122.

      Tableau | Action Filter & Quick Actions

      6:52

    • 123.

      Tableau | Action Highlight

      4:44

    • 124.

      Tableau | Action Sets

      6:46

    • 125.

      Tableau | Action Parameters

      5:47

    • 126.

      Tableau | Action Triggers

      1:51

    • 127.

      Tableau | Section: Tableau Calculations

      0:37

    • 128.

      Tableau | Introduction to Calculations

      11:00

    • 129.

      Tableau | Calculation Components

      8:32

    • 130.

      Tableau | Nested Calculations

      5:35

    • 131.

      Tableau | 4 Types of Calculations

      22:15

    • 132.

      Tableau | Number Functions: CEILING, FLOOR, ROUND

      10:15

    • 133.

      Tableau | Change Cases: LOWER & UPPPER

      10:47

    • 134.

      Tableau | Remove Spaces: LTRIM, RTRIM, TRIM

      11:50

    • 135.

      Tableau | Extract Substring: LEFT, RIGHT, MID

      12:02

    • 136.

      Tableau | Search: STARTSWITH, ENDSWITH, CONTAIN, FIND, FINDNTH

      26:11

    • 137.

      Tableau | CONCAT & SPLIT

      15:19

    • 138.

      Tableau | REPLACE

      7:06

    • 139.

      Tableau | Extract Dateparts: DATENAME, DATEPART, DATETRUNC, DAY

      30:11

    • 140.

      Tableau | Add & Subtract Dates: DATEDIFF, DATEADD

      12:26

    • 141.

      Tableau | TODAY & NOW

      6:46

    • 142.

      Tableau | NULL Functions: ZN, IFNULL, ISNULL

      12:57

    • 143.

      Tableau | Logical Functions: IF, ELSE, ELSEIF, IIF, CASEWHEN

      29:10

    • 144.

      Tableau | Logical Operators: AND, OR, NOT

      16:22

    • 145.

      Tableau | Aggregate Functions: SUM, AVG; COUNT, COUNTD, MAX, MIN

      19:06

    • 146.

      Tableau | ATTR Attribute Function

      15:09

    • 147.

      Tableau | Introduction to LOD Expressions

      8:46

    • 148.

      Tableau | FIXED LOD Expression

      9:26

    • 149.

      Tableau | EXCLUDE LOD Expression

      5:31

    • 150.

      Tableau | INCLUDE LOD Expression

      12:57

    • 151.

      Tableau | Table Calculations: FIRST, LAST, INDEX, RANK

      21:46

    • 152.

      Tableau | Table Calculations: RUNNING TOTAL

      6:05

    • 153.

      Tableau | Table Calculations: DIFFERENCES

      7:25

    • 154.

      Tableau | Section: Tableau Charts

      1:00

    • 155.

      Tableau | Multiple Measures in One View

      20:43

    • 156.

      Tableau | Bar Charts

      10:07

    • 157.

      Tableau | Bar-in-Bar Chart

      2:12

    • 158.

      Tableau | Barcode Chart

      0:59

    • 159.

      Tableau | Line Charts

      9:54

    • 160.

      Tableau | Highlighted Line Charts

      5:52

    • 161.

      Tableau | Bump Chart

      4:16

    • 162.

      Tableau | Sparkline Chart

      2:15

    • 163.

      Tableau | Barbell Chart

      4:56

    • 164.

      Tableau | Rounded Bar Chart

      1:48

    • 165.

      Tableau | Slope Chart

      3:42

    • 166.

      Tableau | Bar & Line Charts

      2:42

    • 167.

      Tableau | Bullet Chart

      1:57

    • 168.

      Tableau | Lollipop Chart

      4:43

    • 169.

      Tableau | Area Charts

      5:10

    • 170.

      Tableau | Scatter Plots

      3:22

    • 171.

      Tableau | Dot Plot

      1:25

    • 172.

      Tableau | Circle Timeline

      2:08

    • 173.

      Tableau | Pie & Donut Charts

      7:05

    • 174.

      Tableau | Heat & Treemap Charts

      3:41

    • 175.

      Tableau | Bubble Charts

      3:49

    • 176.

      Tableau | Maps

      8:41

    • 177.

      Tableau | Histograms

      3:08

    • 178.

      Tableau | Calendar Chart

      2:29

    • 179.

      Tableau | Waterfall Chart

      2:22

    • 180.

      Tableau | Pareto Charts

      7:49

    • 181.

      Tableau | Butterfly (Tornado) Charts

      6:07

    • 182.

      Tableau | Quadrant Chart

      7:13

    • 183.

      Tableau | Box Plot

      3:07

    • 184.

      Tableau | KPI

      3:35

    • 185.

      Tableau | KPI & Bars

      4:51

    • 186.

      Tableau | BANS

      2:55

    • 187.

      Tableau | Funnel Chart

      2:29

    • 188.

      Tableau | Progressbar

      1:57

    • 189.

      Tableau | Choose The Right Chart

      12:14

    • 190.

      Tableau | Section: Tableau Dashboard

      16:37

    • 191.

      Tableau | Tableau Dashboard Project

      10:02

    • 192.

      Tableau | Section: Tableau Project

      0:53

    • 193.

      Tableau | Tableau Project Steps

      3:03

    • 194.

      Tableau | #1 Step - Requirements Analysis

      9:43

    • 195.

      Tableau | #2 Step - Building Data Source

      7:27

    • 196.

      Tableau | #3 Step - Building Charts

      51:33

    • 197.

      Tableau | #4 Step - Building Sales Dashboard

      49:13

    • 198.

      Tableau | #5 Step - Building Customer Dashboard

      21:57

    • 199.

      HR Project | Introduction

      2:57

    • 200.

      HR Project | Build Data Source

      6:44

    • 201.

      HR Project | Build Charts - Part1

      25:57

    • 202.

      HR Project | Build Charts - Part2

      25:13

    • 203.

      HR Project | Sketch Mockup of Summary Dashboard

      10:40

    • 204.

      HR Project | Build the Summary Dashboard

      19:45

    • 205.

      HR Project | Fine Tuning The Summary Dashboard

      75:19

    • 206.

      HR Project | Build the Table

      13:50

    • 207.

      HR Project | Sketch Mockup of Detailed Dashboard

      3:22

    • 208.

      HR Project | Build The Detailed Dashboard

      28:15

    • 209.

      HR Project | Bonus - Build Background Layers using FIGMA

      9:21

    • 210.

      Congratulations & THANK YOU Video

      0:47

    • 211.

      Advanced SQL | Download SQL Server & SSMS

      5:25

    • 212.

      Advanced SQL | Create Databases

      5:19

    • 213.

      Advanced SQL | Tour in the Interface of SSMS

      4:24

    • 214.

      Advanced SQL | What are Window Functions

      12:41

    • 215.

      Advanced SQL | Syntax of Window Functions

      5:02

    • 216.

      Advanced SQL | Window Functions: PARTITION BY

      10:11

    • 217.

      Advanced SQL | Window Functions: Order BY

      4:16

    • 218.

      3 5 window frame

      14:26

    • 219.

      3 6 window Rules

      7:38

    • 220.

      3 7 window summary

      2:32

    • 221.

      4 1 win aggr what is

      2:31

    • 222.

      4 2 win aggr count

      16:42

    • 223.

      4 3 win aggr sum

      7:27

    • 224.

      4 4 win aggr avg

      9:02

    • 225.

      4 5 win aggr min max

      9:37

    • 226.

      4 6 win aggr rolling running

      9:36

    • 227.

      4 7 win aggr moving avg

      9:07

    • 228.

      4 8 win aggr summary

      4:40

    • 229.

      5 1 win rank what is

      5:23

    • 230.

      5 2 win rank row number

      4:24

    • 231.

      5 3 win rank rank func

      4:04

    • 232.

      5 4 win rank dense rank func

      4:25

    • 233.

      5 5 win rank compare ranking

      1:02

    • 234.

      5 6 win rank top bottom analysis

      7:15

    • 235.

      5 7 win rank unquie ids

      2:46

    • 236.

      5 8 win rank identify duplicates

      5:34

    • 237.

      5 9 untile

      6:14

    • 238.

      5 10 ntile use case data segementation

      3:58

    • 239.

      5 11 ntile use case data load

      3:26

    • 240.

      5 12 win rank cume dist

      4:49

    • 241.

      5 13 win rank percent rank

      7:48

    • 242.

      5 14 win rank summary

      2:22

    • 243.

      6 1 win value what is

      4:13

    • 244.

      6 2 win value min max

      9:48

    • 245.

      6 3 win value MoM

      6:49

    • 246.

      6 4 win value customer retention

      8:45

    • 247.

      6 5 win value first last

      12:10

    • 248.

      6 6 win value suzmmary

      2:26

    • 249.

      8 1 intro case

      0:37

    • 250.

      8 2 syntax case

      2:30

    • 251.

      8 3 howitworks

      6:48

    • 252.

      8 4 usecase 1

      5:18

    • 253.

      8 5 Rules

      1:29

    • 254.

      8 6 usecase2

      5:58

    • 255.

      8 7 quickform

      2:52

    • 256.

      8 8 usecase3

      3:52

    • 257.

      8 9 usecase4

      4:40

    • 258.

      8 10 summary

      1:41

  • --
  • Beginner level
  • Intermediate level
  • Advanced level
  • All levels

Community Generated

The level is determined by a majority opinion of students who have reviewed this class. The teacher's recommendation is shown until at least 5 student responses are collected.

1,708

Students

15

Projects

About This Class

Having SQL Skills and learn one of employer's most requested skills of 2023! Learning SQL is one of the fastest ways to improve your career prospects.

This is the most comprehensive, yet straight-forward, course for the SQL language on Skillshare!

In this course, I transferred my experience from over a decade of real-world Big Data projects to one Skillshare course.

I designed this course to take you from Zero to Hero of SQL, so if you are a beginner, don't worry, I will explain everything from scratch step by step. You are not too old or too young, and SQL is super easy to learn.

Each SQL Topic in this course will be explained in 3 steps:

  • The concept (Theory)

  • Learn SQL Syntax using simple Tasks.

  • Learn How SQL Processes the Query behind the scene.

SQL is one of the most in demand skills for business analysts, data scientists, and anyone who finds themselves working with data! Upgrade your skill set quickly and add SQL to your resume by joining today!

You will have exercises during the videos where I present a task and we solve it together. By the end of each section, you will have as well tons of exercises and solutions.

I will provide you with numerous materials:

  • SQL Course Curriculum (Roadmap) and we will progress it in each lecture.

  • SQL Cheat Sheet, so you don't have to memorize all SQL syntaxes and you can use it later during development to have quick access.

  • SQL Database and Data for Training, so you can practice with me.

  • SQL Presentations and Concepts, all collected in one place to have it as a reference for you.

Topics covered in this course :

  • SQL Basics: Intro, Why SQL, Database Concepts, Table concepts, SQL Commands, SQL Elements

  • Download & Install MySQL

  • Install Database

  • SQL Coding Style

  • SELECT statement

  • DISTINCT

  • ORDER BY

  • Filtering Data: Where

  • Comparison Operators: =, >, <, >=, <=, !=

  • Logical Operators: AND, OR, NOT

  • BETWEEN, IN, LIKE

  • JOINS: Inner, Left, Right, Full

  • UNION & UNION ALL

  • Aggregate Functions: MAX, MIN, AVG, COUNT, SUM

  • String Functions: CONCAT, LOWER, UPPER, TRIM, LENGTH, SUBSTRING

  • Advanced SQL Topics

  • GROUP BY

  • HAVING

  • Subquery: EXISTS & IN

  • Modifying Data: INSERT, UPDATE, DELETE

  • Defining Data: CREATE, ALTER, DROP

For your reference:

https://www.datawithbaraa.com/sql-introduction/

-----Update----

For downloading MySQL they only removed "developer" option. You need to select "Full" option and everything should be fine.

----------------------------------------------------------- TABLEAU COURSE ------------------------------------------------------

Welcome to Tableau Ultimate Course: Zero to Hero!

Having Tableau Skills and learn one of employer's most requested skills of 2024! Learning Tableau is one of the fastest ways to improve your career prospects.

Tableau is a powerful data visualization and business intelligence (BI) software tool used for analyzing and presenting data in a visually engaging and interactive way. It allows you to connect to various data sources, transform raw data into meaningful insights, and create interactive dashboards, reports, and charts that help you make data-driven decisions.

This is the most comprehensive, yet straight-forward, course for Tableau on Skillshare!

In this course, I transferred my experience from over a decade of real-world Data Visualization projects to 21 Hour-High Quality Udemy course.

I designed this course to take you from Zero to Hero of Tableau, so if you are a beginner, don't worry, I will explain everything from scratch step by step. You are not too old or too young, and Tableau is super easy to learn.

What Makes This Course Stand Out?

  1. This is the only course on Udemy that breaks down the complex concepts of Tableau into animated visuals. In this course you will be presented with over 250x animated visuals.

  2. What special about this course it is taught by me, I'm not just another online instructor, i am working in big companies in Germany like Mercedes Benz where I'm leading BI & Big Data projects. That means you are getting real life skills out of this course.

  3. You will master over 63 Tableau Charts, equipping you to visualize any data and meet various requirements. You'll gain the expertise to choose the right chart for specific requirements and understand when to utilize each type of chart effectively.

  4. We'll deep dive into 60x Tableau Functions that will help you to manipulate your data for visualization. You will first understand the concept and how tableau works then we will learn the functions using very simple examples.

I will provide you with materials:

  • 3x Different Training Data Sets

  • 3x Cheat Sheets: Concepts, Calculations and Charts. So you quick access to all what you need about Tableau.

  • Access to All Tableau Files that is created during the course.

  • All Course Sketchnotes are available to be used as reference later.

15 Sections that are Covered in this course :

  • Tableau Basics

  • Tableau Products Suite

  • Tableau Architecture

  • Prepare Your Training Environment

  • Data Modeling | Combining Data

  • Tableau Metadata

  • Renaming & Aliases

  • Organizing data

  • Filtering & Sorting Data

  • Parameters | Dynamic Views

  • Tableau Actions

  • Tableau Calculations

  • Tableau Charts

  • Tableau Dashboard

  • Tableau Project

-------------------------------------------

Meet Your Teacher

Teacher Profile Image

Baraa Khatib Salkini

Lead Big Data, Cloud Architecture, Data

Teacher
Level: Beginner

Class Ratings

Expectations Met?
    Exceeded!
  • 0%
  • Yes
  • 0%
  • Somewhat
  • 0%
  • Not really
  • 0%

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.

Transcripts

1. SQL | Course Introduction: Hi and welcome to this very unique SQL course. I embarrass or kidney IT solution architects with over a decade of experience in IT projects. I will put everything that I know about SQL into 4 h tutorial. In this course, you will learn everything that you need about one of the most in-demand skill, the SQL, from basic to advanced topics. So by the end of the course, you will be able to write SQL queries very easily. We can work with the one of the most popular version of SQL, MySQL, by the syntaxes and the skills that you're going to learn from this course. It can be used in any other databases or applications using SQL, I designed this course to take you from zero to hero. So if you are a beginner, don't worry about it. I'm going to explain everything from the scratch step-by-step. So now if you ask me what makes your course a very special compared to the other courses. In this course, you will not only learn how to write SQL queries, but also you will learn the SQL concepts behind them, and especially how the SQL processes the queries behind the scene. And this can help you to understand why we write SQL queries. And it's going to make you more creative with your query statements. In this course, you will have tons of crisp practices and tips and tricks that I collected in the last years. And we will have many SQL tasks and then we are going to solve them together step-by-step. And I will be providing you with a lot of free materials. All the content of this course is also available on my website data with borrow.com. You can use it later as a reference. I will provide you as well with SQL, she achieved where you can find all the tasks and the SQL syntaxes so you don't have to memorize all of them. I've also prepared for this course a database. Where are we going to use it in all our tasks and examples during the tutorials and as well all the SQL representations and concepts made in this course. So now let's jump in and get started. 2. SQL | Course Curriculum Overview: All right everyone, So now I would like to show you the roadmap of the entire SQL course for beginners, that SQL course is divided into nine chapters. First, we're going to start with the basics where you can learn the basic concepts about SQL, like the concept of the databases, SQL tables concept, the basic SQL commands, and the main elements of the SQL statements. In the next chapter, we're going to start preparing your environment so you can practice with me. I will walk you through the steps of downloading and installing MySQL. Then we will take it quick tour of the interface and add the end. We're going to install the database of our course. And then finally, you will begin to use SQL syntax to query the database and the tables that you're just created in the previous section using the select statements. After that, you will learn how to filter your data using the where clause and learn some SQL operators. In the next chapter we're going to step up the level. Where are we going to learn how to combine our SQL tables using joins and Union? After that, we're going to learn many important SQL functions like aggregations and string functions. Then in the next chapter we're going to raise the level again by learning advanced topics in SQL like group by having a subqueries. Then we're going to learn how to modify our data inside our tables using insert, update, and delete. And in the last chapter of this course, we will learn how to define our data using SQL like create, alter, and drop tables. So those are all the topics that we're going to cover in this course. Alright everyone. So with this, I could say, let's jump in and start our SQL course. Alright, so we're going to start with the first chapter. Here. We're going to talk about the SQL basics and concepts. And we're going to start now with an introduction to SQL. 3. SQL | Introduction: Alright, so we will start with the SQL basics, the terms that you'll be hearing during the tutorials, e.g. what is data? Data are facts or statistics that are stored somewhere or moving around the network. Generally, they are like raw materials, e.g. if you order some things online, a lot of data will be generated. E.g. the customer ID, the order number, order date, shipping dates, and so on. Another term that we have is informations. So the data that we have, we could reprocess its structure it, or translate it to a new form called informations, which it has more logical meaning. And we could use it in the analysis, e.g. if we aggregate the order dates over the years, we could see how the company is growing over the years. That means we converted the raw data into meaningful information. Alright, so what our database is a shortcut dB. By definition, a database is a collection of structured and related data that are stored or organized in a way that the data are easily to be accessed and managed. Shortcuts, it is one way to store your data. You will deal with databases everyday and everywhere. So e.g. if you order something online, even if you store your photo at your smartphone gallery. This gallery is a database. We have around many different databases. The most famous one is the one that we're going to learn is the relational SQL databases. Other one is NoSQL database. We have distributed databases, cloud databases, data warehouses, and so on. So now I'm going to go and explain SQL and NoSQL databases because they are the most famous ones. Sql or relational databases. They store the data inside tables. Tables are like containers with a fixed structure, and usually they are related to each other using relationships. That's why we have the name of relational databases. So if you're that are very structured and easy to understand, It would be good if you use SQL databases to store your data. In the other hand, we have no SQL databases, or not only SQL databases. And here you have different types of options. How are you going to store your data? E.g. you have the key value methods where you're going to define the keys and the value inside them. You have the graph store, you have the column store, which is great for big data. Some tools like Tableau for data visualization, they use this method to store the data because it gives great performance and analyses. And as well you have the document. So if you are in projects where the requirements are changing a lot or the data are hard to understand. They don't have like clear structures and so on. It really be good if you use the NoSQL databases to store your data, to use one of those methods. But in many companies, a lot of projects are storing the data inside SQL Database is because they are easy to understand and very widely used. And in our tutorials we will be focusing on these types of databases, SQL relational databases. Now in order to manage all those databases, reuse the software called database management system or DBMS. It is like an application with an interface where you can login and start doing something inside your database. You can do stuff like creating new tables or changing your data, querying your data and so on. And currently we have almost 380 different DBMSs according to the survey of Stack Overflow for this year. I'm going to leave the link in the description. You can see here a ranking of the top and most used database is between developers. So you can see here my scale is number one, then both Chris and so on. We have another ranking websites. It's called DB engine's ranking. If we go there, you find the list or rank of the top abused or most popular DBMS and the ward, they are using different criteria in order to calculate that. But you can see here my SQL is in the top three in the list. In our tutorials, we will be using MySQL and we will learn it, which is the most famous and commonly used databases these days. Now finally, what is SQL? It stands for Structured Query Language. So by definition, SQL is the query language that we use in order to retrieve, manage, manipulate, store data in databases. In short, SQL is the language that you need to master in order to talk to databases. So now in the Internet, there is a never ending battle in how to pronounce it. Some developers call it sequels and other colleagues like me, SQL. It's really depend on the country that you come from or the project that you are working. In my project everyone call it SQL. So it's really up to you which one that you're going to use. Alright, you might ask me now, borrow how SQL works. Let's check this. On the right side we have our relational database where you store your data inside tables. And here we have our DBMS managing our database. So the first thing that you're going to do is to login to the DBMS in order to interact with it. Or if you are building an applications, you need to connect them to the DBMS. After that, you start writing some SQL statements, some instructions, and then hit the button, execute. After that, the DBMS will start processing and do some magic to it and send it to the database. Once the database gets such a query, it start performing. Some operations, are searching for the data that you asked for. Once it's ready, the database will answer to the DBMS with the result that you want it. Alright guys, so that's why it's a quick introduction to SQL. Next, we're going to start talking about why SQL is important and why you should learn it. 4. SQL | Why Learn SQL?: Now I just wanted to quickly motivate why you should still learn SQL. Here are some facts. Sql is, SQL is 47 years old, that is 14 years older than me. You can do the math. So SQL is the granddaddy of the programming world. There are over 700 computer languages that you could learn. You might as well here about the NoSQL movement, where everybody say that NoSQL going to kill the SQL databases. So you might ask now, why we still use SQL? Why should I learn is qu'il, y is scaled. It didn't die like many other languages did, like basic or Pascal. Well, the quick answer for that is SQL still works. It does the jobs and you cannot ask more than that. Here by four reasons why you should still learn SQL. Reason number one is scale is the most used technology in the entire tech industry. If we check now here's a survey of stack overflow this year. I will leave the link in the description. In this chart, we can see the most used technologies. And you can see here, SQL is ranked as the force commonly used technology among all developers. That's means SQL still in trend. Reason number two is SQL in high demand. Most of the companies in all industries, they use some kind of SQL databases to store their data. That means they always going to need someone with SQL skills in order to create, manage, analyze, and understand their data. So now let's do a quick check in Java platform like Indeed search for the keyword SQL. Sql, find jobs. Let's see the results. So you can see here over 170,000 jobs are looking for SQL developer or someone with SQL skills. That means it's scaled skills are really in high demand. And that's because data analyses becoming very important part in many jobs. The third reason is SQL is almost everywhere. If you are in projects and you are working with data, e.g. data mining, data engineering, data science, or data visualizations. You will be end up using a lot of big data tools. I'm programming languages. And most of them they tend to offer you places to write some kind of SQL statements, e.g. if you are using Tableau, it is very famous data visualization tool. There is places where you need to write some SQL statement in order to prepare the data. Or if you are in projects where you are doing like data streaming using Kafka, e.g. there you will find a lot of functions are models where you have to write some SQL statements. They do that to make stuff easier. So that's means. With the time you will see that almost in each tool you can use SQL statements and SQL skills. So now for the last reason, unlike other languages, SQL is simple and easy. It is easy to learn, easy to write, easy to read, because the SQL syntaxes are based in very common easy English words, e.g. select from Curia tables where, and so on. And SQL Managed bear frankly to hide all the complicated processes from you. So that's why a lot of people tends to learn SQL because it's really easy. Alright, so now let's sum up. Sql has the best combinations. Sql is very high in demand and as well, it is easy to learn, which makes learning it's grill is always a smart move and one of the impactful career improvement any IT developer can unlock. Alright, so that was my top reasons why you should learn SQL. Next, we're going to talk about the database concepts. 5. SQL | The Database Concepts: Alright, so now let's understand how SQL databases are organized. It's very important to understand that because once you start writing SQL statements or SQL queries, it's very important to understand the terms that are commonly used in databases, or how to browse your database, or how to find your data. If you let it out at the start, it's going to make the learning process of writing SQL statements much faster. Okay, So now just to make it easier to understand, think of the following analogy. A database is like your city library. We have in Stuttgart, very beautiful library. It's really amazing. I spent a lot of time there. I just like it. So yeah, database is like libraries. Libraries are divided usually into categories like science fiction, romance, history, sport, and so on. So category is going to help you to find quickly the materials that you are searching for. So categories are like grouping up those similar books underneath the same category, we have the same concept as well in databases and we call it schemas or Shamata, pick the one that you like. And of course in libraries we have as well books. We have the similar stuff in databases and we call it tables, where it contains the actual data. So as you saw in the examples, databases are organized in here RC, let's see my SQL, how they organize the data, because not all databases are following the same concepts on how to organize the data. So at the start was my screen. We have the database server. It's like machine containing software and hardware in order to run our DBMSs and databases, usually database server, it's like high-end computer with a lot of CPUs and rams. But in our tutorials, we will install a database server at our local computer or laptop, and we call it local server. Inside the server, you can create them multiple databases. In my SQL databases and schemas, they are synonyms. So a schema by definition, it is like logical containers that's contains similar tables. With that you get a lot of benefits. E.g. imagine you have a big database with a lot of tables, grouping up those similar tables underneath schemas. It's going to make it easier for you to manage the user, e.g. or to manage the tables. Reduce complexity. And as well, if you have like two tables with the same names, you could store them in different schemas. So it's really nice way how to organize the database inside the schema. Then we will have different tables. Tables are the most important object on the whole database because it is the place where you can store your data. Without tables, we have no database. And inside the tables you will have at least than one column or different columns. I will go in details explaining those tables as a next step. Okay, so now I just wanted to show you quickly how other databases, like Microsoft SQL Server or boot scripts SQL, how they organize the data compared to MySQL. So as you can see here, the key difference is that they split database is from schemas. So a database here, it is like the main container, a discrete unit on its own, where you can have logs, jobs, schema's data, and you can do backups, schemas over here it is like a folder inside the database. It's like logical layer containing different tables. In my opinion, MySQL is little bits like misleading or confusing developers. E.g. if you go and create schema, the DBMS of MySQL will be creating a database. I find that at the starts little bit confusing. Alright, so that was it about the database concepts. Next we're going to start talking about the SQL table concepts. 6. SQL | Table Concepts: Alright, so now let's talk about SQL tables because they are really important in the databases and understanding them, it's going to help you to write better SQL statements. The problem is that we have around 380 different databases and they use different terms in the recommendations. Another aspect is that we use different terms in different area forks, e.g. if you are a database developer, you will start using terms like tables, columns, and rows. But if you aren't in the university, you will hear about relations, tuples. And as data modularity will start seeing entity and attributes. That's why I would like to give you a short overview of those terms to make it simpler. Alright, so now we have here a very simple example of SQL tables. In our tutorial database, we have one table called costumers. This table contains all the data about our customers. Another name that we have four tables is objects, entity, and relation. Okay, Next we have columns. Columns are the vertical group of cells that are describing one type of information. In our example, we have four columns. Customer ID, FirstName, LastName, and country. Each column has two informations. The column name, e.g. here we have the firstName and the values inside it, like Maria drawn and so on. Alright, so next we have rows. Rows are the horizontal group of cells that are describing one individual topic and they are related to each other as well. So e.g. here we have the customer id2 belongs to John, and John lives in the US. In this table we have five roads. Another names for rows are records and tuples. Now at the intersections between columns and rows, this piece of data we call it the cell. Another names we have data items, column value, it is one single value. Another example is number four, or Germany or George and so on. The last component we have is the primary key. The primary key is a column or set of columns that can uniquely identify each row in the table, and they could be used as a link within other tables. In our example, we have the customer ID and this is our primary key. You can see it has unique value for each customer. Another name for it called key fields. Alright, That's what the main component of the SQL tables. Alright, so that was the concepts and the main components of SQL tables. And next we're going to start talking about the different types of SQL commands. 7. SQL | Main SQL Commands: Alright, so now let's talk about SQL commands. In SQL we have around 12 main commands and 900 different keywords. Of course, I will not be explaining all of them. Instead of that, in our tutorials, I will be focusing on the most used SQL commands and statements that I use in my projects in the last ten years to make our life easier, SQL commands are divided into different groups depending on their purposes. Alright, let's start with the first group data definition language, DDL. As the name suggests here you will find all the commands that allow you define your database, like creating tables, dropping columns, changing tables, anything that's going to change the structure of your database. Underneath this group, you can find commands like create, which helps you to create anything new in the database, like create a new table, create a new view stored procedures, and so on. One more we have here the drop commands that allow you to delete one object from your database. And the last one alter. It helps you to edit the structure of your database, like altering one table to change a column or to add new column. Okay, So now to the second group, we have Data Query Language DQL. It contains only one commands, and that's enough. It's called the select command. Selects helps you to retrieve your data from your database. The left is the most important command that we have in SQL, and the one that you need to master in order to be good in SQL. In my tutorials, I will be explaining everything about the SQL select statements because if you start working with SQL, you will end up writing tons of select statements. Don't worry about it. Alright, so let's go now to the next group. We have data manipulation language, DML. Dml contains all the SQL commands that you could use in order to manipulate your data inside your database. So we have commands like insert, in order to insert new data inside your tables. Or we have delete to delete some data from your tables or updates to update the content of existing data inside tables. So as you see, it is really easy. The name stills everything. Alright, so now we have two groups of command that is really more for SQL database administrators. The next one we have data control language, DCL. Dcl contains SQL commands That's allows you to give access to specific user to your database, or two tables or schemas and so on. So here we have two commands, grants, you could use grants to give someone access to your objects in databases or revoke to remove such axis from specific user. Okay, so now to the large group that we have, the Transaction Control Language, TCL. In TCL, you will find the SQL commands that's going to help you to manage that database transactions in order to maintain integrity of your data. So here we have commands like commits in order to save the changes in your database, rollback to restore the database. The last commit or to the last saved point. If you have some errors, you could use that safe point. You can define same points in the transactions, which you can use it later to roll back. Alright, so now about those names, did the l do QL, DCL, TCL, and so on. You don't have to memorize them. Maybe only the important one is the LA sometimes here in the project. So if someone says, I will be creating some DDL scripts, that means he or she, I'm going to create a scale statesman's to change the structure of the database, like creating a new table or dropping something. Alright, so in our SQL tutorials, we will be focusing on the first three groups of the SQL commands. We will start with the most famous one, the SQL select statements. And after that, we're going to deal with all those scripts. And finally, I'm going to explain, insert, delete, and update. Alright, so that was the main types of SQL commands. Next you will learn the basic limits of SQL statements. 8. SQL | The Elements of SQL Statements: Alright, so now let's start with the basics a want you to understand at the start, the basic elements inside each SQL statements. We have over here very simple select statements. Don't worry about the content. I will be explaining that later. So the whole text that's going to be sent to the database, we're gonna call it SQL statements, or sometimes we call it query if it is a select statements. So it doesn't matter whether you are retrieving data from database or creating new table or updating content, we're going to always call it SQL statements. Okay, so now let's talk about the components inside our SQL statement. Let's start with the first line over here, the green one, we call it SQL commenced the SQL command. You could write anything you want and once you hit Execute or the whole SQL statements, the database just going to ignore it. That means nothing going to happen. There is some benefits of SQL commands. We could use it to describe our code. So later going to be easier to read it. And because the database going to ignore it and nothing's going to happen, reuse it to deactivate part of our code, e.g. if I don't want to use such a filter over here, I could make it as a comment and the database will not execute it. Okay, so now SQL statements are usually divided into different parts. We call them clauses. Each part is responsible for specific action. In our example over here, we have three clauses, select from and where clouds and each of them has its own unique function, e.g. in select, you can list the names of the columns that you want in from. You're going to call the tables where you're going to define the filters. So as you can see, SQL is really nicely splitted after functions, which makes it really easy to read and easy to write and make the whole SQL language a very easy one. Okay, so next, as you might already notice, we have those blue words, we call them keywords. In our example, we have four keywords, select from where, and those keywords are predefined and reserved in SQL, that means you cannot use them as a table name or column name. In my SQL, we have over 900 keywords. We will not go through all of them. I'm just going to focus in the tutorials on the most used keywords. The link in the description, you will see a list of all keywords that we have in MySQL. Alright? Okay, so now let's take the next element. We have identifiers. Identifiers are any name that you give to any object in your database, e.g. a. Table name, column name, even the database name itself, it is identifier in our example here we have four column names. Firstname, LastName, country, and score. And we have as well here table name called customers. All of those stuffs, they are identifier's. Alright, so now to the last element that we have, we call them operators. In SQL, there is many different operators. They have different shapes and forms, e.g. they could be simple, like what we have here equals smaller, or they could be keywords, e.g. and we call it as an operator. So as I said, in SQL, there is different SQL operators, like there is arithmetic operators plus and minus. There is comparison operators as our example, equal and smaller and so on. Alright, so that's why it's the basic elements inside SQL statements. So drama, over here, we have the whole text. We call it SQL statements, the green ones, we call it comments. In SQL, we have different classes, different parts. The blue one, they are the keywords. We have our name. So that's what gives in the database. We call them identifier's. And at the ends we have operators in our statements. Alright everyone, so with that, we have finished the first chapter of SQL course. We have now a lot of knowledge about the SQL basics and concepts. In the next chapter, we will start preparing your environments so we can start practicing SQL. And we will start by downloading and installing MySQL. 9. SQL | Download & Install MySQL: Now if you don't have already MySQL Install, then you can follow me. I'm going to show you step-by-step, how are we going to download and install MySQL in Windows? This is so important so you can practice and run the tutorials at your computer. Let's start by downloading my screen. Okay, Let's go to our browser. We will go to the official website of MySQL, mysql.com. You will find your downloads. Click on that, then scroll down until you find MySQL community downloads. Click on it. You'll have a bunch of installers. The one that we need is MySQL Installer for Windows. Let's go there. Here you have two options, smaller one and bigger ones. So the small, it's like it got download some packages as you install MySQL. Or you can download the whole package at the start. So I recommend you to go with the bigger one. So we have everything downloaded at the start. Click on download this page. It asks you to login to create new accounts. It's not necessarily for the tutorial, so you can skip that. So I'm gonna go with no sacs. Just start my downloads. That's going to now start downloading the installer. But because I already done it, I don't want to ask now at the time, but I'm gonna go to downloads and I'm going to start the installation. Okay, let's start now the installer, I'm going to click on it. Press yes. And now we are at the first step of the installations. Before we proceed, I'm going to tell you there will be lots of steps 30 I think we're just going to press start next, finished. Yes, and so on. We will not change a lot of configurations. Maybe we're going to put some password, but that's it. So it's really easy. Let's start with the first step. I'm going to tell us see e.g. developer, server or client and so on. We will stay with a developer default. So click Next. After that is going to check the path. We're going to stay with the defaults. Next. Yes, I'm sure. So here it's going to check the requirements. They will do a lot of steps like this, checking the requirements. So we stay with the defaults for SES. And now I'm going to show you all the packages that's gonna be installed so we will not change anything. Let's everything to be downloaded. So now I'm going to click Execute and it's going to start installing all of those components at maybe see one-by-one. Alright, so now we have all the products installed. We will click on Next. Then we have some product configurations. Just click Next. And now you can see about the networking. Well, the most important thing here is to know that we have the following port number or our local database, but we will not change anything. You're going to leave it like this. Then click Next. We're going to stay with the recommended settings for the authentications. Click Next. And now we have to set up finally, the password for our rod user, or we call it an admin user for the database. This is very important to memorize or dried down somewhere. So now I'm going to give our admin user the following password week. So next, we will stay with their commended stuff, not going to change anything. And we can click now execute to apply our configurations. Okay, after all configurations are completed, we can click on Finish. After that, there'll be more configurations. Next. Don't change anything. We're going to stay with those configurations. We're going to click on Finish. After that, some more configurations or finish, okay, now we're going to test our connection to the database server. You see here the username is root, and we're going to type in the password that we gave previously for the admin user. So I'm gonna give here the passwords and click check. So if you get it like here, Connection succeeded. That means we are successfully connected to our SQL database and everything is fine. So let's click Next up like configurations k, x cubed. So everything is green. Click Finish. We have more configurations. Guess what next? Alright, installation completed. So let's click now one more finish. After the installation is completed, it's going to start like MySQL workbench for you and as well another shell scripts. Let's check here. So we don't need this one, you could close it. We will stay with the MySQL Workbench. This is exactly what we need for the tutorials. So you can see over here, local instance might squeal AT this is your local database at your machine. So we're gonna login and try to see whether everything is fine. You see here the admin user roads and we type the password we gave in the installation. This is mine. Click Okay. And now I'm inside my database. If you aren't exactly this step, that means you downloaded, installed, and locked into your database successfully. So congrats. Alright, so with that, we have downloaded and installed MySQL successfully on our system. Next, I'm going to take you in a very quick tour in the interface of MySQL. 10. SQL | Tour in the Interface of MySQL Workbench: I would like to give you now a real quick overview of the interface of MySQL Workbench. Because I remember when I first started using such a database applications, it was little bit confusing, overwhelming having all those panels, options, and toolbars. But actually it was not that heart. I'm not going to go and explain every single detail, but instead, I will give you a general overview of the interface. If you need more details about the tool, visit my SQL manual. I will leave the link in the description. So now let's start explaining the main sections in MySQL Workbench. Alright, let's start on the left side we have here very important sections called Navigator. And in the navigator you can see two tabs, schemas and administration. As a default, you will be landed in the schema. So you can see in the schema, it allows you to navigate or browse through your database objects. E.g. I. Can see here, I have three databases as default. We got it from the installation. So if I want to see inside this database called word, I'm going to double-click on it and I'm going to see the tables, views, stored procedures and functions. So I can router furthermore, I want to see what is inside the tables. We will see that we have three tables, city, country, and country language. So I can start, okay, I have three tables in the database. Let's see now which columns contain those tables. I can click on the city and expand. And I will see, okay, I have the following columns, ID, name, and so on. So with the schema navigator, you can navigate through your database to understand the contents of it. Let's go now to the second tab administrations. Here you will find a lot of info, a lot of tools to manage your SQL Server, e.g. you can check the server status, double-click on it, you'll see the right side here. Several status is running or you can manage the connections, many users and so on. It is interesting if you're going to be like database administrator to understand all those stuff, we are now learning SQL and it is different topic. Now, let's go back to the schema where we can browse our databases. Alright, let's close this one over here. I don't need it. Go away. Right? Next we have the toolbar. We have two toolbars. The first one called main toolbar. It is like the most frequently used functions in SQL, e.g. to create a new SQL statements or to create a new schema or database, creates a new table and you view new stored procedures and so on. So it gives you like a quick access to create the new stuff in the main toolbar. The toolbar, it is over here. It is the query toolbar. It contains all the actions that are related to the query that you are writing in the Query Editor. And the most important one is the execution. So once you write your SQL statements over here, you click on execution and it will be run on the database. You have some other options, e.g. to save the SQL statements or to open one that's already saved, and so on. Alright, Next we have very important sections. It's called the Query Editor. Here we will write our SQL statements and queries and so on. It is our main place where we will work. E.g. I'm gonna write the following statement. Select star from Tuesday. Don't worry about the syntax. I will be explaining everything about the select statements in the next tutorials. So now let's hit the run or execute. After we run the query, you will see that we have here a new section. It's called the result grids. Here you will find the results that data that are returned from the database after we executed the query or the select statements and the data is presented as a table form. Underneath that, you will find another sections. It's called the outputs. Let me just make it bigger little bit. So in this section you will find a lot of information. It's like logs. So you can see the execution time, how long it took the server to execute your query. You can see as well whether it was successful or you have some problems and the syntax, or you have some errors. So you can see it over here and you can see the error message as well and so on. Okay, Now if you go to the right side over here, we will find another section. It's called SQL additions. It is like a tool from my SQL that's gives you descriptions for the SQL statements, syntax, they usage, and recommendations and so on. I usually hide it to save some space in the application by clicking over here. It's really up to you. It's bursa references. Alright, that's why the main sections of MySQL Workbench and really need it in the SQL tutorials. So I hope it helps. Don't worry about it. You need some more time using such applications in order to understand them and to navigate through them. And it will be less overwhelming. Alright, so with that, we have learned how to navigate through the MySQL interface. And next we are going to install the database for practicing. 11. SQL | Install the Course Database: Alright, so, so far we have installed MySQL application locally at our commuter. As a next step, we're going to create a tutorial database for this SQL series. I've prepared spatial database just for practicing and tutorial purposes. In this tutorial database, we will have three tables with few data. So all our next tutorials will be based on this tutorial database. What you're gonna do, I'm gonna show you some tasks. And we get to try to solve those tasks using SQL codes on top of our tutorial database as an x, I'm going to show you step-by-step how to create our tutorial database. Okay, so now the first step is that we're going to go to the video description. And there you will find the link to my website. And with that, you will find our SQL tutorial database. So it will look something like this. So this is one big code in SQL around 53 rows. So you don't have to understand all those stuff at the starts. At, after you finish the acidity, you will understand what we have done over here. So you will understand how to create a new database tables, how to insert a new data, and so on. So what we're gonna do is now is just to copy the script. So in order to do that, you can go over here and click Copy or Gus, go and select everything and copy it. So once we covered our tutorial database scripts, Reagan to go to our MySQL database and run that. Alright, so step number two, go back to my SQL Workbench. And there we're going to execute our code. So we're going to open a new tab scale editor. And here we're going to paste our code. So it is around 53 rows in the codes. And we're going to hit Run. So once we run, we have to validate whether everything went perfectly. So if you check the left side over here, you will find, okay, we have three databases. So where is my tutorial database we just installed? In order to see that you're going to hit on refreshed. Once you hit refresh, you will see, okay, we have now our tutorial Database, DB SQL tutorial. In order now to browse our new database, we're going to do the following. Just double-click on it and then go to the tables. And there you will find our three tables. So there we have the table, customers, employees, and orders. Okay, so now let's check whether we have all the data in our tutorial database. In order to do that, we can open a new tab. Just follow me with those steps, all the commands, you can explain them later in the tutorials. So I'm just going to retrieve all the informations from each table to check, do we have all the data? So select star from customers. So this guy retrieve the data from the table customers. And as you can see, we have here table called customers with five customers. We have Maria joined George Martin and Peter. And in this table we are storing the general information about each customer, like the FirstName, LastName, country, and score. Okay, so now let's check one more table. Let's check the orders. Instead of customers, I'm going to replace it with orders and click Execute. Though that we're going to see that we have table orders that stores all the orders that are placed for our customers. So we can see over here we have the customer ID and we have the order ID and the date when the order is placed and the quantity. If you want to see the formation of the orders, we're going to check the table orders. If we're going to see that information about the customers, we're going to check the table customers and so on. So if you have done all these three steps and you have checked the data, that means you have now our tutorial database installed at your local machine. And we could proceed with our tutorial. Alright, so with that we have a database with data. And before we start writing our SQL code, we have to learn how to style it. 12. SQL | Guide to SQL Coding Style: Okay, so now before we have hands-on and you start learning how to code in SQL. I really have to mention this. When you start learning any new programming language, it's really not enough to learn how to code it. But also you need to learn many other stuff, e.g. how to solve the task was few lines without making stuff complicated or how to write the code that delivers good performance. And finally, and the most important, how to write code that looks good, that is easy for you to read and for others. So if you are working on projects, you will notice that developers has always different opinions about how to style it code. But all of them will agree that the code should be readable and following some styling guide. So you might ask me now, Barra, do I really need to style my code? Is it not enough that my code is working correctly? Well, no, there's two reasons for that. If you are working on team projects, sometimes your code should be reviewed from others. And if your code is hard to read, you will give them hard time reading your code and even end up that they gonna rewrite your code in order to read it. Another reason that if you find out there's some errors or some problems in your code, you will have hard time searching for the error to find out in which line you have the problem. So especially if you are a beginner in SQL or in any programming language, at the start, you will not pay attention for the styling guides. You will just makes sure that you learn the codes and the statements. So my advice here, don't develop any bad habits at the stars because later gonna be really hard to break them. Alright guys and girls, I want to share with you now my three golden rules that I always follow when I start writing SQL codes. Let's check this example over here. It's very simple statements, query, select statements where at the start to be honest, I had really hard time understanding what is going on. So let's try to make it beauty following the three rules. Rule number one, always add new lines for keywords and as well for each column. So let's start doing that. We have here the select statement. So let's add a new lines for each column. I'm going to do that. So all of those stuffs are new columns or new lines for each column. And as well from we have it here as a new line, so that's okay, I joined. We could add new line for it on as well. So just adding new lines for each key words as well here for the end. So as you can see, it's already looks better. I added a new lines for each keyword and for each column. Rule number two, let's make all those keywords as uppercase. So let's do that. Select is lowercase, Let's make it uppercase. The same goes for from join. Let's make everything as uppercase. Why we do that? It's because it's easier to read what is keyword and what is other stuff like I didn't do, it fires operators and so on. So it's much easier to read. So rule number three is that we're going to go and add some whitespaces around. So let's check that. And in the wearer statements, we could like splits this condition with whitespaces. It's just easier to read if you add whitespaces as well here on the condition of the join, we could add whitespaces. So as you can see, we can read it better as everything like stuck together. Now as well for the columns, I always add a tab for it. So now that's it. Now, I have applied by three rules and you could see, it's really much easier to read. We can see here or key select from join where, and so on. I could read it through the easier compared to the first one. Alright, so now let's look at both of the script side-by-side. Can you see the differences? Which one is more readable? It's straightforward. Script with a style, has a proper format that's helps you and others to read it easily and as well to find erodes and problems if you have any. Alright guys, so with that, we have now my SQL Server database and data up and running on RPC. So everything is ready to start practicing SQL. And now in the next chapter you will find, will begin to use SQL syntax to query the database and tables using their very famous select statement. 13. SQL | SELECT Statement: Alright, so now we're going to focus on start on the select commands. So here gonna be our focus. We're going to learn how to query our data. And this is going to take almost 80 per cent of our tutorials because SQL is all about how to query our data. Then other on our data, we're going to talk about the data manipulations and data definitions at the end. So now let's start with the select command. Alright, so before we start writing our first select statements, I want to mention the following. And that's in select statement. There's a lot of clauses. This is not really bad. This gives like a squeal, dynamic and easy way to use SQL. And each of those clauses has his own definition and own function, which makes it really easy to use. So we have the select in order to select our columns from, to select the tables that we need. Joins in order to connect two tables together where in order to filter our data, groupBy to aggregate the data. Having is another way to filter our data. Orderby is to list our results and limit is just to limit our results. So those clouds is don't worry about them. I'm going to explain all of them step-by-step with examples and task and everything and the end, you can understand all of them. One more very important aspect to understand in SQL statements is that the order of those clauses are very important. So e.g. I. Cannot use at the start from then we write down the select. So this order is very strict, and if you switch between them, you will get immediately and error in SQL. So that's means pay attention to the order of those clauses. Don't miss between them. You need to follow those rules in order to get like your query executed in SQL without in euros. Alright, so now the first thing that we need to learn is how to fit our data from the database, how to retrieve all those records or rows from our tables. And to do so, we use the most fundamental SQL statements. We call it select statements or sometimes select query. So now in order to understand all those SQL statements like select where joined from, I will be giving you like one task. Then we're going to try to figure out together how are we going to solve it using our tutorial database? In our tutorials database, we have two tables, customers and orders. In the customers table, we have five customers. And in the orders we have four orders. Alright, so let's start with the first task. Retrieve all data and columns from customers. So that means our focus here on the customer staple and all data that's means or rows. So we need everything, rows and all columns. So now before we start writing our first query, we need to make sure that we are selecting the right database. As you install MySQL Workbench, you will be getting some default databases. And after that, we installed our database for the tutorials. So to make sure that we are selecting the right one that we need, either you're going to double-click on it, or you can write this statement. So we're going to write use then the database name, DB SQL tutorial. And then run. With that, we make sure that we are on the right database so we don't get any errors. Alright, so now let's try our query for the task. So we need all the data from the customers. So the first thing that we specify in the SQL statements for the query is select keywords. After that, since we said all the columns, we're going to use star. Star means all the columns inside this table. After that, we need to tell the database which table that we need, those since we need the customers, we're going to select the table customers. So we're going to say from customers. So we have now the query that's going to select all columns from the table. And here we don't have any like filters or anything. So this is the basics form of SQL. Let's hit Run. And as you can see here, now, we have the results. We have all five customers from the table, customers and don't forget, in SQL, the order is very important. So it always start with select then comes from clauses. Because if you do the way round, you will get an error. So make sure that you are getting the right order while you are writing and SQL statements. Let's do another task were to say, okay, I want to see all the data from orders. So let's do that. Old data or columns, that means select star from. And now our table is orders. So I'm going to select that table orders here and then execute. And as you can see now, we can see that database retrieve orders. And that's right, because this is all what we have in our database. Alright, so now you might be saying, I'm not really interested in all the columns from my table. I want to specify few columns from the table to retrieve. So let's say we have the following task. Retrieve only the first name and the country of all customers. So here's the difference from the previous one is that we don't need all the columns, we just need your columns. So let's see how we can solve that. I'm going to remove this one and start with Select. And now I cannot use star because I don't want to have all the columns. We are interested on the firstName. So we write down firstname, then comma. The second one is country. And now we need to tell the database from which table, so from customers. And let's run. As you can see here now we have only two columns, first name and country, and we don't see here the other columns like customer ID or score. So with that, we selected only two columns without using star and we solve the task. Okay, So now just to understand how the database are reacting to our query, I'm going to show you now step-by-step what is going on in the database once you query this statement? So the database starts from the table. So we said from customers, that means the database is going to focus on the customers table. Then it's going to check, okay, which column we need. So we say firstName, country. And since in our SQL statements there is no filters, it's going to select all the data. So it's going to select everything from the table. And as well for countries. And that's how the database implemented our query. Alright, so with that, we have learned how to use the select statement. Next, we're going to talk about how to retrieve unique values using the distinct. 14. SQL | DISTINCT: Alright, so the select statement as a default, it will not remove any WE kits from the results. So sometimes you might be in situation where you have some duplicates inside your tables and you want to remove them from the results. So removing duplicates from results, not from table. In order to do that, to remove those duplicates, reuse in there select statements, a keyword called distinct. So in order to understand that, let's have the following tasks. List all countries of all customers without duplicates. Alright, so now let's try to figure out how are we going to solve this task. As you can see, we need the customers. That means we're going to focus on the table customers. And we need all the countries. That means we need only one column called country. So let's do a basic query. We're going to start always with select. The column that we need calls country, but we're going to write down country. Then from our table is customers. So now let's just check whether there is an WE kits and see the results. So x cubed. Now we can see the results. Germany, USA, UK, Germany, USA. As you can see, there is duplicates. We have Germany twice and as will the same, we have u is a twice. So now the task is saying without having any duplicates. So in order to solve that, we can type distinct exactly after the select. But we're going to use distinct over here. And this keyword always comes after selects. Only by doing that, it's like magic words. It's going to remove all the duplicates. So let's check that. So x cubed, as you can see, now the list contains only unique entries. We have Germany, only ones, USA as well, and UK as well. So here we have a unique list of all countries, of all customers, and we solve the task. Alright, so now in order to understand distinct, I'm going to show you how the database is implementing our query. So we said in our query, we need the data from customers. So the database is going to focus on the table customers. And we sit as well. We need only one column called country, so the database can select it in the results. We said, okay, we need all data, but in distinct without having any duplicates. The database can start, okay, Germany, it's not in the result. It's going to put it there. Usa. We don't have it and the result is going to put it there. Uk the same. We don't have it in the list and a booted, but now it comes to Germany again, said, okay, we have it already, so it will not include it in the list. And same goes for USA. We have the use aorta here. It will not included in the list. And with that, we will have our unique list of all countries. Alright, so that's all about the distinct. And next we are going to learn how to sort our data using order BY. 15. SQL | ORDER BY: Alright, guys and girls. So now once you start using select statements in order to retrieve your data from your database, the results that you are getting is not sorted in any particular order. That means that the DBMS or database is sending that data back to you in unspecified order. So now if you want to apply some rules or you want to sort the results, we could use the clouds order byte. So now, in order to understand the order BY, you're going to check the following task. Retrieve all the customers where the results are sorted by scores and the smallest should be first. So now let's try to figure out how we're going to write the SQL statements in order to solve these tasks. Now, since we need the customers, that means we are focusing on the table customers. Let's try it. Our select statement first. So select, there is no specification about the columns. I'm going to use a star from customers. So let's run that and see, as you can see, we have all the customers. But as you can see, it is not sorted by the score. The task is distorted by the score the smallest first, then come the highest. In order to do that, we're going to use the keyword order BY. So let's have a new line. Thereby. After that, we need to specify the column that we're going to use to sort our data. Or the task says it should be sorted by score. That means our column is score, the column name score. Now we have here two options, how we can sort our data. We have two ways, ascending and descending. In the task it says it should be sorted by score, the smallest first. That means we need to use ascending. In SQL, we have the keyword ASC. That's means it is ascending. So now we have the Order By clause and we should be fine. Let's run the query. Now if we check the results, you already might notice that the result is sorted differently from the standard sets means we have different sorting now after the score. So the first one is null, because null, null considered to be the smallest. Inserting. After that, we have 350 is the smallest score from all those customers. Then comes the higher and higher and so on. So now we first are reboot, rule how to sort our data and we have solution for our task. One more thing to notice is that in SQL, the default sorting in order by is ascending. That means if I go here and remove the ask this keyword and start the query again, I will get exactly the same results because don't specify anything after the column name, the default gonna be ascending. Okay, so now let's consider one more quick task and it says almost the same. Retrieve all customers and the results should be sorted by score. But this time the highest should be first. So that means we need to use the method of descending the highest fare than the smallest. So that means we have the same query. We don't have to change anything. But now after the column name, if I leave it empty, it's gonna be ascending. But this time we need to be using descending. So we're going to use this keyword disk, that means descending. So let's run this query. So now let's check the result. We can see already that the list is sorted the way around. So now we have the first three card with the highest score. John has 900, and it is the highest, then come the smallest and so on. So now we are sorting the list or the result with the descending way. Alright, so now using order BY sometimes it gets a little bit more complicated. If you are using not only one column, maybe different columns to sort your results. Especially if you have a lot of kids inside your data, using one column will not help you. You're gonna be in W using multi columns in the order by in order to understand that. So we're going to have the following task. Retrieve all the customers were the result is sorted by country in alphabetical order, and then by score with the highest is first. So let's try to figure out how to write the SQL for that step-by-step. So now I'm going to remove everything over here. I write it down order by the first one called country. So the column we need is country, alphabetical order, that means it is ascending. So we could leave it as a default or we could write ask, doesn't matter. We're going to have the same result. So now let's check the result for that. So now as you can see that we have the result is already sorted by country in ascending way that everything is fine. So we have Germany fairs then you can use a, it's already sorted, but that is not enough because the task it says, okay, after that, you need to sort it by the score, the highest fairs. If you take now here the example, those to customers Marty and Martin. Both of them comes from Germany, but Maria comes as spheres. And even though that she has lower score. So that means after we started with the country, we need to sort again those scores. So in order to do that, we're going to put here comma and then. Write down the score. Then the option here is gonna be descending the highest first. So this, so that means we could use in the order by year two columns. For each column, we could use different methods in order how to sort it. So now let's run this. And as you can see here again, that's okay. We have it sorted by country, but now Martin comes first because he is higher score than Maria. And this is exactly how we're going to sort the data using multi columns. One more note about order by it that we could use instead of the column name, the position of the column. So if you can see over here that the country had the position four. So this is the first column, second, third, fourth, and fifth. That means that country had the position of four. So instead of writing country, I could write four. Here the score is the last one, is the fifth. So this is like an easy way how to sort the data. I'll use orderBy and if I run this query, I will get exactly the same results. But I really don't recommend that. Because if you change any structure of your data, like let's say the country will be the position to underscore gonna be the position three. Then after you change the structure of your data, you have to go and edit your query. That means I need to change those numbers. Again. That's really bad because you might forget about it. So if you write the name, it doesn't matter any change. It's going to happen on that schema or on the table. Your query can deliver the same results and using the numbers, you need to adjust this. So I really don't recommend using those numbers. The bitter is to write the full name of the column. Alright, so now in order to understand the order by, I'm going to show you step-by-step what the database is doing in order to execute our statements. So first, it's going to choose the table. Our table is that customers, we are using the star, that's means can select all the columns, are going to put it in the results. But now, once we are not using anywhere or filters, you're going to select all the data. But it notice that there is order BY, so it can sort the results by each column. So the first column is the country. So it's going to sort it by the country first. The first, the first customer going to come here, Germany as well, Martin. Then after that comes the UK. Sort it over here. And then after that come drawn from USA, it starts sorting the results. So we can have here that the country is sorted. And this is the first step. The next step is gonna goes to the second column in the order by in the score. So it's going to sort the results again. So it's going to check those to our customers. It's gonna see, okay, Martin has higher score and it's going to switch it. So let me just do it like this. And Martin going to be the first on the list. Second we have UK, so that's okay. Then we have those two. We have 900 and null. Null is the smallest and it's ok. So now this is how the database is sorting using the order of Y. Alright, so that's what's it for this chapter. We have learned how to query our data using the select statements and how to sort the results using order BY clause. In the next chapter, we're going to learn how to filter our data using the where clause. Where are we going to learn many important operators. 16. SQL | WHERE: Alright guys and girls. So now we have learned how to retrieve all our data from the database using very basic keywords select from. As a next, we need to learn how to filter our data using whereClause, because in real-world scenarios, you are not interested in all records in the tables. So usually you will be interested in only the rows that fulfilling a certain condition. E.g. we don't need all the customers and their results. We need only the customers that come from certain country or have like specific score. So in order to understand that, let's check a very simple task. The task says, list only German customers. So that means we are not interested in all customers. We need to see in the results. Only the customers thus comes from Germany. Okay, so now let's try to figure out how are we going to solve this task using SQL query. In the task we will be focusing on the customers. That means we will be querying the customers table. And since there is no specification about the columns, we could go and retrieve all the columns. Let's try to write now the SQL statement for that. Select as usual. Then no specifications about the columns. We're going to select everything we use. Star from our table is customers. And let's run this and see, as usual, we have all the data, all the customers from Germany, from USA, UK and so on. But the task says only the German customers. That's means we have to do some filters. Now, in order to do that, we're going to use the weird clouds and usually we put it immediately after from, alright, so now we need to write down the keyword where After the way we need to specify our condition, the condition should be based on the countries. That means country should be equal to Germany. So we write down now the column name, country equal operator. And now here we need to enter the value that is exactly like it's written inside the database. Jeremy, like this. We write down Germany. So let's start now the execution and see the result. As you can see, we don't have all the customers. We have only two customers That's fulfill this condition. Maria and Martin. Other customers like John, George and bitter, they all don't fulfill the condition and they are excluded from the results, right? So as you can see, SQL is pretty easy to write Android, like take these, select all columns from customers where the customers country should be equal to Germany. So it's really easy to read it using English words and in the logical order. Okay, Let's have now another quick task. It says select customers whose score is greater than the 500s. So it's based on the same table, so we will not change here a lot of stuff. The only part that's changed is the condition. So now we're going to remove this year. Our condition here is based on the score. So we have the column score operator is not anymore equal, it should be greater. So we need another operator and the value is five hundreds. So we write down your 500. Let's execute that. Now we can see the customers who score is greater than 500. As you can see, it's pretty easy to use the where statement. Alright, so now in order to understand the where clause, I'm going to show you step-by-step what the database is doing once we execute our query. So that database gonna check which table, so it's going to focus on the customers. Then I'm going to check which columns do we need as we write down the star. That means the database going to select all the columns and their results are then the database can check, okay, there is filter, that means not all the data we should be on the results, so it's going to check it. So now the first three chords is going to check the score over here. The score is 350, that means it is not greater than 500. It will not include it in the result. The next one is greater than 500. That means it's going to take it the next customer, the same, fulfill the condition. Oops, I need to write it down over here. Alright, now, the first customer, 500, it is not greater or equal, it's only greater than 500, that means it will not consider it. And the last one, it's null. That means it's empty. It will not feel for the condition. That means we have only two customers and that's how where is working inside the database. Alright guys, so in SQL there is many different types of operators that you could use inside the where clause in order to filter your data. In SQL, there are splitted into two groups. On the left side we have the comparison operators, and on the right side we have the logical operators, the compressor and predators. You could use it in order to compare two values, e.g. we have the equal, not equal, greater than, less, than, greater than or equal to, less than or equal to the logical operators you could use it once you want to combine two different conditions. And as a result, you're going to get true or false. E.g. we have an operator, it returns true if both of the conditions are true, we have or return true. If one of the conditions is true, then we have not in-between lie and so on. So in the previous examples in the where clause, I showed you two conversion operator, it was the equal and greater than. So as a next, I'm going to go through all of them in order to show you how you could use them inside the query and some examples. So you don't worry about it. Alright, so that's what's it for the whereClause. Next we're going to talk about the comparison operators. 17. SQL | Comparison Operators: =, >, <, >=, <=, !=: Alright, now we're going to focus on the comparison operators and learn how to build up our conditions inside where clouds. The conversion of birth is, as I said, it is used in order to compare two values and it is the most basic way how to filter data using SQL. Okay, so now in order to understand them, let's have the following tasks. First, find all customers whose score is less than 500. So that means we're going to focus on the customer's table and there is no specification about the columns. We're going to use Select star from customers. So now let's run this. As you can see, we have all the customers, but we need to filter the data score less than 500. So we're going to use the where clause. The column is score, the less operator. And then we're going to type 500. So let's check the results and draw on it. So we have only one customers whose score is less than 500. So now in order to understand why we had only one customer other results, I'm going to show you what the database has done once we executed our query. So we said select star from customers. The database is going to focus on that customers. We said star, that means we need all the columns, add our results. And then since we have wear gloves, are going to filter the data. So it's going to go through all the records and tried to find whether its fulfill the condition or not. So I'm going to use the like and dislike what term to say with those is true or false. So the first customer hear score is less than 500. That means it's gonna be shown in the result because it fulfill their condition. Then we have the next one. Score is 900. It is not less than 500, so that's means false. The next one the same 750, it is not less than 500. The next one is interesting. It is exactly 500, but since the conditions, it says less than 500, it not fulfill the condition, then the null is false. So that's why we had only one customers at their results. Okay, so now let's add another task and it says, find all customers whose score is less than or equal to 500. So almost the same, but we have here as well the customers that are equal to five hundreds. So let's check that we can have the same query, so will not change anything over here, only the operator. So we need the less than, so it's going to stay like this, but we need as well equal to. So there's another operator called less than equal to, and it looks like this. So we have them both like this. And let's worry and see what the result. So as you can see now we have the customer number for Martin. He has score 500. And now it should be shown on the result. So we have the first one, Maria, less than 500s and we have Martin. It has exactly like the 500. So this is the less than equal to. So as you can see, it's pretty simple. Let's go with another operator with the following task. Find all customers whose score is higher or equal to five hundreds. So that's means it's almost the same, but we need to use other operator greater than equal to. So it looks like this, greater or higher than equal. And let's check the result. So as you can see here, now we have all those scores are higher than 500. So we have joined with 900. We have George with 750, and Martin stays here because his core is equal to 500. So as you can see, it's really easy. Alright, so now we have one more last task. It says, find all non-German customers. So let's try to solve that. We're going to stay with the table customers. So select star from customers. And we need to filter the data using NAT score but that country. So we're going to dive now here country. And since it says non-German customers, that means the country should not be equal to Germany. So the not equal operator, it looks like this. And then we need the value Germany. So with this query you are saying, okay, give me all the customers whose country is not equal to Germany. So let's run that. And as you can see here, we don't have a country called Germany and the results. And you could see like or have the same result using this operator as well. It tastes as well, not equal. So if I run that, we're going to get the same results. So you could use either one of them. There is no difference between them. Okay, so now let's see how the database solve that. We say select star from customer. That means that the three is going to focus on the customers star means all the columns as usual. We're going to put it over here. We have under where it says country not equal Germany. So the database is going to focus on this column or the condition. So let's see the first customer, the country equal to Germany. So that's mean it's false. We will not see it as a result. The next one, the country is not equal to Germany, so that is positive. We're going to see it at the results. The next one is the same. The country is not equal to Germany. We will see it as well as the results. And the first customer, the country is equal to Germany. So that's means it is false. We will not see it at the results. And the last one, that country is not equal to Germany, so it is true, we will see its result. So that's why we saw three customers at their results. Alright, so now we've covered all those operators inside that comparisons. They are pretty easy. They're always like compare two values. And I would suggest that you go and play with them until you understand how they work. But as an x, we're going to go and start working on the logical operators. They are like little bit more difficult, so don't worry about it. I'm going to explain that in details and examples and everything. But they are very important using SQL because you will be end up using them a lot. Alright, so that was it for the first group of operators. Next, we can talk about that other group, the logical operators and or not. 18. SQL | Logical Operators: AND, OR, NOT: Alright guys, so now we're going to talk about the second group of operators that you could use inside the where clause, and they called the logical operators. We will focus on those three bad boys and or nuts. In the previous examples, you'll learn how to filter your data using only one condition. But in real life scenarios, things gets more complicated where you have to combine the results of two or more conditions. And in order to do that, you could use the operators and, or. Okay, So now let's start with the first operator. The operator, it says the following. It returns true only if both conditions are true, otherwise can be false. So let's say we have condition a, condition P and we want to combine them using. And. So the first situation we have in the condition a true and the condition B we have true. If you do the ads, we will get as well through because it's fulfill the requirements. So both conditions are true. We will get through. Let's have the second scenario, condition a as well, true. But in condition B we have false. Here. Not both of them are true and we will get the result false. Now the way around the condition a has false and condition B has true. Not both of them are true, that means the result's going to be false. And the last scenario where you have both of them are false. As a result, you're going to get false. So that means the AND operator is really strict. Both of the conditions should be true in order to get true. Otherwise, it's going to be always false. Okay, Let's jump to the next one. We have the OR operator. It says it returns true if one of the conditions is true. So that means the OR operator. It can not be happy if you have one of those conditions was true to give you true. Otherwise, it's going to give you false. So let's take again the same example we have here, condition a, condition B, but now we're going to apply that or we have in the first scenario true and a true at the B, it will further requirements. Both of them are true. So that means in the order we have true. The next one we have added a true or false. So now it says at least one should be true. So that means with the oral you're going to get as well through because you have it here as a, it is true. So the next scenario where it is the opposite, where you have a false and a true, It's fulfill the requirements. At least one of them is true to give you true. But only the last scenario where you both are false. With this scenario you will get false. So as you can see, the orbiter is less strike that. And it's gonna be happy if you have somewhere through to give you a true and you will get more results. Okay, Let's move to the last one, the not operator. It says it's going to reverse the result of any Boolean operator. So that means it's going to be always giving you the opposite. E.g. if you say left, it's going to go right. If you say go right, it's gonna go lift. So here you having always the opposite other results, it's going to work only was only one condition. So it's not combining two conditions like and, and or. So. Here we have the condition a. If you have here true and you use the nuts. So that means you will get the fall. So it's going to do the opposite. And the same. If you have false and you use NOT operator on it, you will get true. So it's always like reversing the results. If you have true, you're going to get false. If you have vaults going to get true. Okay? So enough with the theory, let's have some tasks in order to learn that in SQL. So we have the following tasks. Find all customers who comes from Germany and their score is less than 400. So we have here two conditions. Let's try to solve that. So as usual, we're going to use select. No specification about the columns. Star from our table is customers are now in the where condition. We have two conditions. The country is Germany, so we can write country equal the value Germany. Now we have another conditions. It says the score should be less than 400, score less operator 400. So now I have two conditions and I need to combine them in the task is safe. And that means both of the conditions should be fulfilled. I need to right now, the operator and between both of those conditions. So let's run this and see. With these conditions we have only one customers, thus fulfill both of the conditions. So we have Maria come from Germany. Her score is less than 400. Okay, guys and girls. So now let's see whether database, once we executed the and operator, we have as usual, select star from customers database focused on the customer stable stars means we need all the columns. So we're going to see all the columns and the results. Now that database is going to go through each row and to strive to find out whether it fulfill their requirements to put it as the results. So let's start with the first one. The first customer, Maria, she comes from Germany. So this is the first through the first condition. The second condition, we have scored 350, it's less than 400. So that means we have another true. And since we are using ads, both of them are true, we will get the result as row. So that means that Delta V is going to go and put her other results. So the next one we have John. The country is USA. So this is the first false over here on the first condition. The second condition as well, it is higher than 400s, so it's going to put it as well, false, false, false. The and operator gonna put it as false. The next one, we have the same situation as well. The country is not Germany and the score is not less than 400, So both of them false. The end of birth are going to put it as false. And the fourth one we have Martijn, the country is Germany, so we have the first through paths. That condition is not less than 400 z-score, sorry. So we have here false with the ant, it will not work. So that means it's going to put false as a result because both of them are not true. And the last one, both of the conditions are false. The country is not Germany and we don't have a score. So that means we have as well false. So only one customer fulfill both of the conditions with true. And once you use and you will get only one record. Okay, so now let's jump to the next one and we have the OR operator. The task says, find all customers that come from Germany, or their score is less than 400. So we have almost the same setup. But here we have the logical operator or so, we have the same conditions. Country equals Germany score less than 400. But now we're going to connect them with the OR operator. So now let's check the results. I'm going to execute that. And as you might already noticed, we have now to customers as a result for this setup. So let's check what happened. So now at the start as usual, we tell the database select star from customers. It is focusing on the customers, all the columns because of the star. And now we have here the same conditions. So Country equal to Germany score is less than 400. But the only difference that we are using the logical operator or the results can be different. So that database is gonna go through each row and see whether it's fulfilled the requirements or nuts with the, or it is enough to have only one true, true as a result. So as you can see here in the first customer, both of them are true. That's means we will have true. As a result. We will see Maria results. After that. Those two customers, they don't have any true in any condition. That means it's going to be false in the results. But the customer for Martin, he has one true. So that means this is enough. We will get that as a result. So Martin gonna be the results. The last customer the same. So he has both false. We don't have any true. That means the or operator gonna put false. So that's why we got to customers as a result. Alright, so now Two, the last one we have the not operator and we have the following task. Find all customers whose score is not less than 400. So that means we have only one condition and we have the nut. So let's try to solve that. So here we have only one condition. It is above the score. So it didn't say anything about the country. I'm can remove this part of it. So we have score is less than 400, but it says it should not be less than 400. So all what we can do is just to add them nuts operator. It's very simple. So let's run this. As you can see over here, That's all customers, they don't have score less than 400. Okay, so now let's see what the database done once we executed the NOT operator. So as usual, we will get all the columns because of the star. And then we have the condition score less than 400. But with the operator nuts, without the knots, we will have only one customer that's fulfilling these requirements. So we have only one true or false with another operator is going to reverse everything. That means if you have true, it's going to show it as false. And if you have four, it's going to show it as true. So it's just gonna do the opposite. So here we have true and the result is going to be false. The next three are all false, so we will get through, but you need to be careful with something. So here it is null. So the database don't know whether it's less or greater or something like that. So it will treat it as unknown and it will not show it other results because it is empty or null. So that's why we have add the results, those trues. That means we will have only three customers. Alright, so that was it for the three operators and, or nuts. And next we're going to learn about the logical operator between. 19. SQL | BETWEEN: Alright guys and girls. So now we're going to talk about one more logical operator that you could use inside the where clause in order to filter your data. And that is the between. Between is a logical operator that allows you to select only the rows that falls within a specific range. In order to work with between in SQL, you need to define boundaries to boundaries two values that specify the range. So here we need to define in-between the mean value and the max value. It could be anything like text, number and date. Here in SQL, any value between those two boundaries. They aren't gonna be considered as true. And the values or the rows that are outside those boundaries gonna be considered as false. And one more very important information that those boundaries, the main value and the max value are included in the condition. So it's really, I see in the projects a lot of people that forget about it or true, like ask again, are those boundaries in the condition or not? So it's really confused a lot. Don't forget those values are included in the condition. So now in order to understand that, we're going to have some task and we're going to try to solve it with SQL. Alright, so now we have the following tasks. Find all customers who score falls within the range of 100.500. So let's try to solve that with SQL. So as usual, select star, there is no specifications about the columns. Our table is customers. Now we need to filter the data. So we're going to use where and here, the column that we need to use a score because it says score should be 100-500. So we're going to write down score. And now the syntax for between, you need to write the keyword between. And here now we need to specify the minimal value. So the Min value, the first boundary is the 100s. And then we're going to use the operator and then the max value. And that's it. So for the between, you need to write down the column name between Min value and max value. So that's it. Let's now try to execute the query and see the results. As you can see, those two customers have the scores. That is 100-500. Okay, so now let's see what the database does once we executed the query with the between operator. So now as usual, select star from customers. That means in the results we need all the columns and we have where. So that means that database should filter the results and we have the condition 100-500. So let's go through all the customers. So the first one we have the score 350. It is between this range 100-500. So we have the first true and we will see it in their results. So the next one is 900. So it is like outside of the max boundary. That's makes it as a false. The same goes for George. We have 750 it is as well outside of the 500s, so it's outside of the boundaries, not between those two values. We have the false. And now it is interesting, we have the 500, 500. It is not within the range, it is exactly the boundary. And with that in-between, It's going to consider it as true. So we have it as true. And the last one we have now, so it is unknown, so it will not return it here. That's why the results. We saw two customers, Maria and Martin, because they fall in within the range 100-500. And Martin is exactly the max boundary. That's why it is considered as we be true. Okay guys, So there is another way how to solve such tasks without using between. And instead of that, we can use two conditions and connect them with the AND operator. So I'm going to show you that star from customers like usual. And now we're going to write the where conditions. First, the score should be greater or equal than 100. So we're going to use operator greater or equal to 100. And then you're going to write the second part of the second boundary. The score should be smaller or equal to five hundreds. So we're going to use this operator less than or equal to five hundreds. So with that, we redefine the between function. And if I run this, I'm going to remove this part over here and executed. We will get exactly the same results because we just redefine it in another way. Some developers like me tend not to use between and instead of that, we use such conditions because for me it's more easier to read what the query is doing instead of using between, because I need to remember when I used between that, e.g. the boundaries are included. And if you forget that, you need to search about that. So it's really easier just to read exactly what the query is doing. So I tend to avoid between the two conditions with ants. And one more advantage about that. You couldn't control it better. So e.g. I. Could use for the boundary with the magnets value only less without the equals. So you could define it more flexible than the between. Alright, so that was it for the operator between next week and learn about the in operator. 20. SQL | IN: Alright guys and girls. So now we're going to talk about one more logical operator that you could use inside the where clause in order to filter your data. And that is the in operator, the enumerators. It allows you to define a list of values that you would like to see at the results or to be included at the results. So how it can work, as I said, you can define like just check list, a list of values where you are telling SQL only those values are allowed at the results. So here you can define multiple values. It's not like the between where you define the boundaries. Here is a list of values. So the database can start like asking for each value is a value inside this list. If the answer is yes, then it's going to be true. If the answer's no, it's gonna be simply false. Alright, so now as usual, in order to understand that, we're going to have one task and try to solve it at SQL, the task says, find customers whose customer ID equal to one of the values 12 or five. So let's try to solve that. As usual, there is no specification about the columns, so you're gonna select star from customers. And now we need to filter the data. So we're going to use whereClause and here we starts. So it says the customer ID. So that means this is the column that we're going to use in order to filter the data from our ID. And now we have a set of values, 12.5. So in order to use that, we're going to use the in operator. And we start defining now the list, a checklist. So open brackets. The first value is one, then comma two, comma five, then close brackets. So we defined the list of values that we want to see the results. And with that, we're gonna run that query and see what's going to happen. As you can see, the query is run and we have the list of customers that exactly match our list, the customer ID 12.5. Okay, so now let's see what the database done once we executed the in operator. So as usual, select star from customers means I want to see all the columns at the results and the database can select that. And since we have where clause, it's going to start checking the condition. The condition should say is customer ID should be in this list. So the data is going to check each customer. So here we have customer ID one, and it is in the list. That's why we're going to get a true over here for this condition and we're going to see it at the results. The next one is two. So here as well we have true or this one and we're going to get it at the result. The third customers customer ID equal to three and it is not in the list. That's why we're going to get false over here. The same for four, so four is not in the list. It will ignore it. And the last one, customer ID equal to five and it is in the list. So we will get a true for that. And this is how the database can process our query. Alright, so you might tell me now, wait a minute, Vera, I just learned about the or operator and how I combine different conditions using the OR. And I could solve this task using that instead of using in and like a checklist. So let's see how we could do that. I agree it's going to work as well. So select star from customers, where customer ID equal to one. So the first one, then we write or customer ID equal to two and go on. Customer ID equal to five. So if I run this query, we will get exactly the same results. But like I agree on that, but as you can see here, it is more compact and much easier to read, like you make list and that's it. So here you can define all those values with multiple conditions and connect them with the OR. So. Imagine you have ten values, you will have here ten rolls of codes. So I really liked it with the n operator. It is more compact and easier to read. Alright, so that's all about the in operator. Next, we're going to learn a very important operator. It is the light. 21. SQL | LIKE: Alright guys and girls. So now we have the final logical operator that you could use inside the where clause in order to filter your data. And that is the like operator. It is little bit more complicated than the others. Don't worry about it. I'm going to explain that step-by-step with examples. So once you understand it, it's gonna be more easy and fun to use. So in the other examples with the whereClause, we always define the whole value of the complete value in the where clause. But sometimes you might be in situations where you still don't know the values. You are searching for some values and you have a bathroom at your head, e.g. you are searching for customers where their name starts with m. So here you don't know the whole value. You are searching for something and you have a pattern. You could use the lag operators who was a button in order to find those customers. Or there is a lot of values at the database or SQL, where it's going to be almost impossible to define all of those values and the where clause. So instead of that, you're going to define like a button and you tell SQL am searching for something like this. So now the like works like this. It returns true if the value matches the pattern. Otherwise it's going to return false. So that means we need to build up like butter on, in SQL. And in the scale we have two tools in order to do that. We have the percent where we say it's matches anything, or we have the underscore, it matches exactly one character. So now let's have an example in order to understand that we have the first example file names that begins with M. That means, you know that the names begins with M and you don't care about the other characters. So now we need to build up such a button. We can write down the M and the percentage you are saying here for scale, that begins with M and the others, it doesn't matter. It could be empty. It could be like characters. Multiple characters doesn't matter, but for you, it's very important that they start with m. Now we have another one. It says find names that ends with n. So that's means it could start with anything. So we're going to start with the percentage, and it should end with the end. Here. You need to be careful that they are case-sensitive over here. So there is difference between small n and begin. So this patterns tells SQL starts with anything, but I need it to be ended with n. Now we have the example where you say, Okay, it should not be the first or the last. The name should contain somewhere that our character. So find names containing the r. So you are not defining whether they are at the start or at the end. So with that, you could use the following pattern. It could be started with anything than R and end up with anything. Here. You don't know where exactly they are. The names should contain somewhere and our character. Now, the next one you could be more specific where you can say, okay, find me the names that's containing the r, but exactly at the third position. So it's little bit more complicated. And with that you're going to use the underscore. Underscore you say, okay, the first position could be anything. The second position could be anything. But the third should be exactly the R. And afterwards it's going to be anything like empty characters and so on. So with that, you are mixing those two tools, underscore and percent. So now we're going to go more in details and words examples in order to understand how x. Okay, So now we're gonna go and deep dive in each of those examples and explain for you what is going on in the database or SQL once you define those patterns. So the first example we have find names That's begins with M. Our pattern is M and percentage, that means anything after that, we don't care about that. It should start with M. And in our database we have those five values, those five names, and let's start one-by-one. So Maria, It starts with M. So that's means it is matching our pattern. So SQL going to return for that a true. The next one we have John. So the J over here is not matching our pattern. That means SQL going to put false on it than George, the same thing, it starts with G and not matching our pattern. It should start with em. To get like a true. We have false for that. Martin here starts with M. That means is matching our button and we're gonna get for that to true. And the last one, Peter, we have p and it is not matching or pattern and we're going to cut to false. So if you define those pattern in the SQL, you will get those true and false from the database. Okay, so in the next example we have find names ends with n, small n. Our pattern is like anything, the percentage and then small n. Let's go through the names. The first one, Maria and the database is going to check the last one. Okay? The last one is a not matching. Our n is going to reject it. You're going to get false. So we have John, john has the last character and it is matching. Our pattern database is going to put through on it. So the second one we have George, George end up with g. It is not matching the pattern false, Martin n, we have true here. So the last character matches our button. And better, we have the r over here. It is not matching the pattern. So if you run Sanjay button on your database, you will get only John and Martin as a result. So let's find the next one. The next one says find names and containing R and we didn't specify anything or that somewhere should be R. So the button it says present, are present. That means somewhere there is an R. So with the Martin, somewhere there is r. So here, over here we have the R and it's going to return true. With John, there is nowhere and are like There is no character over here with the R. That means the database is going to return false. George, we have over here an hour, so it's going to return true. Martin, the same and better as well, the same. So as you can see, if you like, start with the present and end with the percents. The database can find somewhere your character and it's going to return it as true as you see here, peter ends with R, Martin in the middle somewhere there is r. So here you don't care about the position. Where is your character? Okay, so now we come to the final one. It says find names containing the R and the third position. Here we are very specific. We are saying exactly the third should be the R. So in order to do that, we will not use the percent in our button. We're going to use the underscore. It says the first character could be anything. The second character could be as well anything. But the third character should be exactly the r. And after that, it could be anything, it could be empty like bunch of characters. We don't care about that. So let's go through our values and see how the database is going to react. So Maria, It starts with M, It's okay. It's okay. The third should be R and we have here match afterwards, like it doesn't matter. So this is a matching to our patterns. So Maria gonna get a true from the database. The next one, John, like the first two characters are okay, but the third one is not matching the pattern. It is the H. That's why we're gonna get a false for that. The third one, you can see the third position is 0 as will not matching. Our button. Martin is matching because we have, the first character is M, could be anything. The second one as well, a, and the third is R. So this is matching our pattern. The risks could be anything. So that's why Martin is matching exactly our button. The last one, Beta, doesn't match our button because at the third position we have that T. With that, if you run such a button at your database and you are specific with that, you will get only Maria and Martin as a result. So now as a next we're going to go deep dive in examples were okay, so now as an x, we're going to learn how to write SQL statements using the like operator in order to understand the syntax and to solve those four tasks, we're going to start with the first one. Find all customers whose first name starts with M. So as usual, we're going to select star. No specification of what the columns are table is customers. And now we have to filter the data with our buttons. So where clouds, the columns that we're going to use in our button is the first name. Then we're going to write down the like keyword. After that, we're going to specify now the button. So it starts with the high comma, then big M percent, and then close it with the high command. So with that, we specify the pattern for the like operator and let's run it. So as you can see in the results, we got those two customers that have a big M, the start of the firstName. So this is how we gonna do it using the like operator. So the next one, it says, find all customers whose first names ends with a small n. So we're going to have the same stuff over here, but we need to redefine the pattern of comma, high command that wasn't German. And then anything like the present, and then small n, then closet. So let's run that. And as you can see, we got those two customers, join and Martin, because they have their first name and they end up with, alright, so now to the third task, it says find customers whose first names containing somewhere and r small r. So let's do that. So we're going to have the same setup over here, but we need to change the pattern. So high comma, then percent, small, percent, then high come up with that. As I said, you are not specifying any position somewhere should be an R. So let's run that and check our query. You can see here Maria has an R somewhere. George has an awesome where Martin and Beta. So we got those four customers. But John, we didn't get him because he didn't have an art in his first name. Okay, so now to the last one's the task says, find all customers where the first names containing the character and the third position. So here, the same stuff over here. We need to change only the pattern. Too high comma, the first character should be anything. So underscore. Again, underscore the second character could be anything. And here we define the r. And then we say anything after that. Then high comma, it's own. Once we write down here, the button up there, they're like, and let's run that. And as you can see, only Maria and Martin, as we discussed that containing the third character, the r. So with that you have those four examples with the like operator. It's really fun once you start like practicing with that. So try now, I would say to make some pattern at your head and try to write it down and see how SQL January egg that. Only with the practice, you're gonna get some good results and you're going to like, understand it. Alright, so that's all for this chapter. We have learned how to filter our data using the where clause and many important operators. In the next chapter, we're going to step up the level we're reading to learn how to combine our SQL tables using joins and union. 22. SQL | JOINS Concept: Alright, guys and girls, so, so far we have learned how to query only one table. In all our examples, we focused on the table customers we've done select where we filter the data and so on. That was only one table. In a real-life scenarios, you will be working with a real database that's contains a lot of different tables. And once you start writing SQL statements, you will end up querying that only one table, but maybe multiple table in order to get something meaningful of the data. So that means you need to start learning how to combine different tables, how to join those tables together in one SQL statements. This is very important in order to learn SQL, because once you master this, you will be good in SQL. Now in our tutorial database, we will be working now with two tables that customers and the orders in the order, as you can see, which customer did the blades, which order? So now, in order to join those two tables, you have to specify two things. First, you need to determine what is the join key, adjoin key. It is like a column that exists in both of the tables, e.g. the customer ID, we can see it here in the customers and as well in the orders. So that means the customer ID is good candidate in order to join those tables, and it's going to be our join key. The second thing that you need to specify is the type of the join. In SQL, we have four different types of joins. We have the inner join, left join, right join, and full join. It might be complicated at the styles, but don't worry about it. I'm going to explain all of those types step-by-step with examples. I'm going to show you as well how SQL works with those types. Alright, so now let's start with the first type of joints we have the inner join. The inner join is the most commonly used type of joints between develop bird's eye as well tend to use a lot of inner joins in my SQL statements. So it is widely spreads to use inner joins in SQL, there is very important aspect that you need to understand once you work with This girl joints. And that is in SQL, there is always a left table and the right table. And that's really the band how you are writing the scripts. We will see that in the examples. In the SQL Joins, there is the left table, we have the customers and the right table, it is the orders and the inner joint. It doesn't matter because in the results, once you are using inner join, only the matching roads will be presented at the results. So if you use inner join, you will exclude all those results that are not matching. And you will see as a result, only the matching rows between those two tables. Now to the second type of joints, we have the left join. As the name says, it is a left join. That means we are the bending on the left table more than to the right table. So once you are specifying the left join in your SQL scripts, you are telling the database or SQL that I want everything, all the rows from the left table and from the right table, only the matching rules. So once you are saying okay, left join, that's been, you will find all the records from left and only the matching grows from the right side. So let's go to the next one. We have the right join is exactly the opposite. So you are saying here in your SQL script, right join, you are the bending completely on the right table. So that means once you write that script, the SQL will present all the records from the right table and the results. And from the left table only the matching records, only the matching rows. So it's really the way opposite as the left join. Then we have the lifestyle of joints. We have the full join. Once you say in your scripts, I want to have full join. That means you want everything from both of the tables. That means from the left table, gonna retain all the rows. From the right table you will get as well, all the rows. So what's full joined as the name says, it is everything. Alright, so with that we have an overview about the joints. And now before we start talking about the first time the inner join, we will learn quickly about the SQL aliases. It's like hidden tutorial, not on the roadmap, but we have to learn that before we start writing SQL Joins. 23. SQL | AS Statement - Aliases: Okay, so now before we stop having some examples in order to understand and learn how to join tables using SQL, we have to learn very important things in SQL and that is SQL aliases. You need to learn that once you start querying multiple tables in one SQL statements, Let's take this. If I want only to select the customer ID from customers, this should not be a problem. So if I execute this, I will get all the customer IDs. But once I specify multiple tables in one query, you need to tell the database which customer ID in which table, because as you'll see in our example, we have the customer IDs and two tables in customers and orders. And if you leave it like this, you will get an error where the database is going to tell you. I don't really understand. Which column do you mean? Do you mean the column from customers or from orders? That's why we need to specify one more thing near the column name, and that is the table name. So we're at customers, dots customer ID. And with that, you are telling database, I want the customer ID from the customers. So if I execute this, I will get as well as same result. There is no problem here, but you need to specify that once you are working with multiple tables. But the annoying thing here, if you just always like write the table name over here, it's gonna be really annoying to write. That's why we're going to work with aliases. So we're gonna give the tables like a nickname and we call it in SQL aliases. Okay, so now in order to do that in SQL, we're going to go just beside the table name, and we're going to write down the keyword as, then give that alias name or nickname. I'm going to use the C Instead of customers. And now the database understand, okay, in my script is using C Instead of customers so I can go everywhere. And instead of using the customers, I could say C. So if I ran the result, I will get exactly the same thing. There is no error. But now as you can see, it is much easier to handle my script. I'm going to just write see dots customer ID instead of the customers dots customer's ID. So it's really easier way to handle stuff, and I always tend to do that. So I really recommend to use aliases in order to have like small scripts, you could do as well the same for the columns. So e.g. we have here the customer ID. I could go and rename that. And to do that, It's the same stuff. I go right beside it, I write as. So instead of having customer ID, I'm going to write like CID. So let's run this. And as you see it's grill understood that. And he is printing out the result as well, CID to Hey, I understand. I'm renaming this column in my result as CID. There is a very important aspect here to understand is that It's going to rename that only in my script and in the results. That database will not go to the tables and going to rename the tables are gonna rename the columns that is different query to do that. So this command, the ads, it is only temporarily at my script and the results, so nothing is changing in the data model or in the database. It's going to stay the table, customers and the column, they're going to stay the customer ID. This is only a tool to help you once you are writing SQL statements and as well to help you rename stuff very fast, to have it as a result. Alright, so now we have everything to start with the first type of joints that inner join. 24. SQL | INNER JOIN: Okay, so now let's start with the task in order to understand how to write SQL statements to join two tables, we're going to start with the first task. It says, find all customer ID, first name, order ID, the order quantity, excluding those customers who didn't place any orders. So in this example, as you see, it is not only the customers, we need some columns from the customers table and some columns from the order tables, and we have to join them in order to do that. Let's start doing that step-by-step using SQL. So first we're going to start with the select sense in the task. It is like specifying the columns. We will not use the star selects. We need the customer ID, then the firstName, and the order ID, and quantity. So now we need to specify the tables. We're going to start from the customers with the inner join here. It doesn't matter whether you are starting from left or from right. So I'm going to start from the customers. Now, in order to specify the second table, we're going to use the join statements. So we're going to say inner join. And with that, I'm saying, okay, we're going to join now the customers with another table. So we're going to inner join that orders. With that you are connecting two tables, the customers and orders. As I said, you need to specify two things. The join type and the join key. We have already here specify the inner join because we don't need those customers that didn't place any orders. So we're going to use the inner join over here. And the second thing that you need to specify here, what is the join key? How are you going to connect those tables? You need to specify that for SQL in order to do that. So we're going to now go to the new line and say on the joining on those columns. So in order to specify the columns, I'm going to give now only some aliases. So instead of customers, I'm gonna say, okay, I'm going to call you see? And instead of orders, I'm going to call u as 0. So now in order to join those tables, we need to find out what is our joint key. Which column here exists in both of the tables so we can see the customer ID, we can find it in the customers and in the orders. And it is the perfect column to join those tables. So we're going to connect both of them with their own. So I'm gonna say, okay, let's take the customer ID from customers. It should equal to the customer ID in the orders. So all dots, customer. With that, I specify the rule or the key, how the table is going to be joined. I said the customer ID from the left table should be exactly the customer ID from the right table, from customers and orders. And with that, I specify the rule I specified over here as well, the join type. And with that, we connected two tables. Alright, so now before I go and run this query, we still have one problem. And if the customer ID in the select, I didn't specify from which table. And if I run it like this, we will get an error. You could try it. But now we need to specify which customer ID I want. Is it from the customers or from the order? So in order to do that, we're going to use the C dots, the table name or the alias in order to specify, okay, I want the customer ID from the customers. For the rest. You don't need to do that because it is unique name like the first-name, its unique column name only on their customers by two, I really recommend you once you are trying like to join some tables, it is very nice way to document your staff to say, okay, the first time it is from the customers. Because with the time you could forget that or if you don't understand or don't know the data model, it will be hard to understand whether this firstName and the customers are on the orders. So it's really nice way to document that. If you put just the table name or the alias address starts with that, you could see very quickly those two columns come from the orders and those two columns come down the customers. And one more thing to make, it looks nicer. I'm just going to use tab. So now we are ready. I think let's try to query that. So as you can see now in the results, we got the columns from both of the tables. So we have the customer ID, the first name from customers, the order ID, and the quantity from the orders. Okay, so now let's understand what that database was doing once we executed the inner join. First, I'm going to select, Okay, Which tables do we need in the script we have the from customers, so it's going to read the table customers and then they have the join table orders. So that means that database is going to focus on both of the tables. Then it's going to define a clear which table is left and which table is right. Since we have first the customers in the front, It's going to consider the customer tables as the Lift Table. And then since we have the orders in a joint as the next, it can consider it as a right table. This is very important to do the joints, but since we are using the inner join, it doesn't really matter for us whether we use first customers are orders in the database is going to follow the script. Okay, So now as a next step, the database is going to check which column do we need. In our SQL scripts statements, we said we need only the customer ID FirstName from customers, from orders. We need the order ID and quantity. Alright, so now as a next step, the data is going to check up here which roads should be presented at their results. And here is like the most important thing we are using now, the inner joins, that means that database should present only the record that is matching. So in order to do the match, it needs like the key column for the joints. So we specified and said, Okay, you need to check the customer ID between those two tables. So let's go through that. The first customer ID one, we have it at the customers and as well we have it as a records in the orders. So that means there is a match between those two tables and this customer will be presented. So here we will get the customer ID one, firstName Maria, and her order was 1001. And we have this quantity. So here we have the whole record of Maria from both of the tables. We go now to the next one. We have John john present as well as the customer id2 in the table orders. So there is a match and it will be presented as well in their results. And his order is 1002, and he has this quantity. So it's going to proceed in the third customer. The third customer exists in both of the tables in customers and orders. And it will be as well listed in the results. And his order ID, this quantity 500s. But now we comes to the Customer ID for. The customer ID for exists only in the customers and we don't find it in the orders. That's why there is no match. And the database is going to ignore this customer and it's going to proceed as well. Over here. It's going to check, okay. We have the customer ID five. It is only as well exists in the customers and not in the orders. There is no match. We have one more thing that we have customer ID number six over here. We have it only on the orders, but we don't have it in your customer. So there is no match with the inner join only if the customer or the key exists in both of the tables, it going to be presented as a result. Alright, so that's all for the inner join. Alright, so that's all for the Inner Join. Next, we're going to talk about a left join. 25. SQL | LEFT JOIN: Okay, so now let's go to the next task and we have the following. Find all customer ID, FirstName, order ID, quantity, but include those customers who didn't place any orders. For us. That means we need to see, as a result, all the customers, not only those customers that did place an order, but all the customers. In order to do that, we're going to use the left join. So we're going to have exactly the same query. There's nothing has changed the same columns, the same tables. But instead of saying inner join, we going to work with a left join and saying left join. That means okay, for the SQL, it can list all the customers. So let's see what can happen if we do that. Let me make it bigger a little bit. So as you can see here, as I said, left join, we have all the information from the customers and only the magic ones from the orders. Alright guys, again, let's understand what the database was doing. Once we executed the left join, the database is going to focus on the customers and the orders that database understand, Okay, Customers is the left table because it's comes as first with the from the orders is the right table because it's comes in the left, join in the query. As a second, after that, I'm going to specify the columns. Again, we have the customer ID, FirstName, order ID, and a quantity. And so now it's going to start doing the matching and going to check, okay, which joint type, what do we have? We have the lift. So since we'd say, okay, it is a left join, the database is going to say, Okay, I need everything from the left table without doing any matches, so we need everything. So it's going to list all the IDs and as well all the names, results, checking anything. But from the right side we need only the matching records. So it's going to really check each one of them. So here, customer ID it exists and the customers, so it's going to take it and put it as a result. So now for customer id2, we have as well one, it's going to put it at the results that customer IDs three there is matching. But now for Martin, he don't have any orders. So the database is going to show nulls. Instead of that. Now, it means like an empty, there is no value found or unknown. And for better as well, there is no customer ID with the number five. That means there is nothing at the right side. We will have as well. And if t, So this is how it looks. Once you execute the left join, you will get everything from the left and only the matching from the right. If there's anything missing, I'm going to put nulls. Alright, so that's all for the left join. Next we're going to start talking about the right join. It is very similar to the left join. 26. SQL | RIGHT JOIN: Okay, so now let's jump to the next one. We're going to talk about the right join. We have the following task. It's almost the same. Find all customer ID, FirstName, order ID quantity, but this time include all orders regardless of whether there is a matching customers. That means for us, okay, we need all the orders from the right table, from the orders. And in order to do that, we have the same setup over here and it's krill. We just need to change the type of joints so we can write here, right? Once you do that, you are controlling how the database is going to match and going to present the results. We will have the same setup over here, will not change anything. And let's run this. And with that, you can see the database did list all the orders from the order table and from the left side only the matching customers. Okay, So as usual, let's see what the database did once we executed the right join. We have the same setup. Customers is the left table, orders is the right table, and we have the same column as well. So a customer ID, FirstName, order ID, and as well we have the quantity. But now here the difference is that we say it is a right join. So in order to do that in SQL, it's going to like presents all the results from the right table without checking whether there is a match with the left. So the data is going to select everything from here. So all the orders and all the quantities without checking anything from the left side. Now from the left side, it's going to only present what is matching. So it's going to check. Okay. Do we have customer ID one? Yeah, we have it so it can present their results over here on the left side. Do we have customer two? We have it as well. Customer three. We have George over here. But now we don't have a customer number six, that means it's gonna be null again, so it's gonna be empty. We don't have a customer with the idee fixe in the customer table, though that we presented everything all the orders from the right sides and only the matching informations from the customer. Alright, everyone is, so that's all for the right joint. Next we're going to start talking about the last type of joins, the full outer join. 27. SQL | FULL JOIN: Alright, let's move to the last one. We have the full join and we have the following task. List, customer ID, FirstName, order ID, quantity. But this time include everything, all orders and all customers. Okay. With the full joint, I have two things to say. First is that the full joint is only supported in some databases like Microsoft SQL or MySQL or Oracle. You could not use the full joint. But instead of that, I'm going to show you some work around how to do full join with MySQL. So don't worry about it. But we need to twist some stuff in order to create the full joint. If you are using Microsoft SQL, you can just go and say full join. The second thing, that full join has sometimes bad performance if you have big tables. So try to avoid using the full join in my projects, I always tend to use like inner join, left join, right join, all full outer joins. I really tried to avoid using that full joint has really bad performance. So if you have small tables, it should not be a problem. But once the table gets big, the full joint is going to be really slow because you are saying, okay, give me everything from left givers, everything from right. And that has sometimes bad performance. So try to avoid that. So now the question, how are we going to do full join if we don't have in my SQL if full keyword in order to do that. So as I said, we're going to use some workaround. So following this, so a full join is actually is a combination between a left and right, left join, right join. So what I'm gonna do, I'm just going to go and duplicate this scripts. So we have twice the same query, but when we say left join and the other we say right join. As the next tutorial, we're going to talk about how to combine two statements in one. In order to do that, we will use the keyword union. Once I put union, I'm just like adding two statements in one. So here I'm saying, Okay, give me all the results from the left and combine it with the result from right. And if you execute it, you will get exactly the same result as the full join. With that you could see, okay, here I have all the customers as well. I have all the orders, so we have here a full join. Alright guys, so now let's see whether that is done once we executed the flu joint or the scripts that I showed you is left, union right. We have the same setup customers orders, and we have those four columns. So since it's full join, that means all the records from left and all the recall from right. So it's going to start from the left. We will have all the customers and all the first names. And then it's going to start matching on the right side, some area, it has this order, this quantity, customer ID has this order, this quantity. The three, we have this ID and this quantity. But for Martin and better, we don't have any orders from them. So we're going to see nulls over here, over here. But there is still something missing that we don't have all the orders over here. That's why the database is going to go and present this order ID. And this quantity that's going to match on the left side, it says, okay, there is no customers on the left sides. And it's going to put over here some nulls. So with that, you got all the customers and you have all the orders that is matching for them. And the way around with that you have all orders and old customers using the full joint. Alright guys, so with that, we have learned all different types of joins. Next we're going to talk about a similar concepts. It is the union and union. 28. SQL | UNION: Alright, so now we're going to learn how to combine tables using union. Union is very important tools and SQL in order to combine tables and very powerful. So previously we have learned how to combine tables using the join methods. So what we are doing enjoins we have two tables, customers and orders, and we are joining the columns together. So at the results, we're gonna get one big table, one table with all the columns from left and from the right. But with union, we are as well like combining two tables. But instead of combining the columns here, we're going to combine the rows together. So here we're going to get very long table, including all the rows from the left and from the right, but having the same column. So we will not get all columns from left and right. Instead of that, we will get all the rows from left and all the rows from rights. Okay, so now in order to understand the union, we're going to have the following example. So in our tutorial database, we have two tables. We have the table customers, and we have the table employees. So now we have the following tasks. Make a list of all persons from customers and from employees where we have the FirstName, LastName, and the country. So that's means it doesn't matter whether the person is a customer or employee. We're going to have make a list with everything. So in order to solve this task, so we're going to use the union operator between two tables, customers and employees. So if we take this closely, you will find though three informations in both of the tables. So we have firstName and customers. We have as well the same in employees, LastName and customers last name employees. And we have the country and employees and the same ads, customers. This is very important that we have the matching columns from both of them. So the database, if we start the union between both of them, the database can select the columns only from the left table salt. We will have FirstName, LastName and country. And we will not have here again the same columns from the right one. It's not joined, it is a union. So the left one going to decide what are the column names. So this is very important. So the database is gonna go and select everything from the left table and put it at the results. I'm going to do the same for the right ones so that the employees and select all the records and put it over here. And with that, we have a full list of all persons from customers and as well from employees in one results. This is very important that both of the tables at the SQL query should have exactly the same number of columns and as were the same order. So if we are doing like in the employee's first, the last name, then the firstName. In the results. We will get that switch as well. So be careful with the order of columns and the number of columns should be matching between left and right. One more thing is very important that there is two types of union. Time. Number one, that is the union all where we're gonna get the result exactly like this. So that's means if there is any duplicates between the table one and table two, those WE gets going to stay at their results so there is no check the uniqueness of the results. If there is any like person on the left, I'm the same person or the rides. Nothing going to happen. We will get the whole results. But if you wish to remove those duplicates. So if you check the results over here, you can see John. He is customer and at the same time, he is as well employee. So this could happen. Yeah. So in order to remove such like doubly kits, we could use the other type of union, and that is only the union without union. All. I'm going to show you that once we are writing the SQL statements. So this is as well very important to understand that the union, if you want to have the duplicates like exactly like the data inside the tables, then you should use union all. If you want to remove the duplicates, then use union. So now let's see how we're gonna do that in SQL. So this is really easy to do in SQL. All what we're gonna do is we're gonna write two queries, one for customers, one for employees, and then just put union between them and we're going to get the results. So let's try building the first one. Select first name, last name, and we need the country from customers. So this is the first query. Let's just execute that and see, okay, now I have a list from the customers. And then we're going to write that again for the employees. So select employees we have as well firstName, lastName, and amp. Country from Blow is. So let's run the query and see. Now we have the list from employees. So as you can see, we have now two queries, one for customers and one employees. In order to do the union, like maintain all the duplicates as well. We're going to write the keyword between them, union. All. So now we're going to run the whole thing and let's check. So with that we got all the FirstName, LastName, country from both of the tables from customers and employees. And as you can see, this list contains WE kids because e.g. John is in customer as oil in the employees. So if we wish to remove such a duplicates between customers and employees or other results, we just remove the oil from here. We just use the union. So let's run that again. So now we're going to get a unique list of information so John can only happen once over here. So this is how we're gonna do it in union. One more thing is about how to control the column names. So as you can see, the FirstName, LastName, country, this comes from the query above. So this query over here, it's going to control the naming of our table. So if you wish to have like different column name, so don't change it over here because nothing could happen. Database going to just ignore it. So here we're going to control the name. So if I wish to add e.g. let's say person, first-name. Here, person, lastName. And hear Harrison country. And we rerun the query. As you can see, we have the names over here. And if you change anything over here, the query below, nothing's going to happen. So let's have first name. So let's run the query. You see nothing going to happen. So now let's test a few things over here. So if I just make your problem where I'm going to have first, we have the last name and then comes the first thing. It is the opposite as the first query. So let's run this. As you can see, the database will not notice that we have here mistake or we have problem where we have above the FirstName, LastName, and then here we have last name, then first name. Because the database doesn't care about that. It only cares that's both have the same datatype. Like since we have here var character and here we have var character, it could present their results. For the database. It doesn't care about like whether you are doing it rightly or not. The column name, don't say anything for it. So that's why Be careful about the order of the columns. When you are doing the union between two tables. Now, if we go and try another data type, e.g. customer ID. Customer ID is integer, and the first name over here is var character. So if I run the query, we will get an error because I think it's hidden over here because there is mismatching between the datatype that database cannot lie combine strings and then after that we're going to have integer. That's why the data type is very important for SQL. So let me just repair everything and run. Now it's works because the data type is same. So let's try some other errors. I'm just making things broken. So above we have three columns. We have FirstName, LastName country, and we have here the same. So if I have like different number of columns between the two tables, let's say have salary. So now we have four columns in one squeal and the other we have three. If I run this query, we will get as gain and error because it's going to say you have different number of columns between those queries and we can not do the union. That's why that data type is very important. The number of columns is very important and as well, the order of the columns should be matching. All right everyone, so with that we have covered the SQL joins and now you know how to combine SQL tables together. And in the next chapter we will learn many important functions and we will start with the aggregation functions. 29. SQL | Aggregate Functions: Alright, so, so far we have learned how to get, how to retrieve our data out of our database and tables. But in real life scenarios, we will be doing a lot of calculations, aggregations on top of the data in order to get something meaningful of it, in order to get some useful information of the data. So in SQL projects, we tend to use a lot of aggregations in order to understand the data. Because we have in the data model sometimes like big tables and just reading the raw data, we will not get any useful information of it. So we have to do some aggregations on top of it in order to understand the data. So that means understanding the SQL aggregate functions are very important and very essential in learning SQL. In order to get some information out of the data. In SQL, we have the following aggregate functions. They are really easy. So if you just read the function name, you will understand what SQL gonna do once you execute those functions. So the counts, it can return the number of rows in a table. So I'm going to summarize the values. We have the average, we have max-min to return the maximum value and the minimum value. I will go through all of them, explain that step-by-step with examples as usual. But here it is very important to understand how each functions can deal with the nulls, those empty fields that we don't have a value because each function's going to deal with the nulls differently. Alright, so now let's start with the first function we have. It is the accounts. It is as well the easiest one we, that we have in the aggregation functions. In many situations once you are working like, let's say new projects, you have a lot of tables. The third thing that I tend to use it to see, okay, how many customers do we have? How many orders, how many, Let's say employees, we have the band on the table. So I usually always check that to see how many records do we have in each table? Is it like Big Table? Is it small table? So if we have the following task that says, find the total number of customers in the database. Okay, so let's solve that using a scale. First, I want to get like all the data from the table customers, we usually do that using select star from customers. So that is easy. Now we can see, okay, we have five customers at the table. But the task is says, find the total number of customers. That means I want to see as a result, only the number five, the total number of customers. In order to do that, we're going to use the function count. So after the select, I'm going to type here the key word count, open brackets and close brackets. And inside the account you could specify either star or the name of that column. So let's see what the star and execute that. And as you can see now, we got like five as the row numbers of customers in the table. So here we have now counted how many customers do we have. But as you can see here, the name of the column, I don't really like it. It's like the function name. So let's rename it for the results as total customers. So let's re-execute that. And now it looks better. So the total number of customers, we have it as five. As I said, we could use here like star or a column name. So this is the easiest way to do a count on the table using the star. But if you now include the column name, is gonna be a little bit more tricky because of the nulls. So let's see what's going to happen if I type over here customer ID and run the query, we will get the same information, like five. But if I bought over here not the customer ID, but the score. And you will see we have now four. So here we have four scores. We don't have like five customers. So what happened over here? So now let me explain you what a database is doing. Once you say count star or count a column. If you say count star, you are not specifying any column. That database is going to go to the table and going to just count how many rows we have in the table. So that data is going to count 1,234.5. We have five rows in the table and add the results, you will get five. But if you say okay counts score, if you put the score inside the counts, the database is going to count how many values do we have inside the score? It's going to ignore the nulls. And here is the problem, or like let's say the tricky part. So if the database is going to count, how many scores do we have, It's going to count only four. So in order to count, how many customers do we have? Either you're going to say, okay, count star or you're going to like count how many customer IDs do we have, and you will get the same results, you will get five. But if you are counting like a column that contains nodes, here, you will have fewer records in the results, like the score, we have only four with the Id, we have like five. Okay, so now let's move to the next one. We have the sum. Unlike the count, the sum works only on the columns that contain numbers, e.g. you could do the sum on the customer ID because we have numbers inside it on the score, on the quantity on the order IDs, but you cannot sum the firstName or some the last names with the count. You could do that on any type of columns like you could do, count, firstName, count countries and so on. So the sum, you deal only with numbers. And one more thing, if you have nulls, the sum gonna deal with it as a zero. So it will not ignore it. It's going to deal with that as a zero. Let's have the following task. Find the total quantity of all orders. So that means we're going to focus on the table orders and we're going to summarize all the quantities of all orders. It's really easy. Let's do that. So first of all, I would like always to start with the star from orders. And let's run this. So now I have here the table orders and we're going to focus on the quantity and we have to summarize it. So in order to do that, we're going to use the keyword some open brackets. And now type here quantity close brackets and run this. So with that, you got the total number or the total of the quantity. We summarized all the rows in one cell. Here. As usual, we have this ugly name over here. So we're going to rename it some quantity. Run it again. So now we have better name at the results. So the sum of the quantity we have here, 2650. Okay, so now let's move to the next one. We have the average. The average is one more aggregate function in SQL and you could use it in order to find the average of one column. It is almost the same as sum. So it works with the columns that has some numbers. It will not work the average if you use it on the first name or last name, there's characters, so only on the numbers. But the only difference is that, however, is going to deal with the nulls. So e.g. over here we have the null in the score. It will not consider it as a zero, as a sum, but it will ignore it completely because it considered as zero is gonna be really problem using the average function. So in average, the nulls will be completely ignored. So let's have the following example or the task, find the average score of all customers. So let's try to solve that. We will be focusing on the table customers. As usual. I'm just going to select everything to check the result over here. So we need the column score and we need the average of those values. So in order to do that, we're going to write the keyword average, open brackets, and then the column name and close brackets. So let's run this. So with that you got the average score of all customers. The nulls are ignored. And I like to rename it as Very score. Run it again. It looked better. Now we have the average score, 625. Alright, so now we're gonna move to my favorite aggregate function. We have Min and max. I use it a lot once I'm doing like data profiling in order to understand my data, e.g. if I am row filing or checking the table orders for first-time, I will be interested. What is the latest date or what was the latest order dates? So in order to do that, we could use the max function on the order date and we're gonna get the latest value or e.g. I'm going to check okay, Which customer has the highest score. So I could go to the score and do a max function. So the max and Min, It's like the count. You could use it in any type of columns, so you could use it on numbers and characters, on dates is going to work and hear about the nulls, it's going to be ignored. So if you are going to say, okay, what is the minimum value on the score, you will not get the null, you will get 350. Was Maria. Let's have some example and tasks in order to understand how to work with Min and max. Alright, so we have the following task. It says, find the highest score, the maximum score in our customer's table. We have the same table over here, so I'm going to remove the average, select the data. So I want to get the highest score. So this should be done. In order to do that, we're going to use the function max, open bracket, score those brackets and run this. If you do that, you're gonna get the 900s. And that is true. Just going to rename the column. Let's run that again. We have the max score as 900. So let's now find the lowest score. The lowest score over here should be with Maria 350. In order to do that, we're going to use the function mean on the score as well. We changed the name just to look better. And run that again, though with the mean score, we're going to get the 350 and not the null. So this is very important. Alright, so now let's keep playing with the data. Let's take the order. So I'm going to get the earliest date on the order dates and the latest. So let's try to do that. I'm just going to remove that. Select the table orders. Now we want to get the earliest dates and the maximum date or the latest dates from the column order dates. In order to do that, you're going to use the function mean when brackets, order date, and then closets and just rename it for the results, mean order dates. Let's run this. And with that, we got the minimum dates in the order date. So this is was the first order data in the table. And let's get now the latest one. So in order to do that, I'm just going to change the function max and just change the name of it for the result. And see. This date is is the latest dates that we have as an order. Alright guys, so with that, we have learned all the aggregate functions in SQL. They are really important for data analytics and data science. Next, we're going to cover the string functions. Where are we going to learn how to manipulate the text data? 30. SQL | String Functions: Alright, so as the next Reagan to learn how to clean up our data using the SQL string functions. In many cases, if you are working with a big database, you will have a lot of columns That's includes values like text or characters, we call it string. And the data quality insights such a columns might be sometimes bad. So you will be end up needing some functions in order to manipulate the structures of those values. So in SQL we have the following SQL string functions. We have the concave in order to connect to strings in one value, the lower and upper in order to transfer the data to lowercase or to uppercase trim. If you have some whitespaces at the start or the ends of the value, you could remove them links in order to calculate the length of the character or the value, then we have the substring in order to return a sub part of the string. Alright, so now we're going to have some tasks in order to understand how to work with those string functions. The first one it says list all customers name, where the customer name is a combination between firstName and lastName in one column. So let's try to do that. We need the list of all customers names we have here, firstName, and we have as well the LastName from customers. So if I execute this query, I will get the following. We have now a list of all customers names, but now we didn't solve really the tasks because the task says, we want to have customer name where the firstName and lastName in one column. And as you can see here, we have it separated in the database. So in order now to connect those two strings in one, we're going to use the function concat. So let's see how are we going to do that? So we need the keyword con, cats, open brackets. And here we're going to list the first column, firstName, comma, last name. So I'm going to move those here and let's see the result. So as you can see, okay, now we have the first name and last name together in one column. So if we want to separate them as well from each other, we could use one more string. I'm going to put the minus between them. So I'm now connecting three strings. Firstname minus, this is from me, then the last name. So let's check how it going to look like. So as you can see, Maria Minos Kramer. So with that, we have a list of all customers names with the first name and last name on it. But I just want to rename it as well to the customer. Name. Me, make it smaller. Alright, so let's vary that. As you can see now we have a column called customer names and we have exactly the information that we need. So if you want to connect like two strings or more strings, you could use the function's concave. So another task that's mites be okay, I want all the first names to be in uppercase or lowercase. So let's see how we can do that. So now we're going to remove this. And we're going to now transfer the first name to uppercase. So if I just query now the first name, you can see it is not uppercase, it starts with big M, then the rest are small. So in order to convert everything to uppercase, we're going to use the keyword or the function of our brackets. Close it, and I'm going to rename it to upper firstname. Let's run this. And as you can see now, all the names now with the uppercase, you could do as well the same with the lowercase. I'm going to use now the functions lower firstName as lower virus name. So let's run this and as you can see now, I transfer the string from like the uppercase to lowercase. One more thing to notice here. So any changes that's now I'm doing in the query, it will not update the contents of the table. That means the firstName going to stay like before, thus Maria with the first character m and there is small. So now we are just changing or transforming the data at the result sets that I'm getting as output. So nothing's going to change on the table unless we do some updates. We're going to learn that later. So now we're just transforming the data for our results. Okay, so now let's talk about the trim. This is interesting. Sometimes in the database you might find something like this. Like the name Maria, and before that, we have an empty space. So someone before entering the name Maria, they entered whitespace before that it happens. Or at the end, someone intertwines whitespace. Usually this is like bad data and we have to remove it in order now to work with that and our query, we could use a function trim. So for the left one, we call it the lifted space. For the right one, we call it the right space. So in order to remove the left spaces from the name, we could use the function L trim, that means left trim. And if you execute that, this whitespace will be removed from the query, from the results. And if you have on the right side, you have as well whitespace, you could use another function that is called our trim. That means right, trim. And if we execute that, it's going to remove any whitespace is at the end of the string. If you have the situation where you have both. So either you're going to apply lift trim and dry trim or you can use the function trim. Trim it going to remove both of the sides, the left atrium and the right trim, and you will not have the results any whitespaces, the string. Okay, so now let's have some examples to learn about the Trim. So if you check our tutorial database, you might already find out there is some whitespaces around. If you check the table customers exactly in the lastName, you will find here some leading or some left whitespaces. So let's query that's unchecked. Us. Select LastName from customers. So now if you take the results, you might find, okay, there is here lift, lift whitespace, but I have here for you tip in order to find all those whitespaces that are hidden. So e.g. we have as well as Cramer as well whitespace, but you cannot see it if you check the results. So I would say just copy the value and put it at the editor. So if I put it at the editor, you could see there is like a right whitespace. And let's take all the values. Let's see, steel is clean so there's no whitespace around and pips remove those. Beeps has like lift whitespace and the right whitespace. So we have to repair that. Now. Molar, molar save, we don't have whitespaces around Rankin as well. I think the same. Yeah. We don't have whitespaces, so let's try to repair that. We just going to use the function trim, the keyword trim brackets. As usual. I'm going to call it clear clean last name. So let's run the query and check the results. So let's check Kramer whether there is any whitespaces around. So as you can see, it's clean. Let's have another example of our pips as well, clean so we don't have any lift whitespaces or right whitespaces. You could use the function trim in order to remove them. Okay, so now let's move to the next function. We have the link. If you want to calculate how many characters do we have in one string, you could use the links function for some reason if you want to calculate how many characters we do, we have the last name, we could do it like this. I'm just going to extend our query. You calculate that. So in order to do that, we're going to use the keyword links. And inside it we're going to put the last name. Who calculates how many characters do we have there? I'm just going to rename it to Olin last name. So let's run the query. And you can see the database already calculated how many characters do we have in the last names? You might already noticed it is not really true because we have here Kramer, it's only six characters, but the database is showing seven. And that's because we have whitespaces. So this is really nice way in order to find out whether there is whitespaces or not. In order now to clean that you could like merge those two functions in one. So I can put first the trim inside the link. So first I'm cleaning the data and after that, I want to calculate the length. So in order to do that, I'm going to make a new column. So first I'm going to trim the lastName. And after that, I'm going to apply again another function links. So I embedded two functions and one as, let's say, let's call it clean lynn, getting long name. But anyway, let's see the results. As you can see now we have the clean links or the last name. So we have exactly here 65. And as you can see here, there is like two whitespaces. And those names don't have any whitespaces because we have exactly the same number of characters. Okay, so now let's move to the last string function that we have. It is the fun one substring. So let's say we have in the database the following name. We have Maria. Each character in the database has the position e.g. M is one, a is two, r is three, and so on. And if I want in the query to subtract this name, and I just want to be part of it. I could use the function substring. So the substrate has the following syntax. I need to define inside it the column name or the string, then the start position and the length. Let's have the following example. If I say I want to substring Maria, starting from two, and the links is three. So we have here two pointers. The first pointer is where to start. So we're going to start with the position two. So it's going to calculate 12. And this is our starting position. And from this point we can calculate three steps. So here we said three as links or steps. So 123. With that, we have like starting point and ending point for the substring. So if you execute this query over here, you will get as a resort or sorry. Okay, so now let's have some life example. We can apply the same rule on the last name. So I'm going to remove the old part over here. So I'm going to use the same function, so substring. And we need to define now the column name is the last name. The starting position is to the links or how many steps is three. So let's call it sub last name. And let's run this and see the results. So if we take the result now we can see that we don't have the whole lastname, but only part of it because we define the substring on it. So instead of Cramer, we have only RAM. So it started with the position two and we cut three characters. So RAM from steel, we started with t and we have E. Alright everyone, so that's all for this chapter. We have learned many important functions. And now in the next chapter we will raise the level again by learning advanced topics in SQL. And we will start with the group by clause. 31. SQL | GROUP BY: Alright guys, so, so far we have learned how to aggregate our data using SQL aggregate functions. E.g. if you want to get the total number of customers, you're going to go and use the count star on the table customers and you're gonna get five. So sometimes this is not enough. Sometimes you need to group up there rose by a column value, e.g. we don't want to get the total number of customers of the whole table. Instead of that, we want to get the total number of customers By the country values, e.g. I want to see how many customers we have from Germany, how many customers we have from UK, USA, and so on. So here we are grouping up those customers by the country values. And in SQL, in order to do that, we're going to use the clouds group by, Alright, so now we have new clouds in our query. And as you know, SQL is very sensitive about the order of those clauses. So we have to follow the rules here. We cannot go and say, Okay, let's start with where, then select from no, we have to follow the rules. So we start with select from joins where and the group by its comes always after the where. So we cannot place it before the where. So if you have any filter, you should do the filters on the tables and then comes the group BY as well grew by it is an optionals, it is not a must clouds. It's not like select from. So if you need grew by, you're going to include it. But after the where this is very important, okay, so now in order to understand the group by, we can have one task and try to solve it using SQL. Let's go. So the task says, find the total number of customers for each country. So that means we need to Grubhub the customers by the column country. So we're going to build this step-by-step. So we're going to start with a select star from customers does to check what we have in the customers as usual. So now we need to count how many customers do we have. And with that, we learned we're going to use the function counts. And we're going to close it like this. I'm just going to rename it as a total customers. So let's run this. So now we have the total number of customers by five. But now we want it to be divided on the countries to Grubhub by the country. In order to do that, we're going to use the clouds now, grew by, proved by keywords. And after that, we're going to name the column that we want to group by. So in our example it is the column country, but this is not enough. We want to include, as well as the select statement. In order to do that, let me just select as well that country. So with that we say it's okay. I want to count the total number of customers together with the country and then group it by the country. So let's run this. And as you can see now, we have not only the total number of customers, we have as well, the country and the customers are grouped up by the values of the country. So in Germany we have two customers. In USA we have as well to customers, and in UK we have one customer. So with that, we did the total number of customers by specific column. Alright guys, so now let's take step-by-step what the database done once we executed the group BY? So first, it's going to ask is clear, which table do we need? We have the table from customers, so it's going to focus on the table customers. And then says, Okay, which columns do we need? We need the column counts. And then as well, the new column total customers. Alright, so now after that's going to take okay, there is group BY and count. So with a group BY what SQL gonna do, it's gonna go to the column values in the country and only list the unique value that distinct values that it finds inside the country. So it's gonna go one-by-one, okay, Germany, it's gonna be listed over here, USA, UK. But it will not list again, Germany because we have it already in the list. And USA we have it as well already in the list. So it's gonna go and aggregate all the roads for the column Germany. So it's going to see, okay, for the column Germany, we have it twice. So it's going to type over here. Let me just do it like this. Two. Then it's going to go to the next column. Okay, How many USA customers do we have? Going to count 1.2. And we're gonna put as well over here too. Then for the last value at gonna Grubhub or count how many customers we have for UK and we have exactly only one. So that's how the SQL works and why we get these results. Okay, so now we could extend our task and we say, I want the same results by the total number of customers should be sorted with the lowest fares than the highest. So in order to do that, we're going to use the order BY and here it's very important that the order BY comes after the group BY order BY. We are ordering by the count star, so the total number of customers. And here you could use the ask or without it, because it is the defaults. Let's execute this. And you can see the result is sorted now by the total customers, where the lowest fares and then the highest. Okay, so now let's have another example for the group by and the task says, find the highest score for each country. So this time, so we don't need the count function, we need the max function. As you notice already, with a group BY we need always those aggregate functions, but it is not a must. So let's try that in Scratch. So select star from, well, let's make it big customers. We want now the highest score. So we're going to use the function max. Open brackets are column, is score, and we're going to rename it max score. So this is not enough because if I execute this query, I'm going to get the highest score of all countries. But this time we need to group it by, by the country. In order to do that, I'm going to list in the select the country. And let's make it more beautiful and then use the clouds group by country. So that's I'm finding now the highest score for each country. So let's run this. And with that you can see the highest score in Germany is 500's. The highest score in USA is 904, UK is 750. Okay, so let's check what the data is done. We selected the table customers. We said we needed the column country and a new column called max score. And in the SQL we have the group BY of countries. So that means the database is gonna go and select all those values and put only the unique values. So that means Germany, USA, and the UK. Then it's going to start like finding the max of each those countries. So it's going to select first for the Germany, we have two rows, 4.1, and it's going to find the maximum value of those two values. So 350.500, It's going to select this value in the result because it is the highest, then it's going to select for the USA, the two records over here. So we have usa over here and one here. And the max value of those two values, 900 and null, it's gonna be the 900. So it's going to put it at the results. For the UK. We have only one record, so the max value is gonna be the same. So it's going to be the 750. And that's how the database build up this results from our query. Alright, so that's all for the group by clause. And next we're going to talk about a related to B. It is the having clause. 32. SQL | HAVING: Alright, so, so far we have learned how to group up our data using the group by SQL clouds. But sometimes you might be in situation where you are working with really big table, where you have in one column many different values. In our example we have only three values. It's just to make it simple, but in real-world scenarios, you will have really a lot of values in one column. And you will be first to use some filters on the results. So in older now, to filter the results that we have from the group BY SQL, we have one more new clouds and that's called having. Alright, since this is the new clouds, we need to understand where we're going to place the having clause. Because as you know, it's scale is sensitive about the order of those clauses. So we will have the having clause exactly after the group BY, so once you define the group BY, after that, you're going to define the having clause and it is as an optional once you want to filter the aggregations functions, you could use the having clause. So with that we have all the clauses about the select statement or the query. It started with select from joins where group by having. And lastly we have the order BY and limits. Okay, so now in order to understand the having, we're going to have one task and we're going to try to solve it using SQL. The task says, find the total number of customers for each country, but unclothed those countries that has more than one customer. So that means we have here a condition to filter our data. So let's try to solve that using SQL. So as usual, we're going to start with querying our data. We're going to focus on the table customers over here. So now we need now to have the total number of customers by country. That means I need to do groupBy and use the aggregate function count. Like before. I'm going to use a keyword counts, star and rename it, will look good at the results. So counts, or we call it total customers. Since we're going to group BY, by country, we have to include the country as a select. And after that we just going to group by that country. Let's run this. We see at the results, we have now all the countries and we have the total number of customers. But our task is not solved yet because we still have a country where it's total number of customers is not greater than one. So we need to filter this data in order to do that with the group BY, we're going to use the clouds having and think about it. It is like exactly like the where clause. We're going to write down one condition. So our condition says the total number of customers should be greater than one. So the total number, that means the count should be greater than one. So we have defined our condition. It's exactly like the where clause. And let's run this. And as you can see, we don't have now the UK with the one customer. We have now all the customers aggregated by the country and the country that has more than one customer in their results. With that, we filtered our data and we have exactly what we want. Alright, so now you might be wondering, and you want to ask him, you borrow. Why do we have such a clause called having any squeal, we can just go and use the where clause because there we could filter our data. We could define exactly the same condition and we filter our data. Why SQL has one more function or clouds that does exactly as where. The answer for that is. Where you could use it only on the columns that exist in the database. E.g. if I want to filter the country or if want to filter the score or last name. So any columns that I have in the database, I could filter it with aware. But once I want to filter the data based on a column that doesn't exist in the database, e.g. the count star or the max min. So any aggregated function that we are using in the query, and we want to build up like a filter on top of such a function, then we cannot use the where we should use having, having only works with the group BY once we are doing aggregation. We could define here filter on top of it. But the where clause works only on the columns that we have already exist in the database. So that means if I have this results and I want to filter the data where I don't want to see the country USA, other results, I should use the where clause. So let's do that. The Wire comes after the from where our column is country not equal to USA. So let's run this. And with that you see here we have filtered the data. We don't have to use other results. So if I want to filter the country, I need to use the where clause. If I want to filter the aggregate function or the group by, I have to use the having. Alright guys, so with that, we have covered the having clause. And next we're going to talk about the concept of subqueries in Israel. Where are we going to cover exists and in, and learn the differences between them. 33. SQL | SubQuery: EXISTS vs IN: Alright, so now we're going to learn about how to do subqueries using SQL. This is extremely powerful in SQL. Once you learn how to do the subqueries, you will be able to do a lot of complex and important tasks using SQL. So what is a subquery? It is like you have different queries that are nested to each other that as you have one query embedded in the other query. So in the normal situations and the brewers materials, we had only one query, one statement that is querying our data, e.g. the customers. But with a subquery is you will have different queries that are the bending to each other. E.g. we have here query number one that asking the data from the table customers and then present their results. Then we will have another query, gray number two, that is depending on the results and making glitzy another select statements. With that, we're going to call the query number one as a subquery. This will be the basis for the next query that we have. So with that, you could do really nested queries, Not only two, maybe 34 and so on, so that you could do nested queries and not only one. Alright, so now we're going to learn how to do subqueries using SQL. And for that we have two options. Either we're going to use the operator in or exists. So now we're going to focus on the operator N in order to solve the following tasks, the following tax says, find all orders that are placed from customers with a score higher than 500 using the Customer ID. So let's try to solve that. So that means we're going to focus on both of the tables, orders and customers and sends. At the end result, we should present all the orders. I'm going to start with that query first. So we're going to say select star from orders. So as you can see, we have now all the orders, but the task says, it should contains only the customers that has higher than 500s as a score. So that means I need to find out which customer ID over here has a score higher than 500s. In order to do that, we need to check another table. So select star from customers. And now we need to put the filter that we need. So where score is higher than 500. Let's run this. You could run this separately if you highlight it and then execute. So with that, we know that a key customer ID 2.3 are the customers with the score higher than 500. So I could go back to my original query and make this filter. So I'm going to say where customer ID, I would say in 2.3. So with that, with this filter, I'm saying, okay, those customers have higher score than five hundreds. So let's run only the other Bart's and check the results. Now, I have the orders for those customers, and with that, I solve the query and now comes the buds. This is really bad to do because it has two problems. First of all, I went to another table. I found out those IDs manually. So it was like we can do it with small table. But imagine if you have like big table with a lot of id. So you need to give them extra in the next query. And sometimes it is almost impossible with this small example it is okay, but with big tables, this is impossible to do. Second problem is that once you are having changing data, e.g. we are getting more customers, we are getting more orders. That means each time, like I'm getting a new data in my tables, I'm going to go and check the query over here and adjust our query. This is not dynamic, so this is really bad. So instead of that, we're going to do small trick that's going to solve everything and make our life easier with the subqueries. So instead of having those static numbers in the filter over here, I'm going to remove them. And instead of that, I'm going to say this query going to be my subquery. And this over here gonna be my main query. The results that I'm getting over here that check that again. So the results I'm getting over here, it's gonna be like feeding the other query. So for That's what I need is really to have 2.3. I just need the customer ID, so I don't need all those columns. Instead of the star, I'm going to say customer ID. Let's run this again. As you can see, we have now 2.3. So it doesn't matter how many new customers I'm gonna get. I'm going to always have a full and right list or the next query. So what I'm gonna do, I'm just going to cut it and paste it over here. I'm just going to put it in a new line so it looks much better. So with that, I embedded one query in the next one. So this is the subquery. It has always those open brackets and close brackets. With that, I'm indicating for SQL, we have here a subquery, and here we have the main query. So let's run this and check the results. As you can see, I got exactly those orders from the customers whose score is higher than 500. And now we could have new orders, new customers. I don't have to deal with that. All is my query will solve my problem. And I don't have to add all those IDs in the, in. Instead of that, we're going to have it very dynamically and very powerful. So this is much better solution than having a static IDs inside the n statements. And we are very dynamically, if you like, just go through that and do more nested queries and so on. You will be able to solve a lot of complex and important tasks using SQL. Alright, so now we're going to try to solve the same tasks using exists. Exists is little bit different than in like both of them. We're gonna get the same result, but with exists, you're gonna get better performance if you have like big tables. So if you're having big tables and you are suffering with performance from the in operator, you could start using the exist and to check whether you will have better performance. So we tend to use exist more than n If you are facing performance problems. But it is little bit more complicated than they exist because there is no clear separation between the query one and create two or the subquery and main query. So let's see how are we gonna do that using exist. I'm going to open a new tab. So we will have the same setup. So select star from orders. But now we're going to have some aliases because it is something like Joins. So I'm going to have the name 0 as alias for the orders. And now we're going to type the filter where then after that we can type directly the exists basements where exists. Then we will have the sub query. Now we're going to write a subquery so we can select. And now here we could write anything as columns, so they exist will not depend on the selected columns over here. So you could write anything like customer ID or star or anything you want. We tend to any scale to write just one. So because we don't care about that, just to make sure that the result from the SQL subquery is not important. It is like the join. So select one from customers, I will give it a name. Now we need to add the filter. And here it is exactly like are they doing the joins? See, customer ID equals two orders, customer ID. So as I said, it's like a join. And after that, we have another filter on the customers and that we need the score to be higher than 500. So with that, we have over here our subquery. It looks little bit complicated compared to the n. So here we have like some kind of like inner join. I cannot trend this part of a squared. I will get an error because I have such a kind of like those conversion between the ideas. So in order to get the result, I need to run the whole thing. So let's see and run this. You can see I got exactly the same results exist and n, which will give you the same results I tend to use in like, when it's like, let's say small tables and so on. But once I have bad performance, I will switch to exists. And it's up to you which one you're going to use. But both of them are doing the sub-queries and doing this dynamic in SQL. Alright guys, that's all for this chapter, we have learned some advanced topics in SQL and mix. We will start learning how to modify our data inside our SQL tables. And we will start with the insert statements. 34. SQL | INSERT: Alright, so, so far we have learned how to query, how to retrieve our data from the database without changing anything, without changing the content of the tables or changing the columns. So we have used the command select in order to retrieve our data. And with that, those commands will not change our data inside of our database. So next we're going to learn how to manipulate our data inside of our database in order to change the contents. And for that, we have a new set of commands inside a new SQL category that is called DML Data Manipulation Language. And inside it we have three main commands. We have the insert. We could use it if you want to insert a new data inside our tables. We have deletes. If we have some existing roles and we want to delete it from the database, we could use the delete command. And the last one we have updates if you want to update or to change the content of existing grows in our tables, we could go and use the update command. Alright, so now we're going to start with the first command. We have the insert command. We're going to learn now how to insert new rows to our database. So we're going to focus on the table customers. As you know, in our tutorial database we have five customers. And now we're going to practice by adding one more new customer to our database to learn how to work with the insert commands. So before now we are inserting any new stuff to our database. We really have to understand the structure of the table, the structure of the columns. Because if we don't know the structure and the definitions of those stuff, we will be having some errors while we are inserting the data. So just knowing that we have like five columns inside the table, customers, that is not enough. So we really need to understand the definitions of the tables before we start inserting any new data to our table customers. And to do that, I usually use the following keywords. So describe customers, the table name. So what I'm saying now to the SQL, give me the definition of the table customers so I can have a look. What do we have for each column? The first look, it might look a little bit complicated. Don't worry about it. I'm going to explain all those stuff step-by-step. So we are saying, Okay, database explained for me or describe for me the table customers. As you know that each tables contain multiple columns. So we can see in the results we have here five columns. We have customer ID, FirstName, LastName, country, and score. Those are the column names. And for each column we have over here descriptions are properties that describing each column. We have here the data types, e.g. if you take here now our table of customers, we have in the customer ID only numbers and they are unique. So we have 12345 and those are numbers. So the datatype for the customer id is like something like numbers. And in database we call them integers or int. And the firstname, it's like we don't have all the numbers, we have character. So we have Maria, John, and they are like text, and we call them in the database var char. There's different types for such a characters, e.g. we have character or char and so on. But in the best practices we use var char because they optimize the spaces or the sizes in our database as well. We can see here there is like the size of the var char we have here 50, that means the maximum I'm loud size for the FirstName is only 50. So if you having more than 50 characters in the firstName database will cut it and insert only 50 characters for the first name. So here we are like putting some rules for each columns. So the first name should maximum beefy characters, the same for the last name and the country. So if you have really long name that is more than 50 characters, it will not fit in this column and the database gonna cut it. So you could apply as well as the datatype over here, some rules about the size of each column. And we have as well the score as you can see in this course, we don't have any characters. They are only like numbers. We call them integer. So with that you can see each column has a different data type. You have more like understanding of that description of the columns. After that, there is a field called nulls and you can see here only no and yes. It says, are the nulls are allowed in each column or not. So e.g. on the customer ID, we are not allowing any null. So here, the database, if you insert an enol, that database would say no, it's not allowed. So in the definitions, there is no null allowed. And the same goes for the firstName and lastName. Once we insert data to the customers, we always have to have customer ID, FirstName, lastname. But now with the score and the country, we say, yes. So the nulls are allowed, e.g. as you can see in the score we have here one null. And in the country, if you don't specify anything in the insert statements, there will be no problem. And the database can see gonna show us a null. So here we can see the definition where we can add nulls and where it is not allowed. So we have over here as well a key for each tables. In SQL databases, we have primary keys. The keys that defines each customer or each row, e.g. in our table over here, customers, we have the customer ID as a primary key. And once we say brown murky, he comes to stuff. First, it is not allowed to be null, and second, it should be unique. That means it is not allowed to have two customers with the same ID. So Maria and John should always have different customer ID. We cannot have both of them, e.g. the customer ID one here should not exist any WE kits and this is unique. So this is the most important thing to understand about the primary key that they are unique. So if I go now and insert one more new customer and say, Okay, we have a new customer called, and she has a customer ID five. But since in the database we have already the customer ID five, the database is going to give you an error. So here it's very important to understand the structure. Which column over here is our primary key? Then we have some other information e.g. we have here extracts. It says it is an auto-increment. Auto increments means like if I add a new customer, the database going to increment the customer ID automatically, e.g. if I add in one more new customers, I don't have to specify, like the customer ID should be number six, that database is gonna do it automatically. So here we have added some extra information that it tells us this id will be generated from the database and we don't have to specify it. So now we have more insights about the table customers. We know the definition of each column and we could start now inserting new record or new rows to the table customers. So I'm going to open a new tab. And we're going to start using the insert. So I'm going to type here, insert into keyword. And then we have to specify the table name where we can insert our data in the table customers. Then we have now to specify the values for each column, values of n brackets. And now we're going to start one-by-one. So the customer ID, I want to check that again, the customer ID is integer, it is the primary key and auto increments, that means that Delta V is going to increment the new ID. I don't have to do it myself. So I could go and say defaults. Defaults means that data is going to take care of that. I'm going to insert the customer id seeks. You could go and say, instead of that, I'm going to type number six, but I really don't recommend it because if you have like big database and someone else is doing inserts or you forget about what is the last customer ID we have in database. So just make your life easier and type defaults. So now we have to enter the FirstName. I'm going to use e.g. that firstName Anna. Here we have problem in SQL database that you cannot just type the first name like this. It is a string and int string. We have to boot it always inside single quotes or double quotes. So e.g. I'm going to use the double-quotes in order to like to deal with the strings. If you don't do that, you will get an error. I usually use one. So insert the strings so that it should be okay. The last name is the same thing as this var character and we have to put a name on it. So I'm going to use Nixon as lastname. So we have now the three columns, Customer ID, first name, last name. Now we have country and score. So let's check the country. The country it says it is var character, so we have to specify something over here. And we could leave it empty. So I don't have here really to answer anything if I don't want. And the same goes for the score it is but I integer, but we could leave it as well empty. So what I'm gonna do, I'm just going to add the country. It is var character, so it's a string. I need to put it in single quotes. I'm going to use the country UK. Okay, so now to the last column we have the score. So let's check that in the description. So we have score, it is integer. So that means only numbers should be inside this core is nullable, so I could leave it empty and it is not primary key and so on. So that's means I could leave it as a null. And that's makes sense because Anna as a new customer and she doesn't have yet any like scores in our database or systems. So that's why I could just write over here and null. Or I could leave it like this zero. If I want, so with that, I will just leave it as a null. Let's just execute the query and see whether we have everything right. So he will not get any results sets. We will just get you the information that everything is green and we have inserted the data. So in order to check now this user inside our database, we're going to open new tab, select star from customers, and see whether Anna is in the database. And yes, we have one more customer calls, Anna Nixon from country UK. The score is now she is new and we have the new generated ID, Customer ID from the database. Okay, so now let's keep practicing and add one more customer, our customer number seven in our database. So let's go and do that. I'm going to move everything and start from scratch, inserts into our table customers. And now we're going to add the values. So as usual, our first value, the customer ID, is gonna be defaults. The FirstName I'm going to use max, and the last name, I'm going to use lighting. But now the country and score, I could leave them empty. So I'm going to use the null as well for the score now. So now as you might already notice what I've really done over here, I just gave a firstName and lastName. And for all others, I'm using some nulls and defaults. So we could skip that and make our life easier, which just adding the first name and last name. So if I just remove the null over here and that default and run the query, I will get an error because the database is not understanding what is max. Is max like the country is max, the firstName, the lastName, the lung as well. Is it like the last, lastname? So we need to specify for the database, what are those values to which column. So in order to do that, I'm going to open here new brackets and say, Okay, I'm going to type the column name, firstname, and the second one we are using the lastname. So with that, we are telling the database, okay, the first values belongs to the column firstName, and the second value belongs to the column LastName. And if I run this, we will not get an error because we have already done the mapping and everything else is done automatically. So that means that database knows the customer ID. It is like automatically generated. So it's going to generate a new ID and sends the database, didn't find any information about the country and the score, it's going to put it as default as a null. So let's check now the result. If I query now the same, select star from customers, and we can see that that is done. That's an inserted our new customer Max lying. She understood that the country or it understood that the country is a null and score is a null and generated the ID of seven. So as you can see, It's more compact and I don't have to add all those nulls because imagine if you have a big table with like 50 columns and you have a lot of nulls, the query gonna look really bad. So here I'm just inserting what I need and the rest is gonna do the database from me if it is allowed. So e.g. if the country should not be null, I have to insert you something about the country. But since we are allowing the nulls in the country and the score, we could just ignore it and leave it like this. Alright, so with that, we have learned how to insert data in our SQL tables. Next, we're going to talk about the update statements. 35. SQL | UPDATE: Alright, so now we're going to talk about one more command in order to manipulate our data inside the database. And that is the update commands. So you can use updates in order to change the values of an already existing row in your tables. Okay, So let's have now the following task. We just added a new customer with the insert statements, and that is max, the customer number seven. And as you already noticed, this is the only customer that we don't have a country specified in the database. The task is now is just to add the country Germany to this record. So now we have to update the content of this customer by changing the null to Germany. So now we're going to start with the keyword updates. And now we have to specify the table name that should be changed. So we're going to have the table named customers. And after that to the new line, we're going to have the keyword sets. With that, we can specify new values for the columns that should be changed. So we want to change the column country and we have a new value instead of the null, we need to give the value of Germany as a new value for that country. Now he needs to be really careful about that. If I execute this, don't do this. If execute this commands, what can happen? The database gonna go and updates all the values for all customers underneath the country to the new value Germany. Because if you read this, we are telling the database that update the table customers and sets country to Germany without specifying any customer. That means if we run that all the countries will be in the table as a Germany, so don't do that. Our task is here is only to change it for the new customer. So as you can here see, our customer Max has an empty value add the country, and we only need to change it. So in order to do that, we're going to filter, are we going to put like condition for the updates? And in order to do that, we're going to use the primary key, customer ID number seven. I don't recommend to use any other columns like e.g. the first name or the last name. Because if you have a big table, the first name max, may be presented in other customers. So maybe you have different customers, the same firstName. And if you run the query on the firstName, all customers with the first name max will have the country as Germany. So to make sure to update the right record, the right row, we're going to use the Brian hierarchy, the customer ID in order to do that. So let's go back over here. And we're going to write the where command exactly like the select. And we're going to say, we need to change the customer ID. Number seven. With that, we are telling exactly the database. We have now new value on the country, and that is only for the customer ID number seven. So let's run this and go over here and run this again to check the value. So here we have it empty or null. And after the updates, now we have Germany inside the country. Alright, let's have another tasks where we're going to manipulate and update the content of our tables. The task says our new customer and she was active. She bought something in our websites and she has now the score of 100. So instead of having the score of null, because you as a new customer, we have now 100's for Anna. Not only that, we have entered by mistake, the country UK instead of USA, show Ana comes from USA and we have to update as well, the country. So let's do that using the update command. Alright, so we're gonna check over here. So before we start like updating the values in the columns, Let's go and make sure that we have the right customers so we are not updating different customer or updating the whole table. So let's make sure that we are selecting everything right in the where command. So Anna has a customer ID number six instead of seven. We're going to write here number six. So now we are focusing on the right row. And now the country should be USA. So now we are giving a new value for Anna in the country field. And we want now to specify one more column to be changed. In order to do that, we have that comma. I like to put it in a new line and the score should be equal to 100. So with that, you are specifying life multiple columns in one updates and you can split them by a comma. So if I want to change one more column, I could do it all in one command. I don't have to have like different command for each column. I could put everything in one. Now, what we are saying, update the table of customers, where the customer ID is number six. And the country should be equal to you as a, and the score should be 100s. So let's run this and then go back to our select star from customers to check whether everything was okay. So I'm going to refresh that. And you can see now how the country USA and the score is now 100. So it's really easy to manipulate the data using the update command. Alright everyone, so that's all for the update statements. And next we are going to learn the delete and truncate statements. 36. SQL | DELETE & TRUNCATE: Okay, so now we're going to move to the last command that we have under the data manipulation section, and that is the delete command. So in order to delete rows from our tables, we could use that deletes and let's have the following tasks. The test says, Wait a minute, all the new users since yesterday or since today, they were wrong inserted in our systems and rehab to delete them. So we have the customer and the customer marks. They should be deleted from our database, from our tables. So in order to do that is pretty simple. We're going to use the command Delete. Alright, so in order to solve that, we're going to write it very easy commands and as well it is very dangerous. So we're going to start by writing the keyword delete from, and then comes the table name. So we need to delete from customers. As you can see, it's only three words. It's very easy, but if I execute this, be careful that it's going to delete everything inside the table customers. So I'm not specifying anything. I'm saying delete from customers. And if I run it the database gonna delete all our customers from the database. So be careful with that. Always specify what do you want to delete Exactly. So with that, it's like the updates. We're going to use the weird commands and use the primary key, the customer ID. So we want to delete the Customer ID number. Let me check again number 6.7. So in order to do that, I'm going to use the in operator in 67. So any customer IDs in 6.7 gonna be deleted. So this is my filter condition. And if I run this, both of the user's gonna be deleted. So let's check that. If I run this over here, you can see what other customers are deleted. And with that, we have deleted some records from our customers. But be really careful what you are specifying in the delete. So you don't delete or you already cards. You might be during the development of your tables, you are inserting that like testdata and you want to delete all of them. So if you want to make a table and empty, you could go and say delete from table name and you're going to make the table empty and then insert again, it is data. But if you are like deleting only few records, be careful what you are writing and the where condition so you don't lose all your data. One more thing here to talk about, about deleting rows that you might be in situation sometimes you have very big table. And the mission is over here is to delete everything, to delete all the rows from this big table. So if you are using the delete from commands, it might take a long time because what SQL is doing, it's gonna go like for each bunch of data deleted, then go to the next one. So it's going to do it like an iterative manner and it may take a really long time. So instead of using delete, if you are sure that's okay, I want to make and table empty. I want to delete everything from the table. I just want to have the columns and nothing inside it. So instead of using the leader is best practices to use another SQL commands to delete the rows and that is truncate keyword. And customers. As you can see, it's only two words to destroy everything. So it's very short command, trying to get customers you are telling the SQL, delete everything. I don't want to see annual records inside my table. So the database gonna do it really fast. So if I'm gonna run this query over here, so I'm just going to remove that delete from. We are deleting everything in the table of customers. So if I do select star from customers, the table is going to be empty. So if you have done that and you want to have the test data again, just go to the tutorial database and rerun the whole script. Then you will have exactly the same situation before you are deleting the data from customers. Alright everyone, so that's all for this chapter. We have learned how to modify our data inside SQL tables. And now we're going to jump to the last chapter where we're going to learn how to define our data using SQL. And first, we will learn how to create a SQL table. 37. SQL | CREATE Table: Alright guys and girls, so, so far we have learned how to query our data using the select commands and as well how to manipulate our data, the values inside our tables using insert, delete, update as an x, we're going to focus on a new group, that is the data definition language, DDL. It is about how to change the structure of our database, how to change the tables themselves. So we have here three commands. Create to create something new, like create a new table or create a new objects we have dropped. To drop a table or deleted table. Alter is to change the structure of one table. Okay, so now we're going to start talking about the first command. We have the create command. If you want to create something new in the database, new objects, e.g. new table or new view stored procedures in the databases there is like different types of objects, not only tables. So you could go and use the command create. In our tutorials, we will be focusing on creating a new table. So in order to create new tables, you have to define the structure of each column inside it. And in order to do that, we have to specify those three informations for each column. So each column should have a name. This could be anything depends on your requirements that you have. So it must have a name, and after that, it must have datatype, exactly only one data type. So you cannot specify multi data types for each column. Exactly one in my SQL that is like big list of all available data types in MySQL. I'm going to leave the link in the description so you could check that the most famous ones are int, var, char, date, jar, and so on. Those data types should be assigned for each column and as well, you could assign inside them the size of each column, the maximum allows size like it's a rule that you can apply. If you leave it empty like this, only int, that data type is going to get a default one from the SQL. So if you define like in our last example, the var char for the last name, varchar 50, that means the maximum allowed size for the LastName gonna be 50. Anything that can exceed the 50 characters, it's gonna be cut down. Only allowed 50 characters inside the last name. So here you could specify the data type and as well the size of the datatype. After that, you have a bunch of constraints that you can do fine on your database in order to have some data quality. E.g. you have the constraints primary key. You say this column is primary key, and immediately it's going to be unique and not allow any nulls inside it. And you could define for each column multiple constraints, that only one constraint. So you could say this is a primary key and not null and unique and so on. So you could define multiple ones. So we have as constraints in the database, primary key, not null. So you are not allowing the null values unique. That means the value inside it should not be duplicated. And then we have default. Defaults means if we are inserting any data and we didn't specify value for this column. The database is going to use the default value that we have defined in that column. So those constraints, as I said, you could use like all of them if you want for each column. So it's really depend on the requirements and on the data quality requirements as well. The data types should be only one, and for each column we have only one name for it. Alright, so now let's learn how to create a new tables using SQL. And we have the following task. Create a new table called Pearson's. And inside this we're going to have four columns, ID, name, birth date, and a phone. As you know, in our tutorial database, we have only three tables. So if you check here, the left side, we have the customers, employees, and orders. And now we can now add one more table called Pearson's. So let's do that. Alright, so now let's start creating our table. We're going to start with the commands create table. And after that, we need to specify now the table name. But before that we have to enter the database name or another databases. It is the schema name. So as you might already notice in my SQL, we have different databases. We have our tutorial database and some default ones. We're going to put this table in our tutorial database and that is dB underscore SQL tutorial. Then dots. And here we're going to put now that table name and we have the person's. After that. We're going to open two brackets and inside them we're going to define now the columns structure. Let's start with the first column. We have the ID. This is our primary key, the most important like column, the whole table at something like the customer ID in the table, customers. So the name of it's going to be ID. After that, I'm going to have space. And then we have to define now the datatype, since it's gonna be sequence of numbers 1234 and so on. We're going to use the datatype integer int. I will not define the sides. I'm going to use the one that we have as a default from MySQL. So now we're going to define the constraints that we want for this column. Here, since it is our primary key, we're going to use the constraint primary. We don't have here to specify not null because as the default, if you are saying this is primary key, you will get inside it two things. First, it's going to be unique as well and not null. So it is two constraints in one, the primary key. So after that, I don't want to generate those ideas by myself manually, by doing the inserts. I want that the database take care of that. So to do that, we can define it as auto increments. So with that, if you are using default or you are not specifying anything in the insert statements, the id gonna be generated automatically from the database. So with that, I have the column name, I have the datatype, and I have two constraints. So now we're going to jump to the next column. We have the person name. So I'm going to add comma and a new line for that. So here we're going to have the person name as a column name space. After that, we need to define the datatype. So since it's going to include some characters and so on, I'm going to use the var char and defined as a size 50. More than 50, that data is going to be cutted and inserted in the database. So this is my role as well, a want that each person has a name. So we don't want to have some nulls. So now we can define that constraints. So this should not be null. That's it. I don't want to have some unique constraint and so on. So we allow that we have two persons with the same names, but they will have different ideas. So that's enough for this column. We're going to jump to the next one. We're going to add the birthday. So the name of that gonna be Birthday space. The data type of that can be date. Now, I don't really want to specify any constraints because this column could be optional, so we will not add anything. So that should be enough. We have the column name and the data type of dots, a comma. And the last one, we're going to have the phone as a column name. The phone could be like characters as well. So var, char, our char. And I am going to allow only 15 characters to be inside the phones. Or some data quality is, so the phones should not be null. So here I'm going to add a constraint not null. One more thing that I could add as a constraint on this table is that each person should has a unique phone number. We should not have two persons with the same phone number. In order to first such equality at your table, we could add the unique constraints. And with that, we are tiling in this column. We should have only unique phones and duplicates are not allowed. So now we have all our four columns. We have specified the data types and the constraints, and that's it. We could run the query over here. So we don't have any year ours if we check on the left side, so we don't have yet the person. That's why, because we have to refresh the data over here. So click on Refresh and you will see we have one more table called person's. Okay, so now let's check some stuff, e.g. if I go and say select star from persons, just to check the table structure. So here I can see, okay, I have a table called Pearson's. I have my four columns and everything is empty. You could go and as well Jake, that describe commands for persons and query that. And you can see we have the fields, the data types, what is null, what is not null? The primary key, and what is unique, the auto increments. So you could check that everything is fine. And as we wanted. Alright everyone, so that's all about how to create a SQL table. And next we are going to talk quickly about the altar tables. 38. SQL | ALTER Table: Okay, so now let's move to the next command. We have altered table and that's you could use it in order to change the definition of a table. So let's say, Okay, we need to add one more column to our new table persons, and that is the emails. So in order to do that, it's pretty simple. So we could use, you can remove this. We could use the keyword alter table and the table name persons. And after that, we're going to add the keyword ads. Now we are adding a new column, It's like in the create table. So we need the column name and that is email. Then after that, we need to define the datatype. It's going to be var char 15 as well as rule. And here as well, we need to add some constraints if you want for some data quality, You say, okay, this is not null. So with that, I'm changing now the already existing table that's called Pearson's and I'm adding now a new column. So let's run this. And let's check again our table refresh. Let's select the table persons and see the results. And as you can see at the ends, we have a new column and always squeal going to add the new columns at the ends. So if I check this as well described person's, just to make sure that everything is fine. We can see here we have one more column that's called emails var character 15. And this should not be enough. Alright, so that's all about how to alter a table. And now we are going to learn how to drop a table. It's string, easy. 39. SQL | DROP Table: Alright, so now let's jump to the last command that we have in order to change the structure of our database. And that is a drop command if you want to delete a table, so you say, okay, this table is completely wrong. I don't want it at my database. You could go and drop the table and that's pretty easy. You could do it like this. So let's say we want to drop the new table that you have that's called persons. So we use the keyword drop table and just write down here the table name, and that's it. Once you execute that the table persons will not exist at your database. So I'm going to delete it. And as you can see on the left side, you will not have a table persons. So it's really simple. Alright guys, that's all for the last chapter. And not only that, that's all for this course. 40. Tableau | Course Introduction: And welcome to this very unique course to master Tableau. My name is Var Zlqini and I'm currently leading big data projects at Marsidespenz. With over a decade of experience in big data data visualizations and business intelligence projects. And I'm very excited to be your instructor for this course. In this 20, 1 hour course, I'm going to be sharing everything that I know about, one of the most demand skill in data science and data visualizations Tableau. So that by the end of the course you're going to be able to create amazing D visualizations in Tableau like I do in the real projects. I designed this course to take you 0-0 If you are a beginner, don't worry about it. I'm going to explain everything from the scratch step by step. That means this course assumes that you don't have any skills in data visualizations as well. All the skills that you can learn in this Tableau course, like data moduling and so on, could be used in any other tools like Power BI and click. Now of course, you might ask yourself, what makes this Tableau course different and unique from all other online courses? This is the only course that breaks down the complex concepts of Tableau into animated visuals, because visuals are very powerful to make complex concepts easy to understand to follow. In this Tableau course, we're going to present over 250 animated skitch notes of Tableau concepts. Understanding the concepts and how Tableau work can make you a professional and expert in data visualizations and in Tableau. And in this course, I'm going to provide you with tons of free materials. Like, for example, I've prepared three different data sources for this course that we can use in all our tasks and examples through the course as well. I'm going to provide you with three tableau sheet sheets. One sheet sheet for all tableau concepts, another one for all tableau calculations. And we have one more sheet sheet for all the visuals to help you choosing the right charts. Having those three sheet sheets, you don't to memorize everything. You have a quick reference and access to Tableau concepts as well. You have access to all Tableau files and dashboard that is created during the course as well. All the skitch notes of each section are available to you to download, so you can use it later as a reference. Now let's have a sneak pick about the Tableau course. We will start with the basics. What is business intelligence data visualizations, what is Tableau? And then you're going to learn the Tableau product suites. And after that, we're going to do deep dive into different Tableau concepts like the table architecture dimensions, measures discretes and continuous data. After that, we're going to deep dive in Tableau calculations and functions. You're going to learn more than 60 different functions in Tableau to manipulate data. And after that, we're going to go and cover more than 63 different types of charts in Tableau. And then at the end, we're going to go and implement Tableau projects, similar to the one that I do in real life projects. So now the question is, who is this course for? If you are someone that has never built any data visualizations using tools like Tableau or PI, I will be with you in this course in each step starting from the fundamentals and we're going to end up having the advanced topics. And this course is as well for you if you are already a Tableau developer. So I will suggest for you that to take a look to the course curriculum and start at the level that suits you. I have covered a lot of advanced topics and you're going to have a lot of best practices in this course. And this course is suitable for you if you have experience in any other tools like in PI, and you would like to pick up a new skill in Tableau. So let's jump in and get started. 41. Tableau | Course Curriculum Overview: We're going to have a quick overview of the Tableau course. I have splitted this course into 15 different sections. For example, we're going to learn what is business intelligence? What is data visualizations? What is Tableau and the history of Tableau, And why Tableau is a very powerful tool for data visualizations. After that, we're going to go and deep dive into the Tableau product suites. We don't have Tableau only one products. We have eight different products. So I'm going to go and introduce you to those products. And we're going to go and compare them side by side for you to understand the differences between them. And I'm going to help you to choose the right products for your project. Moving on, we're going to go and deep dive into the Tableau architecture. Here we're going to learn many different concepts like what is life connections? What are the different types of Tableau files? And then we're going to deep dive into the Tableau architecture in order for you to understand the main components of the architecture and how Tableau internally works. After all those theory, we're going to start preparing your environment in order for you to practice with me in this course. So we'll go and download and install Tableau for free of courts at your PC. We're going to go and create a free public accounts. We're going to download the training datasets and we're going to publish our first visualization and the ends. I'm going to take you on a tour in order to make you familiar with the Tableau interface. And after we have repaired your environment, we're going to start with the first topic, how to create a data source in Tableau. And here you can gain skills about the data moduling. So we're going to go through the basics of data moduling and as well how to do moduling in Tableau. And then we're going to go and learn four different methods on how to combine tables in Tableau using joints union relationships and data blending. And of course, we're going to go and compare them side by side for you in order to understand the differences between them and when to use which methods. And at the end of this section, we're going to go and create two data sources. Moving on, we're going to start talking about the Tableau meta data. Here you're going to learn very important concepts in Tableau. The data types, dimensions and measures, discrete and continuous values. Once you understand those concepts, you can understand how to create visualizations in Tableau. After this section, we have a small section about renaming. Here we're going to talk about the naming conventions that each developer should know. Then we can learn the different techniques on how to rename columns and tables in Tableau. And at the end, we can learn how to give aliases to the values. Moving on to the next section, you can learn how to organize your data in Tableau. And here we have different methods like grouping up the dimensions using hierarchies, grouping up the values using groups and clusters. And then after that, we're going to learn sets in Tableau. And at the end, we can learn how to create pens in Tableau in order to create histograms. Next section, we're going to learn how to filter our data in Tableau. And here you can learn the different types and concepts of filters in Tableau. How to create them and how to customize them. And I'm going to give you ten tips and tricks about filters in Tableau. And we will learn as well in this section, how to sort our data. After that, we can learn very important concepts in Tableau, which is the Tableau parameters. Tab parameters are great in order to add dynamic to your visualizations. You can learn the concepts of parameters and then you can learn different use cases for that. How to make dynamic calculations, dynamic reference line filters, how to swap measures and dimensions, and as well dynamic pens. Moving on to the next section, we're going to learn as well something about dynamic. So we're going to learn the Tableau actions in order to make your dashboards interactive as usual. First you can understand the concepts of Tableau actions. And then we're going to go through all Tableau action types. For example, how to go to URL, how to go to sheets, how to filter data using actions. And then how to make highlights using actions. And how to change the values of sets and parameters. After this section, we're going to have the Tableau calculations. This section is very huge. You're going to learn how to transform and manipulate your data using four different Tableau calculation types. So we have the role level calculations, aggregate calculation, table calculation, and the LOD expressions. In this section, you can learn more than 60 different Tableau functions in order to manipulate your data. Moving on to the next section, we have another big one. We have the Tableau Charts. Here we're going to go and build together more than 63 different charts in Tableau. So we will start with the basic charts, like the bar charts and we're going to end up building very advanced charts in Tableau. And at the end, I'm going to help you to choose the right charts for your requirements. Moving on to the next one, we're going to learn the Tableau dashboards. We're going to go step by step on how to create clean dashboards in Tableau using containers. And now in the last section, we have a Tableau projects here. In this section we're going to go together and implement the projects exactly like I do it in my real life projects. So first we're going to learn the different phases of each Tableau projects. Then we're going to start with the requirements. So you're going to learn how I analyze the requirements of Tableau. And then we start with the implementations of the projects. So we're going to go and build the data sources, the charts and two different dashboards. So with that, you're going to get familiar on how to implement projects and companies using Tableau. So once you go through all those sections, you're going to have a solid knowledge about Tableau. 42. Tableau | Section: Tableau Basics: Tableau basics. Before you start learning how to use any tools, it's very important to understand the principles and the theory behind them, which can help your career to be a professional developer and as well an expert. That's why we're going to cover now the following topics. The bazzwords of the big data. What is business intelligence and what is data visualizations and why it's very powerful? And at the end, we're going to talk about what is Tableau and why Tableau is a leader in data visualizations. So let's start with the first topic. We're going to go and learn the main bazzwords of the big data. So now let's go. 43. Tableau | Big Data Buzzwords: If you are new to the world of data, you must start hearing a lot of puzzwords from big data to IOT data science, data engineering, and phrases like, data is the new oil. In this tutorial, I will be covering some important passwords about the data and what they really mean. Let's dive in, we are living now in the data driven age and data is generated everywhere. We people, we generate massive amounts of data as we speak. Each click on the Internet, each search e mail, or even if you are ordering something online, we generate data. We spend hours every day on the social media, Liking, commenting, searching our smartphone is just all time uploading data about where you are, how fast you are moving. And everything we do online is now stored and tracked as data. Not only our smartphones and computers are connected to the Internet and generates data, but also we have something called smart home. We can connect any device at our home to the Internet. Just put the word smart before it. We have smart mower, smart lightning, smart fitness, voice devices, security systems. All those devices could be connected to the Internet and start generating massive amounts of data. And this is what we call Internet of Things, IOT. Iot is the concept of connecting any device, anything to the Internet in order to generate and exchange data. Not only we have IOT at our home, but also everywhere we are living in the digital transformation in the industry and manufactury. You might heard of the concept Industry 4.0 the first Industrial Revolution introduced in Germany. It's all about smart factories, connecting machines and devices to the Internet in order to exchange data. And now we can find IOT's in the cities. We are trying to implement those smart cities where we're going to connect everything in order to reduce waste, saving money, improving quality we have as well IOT's in our cars. Our cars are loaded with sensors and devices that are connected to exchange data for many reasons like driver assistance, object recognitions, self driving systems. The list is just so long. In 2022, we have around 14 billions of physical devices, things from small household cooking devices to the sophisticated industrial machines that are connected to the Internet, generating and exchanging data. The amount of generated data every day from IT's social media, websites, machines is truly mind blowing. There are currently over 44 zetabytes of data in the entire digital universe, that is 2010. That means we are no longer dealing with normal traditional data, we are dealing now with the big data. What big data means? There's three indicators that help us to understand whether our data is big and they are defined by the three Vs. The first V is volume. Well, big data is big. With the growth of the Internet, mobile devices, social media, IT's the amount of generated data from those sources has grown dramatically. The second V is velocity. In normal data processing, we used to process slow data, or we call it patch data, once a day or something, and then we store it in the disc. But in big data words, the sources are generating streams of data with very high speeds. That means we have to process and analyze the data in real time fashion, and then we store it in memory instead of disc. And the third V is variety. In traditional systems, most data types could be captured on raw, unstructured tables like database or Excels. But in the big data words, data often comes in semi structured format. For example, several logs in XML or websites. Or the data comes in unstructured format. Like videos, audios, images, free text In big data, we have not only to deal with structured data, but also with semi structured and unstructured data. Though the big data terms means how we can efficiently store, process, and analyze our data when it has huge volume, high speed, and different types in order to reveal significant values for the business. But we still have a problem with that. All those generated data are raw data. Raw data are just unprocessed rows and rows of numbers that are really hard to understand, hard to read, badly structured, and almost has no value to the business. Almost 70% of the words data are unused. Raw data, if left without processing and refining, is just worthless, waste of money, waste of space, and it generates digital waste stores in very expensive data centers. And that's why we have the very famous phrase of the famous British mathematician, Clive Humby. Data is the new oil. Well, it means that we have to extract the raw data like we are extracting oil. We have to refine it, process it, transform it into something useful and has valued the business. What this really means is that most of the companies are sitting on very big field of new oil, raw data. And most of them understood that data is their most valuable asset. They have to extract it. They have to analyze it in order to reveal insight that could help them in order to make faster and better decisions. And that's why most of the companies are hiring army of data workers. As we know that demand for data scientists is increasing rapidly and the supply is law. Now what we can do with all those chaos, all those generated unprocessed raw data? Well, we can do the following stuff. So what we can do, we can design or build a data architecture. Data architecture is the process of creating a blueprint on how we organize, process, and store our data into different layers for different purposes. Architecture makes it easier to manage, protect, and access our data. Another thing that we can do with raw data is data engineering. Data engineering is very complex process of designing and building data pipelines and data storages. In data engineering, we usually build ETL processes to extract the raw data from multiple sources, then transform it and then load it to the target storage in order to make it highly available and usable for the data scientist or any other end user. Another thing that we can do is data modeling. Data modeling is the process of connecting the dots. So what we're going to do is we're going to put all the data into entities and objects. Then we describe the relationship between those entities in order to help us and help the programs to understand how the data are related to each other. Another thing that we can do with the raw data is we can do data mining. Data mining is the process of analyzing massive amount of raw data in order to discover knowledge, to discover business intelligence like patterns and trends, to solve problems and to mitigate risks. Another use of the raw data is that we can use it in machine learning. In machine learning, we are providing the computers with two things. First, the raw and historical data, together with the mathematical models and algorithms. Once the computer has those two things, it's going to start training and practicing in order to perform tasks like predictions. It's like human. The more the machine practice and train, the better and accurate the results going to be. Next, we can do data science. Data science is the scientific study of data. And it combines three major powers. The power of programming languages, together with the mathematics and statistics. And the knowledge of specific domain in order to uncover valuable knowledge and insights from our raw data. One more thing that we can use on the raw data, and my favorite one is that we can use data visualizations. Data visualizations is the process of converting numbers and raw data, which is normally hard to understand and to read into visuals and charts like powers by three plots, in order to make it easier to understand and easier to read, which really helps in the decision making. There are many other things and processes that we can apply on the road data, but these are the major fields of work that we can use in order to convert the useless road data into knowledge that has significant impact and value to the business. All right guys, so that was an introduction to big data terms. And next we will quickly learn what is business intelligence? I using very simple example. 44. Tableau | What is Business Intelligence (BI): All right, let me tell you this story. We have shops in three different cities in Germany. In Suttgart we have shop Berlin and Hamburg. And our three shops are generating every business day a lot of raw data on sales, inventory levels, products, staff costs, and so on. And now we have a group of people that are the decision makers, like managers, HR, finance. And they have many questions and decisions to make. So they might have questions, for example, what happened, and another question about what will happen. Now if the managers try to find the answers from the road data, they might find nothing and no answers. Because the road data are usually very complex and badly structured and they are really hard to understand. And that's why they're going to go and hire some data analysts, for example, in order to help them finding the answers from the road data. The data analysts is going to go and start analyzing the raw data by doing some magic. For example, cleaning up the data, connecting objects together, and aggregating the data in different levels. And at the end, the result will be communicated as, for example, spreadsheet to the decision makers. In the other hand, the managers can hire data scientists in order to help them finding answers about what's going to happen or uncover unknown facts and insights. The data science is going as well go and start analyzing the raw data, but this time using different methods like for example, data mining, machine learning or train model in order to find new insights, new knowledge answers the questions. At the end, the output is going to be communicated as well to the managers as numbers and spreadsheets. Now, both of the data scientist and the data analysts did an amazing job working on the raw data and analyzing those stuff. But the problem here is that the output might be hard to understand and read, because those managers are usually people that don't work directly with the data every day. This could lead to a big gap between those managers and the results. Now in order to bridge this gap and make everything easier, we can use the power of data visualizations and the results presented from the data scientist. And the data should be converted from the pouring numbers and spreadsheets to visuals, graphs and charts. The visual representations of the data will just do the magic by making everything clear and easy. And it's going to bring very easily the wow effect once you are presenting your results. So it's going to help the managers to immediately find their answers and they're going to start making decisions using the data. This process, we call it a business intelligence or as a shortcut. B, I. All right, so now I hope you have better understanding what is business intelligence and next we will understand why visualization is so powerful and what is data visualization. 45. Tableau | The Power of Data Visualization: Now the question is why visualization is so powerful. With the simple visual communications, you can make a huge difference since the start of the humanity thousands years ago. And early human use visuals in order to tell a story. And until now, in the modern age, the human still uses visuals in order to tell any story. Because we humans, we are visual creatures, we think in pictures and individuals. If we see a tree, our brain can as story it as a visual, as an image. In our brain statiste, that's 90% of the information transmitted to our brain is visual. But if we read the word tree, our brain has failed to transform it to a visual before storing it, which is waist lower. In fact, the human brain processes visual 60,000 times faster than a text. More facts about our brain that we remember most of what we see and interact with. It's proven that the human remember only 10% of things we hear and 20% about what we read. And it's also proven that we remember about 80% of what we see and interact with. That's why we have the famous phrases of a picture is worth 1,000 words. And seeing is believing. Having all those facts, no wonder that in digital channels the visual content is taking over posts, tweets, articles, news presentations, dashboards. You can find visuals everywhere. Now the question is, what is data visualizations, or sometimes we call it Dataviz. Data visualizations is the process of converting boring numbers and raw data into interesting graphical elements like parts by three blots and so on. So data visualizations brings the data to life, makes you the master of storytelling of the insights hidden within your numbers. So it's like an art of converting highly complex, massive amount of datasets into something very simple, something very easy to understand and to interact with. Imagine yourself to be one of the managers and you have two data analysts. One of them is presenting the result in spreadsheet filled with numbers, and the other data analyst is presenting the result with visuals filled with the graphic representations of the data and both are presenting the same facts. Which report you will prefer? I would go with the right one because the left one is just dry numbers pouring and unlikely you will be able to spot any trends and patterns. The main benefit of data visualizations is telling a story, arms you with tools in order to make the right decision at the right time. There are many other benefits, like seeing the big picture, tracking trends, making smarter and faster decisions, discovering unknown facts, patterns, trends. And getting as well more engagement from the end users by asking more and better questions. All right, so with that, we have learned what is data visualizations and why it is very powerful and important. Next we will compare Excel to tools like Tableau and why you need to use Tableau instead of Excel. 46. Tableau | Tableau vs Excel: Over and over again I'm asked the same question, why I should bother learning and using Tableau or BI for data visualizations if we have Excel. In this video, I'm going to explain for you my six reasons why we should use a modern BI tool like Tableau and BI and not use Excel for data visualizations. And we start right now, there is around 1 billion users globally are using Microsoft Excel. I worked in many companies and I can tell you people are just addicted to Excel. They love it. They use it for everything as blanding tool, data entry, data analyses, and data visualizations. The main problem here is that the more a company grows, the more it generates data. And because everyone is familiar with Excels, they're going to keep using them in big data use cases. And they're going to face really hard time managing those spreadsheets and dealing with limitations in Excel. In these situations, it's really time to switch to a modern BI tool or data visualization tool like Tableau or Bar BI. Now let me show you how BI is done with Excel. We usually have different source systems and a data analyst that's going to go and start exporting manually the data from those systems and import them in Excel. And then some calculation is going to be done and at the end a report will be generated. The Axial files then will be access from different business users. On the other hand, we can do BI with a modern tool like Tableau. So what we're going to do, we're going to connect Tableau directly to those source systems. And the data analysts can start developing a report or dashboards in Tableau. And at the end, the business users will access Tableau in order to see those dashboards. So far you can say, okay, both look really similar. So now let's dive in in order to show you what is the real benefit of having a modern BI, to like Tableau or RBI. And the limitations that we have in spreadsheets like Excel. The first benefit is automation. If you are using Excel and we made some nice reports, it's time now to update the data. And how we do that in Excel, we update data manually. So some employees have to sit down every day and go through the process of extracting data from those source systems, importing them in Excel calculations. And at the end, prepare the reports over and over again, which is very time consuming. But if you are working with the modern BI, two like Tableau, we can automate this poring task by creating schedule to refresh the data. For example, we can create a schedule in Tableau every day at 07:00 Morning Tableau should automatically connect to the data sources, pulse the data, and prepare the reports. There is two benefits of doing that. First, we eliminate the human errors, which is very common thing in Excel, and sometimes those mistakes can lead to wrong decisions and to finance loss. And the second benefit, of course, we no longer need employees that is dedicated only for the pouring task of exporting and importing data manually to Excel. Another benefit here is the capacity if we are working with Excel and one of our source systems start producing and generating massive amounts of data. Here we have problem in Excel because we can handle round only 1 million records. So our Excel file garner breaks, we're going to start getting aero messages like the dataset is too large, what we usually do in Excel, we're going to go and start splitting the main file into small multiple files in order to manage the huge volume of data, which is really hard to manage. On the other hand, if you are working with Tableau, we don't have to worry about all those stuff. We have no problem in Tableau because Tableau is made for big data use cases and can very easily handle massive amounts of data. We might just change the connection type from extract to live in order to handle it. Another benefit is security. If you are working with Excel, it's really hard to hack into Excel even if you are using password protected spreadsheets. It still can easily act nowadays. And the users are really used to share their Excels in e mails, copy TSB, or store it locally at their computers, which is not secure at all. All those staffs could cost companies a lot if sensitive and confidential data is accessed by competitors. But if you are working with modern BI, two like Tableau, it's going to provide us with superior security features like advanced access control data security, network security. And plus, if you are working with Tableau, we don't have to export the data, we can just share the dashboards and reports between employees, and only if we grant them access rights. They can see the data. Another benefit is the role level security. In many companies, they have a lot of confidential sources. And they start to understand how important it is to apply the principle need to know the principles needs to know says a user shall only have access to the information that their job functions requires. That means we cannot go and share all data to all users. We have to have some data restrictions. For example, a sales employee should not see all data like manager and finance. Employees should not see all personal information like HR and so on. That's means if you are working with Excels, we have here again to split the main files into specific reports, for specific rules. But on the other hand, most of the modern BI tools, they offer a feature called row level security, RLS. Row level security refers to restricting the rows of data a certain users can see based on the policies that we define using this technique. Going to enforce the need to know principle and going to make our life easier by just having one dashboard accessed by different types of users. And then based on the rule, they're going to see the data and the information that their job requires. Another benefit is reducing chaos. Let me tell you how we usually work with Cel. A data science will start exporting data from one source system and you're going to make a report called version one report. And then for other requirements, you're going to make version two reports. And eventually we're going to have a final report and we have another data analysts working in different source system. And the same thing going to keep happening a few times back and forth. And eventually we're going to end up having different six versions of the reports. If we scale this impact, you will notice that you are slowly poisoning your business and the end user is going to have to access different versions of the reports. Now if we ask how old is the data in our reports, we will get different answers. One version going to be ten days ago, another 184.3 days. That's means we don't have single point of truth for our data. That's why having modern tools can help us to eliminate such a chaos and can help us building a single point of truth for our data. One last benefit that I would like to talk about is visuals. Although Excels offers visualizations, but it is sometimes very limited when we are producing complex visuals in Excels as well. Creating visualizations is very time consuming, including a lot of manual steps. And as well, those visuals are going to be static and not interactive. But on the other hand, if we are using Tableau, everything is going to be automated and super fast. We can create new reports and views very quickly by just drag and drop. And they offer way more interactive and cooler visuals than Excel. All right, the main reasons why I prefer working with modern BI tools like Tableau and Power BI and not Excel for data analysis and data visualizations are automations, security, big data use cases, and interactive visuals. It's not about Cel versus Tableau, It's all about using the right tool for the right use cases and not to misuse a tool. Excel is a great tool that is used by billions of people because it's very easy to use sheep professional spreadsheet for data entry and complex calculations. But when it comes to data analysis and data visualizations, we have way better tool than Excel like Power BI and Tableau. And you can still use them together. For example, you can do your complex calculations in Excel and the final result can be imported in Tableau in order to do better visualizations and to get more insight about the results. The thing is the world is changing very fast and the companies are generating massive amounts of data. So instead of using traditional spreadsheets like Excel, we have to use more powerful tools in business intelligence to help us quickly find insights, trends, patterns in order to make faster and better decisions. All right guys. So with that, you will no longer have to rely on Il for data visualizations and can start using BI tools. Next, I will show you quickly the top three BI tools for data visualizations and what is my favorite BI tool. 47. Tableau | Best 3 BI Tools: Now the question is, what are the best tools for data visualizations? A leading research company called Gartner published every year the Gartner Magic Quadrants to show who are the leading product in specific domain. And if you check the Magic Quadrants for analytics and business intelligence platforms for the last ten years, you can almost see always the same leaders. We have tal, power, BI and click view since 2012. And I'm working with a lot of data visualization tools. And I can say that all those three tools are really great tools. They have the advantages and disadvantages. But by just checking the data visualization aspects, I can say that Tableau is here a winner because data visualization in Tableau is a core concept and really the best tool for data scientists and for big data. All right, so with that, you have learned what are the three top BI tools. And you know by now that Tableau is my favorite data visualization tool. Our next step is to introduce you to Tableau. We will cover what is Tableau, its history and its mission. 48. Tableau | What is Tableau?: The first question is, what is Tableau? A quick answer could be, Tableau Lbs. To convert this to this without any technical or programming skills, Tableau converts complex and boring raw numbers into beautiful visuals and charts, which is really easy to understand. The key features in Tableau is interactivity, easy to build and to use, and fast performance. We can call Tableau with many names like data visualization tool, a business intelligence or BI tool, or sometimes we call it a reporting tool. Well, Tableau is all of them, but I choose to call the Tableau a data visualization tool because data visualization is the core concept of Tableau. Now let's have quick history about Tableau. In 2003, Tableau was founded by three guys, Pat Christian and Chris, as a result of computer science projects at Stanford University. They focused on visualization technique to analyze data inside databases. And then in 2019, Tableau was acquired by Salesforce in a deal worth over 15 billion. And for the last ten years, Tableau was named as the leader in Gartner Magic Cordants for business intelligence. Tableau has a clear mission to help people to see and understand their data. They really focus on keeping Tableau intuitive and easy to use. That's why Tableau does not require any technical or programming skills in order to build amazing dashboards and insights. That means the target audience of Tableau is not only for technical users, like IT, data analyst, data scientist, but also for all other non technical users, like a business user, an end user, a teacher, and so on. This aspect is a game changer, of changing the old mindset of having only IT and technical people working with data and building visualizations. But now we have modern data visualization tools like Tableau, which opens the door for everybody to start working with data. That's why tools like Tableau helps organizations to be data driven. And now Tableau is widely used. You can find Tableau almost in all organizations, industries, sectors, in all departments. Because most of those organizations want to empower their employees with tools like Tableau in order to make better, faster, and smarter decisions using data. All right, so with that, I hope you have now better understanding what is Tableau and its mission. And next I will show you my top four reasons why I think Tableau is a leader in data visualization. 49. Tableau | Why Tableau is Powerfull?: Tableau is not the only leader in business intelligence and data visualization market. There are many other tools that are available like PowerPI, Click View and so on. But now if you ask me what makes Tableau so special, why Tableau is so widely used, I would give you four reasons. The first reason is performance. The sources now are generating massive amounts of data, and Tableau is designed and optimized to handle huge volumes of data without embarking the performance in the dashboards. And that's because Tableau is using high performance in memory data engine to help analyze large datasets where the data can be stored inside columns instead of rows, which can boost the performance in dashboards. Table has no limitations or whatever, to the number of data points in the visualization. For example, on this view we have over 1 million data points without any problem. This allows us to analyze large datasets in order to find trends. Patterns with great performance and all other tools still enforce raw sized data point limitations, which is not really helpful for data analyzers. The second reason is quick and interactive visualizations. Compared to the other tools with Tableau, we can create rich and beautiful visualizations in just few seconds. I'm going to show you now quick example how to cluster my data and how to calculate the forecast. In order to do such a complex job in Tableau, we will just use drag and drop. So let's see how simple it is. All right, so we're going to go to the orders. Take the sales, put it in the columns Profit and the rows. And take the order ID's and the details. And I want to see all my members over here. And now we go to the analytics pan, and then double click on the clusters. With that, I have very nice fore clusters of my data. The next step, I will create a forecast of my data. I'm going to take the order ID, put it on the columns. And then we're going to take the sales. I would like to change the visual two parts I have now here, around five years. What we're going to do, we're going to go to analytics and just click on the forecast and that's it. I have a forecast of two years of my sales. Now I'm just going to go and put them together in one dashboard. So I'm going to create a new dashboard, drag and drop the clusters, drag and drop the forecasts. I'm going to link them together with the filter. That's it. Now we have both of them, and if I click around, I will have an interactive dashboard for the forecast and for the clusters. The third reason Tableau is user friendly, as you can see, we have done very complex analysis with just Dragon Drop without writing any code. And this is exactly what Tableau wants. It's very intuitive and user friendly, and this is the major strings of Tableau. It just opens the door for all non technical users to have a chance to work and play with data to solve their daily problems without the need of IT. But on the other hand, Tableau is integrated with programming languages like Python and R, which opens another door for advanced data visualizations which might be used from data scientists. The last reason is community. If you are working with Tableau, well, you are not alone. You have a huge Tableau community. In the community, we have around 2 million students and teachers. And in Tableau public we have around 5 million data visualizations that are published. And there's around 200,000 questions and ideas that are shared in Tableau forums. Having such a huge community is a big blast. For any tool, It's very important because while you are working with data, you might face some problems or you have questions. It's very important that you have a place where you can go and ask your questions and get advice from other developers all over the world. Not only that, you can as well get inspired from the shared visualizations from other developers. You can find the important links about the Tableau community in the video description below. All right, so my four reasons why Tableau is one of the best tools for data visualizations are, Tableau can handle massive amounts of data, very suitable for big data use cases. It offers beautiful, quick interactive visualizations. Tableau is intuitive and user friendly. No coding or technical skills are required. And the last reason Tableau community is very huge. One more thing that I would like to add, that data visualizations is really one skill that you have to master as a data scientist or data analyst. And Tableau is an amazing tool for data visualizations. That's why I highly recommend to learn or to get familiar with Tableau. It's going to be like a huge advantage for your career. All right guys. So with that, you know my reasons why. I think Tableau is a leader in data visualization. And with that, we have finished the first chapter of Tableau where we have covered a lot of important terms of data and Tableau. And in the next chapter, we will have an overview of the Tableau product suites where I will introduce you to eight different Tableau products. 50. Tableau | Section: Tableau Products: Table products in Tableau, we have eight different products and it's really important to understand them and understand the differences between them. So that's why I'm going to go and give you a quick overview of all eight Tableau products. And then we're going to go and compare them side by side in order to understand the differences between them. And add the end you can alone the decision making process that I usually follow to choose the right product for your requirements. So now let's start with the first topic where we can have an overview of the development process and products. So now let's go. 51. Tableau | Development Process: All right guys. In this chapter I will introduce you to Tableau product Suite to understand the differences between the eight Tableau products. And we will start with the Tableau development products. All right, if you think Tableau is only one software, then you are wrong. If you visit the home page of Tableau, Tableau.com you will find many different Tableau products like Tableau Stop Public Server, Cloud Prep Reader. I can say at a starts, it might be confusing having all those Tableau products, but don't worry about it. I'm going to explain them one by one. So you can choose the right combinations of Tableau products for you or for your organizations. It's really important to understand the differences between them, the functionalities and the limitations of each Tableau products. And let's dive in. Tableau product suites contains eight different products. We have Tableau Disktop, Tableau Public Disktop Rep Server, Cloud Public, Cloud Reader, and Tableau Mobile. All right, the first thing to understand is that we can split those products into two main categories, Developer tools and Sharing tools. Tableau Developer Tools, as the name implies, they are tools that are going to help you to build data visualizations by creating and designing dashboards, charts, reports, or to do data preparations or data engineering by preparing the data for data analysis. Under this category, we can find three Tableau products. Tableau Disktop, Public Disctop, and Tableau Prep. And now in the other category, we have the sharing tools. Those tools can help you to share and collaborate your work that you have done and created using the developer tools. Under this category, we can find five Tableau products. Tableau Server, Tableau Cloud Public, Cloud Reader, and Tableau mobile. All right, so now first let's focus on the Tableau products under the category Developer Tools. Now we can go and as well split the developer tools into two groups based on their purposes. We have Data Visualzations and Data Engineering. Underneath Data Visualzations, we find two Tableau products, Tableau Stop and Tableau Public Stop. And underneath Data Engineering, we have only one Tableau products and that's Tableau Prep. All right, so now after we understood the main categories and the main purposes of Tableau products, we will go now and talk about the development process in Tableau. All right, so basically we have three very simple steps in the development process in Tableau. The first step, we connect our data to Tableau. Then in the next step, we start building our data visualizations to do data analysis by creating report chart and dashboards. And in the third step, we share our work by publishing it. The two products to do these three steps are Tableau Disktop and Tableau Public Disktop. In many cases, the quality of our data is bad, not ready for analysis. That's why we add one more pre processing step to prepare our data before we start building our visuals. And we can use for this step the product Tableau prep. All right, so now let's do deep dives and into Tableau developers products one by one in order to understand the key features and as well the limitations for each one of them. All right, so with that, we have an overview of the development process and the products. And next we will have a quick overview of the Tableau Desktop. 52. Tableau | Tableau Desktop: Tableodsctop is a software you download and install at your PC. With Tablo Syctop, you can connect to many different source types. There are over 90 data connectors you can connect to Tableau server or to connect to files like Excel, Text Jason, or to Prem servers like my SQL and Oracle. Or to cloud like Amazon, Google and Microsoft Azure. Once you connect Tableau to your data, you can start building your data visualizations. In Tableudyctop, you will find many tools and functions to help you creating charts, reports with just drag and drop. And then you can combine those different reports into interactive dashboards. And after you've done building your views and dashboards, then you have three options to share your data by either publishing them to Tableau server, Tableau Cloud, or to Tableau Public Cloud. Or even you can store your workbooks locally at your PC. All right, so Tableau Stop is the backbone product of Tableau. A Tablo developer, you're going to spend 90% of your time using this tool. Tabloid Distop is a developer tool to build data visualizations where you connect to your data, build dashboards, and then publish them Oddly, Tableau Stop is not a free tool like Power BI Disctop. In order to work with Tabloidstop, you have to buy a license. I think they offer some kind of trial phase, or if you are a student you get like one free year. Don't take my words. It's better to check the current offering from Tableau in their home page. With Table Stop, you can connect over 90 different data sources. You can publish as well your work everywhere to Tableau Server, Tableau Cloud, and Tableau Public. Since Tablo Stop requires a license, you don't have any limitations or whatever on how many roads and data you can store and process. Tableau Desktop is meant for data analysts, data scientists, PI developers who work professionally in companies in data analytical projects. All right, so that's was a quick overview of the Tableau Desktop. Next we will check the Tableau Public Desktop. 53. Tableau | Tableau Public Desktop: Tableau Public is the free version of Tableau Stop. It is very similar to it. It's a developer tool in order to build and publish data visualizations. And since it's free and requires no license, it comes with fuel limitations. In Tableau Public, we have around ten data connectors you can connect only to local fights at your PC. Another limitation of that, you can store and process only 15 million rows of your data and you can publish only to Tableau public Cloud. That means you cannot publish your work in Tableau server or Tableau private Clouds. And the last limitation is that you cannot store your workbooks at your local PC. But here I have to be fair that the most important part of that all functions and tools in order to build visuals and dashboards are completely available in Tableau Public, like Tableau Dctop, which makes really Tableau public as a great alternative and tool for beginners in order to practice and to learn Tableau before they go and buy licenses. And to be honest, that's why I decided to go with Tableau Public in all my tutorials so that anyone can follow and practice with me without having you buying any licenses. All right, so with that, we have a quick overview of the Tableau Public desktop and next we will check the data engineering tool, Tableau prep. 54. Tableau | Tableau Prep: Tableau Prep Builder is a software you download and install at your BC, and you can use it to prepare your data before you start analyzing it. Same as Tableau Desktop, you can connect to many different source types. There are over 90 data connectors, like Tableau server piles on prem cloud and so on. Once you connect Tableau to your data, you can start building data flows where you have access to tools and functions to help you to transform your data. For example, combining data cleaning, filtering, aggregating, and all other art of data engineering tasks, prepare your data for data visualizations. And at the end of your data flow, you can store the new prepared data in three different places. Either as a file at your local PC or publish it as a data source in Tableau server or cloud. And the last option, you can write the output directly in databases. And after we are done building the dataflows, then you can publish them in Tableau server or Tableau online for automations. And in Table Prep you have the option to store your dataflows locally at your PC. All right, So Table Prep is a data engineering tool to prepare our data, to get ready for analyzes. Sometimes the data that we are connecting to Tableau Desktop has bad quality and we cannot use it immediately in our dashboard. That's why we spend like hours and hours of cleaning up, organizing, combining preparing our data. And that could be really time consuming. So for this situation, we could use Tableau Prib to help us with this process. The Tableau Prib is a developer tool for data engineering where we connect to our data, build data flows, and then publish them. And it's not free tool, it requires a license in Tableau Prep, we have over 90 different data connectors. The output of the data flows could be stored locally at your PC or as a Tableau data source or directly in the databases. And we can publish the dataflow either to Tableau server or to Tableau Cloud. Tableau prep is not like Tableau Desktop. We don't have any free version of Tableau prep, so there is no Tableau public prep. All right, so that was a quick overview of the Tableau prep. And next we will compare all the three Tableau development products side by side. And I will walk you through my decision making process to choose the right product for you. 55. Tableau | Tableau Desktop vs Prep: All right, so now let's go and have a summary of the three products where we're going to compare them side by side. The main purpose of Tablo Dicto and Public is to generate data visualizations. But the main task of Tablo Prep is for data engineering. Now if you are talking about the costs, both Ctop and Prep requires licenses, but Tablo Public is free to use. Now about the security aspect of the data. Tablo Dctop and Prep are secure since you can publish them to private servers. Tablo Public, you have to publish your work to public platforms. Everyone can see your data, so you cannot secure your data in Tableau Public. And the next point, data limits. Since public is free, it comes with the limitations of 15 million rows. But Disktop and Prep, you will get no limitations. The next point is connectors. In both Disktop and Prep, you have over 90 different data connectors like files, ABI, servers, Cloud and so on. Where in Tableau Public you can connect only to files. And if we talk about the live connections aspect, the only tool offers a live connections to your data sources is Tableau Disctop. You cannot make live connections in Tableau Public and in Tableau Prep. You have always to work with extracted data. The next point is about storing your files locally. Both Tableau Disktop and Prep allows you to do that by storing your work locally at your PC. But in Tableau Public you cannot do that. Instead, you have always to publish your work to Tableau Public Cloud. The last aspect is about the target audience. Tableau Disctop is made for data scientists and data analysts, but Tableau Public is made for anybody who wants to work with data visualizations, and Tableau prep is made for data engineers. All right, so now with this, we have good overview of the three Tableau products for development. And now comes the question, when to use which product. Now let me guide you in my decision making process using the following flu charts. First, we ask the question, for which purpose. If we need products for data engineering, then it's easy. We have only one Tableau product and that is Tableau Prep. Now if we need products for data visualizations, then we can ask more questions. The next question, do we need to connect to server ABI databases or to cloud? If the answer is yes, then we have to use Tableau Desktop. And if the answer is no, then we ask the next question. Can our data be public? If the answer is no, our data is confidential, then we have to use Tableau Desktop. But if the answer is yes, our data can be public, then we jump to the next question. Do our data sources contain more than 15 million rows? If yes, then we have to choose Tableau Stop. But if the answer is no, our data sources have less than 15 million rows, then we jump to the last question. Do we need to have live connections to our data sources? If the answer is yes, then we have again to choose Tableau Desktop. But if the answer is no, then finally we can go and use Tableau Public. All right, so if you follow those questions and this chart, you can easily decide when to use which Tableau products. All right, so with that, we have covered all the Tableau products for development. And next we will start talking about the Tableau products for sharing. So let's first understand the sharing process. 56. Tableau | Sharing Process: All right, so in the briefest tutorial, we split it Tableau products into two main categories, Developers, Tools, and Sharing Tools. Now we're going to focus on the second category, the Sharing Tools, where we have Tableau Server, Cloud Public, Cloud Reader, and Tableau mobile. And as the name implies, those products can help us to share our reports and dashboards with others. In the last tutorial, we have talked about the four steps of Tableau development process. Now we're going to do deep dive in the step number four where we're going to talk about the different options that we have in order to share our reports and dashboards with others. If you want to share your visuals with your colleagues in your organization, then we have here a few options. First, you can install Tableau server products on servers using the infrastructure of your organization. And then you can start publishing and sharing your dashboard there. Then your colleagues can either use their web browser, or they can use Tableau mobile app on their smartphone or tablets to view and interact with your dashboards directly from the server. The second option we have, we can install Tableau server products on cloud service providers like Amazon AWS, Microsoft Azure, or Google Clouds. And then you can publish your dashboard there. And the same thing here, users can use web browsers or Tableau mobile in order to access your work. The third option we have, you can use Tableau Private Cloud Service. Here, you don't have to install any Tableau server or anything. You will get everything prepared from Tableau team. You can start immediately publishing your dashboard there, and your users can consume it from Tableau Cloud. Now let's say you want to share your dashboards with everyone in the world and make it public. Then you can use Tableau Public Cloud. You don't have to install anything. You can immediately publish your dashboard there. And users all around the world can use their web browser to access your dashboards and data. But they cannot use mobile app in order to access Tableau public. And now to the last option that I really don't like to use. If you want to share your reports to individual users, you can send them a Tableau file with the format TX. Tableau packaged workbook which contains your data plus your reports and dashboards. And then the users can view this file using Tableau reader software installed at their PC. All right, so with that, we have an overview of the sharing process and the different options on how to share your data. And next I will introduce you to three methods of hosting Tableau. 57. Tableau | Hosting Tableau: On-Prem vs IaaS vs Saas: All right everyone. So now in order to understand the real differences between Tableau server and Tableau Cloud, we have to understand the back end details and some basic concepts about hosting servers. Let's go, let's say we are start up company and we want to host our own Tableau application and build the entire infrastructure. For that reason, there is a long list of tasks that should be done. Of course, the first thing that we need to do is to go and pile some hardwares and configure them like servers that will run the applications, each server needs as well storage. So we have to provide additionally storage infrastructure like some hard disk driver and SSDs servers needs to be as well connected to the Internet. Therefore, we have to provide as well all the networking infrastructure. Once we have all those staffs, then we have all hardwares needed. The next thing that we need to do is that we're going to go and start installing and configuring some softwares. Like we can install an operating system, for example Windows or Linux, and many other middlewares. Once the operating system is in place, then we have to install and configure Tableau server application. Once we have all software and hardware ready and running, It's finally now the time to set up our Tableau projects. And we have to manage the following tasks. We have to start adding users to the Tableau server and map them to the correct licenses we have as well, to curiate schedules and tasks to refresh our data inside Tableau server, and then we have to start monitoring the Tableau jobs. All right, so now we come to the big question that we have to answer. Who will manage what? The first option you have if you decide to manage all these layers, that means we are talking about the on premises model. So it's clear ownership, You manage everything from top to bottom, hardware, the software, and the project itself. But now, if you say, you know what, this is too much to manage, We don't have the money to buy all those stuff and hardwares at the start and we don't have the time to take care of them and maintain them. Then you will start thinking about outsourcing the hardwares where you're going to buy a service from cloud providers like Microsoft Azure, Amazon, AWS, or Google Cloud. Know that they manage the hardware and you manage both software and projects. And this is what we call infrastructure as a service, IS the first letter of each word. But now if you say, you know what, our IT team is very small, we don't even have the time to keep those softwares updated. Each time Tableau makes a new release, we have to install a new version of Tableau server, which is really wasting our time and we are not able to focus on our core business projects. We don't have the resources to manage our own software. Then you start thinking about outsourcing the software layer. To do that, you can buy a service from Tableau. It's called Tableau Clouds, where Tableau team going to manage everything for you, both hardware and softwares. And this is what we call software as a service as. Okay guys, so now let's summarize and compare the three hosting options. The first point is about hosting set up on premises. You need Tableau server installed in your organization servers in as you need as well. Tableau server installed in cloud service provider, for example Microsoft Azure, and in SAS, you just buy Tableau cloud products. And now for the question, who manage what? In on premises, you manage everything, the hardware, software, and your projects. And there is no outsourcing in as you manage both software and your projects. And the cloud service provider manage only the hardware in Sass, you manage only your business projects. And Tablo can manage both hardware and software. So now let's check the advantages and disadvantages of each service model for the on premises. The good thing here is that you have full control of everything, the hardware and the software, and your data remains behind your firewalls. This is very important if you have critical or sensitive information that should not stored outside of the company's firewall. But the drawbacks here, you need a dedicated hardware and software administrators to deal with the maintenance, patching, and many other tasks. It is very costly. At the start of the projects, you have to pay a lot for the hardwares and the softwares, and it's not flexible. It's really hard to scale up or scale down your hardwares as needed. Having all those stuff, generally you have less time for your business projects. All right. So now let's move to the IS the first advantage it gives you flexibility. You can scale up, scale down the hardwares as the business needs and there is no upfront cost for buying hardwares. But the downside of IS, is that you still need administrators to manage your softwares, to do installations, patchings of your softwares. And if you don't pay attention for the cost, you might end up paying big pills. Now let's move to As the main advantage in SS is that it allows your IT team to focus only on the core business projects and allows you to implement projects in very short time. And the other good thing is that your software will be always up to date. Tableau team going to deal with that. But the downside of SS is loss of control. You will be at the mercy of Tableau team. If anything bad happen, like security problems, all your organization's data might be compromised. And the other disadvantage is that you might have bad performance or networking issues connecting Tableau to your source systems. My advice here that you should avoid reinventing the wheel. Always take advantage of services that do things not part of your core business. Every hour you spend patching an OS or installing update for your software or replacing hardware, is an hour not spent enhancing and refining your dashboards in Tableau. All right, so with that, we have learned the differences between those three methods of hosting Tableau. Next we will have an overview of the Tableau server and Tableau Cloud. 58. Tableau | Tableau Server & Cloud: All right everyone. So now we're going to do deep dives into Tableau sharing products one by one in order to understand their key features and as well their limitations for each one of them. And we start with Tableau Server and Tableau Cloud. As Tableau developers in organizations, we need to share our reports and dashboards with other colleagues in our organization. So we need to put those dashboards in a trusted environment or platform in our organizations. And we usually have four requirements. The first requirement, it should be safe and secure. We want to control who is accessing our data and dashboard. Second, it should be easy to scale. Third, it should be robust that can handle huge amount of users and data. And the last requirement, it should be powerful and deliver high performance. No one wants slow dashboards and reports. And now in order to build this trusted environment with these requirements, we have two Tableau products, Tableau Server and Tableau Cloud. And we have three hosting options on premises As and SS. Don't worry about the terms, I'm going to explain them, Tableau Server and Cloud, they are very similar. At the user interface level, you will not notice any differences. But if you are checking the back end level, there is a big differences between them. So now first let's talk about the user interface level of Tableau server and Table Cloud. Once you publish your dashboard to Tableau server or Cloud, you can share them by providing links to the users across all departments in your organization. And then the users, they can access your dashboard using their web browser without installing any software at their end. And if you give them access, they can start exploring your data in Tableau server or cloud. You can manage your users by adding and removing them. Give them specific rules like admin, creators, viewers or explorer. You can manage your users as well by adding them to groups. Another important task you can do in Tablocerver or Cloud is that you can automate your tasks. For example, you can create a refresh schedule to refresh your data sources on regular basis, like once a day in Tablo server and cloud. You can monitor the tasks and schedules to check the status if the job failed or succeeded. And you can find many other statistics about the run time, the average and error messages and so on. Not only the users can view the dashboards in Tableau server or cloud, but also they can create a new one. If you give the users enough rights, they can even start creating their own insights and views directly on their web browser without having them to install any Tablo desktop. It's something we call self service PI. All right, so that was a quick overview of the Tableau server and cloud. And next we will talk about the free option Tableau public. 59. Tableau | Tableau Public: All right everybody. So now with this we have clear picture about Tableau server and Tableau Cloud. So now let's talk about the other sharing Tableau products. Tableau Public Cloud is a free cloud service managed by Tableau team. Everyone in the world can share visualizations in this platform. If you publish your dashboards in Tableau Public, everyone can access it, interact with it, and even download it. Tableau Public is like social media, you can edit your profile and add your personal informations in Tableau Public, you have a huge gallery of visas built by people all around the world. It hosts currently over 5 million visualizations in Tableau Public. If you are browsing and you found some interesting dashboard like this amazing dashboard from Ajias, you can add it to your favorites and then you can check what other visits did Ajias created and published to public and like any other social media, if you like her content, you can go and follow her to see her new updates. And if you are inspired of one of her dashboards, you can go and install the whole workbook to see how she did build these amazing dashboards and see all details. With that, you are expanding the knowledge in Tableau Developments. So using Tableau Public, you can get inspired from others and you can get connected to other Tableau developers from all around the world. And one more cool thing about Tableau Public, if you are searching for new job and you want to flex your data visualization skills, you can publish a lot of work in Tableau Public and link it in your CV so that the companies can see how skilled are you in Tableau. So all these nice features makes Tableau Public Cloud a very attractive platform for sharing visualizations. But now if you are talking about the security aspects, it is very limited. The only thing that you can control, not allowed to download your visualizations or you can completely hide it from others. But you don't have any user access control like we have in Tableau server or Cloud. Tableau public Cloud is a free cloud service from Tableau. We host a lot of reports and dashboards built by people all around the world. It's a great platform to get inspired by Tableau community, build connections to other Tableau developers and share your skills. But since it's free, it comes with field limitations. The total size available for each account is only 10 gigabytes. Your dashboard and reports are not connected to the source systems. That means you cannot automatically refresh your data in Tableau Public. Always, you have to do it manually. So you can open the reports, refresh the data, and again publish it to Tableau Cloud. And the third limitation of Tableau Public is that as the name implies, everyone in the world can see and share your data. That means you cannot use it in organizations since you cannot protect your data. All right, so that's all for now about the Tableau Public. Next we will cover the Tableau reader and Tableau Mobile. 60. Tableau | Tableau Reader & Mobile: Tableu reader is a software you download and install at your BC. You can use it only to view reports and dashboards, but you cannot use Tableau reader to create any data visualizations or even edited. As you can see, we don't have any tools or functions to create charts. You can't even connect any data sources or refresh your data. Tableau reader is very old tool from Tableau. It was created in the early days of Tableau in order to share content piled using Tableau Stop. This was before even Tableau server and Tableau Cloud made available At that time, Tableau reader was the only option you have in order to share dashboard and report with other users. So how it works, you build data visualizations using Tableau Stop and then you send a file to someone else. Then they're going to use Tableau Reader in order to view and interact with the dashboard that you built. To summarize, Tableau Reader is a pre tool. It is just to view and interact with report and dashboard built using Tableau Stop. You cannot create or edit anything in Tableau Reader. You cannot refresh the data inside your dashboard using Tableau Reader. Each time you have to ask for a new copy. If you want to have fresh data and there is no security features, password protections or login option, this is a big problem. If the files lands on the wrong hand, your organization data could be exposed. Well, I don't recommend at all using this tool. In organizations, the risk is just too big. But if you want to take the risk and to share your visuals with 123 persons, then use it, but try to avoid it. Tableau mobile is a free mobile app that you can download at your smartphone or your tablet. You can use it to view and interact with Tableau reports and dashboards published to Tableau server and Clouds. So you can use it only to view the reports. You cannot use it to create new reports or to edit the reports. While Tableaumobile is free to download, it requires a license to use, and it can only access Tableau server and Tableau Cloud. So you cannot use it in order to access Tableau Public and Tableau Moobile can automatically cache your reports and dashboards in memory. That means you can access them even if you are offline. All right, so with that, we have an overview of all five Tableau sharing products. And next we will compare all the five Tableau products side by side. And I will walk you through my decision making process to choose the right products for you. 61. Tableau | Tableau Server vs Cloud vs Public vs Reader vs Mobile: All right everybody. So now let's summarize and compare all Tableu sharing products side by side. The first point about hosting Tableu server can be hosted in your organizations or in cloud service providers like Azure or Amazon. Both Tableau Cloud and Tableau Public Cloud are hosted by Tableau team. Tableau reader will just be software installed at your PC. You can't even host it. Now if you are talking about the cost for Tableau server, you have to pay for licenses, hardware and maintenance, but in Tableau Cloud you have only to pay for the licenses. Tableau Public and Tableu reader are free to use. Now if you check the data security aspects, both Tableau server and Tableau Cloud are highly secure. Table Public and reader, they are not. Next point is about the storage limitations in Tableau server. It really depends on the server, disc space. In Tableau Cloud and reader there is no limitations. But in Tableau Public Cloud, the total size available for each account is only 10 gigabytes. The next point about the connectors. Tableau server and Cloud can be connected to different types of sources like Cloud API, services, files, databases, and so on. But Tableau Public, Cloud, and Tableau readers, they cannot be connected directly to any of your source systems. Let's jump to the next point, automation in Tableau server and cloud. You can schedule tasks to refresh your data inside your dashboards automatically from the source systems. But the data inside Tableau public cloud and reader cannot be refreshed. You have to do it manually. You have to republish it, or to resend the file. The next point about Tableaumobile, you can connect your smartphones or tablets only to Tableau server or Tableau Cloud. Now to the last point, we can use Tableau server and Cloud to share dashboards inside organizations. Table Public is used to share dashboards to the whole world, and Tableau Reader is used to share dashboards directly to individuals. All right, now with this, we have an overview of all Tableau sharing products. Now the question is when to use which products? Let me guide you in my decision making process following this chart. All right. First we ask all questions about the limitations inside Tableau Public Cloud. The first question, can data be public? If the answer is yes, then we ask the next question. Should the data be frequently refreshed in the reports and dashboards? If the answer is no, then you can go and use Tableau Public Cloud. But if the data should not be public and should be refreshed automatically, then we have to think about private hosting. Now the question now, do you want to manage the hardware? If yes, then you can use Tableau server on, on premises at your organization. If you don't want to do that and you want to outsource it, then you ask the next question. Do you want to manage the software on your own? But if the answer is yes, then you can use again, Tableau server, but this time it's going to be hosted in cloud service provider like Microsoft Azure in a service model. But if the answer is no, you don't want to manage the software by yourself and you want to outsource it, then you can go and use Tableau Cloud as a SAS service. As you can see, Tableau reader is not in my decision making process since I don't recommend it at all. Now if you combine this flow chart with the one that we built previously for developers tools, you will get my whole decision making process that I usually use when I start a new Tableau projects. So if somebody asked you when to use which Tableau product, you can go through it and find the right combinations for you or for your company. All those materials, you can find it in my website. All right everyone. So with that, we have covered all eight Tableau products and we understood the differences between them. In the next chapter, we will learn the Tableau architecture to understand how Tableau internally works and what are the main components of Tableau. 62. Tableau | Section: Tableau Architecture: Table architecture. Now we're going to go and understand how Tableau internally works, its components and its limitations. So now we're going to go and cover many important Tableau concepts, like what is live and extract connections, what are the different file types in Tableau? And then we can start drawing the Tableau desktop architecture. And then we're going to jump to Tableau server in order to understand different scenarios like the published process, authentication process, and accessing view process. After that, we're going to go and complete the big picture by drawing the server architecture and its components. And at the end, you're going to cover as well the architecture of the Tableau public. So now let's start with the first concept, the live and extract data connections. So now let's go. 63. Tableau | Live vs Extract: In this section, you will learn the Tableau architecture to understand how Tableau internally works and what are the main components of it. You will learn some important concepts. And we will start with the data source, connection types, live and extract. Now we come to the most important decision or questions that we're going to make inside data source. Do you want to store an extra copy of your data inside Tableau? Here we have two designs for the data source. Either you're going to say, no, we don't need to copy inside Tableau. The data should stay where it is in the source systems. Then what can happen? Each visualizations needs data, it's going to send squares directly to the external database. Then the database is going to send the results back to your visualizations. The data comes always fresh from the sources directly to your dashboards. This type of the connections, we call it a live connection or you're going to say yes, let's have a copy of our data inside Tableau. A snapshot or subset of the data going to be copied from the external database to Tableau. This copy, we call it an extract. Now, each time our visualization needs data, it's going to send queries, this time to the extract instead of the external database. And then the extract going to return the results back to your visualizations. Since the extract is inside Tableau and very close to the visualizations, we will get great response time and very fast performance. This type of connection, we call it an extract connection. All right, now the question is, which connection type should I use in my data sources? The typical answer for this question is, well, it depends. Because here we have a trade off between performance and data freshness. For example, if for you the performance is way more important than the data freshness, then you have to go with the extract. Since the data going to be stored inside Tableau in memory using the column store technique, you will get just great performance. But if you say you know what, the data freshness for me is more important than the performance, Then you have to go with the live connections in your data sources because you will always get the fresh data directly from the sources in your dashboards. All right, so that's was a quick overview of the two data type connections in Tableau Live and Extract. And next we will learn the different types of files that you can generate in Tableau. 64. Tableau | Tableau File Types: All right, so now if you want to send Tableau files directly to the users, we have to ask the question, which type of files we're going to send? Because in Tableau, so we can generate not only one file, we can generate five different types of files in Tableau. So now we're going to have like quick overview of those types of files to understand them and to know when to use them. All right. As we learned, the Tableau workbook contains three things. The extract, the data source, and the visualizations. There is a file type for each. Combinations depend on your requirements example. If you want to share only your data without anything else, no data source, no visualizations, then you can send an extract as a hyper format. But now if you say, you know what, I've done a lot of work in the data source. I built a data model, I renamed stuff, I did aggregations, I created a lot of new columns. So I would like to share that with my team, with my colleagues, and I'm not allowed to share my data with them. In this situation, you say, okay, I'm going to share the data source with my colleagues and we call it Tableau Data Source TDS without data. Or you might be in other situations where you say, you know what? My colleagues don't have an access to the source systems. We cannot use the live connection and you don't mind sharing your data as well. Now you can send them a package of an extract and the data source. The file type here called Tableau package Data source DDS x. This type of file contains both of your data and your data source. We might be in another situation where our colleagues or users are interested as well in the visualizations. We can send them a file with the visualizations and the data source. Here again, we have the same situation. You decide whether you're going to send with it data or not. If you don't want to send the data inside it, you can send a file called Tableau workbook B. And the last scenario, I think you already guessed, if you want to send everything, the whole package, the extract, the data source, and your visualizations, then you can go and send your colleagues a Tableau format called Tableau packaged workbook TB X. All right, so as you can see, Tableau did different types of files for different purposes depend on the situation or the scenario that you have? You can share your work with your colleagues. All right, so now generally speaking we have two different types of workbooks. A workbook with data using extract connection, and another book without data using live connection in one hand, in the workbook with data, you can send three different types of files. You can send only the data using hyper format or send the whole dataset with the data using DSX format. Or send the whole package with the format BX. In the other hand, with the workbook without data, you can send only two files. Dataset without data DS or the workbook X. Now you might have the question and you say, okay, which Tableau products should I use in order to open these Tableau files? Well, we have three Tableau products. Tableau Tableau Public, and Tableau Reader. With the Tableau disctob, you can open everything. You can open all these different Tableau formats and files. But with the Tableau reader and public, you can open only the Tableau packaged workbook TX. Since Tableau reader and Tableau public cannot connect directly to the data sources and they cannot use the live connections. All right, one more thing to understand about Tableau workbook is that Tableau uses two different types of data to store the workbook. The first one is the metadata information, It will be stored in XML files. Metadata is data about your data. It describes your data. It contains all informations on what have you done in the workbooks. Anything you click, Dragon, Rob, or do while working with Tableau Desktop will be reflected in some way in the meta data. You can find information, for example like column names, data type, data model, and so on. The second type is the data itself, the actual data. If you load data inside Tableau, Tableau can store it in a format of hyberfile, where the data going to be stored in column store methods in the memory of Tableau. It is like special formats for fast data retrieval. All right everyone. So with that, we have learned the purpose of the different types of files in Tableau and when to use them. And next we will do deep dive in the Tableau architecture to understand the desktop components. 65. Tableau | Tableau Architecture: Desktop Components: All right, if you understand the Tableau architectures and how the components are connected to each other's, everything going to make sense for you as you are working with Tableau and as well it's going to makes you a better Tableau developer. I will be sketching the concepts in order to make it easier for you to understand. So let's go. The Tableau architectures contains four different layers. Source layer, the disto layer, server layer, and the consumer layer. We will start unboxing each layer one by one to understand their components. And we're going to work with this architecture from left to right. So we will start by the source layer and we're going to enda by the consumer layer. All right, so now we have the source layer. The source layer is outside of Tableau and it contains the source of our data. Our data could be in databases like Mysql or Oracle, Or the data could be in files like Excel and Jason. Or even in the cloud like Amazon, AWS or Microsoft Azure, or even in PI's, our data could be everywhere. All right, so now back to the big picture. Let's jump to the next layer. We're going to unpack the disctop layer. The first component in Tableau Desktop is the data source. Before you start building your visualizations, you must set up the data source. The first thing that we're going to do inside the data source is to connect Tableau to our data. Tableau offers around 90 different data connectors, so we can connect Tableau almost to anything. Once you build the connection between Tableau and your source of data, the access information is going to be stored inside the data source. For example, the bath of the file location of servers, username, passwords, or access tokens, and so on. All these information is going to be stored inside the data source. All right, so the two types of data connections in data sources are extract and live connections. Now we connected to data, we decided which type of the connection. The next thing that we have to do in the data source is to start building our data model. And we can do that by combining tables together, using relationships, joins, and union. And you can do many other stuffs, like setting the right data types, doing aggregations, renaming tables and columns, creating new calculations and filters and all right. Now to summarize, the data source component in Tableau contains the following informations. We have the data connectors to connect Tableau to our data. We have the access informations, where the locations of our sources going to be stored as well. We can decide whether we're going to load an extra copy of our data inside Tableau. We call it an extract connection, or we're going to leave it as live connections in the data sources. The last thing we have the data model inside data sources where we can combine tables together and do aggregations or we can do some other custom. All right, so once we are done with the set up of the data source, we have the connection whether it's extract or live. We have our data model and everything is ready. Now we're going to go and start building our visualizations. And Tableau organizes the visualizations in three levels. The first one is the worksheets. So we can use the data available in our data sources to build a single view, only one visual. It could be a bar chart, a pie chart, or a table view. And as you can see, each worksheet is connected directly to a data source. But in Tableau, you can build a worksheet from two different data sources by using very powerful combining methods called data. This is very unique feature in Tableau. You cannot find it in any other tools where the data in one visual can come from different sources. Once we have these different worksheets, we can go to the next level where we start combining these worksheets into one dashboards to show the different visuals in only one view. But keep in mind, if you want to do any changes in the visuals, you have to go back to the worksheets and do the adjustment there. Now we come to the last level, we have the stories. As you know, the main goal of doing data visualizations is to tell a story. So you can build like a sequence of worksheets or dashboards, works together in order to tell the users story based on your data. All right, now you might ask me which visualization level is the right one for you? Well, if you have only one visual, then go with the worksheet. But if you want to build some QBI to monitor process, then build a dashboard. If you want to present your data and tell a story from it, then go and build a story. All right, now we have in Tableau Desktop both of the data sources and the visualizations, and these two components are contained in something called a Tableau workbook. Now the question is, after you've done building your data sources and visualizations, what can you do workbook? Well, you can share it with your colleagues in your team or departments. And there is two ways to do that. Either you're going to go and send a Tableau file directly to the users, or you're going to go and publish the workbook to a Tableau server or cloud. And from there your users and your team can access your workbook. All right, the big picture, the Tableau architecture. Let's talk about the layer on the right side, the consumer layer. There is different ways to consume Tableau visualizations, depends on the user's clients and on the tasks the users do. We start with a very small group of users that they might use Tableau reader to view and interact with Tableau visualization and they usually don't want to edit or create something new for this group of users. We're going to send them a Tableau file. As we learned, they're going to need a Tableau packaged workbook, WPX. We might have another group of users, usually they are your team colleagues. They want to build analyzes on top of your work. They're going to use Table Desktop to do that for them. We can send any kind of Tableau files. Depends on their requirements and their tasks. And now we have a big group of users or consumers that they can access Tableau server or cloud to view and interact with Tableau visuals. They can use their web browsers like Google Chrome and Firefox to access the content of Tableau server. And from there they can view, interact, and even edit the visualizations if they have enough permissions. Or they can use Tableau mobile app on the smartphones or tablets to view and interact with your workbooks. But they cannot use it in order to edit a Tableau Visualization. For this group of users, you will not send them any files. First, you have to publish your work to the server. And here we have two options. Either you're going to publish only the data source, or you can publish the whole workbook to the Tableau server or cloud. After that, you're going to share the link of your workbooks to the users. Now to the last group of users that's worth mentioning, they are the static users. You can always export your data and visuals from Tableau Desktop and send it directly to the users as a BDF or Excel. So of course it's static and they cannot interact with it. All right, so so far in the table architecture, we talked about the source layer. We did deep dive in the tabloid stop and its components and we understood the different type of consumers and the clients. And in the next step, we will start talking about the Tableau server architecture. But first, in order to make it easier to understand, we will go through three different scenarios. And we will start with the published process. 66. Tableau | Publish Process: All right, previously we start sketching the Tableau architecture where we learned about the source layer, the desktop layer, and the consumer layer. Now we're going to unpack the server layer in Tableau architecture in order to better understand Tableau server components. I'm going to walk you through three scenarios from the user point of view, what's going to happen exactly in Tableau server once we publish a workbook or when we log into the server and access a workbook. Let's go. Let's say that you want to publish a Tableau workbook with an extract. What's going to happen? Tableau Desktop going to request the server to upload the workbook Bx. And the first component in Tableau server that can receive the request is the gateway. The gateway knows how to forward the request to the right server components. In this situation, the right component to process the publishing is the application server. The gateway going to forward the request to it. As we learned, the Tableau workbook holds two different types of information. The metadata stored in the Xmil files and the data itself stored in Hyper files in Tableau server. Those two different types of files going to be stored in two different places. Application server going to send the XML file to be stored in the server component called repository, and the hyberfile going to be stored in another component called the file store. What we have learned so far, the gateway is responsible to forward the request to the right component. The application server is the one that can handle the published process. The repostery going to store the XML files, the meta data of the workbook, and the actual data, the hyber going to be stored inside the file store. All right, so that's all for this scenario. Next we will start talking about the authentication workflow in Tableau server. 67. Tableau | Authentication Process: All right, so now our workbook and our data are published to Tableau server. It's time now for our users to log into the Tableau server and start interacting with our dashboards. So let's see how this going to work. Let's say your manager is Michael Scott. And Michael wants to check your sales dashboards in Tableau server. And I'm going to do it, I need a username and I have a great one. Once Michael gives these informations, a request going to be sent to the server as HTTB request. The first thing that it's going to head is the gateway. The gateways knows that the application server is the right component to handle the authentication process, so the gateway going to forward it to it. And then the application server going to ask the repository to check if the credentials, user name and password are correct and if Michael has permission to access our server. And then the repostoryinga check and if everything matches and Michael is allowed to access our server, it will respond back to the application server and going to say, yeah, we knew the guy, he is in our records. Then the application server going to start building the server UI and send it back to the gateway. And then the gateway going to send it back to Michael browser. Now he is inside our Tableau server. So what we have just learned from this process, again, the Gateway is responsible for forwarding the request to the right component. The application server is the one that going to handles the authentication process. The reposterre going to store the user credentials and if the users have an access and permissions to our server and the application server is the one that renders the web interface of the server. All right, so that's all for this process. Next we will talk about what happens in Tableau once we access a workbook to view the data. 68. Tableau | Access View Process: All right, so now Michael is inside our Tableau server and he's going to start browsing and searching for your sales dashboard. And once you find it, he's going to click on it and try to access your dashboard. So now let's see what's going to happen in Tableau server. As usual, the HTTB requests for accessing going to be generated and sent to the server. And we know by now that the gateway going to receive the request and start forwarding it to the right component application server. Then the application server going to start render the Chrome around the Z, all those icons and images that are not inside the dashboard itself. And then the application server going to say, okay, now we are talking about visualizations. This is completely out of my leak. We have to forward this request to the master, to the brain. It is the viscuL server. It is the one that deals with visualizations. From here, the ViscueLgn take over. I'm going to say, okay, first thing first, let's check if this guy, Michael, is allowed to see the sales dashboard, the Viscuelgn ask the repos story. In the repos story, there is a list of users and reports. So it's going to search there to find any. If yes, then it's going to send back, yeah, Michael is a boss and he's allowed to see the sales dashboard. And now the viscuL gonna say, all right, now we need data. So first we need the meta data of the dashboard. And as you know, after we publish the workbook, the meta data is going to be stored inside the repostory, The Visculgna request from the repostory. One more thing is to send the XML file of the dashboard. The repostory then's going to send back the XML to the ViscuL server and the server will start building the dashboard. All right, so now the Viscul gonna say, okay, now we have the dashboard. But the problem is it is empty. We need the data to fill it. And it's better to ask our data specialist and the data server. The data server is the one that knows everything about the data. It's going to say, all right, for this dashboard, part of the data, we have it already inside Tableau server. But the other part is sadly outside of Tableau. To get the data inside Tableau server from the extract, the data server is going to send the query request to the D engine. And the data engine knows how to query and extract the needed data from the file store. The data engine is going to get the data from the file store and it's going to send it back to the data server. And now we come to the part where the data is living outside of Tableau server. Here, the data server is going to act as a proxy. We're going to use the data connectors to connect to the external databases. Once the connection is established, it's going to send a query that matches the language that the database speaks. And then the database is going to return the needed data as raw table. Now once we have all the needed data inside the data server, it's going to combine it and do another security check. The data server going to check, is Michael allowed to see all data or should we filter the data? The data saver going to filter the data depends on the data security setup that you have made. And then it's going to send the raw data back to the ViscuLserver. Now once ViscuLserver has the raw data for the dashboard, it's going to do now the magic by turning all those numbers and raw data into images and visuals, and it's going to put it inside the workbook. So now finally, the ViscuL has everything it needs. The sales dashboard is complete and ready. The ViscuL going to send it back to the gateway. And the Gateway going to send it back to the web browser of Michael. Michael can start interacting with the dashboard now. Will hm. Does Michael have any idea what to do with the sales dashboard? I declare bankruptcy. All right. I know there was a lot of stuff going around in this scenario, but we have covered most of the Tableau server components. So let's have a summary and understand what we have learned so far. As usual, the gateway is responsible to forward the request to the right component. The application server is not responsible for the visualization process, but the viscuL server is the one that is responsible of building the visualizations. The repository can store information about the permissions and security which users are allowed to access, which dashboard. And the data server is going to manage both of the extract and live data sources. And the data engine is responsible for retrieving the data from the extract inside Tableau. And the data connector is going to help the data server to connect to the external sources. And the viscuL server does the magic of transforming the raw data into visuals. All right, so far with those three scenarios, we covered the most important component of Tableau server. Now we're going to go and put all pieces together into the Tableau Architecture and start explaining them one by one. Let's go. 69. Tableau | Tableau Server Architecture: In this video, you will learn about the Tableau server architecture. And then we're going to do a deep dive into each server component of the architecture to understand how it works and what it does. And we start right now, the server layer contain mainly of three stuff, two interfaces left and right. In the middle, we have a bunch of server components. The left interface is the data connectors. They're going to connect the external source systems to Tableau server components. In the right side, we have the gateway. It's going to receive requests from different clients, going to connect it to Tableau server components. All right, so now let's go more in details about the gate component. In one hand, we have requests come from different clients, like a login request from web browser, or a published requests from Tableau Desktop. And in the other hand, we have different Tableau server components like the app server, ViscuL server and so on. And the gateway is going to be in the middle that's knows how to forward the requests from different clients to the right server components. And the other task of the gateway is balancing stuff around. Let's say that you are working in multi node environments where you have two nodes. When the gateway received the first request, it's going to forward it to the node number one. Both nodes are free. But now, if the gateway gets a second request, it's going to say, oh, node one is full. Let's process this request in node number two since it's free and so on. All right, so the gateway in Tableau server is like a distributor that knows everything. You know someone like that. Let's just say I know a guy who knows a guy who knows another guy. So the Gateway has two tasks. First, it routes the client requests to the right component. And second, it does load balancing if you are running Tableau server in distributed environment. All right, so now we're going to start talking about those Tableau components. In the middle, in Tableauver there is like different arts of components. We have servers, we have engines and storages. And we're going to start with the servers. As you learned in Tableau server, there is like different processes. The login process, populis, accessing, workbook, and so on. And in Tableau server, they designed different servers for different processes. Let's start now with the application server. The application server is responsible for different processes. Like, as we learned, a user login request is going to be forwarded to the application server. Then the application server is going to check with the repository or an active directory, depend on your configurations to find out if the user is allowed to access the server or not. And the other process the application server handles published process where the application server going to get the published request and it's going to split the workbook into two files. The XML file to be stored in the repository and the hyberfile to be stored in the file store. One more task for the application server is to render the server interface. All those little stuff that you find in Tableau server like icons, images, projects minus it. Is the application server who render those stuff. The application server is responsible for different processes like the authentication and authorization process, the published process, and rendering the server I. But one process that the application server will never do is the visualization process. Or now we're going to jump to the next server. We have the Viscul server. This one's going to be interesting. All right, so previously we talked about the power of visuals and how human brain transform text into visuals and images. The ViscuL is like our brain. It can add the magic by converting numbers and texts into visual and images. Viscul stands for Visual Query Language for databases. The founders of Tableau, Crest and Pat, they did invent this language. Let's say that you drag and drop something in Tableau. The ViscuL gonna convert this action to an SQL query and then send it to the data server to get the data. Then the data server going to send the results back to the ViscuL as raw data. Now ViscuL going to do the magic by converting those raw data into visuals and images presented at your clients. All right, so the viscuL is the brain. It is very important Tableau component and responsible of the visualization process mainly. It does two things. It's going to generate queries from user action and it's going to convert and transform the raw data into visuals and images. All right everyone, So now we're going to talk about the third one. We have the data server. The data server is the one that knows everything about the data. It knows where to find the data, how to connect to, how to speak to it. The first task of the data server is to manage both extract and live data sources. If the data is inside Tableau, it can send query requests to the data engine. But if the data is outside Tableau, it can use the data connectors to send query requests to the external sources. And the data server knows how to speak to the sources. It acts like a proxy to the data sources, can speak many different database languages so that it sends query requests in a language that the database understands. We have another task for the data server is to handle the data security. It checks if a user is allowed to see the data and do filtering if needed, and the data server manages as well. Driver deployment. So the data server is the central data management component in Tableau server and the one that knows how to get data from the sources. All right, so now let's jump to the next component. We have the data engine. If we decide to store our data inside Tableau as an extract, then the data engine is going to be the one dealing with it. Different components can send requests to the data engine. Like, for example, the data engine can receive a request from application server to publish a new extract. Then the data engine can execute and create operation to create a new extract and store data inside it. The data engine can receive as well equ request from the data server asking for data. What can happen here? The data engine going to find the correct extract. It's going to connect to the hard driver and then it pulls the needed extract from it. And at the end, the data going to be sent back to the server. And finally, the data engine can receive a request from the backgrounder to update the content of an extract. The data engine can execute an update operation by opening the extract and updating its content with the new data. The data engine in Tableau is like any other database engine. It does different operations. Like it queries the data, it perform insert and update operations. It creates new extracts, but only for the data inside Tableau server. Inside the extracts. Okay, the next component is the repostory. As you might already noticed, the repostory was involved in every table process. So let's talk about it. The repostory stores many different types of data. Like, for example, it can store the workbooks that's we published to the server, but only the metadata part, not the data itself. The XML files from the workbooks can be stored inside the repostory. In the repostry we find as well the usage data. It's data that's going to help you to understand the performance and the traffic about your project. Like for example, you can find the total number of active users inside Tableau server. What total view counts by day, and you can find out the most used data sources in your project. Another type of data that you can find inside the repostery is the security information. For example, which users are allowed to access your content or which users are allowed to access our Tableau server. All right, so as you can see in the repostery, there is different types of data and it contains as well huge amounts of data in Tableau server. But it's very important to understand that is the data inside our dashboards and reports not stored inside a repository. We have many other Tableouserver components that's worth mentioning. Like for example, the cache server, it stores almost everything like images, icons, results of queries, dashboards and so on. So if you start a dashboard that is already accessed before, the data going to be pulled from the cache server. Another component is the Backgrounder. In Tableouserver, you can create a schedule to refresh the data inside your extract. And the task of the backgrounder is to check this schedule each 10 seconds and then trigger the process of refreshing the extract if the time comes. And the last component that I would like to mention here is the search and browse. The users of Tableouserver, they can search for content. This component is responsible for searching inside the repostery and return the results to the users. All right, if one finally we have the last puzzle, the sever components. If we put it in the architecture, we will get the whole big picture of Tableau architecture. Now let's go and do very quick summary. The source layer, it is the one that is outside Tableau and contains our data and it could be anywhere like databases or files. In the disktope layer, the developers can start connecting Tableau Disktop to the data sources. With either copying the data inside Tableau using an extract connection or with the live connections to the sources. The going to start building visualizations using worksheets, dashboards and stories. And both of the data source and the visualizations. We call it a workbook and we can either send it as a file or share it to the server. The server layer going to host our workbooks and we can find many components like the data connectors to connect our sources to the Tableau server. And the gateway to connect the client requests to the Tableau server. And we have the application server responsible for the logging and publishing processes, the viscuL server responsible for the visualization process, and the data server is the one responsible for the data management. We have another component like the data engine that's going to handle the extracts. In Tableau server, we have three places where the data going to be. We have the repostery that contains many different data, like the XML of the workbooks and the security objects. But not the data itself, because our data going to be stored inside the file store as an extract. And we have the cache server that contains many different types of data to increase the Tableau performance. And the last one is the consumer layer. Here we found the different groups of users and clients, like the Tableau readers that needs only the TWbX files directly from the Tableau developers and another group of users that they're going to use Tableau to develop new views. And we have the static readers that's going to receive files like BDF and Excel. And then we have a big group of users that's going to access Tableau server using either Web or Tableau mobile to interact with the populist workbook. All right everyone, one more thing that I would like to show you is this amazing dashboard from Tableau team. It's going to show you the different component inside Tableau server and how they're going to interact to do a task. For example, if we go to the workflow or the process, we can select, for example, access to view. And then we're going to select whether it's like an published extract or live. Over here we have like slider. If you drag it to the end, you're going to see how the components are interacting with each others to do the tasks. And on the right side you will see description for each step. And this is really great way to learn how Tableau server works. I learned from this a lot for this tutorial, so make sure to check that if you want to see more details about other processes in Tableau server. I'm going to leave the link in the tutorial materials. All right guys, so that's all for the Tableau server architecture and its components. Next we will learn the Tableau Public architecture and what are the limitations of Tableau Public. 70. Tableau | Tableau Public Architecture: Let's start with the source of our data. In Tableau Public, you can only connect files like CSV Jason, Microsoft Access, and Google Sheets. The next component is Tableau Public Disktob. It is free version of Tableau Disktob. It's software that you can download and install at your PC. So here we start by connecting Tableau public to our files by creating a data source. In the data source, we have only one type of connection. It is the extract. The data should be copied from our files to be loaded inside Tableau Public Disktop. There is no live connection option. And then after that, we're going to start building our visualizations, or we call it viss. Now once we are done building the views and the dashboards using Tableau Public Disctop, we have here only one option to share it. That is to share the whole workbook, your data, and the vises to Tableau Public. Tableau Public is a free platform hosted from Tableau team to share the visualizations from the whole world. Once our viss are published to Tableau public, D can be now consumed from users all around the world. And here we have few options. The users can use the web browsers to view and interact with your visualizations, or users can download the whole workbook, your data, and devises in different formats like Tableau file, WPX or Il, BDF, images and so on. The last option of consuming your vises can be embedded into your websites and blogs. Okay, now since Tableau Public is free, it comes with few limitations. At the source level, we can connect Tableau Public only to files. The data connectors are very limited, and we cannot connect, for example, to servers. And in the next level, at the public desktop level, there is limitation. In the data source, we have only one type of connections and that is the extract. So we cannot have a live connections to the sources and the workbook itself, it can contains only maximum 15 million rows and we cannot save the workbook locally at our commuter. The only option to share it is to publish it to the Tableau public. But there is like a work around for that. I'm going to show that in the next tutorial. All right, so now let's move to the sharing level to Tableau public. Here we have as well, few limitations. For example, the total available size for each account is only ten gigabyte. And there is no way to refresh your data automatically. Each time you need new data, you have to manually republish the workbook with new data. And the third one, it's going to be public, so there is no way to make it like a private and to share it with only a few people. You have always to publish it to the whole word. Now let's move to the final level. We have the consumers. The only limitation here is that you cannot use Tableau Mobile to access and interact with the visualizations. All right everyone, I decided to use Tableau Public in this Tableau course since it's free. And all of you can follow me with the examples without having you to pay for extra licenses. And the limitations that we have in Tableau Public, they are not really relevant for the learning process. So the main features of Tableau, the data visualizations that we have in Tableau Desktop, they are all available as well in Tableau public without any limitations, so don't worry about it. All right everyone. So with that, we have learned the Tableau architecture and its components, and we learned how Tableau internally works. And with that, we have covered the theory parts of Tableau. And in the next section, we will start preparing your environment so you can practice Tableau with me during the course. So let's jump in. 71. Tableau | Section: Prepare Your Pc: We can prepare your Tableau training environment. In order to learn Tableau, you should not only watch the videos, you have to practice with me. And that's why now we're going to go and prepare your environment in order to work with me. And of course, don't worry about it. Everything is for free. So we'll start by downloading and installing Tableau, then we're going to go and create a Tableau public account. And after that, in order to make sure that everything is working, we're going to go and create our first visualizations. And then we're going to go and publish it to your Tableau public account. And at the end, what we're going to do, maybe it's your first time starting Tableau, that's why I'm going to take you a quick tour of the Tableau interface. So now let's start by the first step by downloading and installing Tableau. So now let's go. 72. Tableau | Download & Install Tableau: All right, let's start with the first step. We're going to go and download Tableau, public Disktop. In order to do that, we're going to go to the website public Tableau.com I'm going to leave the link in the description. From there, we're going to find the menu Creates, and then we can click on that. Then we have download Tableau Disto Public Edition. Let's click on that. And then we're going to go to the middle and click on Doable Public. Now before the download starts, we have to fill out this registration forum. This is not for creating public account, it's just something, before download starts, we're going to give the first name, last name, e mail, and country. And then we're going to click download the app. And then the download going to start is just 500 megabytes, so it should not take a long time. Now we have the download is done. Let's click on the execution file to start the installation process. Okay, At the start of the installation, we are at the welcome page here. As usual, we have to read and accept the terms, so you have to do that. And here we have second box. You can click on it if you don't want to send the product usage data to Tableau team. It's like cookies. I don't mind. I'm just going to leave it. So we click now Install. Once you do that, the installation going to start. It should not take long time. Okay, so now the installation is done and Tableau going to be launched automatically. All right, so with that, we have done the first step where we have successfully downloaded and installed Tableau Public at UPC. And next we're going to create Tableau Public Accounts, where you can share and publish your work. 73. Tableau | Create Tableau Public Account: Okay, so let's go back to the website public.tableau.com and on the right side at the top, we're going to click on Sign In. And then we have to click on this join now for free. And now we have to fill out this registration form in order to create a new Tableau public account. So we have to enter the name, the E mail, the password, and the country. And then we have to read and agree on the terms. And let's click here. I am not a robot. And at the end, you're going to click on Create My Account. And now we got the message to verify our account. So that's means we have to check our e mails in order to activate our account. So let's do that. Okay, So now after checking, I got an E mail from Tableau. So I'm going to click on it. And then I'm going to click on Verify now in order to activate our account. So I'm going to click on that and then it's going to send me to my account. And with that we have brand new active Tableau public account. Well, it's like any other social media account. You can add your personal information, for example. We can add our photo or avatar. So let me check what I can do over here. I have this photo from Studgard Television Tower. It's a meeting there. And then I'm going to click Save. We can add many other stuff. Let's click on Edit Profile. As you can see over here, you can link your social media accounts or add your websites and so on. So let's click Save now. All right, so with that, you have now Tablo public Accounts, but it's still empty, we don't have anything inside it. Next we will get the training datasets, and I'm going to explain for you the data model behind them. 74. Tableau | Get Training Datasets: If you want to learn any new tool like Tableau bar BI or any other programming languages, you need always a good dataset for training and practicing. I start searching for good training datasets and after a lot of research, I downloaded like many, many datasets. But I was not happy with them. I didn't like them because they don't cover all the scenarios that we need for training. Let me tell you why this is an issue. In real projects, your data going to be stored typically in data warehouses or data leaks inside many, many different tables. The first step in any visualization tools like Tableau or Power BI is to connect those tables and combine them in one big data model. Training with only one table not going to help you and prepare you for real projects. That's why I decided to make my own datasets to cover all the training scenarios and to have multiple tables in order to learn how to combine them in one data model. And of course, you can use my dataset in order to learn anything else like SQL, Python, Power BI, and so on. So let's see what I've prepared for you. All right. The first thing that we're going to go to the link in the description. And then you're going to land in my website where I've collected all the course downloads and materials in one page. So for example, you're going to go and download the training datasets. We have here some important links. The three sheet sheets and many sketch notes that I have prepared for this course. And then as well, you're going to find for each section what are the important links and sketches, and as well the Tableau files. This link going to be available for you after the course as well. So you can always come back here and download the stuff that you need and of course for free. But now what we're going to do, we're going to go and download the training datasets that we need for our course. Here as you can see, we have two zip files, one for the non EU and one for the EU. So if you are currently in Europe, what you're going to do, you're going to go and download these datasets. But for all other countries, you're going to go and download the first datasets, the non EU training datasets. And now you might ask, what is the differences between them? Well, it's about the decimal numbers, since in our datasets we have different decimal numbers, like the sales in different countries, we have different representations of the decimal numbers. So all the European countries, they use, for example, the comma to separate the decimal from the whole number. But in many other countries, USA, in Asia, we have the.in order to separate decimal number from the whole number, and if you are using the wrong format, what's going to happen? Tableau will not understand that this field is a decimal number and it's going to convert it to string. Now, depend on your location, go and download the datasets for me, I'm in Germany, so I'm going to go with the second one. And as I said, it's depend on your location. Let's go and click on that. Next I'm going to do, I'm going to go and grab the zip file and put it somewhere safe. So I don't want to leave it underneath the downloads, so I'm just going to create a safe path for that and then start extracting the data. Okay, now let's go and unzip the file. So I'm going to go and extract all of them. Okay, so now let's go inside it and check the data. So here we have three different datasets. The first datasets, the Tableau projects, sales dashboards. We're going to use it in the last section once we start building our projects. Then we have two other datasets, the big datasets and the small datasets. We're going to use these two datasets in the whole course. So the small data source and the big data source, they are very similar. So now you might ask me, why do we have two datasets? Okay, so now let's open both of them and see what do we have inside them. So as you can see, we have almost the same tables, so customers, we have orders, products and so on. And so they are almost identical. And now you might ask me, why do we have two datasets? We, we have many different types of calculations and functions. For example, some calculations going to change the data at the role Evel. And it's better to have a small dataset in order to understand their results easily. On the other hand, we have calculations like aggregations on the table LOD. It's better to have many data in order to understand how it works. That's why I have decided to have two datasets in order to cover all those scenarios. Another thing about the datasets is that the file type is CSV. We have only one Jason over here, so you can use either table public or tabletop in order to follow me in the course. All right, so now I'm going to walk you through the data model of our datasets. Here we have three typical tables. Our datasets contain information about the superstore use case. It is simply sales transactions of customers ordering products by a company. It's classic and very easy to understand. The first table in our data model is the customer's table. It contains all customer information such as the name of the customers, their locations, and their score. In the small datasets, we have five customers, and in the big one we have around 800 customers. And the second table in our data model is the orders. It contains all the orders placed by the customers. So we have informations like the order, date, sales, quantity, and profits. In the small datasets, we have ten orders. And in the peak dataset we have around five years of data. And that's really helpful once we start building clusters. And the third table in our data model is the products. It contains all the products that we find inside our supper store. So we have informations like the product name, category, and the subcategory. In the small dataset, we have only five products in the category monitor and accessories. But in the peak datasets, we have more than 2000 products with categories and subcategories. All right, so now we have those three tables, but as well we have relationships between them. Like for example, there is a relationship between the orders and customers. They can be connected using the customer ID. And if you check the orders and products, you can find another relationship between them where you can find the product IDs in both tables. And with that we can make a relationship between the orders and products. All right. Kay, so I left all those informations in my website. You can find there all the links to the datasets that I found during my research. So you can go there and check them if you want. All right, so now with that, we have everything. We have the tools, we have the data, we have the accounts. Next we will go and build our first visualization in Tableau, and we can publish it in our new Tableau public account. 75. Tableau | Publish First Viz: Okay, if you want, so let's start Tableau, public Disktop, if you don't have it open already. And then in the starting page, we're going to go to the left menu to connect Tableau to our data. So click on Text File, and now we're going to go and find our file, the Customer CSV that we just downloaded. And now we can see the customer's data inside Tableau. Let's move to the worksheets. I'm going to click on the orange tab over here, sheet one, to create a new worksheet. And now we're going to build our visualization in Tableau. We have only to drag and drop from the left side. Let's drag and drop the country in the columns. Let's get another one. Let's move the account to the rows. All right, so that was it. We have our first viz. And here you can see in this visual how many customers we have in each country. With that, we are done building the workbook and now it's time to share it. Sadly in Tableau Public, we cannot download it locally at our PC, but I'm going to show you work around later. Now the only option that we have is to publish it to our new Tableau public account. Okay, now in order to do that, let's go to the main menu over here. Then click on Files. And then we're going to click on Safe to Tableau Public. For the first time, you have to sign in with Tableau public account that we just created. All right, now let's click on Sign In. And now we have to give it a name, and I call it my first viz. And once you click Save, Tableau Public Desktop can start publishing our workbook to Tableau Public. Once it's done with the publishing, a web page can open automatically, directly showing your viz in your public account. Here's our Z. Let's go back now to our home page. And as you can see over here, we have our first viz published to Tableau public. Let's go inside it again. Now everyone in the world can see your viz, interact with it and even download it. Let's see how we can download that. There is download icon over here, then click on that. And now you can select the file format that you want. Let's select the last one is Tableau workbook, so click on that and then click Download. And now we will get the Tableau file bx, where we have our data and our visualizations inside it. So if you open it, you can see our work again. And this is the work around that we can use in order to save our work locally at our BC in Tableau Public. All right, so with that, you have published your first vis to your new Tableau public account. And next I'm going to take you in a quick tour in the Tableau interface of the three main pages of Tableau and we're going to learn how to navigate through Tableau. 76. Tableau | Tour of the Interface: Now I remember in 2014, the first time I opened Tableau, I was overwhelmed with all icons and parts that we have in Tableau interface, and navigating through Tableau pages was very confusing for me at the start. And that's why I'm going to take you in short tour in Tableau interface. So let's go. Okay, so now let's go and start Tableau. Now the first thing that I want to show you is that the whole thing, the whole file, we call it a workbook. And the workbook is like any other book. It contains different sheets. And the Tableau workbook contain three main pages. We have the start page. It is the main page where you can connect our data to Tableau. And then we have the data source page. It is the place where you can connect and combine your tables together and do changes to the meta data like renaming columns and so on. And the third page where you're going to spend most of the time is the workspace page. It is the place where you're going to build your data zolizations. All right, so now we can learn how to navigate through those pages and how to switch between them. Okay, once you start low, you will be in the welcome page, the start page. Now if we want to go to data source page, we have to connect something. Let's go again to the left side over here, Connect to text file and then select our file customers and open. Once we do that, we're going to land automatically in the data source page. Now if we want to go back to the start page, in order to do that, we're going to go to this Tableau icon over here on the left side. If we click on that, we're going to go back to the Start page. If we want to go back to the data source page, we're going to click on the same icon. Click on that again, and we are back to the data source page with this icon. We can always go back to the start page of Tableau. All right, now let's see how we can go to the workspace page. In order to do that, we're going to go to the bottom. Over here you will find different taps. The first one is always the data source tab. This is exactly where we are now at the data source. But now if we select the sheets Tableau, going to take us to the workspace page. If you want to go back to the data source page, there is two ways to do that. First, we can stay at the bottom over here, and we can select the data source tab. By clicking on that, we go back to the data source. And the second option is that at the data pane, if you go to the left side, over here you can see our data source customers. And if you double click on it, we're going to go back to the data source page. Okay guys, that's what it's, this is how you can navigate through Tableau pages. Let's have now a quick overview of each page. Okay, let's start with the first page, the start page. We can see here three panes connects, open and discover. In connect we can find all different types of datacnectors. And in Tableau public we have around ten. That's enough for the training. But in Tableau to we have over 90 data connectors. Now in the middle, we have open, once you start Tableau for the first time, this section going to be empty. But as you start creating new workbooks, Tableau going to start showing you the most recently opened workbook. And this is really nice to have quick access to our workbooks. Here, we have only won the first phase that we published before. And on the right side you will find Discover. You will find different stuff from Tableau team like blogs, news, training tutorials, and so on. And now in the bottom, you can see information about Tableau software, for example, now it shows that we can upgrade to Tableau dicto or later once Tableau releases a new version of Tableau, you will find information here to update your Tableau. But since we just installed the most recent version of Tableau, it doesn't show it. Okay, so that was it for the start page. Let's jump now to the next one. We have the data source page. By now, you should know how to go there by clicking on Tableau icon. Okay, what do we have here in the data source page on the left side, you can find all informations about our data. In connections, you can find the connection informations, and in files you can find all tables that are inside our data. And then in the middle we have the data source name. And then over here we have the area where we're going to build our data model. And it contains two layers, the logical layer and the physical layer. I'm going to explain that in the next tutorials. Don't worry about that. Beneath that, we have the data grid. It's going to show us a sample of our data, and as default, it's going to show the first 1,000 rows of data. And in the left side we have another grid. This is the meta data grid. It shows us more details about the tables fields. All right, so that's all for now. We're going to move now to the next page, the workspace page. And we can do that by selecting the sheet tab. Okay, in the workspace page, we can spend most of our time here building our visualizations. That's why we have a lot of icons and stuff around. So let me quickly guide you here in this interface. Okay, so we're going to start on the top. We have the tool bar. It contains a lot of icons and those icons are. Most frequently used functions in Tableau. As you are building your visualizations, you have a quick access to those functions. As you might already notice, there's some functions that are not selectable. Well, you have to understand here that in Tableau, if something is grayed out, that doesn't mean that this feature is not available in Tableau public, but it means it is not relevant for the visual. Now for example, if I go over here, it's going to sort the visual, and since I don't have anything, it's not relevant to sort it. Let's check the other icons. We have the Tableau icon, it's going to take us to the start page. You know that already we have the undo and redo the last action in the visual. And as you can see as I'm hovering the icon Tableau going to give me short description of the function here we can create a new data source, or over here we can create a new worksheet and so on. So just hover all the icons and you will see the function. All right, now let's move to the left side. We have here two panes. The data pane and analytic span. As default, Tableau Gonhowas, the data pane. But if you want to go to the analytic span, just simply click on it. You can switch between them by just selecting them. Let's see what we have here in the data pane. The first thing is the data source contains our data, and below that we can find the tables inside this data source. We have currently only one table, the customers. And we can see over here the fields or columns inside our tables. And here we have as well a search field. Sometimes our data source gets really big and we're going to have a lot of fields, so this is really nice way to search for specific field. Okay, so now let's go to the analytics pane. And you can find over here predefined functions that you can add to your visual, like adding an average line or doing clustering or even you can create your own reference line. Really nice stuff. Okay, so now I'm going to switch back to the data pane. All right, so now let's move to the middle. And you can find over here different shelves and cards. We're going to use them in order to build our visualizations. And everything works here with drag and drop. So let's start with the first one, the rows and column shelves. The visuals of tableau, they have two dimensions, the rows and columns. Like any other tables, if you put fields in the column shelf, it's going to create a column of the table. While if you put fields in the row shelves, it's going to create a row of the table. Easy stuff. So now let's have an example. Okay, so let's go to the left side and we're going to drag and drop the countries on the columns. And with that we define the columns of the visual over here. So now we're going to have something on the rows. Let's take the counts and drag and drop it on the rows. And with that we define the visuals, columns and rows. If you want to swap between them, you can go to the Tool bars over here and click on this icon. And you can switch between them very easily. If you have a lot of columns, I'm going to switch back. And now we can add more columns or more rows. For example, let's take the City Drag and drop it on the columns over here. You can have multiple stuff. Now if you want to remove one of those columns, you can do that by drag and drop on the empty space. Okay, let's move to the bages shelf. You can use it to split the current visual into a series of pages. If you want to analyze something like step by step and take it slowly, let's have an example. Okay, let's take again, the customer. Count a drag and drop it on the pages. You can see on the right side we have a new window to control the pages. And now we are at the first page where we have countries with only one customer. If we click over here on the right side, you will get the countries with two customers and so on. And now for the next example, I'm going to remove it. So I'm just going to drag and drop in the empty space. All right, so let's move to the next shelf. We have the filters. You can use it in order to filter our visual. For example, let's stick the countries, drag and drop it in the filters. And now you can here decide which country is going to stay and which country going to leave the visual. Now if I select, for example, let's remove France and click Apply. You can see our visual don't contain now the Country Friends. Now I'm going to remove it again from the shelf by drag and drop in the empty space. Then we have the Mark card. You can use it in order to design the visual. For example, we can add new colors. If we drag and drop the countries on top of the colors, we will get a color for each country. Or we can change the size of the pars, either make it small or big, or we can add labels and so on. Okay, now let's move to the middle. Of course, here we have our view, it contains visualizations or we call it visas. First we have the title and you can change it by double click on it. Let's give it a name. For example, customers by country, and then click Okay. Okay. Below that, we have our visualization, and it contains different stuff. For example, we have the headers, and here we have the countries as well, we have the axis. Now the intersection between those fields are the marks. Those marks could be like pars in this example or could be a line or circles or any other shape. Now if we check the bottom of table interface, you can find status par. It contains a lot of details about our visual. For example, it says we have three marks. Of course we have three parts. We have one row and three columns. The total number of customers is five. Now let's add more stuff to the visual to see how those status change. Let's take the scores, drag and drop it in the rows. You can see here we have now six marks, we have six pars, we have two rows and three columns. Those stats are really important once your visualizations get complicated. Now we have very simple one, we can count it and see we have six parts. But if we have a lot of dots and a lot of points, it's really hard to count them. It's really nice to check the status par to see details about our visual. All right, now let's move to the right side and we're going to go to the show me icon. Select that. Now you will get different visualizations that Tableau offers by just clicking on them. You're going to switch the whole visualizations in our view here. We can switch it to tables or to pie charts or to three maps and so on. Now just go and explore those different visualizations. You might already noticed that some of them are grade out, we cannot use it here. Again, it's available but we don't have the requirements to use it. For example, if you go to the line chart here, Table tells you what are the requirements or what Tableau needs in order to build this visualization. It needs one date. It doesn't need any dimensions, and it needs at least one measure. Currently in our view, Tableau cannot create it because we don't have any date field in our view. All right everyone. That was the main component of the worksheets. Now, before we go to the dashboard, I'm going to do few stuff. You can follow me. Okay? I'm going to undo those visualizations and go back to the par. And then I'm going to create a new sheets. So I'm going to click over here, create a new worksheets. And then I'm going to take the countries. And this time I'm going to take the scores over here. And then I'm going to use the Pi charts over here. I'm going to put some labels on it. Okay, that's enough. Let's go now to the dashboards. We can do that by creating a new dashboard on the icon over here. Now we are at the interface of the dashboard. I'm not going to explain everything over here. It's just important to understand that in the dashboard we can start compiling different sheets in one place. We can drag and drop the sheet number one where we have the customers by country. Then we can take the sheet number two, just place it somewhere over here. Then I have in one place two visuals, the sheet number one and sheet number two. This is the main job of the dashboard. All right everyone. Now I'm going to show you the last type of sheets we have, the story in order to create a new one, we're going to go to the bottom over here and click in this icon. And with that we have created a new story, stories in Tableau. They are like sequence of visuals and we use it usually for presentations if you want to tell a story from our data. All right, what do we have? Over here in the left side, we have the visuals that we created. We can see the worksheets and as well the dashboard. And then over here we can add new story points. In the middle we have in this section, like Navigator, to go through our story. And then here we're going to present the story or the views. What we're going to do now in the first one we can drag and drop the dashboard. Let's two that now. We can add a next step by adding plank over here. And then we're going to take the sheet number one and then we can add a new one blank and then sheet number two. So now we have story. It starts with the big picture with the dashboard. And as we go through the story step by step, we go more in details. In each visual. It's really nice way to present or to tell a story using our visuals. All right, so now we have the Tableau software installed. We have the two training datasets, the public account to share your work, and everything is ready to start learning Tableau. So with that, we have finished this section where we have prepared your environment to practice Tableau. And in the next section, we will do deep dive in the Tableau data source to learn how to build a data model in Tableau by combining tables. 77. Tableau | Section: Data Modeling: Data modeling in Tableau. Each successful dashboard or charts in Tableau can be based on a solid data model, and having data modeling skills is essential for each table, objects or business intelligence projects. So that's why we're going to start learning the fundamentals of data modeling, including the star schema and the snowflake schema. And then I'm going to introduce you to the Tableau Data Modeling, where you can learn the physical and the logical layers. And then we can learn the different methods on how to combine tables in data modeling using joins union relationships, data blending. And of course, in order to understand the differences between them, we're going to compare them side by side. And of course, I'm going to guide you in when to use which methods. And at the end, you're going to go and build two data sources based on our training datasets. So let's start with the first topic where we can understand the fundamentals of data moduling. Now let's go. 78. Tableau | Concept of Data Modeling: In real projects, your data going to be stored, typically in data warehouses or data links inside many, many different tables. The first step in any visualization tools like Tableau or PI is to connect those tables and combine them in one big data model. Let's start with the question, what is data moduling? Data modulings the process of organizing and representing data in a clear and understandable way. Each data model has entities, entities, things like customers and products or events like orders. And inside those entities, we have informations, and we call them attributes like the first name and the last name inside the entity, customers. And we describe in the data model how those entities are connected or related to each other and we call it relationships. This data model, this visual representation of the data makes it easier for us and for programs to understand the data, which is really important for making decisions and improving performance of the business. All right, so we have three different types of data models at different levels of abstraction. First we have the conceptual data model. This type is high level representation of the data model without going in details on how the data model is implemented. It's like a map that shows the important entities and the relationships. And we usually use this type to explain the data models to business analysts and stockholders to understand the big picture of the data. The second type is the data model. In this data model, we go more in details on how the data is structured and organized. We define in this model the attributes of each entity, and it includes as well constraints and more details about the relationships between the entities. This data model is usually used by database designers and developers as a blueprint for the implementations. And the third type is the physical data model. This type represents the actual implementations of the data model. It includes all the technical details about how to store the data. Like the data types of the atroputes, the primary and foreign keys, indexes, and so on. This data model is used by developers to create and manage the databases. All right, so let's summarize. The conceptual data model shows the big picture of the data. The logical data model provide a blueprint for the implementations. And the physical data model shows how the data is implemented in the databases. And Tableau did adapt both the logical and physical data models in the data sources. But we don't have conceptual data model in Tableau. Don't worry about it. I will show you more details later. All right, so now for analytics and specially for datawarehousing and business intelligence, we need special data models that are optimized for queries and for analytics. It should be flexible and easy to understand. And for that we have two special data models. First one is the star schema. Star schema has a central fact table and surrounded by dimensional tables. The fact tables contains events and the dimensions holds descriptive information. The relationship between the fact and the dimension tables form star shape, and that's why we call it a star schema data model. We call it snowflake schema. It is very similar to star schema, but the dimensions here are breaking down into sub dimensions. Normalized tables or dimensions means that those tables are broken down into small pieces to avoid having big tables or big dimensions, which leads to many data duplications and slow performance. The shape of these data models looks like Snowflake star schema is a simple and easy to understand data model and we usually use it if our dataset is small or medium. On the other hand, the snowflake schema is more complex, but it eliminates the duplicates and reduces the storage spaces. We usually use it if we have a large datasets. All right, so the datasets that I've prepared for this Tableau course are using the star schema data model just to keep it simple and easy to follow. All right, our data model has a name and we call it Star schema. If you're going to work on real projects, you're going to hear about the star schema a lot. Star schema has mainly two types of tables, facts and dimensions. For example, we have the table Customers. It describes each customers by their first name, last name, country, and so on. So customers is a dimension table. And we have another dimension table in our data model. It is the products product table describes as well each product by their name and category. It is as well a dimension. All right, so now let's talk about the second type of tables in the star schema. We have the facts, for example, let's have a look at the big table in the middle, we can see three things. You can see first, a lot of keys to the other dimensions. We have the order ID, customer ID, product ID, and we can see dates. So we have the order date, the shipping date, and the third thing, we can see a lot of numbers. We have sales quantities, profits, we call them as well, measures. If you see those three things, that means we have an event or fact. Table Facts connect dimensions together. It has dates and as well measures. Okay, So to summarize, how do we decide if a table is dimension or fact? If you have a table that contains information about a physical person or an object, like employee, customers, broducts, then this table is a dimension. And usually they are small tables. And on the other hand, if you have a table that contains events, for example, we have sales or doors logs, ETM transactions. Any table that has events, transactions and has time in it, It facts, and usually they are really huge tables, okay? So in our data model, in the datasets we have two dimensions. We have the customers and products, and in the middle we have our fact, the orders. All right, So now if you hear in your project someone talking about star schemas and so on, you know exactly what they mean. It's very important concepts in analytics and BI words if you are using Tableau or Bar BI. All right. So with that, you have learned some important concepts in data moduling. Next we will learn the Tableau data model and the two layers, physical and logical layers. 79. Tableau | Tableau Data Modeling: Okay, once we connect our data to Tableau, we have to create a data model in our data source. If your data contains only one table, then your data model is very simple. You have single table in your data model. But in real life projects, things get more complicated where you have multiple tables. And Tableau here offers four different methods of how to combine and connect your tables. We have relationships, joins, union, and data blending. Now, before we start doing deep dive and those methods, let's first understand that data moduling in Tableau, In Tableau data model, we have two layers. We have the physical layer and on top of it we have the logical layer. In the physical layer, we might have some couple of physical tables and we can combine them in Tableau using two methods, either joining the tables or using union between them. Now let's move to the logical layer. It is the top level layer and provide us like an abstract to hide all the details in the physical layer. This is especially nice if we have a lot of tables in the physical layer. Once we are building our visualizations, we don't want to see all those tables in the physical layer. The logical layer is going to provide us like an abstract or going to hide all those details. The result of merging the tables using join and union in the physical layer are going to be presented in the logical layer with single table, flat table, and we call it illogical table. That means we're going to have two logical tables. The first one going to represent three tables after doing the join. And the second one going to represent two tables using the union. But we still have in data modeling to connect those two logical tables in Tablo, We have only one method to do that, and we call it relationships. It's very important to understand that in the logical layer, we cannot merge tables in one table after reconnecting them using the relationship between the two logical tables. The table is going to stay as it is and nothing going to be merged. We just describe the relationship between the two logical tables. Now back to those two layers, both of the physical layer and the logical layer. We can find it inside Tableau Data Source. And as you know, on top of the data source, we have our visualizations. And you can see in this example only the tables from the logical layer. And you can start building your visualizations using the data available from the logical layer. But sometimes as you are working with the projects, you build another data source with another data model. Here in this example, it's important to understand that not all logical tables comes from the physical tables. They could come directly from your source system. Now in order to build one visualizations from both of the data models and the data sources, we have somehow to connect those two data models or data sources. And we can do that in the visualization level where Tableau offer us the last and very unique method of connecting and combining tables, something called data blending. By looking at this, you can see that Tableau offer us four different methods of how to combine and connect tables in different layers and different levels. In the physical layer, we have the joints and unions. We have in the logical layer the relationships, and at the visualization level we have data blending. All right, so now let's see in Tableau how we can navigate through the physical and a logical layer. We are currently at a data source page, and as a default, we're going to be a logical layer in the data model. So that means anything that we drag and drop in our data model is going to be considered as a logical table. The customers is illogical table. Let's take another one. Let's take the orders, drag and drop it over here. So this is our second logical table. And as you can see, Tableau did create between them a relationship. Because at a logical layer we can do only relationships. So now we are at the logical layer, how we can go to the physical layer? In order to do that, we're going to go inside a logical table. Let's go to the customers and double click on it. Once we do that, we're going to go to the second layer. We are inside the physical layer now. Tableau going to tell you over here, the customers is made of one table because we have only one physical table now, anything that we drag and drop in the data model is going to be considered as a physical table. For example, we can take the Customer Details, let's drag and drop it over here. And by default, Tablo going to create between them, not relationship, it's going to create a joint between those two physical tables. And of course we can do a union between them. In the physical layer, we can do joins and unions. As you can read over here, it says the customer, the logical table. Customers is made of two physical tables. If you have her on this icon, you will see exactly that we have two physical tables defines the logical table customers. Now if you want to go up back to the logical layer, we can do that by just closing the physical layer. Let's click on that. Now you can see that the customers has a new on, it says in the physical layer there is like a join and we get more information if we have her on the tables, it says logical table Customers. That is made of two physical tables, the customers and the customers details. That means the data in the logical tables comes from the physical layer. But if we go to the orders over here, you will see no physical tables. The data comes directly from the original tables. And with that, we have learned how to navigate through the physical and logical layer. All right, so with that, we have learned the data modeling in Tableau and what is the physical and logical layers. Next, we will start learning how to combine tables in Tableau and we will start with joins. 80. Tableau | Joins: All right, so let's start talking about joining tables. We usually have two tables, table and table B. If we want to combine them in one big table, then we can use joint between them. The first thing to understand is that once we use join between two tables, then we have two sides. Table A going to be the left table and table B going to be the right table. Now what's going to happen after we join the tables? All the fields from the left table will be at the output. And then all the fields from the right table will be added next to it. Joints combines the fields or the columns of two tables. Now, in order to do joins things, first we need the key field. It is a field that you can find it in both tables. And after that, we have to define the type of join. And we have to choose between four different types of joints. We have the inner join, the left join, right join, and full join. If you know L, then you know those types. It's exactly the same logic. But let's have a quick example to understand the four types of joints. All right, now we have this example where we have two simple tables. We have the customer's names and the customer's age. And we want to combine them in one table because it makes no sense to have two tables about the customers. We want to make one customer table and we want to combine them. In the first table we have the ID and the names. And the second table we have as well the ID's and the age. It's really easy. The key for this joint is the customer ID. Now let's see the different output using those different types of joints. Let's start with the first type of join, The inner join. Inner join says the output going to show only the matching rows from the left and from the right. That means any matching rows will not be presented at the output. Let's see how this works. The first thing that's going to happen is that we're going to combine first the field first. We're going to start with the left side, then the right side. Now we're going to start matching the rows. We're going to start from the left side. Do we have the user ID one in the right side as well? We have a match in both tables. We have the customer ID one, this, we're going to see it at the output and then we proceed on the left side. Do we have customer ID number two as well on the right side? You see we don't have it. We have only the customer number three. That means two is not matching on the right side and the customer three is not matching on the left side. That was it. If you use inner join in this example, you will get on the customer ID number one, since we find it in both tables. Let's go to the next one. We have the left join, left joint says we're going to have everything from the left table without checking anything but from the right table we're going to have only the matching rows. If we do lift joint between those two tables, we're going to have the following output. First we're going to have the fields from the left table and the fields from the right table near each other. And then we're going to have all the customers from the left table without checking anything. Everything going to be presented over here, those two customers. And then from the right side, we're going to have only the matching rows. That means, do we have the customer ID number one on the right table? Yes, we have it. Then we're going to have it at the output. But the customer ID number two, we don't have it at the right table, which means it's going to be empty. Empty means nulls. Here we're going to have the values of nulls in both of the field ID and as well in the age. And that's it, this is the output of left join. All right, so now we're going to move to the next one. We have the right joint. You might already understand how it works. We're going to have all the roads from the right table and only the matching rows from the left table. Let's see how the output is going to be if we do right on between those two tables. As usual, we're going to have all the fields, all the fields from the right, and we're going to have all the rows from the right table without checking anything. We're going to have those two customers, and then we start matching from the left side. Do we have the customer number one? Yes, we have it. We're going to add it over here. Do we have the customer number three? As you can see, we have only the two. That's means we don't have informations and we're going to have the nulls. Those can be empty, That's it. It is exactly the opposite of the left join. Now to the final type of join, we have the full join. Full joint means everything from left and everything from right without missing anything. Let's see what's going to happen if we have full joint between those two tables. As usual, we start with the fields from the left and from the right, then we take everything from the left side. We take those two customers over here. From the right side, we're going to have the matching grows for those two customers. For the ID number one, we have this one, but for the two, we don't have any matching grows, we're going to have nulls over here. But as you see we don't have everything from the right side. The customer ID number three is missing. That's why using full joint we're going to have those informations over here and then we're going to match it as well from the left side. Do we have any customer number three on the left side? We do have that means we're going to have nulls as well. Now by checking the output, you can see we have everything, all the data from left, all the data from right where there is no match, we're going to have nulls. As you can see, you need to be really careful with the type of joint you are using because using the wrong one, this could cause of losing data. If you want to be safe and you don't want to lose any data, then you have to use the full join. But sadly, full joints are very slow and you're going to end up having very big tables, especially if both tables have a lot of unmatching rows. And now I want you to understand how joints works in Tableau, what can happen in the background once we join tables. We have the data source, we have the visualizations, and inside the data source we have the physical layer and the logical layer. In the physical layer, we're going to join both of the tables A and B. Once we do that, Tableau can create one new combined table A and B. In the logical layer, this table, we call it a logical table which contains data from both tables. Then in the visualization layer, let's say we want to select the fields of F two and F four. Tableau can query the data source and the data source going to get the data from the new combined logical table B and then send the data back to the visualizations. You can see the interaction between the visualizations and the data source going to be at the logical layer. The physical layer is going to be completely out of the picture. That's simply how joints works in Tableau. All right, now how we can do joints in Tableau. Let's say that we want to join the table customers with the orders. First we're going to go to the left side over here. Drank and drop the customers. The joint is going to be done at the physical layer, we have to go there. Let's go inside the customers. And now we are at the physical layer. We're going to take the orders and just drag and drop it over here at the empty space. With that stable as default can create an inner joint between the customers and the orders. If we want to customize the join, we're going to go over here at the icon and click on it. And we have here two things to do. First, we're going to define the type of join. As we learned, we have the inner left, right, and full outer join. You can just click between them and see which data can be missing and which data can be presented as the example that I showed you. So I'm going to stay with the inner joint and the next thing that we're going to define, the key for the joint Tableau did understood there's customer ID from the left, there's customer ID on the right, and this is the perfect match, which is correct. But let's say it was wrong and you want to choose the correct key for the joint. What you're going to do, you're going to go to the left side over here, Click on the arrow, you will get all the fields from the left table and select the correct one. This example, the customer ID is correct. So I'm going to stay with it and you'll go to the right side. You have as well, the same icon over here. And you will get all the fields from the right table and you select the one that suits you. One more thing. Your key for the joint could be not only one field, it could be multiple fields. You can add more fields over here. You go to the next row and select the next field for the join. But in this example, we have only one key. I'm going to close this. We have set up the joints. You're going to stay with the inner join. We can go back to the logical data model. And as you can see, the table over here has icon of join. It tells us that this logical tables is a result of joining two tables. That's it. This is how you can do joins in Tableau. All right, that's all for joints mix. We will learn the second misods, how to combine tables using union. 81. Tableau | Union: All right, so now let's talk about union. Let's say that we have two tables and both of them has exactly the same columns. Sometimes it makes sense to combine them in one big table, and we can do that using the union. Once we do union, what can happen? The columns and the rows of the left table going to be presented at the output from the right table. Only the rows going to be a pen at the output beneath the first one. Union Going to combine the rows of two tables in the union correctly, we have two requirements. First, both of the tables should have exactly the same number of fields, and second, the field should have exactly the same data types. So as you can see, we don't need the key between those two tables. It's not like the join. All right, so now let's have a quick and very simple example about the Union. We have here very simple two tables, the orders of 2022, the orders of 2023, and as you can see, both of the tables has exactly the same structure. So we have two columns, the ID and date, in both tables. And it makes sense to merge them in one table. We call it orders. So if we do union between them, what can happen at the output? It's going to start from the left table and it's going to take the fields first, the ID and dates. And then it's going to take all the rows from the left side and put it at their results now from the right table, we will not take again the fields because we have it already from the left table. It's going to take only the rows and abandon at the end of the table. It's going to take the two orders, 3.4 and just put it beneath the table over here. And that's it. It's very simple and easy. It just needs exactly the same number of columns or fields and exactly the same data types. Now let's understand how union works in Tableau and what's going to happen in the background. Once we do union, we have here again our layers. And union is very similar to join in the physical layer, we have our tables A and B. Once we do union between them, Tableau going to create a new combined logical table where it's going to combines the rows of both tables. Then in the visualization level, let's say that we take the field F one. Tableau going to send a query to the data source. And data source going to ask the logical table to get the data. Once Tableau get the data from the data source, it's going to be presented at the visualization. As you see again here, the interaction is between the visualizations and the logical layer. All right, now let's see how we can do union in Tableau. We're going to work with the two tables. Orders and orders are shaves, Both of them has exactly the same number of fails and as well exactly the same data types. In order to do that, we're going to take the orders drag and drop it on the logical layer. But you know, we can do union only in the physical layer. We have to go inside the orders. Double click on it, and now we are at the physical layer. Let's take the second table, the orders a show, instead of dropping it at the white space, because Tableau then going to create a joint. We don't want to do that. We want to create a union just and drop it beneath the table. And as you can see, Tableau going to say drag table to do union, just place it beneath it. Tableau going to do union between those two tables. And as you can see, there is two lines. Gray lines indicates that there is union. If you want to check that, you can check at the result over here, the data, we will get a new field called table name. And you see some records comes from the orders and other records comes from the orders are Sheaves, which indicates that we have one combined table of both of the orders. And the orders are shave. Let's go back to the logical layer. So I'm going to press here, the X. As you can see, we have a new icon over here, it indicates that we have a union. As you can see, the tooltip of Tableau, it explains everything. We have a logical table called orders. It is the result of union, table orders and orders achieved. This is one way of doing union between two tables in Tableau. There is another way to do that. So let me show you how to do it first. I'm just going to remove it, drag and drop it somewhere over here. As you can see on the left side we have something called new Union double click on it and you can see we have here two options, the manual and as well the automatic. Then we're going to get the result exactly like we just did. What we can do, we can just drag and drop the tables over here. The orders and the orders are here. And then click okay. With that, we get exactly the same results without going to the physical layer. And drag and drop two tables and put it exactly underneath the table. This is nice way to do union between two tables. You can check that by just going to the physical layer. Double click on it. As you can see, we got exactly the same results here. We can check the table name. We have orders and orders achieved. All right, so now let's check the second option where we can do union automatically. I will go back to the logical layer and just remove the union over here. Let's start a new one from the scratch. And now we're going to go to the automatic. What do we have over here? Imagine that we have around 100 tables about the orders. And this is very common if you are not working with databases, you are working with files, and the files has limitations. So what we're going to do, we're going to go and split the files after day after month after year and so on, so we end up having a lot of files. And it is very painful if we're going to go and drag and drop all those files in Tableau to do union. And instead of that, we're going to define for Tableau or rule Tableau, going to go and search for all files that's follow the rule and do union between them. What that means. For example, we have here two tables, the orders and the orders achieve. What is the naming convention over here? Both of them starts with the orders. I could have like a third table called Orders underscore 2022. Orders underscore 2023. And so there is a rule I'm following here in my naming convention, and I can specify that in Tableau. Let's see how we can do that over here. The first option is going to include or execlude. I'm going to leave it as includes. Now, I'm going to specify the rule. It starts exactly with orders after this word. It doesn't matter after that, it could be underscore 2022, 2023 or nothing and so on. Anything after that doesn't matter what we're going to specify. After that stars means anything after orders. Then we have some options to tell Tableau where exactly to search, either at the subfolders or at the parent folders. I'm going to leave it as it is, and then click okay. Now we have a union. Let's see what Tableau to say. It says we have a logical table called union. And it says we have many union table because we have the automatic way of doing that. Now let's check whether Tableau did that, correct? As you go to the right side here and the overview, you find we have a new field called path. It is the path of the files. Let's see that. I'm going to go to the sheet one here and just drag and drop the past to see just the files. So, as you can see, Tableau did it correctly. We have the orders achieve and the orders, it's a really nice way if you have a lot of Ss and Excels to do it automatically instead of drag and drop all those tables. Usually in my projects, I never use this because all the data is prepared in the datawarehouses or in the data link. So with that, we have learned all the different options on how we can do union in Tableau. All right, so that's all for union. And next we will learn very important methods, the relationships in Tableau, or we call it noodles. 82. Tableau | Relationships: All right, so now let's talk about relationships. In 2020, Tableau introduced a new methods on how to combine and connect tables together, and they called it relationships. They made it even as a default methods on how to connect tables, since it is very fast and flexible. What is relationships and how it works in Tableau, it is completely different than joins and union. If we have in the logical layer, two logical tables, A and B, we can connect them at this layer using the relationships. Think of the relationships as a contract between two tables. When Tableau uses the data from those tables, it has first to check the contract in order to understand how to generate the queries. And now it's very important to understand that once we connect the tables using relationships, the tables can stay separated from each other's and Tableau will not create a new logical table, so everything going to stay as it is without any changes. And here we just describe the relationships between two tables. Now in the visualization level, if we take the field F one from Table A and four from Table B, what's going to happen first? Tableau going to check the contract in order to understand how to generate the queries. And then it's going to send the query to the first table. And then it's going to send another query to the table B in order to get the data for four. And then the data going to be combined at the visualization level and not the logical level. All right, so now let's see how we can create relationships in Tableau. It's really easy. So we're going to stay at the data source page and as we'll add the logical layer, we will not go to the physical layer and all what we need is two tables. So let's take the orders, drag and drop it over here in the data model. And then let's take the customers. Now as you can see, as I'm moving there is like a noodle or relationships. Let's drag it here. Tablo going to automatically create relationships between the orders and the customers. Now how are we going to configure and set up the relationship? So let's go to the Nodle over here and just click on it. And then there will be no new window or something for the set up. We're going to go to the meta data over here. If you don't see the information like this, then you can go over here and you will see the relationships and the logical tables. So make sure you are selecting the relationship. There is like three things that we're going to set up at the relationship. First, it's going to be the key. It's like the joint key. It is common filled between the two tables. Now, as you can see over here from the left table we have the Customer ID, and the right table we have the Customer ID. And Tableau did automatically understand that this field could be used as a key, which is correct, but if you want to change it, you can go over here. So we will get a list of all fields on the left table. And as well, you're going to go over here, you will get all the fields from the right table and you can add more fields for the key currently it is correct, so I'm going to leave it as it is. Next we're going to go to the performance Options. We're going to extend the performance options over here. And we have here two things. We have the cardinality and the integrity. And if you leave it here as it is as a default, nothing going to go wrong. You will not lose any data. So you don't have to change anything here unless you want to optimize the performance. What do we have over here? We have cardinality as many or one on the left side. And on the right side you can define the same stuff. For the integrity, we have some records marks and, or records marks in order to understand those stuff. Let's have an example. All right, so now we can have example for the cardinality. In relationships, we have two tables, our orders and customers. There is a relationship between them and the key for the relationships is the customer ID. In the cardinalities, there is two options, Either we're going to use many or one. In order to decide which one is the correct one, we have to do data profiling. Data profiling means we're going to do deep dives in the data to understand the values inside our tables. And once we do data refining, it's very easy to select whether it's many or one. Now what those values means many and one. There is a simple rule for that. We use many if there is double kits in the key, and we use one if the key is unique and does not have any double kit inside it. Now let's check the example in order to determine whether it is many or one. So let's go to the orders over here. And the customer ID, you see in those values there is double kits. We have the customer ID once here and once here as well, and the customer ID two is twice. So those values are not unique and contains double kits, that's why we call it a many. Let's go to the customers over here, you can see we have the customer 123 and that's it. So those values are unique and there is no duplicates inside that. We don't have the customer ID one again in the table, so that means we can specify here one. So now let's go through all scenarios in order to understand what can happen in Tableau once you configure this. All right, so now let's run the first scenario where Tableau going to define it as a default many to many relationship we have at the left side many and on the right side we have as well many. And let's say in the visualization level we talk the customer IDs from the order and the sum of all sales. Then the name of the customer. All right, now let's see how Tableau going to work. Tableau, first going to check the relationships. It's going to say, okay, it's too many, it's better to check the whole tables on the left and on the right. So we're going to start on the left side. We have the customer one. It's going to take it over here and it's going to sum all the sales. Since it's many Tableau can understand, I have to check the whole table. Tableau can scan the whole table one by one. It's going to say, okay, we have the sales 50. The next one is not the customer one and then go to the next, it's going to skip it. And then we have again the customer ID number one and it's going to do the sum 50-30 That means we're going to have the value of 80. It is the sum of the two sales. And now we're going to go to the right side to find the name of the customers. It's going to check, okay. It is many. So it's going to scan the whole table for the customer ID one. So now the first record, it's fine. Okay. We have the customer ID one. It's going to take Maria over here. But now Tableau will not stop. It's going to scan the whole table sense in the relationships. It's many but it doesn't make sense because the customer ID here is Unique. Tableau going to check whether there is customer ID one over here and then go to the next, and then it didn't find anything, so it's going to stay like this. And now Tableau going to proceed with the next customer. We have the customer ID number two, we're going to have it at the output and then we're going to have the sum of all sales. So Tableau going to scan the whole orders in order to do the sum, we have over here the 20. And then we have here ten. So the sum of that is 30. Tableau going to have at the output 30. So that's it for the left table. We're going to go to the right table table. Going to scan the record one by one. So the first one is not the customer ID. Number two, we have here a match, so John going to be at the output Tableau going scan the whole table, so it's going to go for the three and so on. And as you can see, the output is correct using the default methods of many to many. But we have here problem with that. On the right table, Tableau is doing a full scan, so with that we are losing performance on the right side. So it's better to optimize its where we're going to tell Tableau. If you find a customer then that sits, you don't have to scan the whole table because we have at the maximum one record of each customers. There is no duplicates and it is unique. And now we have to tell somehow this information for Tableau. In order to do that, we can do it in the cardinality. On the left side it's going to stay as many, but on the right side we're going to say it is one. And with that Tableau going to understand, okay, it is unique. We don't have to scan the whole table and we're going to win a lot of performance. All right, so now let's see how Tableau going to work. Once we have it as many to one on the left side, nothing's going to change because we have many. So Tableau going to scan the whole table for the customer one, the result going to be the same. Now on the right side, things going to be changed. Tableau going to say, okay, customer ID number one, there is a match. It's going to take Maria as the output. But now Tableau, Tableau will not search for the customer ID one and scan the whole table. With that, Tableau will not be doing any unnecessary stuff and we're going to win some performance. We're going to go now to the customer number two over here. Same information. So Tableau scan or do we have the customer number two over here? No, we jump to the next one. Yes we have a match. We're going to take John, but Tableau stop as well and we'll not scan the next record. As you can see, we have exactly the same output, whether you are using many to many, many to one. With many to one, we have one. The performance were Tableau going to stop the scan on the right side. All right, so now let's jump to the next scenario where we're going to do something wrong. Where we're going to say, okay, the customer ID on the left side is unique and we're going to put the value of one on the right side. It doesn't matter. Let's have money, for example. Now we are telling Tableau on the left side, the customer ID is unique, so you don't have to scan the whole table. And we're going to have the same example over here. So let's see what's going to happen. On the left side tableau going to start with the first customer, say customer ID one. The sum of sales is now 50. Because I don't have to scan the whole table, it's going to stop at the first three cords and the output going to be 50. Now on the right sides, once we are saying many here, it doesn't matter the result. We're going to be correct. We're going to have Maria but table going to scan the whole table so the performance is going to be bad. Now we're going to jump to the next customer. We have the customer number two table going to have it at the output here. Again, the same problem table going to say, okay, we have the sale 20, The customer ID is unique. We will not find it again in the same table. I don't have to scan the whole table. Table Going to take the value 20, I'm going to put it at the output without checking the other values here on the right side, it doesn't matter. We have John, which is correct. But going to scan the whole table as you can see, if you make mistake here in the cardinalities, you might have some problems at the output where we're going to have some missing data and wrong information. All right, now let's run the last scenario where we have on the left side one and on the right side as well one. We're going to get exactly the same output because we have, it's wrong on the left side. The only good thing here is that on the right side table going to stop the scan. Once it find a match, it will not scan the whole table. So at the output we're going to get exactly the same informations. And here we have one to one. All right, so now let's quickly summarize. On the left side, we have two criteria, the correctness and the performance. Correctness is always way more important than the performance. Let's start with the first scenario. We have many, too many relationships. As you can see, the output was correct, but the performance was bad since Tableau doing unnecessary full table scan on the right side. So that's why I'm going to give it okay for the correctness and not okay for the performance. For the next scenario, we have many to one relationship. The output was okay. So it was correct, we're going to give it okay. And the performance was okay since Tableau stops scans once it find a match. So that's why we're going to win a lot of performance and we're going to give it an okay. Let's jump to the third one. We have one too many relationships. As you can see, the output was not okay. This was not correct. We are missing data, so we're going to give it not correct. And the performance was bad because on the right side we are doing unnecessary scans, so that means it was the worst scenario over here. And then the last one, we have one to one relationship. The output was not correct. Not okay, but the performance was okay, since on the right side we are not doing any unnecessary scans. But to be honest, correctness is way more important than the performance. And that's why tab always recommend to stay at many, too many relationships if you are not sure because you're always going to get correct answers at the output. But if your data is big, you will get some bad performance. If you want to have like good performance, you have to invest time in analyzing your data, doing data profiling to understand is it, is it one? And then change it. But you have to be sure about your data, otherwise you will get wrong informations at your visualizations and that's really bad. So that means for this example, the safe way to do it, to stay at many to many relationships, but the professional one is to have many to one relationships to get good performance. But this is not always a scenario. Just imagine we switch the tables between customers and orders. So customers is left and others as right. Then one too many relationships going to be the correct one. So be careful here with the sides. All right everyone. So now let's understand the integrity options in Tableau. Each relationship has two sides, the left table and the right table. When we are changing the settings of the integrity, we limit which joints can happen in the visualization. So here we have two options, some record match and a record match. And with that we have four scenarios. First, we can choose some record match in both left and right tables. And if we do that, then all types of joints are possible. In the visualization, we have inner left, right and full join. But now if we choose all record match on the left and some record match on the right. So what can happen now? We are limiting the types of joints to only two types, inner and right. Join the next one. It can be the opposite, so we have some record match on the left and all record match on the right. What can happen again here we limit the types of joints to only two types, the inner and left join. In the last scenario, if we choose all record match on both sides, the left and the right. Then here we limit Tableau to only one type of join, the inner join. As you can see, it's very similar to joints. We are just defining how Tableau should work. When we use some record match, we allow more types of joins. And when we use the option or record match, then we are limiting Tableau with the types of join. And here it's very important to understand that we have a trade off. If you use or record match and go down this path, you will likely experience better performance, but you will increase the risk of losing data. But if you choose to use some record match and you go up, you will ensure the completeness and the flexibility, but you are sacrificing some resources and performance. Tableau team here decided to go with the first scenario where we have on the left and the right some record match. I can understand that because it's more important to have completeness and flexibility more than performance. Let's have a look at our data here. We have customers that didn't order anything. The customer number three didn't order anything over here, and we don't have a match of it. We can say some records matches like the 1.2 are matching on the left side, but some other records does not match. We don't have an order from the customer ID number three. That means in our database, we could have customers in the customer table, didn't order anything. The correct option over here is some records matches. Now let's analyze the orders. As you can see, we have the customer ID number one, we find it in the customers two as well, and so on. So we can see that all the records, all the customers IDs in the orders has a match from the customers. Well, that means we can select all records match. We don't have, for example, customer ID four over here which does not have a match on the right side. That means in our database, all orders should come from our customers and we should not have any order without a known customer. After the analysis, we can say on the left side on the orders, we have always a matching records. So we're going to select all records matches. But on the right side, we might have customers that didn't order anything. Then we can say some records matches. If we do it like this, we can prevent Tableau from doing any extra stuff by analyzing the nulls. Like in SQL, if you have full outer join, you will get like huge amounts of data. And sometimes if you're using inner join or left join and so on, you will get better performance. So if you know exactly what is going on in your data, then select the correct integrity. Otherwise just leave it as a default. Some records matches on the left and on the right you will be safe, you will get correct answers. All right, so a pack to Tableau relationships are really easy. We just have to drag those two tables and Tableau go and create the relationships between them. Just get the key between the relationships correct and everything going to be fine, and leave those staff as a default. But if you want to be like more provisional and get better performance in Tableau, you have to do data profiling and then select the correct one if you are 100% sure. So in this example, the orders over here has many in the customer IDs, but we have on the right side one for the customers and then for the integrity on the orders or records matches because all orders has a customer ID in the customer's table. But we might have some customers that didn't order anything. So I'm going to leave it as some records matches and that's it. That is relationships in Tableau. All right, so that's all about the very important concepts of the relationships and how it works. Next we will learn very unique methods, the data blending in Tableau. 83. Tableau | Data Blending: All right, so now let's talk about data blending in Tableau. But first some coffee. Let's go. All right, so now let's have this example where we have in the data source table A. And now in the visualization level we want to use the data from the field F one. And you know by now Tableau going to send a query to the data source in order to get the data of the F one from the table to show it in the visualization. Now since this data source was the first one to be queried and to be used and Tableau going to call it a primary data source in Tableau, anything is primary going to get the blue color. That's why you will see like blue icon indicates that this data source is a primary one. Now sometimes you are in a situation where we want to get the data from another data source. For example, we have another data source with the table B and we want to add the visualizations to show the data of four. What's going to happen? Tablo going to send another query to the second data source in order to get the data of four and then the data can be forwarded to the visualizations here. Tablo going to call this data sources ondary Data source, and it will market with an orange icon. Now in order for this to work where we're going to get data from two different data sources, we have somehow to connect them here. Exactly. We're going to use the very unique way in Tableau where we can connect data sources together using the data blending. Data blending can only be done at the visualization level on the worksheet page, not in the data source. Now you might ask how Tableau is joining those tables at the visualization level? Well, Tableau is using a left join. We cannot change that. Sadly, it is fixed. It's like a left joint Tableau going to get all the data from the primary data source and only the matching records from the secrondary data source. Now to summarize, data blending is the methods of combining data at the visualization levels from two different data sources using a left join. This is very unique feature in Tableau. You don't find it in any other BI tool like Microsoft Power BI. You cannot, for example, there, combine data from two different published datasets. All right, now let's see how we can do data blending in Tableau. And for this we need two data sources. The first one going to be from the CSV files that we have, from the small datasets, we're going to go to the text files. Let's take the products over here. This is our first data source. Now let's go and create the second data source. In order to do that, you can go to this icon over here and then click on New Data Source. Let's go there. It's going to be from the Json file that I prepared for you. So let's go to Jason and we have the product prices. Let's open that. Since it's Jason, we have to select the schema. Let's go to the data over here. And click Yes, and then click okay. Now we have two data sources. In order to switch between them, we go again to this icon over here, and you can see we have now two data sources, and by just selecting the data source, you will switch to it. Now in order to do the data blending and to connect those two data sources, we cannot do it at the data source page. We have to go to the visualization level, to the worksheet page. Let's do that. I'm going to go to the sheet one over here. As you can see at the data pane on the left side, we have two data sources and by just clicking on them, you can switch in order to see the tables inside them. Now we have to decide which data source is the primary and which one is the secondary. For this example, I will say that the product is the primary one. And how are we going to do that? By just using the data indivisualizations as the first data source. So I'm just going to take the product ID, drag and drop it on the rows and immediately Tablo going to understand. Okay, this is the primary data source and it's going to market with a blue icon over here indicating that this is our primary data source. We still don't have a secondary data source, so you see there is no orange icon over here, because in our view, we have data only from one data source. Now, in order to get the data from the second data source, we're going to switch to the product prices. And you can see Tableau immediately turn this data source as a secondary data source. You can see over here we have the orange icon indicating that this is secondary data source and any field that we are using, it's going to market with orange. So you can see over here the price, it has an orange icon that it's very simple. Now let's say that the product ID is not the key of order to join those two data sources. You want to change that. In order to do that, we're going to go to the Data over here in the menu, and then go to the Edit Blind Relationships. Let's click on that. We will get a new window over here. And here we have two options, Automatic and custom. If you leave it as Automatic Tablo going to figure out which key to join those data sources here in this example is the product ID. If you want to change that, you can go to the custom over here. It's like join. You have to specify from the left and from the right which fields are the key in order to do the join. If you want to change that, just double click on it. And then you have on the left side the primary data source and the right side the secondary data source. And then you select the fields that are the key for the join. I'm going to leave it as it is. Let's add another key. I will go over here and for example, the category is from the left side and from the right side the data index, which is really wrong. Let's click okay. And then again, okay, you will see on the left side now we have another chain on the data index. And you can see it's like broken chain, that means not yet used in the joint. If you want to activate it, just click on it and you will see we have an active chain. Now as you can see, the result is wrong because it doesn't make sense to use this key. But I just want to show you how you can deactivate and activate the key of the joint between two data sources by just clicking on them. Now let's just correct this. I want to have only the product ID as the key for the joint. So that means I'm going to deactivate the data index over here. And that's it. This is how you can define the key for the data blending. One thing that is very important to understand that everything that's we've done in the data blending is only relevant for these worksheets. If I go to another worksheets, let's go over here and create a new one. Now as you can see over here, it's completely reset the two data source. We have it again, but we don't have it as the primary and secondary data sources. That means in each worksheets we can make a new decision. At the sheet Number one, the products were the primary. I can change my mind here where I can say, okay, the product prices now is the primary data source. If I take anything over here, you can see product prices is the primary. And if I go to the products and let's say I'm going to take the product name over here. Products can be the secondary, so I just switched between them depending on the requirements. So if we go back to the sheet number one, we see that the product is the primary. But if we go to the sheet number two, the product prices now is the primary. This is really nice because it gives us really flexibility where we can decide in each worksheet which one is the primary and which one is the secondary. Depending on our requirements, data blending is very unique and great way on how to connect and combine data. All right, so with that, you have now an overview of all four methods of combining tables. And next we will go and compare them side by side, and we will start with the differences between joints and union. 84. Tableau | Join vs Union: All right, so now what is the main difference between joins and unions? Both of them are very similar. They're going to combine two tables in one big table. But the difference here, that's how the data going to be combined in joins, the fields of both tables are going to be combined. So we're going to take all the fields from the left side and beside it, all the fields from the right sides. So the results, we're going to get one big wild table. But on the other hand, in the unions, two tables are going to be combined. But instead of combining the fields here, we're going to combine the rows of both tables. So we will get all the rows from the first table, and beneath it, all the rows from the right table. But both of them has exactly the same columns. So joints combines the fields and union combines the rows. All right, so that was the main difference between join and union. Next we will learn the differences between joints and data blending. 85. Tableau | Join vs Data Blending: All right, so now the question is, what is the main difference between joints and data blending? Data blending is like a lift joint. But the main difference here is that when the aggregation is going to be performed in joints, the data combines first and then the aggregation can happen. But in data blending is opposites, the aggregation going to happen first and then the data going to be combined. So now let's have a simple example in order to understand what this means. Okay, So again, we have our tables, customers, and orders. First we're going to do the left join and afterward we're going to do the data lending between them in order to understand the differences between them in the output. All right, so now we're going to start with the left join, you know, left joint, all the data from the left side and only the matching on the right side. We start as usual by combining the fields from left, the fields from right. We start record by record. We're going to take the customer number one and we're going to search for the matches. We have two rows on the orders. That means Marie going to be twice in the output because there is two orders. And then we're going to go to the next one, customer ID number two. We have only one order for that, we're going to have it at the output and George don't have any orders, so that means we're going to have null here, here, and here. So as you can see with the lift join First we combine the data, the raw data, without doing any aggregations. Afterward, ind visualizations we can find, for example, the sum of sales or the average and so on. Now let's check the data blending, how it works. All right, now let's say we have all the fields from the primary data source and beside it, all the fields from the secondary data source. This is like left joint. We're going to take all the data from the primary data source. We're going to get all the three customers over here. But the main difference here is that there will be no doublicates. As you can see, we have here Maria twice. But in data blending, you will not get any doublicates. Now here comes the difference. Before we start getting the data from the orders from the secondary data source And aggregation can happen. For example, with the customer ID number one, we have two rows. The two rows will not be presented at the output first. It's going to be like an aggregation, and now it's very important to understand that the fields in Tableau are split between dimensions and measures. In the next tutorials, I'm going to explain that in details. But now the measures can be aggregated. The dimensions will not be aggregated, for example, the customer ID, It is not a measure, it is a dimension. Tableau cannot aggregate it, but since we have it twice the same value, Tableau can arrive here one. Then the next one we have the sales, It is measured. So Tableau can aggregate fares and then combine it. The sum of that is going to be 80. Let's two thats the next one we have the date here. Dimension cannot be like aggregated since we have two different values going to write at the output a star. Since Tableau going to provide at the output only one value and we have here two values, Tableau will not decide which one of them going to be. Tableau going to add a star. What's going to happen in the output going to be star? I know this is really not nice, but this is how data blending works. As you can see, Tableau always try to aggregate the data before combine it. Now let's move to the next customer. We have John in the orders, we have only one records. That means nothing going to be aggregated. The output is going to be exactly the same. Then for the customer George, there is no information over here. We will get as well nulls. This is the output of data blending. This is exactly what I mean with the main differences between joints and blending is when we do the aggregations in the left joint, as you can see. First we combine the road data togethers. Afterwards, we can do aggregations in the visualizations. But in data blending first the data should be aggregated specially from the secondary data source. Afterwards, the data going to be combined in Tableau. All right, with that, we have learned the main differences between joints and data blending. Next it's important to one, we will learn the main differences between joints and relationships. 86. Tableau | Join vs Relationship: All right, so now what are the main differences between joints and relationships? If you are using joints, things can get really static and we might lose as well a lot of data. But if you are using relationships in our data model, then we will get more flexibility and we will not lose any data. Now, in order to understand this, let's check this example. We have prepared two data sources, one with joints and the other with relationships. The first one with the orders. If I go to the physical layer, you can see we have a left joint between orders and customers. Let's check the second one. We have the relationships we have as well, the same tables, we have orders and customers between them, there is a relationship. Now, if you check our data, we can find that there is a five customers in the orders. There is only four customers that did order. If you check over here the customer ID, you will not find the ID number five. That means this customer didn't order anything. This is no problem for the relationships, but if you go to the joints over here and you check the data, you will see that we don't have a customer ID number five at all in our data. So you can check, okay, we have 1234 and so on. The customer ID number five is completely disappeared. That's because we have a lift joint between the orders and the customers. Only the matching roads from the right sides can be presented at the final table. That means we lost this customer. And if we are at the visualizations, let's go over here. Let's say we want to count how many customers do we have in our database. Let's drag and drop the customer ID. Let's turn it to a measure of count distinct. Our data says, okay, we have four customers. If we go to the relationships, let's open another one and switch to the relationships. And let's take the customer ID again over here, switch it to a measure and count distinct. You will see we didn't lose the data. We have five customers in our database, and the relationship is going to give us more correct answers. Now you might say, okay, we can fix this. If we change the type of join, that's right. If I go to the data source, then I go to the joins, go to the orders, and I just switch this to the right. So that means we're going to get all the data from customers and only matching from the orders. Let's close this and go back to our sheet number one. You me close this, we'll see that we have five customers. So with that we have correct answer. As well as with the join here we come to the next point that things are really not flexible. So that means if I'm building visualization, where sometimes I'm asking how many customers do we have or how many orders do we have? I cannot each time go to the data source and change the type of join, because once I decide it's a lift joint, it's going to stay for all the worksheets as a lift join, unless I'm doing full outer join between the two tables. And if you're working with big tables, then you will get a very big merged table which can slows everything down. And this is exactly what I mean. If you are using joins, you will lose data if you are using lift joint or right join. And as well, things are really static with the relationships. If we go to the sheet number two here, things are more flexible because we didn't merge anything, the data state separated from each other, we just describe the relationships between them. If in worksheets I'm doing analysis about the customers, it will not affect the next visualizations if I'm doing analysis about the orders because we didn't lose any data. And I don't have to worry, do we have left join or right joint? Should we change it and so on. So it's more flexible and we will get always a correct answers. So that's why joints are static and you might lose data. But relationships are more flexible and you will not lose any data. All there is another issue with the joints, if you compare to the relationships. Sometimes in joints we might get wrong answers if you are doing calculations on the measures. Let's take this example on the customer tables. We have the score for each customers, we have a score and we have those five customers. The average of the score going to be 625. Now let's stick in Tableau that results from joints and relationships. All right, now we are at the relationships. And let's take the score and drop it over here on the text. Then let's find the average. So we're going to go over here, measures and the average in relationships. We got the correct answer. We have 625. Now let's check the joints. We are at a data source of joints. I'm going to score drag and drop it on the text. And now we're going to switch as well to average here, we got the wrong results, 585. What happened here? Well, the answer for that is sometime if we merge two tables together, we might get doublkates. Let's check the data. If you go to the data source again in the joins, if we go to the score, we will have doubles. Because some customers have more than one order, that going to result in a lot of doubles if we merge the customers and orders, and if you do the average, you will get the wrong answer as we saw in the results. If you switch to the relationships, we go to the customers, we see the score over here on the right side, there is no duplicates and we will get the correct answer. And that's going to guarantee for us that using relationships we will get correct answers if you are doing calculations. And that's way better than having duplicate in our data. We might never get correct answers from joints. And that's why Tableau introduced in 2022 relationships just to fix all those problems with the joints and they made it as the default methods on how to connect stables. All right, so that's all for now. And next we will compare all the four methods side by side in order to understand the big picture. 87. Tableau | Join vs Relationship vs Union vs Blending: All right, so now we're going to go and compare the four methods on how to combine data in Tableau unions, joints relationships, and data blending side by side. So let's go. The first point is in which page in which layer we can use the method. Now, both union and joints, we can create them at a data source page, the physical layer, as will the relationship. We can use it as the data source page, but in the logical layer. And finally, the data blending could be used at the visualization level in the worksheet page. And the next point, can we use the method in order to connect tables from different data sources? Well, for union, joints and relationships, we cannot do that. It should be done in the same data source. But only the data blending could be used in order to connect tables from different data sources. The next point is after using the methods, are the tables going to be merged in unions and joints? They're going to merge the tables and they're going to create completely new tables. But if we are using relationships and data blending, they will not create anything. The next point is about the flexibility. If you are going to use unions and joints, the decisions that you are making at the data source can affect all the worksheets and the visualizations. But if you are using relationships and data blending, you have way more flexibility. For example, in the data blending, you can decide on each worksheet page. Now if you are talking about the joint types in joints, we have inner left, right, and full in the relationships we can have as well. Exactly the same behavior as joints, but in data blending it is fixed. We have only left. Join the next point. If you ask me to rank these methods I would say and Tableau as well. Going to say always use relationships. And after that comes the data blending. It is really great way on how to combine tables from different data sources and the flexibility that we have. And then the third one I'm going to say the joints I would not try union because it's completely different than the methods of joining relationships and data blending always try to go with the relationships. Now let's see the big picture on how those four methods works. And let's start with joints. They're going to connect two tables at the physical layer and they're going to create completely new logical table in the logical layer where it's going to combine the fields of both tables. And then at the visualization layer, the datasets going to create query at the data source and data source going to get the data from the logical table. And same thing for the union. You can create it at the physical layer of two tables. And they're going to create as well completely new table where the rows of both tables can be combined and add the visualizations table going to send query to the data source and the data source going to get the data from the logical layer. Now to the third method of the relationships. We have two tables at the logical layer, and Tableau will not combine or create anything. We are just describing the relationship between A and B. At the visualization level, Tableau can ask the data source and the data source going to get the data from the separate tables. And finally, the data blending. We have two data sources. The first one going to be called the primary data source. The second one is the secondary data source. So first table going to send a query to the primary data source and then another query to the secondary data source. Here it's important that the aggregation is going to happen before the data is combined. And we are combining the data at the visualization level using data blending. So as you can see, joints and union happen in the physical layer. In the logical layer we can do relationships and at the visualization level we can do data blending. All right, Kay, so with that, you have learned everything that you need about combining tables in Tableau. And next we're going to practice where we're going to create two data sources using the new skills that you have just learned. 88. Tableau | Build 2x Data Sources: All right. Okay, so now we're going to create together two data sources because we have two datasets, the big one and the small one. During that, I want to show you how I usually make decisions on when to use which methods. Let's go. Okay guys, now let's close everything and start from the scratch in order to get the data source correctly created. Let's start Tableau public. We're going to create now the small data source on top of our small dataset. Let's go to the connectors on the left side and click on Text File. And then it doesn't matter which one you're going to use. Let's take the orders open. I will delete it anyway, in order to explain how I start. Previously, I showed you the data model of our datasets. We have star schema where we have facts and dimensions. I always start with the fact table. Doesn't matter whether you are using star schema or snowflake. Always start with the fact table. Our fact table is orders. Let's just drag and drop it here on the logical layer. And then I continue with the dimensions, so we have customers and products. Let's start with the customers. Just drag and drop somewhere over here. And Tableau going to create a relationship between the orders and customers. Since we are talking about two different entities, so we have orders and customers, I always use relationships between them. Let's check the relationships whether everything is correct. So we go over here on the meta data. We see the customer ID from. Lift the customer ID from right, which is correct. And now let's go to the performance options. I will change only the cardinality. If the quality of our data is bad and we haven't done any data profiling, then the pace is to leave it as default to many, some record matches on the left and on the right. But in the datasets we already checked that. So we have clean star schema and always on the fact side, on the left side over here it's going to stay as many and all the dimensions on the right side, like customers, it's going to be one because we have usually, for example, unique customers or unique products. So I will go and change that on the right side as one because it is dimension side and on the fact side it's going to stay as many. I will not touch those integrity stuff, so we're going to leave it as it is. And that's it. We have now the customers and the orders connected to each other. Now before we continue building our data model, we have to check something very important. Are we working on the correct datasets in the correct format? So now if you go to the orders over here and here we have some few fields like the sales quantity, discount, profits, all those informations should be in number. And you can check that by checking the icons, the data type icons. And if they are like this hash value over here green. If you click on it table going to say it is number, decimal. If you see it like this number, decimal or number, then everything is fine. But if you see it as a string, for example, if you go over here and switch it to a string, if you see this field as a string, there is something wrong. If your data is like ABC, then you are working with the wrong dataset. It's not correct, you should see it like a number. Now the question is why it's wrong? Why it's not correct? Why Tableau didn't find it as a number? Well, there is different representations of the decimal separator in decimal numbers. Some countries, like in Europe, we have a coma, but in many other countries, like in USA, in Asia, we have a dot between the decimal number and the whole number. So now for example, I'm now in Germany and my data is separated with a dot. What can have been Tableau will not understand this is a decimal number and it's going to show it as a string. And that's why in the download link I have prepared two datasets depend on your location. The Europe training datasets and the non Europe training datasets. The Europe training datasets, all decimal numbers are separated with coma and for all other countries, they are separated with a dot for the first downloader. So now the question is how to fix it? Well, go and download the correct training dataset there in order to fix it. For example, now I have the Non Europe dataset. And as you can see, the discount sales, profit, everything is wrong, everything ABC and string. Now some of you think, okay, it's really easy fix. I can go to the data type over here and switch it from string to a number decimal. Once I do that, what's going to happen? Everything going to be null. It will not work because Tableau don't know how to convert those numbers correctly. Let's move it back to a string in order to see the data. Again, there is a fix for that. If you go to the orders over here and then rightly connect. And let's go to the text file properties. Here we have different properties about the files like the separator, here we have a semicolon Tableau did de correctly, but what's more important than this is the format of the decimon number, the local. Here we have to choose a locale which is matching to the current format. The current format is a dot here in this example. So what we're going to do, we're going to go over here and search for, for example, United States. And as you can see, Tablo can understand the correct format and everything going to be changed to a number. The solution, either you can use the correct datasets or you can go and configure the properties of each file. So I would say you can go and try United States or Germany until you have the data type number. So make sure that's in the orders, all those informations is the data type number. All right, so now let's go and keep building our data model in the data source. Let's go to the next dimension. We have the products, All what we're going to do is just drag and drop and they release it. Tablo going to create another relationship between them. Let's check that again. So click on that, go to the Metadata. Scroll up Tableau did automatically find the key for the relationship, it is the product ID, which is correct. And now the same thing. We're going to go to the Performance Options on the left side, on the fact side it's going to stay as many and on the right side it's going to be one. On the right side we have the dimension, it's going to be one. You can check that easily. If you click on the products and here check the data, you can see the product ID is a unique field, there is no doublicate inside it and we can go and use one. If you are not sure, just leave it as many to many relationship. Let's go again to the relationship. We have it many to one and I'm going to leave it here as some recurse matches. No problem. Now let's go to the other tables. We have here the customer's details. And here we have two options. Either we're going to use relationships or joints. You can go over here and just drag and drop, put it near the customers as a relationship. But to be honest in data moduling, if I have two objects about the same entity, here we have customers and here another information about the customers. I tend to merge those two tables in one. This is different than talking about the orders and customers. They are completely different entities and usually in data warehouses I prepare this step in the database or we can stay tableau and merge those two tables into one. And we can do that using joints. What I'm going to do, I'm just going to remove the customer's details away and then we're going to go to the physical layer inside the customers. Then we're going to take the customer's details and drop it over here. Table as default, going to leave it as inner join, but to be honest, the customer's table is for me, the main table about the customers and customer details is like secondary table. In order to not lose anything from the left side, I'm going to change the type of joint to left join. Let's do that. I'm going to click on the icon and then select Left Join. Then we can check the results. Well, the main thing that we don't get doublkates or we don't lose any customers. As you can see, the outputs, we have our five customers. There is no duplicates and we didn't lose anything. Let's go back to the logical layer, just going to close this. As you can see, we have list tables and we have one entity called customers. We don't have a lot of tables, and I usually do that if we have a lot of tables about the same topic. Now let's go to the next table. We have the order achieved. And here we have the same situation. We have two tables describing the same entity, the orders. But of course, we can connect it as a relationships to the orders. But again, I like to minimize the number of tables that I'm dealing with and I'm going to go and merge those two tables together. So here we have again two options, unions or joints. If the tables has exactly the same number of columns and the same data types, we can use union. In order to do that, we have to do data profiling. Either you open the CSV files and compare them together, or we can go over here. There is like small icon, like a table. And if you click on it, Tablo going to show you a sample of data in order to do data profiling and to understand the content of this table, let's just make it bigger. We have the order date, shipping date, customer ID, product ID, as well the unit price, and so on. And you can compare it to the orders over here. Let's just make it bigger. We can find exactly the same number of fields, the same content, the same data types. That means we can go and do union between them. In order to do that, I'm just going to close this and go to the physical layer inside the orders. I like to drag and drop just beneath it over here. Now you can see we have a union, let's check that on the right sides in the table names. So we have orders and we have orders achieved. With that, we combine both of the tables in one logical table. Let's close this. As you can see, we have the icon that there is inside it a union. And with that we have only three tables. Instead of having five tables, it is just easier at the visualizations to deal with three tables instead of five tables, and the data model is much easier to understand and to explain. With that, we have connected all the files together, but we still have one file, the adjacent file product prices. Sadly, we cannot connect it with the others in the same data source because it is different file type. But we still can connect it to them if we create a second data source and use data blending. Now that says we have our fact table and the dimension. We're going to give it a name. I'm going to call it small data source. Now you can pass the video and go and create the big data source. If we are done, I'm going to go and create the big data source. I'm going to go over here, new data source. Going to click on the text file. I will just go back to the big one here. We have only the three. We start with the orders, we start with the fact table and then we take the dimensions. Let's take the customers, customers. I already checked all those IDs. They are unique. So I can go to the relationships over here and change it to one on the right side and on the fact side it's going to stay as many. The same going to do for the products, drag and drop. All the IDs of the products are unique. We can go to the performance option just to make sure we select the relationship and select one. I'm just going to call it big data source Now in order not to lose those data sources in Tableau public, we have to publish to our public account. I will go and do that. We're going to go to the sheets over here. Let's just take something like the customers drag and drop on the rows that I will just go over here and publish it safe to Tableau public. And I have to sign in, I'm going to call it data sources then safe. Now it's start publishing to our profile that says if you want to download the file, you can go over here and download Tableau workbook. All right, with that we have created two data sources on top of our datasets and we can use them in the whole tutorial. All right, with that, you have learned everything about the Tableau data moduling in data sources and how to combine tables using the four methods. In the next section, we will start talking about the data in Tableau. We will learn there are many important Tableau concepts for data visualizations. 89. Tableau | Section: Tableau Metadata: The meta data of Tableau. Understanding the Tableau metadata concepts like data types, measures, dimensions discrete, continuous is very important in order to build a correct data visualizations in Tableau, and as well can help you to understand how Tableau works with your data. First, I'm going to introduce you to the meta data in Tableau to learn what happens to your data once you connect it to Tableau. Next, we're going to dive into all data types in Tableau, like integers, strain date, and so on. And after that, we're going to learn about the data type rules like the geographic rule and the image role. And after that, we're going to cover very important concepts in Tableau. We have dimensions, measures, discrete and continuous. And of course, in order to understand the differences between them, we're going to compare them side by side in order to understand. So now let's start with the first topic where we can have an overview of the basic concepts of meta data in Tableau. So now let's go. 90. Tableau | Introduction to Metadata: All right, so now we're going to have a quick introduction to the Tableau metadata in the data sources in order to understand what's going to happen to our data once we connect it to Tableau. After connecting our data to Tableau and building the data model in the data sources, the next step is to check the metadata of the tables and the fields. Because once you connect your data to Tableau, Tableau can start analyzing the content of your data to make assumptions about the types and roles of each field in the data source. Tableau can assign each field to types like integer, string, date, and so on. Data types gives us information about the kind of data stored inside our datasets. This piece of information is very helpful for Tableau in order to understand how to deal with your data. Which rules operations calculations can be performed. One more thing that Tableau going to do is going to assign each field to a role. These roles can help Tableau building the visualizations. The first set of roles we have dimensions and measures. Dimension fields define the level of details of the view. And the fields with the role measure going to be used for aggregations in the view, we have another set of roles, we have discrete continuous. These rules can help Tableau by plotting the visuals. Discrete fields can break the view to separate values. And the fields with the continuous rules going to plot unbroken chain and connected values in the view. And I call all those informations about your field as a metadata in the Tableau data source. One more thing that I want to tell you is that those assumptions that Tableau makes about your field is correct around 90% That means there is a possibility that those assumptions from Tableau are wrong. That's why it's very important after you build the data model is to have a double check on the meta data to check that all the informations are assigned correctly. Otherwise you're going to have bad quality and bad results at the visualizations. All right, so next we're going to do a deep dive into these important concepts in order to understand them and the differences between them. All right, so that was a quick introduction to the meta data in Tableau. Next we will dive into the basic data types in Tableau like integer, string, date, and so on. 91. Tableau | Data Types: All right, so we can find data types not only in Tableau, but in all programming languages. But they don't support exactly the same data types. And that's why if you are learning new programming language or an application like Tableau, it's very important to understand which data types they support. Now the question is, what is a data type? The data type give us information about the kind of information stored inside our data. And this piece of information is very important for programming languages and applications like Tableau in order to understand how to deal with your data. Which rules, operations, and calculations could be performed on top of your data. Now, if you look closely to our data, you can see that each field in our data source must be assigned to a small icon or a simple. Those icons indicates the data types of each field. Now, one more thing, once we connect our data to Tableau, Tableau can analyze our data in order to assign automatically the correct data type to our fields. Well, most of the times, Tableau does it correctly, but sometimes things go wrong or you want to change the data type of specific field, this is really easy. Either you can do it on the worksheet page or at the data source page, you will get exactly the same effect. Let's go to the data source page. Let's go to the orders. And click on the icon over here, you can see it's number hole. We can change it to a string. What we're going to do, we just click on the string and that's it. We just change the data type of the order ID. But let's say we want to change it back as Tableau did it at the start. What we're going to do, we're going to go to the icon over here again, and then we go to the defaults. It's back to the original data type that Tabloadd assign at the start here. One more thing to notice that the data types are really sensitive in the joints and the relationships. For example, if we go to this relationship over here between the orders and the customers, the key is the customer ID. Those keys should have exactly the same data type. Let's say we go to the orders, and let's change the customer ID from number to string. We're going to go to the string over here and we change it immediately. You can say at the data model, the relationship between the orders and customers is now broken. You can see at the tool tip, it's going to say type mismatch between the customer ID, the string, and the customer ID number. As you can see now, Tableau is very sensitive with the data type of the key, whether you are using relationships, joints, data blending doesn't matter. They should have exactly the same data type. Now in order to correct it, as you can see, we don't have any more the data. Review the data grid, how we can change now the data type. We're going to go to the metadata grid. We're going to do the same thing. We're going to go to the customer ID. Just click on the data type icon and change it back to default or to number. I'm just going to click on Defaults and Tableau going to be happy now, and the tables are related again, The third way to change the data types, you can go to the worksheet page. And same thing over here. You can go to the icons and change the data type. As you can see, it's really easy. In Tableau we have a bunch of different datatypes that's we're going to cover in this tutorial. And I group them into three categories. First we have basic main six data types. We have the number hole number, decimal string, date, date and time and bullion. The second group, we have roles. We have geographic roles and image roles. And the last group, we have advanced data types like group, cluster, group benz, and set. This group contains special data types that's introduced from Tableau for data visualizations. And they are specially made in order to organize our data. In this tutorial, we're going to focus on the first two groups, the Basic and the role for the advanced data types. I'm going to dedicate another full tutorial just speaking about them. All right, now let's start with the first group, the basic data types, where we're going to do deep dives into each type in order to understand them. Let's go all right, so now we're going to talk about the data type number. If our data contains only number, nothing else it contains digits 0-9 then we can call it a number data type. And it's very important to understand that numbers cannot contain any characters. For example, let's say that we have the following phone number in our data, this type of data. We cannot call it a number because it contains characters. We have the minus, we have the plus, because the number data type can only have digits 0-9 Now if we remove those characters from the phone number, then it's going to look like this. And only now we can give it the data type number in Tableau. The data type number has this icon. It's like a hash for numbers, we have two data types in Tableau, we have number hale and number decimal. So what is the difference between them? You know, in math, a positive or negative number could be splitted by dots. The first part we call it a whole number, and the second part we call it idcimal. If your number does not include decimal dots or any fractions, then we can call it a whole number. Like three -100 zero and so on. But if your number contain dots and fractions, then we call it a decimal number like 2.4 or 13.99 And here, you need to be careful which one you are using, especially if you are making calculations in Tableau. For example, if you want to divide two numbers like 1/2 if the output field has the data type whole number, then the result going to be zero. But if it has the data type number decimal, then the result going to be correct 0.5 and this is exactly the difference between those two data types. All right, so now let's check our fields in Tableau to find out which one has the data type number. And I would say, let's check the orders over here. You can see we have the order ID, customer ID, product ID. By just checking them, you can find that all of them are numbers, they don't have characters and they don't have fractions. That means they should have the data type number hole. As you can see, all of them is number hole. Let's check another fields on the right side. We have here sales, we have discount, profit. As you can see, they have fractions. Those numbers should be a number decimal. Let's check that. You can see Tableau did automatically figure out that those numbers are number decimal, but for the quantity it's whole because we don't have here any fractions that sets, everything is fine. All right, now we're going to talk about the data type string. The string datatype is one of the most widely used datatype in all programming languages. A string datatype is a sequence of characters, and it could include anything like letters, numbers pass, and any other type of characters. You can think of string as a plain text. And any field in our data source could be a string. String is like a default data type and it has no rules or whatever like the other data types. So that means you can convert any fields in your data source to a string datatype without any problem. And Tableu as well uses the string data type when it couldn't find any suitable other datatype for your fields. Let's check in our datasets where we can find fields with the data type string. Let's check first the products. Over here, you can see we have here two strings, the product name and the category. In the product name, we have characters, we have spaces, we have numbers. Those are the data type string. Let's check the customers. Over here, we have the first name, last name, both of them are string. But now you might notice or ask, you know what, we have city and country, both of them contains characters. Why don't we have the icon of ABC? Is it like string? Well, the answer is yes, because if you just click on the icon, you can see that Tableau did assign it to a string. But here the difference is that they have an extra role. We have the geographical rule. And you can see Tableau did assign it to a country. Here, Tableau going to give it another icon just to indicate that this field has a geographic role. But the basic, the main datatype for that is a string and the same is for the city. Okay, now we're going to talk about one of the most confusing data type. It is the date. If your field stores information about the calendar data, then this field is going to have the data type dates. Dates have very different formats in different countries. For example, in Germany, we have the following date formats. You see we use dots instead of slashes, but date in the international formats follow another rule where the date going to split it by a minus. And in the world there are many, many different formats. So those dates follow specific formats and we describe it with the following codes. For example, for the international formats we have this code. It's going to start with the year. And the year has four digits, that's why we have four times Y. Then we have a minus and two digits. For the mansus, we have M minus two digits for the day, DD. So there is like a code for each part of the dates we have, the day, months, year, weeks, and so on. In this table, I'm going to leave the link in the description. You can find all those codes and the descriptions of that. With that, you can customize the date format as it suits. You don't worry about it. Tableau understand almost all date formats that we have in our data. We could have not only the calendar data, but also informations about the time. Then we have Tableau, another data type for thats, we call it date and time. And in programming languages or databases, you might hear it already about the time stamp, But Tableau, we call it date and time. It might look like this. We have the date, then space, and then afterwards we have informations about the hour, the minute ant, seconds like the dates, it could have as well different formats. You could have the li seconds, the time zone, and many other stuff. So here we have again a table of all the codes for the time informations. You can find it as well on the same link. All right, so now let's check our data to find out which fields has the data type date, usually in a star schema data model. All the dates are placed at the fact table and our fact table is the orders. Let's check that. You can see we have two fields with the data type icon dates. We have the shipping date and the order date. It's not date and time because we don't have in the data. Information about the time. So both fields are dates, we can check here and as well here and in the other tables, broad acts and customers, they don't have any dates or times because they are dimensions, they are not events and usually don't have any information about the date. All right, so now let's go back to our orders, to our two fields. And as you can see, the format here is that they are splitted with slashes. Let's say that you don't want this format, you want something else. So now how we can change the date format in Tableau? In order to do that, we have to go to the worksheet page. So let's go to the worksheet page over here. And now you have to decide something. Do I want to change the date format for the whole workbook, for the all visualizations? That means you are changing the default format of the date. Or you want to change the format only for this view. Only for one visualization. Let me show you how you can do both. Now let's put something at our view. I'm going to take the order ID, drag and drop it over here. Let's work with the order date. I'm going to drag and drop this on the Tableau. Going to show it as a year. I want the exact date in order to see the format. So as you can see, our date has the following format. Now I want to change the default date format for the whole workbook. In order to do that, we're going to go to the left side to the order date right click. Then we go to the Default Properties, and here you can find the date format. If you click on that automatic, it is what Tableau did figure out at the start. And then we have some predefined format from Tableau. What is interesting is at the end we have custom our new format for the date can split with the dots. And the year going to have only two digits. The code format going to be like this, D, D for day, then dots, M, M for month. For the year we're going to have only two digits. That's going to be Y, Y twice. Let's hit, okay. As you can see, Taba did change the date format in Tableau. Now let's go and duplicate this worksheet over here, Piratical kicking on it. And then duplicate, as you can see in the next worksheet as well, we have exactly the same format that we defined. This means that the format that we defined is a default now for the whole workbook. But now let's say that I want to change it only locally at one visualization. I don't want to change the default format for the date. Let's dublicate that as well once again. Now, instead of going to the left side, we're going to stay at the view and we're going to go to our fields right click on it and then we go to this one here, format. Once you do this on the left side, the data being going to switch to the format spin Over here on the left side you can see dates. If you click on that, we're going to get exactly the same stuff over here. Those are the predefined from Tableau. We have the automatic at the top, and at the bottom we have the custom. Now let's choose one of those predefined. I'm going to take the week and the year. Let's click on that. As you can see, Tableau did change at the date format in this view. Now interesting to check the other sheets whether the date format did change. Let's go back to the previous sheets and see the state at the default format of the date. With this, you learn how to customize the format of the date for specific view or for the whole work work. But now I want to change the date format as before. In order to do that, I'm going to go over here, close this format. Then go to the Order Date again, right click Default Properties Date format, and then we just click on the Automatic and hit Ok. As you can see, we have again the same old format. That's it, this is how we can work with the data type date. All right, now we're going to talk about the last data type in the basic category, the Pullion datatype. The Pollan data type represent a fields that has only two values, true or false. It's like the language of computer, we have only 1.0 This datatype is often used in the output of a condition or logic. For example, if I ask you, do you like this video so far, the answer is going to be yes or no. If you like this video, please give it alike. The answer for this question, Can has the data type pull either yes or no, true or false, and no, any other values? And don't forget to subscribe the pulling datatypes has many use cases. For example, control the workflow of something. If the output is true, then do something. If false, then do something else. All right, so now let's check whether we can find any pulling datatype in our orders. We can check over here, we don't have any pulling data type and the customers as well. Nothing. And in the products, well, we don't have any field with the bullion datatype. Well, usually data type bullion going to be add once we use conditions in Tableau and once we create new calculated fields. Now to create the calculated field, we're going to go to the worksheet page. We're going to go sheet number one. Now make sure to select the small data source. Then we go to this small icon over here. And now we select Create Calculated Field. So let's click on that. We will get a new window to write our expression or our condition. I'm going to give it the name of logic 400. And now what we're going to check, or what is our condition? If the sales is smaller than 400, then it should be true, otherwise going to be false. The logic is very simple. So here we're going to find the sales smaller than 400, and that's if the sales is smaller than 400, it's going to be true. Otherwise it's going to be false. Let's click Ok. And once you do that, you can find on the left side we have a new field called Logic 400. It has the data type volume. The output has only two values, true and false. Let's validate that. I'm just going to drag and drop this on the view over here. As you can see, we have only false and true. Let's see whether the logic is working. So we're going to take the order ID and just put it before it. Now we need the sales. So we're going to take the sales, drag and drop it here on the ABC. Here you can see, for example, the first order, it is smaller than 400, that means the logic is true, correct. And then the next one, it is above 400, it's false. And so on. We can see if the field has only two values, true and false, then the datatype going to be bullion. And we usually use it as an output of a condition. And the bullion datatype has a lot of use cases. For example, if you want to filter our data, anything above 400, we don't want to see it in our visualizations. So what we can do, we can use the logic in the filter, Just track and drop that on the filters. And we're going to select only the true. So I'm going to unmark the false and then hit, okay. As you can see, the result can show only the orders with the sales less than 400. And with that we just filter our data very easily. All right, so with that, we have covered the basic six data types in Tableau. Now let's do a quick recap. We have the number hole is for fields that stores only numbers without characters, and those numbers are without fractions or decimal dots. The number is as well for fields that have only numbers without characters, but those numbers could have fractions or decimal dots. String is a sequence of any characters. It could be numbers, letters, special characters, or spaces. Then we have date. Date is for fields that stores informations about the calendar dates. Next we have the date and time is as well for fields that stores informations about the calendar and as well about the time. And it has as well specific formats. And the last time we have the bullion, it can store only two values, false or true, and we usually use it for conditions. All right, so so far we have learned the basic data types in Tableau. And next we will learn the two data type roles, geographic and image roles. 92. Tableau | Data Type Roles: Okay guys, so the first role that we're going to talk about is the geographic role. If you have in your data field that contains location informations or geographical areas, then you can assign it to a geographical role in Tableau based on the type of the location, such as city, country, postal code, and so on. Assigning this extra role can help Tableau to plot your data correctly. If you are using map visualizations in Tableau, there are over 12 geographic roles, but I think the most important ones are city and zip code. Now let's check our data, but first, some coffee. Let's go, All right, back to our data source. Let's go to the customer's table. There we have some information about the location of the customers. Here we have three fields. We have Country, City, and Postal Code. Now in order to check the geographic role, just click on the icon over here on the data type. Again, here it's very important to understand. Each field must have a basic data type. For example, the postal code is a number hole. Then we assign an extra role for it. Having the geographic role will not remove the number data type. Now let's check the geographic role over here. And you can see that assign it to anything. It stays here. None. This is a zip code or postcode, so we're going to correct that. We're going to just click on this over here to assign a geographic role. And you can see the icon did change. With that, we have the data type number and we assigned a geographic role for it. Let's check the others. This should be a, let's click over here. The basic data type is a string because we have characters. And let's check the geographic role. Tableau did it correctly, We have it as a city. That is correct. Let's go to the country over here. We have it as a string and then the geographic role is country. With that, we have all location informations assigned correctly to the geographic role. We can start building a map visualizations in Tableau. Let me show you an example. Let's go to the sheet number one over here. What we can do, we can go to the customers over here. And let's take the location information. Let's take the county, the city. Let's have one metric. I'm going to take the sales, drag and drop it over here on the ABC. As you can see, it's only a table. We want to switch it to a map. In order to do that, go to the Show Me over here and then click on the map. You can see Tableau did correctly plot our data. Let me just close it and assign for each country the metrix. This is done because we assigned our data to a geographic role. All right, so now let's talk about the other one. We have the image role. This is brand new Tableau just introduced that in 2022. In principle, if your field stores a URL's pointing to images, then you can assign this field to image role with the URL to show the images in the visualizations. And Tableau have here some requirements. So the first one, Tableau supports only those three image extensions, and the URL should begin with the HTTB or HTTBS requirement. The maximum number of images in each field is 500, and then we have the image size. It should be less than 128 kilopytes. But though things might change in the time, since it's completely new feature in Tableau. And I think the most used case for this is to show the product images in your visualizations. All right, so now let's see an example in Tableau about the image role in our datasets. I have prepared some URL's inside the table products, but only in the small datasets. So let's check that. If you go to the products over here we have a field called product images, and here we have URLs pointing to images in my website. Now let's check the data type. Over here, it is a data type string. This is the basic one, because a URL is a sequence of characters. And now we can add on top of this basic data type an image role. And it's really easy, we just go over here to the image role and we click on the URL. So let's do that. And with that we have a new icon, indicates that this field has the role of image. Let's check the data. We're going to go to the sheet number one. Then we go to the products, make sure we are selecting the small data source. Then we go to the products image. Just drag and drop over here. And as you can see now we have some images about the products, but two of them are broken. And I think it's still bagging at the disto version of Tableau Public. Because if we publish now to Tableau Public in the Whip, we're going to have all the icons correctly. So now we can go and grab another field. Let's take the sales, drag and drop it over here. And with that, we have nice images to the matrix. Let's go and publish that in Tableau public. I'm going to call it View Image. Let's save as you can see now in Tableau Public we have all icons, nothing is broken. I think if you are building dashboards about the products, it's really nice to show the image of the product instead of the names. It's just more catchy to have images inside the visualizations. All right, so that's all for the data types. Next we will learn very important concepts, the dimension and measure roles in Tableau. 93. Tableau | Dimensions vs Measures: Dimensions and measures in Tableau. So once we connect our data to Tableau, Tableau and analyze our data in order to assign each of our fields to either a dimension or measure this kind of meta data. Going to help Tableau to blot our visualizations. All right, so now the question is, what is dimensions and measures? Well, Tableau didn't invent the concept of dimensions and measures. It is an old concept of PI. And now we're going to have a quick origin story. If you learn the concepts of datawarehusing and business intelligence, you might already know that the core concept is the multi dimensional op, online analytical processing. The concept says, if you want to answer the business questions or do data analysis first we have to build a data model that has the shape of a cube with multidimensions. It's something like this cube. And each cube has two informations. First we have the dimensions of the cube, and the second information we have those cells, those cells can store informations like data, numbers, and we call it measures. Each cube has two informations, the dimensions and the cells, the measures. Now let's have an example. We have the cube of sales and it has three dimensions. The first dimension is the locations. And inside the locations, we have three members, USA, France, and Germany. Those three values are the member of the dimensional location. And we have another dimension called time. And it has three members in the dimension, January, February, and March. And the third dimension, we have the categories. Now, inside the sales of the cube, we have the Mejor Sales. Now our cube is ready with the dimensions and measure and we can start answering the business questions. For example, find the total sales in USA. What can happen? We can select the dimensional location and filter the dimension to have only the member USA. This operation in the cube, we call it slicing the cube. And then we can aggregate them, measure, and we will get the total sales of 120. And if you have cube, we can do multiple operations like slicing, dicing, roll up, drill down, and be fought. So if you have such a cube, we can do data analyses and find fast answers to the business questions. Now to summarize, dimensions contain qualitative values. They usually describe something like the product name, the broaduct category, customer location. And we use dimensions to categorize, filter, and show the level of details. And on the other hand, we have the measures. They contain numeric quantitative values that can be measured like the name says. And the measures, unlike the dimensions, they can be aggregated. All right, so this might be still confusing. And if you say, you know what? If I look to my data, how do I decide whether it's a dimension or a measure? So here is my decision making process. First I check the data type of the field, whether it is a number. If the answer is no, then this field is a dimension. But if the answer is yes, then we can ask the next question. Does it make sense to aggregate the values of the field, like doing the sum calculation on the values or finding the average value? If the answer is yes, then it is a measure. But if the answer is no, then it is a dimension. So what this means, all nonumeric fields are dimensions, all numeric fields are measures. That really depends on the questions whether it makes sense to aggregate the values. If yes, then it is a measure. If no, then it's dimension. Okay, so now let's practice. In order to understand the concept of dimensions and measures and how they work. We will check our datasets and we're going to assign each field to either dimension or measure. We're going to do the table customers together. And then you can go and bowse the video in order to do the products and the orders. And then at the end, we're going to check the result together. So let's go, we're going to start with the first field, the customer ID. The customer ID is a number, so we cannot say it is automatically a dimension to jump to. The next question now, does it make sense to aggregate it? Well, we have here to understand that the customer ID is a unique identifier for the customers. For example, Maria has the customer ID number one, Martin has four. And now if we sum all those values, we're going to get the value of 15. Or if we do the average, we're going to get the value of three. Those values don't make any sense because we use the customer ID only to identify the customers. And I don't think that we will be in a situation where we have to find the average of the unique identifiers since it makes no sense. This field is a dimension and with that, we can assign the customer ID to a dimension. Now let's go to the next one. It is much easier because we have here the first name and it is not numeric, so it is automatically dimension. The same goes for the last name. It is as well string. It is not a number. All right, so now let's move to the next one. We have the postcode or the zip code. It is a number. So we can ask the question, does it make sense to do aggregation here? Well, I don't think there will be a situation where we have to find the sum of the postcode or to find the average of it. So that means it is here again, it's a number, but it is a dimension, so let's assign the value for that. And then the next one, it is easy, so we have the city and the country. Both of those values are string, so it is automatically a dimension. So let's assign it again. Let's move to the last field. We have the score here. It's again a number we can ask the question, does it make sense here to do aggregations? Well, the answer is yes. It's really makes sense to find the average of the score. That's why we're going to map it to a measure. On the table customers, we have six dimensions and only one measure. Now you can go and pause the video in order to practice with the table orders and as well with the products. All right, now let's check the results. As you can see in the table orders, we have a lot of measures because it is a fact table. And fact tables in the star schema is the central place for the measures. This is very normal. Let's check the fields. We have the order ID, customer ID, product ID. It is like the customer ID. Those are identifiers and it doesn't make sense to aggregate it. That's why we have it as dimensions. The order date and shipping date. Those informations are not numeric and that's means it is dimension. And then we have all those informations. The sales quantity, discount, profit, unit prices, all those fields are numbers. Here it makes sense to do aggregations like the sum or the average. We're going to use the orders, the fact table if we need any measure. Let's go to the next one, to the products here. This one is easy, the product ID is like, again, the identifier. It doesn't make sense to do an aggregation. We can have it as dimensions, product name, and category. Both of those informations are string, they are non numeric, and that's why they are dimensions. I hope with this you have understood how I usually do it. By just looking at the data, we could decide whether it's a dimension or measure. All right, so now back to Tableau and the first question is, where do I find in Tableau whether my fields are measures or dimensions? Well, there is no icons for dimensions and measures, and as well, we cannot check that at the data source page. In order to check the dimensions and measures, we have to go to the worksheet page. Let's go to sheet number one. And then we're going to go to the data Bain on the left side over here. Let's open any table, for example, the orders. Now if you look closely to the table orders, you will find like fine gray horizontal line which splits the fields of the orders into two groups. The fields above the line, they are the dimensions. And the fields below the line, they are the measures. For example, we have the customer ID, the order dates, order ID, product Ed, and so on. Those fields are dimensions in Tableau and the fields below the line that discounts, the quantity sales and so on. Those fields are measures, you can find this splitter, this horizontal line in each table. If you go to the customers over here, you will see again the same line that splits dimensions from measures and the same if you go to the products. Scroll down, we have again the same line. And one more thing that you might already noticed. Let me just close those tables. That outside the table there is as well horizontal line. Sometimes in Tableau we curate fields that doesn't belong to any tables and Tableau can put it just outside of the tables. It's like global fields, and for that we need as well splitter to split the fields to dimensions and measures. Okay, so now let's go back to the orders. And now you might say, you know what? We don't need this horizontal line to identify whether the field is dimension or measure. And now if the field has the color of blue, then it's dimension. And if the field has the color of green, then it is measure. Well, this is exactly where most of Tableau developers get confused. Things gets mixed up between dimensions, measures and discrete. Continuous. To be honest, I was thinking the same at the start until I found out that the color of the field indicates whether the field is discrete or continuous. We're going to talk about this concept in the next tutorial. Don't worry about that. The color does not indicate whether the field is dimension or measure, but the position of the field, whether it's above the line or below the line. Let me show you quickly something. Let's take any field over here, the product ID. Let's just drag it a little bit. Now, table going to mark the horizontal line with orange. And I'm going to show you, okay, anything above is dimension and anything below is measures. So Tableau shot that as well. All right, so now to the next question. How do I change a field from dimension to measure and vice versa? And here you have two options. Either you're going to do it globally for the whole workbook, for all the views, or you might do the change locally in one individual view. So let's see how we can do that. Let's start with the first one where we're going to do the change for the whole workbook for all views globally. We're going to go, for example, let's take the order ID over here. Just right click on it. And then we go over here, Convert to Measure. Let's click on that. And as you can see, the field order ID just jumped from above the line to below the line as a measure. Now if you want to change it back to dimension, just radically con it and then convert to dimension, that's it, it's really easy. Now let's see how we can do the change locally at one view without affecting the whole workbook. Let's take again the order ID, drag and throw it over here, and here we're going to radically con it on the view. And then we're going to go to the measures. We're going to convert it to a measure. Currently it is a dimension. Let's go to the measures and we have to select one of those calculations. Let's take, for example, the sum. Now as you can see, the order ID only for this view is a measure. But the order ID on the left side for the whole workbook, it stays as dimension. That's, this is really easy how you can convert between measures and dimensions. Let's have an example in Tableau in order to understand the main purpose of measures and dimensions. Let's go to the orders on the left side over here and the small data source. And let's take one measure, the sales. We're just going to drag and drop it on the text over here. As you can see, Tableau going to start immediately doing aggregations on the measures. Now if you check the data, we have only one number. This is the total sales that we have in our dataset. And now we are at the top level of details where everything is aggregated in only one number. And now we have to add more information in order to understand this number. In order to do that, we're going to use dimensions. For example, let's go to the products over here, and let's take the category. So I'm just going to drag and drop that category over here. And as you can see now that dimension is splitting our measure into two rows. So that means we have now one level lower of details than the top aggregation. And now let's take another dimension. We're going to take the product name. So let's just drag and drop it over here near the category. And as you can see, using this dimension can give us different level of details about the sales than the first dimension, the category. What happened? We just moved with the details one more level beneath that. Now let's take third dimension. We're going to take now the order ID from the order. Just drag and drop it near the product name. Now as you can see, this dimension can bring us to the lowest level of details where the aggregation of the measure is exactly the same original value. As you can see, the dimensions define the level of details in our views. And each dimension can take us to different levels of details. Always, if you want to go to the top level of details, you have to remove all dimensions and only have the measure as. See as we are removing those dimensions, we are going to the top level of Another nice way to show that if we go to the tree map visalization, let me just go back over here to have one dimension. Let's go to Show Me and then click on the tree. Now you can see our data is split it to only two details. Now as we add dimensions, let's take again the product name over here, drag and drop it on the label. You can see the view, split it to more details, if we go to the lowest level, if you take the order ID, again, over here to the label, we can see the view is split it. Furthermore, now I'm going to tell you small secret. If you follow it, you can generate hundreds of reports, even if you have small datasets. If you combine any measure with any dimension, you will be creating a new view or new reports with the title following this pattern, measure by dimension. For example, sales by product, profit by category, quantity by country. So if you follow this pattern, you can generate endless amounts of reports and views in Tableau. All right, so now if you come with the dimensions and measures in our small datasets, we have around 16 dimensions and ten measures. So that means if you follow this rule, you can generate around 160 views and reports. So even we have small datasets, we can generate huge amounts of views and reports. So as you can see on the visualizations, if we combine both of them, we're going to have sales by order, date sales by shipping, date sales by country, and so on. All right, so now let me just show you how we build usually reports in Tableau using dimensions and measures. We're going to work now with only one measure, the sales, and we're going to make dashboards about it. So let's stay at the small data source and we're going to take the sales from the orders. Let's just drag and drop it somewhere at the rows. And now the dimension is going to be the product name. Let's take the product name from the products. Let's drag and drop it over here. So that's it. Now we have to call it sales by product. Let's just rename the sheets over here, right? Connect and rename Sales by product. All right, so now we're going to create another one using the same measure, different dimension. What we're going to do, we're just going to go and Duplicate it. Right click on it and duplicate. We're going to have now the Sales by Category. I'm just going to rename it again. Let's call it Sales by Category. Now we're going to remove the product name from here. Just drag and drop it somewhere at the white space. And then we go again to the products and drop the category on the columns. Now we're going to use different vocalizations. I'm going to go to the Show Me over here. And let's use the pie chart. Click on that. All right, now we have a pie chart, but I would like to show the values. We go to the label over here, click on it, and click on this Show Mark labels in order to show some values that says this is our second one. All right, so now we're going to create the third one with another dimension. We're going to take the order date, but we're going to show only the months. We're going to go over here and duplicate it again. Just rename it, I'm going to call it sales by month. We will go now and remove the category. Just drop it here. And then let's take the order date, drag and drop it on the columns. We're going to switch the visualizations to par. I'm going to click on this over here on the parts as you can see here. Table going to show the years of the order date. We want to have it as a month. We have to switch that. Just right click on the Dimension and then over here, just select the month. Let's do that. Let me just close the, show me over here and then let's add some lapoles. All right, so that's it for this view. Let's make the last one, we're going to make Sales by Country. Let's duplicate this again, and we're going to call it Sales by Country. Then we're going to remove the dimension order date. And then we're going to take the Dimension Country. Just drag and drop it on the rows. Now since we have the country, we can change it to a map. Let's do that. We go to the Show Me over here and then select the map. Click on that. All right. So now we have a map showing the sales by country. All right, so now we have those four reports or sheets we can build now a dashboard. In order to create a new dashboard, we're going to go to this icon over here. Click on it. Before we start, I'm just going to give it a name. Let's call it Sales Dashboard. All right? Okay. Now we're going to go and drag and drop all the sheets. We're going to start first with the country. Let's just drop it here in the middle. And then we're going to take the category just beneath it. Then the product beside it. Let's three size, a little bit to the left. And then we're going to take the last one, the Ns, and put it over here. As you can see, with just four dimensions and one measure, we were able to make dashboards about the sales. And just following this small rule, sales by country, sales by category, sales by product, and sales by month, always measure by dimension. Now it's really easy to train, just go and pick another measure with different dimensions and build different dashboards. All right, so now let's have a quick summary where we're going to compare both dimensions and measures side by side in order to understand the differences between them. Let's start with the definition. Dimensions are fields that contains descriptive values, and measures are fields that contains quantitative numeric values. For example, we have dimensions like broaduct category, country and customer ID. And on the other hand, we have measures like sales, profit and quantity. The next point is about aggregating dimensions can aggregated as each member of the dimension is unique. Measures, however, can be aggregated using functions like some average min, max, and so on. For example, you can calculate the total sales for specific product category. Moving on to the data types. All different data types can be used as dimensions like string, date, bullion, and even numbers. Like we have learned, the customer ID. But only the fields with the data type number can be used as a measure. The next point is about the role of analysis. Dimensions are typically used for grouping, filtering, and organizing your data. And measures, on the other hand, are used for calculations and numeric analysis. The final point is about the granularity. Dimensions define the level of details of the data, and the granularity of measures, on the other hand, determines the quantity being measured. These are the main differences between dimensions and measures. All right, so that's all about the dimensions and measures. Next we will learn another important concept for data visualizations, the discrete and continuous roles in Tableau. 94. Tableau | Discrete vs Continuous: All right guys, so now we're going to talk about discrete and continuous. Here again, once we connect our data to Tableau, Tableau can analyze our data in order to make assumptions, map each field to either discrete or continuous. Discrete and continuous are metadata informations that's going to impact on what type of visualizations that you can create, as well as how they will look like. Now in order to understand the concept behind them, we're going to compare both discrete and continuous. First, we're going to start with the definition. This concept comes from math. And they say discrete values are always separated. Disconnected distinct values, continuous values are exactly the opposite. It's like connected value, a serious or unbroken chain of data without any interruptions. Let's have an example. Think of discrete as you are counting 0-100123 and so on. So that means 0-10 we have exactly 11 distinct values. But with the continuous values we have like real numbers, which means 0-10 we have infinite number of real numbers. For example, we have 1.21 0.31 0.4 and so on. So with discretes we have distinct values. And with continuous we have a range of infinite values between start and end. Once I read about the discrete and continuous and the following analogy stick in my head. Think about the discrete values as a legal pieces. You can take them apart and you can work with each piece differently and independently. You can move them around and analyze them in different orders. And now think of continuous as a roll of yarn. And now when you unroll the yarn, you will not get different pieces. You will just see more of the yarn, so you will just get a longer piece of the same string. All right. So discrete values are separated, distinct values and continuous values are unbroken chain of data without any interruptions. All right, so now let's move to the next point. We have the colors in Tableau. The discrete fields are the blue pills and the continuous fields are the green pills. So let's see in Tableau what this means. All right, so now as usual, the first question is, how do I know whether my fields are discrete or continuous? Well, it's like the dimensions and measures. We cannot check that at the data source page, we have to switch to the worksheet page. Let's two dots. We're going to go over here. And now it's really easy. Now as you hover your mouth on those fields, you will see we have only two colors, the blue and the green. And you can see those colors as well. On the data type icons, we have icons green and icons blue. The fields with the blue color, like for example, the customer ID, first name, order date, and so on. Those fields are discrete fields and the fields with the green color, like discount, sales, unit price score and so on, those fields are the continuous fields. Here exactly comes the confusion where a lot of tablet developer think that the blue indicates for dimensions and the green indicates for measures. Well, that's wrong those colors to indicate whether it's discrete and continuous. Now you know that. Let's start with the first one where we're going to change the role of field globally for the whole work work. In order to do that, we're going to go to the Data Bain on the left side as you can see here. For example, the sales in the orders, it's green pill. That means it's continuous field as well. It is a measure, let's say that. We want now to switch it to discrete field. In order to do that, right click on the field, and here we have convert to discrete. It's really easy, so let's click on that. Now if you check again the sales, we have it now as a blue pill. That means now it is a discrete field. If you check the others, all of them are continuous measures, but only the sales is a discrete measure. This change is done globally. If you go to another sheet, the sales going to steal as a discrete field. Now if you want to switch between discrete to continuous, all what you're going to do is right click on it. And here we have again the same option. We're going to convert it to continuous. Once we click that, it's going to go back to the green pill. That's it, it's really easy. We're going to learn how to switch between discrete and continuous locally for only one view. All right, let's build the view. We can drag and drop the sales on the columns. Let's take a dimension. For example, the category drag and drop it on the rose. Now we want to switch the sales from continuous to discrete only for this view what we're going to do, we're going to go to the sales over here. Radically con, as you can see, the current role is continuous as table market for us here. Or you can see it from the green pill. All what you have to do is to select discrete. Let's go and do that. Now the field sales is discrete for this view, as you can see, it's blue pill, but if you go to the data pin on the left side, the sales stays as continuous with the color of green. That's how you can. Locally for only one view. So for example, if you go back to another worksheet and take the sales, the Sal is going to be a continuous measure. That's it. This is how you can switch between discrete and continuous fields locally for only one view. All right, now let's move to the next point. We have filters in Tableau. The discrete field going to create a filter with distinct values, but the continuous field going to create a filter with range values. All right, now let's have an example in order to understand what I mean with those filters. And now we're going to work with a big data source, because we need more data in order to understand this. Now let's switch to the big data source. Just click on it. And then let's take the Sales drag and drop it over here. And then we're going to take from the products the subcategory, drag and drop it on the rows. So now we have the sales by the subcategory. Now if we want to go and filter those values, we can go and put the subcategory in the filters. And don't forget that the subcategory is a discrete field, let's just drag and drop it on the filters and see what can happen. Now in the new window, as you can see over here, Tableau listed all distinct values inside the subcategory. Now here with those discrete values, we can make decisions individually. We can include some stuff or remove others. Let's just do that. I'm just doing this randomly and click, okay. That says this is how the filter in Tableau can react if we have a discrete field inside it. So we have a list of all distinct values, we can show this filter on the right side. If we just right click on the subcategory of over here and then select Show Filter. Now we have it on the right side and we can now include or exclude values. Now let's see what can happen if we put on the filters a continuous field. Let's take the sales again since it's continuous field, but instead of taking it from the left side here from the data bin, you can take it from the shelves by holding out and then drag and drop on the filters. Since it's continuous field and a measure Tableau can ask is first do we want to do the filter on all values or after we do the calculations, let's go with the sum over here, since we have it as a sum. So I'm just going to click on the sum and go next. This is exactly what's going to happen if you have continuous field as a filter, you will get a range. It has a start and end. You don't have distinct values of all the sales. You will get a range of values and you have to define the start and the end. Here we have different options about the range, but we're going to stay with the first one. Let's hit Care. Now I want to show the filter on the right side. Let's go over here. Right click on Shore Filter. Now on the right side, you can see exactly the difference between discrete and continuous fields in filters. Let me just extend it over here. You see the sales continuous and we have a range. So we can filter like this by changing the start and the end of the range. But with the discrete filter, we have all members of the field and we can decide on each value individually. We can just select and deselect those values. All right, now let's move to the next point. We're going to talk about the changes in the view. Discrete fields create the headers of the visualizations, where the continuous fields creates the axis of visualizations. Okay, now let's see what this means in our view. As you can see, the subcategory is a discrete field and the sales is continuous field view. Over here, we have three things. We have the marks, those parts. On the left side, we have the subcategory, and we call those informations as headers. And the third information, we have the axis of the view. What is the difference between headers and axis? The discrete fields like subcategory always create the header of the view. In the header over here, you have a list of all distinct values inside our dataset, exactly as it is. But the continuous field, like the sales, creates the axis of the visualization. It's like the values inside the filter. It's a range that has starts and ends. Unlike the headers, you cannot see in the axis all the possible values individually, you have a range with start and ends. And in between we have pens, so discrete fields create the headers and continuous fields create the axis. All right, so the next point we're going to talk about sorting data in discrete fields. We have many options in order to sort the data, but with the continuous fields in Tableau, it is very limited. So let's see an example. So we're going to stay with the same example, and we can start with the discrete field subcategory. In order to sort the data in the discrete field, just right click on the subcategory over here on the shelf, or you can go to the header. It's exactly the same, so right click on the subcategory. And then we can select over here, the Sort, select that. And now we have extra window to set up the Sort. So as you can see here, we have many different options like alpha patic field, manual and so on. So let's go with the manual over here and here again, since subcategory is discrete fields, we're going to get a list of all distinct values. Then we can change the order. For example, by just clicking on the applications, we just can bring it down and we can take the storage and bring it up, Plenders down and so on. So we can do it manually without any rule. As you can see, as I'm changing the values, the order in the visualization is as well changing. If you want to sort the data, we're going to use the discrete fields in order to do that, since we have many options. Now let's check the continuous field. I'm going to cloth this. Now if you go to the continuous fields on the sales, right click on it. We don't have here an option to sort the data like in the discrete fields, but instead we have only one option. If you hover on the sales, we have this very small icon and we can use it in order to sort the data, ascending or descending. Just click on that. And as you can see, now the data is sorted by descending values. If you click on that, again, you will get the data as ascending. Sorting the data using continuous field is very limited. But instead of that, we can use the discrete fields in order to sort the data since we have many options. Okay, now let's move to the next one. And this is really important to understand what is really the purpose of having continuous and discrete tableau. The main use case of using the discrete values is to do a deep dives analysis in specific scenario. On the other hand, we're going to use the continuous values to see the big picture and do trend analysis. Let's have an example. Now we're going to create a new view using the big data source, since we have more data. And we're going to go to the table orders. Let's take the order date. Just drag and rub it on the columns. And then we're going to take one measure, let's say the quantity drag and dub it on the rows. Now as you can see, the order date is a discrete field and we have five years of data. But now what we're going to do, we're going to go to the order date. Right click on it and we want to see more details. Just go to the exact date over here. Now as you can see, Tableau did convert it automatically from discrete to continuous value, and we have it as a green pill, and that's because we have a lot of order dates. And Tableau tried to bring it all in one picture. You can see now the order date created an axis, a range of dates having continuous fields. You have all the data in one big picture. And that's going to help you to find any trend in your data. Now let's go and convert the order date to a discrete field. In order to do that, we're going to go to the order date, right click on it and click on Discrete. As you can see now, we just broke the chain and we broke the visualizations into individual dates. Now because of that, we have the header and we have all the distinct values inside our data. We have all the days, all the months of the five years in one visual without having the order date as a discrete, we cannot really do any trend analysis over here because it's really huge visualization after we converted the order date from continuous to discrete, lost the big picture. And now it's really hard to do any trend analysis. But now instead of doing trend analysis, we can do now a deep dive, details analysis for each individual date in order to analyze a specific problem or scenario. Or to answer the question, why do we have in the first place a trend? You can check the value of each date individually. We usually use the bar visualizations for the discrete and the line visualizations for the continuous. Let's change that. I will go over here on the marks and instead of automatic, I will move it to bar. We have it now here as a bar. And I'm going to just duplicate the sheets and bring the order date as a continuous and then change the visualizations to automatic. Now I just moved both of the views into one dashboard in order to see the differences between continuous and discrete. As you can see with the continuous, if you want to make like trend analysis, seeing the big picture or you're going to make like a report for the management without showing a lot of details, then go and use the continuous field. Now if you look at the visualizations with the discrete fields, you can use that if the task or the requirement is to do deep dive analysis under data and evaluate each data individually. The main purpose of having discrete is to do detailed analysis where the purpose of continuous values is to do trend analysis. All right, now let's have a summary where we're going to compare both of the discrete and continuous side by side in order to understand the differences between them. Let's start with the definitions, discrete values are disconnected, separated values, and continuous values are connected, unbroken chain of values. For example, in discrete 0-10 we have infinite number of values. We have exactly 11 values. In continuous 1-2 we have infinite number of values. Next one is about the colors. Discrete fields are the blue pills and continuous fields are the green pills. Moving is discrete fields generate filters with a distinct list of all values available in the dataset. On the other hand, the continuous fields generate a range filter that has start and end values. Next point is about the views. Discrete fields can generate the header of the view showing all possible values, and the continuous fields generates the axis of the view. Again, it's like a range of values. Then we have sorting. You can use discrete fields to sort your data using different options, but if you sort your data using continuous fields, you're going to have very limited options. We have only ascending or descending. Finally, we're going to talk about the purposes. The main of the discrete is to analyze a specific scenario, like you are doing a deep dive analysis in a specific issue. But the main purpose of the continuous is to understand the big picture from the data in order to do, for example, trend analysis of your data. These are the main differences between discrete and continuous fields. All right, that's all for the discrete and continuous. Next we'll wrap things up with the summary and get better understanding of the big picture and the differences between all of these concepts. 95. Tableau | Data Types vs Dimension & Measure vs Discrete & Continuous: All right guys. So now what I'm going to show you is how those different metadata concepts like data types, dimensions and measures, discrete and continuous, are related to each other. All right, so now we have a field in our data and in Tableau we can assign it to different data types. So it could be string or pull in with true and false or a date. And we have as well date and time or a number, whether it's whole or decimal. And now next Tableau can assign it to another metadata info, either dimension or measure any data type that is not a number. It's going to be dimension, string, polling, and date. All of them going to be automatically dimension. You cannot convert it to a measure. If the datatype is number, we could have it as a measure or dimension if it makes sense to do aggregation. Next table can assign this field to the third metadata concept, discrete or continuous. If we have a dimension field with a data type string, it could be only discrete. We cannot convert it to a continuous like in our dataset. We have the category, the first name, the country. All those fields are string dimension and discrete. You cannot change it to anything else. Goes for the data type bullion. It could be only dimension and only discrete. But now if we have a dimension filled with the data type date or date time as you saw in our examples, it could be continuous or discrete. We can have both now to the last one. If we have a field with the data type number, it doesn't matter whether it's dimension or measure, we can have this field as continuous and as well as discrete. All righty, with this you have big picture for all those confusing concepts in metadata in Tableau. All right everyone, we have now better understanding about the data types and roles in Tableau and these important concepts. In the next section, we will learn about renaming and Elias in Tableau. 96. Tableau | Section: Tableau Renaming: How to rename things in Tableau. As we are preparing our data sources, what we usually do with that, we're going to go and rename stuff like renaming tables, columns, and even give Eliass to our data. First I'm going to introduce you to the different naming conventions that each developer should know. And after that you're going to learn the different techniques on how to rename fields and tables in Tableau. At the, at the end, you're going to learn the different methods on how to add Eliass to your data in Tableau. So let's start first by learning the different naming conventions and what are the differences between them. So now let's go. 97. Tableau | Naming Conventions: Sometimes in real life projects, the source of your data might contain technical or unfriendly names. And when you are creating visualizations for the users or your colleagues, you have to make sure that you are using friendly names that are easy to understand and to read. And that's why after you connect your data to Tableau data sources, Tableau will start cleaning up and renaming the fields and the tables to more friendly format. And the format is following specific naming convention that is decided from the Tableau team, which is really great. So let's understand first what is naming convention? Naming conventions are set of rules and guidelines that could be used in order to give names for things like tables, fields, functions, and variables inconsistent and understandable way. Let's say for example, we have the two words, hello word. In order to create a naming convention, we have to decide in two things. First, the word itself, how we can write it. Here we have three ways we can use the lower case, or we can decide to go with the upper case, or we could use the capital letters. And the second thing to decide is the separator between words, between hello and word. We have here white space. Here we have different options. You could use dots underscore, white space, or even nothing. Now for example, let's say we're going to go with the lower case and the separator underscore. Then we're going to have the following name. Hello, underscore words. With that, we have a naming convention that we're going to follow through all the projects and it's really easy to follow. And at the same time, it's very important to decide on the naming convention for your data model, especially at the start of your project. And if you don't do that, I promise you the look and feeling of your visualizations and dashboards gonna look really bad and the whole project gonna look unprofessional and inconsistent. And one more thing, project team decides on different naming conventions so there is no really right and wrong here. All right everyone. So now I'm going to walk you through the most common naming conventions used in programming languages. The first naming convention is the snake case case, the lower case in all the words, And going to separate them using the underscore, The name at the end is going to look like snake. All right, Our example is going to be the customer name. And we're going to work with this table to fill all the different naming conventions. An example of the output, the rules for the litter case and the separators in which applications and programming languages we can find this rule where we're going to start with the snake case. The litter case is going to be here, lower case, the separator is going to be the underscore. If we follow those rules with the example, we're going to have a lower case customer underscore name. We can find those formats in Python, HP, and Rob the Snake format is really easy and popular and you can find it like almost everywhere. And now we're going to talk about the next naming convention. We have the camel case. And here we have another naming convention that looks like an animal. In the camel case, only the first word going to be lower case, but then all the following words going to be capitalized. And between the words there is nothing, no separators, no dots, underscores, dashes or anything. So at the end, we're going to have the shape of camel. All right, so that means we have the second naming convention. We have the camel case. The rule for the letter case is going to be the following. The first word is going to be lower and the rest of the word is going to be capitalized. For the second rule, we have the separation. There is no separation. There is nothing between the words. Here, we're going to write no separation. Now if we apply those two rules in our example, the customer name, we're going to have the following output. The first one going to be everything. Lower case customer, there is no separation. That means we're going to start immediately with the second word, but the second word going to be capitalized, it like this. We can see the camel case is widely used in programming languages like Java, Java, Script, and scripts. That means we have the third naming convention, we have the Pascal case. It's very similar to the camel case. The rule says all the words going to be capitalized. So here we have capitalized. And the separations, there is no separation. Like the camel case, there is nothing. If you follow those two rules on the customer name, we're going to have the following output. The first word is going to be customer capitalized, no separation then a capitalized name, we can find this naming convention. The Pascal case is used in programming languages like Java and C, Sharp. I like this naming convention. I used it in many projects. All right, the next naming convention is going to be the cup case. I think by now the one who named those naming conventions should be an arbitude. As you can see, we have all the words are lower case and the skew and separated with dashes, the name going to look like a delicious hot Cbscow. The fourth one, we have the keep case. And the rule going to say, okay, the letter case going to be lower caste like the snake case, and the separation going to be here, The D. If we follow those two rules on the customer name in our example, we have the follow output. It's really easy going to be customer or lower then then name if you are web developer or designer. I think you know about this naming convention because it is widely used in HTML and CSS. I think it's like the snake case. It's really easy to follow. Now we have another naming convention. This one is very important and we call it a title case. It has nothing to do with animals or foods. Sadly, we have here title case. The rule going to say, okay, the words going to be capitalized, and we're going to separate the words with a white space. So here we're going to have space. So now if you follow those two rules in our example, we're going to have capitalized customer, then space, then capitalized name like this. So why It's important because this one is the naming convention that Tableau team did decide to go with. So you can see this naming convention in Tableau. Tableau currently is enforcing this naming convention in all your data. So once you connect your data to Tableau, Tableau, going to Clelup and rename everything following this rule. Well, if you look at it, it's really friendly and easy to read. But sometimes in projects we are forced or we are following some requirements, follow a specific naming convention, it doesn't match with the title case, then the situation is really bad, you have to go and rename everything again. Of course, you don't have to follow one of those naming conventions. You can make your own rules and guidelines. For example, let's say this is my naming convention and the letter case, let's say it's capitalized and I would like to separate the words with the underscore. I'm just mixing stuff around. If I apply those rules to the customer names, we're going to have something like this capitalized customer underscore capitalized name. And with that we have defined our naming convention. All right, so now let's check the naming conventions in our datasets and as well in Tableau. Now if you go through the datasets that I've prepared for this course, the small and the big one, you can see that I'm always following the same naming convention. The letter is going to be capitalized and going to be separated with an underscore. So for example in the orders we have the products underscore ID. Or if you go to the customers, you can see the first underscore name and so on. So I'm always following the same naming convention. All right, so now let's check how Tableau did name our fields and tables from the datasets. You can check those informations either from the worksheet or in the data source page, but in the data source page you can find more informations. So now we are at the data source page. Let's go to the meta data grids. And here it's really interesting, We're going to find two field names. We have here the field name and the remote field name. What are the differences between them? Well, the information in the remote field names comes from the original datasets. And as you saw, the original dataset is following the naming convention of having underscore between two words, and we have all the words capitalized. We have, for example, the order underscore ID, customer underscore ID, and so on. All information we find under the remote field names comes from the original dataset, from the original source system, but now the field name on the left side over here, those informations comes from Tableau after renaming and cleaning up our fields. If you take a closer look to those names, you can see they are following the title case, where we have capitalized words and separated by a white space. You can see over here we have the product space ID, where the original name was Product underscore ID here, Tableau did rename our fields here. It's really cool. We have in the Tata the grid, a mapping between the old values, the remote field names and the new ones. After Tableau did rename them, we have always a data lineage between Tableau and our datasets. As I said, there is no right and wrong here, but it's very important to define those rules at the start of the projects before you start building any visualizations. I remember one project where we started immediately with building the dashboard and visualizations without deciding first on the naming conventions. We build around 30 dashboards in Tableau, and after a while, we found out that the developers are using different naming conventions, which is really normal if you don't define the guidelines and the rules at the start of the projects, then everyone going to make their own style. We end up having a lot of dashboards with different rules, and the users were not happy about it at all. Then we decided in the anemic conventions, and of course, we were too late for that. Then we spend a lot of time renaming the dataset, checking the reports, and so on. If you don't decide at the start of the project, especially if you have like a big projects on dynamic convention, then you can have really painful and costly process of renaming everything from scratch. Make sure at the start to take enough time to talk to your users and the project team to decide on the naming conventions. And very important in the review process of any new dashboards in Tableau that to check that the naming conventions are followed in each workbook to be consistent in the whole project. All right, Kay, so that was an overview of the different naming conventions. Next we will learn how to rename fields and tables in Tableau. 98. Tableau | Renaming: All right, so now let's say that you decided together with your users and the project team on specific naming convention which is different from the one that Tableau uses. Now the question is how to rename Tableau? In Tableau, we can do the following changes on the table. We can rename the table itself, or we can rename the fields inside the table. And the last one, we even can change the values inside these fields, also known as aliases. We're going to talk about it in the next tutorial. In this tutorial, we're going to focus on renaming the fields and renaming the tables. First, let's learn how to rename the fields in Tableau. All right, so now we're going to learn how to rename fields in Tableau. Let's have the following task. The task says, rename our fields in Tableau following the naming convention Pascal case. So that means all the words are capitalized and no separation between words. All right, so now the first question is on which page we can rename our fields? We can rename our fields either in the worksheet page or in the data source page. We're going to get the same effects. But I usually go to the data source page since there we can find more metadata, information about the fields and tables. Now the second question is, can we rename our fields globally for the whole workbook, for all worksheets? And as well, can we do it locally for only one view? Well, you can do both. But renaming locally for only one view, it's a little bit tricky. So now let's learn how to rename our fields globally, for the whole workbook, for all views in the worksheet page. Okay, so now let's go to the worksheet page over here. Then we're going to go to the data ban on the left side, We will rename the shipping dates. And here we have three methods. The first one is the drop down. So what you're going to do, write a click on it and then simply go to the rename. So we're going to click on that and we're going to rename it to the past cut. So I'm just going to remove the space between them, then Enter. And that's it. It's really easy. We just renamed the shipping dates and the second method is to use a shortcut. For example, let's go to the order date over here and hit F two. And with that we can edit the name. So I'm just going to remove as well the space between order and date and hint enter. As you might already noticed, the position of the order date just change in the Databan. That's because the fields in the Databanes are sorted in alphabetical order. That's what the second method using the two, using the shortcuts. And the third method to rename the fields in the worksheet page is to click and hold. For example, let's go to the Unit Price over here, Lift to click and hold, then release. As you can see, we can now edit the name. This is third one. I'm just going to remove the space between them and hit Enter. That's it. Those are the three method of renaming the fields in the worksheet. Drop down a shortcut using two and click and hold. One more thing about renaming, unlike the aliases which we get a layer later, can rename any type of fields. So whether it's dimension measure, continuous discretes, any type, we can rename it so there is no restriction or whatever for renaming Tableau. All right, so now let's go to the next one. We're going to rename the fields in the data source page. Let's go to the data source page over here. And here we have two places where we can rename stuff, either at the metadata grids or at the data grid. And here we have only two methods to rename stuff. So the first one is going to be the drop down, like the worksheet page. Let's go to the name, for example, the order date, right click on it and then rename. So we're going to remove the space between them. And that's the second method to rename fields in the data source page is by double clicking. For example, let's go over here on the metadata grids to the customer ID and just double click on it. Now we can go and as well we're going to remove the space. This is how we can rename. In the datasource page, we have only two methods that drop down and double click here. We don't have, sadly, any shortcuts. All right, so now we have the following scenario where we have renamed the fields like several times and we forgot the original names of the fields. In this case we reset everything back to the original names. And we can do that either at the data source page or at the worksheet page. Let's see how we can do it on the data source page. If you just go to the field, for example, the customer ID, right click on it. Then here we have the option reset name. Let's click on that. As you can see, now we are back to the original name of the field. I found it really strange because I would like as well, to have the option of resetting to the Tablemic convention. Now let's see how we can do that. On the worksheet page, I'm going to switch back and then go to the Data Bain. Let's pick the order dates. And now we're going to go and edit the field again. So right click on it and then rename. Then you can see over here a very small icon to reset the original name. By clicking on it, we reset the field to the original field name. All right, so now let's say that you have a lot of fields and you want to reset all of them now. Instead of resetting them one by one, we can do multi selection and then do reset. And we can do that at the data source page. So let's switch there. And it doesn't matter whether you're going to work with the meta data grid or at the data grid. So now what we're going to do, we're going to go to the order ID, click on it and then hold control. Select the next one, and then we're going to select the unit price as well. Then right click and reset names. Once you do that, you're going to reset all of them, which is really nice. So we have the unit price reseted the shipping dates. The order dates. All right, so now we have the following scenario where you are in the project and you build already view. But afterward you decided to do renaming. What can happen to our view if we do renaming? For example, here in the view we have the order underscore ID, and we want to rename it back to the Tableau name. So we're going to go to the order ID two, then instead of underscore, I'm just going to leave it as a white space. As you can see in the view, Tableau did change the names automatically to the new name. Well, you might say, okay, and what this is expected, if I change the name of the data source, it's going to change as well in the visualizations. Well, this is only in Tableau. If you are using any other tools like Power PI and you do renaming a datasets, the whole visualization going to break. So here if you have the task of renaming, this is going to happen fast in Tableau, but in power BI projects it's going to be really painful. All right, so so far we have learned how to rename the fields globally for the whole work. Boop. Now the question is how to rename locally for only one view. And here it depends on the field roles, discrete and continuous. So let's start now with the continuous. As we learned before, the continuous can generate the axis of the view. So here in this example, as you can see, the quantity and sales are the green pills. That means they are continuous and they generated the axis of the view. Now to rename the quantity over here and the sales, it's really easy. What we're going to do, we will go over here on the axis, right click on it, and then go to Edit Axis. Let's go there. Then here we have a new window. And if you go over here, you can see the axis titles. The current title is Quantity. Let's go to the field over here and change it from quantity to quantities. Then let's close this. As you can see now the field name called quantities on the axis. And if we check the data Bain over here, the field stays as quantity. We did this change only locally. At this view, this is really easy for the continuous. But the tricky part is if we have a discrete field, for example, the order ID over here is discretes. We have the blue pills. This one going to be tricky. Now, we're going to change the name from order ID to orders. What we're going to do, we're going to go to the blue pill over here at the rows and double click on it. Double forward dashes, write the word orders, then press. And that's it. Go outside, just click here in the white space. And as you can see now we have renamed it to orders. And as you'll hear in the view, but we didn't change the global name, it stays as order ID here at the data pain. This is how we rename the discrete fields locally. At one view it was not really clear, it's tricky, but let me show you how I usually do it. Let's take another field, that category over here. We're going to change it from category to categories. What I usually do, I go over here and double click on it and just I copy the name. Then I go to Antics Editor and paste the name. Then for its we're going to have the new line then double dashes and we're going to have the new name categories. And that's it. Then I'm going to copy it from here and go back to Tableau. Then again, inside the category over here, double click cones. Then I remove these parts and just paste the new stuff. Then Enter. So that says, this is how I usually do it for the discrete fields. I go to the text editor and prepare there since it's more clear from me what I'm writing. All right, so now we have learned all different methods of renaming fields in Tableau at the data source page, the worksheet page, globally and locally. All right, so now we're going to move to the next point where we can rename the tables in Tableau. And here again, we can do the changes either at the data source page or at the worksheet page using the same methods as renaming fields. The next point about locally and globally, you can change the names only globally. So anything you do, it can affect all the views, which is not really critical as the field names. Now let's see how we can do it at the worksheet page. So we're going to stay with a small data source over here and let's minimize everything so we see the table names. You might already noticed that on the names we have dots. And that's because our datasets comes from CSV files, which is not really useful information to see it at the data source. So we can go and clean up the name and rename it to only, for example, customers. We can go to the name over here, right click on it and then click rename. So I'm going to rename it to only customers. The next one, we're going to use the second methods using the short cut, F two. Let's hit F two, remove the S parts, we have only the orders and we're going to use the third Meisodes for the products. Just click and hold, then remove the CSV parts that those other stream Mesodes for renaming tables at the worksheet page. Now let's do the changes for the big data source at the data source page. Let's switch there. We're going to go to the data source page. Here you have two places to change the table names, either at the data model or add the metadata grid. We cannot go to the data grid to rename tables. First, let's switch to the big data source. I'm going to go over here, the big data source. Let's change the orders at the data model. Here we have only one methods, right click on it and rename. So we're going to remove the CSV parts, and then we go to the customers over here. Then let's go to the metadata grid. And as you can see, just click over here and you can remove the CSV parts. So that's it. And now for the last one, we have to rename the products. So we can go over here and select the products, and then we can rename it in the datasource page. So that's it, this is how you rename the tables. At the datasource page, we have the data model and the meta data grids. So with that, you have learned all the possible methods on how to rename tables in Tableau. All right guys. So with that, we have learned how to rename things in Tableau. Next we will learn how to add aliases in Tableau. 99. Tableau | Aliases: Let's first understand why and when we need liss in Tableau. Sometimes in Tableau projects we face the following situations. The first one is when we have a poor data quality in our datasets, Chrome data typo or inconsistent values, we have somehow to clean up our data before we start building our visualizations. For example, we have the following scenario on the table, customers, we have bad data quality inside the field. So here we have a typo. Sometimes it's Germany, sometimes it's Deutschland, sometimes they call it USA, and then America, the data quality is really bad in this staple. So here we have to do something about it and clean up the data. And here we have two options. Either we go back to the original datasets and do the changes of the values. And the second option, we can do the changes directly in Tablo using aliases. How we're going to clean this up. We're going to remove the E from here, the typo. And then instead of Deutschland, we're going to have Germany. And instead of America, we're going to have USA. And we might have another situation where the data quality is good but the names are too long. And if you're building views, you will understand that everything is tight and you don't have enough spaces to show the whole values of the dimensions. That's why we end up, most of the time changing the values of the dimensions to shorter names, to abbreviations. For example, instead of having the value of Germany, we're going to have E instead of USA. Us here, F R E, and US here. Again, we have the same situation. Either we're going to go back to the original dataset and change the values, or we stay at Tableau and do it directly there using aliases. In real projects, you cannot go each time back to the source system or to the original datasets and change the values there. Either you don't have the time for that or you cannot do that. That's why we end up always changing those values directly in Tableau. So eliuses in Tableau are alternate names for the member of a discrete dimension field so that their labels appears differently in the view. As you might notice, I say it's discrete dimension field and that's because Tableau does not allow you to create eliuses for measures or for continuous dimensions. So in Tableau you can create Elises only for the fields with the role discrete dimension. And now as usual we have the questions on which page we can create eliuses. Well, only on the worksheet page we can create the eliss in Tableau. We cannot create it in the data source page. And the second question, can we create aliases globally for the whole workbook, all the views and as well locally for only one view. The answer for that, we can create aliases only globally That's going to affect the whole workbook. All visualizations. We cannot create aliases locally for only one view. Okay, we're going to go to the worksheet page. We cannot do it at the datasource page. We're going to stay at the small data source. Let's take the country's drag and rob it over here on the rows. And then let's take any measure, let's take the scores, drag and rob it on the columns. The task here, instead of having those values, France, Germany, USA, we want to have short names. Here we have two methods to create aliases in Tableau. The first one is to go to the data bin on the left side. So let's go to the field country over here. Right click on it, and then here we have the option aliases. So let's go there. And here we're going to get a new window to edit the aliases. So let's check what we can see over here in the middle, we have three columns. We have members, has eliases and value of the aliases. The first one we're going to see all the members of the dimension country. Those values comes directly from the datasets. So those are the original values from the source. Then the next one we has has aliases. It is like an indicator to show us whether the values in the view are going to come from the original values or from the Elias. Now it's all empty because we didn't add any aliases. And the third field, we have the aliases here. We can go and edit the aliases of each member individually. And as you can see now, the aliases are exactly identical to the original values. That's why we don't have any aliases. Now let's go and change that. Instead of France, we're going to have R, And then instead of Germany, we're going to have E. As you can see, as I'm adding different values in the aliases from the original values. Tablo going to market as a star. Now let's go for the last one and we're going to have it as US. Now Just check what's going to happen once I click Ok. You see here we have the old values and if I click Ok, switches to the aliases, this is how you can add aliases in the data Bain. But now let's say that you change your mind later and you don't want to use the aliases and instead of that you want to go back to the original values. How we can do that. Maybe you already saw it. So let's go back to the country over here on the data Bain, right click. We go again to the aliases and while editing the aliases, there is here an option called clear aliases. What you can do, you can go over here and just click on it and everything going to reset to the original values. And as you can see, those indicators did vanish. That means there is no eliass. Now if you go and hit okay, the value is going to go back to the original values from the datasets. Here what I usually do once I need aliases in Tableau I don't go directly to one field and change the values. But instead of that, I tend always to create a new duplicates of the field and only change the values of the new fields that I have created. So let me show you what I mean. We go to the country, the right click, and then we go to the option over here, doublates. Let's do that. And as you can see now we have another field called Country with the copy. And of course now from the name I can understand this is copy and the other one is the original. But in Tableau, if you look very closely to the data type icon, you can see that in the doublkates we have like an equal sign. This sign indicates that this field is not original one, but it is created from another original field. If you see, that means this is a customized field that we have created. What I usually do, I go and rename it, we're going to call it country shorts. Now I create the aliases on this new field, let's go and do that, Right click aliases, and then instead of France, F, R, D, E, and US. So with that I have the two options, the long one, the original one, and as well the short version of the country. And I can decide the Is visualizations whether I'm going to use the short version or the long version. All right, that's all for the first method where we created aliases from the left side, from the databan. Now we're going to go to the second method where you can create aliases directly from the view. Let's see how we can do that. Just move over the value France over here and right click on it. And then here we have the option edit Elias. Let's select that. Now here I have very simple window. I just have to edit the lis only France, so I'm giving the Elias only for one value. Let's do that FR and then hit Ok. And as you can see in the view now, we just change the value France to FR quickly from the visualization and we can do the same for Germany. So right click on the value, then edit Elias. Again, the same window, we go see DE and Ok, as will the value change directly in the view. This is really quick methods to edit the aliases directly in the view. Now if we go and check the dimension country in the Databain, let's check the Elias. As you can see, the member France and Germany has an Elias, FR and DE and we've done that directly from the view. Now the question, which methods you use, I would say if you want to change multiple values, go to the databain and do the changes. It's just easier to work with the window and add all those values. But if you want to change a single value from the dimension, then you can do it quickly by going to the view and edit the alias. And that's all for the aliases. This is really great way how to clean up how to change the values directly in Tableau without having you going back to the original datasets doing the changes there. All right, so now we have the following Tableau task for you. The task says, abbreviate the values inside the field category in the table products from the big datasets showing only the first character from each value. You can pause the video right now to do the task, then resume it once you are done. All right, now let's do that quickly. As I showed you before, first we start with duplicating the field. So I'm going to go and do that. Then I'm going to rename it to category shorts. Then I'm going to present posts of the values, category and category shorts. So far both of the dimensions has exactly the same values. We didn't change anything. Now we're going to go to the category short, write a click on it. And then we're going to go to the lius. The task says, the first character, the first letter from each value, so that means the first one going to be the second one. It could be or OS, so I'm going to leave it as O. And the third one is going to be, then click Okay. And that says now we have new dimensions that has only the first character of each value. And we have done that using the lius. This is really easy. All right guys. So with that, we have completed this section, which is really important step in order to prepare our datasets before we start building our visualizations. In the next section, we will learn how to organize and structure our data in Tableau. 100. Tableau | Section: Organizing Your Data: How to organize your data in Tableau. In Tableau, we have different techniques and methods on how to group up and organize your data, which is very important for your users to understand your data. First, you can learn how to organize the dimensions in hierarchies, and after that, you can learn how to group up the members of dimensions using groups. Moving on, we can learn how to cluster your data into different groups using the cluster group. And after that, you can learn how to split your data into two subsets using sets. Then we have another method called Pens, in order to group up the values of the measures in order to build histograms. Let's start with the first method of organizing our data using hierarchies. Now let's go. 101. Tableau | Hierarchy: All right guys, the best way to understand the hierarchy is to have an example. If you take a look at our data, for example, the customers, you can find some dimensions are related to each other's since they hold similar informations. For example, the dimension country, we have values like Germany, USA, and France. And we have another dimension city, where you can find the cities inside those countries. For Germany, we have Berlin, Stuttgart. And then we have a third dimension, Postal Code, where you can find the codes inside those cities. As you can see, these three dimensions are describing common information. They give us information about the user location, and we can relate those dimensions together using the hierarchy. In hierarchies, we have different levels. And we start with the top node, and we call it the root node. This node represents the highest level of aggregations in our hierarchy. And now we're going to go to the next level of the hierarchy, where we have the country. In this level we're going to see more details about our data. Where we have, for example, the two values, USA and Germany, and the links between the nodes, we call it branches. And now we're going to go to the next level in our hierarchy. We have the level two here in the city. We will see more details about our data. So in USA we have Portland and Seattle. And in Germany we have Stuttgart and Berlin. And again, we have the link between the parent node and the child node using the branches. And now we're going to go to the last level in the hierarchy, we have the postal code. And here we're going to split the structure furthermore with more details. So we have the following bustal codes for each cities. Now, since the postal code is the last level in our hierarchy and those value don't have any children, we call those nodes as the leaf nodes. The leaf nodes or the leaves, they represents the most detailed level of our data in this hierarchy. So now with that, we have the complete structure of our hierarchy. As you can see, it looks like a tree structure. The top node, we call it the root node, it represents the highest level of the details. Then we have the intermediate levels, and they are connected using branches. And the last level, we call it leaf nodes, where it represents the lowest level of details. We have the root node, it represents the highest level of the aggregations. Then we have intermediate levels connected with the branches. And then we have the leaves, the leaf nodes. They represent the lowest level of details in our data. As we learned before, we can do many lab operations on the cube. So if we have rake in our data, we can do two very important operations, the drill down and the drill up. The drill down and drill up, they are all operations that's going to help us to navigate through the hierarchy in order to gain deeper or higher level understanding of the data. So let's understand first how the drill down works. Let's say that we are working with the Mejor sales. We start on the top node on the highest level. At the highest level, we're going to have the total sales in the whole datasets. For example, it's going to be 140. So now we are at the highest level, at the root node. And if you use drill down, you're going to jump to the next lower level in the hierarchy. So that means at this level we're going to see more details about the sales. So for USA we have 90, and for Germany we have 50. And now if you want to see more details about your data, we can apply again, drill down in order to jump to the next lower level in the structure. So what's going to happen? We're going to go to the level two and here the sale is going to split between Portland and Seattle. We have 40.50 and for Germany, we're going to have 24 suit guards and 34 Berlin. So that means we are seeing more details about our sales. And now if you want to go to the lowest level to the leaves, we're going to drill down from the city to postal code. So it's going to look like this. The Portland gonna split between those two postal codes. Say Seattle going to be the same because we have only one child. The same for Stuttgart, it's going to stay 20, and Berlin, we have two postal codes, so it's gonna split again. So as you can see we are using drill down to navigate through the hierarchy by taking us from higher level to lower level of details. It's like we are expanding the tree to see more details to understand our data. All right, so now we're going to talk about the second Alp operation, the drill up. It's exactly the opposite of drill down. Drill up gonna take us from bottom to top, lower to higher level of details how it works. Let's say we're going to start at the leaves and we're going to have the sales of those leaves. And now we can use a drill up to move from the postal code to the city. For example, we're going to have the total sales in Berlin, 30, because it's the sum of ten plus 20. And then in Utgard going to stay the same, 20, Seattle 50, and Portland as well, going to sum up the values from the leaves. So we're going to have the value of 40. As you can see, as we are moving higher, the value is going to get more aggregated. Let's see that we want to jump to the country, so we can use again, a drill up to move from the city to the countries. Germany, we can have the total sales of 50. For USA, we can have the total sales of 90. Now you can use, again, drill up to go to the root node where you can have the highest level of aggregations. So we can have the value of 140, the total sales inside our dataset. As you can see, if we have a hierarchy structure, we can use a drill up and drill down to navigate through the hierarchy structure. Hierarchies organize and structure the member of the dimensions into a logical tree structure by grouping similar dimensions together, Hierarchies are really important and give dynamics to your views where you can have the big picture and understand the data at the highest level. And you can drill down to specific details to gain deeper knowledge data. All right, so now we are back to Tableau. Let's understand how we can create hierarchies in Tableau. We can create hierarchies only on the worksheet page. We cannot create it at the data source page. In the worksheet page, we can create hierarchy on the data pain page. If you take a look to the customers tables, you can find that we already have a hierarchy. And here we have small icon that indicates we have hierarchy, the hierarchy name called Country City, and on the left side over here we have small arrow. If we click on it, the hierarchy can expand and we can see the dimensions inside this hierarchy. Speaking about dimensions, hierarchies could be used, only four dimensions. You cannot create a hierarchy from measures. And this hierarchy that we have over here, it is created automatically from Tableau. Since Tableau analyzed the content of the country and the city and automatically understood that there is a hierarchy between them. But since we want to learn how to create a hierarchy, we're going to go and remove it and create a new one from the scratch. Now in order to remove a hierarchy, you go to the hierarchy name over here, right click on it. And then here we have the option remove hierarchy. Here you have to understand that the dimensions inside the hierarchies will not be deleted, only the hierarchy itself will be deleted. So you will not lose any fields on the logical tree. The logical hierarchy will be removed. All right, so now let's see how we can create hierarchy in Tableau. And we're going to create the location hierarchy. We're going to go to the left side of the data in and we're going to select one of the dimensions. It doesn't matter which one you're going to select, but I prefer to start with the highest level of the hierarchy. Here in our example, it's going to be the country select the country radical. Click on it. And then here we have something called hierarchy. And we're going to select Create Hierarchy. Let's go there. We have to give it a name, so we're going to call it location hierarchy. Then he, as you can see now on the left side we have the icon of the hierarchy. Inside it, we have only one dimension, the country. Now in our hierarchy, we have as well the city and the postal code. So how we can add it to this hierarchy? As we learn, the hierarchy has different levels, and the order of those levels are really important. We have country, city, and postal code. Now, in order to add the city, we're just going to drag and drop the city beneath the country over here and release it. With that, we have now the city inside our hierarchy. Let's grab as well the postal code. So we have to drag and drop it beneath the city. Let's release. With that, we have created the location hierarchy with the three dimensions, country, city, and postal code. Here Again, if you want to hide the details about this hierarchy, we can collapse it over here. Or if you want to see the details, we can expand the hierarchy. All right, so this is one way on how to create hierarchy in Tableau by using drop down. The second way on how to create hierarchy, we can quickly drag and drop dimensions together. So for example, if we go to the product table, we have as well a hierarchy here between the category, product name, and subcategory. Our hierarchy starts with the category, then the subcategory, and the last one, the leaves, going to be the product name. Now let's see how we can create the hierarchy using quickly drag and drop. We're going to take one of those dimensions, let's say we're going to start with the category, drag and drop it inside the subcategory. So I'm now hovering and selecting the subcategory. Let's release. Once we do that, Tableau understand that we want to connect those dimensions. So Tableau going to create a new hierarchy. We're going to call it the Product Hierarchy. And let's it, okay. And now let's see. On the left side we have a new hierarchy called product hierarchy with the icon. And we have insided two dimensions, category and subcategory. We are missing the third dimension. Let's take the product name and drop it in the hierarchy. Now we have problem with that. The order of the dimensions inside our hierarchy is wrong, because the dimension category should be the level one and the subcategory should be the level two. How we can fix that? Just select the category and drag and drop it on top of the subcategory. Let's release that. That says this is how you change the order of the categories. And with that, we have the product hierarchy. All right, now let's say that we want not to remove the whole hierarchy, we just want to remove one member, one dimension from the hierarchy. In order to do that, let's say we want to remove the product name. Select it and just drag and drop it somewhere here in the empty space. And with that, the product name is not anymore member of the hierarchy. So this is how we can remove dimensions from hierarchy. But I want to put them back in our hierarchy because we need it later. So I will put the subcategory beneath the category, and we take the product name and put it beneath the subcategory, and that's it. So these are the two methods of creating hierarchies in Tableau, either by drop dominu or by quickly drag and drop the dimensions together in order to create a hierarchy. It's really easy. All right, so now we have this hierarchy, the structure, how we're going to use it inside our view, it's really easy. We're going to go and select the whole hierarchy, then drag and drop it to the View. So here the hierarchy going to start from the level one for the countries, and we're going to see the values of the country. Now let's have one of those measures. We're going to take the sales and drag and drop it on the columns. So now if you look closely to the country, to the plu, pile over here, you can see that we have a new sign, the blast sign. This sign indicates that we can drill down in this dimension. So now let's go and click on the blast sign. As you can see, now we are drilling down in our hierarchy to a lower level. Now we are seeing more details about the sales. And we are now at the level of the city to the next level. Now as you can see, we have the dimension city. Our rows, we didn't drag and drop it from the database and put it at the rows it expanded from the hierarchy. Again, here the city has the plus sign that indicates we can drill down inside the city. Let's drill down again. As you can see now we are at the postal code and we can see more details about the sales. Now if you check the postal code, there is no plus sign, like the city and the country. Because we are at the leaves, we are at the lowest level of details in our data. With that, we have navigated through our hierarchy from the top node to the leaves. As you can see, it's really easy and very dynamic. Now let's say that we are at the leaves and we want to drill up back to the highest level of the aggregations to the top node. It's really easy if you check again the city and the countries we don't have anymore, the plus sign we have the minus sign. The minus sign indicate that we can drill up in the hierarchy. So let's see what can happen if you click on the minus sign. As you can see, we drill up now from the leaves, from the postal code back to the city. And the values of those cells are now more aggregated. And now the same thing, if you want to drill up from the city back to the country, we're going to click on the minus sign. So let's do that. And with that we are moved to the level one, to the highest aggregation in our hierarchy. All right, so so far what we have done is we drill up and drill down in our hierarchy using the row shelves and you know that's the rows and the columns. We use it as developers build our view. Now the question is how our users and the audience get and drill up and drill down through the hierarchy. Because the hierarchy should be as well used quickly from the users to drill down to the details. Now let's see how we can do that. If we go to the view over here and hover on the country, we can see again a plus sign. Let's go and click on that. And as you can see, we drill down in our hierarchy from the country to the city. Now let's go more in details and drill down to the postal code. We can hover on the city, and as you can see, we have again the plus sign. Click on that. And with that, we drill down to the postal code. This is exactly how the users can drill down in the view. Now if we want to drill up back to the higher level, we can do the same. We can see the minus sign over here. Click on it and you go back to the city. And then we go to the country as well. We have the minus, we click on that. And with that, we drill up back to the country. As you can see with those icons, we can navigate through our hierarchy. Now you might say all your users, you know what, this is really small icon and my users don't like it. Is there any other way to drill up and drill down in the view? Well, yes, if you go to any of those values over here and write a click on it, you can see in this drop down, we have a drill down. If you click on that, we drill down to the city the same. If you select any value, doesn't matter which one, let's go over here and then drill down again. And with that we are at the postal code. If you want to drill up, you can do the same, any values radically cone it. And here we have the drill up socilic. And to drill up back to the country, go to any values in the country radically on it and drill up. So those are the two ways on how to drill down and drill up in the view. All right guys, so far we have created our own hierarchies by putting those dimensions together in different levels. But in Tableau we have as well indirect embedded hierarchies in the data type date in Tableau. Any field with the data type date has the following hierarchy. It starts with the highest level with the year, then we have the quarter the month, and then the lowest level, the leaves. We have the days. Those four levels are the default levels inside each field with the data type date in our dataset. Now we have another data type that holds as well, an embedded indirect hierarchy. We have the fields with the date and time. Here we have informations about the time, and we have seven levels. It start exactly like the date, so the highest level is going to be the year, then the quarter month, and then the day. But now we can drill down to more details since we have the time information. The next level is going to be the hours. Then we have minutes and seconds. Second are the lowest level of details. They are our leaves here. We have civil levels of the hierarchy. Date and date and time. They have hierarchy embedded inside it. Now let's uncover those hierarchies in Tableau. All right, so now we're going to go to the table orders. And here we have two dates. Doesn't matter which one, both of them are going to have exactly the same hierarchy. Let's take the order date, drag and drop it here on the rose. Now, as you can see, we have now the plus sign. It indicates there is a hierarchy. And it starts at the highest level with the years. Now let's take a measure to see some data. We're going to take the order counts and put it in the columns. And I want to show Israel the labels. Let's show some labels. All right, Now let's go and discover the hierarchy inside the date. As you can see on the left side, we don't see any information about the hierarchy, so that means it's really embedded inside this data type. So let's go on the years and click on the plus sign to drill down. As you can see the next information we have the quarter informations. So now we see the total number of orders by the quarter. So now we can see more details about the total counts, and then we can drill down to the day. And now we are at the lowest level at the day. We cannot drill down further, for example, hours, minutes and seconds, because the order date has the data type date. As you can see, the dimension order date has four levels, years, quarter, month and day. It's really nice to have it like this in Tableau because it's really standards. I worked with other BI tools and there we have to build it in our own, which is really time consuming to build all those hierarchies. Especially if you have a big dataset here in Tableau, our life is easier. Tableau did decide to have a hierarchy inside each date. All right guys, one more thing about the arches. They really organize and structure your views and make it more dynamic for the users. For example, requirements to make sales by country, sales by city, sales by postal code, and you don't use hierarchies, you will end up making three views like here on the left side, it takes a lot of space. And as well, it's literally dynamic. But better than that, we can create hierarchy between those dimensions. And we can put everything in one view. And then you give the options for the end users to drill down and drill up, depending on what they need. If they want the sales by country, we have it already at the top node. But if they want the sales by city, all what they have to do is to drill down to the next level, and we have it already, sales by city. If someone's need to go more in detail to go to the postal code, they can drill down as well to the sales by postal code. As you can see, it gives really your view more dynamic and going to be more attractive for the end users if you compare to the lift sides. Now we have more dynamic, more interactive for the end users. And as well, you are creating list views in your dashboards. So this is really great. If you want to drill up back to the country, we can just click the minus sign. Hierarchies gives more dynamic its structure and organize your data in the views. All right, now let's summarize. Hierarchies, organize and structure the members of the dimensions into logical tree structure. Hierarchies are special feature only for dimensions. You cannot create hierarchies between measures we can and drill up to navigate through our hierarchy to gain deeper or higher level understanding of your data. Overall, hierarchies are really important to organize and structure your data interviews. And it provides for the users a powerful tool to quickly and easily navigate and explore your data, uncover insights, and make better decisions. All right, so that's all for hierarchies in Tableau. Next we will learn how to group the members of dimensions into hierarchategories using groups. 102. Tableau | Groups: All right, Kay, So far we have learned how to group up the dimensions together in hierarchies, but now we will learn how to group up the values, the members of the dimension into groups in Tableau. We have three methods in order to do that. So we have the groups, cluster groups, and sets. And now we will start with the first one, how to group up the members of the dimensions using groups. But now, as usual, let's understand first the concept behind it and then we're going to learn how to build it in Tableau. So let's go. All right, so now if you take a look at our data, sometimes you're going to find dimensions that could be used to categorize or to group up the data inside the table. For example, if you take a look at our products data, you can find that the category can be used to group up the data. For example, you can see two products are assigned to the category Monitor and three products are assigned to the accessories. So this field could be used to group up the data. Now if you check the customer's data, you can find some dimensions that could be used to group up the data. For example, the country, the city, the postal code. Those information can be used to group up the customers. All those dimensions could be used to group up our data. Those groups or those dimensions comes directly from the datasets and we didn't create so far anything. Sometimes we might be in a situation where we want to group up the data differently than the original groups in the datasets. Here we have two options. Either we go back to the original datasets and do the changes there. I create a group, or we can create a group directly in Tableau without going back to the original datasets. For example, we want to create a new group in the products and it's going to be the product class. Here we have another group and we're going to call, let's say for example, the first three are the class A, the last two are the class. We can create this extra group directly. Tableau. The same thing goes for the customers. We want to add a new group. We want to add the continent on formations. We can add this group. For Germany, it's going to be Europe. For USA going to be North America. And for the rest France, Germany, USA it's going to be as well. Europe's. All what you are doing now is adding new groups to our data. The groups Tableau combine similar related values into higher level categories which can create a new dimension for your data analysis. Now let's see how we can create groups in Tableau. And there is two methods in order to do that. Either by creating the groups in the data in or directly in the view. We're going to start with the first one, where we're going to create the continent group in the data. In, in order to do that, we're going to go to the table customers and based on the values from the country, we're going to create the new group here. It's important to understand that we can create groups only on top of dimensions. We cannot create groups on the measures. Another feature where we can use it to group up the measures and we call it pens. But now for the groups, we can create only on top of the dimensions. And the new field going to be as well a dimension. Let's see how we can do that. Select the Country, right, click on it. And then let's go to the Create. And here we have the Option group. Let's select that. So now we're going to get a new window in order to create the group. We're going to start first by renaming the field name, we're going to call this continent. Then in the middle of over here, Tableau going to list for you the distinct values inside the country, all possible values from the dataset. What we're going to do, we're going to group up France, Germany, and Italy to Europe, and USA to North America. How we're going to do that? We're going to multi select those values by clicking control. France, Germany and Italy. They are one group. In order to group them together, we're going to select over here, the group, Once we select it, Tableau, going to put all those values underneath a new group. We're going to give it the name of Europe. Let's click Okay. And with that, we have created now a new group for those three values. As you can see, we can expand and collapse of those values to see the details. But still we have one more value inside the country that is not mapped yet to a group here. What we're going to do, we're going to select it and then click on the Group and we're going to call it North America. That's now inside the continent, we have two values, Europe, North America, and they are related to those members from the country dimension. Now let's say that you want to move one of those members from one group to another group. How we can do that? It's really easy by just drag and drop. Let's take, for example, Germany drag and drop it here in the North America. You will see this member now is belongs to the group of North America which is wrong. So I'm going to put it back that says this is how you switch between groups. Here we have Tablo. Another option is to remove the member from all groups. In order to do that, let's select Germany and click over here and Group. Once we do that, you will see that the Germany value is not assigned to any of those groups if I collapse those stuff. You will see that Germany is standalone value. We usually use the group other for all values. Thus we couldn't assign to any of our groups here. Tableau gives us a quick way in order to create this group. All what we have to do is to click the value of Germany and then click over here, Include Other, Let's put that as you can see now the value of Germany is inside the group Other, and with that we have in the continent three groups. Europe, North America, and other. Now if you want to rename the groups, you can click on the group and then click over here, Rename. So we're going to have it like other continent or something, or. Right click on the group and then rename. That's really easy. So now what we want to do is to move Germany back to Europe. Now as you can see, the group other did disappear because it doesn't have any member. So that's it for now. We have created our groups. Let's click Ok. Now as you can see on the left side, we have a new field called continent. And it is discrete dimension and it has a special icon and the data type indicate that this field is a group in Tableau. If you are creating a group based on another field with the geographic role, Tableau going to show both of the icons group and geographic role. Because usually the group has the following icon for the situation. It's going to show both of the icons, geographic role and the group. All right, so now let's build the view based on this new dimension. We're going to take the continent dragon rabbit on the roads. As you can see it has two values. We're going to take the sales as well. And the columns now to see more details in the view, we're going to take another dimension, or we're going to take the whole hierarchy of the location. Let's drag and drop it here on the rose. Now as you can see, the continent is now grouping our data. Europe for those three values, North America for USA. As we learned in the hierarchies, we can drill down to the next values. And you know what? This new dimension, the continent, has similar informations to the country and city, and it belongs to the hierarchy. Now it makes sense to add it to the structure of our location hierarchy. So what we're going to do, we're going to drag the continent and drop it on top of that country. With that, the continent going to be the level one and country going to be the level two. We can use this new group as the highest level of aggregation in our structure. We can drill up back to the continent. As you can see, we can create new groups directly in Tableau without going back to the original datasets and do modification there. All right, so that's why the first method on how to create groups in Tableau from the data Bain, The second method is to create groups directly in the view. Let's see how we can do that. We're going to create a new worksheet and we're going to take two measures. We're going to take the profits, let's put it here on the rows. And we're going to take as well the sales. And now we want to show all the customers as data points. In order to do that, we're going to go to the customer ID, drag and drop it, put it here on the marks, on the details. Now we have for each customer in our dataset as a data point. Now our task is we want to group up the customers performance. If you decide to go to the data paint in order to create those groups and radically connect, then we go to the groups, you will see a long list of all customers. And now creating groups based on those values can be really painful because the customer ID has high cardinality compared to the country. Instead of doing that here, we will do it directly in the view. In order to do that, we will go and select, for example, those customers, those data points. And we will get a new window. As you can see, Tableau tells there is eight items that are selected and we have the icon of the group. If we click on that, Tableau going to be create new stuff. If you look to the data pain over here on the left side, you can see that Tableau did already create a group with the selected items. And it did as well the coloring. So you can see the group as well. Here on the colors on the right side, we have the legends. So you can see the selected item is the blue and the others are gray. Now what we have to do is to go and rename stuff. First of all, I'm going to rename this group. I'm going to call it Customer Group. As you can see, the group name is like the list of all members. It says, okay, 9113035 and more. That's because it's hard for Tableau to understand why did we select those customers and what is the group name? In order to rename the group, we're going to go to the left side to the Data Bain, right click on it and then we go to Edit. Select that. Now as you can see over here, we have our group that we just selected with the eight members. So let's go to the group name, right click on it, rename, and we're going to call it high performers. That those customers has the highest performance compared to all other customers. So as you can see, Tableau did put all the other customers under the group other. Let's click okay now. And now we have a better name on the right side. And it makes sense to have a gray color for other. All right, so now we're going to go and create another group of customers with a low performance. All right, in order to do that we're going to do the same, we're going to go in the view and select those customers with a bad performance. And once we do that, we're going to get this new window saying, okay, nine items, and we're going to select the group. But instead of that, if you move your mouse away, you will see the window disappears. In this case, we're going to go to one of those data points and right click on it. And then here we have the option of group, select that. Now what can happen? Tableau will not create a new group on the data bin, it's going to include it as a new group inside the already existing group. You can see here on the right side we have a new group with the color of orange. And with that, we have added a new group to the customer. In order to rename it, we're going to go to the data bin and edit the group. Let's go there now. Instead of having the list of the members, we're going to click on it, rename, and we're going to call it law performers. Click Okay. And now with that we have nice naming for the groups, we can as well change the colors of the group. For example, for the low performance, we can have red. For the high performance, we can have green. In order to do that, we're going to go to the Marks over here to the colors. Click on that. Then we're going to select Edit Colors as we say it for the high performance. So let's select this value and assign it to green. And we want for the low performance to have a red and the color of the other going to be gray. Since it's not our focus, let's click Okay. And as you can see now the data points has new colors. And another use case for the groups that we use it as well as a filter. So we give the users the possibility to interact with our views, to focus in specific group. Now in order to do that, we're going to go to our database, to the group right click on it and show filter. Now we have the group as a filter. And the users can click between the groups to change their focus on which cluster they can analyze. For example, if they are not interested with all those great stuff and they want to compare the high performance with the low performance to understand the difference behavior between them, they can just remove it like this. All right, so this is how you can create groups in Tableau using the two methods, either from the data Bain, especially if you have a dimension with low cardinality like the country. But if you have a dimension with high cardinality, the customer ID, order ID, then you can create groups directly from the view which is really fast way to assign the values to specific groups. As you can see this feature in Tableau, the groups is really awesome way on how to group B data directly in Tableau without going back to the original datasets and create the group there. All right, so now you have the following task for you. Go to the small datasets and create a new group called classes based on the Dimension product name. The first three products belong to the class A and the last two products belongs to the class. You can pass the video right now to do the task, then resume it once you are done. All right, so now let's quickly create this group. We're going to check first the cardinality of the product name. I'm just going to drag and drop it here in the rows. And as you can see, we have only five values. That means it has low cardinality. And we can do it directly in the Data Bain, right click on the product name. And then we're going to go to the Create group. And now we're going to call it, we're going to go and call it classes. The first three members are the class and the last two members are the class B that says let's go. Okay, now we can go and check the values. Let's drag and drop it over here before the product name. And as you can see, the three products are Class A and the two products here are class. This is really easy. All right, so now let's summarize groups in Tableau, combine related similar values into high level categories. And groups can be created based only on dimensions. We cannot create groups for measures and the group itself going to be a discrete dimension. Groups in Tableau are very useful to simplify your view and make it easier to understand your data by grouping the data points into clear and relevant categories. All right guys, so that's all for the groups in Tableau. Next we will learn a very similar feature called the cluster groups. We can use it in order to cluster your data into different groups. 103. Tableau | Cluster Groups: All right everyone, So now we're going to learn another method on how to group up the members, the values of dimensions into groups. And this time we're going to use the cluster groups in Tableau. But as usual first let's understand the concept behind it, that we can learn how to build it in Tableau. So let's go. All right, so cluster group is another way of grouping your data, used for data clustering, which is a statistical technique to group up similar data points together. In data clustering, we have different algorithm to calculate the clusters. For example, we have the algorithm Manes and another algorithm called hierarchical clustering and another one called density based clustering. And Tableau did decide to go with the mine algorithm since it's really simple and easy to implement. The mine algorithm is widely used in data clustering. Now let me show you how the Kemanes algorithm works. Let's say that in our dataset, we have the following data points. First, we have to define how many clusters we want to build. In this example, we're going to go with three clusters, and after that, the algorithm is going pick three points, and we call them centroids. Then it can assign the data points the nearest centroid for this data point, it's going to belong to the green cluster. And then it's going to go to the next data point and calculate the link between it and the three centroids. And then it can assign it to the nearest centroid. For this, it's going to be the red cluster. The algorithm is going to do that for all data points and assign them to the nearest centroid. At the end, we're going to have three clusters, the green, red, and blue. As you can see, the key means is really simple and easy to implement. All right, so now in order to understand the clusters, let's have the following task. The task says to identify high value customers by clustering them based on their sales. And in order to find out which customers generate the most revenue and which do not. All right, now in order to create the cluster group, we have to be at the worksheet page. And this time we can create the clusters from the analytics pane, and we cannot do it at the data pane. Now let's see how we can create the clusters and we will stay with the big data source. Since we need a lot of data points here. We need two measures. We need the profit. So let's track and drop it on the rows. And we're going to take the sales as well to the columns. And with that, we have two axes, the sales and profit. But what we are missing now in the middle is the customer's data. Each customer is going to be one point. For that, we're going to take the customer ID, we're going to drag and drop it over here on the details on the marks. All right, so now we have the data points and each point represents one customer. Now in order to create the cluster, we're going to switch to the analytics pane. So let's go over there, and if you go to the models, you will find the cluster. It's really easy. We just drag and drop it here on the name clusters, and here we will have a very simple window here it says the variables for the clusters are the seals and profits. And then we have the number of clusters here. As a default, it's going to be automatic. That means Tableau going to figure out from the data, how many clusters do we need here? As a default, we have automatic. That means Tableau going to figure out how many clusters it makes sense to create from those data points. As you can see, Tableau did already created the cluster, and it's created three clusters. But if you say, you know what, we want four clusters or five clusters, you can go over here and define how many clusters do you need. If you have five, let me just move it over here to see what is going on. We have now five clusters. If you want to have two clusters, we will have only two colors and so on. So I'm going to stay with the three clusters. It makes sense. That's it. In this window, there is no okay or something. So we just going to close it because Tableau can create the cluster immediately. All right, so now we have the cluster. The question is, where do I find the cluster group? Well, if you go to the data in on the left side, you will not find any cluster group over here because we have this information now only on the colors. This field here is our cluster. Now, we might have this information, this cluster group in the data in, in order to use it in different views. So what we're going to do, we can just drag and drop it somewhere in the data in. Now over here we can see we have new fields and the icon indicates that this field is a cluster group. So now we're going to give it a name, Customer clusters. All right, so now we can reuse this cluster in different views if we need. All right, so now the next point is how we can edit our cluster. So now we have three clusters. How about we want to change it to four? How we can do it? We will go to the marks over here, right clickonets, and here we have the option of edit. So let's select that. We will get again, the same window, so in order to change the number of clusters, we will not do it at the data pain, we're going to do it at the marks. This is how you edit the clusters. Now if you go over here again and click right to click on the Clusters, you can find we have another option called describe clusters. So here we're going to find more information about our clusters. Let's select that. So as you can see here, we have a lot of information about our clusters. So first we have the input for the algorithm or for the clustering algorithm. The variables are the measures that we use in our view, the sum of rough it, the sum of sales, and the next info is the level of details Usually here we have the dimensions. We are using. Now the lowest level of details, the customer ID. Since each data point represent a customer, then we have more information about our clusters. So the number of clusters we define are three, The number of data points, the number of customers, we have 800 customers, and then we have the table over here. For each cluster we have informations like the number of items or the number of data points inside each cluster. In the cluster one, we have around 617 customers. In the cluster two we have 171, and cluster three is the lowest. We have 12 customers. The centroids of each cluster, the central points of clusters. If you need more statistics about our clusters, we can find it inside describe clusters. Really fun to work with the clusters and I found different people use different designs on how to present the clusters. For example, one design that I see almost everywhere, That's if you go to the shapes over here and then choose the field circle. Now if you have a lot of data points, what is interesting is that to see the overlapping between those points, but now it's really hard to see it in this view. So what I'm going to do with that, I'm going to focus about those data points. Let's select those stuff. And then we're going to say, okay, keep only. Let's click on that. We have now like zoom in in those points now in order to show those overlapping in better way in bitter visual. What we're going to do, we're going to go to the colors and then we're going to reduce the opacity. Let's reduce it to something like 70% I think it should be fine. And now our visualization will just look really professional and you can see the overlapping between data points. All right, so there is another design in that to assign a shape for each cluster. So before we do that, I want to have again, the big picture. I will remove the filter, so let's just remove the filter from here to somewhere else. And with that we are back to original view. So what we're going to do with that, we're going to take the cluster and put it on the shapes. So let's track and drop the cluster on the marks over here on the shapes. So as you can see, for each cluster we have a shape, we have the plus, square, and circle. And if you want to assign different shapes, what you're going to do is click on the Shapes. And now we can go over here and change the shape of cluster. Let's say instead of loss for the clusters three, we're going to have X. And let's click okay. And now instead of flaws, we have X. This is how I usually design the clusters in Tableau. Alright, so now after we create the clusters, it's really important to interpret the outcomes of the clusters with the business like in one hand we have the red cluster focus on the customers with the high profits. And in the other hand, we have the blue cluster focus on the customers with the low profits. Clustering your customers based on the sales and profit can help you to gain insights about your customers. Which can help the business to target its marketing strategy very effectively. Al, right now we have the following task for you. The task is to identify the top selling product by clustering the products based on the quantity and the profits, create five clusters using the big data source. You can pause the video right now to do the task, then resume it once you are done. All right, so now let's create the cluster for the products. Here we need two measures. We have the profit and the quantity. Let's have first the profits. We can drag and drop it here on the rows. And then we're going to take the quantities on the columns. And now we need the dimension to define the level of details, the data points. And here we can use either the product ID or the product name. So I will go now for the product name. So drag and drop it on the details. All right, so now we have everything. We have the measures and the dimension, and we're going to go and create the cluster. We go to the analytic spin. And then we take the cluster, drag and draw it over here. And Tableau did create here only two clusters, but the task says five clusters, so we're going to go over here and define five. All right, so that's it. Now we have five clusters for the products. Let's close this clustering. The product space on the quantity and the profits can help you to gain insights about the product portfolio. And the business can use it for many staff. For example, to optimize the inventory management and make strategic decisions about the product developments and marketing. This is really amazing. All right, let's summarize. The cluster group in Tableau is a statistical technique to group up similar data points together in clusters. The cluster algorithm used in Tableau is the key means easy to implement and as well easy to understand. Clustering in Tableau is one of the main features and very powerful since Tableau is the only to, the only I tool that can plot endless amount of data points. Because other BI tools like power BI do always like make limitations on the number of the data points that you can see in the visualization. Which can make it really impossible to create clusters in power BI. Data clustering in visualization is a very powerful tool for data analyses and batter recognitions to help the business organizations to be data driven, which means to make better decisions using the data. All right, so that was it for the cluster groups. And next we will learn how to split the values of dimension into two subsets using the Tableau sets. 104. Tableau | Sets: On how to group up the members, the values of dimensions into groups. By the times we're going to use the sets in Tableau, it is very similar to clusters, as usual. We're going to start first with the concepts, then we're going to learn how to build it in Tableau. So let's go. All right, so now let's say that we have the following data points in our visualization. We can use datasets to group up those data points. Sets can divide your data based on specific criteria or selection into two groups of data. The first group, we call it the group, This group, you're going to find all the data points that are included in the subsets of data. These data points are the members of the set. And the other group is the out group. This group contains all the data points that are not included in the subsets of the data. That means the data points in this group are not the members of the set. The sets in Tableau divide our data into two groups, the in and out groups. When do we need sets and why it's important? Well, we can use the subset of data to do focus analysis on specific scenario. And as well to compare the subset with the remaining data. For example, we can make a subset of the top ten customers in our datasets based on the sales. And compare the subsets with their remaining customers in order to understand their behavior and what makes them on top ten. So it's really amazing feature in Tableau to understand your data and to make focus analyses on specific scenario. And in Tableau we have different ways to create the sets. The first to create a fixed set, and that's by using a manual selection. And the other way is to create a dynamic set based on specific criteria. Here we have two ways to create the dynamic set, either using condition or using ranking top or bottom. Now, the last methods of creating sets in Tableau is by combining two sets. It can create new combined sets. Since we are combining data together, it's like the joints. Here we have four options, inner left, right, and full join. Here the output can be new combined sets that those are the different methods in order to create sets in Tableau. Let's have quickly some simple examples in order to understand those methods. All right, so now back to our five customers, and now we're going to create different sets using different methods. We're going to start with the first set. It's going to be fixed sets using a manual selection. Here we're going to go and manually select which customers are inside the subsets and which customers are outside. Here we're assigning two values in and out. For example, we're going to say John is inside the set and as well bet. But there is going to be out Martin, George, and Maria going to be outside of the set. As you can see, we just manually selected which customers are in the sets. So let's move to the second set where we're going to create a dynamic set using condition where the sales is bigger than 400. So here we will not select anything manually. We will just define the rule for Tableau. And Tableau going to do it automatically for us. Tableau can hear all the customers and start assigning the values in and out. The first customer is Maria, does not fulfill the condition, so it's going to be out of the set. Next we have the second customer, John. He has high scores or 900 it fulfilled the condition, so he is a member of the set. The same goes for George, 750, Martin as well, but Peter don't have any score, so he does not fulfill the condition. He will be but Peter don't have any score, so he does not fulfill the condition. Peter is out. So using this condition, we have three customers in and two are out. Now what make dynamic sets very important and efficient at let's say in the next days, those scores of the customers did change. What going to happen after your ratio data in Tableau? Tableau going to recalculate the condition and assign new values if something changed So there is dynamic and everything going to be done automatically. Now let's move to the third one. We have dynamic sets and now we're going to use the top two customers, which means the top two scores is going to be inside the subsets and there is going to be out. If you have a look at the data, you can see Joan and George has the highest scores between the customers. Those two customers going to be in. The rest going to be out. Again, everything here dynamic and automatic, We just specify the rule and Tableau going to do the rest, all right? Okay. So those are the three methods to create a set. Next, we're going to go more advanced, where we're going to create a set from combining two sets. Here we're going to take the following example, where we're going to create a new combined set by combining set one and set three. Here it's really important to understand that the calculation of this new combined sets can be based on the output from the set one and set three. Tableau will not check the table customers, it's going to check only the output from the sets. And here we have to configure the combined sets and we have four options. It's something similar to the joints, but not exactly like the joints. So let's go through those options one by one. The first option says all members in both sets. That means the customer is going to be a member of the combined set if the customer is at least a member of one of those two groups. So let's check our customers. Maria is not a member in set one and set three, so it's going to be not as well a member of the combined group. And the next customer, John, is a member of both groups. So that is more than enough. So he's going to be as well a member of the combined set. And George is a member of one of the sets, so he's going to be as well. In Martin here again is like Maria. He's not a member of set one and set three, so he's going to be as well out. Then the last customer better, he is a customer of one of those two groups. That's going to be enough to be a member in the combined sets. As you can see with this option, it's going to be enough for the customer to be a member of one of the two groups to be in the combined group. All right, now let's move to the next option. It says shared member in both sets. That means to be a member in the combined sets, the customer should be a member of both sets. It's not like the first option. It's enough for the customer to be one of the sets. The customer has to be in both sets. Let's check our customers. Again, Maria is not a member of both sets, so Maria going to be out. But next we have the customer, John. He is a member of both sets. So that means he fulfilled the requirements, be a member of the combined set as well. So now, as you can see, for the other three customers, none of them fulfill this requirement, so that means none of those customers going to be inside our set. Well, this option is very restrictive. All right, so now let's move to the next one. It's going to say set one except shared members. So what this means, we can have all the members from the set one, but they should not be a member in the set three. So let's check the customers. Maria is not a member in both of them, so she going to be out. And now we come to John. John is a member of the set one, but he is as well a member of the set three. Well this time John will not be a member of this group because we are saying except shared members. So that's mean John this time gonna be out the next one. George is not a member of the set one, so automatically going to be out. The same goes for Martin. He's not a member of the set one. But now if you check Peter, he is the only one that's fulfilled the requirements. Peter is a member of the set one and not member of the set three. And this is exactly the requirement for this group. So Peter going to be a member of the set three. And this is exactly the requirement of this option. So only Peter going to be a member of this group. All right, so now let's move to the last one. It's exactly the opposite. So it says set three except shared members. So the requirement for the customers to be a member of this combined group is to be a member of the set three, but not a member of the set one. All right, so now let's check our customers. I really feel bad for Maria. She is not a member of any of those sets. Like if your name is Maria, I'm really sorry for that. It's not intended but now it's really too late. I already recorded, so sorry for that. Next time, I promise you I'm going to make better examples. But for now, Maria is out as well in this group. The same here goes for John. John is a member of set three, but Joan is as well a member of set one. So he does not fulfill the requirements John gonna be out. Now if you look to the customers, George is the only one in the set three and not in the set one, so only John gonna be in this group and the other two are out. Alright, so with that, we have covered all the scenarios, all the methods that we have, the Tableau sets. All right guys, so now let's see how we can create sets in Tableau. We can create it in the worksheet page, we cannot do it at a data source page. And we can do it either at the data bin or in the view. So now we're going to create different sets using different methods. But first let's create the view. So we need the customer ID. By the way, instead of drag and drop, you can double click on the field, and it's going to be in the rows we need as well, the first name. Click on the first name, and we would like to have the scores as well. So drag and drop the scores at the ABC. So now we're going to create the fixed set using manual selection. In order to do that, we're going to go to the customer ID over here on the data bin. Right, You click on it and then we go to Create. Over here we have sets. As you can see, the sets has at the icon of joints, but it is not joints. It has just the same simple. Let's click on that. And now we have a new window. Let's see, what do we have over here? We have first the name of the set, let's call it Set one and fixed. Now we have over here three tabs, general condition and tops. As you can see, those are the different methods of creating sets in Tableau. The general tab is actually the manual selection, the condition, as you know, the dynamic set. And the top as well is a dynamic set. Now we're going to go with the first one. We're going to start with the general manual selection. The middle, we have a list of all customers in our datasets. And we have to go and start selecting manually which customers are in and which customers are out. In our example, we selected the customer two and the customer five to meet the members of the group. And anything that you are not selecting is going to be on the out group. So that sets the customer 134 are out. Let's go now and click Okay. Now let's see what happened on the data Bain. We have a new field. It's going to be discrete dimension and since it's set, it has the following icon. As I said, it's like the icon of joins. Now let's see the values inside this field. Let's drag and drop it over here. And now as you can see, we have only two values out. It's like bullion data type. We have true and false here as well. In the sets, we have only two values. We selected the customer two to be in the set and as well the customer five to be in the set. The risk going to be out. This is how we can create sets in Tableau using manual selection and it's going to be fixed. All right, so now we're going to go and create a dynamic set using condition. Our example was the customers with score higher than 400. Let's go again to the left side. Right click on the customer ID, go to Create, and then to Set, let's call it now, set two and we're going to call it condition. Since we are making now a condition, we're going to go to the tab condition over here. So now we're going to go and specify for Tableau the rule to decide which members are in and which members are out. The rule says score higher than 400. Let's define that first. We have to select this by field. Our field is a score which is correct. And then the operation over here is not equal, it should be higher than 400. So we have to specify the value over here. And that sets if the score is higher than 400, the customers going to be in. Otherwise, it can be out. Now let's go and click Okay. And as you can see, we have another dimension on the data pane called set two, double click. So let's check the values. The score over here, 350 which is out, 900 in, 750 in 500 in, and null, it's out. As you can see, it's really easy to define the dynamic set we have just to provide a rule and Tableau and do the rest. If tomorrow we have different data, the Sit member going to change. Now we're going to create another dynamic set using the rank. In our example, we had the top two customers going to be in and the rest is going to be out. Again, we're going to go to the data pane. Click on the customer ID, create the sets, let's give it a name. So it's going to be Sit three and Rank. So now we're going to go to the third tab over here to the top. Let's go there for this example. We're going to use the score to rank the customer, so the highest two scores can be in. In order to do that, it's really easy. We can define it here by field. Here in ranking we have top or bottom as you can see. So we're going to stay with the top. Next, we have to define what we are selecting. Top two customers, top ten to five to 20. So here we have to go with the two and by score, so we are using the score, everything is correct. And that's it, this is how we define the rule. And Tableau going to do the rest. It's really logic if you just read it. Top two by score. All right, that's all. Let's go and select. Okay, again, as you can see we have the set over here and the data being able connect. Now let's check the data. As you can see, John and George, they have the, that's why they are in, and the rest, they are out. As you can see, sets are really easy in Tableau. All right, so now we're going to go and make it a little bit complicated, where we're going to create combined sets. We're going to go and combine set one with set three. In order to do that, we're going to go again to the data bin, but this time we're going to start from the set. Let's go to the set number one, right click Connect. And then we have here an option called Create Combined Sets. Let's click on that. As you can see, we have here a new window for the combined sets. First, let's give it a name. So it's going to be set four and combined. First, we have to define the two sets we have. Here's the set one, since we started from it. And then on the right side, if you click on it, you will get a list of all sets available in the data bin. So we have the set two and set three. We're going to go with the set three. All right, with that, we have defined which set is going to be combined, but now we have to define for Tableau how the data going to be combined. Here we have four options. The first one is going to be all members in both sets. The second one only the shared members on both sets. And the next one is going to focus on the set one, and the last one is going to focus on the set three. For this example, we're going to go with the shared members in both sets. Let's go and select that. And as you can see here between the sets, the icon did change as well. All right, so now everything is ready. Let's click Okay. So here again on the Data Bain we have a new field, new dimension. Let's see the results. I'm going to go and double click on it. Now let's see the results. We are combining the set one over here with the set three. If you go and search for the shared member, it's going to be only the customer two since it is in, in the set one and as well in in the set three. As you can see, we have only one member in the combined set and that is the customer, John. Because it is the only shared customers between the two sets. It's really not that hard. You just have to pay a little bit of attention to which combining option you are using. All right guys, so far we have learned how to create the sets from the databain using different methods. Next we're going to go and learn how to create the sets directly from the views. All right, so now we're going to go and create a new view. And it's going to be something similar to the cluster group. So we're going to have the two measures, profit and sales. So let's go and select them. So double click on the profits and double click on the Sales. We have now the two axes, what we are missing now the customers. In order to add the data points, we're going to go to the customer ID and double click on it. So now we have our view and we're going to go and create the set directly from the view here. It's very similar to the groups we're going to go and select. Which customer is going to be the member of our set. So in this example, we're going to go and select the customers with the high performance. All what you have to do is To select like this. Let's go for those customers. And again, here we have this new window. Last time we have created a group, but this time we're going to go and create a set from those customers. So click on Out, and then we have to select this Curet set. So let's go and select it. So now we have a new window, and as you can see, we cannot define conditions or any dynamic set. It's going to show us a list of all customers that we have selected in the view. And the only thing that we can do over here is to check, did you select all the customers correctly? And if we've done any mistakes, we can go and remove the customer. Now let's give it a name, I'm going to call it Set Customers high performers. That's all for now. We're going to go and hit okay, so let's select that now. As you can see, nothing changed yet in our view. We have now a new field on the data pine called set. So we just created a new set directly from the view. Now quickly I want to show you something. If you are selecting group like this and let's say the window here disappears. What you can do, you can go to any of those data points, right click on it. And then here the last option is create set. This is another way how to create a set directly from the view. All right, so now we have the set. And you might ask me, okay, what you can do with it? Well, we can do many things with the set now. So first we can highlight it in our view. In order to do that, we're going to take the set from the data pane and let's just put it on the colors quickly. See which members are in and which members are out here. As you can see, table always use the color of gray for the members that are out of the set. Of course you can change that by going to the Marks. So if you go over here, then we go to the Edit colors. And you can define over here the color of in and the color of out. But for me now, the colors are okay. So let's click Okay. With that, you are highlighting subsets of your data for the end users. All right, so the other use of the sets inside our view is that to focus on specific subsets, currently we are showing all the customers in and out, how to filter the data only for the customers that are member of the set, only for the group. In order to do that, we're going to go to our set. Right click on here, you can find two options. As you can see by default we have show in out of set. That means we are showing everything. But now we have another option called show members in the set. That means we're going to filter the data and we're going to show only the members inside our set, the group. Let's go and select that and see what can happen. As you can see now Tableau, remove all the customers that are outside of the sets and we can see on the view only the members of the set. This is really quick way on how to filter your data and to make a focus and specific scenario. But now you might say, you know what? Let's give this option to the users. Let's have the audience that the users decide in which subset they're going to focus on. This is going to make your view more interactive and dynamic in that we can offer the set as a filter. So let's see how we can do that. First we have to show all the data points in our view. So we're going to switch that Pac, let's go to our set right click on it and we're going to go and select Show in out of the set, show everything. So it's select that. Next we can offer the set as a filter. So go to our set again, right click on it, and here we have the option of show filter. Let's select that. Now as you can see on the right side we have the two options in out and all. So now we have different scenario. If the users wants now to see the whole big picture, all customers, they're going to leave the filter as it is. But if we have different scenario where they want to focus on the subset of the customers with the high performance. All what they have to do is to di, select out and the filter. So let's go and do that. And now as you can see, we are focusing on the subset of the group in only the members in the sets. And for some other reasons, another users want to focus on the groups that are outside of the sets. Maybe to understand the behavior and so on. So they're going to diselect the in and select the out. So now we are focusing on the group that are outside of the sets. And again, if you want to see the whole big picture, you're going to select both of them. So I really prefer to give this option to the users to decide which subset they're going to select and they're going to focus on, because with that you are covering many scenarios in only one view. All right guys, so now with the sets in Tableau, we can go step further. We're going to give the full dynamic to the users and they're going to have the option of defining which customer is going to be in the set. Because so far what we have done is that by creating the views, we defined everything we defined which customer is going to be in and which customer is going to be out. But now instead of redefining it, we're going to give the options the full dynamic of defining the whole set. So let's see how we can do that. In order to make the set dynamic and interactive, we're going to add an action to our worksheet. I will dedicate later full tutorials on the actions and the interactivity in Tableau. But now let's just learn how to add a action for sets. All right, so in order to do that, we're going to go to the main menu in Tableau, to the worksheet. So select that, and then here, actions in Tableau. Let's go there. Now, I will not go in details explaining all the options that we have in the actions, because here we have way more than sets, We have a lot of things. So now just follow me, we're going to go to the add action over here. And then we have the option here, change set values. So that means the actions of the users going to change the values in set our set. So let's go and select that. Now we have to give an action name, so we're going to call it action change sets. And now we can select in which worksheets this action can be applied. So now if you go over here, you can see the list of all sheets that we have in our whole work. So now I want to apply this action only on this worksheet, so everything is fine. And now here we are defining the behavior of the user. So now the question is, when the action going to be triggered, Either by hovering in the mouse or by selecting the data points, or by drop down a menu. So I will stay with the default. Let's have the user clicking on those data points. All right, so now we're going to define the target set. Which set is going to change once we do the action? So let's see what we have here. So as you can see, we have two data sources. In the tutorial we created, in the small data source three sets. And in the big data source, we have created only one set. Once the action is triggered, the values of this set should be changed. So let's select that. And now we are coming to the interesting part. But first subcafe, Okay, so here we have two types of actions with the mouse. So first, let's check the left side, what can happen when we select a data point. The first option going to say assign values to set. So that means it's going to create completely new set from what you selected. The second option is add values to set. So table going to hold the old values and everything that you are selecting can be added to the set. The last option is anything that you are selecting going to be deleted from the set here. It's really depends on how do you want the users to interact with the view. Either you want them to create completely new set, so you're going to go with the option one. Or you want to redefine a sets and you want them to extend it by adding new members to the set. So you're going to go with the option two or you want the users to start removing members from the pre existing sets. I would say let's go with the option two where the user is going to add members to pre defined set. All right, so that is for the left side. What can happen once the user start selecting? And on the right side, what can happen once the user starts moving away from the selection? So here the first option is to keep the set values. Second is to add all values to the sets. So that means once the user start moving away from the selection, all the members, all the customer is going to be in the in group, it's going to be inside the set. And the third one is exactly the opposite. What's going to happen? All the data points going to be outside of the sets. So I think both of them are extreme. We can leave it as it is keep set values. So now let's keep those options and let's see what can happen in the view once we start selecting. So let's go with okay, so as you can see here we have our new action. Let's click okay. Now let's go inside the view and start selecting stuff. But before that, I want to change the shape of those data points to be more clear. So let's go to shapes and use the field circle. All right, so now I'm not selecting anything. Like if I move my mouse over here, you will see nothing going to change, but the action here is to select. So to click on the data point, let's click on that. Let's move away. So now we can see this member is blue. That means it is in the set, and anything I'm clicking on those data points can be inside our set. Or we can go over here, for example, and select all those stuff at one time. Now anything that I'm selecting, the view as you see, it's going to be included in our set. With that, we are going full dynamic and we give the option for the user to define which customer is in and which customer is out. All right, with that, we have covered everything about the sets. How to create it as a fixed dynamic from the data bin, from the view, how to add actions to it, how to add it to filters. This feature in Tableau is really great. All right, now let's summarize the sets in Tableau. Going to divide your data based on specific criteria or selection into two groups. So we have the subsets, it's going to contain all the members inside the sets. And the out subsets, it's going to contain all members that are not included in the set. The sets is very important feature in Tableau since it's going to allow you users to focus on subsets of your data and to compare it with the remaining data. And sets are a great way to add dynamic and interactivity to your views by giving the options for the users to define in which subset they're going to focus on. All right, kay, so that's all for the sets in Tableau. And next we will learn how to group the values of the measures using pens and how to build histograms in Tableau. 105. Tableau | Bins & Histograms: All right guys, So far we have learned different methods on how to group up the values of dimensions into groups. But now we will learn how to group up the values of measures into groups. And for that, we can learn the pins in Tableau as usual. Let's first understand the concept behind the pins, and then we can learn how to build it in Tableau. Let's go all right guys, before as we learn dimensions and measures, we learn the secret formula of building new views. And that is measure by dimension, like sales by category. We have to build view from two measures. So it's going to be measure by measure, like profit by sales, quantity by profit, and so on. One way to do that is by converting one of those measures to pens. So we will have profit by sales pens and quantity by profit pens. So what is Benz pens? Divide the data into groups of equally sized containers, resulting in systematic distribution of the data. And we can use those pens to create charts called histograms. Histogram going to classify your data into different pens and then counts how many data points do we have inside each of these pens. In histograms, we usually use the part chart to visual the data. All right, so now let's have an easy example in order to understand the pens and histograms. All right, so now let's have the following data. We have ten customers and with their scores, the scores are like points that the customers collect. And now we want to count how many customers fall within a range of scores. For example, how many customers do we have in the range 0-303060 and so on? So first we have to create pens. In order to create pens, we need few informations like what is the highest value in the scores? So it's going to be the first customer, the 63. And what is the lowest value in the scores? It's going to be the zero. The next value that we have to define is the size of the pin. For example, here we're going to take the size of 30. And now we have all the information that we need in order to create the pins. Don't forget they are equally sized, what that means. The first pins that we have is 0-30 It's starts with the lowest value of zero and the size should be 30, that's why we have the range 0-30 This is our first pin. The next one going to be 30-60 Again, as you can see, the size is 30. And now the last pin going to 60-90 And with that we're going to start because with the last pin we the highest value. So with that we have created from the measure score and equally sized pens. And now after we created our pens, we're going to go and count how many customers, how many data points do we have inside each pen? All right, so now let's start counting the customers for each pen. Our first pen starts 0-30 so let's see, how many customers do we have inside this range? So the first customer is out, will not count it. The second one is inside the range, so we have one customer, two customers, three customers. This customer is out of the range, the same over here. So here we have the first customer, this customer is out. We have the customer number five, and that's it. So we have five customers between the 030. All right, so now let's move to the next pin. How many customers do we have that their score is 30-60 All right, so now let's start counting and scan our table. I think all those values are out. We have this customer that is inside this range. Then we have the 45, and as well 55. So we have four customers, their score 30-60 so this is our second pin. Let's move now to the last pen. So we have the range 60-90 And now let's count how many customers do we have inside this range? So we have ten customers. We have already nine, so I think we have only one and that is the customer number one. And all other values are not in this range, so we have one customer and that's it. With that, we have created a histogram for the scores. We just have to create the pens and count how many data points are inside each of those pens, and we call those blue parts as pens. And each pen has a size. Now let's say that we want to define another value for the size of the pen. And we take the value ten. So what can happen? We can have more pens, so the first one going to be 0-10 The next is ten to 2020 to 30, and so on. So it makes sense if you define smaller size for the pens, you will get more chunks from the data instead of having three pens. Now we have seven pens, and as you know, after creating the pens, we can account how many customers do we have inside each of those pens. If you go and start counting, you can have the following histogram. As you can see, what is defining the score is the lowest and highest values inside our data and as well the size of the pens. As you can see, using the pens, we created different groups from a measure. Now you might ask me, why do we need histograms? Why they are important? Well, if you compare the table on the left side with the visual on the right side in the histogram, you can quickly identify trends and patterns in the distribution of the customers. Like you can see quickly that most of our customers have the score 0-30 This type of chart can help you quickly understand whether everything was okay or you have to improve in certain areas. Define new strategies and make better decisions using the data. All right, now let's see how we can create pens and histogram in Tableau. And we can do that only on the worksheet page. We cannot do it at the data source page. And there's two ways in order to do that. Either we create pens in the data pane or we can create pens in the visualization. Let's start with the first one. So now we're going to create a histogram for the customer scores. And we're going to stay with the big data source on the left side. We're going to go to the data pane and we need the score. Right click on it. And then we go to Create. And here we have the option of pins. Let's go and click that. Now we have here a new window to create the pins. The first one we have the field name. We're going to leave it as it is. The second option here we have the size of pens here as a default, Tablo going to follow specific mathematical equation in order to find the suitable size of pens. But if you don't want this value, you can go and change it. So for example, let's go with the value of 20. After that, we found informations about the range of values. So what is the minimum value and the maximum value that we find inside the field score and the differences between them? For now, that's all we're going to have. The size of pens of 20. Let's hit okay. Now if you check the data bin on the left side, you can find a new field called score pen. It is a dimension because it has infinite number of values. The score going to stay, of course as a measure. Let's check the values inside our new field. So let's drop it here on the rows. Now as you can see, we have the pens and the size of each pen is 20. Okay. Now, so far we have the pens from the score. The next step in order to make a histogram is to get the count of the customers. Now let's use this measure, the customer count, Drag and drop it here on the view. And then I have to switch between them, so it looks like a histogram. With that we have our histogram, but we are not there yet. To make it look like a real histogram, we have to have the pens as continuous. If you check the score pin on the left side, you can see it is a discrete, it is a blue color. And now we're going to go and convert it to continuous. Right click on it and convert to continuous on that. And it's still on the view as a discrete, so we have to convert it as well here and the view as a continuous. With that, we have created a histogram in Tableau. I'm going to add the final touch where I'm going to add the values for each pin. So we go to the labels, show mark, label, and now I'm going to change as well, the coloring in our histogram. So I'm going to take the score pin and put it in the colors. Let's do that. We are still not there. I would like to have the pin with the highest number of customers to be darker. So in order to do that, we're going to go to the customers it color and then we're going to go over here and reverse it. Click Okay. Now I'm happy. This is how I usually present the histograms in the project. Once we have the histogram, we have to discuss it in order to understand the data. Usually we search for peaks for valleys, or any outliers that stands out. For histograms, there are different shapes with different interpretations. The shape of our histogram that we have called skewed to the right. Skewed to the right means that the histogram on the left side has the highest peak, and then the frequency of the data going to be descending as you go to the right. And on the right side, you're going to have the lowest frequency of the data points, which is naturally good in this example. That means we have a lot of new customers that didn't collect yet any points. The histograms are really powerful to see the distribution of your customers in one click to quickly understand whether there are issues in your business or if you find any new trends. So now for this example, we have decided that the size of the pin is 20. Let's say that you want to change the distribution and you want to change the size as well. So in order to do that, let's go to our field, right click on it and then we go to the edit. So let's select that. And here we can go over here and change it to ten. Let's click Okay. And now as you can see, we have more pens and more details about our data. So now you might ask me, I want it to be more dynamic and I want to give the users the option of defining how many pens do we have. And for this we can use another feature called parameters, which is going to be in the next tutorial. Alright, so now so far we have learned how to create pens from the data pane. There is another way to create pens and histogram in Tableau, which is way easier than what I showed you. We can do that directly from the visualization. Let me show you what I mean. So let's create a new work sheet. And let's say that I want to create a histogram from the sales. So in order to do that, we're going to go and take the sales and put it on the roads. And then we're going to go over here on the show me. And we have redefined visualization from Tableau called histogram. So the requirement for this visualization is only one measure. So once we click on that, you will see the Tableau did everything. If you check the data pane on the left side, we have already been or dimension called sales pen with the role of continuous. And of course Tableau going to suggest the size of the pens. You can go and change that of course, but as you can see, it's really easy If we just took one measure in the view and click in the histogram, the rest is going to be done from Tableau. And this is exactly the power of Tableau in the visualization. All right, so now let's have a summary pens going to divide your data into equally sized containers which going to result in systematic distribution of the data. And pens are the method of creating groups from measures. So that means we can create pens only from the measures. We cannot create it from dimensions because dimensions are already pins. And pins themselves are dimensions. And it's better to convert it to continuous dimension to be used in histograms. And one limitation in Tableau is that you cannot create pins from calculated fields. And the main purpose of having pins and histogram is to quickly identify patterns and trends in the distribution of your data. All right, Kay, so that's all for the pins and histograms, and with that we have learned everything about how to organize and customize our data in Tableau. And we are done with this chapter. Next, we will learn in Tableau how to filter your data using different techniques at different layers. 106. Tableau | Section: Filtering & Sorting Data: Filters in Tableau. We have many different types of filters for different purposes, like optimizing the performance or as well for your users to explore your data. That's why it's very important to understand them and the differences between them. So that's why first we can start by understanding the concept behind the different types of filters in Tableau. And then we can learn the different methods on how to create all those filters in Tableau. Moving on, we can learn the many different options on how to customize the filters in Tableau. And at the end, I'm going to share with you many tips and tricks based practices of using filters in Tableau that I usually follow in my projects. So let's start with the first topic where we can understand the concept behind the different types of filters in Tableau. Now let's go. 107. Tableau | Types of Filters: All right guys, the best way to understand the hierarchy is to have an example. If you take a look at our data, for example, the customers, you can find some dimensions are related to each other's since they hold similar informations. For example, the dimension country, we have values like Germany, USA, and France. And we have another dimension city, where you can find the cities inside those countries. For Germany, we have Berlin, Stuttgart. And then we have a third dimension, Postal Code, where you can find the codes inside those cities. As you can see, these three dimensions are describing common information. They give us information about the user location, and we can relate those dimensions together using the hierarchy. In hierarchies, we have different levels. And we start with the top node, and we call it the root node. This node represents the highest level of aggregations in our hierarchy. And now we're going to go to the next level of the hierarchy, where we have the country. In this level we're going to see more details about our data. Where we have, for example, the two values, USA and Germany, and the links between the nodes, we call it branches. And now we're going to go to the next level in our hierarchy. We have the level two here in the city. We will see more details about our data. So in USA we have Portland and Seattle. And in Germany we have Stuttgart and Berlin. And again, we have the link between the parent node and the child node using the branches. And now we're going to go to the last level in the hierarchy, we have the postal code. And here we're going to split the structure furthermore with more details. So we have the following bustal codes for each cities. Now, since the postal code is the last level in our hierarchy and those value don't have any children, we call those nodes as the leaf nodes. The leaf nodes or the leaves, they represents the most detailed level of our data in this hierarchy. So now with that, we have the complete structure of our hierarchy. As you can see, it looks like a tree structure. The top node, we call it the root node, it represents the highest level of the details. Then we have the intermediate levels, and they are connected using branches. And the last level, we call it leaf nodes, where it represents the lowest level of details. We have the root node, it represents the highest level of the aggregations. Then we have intermediate levels connected with the branches. And then we have the leaves, the leaf nodes. They represent the lowest level of details in our data. As we learned before, we can do many lab operations on the cube. So if we have rake in our data, we can do two very important operations, the drill down and the drill up. The drill down and drill up, they are all operations that's going to help us to navigate through the hierarchy in order to gain deeper or higher level understanding of the data. So let's understand first how the drill down works. Let's say that we are working with the Mejor sales. We start on the top node on the highest level. At the highest level, we're going to have the total sales in the whole datasets. For example, it's going to be 140. So now we are at the highest level, at the root node. And if you use drill down, you're going to jump to the next lower level in the hierarchy. So that means at this level we're going to see more details about the sales. So for USA we have 90, and for Germany we have 50. And now if you want to see more details about your data, we can apply again, drill down in order to jump to the next lower level in the structure. So what's going to happen? We're going to go to the level two and here the sale is going to split between Portland and Seattle. We have 40.50 and for Germany, we're going to have 24 suit guards and 34 Berlin. So that means we are seeing more details about our sales. And now if you want to go to the lowest level to the leaves, we're going to drill down from the city to postal code. So it's going to look like this. The Portland gonna split between those two postal codes. Say Seattle going to be the same because we have only one child. The same for Stuttgart, it's going to stay 20, and Berlin, we have two postal codes, so it's gonna split again. So as you can see we are using drill down to navigate through the hierarchy by taking us from higher level to lower level of details. It's like we are expanding the tree to see more details to understand our data. All right, so now we're going to talk about the second Alp operation, the drill up. It's exactly the opposite of drill down. Drill up gonna take us from bottom to top, lower to higher level of details how it works. Let's say we're going to start at the leaves and we're going to have the sales of those leaves. And now we can use a drill up to move from the postal code to the city. For example, we're going to have the total sales in Berlin, 30, because it's the sum of ten plus 20. And then in Utgard going to stay the same, 20, Seattle 50, and Portland as well, going to sum up the values from the leaves. So we're going to have the value of 40. As you can see, as we are moving higher, the value is going to get more aggregated. Let's see that we want to jump to the country, so we can use again, a drill up to move from the city to the countries. Germany, we can have the total sales of 50. For USA, we can have the total sales of 90. Now you can use, again, drill up to go to the root node where you can have the highest level of aggregations. So we can have the value of 140, the total sales inside our dataset. As you can see, if we have a hierarchy structure, we can use a drill up and drill down to navigate through the hierarchy structure. Hierarchies organize and structure the member of the dimensions into a logical tree structure by grouping similar dimensions together, Hierarchies are really important and give dynamics to your views where you can have the big picture and understand the data at the highest level. And you can drill down to specific details to gain deeper knowledge data. All right, so now we are back to Tableau. Let's understand how we can create hierarchies in Tableau. We can create hierarchies only on the worksheet page. We cannot create it at the data source page. In the worksheet page, we can create hierarchy on the data pain page. If you take a look to the customers tables, you can find that we already have a hierarchy. And here we have small icon that indicates we have hierarchy, the hierarchy name called Country City, and on the left side over here we have small arrow. If we click on it, the hierarchy can expand and we can see the dimensions inside this hierarchy. Speaking about dimensions, hierarchies could be used, only four dimensions. You cannot create a hierarchy from measures. And this hierarchy that we have over here, it is created automatically from Tableau. Since Tableau analyzed the content of the country and the city and automatically understood that there is a hierarchy between them. But since we want to learn how to create a hierarchy, we're going to go and remove it and create a new one from the scratch. Now in order to remove a hierarchy, you go to the hierarchy name over here, right click on it. And then here we have the option remove hierarchy. Here you have to understand that the dimensions inside the hierarchies will not be deleted, only the hierarchy itself will be deleted. So you will not lose any fields on the logical tree. The logical hierarchy will be removed. All right, so now let's see how we can create hierarchy in Tableau. And we're going to create the location hierarchy. We're going to go to the left side of the data in and we're going to select one of the dimensions. It doesn't matter which one you're going to select, but I prefer to start with the highest level of the hierarchy. Here in our example, it's going to be the country select the country radical. Click on it. And then here we have something called hierarchy. And we're going to select Create Hierarchy. Let's go there. We have to give it a name, so we're going to call it location hierarchy. Then he, as you can see now on the left side we have the icon of the hierarchy. Inside it, we have only one dimension, the country. Now in our hierarchy, we have as well the city and the postal code. So how we can add it to this hierarchy? As we learn, the hierarchy has different levels, and the order of those levels are really important. We have country, city, and postal code. Now, in order to add the city, we're just going to drag and drop the city beneath the country over here and release it. With that, we have now the city inside our hierarchy. Let's grab as well the postal code. So we have to drag and drop it beneath the city. Let's release. With that, we have created the location hierarchy with the three dimensions, country, city, and postal code. Here Again, if you want to hide the details about this hierarchy, we can collapse it over here. Or if you want to see the details, we can expand the hierarchy. All right, so this is one way on how to create hierarchy in Tableau by using drop down. The second way on how to create hierarchy, we can quickly drag and drop dimensions together. So for example, if we go to the product table, we have as well a hierarchy here between the category, product name, and subcategory. Our hierarchy starts with the category, then the subcategory, and the last one, the leaves, going to be the product name. Now let's see how we can create the hierarchy using quickly drag and drop. We're going to take one of those dimensions, let's say we're going to start with the category, drag and drop it inside the subcategory. So I'm now hovering and selecting the subcategory. Let's release. Once we do that, Tableau understand that we want to connect those dimensions. So Tableau going to create a new hierarchy. We're going to call it the Product Hierarchy. And let's it, okay. And now let's see. On the left side we have a new hierarchy called product hierarchy with the icon. And we have insided two dimensions, category and subcategory. We are missing the third dimension. Let's take the product name and drop it in the hierarchy. Now we have problem with that. The order of the dimensions inside our hierarchy is wrong, because the dimension category should be the level one and the subcategory should be the level two. How we can fix that? Just select the category and drag and drop it on top of the subcategory. Let's release that. That says this is how you change the order of the categories. And with that, we have the product hierarchy. All right, now let's say that we want not to remove the whole hierarchy, we just want to remove one member, one dimension from the hierarchy. In order to do that, let's say we want to remove the product name. Select it and just drag and drop it somewhere here in the empty space. And with that, the product name is not anymore member of the hierarchy. So this is how we can remove dimensions from hierarchy. But I want to put them back in our hierarchy because we need it later. So I will put the subcategory beneath the category, and we take the product name and put it beneath the subcategory, and that's it. So these are the two methods of creating hierarchies in Tableau, either by drop dominu or by quickly drag and drop the dimensions together in order to create a hierarchy. It's really easy. All right, so now we have this hierarchy, the structure, how we're going to use it inside our view, it's really easy. We're going to go and select the whole hierarchy, then drag and drop it to the View. So here the hierarchy going to start from the level one for the countries, and we're going to see the values of the country. Now let's have one of those measures. We're going to take the sales and drag and drop it on the columns. So now if you look closely to the country, to the plu, pile over here, you can see that we have a new sign, the blast sign. This sign indicates that we can drill down in this dimension. So now let's go and click on the blast sign. As you can see, now we are drilling down in our hierarchy to a lower level. Now we are seeing more details about the sales. And we are now at the level of the city to the next level. Now as you can see, we have the dimension city. Our rows, we didn't drag and drop it from the database and put it at the rows it expanded from the hierarchy. Again, here the city has the plus sign that indicates we can drill down inside the city. Let's drill down again. As you can see now we are at the postal code and we can see more details about the sales. Now if you check the postal code, there is no plus sign, like the city and the country. Because we are at the leaves, we are at the lowest level of details in our data. With that, we have navigated through our hierarchy from the top node to the leaves. As you can see, it's really easy and very dynamic. Now let's say that we are at the leaves and we want to drill up back to the highest level of the aggregations to the top node. It's really easy if you check again the city and the countries we don't have anymore, the plus sign we have the minus sign. The minus sign indicate that we can drill up in the hierarchy. So let's see what can happen if you click on the minus sign. As you can see, we drill up now from the leaves, from the postal code back to the city. And the values of those cells are now more aggregated. And now the same thing, if you want to drill up from the city back to the country, we're going to click on the minus sign. So let's do that. And with that we are moved to the level one, to the highest aggregation in our hierarchy. All right, so so far what we have done is we drill up and drill down in our hierarchy using the row shelves and you know that's the rows and the columns. We use it as developers build our view. Now the question is how our users and the audience get and drill up and drill down through the hierarchy. Because the hierarchy should be as well used quickly from the users to drill down to the details. Now let's see how we can do that. If we go to the view over here and hover on the country, we can see again a plus sign. Let's go and click on that. And as you can see, we drill down in our hierarchy from the country to the city. Now let's go more in details and drill down to the postal code. We can hover on the city, and as you can see, we have again the plus sign. Click on that. And with that, we drill down to the postal code. This is exactly how the users can drill down in the view. Now if we want to drill up back to the higher level, we can do the same. We can see the minus sign over here. Click on it and you go back to the city. And then we go to the country as well. We have the minus, we click on that. And with that, we drill up back to the country. As you can see with those icons, we can navigate through our hierarchy. Now you might say all your users, you know what, this is really small icon and my users don't like it. Is there any other way to drill up and drill down in the view? Well, yes, if you go to any of those values over here and write a click on it, you can see in this drop down, we have a drill down. If you click on that, we drill down to the city the same. If you select any value, doesn't matter which one, let's go over here and then drill down again. And with that we are at the postal code. If you want to drill up, you can do the same, any values radically cone it. And here we have the drill up socilic. And to drill up back to the country, go to any values in the country radically on it and drill up. So those are the two ways on how to drill down and drill up in the view. All right guys, so far we have created our own hierarchies by putting those dimensions together in different levels. But in Tableau we have as well indirect embedded hierarchies in the data type date in Tableau. Any field with the data type date has the following hierarchy. It starts with the highest level with the year, then we have the quarter the month, and then the lowest level, the leaves. We have the days. Those four levels are the default levels inside each field with the data type date in our dataset. Now we have another data type that holds as well, an embedded indirect hierarchy. We have the fields with the date and time. Here we have informations about the time, and we have seven levels. It start exactly like the date, so the highest level is going to be the year, then the quarter month, and then the day. But now we can drill down to more details since we have the time information. The next level is going to be the hours. Then we have minutes and seconds. Second are the lowest level of details. They are our leaves here. We have civil levels of the hierarchy. Date and date and time. They have hierarchy embedded inside it. Now let's uncover those hierarchies in Tableau. All right, so now we're going to go to the table orders. And here we have two dates. Doesn't matter which one, both of them are going to have exactly the same hierarchy. Let's take the order date, drag and drop it here on the rose. Now, as you can see, we have now the plus sign. It indicates there is a hierarchy. And it starts at the highest level with the years. Now let's take a measure to see some data. We're going to take the order counts and put it in the columns. And I want to show Israel the labels. Let's show some labels. All right, Now let's go and discover the hierarchy inside the date. As you can see on the left side, we don't see any information about the hierarchy, so that means it's really embedded inside this data type. So let's go on the years and click on the plus sign to drill down. As you can see the next information we have the quarter informations. So now we see the total number of orders by the quarter. So now we can see more details about the total counts, and then we can drill down to the day. And now we are at the lowest level at the day. We cannot drill down further, for example, hours, minutes and seconds, because the order date has the data type date. As you can see, the dimension order date has four levels, years, quarter, month and day. It's really nice to have it like this in Tableau because it's really standards. I worked with other BI tools and there we have to build it in our own, which is really time consuming to build all those hierarchies. Especially if you have a big dataset here in Tableau, our life is easier. Tableau did decide to have a hierarchy inside each date. All right guys, one more thing about the arches. They really organize and structure your views and make it more dynamic for the users. For example, requirements to make sales by country, sales by city, sales by postal code, and you don't use hierarchies, you will end up making three views like here on the left side, it takes a lot of space. And as well, it's literally dynamic. But better than that, we can create hierarchy between those dimensions. And we can put everything in one view. And then you give the options for the end users to drill down and drill up, depending on what they need. If they want the sales by country, we have it already at the top node. But if they want the sales by city, all what they have to do is to drill down to the next level, and we have it already, sales by city. If someone's need to go more in detail to go to the postal code, they can drill down as well to the sales by postal code. As you can see, it gives really your view more dynamic and going to be more attractive for the end users if you compare to the lift sides. Now we have more dynamic, more interactive for the end users. And as well, you are creating list views in your dashboards. So this is really great. If you want to drill up back to the country, we can just click the minus sign. Hierarchies gives more dynamic its structure and organize your data in the views. All right, now let's summarize. Hierarchies, organize and structure the members of the dimensions into logical tree structure. Hierarchies are special feature only for dimensions. You cannot create hierarchies between measures we can and drill up to navigate through our hierarchy to gain deeper or higher level understanding of your data. Overall, hierarchies are really important to organize and structure your data interviews. And it provides for the users a powerful tool to quickly and easily navigate and explore your data, uncover insights, and make better decisions. All right, so that's all for hierarchies in Tableau. Next we will learn how to group the members of dimensions into hierarchategories using groups. 108. Tableau | How to Create Filters: All right, so now we have the following task where we have to hide sensitive informations. For example, let's say that the USA data in our dataset is sensitive informations and we have to hide all the customers that comes from USA. And now we're going to go and build a view from the customers. We're going to take the location, the country, and then let's say we're going to take the profit from the orders. All right, so now as you can see in the worksheet, we can see all the countries including USA. So now we're going to go and hide this sensitive information. In order to do that, we're going to go to the data source page. And then here on the corner on the top right, we can see filters and we can add a new filter. So let's go and click on it. Then we will get a new window called Edit Data Source Filters. It's really easy here. We're going to go to the ads, click on it. And then we're going to get a list of all the fields that are available in our data source. Since we have to hire the customers from USA, we need the field country. So let's go and check that over here. Then click Next. And here we got another window to set up the filter for the country. So as you can see, we have all the countries here listed. And now we can go and select the countries should be included in our datasets. Or we can go over here and click Exclude. And we're going to exclude the USA. That means we are filtering out all the customers with the country equals to USA. Let's go and click. Okay. Now we can see over here a quick information. So the filter is based on the country and the details is saying we are keeping the values France, Germany and Italy. So that's it. Let's click Okay. Let's go now and check the data in our worksheets. So we're going to switch back to our view, and as you can see, we cannot find any information about USA. And this can affect as well, all the worksheets that are connected to this data source. So for example, if you go over here and create a new worksheets, and we take the countries track and drop it over here. You can see again here as well. We don't have the USA, we have the values France, Germany and Italy. And with that we have protected this sensitive information, right? Is more, we go to another use case of the data source is to reduce the size of data inside Tableau. This is very critical. If you have a bad performance in Tableau, then you have to start thinking about how to reduce the size of data inside our visualizations. And the first step to reduce the size of our data, we have to decide which fields we're going to use in order to filter our data. A very common and usual field is that we can reduce the number of years inside our data source. Let's go and build a view. So I'm just going to go and create a new worksheet. Let's take the order dates to the rows, and let's take the profits to the columns. And then let's make it as a part diagram and show the results. As you can see, we have inside of our data five years of data. This field is really good candidate in order to reduce the data and you have to go and discuss it with your users. So we have to ask, do we really need five years of data inside the visualizations? Is it enough to have only like last two years or three years? Let's say that after discussions with the users, you say it, the relevant data for the visualizations is starting from 2020. Anything before is not relevant anymore for the visualizations. We would like to have everything starting from 2020. In order to do that, we're going to go and build a data source filter. Let's go back to our data source page. We're going to go again over here. So let's go to the edits. And then we're going to go and choose the field that we're going to build, the data source filter on top of it, go to ads, then we need the order date. We have it over here. Let's go and select it. Okay here, since it is a date, ask us fairs in which format you want to build your filter. Since we are discussing about the years, so we are interested in the years. I'm just going to go with the format years and go next. Now with that, we get a list of all years inside our data source. Either you're going to go and say, okay, I would like to include everything starting from 2020 and not select the old years. Or you're going to say, you know what, I'm just going to exclude the last two years, anything before 2020, so you're going to go with the excludes and with that we are removing the old years. I prefer this one over here since let's say that we get 2023 data inside our data source. You don't have to each time to go and click on it. With that, we are saying all the data are relevant, starting from 2020. Let's go hit Ok. And with that, you can see inside our data source filters, we got a new filter based on the years of order dates and you can see some details. It says it keeps 2020, 2020, 1.20 22. With that, we're filtering now the data source paced of the order dates and the country. Let's go okay. And as you can see here, we have now two filters in the data source. Let's go back to our view sheet seven. We can see that we have only the data starting from 2020. All, all data are not presented anymore inside our visualizations. Which is really great way in order to reduce the stress and the size of data that Tableau has to handle, that we are reducing the scope of data and as well we are going to get great performance in Tableau. So this is how we use the data source filters in order to reduce the size of our data and as well to hide the sensitive informations. But here, don't forget that all the worksheets that are connected to this data source can be effected with those filters. All right, so now we're going to learn how to build a context filter in Tableau. Let's say that we have the following view. We're going to have the category from the products and as well the subcategory. And let's take for the measure, the profits. So let's take it over here and as well, let's change the colors. So we're going to put it over here as well. So now in this view, we have all the categories furniture, office supplies, and technology. But the users want, in this view, to focus only on the office supplies. And for this specific view, all the other categories are unrelevant affirmations. So they want only to focus on the office supplies by profits. So that means we want to filter the data by category. In order to do that, we're going to go to the category over here, hold control and put it on the filters. And then we're going to get again, the same window for filtering. And here you can see the three values, furniture, office supplies, and technology. For this view, we want only the office supplies. So what we're going to do, we can remove the others and leave the office supply then hit, Okay. So as you can see now we removed everything and we have only the one category, the office supplies. The job is done, right? So we have the office supplies part profits, and we filter the data. The answer is yes, the task is done. But we are not using the full power of Tableau Sincere. The focus is only about the office supplies and we are focusing on this subset of data. We could go and reduce the whole datasets to only this category. And with that, you can win a lot of performance in Tableau because you are focusing only on subsets, and all other data is removed from this visualization. In such a scenario, we can go and use the power of context filters. Now the question is how to make our filter as a context filter. As you can see now in the filters we have our category, It is plupil. And it is as well as this filter type called Dimension filter. In order now to promote it to the context filter. As we learned before that we have specific order of the filters, we have context, then dimension. All what we have to do is to radically connect. And here we have the option of adding to context. Once you do it, you will see that our filter now has the gray pill. The gray pills indicates that this filter is a context filter. So now you might notice nothing changed over here, we have exactly the same view, but we optimized the background in Tableau where we created a Tumberal datasets. And it has only the category of a supplies, so it's really small table compared to the whole data source. All right, so now I want to show you how Tableau process the different types of filters. As we learned, the order of the filters are really important. So that means the context filter can be processed first, then the dimension filter, the context filter is dominating the behavior of the dimension filter. All right, so now we're going to go and add dimension filter in our visualization. We're going to use the subcategory in order to do that. Right click on it and click over here, Show Filter. As you can see on the right side, we have all those values that are included in the office supplies. But in our original data source we have way more subcategories as we are seeing now from this view. And this is exactly the effect of the context filter on this dimension filter. We are seeing only the values inside this context. All right, so now we're going to go and change the definition of the context filter and see the effect on the Dimension filter. Let's go again to our Context Filter. Right click on it and Edit Filter. Let's bring it here side by side to our Dimension filter. We have only those values. And we have over here on the context filter, only the office. If we go now and include as well the technology, let's apply and see that on the right side the value is going to change. Let's go there. Now, as you can see in the dimension filter subcategories on the right side, we have more values than before because we included in our context in our Tumberal table, the technology data. We can go and change the values around. Let's have only the furniture check, the right side apply. And you can see we have only four subcategories with this. You can see that the context filter is really dominating all other filters below it. Understanding the order of the filters, you can understand how Tableau works with those different types of filters. So I'm going to bring the context filter again to the office supplies and hit one more thing about the context filter. As we learned before, it is flexible. That means we can reduce the size of data only for one worksheet. That means if you go to any other worksheets you will not find here any context filter. You can go and decide for each worksheet whether you want to reduce the size of data or not. Unlike the data source filter, where it can affect the whole workbook, any worksheet that is connected to this data source. With the context filter, we have way more flexibility. Now you might ask, can we use the context filter to hide sensitive informations? Well, the answer is no. Let me show you why. Let's have a quick example. Let's take the customers again. And we have the Country City, and let's take as well the profits. As you can see over here, we don't have the USA data because we have the filter data source. And now let's say that the data of Germany is now sensitive and we want to protect it using the context filter. Let's go and do that. We're going to take the countries hold control and put it on the filters and we're going to say we want to exclude Germany. So I'm going to click over here on the Excludes and then hit Okay. As you can see now in the view, we don't have any information about Germany and we go and promote the country to context filter. So right click on it and add to context. And now you might say, okay, everything is fine. We don't have any information about Germany, so we are secure. Well, naturally, there is still a way in order to see the German data in the view. Let me show you how. If you go to the city over here and let's show it as a filter. On the right side, you will find all the cities from France and Italy. So there is no cities from Germany or USA, but here we have an option on the filter. So if you go to this small arrow over here, then we can go over here and see all the values from the database. And we can explain all those options later, don't worry about it. But let's go and click over here. So now as you can see, the filter is showing data about Germany. We have Berlin, we have Stuttgart. So that means the data are naturally protected. That means we are hiding the sensitive data from the view, but still we can see all the values from the filter. That's why never use context filter to protect your sensitive data or confidential data. Because even if we are seeing the data only in the filters, it's still exposing the data and the data is not protected. So that means if you want to protect your data and hide the sensitive informations, use only data source filters. All right, so now we're going to move to the next filter in our chain. We have the dimension filter. We have already created some dimension filter in our view. But now let's go in details and see all the options that we have. All right, so now let's go to the filters on the shelves. And you can see that we have the subcategory. It is a discrete dimension, that's why we have the color of blue. And now in order to see all the options radically con it and edit, Filter. And now you already know this window, Let's just bring it over here to see the effect directly on the view. So first we have here different taps. The first one is going to be about the manual selection and the rest is going to be a dynamic filter. So here we have four taps, general wild card condition and top. The first one is going to be the manual selection of the values. And the rest is going to be like you are defining a rule. And the filter going to be dynamic here. As usual, since it's discrete, we're going to see the list of all possible values that we can see. And then you can go and manually select or deselect values from this list. And as you can see on the right side we have exclude. The default in Tableau is included, so that means anything that I'm selecting from this list, it's going to be included in the view. And anything that I'm not selecting, it's going to be excluded from the view in order to have the opposite effects. What we can do, we can click on Exclude. And now we're going to have all the values that are selected are crossed out. So that means they are excluded from the view and everything that is not selected going to be included in the view. So here it really depends. If you want to exclude only two values from a long list, then it makes sense to go and use exclude. So now if you go and select Apply, you can see in the view the remaining values are application, Art and Benders. Tableau did exclude all those values and you're going to have the same effect if you did select the excludes. And select only the application Art and Benders. And in order to remove our selections, we can remove everything from here. So select none, and we can reapply our selection on the application Art and Benders. And as you can see, we're going to have the same effect. So this is how you work with the manual selection at the first tab general. But now let's move to the next one. And before that I want to include everything over here so we don't affect the next one. So let's apply, and then we go to the wild cards. So here we can work with the wild card. If you have a dimension with high cardinality, that means you have a long list of all possible values in the dimension. And if you go and select manually everything, it's going to be really painful. So instead of that, we can go and define the rule if there is a rule to define. So here we have like an input field, we can write something like for example, A. So here we have four options. The first one is contains, it's gonna means that somewhere in the world there is a character A. And then the second option we start with, it's gonna, means that the world going to start with the character A. The next one is exactly the opposite, it's going to end with a. Then the next one we have exactly matches. That means the word should contain only the value a. Let's start with the first one. If the word contains a somewhere, then it's going to stay in the visualization. Now as you can see, all the words, words contains a somewhere. The application, we have it here at the start and at the middle. Art as well, at the start. And here we have it in the middle and so on. Let's try out the second one. It's going to say if the word starts with a, it's going to stay in the view. So let's just apply. So as you can see, we have only two words that starts with a. All right, so now let's go to the next option. We're going to have ends with. But instead of A, we're going to have any words ends with can stay in the view, let's apply that. As you can see, all those words ends with the character. Well, now you might ask, is it a K sensitive? Well, it's not so if you have a big, as you can see, it's still Tableau. Go and select those values. Now let's go to the last one, it's going to be exact match. If you go over here and select Ok, you will not see any data. But if you have exactly labels and hit Apply, you will get only one subcategory. It, is it a labels? But we don't use it. Usually we use contains or start with endswith. This is how the white card works. Let's clear everything in order to have the data we have it contains and hit Apply. Let's move to the next step. We have a condition in the previous materials with the parameters. We have already worked with the conditions. And top here what we're going to do, we're going to define a rule. And Tableau going to go and check all the values and filter out all the values that are not meeting this condition. So for example, if you are checking our view, we have some minus values and the profits and we don't want to see it. We will go and define a rule that we want to see all the profits that are higher than zero, only the positive profits. In order to do that, we're going to select over here by field table. Going to show you immediately the measure that is using the view, so we are using the profit sum is correct. So we're going to go over here and see the sum of the profit should be higher than zero. With us, we have defined a rule and let's hit Apply. As you can see, we have just removed the subcategory that does not fulfill this condition. That's it, This is really easy. We're going to move to the next one, but first letter reset everything. So we go select None. And then we're going to hit Apply. In this tab, we can define if we want to see the top ten products or five products, or the lowest, or the bottom five products. Again, we have to define the rule four Tableau. And Tableau going to filter the data based on our rule. Here we have two options. Either we have the top subcategories or the bottom subcategories. Let's go by field over here. And then here we have two options, as I said, top and bottom. Then we can define is a top ten is a top five or top parameters. As we learned before here, we're going to stay with the same sense we are using the profit and that's it apply. And now we can see on the view that Tableau did filter our view based on our rules. So now we have the top five subcategories. All right, so that's it. This is the different options on how to filter the dimensions. I'm going to deselect everything over here, and then we're going to go to the mineral selection. And then it. Okay. Instead of redefining the rules for the users, we're going to offer the whole dimension as a quick filter for the end user. And as you know, in order to do that we're going to go to the dimension right click rot and show filter. The user is going to go to the quick filter on the right side and start selecting the values that suits their needs. All right, so now let's move to the next one. We have the measure filter, as we learned in the order chain is below the dimension filter. So let's, we can create a measure filter. All right, so in order to create a measure filter, we're going to go to the sum of profits. Let's cold control, drag and trope to the filters. Then we're going to get a new window in order to configure our filter. And since it is continuous measure, Tableau going to ask us, do you want to filter the original data, all values, or do you want to do the aggregations and then do the filters? Since it's measure we have the following aggregations, like sum, average, median, and so on. Or if you want to do only the filter on the original data, then you're going to go and select all values. But since we have sum of profit, I would like to go with the sum aggregation. Let's select that and then go with next. Now we're going to get a new window in order to configure our measure. And here we have four options. Range of values. At least, at most and special. Since our measure is continuous, Tableau can be presented as a range. It has a start and end. It's not like the dimensions where we're going to get a list of all values from the data source. We will get only aggregated data and we can configure only start and end. In the first option, we can configure the starting point of the range and as well the end point of the range. You can control both of them. In the next one, we can control only one of them, Only the start. Here we can specify what is the minimum value that is allowed in the visualizations. The next one is going to be exactly the opposite. At most. We can define the end point of the range. What is the highest value that is allowed in the visualizations? Again, the range of values we can specify the start and the end at. We can specify only the starting point. And at most we can specify only the end point of our range. Then the last one, the special, is about the null values. Here we have three options, null values. If only you want to see the null values from this filter, null values, that means you don't want to see any nulls inside our data or all values. You are allowing both of them. So as a default we stay use all values. I'm going to stick with that And I would like to configure both of the ends and the start of our continuous measures. As you can see, it's really easy. Let's go and hit, okay. And with that you can see we've got a new filters inside our filters and it has of course, the green color. All right, so first we're going to go to our major filter and show it as a quick filter. So radically connect and show filter. And now we can see the range on the right side. Let's just make it a little bit bigger to see the range. Now as you can see we have like start and end, but it is not completely for the whole bar here. Table want to show you that we are not showing all the values. We are showing only the range of the subset. So now what can happen if we take the end to the right and the end to the left? Nothing can happen on the view. We can have exactly the same data, but here we can see in our range, there is different colors. The light part can indicate that if you change the values here, nothing can happen in the view. As you can see. If I just move it over here, the view will not be filtered. Now, if I start moving the start inside the dark parts, you can see that now we have now an effect on the view. The dark color in the slider is the relevant values and the light part is the unrelevant values. All right guys, so now we're going to talk about the last type of filters in Tableau, the table calculation filter. It is the bottom of the chain. And you can see each type of filter is going to have an effect on this type. All right, so now let's learn how to build table calculation filter. And as the limbs suggests, it is a calculation. And we're going to have a whole section on how to create calculations in Tableau now. Don't worry about the details how to create calculations in Tableau, just follow me with the steps now. All right, so now we're going to go to our measure in the marks, radically cont and then here we have the option of quick table calculations. And then we're going to have a list of all different calculations that we can do it on the table. And now we will go with the percent of total. So let's select that. And now we can see small icon to the measure, it indicates that this is a table calculation. So hold control, drag, and drop it on the filters release. Here, since it's a continuous field, we have to define it as a range solistically coke. And now we can see in the filters two measures for the same field. The first one without triangle icon, it means it is a measure filter. And the second one with a triangle icon, It means it is table calculation filter. What we can do with that? We can offer it to the users so we can erratic click on it and show filter. We can see it now as a quick filter on the right side and the user can go and use the filter. That's all about the table calculation filter. All right, so with us we have learned the different types of filters in Tableau and how the order of the filter in the chain can affect each other's. All right, so now let's have a quick summary. We can start with the extract filter at the top. We can use it only on the extract connections and we cannot find it in the Tableau public version, don't worry about it. It is very similar to the data source filter. And then next we're going to have the data source filter. In order to create it, we go to the data source page. Here in our example, we created two data source filters. The first one is to hide the sensitive informations of the country USA. And the second one to reduce the overall size of our datasets. And don't forget that the data source filter can affect the whole workbook. All worksheets that are connected to this data source. Then the next, we can create them all in the worksheet page. So let's go over there. So here you can see very nicely how the different types of filters are sorted in the filter shelves. The first one we have the context filter. The gray pill context filter can create a subset of data or a timbral table only for this view. It is something locally only for this view. But don't forget, do not use context filter in order to hide or protect sensitive information. Since there is possibility to show the values in the filters. The next three filters, we usually offer it to the end users in order to slice and dice the visualizations. So the users could use it to specify a subset of data to make focus analysis. Next we have the dimension filter, like the subcategory. After that we have the measure filter. And the last one at the chain we have the table calculation filter. And since those different types of filters has a logical order, it would be nice as well to have this order on the quick filters on the right side. So it makes sense to have the dimension filter at the top. Then we're going to take the measure filter as the next and the last one going to be the table calculation filter. All right, so that's all. It could be confusing at the start. But now after you understand how Tableau works and the logical order of the filters, everything then going to make sense in the visualizations. All right, so that we have learned how to create different types of filters in Tableau. And next we will learn how to apply filters to multiple worksheets in Tableau. 109. Tableau | Customize Filters: All right, so now we're going to talk about how to apply the same filters in different worksheets. Because if you are building like different views, you end up having exactly the same filters in each view. And it's going to be time consuming if you are going in each worksheets and adding exactly the same filters. So instead of that, we can share the same filters to be applied in different worksheets. And in Tab we have four different options. In order to do that, we can find those options in the filters, so it doesn't matter which one you can pick. Let's go with the context filter, for example, Radically connects. And here we have the option of apply to worksheets. And here you can see the four options as a default. Tablo going to leave it as only this worksheet. This means locally only for this view. Here we can see other options like all using related data sources, all using this data source and selected worksheets. Before we try those options first, let's understand those four options. All right, so now we're going to have a very simple example in order to understand how to apply filters. We have two data sources, DS one and DS two, and we have different worksheets that are connected to those data sources. We have the sheet one connected only to the data source one, and the sheet two connected to both DS one and DS two using data blending. And the sheet three only connected to DS two. Now let's say that we are at the sheet one and there we created a filter. So now let's learn how to apply this filter in different worksheets using those Sods. All right, the first option we have only these worksheets does mean this filter going to be only locally available for the sheet one. We will not find it in the sheet two or in the sheet three. This option is as well a default in Tableau. Each time you are creating a new filter in Tableau, it's going to be using this option only. This worksheet going to be only available in the worksheet where we have created. The next option we have in Tableau all using this data source. For example, the sheet one is using the DS one. That means the filter can be applied in all worksheets that are connected to the data source one. In this example, we have the sheet one because it's connected to DS one and as well the sheet two, which is connected as well to the data source one. But the sheet three is not connected to the data source one, it's only connected to the two. That means this filter will not be found in the sheet three. That means we are sharing now the filter in all worksheets that are using the same data source. Let's move to the next one. We have all using related data sources. If you are going to use this option, you're going to find your filter almost in all worksheets in your workbook. So we're going to find this filter in the sheet one, we're going to find it in the sheet two and as well in the sheet three. That means if you are using this option, we are automatically spreading our filter in almost all worksheets. Let's go to the last one, and it's interesting one, selected worksheets. This means we can go and manually selecting which worksheets can include my filter. For example, I could say, I want to see my filter in the sheet one and as well in the set three without any rule. As you can see, we have here more control where our filter can be applied. The last two, all using the data source or all using related data source. There is like a rule, and Tableau can go and automatically spreads our filters in the worksheets in my projects. I tend to use selected worksheets more often than the other ones, because I would like to have control where my filters should be appear, in which worksheets. That's all about the concept of those four options. Now let's go back to Tableau and try those options pack our filters. We're going to go to the category, we're going to stay with the context filter, radically connects and go to the applied to the worksheets. And you can see the selected option here is only these worksheets. This one is a default with that, it means this context filter is going to be found only in these reports. If we go to the other reports, we will not find it. In order to change that, we're going to go again to the context filter radically con, let's try now all using this data source. Let's click on it now. If you take a look at our filter, we can find a small icon that indicates this filter is used in different worksheets that are using the same data source. In this view, we are using the big data source. As you can see, we have it as primary data source. Any worksheet, any view is using this data source. This filter can be applied on it. Let's go to the different views over here. So we're going to switch to this one. You can see we have the context filter and as well the first one, since both of them are using the big data source and the filter going to be applied automatically on it. But now let's create a new view where we are using different data source. Let's switch to the small data source. Let's take anything. Let's take the first name. As you can see, the filter going to stay empty because the big data source is not used in this view. But now let's go and use the big data source and see what table going to do. Let's remove the first name, switch back to the big data source and take as well, let's take the last name. As I'm dropping in this view, this data, you can see table automatically going to bring me the context filter because it must be used in all worksheets. That is, using the big data source. Which is really useful if we have different worksheets using the same, for example, context. Instead of creating the same filter over and over again, we can create it in one worksheet and then spread it to all sheets that are using the same data source. Okay, that's all for this option. Let's go back to our context filter and try something else. Let's switch to apply to all using related data sources. Let's try this one. Click on that, now you can see that we got a new icon from Tableau. Indicates that this filter going to be applied to all worksheets with related data source. Now let's go and check what can happen to the other sheets using this option. We're going to find now this filter almost everywhere in the first sheets, you can see we are using the same data. It's going to be like this. We have the context filter applied to the view. In the second sheet, we're going to see again the same context because we are using the same data source. Let's go now and create a new sheets where we're going to use the small data source. We are using different data source. Click on that and let's take, for example, the first name to the view. Now as we can see in the filters, we have our context filter. Even though that we are using different data source, we are not using the big data source. But Tableau brings this filter here because we are using this option. But as you can see, it's red. What is going on over here on the filter, If you mouse over it, it says, data sources that contain logical tables cannot be used as a secondary data source for data blending. Since these filters comes from other data source, from the big data source, Tableau has to make a data blending between them in order to connect it. It will not work if you have in the secondary data source a logical data model as you know in our big data source. If you switch to this page over here we have a data model. We have a logical model where we connected the customers with the orders and so on. Tableau don't like it as a secondary data source to has a data model, it will not work but if you have only one table or if you have like multiple joints at the physical layer, this going to be working. If you go back again, it's going to stay red as long as the secondary data source has a logical data model. But if you have one table, everything going to be fine. You will not get this error. All right, with this option, as you can see whether you are using the same data source or different data source, our filter going to appear. Now let's go and check the last option. Let's go back to our view over here. Go to the Context filter at click on it, Apply to worksheets. And now we're going to go to the selected worksheets. Let's click on that. All right, now we have a very simple table where we have a list of all worksheets and as well descriptions, the data sources, and some details. Now we can go and manually select which worksheets can include our filter. As you can see, we have like everything is selected because we use the option of related data sources. I don't want that. I'm going to select everything and start from the scratch. I would like my filter to be the first one. The second one. And this one is like grade out because we are currently in the worksheets. It's anyway selected. And other ones, I'm going to leave it de selected. That's all. Let's go and select Ok. Now, if you check the filter again, we can find a new icon that indicates this filter now is used in different worksheets that we manually selected. Let's visit the first report. We can find our context filter. The second one the same, the third one anyway, because we have here created this context filter. But now if you go to the different worksheets, you will not find this context filter. As I said earlier, I use this option a lot in my projects to have control in which worksheets I want to see my filters. Generally speaking, those options are really great way to share your filters in different worksheets and solve the problem of having creating the same filters over and over again. All right guys, so now we're going to talk about how to customize our quick filters. But first, let's understand quick filters. Any filter that you are presenting in the view, in the visualizations for the end user to interact with the view. Considered to be a quick filters. For example, all those filters on the right sides in the view are quick filters. We have the subcategory, the sum of the profits. Those stuff are quick filters. The users can go and start selecting the values inside those quick filters to interact with the visualizations. Now in order to customize those quick filters, we're going to go over here in this small arrow and click on it. Here we will get a long list of many options on how to customize our quick filter, and they are as well. In two groups. The first group is about how to customize the quick filter. The next set of options is about the filter modes then we have here, and many options about which values can be presented. In the quick filter, we have only relevant values, all values in context, all values in database. Now we're going to go and focus on this groups of options, but first we have to understand the concepts behind them. All right, as we learned before, we have a data source and worksheet. Inside the worksheet, we have a context filter and visualizations the data going to be sent from the data source to the context filter. The indivisualization going to be querying the context data and the result going to be sent back to the visualization. Now inside the view, we can create a filter. Now the question is, which data going to be presented inside this filter? Here we have many options. The first one is we're going to get the values from the database, all values in database. With that, the values going to be queried directly from the data source. With that, we are skipping anything inside the worksheet. We are skipping the data in the context filter and as well in the visualizations. Does this matter what we are doing in the worksheets? The values can come directly from the data source. All right, this is for the first option. When we say database, it means the data source informations. The next option, we have all values in the context. This time, the values in the filter going to come directly from the context filter. As we learned before, the Context filter can generate a Tumberal view or Timbal data inside the worksheets. Here the values going to come directly from the context filter and anything that can be done inside the view will be not considered in the values in the filter. With that, we are skipping the visualization level. We are getting the data directly from the context filter and not from the data. All right, so that's all for this option. The next one going to be only relevant values. The values for the filter now can come directly from the view, from the visualizations. That means any interaction that we are doing in the view, any filtering can affect directly the values that are presented in our filter. As you can see, those options are really helpful. And Tableau gives us now the control in which data can be presented in our quick filters. Because as you can see in Tableau, we have different layers and different stages, and the subsets and the size of the data can be different from one to another. Normally the size of the data in the data source way bigger than the context filter. With that you are defining and you are controlling which data are going to be presented in my filter. All right, now back to overview. Now, in order to practice those options, what I'm going to do, we're going to bring new quick filters to overview. Let's take the country rat, click on it, show filter, and we're going to get as well, the city. Let's go over there. We can change the order over here. So we're going to bring first the country, then the city and the subcategory. I'm going to remove those measures from the filters. So let's just remove them. And with that, we have those filters. Now we're going to go and check which options do we have inside the quick filter city. Go to the arrow. As you can see, the current value is all values in the hierarchy and that's because the city is part of the location hierarchy. But now we're going to go and change it to only relevant values. Let's go and do that. Now. If you take a look to the values inside the cities, we can find almost all the values from the data source. So nothing changed yet. But as we start now interacting with our views, the values in the city start reacting to our selections. For example, let's go to the country over here and start removing some countries. We're going to deselect France, Germany, USA. As you can see, the values inside the city acting to our selections. It's like those two quick filters are connected to each other's. This is exactly what the option of only relevant values does to our quick filter. This is exactly the purpose of this option. Only relevant values, anything that we are doing in the view, the values inside this quick filter can be refreshed and updated with the current selection. Now of course, if we go and deselect Italy, what's going to happen? The filter city going to be completely empty like our view. It is reacting to our interaction. Now we're going to go and change it to another option. Let's go over here on the arrow. And now we're going to change it exactly to the opposites, all values in the database. Let's click that. Now what's going to happen? Tableau going to go to the data source and bring all the information about the city and put it on the filter, regardless what we have selected in the view or whether we have a context filter and so on. So now we have a list of all values in the city that is available in our data source. It will not be refreshed or updated if we are clicking around or interacting with our view. For example, if I'm adding any other cities or I'm changing any other filters. For example, I'm removing all the subcategories. You can see it's static, nothing going to be changed in the city because Go to the data source, get all the data from there. And that's, this is really nice in order to optimize the performance Tableau and reduce the resources that are used in those quick filters. Now let's go and check something else. We're going to go and select all values in the context. Let's click on that. That means the values inside the cities is responding only to the context filter. Since our context filter is based on the category, we have to bring it to the view in order to change the values. Let's go to the category radical, click on it and show filter. Now we have our context filter on the right side. All other filters are dimensional filters. Now the values from the city can interact only with the category, not with the country and the subcategory. Now let's try that. For example, if I go to the country, I remove all the values. You can see the values in the view did disappear because we are not selecting any data, but the values in the city still are there. Let's go and select everything. The same for the subcategory. If I remove everything from the subcategory, you see the city is not reacting. It's still static because it comes from the context filter. Now let's bring everything back. But now if I go to the category, to our context filter, and let's remove office supplies. Once I remove it, you can see now the city is reacting to our view. So we don't have any values because we are not selecting anything from the category. Here you can see there is like connection only to the context filter, but not to the other filter. This is exactly what can happen if you make the city depending to the context filter. All right, with that, we have learned the three main options in order to control which values is going to be presented in our quick filters. But as we started with the city, we saw that there is another option called all values in the hierarchy. It was the default one, let's go and select that. Once we do it, what we are doing now we are connecting dimensions that are in the same hierarchy. If you check our data Bain, we have hierarchy that we created previously. It is the location hierarchy, and inside it we have four dimensions. We have the continent, country, city, postal codes. Now, all those four dimensions, if we use it as quick filter, they're going to be connected to each other's. Let's check the example. Now we have the city and the country in the same hierarchy, and they are connected to each other in the category. It's our context filter, it's empty, but still the city is showing values. That means the city now is disconnected from the context filter or from any other filter, not in the same hierarchy. If I go and select any values in the category, you see nothing is changing in the city. Even if I remove everything, but the city can react once and start deselecting or selecting values from the same hierarchy. If I remove France, Germany, USA, you can see now we have only the cities from Italy. They are like connected to each other. But here we have something special about the hierarchies, since as we learned, we have dimensions levels. The country is higher level than the city. The lower level dimensions will not affect the higher level dimensions. Only a higher level dimension can affect the lower one. What I mean with that, Let's go to the country. Select all the values. As you can see, now we have here in the cities, all the values. But if I start selecting any values from here, you can see the country is not reacting for it because it's higher dimension. Even if I go and deselect everything, I still have the four countries. That means since the city is lower level than the country, it will not affect the country. But if we bring now a higher level than the country which is the continent, let's see what's going to happen. We're going to go to the continent, radically connect and show filter, I'm just going to bring it over here now as I start deselecting stuff in the continent, as you can see, the values in the country are affected with my selection. Because of the hierarchy, the content is higher level than the country. With that, as you can see, this is what can happen if we have all values in the hierarchy. You have to pay attention to the levels of the dimensions, and those dimensions is going to be connected to each other. With that, we have covered all those options that we could use in order to control the values inside our quick filters. Okay, so now we're going to talk about a different group of options we could use in order to customize our quick filters. We have the filter modes, we have single value list, single value, dropdown slider, custom list, and so on. In order to learn that, we're going to have the following example what we're going to do. We're going to go and clean up our filters. I'm going to remove the country, the city and the continent. And we're going to have the subcategory and category. And we're going to bring as well the product name as a filter. Right click on it and let's go show filter. Now we have the quick filters. On the right side, we have the product name. I'm just going to bring it over here so it looked like our hierarchy. It started with the category, subcategory, and product name. Let's show all the values over here. And for the product name. I'm going to change the modes to drop down or a list. All right, so now let's start with the first quick filter the category and try those modes. We're going to go to the arrow, and as you can see, as a default it is multiple values list. As you can see, we have the list again here as a single value we have the same option, one a single value and other is as multiple value. The same goes for dropdown. We have dropped down single value and drop down as multiple values. Let's try those stuff out. We're going to go to the single value list. And as you can see now the visual of the filter, the change to radiobuttonsow, as I'm selecting those values inside the category, as you can see, we only one value, as the name says, it's only single value list. So that means we are making some kind of restrictions. Only one value is allowed. But if you want to have multiple values as a list, we're going to go and change it back to multiple values list. Here of course, you can choose different values and different categories without any restrictions. This is about the modes list, single value or dropdown list. Okay, Now let's go and try another modes. We're going to take this time single value, dropdown. Let's switch to this one. And as you can see with the dropdown, you will not find all the values immediately in the view. You have to click on the dropdown menu over here. And then you can select the values, single value. Again, here we can select only one value. We cannot select multiple values. I can select one category at a time. And as you can see, it is working. Let's switch now to multiple values. Drop down. We're going to have, again, here, the same thing. We have a drop down menu. But inside the menu we can select multiple values. That's it for the drop down. All right, so now let's move to another filter mode. We have the single value slider. Let's select that. And with that you can have a slider. We can move it to left and right to have different values, but it is not really interesting for a dimension with string values. We can use it for numeric or dates. Because this is not really nice to have a slider for values, it's better to use the drop down or a list for string values. So that is for the sliders. I rarely use it in the projects. So now let's move on to another one. We have the custom list, but I will not use it in the category. Let's go for the product name and use a custom list. Click on that. Now as you can see now the product name don't have any values. We cannot see anything. We have only a search box. So now we can search for a value. Like for example, let's search for Apple. And then hit Enter. You can see now a list of all products that contains the name Apple. So it's like searching inside this field. So if you can go over here and start selecting the values that you want to be in the filter. As I'm clicking over here on those boxes, I'm going to see a list of all values that I'm selecting. With that, we have created our list using the search box, but here we are not seeing any data because of the categories. So I'm just going to switch it back from the slider to multiple values list. I'm going to select everything. And now we can see that we are selecting only the subcategory phones, because we selected over here, the Apple. With this type of list, the customers can go and select their own list. So we can go and add more stuff like Samsung over here. So let's search. I'm going to add those products as well to the list. And with that, we are bending or adding more products to the list. If you want to clear everything, we can go over here and clear the list. This is really nice way to search for specific value, especially if you have a lot of values inside the product name. Now let's go and try the last option that we have in the filter modes, we have the wild cards. Let's go and select that. Now we can see that we have again a search box where we can enter a value. But now we are searching for specific pattern in our data. In order to show you how this works. We're going to get the product name as well in our view. Now we're going to go and search for specific pattern example. I want to search for all product that starts with the character A. In order to do that, we're going to go over here after the A. It doesn't matter which character going to comes after that. That's why we're going to use the character star. Let's go with that. And then hit Enter. We can see at the product name, Tableau did filter the data depending on our pattern, our search pattern. We can see over here all the products that starts with the character A. Let's go and have another example. Let's say we want to start with PP, then it doesn't matter which character going to follow up, we're going to have the star. Let's enter. We have here only four products that follow this pattern, and it is the word of or. We can search for the last characters. Let's say that it should end with, instead of having the start at the end, we're going to have the star at the start. We have star then then let's hit Enter. All those products end with the character. If I just like move it over here, some of them are really long names, you can see for example here book cases. It ends with all those products ends with the character. This is how this mode works. The wild cards, we can use it in order to search for specific pattern in our data. Again, this is really helpful. If we have a dimension with a lot of values, we can use the search. To find the specific data that we need. With that, we have covered all different modes that we have in this category in order to customize our quick filters. All right, now let's move to another set of options to customize our quick filters. In each quick filters we have a lot of information. For example, we have this extra bottom called all, or we have a title. Or we can search for specific value, or we can reset stuff and so on. So we can customize all those informations in Tableau. Let's go over here again. And then we can go to the customize. And now we can see all those options show all values. This is exactly the first value that we can select. Deactivated. We can have only the values from the dimension, from the filter, But sometimes it's really nice. For example, here in the subcategory, if you are like you want to deselect a lot of values, you just can go and di, select the all with that, you are removing all the selections and then you select specific stuff. With that, we can select the values really fast. Let's move to the next one. We have this small search icon. As you go over here, you can search, for example, for Art Enter. Then you're going to get the value inside this dimension if you want to hide it and nato it for the users. For some reason you can go over here and the customize and then deactivate it. Once you deactivated, you can see the small icon disappeared, but I think it doesn't harm to have it in each quick filter. Let's activate it again. As you can see with those options, we are customizing our quick filter. Let's check another option. Let's go to customize. And here it's really interesting to have the show apply pattern. Let's select that. And once you do it, you're going to get two new pattern cancel and apply. I'm selecting now in my filter, as you can see, nothing is changing in the view. That means it will not send any query to the data source or the context filter to get the data. Nothing is changing as long as I'm not clicking here on the Apply. Once I click on Apply, the filter going to send query to the Tableau and Table going to answer with data. This is really nice if you are going to select a lot of values, each time you are selecting a value Tableau going to do the calculations, maybe it makes sense. First, let me select everything and then do the calculations If you don't activate this option, like in the category, each time we are selecting and selecting from the filter Tableau has to to our interaction with that, we are generating a lot of calculations in Tableau as we are clicking around. But over here as we are selecting the values, nothing is changed until we decide to say, okay, I'm done. Now go and do the calculations. This is, again, really nice way to reduce the unnecessary calculations in Tableau. All right, so what else we can customize in our quick filters is the title. So we can decide whether you want to show a title or not, or you can either the title name itself. If you go over here you say okay, instead of subcategory, I'm going to have like minus between them and make everything small for some reason. Let's click okay. As you can see now the title change, but the datasets name didn't change. So if you go to the subcategory, the name stays as it is. We just renamed the filter name. All right, so that we have covered now almost everything on how to customize our quick filters in Tableau. Alright, so that we have learned how to apply filters to multiple worksheets in Tableau. And next I'm going to share with you my top tips and tricks that I usually use in my projects once I start using filters in Tableau. 110. Tableau | 10x Filter Tips & Tricks: Now I'm going to show you the best practices of Tableau filters that I usually follow in my projects. Let's go. The first step that I have for you is to utilize those filters. The extract filter, data source filter, and the context filter. I saw a lot of projects where developers really forget about them or ignore them because they are not really important individualizations, but they are very important for optimizing the performance in Tableau. My advice here is for you to always have a discussion with the end users about promoting one of those filters that you have indivisualizations to be first an extract filter. If it cannot be an extract filter, then the data source filter and the last option to optimize the performance is to bring it as a context filter. Because sometimes individualization you really don't need all the data you don't need. Like for example, ten years of data indivisualizations. Try to discuss it with the users to say, maybe let's bring only two years of data to the visualizations. And then you can utilize an extract filter or data source filter on your work work. Which can has a great impact on the performance overall in Tableau. Don't forget or ignore those three filters. The second filter tip that I have for you is about optimizing the performance in Tableau. Which is avoid using only relevant values in your quick filters. For example, if we go to the subcategory over here, we can see that it is currently set to only relevant values. If you use this option for all your quick filters, what can happen? The performance in Tableau going to be really bad and everything going to be really slow. So we can go and switch it to something else like all values in database or in context. We can go and switch that. With that, you're going to reduce the stress on the memory and the resources in Tableau, but let's understand why. All right, so now let's understand what can happen in Tableau If you're using your filters all values in database or in context. It's the same once the viewers or their users start the reports. If you're going to send only one query to the data source, and the data source is going to answer with the results back. So that means we're going to have only one initial query as the user starts the view. But on the other hand, if you are using only relevant values, what can happen? The view going to keep sending queries after query to the data source always to get an update and refresh in the view. That means the view going to keep sending multiple queries for each user interactions, which can really impact the performance in Tableau. Because each time the user is clicking something or interacting with the view, the view going to keep sending queries to the data source to get an update about the interaction. Which can use a lot of resources and memory in Tableau. And going to slow everything down because each user is clicking something in the view or, and interacting, the view going to keep sending queries to the data source which consumes a lot of memory and resources from Tableau. And it's going to slow everything down. Be careful with your quick filters, if you're having everything on only relevant values, things might be slow. If the users are suffering from bad performance in Tableau, maybe think about switching all those filters to all values in context or in the database. I have another filter tip about optimizing the performance in Tableau, which is avoid using dimensions with high cardonality as quick filters, those dimensions might impact the performance in Tableau. But first, let's understand what is cardinality? Cady is the number of distinct values in the field. For example, in our database we have the customer ID. We have around 800 customer ID and we have a lot of products names, those two fields considered to be a high cardinality dimensions. On the other hand, we have another dimensions, for example, the category. We have only three values or the countries in our database, we have only four countries, The subcategory as well. We have only 17 subcategories, those dimensions considered to be it. And if you are using them, the performance going to be okay. But if you are start using those dimensions with high cadalty, the performance might be pads. The best practice here is to avoid using high cardinality. All right, back to our quick filters. In our view, as you can see the category and the subcategory, there are dimensions with low cadality. It's fine to leave it at the view, but the product name, it has a lot of values. It is dimensions with high cadality. It's really worth to discuss it with the users whether they really need such a filter in the view. If you find out no one needs it, just remove it from the view just to have a good performance at Tableau. Now let's move to the next filter tip. Is that, let's say that the users really want to see the product name or the customer ID, any dimension with high cardinality. In the view here, the tip is to change the filter modes. Instead of having a drop down list or a list, we can use a wild match for dimensions with high cardinality. Why having a list of all the products or the customers in the view is a bad thing in Tableau or bad for the performance. We each time Tableau has to go to the data source or to the database and prepare a distinct list of all the customers or all the products to be presented in the view. Instead of having a list, we could go and change it to Wildcard match. And as you can see, Tableau is not preparing anything. So we don't have any values to be presented in the view, only if the customers start interacting with the quick filter. Then after that, Tableau going to go to the database and brings the relevant values. And with that, we are avoiding using a lot of resources and unnecessary calculations in Tableau. If you have a dimensions with high cardonality, either avoid using it or if you want to use it, just use wild card match. All right, so let's move to the next place. Practice in Tableau is as well about optimizing the performance in Tableau, which is start using the apply patum in your quick filters. Because if you don't use it, let me show you what can happen each time. I'm still selecting something. It is like equery sent to the data source. This is one query, second query, third query, fourth query, and so on. Each time I'm clicking on my filters, there will be generated a lot of queries to the data source, which is consuming a lot of performance. Instead of having such a filter, we can customize and add bottom as we learned before, we can go over here, then customize and show Applypatom. Now as I'm clicking on those values in the filter, no query is generated to the data source. We are not using any resources in Tableau. And once I'm done selecting what I need, then I'm going to hit Ok or apply what can happen, one query going to send to the data source to bring the result to the view. With that, we are reducing the number of queries that our visualizations is generating Tableau, which is really great for the performance. My recommendation here, if you have a filter like the subcategory or a dimension with high cardinality, where you are using a list, use applypaom. Because the users will not select only one value, they usually select multiple values and then at the end they can apply. But a filter like the category, we have only three values, like it doesn't work to use apply bottom, it's only three so the user is going to maximum like generate three queries. It's fine to not use a bly bottom with the dimensions with really low cardinality. With high cardinality or medium cadiality, like a subcategory, go and use a bly bottom. All right, the next filter type that we have is as well about the performance tableau, which is avoid using exclude and always include if it is possible. So for example, if you go to the subcategory we have here the option of using include or exclude if you're using exclude values. Those queries that are going to be generated in Tableau are more complex than include. More complex means more resources and might slow down the report or the view in Tableau. Avoid using exclude when it's possible, so I'm going to switch it back to include which has better performance. All right, so let's move to the next one. And I promise you this is the last one about the performance which is minimize the number of quick filters in your view. Those quick filters are going to take not only the space in the view, but also going to generate a lot of queries. A lot of stress going to bring the whole performance in Tableau down. Try to avoid using a lot of quick filters and discuss with the users each time they need new filters. Whether it's really necessary to put it in the view, because I saw a lot of Jects that the users always wants. A lot of filters try to discuss them. And not always bringing a new quick filter to the Tableau, because you're going to end up having really bad performance in the view, and no one's going to be happy having bad response time in the visualizations. Try to minimize the number of quick filters in Tableau that everyone is happy. Now let's bring more filters to our view. We're going to go, for example, I pick the order date, I'm going to show it as a filter. Let's take the location informations, the country as well, maybe the city. Now we have to start sorting those informations. I usually start in my projects with the first filter is the date or the time aspect that we have in the visualization. Here we have only the order date. We're going to drag and drop it on the top because the users can start thinking which date, which year I want to see in my visualizations. They're going to focus always first. On the time and the date aspect. After that, we have two kinds of informations or two hierarchies. In the quick filters, we have here the location informations, We have the city and the country. Then here below, we have the informations about the product as well. The, our hierarchy here, we have to not mix them together. Separate them first, start with the topic, for example, the location. First we're going to talk about the city and the country. Then we're going to talk about the product informations here follow, as will the logical order in our hierarchy. Our hierarchy starts, for example, with the country as a higher level than the city. Start always with the higher level, then move down to the lower level. For example, here we should bring the country and top, and then the city should be below it. If we take, for example, the postal code, let's have it as well in the filter, the postal code should be below the city. As you can see in the quick filter, we are rebuilding the logical order of the levels in the hierarchy. The same goes for the product. We have first the category, the subcategory, then the product name. Here, everything is fine with this. Add the user, start filtering the data, They start from top to down. There is like logical order of the field which really makes sense. All right, let's move to the next filter tip that we have to all values in dimensions with very low cardinality. What I mean with that, for example, let's check the country. The country has only four values. And really it makes no sense to use all because it's only three values or four values. And the users can go and select those values without now selecting all or deselecting. All this dimensions is really low cardinality. And we can go and remove this option. Let's go to the customized and remove it with us. We have more space to show to the users and this option usually takes a lot of space. All right, so let's move to the next one, to the city, and let's check the values. As you can see, we have a lot of values and here it makes sense to leave it as it is. We're going to leave the all values, the postal code as well. It's like relative high cardonality, we're going to leave it, the category here. We have only three values. It really makes no sense to use the old values, so I'm going to go and remove it as well from here. And with that, we have now more space. We didn't waste space for that. The subcategory here, let's make it bigger a little bit. And you can see, yeah, a lot of values and it makes sense to select all subcategories or de select. So I'm going to leave it for that. That means we just change that for the category and the country is really dimensions with very low cadonality. All right, so now we're going to move to the final filter tip that I have for you that I usually use in my projects, which is as well about the design as the locum feeling in Tableau. Here, we're going to use the suitable filter modes in the quick filters. Let's see what I mean with that. First, we're going to start with the order dates or with the date that we have. Usually in our view, I usually tend to use here like a continuous field instead of a list of distinct values. What I mean with that, I usually go over here on the year of order, dates radically connect and convert it to continuous. With that, we can have a range between two values which can has as well less space in Tableau. Let's go and switch now. As you might already notice, the order date, the quick filter did disappear because we changed the role from discrete to continuous. Let's go and show it again. As you can see now we have the quick filter, very minimum and not taking a lot of space. This is really nice as a start to have a range between two values for the date. Let's move to the next one. We have the country. The country is dimensions with very low cardinality. And here I tend always to use a list, multiple values, so everything is correct. Let's check that it is multiple values. A list. I'm going to leave it as it is. The next one, we have the city here, We have a lot of values here. We can only see like three values from the whole filter. Doesn't make sense to have it as multiple value list. Instead of that, I was going to say this is dimension with medium cardinality, we're going to always tend to use a drop down for that. I always keep this single value. It's like restriction, that has no meaning. We're going to go with the multiple value drop down with that. As you can see, we have a minimum space. We have only like one value that we can see. So if the users want to select the cities, so the user is going to go and select the values that the needs, and then closets. It's really minimum and don't take a lot of space. The next one, we have the postal code as well. Here we have the same situation dimension with a medium cadonality, we have like a lot of values, so we will not leave it as a list. We can have it as a drop down nu. So as you can see, the size compared to the city is really big. Individualization. We're going to go as well over here and change it to multiple values. Drop down. The next one is the category. It's exactly the country, only three values, very low cadonality. We're going to leave it as it is. I think for the subcategory. You already know that it has like medium cadonality. We're going to go over here and make it a drop down. Now we're going to move to the last one, we already talked about it. The product name is huge and has a lot of values. The best practices here is to use a wild card match for this value. For example, let's take another one. Let's take the first names. I'm going to show the filter over here and we're going to bring it just down. The last one penis. The product name as well is a huge filter. It has a lot of values here dimension with high caderality. We're going to go and switch the modes to wild card match exactly like the product name. So as you can see, we have now a lot of filters, which is not really good for the performance. But we saved a lot of spaces as we change the filter modes. So with that we have really nice quick filters on the right side, not taking a lot of spaces. So with that, I covered all the tips and tricks, or best practices that I usually use in Tableau projects if I'm using filters. All right. So with that, you know the best practices that I usually follow once I start creating filters in Tableau. And next we will learn the different ways on how to sort our data in Tableau. 111. Tableau | Sorting Data: All right, now we're going to learn how to sort your data inside Tableau. A lot of people think that sorting data in Tableau is not working correctly, which is not really right. So we're going to remove now this confusion and we can understand how sorting in Tableau works. Let's go, okay, now let's understand what is sort. It's very simple. Sorting is arranging your data in a specific order. And here we have two options. Either we can assort it using the ascending order. Here we can arrange your data in increasing order. That means we're going to start with the lowest, and as we are moving down, we're going to have the highest value. For example, let's take the order ID. We can sort it using the ascending order. Then the values can be like this, 123456, the values are increasing as we are going down. Or if we have like, for example, the first name, we have characters. It's going to be sorted from A to Z. For example, we have here and Dwight, and end up with Pm. The second option is to sort your data using the descending order. Here we can arrange your data in decreasing order. That means we always start with the large value. As we are moving down, we're going to go to the lowest value. For example, again here the order ID, We start with the highest value. In this example, it's going to be the 654. As I'm moving down, I'm going to get the lowest value. The same for the first name. It's going to be the opposite of alphabilitical order. We're going to start with Pam, Michael James, until we end up with, and as you can see, it's very simple. We have only two options, either sorting the data using the ascending order or the descending order. Now let's go in Tableau and understand how we can do that. All right, now let's create another view from the scratch. We're going to stay with the big, so let's take, as usual, the subcategory in the rows. And we're going to take, as a measure, the sales. Let's put it in the columns. Let's show the numbers. I'm going to take it to the labels and as well to the colors. Then we can have as well in the columns, the country. Let's go to the customers. Inside the hierarchy location, we have our country and let's put it over here. Okay, this is our view for now. There is two ways on how to in Tableau, either directly in the visualizations and we call it quick sorts or we can do it as we are building the view as developers. We're going to start the first one where we can learn how to do sorting using quick sort from the visualizations. This is what usually the users going to see and do. All right, now for quick sort in Tableau, there are three places where you can sort your data directly in the visualizations. The first one is sorting the data from the header is you mouse hover on the header name. Over here you can see that we have like small icon in order to sort your data. We can use it here to sort the header informations. Or the second place we can go to the axis over here. And you can see as well there is like small icon to sort the data. The third on the last one, if you go to the field labels, if you go to any values here inside the header, you can see we have a small icon to sort the data. Those are the three places where you can sort the data. In Tableau sorting work with three clicks. The first click going to sort the data, ascending the second one going sort the data, descending the third click going to bring the data as it is sorted from the data source. All right, as a default, the data going to be sorted as the data source. If your data source is sorting ascending, we can have the same way at the view. Now as a default, we are not enforcing any sorting in our view, but we are taking it from the data source. As you can see, it is sorted already in ascending fission because we have from the data source. Now if you go to the header, for example, let's click on this icon and see what can happen. As you can see, nothing happened in the view because it's exactly like the data source. We have it in ascending fission. That's was the first click that we done. We sorted now the data in ascending way. You can see over here we have a small icon that indicates this dimension is now sorted in the view in ascending way. Let's go again over here and click again. Let's see what's going to happen if I click on it. Now the data going to be sorted in descending order as well. Here we're going to have different icon. We have the tables and then it ends with the accessories. Now we have it descending. Now to go and reset everything back to the dealt, to the data source models. What we're going to do, we're going to click the third time. If I click again over here, the icon is going to be gone from the dimension and the data going to be sorted exactly like the data source. This is how sorting Tableau works. You have three click, the first one ascending, the second one descending, and the last one, we're going to bring it to the default. Data source. All right, now we're going to go to the second place where we can sort our data in the view, and that is the axis. If you go to the axis over here, we can find the small icon here is exactly the opposite. The first click can assort the data in descending order. The second click can assort the data in ascending order. And the third one going to bring it back to the default like now, let's try that. We're going to click the first one, as you can see now the data and the rows are sorted in descending order. We start with the highest sales. As we are moving down, we're going to move to the lowest sales. All right. Now let's click the second one. Let's go, we are now sorting the data in ascending order. So we start with the lowest sells and we end up with the highest seals. And the third click can bring it to default without any order. Let's click on that and we are back to the start, where the data is not sorted at all. So as you can see with the header and the axis, we are sorting the rows only only the rows are sorted. We are not sorting the columns. France, Germany, Italy, USA can stay at the same position. We are not sorting the columns. Now, in order to sort the columns, we're going to go to the third place, to the field label. We're going to go to any of those values, doesn't matter which one we're going to click. For example, on the chair, you can see this small icon here. Again the same as axis. The first one going to sort the columns in descending order, the second one ascending, and the third one to the default. Like now, let's go and click over here on this icon. Now the data is sorted in descending order. That means the first column going to has the highest sales, then the next one going to has the lower. And as we are moving to the right, we're going to get the lowest value. We are sorting the columns in descending order, as you can see. As well on the columns, we have this icon over here indicate that the columns are sorted. Now in the view. Now if we go and click it again, we're going to sort it in ascending way, where we can start with the lowest value, the first column. As we are moving to the right, we're going to have the last one with the highest value as well. Here we can see the icon which the data is sorted in ascending way. The last click as you know, we're going to go back to the default, the data is not sorted at all. All right, that's all about quick sorts in Tableau, it's really simple once you understand the places to sort the data and how you can click around to sort the data in different ways, a lot of people get confused about it. But it's really simple. Let's say that we have the following scenario where you say, you know what, I don't want to offer the users this possibility to sort the data. I'm going to sort everything in the view and the user is going to just see the report as I. All right, so now in order to disable the sorting option for the users, we're going to go to the main menu. And then we're going to go to the worksheets. And then here we have show sort control as a default tablet going to enable it, which makes really sense. Now let's go and disable it and see what can happen. Now if you go to the visualizations, you will see that we don't have anymore the icons in order to sort the data. If I go to the sales over here or I go to the subcategory or anywhere you see, we don't have any options in order to sort the data. This possibility is going to be completely disappear for the users. With that, we have removed completely the options for the users to sort the data inside the visualizations. To be honest, I've never been in situation where I have to remove this option for the users. It really makes everything static. And this is exactly the opposite of what we want. We want to make always our dashboards and reports dynamic interactive for the users. I think it's always really bad to make only static reports without having any dynamic inside it. Unless maybe the users exactly ask for this to say, okay, I don't want to sort the data, make it static as much as you can. You can go and disable this option. For now, I'm going to go to the worksheets. I'm just going to go and show set control and enable it again as we go again to the sales. You can see we got again those small icons in order to sort. All right, y. That's all about how to sort the data directly from the views, from the user's point of view. All right, so now we're going to move to the second group where we're going to learn how to sort the data as you are building the view. In order to do that, there's two ways to do it, either from the tool bar or from the dimension itself. Now if you move to the tool bar, we have here two options, ascending and sort Descending. Now in order to sort those dimensions, you can click on the country, for example, now we are sorting the columns. And then click over here, Ascending. As you can see, now we are sorting the data in ascending way for the columns. If you want to sort the subcategory, the roles, we can click over here and then click on ascending or descending. As you might already notice, we are sorting the data always by the measure, by the sales. If you most over on it, it's going to say sort subcategory descending by the sales. We don't have any option here to sort the data by the header. It's only sorted by measures. All right, that it's about how to sort the data from the toolbar. The second methode is to sort the data directly in the dimension. Let's go, for example, to the subcategory, right click on it. And as you can see, we have here two options about sort. We have clear, sort and sort. Clear Sort, going to reset everything to the default. Let's go and do that to start from the scratch, so I'm just going to clear everything for the subcategory and then right click on it. And let's go to sort. With that, we're going to get a new window. Says we are sorting now the dimension subcategory. I will just move it to the left side in order to see how table going to react to my selection. Okay, what do we have? Over here is two sections. The first one is about how to sort the data, the sort methods. The second one is about the sort order, ascending and descending. Let's see, which options do we have? We have five options. The data source order, alphabetic filled manual, and instead, let's start with the first one, the data order. Here we have it as ascending. We are sorting the values inside our header, the subcategory in ascending way, in alphabetical order. We can reverse it by going to the descending order. As you can see, the values switch. Now if we want to go and reset everything, we can go over here and click Clear to go to the default settings. That's it for the data source order. Let's move to the next one. We're going to have exactly the same effect because we have it as well at the alphabetical order. Let's go over here. As you can see, nothing going to change because we have it at descending. Let's go in alphabetical order to the ascending and the Hedron switch. Exactly the same effect. All right, now let's move to the third one. We're going to go to the field. We can go and sort the data by any field, from the whole data source. The field doesn't have even to be on the view, but of course, it makes no sense to do that. As a default, Tableau is selecting the sales because it's only measure that we have. In the view, it makes sense and the data is sorted in ascending way. But if you want, you can go and sort the data by the number of customers inside each category or subcategory. We can go over here and select the customer ID and the function can be counts the total number of customers inside each category. Now those categories are sorted in ascending way, depending or based on the total number of customers. We have this ability to sort the data by any field from the data source. But it doesn't make sense of course, to sort the data like this because it's going to confuse the customers and they will not understand why those categories are sorted like this without having like a description in their report. That's all for this method, sort pi field. Let's move to the next one. We have sort pi manual and here you have the freedom to make the order of the dimension. For example, we can take these machines over here. As I'm moving it down, you can see the order in the view is changing as well. I can go and sort the dimension as I want. It's really simple here. We don't have any rules, we don't have ascending or descending. We have the complete freedom to sort the values inside any dimension. That's it for this option, let's move to the next one. And the last one, we have the nested. Now, in order to understand how the nested sort works in Tableau, we have to work with multiple dimensions. The best way is to get hierarchy. Now, let's go and create another view. I'm just going to go and close this one here. Let's, let's take the continent to the rose and let's take the profits to the columns as well. As usual, we're going to show the labels of our data. Now if you go to the continent over here and radically connect, let's go to the sort. Let's say we're going to sort the data by the data source descending. As you can see, we are now sorting only the continent. If we drill down to the country, you can see that only the continent is sorted, but the country is not sorted. So if you go to the city, you can see that city is as well not sorted on the first dimension, is sorted. But now instead of that, we can go and use the nested sort in order to sort all dimensions inside the hierarchy automatically. Let's go and remove those stuff. So I'm just going to drill back to the continent, or we call it drill up, right click. Let's go to Sort. Then we're going to go to the nested. Now we're going to say, okay, ascending. And we're going to use the measure, the aggregation sum of profit, in order to sort the data. Now let's go and close it. And with that, we got the nested sort. As you can see, the continent is sorted. But now, if I drill down to the country, let's see the country going to be as well sorted. Now if you look closely to the data, you can see that the USA is the only country inside this continent. So we cannot see any sort of over here. But you can see that the countries in Europe are sorted, ascending it's start with the lowest value from Italy, then France, then Germany. You can see the country inside this continent is sorted as well based on the nested sorts. As you can see, the countries of each continent going to be sort separately from the countries from the other continents. This is how the nested sort works. Let's go and just put the profit on the colors as well. Now let's go down in the hierarchy and drill down to the city. We're going to have more data and it's going to be more clear as you can see. Now the city is as well sorted and now we are sorting the cities in one country. For example, over here in USA, the lowest sales is in, and the highest sale is in Portland. We are sorting the cities based on the country. So this is one section. The next section is Italy. The next one is Germany. So each country is going to be sorted separately from other country. With that, we have learned this method work if we have multiple dimensions and it's going to work perfectly if we have arch, in our view everything going to make sense and the sort going to be very logical for the users as I'm drilling down, for example, to the Bostl code or I'm rolling up back in my view, everything going to be sorted in very logical way. All right guys. So with that we have covered everything, how to sort the data inside our views from the user's perspective, how to sort the data as we are building the views. And I think it's really simple and not that complicated. All right, so that's all about how to sort our data in Tableau. And we have completed this section. In the next section, we're going to learn about Tableau parameters to add dynamics to our visualizations. 112. Tableau | Section: Parameters: All right everyone. So now we're going to talk about the parameters. Parameters are game changer in Tableau and that's because this is my opinion. Parameters are the best feature that Tableau did introduce. Because parameters in Tableau can make your visualizations very dynamic, interactive, and flexible in very unique way that you cannot find it in any other tool. All right, so now what are parameters? Parameters are like variables in programming languages that allows the user to replace a constant value in the calculations, filters, a reference line, and so on. Okay, so now what this really means, if you are building a view for your users, you are already making a lot of decisions. Defining a lot of values that can stay static, and the users are allowed only to read your views. So for example, you might create the following calculation in Tableau where you are defining a threshold for your KPI. So you are saying if the total sales is less than 400, then the KBI gonna show red. Otherwise it's going to show green. Here, the value of the threshold 400 is static and cannot be changed from the users. The viewers only can be changed from the developer. But now you might be in a situation where you have two requirements from two different users, where they define different thresholds. So here you end up making two calculations for two customers and as well creating two views. But now instead of doing that, we can use the power of parameters. So here we can replace the value 400 with a parameter, and then we can offer the parameter as an input field for the users in the view. And now the users can use the parameter to define the needed value as it requires using parameter going to change the behavior of your view depending on the value of the parameter. This going to make your views are dynamic and ready for any requirements. And there are endless ways to use parameters in Tableau. And in this tutorial, I'm going to show you six different use cases. The first use case is about how to use parameters and calculations. The second use case is about the reference lines, the third one how to use them in filter. And we have another very special use case in how to switch between dimensions and switch between measures in very dynamic way in one view and another use case about the titles and text. And the last use case, how to use parameters in pens. All right guys, so that was a quick intro to parameters. Next we will learn how to create dynamic calculations using parameters. 113. Tableau | Dynamic Calculations using Parameters: All right guys, so now let's start with the first use case, how to use parameters in calculations. So now let's create now some kind of KBI to track the profits by the subcategory. Okay, so now we're going to stay with the big data source and we're going to go to the products to get the subcategory. And then we need the major profits. So we're going to go to the orders and we're going to get the profits over here. Okay, So now we're going to show as well the labels on the view. And now we can have a threshold or BI, where we're going to say if the profit is less than ten K, then it's going to be red. Anything higher than ten K, it's going to be green. Now in order to create the logic and the colors in the view, we have to create calculations. Don't worry about how to create calculations in Tableau, because we're going to have a dedicated section for that. Now in order to create the calculation, we're going to go to the data pane radically on the empty space, and then choose Create Calculated field. Let's go there. And now we're going to call it QBI colors. Now then we're going to write here the expression about our logic. It says if we need some and then we have the profits. We said if it is less than 1,000 K, it can be red. So we're going to write the value red, otherwise it's going to be green. Let's end it with that. We have our logic for the colors in our view, and as you can see over here in our calculations, we have a constant. It is the ten k. Let's go and create that. So we're going to click okay. And here on the left side you can see our dimension. We're going to take it and put it on the colors. Now let's go inside and assign the values for the colors green. It gonna be green and red. It's going to be a red. Let's click okay. Now we can go and give this report to the users and they can view it and interact with it. But now as you can see, the calculations of the KPI is really static and they cannot customize it. In order now to give to the users option of defining what is red and what is green, we have to use parameters. Now, in order to create parameters in Tableau, there is two ways to do that. Either you go to the data pane and create your parameters, or you created in the place where you need it. For example, if you are creating a filter, inside of the creation of the filter, we can create parameters. Now let's see first how we can create parameters in the data pane. In the data panes, there's two ways to create parameters. Either you go to the empty space, Tic, click on it, then you can see here create parameter or the other option is that you go to the head of the data pane and you have here small arrow. If you click on that, you're going to see exactly the same drop down. And here we have the option of creating parameter. Let's select that. And now we have the window of creating parameters. First thing first, we have to give it a name, We're going to call it choose threshold. Next we have to define the data type of the parameter. And if we go over here, you can see a list of all data types. But here you know all of them. But Table decided to go with float and integer instead of number, hole and number, decimal. But they are exactly the same for now. We're going to go with the integers. We don't want to have decimal numbers in the KPI. And then once you do that, we can define the display format here For each data type, there are different formats to represent the values. So as you can see, we have automatic number standards, percentage, currency, customized. I'm going to stay with the automatic. And then in the next one, you have to define the default value that's going to be show up in the input. So here I would say it's going to be the 10,000 And of course the users can change that. Then after that, you have different options to limit what the users can select. So the default option here is all. That means you are allowing the users to enter any value, but of course, we limited the data type to integers. That means the users cannot go and enter any characters in the input field. Or you define for the user a list of allowed values. So here you can go and allow, for example, five different values, maybe to make sure that nothing goes wrong in the view. So here you are making the parameter more restrictive. So the list is something like discrete, you are allowing a list of distinct values. And the next one is something like the pens, you are defining the start and the end of the range, and then you are defining the steps between those two values. So for now I'm going to leave it open ended so the users can select whatever they want. All right, so now let's go and at Ok to create the parameter and now if you check the data bain on the left side, let me just minimize those tables. You can see that the parameter is going to be created always at the end of the data pane. So there is like a separator between your data and the parameters, and that's because the parameters are something that is independent from your data source. So there is no dependence between the parameters and your dataset. It's completely something independent and only special for the workbook. Okay, so now we have the parameter, how we're going to show it to the users. In order to do that it's really easy. Go to the parameter, right click on it, and then we have the option of showing parameters in the view. Let's select that. And now you can see the parameter input on the right side of the view. Here we can see the value of ten K as a default. Now let's go and change the value. We're going to have it like 500. You can see nothing change in our view. So it doesn't matter what you are giving here. You see that the view is not changing. That means we have now to connect it somehow to the view. And in order to do that, we're going to go inside the calculations and replace the constant value with the parameter. Let's see how we can do that. We're going to go to our calculation, the QBI colors. Right click on it, and then let's go to Edit. So now we have to go over here and replace this value. I'm going to remove it and now we're going to type the name of the parameter. As you can see Tableu, suggest us here and click on it. That any values that the user is going to give for this parameter going to be used directly in this calculation. Let's try that out. Can click okay. As you can see something changed already in the view, but let's go and play with the values. Instead of five K, we're going to have like 20 K. It's okay. And with that, I just changed the threshold for this KPI. So now anything below 20 K going to be red, anything higher going to be green. Let's have another value like 50 K. And now as you can see the threshold is really high. We have only two values. It's green, and as you can see, it's very dynamic. And you give the users the power of defining and customizing the KPI as they want. And with that, you're going to cover a lot of requirements in only one view. I just love this feature in Tableau. All right, so that's all for the dynamic calculations. Next we will learn how to use parameters to create dynamic reference lines. 114. Tableau | Dynamic Reference Lines using Parameters: All right, so now let's see another use case of the parameters. We can use parameters in the reference line, so we can show in our view a reference line to indicate what is the threshold, just it makes it more clear where is the cut between red and green. And here we can use our already existing parameter, how the threshold in the reference line. Let me show you quickly how we can do that. So now let's go to the analytics pane. And then here we have the option of creating a reference line over here. So let's go and doublicly connect. And now we have a new window to configure the reference line. There are a lot of options, but now we can focus on the parameters. What is really here, important is value of the reference line. Now let's check the option as we can see over here, as you can see Tableau here suggesting the metric. The second one is to create a new parameter. The third one is to choose the already existing parameter. As you can see, we can create new parameters exactly in the place that we need it. But for now, it makes really sense to use the same parameter in the reference line. Let's go and select that. Now as you can see on the right side, we have already a reference line in our view and we have the label of choose threshold. Instead of showing the labels, we can show the values of the parameter. In order to do that, we're going to go to the labels and we can change this two value. Let's select that. And that's it for now, Let's go and click Ok. So as you can see, we are showing now the threshold as a reference line. And if we go and change the value of the 50 K two, let's say ten k, let's go. Now as you can see, the user can control everything in the view with their input in the parameter. They are changing the calculations as will the reference line. It's really cool and professional to have this dynamic on your reports, so this is how you can use the value of the parameter inside the reference line. All right, so that's all for the dynamic reference lines. Next we will learn how to use parameters in filters. 115. Tableau | Dynamic Filters using Parameters: All right, so now we're going to go to the next use case where we're going to use the parameters in filters. And we can learn as well how to create parameters exactly in the place where we need it. So now we're going to go and create a report where we're going to show the top ten products in our dataset. In order to do that, we're going to stay with the peak data source. And let's go to the products and we take the product name autoblicly. Now we have a list of our products and what do we need is a measure. We're going to go to the orders and we're going to take the sales, drag and draw it over here as usual. Let's have labels and I'm going to sort it. Descending. Now we want to show only the top ten products. In order to do that, we're going to take the product name in the filters, so we can drag from here by holding control and then drop it on the filters. Now in the filters over here, we want to show the top ten products. In order to do that, we're going to go to the top top. And now we're going to go and define the rule. Everything is fine. So here you can see Top Ten by Sales. Now as you can see, we are defining a rule. In this rule, it's like the calculations, we have a constant. The constant in this rule is the ten. Now you might be in the same situation where you have one user asking for top ten products and another user asking for top 20 products. Now instead of going and creating two different filters, two different views, we can stay with the same view and use parameters. And then you're going to give the end users to define their list. So now we have to change the value of ten to parameter. So let's click over here. And here we have always the three options. Either the value you enter or you can create a parameter or use already existing parameter. Now we want to create a new parameter for this view, and as you can see, this is the second method on how to create parameters. We will not go to the databain we're going to create it exactly where we need. Let's go and click Create a New Parameter. So now we have here again the same window where we're going to create a parameter. We're going to call it Choose Top Products. Now you might notice that you cannot change the data type because you are creating here a parameter inside the filter for the sales. And the sales is measure and the number. But the same here, you can customize the display format, the current value, and as well which values you can allow, whether everything or a range. So now let's try the range. The minimum going to be one, the maximum going to be 50. And we're going to have a step size of five. All right, so that's all. Let's click Okay. So now let's check again the rule. We have Tube then our parameter by sales. So that means we don't have a constant value and we are using the parameter. Let's go and hit okay. So now as you can see, the report is showing the top ten products because the default value of the parameter is ten. And if you check the left side, we have a new parameter called Choose Top Products. Great. Now the next step is to show the parameters for the users rightly and say show parameter. All right, so now let's check our parameter. Now it's showing 11. I thought I gave it like ten. So let's edit it again. Right click on it and then let's go and it. All right, because we blade with those values. So as you can see it's like pens, it starts from 1611 and so on because the size is five. So what we're going to do is to change this to zero, and then as you can see, we have here again ten. Let's click Okay. All right, so now I promise you we have top ten, because if you check the value here on the parameter, it's ten. All right, so now this is something different. Instead of having input fields here, we have like a range slider. The user can change the slides. You can see our filter reacted and it's showing now the top 20 or the users could use those arrows in order to change the step. And as you can see, as I'm moving to different values, the filters eyes as well is changing. That says this is how you can use parameters and filters. As you can see, your view is very dynamic and you let the users to customize what they want. All right guys, so that's all for the dynamic filters. Next we will learn very interesting use case of the parameters, how we can dynamically swap between dimensions and between measures. 116. Tableau | Swap Measures/Dimensions using Parameters: All right guys, so now we're going to move to the most important use case in parameters. You can see this use case almost in each table project. The use case is to use parameters to switch between dimensions and to switch between measures. Now let's learn first how to use parameters to switch between dimensions in one view. Let's say that you are building a dashboard about the sales, and you're going to have views like sales by country, sales by category. That means you are creating two views with the same metric but different dimensions. Now instead of having two views, we're going to have only one view for the users. And they're going to decide which dimension they're going to use in the view. Now in order to do that, we have to use the power of parameters. All right, so now let's go and create our view. We have the sales, so let's take the sales on the columns. And then we need the countries. We're going to take it from the customers. Then we have here the country and the rows, great. And as usual we're going to show the labels. So now we want to make the dimension country as a variable, as parameter. So that means we need somehow to switch between dimensions, between country and category in the same view. So that means instead of having the dimension country, we want to have like a dynamic dimension with different values. Now the first thing that we have to do is to create a parameter where the user's going to choose which dimension should be presented at the view. So here we're going to go and create a parameter from the data pane. Click over here, then create parameter here. The main focus of this parameter is to choose which dimension can be presented at the view. First, let's give it a name, we're going to call it Choose Dimension. And now the question is what are the values inside this parameter? It's going to be the dimension name. So it's going to be values like country and category. So they are string, so the data type over here is going to be string. Let's go and select that. And as you can see, Tableudd disabled the format. We cannot choose a format for the string, it's like a free text. Next we have to define the current value, and here we're going to have the dimension country as a default. So let's go and enter the value of country. All right, so now since the datatype is a string, we cannot build a range from it. So here we have only two options. Either we're going to have it as a free text, as an input field. And in this scenario, it really makes sense to have a predefined list for the users, since the users will not see your data source and they have no idea which dimensions do we have for that. If we go with the free text, it's going to be really confusing and no one's going to get the right dimension for it. In this scenario, we really must provide a predefined list for the users, and then they're going to select the value that it's going to suit them. Here in this example, we're going to offer only two dimensions. It's the country and the category. Let's go and add those values. So we're going to have the country and the next value going to be the category. And of course, you can add more dimensions like the city, the product name, and so on. So now we're going to stick with the example. And that's it, So let's click okay, great. So now if you check the data pain, we have a new parameter called choose Dimension. Here you can see quickly which data type do we have for each parameter? Now the next step is to show the parameter for the end users radically connect. Let's go and show parameter. All right, now let's check our parameter. On the right side we have a list. It makes sense. We have created a list parameter, and at the end we're going to have a list for the users. And inside it we have only two values, country and category. Now if you go and switch between those two values, nothing going to change in the view because this parameter is not yet connected to our view. All right, so now we're going to go and create our dynamic dimension and use it in the view instead of the country. That means we have to create a new field in order to do that radically over here and create calculated field. Let's go there now. Let's call it dynamic dimension. We're going to use here the case when, Don't worry about it. I'm going to explain everything in the section of calculations. The syntax start with case and then we have to specify the field name. In this situation, we're going to enter the parameter, our parameter called here. As you can see as you are writing Tableau is suggesting stuff for us. Our field choose dimension. Next we're going to go and specify an action for each scenario, for each value. Let's have a new line and right when the first value going to be the country, you need to be really careful here to write it exactly as we wrote it in the parameter. It was capitalized in the parameter and it should be as well here capitalized, otherwise it will not work. Now, what can happen if the value is country? Then we have to specify the action. If the users choose country, what can happen? The dimension country should be used. Let's go and write over here, Country. And as you can see, as I'm writing is suggesting we need the dimension country. You can see it from the icon over here, so let's select that. All right, so now let's move to the next scenario that the user going to go and select the value of category. It's exactly the same stuff we can write here. When the value is category, then what can happen? The dimension category should be used. Let's start here. Category. And as you can see, we have suggested over here the dimension category. Let's select it that says this is the scenarios that could happen to the parameter and we have to end the case win like this. As you can see in this calculation, we are just mapping between the values of the parameters and the dimensions. Let's go and click okay. Now as you can see, we have a new dimension on the left side called the dynamic dimension. It is calculated field, and now we're going to go and remove our static dimension, the country. And instead of that, we're going to add our new dynamic dimension. All right, so now let's go and check with the ethical work. As you can see, the value is now category and in the view we see the categories which is really good. All right, so now let's change the value of the parameter to country. As you can see, the dimension in the view did change. So now we have country instead of category. As you can see, parameters are really powerful and you are going full dynamic in your view, where the users can define the level of details in the view by changing the dimension. So imagine now you are making dashboard with sales and you have ten dimensions. Here you are going with only one view instead of having ten reports. All right, so that sets for this use case. This is how we switch between dimensions using parameters. All right, so now you have the following Tableau task. The task says to create a dynamic measure using parameters to between three measures, sales profits and quantity. In the same view. You can pause the video right now to do the task, then resume once you are done. All right, so now let me show you how you can do that. We have exactly the same steps as the dimensions we have. First to create the parameter and second to create the logic in the calculated field. Let's start with the first one. To create the parameters, we're going to go to the data pane. Click over here and Create parameter, we're going to call it Chose Measure. And here we have to think about the values of the parameters. So it's going to be the name of the measures, which means the data type going to be a string. And here we have to define the default value. Here we have three values, sales, profit, and quantity. And we're going to have the default value as sales. Here again, about the values the users don't know about your data source, they don't know the exact name of your measures. So you have to go and create a predefined list for them. Let's go over here. We have three values, so we're going to have the first one sales, the second one profit, and the third one going to be the quantity. That's it. Let's go and hit okay. As you can see on the left side we have our new parameter. And the next step is to show the parameters for the end users. In order to do that, right click on it and show parameter. Let's check our parameter. Over here you can see it starts with the sales. Since it's our default, you can switch between those values, but as you can see, nothing is changing at the view, the view is still showing the sales. The next step is now to go and create the calculated field. In order to do that, we're going to go to the data pane radically over here, and then select Create Calculated Field. We're going to call it dynamic measure Here again, we can use the same syntax case, then the name of the parameter, so choose. We're going to select the measure. Now we're going to go and define the scenarios when the value is sales. Then the action is going to be selecting the measure, Sales, write sales and select the measure. All right, new line. And we're going to go now and map the next value. That's going to be the profit, then the measure profit. Profit. And let's go and select the measure. All right, so we map that. We're going to map now the last value. So we have the quantity. If the user select this value in the parameter, the quantity measure is going to be selected as well. Let's go with that. That's it, this is our three scenarios we're going to have at the end. Now as you can see, our calculation is valid. And let's go and hit Okay. If you check the data Bain, we have new calculated field called dynamic measure. So now what we can do, we're going to go and remove our static measure and replace it with the dynamic measure. All right, now let's go and change the values in the parameters. Let's start with the sales. As you can see, now we have the values of sales. If you switch it to profit, you can see the axis and the values in the view are changing to the new measure. But now let's go to the last one, to the quantity, and as you can see, we don't have any data. Well, if you have something like this, then we have an issue either in the calculations or in the parameter. Let's find out where is the error. Let's go to the calculation again, radically con it and then go to Edit. And here we have to compare the values. As you can see, we have here quantity and we have the dimension quantity. Everything is like correct, but as you can see, the value over here in the parameter is quantity. So here I have a typo, and that means for Tableau, we didn't define any scenario for this value. In order to correct that, we're going to go to the parameter on the left side, rtically correct, then go to Edits, and then we're going to go to our list and change this value, so double connect and write it correctly. Quantity. So that's it. Let's go okay. And now as you can see, we have data for the quantity, so it's really important to have exactly the same values from the parameters inside the calculation. So as you can see, it's really sensitive. So with that we have a dynamic dimension and a dynamic measure and we can switch between those staff as the user wants. All right, so this is how you can use parameters to swap between measures. In a view, it is just great. All right guys, so that's all on how to swap between dimensions and between measures using parameters. Next we will learn how to use parameters in titles and texts. 117. Tableau | Dynamic Titles using Parameters: All right, so now we can move quickly to the next use case, where we can create dynamic titles using parameters. Now if you look to our previous example, we have an issue. You see we have the title, Sales by country. But the view is showing category by profits, because we chose over here, category by profits. And now the title is wrong and misleading. So how we can solve this problem, we can use parameters to switch this static title to a dynamic title. Let's see how we can do that. So let's go to the title. And now we have a new window to customize the title. Now the rule, as a default, it's going to be the sheet name. That means the name that you gives to the worksheet going to be the title of your view. In this example, I call this worksheet as sales by country. And we have it as well as a title. But now we have to change this rule to be measure by dimension. Let me show you how to do that. Let's just remove this rule, and the first word in our naming convention going to be the measure. Now in order to insert the parameter, we're going to go over here on the insert. Then you will have a list of different table functions. And we have here a section for all parameters. Here we need the parameter for the measures, let's click on that. And now the next word in our naming convention going to be by space. By space. As you can see by don't have any background color because it is static and the parameter has a gray color to indicate that this is a dynamic value. And then the last word of our title going to be the parameter dimension. Let's go and insert that. In the same way, click Insert. And our parameter going to be over here. Parameter chose dimension. Let's click on that. The first word going to show the value of the parameter measure. Then we have, by then we have the value from the parameter dimension. Let's go and click okay. Now as you can see, the title of our view did really change. So now we have it, correct. Profit by category. Now as usual, we're going to go and play with the values of the parameters. Now let's have the dimension country. And you see now we have profit by country, the same for the measure. We can go and select quantity. We have quantity by country. As you can see, it's really amazing. And you can add parameters in everything and you're going to have really awesome views in Tableau. Let's have quickly another example. We can do the same in the parameters and filters, and here we can make as well a dynamic title. Let's double click on the title. Let's remove these parts, we're going to call it top. And then the value going to be from the parameter, so it's going to be top 30 to 40 and so on. So we're going to go and insert the parameter that you are using in the filter. So it's going to be the Choose Top Products. And then we can add the word Products. So that's it. Let's click Okay. And now as you can see, we have the title Top 30 Products, because the value in the parameter is 30. And as you are changing the values in the parameters, you can see the title is as well changing accordingly. I just love parameters in Tableau. All right. Okay. So with that we have learned how to use parameters in text and titles. And next it's going to be the last use case of the parameters. We will learn how to create dynamic pills in histograms. 118. Tableau | Dynamic Bins Using Parameters: All right, so now we're going to move to the last use case. We can use parameters in pens. In the last tutorial, we created pens and histogram about the scores of the customers, and we have decided that the size of the pen is ten. Let's go and rebuild this view quickly. It's really easy. So let's take the scores and put it in the columns, and then we can take the count of the customers and put it on the rows. With us we have an histogram and the size of each of those pens are ten. Again, we have a constant value inside our view. Let's go and make it dynamic. So we're going to go to our pen score. Right click on it and then d it. Here you can see the pens is ten, this is what we have defined. But now instead of that we're going to create a parameter raticlick on it. And again we have here the option of creating a new parameter. Select that, now we're going to call it choose size of Penso. Again, Tableau did decide on the data type, It should be based on the scores, and here we have the default value is ten. I'm fine with that. Now we have to go and choose which values can be allowed. Either all the values or list or range. Here I recommend to use that, a range because if you look at the parameter range, it really looked like a small pens as well. It makes sense to define the range for the users. Here we have the minimum five, the maximum 25, and the step size can be five. I'm fine with that. I'm going to leave it as it is. So let's go and click Ok. And now you can see instead of having the size of pens ten, we have a parameter, let's go and hit Ok. So as you can see, nothing's changed in our histogram because previously we have the size of ten and the default value in the parameter is as well ten. Let's go and test everything we have first to show the parameter. So radically connect and show parameter. Now in the right side we have ten. And if we are just moving between those two values, you can see that our histogram is as well changing accordingly. And with that, the customers can go and customize the histogram as they want here. Always, don't forget to make a dynamic title, because it's really cool. Let's go and do that double click on it as usual. We're going to remove this from here and we're going to call it histogram. So this is the static part, histochram score. And now we're going to add the size of pens. So we're going to have inserts size of pens and then we're going to close it. That's it. With that, we have a dynamic name. Now you can see the selected value from the parameter is now showing in the title. If the user is changing the size of pens, as you can see the title is as well changing accordingly. This really makes a lot of fun working with Tableau. All right, so now let's summarize. I think parameters are the best feature that we have in Tableau. Parameters are like variables that allows the users to replace the constant value in the calculations filters reference line and so on. And another unique thing about the parameters, that they are independent from your dataset, from your data source. And the main purposes of parameters is to make your visualizations more interactive, more flexible and dynamic. And give different users the possibility to customize the visualizations for different ways and requirements without having to create multiple versions of the same visualizations. I just love parameters. All right, kay, So that we have learned everything about the parameters and how to make our views dynamic. In the next section, we will learn more techniques about interactivity in Tableau, and we're going to focus on Tableau actions. 119. Tableau | Section: Actions: Tableau actions. They are really great feature in Tableau where it can add more interactivity and dynamic to your dashboards, which is going to make your dashboards very modern and interactive. And as well, it can enable the users to do data accelerations using your dashboards. So as usual, first we have to understand the concept behind the Tableau actions. Then we're going to go and practice in Tableau. So let's go. All right guys, now we can start with the first question. What is action? Well, action is a change of status. That means because of specific event or trigger, the status of an object can change from a to B. And the object in Tableau going to be the visualizations. The starting point, we call it in Tableau is source sheets. And the action going to be triggered by the user interactivity. How usually the users interacts with our views using the mouse. Either by hovering the mouse on the data or by selecting or clicking on the data. And the last option is using the menu. So far we have defined for Tableau the starting point source sheet. The second thing we define for Tableau, what can trigger the action. And the last thing that you have to define for Tableau is what can happen once the action is triggered. And here we have six different options or actions. The first one going to be go to URL. That means Tableau can jump from Tableau to an external website. So that means the target is going to be here, a website, not Tableau or not anivisualizations. The second option is to jump, or to go to another worksheets or to another dashboard. So here we are moving from one worksheet to another. Moving on to the third one. We have the filter action. What this means, the actions that you are doing at the source sheets. It's going to affect the filtering in the target sheets. Anything that you are clicking on the source sheets, it's going to impact the filter in the target sheets. And then we have another action called the highlights. Here again, we have a target sheets. And this time, any action that you are doing on the source sheets, it's going to impact and going to be highlighted in the target sheet without filtering the data. That means go to Sheet Filter and Highlights. You have always to specify the source sheet and the target sheets. And then we have two other actions where it's going to impact the values of something. Here we have change set value. So anything that you are doing on the source sheets, it's going to affect the members or the values of the target sets. This going to make the set very dynamic and interactive. The last one we have change parameter values. Again, here, any interaction that you are doing in the source sheets, it's going to impact the values of the parameters that we have. Now, all the options that you can define as a consequence for the action. So as you can see, it's really easy. We have to define the source sheets, we have to define the trigger, and then we can define what can happen once the action is triggered. All right, so that was a quick introduction to the Tableau actions. Next we're going to start with the first type of actions that go to URL. 120. Tableau | Action: Go To URL: All right guys, In Tab we can create actions either in the worksheet page or in the dashboard page. In order to do that, we're going to go to the main menu. Over here we can find the option worksheets. So let's go there. And then we have here the option of actions in order to create new actions. Or we can go to the dashboards. And as well, we have the same option actions here. But since we are now at the worksheet page, it is graded out. So now we're going to learn how to create actions in the worksheet page. And we can start with the got URL. So let's go back to the worksheet and the main menu. Then let's go and click on the actions. With that, we're going to get the first window. So what we're going to see at the start is an empty table because we didn't create any actions yet. But once you start creating actions, you will get a list of all actions that you have inside the workbook or inside the sheets. Now in order to create a new action, we're going to go over here, add an action. Then we're going to go to URL. So let's select dot. And here we're going to get a new window in order to set up our action. In our example, we want to jump from Tableau to external web page to Wikipedia. We have to give it first a name. The name of the action it's going to be go to more details. Then as we learn, we have to specify for Tableau three things. First, we have to define for Tableau the source sheets, the starting point of our action. Then we can specify for Tableau what can trigger our action. And then at the end, we have to specify the target. Let's start with the first one. We have to specify which worksheet is going to be including this action. Here we have to select first which data source. It's going to be the big data source. And we're going to select immediately the current worksheet, sales inside source. That's all for the source sheets. Then we have to specify for Tableau what can trigger our action. Here we have three options, Either mouseover select or by menu. Let's leave it as a menu first. Then we have to define for Tableau what is the URL targets in our example. We have to specify here, for example, the Wikipedia page. Here we have two options. Either we can to create a new tab, or we can create a new window. That's all. It's really easy, all have to do is to specify the starting point, what can trigger our action, and what can happen once is triggered, let's go and hit. Okay. And with that, you can see we have now one action in this table. Let's go and hit okay again. And let's test it. So far nothing changed in our visualizations. As you can see, we have the subcategories by the sales. But now once the user clicks on the marks, for example, let's go on the chairs over here. We will see here a new link. It says, go to more details And this is exactly the actions that you have defined here, the interaction from the users. They have to go to the marks. They have to click on the mark and then go to the menu. Once click on the link over here table, going to jump to a wiki BD page. This is how it works. Now let's go and try different triggers. So I'm just going to close this. Let's go back to the worksheets, then go to the actions. Let's go to our action over here, and go edit it. Now, instead of using now, I would like to have select. Let's see the effect of that. Let's click okay. And then again. Okay. Now the trigger for the action is going to be by selecting, by clicking on the marks. Once I click somewhere over here, Let's go to the storage. I'm going to go and click on the mark. We're going to go and jump to Wikipedia. So as you can see here, it's a little bit more sensitive. Once you click on the Marks, you're going to jump to the URL. Here, we don't have a menu where we have a link. We're going to jump immediately to the link. Let's go and try the hover. It's going to be more extreme, so let's go to the actions again, to our action. And then let's go to the hover. And here you have to be careful as you are, mouse hovering, because you're creating a lot of web pages. Let's go and it. Okay. Now, very carefully, once I mouse over on the paper table, going to go and jump to WikibD. I didn't click anything, I just mouseover. So as you can see now, the action is very sensitive to the user's interactions by just mouse hovering on the Marks table, execute the action. With the menu, the users have the chance to think whether they want to execute the action or go to the URL or not. With the select, it's more aggressive where the users can select on the marks that they can jump immediately to something else. With the hover, it's very aggressive just by how mouse hovering on the marks, the action can be triggered. Now let's conclude this and be very careful where you are mouse hovering because once you hit any marks table going to go and open a new web page. So let's go back to our worksheets and then go to the actions. Let's remove it because it really doesn't make sense to have a mouse hover to go to an URLs. The best way is to do that is to go to the menu. All right, so now since we are working with the URLs, we can add a lot of stuff like values, filters, parameters to the URL in order to make something more dynamic. For example, I would like the users depends on which subcategory they select. They're going to go and find more descriptions about this subcategory, how we can do that. First we're going to go to the URL over here and we can add wiki. Then we have to add the value of the subcategory. In order to do that, let's go to the Insert over here. Then we will get a list of all fields that we have inside our data source. We are searching for the subcategory and we can find it over here. Let's go and select on the subcategory. As you can see, it's like dynamic inside of our URL. Now I would like to make the name of the link as well more dynamic. Let's go and call it Read more about. Then we have to add the subcategory to make it more dynamic. We have as well here, an insert. And we're going to go and search for the subcategory we have over here. That's that we have a dynamic name for the link, and as well a dynamic link. Let's go and hit okay. And try that again. Okay, let's go, for example, to the tables over here. Click on the mark, and you can see here we have the following link. It says, read more about tables. So it's read the value from the subcategory that we are currently selecting. Let's click on that. And here we're going to jump immediately to the Wikipedia page that describes the tables. Let's go and try something else. Let's go to the storage over here. As you can see, the name of the link is very dynamic. We have read more about storage, and once you look over here, you will get more information about the storage. So this is really amazing. In order to add more context, more information inside of our alizations and to make it more interactive, that's all now for the go to URL action. All right, so that's all for the first type of actions that go to URL. And next we're going to learn how to use actions in order to jump from one sheet to another. 121. Tableau | Action: Go To Sheet: All right guys, Nick. We're going to learn how to use actions in order to jump from one worksheet to another one. In this example, we have the source, or the starting point, the sales insights. And the target going to be the profit insights. So now we'd like to make an action in order to jump from the sales to profits. In order to do that, we're going to go to the worksheets in the In. Then we're going to go to the actions. And we're going to go and create a new action. This time we're going to go and two sheets. So let's go and select Dots. And here we got our new window in order to set up the action. It is very similar to the URL set up. First we have to give it a name, we're going to call it Go to Profit Insights. And then here we have the three things. The source, what's going to trigger the action, and the target. The source is going to be the sales insights. And the action this time is going to be as well. By menu, Let's go and select Dots. And then we have to specify the target sheet. It's got to be the Profit Insights. Let's go and select dots. We have our set up. Let's go and hit. Okay, that's all. Then as you can see, we got a new action in our table. Let's go and hit okay as well. Now let's go and test it. Let's go to one of those marks. Let's go to the machines. And then we get our menu. We have now two links. The first one says, go to the Profit Insights or read more about the machines. So this one is going to take us away from Tableau to an external web page. The first one can move us to another worksheet inside Tableau. So let's click on Go to Profit Insights. Now as you can see, Tableau executed the action once we click on that and we jumped to another worksheet. Now we are at the Profit Insights. All right, so that's it. As you can see, it's really easy. We have to just specify the source sheets, the target sheets and what can trigger the action. All right, so that's all for the type. Got to sheets and next we're going to learn the action filters and as well how to use a quick actions. 122. Tableau | Action Filter & Quick Actions: All right guys, when we on to another type of actions, we have the filter action. What can happen here that anything that you are selecting in the source sheets, it's going to be relevant in the target sheets. That means in the target sheet, we will see only the data, only the information that you have selected in the source sheets. So let's see how this works. We're going to stay with the same examples, where we have one worksheets about the sales, it's going to be our source. And we have another worksheet about the profits. It's going to be our target. Let's start with the source. Let's go to the menu worksheets. Let's go to, and let's go and add a new action. The first one is going to be the filter. Let's go to the filter here. We get again a new window in order to set up our filter action. It's going to be very similar to the previous ones, but here we have a little bit more options. First we have to give it a name, we're going to call it Filter Profit Insights Here. As usual, we have to define the source sheets. It's going to be the sales insights. I don't want to have all sheets. And then the triggers be, let's say that's going to be the select this time. Then we have to define the target sheets. It's going to be our profit insights over here. The filter Here in the filter actions, we have more options about the interactivities. We have to define for Tableau what can happen once the users deselect the data, once they clear the selections. So here we have three options. Keep filtered values, show all values, exclude all values. The best way in order to understand this interactivity is to have an example. So now we're going to stay with the default, keep filtered values. Let's go and hit okay. With that, we got our new action over here. Let's hit okay again. And try the action. The best way in order to understand how this filter action works is to bring both of the worksheets in dashboards. So let's go and create a new dashboards. And let's go get the source and get the target as well. Below it, I will just remove this legend over here. So now let's go and start interacting with the reports again here. Once we select something from the source, it's going to affect the data on the targets, for example. Let's go and select, for example, those subcategories. So as you can see, my interaction with the source can have an effect on the target. Now we can see only the subcategories that I have selected in the source sheets. With that, the user is going to get the feeling that everything is connected together. Everything is interacting together is alive. Anything I'm selecting in those worksheets, it has an effect in the next one here. For this type of action, we mostly go with the select instead of the menu. It really makes sense to select something in the dashboards and to have immediate interactions in the next one. So as you can see, it's really easy, right? So now I want you to understand another type of interactivity. What can happen once I diselect what I have selected or once I clear my selections, we have selected show filtered values. So once I, for example here, click on the empty over here to deselect, nothing going to change. With that, we have kept the filtered values and this is exactly what we have specified inside our action. But now if you say, you know what, once I diselect stuff in the source, I would like to have all the values as well deselected from the targets. In order to do that, we're going to go back to our action and we're going to go and edit our filter action. Now if the users go and clear their selections or deselect, we want to show all the values for the target sheets. So let's switch it like this. Click okay again. Okay. And let's try this. For example, I'm going to go and select only the storage. And as you can see, we got only the storage. And once I clear my selections, once I di, select anything in the source, you can see we'll get all the values again in the target sheets. In this scenario, it makes more sense to use these options. If I'm not selecting anything from a source, nothing should be filtered in the targets. Now let's go and check the last option. Let's go to the worksheets actions, and to the filters. Let's go and exclude all values. Let's select that. Let's try what can happen now. Now, at the start, nothing happened. We see all the data from both sheets. Now let's go and select, for example, those subcategories. As usual, we will get all data filters in the target sheets. But now, once I dielect, everything going to disappear the target sheets. So that means the target sheet will only show the data if I select something in the source sheets. So that means nothing here is relevant, as long as I'm not selecting anything in the source sheets. And once I start selecting something in the source sheets, the data going to be shown. Otherwise, if I do select it now, don't show anything. One more thing that I would like to show about the filter actions. If you go to the target sheets over here, you can see that we don't have any data. And Tableau can indicate that there is an action that is filtering the data inside these worksheets. And you can see in the name of the filter, we have the word action Tableau to indicate that this filter is really depending on the actions from the users, any value that is selected from the users. It's going to impact this filter. For example, if you go inside it and edit the filter, you can see nothing is selected. And that's because in our interactions, we didn't select anything here in the dashboards. Once for example, I select those values, you can go back to the target sheet and you can see those values as well selected in the worksheets. And if you go inside the filter, you can see those values are as well selected inside the filter. Anything that starts with the action and the filter, this comes from an action filter. And the values inside it can be defined depending on the interactions that you have done. All right, so that we have covered everything for the filter actions in Tableau. All right guys, now I'd like to show you quick actions in Tableau using the dashboards. For example, let's say that we have the sales and the profits and they are disconnected. There is no actions between them. But now I can go and create a filter. Actions between them very quickly. If you go, for example, to the sales over here, you can find a small icon for the filters. It says use as a filter. If you click on that, you can see now it's filled. And now if I'm clicking on anything inside the sales, as you can see, the profits can be filtered. Now if you go to the inminute the dashboard, to the actions, you can see that Tableau create automatically new actions. It's usually has the name of generated. We have here filter one generated. This one is created automatically or quickly as we clicked in this small icon over here on the dashboards. And of course, you can go over here and change the options if you don't want to have Select, you can move it to Menu to Hover, and so on. And of course, you can do the same thing for the Profit insights. So let's go and close everything. Let's go to the Profit Insights, And we can say, okay, the profit is going to filter as well the sales. So let's go click on that. And now let's select everything. And anything that I'm selecting in the profit's going to as well filter the seals. This is really nice and quickly in order to create actions in Tableau. But this is only for the type filter action. All right, so that's all for the action filters. Nick, you're going to learn another type of actions. We have the highlights. 123. Tableau | Action Highlight: All right guys, Now we're going to talk about another type of actions. We have the highlight. The highlight is very similar to the filters where the user is going to interact with the source sheets. In the target sheet, we're going to focus on a subset of data that we selected from the source. But the main difference here is that the unrelevn data will not be filtered out. All the data going to be the target sheets, but only what we are selecting going to be highlighted in the target sheets. And the best way in order to understand the highlight action is to have a dashboard with two worksheets. So now let's go and create a highlight action. As usual, we're going to go to the main menu over here, but this time we're going to go to the dashboard. Then let's go to the Actions, and let's add a new action. We're going to go over here, add an action, and then we're going to pick this time, the highlight. As usual, we have to define the source, the trigger, and the target sheets. Let's go and give it a name. It's going to be Highlight, Profit Insight. Then the sources, going to be our sales. I'm just going to remove the profit from here. And the best way to work or to trigger a highlight is to have a hover. I'm just going to run this action on the hover. And then the target going to be our profit inside. So I'm just going to remove the sales insides. Then we have some options to define which field is going to be included in the interaction as the default going to be all the fields or dates and time. Then the last option you have selected field, so you can specify which field going to be included in the action. I'm going to stay with the default all fields. So with that we have everything. Let's go and okay. And with that we got as well our action. Let's set Okay again. Now let's go and test the action. Let's go to the source sheets. That trigger going to be mouse hover. Now as a mouse hovering on those informations, you can see that Tableau is reacting in the target sheets and focusing on the data that I'm like, mouse hovering. If I stay on the storage sheet with my mouse, you can see that Tableau is focusing on the storage in the target sheet. And you have a highlighter with a yellow color. As you can see, it's really nice, right? It's add more interactivity, more dynamic to your views as the users are interacting. Worksheets and other worksheet is getting highlighted. It's really nice. Now you might say, you know what? I would like to have the same effect in the profit insights as a mouse hovering on those data. I would like to have highlights in the source, in the sales insights, both of those reports or those worksheets can highlight each other's. In order to do that, it's really simple. Let's go to the main menu again, the Dashboards, actions. Let's go to the Highlight Action. And then let's include everything in the source sheets and as well everything in the target sheets. With that, all those worksheets can highlight each other's. Let's go and hit. Okay. And then again. Okay, and let's check. Now, as you can see as a mouse hovering on the Profit Insights, the highlight is going to be in the sales and the vice versa. As I'm moving on the sales, you can see the highlight is going to be on the profits. Now the mouse hover is going to highlight both worksheets. All right guys. Now generally speaking about the highlights in Tableau, there are different options where we can add highlights or control the highlight option. For example, if you go to the Quick menu over here, you can see that we have an option to edit the highlights. If you go over here, you can see that we can disable the highlights. We can enable it, we can define which fields is going to be included in the highlights. For example, if I go over here and say, okay, disabled workbook highlights what can happen that the highlight action going to be disabled. In order to enable it, we're going to go again to the Quick action over here and enable the workbook highlights as you can see. Now I can highlight on those stuffs in Tableau. We can add highlights to the worksheets or to the dashboards if you go to the main analyzes. And then here we have highlighters. If you go over here, we have the subcategory. Since it is the only dimension that we have in the dashboards or on those worksheets, let's go and click on that. Now if you take the right side, we cut something like a filter. But it's not really a filter, it is highlighter. If you click on this box over here, you will get a list of all distinct values inside the subcategory. Now what you can do, you can just mouse over on those informations and as you can see, the dashboard is going to be highlighted. This is another way to trigger the action highlights inside your dashboards or worksheets by adding the highlighter on the right side. For example, if I just go and click on that, it's going to stay highlighted times since we have selected this value over here. And of course, if you want to get everything back to the normal, you can go over here, click on the X and remove the value. With that, we got everything back without highlights. All right guys, so that's all about highlights. Actions in Tableau. Alright, so that's all about the action highlights. And next we're going to learn how to use actions in order to change the members offsets. 124. Tableau | Action Sets: Dcast. Moving on to another type of actions, we have the sets. As we learned before previously, in the sets, it can split your data into two groups, the group and the out group. Now the one who is creating the dashboard order worksheets, guarantefine which members is going to be in and which members is going to be out. But in order to make your visuals interactive, we can give these options to the users so they can define which members is going to be in and which members going to be out. In order to do that, we're going to go and create action sets. So first let's create a view and the sets. In order to do that, we're going to stay with the big data source. Let's take the sales to the columns, the profit to the rows here in the middle. We're going to go and get the customer ID that we got, like data points, but we still don't have any sets. But first let's go and make those points a little bit bigger in order to understand the members. And then I'm just going to go and change the shape as well to be field circles that sets. Let's go now and create a sets. In order to do that, I'm just going to go and select those top right customers. And then we go over here and then we say create sets. All right, I'm just going to leave it as it is. And with that we got on the data pain a new dimensions for the sets. So now we're going to go and add it to our view as the colors. So let's go and move it to the colors over here. So as you can see, the blue going to be the N and the outs going to be grey outs. I'm just going to change those coloring. So let's go to the colors and the going to be, let's say the green and the outs going to be the Reds. Let's go and hit Apply and okay. And now as you can see, the one who's creating this view is deciding which members are in and which members are out. But now let's go and give these options to the users. In order to do that, we're going to go and create an action set. As usual, we're going to go to the main menu to the worksheets. Let's go to Actions, and let's add a new action. This time we're going to use change set values. Let's go inside. And here we have the usual stuff. We have the source, what can trigger the action and the target. Let's just give it a name change, customer ID set and then we're going to go and define the source sheets. It's going to be the action sets that we have it and then we have to define the action. I'm just going to leave it as select. The target is going to be the target set. In order to do that, we have to click over here. And then we will get here all the sets that we have inside our data source. In this example, we have only one set, big data source. We have it over here, customer ID sets, Let's go and click on that. And now here we have more options about the sets. The left one going to be what can happen to the set once the users start interacting or selecting data points. On the right side here we have options about what can happen once the users clear the selection, once the user diselects stuff in the visualizations. Now we know that Santos options, we have to play around those values. On the right side, I'm just going to say keep set values. If I di, select anything in the view, nothing can happen. Now, in this left group, we have assigned values to set, add values to set, and remove values to sets. We can start with the first one. Once the action is triggered, we can assign values to sets. What this means, if you choose this one, what table going to do? Going to empty the group, and anything that you are selecting, going to be the members of the group. Let's see what this means. Let's go and hit, okay. And then again, okay again. Here we have to select in order to trigger the action. As you can see, we have those members are inside the group. Now let's say that I would like to select those four members over here. Once I start selecting those members, what can happen? Only those members going to be in the group can see those points are now out. That means Tableau is removing everything and starting from scratch. And anything that you are selecting going to be the only members of the group. That's it for this option. The selection going to define the members of the group. Let's go and change it to the second option. Let's go to our action, the change customer ID. Now let's move to this one. It says add values to sets what can happen this time. Tableau will not forget previously which members were inside the group. Now we are just adding new members to the sets. Let's see how this works. Let's go and, and again. Ok, now currently we have those four members in the group. And let's say that I would like to add two new members. So let's say that I would like to add those two members over here, so let's go and select them. With that, you can see we still have those members in. We just have added two new members that set. It's really simple, right? Let's go and try the last one. Let's go to the action and as well to the customer change ID. This one we can say remove values from sets. Now what can happen? It can be exactly like adding new members to the sets, but this time anything that you're selecting, it's going to remove those members from the sets. Let's go and try that out. Let's go and hit okay. And again. Okay, let's say that I would like to remove this member from the group and move it to the out group. In order to do that, let's go and just select it and click on it. As you can see now it's thread and it is not anymore in the group. That's it. So this is about what can happen once we trigger the action. But now let's learn about what can happen once we start the selecting the action. Let's go to the actions over here and go back to our set action. On the right side, we have here three options. Keep set values. Add all values to set. Remove all values to sets. So far we have always worked with the keep set values. That means if you clear the selections, nothing going to happen. The members that you have defined with your selection is going to stay in the group. But the other two is going to destroy your definitions. Let's say that add all values to sets. If you deselect, it's going to add all values to the group. So this option means if you disselect everything going to be in exactly the opposite. We have removed all values from sets, so if you disselect everything going to be out, so let's go and select this one. Add all values to sets and try this out correctly. We have those five members in the group and the rate is out. And I'm like interacting with our reports. And I select this point to be removed from the out group. So now once I disselect or clear my selection, what can happen, All the members going to be in the group. And the other option can be exactly the opposite. If I disselect everything going to be read and going to be out. All right. Okay, so that's all for the set actions. As you can see, it's really nice feature where you can give the users the freedom to choose which member is going to be in, which member is going to be out in order for them to do focus analysis instead of us the one that is creating the dashboards. So it's really adds more dynamic and more interactive to your views. All right, so that's all about the action sets and next we're going to learn the last type, how to use actions in order to change the values of the parameters. 125. Tableau | Action Parameters: All right guys, Now we're going to move to the last type of actions. We have the parameters. Again, here we can use actions in order to change the values of the parameters. So now let's have an example in order to understand how this works. Let's build now sales by month. So let's go and get the sales over here. And let's go and get the order date to the columns. I'm just going to change it to the months over here and let's go and add the labels. Now what I would like to build in this view as I'm like selecting data from the view, I would like to get the total sales of my selection. Whether I choose one point or I choose different group of points, I would like to get the total sales of my selection. Now in order to do that, we're going to go and create another worksheet where we want to show the total sales of our selection. Let's go and create another worksheet. So the first thing that we have to do is to go and create a new parameter. Let's go to the data paint, to the empty space over here, right click on it. And then create parameter. Let's give it a name. It's going to be the total sales. Inside this perimeter, we can have the total sales of our selection. We can have the data type flows, the display format. Let's move it to a currency standard and the current value can be let's say zero instead of one. That's all. Let's go and hit Ok. Radically connect show parameter, currently it's zero and nothing in our view. Now I would like to have one sentence here that says total sales. And then we can have the value of the parameter. In order to do that, we have to go and create a new calculated field. Let's go over here in this arrow, create a new calculated field. In order to do that, we're just going to go to our parameter from the data, Pain, drag, and drop it to our calculations. Why we are doing this? Because we cannot use directly parameter in our aggregations or in our view, we always have to create a new calculated field and inside it we're going to have the value from the parameter. That's all. Let's go and hit Okay. Now on the left side we have a new calculated field, our new measure. Let's go and put it inside the text over here. And as a default, we can have it as a sum. As the user are selecting different points, we're going to have the sum of all our selections. This aggregation is correct. But now here in the view we have only zero, but I would like to have a sentence, total sales, then the value. In order to do that, let's go to the text over here, then to the three points. And now we have a new window where we're going to customize the text. We're going to say total sales. Then we have the value of our new calculated field. But let's just make everything bigger. Total sales, let's move it to 20. And the parameter or the calculated fields, it's going to be as well 20. And I would like to make it more bold. That's all. Click Okay. As you can see, now we have total sales and the value is zero, which comes from the parameter. Now let's go and change this value to, for example, 100. Now as you can see, we got the total sales of 100. And now I would like as well to change the format of the total sales. Let's go to our calculated field, Rad. Click on it, then let's go to Formats. And then here on the left side we have numbers. If you click on these options, we can go to the Currency standards. Then let's move to United States. It's going to be somewhere over here, English United States. And with that, we got the dollar signs. All right guys, Now the next step is that I would like to bring everything in one dashboard, so both of the worksheets. Let's go and create a new dashboards. Let's get the total sales, and then we're going to get the sales by month. Let me just make it a little bit bigger and let's remove the title from the Total Sales. Now as you can see, the total sales value comes from the parameter. Now so far, everything is disconnected between those two worksheets. Thing that I'm selecting here, it will not be reflected inside the parameter. Now here comes the magic. I would like to change the value of the parameters depending on my or my interactions from this view. In order to do that as usual, we're going to go to the main menu over here to the dashboards. Then let's go to the Actions. And then let's add a new action and choose this option. Change parameter values. Let's go inside it. So here we have the usual stuff, The source, the trigger, and the targets. Let's give it a name change, Total sales. Let's define the source. It's going to be the sales by month. Let's just remove the sheet seven from here. The sheet seven is the total sales. And then the action going to be the select. So I would like to select and trigger the action. And then here we have to find our parameter. We have only one, so the total sales, let's select that on the right side, what's going to happen once we clear our selections? So I would like to say, okay, let's set it to zero if the users are not selecting anything. All right, so now the last one we have to define for Tableau, which field going to control the values of the parameters by the sales. By month, we have different informations as you can see over here. We have the month and we have the sum of sales. Of course, the sum of sales going to be controlling the values of the parameters. So let's go and select this value over here. And the aggregation going to be the sum, since we are finding the total sales. So that sets all for now, let's go and hit Ok. Then again Ok. Now as you can see, we have the 100 value comes from the parameters. But if I select, for example, the data points over here, you can see that the total sales comes from my selection, the 64,000 So now if I go and select all those values from the view Tableau going to go and summarize all those sales from my selections and put it in the parameter value. So with that we have connection between the parameters and our actions to the view, which gives a lot of dynamic and interactivities to your dashboards. All right guys, so that's all for the parameter actions. It's really nice feature in Tableau. All right, so that's all for the action types. And next I'm going to share with you my tips about the action triggers. 126. Tableau | Action Triggers: All right guys. Now I would like to give you quick tips about when to use which type of triggers of actions. For example, if you want to jump from your worksheets to another worksheets, or to go to an external website, it's better to give the options to the users to select this option using menu. First, show the menu. Slit the users, see the link, and then if the users wants to go there, they're going to select the link and click on it. It's always better than to surprise them by select if the users like select on something, like suddenly they go somewhere else. It's really not nice. Go with menu. If you go to URL or go to if you are using filter action, the best way is to use select. It's like more interactive, once a user start selecting from more worksheets. The other worksheet going to be filters. I usually go with Select if I'm using the filter actions and table used as well as a default. If you are using a quick action for filter action, I usually go with Select For the last one, the highlights, I really recommend you to go with the hover. As the users are most hovering inside one worksheets, the other worksheet is as well interacting. It's really nice and more like modern. Really be careful about when on how to trigger, which actions don't surprise your users by jumping somewhere else. If you are using like go to RL and sheets, be careful, talk with your users about it, how they would like to see it, and then maybe together make a decision about the interactivity and actions together with the users. All right? Okay, so that's all for me about actions in Tableau. All right, so that's all for the tips about the action triggers. And with that, we have completed the section, the Tableau actions. And in the next section, we're going to cover a very important topic in Tableau, the Tableau calculations. We can learn there how to manipulate the data in Tableau, and we're going to learn many Tableau functions. 127. Tableau | Section: Tableau Calculations: Table calculations. We will cover now over 60 different functions in Tableau in order to manipulate your data. You will not only understand how to use all those Tableau functions, also you will understand the concept behind them. Using very simple sketches and examples in order for you to understand how those tableau functions works. Because some of those calculations are really complicated, we will start first by covering the basics about table calculations. And then we can dive into the most used functions in the four category, row level calculations, aggregate calculations, LOD expressions, and the table calculations. Let's start first by having an introduction to the basics of tableau calculations. So now let's go. 128. Tableau | Introduction to Calculations: Everyone. So now we're going to talk about the calculated fields in Tableau. And we're going to start with the first question. Why do we need calculated fields in the first place? As we learned before, as we are building our visualizations, we always go to the data paint, to the data source, and we grab those fields that we see to the view. So now let's imagine that you are in scenario where you need extra information, information that are not available in our data source. Or you would like to manipulate and transform those informations to new information, to new fields. Or let's say that we are building a very complex logic in our views. For all those scenarios, we can go and create new calculated fields in Tableau to be placed in our data source. Calculated fields in Tableau are user defined fields that are created using formulas or expressions. So there are additional fields that you can create based on the original fields in the data source. All right everyone. So now we're going to move to the next question, how to create new calculated fields in Tableau. There are five methods on how to create calculated fields. Four of them are globally. That means once you create the calculated field, it's going to appear on the data source, on the data. Pain to be used in any other worksheets or in any workbook that is connected to the data source. And we have one local method in order to create one calculated field only from one view. And we call it quick calculations. Now let's go and explore those five methods. The first way to create a new calculated field, we can go to the data pin on the left side. Right click on the white space, right click over here. And the first option is create calculated field. Once we go over here, we get a new window where we can write our expression. That's it, this is the first way. Let's move to the next one. I'm just going to close this. If you go over here, we have a small arrow near the search. If you click on it, we will get exactly the same list. So as you can see, the first option, create calculated field. The third way in order to do that is if you go to any of those fields inside our data source. Let's say that we go to the addresses, write a click on it, and then here we have the option of Create. And the first one called Create Calculated Field. Once you go there, we're going to get exactly the same window, but this time we're going to get the field name prepared in the expression, because here we went specifically to the address and we create from there a new calculated field. Let's close this and I'm going to show you the first methods in order to create calculated field. We're going to go to the Analyses in the menu over here, click on that. And here we have the option of Create Calculated Field. Once we click on that, we're going to get again the same window. Those are quickly the four methods on how to create a new calculated field. You will get always the same result, only if you go to the field and you go from there and create calculated field, you will find the field name inside the expression. Now let's go and call it my first calculation. And I'm just going to give anything here inside the expression. Let's just type one. Let's go and hit. Okay. So now we can see on the databain that Tableau did create for us a new field. It is like a field, like any other fields that we have on the databain in our data source. It has as well a data type. It is continuous measure because I enter there one, so it's like a number. You can treat it exactly like any other fields, but here to understand which fields are calculated and which fields are original, you can see on the icon over here, it has the equal sign. That means if you see the equal sign near the data type icon in any field, that means this field is a calculated field. It is not original field that comes from the data source. Someone went and created this calculated field and it is based on the original data. With that, you can quickly identify which fields are original data that comes from the source systems and which fields are calculated fields created from the users. With that, we have created our first calculated field. And it is a global field. That means if you go to any other worksheet, let's go, for example, to new one. We can find again our calculated field. Now let's move on to the next method where we're going to create a local calculated field relevant only for one view. In order to do that, we're going to have fat something on the view. Let's take, for example, the customer's first name and put it on the rows. Now in order to make quick calculated field locally, we're going to go inside the field, inside the dimension. And we can do that by double clicking. Once you do that, you can see we are now allowed to write something inside this field. And we are writing now the calculated field. Let's say that, okay, we have now capitalized letters of the first name and I would like to manipulate it and transform it to upper case. I would like to see everything as an upper case. In order to do that, we have the function in table called upper. Now I'm writing the function name and it's going to transform the first name that I have created, calculated field inside the first name. Once you go outside, click somewhere outside or click now we can see on the results of that, this function did change. The first name case that we have done a quick transformation, quick calculations inside the view. If you grab the first name again from the data pain, you can see that nothing's changed. We didn't change anything on the data source, we just changed it quickly For this view. This is how you can create quickly new calculated field in the view without affecting the data source. And it's going to be locally only available in this view. Now let's say that this transformation here is interesting and I would like to reuse it somewhere else in other views. Now, in order to make it available in our data source, what we can do, we can grab this field from the visualizations and just put it on the data source. Let's release with this, you can see. Add the new field inside the customers and we know this is calculated field by checking the data type, You can see we have the equal sign Tableau, Offer us here to rename it. I would like to leave it as it is, and if you go inside it in order to edit the calculation, radically connect and edit the calculation. And again, we cut the window where we can configure the calculation. All right, Kay, so that I have showed you all the methods on how to create a new calculated fields in Tableau. All right, the next step we're going to go and learn the basic options that we have inside the calculated window. Let's go to our calculated field, my first calculation. And first let's show the value in the view. Let's drag it to the text over here, and as you can see, we have the value number one. Let's go and edit the calculated field in order to get the window radically connect. And let's go to the edit. So what do we have over here? First we have the name of the calculated field, and we called it, in this example, my first calc. But of course, you can go to the data pane or the data source and rename it directly from there, or you can do it inside the calculated window. Okay, the next information we have the name of the data source where we are creating the calculated field. In this example, we created the calculated field inside the small data source. This is really important if you have multiple data sources and you are creating a lot of calculated fields, it's really nice to know where I'm creating now this calculated field, so it's nice and f Now moving on to the most important section in this window, this white area where you can write your expression to define the calculated field. Currently we have one, but we can go and use different stuff. We can use the field names, parameters, functions, and so on. For example, we created last time the upper function for the first name. With that, I have defined what should be done inside this calculated field. This is my expression. Now don't worry about the syntaxes that I'm writing inside the expressions, because in the next tutorials we're going to learn everything about the syntaxes, about different functions in Tableau. Don't worry about it now. Next information that we have is we have the info of the calculation is valid. Here, Tableau gives us a quick information whether the expression that I just wrote valid or invalid currently, I wrote the calculation in correct way. That's why we have everything fine from Tableau. But now let's make something wrong. Now we will get a red message from Tableau saying the calculation contains errors. And here we have small arrow. If you go over here, you'll see the message. It says Tableau is expecting here a closing parenthesis here, Tableau, show us a quick message to know what's wrong in our calculation. If I go and add the parenthesis, you can see that the calculation is valid. We have quick info from Tableau. Moving on to the next information that we have. In this one it says one dependency and small arrow. Let's click on that and see what we have here. It says changes to this calculation might change the following sheets, sheet number one here, Tableau gives us a warning. Anything that you are changing in the expression inside this calculation, it might has an effect on the sheet number one. And that's because we are using this calculated field in the view in the sheet number one. This is very important information, especially if you have different worksheets and you are using the same calculated field in different worksheets. And this happens a lot, especially if you are like focusing on the content of one view and you go and change the calculated field here. It's like a reminder, a warning from Tableau tells you, all right, if you do this change, you can affect the following worksheets here. My recommendation for you is always to go and check the dependencies to make sure that the changes that you are making currently to the calculated field, it is still relevant for the other sheets. All right, so moving on, we have two simple bottoms that apply and okay, I don't have to talk about it, I think. Then we have here a small arrow, and this is very important. So let's go and click on that. What do we have here? And this extension is documentations or a catalog of all the functions that we have in Tableau. So for example, let's go and search for the function upper that we use in this example, search for upper, and now we can see on the right side the documentation of this function. So here we have three informations from Tableau. The first one is the syntax of the function. So syntax says it's start with the upper keyword. It accepts only field and the data type should be a string. The next information we have a short description of the function, so it says it's going to convert a text string to all upper case letters. The third information, we have an example of use here. It says, okay, if you have an upper for the value product, everything in lower case, the output, the result going to be a product in upper case. Here we have a nice short quick descriptions about all functions that we have in Tableau. This is very useful, especially while you are writing the calculations because it doesn't make sense to memorize everything, right? I tend as well always to check whether I'm using the correct syntax or even a using the correct like function. I always check the examples and say, okay, this is the one that I need. And one more thing that you can see in this window, this drop down menu. And here we have different groups of functions in Tableau, for example, we have here the group of string functions. If you go inside it, you will get a list of all functions that's going to manipulate the string fields. So we have here at the end, as you can see, the upper function that we use in our calculation. All right, Kay, so with that we have covered all the options that you can see inside the window of calculated fields. All right, so that was an introduction to calculated fields in Tableau, and next we're going to learn the basic components of Tableau calculations. 129. Tableau | Calculation Components: Guys, so moving on, we're going to talk about the basic components of calculations in Tableau. That means what kind of information we can add inside the expressions, inside the calculations. The first thing that we can add inside the calculation is the comment. Comments are really useful for you and for the others to have some context or small descriptions why you are doing the calculation. For example, in order to add comments to this code, we can go on the start and we have the forward two slashes. Then we can write anything. Anything after the forward two slashes will not be executed in the calculation. For example, we can write here calculation to change first name to upper case. Anything I'm writing over here will not be executed and as well will not be checked from Tableau. I really recommend always to add comments for you if you visit this calculation later, you understand why you write this expression. All right, moving on to the second information that we can add inside the calculations, that are the fields from the data source. So those are the orange colors. We have it over here, the first name. But let's just remove everything as start from scratch. So if you want to add a new field inside this calculation field, you can start writing the field name As I'm writing now, Tableau can make a list of suggestions here, Tableau defined three things. The first one is a function. As you can see, there is like a small icon, like an F. This indicates that this is a function. Or the second information, it says the first name, and beside it there is a data type icon. This data type icon can indicate this is a field name. The third information is as well, the first name with the icon. So that means it is filled. But here Tableau writes it, this is from the big data source because those two fields has the same name Exactly here. Tableau show for us that this field comes from different data source. The first one comes from the same data source. That's why Tableau don't have to say, okay, it is from small data source, because it is from the current one. But since the second one comes from different data source, Tableau indicate that this is a different field from different data source. Now since we want the first name from the current data source, we can go and select this one over here. And with that, we have inserted a field inside our calculations, and as you can see, it gots the orange color. Another way to add fields inside our calculations, and that is by drag and drop, Let's say that I would like to get as well the last name. So I can go to the last name over here, drag and drop it inside the calculation and as see with that we got our second field and again it is the orange color. And of course the fields that we are add to calculations could be any fields example. Let's go and add the seals. The seals is a measure so we go to the orders, the sales, we can just drag and drop to the calculations. As you can see, Tableau, except as well measures inside the calculations and they can have as well the same color, the orange color. All right, moving on to the next and very important component, we have the Tableau functions. Tableau Functions are built in operators that could be used in order to manipulate, to transform, to change the content of one field. For example, what we can do with the sales. We can go and calculate the total sales inside our data. In order to do that, we can use the function sum before the field sales, we can start with the sum and then we have the open apprentices and then close as we can see, this component, those functions in Tableau have always the color of light blue. Now what can happen? Table going to go and summarize all the values inside the sales and presented as the result. Let's go and heat. Or we're going to get an error here because we have changed the calculation. So let's go and remove it. Let's get it again in the text so that we got the total sum of sales inside our data. Now let's go back to our calculated field and see the next component. We have the logical expressions. We can use the logical expressions in order to check whether a condition is true or false. And they have as well, the color of plaque. So for example, let's say that we want to create the calculation where we are checking the sum of sales. If it is higher than 1,000 then we want to see the high at the end. Let me show you how we can do that. We're going to use the statement, it's going to start with the keyword. As you can see, it is black because it is a logical expression. If the sum of sales is higher than 1,000 we can here the operator higher greater than 1,000 then what's going to happen? We're going to have the value high. Then we're going to go and end the logical expression. We can check over here that the calculation is valid here. We have our logical expressions then and end, don't worry about the syntax. We're going to learn everything in the next tutorials step by step with very simple examples. All right, so now we're going to move to the last component that we can add to our calculations. We have the Peter parameters, dynamic fields that we can add to visualizations in order to make everything dynamic in the views or the calculations. Again, there will be a dedicated tutorial for that later. But now let's see, We can add a parameter field inside the calculation. First, we have to create quickly a parameter. In order to do that, I'm just going to close our calculation over here. And then we can go to the arrow and the data pane. Then we can have the create parameter click on that Here we're going to get the window. In order to configure the parameters, we're going to call it, choose a number. That's it. Let's close it and say okay. Now on the left side we've got a new parameter. Right click on it and show parameter that we got like on the right side and input field where we can add a value. For example, we have it now as a one, we can add like 1,000 Now nothing can happen in the view because we don't have anything. But we're going to go and add this parameter inside the calculation. Let's go back to our calculation, my first calculation, right click on it and then go and Edit. Now what we're going to do, instead of having 1,000 we're going to get the value from the parameter we make like a dynamic calculated field, so the user is going to go and control this value. Let's go and remove the 1,000 And we're going to start writing the name of the parameter like any other field, so it's going to be choose and we get it over here, so click on that. And with that, we have added our parameter inside the calculation. And as you can see, parameters in Tableau has the color of purple. That's it through the last component. And with that we have covered all different components that is be used inside calculations. Now let's go and try the output. I'm going to go and hit okay. Then I'm going to remove this one, it's red. Let's get the products to the rose. Next we're going to go and get our new calculated field. This time it's going to be dimension because the output of the calculated field going to be a string value. Let's check the results. And as you can see over here, we have two products with the value high, the rest going to be null. Now let's go and get the sales in order to understand why those values are high. And that's because of our calculation. Anything above 1,000 we can get the value high. Anything below it going to be null. And with the parameter, the users are controlling the calculation. If I go over here and say, okay, instead of 1,000 let's have 500. With that, we have included as well the other products. So all the products now has the high value in the calculated field that we have generated new information to our visualizations. All right guys, now let's quickly summarize the components of the calculations in this example. First, we can see the comment, This comments going to help us to document the purpose of the calculation and it will not be executed, it's going to be as well in the gray color. The next component, we have the field. So any field inside our data source, whether it's dimension or measure, we can add it to our calculation like this one. We have the sales and they have the orange color. The next component, we have the functions. They are the build in operators order to manipulate our data, and they have the blue color. The next component, we have the operators. In this example, we have two operators, the plus, the arithmetic operator. And as with the comparison operator, it is the higher than they're going to have the black color. The next component, it can be as well. With the black color, we have the letter expressions. Those are static values that we can insert inside our calculations. It could be a number like here the ten or it could be string like here the high. And he don't forget to add that double or single Oto marks in order for table to understand this is a value not filled or a parameter or function or anything else and we can add as well date values. All right, moving on to the next component. We have the logical expressions we have, if then, and they can help us in order to evaluate conditions inside Tableau and then to decide whether it's true or false. And the last component that we have inside the calculations, we have the parameters. They are the dynamic fields that we can use inside calculations. All right, so that's all about the components of calculations. All right? So that we have learned the main the basic components of the Tableau calculations. And next we're going to learn how to nest one calculation into another. 130. Tableau | Nested Calculations: So I'm going to talk about the nested calculations in Tableau. In Tableau, you can nest calculations by using the result of one calculation as an input for another calculation. And that's because sometimes you might be in a situation where we have complicated calculations with different steps. For each step, we can have one calculation. As you are implementing those steps, you're going to end up having multiple calculations and they're going to be nested inside each other's. Now let me show you an example. All right, so now we're going to go and create a new calculated field to manipulate the values of the field country to have specific format. In this example, let's take the first name of the customers and as well the countries. Now we're going to go and create a new field for the country with different format. Let's go and create a new calculated field. And then we're going to start with the first calculation where we can make all the litters of the field country with the upper case, so we're going to have upper function. And then we're going to manipulate the field country, so we're going to start writing country. And here it is, our field that sets for the first calculation. Let's go and hit Ok, that tab. Going to go and create a new calculated field, new dimension inside our data source. So let's go and check the values. As you can see, all the litters, all the countries are with the upper case. All right, so now we're going to move to the next step in the transformation, where we want to show only the first three characters of each values inside this new calculated field. In order to do that, we're going to go back to our calculated field and we're going to edit it. This time we're going to use the function left. You can go and search in the catalog to see the syntax of the left function as you can see it, except two fields, the first one is going to be the string that we want to manipulate, and then we're going to have the number of characters that we want to show. Let me show you now, step by step, how we can do that. Let's go first to a new line. So we're going to have left and then it needs two arguments. The field that we want to manipulate and the number of characters. The field that we want to manipulate, it going to be the result of the upper function. It's going to be this one over here. So I'm going to just cut it and insert it over here. With that, we have the first argument. The second argument is going to be the number of characters that we want to show. It's going to be three characters, that's why we can specify three. This is how we can list functions in Tableau. The first function to be executed going to be the one inside the upper function is going to be executed first. And then the result of this function is going to be used as an input for the function outside, for the function lift. That means first we're going to go and make all the values inside the country as an upper case. Then we're going to go and execute the lift function, where we're going to show only the first three characters. Now let's go and hit a blight to check the results. With that, you can see we have now only three inside the values of the country. Again, the function inside going to be first executed, then the function outside. With that, you can further expand this calculated field to more functions. For example, let's say the third step, we want to go and calculate the length of the characters. In order to do that, we can use the link function. We're going to add it as a starch, and then the input of the field can be the output of those two functions. As you can see, it's very easy to nest functions in Tableau. Let's go and had a blind and check the results. As you can see everywhere we have the links of three. Again, the order of execution going to be the one just deep inside the upper function, then the left function. Then the last one to be computed is the length function. That's it. This is one method on how to create nested calculations in Tableau, but there is another method in how to do that. That's by creating a second calculated field using the first calculated field. Let me show you what I mean. We can go and close this one over here. And let's create a new calculated field. We're going to call it second calculated field. What we're going to do inside it is to use the output of the first calculated field. In this example, it is the country U. This is our first calculated field. And then we're going to multiply it with two. For example here again the order of the computation going to be first. Tableau has to calculate the first calculated field, calculate the upper left and link, and then at the end it's going to come over here and multiply it with two. Let's go and hit okay. And with that we've got a new calculated field. Let's track and drop it on the view. As you can see there is going to have the value of six, window I use the first isolde, and window I use the second mode. All right? So I'm going to show you how I usually decide on this. Let's go to our first calculation. And as you can see, those intermediate steps, if they are not important steps like you don't want to use them in any other visualizations, then it doesn't make any sense to create for each intermediate steps in your field, inside your data source, then the data source can explode. And you're going to have a lot of fields that are not necessary in this situation. I'm going to have all those intermediate steps in one calculations. Another scenario where you have a very complex calculation, where the code going to be very huge and it's really hard to maintain everything in one calculation there. I try to split it into steps and each step going to have like one field in the data source. The last scenario where those intermediate steps are really important for something else, for different visualizations, or maybe as well for any other different calculations. In order to not repeat myself and doing the same calculations over and over, I go and create a dedicated calculated field for each intermediate steps only if they are important. All right guys, that's all for the nested calculations, that was an introduction to calculations in Tableau. They are really important to make grade visualizations. In the next video, we're going to learn more and more about calculations in Tableau. All right, so with that, we have learned how to do nested calculations in Tableau. And next I'm going to give you an introduction to the four types of Tableau calculations. We have the row level, aggregate, table, and LOD calculations. 131. Tableau | 4 Types of Calculations: Tableau, we have many different functions that we can use inside the calculations, and in Tableau we can categorize them into four different types of calculations. In this tutorial we're going to talk about them. But first we can have a very simple example to understand how they work and how they interact with each other. So let's go. All right, now let's say that you have the following product table inside our data source where we have information like the product prices, quantities, and so on. Those data are the original data that we can find inside the data source. Now let's say that we need a new field inside our data source to show the data of their revenue. In order to do that, we can simply create a new calculated field where it's going to multiply the prices with the quantities. Now with that Tableau going to go and create a new field inside our data source to store the result of the calculations inside it. Table going to go row by row by multiplying the prices with the quantity. So for example, for the first row it's going to multiply 20 with two. And Tableu going to go and store it at the new field. Then Table can jump to the next row and do the same exact thing. So as you can see, Tableau is processing each rows individually and independently from each others. When the calculations is happening on one row, we don't care about the information that is present in the other rows. Tableu can focus only on one row at a time. This type of calculations, we call it row level calculations. And the level of details we have it here is the lowest we have the level of detail from the data source. It's very important to understand that this type of calculations is the only type that will not go and aggregate the rows of the data source as well. The only type that can store the results at the data source. That means table will not go and calculate the result of these calculations each time you are using it in the visualizations. So it can recalculated and store it in the data source. The calculation will not be done on the fly. All right, now let's move to the visualizations. And let's say that I would like to show the total revenue of each product. For that, we can use the function sum to summarize the values of the revenue. And we can go and add the Dimension product to the view. And Tableau here going to show only three rows in the view. A row for each product value. That means we're going to have P1p2 and P three. Now this time Tableau will start summarizing and aggregating the rows in the data source. That's going to be at the level of the dimension. For example, Tableau going to start with the first product, the one and Table going to summarize the first two rows from the data source. We have 40 plus 60 Tablo going right at the output, 100 directly in the visualization. Then you're going to move to the next row. We have the P two here. We have only one row at the datasource. And the summarize of that's going to be 20 for the product. Three, the three we have here three rows in the data source. The summarization of 40 plus 25 plus 15. Table going to have the answer 80 at the visualizations. This time as you can see, table is not processing the rows of the data source one by one and individually. Instead, Tablo going to go and summarize. Group up the rows of the data source at the visualization level. This type of calculations, we call it aggregate calculations and it's going to be calculated on the fly. That means the result of these functions of those calculations will not be extra stored inside the data source. And now it's very important to understand the level of details of this new table that we have. In the view, it has lower level of details as the data source and the one who controls the level of details is the dimension that we have on the view. The dimension that we use in the view going to control the level of details for the aggregate calculations. And that's why we have another type of calculations. Because of that, let's say that we have another scenario where you say, you know what, I would like to control the level of details. I want my calculations to show the total revenue of each category. Here we can use different functions like the fixed function, so we can have fixed category and then some their revenue that we are telling Tableau. Okay, find the total revenue. But this time it's going to be fixed. It's going to be connected to the dimension category. So let me show you what can happen. Tableau going to go and check. Okay, what is the category of pay one? It is the category A. Now the next question. What is the total revenue of the category A? Here, Tableau can summarize 40 plus 60 plus 20 and the result going to be 120. Here, Tableau will not show the total revenue of the product, pay one, but instead of that, we are showing the total revenue of the category A. The same thing can happen for the next product. We have pay two. It belongs to the same category, two A. The total revenue of category A is again 120. And then the last product, pay three. It belongs to different category, this time to category. And the total revenue of that going to be 40 plus 25 plus 15. The output can be 80 as a total revenue for the category. Now, who is controlling the aggregations? It's not anymore the dimension that we have on the view, but instead it's going to be the dimension that we specify on the calculations, this type. Cations, We call it LOD expressions. Level of details expressions here. The same thing, like the aggregations. It's going to happen on the fly. Nothing going to be stored inside the data source. All right, now moving on to the last calculation type that we have in Tableau. Let's say that after I got the result in the view, I would like to calculate the rank of the products based on the data that is displayed in the view. In order to do that, we can use the function rank of the summary of the revenue. What can happen this time table will not go and query the data source. Instead of that, Tableau can go and query the visualization itself. It's like we are aggregating the aggregation based on the value that is displayed on the view, we can find that the product one pay one has the rank one, the two has the rank three, P three has the rank two. This type of calculations, we call it stable calculations. And unlike all other types, it is based on the context and on the data that is displayed on the view and it will not go directly and query the data source. It is as well computed on the fly. That means the result will not be stored inside the data source. If you're talking about the level of details, it depends as well on the visualization. It can depend on the dimension products. All right guys, so that we have now a big picture about the four different types of calculations inside Tableau. And we can see how Tableau can compute the calculations, present the data at the end in the results. All right, so now we're going to start with the first type of calculations. We have the row level calculations. And here we have a lot of functions under this category if you compare to the other types. So here we have the number functions, string, date, logical functions. There are a lot of functions, but we're going to cover them all in the next tutorials. So now let's go in Tableau and try a few of those calculations. Okay, so now back to Tableau, we're going to go to the small data source, and then we're going to go to the orders. As you can see, we have here the quantity and as well the unit price. Now we're going to go and calculate the revenue, where we're going to multiply the quantity with the unit price. Do that, we're going to create a new calculated fields in the data source and this going to be row level calculations type. Let's go and create a new calculated fields. We're going to go to the data pane radically in the empty space. Create calculated fields and let's give it the name revenue. And then the formula for this going to be quantity multiplied with the unit price. Now you might ask me, where do I find in Tableau all the functions that are related to the type row level calculations? Well, there's no specific place for that, but there's like orientations for it. So if you go to the documentation over here and check those groups, you will not find directly the types of the calculations, but you will find some groups that are similar to those types. For example, if you can see over here we have table calculations. If you go inside it, you can find all the functions that we could use in this type. And then we have another group called aggregate. And here you will not find only the aggregate calculations, but as well you will find the LOD expressions. The last one, the last type is the row level calculations is actually the rest. All other like the number string data type conversions, all of those stuff are row level calculations. All right, so now back to our calculations. Let's go over here and hit okay. And with that you can see that Tablo did immediately create a new field in our data pane. Now as I told you, if you are using row level calculations, Tablo can do the pre calculations and store the results immediately in the Da. Let's go and check that. Either you can go to the data source page or we can go to this small icon over here, it says View Data. Let's go inside and check the results. Here we have to switch to the orders. Now let's scroll to the right. You can see we have the original field, We have the quantity and as well the unit price. But we have as well our new calculated field, which is like any other field that we have in the data source. We have the revenue over here. And as you can see, Tablo did immediately stole all the results of this calculated field in the data source, even though we haven't created anything yet in the visualizations, that means Tablo prepare for you in the datasource and we can check the result. For example, here we have the quantity one, the unit price 215. We're going to get the same course. And here the things are multiplied with two. So as you can see, we are now multiplying the quantity with the unit price. And now we can see very clearly that the row level calculations will be calculated and performed on the row level individually and independently from each other's. So the information that we have in the other rows will not affect the calculations of the first row. All right guys, so that's it. This is how the row level calculations works in Tableau. Okay, so now we're going to move to the next type of calculations. We have the aggregate calculations. And here we have few calculations. If you compare to the row level calculations, we have max in average, count, count distinct and attribute again. All of those can be covered in details and extraorials, but now we're going to go in Tableau and try a few of them. All right everyone, So now we're going to go and build a view where we have the total revenue by products. In order to do that, we're going to go and get the product name from the small data source and let's put it in the view. Now it's really important to understand the concepts. Now the Broaduct name is the dimension that can define the level of details in the visualizations. That means in this view we have five rows and this is completely controlled by the broduct name. Now I want you to understand how to pick which type of calculations we're going to use. Now to answer this question, we start always with the first question. Do we have to aggregate the data since the task saying Revenue, That means there's like an aggregation And summarizations. Well, that means we cannot use the row level calculations, then we have to use the other types. For aggregations, then we are left with the three types. Now the next question going to be, do we have all the data in the view? Well, as you can see in our table, we have only the dimensional information. We don't have anything about the revenue. That means no, we don't have all the data inside the view. That means we will not use table calculations type because the table calculations types always depend on the view. If you don't have the data in the view, you cannot use table calculations. That we are left with two options. Either we can use the aggregate calculations or the LOD. Well, the last question you can ask, does the level of details that we have in the view can fulfill my requirement? Well, in this example, yes, because we want to have the total revenue by products. So we are talking about the products and the dimension that we have inside the view exactly fulfill the level of details. That means we can stay with the level of calculations that we have inside the view and we don't need to use any LOD expressions. If you follow those three simple questions, you can easily identify which type of calculations you need to solve your task. In this example, it can be the aggregate calculations. Let's see how we can do that. Since the aggregate calculations are the default methods in Tableau, In order to aggregate any data or any measure, it's going to be really easy to create. So all what we need is their revenue, so just drag and drop it here on top of those numbers. And with that Tableau going to create immediately an aggregate calculations, we can see it over here. The sum of their revenue. That's because it is the default method on aggregating data table goes for each product inside the data and start aggregating all the revenues that are related to these products. Now the next step, what I usually do, I go and validate some examples. I go and pick some of those products and start summarizing the values to check whether the value that I'm seeing in the visualizations is correct, let's go and create a new sheets. And here we want to go to the lowest level. In order to do that, we're going to take the order ID, the view. Let's take now the product name. We can take the categories as well. Then let's take their revenue and put it on the APC over here. Let's make it a little bit bigger in order to see the names and then we can go and sort the product names. So now we can any of those products. In order to validate the answers, let's take the LG Fol HD monitor. As you can see, the total sum should be more than 3,000 Let's go back to our aggregations and check the LG Fol Hd. You can see it is above 3,000 That means everything is fine. And with that, we got the total revenue by products. And of course, we have done this in the quick way where we drag and drop the field to the view. But if you want to do it as calculated field in order to re, use it later in different sheets, we can go and create a new calculated fields. Let's call it Total Revenue. And then we're going to have the same syntax, the sum of revenue. This time we're going to use the nested calculations. So we have it already in another calculated field. Let's go and click on that. And as calculation is valid, let's hit okay. And we got with that a new measure in our data pain. So if you go and replace it, you will get exact results. So as you can see in the result, nothing changed. The only advantage to you this is to reuse it in different sheets and as well in different workbooks. All right guys, That's all for the aggregate calculations in Tableau. All right guys, the third type of calculations in Tableau, we have the LOD calculations or the level of details expressions and here we have only three Tableau functions. We have the fixed, include and exclude. Now let's go in Tableau and create one of those functions. All right, now we have the following task where we want to show the total revenue but using the same view. So we're going to stay with the same information. We're going to have the product name, we're going to have the total revenue by the products. But I want to see side by side the total revenue by category. Let's go again through the three questions. The first question is, are we doing aggregations? Well, yes, that means we cannot use role level calculations. Then the next question is, are the data that we have in the view enough? Well, it's not here. It's not the total revenue by category, it's by the products. Well, that means we cannot use the table calculations. Now we come to the last question. Does the level of details in the view going to support me to solve the task? Well, the answer is no. That's because the level of details inside the view now defined by the product name and it has a higher level of details than the category we want to have, the total revenue by category. The level of details that we have in the view will not support me. That's why I cannot use here aggregate calculations. And I have to go and use LOD expressions. As you can see, very simple questions. And it's going to move you exactly to the right type of calculations in Tableau. And now you might say weight weight rates. I can go and add the category information to the view and then I have the level of details of the category. Well, this will not work and that's because the broad act name has a higher level of details. Let me show you what can happen if you bring the category. So let's go and grab the category to the right side of our. Here you can see nothing going to change. We still are. The five rows, and that's because of the product name. Even if you move it to the left side over here, we don't have here two rows. We have here five rows. If you can check the details over here, we have five marks. So that's why even if you are adding the category, nothing going to change. We are still with the product level of details. Now let's go and create a new calculated field to use the LOD expressions or calculations. Let's go to the left side and create a new calculated field. We can call it total revenue by category and the syntax, don't worry about it, we're going to learn it in a separate tutorial about it. So it's going to have the following syntax fixed. Then we have to specify the dimension that's going to control the level of details of the results. It's going to be the category. And then what we are doing, aggregating the revenue, we have to add here, sum of revenue. And then we have to close it that says the calculation is valid and everything is fine. Let's go and hit okay. As usual, we're going to get in new calculated field in our data in over here. Let's get the result. And let's drag it over here to see the data. We can see for each row the total revenue by the category. For the first one, it's going to be the total revenue by the accessories. The second one the same because it's belonged to the same category. The third one the same, but the fourth one you can see it belongs to different category and that's why we're going to get different numbers. That's it. This is why we need LOD calculations in Tableau. Now we're going to move to the last type of calculations that we have, the table calculations. And here we have as well, few calculations. So we have the running window rank first, last index lookup, and so on. Again, here we can have dedicated tutorial for those stuff, but now let's go and try one of them. All right everyone, so now we're going to move to the last task for this view, we want to show the running total of the revenue by the products. Here we're going to ask again the three questions. Are we aggregating? Well, yes, because we are having the running total of the revenue, we cannot use the row level calculations. The next question is, are the data that we have in the visualizations are enough to solve this task? Well, yes, that's because we have the total revenue by the products and the view. Based on those informations, we can build up the running total of the revenue by the product. So we have actually everything in the view in order to solve the tasks. And that's why we're going to go and use the type table calculations. And we will not bother with the third question, whether it's aggregated calculations or LOD, because it is table calculations. So let's go and create a new calculated field. We're going to call it Running Total Revenue. The syntax for that is as well very simple. We'll start with the running, then we have to select which aggregation type it's going to be the sum. And then we have to go and specify which data are going to be calculated inside the table calculations. And here we have only two informations, so either we're going to use a total revenue or the total revenue by category, the LOD, but we are talking about the total revenue by products, that's why we can include it over here. That's going to be the sum of the revenue, and that's it, and the calculation is valid. So let's go and hit okay. And we're going to take our measure and put it as well on the view to check the results that we can see very nicely. They're running total of the revenue. It's very simple. Let's start with the first value from the total revenue. Then the next value can be based on the previous value plus the total revenue. Those two values are going to be added to each other in order to get this value. Then the next one the same, the previous value, plus the current total revenue. As you can see, we have nothing here. That's why we are getting the same value. As you can see, as we are moving down, we are adding more total revenues to the total number. Now, it's very important to understand that the table calculations are very sensitive to the data that is displayed in the view. Any change to this structure, we're going to get different numbers at the output. This is not the case for the aggregate or the L. Let me show you what I mean. For example, let's go and just change the sort of the data inside the product name. Let's go over here and make it descending. For example, you can see that the aggregate calculations or the LOD, the values are the same. It'll just change the sort. But the values inside the table calculations did change completely because we have now different sort and Tableau going to recalculate the running total based on the view. That means any interactions in the visualizations, it can affect the table calculations functions, It is completely based on the view. That's it for now. This is about the table calculations in Tableau. All right guys, now we're going to talk about computations of those different calculations types that we have in Tableau. Now let's say that we have the following calculations, and it's very similar to the listed calculations here. We have different types. We have the rank for the table calculations, we have the sum as an aggregate calculations, and we have the quantity multiplier with the price. As row level calculations, the first thing to be executed is always the row level calculations. The first one going to be quantity multiplier with the price. Then the second type to be executed in Tableau going to be the aggregate calculations. It's going to be the sum function in Tableau. And the last type of calculations that's going to be executed in Tableau going to be the rank function, the table calculations, again, row level calculations as a first, then the aggregate calculations, and always the last one, the table calculations. Okay, now let's go and quickly recap how to choose the right calculation type. Here we have three questions. We started the first one. Do you have the aggregated data? If no, then go and use the row level calculations. We are at the row level. If yes, then we jump to the next question. Is all the needed data already available in the visualizations? If yes, then we can use the table calculations. If no, then we have here. The third question is the level of details in the visualizations matches the question or the requirements. If yes, then we can use the aggregate calculations. If no, we can go and use the LOD expressions or calculations if you follow my decision three, you can simply find an answer for that. All right, Is that you have now an overview of the different types of calculations that we have in Tableau. Next, we're going to do a deep dive in each type of them and we will start with the role level calculations. Here we're going to cover a lot of functions in Tableau that are very important to do, data manipulations and transformations and generate as well in new information that you need for your visualizations. 132. Tableau | Number Functions: CEILING, FLOOR, ROUND: So now we're going to start with the first type of calculations there, row level calculations. And in this tutorial we're going to cover the number functions in Tableau. So the main purpose of the number functions in Tableau is to manipulate and transform numerical values. So we can use them on any field with the data type number. And the most important use case for the number functions is to simplify the numbers. Here we have three functions. We have the ceiling floor and round in order to round the numbers to similar form as usual. First, let's understand the concept behind them, then we can practice in Tableau. Let's go. All right, so now let's say that we have the following scenario. We have built a view from the subcategories and the sum of sales. Now if you take a look to those numbers, you can see that they are large numbers with a lot of fractions, a lot of details. We have three decimals over here. Those details are going to make it really hard to read those numbers in the view. Instead of that, we can round those numbers to make it easier to read and hide those small details that are unnecessary here. If you take the sales, the rounded sales, you can see now we have smaller size in the numbers. We rounded all those fractions, all those decimal numbers. With that you can see if you compare the right to the left, it's easier to read right. Now let's learn how this works. Each decimal number, like for example, 1.4 it has always two integer neighbors. Think about it like we have a room, it has a ceiling and floor. In this example, the 1.4 has the ceiling of two and the floor of one. Here, we might be in a situation where I don't want to deal with those details, with those fractions. I would like to have a whole number two or one here. Exactly. We have two options. Either we're going to move it to the ceiling to the higher number, or we're going to move it to the floor, to the lower number. If you decide to use the ceiling function number, going to be two. What we are doing here is we are rounding up the number to the higher value to the ceiling or we are moving it to the floor. That means we are rounding down the number, the floor function going to round down the 1.4 to one. Now you might say, you know what, I don't want to decide whether it's going to go to the ceiling or to the floor. I would like to have it automatic. It should go to the nearest integer, and here we can use the round function. Let's have the following example. Let's say we are at 1.3 If you use round, we're going to go to the nearest neighbor. The nearest neighbor going to be one. The round going to move the value to one. But now let's take another value, 1.7 Here, the nearest neighbor is not the floor. It is the ceiling. It's more near to two. If you use the round function, it's going to convert it to two. Now let's say that our value is exactly in the middle of 1.5 What can happen to the value if I use round because it has exactly the same distance to the ceiling and to the floor here. What can happen is it's going to be rounded up to the ceiling. We have to have only one value, 1.5 the round of that's going to be two. As you can see, this is how those three functions works. All we think about it's like a room. You have a ceiling and floor. All right, now let's compare the three functions side. We're going to start with the ceiling. The ceiling going round up the numbers. The syntax in tablo going to look like this. Ceiling and it accepts only one argument, the original number. For example, the ceiling of 1.2 is going to be two ceiling of 1.8 going to be two. Ceiling of 1.5, can be two, we are always going to the higher number. Let's move to the next one. It's going to be exactly the opposite, the floor going to round down the numbers to lower value. The syntax here is floor it, except as well only one number. The examples are floor 1.2 can be 11.8, can be 1.1 0.5 can be as well one. We are always going to the lower number. Now let's go to the last one. We have the round round the numbers to the nearest integer. The syntax for that is going to be a little bit different. We have round, then the original number, then we have a decimal here, it's optional, of course. Here we can decide as well whether we're going to see, for example, one decimal, two decimals. And if you leave it empty, it's going to round it to a whole number. Now let's go to the examples for the same numbers. If you round 1.2 it's going to go to the floor. The nearest to be, if we round 1.8 the nearest going to be the ceiling, it's going to go to the two. If we round 1.5 exactly the middle, it's going to be rounded up to the ceiling, so we have a two. That's it. This is how the three functions work. Now let's go back to Tableau and start. All right guys, back to Tableau. Let's create now view that. We're going to show the orders with the sales. We're going to stay with the small data source. Let's take the order ID, put it on the rows, and let's grab the sales to the view. As you can see, the sales don't have any fractions. And that's because, not that the numbers are rounded, it's just the format is different. In order to show the real values, we have to change the format. In order to do that, we're going to go to the major sales of our here, right click on it and go to the format. Then we're going to go to the left side. We have here numbers. Let's click on this menu and go to Once you do that, you can see that we have the raw data as we have it in the data source. Now we want to round those numbers to make it similar to read in the view. In order to do that, we have the three functions and we can start with the ceiling. Let's close this over here and create a new calculated field. Right click over here in the white space. Create calculated field. We're going to call it Sales Ceiling. The syntax is really easy, so it starts with the ceiling, Ord, and then inside it we have to have our field, The number, Our field is the sales, and as you can see, the calculations is valid. Let's it, okay. As you can see, we have now the field, the new calculated field in the data source. Let's bring it to the view. Let's go and drag it over here. As you can see, now we have our new field. Let me just make it a little bit bigger and all those values are rounded. Let's take the first value. We have 215, 88. As we are rounding up, we're going to go to the next higher value which is 216. Everything is fine. Let's check this over here. So we have 56, 11. As we are rounding up, we're going to go to the next integer which is 57. Everything is fine and the ceiling functions is now working. All right. Next we got to go and do exactly the opposite. We're going to round down the numbers to the floor. We're going to go and create a new calculated field and we're going to call it Sales Floor. The as well really easy. The keyword is Floor. And our value going to be the sales. So that's the calculations is valued. Let's click Okay. And our new field is already in our data source. Let's grab it to the view. The first value was 215, 88. As we are rounding down to the integer below it, it's going to be 215. This value over here, we have a 56, comma 11. As we are going to the floor, it's going to be 56, so everything is fine. And as you can see, it's exactly the opposite of the ceiling. All right, so next we're going to go around the numbers automatically to the nearest neighbor. Using the round we're going to go and create the third calculated field, we're going to call it sales round. The functions is really easy. It starts with the round and it's accept two arguments. The first one is a must, it's going to be our number sales, and the second one going to be optional in case we want to decide on the number of decimals here we don't want to use it, we're going to leave it as default. We don't need any decimals or fractions, so we're going to leave it as like this, sales and that's it. So as you can see, the calculation is valid and we're going to go and now our third calculated field as well. In the data being, let's just grab it to the view and check the values. Now, the first value, 215, 88. It is near to the ceiling, that's why the round going to take it to 216. The next one we had 56, 11. It's really near the floor. That's why Tableau or the round function going to take it 256. As you can see, everything is fine and the numbers are moving to the nearest neighbor. All right, now let's say that we want to see the Els in our view, but having only one decimal, not two decimals like here in our example. In order to do that, we can round those numbers to only one decimal using the round function. Let's go and create a new calculated field. Let's call it sales rounds one. And we're going to use as well, the same keyword rounds. The number is going to be sales. And then we're going to define how many decimals do we want? In this example, we want only one decimal, so we're going to type here one. That's it. As you can see, the calculation is valid. Let's click Ok. And here we have our new field, Let's bring it to the view. And now you might say, you know what, nothing changed. We still have everything rounded to a whole number, there's no decimals. Well, that's about the format. Let's go and change that. We're going to go over here, right click on it and then let's go to format here. We're going to bring it to the standard. Once we do that, as you can see now we have only one decimal value. We don't have two decimal values like the seals, like the original field in our data source. But now you might say, okay, maybe the round as well has decimals. So let's check the formats. We're going to go to the round over here, and let's click Formats. And now if we bring the standard, as you can see, nothing is changing. So that means we don't have really no decimals, we have only a whole number. All right, So now you might ask me, when do I use ceiling and when do I use floor? Well, there is no rule for that. It really depends on the use case and on the requirement. For example, if I'm building a dashboard for budgeting to bland a budget, I would go always with the ceiling to make sure that I'm not forgetting anything and I'm not short in the budget at the end. In this use case, I tend always to use ceiling and never use floor or round. It really depends on the requirement in the use case. So as you can see, those three functions really makes the visualizations easier to read and more simpler. All right everyone. So, so far we have learned how to simplify the numbers in Tableau using the three number functions, ceiling, floor, and round. And that's it for the first group, the number of functions. Next we can learn the string functions in Tableau. 133. Tableau | Change Cases: LOWER & UPPPER: Now we're going to focus on the second group of functions in Tableau. Under the category row level calculations, we have the string functions. The main purpose of the string functions in Tableau is to manipulate and transform the text values, any field in our dataset with the data type string. There are many use cases and reasons to use string functions in Tableau. For example, we can use it to clean up our data and bring our text to standard cases. For example, we can change the case to either lower or. And the next use case as well is about to clean up our data in Tableau by removing any unwanted spaces. Here we have three functions, The left trim, right trim, and trim. Moving on to the next group or use case, we have here three functions to extract specific substring from a text. We have left, right, and made. The next use case is to search for specific patterns. Here we have five functions, Start with width, contains, find, and find in. Then we have another use case for the string functions to combine and split data inside Tableau. Here we have the concat operator and as well split function. The last use case is to replace specific substring, another substring. So here we have the function replaced. As you can see, we have a lot of string functions and tools to manipulate, transform, clean up the text values in table. Now we're going to start with the first use case about the string functions. How to clean up our data and bring our text to standard case using the two functions, lower and er. But as usual, first we have to understand the concept before we start practicing in table. Let's go. All right, now let's go and check the following data quality issue in our view. If you check the dimension products over here we have three values for the word. We have keyboard three times in the view, which is really wrong. And that's because data quality from the source system where we get the data from is simply low. This happens if you have a lot of people working in the peak projects and you have a lot of products. So they may enter like different names for the same products. Here we have a case issue in the product name. And what I usually do in my projects, I go and contact the source systems and tell them about the data quality issues that they have. But sometimes it may take a long time until they fix it. Individualization, we can go and fix and clean up those stuff. In Tableau, we have a lot of tools and functions to manipulate and clean up the dimensions. For example, we can use the upper or the lower functions in order to bring standards to the values. If you go and use the lower, we have the following results. We can have in this example only three products in the visualizations and although three values going to be aggregated for the quantity in only one row, which is really correct. Now if you compare the first view with the second view, you can see that we have improved the data quality indivisualizationsow let's go and understand how those two functions works. Now let's have the following example about the customer's name. The names could be written like this, the first character of the first name and the last name is capitalized, or everything as an upper case or the opposite. Where we have everything in lower case, you can see we can write the customer's name in different cases. Now in Tableau, we have to bring those names in. Standards, we have two ways to do that. Either we bring everything to lower case or case. Now, if you decided to go with the upper case for the customer's name, what can happen? The first customer can be converted completely to upper case. The second customer is already an upper case. Nothing can happen, it's going to stay the same. The third one, it is low case, so it can be converted to upper case. But now, if you want to go with the lower name for the customers, this is what can happen. The first one, the first customer can be converted to a lower case. The second one as well can be converted from upper to lower. The third one, nothing can happen because it's already a lower case. As you can see with this function, we are forcing the names to be either upper or lower. So we bring standards to the visualizations. Now we're going to go and compare those two functions together. We start with the upper. It's going to convert the characters two upper case. The syntax in Tableau going to be the following. It starts with the keyword upper. It accept only one field, the string, The output can be as well string. For example, if we take upper Maria, the first character is capitalized, the output can be string Maria in upper case. Now let's go to the lower. It's going to be exactly the opposite. So it's going to convert the characters to lower case. The syntax can be similar to, here we have lower than one field, the String The output can be as well String. The example here is lower. Maria, Maria can be in the output as a lower case. Those two functions are simple and easy to use, but still they are very important. I tend to use them a lot in my projects to clean up the data. Now let's go back in Tableau and Start. All right, for those two functions, I have prepared an extra file with the low data quality in the product names. In order to connect this file, we have to create a new data source. Let's go to the data source page over here. And then we're going to go and create a new data source. Then we're going to go to the text file. You can find it inside the small folder. We have here a CSV file called products low quality. Let's go and connect it. It's only one table, and if you check the data grid over here, you can see we have problems in the product. You can see we have here keyboard in upper case. Keyboard in lower case or with the first Carter capitalized. So now let's go back to our sheet and start checking the data as well from there. Now let's go to the database, make sure we are selecting the new data source. We have here a product one. Here we have the case issue, so let's bring it in the view and check the values. As you can see, we can find like five products, but in reality we have only three right here. We have the keyboard three times, monitor and mouse. We should have only three keyboard, monitor and mouse. We have data quality issue in the product names. Tableau is case sensitive so it can present the data exactly as it is from the source system. Let's take the quantity and put it in the columns. And as you can see, those three values will not be aggregated together. Since Tableau think those are three different, let's show the values here in the labels. Let's take it to the color as well. So now we're going to go and clean up the data using the lower function. In order to do that, we have to create a new calculated field. Let's go to the Data Pain over here. Right click on the empty space, Create Calculated Field. We're going to call it Products Lower. It's start with the keyword lower and it accepts only one value, the string. So we're going to have the products one and that's it. So as you can see, the calculation is valued and the output going to be a string, the product. Let's go and hit, okay. Now if we check the data pain, we have here, our new dimension, the calculated field. Let's bring it to the view and the rows to start comparing the values. The first one, as you can see it is an upper case. The output going to be a lower case of the keyboard. The next one is already lower case, nothing going to change. The third one is completely upper case from the original data, but the output is lower case. As you can see, we have all the names here in a lower case. Now if you go and remove the product one over here, you can see we can end up having only three values. Only three products which is correct. With that, we have cleaned up the data using the lower case. Now let's go and clean up the data. This time using the upper function, we can do the same. We're going to go and create a new calculated field. Let's call it products upper. We're going to use the function upper over here. And it accepts only one field, our products, products one. And that's it, the calculation is valid. Let's click okay. Now if you check the data bin, we have new calculated field, new dimension. Let's bring it to the view and start comparing the values. I can bring as well the original field, the first one is capitalized, as you can see, the output can be an upper case. The second one is completely lower case as well, completely upper case. The third one, nothing going to change. As you can see all the values now in upper case, now I'm going to go and remove the others to see the final results. As you can see, we have only three products and the visualization which is really correct. And with that, we have fixed the data quality using. All right, so now you might ask me, should I use a lower case or upper case in my views? Well, if you're asking an IT guy like me, I'm going to answer like this. It depends, it depends on the fields that you are using in the views. Let's have the following example. Here we have two views. The left one with the lower case and the products name. And the second one is with the upper case. If you take a look now to those two views, what do you think? It is easier to read? If you have a normal text or a long text like the product's name, the customer's name, and so on. It's always better to use a lower case. The lower case are easier to read compared to the upper case. The upper case is going to take as well more space. It's more aggressive and it's really hard to read. So for the scenario I would go and recommend you to use the lower case. In modern design they tend to use lower case since it's provide more slick and minimalist look in the website and in the look and feeling for the visualizations. So the lower case is easier to read. It's more modern. If you compare it to the upper case, it's hard to read and it's like someone is shouting. Let's take now another example. We have here an aggregation for the country abbreviation. So here we have it as a lower case and as well as the upper case. This time if you compare them together, you can see that maybe it's more better to use the upper case. And that's because since it's very short, the abbreviations has maximum, maybe three characters. It's really hard to see Ind visualizations. They are really small. If we have it like a big characters, it's easier to read with the abbreviations. I always tend to use the upper case, The abbreviations if they are written in upper case, they can bring standards and they can avoid misinterpretations of the data. If you look to the right side of over here, you can understand immediately. Okay, here we are talking about countries. But if you are on the left side, you might get confused. For example, are we talking about USA or the word us? The same goes for Italy. Is it like it that we use it in sentences in the pronoun or is it like the abbreviation of Italy here? If you write it in lower case, you might introduce some misunderstanding and mis for the abbreviations. I always tend to use upper case. It's more clear and easy to read for short names. That's why the answer that comes from the, IT, it depends, it depends on the use case, the requirements, and so on. So sometimes we go with the lower, sometimes we go with the upper. But 90% I go with the lower case for the names and so on, but only for the abbreviations. I go with the upper. With that, you have at least some orientations in your visualization. All right, so that's all about how to clean up the data by bringing our text to standard case using the two functions, lower and upper. Next you can start talking about the three functions, left trim, right rim, and trim. 134. Tableau | Remove Spaces: LTRIM, RTRIM, TRIM: All right, so now we're going to talk about another string functions in Tableau to clean up our data by removing unwanted spaces using the three functions, left rim, right rim, and trim. And of course, as usual, we have to understand first the concept behind them and then we got to practice in Tableau. So let's go. All right, so now we have the following scenario, where we have, again, a bad data quality in our view. If you check the products, we can see that we have four times the keyboard. So what is going on? We have here no case issue, like all of them are capitalized in the first character, so there is no lower case, upper case. Everything is fine. Why Tablo didn't aggregate all those values in one row, in one product? Because here we have only three products. So what is going on here? What happened? Well, we have the dirty spaces in the product name. In the keyboard, there are like unwanted spaces. It's really hard to see individual. You can see that. Like everything looks fine, right? But there's spaces inside the keyboard and we have to remove it. Now, in order to clean up the data and remove those dirty spaces, we can use one of the three functions left, right, trim, or trim. And if you apply those functions on the product name, we're going to get the result like this. Only three products and everything will be fine. Let's understand how those functions works. Let's have the following simple examples. Let's say that we have the word monitor, but on the left side we have a white space. In order to remove it, we can use the Tableau function. Lift, trim, lift, trim, Gna, remove any unwanted spaces from the left side of the word. Now we might have the opposite situation where we have the monitor, but on the right side there is a white space. In order to remove those spaces, we can use the function in Tableau. Right trim, right trim. Going to remove any spaces from the right side of the word. Moving on to the third scenario, we have the same word monitor, but this time on the left. And on the right there are white spaces. In order to remove those spaces, either we can use both of the functions lift trim and right rim, or we can use the third function, trim, if you use the trim function in Tableau. For this scenario, it's going to remove all the white spaces from the left side and as well all the white spaces from the right side. All right, so now we're going to go quickly compare those three functions. The lift trim going to remove any leading spaces. The right trim can remove any trailing spaces, and the trim can remove both of them. The leading and trailing spaces and the syntaxes in Tableau are really simple. So for example, we have here the left trim keyword. Then it accepts only one string field, the output going to be a string value. For example, let's say we want to lift trim, this value, we have narea on the left side, we have a white space. And as well on the right side, if you use a lift trim, it can remove only the leading spaces. So it can just remove the space from the lift and going to leave the space that we have on the right because it's only lift trimming. Let's go to the next one. It's exactly the opposite, but the syntax is almost the same. So we have a right to trim it, except the field string, the output going to be as well a string value. If we stay with the same example, it's going to remove only the trailing space. The space on the left side going to stay in this example. Now let's move to the last one. I think you already got it. We're going to use only the trim here. Not a lift or right. So both of them. And it except as well a string field. The output going to be a string value. And the example going to be the following. Maria with the left and right spaces, what can happen? We're going to remove the lift space and as well the right space. Those functions are really easy to use and very important to improve your data quality indivisualizations. Let's go back to Tableau and start practicing. Okay, first, make sure to select the right data source so we can stay with the products low quality since I prepared the examples. And now we're going to go with the product two, just drag and rub it here in the view. As you can see, we have now four products for the keyboard. Now it's really hard to see where are those white spaces. For the first two, you can see they are little bit shifted to the right, but for the second two keyboards, we are not sure whether they are like on the right side a white space or not. The situation can be really bad if we switch to different visualizations. Let's take the quantity and now in the bar diagram, it's almost impossible to see whether there are like any white spaces. If I'm facing this situation in my projects, I go first and start counting how many characters do I have in each product. I calculate the length of each word. In order to do that, we can create a new calculated field. Let's go and create a new one, and we're going to call it products length. The keyword for the arts to calculate the links is LEN. That sets. Then it accepts only one field, string field, and the output going to be a number. Our field going to be the product to make sure to select the correct one and that sets the calculation is valid. Let's click Okay, since the output going to be a number Tableau, going to go and create a continuous measure. So I'm just going to remove the quantity from the view, and let's bring our new calculated field to the view. The link of the first one has nine, so this means we have only one white space. The second one has two white spaces. The third one is correct. The first one is as well has one white space. With the link function, we can easily detect whether there are dirty spaces in our words. Now in order to remove and clean up those problems, we're going to use the trim functions. Let's start with the lift trim and we're going to go and create a new calculated field. Let's go and do that. We're going to call it products left trim. And we're going to start with the syntax left trim, and it accepts only one string field. Going to be the product. To make sure to select the correct one, that calculation is valid. Let's go and hit okay. Now we notice that table created a new dimension because the output is a string. Let's go and put it here in the view. Now what can happen to the values inside the products? All the spaces from the left side going to be removed or trimmed. But again, here, it's really hard to see from the view whether everything is fine. So we're going to go again and calculate the length of the new field. Let's go and change the calculations inside our calculated field. Instead of having the Broadct two, we can remove it and insert the new dimension. Let's click Okay. All right. So now let's check the result. As you can see, we have some values fixed to the first one. We have it as eight. The second one we still have space. The third one is anyway correct. The third one is as well incorrect. As you can see, the situation is now a little bit better. But we still have spaces. That means we have spaces on the right side. In order to fix this, we're going to go and trim from the right side. Let's go back to our calculations, the left trim. Let's edit it and add the right trim. So we're going to go over here, we're going to have nested calculations, right trim, and we want the results from the left trim. Let's go and hit. Okay, But maybe I'm going to change the name to Trim. Let's hit. Okay, so what can happen to the values inside the products? We are trimming everything from the left and as well from the right as you can see. Now the length is as well, correct. All those values has the links of eight. In order to test this as well, we're going to remove the product two from the view we have here, only three values. Of course the link doesn't make any sense here because we are summarizing the links of all the products inside the orders. Instead of having it as a measure, maybe we can convert it to dimensions, do not have any calculations. I'm just going to remove it from here and just add the product length. As you can see, everything is fine. Now, of course for this scenario, we have an easier solution. We can just use a trim instead of using left and right trim in one calculation. Let's go and do that. We're going to go back to our calculation and edit it. So we're just going to remove everything. We're going to use the keyword trim and then it accept only one field, going to be the product two, and as you can see, the calculation is valid. Let's click Okay. As you can see, nothing going to change in the view. We're going to get exactly the same results. With that, we have cleaned up the values inside the products by removing any dirty or unwanted spaces. All right, I want to show you one more method on how to detect whether there is like bad equality in your data by having unwanted spaces. That's specially if you have a big data source. If you have a lot of values, it's really hard to detect those stuff if you are using the link function. I'm going to show you now how I usually do it if I have a source, what I usually do if I have suspicion about one field where I think the users are like manually entering the values that I go and count the distinct value inside this field. Now let me show you how I usually do it. Let's go and create a new calculated fields, and we're going to call it Products count D. The syntax for that is going to be count. Then the word D, we are counting the distinct value inside our products. The field is going to be product two. The output for that is going to be a number. The calculation is valid. Let's go and hit, okay. As you can see on the left side we have a new continuous measure. It's going to count how many distinct values we have inside products. Let's see the results. I'm just going to go and remove everything from the view. I'm going to take the count and put it on the text. Now the results going to say I have six different products inside my data source, but I have suspicions about it. Now what I'm going to do, I'm going to go and start trimming the values inside the products and my expectation going to be the following. If the number is going to stay the same, then we don't have any spaces, But if the number is going to go smaller, then we have unwanted spaces inside the products. Let's start testing that. We're going to go to our calculation and start adding our trims. We start always with the left trim or right trim. Why? We don't go immediately to the trim Because if you are trimming everything from the left and the right, this can, has a bad performance in Tableau because it needs resources. If you are only lift trimming or only right trimming, it's going to be easier for Tableau to do it. But if you always go immediately to the trim, you might have bad performance. That's why I always start with the lift trim. So let's go to the lift trim and check the results. So I'm just going to add it to the product over here. With that, we are first lift trimming the product two, then we are counting how many distinct values we're going to see inside this database. The calculation is valid, Let's set. Okay. All right, so now we moved 6-4 products. This is alerting for me, that means there is like leading spaces. Now the next step, what I usually do is to go and test whether any right spaces on the right side for that. Either I'm going to add a right to trim or I'm just simply going to use the trim. Now if we add the right trim and the trim and the number going to stay the same, Four, that means we have only problem with the lift spaces. But if the number going to go smaller, that means we have as well right spaces. Now what we can do, we're going to go again to our measure and edit the calculation. And instead of having left trim, I'm just going to have now a trim to test as well, the right spaces. Let's go and hit. Okay. Now as you can see we went 4-3 That means we have as well right spaces, not only left but as well, right. So the total number of products went 6-4 to three. This is how I usually do it to decide whether I'm going to use only lift trim or right rim, or both of them instead of using immediate trim. I saw a lot of projects, and a lot of developers tend to overreact with this. If they see like a string value, they go immediately and trim it just in order to have a correct result. Add a Tableau visualization. But believe if you do this always, you're going to have bad reaction in Tableau and you can have bad performance. Take little time investigating whether it's really necessary or not. All right, so that's all about how to clean up our data by removing unwanted spaces using the three functions, lift trim, right, rim, and trim. Next we're going to talk about another group, the lift, right, and mid. 135. Tableau | Extract Substring: LEFT, RIGHT, MID: Now we're going to cover another group of string functions in Tableau to extract specific substring from the text using the three functions left, right, and mid as usual. Let's understand the concept that we can practice in Tableau. Let's go all right everyone. So in real scenarios and real life projects, the data that comes from the source systems usually are way more complicated than the data that you can find in samples, tutorials, courses, and so on because the processes and real projects are way more complicated. The example that we can see here could be the Broaduct name inside your projects. Here you can see we have a lot of informations in only one field. For example, we have the Canon, this could be the product name. The next one we have the product ID. And the third one is the product code. All those informations, we might find it underneath the product name. In only one field individualization, we might be interested in only one piece of information, not the whole thing. We could be interested in on the Canon, the product name. Or we need only the ID 789. Or we want only the code to be individualizations. We need Tableau, such a function or tools in order to extract those pieces of information. And split the one field to three fields in Tab. There are a lot of functions and ways in order to achieve this goal. One of them is to use the functions left, right, and mid in order to cut this field into multiple fields. We're going to start now with the first one. Let's understand the lift. The first thing to understand is that each character in our string has a position number. For example, we have the C, it has the position number one, the 23, and so on until we reach the last character, five, it has the position 14. We are counting from the left until we go to the right. Now in this example, we are interested only on the product names, so we're going to focus on this one. And as you can see, it ends with the position five. The syntax in Tableau in order to do the lift is the following. It starts with the left. Then it needs two arguments. The first one is the field itself, the string itself. Then the numbers of characters that we want to keep the output. The result going to be a string value. For example, we're going to take left, then our value and the number of characters going to be five. We are keeping five characters from the left side. Let's see how this is going to work. We're going to start counting from the left and we move to the right. The starting character is, we start counting 12345. This is exactly the number of characters and we make a cut here. Anything after the five or after n going to be removed. And we keep here only five characters. We can have the output of Canon. In this example, we are cutting all the values after the character with the position number five. All right, so this is how the lift function works in Tableau. Let's move on to the next function. It's exactly the opposite. We're going to have the right function. Let's say that we are not anymore interested in the product name. We would like to have and extract the product code, the last four characters of our string. Now if you are considering to use the right function, what can happen? The position number of the characters can be exactly the opposite. We're going to start counting from the right side as we are moving to the left. The first character going to be the character five. The second one, R, the third and the last character, number 14, going to be the C. Now we want to focus on the product code and we're going to use the right function. The syntax for the right function is very similar to the lift, it's start with the right keyword, then we need our field, the string field. Then the number of characters the outward going to be as well a string value. This time going to be the example like this. It's going to have right our string. Then the number of characters that we want to keep from the right side is for. Let's see how this can work. The right function is going to start counting from the right side and we move on to the lift. We start counting from here, 1234. And that's it. Here we make cut. All the characters after the position number four will be ignored, will not be part of the results. At the end, you're going to get only four characters from the right side. E R five. This is how the right functions works in Tableau. We start counting from the right side and we keep only like, for example here, four characters. All right, so now we're going to move to the third one. We have the mid function. All right, so now we want to extract the last piece of information that we have in our string, the product ID, the one in the middle. So we are not interested in the first part of the product name or the last part of the code. We want to get exactly this information in the middle. If you are using mid, we're going to count from left to right, exactly like the left function. The first character going to be the C, the last character going to be the five. The syntax in Tableau is slightly different as left or right, so we start with mid. Then we have three arguments. The first one, as usual, the string value that we want to manipulate. The next one here is new. We can define the start point, where we can start counting how many characters were going to leave. Then we have the length here, it's like the number of characters, but this time it is optional. If you leave it, we're going to consider everything after the start point. Or if you specify it, we're going to have exactly the same number of characters that you define the output going to be here as well, String value. Let's take here an example. We can have mid, then our value. We want to start counting from seven and we want to keep only three characters in the output. Now let's see how this can work. The start position, to count the number, is the position number seven. We're going to start from this value and we're going to count three characters, 123 and cut. Now what we are doing, we are cutting two things, the starting position and the position. That means all the characters before the starting point will be ignored, will not be at the results, as will all the characters after the final one at the cut will be ignored, the output going to be 789. With that, we extracted information in the middle of our string. This is how the mid function work, as you can see with those three functions. With those three tools in Tableau, we can cut anything in our string and generate new data. Let's go and Tableau and start practicing. There are many use cases for those three functions. For example, let's start working with the URL. The URL has usually a structure and we want to extract part of the information inside URL in our data sources, we have a URL in the images. If you go to the small data source, go to the products, and here we have the product image. Let's drag and draw it on the rows and check the structure. The standard URL usually starts with the protocol. Then we have a domain, and then at the end we have like a file or something. Our files here are all images like we practice in the image droow. The first task is to extract only the protocols from our URL. Now, tools are from the left side. I think you know already that we want to use the lift function so we can go and count how many characters we want to leave. We need five characters. Let's go and create a new calculated field. Because we need a new field, we're going to call it URL and then we're going to have that protocol. It's starts like this, the left and then it needs two arguments. The data that we need is broad act image, we have it over here and we want to cut five characters. We can specify here five. As you can see the calculation is valid. Let's go and try that out. We're going to go and hit Ok. And as you can see on the left side we have our new dimension, our new calculated field. Let's go and bring it to the view. Drag and drop it on the road beside it. And as you can see now we've got a new field in our data source where we have the protocol information from our URL. So everything is working fine, and this is how we work with the left function. Let's go to the next use case where we want to extract the file extensions in our URL. We want to get this part at the end from the URL as we are speaking about the right side. What we're going to do now, we're going to use the right function here. We need to extract around three characters. Let's go and create the calculated field. So we're going to go and create a new one. We're going to call it URL file extension. It's start with the keyword, right? And then it needs as well two arguments string our field going to be the product image. And how many characters we want. We want three, Come on three. With that, you can see the calculated field is valid. Let's go and hit Ok. And as usual, we have a new calculated field, a new dimension in our data source. Just to deal with the file extensions, let's check the values to see if everything is fine. And as you can see, we are getting all the file extensions from the URL. As you can see, it's really simple. And we are with that, generating new informations and new fields that we could use in our analysis. And they are based on the original data that we get from the data sources. All right, so now let's move to the next task where we want to get the URL's starting from the domain name without having the protocols. We want to keep anything after the double slashes in the string. This time we're going to use the table function de. Let's go and create a new calculated field. We're going to call it broad domain. Here we can start with a keyword mid. It takes three arguments. The first one, as usual, can be the broad act image. Then when do we start cutting? Here we have to specify the number, 12345789, we start cutting from nine. The last one is optional. I'm just going to leave everything afterward. We will not cut anything from the right side. That's it. The calculation is valid, it's okay. As usual, we get a new dimension, new calculated field, and our to be used in the analyzers. Let's go and grab it and put it in the rows to check the values. As you can see, we start from the domain name and the protocol is cutted. The whole value going to be the rest. Now next we have the following task for you. All right, so the task is to extract the last four digits of the phone numbers from the customers. To go to the addresses and extract only the street name. So we can remove the code and the word street. Now you can go and pass the video in order to complete the task. And once you are done, you can resume it all. I think it's really easy. Let's go to the small data source. We're going to go to the customers and grab the phone to the view. Now we want to extract the last four characters we are speaking about. The right side, right, we're going to use the right function. Let's go and create a new calculated field. We're going to call it phone code. And we can use the right function to cut from the left. From the right. Sorry, the string value is phone. We want to cut four digits, so we're going to have the number of characters going to be. Now the calculation is valid. Let's it okay, and take it to the results as you can see. With that, it's really easy. We got the last four digits from the phone number. All right, so now we're going to go and solve the next task. We need only the street names from the address. As you can see over here, we have the code and then the word street. And then we have the street name. We want only this piece of information. Since we want to start cutting over here, we're going to use the mid function to define the starting point of the cut. Let's go and create a new calculated fields. We're going to call it address stretch, so we're going to use the function mid. The first value can be the field address, then the starting point can be nine. The rest, we're going to leave it as it is. So that's it. Let's apply and check the values. Drag and drop in the view as you can see. With that, we have only the streets from the address. We cut it. The first part, you solve the task using like eight instead of nine. That's because you forgot to count the white space. If I just remove it and use eight, I might get exactly the same results. But we have white spaces, which is not really good. The space counts, it should be nine. That says this is really simple. This is how you can extract information in Tableau. All right, that's all about this use case. How to extract specific substring from the text using the three functions left, right, and mid next we can start talking about bunch of functions on how to search for specific patterns in Tableau. 136. Tableau | Search: STARTSWITH, ENDSWITH, CONTAIN, FIND, FINDNTH: Guys, so now we're going to move to the next use case, where we're going to learn how to search for specific patterns in our text using calculated fields. And here we have five functions we have, start with, end with contains, find, and find th as usual. First we have to understand the concept behind them. Then we're going to go and practice in Tableau. Let's go. All right everyone. The search functions in Tableau gonna be split into two groups. The first we're going to return whether the substring exists or not. In our text here we have three functions. We have the start with, end with, and contains. The output of those three functions is going to be always either true or false. We have a pullion, for example, we have the function contains, we have our string, and we are searching for dashes here. The output is going to be either true or false, in this example, is going to be true since we have it here twice. And then we have a second group of functions where it can return the position of the string. Here we have two functions, find and find in the output going to be the position number. So we're going to get numbers out of those two functions. For example, if we take the function find for the same string and we are searching for the dash here, we're going to get the output of six. So we are not getting true or false, we are getting the position of the substring. And example can be the first one. It has the position number six. As you can see, both of them could be used to search for specific thing in our text, but they answer different questions. The first group can answer the question whether the substring exists in my text, yes or no, true or false. But the second group can answer my question where I do find my substring. So here we're going to get the position number of the search. Now let's go and focus on the first groups of functions we're going to focus on. Start with, with, and contains. Okay, Now we're going to start with the first one. Start with, let's say that we have the following text, Monitor, LG, four k. The syntax in table going to be very simple. It's start with the keyword start with, and it accepts two arguments. The first one going to be the string field. It is the text where we want to search inside it. The second one, we'll have the substring here we can specify what we are searching for. The output as we learned is going to be either true or false. It is epuli. Let's take an example. We have start with our text and we are searching for the word monitor. Let's see how this can work. It's really easy. We start searching from the left and we move to the right. The start position for the search is going to be character. Now Tableau can go and start matching the monitor here in our text starting from M. And as you can see here, the first part of our text is matching with the substring that you are searching for our text. Start with Monitor, which is correct. That's why Tableau can return. It's true. Okay. Now let's take another one. Here we are asking, does our text start with the substring LG? Of course, if you're checking our word, if you start searching from the left to the right, our text does not start with LG. Tableau will not find a match and it's going to answer with a false. That's it. It's simply right. We are just asking a question. So we ask Tableau something and Tableau can answer with either yes or no. Okay, so now let's move to the next function. We have the ends with, it's exactly the opposite. All right, we're going to work with the same example. And the syntax in Tableau is very similar. Here. It starts with the ends with here it accepts to argument as well, the string field where we're going to search inside it. And the substring here, we can specify what we are searching for. The output going to be as well, true and false. So let's start with the first example. We are asking here, does our text ends with four K here, Tableau can start searching from the right side, moving to the left. Now here does our text ends with four K. So yes, the last two characters is four K. That's why Tableau can answer was yes, that's it. The output, the result can be true. Let's ask another question. Does our text ends with LG? Well, if you check the text over here, it does not end with LG. Lg is in the middle, so the last two characters is not LG. That's why Tableau can answer was false. So the answer is no. So as you can see, it's really easy. We are just asking questions and Tableau is answering with either yes or no. Let's move to the next one. We have the contains. Okay, so now we are working with the same example, and the syntax is very similar to the other two. Here, it starts with the contains, and it accepts two things. The first one we need to specify the text that you are searching inside it, and the next one we're going to specify what you are searching for. The output going to be as well pullin true or false. Yes or no. Okay, Now let's ask Tableau the following question. Does our contain the word monitor? What table going to do is that it's going to search everywhere. It will not search at the start or at the end. It's going to search everywhere. And if the word is going to be found anywhere inside our text table, Going to answer with yes was true. Does our text contain the word monitor? As you can see, it's true. Table can return yes. Now let's ask another question. Does our text contains the word LG? Well, if you are searching over here, you can find it in the middle. So that's why Table can answer as well. Withdraw. Yes, our text contains the word LG. Okay. Let's move on and ask the following question. Does our text contain the substring four? If you check the text over here, we have the four, we have the G, but they are not together. That's why table can answer. No, we don't have the word four in our text. Now as you can see, the function contains does not have any restriction. It's going to search everywhere. It's not like start with and end with. The substring should not be at the start and at the end if the substring exists anywhere. Yes it's true. If not, then it's false. So that says this is about the three functions. Let's go now in Tableau and start practicing. All right guys, So now you might ask me, what are the use cases for those three functions? Well, I use them in two scenarios. The first use case when I'm exploring new data. The second use case is when I'm offering new filters to the users. Okay, so now let's start with the first one, exploring the data. This is specially useful if you are new to a project or if you have a new data source. So the first step is usually is to explore the data and layer the content of the data source. So if you are in this situation, you might have a lot of questions about the data. So you have those three functions, those three tools in order to explore the new data that you have. Okay, then let's go and explore the products inside our big data source. We have there a lot of products and I would like to understand the content of my data source. So let's take the product name to the rows. And as you can see Tableau saying, okay, there is like a lot of members. I recommend to have only 1,000 but I would like to see everything. So I'm going to say add all members to the view. And now as you can see, we have a lot of products inside our data source. And I would like to understand the scope of my projects. So what are the content of those products? I would like to know whether we have Apple products inside our data source. So we're going to go and create a new calculated field to answer that. So we're going to say products starts with Apple that sets. We're going to use the function starts with start with it. Need two arguments. The first one going to be the text where we're going to search inside it. It is our product name. We are searching inside the product name. Now what we are searching for is the word apple. I'm going to write it like this, Everything is fine. You can see the calculation is valid. Let's click Okay. As you can see on the left side, we have a dimension with the data type pullion because we have yes or no true and false. Let's take it to the rows and check the results. You can see over here we have a lot of falses. I'm going to go and sort it in order to see the true. We can see over here we have four products where the product name starts with the others. Does not start with Apple as you can see. Now we have a little bit more insights about our data. Let's go and ask the follow up question. Does the product name contains anywhere the word Apple? Not only at the start or at the end anywhere. In order to ask the question, we're going to go and create another calculated field. We're going to call it products contains Apple. We're going to use the function contains it. Need two arguments. The string that we are searching inside, it's going to be our product name. What we are searching for is Apple. That's it, and the calculation is valid. Let's set. Okay. Again, here we have a dimension called products. The data type true and false. So pull, let's track and draw it here. But first I'm going to go and make it a little bit bigger to see the header of the field. As you can see, the first one is contains, the second one is start with, let's sort it by contains. As you can see, we have around seven products where the product name contains the word apple. Now let's check the result. As you can see, the first one, we have it over here, the word apple. The second one is over here, and the third as well over here. And the rest, those word products, they start all with the word apple. As you can see, that contains functions. We're going to get more results than that. Starts with. All right, so as you can see, we are learning more about the products inside our data source. We have seven products from the company Apple. Let's have the follow up question, does the products names ends with the word Apple? In order to do that, we can create and again, a new calculated field, let's call it products, ends with Apple. So we're going to use this time. The function ends with, again, here we have the product name and we are searching for the products. Thus, the products ends with the word Apple. The calculation is valid. Again, we have here a pullin. Let's drag and drop it in the view to check the results. Now let's go and check the results. I'm just going to make it a little bit wider to see. Okay, this is the ends with, let's go and sort it. As I'm sorting, we don't have any true, all the values are false. And that means we don't have any products. It ends with the word apple. We do we understand that the word apple exists only at the start of the product name or in the middle? As you can see, those three functions are really great to understand our data. Now let's go and ask the follow up question. Does the product name contains the word Samsung anywhere? Here we are, searching for the products from the company Samsung. In order to do that, I think you already know it. We're going to go and create a new calculated field. We're going to call it products contains Samsung. We're going to use the function contains and we're going to search inside the field name, Broduct name. This time we are searching for the word Samsung. As you can see, the calculation is valid. Let's go and hit, let's bring it to the view. Now I'm going to just make it a little bit bigger to see what we're talking about here. It's about the Samsung. Let's go and sort the results. Wow, we can see that we have a lot of products from the company Samsung. So we have more products from Samsung than Apple in our data source. Let's check the results again. So here we have it over here, Samsung. Samsung over here. Then we have a lot of products where it starts with the word Samsung again here in the middle, but it never end up with the Samsung words. Okay guys, there's one more function that I usually use inside the calculations if I'm searching or exploring the data. And that is the case functions, the upper and the lower case that we learned before. That is because Tableau is case sensitive in the search. We have to pay attention how we are rating the search term. In order to now overcome this problem, we're going to use the case functions. Let me show you an example. Now we can ask the question, does the product name contains anywhere the word plaque? Let's go and create a new calculated field. As usual, we're going to call it products plaque. And this time we're going to use all that contains the string, the product name and we are searching for the word plaque. That's it. Let's set. Okay, we have it as a new dimension. Let's check the result. As usual, I'm just going to make it a little bit wider to see the results. Now we have a lot of falses and we have a lot true. There is a lot of products that has the word as you can see over here. We have here, we have over here as well, the word black at the end and so on. So there's a lot of products with the word black. The case here is the capitalized of only the character B. Let's go and change the case in the search term. So we're going to go and eat it. The calculations now instead of the first character capitalized, you're going to have it as small, everything in the lower case. Let's go and hit Apply. Now as you can see in the results, we have only one product with the word black. As lower case Tableau is very sensitive with the cases inside the search term. If we switch everything, for example, to upper case black, let's search. As you can see, all the products that we have is now false. We don't have any products that contains the word upper case. Tableau is very sensitive about the cases inside your search term. Now to fix this, instead of going and changing each time the case of the search term, lower case, upper case capitalized, and so on. We go to the product name and we force it to be uppercase or lowercase. Using the lower or upper, we're going to go over here and add, for example, the lower. You can use upper if you want. We're going to have the same results. With that, we are first forcing the product name to be a lower, and then we can search for the word black. With that, I'm covering all the scenarios inside my data source. Let's go and hit Okay, with this, I will get all the products that contains the word black. Doesn't care whether it is lower case or upper case. We're going to get everything. So with that, I'm sure that the string is containing the word plaque and we are not missing anything. So that's why I include the upper and lower case inside the calculations before I start searching. So that's it for the facie case. This is how I usually use those three functions in order to explore and learn the content of my new data source. Let's go now to the second use case, where we're going to use those three functions in order to offer new filters to the users. So for example, let's create a filter for the companies inside the products name. So let's go and create a new calculated field. We're going to call it Companies. And this time going to be a little bit more complicated than before, but we're going to do it step by step. So we are searching first for the company Apple. So we're going to have contains product name and the search term going to Apple lower caste. But we have as well to lower case the product name right lower. And we're going to have it like this. This is the first one. I'm just going to copy it and paste for the next company, we're going to have Samsung, and then we're going to have Microsoft. We are searching for those three companies, and that's sets. So now we're going to have those three companies. But as you know, the output of the containers is always like true and false. But I would like to have a value in my filter called Samsung, Apple and Microsoft. In order to do that, we're going to use the logical operations F L statements. Don't worry about it. We can have a dedicated tutorial for that later, but we have to use it now. Now, just following, we're going to use it to evaluate those conditions. It starts with for the first one contains the product name Apple. What can happen then? I would like to see the value Apple. Then if it's not true, then go to the next one, L F. Then we're going to evaluate this condition, it's true, then it's going to be Samsung. If it's false, of course we're going to use another LSF. We're going to evaluate this one. And then the output, if it's true, going to be Microsoft. If doesn't fulfill any of those conditions, we're going to have the L, let's say Unknown. That's it. We're going to end it. Don't worry again about those logics we're going to talk about. With that, I'm going to get values, I'm going to get those three values instead of true and false. And we are evaluating those conditions. Let's go and hit, okay. So as you can see now we have new dimensions. The data type is not pollen, not true and false. And that's because the output of the calculation now going to be string values. Let's go and show it as a filter. And now we can have those values as you can see, Apple, Microsoft, Samsung, and Unknown. I'm going to add it as well to the view to see the results. Let's go and grab it over here. Now the users can go and start filtering the data based on the companies. Let's remove everything and start with Apple. With that, we're going to get all the products with the word Apple inside it, or we have Microsoft. So now we can see. Those products are from Microsoft. The same goes for Samsung. With that, we are filtering based on the companies and we use the product name as basics for that. The Unknown I think is going to be a lot of values Unknown. You can go like step by step adding more companies to our filters. But now I'll just show you an example for that. This is exactly the power of the calculated fields in Tableau. We introduced new information based on the functions, this is all for this use case. How to create filters based on those three functions. All right, so now we're going to focus on the second group of search functions in Tableau. We have the two functions find and find. In here we are answering the question, where do I find my search term? We are searching for the position number of search term. This time we are not getting true, un false, we are getting the position number. Let's understand why do we need this? All right, now let's quickly understand the differences between find and find n. Well, in find we are returning the position number of the first occurrence in the find nth, we are returning the position number of specific occurrence. For example, let's say that we want to search for the position number of the dash inside this string. The results going to be six because the first occurrence is going to be at this position. But on the other hand, we can use the function find n for the same text and for the same, we are searching for the, but we are asking now the position of the second occurrence. So the first occurrence is going to be ignored. We're going to get the position of the second occurrence and that's going to be ten. This is the main differences between those two functions. In find, we are searching for the first occurrence always, but in find eh, we can specify which occurrence we are searching for. Let's go more in details about the function find. All right, so now we can have this example. And as you know that each character in the string has a position. C has deposition number one, and the character five has deposition number 14. The syntax for find in Tableau is as well, very simple. It starts with the keyword find, and here we have three arguments. The last one is optional. String is the te search inside it. The substring is what we are searching for here. The start position of the search as you said, it is optional. The outward is going to be a number. For example, let's say that we want to know the position of the dash inside this text. How this works, it's really easy. It starts from the left side. Always, since we didn't specify anything for the starting position, it's going to start from the first character. Tableau can start searching. Okay, In the first character, we don't find it. The dash, we can find it at the position number six, the outward at the position number six. All right, now let's take another example where we can specify the start position for the search for Tableau. We're going to have the same thing again, but we're going to say this time, start from the position number seven, okay? So what can happen? We're going to start searching from here. And Tableau going to start from left to right, so we're going to find it over here at the position number ten. The result going to be at the output ten instead of six because we start searching from this position. All right, so that's all for the function find. Let's move to the next one, we have to find. And we're going to work with the same example syntax, going to be a little bit different. It's start with a keyword defined the string value, where we're going to search inside it, we're going to specify what we are searching for. But this time we're going to specify the occurrence. Here, we have to tell Tableau which occurrence we are interested in. Let's take an example. We have the following question. Find the position number of the dash inside the string, but we are interested in the second occurrence, how this is going to work. We're going to start searching from left to right. As usual here, we cannot specify the start position of the search. We don't have this option over here. It can always start from the first one. As we are searching from the left to right, we have the first occurrence of this character. We have it at the position number six. Output will not be the position number six because we told Tableau we are interested in the second occurrence, not the first one. Tableau going to go and keep searching for the dash in the string, so we're going to find it at the position number ten. Here is the second occurrence of the dash inside our text. This is exactly what you are looking for. The output going to be the position number ten. That says, this is how this function work. We can search for specific occurrence in the function find. We're going to get always the first occurrence, but there we can specify where to start search. Now let's go in Tableau and start practicing. All right, so now we're going to have the following example. We're going to start with the small data source. Let's go to the customers. And I would like to get their first name and as well the phones. So now the task is to extract the country code from the phone and to put it in extra field so we are interested in those informations, the plus 33, plus one, plus 49, and so on. So as we before, we can use the function lift in order to extract the information from the left side in the text. Let's go and create that. We're going to go and create a new calculated field, let's call it phone country codes. And we're going to use the function lifts. We have to specify the string, so it's going to be the phone. And now the next one, we have to specify the number of characters that we want to extract and he exactly where the problem comes. Sometimes it's going to be like three characters and sometimes going to be two characters. Let's go, for example, with three. Let's set. Okay, we have it over here. New dimension. Let's just bring it to the view here, we can find exactly the issue, right? The first one is fine, the third one as well, Fine. But for those countries it's not working. We have the dash inside it, which is not really correct. Now, in order to fix this, we're going to use the magic of the function find. If you check over here, we want always the numbers before the dash is right. We can search for the position number of the dash. And then we can include it in the left function. Let me show you what I mean. We're going to go and create a new calculated field. We're going to call it phone find dash. So now we're going to go and find the position number of the dash. As we learned, start with find. We have to specify where we're going to search. So we are searching in phones, what we are searching for, right, We're going to have the dash here, and that's it. We are not interested in the start position, so we can start from the first character. That's it. As you can see, the calculation is valid. Let's set, okay, since the output going to be a number, we're going to get it at the continuous measure. Let's drag and rob it over here and see the results. The position number of dash inside the first phone is four. The second 13, then 443. Everything is fine. Now the next step, what we're going to do, we're going to bring those two calculations, the left, and find in one calculation I'm going to go and copy the syntax from the phones. Fine, let's just copy it from here and go back to the first calculation about the country code. Let's go over here, edit it now. Instead of having the three as a static, we're going to have it as a variable using the fine function. Let's just add it over here. Now how Tablo going to execute this calculation? It's going to start with the first function, find, it's going to first find the position number of the dash inside the phones. And then afterwards we're going to go to the function left outside. We're going to now cut everything, This position number. All right. Now let's go and check the results at the string. As you can see, we are almost there. We have the plus 49 dash, plus one dash, plus 33 dash. The dashes are everywhere, and that's because we are cutting everything after the dash position. That means we are always one step more than needed in order to fix it. It's really easy. We're going to go back to our calculation. Yeah, we are getting here the position number, which is correct, but we want to get one step back. In order to do it, we're going to do minus one to go one step back. Let's okay. All right, so with this we get exactly what we want, right? Plus 33, plus one plus 49. And with that, we're going to get more dynamic in the function left. We are using defined function. With that, we can see how we can bring those functions together in one calculation in order to achieve such a great goals. All right, now let's try out the second function that we have defined, nenthow. Let's say that we want to get the position number of the dash. But in the second occurrence, let's go and create a new calculated field. We're going to start with the keyword fined nth. It's needs three arguments. The first one going to be the text where we can search inside. It's going to be the phone. Then we are searching for the dash. And then the third one we're going to specify which occurrence we are interested in. We are interested in the second one. That's it, the calculation is valid. Let's click Okay. Since the output is number, we're going to get a new continuous measure. Let's bring it to the view over here. Now let's check the results for the first phone. The second occurrence of the dash is going to be at the position number eight, which is correct. And as you can see, the find is number four because the first occurrence at the position number four for the second one, it's going to be in the number seven which is as well correct. Now, let's go and start changing those occurrences. Let's go and edit it again. I would like to get now the third occurrence. So as you can see, we have a third dash over here. Let's change it to three and just apply. You can see now we are getting the position number 12 for the last dash in the phone number that we are getting. The third occurrence, the dash inside our text. But now if we go and switch it to one, what can happen? We're going to get exactly the same result as find, because find can always bring the first occurrence. So here we are saying I'm interested in the first occurrence, all right. Okay, so that's it for those two functions, find and find. They are really useful to get the position number of specific substring and I usually use them in another calculation, so they are like supporting another function. All right, so that we have learned how to search for specific patterns in our text in Tableau using Tableau calculations. Next you can start talking about another group on how to combine and split the data in Tableau. 137. Tableau | CONCAT & SPLIT: Now we're going to learn how to combine and split the text in Tableau using the concateination operator, the plus and the split function. But as usual, let's understand the concept behind them, then we can practice in Tableau. Let's go. All right, so now we're going to talk about the concatenation in Tableau. It's very simple. We use for that the plus operator in order to combine multiple texts into one text. For example, in our database we could have the following scenario, where we have the first name and the last name separated from each other's using different fields, we would like to have only one field called the full name, for example. In order to do that, we can use the plus operator in order to combine the first name Michael with the last name Scott. And at the end result, we're going to get the full name, Michael Scott. But now if you check the full name, we would like to have always a separation between the first name and the last name in the output. Inside the full name, we usually use space between them. We can do the same. We're just going to add one plus operator. We have Michael space, Scott. Between Michael and space, we're going to have the plus operator. And between space and last name, we're going to have as well another plus operator. The output is going to be Michael space Scott. As you can see with the plus operator, we can structure anything we want by combining multiple string values together using the plus. That's it. This is really easy. Let's go back to Tableau and start practicing. All right, so now we're going to go to the small data source over here and we go to our customers. We would like to have the first name and the last name in the view. And as you can see, those informations are separated in two different fields. The task is now to create only one field for the customer name, the full name, instead of having two. In order to do that, as usual, we're going to go and create a new calculated fields. We're going to call it full name. Now we need the first part, the first name. And then after that we're going to have the plus operator. Then we want to have a separator between them as an empty space, so we're going to have it like this. And then plus operator, the last part going to be the last name. Let's take the last name and put it over here. That's it. It's important that the calculation is valid, so everything is fine. Let's hit okay. Now, as you can see in the databain we have a new calculated field, a new dimension called full name. Let's check the values. We're going to drag it over here on the rose. And as you can see now we have a very nice full name, George Pips, John Steel and so on. It's really simple right now, if you change your mind, you would like to have like a dash between those names. What we're going to do, we're going to go and edit it then instead of having the white space over here in the middle, we're going to have the dash, That's it. Let's hit Apply. And now we can see in the full name that the first name and the last name are separated with. So it's really simple. Let's take now a quick task. The task is to combine the category and the product using the following rule. As usual, you can pass the video in order to complete the tasks, and once you are done, you can resume it. All right, so now let's check the solution. It's very simple. We're going to go to the product. Let's first see the raw data. So we have the category and the product name. And now we're going to go and create new calculated field. We're going to call it full product name. The rule starts with a category, then we have a R plus operator. After that, the separator can be the double point. But after the double point we have a white space. I'm just going to add it over here and we're going to have the product name. Let's check the results. The calculation is valid, okay? And here we have our new dimension. Let's just drag and drop it over here and check the results. Just going to make it a little bit bigger so we can see the results from here and here as well. So as you can see, our product name now starts with the category double point, then the product name, and that's it. This is how we can work with the concretinans in Tableau. It's very simple right now we're going to learn the exact opposite. So we're going to learn now how to split one field to multiple fields using split. All right, so now we're going to talk about the split function in Tableau. It's very important function and a lot of people get confused about it, But I think it's simple. So let's check this example. We have here one field with a lot of informations. So we have here the product name, the product ID, and the product code, all in one field. In many situations, in the analysis individualizations, I would like to split those informations into three fields. So instead of having one field, I would like to have it in three fields. In order to do that, we can use the split function. And before we learn that, we can do that with the left, right, and mid. But the split function is easier. In such a situation, we want to split this field into the product name, the product ID, and the product code. In Tableau, we have the following syntax. In order to do it, we have split and it needs three arguments. The first one is the string, the texts, we want to split it. Now let's go and check the syntax in Tableau. It's start with the keyword split and it needs three arguments. The first one going to be the string or the field that we want to split. The second one going to be the delimeter. Then the last one the token number, the outward going to be a string value. Now let's take an example. I would like to split this text and the delimter going to be the dash. I would like to have the token number one here. Tableau needs from you two informations, the delimter and the token number. The delimeter is the separator between words. For example, we have a separator between Canon and the ID using the dash. And we have another separator between the ID and the code. Those dashes are the delimeter that splits my text. Tableau wants understand from you how the words are separated in. Now let's move to the next information that is needed, the token number here as well. Tableau wants understand which part of information you are interested in. Is it the first part? The second part or the last part? Here we have like an ID or token for each piece of information. So the first one going to has the token number one. The second one we have token number two and the last one is the token number three. In this example we said I'm interested in the token number one, that means I'm interested in the product name. The output can be, of course, if you're interested in the product ID in the middle, we could say, okay, I'm interested in the token number two. If you specify it like this, you will get the product ID. And if you're interested, of course, in the last one, in the product code, you can specify the token number three in order to get the product code. So as you can see, once you understand it, it's really easy. We just need two informations. What is the separate between words and which token number you are interested in? Now let's go back to Tableau and start practicing. All right everyone. So there are three ways on how to split your data inside Tableau. The first one is by creating new calculated field. The second one is automatic split. The third is customized split. So we're going to start with the first one on how to split your data using new calculated field. We're going to take the following example. We're going to stay with the small data source. Let's go to the customers and grab the phones over here. And the phone numbers has a structure, so we have a country code, area code, and the phone number itself. So now we would like to split those three informations into three new fields. Okay, so let's see how we can do that. We're going to go as usual and create a new calculated field for the first part for phone country code. So we're going to start with the split keyword and it need three argument. The first one is going to be the string that we want to manipulate, so it's going to be the phone number. I'm going to add it like this. Then the dilimeter. The dilimeter here is the dash. So as you can see, those stuff are splitted with the dash. So let's just add it over here. Then Tableau needs from me a token number. So the first one going to be the token number one, then 234. So we have four sections and we are interested in the first token number. So the first one, let's add one, and that's it. As you can see, the calculation is valid. Let's go and hit Okay. So now we can see that on our data Bain in the data source, we have our new field, the country code. Let's go and grab it to the view and check the result. And with that, we are extracting the first token, the first part of the phone. And with that, we have our country code, Everything is perfect. Now, the next step we would like to go and extract the area code, the token number two. So now we're going to go and create a new calculated field. But first, I would like to take the old code because we want only to adjust the token number because everything else can stay the same. Let's go and create a new one. We're going to call it phone area codes. And then we're going to put our code over here. The same stuff is going to stay the phone and as well the dash as separator. Then we want to change only the token number two. So we are speaking about the second part. So let's go and hit okay, and check the results we have here again, our new field, so track and drop it on the view, and as you can see now we are getting, we are splitting yeah, the second part. So we have here 555 and as well over here. So with that, we got the third part from our phone. We have now the country code and as well the area code. And now next we have the following task for you. Create a new field in the data source to extract the phone number, part without the country and the area codes. Now you can pass the video in order to complete the task and once you are done, resume it. All right, so now we're going to go and create a new calculated field. We're going to call it phone number. We can have the same script, we have split phone, but this time we are interested in both token three and token four. How we can do that in Tableau. We can add only one token at time. In order to do that, we're going to go and change this to three. Since we need both of the informations in one field, we can use the plus operator. What do we going to go over here? Plus, then we can add the same code over here, but this time for the token number four. We are getting both of the tokens in one field. The calculation is valid, let's say. Okay, and as usual, we got a new field in our data source. Let's check the result over here. We can see that now we have the phone numbers. Now, as you can see, the first one is 1234567, and we have it as well. Over here we have as well, the same phone number, but you might say, you know what, we are missing the dashes, right? So we can go and add them in our calculated field. So let's go and edit it. And we just can add new operator and between them we're going to have the dash right. As you can see, the calculation is valid. Let's go and hit okay. And with that, we got exactly the same structure from the phone. That's it for the first methods and how to split your data using new calculated field. You can see from one field we have extracted three new fields. Now let's go to the second method where we can split the data using automatic split. All right, so now, yeah, we can do that. We're going to stay with the small data source, this time we need the URL. So let's take the product image from here, drag and drop it in the view. And we know that in the URL there is a lot of informations. And as well, we can use the splitter to split the data. Now instead of creating manually those calculated fields, there is really nice feature in Tableau where we can split the data automatically. In order to do that, we're going to go to our field, the product name radically connect. And here we have the option of transform. We are manipulating the data. And here we have two options, the split and the custom split. The split is the automatic way. Wow. We got now a lot of new fields in our data source, and that's because Tableau automatically split the data and as we understood the content of the data. So you can see here. The product image domain, then fragment path query schema. All those informations are part of the structure of an URL. Now let's go and check those informations. We're going to take, for example, the domain. Track it on the view, and as you can see, tablet it correctly, right? We got now only the domain information from the whole URL, which is really nice. We can take as well the scheme over here, and we have the protocols from the start. As you can see, Tableau get it really correctly. Some of those fields is going to be empty, I think because we don't have it as a part in our URL with Tableau. Did the automatic split and if we would like to learn how Tableau did split it, you can fight it as well inside this field because it is Elcltd field. Let's see how Tableau did split the domain radically, colon it. And as we can see here, Tableau is using two splits in order to get the domain information. The first split is this one. Tableau is splitting the protocol from the whole URL. The separator going to be the double point and the two forward slashes. And we are taking the talking two. So we are getting the second part. Once we get the second part, can be really easy. The separator as you can see is the forward slash. We want to split now with the forward slash. And we would like to get only the first part. It's really easy. You can go and try it yourself. That's it. Let's click okay with that Tableau. In some cases, not in all cases is smart enough to split your data into new fields automatically. That's it for this method, the automatic split. Next we're going to see the customized, okay? So we're going to stay with the small data source and we're going to go to the customers. Again. Here we want to split the phones using the custom split, Let's bring it to the view. And then in order to customize the split, we're going to go to the data pane on the field that we want to manipulate, radically connect. And then here we have transfer before we have the automatic split. This time we are interested in the custom split, let's go inside, and then we're going to get a new window in order to customize the split. And it's like the calculations, the syntax Tableau needs from us two informations. First the separator, second, what do you want exactly to get the token numbers? The first one, the separator or the delimeter, in this example going to p the dash, All those informations are split with the dashes. Let's go and enter a dash. The second information, we have the following options, So split off, and here we have three options. Do you want the first part, the last part, or everything? And here, it depends on what do you want. If you want to split everything you want for each piece of information in new fields, you're going to go with the option all. Now let's say that you are interested only in two informations, the country code and the area code. The rest, you are not interested to have it in the data source. In order to get the first two parts, we're going to go over here and select first. And here you can explcify two. So we are interested in the first two columns, in the first two informations from the left side. But now let's say that you are interested in the last two parts, so you would like to get field for the last two informations. So what you're going to do, you're going to go over here and select last. And as well select two, so that you're specifying for Tableau, What do you want exactly to get as a results? How many fields from the start? From the end or everything? In this example I'm interested to get everything. So we're going to go with the option all. And that's it. Let's go and hit okay. So once we do that, Tableau going to go and create a lot of new fields. So Tableau did manage to split the phone number into four parts. So let's go and check those informations. Drag and drop it over here on the rows as you can see. The first part going to be the country code, the second one going to be the Area code. And then Tableau split those two informations into two fields. Here, it's not like the second misthode where we are blindly automatically splitting everything. Here we are specifying for Tableau, few rules, and then Tableau can go and as well automatically split the data to get better quality in the fields. And of course, if you are interested on how Tableau did the split, we can always go to the database. All those informations are calculated fields and we can go inside them and check the code. So we can go over here and do it it and as you can see the dilimeter is the dash and Tableau get it as a first token in order to get the country code. All right, so that says those are the three methods on how to split the data inside your data source. They are really useful in order to generate new informations and split those complex structures inside the original data source into new structure for the analysis individualizations. All right, so that's it, This is how you combine and split text in Tableau. Next we're going to start talking about the last string function in Tableau, the replace. 138. Tableau | REPLACE: Now we're going to learn about the lass use case for the string function. How to replace specific substring with another substring using the replaced function as usual. Let's understand the concept behind it then we're going to practice in Tableau. Let's go, okay, the replaced function in Tableau. It's very simple. It's going to replace one substring with another one. For example, we're going to have the following address, and as you can see in the middle we have the abbreviation of the street T. I would like to have a normal wording of this, instead of having the abbreviations. I would like to have the complete word, street. We can do that using the replaced function in Tableau. Let's check out the syntax in Tableau. It's start with the Blake word and it needs three arguments. The first one it's going to be the string, the original text that you want to manipulate. The second one is the substring, the one that you want to replace. The third one is the replacement. It's really clear this is going to be the new substring, the new word here, the output going to be as well as string value in order to solve this task. In this example, what we're going to do, we're going to use replace, then our text. Then the old one going to be the T, the abbreviation. This is the old substring and the new one going to be the street word. How this can work. Tableau has first to search for the substring that we want to replace. It's going to search the whole text in order to find the substring. In this example, of course we're going to find it over here in the middle. The next step is that Tableau going to go and start replacing this word with the replacement Tableau. Going to take the SD dots and can replace it with the complete word off street at the ends. We're going to get Louis Street, Paris. As you can see, it's really simple. We are replacing the old value with a new value at the end. The string going to look like this. So we're going to have a street complete instead of ST dots. Now of course, the question is, what can happen in the output and the results if we don't find anything? For example, we have this address, Paris. We are searching for the ST dots, but we don't have it inside the text here. Tableau can return the original text without changing anything. Nothing can happen. That's it. It's really simple, right? We're going to go back to Tableau in order to practice the replaced function. Okay, now we're going to go and practice with the small data source. Let's go to the customers and we can manipulate the phone number again for the customers. Now as you can see, the structure in the phone number starts always with the plus for the prefix, for the international call. So now we have the requirement to replace the plus with 00 as a prefix. Now, in order to do that, we're going to use the replaced function in Tableau. In order to do the switch, the replacement, let's go and create a new calculated field. We're going to call it phone replace. Let's start with the keyword replace. We need now the field that we want to manipulate. It's going to be the phone number, so we have it over here. And now we need to specify for Tableau the substring the old value. The old value is the plus sign. And now we have to specify for Tableau the replacement, the new value, the new value going to be 00. That's it. Tableau has the calculation as a valid. Let's go and hit okay with that, as usual, we created a new calculated field in our data pane. Let's go and check the results. So drag and drop the rose and now we can see the result. Instead of having the plus sign, we have everywhere 00. And with that, we have fulfilled the requirement. And now we might get another requirement where they say, you know what, I don't want those minuses inside the phone number, so it would be nice to remove them. Now, in order to do that, we're going to do the same thing. We're going to use the replaced function. The old value going to be the dash and the new value going to be nothing. Let's see how we can do that. So now let's go and edit our calculated fields. We just want to add new replaced function. So let's go edit over here until it doesn't matter whether we want to replace first the plus or the dash. So now in order to do that, I usually do it like this, if I'm doing nisted, replace what we're replacing the phone number. Instead of having the dash, we're going to have nothing. We are replacing the old valued dash with nothing. Now, in order to have it listed, I would like to take this part, the first one, and put it instead of the phone. With that, we are having nisted calculations. First, we're going to replace the plus sign. Second we're going to replace the dash sign. Let's take it to the first row, and with saying the calculation is valid, let's go and hit Okay. And as you can see now in the results, we don't have any dashes or plus sign, so we have a whole number without any special characters with that resolved the second requirement. It's easy, right? It's not that hard. And we can do a lot of things with the replace function. It's great function the string values in Tableau. Now for you, we have the following task in the big data source, in the product name. We would like to replace the hash simple with a number as abbreviation. And now we can bout the video in order to complete the task. And once you are done, you can resume it. All right, so we're going to go to the big data source at this time. And we're going to go to the products. And we need the product name. Let's drag and draw it on the view and check all values. So now we're going to make it a little bit bigger in order to see more values inside the data. We have some hashes like for example at the start and we want to replace it with in our point. In order to do that, we're going to go and create a new calculated field. Let's go on the arrow over here, create a new calculated fields. We can call it products replace. So we're going to start with the replace keywords. And then we need the string that we want to manipulate. It's going to be the product name. The next we want the old value, it is the hash. And then the replacement is going to be the number as abbreviation in our point. So that's it. As you can see, the calculation is valid. Let's go and hit Okay. So we have a new dimension, new calculated field in our data pane. Let's try contribute in the view and check the values. And we see over here instead of the hash, we have the abbreviation of the number. So with that, we have learned that the replace function is very simple and as well very important in many use cases. I use it a lot once I want to clean up the data. So sometimes we get ad quality from the sources and there will be a lot of like special characters I can use, always replace, to clean up the data and to remove those special characters with something more meaningful in the visualization. Like we did in this example, we replace those special characters with something more meaningful, or I use it a lot as well, to change the format of something. So for example, we here have the phone numbers. And we change the format from having the dashes to something else, like without dashes. And as well, instead of the plus, we have the 00. So with that, we are not cleaning up here. The phone, we are changing the format and how we are presenting the phones in the visualizations. On the left side we have the plus and dash. On the right side, we don't have them. We usually use the replaced function in order to change the structure, the format of one field. It is just amazing and very important tool in Tableau. All right everyone. So that's all for the replaced function. And with that, we have covered all the use cases in the String functions. We have learned around 16 String functions to manipulate, transform, and clean up the Tix values in Tableau. Next, we're going to jump to another group of functions in Tableau, the date functions. 139. Tableau | Extract Dateparts: DATENAME, DATEPART, DATETRUNC, DAY: Now we're going to talk about the third group of functions under the category row level calculations, the date functions. There are three use cases for the date functions in Tableau. The first one is to extract specific date part from our date, like day, year and month. For that, we have six different functions in Tableau. The date part, date, name, date, trunk, month, year. The second use case is to add and subtract date values in our data source. So here we have two functions, date, add, and date. The last use case is to find and fetch the current date and time. And here we have two functions, today and now those date functions going to give us a tool to manipulate and transform the date values in Tableau. We're going to start now with the first use case, how to extract specific parts from the dates using those functions. As usual, it's really important to understand the concept behind them, then we can practice in Tableau. So let's go. All right everyone. So in Tableau there are two ways on how to manipulate, transform the fields with the data type date. The first one is to do it globally in the data source for all worksheets, all workbooks. The other way is to do it locally only in one worksheet, only in one view for the first one, if you are manipulating the date and you want to re, use it in different worksheets in order to do that, we can go and create a new calculated fields using the date functions. But now on the other hand, if that transformation is not that important, you don't want to reuse it, you don't want to use it in any other worksheets. You need it only once in one view. Then, instead of creating new calculated field in the data source and using the date functions, we could just simply go and change the date format directly in the view, which is easier and quicker than creating new calculated fields. As you can see, there is like two methods on how to manipulate and transform the dates in Tableau, either using the date functions or changing the date format. Now, if you ask me which method should I use, you have always to ask the following question. Is the transformation going to be needed in different worksheets? Then yes, go and create a new calculated field using the date function. But if the transformation is only needed for one view, then you have to change the date format directly in the visualization. Now we're going to go and focus on the date functions since we're talking about the calculations and at the end we're going to talk about the date formats. So in Tableau we've got punch of date functions that all has the same goal to extract date parts from specific fields, and we can use them to generate such a view. So as we can see over here, we have the years, we have the monthss, the quarters, all those informations comes only from one field, the order date. And we can build from all those new information that we extracted. A lot of analyses and insights about our data like the one that we are seeing here, the t map. So now let's go first understand those functions and then we come back to Tableau. All right, Okay, so now we're going to talk about the first date function in Tableau. The date part. We can use it in order to extract a piece of information from our date fields. So for example, we have the following date structured from year, month and a day. We can use date part to extract one piece of information, like for example, the year. If you are extracting the year, the output is going to be 2025. But if you're extracting the months, we're going to get the August 8. If you're extracting the day, we're going to get 20 here. It's very important to understand that if you are using the date part, the output going to be in number. The year going to be in number. The month will not be August, it's going to be, it's going to be eight. Same thing for the day, so you will get 20 as a number. Let's see the syntax in Tableau, it's very simple. Let's start with the date part. The Tableau needs from you two informations. The date part here, Tableau can ask you which piece of information you are interested in. You would like to have the year, month, day, and so on. The second part, the second argument going to be the date field that we want to manipulate. The output, the result of this function can be a number. Now let's take an example. We're going to take date part. Now we are interested in the information of day. We would like to extract the day information. Then our date going to be looked like this, the output going to be 20. If we want the months, then we have to specify a month, the date part. And if we do it on these dates, we will get the months eight, the same thing if you want to get the year. So here we specify the year at the start, then our date, the output can be 2025. So that sets for the date part. This is one method on how to extract a date part from a specific date. Let's move to the next one. We have the date name. Let's say the syntax in Tableau, it's exactly the same. Let's start with the date name as a keyword. Then Tableau needs from you two informations, which part of the date you are interested in, and give me the field that you want to manipulate. But this time the output can be a string value. Let's take an example. Let's say that we are interested in the year part from our date. So the output can be, again, 2025. But the value going to be in the data type string. But this time if you say you know what I'm interested in, the month. So you specify a month as a date part this time. Tablo can answer with August instead of eight because the output here is string, so you will get the name of the month as an output. And now the next one, if you say I'm interested in the day, if you specify in the date part, a day instead of month, you will get as well a 20 but as a string value. So that's it for the date name. It's very similar to the date part, right? But the only difference is that there you are getting a number, but with the date name, you are getting a string value. This is another method on how to extract the date parts from a date. Let's move now to another set of functions be used as well to achieve the same goal in order to extract dates parts from a date. This time we have three quick functions in order to extract quickly the date part from a date. They are my favorite. I tend always to use them in compared to the other two because they are really easy to write. The syntax Tableau going to look like this. The first function, it accept only one argument, a date. Same thing for the month. And for the year, the output is going to be a number. It's like the date part function. For example, if I'm interested in the day, I can do it like this. I use the function day. Then the date that we want to manipulate, then the output going to be 20, as you can see, compared to the others. It's really quickly to create. Right here, we don't have to specify for Tableau in the syntax, the date part because the function name called day. The same thing for the month. If I'm interested only in the month, I can just use the function month in order to extract the August or eight for the last one. If I'm interested in the year, I can use the function year. As you can see, they are really easy and quick to create. If you compare it to the other two, as you can see, they are really easy. Let's move on to the next one. This going to be slightly different than all others. We have the date trunk. Okay, Some facts about this function. It is a little bit complicated. A lot of people don't know about it, but I tend to use it a lot. It's very useful function, but it is not that famous. Think about the date trunk rounding function in numbers, if you have a lot of details in one date, you can round the date to specific level. What this means, if we have the following date, time we have here like hierarchy, right? We have a year, month, day, hour, minute and seconds. We are seeing in this data a lot of information, Sometimes you are not interested in a lot of details like seeing the seconds, minutes and hours. You would like to see only at the month level. What we can do, we can use the date trunk in order to round those numbers. Let's check first the syntaxing Tableau. It's very similar to the others, it looks like this date trunk. Then you specify the date part and then the date that you want to manipulate output This time it will not be a number or a string, it's going to be date and time, okay? The best way to understand this function is to have some examples. So let's say that we specified at the date part a day and then we have our time and day over here. Then what can happen? What you are telling Tableau thats the time informations are really detailed for me and I'm interested only to see this piece of information at the day level. So I'm interested only at the day informations. I'm not interested in the time, what can happen in the output if that table going to return the same information, but this time it's going to reset everything at the time. So you can see we are maintaining all the information about the year, month and day, but anything below the day, it's going to be resettd to zero. As I said, it's like rounding numbers, right? You are rounding the information to specific level. Now, let's move to the next level where you say, you know what I'm interested at the month level, you specify at the date part a month, then we're going to have the same information over here. What you are saying to Tableau is that I'm not interested in the details in the day. I would like to see my information at the month level that we're going to get 1 August in 2025. Now we're going to go one more step where we're going to say we are interested only at the year level. So, if you go and specify at the date part the year, what can happen, You tell Tableau I'm not interested in anything else, I'm just interested in the year. I think you already got it. What can happen? Everything can be reseted. Anything below the year, so the month, the day, the time can be reseted to one over year than zero at the times. And we can have only the value 2025. So that's it for this function. It is very useful in many calculations to use the date trunk. Now let's go and compare all those functions side by side. We have here as a rose, the date part, so we have year, quarter month, day, and so on. And then we have here on the columns, those different functions. I don't include here the day, month and year functions because it's very similar to the date part. So the first thing to understand is that the date part output going to be a number, date name. Output going to be string date, trunk output going to be date and time. And we can work with the same example. So we have the following information about the date and time. Now let's go and see the output of those functions and those different levels in the date part. Now let's start with the first level, the year. If you say I would like to have the date part of this information, you will get. 2025. The same thing for the date time, but this time for the date trunk. You're going to reset everything below the year, so you will get 1 January 2025. So let's move to the next level. We have the quarter, the date part quarter of this date. It's going to be three. The same for the date name, it's going to be three. But this time it's interesting, right? Because in date time we don't have usually the quarter informations. So this time it's going to reset to the first month of the quarter. It's going to be the month number seven. So let's move to the next one. We are at the month level, so if you use the date part, you will get eight. If you use the date name, you will get the full name of the month, August. And if you use the date trunk, you're going to reset everything below the month and you will get the first day of August. Moving on to the date, if you use the date part, you will get a number 20, the date name, you will get a string value 20. And this time at the date trunk you are resetting the whole time. Moving on to the next one, we have alternative for the day and here we're going to get the weekday, the number of day inside a week. Here we're going to get the number four from the date part because it is Wednesday. So if you're using the date name, you will get the full name of the day Wednesday. And for the date trunk, nothing going to change. We just going to reset the time as well. Now, if you are moving in details, if you extract the hour for the date part and date time, you will get nine. And here as you can see, we are resetting now only the minute and the second because you are not interested in it. Moving on to the next 1 minute, we'll get 45 in part name, and here we are, resetting only the seconds. As you can see, only seconds are zeros. Now let's move to the lowest level in the hierarchy. We have the second, so we're going to get 21, 21. And the output going to be exactly the same value in the input. So that you can see the big picture using those three functions and what are the main differences between them and what you're going to expect if you are using them. Now let's go back to Tableau and start practicing those functions. Okay, so now we're going to go to our source. Let's go to the orders. And we will be manipulating the order date. Let's take it to the view tab, going to convert it immediately to a year. We are not seeing the original data, we are seeing only the year apart from the order date. Because table wants also to make visualizations. And of course it makes sense to have years instead of all dates inside our data source. But in order now to show all the data like in our data source, we're going to go over here and switch it back to the exact date. Let's click on it and table going to convert it to continuous, but I would like to see all values. We're going to switch it to discrete. Now as you can see, we get all the values exactly like the source system. We have around five years of data. So now we're going to go and practice by extracting the date part. We're going to start with the year, so let's go and extract those years. We're going to go and create a new calculated field. Let's call it order date, year. So here we have a lot of ways in order to get this information we can use the date part, the date name, the date trunk, or even the year function. All right, so now we're going to start with the date part. And as you can see it except two argument, but the third one is optional here you can define what is the start of the week, but I usually leave it empty. The date part that we want to extract now is the year. Then the date that we want to manipulate is the order date that, and as you can see that the calculation is valid, let's go and hit Okay. As we learn the output of the date part going to be a number, that's why Tablo going to create a new continuance measure. But I would like indivisualizations to see is distinct values of the years. I'm going to go and convert it to a dimension now as you can see, it jumps to the dimensions and we have it now as a discrete dimension. Let's bring into the view and check the results. As we can see now we have all the years exported, extracted from the order dates. Now let's go and try the other methods. Let's replace the data part with a date name. Here. It's very important to understand that the data type going to change. Here we have it as a number. If we switch it to data name, we can get it as a string. Let's go and change our calculation. Instead of date parts, I'm going date name. Let's hit Apply. And as you can see, immediately the data type going to switch to string value. But in the view, we're going to get exactly the same result, right? Nothing going to change, only the data type. Now we're going to move to the easiest one. The quickest one is to use the year function instead of the whole thing. Over here we can write a year and we don't have to specify the date part. That's why we're getting an error. We need only our date. That's we want to modify that. Let's hit Apply as well. Nothing going to change in the view, but the data tape going to switch to number, because the output of these functions is a number. Now you might ask me, okay, which one should I use? I recommend you always to use the quick one of course. But what is more important is the data type. The data type number is always faster than the data type string. The data type string is the worst. It is the slowest data type from all others. We always try to avoid the data type string in the visualizations not to have bad performance in our views. If you are thinking about those three functions, I would always avoid that date name. Now we are left with two functions, date part and the quick function. I would always go with the quick one, right? Because it's easier to write. I would prefer this situation to have year or the date like I'm showing it in the view. But of course, in a lot of situations you want to show for example, the day name or the month name. It depends really on the requirement, but if you can avoid it. Don't use date name. So that is this is my recommendations to you and what I usually do. So now let's close this and extract another part from the date. We're going to have the quarter. So here again we have the three options and all three deliver the same information. So I would go and create a new calculated field, let's call it order date quarter. And this time I'm going to use as well the quick one quarter quarter dates. So that it's really simple, right? Let's it. Okay, and now we have again a new continuous measure. I would like really Tableau here to create immediately a dimension. So I'm going to go and convert it again to dimension because I use it in the view as dimension. Let's check the results and we can see we have now the quarter number which is correct. All right, so now let's go and extract another information from our date. We're going to get the month. Let's go and create again a new calculated field. We're going to call it order dates. Now this time we can use a month function and our field order date. It's very simple, right? So let's go and hit, okay. And we're going to convert it again to dimension and bring it to the view. With that, we are extracting the month information from the order date. Everything looks fine. Here we have September, August, and that's it. And here we are usually in this situation where the users would like to see the months as a full name. So instead of having the month number, we would like to have the month name, which I really agree, because it's easier to read the month name than the number. In order now to change it, we can use the date name function. So let's go and change our calculation. So let's go and eat it now, instead of month, I just can remove it. Let's have the date name then, the part going to be month. And then we have our order dates. So let's hit okay. And now of course what happened. We changed the data type and as well the values inside this field. So we are now getting the complete name of the month. So we have January, February, and so on. So that's it. This is how we can extract the different dates parts from our original field, the date. The question is how to use those new informations in our views. All right, so now we're going to go and create a view from three informations, category, order, date, and sales, using a heat map or highlighted table. Now the first thing that I would like to do is to remove the order date. This is a lot of details, we don't need it in the view. Then we're going to have the rows the year. I'm going to leave it, but I will take the quarter to the columns and as well the month. And of course, what is missing now is to fill those gaps using a measure. Our measure going to be the sales. Let's drag and rub it over here. Now, in order to convert it to a heat map, we have to add it as a colors. Let's take the sales again and put it in the colors, or you can hold control and drag it to the colors. We're going to get the same results. Now we are almost there. I would like to have, instead of text, I would like to have squares in order to get the heat map. With that, we got a heat map. We can change the colors if you want. So let's go to colors, Edit colors. And I would like to have it as blue. It okay. So with that, we have created our heat map using only one field, the order date. So we have the years from the order date, we have the months from the order date, and as with the quarter. So as you can see, those parts that we extract from the dates are really useful to make visualizations. So now we can go and add the final touch in this view, and that is by making abbreviations from the month name. As you can see here, the February is really big for the seal over here, so we can make it shorter. In order to do that, we can use the lift function. So let's go to our calculated field and edit it. And now before we're going to add left. And then at the end we're going to add three. So I would like to get only three characters from each month. Let's go and hit. Okay, perfect. Now we have abbreviations for each month and the view look more professional. There is nothing that we have to add, I promise with the last one. It is the category, we forgot about it. So let's go to the categories and just drag it before the year. So with that, we got really nicely those categories, and we can see inside it how those categories are developing over the time. So with that, we got a really nice heat map, all those informations from the date. Now we have in our data source a lot of new information about the order date where we can use it like almost everywhere. Now we have another very common use case for those new informations where we can use those date parts as a filter. Let me show you what I mean. Let's go again to our orders. And we're going to go to the month ratlic on it and show it as a filter. The same thing we're going to do for the year, radically on it and as well show it as a filter. Now we can see those informations on the left side, and the logical order is very important. First a year, then a month. Since the month has a lot of values, let's go and switch it to a dropdown with multiple values. Now using those filters, the users can go and specify scope for this view by changing the values of the year. And as well for the month. This is very common use case for the date parts in Tableau. That's it for those functions. Now let's move to the last one, We have the data trunk. Okay, now in order to see the effect of the date trunk, let's go to the big data source and get all the other dates. To the view, I would like to see the exact date. Let's switch it to exact dates. And I came to discrete to see the values. All right, so next we're going to take the sales to the view as well. With that, you can see we are seeing all the, all the information that we have in the side. And we have a lot of details now. Let's say that I'm not interested in the days. I would like to see one date for each month. We would like to have this date at the month level. In order to do that, we're going to go and create a new calculated field and we're going to use the date trunk. Let's go and do that. We're going to call it order date. Then the syntax can be like this date trunk and it accepts two arguments. The first one going to be the date part. Which level we want to see in the view we want to have the month. Let's specify here month then the date that we want to manipulate, which is the order date that sets and the calculation is valid. Let's go and hit okay. And on the left side we've got a new dimension with the data type date and time. What we're going to do now, we're going to go and replace the order date with this new field. Just put it on top of it. Again, here we have to do the same thing, right click on it, switch it to exact dates, and then again to the discretes. Now we have a new date field where everything at the month level we have always the first of the month. So we have 1 January, 1 February, and so on. So as you can see now the list is short, right? Because we have now one row for each month. Before we had one row for each day. Now I'm not interested in those zeros in the view, I would like to get rid of them. In order to do that, we can change the let's go to our date trunk and let's switch it from date and time to date. Let's go and do that. As you can see now we have a date field and all the time is away. Now, let's say that I would like to have a date only at the year level. I don't care about the days and the month. I would like to have one row for each year. In order to do that, we're going to go and edit our calculated field now, assembly, we're going to go and change the value from month to year. That's it, let's go and hit Apply. And you're going to see over here that we have now one row for each year. So now we have a field always at the year level. And we got like around five years, as you can see with the date trunk, we can control the level of the date field. So let's say that we want to switch it today. We're going to go and switch the year today. And now with that we're going to get all the details. We have one row for each date and with that we have a lot of details. We are back like the original field order date. So this is how we work with the date trunk in Tableau. Okay, so there's another way in order to visualize the effect of the date trunk. So let me show you how to do it. Let's first close this thing here. And then we're going to switch the order date trunk to continuous field. So let's go and do that. Now let's go and flip everything. So we're going to have the order date at the columns and the sum of sales at the rows. And instead of having power, let's have a line. Now in the visualizations, we have a lot of marks. If you mouse over on those informations, you can see we have one mark for each day. And that's because we have defined in the order date, trunk that we are at the day level. And you can see here on the details, we have around 1,800 marks in this one view. Now if you say this is a lot of details, let's switch to month. Let's go to our calculated field, edit it, and just move it over here on top instead of day, we're going to have a month. Let's go and hit Apply. So let me just close this from here and let's check the view we have. Now for each month one mark we are at the month level and the marks are totally reduced a only 60 instead of thousands of marks. With this, we don't see a lot of details in the view, we have one mark for only one month. This is the power of the date trunk. Let's say that we want to go to the years, and I think you already how many marks we're going to get. We're going to get only five marks each point, each mark can represent a year. This is the power of the date trunk to control your view and which details we are talking about. All right, so that's it for those functions. They are really great in order to extract specific parts from a date. And as you can see, they are really useful for the visualizations. Now we've used a lot of calculated fields. As you can see on the left side, we have a lot of new dates in our data source globally. That means if I go to any other worksheets or even to any other workbook connected to my data source, I'm going to see the exact fields that I created using the calculated field. And I can go immediately and start re using them in my visualization. Which going to save a lot of time by doing formatting and so on. So that's how to extract the data parts using calculated fields to be globally. Next we're going to start talking about how to do it quickly, locally for only one view by formatting the field. Okay, so now we're going to start from the scratch, we're going to go to our big data source. Let's go to the orders and get the original field of the order date to the columns. And again, let's take the cells to the rows. Now as you can see, Tableau always brings it as a year. And that's because it wants to visual only small amount of data at the start. And then you decide on what do you need here? We can go and manipulate the order date directly in the view by changing the format instead of going and creating calculated fields. Now in order to format the date, we're going click on the dimension itself. So right click on it. And now we have here two important sections. The first section is a discrete section where it's going to use the function date part and the other section is a continuous section where it's going to use the date trunk and he always on the right side. As you can see, we have those gray examples in order to show you which format going to be presented in the visualizations. For example, there's no difference between this year and this year, but here we have the quarter two, but here we have the quarter plus the year. So you can see the formats that's Tableau going to use in the presentation in the view. Now let's go and check the differences between this month and this one. Let's start with the first one. Let's click on Month. Now as you can see our field states clues means it's discrete and we have those values, January, February, March, and so on. We have it as a text. If you would like to know how Tableau did create this, you can go over here on the month, double click connect and you can see the format Tableau is using, date, part month, then the other dates. So you can see the syntax that is Tableau is using to quickly format your view. Now let's go to the next one. We can have the month as a continuous field, right click, Connect again, and now we can have the month plus the year. Let's go and click Now you see that our field is continuous and if you double click con it, you can see that Tableau is using date trunk. Now we see the years in the axis and each mark, each point of those staff are a month. As you can see, it's very easy. We are just clicking around and we are changing the whole format of our dates. What I usually do, I go and select different formats until I'm convinced about the correct format that can represent my data. And there are as well a lot of different formats. So let me show you. Let's go to the order date. As you can see, we have, yeah, is a year, quarter month, but here we have the option of more. You can see we have a week number, a week day, and you get more options if you go to the custom. Now here you're going to get a list of all possible formats that we can use in order to change the structure of our dates. The same thing, of course for the continue is filled. So if you go again, you can see we have here as well more so you click the custom and as well you can change the different formats. Of course, any decision that you are making now on the view, it's going to stay only in this view. If you switch to any other worksheets, you will not find what you have already formatted. This is the only disadvantage of making a lot of decisions in one sheet then you will not have it in the next sheets. There is as well more options on how to format the fields. For example, let's go to the other date, right click on it and let's choose this month as a full name. Then I'm just going to switch those columns with the rows. Now we can see that in the header we have the full name of the month. But we can go and change the format of those headers by just right click on it, then go to format. And then on the left side, we can change the display format of the header. For example, on this one or the dates. If you click on it, you will get different options like here, for example, abbreviations. Once you click on it, you can see now we have an abbreviation of the month name. Or we can get the first letter of each month if we want. Really to make it small so we can go over here and change it to first month. With that, we're going to get the first character of each month. Of course, those formats are not only for the month. Let's take, for example, the weekday, we're going to go over here, then switch it to week day. We have here the full text of the day in order to make it abbreviations, we're going to go on the left side again and switch it to abbreviation. And with that, we're going to get shortcut for the week day. So as you can see by just clicking around, we're going to change and manipulate the values of the dates inside our data source without writing anything, without writing any syntax, or creating new calculated fields. So we can just do it quickly in one view. But here, if you find yourself that you are repeating the same format over and over in different sheets, I recommend you to go and create a new calculated field for that, to store it at the data source, and use it once you need. All right. Kay, so that's it for those functions and how to format the dates. Okay, Kay, so what does we have learned? How to extract a specific date part from our date field. Next we're going to talk about two functions, date, add, and dated. 140. Tableau | Add & Subtract Dates: DATEDIFF, DATEADD: Now we're going to learn how to add and subtract dates in Tableau using the two functions, date add and date f. But as usual, let's understand the concept then we can practice. All right, so now we're going to talk about the function date ad. We can use it in order to do mathematical operations on our date field. For example, we can add three days to our dates, or we can, for example, two months from our dates. We can manipulate our date by adding or subtracting specific intervals from our dates. Now let's see the syntax in Tableau and take some examples in order to understand it. It's start with the date ad as a keyword and it needs three arguments. First, the that we are interested to manipulate. The interval is like how many days, how many months you want to add. Then we have the date field itself that we want to change the output. The result going to be a date field. So for example, let's say that we want to add three years to our date. We specify at the date part years, then the interval is going to be three. And then our date, what's going to happen? Tableau going to go and add three years to our date field that we are adding three years to this piece of aformation, the year and the rest, the months and the day is going to stay as it is. Let's move on. Let's say that we want to add three months instead of three years. So what we're going to do, we can specify a month at the date part, then three as an interval, then our date as well. So what's going to happen? We're going to change only this piece of reformation. So instead of having August, we're going to have November that we are changing only the month. The risk going to stay as it is now. We can move to the last one, to the day. We would like to add three days. I think you already got it. So what can happen? We are going to add three days, so we're going to have the 23 instead of 20, and it's changed only at the day level, the risk going to stay the same. With this, you can see we can add different intervals to different date parts in our date field. And in our examples we were working with positive numbers, but in Tableau we can as well use the negative numbers that we're going subtract intervals from the date. So let's take an example. Let's say that we want to subtract three years from our date. So we're going to have here the interval as a negative three minus three. And the output we will have, instead of the year 2025, we will get 2022. Of course, the same thing, we can do it on the day. So we would like to subtract three days from our date. So instead of having the day 20, we're going to have 17. So as you can see, we can use the date add in order to add new intervals, but as well to subtract intervals, it's very important function in Tableau in order to compare things together. Like we can compare this year with the next year. So we're going to go and add one year to our field that we're going to get two fields, the field with the current year and the field with the next year. We will see that in next examples. So that's it for the date add. Let's move on to the date. The date diff function in Tableau has a very simple task and that is to subtract two different dates. So for example, let's say that we have two dates, the order date and the shipping date in our data source. So let's say that you ordered something in this date, 2025 in November and you received your order in the next day in February. So now if I ask you how long it took to ship your products to your house, you're going to subtract those two dates in order to give me the number. This is exactly what the date diff does in Tableau. So the syntax is going to be looking like this. Date diff. Then we have three informations, which date of part you would like to subtract. Then we have the starting date, in this example, the order date, and then the end date, the shipping date. The output going to be always a number, as usual, we're going to have examples in order to understand it. So here we're going to ask Tableau how many years it took to deliver, to ship this product. So here we are interested in how many years we are interested in the year part then the start date going to be the order date and the end date going to be the shipping date. If you do that in Tableau, you're going to get one. So it took one year to ship the product. So here we are talking at the year level, you will get one. Now let's go to the next level. Let's say how many months does it take to do the shipment. So here we are specifying at the date part a month. We have as well, the same information for the start and the end date. And this time you're going to get three months. So the answer is going to be it took three months to ship the product to the customers. All right. The next question going to be how many days it takes to ship the product to the customers. And this time it's going to be 68. So now we are talking at the day levels. So the result going to be, it took 68 days to ship the product from the order date to the shipping date. So in this situation, it makes sense to use the date because we always want to understand how many days exactly it took to send the product to the customers. Because if you have like a year, you're going to think it took the whole year to send the shipment. That's it. This is how this function works. It's very simple and very useful in the visualizations. Now let's go back to Tableau and start practicing those two functions. All right, now let's go and see how we can create that in Tableau. We can stay at the peak data source. Let's go to the orders, and we can manipulate the order date. Let's bring it to the view over here and we're going to show you the exact date. So we're going to go and switch it to exact date to see all details. And I would like to have it as discrete to see all the values inside our data source. Now it's really simple. Let's say that I would like to add one year to my order date. In order to do that, we're going to go and create a new calculated field, so we're going to call it order date plus one year. We're going to use the function date, adds it, need three arguments. The date part, we are adding one year. The date part going be a year, the interval going to be one. And the date that should be manipulated is the order date. It's very simple. As you can see, that was the calculation is valid. Let's sit okay and check the results. As you can see, we've got a new field in our data source with the data type date and time. Let's check the results. We're going to grab it to the view, but I would like to see as well the details. I would like to see the exact date. Again, we have to switch it to discrete in order to see the results. Let's switch it to discretow. As you can see, we have a date and time. If you want to get rid of the time, we can cast the to date. In order to do that, let's go to our Data Pain, this is our field. Click on the icon of the data type and switch it from date and time to date. Let's do that. And as you can see, now the time did disappear. At the results, we see that everything is plus one year. We have here 2018 as the result, 2019. We can check other dates. If we sort this as descending, we can see that we have the value as 22 and here we have it as 2023. That's it. This is how we can create a new field with plus one year. Let's add one month. Now let's go and edit our new calculated field. Right click Edit, and let's change as the name from year to month. Now instead of the date part year, we can have a month. It's very easy to switch. And if you select Apply now we can see that we are adding one month to the data. If I sort it again to the old one, you can see here we have January, and now we have it as February. We can do the same if you switch today. If you want to add only one day, let's apply and add the results. You can see that we are adding everywhere plus one day. Of course, we can add to the intervals negative numbers. Let's say we would like to have minus one day. Let's apply and check the results. As we can see in the results in the new calculated field, it's always one day behind the original field of the order dates. This is how we can work with the date adds. It's very simple. All right, so now we're going to go and create a new view to analyze the average days to ship peer subcategory. It's really important for inventory management, optimizing operations allocations of resources and so on. So we can create that using the Date Tableau. But first let's bring a lot of data to the view in order to understand how this works. We're going to stay with a big data source. Let's go to the orders. And here we need our two dates. The first one going to be the order date and the second going to be the shipping date. Let's add as well the order ID at the front. Yeah, we everything to see the results as usual. Tableau, show it as a year. We would like to see all the details. That's why we're going to go and convert it to exact date. For the first one, we're going to do it exact date. It might take a little bit long time because we have a lot of data and we have it now as a continuous. I would like to see all distinct values. Let's convert it to discrete and do the same thing for the shipping date. We're going to convert it as well to exact dates, and then to discretes, we're going to go and move it to discrete. All right, so now we have all the information that we need. We have for each order one row. Now we're going to go and create our new calculated field in order to find the differences between the order date and the shipping date. Let's go and do that. We're going to go and create new calculated field called days to ship. And we're going to use the function dated and it needs three arguments. The first one is the date part here. Of course, since we are saying days to ship, we are interested on the days, how many days it took to place the shipment at the users. So we can enter here day. The start date is going to be, of course, the order date. And the date is going to be the shipping date. We have it like this and let's check the validation. The calculation is valid, everything is fine. Let's go and hit okay. And since the output going to be a number Tableau did created as continuous measure, let's take it and put it on our view and check the results. Let's take, for example this order. The customer did order in December 7, and after four days, the customer did receive the shipment. With that, you can see the differences between those two days is four days, everything looks good. Let's take another value. Maybe some recent orders, so I'm going to sort it. Descending from the order date as you can see here, the customers did place an order at the last day of 2022. And after 24 days, did the customer receive the shipments? We can see here the days to ship is 24. This is how the date works. Now we're going to go and create our visual. We want to show the average days to ship pair category. Now we want to get rid of all those details. We don't need them, we just need our measure. Now we need the subcategory, the product. And get the subcategory over here. And then we're going to take our measure and put it on the columns. But now we have it as a sum. We would like to have it as an average. Click on the measure, then go to the measure sum, And here we have the average. Let's switch it to that. Now we're going to add some more information. Let's add a label. And as well, let's change the colors. Let's bring the average days to ship control and then put it on the colors. Since it's bad thing, we're going to switch the colors to red. Let's go to the colors over here. It colors now instead of Automatic, we're going to switch it to red. All right, Let's click okay. And then we're going to go and sort the list like this. Now let's go and check the data. As you can see, the worst subcategory we have in our data. Yes, it takes longer time to be delivered to the customers compared to the other subcategories. So now the question is we have five years of data inside our data source. Was it always like this that the copyers was the worst or something changed with the time? Now, in order to compare the years, we can add the years to the view in order to compare those informations. We have already the year prepared from the last time. So we have the order, date, year. Let's just bring it to the view, to the columns. Now if you check the data, it's very interesting. If you focus on the Cobyers again, you can see that in 2018, 2019, the performance was really good. Even it was one of the best performance 2019, it gets this light red, but something changed in 2020. From 2020 and forward, you can see it's always dark red. There is like change in maybe the resources or in the inventory management, we can see it is one of the worst performance compared to the other subcategories. With that, you can compare the years as well together to understand whether it was always like this or something changed. As you can see, using the visualizations, the coloring, and as well those functions that we has in Tableau to manipulate the dates, we can uncover those trends inside our data. Maybe it's really hard to find it from the raw data, right? But if you bring everything with colors and everything in the visualizations, it's going to be really easy to detect. So this is exactly the power of vasulizations at those functions. All right everyone. So with us we have learned how to add and subtract dates in Tableau. Next we're going to talk about two functions today and now. 141. Tableau | TODAY & NOW: Now we're going to learn about two cool functions in Tableau today and now in order to get the current dates or the current date and time, let's go. All right guys, one of the very famous use case of the today function in Tableau is to make something like this. You can highlight individualizations about the current date in the view. So we can see here like a separator in the visualizations with the current date of today. And with that you can draw the attention of the users by highlighting one of those parts. Now let's go and understand quickly what is today function. All right, so we have those two functions today and now. They are the easiest and the simplest functions in Tableau that will not manipulate or transform anything. There is no concept behind them. They will just deliver for you the current date and time informations as you execute them. So for example, we have the first one that today it does not need any argument. As you can see, it's very simple. The output can be a date. So you will get the current date informations. Now we are, as I'm recording at the end of my 2023, but if you are interested to have as well the time information you have to execute now no argument inside it. You will get date and time. So as I'm recording it is 06:00 P.M. 10 minutes and 40 seconds. So that this is about the two functions. Let's go back to Tableau and start practicing. When do you use them? All right, so now we're going to see how we can use today function in our visualization. So the first thing is to create the calculated field. So let's go and create a new one. And we call it today, then we need the function that's called today as well. As you can see, it's very easy. We don't need to add anything else. And by the way, this is always the first calculation that I always create in each new data source without knowing the requirement or anything. I just go and create this one because I'm sure that I end up using this function. So it's really one of the first things that I usually do for each new data source. Let's go and hit, okay. Everything is fine. As you can see, we got it on the left side as a new dimension with the data type date. Let's check the current information so we can bring into the view table, can convert it to a year. So I have always to switch it to exact date and then to discrete in order to see the value. And as you can see, we are at the end of my 2023. So now it's very interesting in which year you are now checking the video and following me in those steps, okay, So this is how you can create the today function in Tableau. Now we're going to use it in a reference line, in one view in order to show you how powerful this function and we can create a view about the number of orders over the shipping date. Let's go and create it. I'm going to remove that today from here. And then we can add the shipping date from the orders, the column. Then let's take the number of orders, the orders counts. Let's take it to the rows now. Instead of having the years, I would like to have months. I'm going to do now a quick format. Let's go to the field and then we're going to go and pick this one month. Let's click on it and the visualization type look as well. Good. Now let's go and create a new reference line. In order to do that, we're going to go to the axis over here, right click on it. And then we have here the option of a reference line here. The most important thing to customize is the value of the reference line. I would like to have the value of today as a reference line to indicate the current information, the current date. But if we go to the values over here, you will see that I can either create a new parameter or I can use only the shooting date. And that's because our new field today is not yet in the visual, so we have to add it to the visual in order to do that. We can close this first. Then we take that today and drag and drop it in the details. But we are not there yet because Tableau did convert it to a year, and I would like to have in the reference line the exact date of today. In order to do that, we're going to convert it to exact date, radically connect and we have here the option exact dates. This is the requirement to add it in the reference line. Let's go and add again the reference line. And we go to the values. Let's check, Yeah, we got the today value, let's select it. And then hit. Okay, so now here on the right side we got a very nice reference line indicating of the day of to date. But still there's like a problem, right? Because all of the data is behind the reference line because the data is a little bit old. Now, in order to make it more interesting, I'm going to add two years to the shipping date to make the visual look better. In order to do that, as we learned before, we're going to go and create a new calculated field. Let's call it shipping date. Plus two years. Here we can add a date. Add first, we need the date part. So we are saying plus two years. We are talking about years. The interval going to be two and the date going to be the shipping date. All right, with that we are done, the calculation is valid. Let's click Okay. So we have it now on the left side. And what we're going to do, we can replace it with the old value. Let's just remove chipping date and get the new one to the rose. We're going to do the same steps, so we're going to convert it again to month. Let's do that now. As you can see, we have values for 2024. 2025. Let's add again the reference line. Right click on the axis. Add reference line. Let's go to the values. Let's select it today. Now we've got a very nice cut in our visual in between our data to show the past, today and the future. Now we can go and add a little bit customizations just to make it look better. For example, as you can see, we have a label over here for the reference line. It says minimum Today, I would like to show immediately the value of the current date. In order to do that, right click on the line and then go to Edit. Then change the label over here instead of the computation. Let's change it to the value. With that, as you can see on the right side, we get immediately the current value of today. The next step, I would like to add some coloring to the reference line. Right click on the reference line and let's go to format. Then we have here three informations to customize. The first one is the line itself. Then fill above, that means all the information on the right side fill below going to be all information on the left side. For example, let's start with the line. I would like to have a dot and as well read the opposite. I'm just going to make it to the 100. Now the next value is going to be the fill above. I would like to highlight it with green. Let's go and pick color green over here. And then the next one can be the pillow. You can leave it like white or you can make it like gray in order to show this is history. With that, as you can see, the visual can look more professional. So we are highlighting the future and the history is like grade out. So that's it. With a small function in Tableau, like the Today function, you can create amazing dashboard and visuals for your users. And this is one of the most common use case of the Today function in Tableau to highlight the data. Okay everyone. So that's it for today and now functions. With that, we have learned all the use cases for the date functions in Tableau. We have covered around ten functions in Tableau. Next we're going to jump to the next group, we can learn about the null functions. 142. Tableau | NULL Functions: ZN, IFNULL, ISNULL: Now we're going to focus on another group of functions under the category row level calculations, the null functions. The main purpose of the null functions in Tableau is to handle and manipulate the missing values in our data. The nulls, we can have missing values like everywhere in text, dates, numbers. Any field in our data source can have like missing values. Why handling the missing values? Handling the nulls is a very important step in the analysis. And that's because of two things. First, the calculation accuracy. Null values can affect the calculations and the aggregations in the results. Null values in our data, and we ignore it, we don't do anything about it. What can happen? We can have incorrect calculations and corrupt results. The second reason is to improve the data quality and to achieve completeness. Identifying the data gab that are wrong in the data entry and having issues in the data collection can help the overall data quality in our data and can improve as well the completeness in the data visualizations. That's why the null functions in Tableau are very important to have accurate and correct analysis in the data visualizations as usual, let's understand the concept then we can practice. Let's go, let's go and understand those three functions. Zen null is null in order to handle our missing values as usual, we're going to go with the example because it is the best way to understand those functions. All right, so now we're going to have four customers and their sales. As you can see, only Maria has a missing value in the sales. We have here a null. In order to handle this null, we have the first function in Tableau stands for zero nulls. It can replace the null values with zero. It's very simple. If you use now the Zen function for the sales. For the first value we will not change anything, right? We will get exactly the same value but for the next one. Since it's a null, it's going to replace it automatically with zero. The next two customers, we will get exact values because they are not nulls. So as you can see, very simple, we are just replacing the null values with a zero. So this is a very quick way to replace the nulls. But here the problem is we have no control what we are replacing. So here we cannot specify something else. We will always get a zero. In order not to specify our value, we can use the second function that we have in Tableau. If, if null, it can replace the null value with a specific value from us. If you use this function on the sales, it can has the following syntax. It needs two arguments. The value that we want to manipulate and the value that we specify. This example, I'm going to specify it as zero. It doesn't make sense because we can use but just to show you that we're going to get the same results so you can go over here and put anything you want. So for the first customer, we're going to get exactly the same results. For the second customer, we're going to get again zero because we specify that we have the control on that. And then for the last two customers, we're going to get exact results. And here the output is a number because the field that we want to manipulate is a number. But let's say that we take another field which is a string. The output going to be as well as string here is exactly the difference between z in and if nal z in accepts only numbers, but the iphnal accepts any field from your data source. For example, let's say that we have the countries John has no value in the country. Same for Martin. We have only for Maria and George. Informations inside the field country. Here. We cannot go and use the z in function because it's not number, it's string. In order to manipulate those values or to replace the null values, we're going to go and use the Ip Nal. The syntax going to look like this. If null country, then we have the abbreviation of not applicable. The output here going to be a string value for the first customers. We're going to replace the null with the next one is going to stay the same because there is nothing to replace. The third one we're going to get as well, not applicable, and for the last one we will get France, so nothing to be changed. This is exactly the differences between the null function and the z in function in Tableau. Now we're going to go to the last function is null. Sometimes we might be in a situation where we want to check whether the field has null values or not. So we don't want to do any actions yet, we are just checking, right, the null in Tableau going to return true if the value is null and falls otherwise. That means if there is no value, if we have missing value, we can get true, there is a value, we will get false. So the output of this function is going to be with the data type bullion with only two values, either true or false. So let's check the example or the syntax in Tableau. It's going to accept only one argument, the country, and that's it. So the question for the first customer, is it a null? Yes, it's null, so that's why we're going to get true for the next customer. Is it a null in the country? We'll know, so we're going to get false. The same for the third one, we're going to get true. And the last one we're going to get false because we have a value in the country. So that's it for the is null. So we have three functions, three tools to manipulate or to check the null values inside our fields. And they are really useful to improve the quality and the completeness of your visualizations. So now let's go. Blow and start practicing them. This time we're going to go to the small data source. Let's check the order information. So we're going to take the order ID, and we're going to take this time the profit. Drag and drop the profits on the ABC over to see the values. Now if you check our data, you can see that the order seven don't have any profit informations. And as well the order ten don't have anything we have here missing data, we have nulls. Now let's do something about it and fix it. Instead of having null, we have to have zero. Here we have two functions to do it. Let's start with the first one, the zero nulls. Now we're going to fix it and create a new calculated field. We're going to call it profit n the syntax. Start with the function and it needs only one argument. The field that we need to fix, it's going to be the profits. With that, we are changing all the null values to zero. Again, in this faction, we don't have control to change the value to something else. It's going to be always zero, the calculation is valid, everything is nice. Let's click Okay. And as usual, we're going to get a new measure since the output is going to be as well, the profit informations. Drag and drop this new information to the few, and now we can see on the results, all those values going to stay the same. Only we are manipulating the nulls. We are replacing the nulls with zero here as well. For the Udoumber ten we have null, now we have a zero. It's and quick fix. All right, so now we might say, you know what, why we are making all those efforts to replace those missing values with zero. So what is the big deal? I could just leave it as a null and the users might accept it. Why we are doing this? Well, it's not only the visual going to be better, but also having missing values going to bring wrong and inaccurate aggregations. Let me show you what I mean. Let's just remove the order ID away. Now you can say, okay, we got the same numbers, right? We got the same aggregation. So everything is accurate and fine. Well, not exactly. This is only for the sum. Now let's go and switch them both to the average. We're going to go over here and switch it to average, and we're going to do the same for the corrected one. Now I'm going to just make the headers wider to see the values. Now as you can see now we are getting different values with the Z in function. We got different average from the original data. And that's because in this average we are not counting the orders with the missing values with the Z in. We are counting now the orders with the missing values. That means replacing the missing values with zeros. We will get accurate results at the average in the aggregations compared to the old one. That's exactly why we go and replace the missing values with zeros, especially for aggregations and calculations. All right, that's why we do it. Now let's go and try another function. We can use the Nal in order to replace the null values with zeros. And now I'm going to just bring the order ID to view, to see all the orders. Let's go and create the new calculated field. And we're going to call it profit if null. And the Centax starts with if null. And it needs two informations. The first one going to be the field that we want to manipulate, so it's going to be the profit. Again, for the next information, we have to specify which value can replace the null. In this example, we're going to stay with the zero. The calculation is valid. Let's hit okay, and we got again our new calculated field. Let's bring it to the view and check the results. As you can see, it is identical to the z n for the order number seven. Instead of null, we got zero. The same for the ten we got as well zero. In this situation, if we want to replace it with zeros, I would go with the z n since it's just faster to write it. Now let's move to the next scenario. We want to replace the nulls with the value one. This time we cannot use the z n because can automatically convert it to zero. We're going to stick with the null. Let's go and edit our calculation instead zero. Here we can specify one. Let's go and hit okay. Now we can see instead of having zero, we have the value one. Instead of null we have one. This is the advantage of the Enal. We can control which value going to be the replacement for the null. All right, the next advantage of the E Nal that we can replace not only number values we can replace as well any other data type. Let's take an example. We're going to go to the customers and let's get the customer E mail to the view. As you can see here, we have some nulls. We don't have all the E mails from all customers. But now the task is to replace those nulls with non. Let's go and create a new calculated field in order to replace those values. Let's call it customer email. If null, and the syntax, again null, it accepts two arguments. The field that we want to manipulate, it's going to be the customer e mail, this one over here. Which value we're going to use in order to replace the nulls? It's going to be the unknown, That's it, the calculation is valid, so we can replace all the nulls with this value. Let's go and hit, okay. We have again here a new dimension in our data source. Let's grab it to the view and check the values. Now if you just compare those two columns, you can see instead of null, we are getting Unknown the same here and the third one over here. And the others will not be affected because we have a value inside the field. As you can see, it's really nice and quick way to replace those bad nulls in the view. That's all for the Nal. Now let's check the last one we have is null. The null will not replace the values with anything. It's just to check whether there is a null or not. Let's say that we want to check whether in the field profit we have any nulls. In order to do that, we're going to go and create again, a new calculated field. Let's call it a profit is null, and the syntax for that is very easy, is null and it accepts only one argument. It's going to be the field that we want to check. So we are checking the field of profit. The calculation is valid and that's it. It's really simple. We are checking whether this field, any nulls inside it. The output can be either true or false. It's going to be a pullion. Let's set, okay? And as you can see on the left side we have a new field with the data type pullion because we have only true and false. Let's drag and put it on the view over here. And here we can see quickly all those orders is false because we have a value inside the prophet, but here we have a null, that's why we are getting true. And here again we have a true that we can check immediately whether we have nulls inside our data or not. So let's go and show it as a filter. This is what I usually do if I see there is true, I'm interested to see those values so I can see, all right, we have two orders where we have nulls inside the value profit. This is really quick way in order to check whether we have any problems, any nulls inside our fields in order to make plan what we can do about it. But here in the small data source, it's really easy to see individual like all the orders, we have only ten orders. But imagine you have thousands or millions of orders inside your data individual. It can be really hard to see. Let's take an example in the big data source, we're going to go over here. Take again the order ID as well. Let's check, this time the sales drag and drab it. In the view as you can see, it's really hard to check now in the view whether we have nulls or not. Instead of that we can do a check. We're going to go and create a new calculated field. Let's call it sales is null. We can use the function is null. This time the field is going to be sales. We are checking the sales. Let's go and, and now we're going to show this field as a filter. Now in the filter, we can see immediately that we have only one value falls, so we don't have true, that means we don't have any nulls inside our data. So this is a very quick check inside our data to see whether there are nulls. Instead of just like scrolling down and checking all the orders, that's why we need the isnull function. So with that, we've covered all the three functions that steal and handles with the null. This is very important to improve the quality of your visualizations and to bring accurate data in the aggregations. All right, so with that, we have covered everything about how to handle the missing value, the nulls, in Tableau. Next we're going to move to another group of functions, the logical functions. 143. Tableau | Logical Functions: IF, ELSE, ELSEIF, IIF, CASEWHEN: Now we're going to talk about the last group of functions under the category row level calculations in Tableau, we have the logical functions. The main purpose of the logical functions in Tableau is to make logical decisions based on conditions. Here we have two use cases. The first group is the conditional operations. Here we have like LF, case win, and so on. The main focus here is to create conditional logics and make decisions based on those conditions in order to manipulate the data. And the second group is the logical operators. Here we have three operators and, and the main purpose of this group is to evaluate and to combine multiple conditions in Tableau. Now let's go and focus on the first group, the conditional operations. And as usual, first we have to understand the concept behind them, then we can practice in Tableau. Let's go. All right everyone. So now we're going to do D, dive in those logical functions in order to understand how they work and how they're going to be executed. And now we're going to start with the symbolist form of the statement, where we have only one condition. In this example, the condition going to be, if the sales is higher than 1,000 then we want the value high, otherwise we end happen. Now let's see the flow charts on how this going to be executed. We start first with checking the condition. Here we have always two ways, either false or true, if the condition is fulfilled, if the sales is higher than 1,000 then we go this path where we're going to have the value high. If it's true, we're going to get the value high. And then everything ends the other path. If the sales is not higher than 1,000 then it's false, then we're going to escape everything. That means nothing can happen. Let's have the following example. Let's say that the sales has the value 1,200 Now first we're going to check the condition is the sales is higher than 1,000 Well, yes, it's true. What can happen? We can execute the high and it's end. And if you're looking to the chart over here, first we are asking the question, is the sales higher than 1,000 The answer is going to be true. So we are taking the green path, This one where we can execute the high. Let's take another example where the sales equals to 700. So we start over here again. We ask the question, is the sales higher than 1,000 This time it's not true, so it does not fulfill the condition. And we're going to go with the path on the right side. What can happen? Nothing can happen. The high value will not be executed. And in the output, we're going to get the value null because there is nothing can be executed. It's really simple, right? You are asking always the question that could be answered with yes or no, true and false. You have always two paths, each condition. This is the simplest form of the statement. Let's move to the next level where we're going to have FL statements. Now we're going to stay with the same condition. If it is fulfilled, then we're going to get the value high. But let's say this time if it is not fulfilled, it is false. I would like to get a value instead of null. Here we can add the keyword L. What we're going to do, we're going to add between F and end and L statement to say, okay, if it is not fulfilled, give me the value low. Let's check the flow chart, how it's going to look like. We start first with checking the condition. If it is true the first path, we have the value high. But if it is not true this time, instead of just jumping immediately to the end, I would like to get using the L. So that means the output of the FL statements, it's going to be always a value, either high or low. We will never get a null. Let's take an example. Let's say that the sales is 1,200 It's going to fulfill our condition, so we're going to get the value high and the program can end on the right side as well. The same thing. What can happen? We're going to check the condition and sense is true. We're going to get the value high and the program ends, the output going to be the value high. Here, it's like the last one. But now if the sales equals to 700, the condition is not fulfilled. And now instead of jumping immediately to the end, it's going to jump to the S L statement. So now let's check another value where the sales equals to 700. The condition will be not fulfilled. So it can fail because the sales is not higher than 1,000 So what can happen this time? We're going to execute the L statement. We will not jump immediately to the end, so we're going to go to the Ls and then we can execute the L's In the chart, we checked the condition and we took the right path where it is false. So now once we are at the L statement, it's not like the F here. We will not have any condition. We have only one path. So we can execute the low and the program can exit. So what can happen? We will just get the value low and we end. So the output can be the low value instead of having nulls. So L will be always executed if the conditions are not fulfilled. So that's it for the L statements, it's very simple. Now we're going to go to the next level where we want to add multiple conditions in our statements. All right, so now we're going to talk about the LSF statements. We can use it in order to add multiple conditions to our statements. So far in the previous examples, we worked only with one condition. We are checking with her, the sales is higher than 1,000 and if we are using the FL statements, we're going to get either high or low. Let's say that we want to introduce another condition in our statements to get the value of medium. So now we would like to add a new condition between F and Ls exactly after the F statement. But now we cannot go and use F again as a keyword. Instead of the add anything after the F, we can start using the LSF statements. Adds more conditions. For example, we can add the following condition in between. It's called LF. The sales is higher than 500, then we can get the value medium. That means in the whole statements, we can have only one and only one else, but we can have multiple LF in between if we want to add multiple conditions. Now let's see how the workflow is going to look like. We start as usual with the first condition in the statements. If it is true, what can happen? We can get the value high and everything can end. Now if that condition is not fulfilled in the first, we're going to jump to another condition in the LSF. Here we have another condition where we can check if the sales is higher than 500. And here we have, again, two ways out of this. Either it's going to be true, either it can be fulfilled, so what can happen? We're going to get the value medium and then ends. And the other one, if the condition is as well not fulfilled, then we're going to go and execute the L statements. As usual, the L statement does not have any condition. It's going to just execute the value and ends. Let's see a few examples in order to understand how this works. The first one going to be the sales equals to 1,200 We are checking now the F condition. As you can see, it's going to be fulfilled. We going to get the value high and that's it. So what's going to happen? We're just going to skip everything to the end if we're checking the workflow. So we're going to check the first condition and we will take this pass. Everything else is going to be ignored and will executed. We will just get the value high at the output. All right, now let's take another value, the sales equals to 700. So we are at the first condition. It will fail, so we will not get the high value. Instead of that, we're going to jump to the next LF statement. So we are now at the right path. The true path can be deactivated. So we have here again another check. So we are checking, is the sales higher than 500? Well, this time it's going to be fulfilled. So what can happen? We're going to get the value medium and then the program going skip. So with that, we are at this path where we get the value medium as an output. So this means again that the L statement will not be executed. All right, moving on to the next example where the sales equal to 350. Again, we are at the first check, 350 is not higher than 1,000 that's why this going to fail. Then we're going to jump to the next one to check whether it's going to fulfill this condition. And the sales as well here, not higher than 500. So this can fail as well. So since now both of them are failing, what can happen? We're going to go to the default. The default value is the Ls, so this going to jump to the Ls and we will get the low value from our statements and this is going to be executed. Let's check the right side on the workflow. As you can see, we are the first condition it failed. We go to the second one, it failed as well. Then we go to the last option that we have to the L statements. We will get the value of low. That's all about the LSF statement. If you have a third condition, you just can add it after the LSF or before it. With that, you can add multiple conditions to your statements. And understanding the logical workflow behind those statements is very important to understand those functions. All what you are doing here is we are evaluating different conditions. And based on the evaluations we will get in the output different values. In this example, we have three possible values, high, medium, and low. All right, the case win statement, very similar to the statement here. We're going to evaluate as well, multiple logical conditions. And based on our evaluation, we will get an output value. Let's take an example in order to understand the syntax. It starts always with case, then the field that we want to evaluate. Now we're going to go and evaluate the values inside the country. The first condition is going to be like this. We can write win. Then if the value is Germany inside the country, then the output going to be the E. Here we are trying to make like in the output abbreviations from the countries. Now we're going to go and make another condition for another value. Inside this dimension, we can evaluate the value of France. If it is equal to France, then can be R. Then moving on to the next condition, we can evaluate the US value inside this dimension. If it is equal to this value, then the output should be US. As you can see, using the case when we are evaluating the members or the values of a dimension. Here we are here. In those conditions, we are evaluating a scenario. What can happen if the value of the country is Germany and so on. So far we have three conditions. If you are done and you would like to have a default value if none of those conditions are fulfilled. If the value of the country does not fulfill those three conditions, what can happen? We're going to go and execute the L statements and at the end we're going to have as well and end. As you can see, it's really easy to read and as well easy to write. All right, now let's go and have an example in order to understand how the execution can be done. So let's say that we have the Germany value inside the country. Now as the code can be executed, we can start from top to bottom. So that means we can first evaluate the first one, it's going to be in Germany. Then DE, as the values are matching, we will get the value DE at the output. And the code going to skip everything else, so we will not check France, USA, and so on. So the code is going to go to the end and as output we're going to get DE. It is very similar to the FL statement, right? So let's take another example where we have France in the country. Here we start moving from the top to down again. The first condition can be checked. In Germany. Then DE, this time we don't have a match. Here we have France and here, Germany. It's going to fail. We will get false. That means what can happen? We're going to jump to the next condition to check and evaluate the next value here. We're going to check again when the value is France, then FR, this time we have a match, so we will get it true. And with that, the application going to like skip the other conditions to the end. That means in the result we're going to see FR. Now let's move to the last example where we can evaluate the value Spain in the country. What's going to happen again? Top down. This time none of those conditions going to be fulfilled, right from the first one. We're going to jump to the second because it has falls as well from the second to the third. It's false means we're going to go and execute the L. L can be executed if all conditions are not fulfilled in the output, we will get the NA not applicable. It's very similar to the FL statements. Now we're going to go and compare all those stuff side by side. So now we're going to go and compare three functions, F statements. I, IF case twin. I know that we didn't talk about the IIF, but now we're going to check the syntax in order to understand the differences between it and the F statement. Let's start with the first one here, the syntax. We have multiple conditions. We have two conditions. We have sales higher than 1,000 then high LF sales is higher than 500, then medium L low End with that, we are evaluating multiple conditions in one statement. Now let's move to the next one. We have the IIF. Iif is very similar to the FL statements. We will get the same output, but we write it in different and easier syntax. Let's see the syntax. As you can see, it's very small. It starts with the IIF, then the condition itself. So the sales higher than 1,000 Here we have two outputs, whether it's false or true. The first one is about the true. If the condition is fulfilled, we will get high value. But if the condition is not fulfilled, we will get the low value. Here we're going to write what can happen if it is false. And here we're going to write what can happen if it is true, if we compare to the FL statements. Easier to write and as well shorter here we don't have like keywords like ls or at the end we don't have the keyword end. It's really short and quick to create. But of course, we can evaluate only one condition. Now we can move to the case win as we learned before. It can evaluate the values, the members of a dimension. Here we're going to evaluate the country. Then we have multiple conditions. If none of them is fulfilled, we're going to go to the L statements and then we have an end. Now let's learn the main differences between them. The first one is about whether it's going to support multiple conditions. As you can see in the FL statements, we can add many conditions as we want. It supports multiple conditions. The IIF supports only one condition, the seen as well supports. Now let's move to the next one. We're going to talk about whether it's going to support multiple fields. The FL statements can support multiple fields, so we can have in the condition not only the sales but something else like the country as well. The FL statements support multiple fields. The same for the IIF. It support as well multiple fields. But in the case win, it supports only one dimension. Here, we cannot evaluate multiple dimensions in the same case reinstatements. Here only we are talking about the country. We cannot add any other fields inside these statements. Here we have a limitation in the case reinstatements compared to the other two. Now let's talk about supporting the data types. The FL statements and the IIF, both them they support and in data type, that's why I said here it can evaluate multiple fields here. We could have a dimension measure any data field that you have in your data source. It could be evaluated inside those conditions. But the case when here we have another limitation. We can evaluate only string values, only dimensions. Here we cannot go and evaluate, for example, the sales or profit or a quantity, any measure. We cannot use it inside the case when statements, it should be exactly a string. We cannot even use, for example, a date. The order date here, the field should be a string value. Now let's go and check the main advantage of each method. The first one is, as you can see, we don't have any limitation. The IIF here, the advantage is easy and quick to write in the case win. Here we have again the advantage of easy to write and to read. If you look at the case win statements and to the FL'sessments you can see the case win. It's like organized, it's easy to read. It has like a flaw as compared to the FL's. Here we have a lot of different keywords and it's not that easy like the case win here. My recommendation for you is if you are evaluating only one condition with the output of two values, then always use IIF. It's very quick to create. But now if you have multiple conditions and you want to evaluate it, then think about the case win. Is it like data type string? Are you evaluating only one field? If that's the case, then use case win. It's easier to read and as well to write. But if you are talking about fields and not only shrink values, then you have to go to the FL statements. Always start with the IIF, then the case win, and then if you don't have any other option, go to the FL statements. All right, so that's all about those Sods. We're going to go now and practice in Tableau. All right. Let's go to the small data source. We're going to go to our customers. Let's grab the first name to the view and as well the country informations. Now the task is to create country abbreviations. Short cuts from the original values that we have inside the country. In order to do that, we can use the FL statements and we're going to do that step by step. Let's go and create first a new calculated field. Let's call it country If now we're going to use the keyword if. After that we have to specify our condition. The first condition going to be, if the country equals to Germany, then the abbreviation going to be DE. Let's create that. If the field country equals to the value of Germany, makes sure to write it exactly like our capitalized because Tableau here is case sensitive. Now what happens if the country equals to Germany? We would like to see in the output the word D, E. If it is true, we're going to get the E. If it's not true, then let's try the first one that we just exit. We don't have any L statement or any other condition that this is the simplest form of the statement. Let's go and hit, okay. Now as usual we're going to get a discrete dimension in the data source pain with the data type string. Because the output is a string, we have the abbreviations. Let's drag and drop on our view to see the values. All right, so now let's go and check the values for the first customer, you can see that the value is not equal to Germany. It is not fulfilling the requirements. We will get null. The same thing for John as well, USA, not fulfilling the requirements. We will get null as well. For the next two customers, you see they fulfill the requirements and their condition, that's why we will get the value DE for both of them. For the last customer, Peter, you can see the value is not fulfilling their condition. We got to get null. As you can see, we are getting only one value, the otherwise it's going to be null. All right guys, now let's go to the next step. And I would like to get rid of those nulls. I want to see a real value in the visualizations If the condition is not fulfilled, I want to see the value not applicable in A. Now in order to do that, we have to use the L statements in our calculation. Now let's go to our field, and instead of changing the calculation inside this field, I would like to duplicate it and make a new one. Let's duplicate it and then edit the new one. I'm just going to call it if L. Now we're going to have the same condition again, if the country equals to German, you can get, otherwise we will not skip. Otherwise, we can add the L's statements. It's going to be always before the end. After that, we don't add any condition, we just have to add the value, the value if the condition is not valid to be not applicable. That's it. That's means if it's true we're going to get the is, then we're going to get the not applicable. Let's go and click Okay. And we're going to go and check the values as well in the view. Just make it a little bit bigger to see those informations. Now as you can see, instead of having nulls, we are having now a value which is really better for the visualizations and as well for the user experience to have value instead of nulls. Nulls is always ugly in the views. And with that, we're going to control which value can be presented to the end users if the conditions are not fulfilled. So now, as I recommended before, if you have only one condition where the output is only two values, then the best way is to do IIF. Let's go and create it. We're going to create a new calculated field. We're going to call it country IF, let's see the syntax. So it's start to the keyword IIF here. As you can see, it needs three arguments. The test, it's going to be the condition. What can happen if the condition is fulfilled? So, we have to specify it in the second argument, the third one. What can happen if the condition is not fulfilled? The condition is if country equals to Germany. This is the condition. What can happen if this is true? Then we're going to, then the next step is to define what will happen if the condition is not fulfilled. The country is not Germany. It's going to be, as you can see, it's very quick and very fast to create such a condition. And compared to the L's and so on. So this is the quickest way in order to create such a condition, let's go and hit Ok, and check the results. With that, again, we're going to get a new dimension. Let's drag and drop it over here on the view to check the results. Just going to make it a little bit big. As you can see. We're going to get the exact result as L statements, so the first two countries are not fulfilling the condition. We're going to get the text, two customers, they are from Germany, we're going to get the E, and the last customer is not from Germany that we get a. This is the magic of the IIF. Not a lot of people use it actually. It's not that common to be used, but it is very nice way to quickly create conditions in Tableau. I totally recommend you to use it. All right guys, so now we're going to move to the one more step where we're going to add another condition. So we don't have only one. We can have multiple conditions. That's why we cannot use the IIF. We have to go back to the FL statements. So let's see how we can create it. I'm going to go and duplic it again, one of those fields. So let's go and do that. And then let's go and edit it. I'm call it statements. We're going to stay with the same information is right, the first one we are checking the Germany, so this is the first condition and L going to be A. Now we're going to go and add a new line between the F and the Ls. And we're going to add a new condition by adding the key word LF used. Like the statements, we can write our condition. If the country this time equals to, let's say France, then what can happen? We can have the abbreviation. That's it, We have added our second condition. As usual, we start the execution from top to bottom. The first condition to be checked is if, whether the country equals to Germany. If it is not correct, then it can jump to the. Let's go and it to check the results. So let's go and grab it from the data pin and drop it on the view. Now we can see that there is one customer with a new data. As you can see, George from France, we got the abbreviation of FR, and that's because the country equal to France. And with that, we are fulfilling the second condition. The USA for John and bitter, they still don't fulfill any of those conditions. It always be executed from the ills and Maria and Martin can be executed from the first condition where the answer going to be DE. So that's it. Now we're going to go and add the final step where we can add the third condition for the country USA. Because we still are getting those not applicable for those two customers. I'm going to go to the same field this time, I will not duplicate it, so let's go and edit it. And we just have to add one more condition, right? So I'm just going to copy those stuff and then as the next condition it's going to be as well, LSF country equal to this time USA. Then what can happen if this condition fulfills? We're going to get that abbreviation US. So you can see it's very simple to add one more condition and the LSF. Let's okay. So now we can see in the results, all those customers that come from USA, they have now the US abbreviation. And with that, we have covered everything with conditions. And none of those customers can be executed from the L. So we don't have the NA anywhere in the output which is really nice. And now we can see in the view very nicely, how we started with the simplest form of the statement, and we end up with the complete form of the F statements. Now next we're going to solve the same task, but this time using the win statements. All right, so now let's go and create a new calculated fields. We're going to call it country win, then the syntax. Start with the case, then we have to specify the field that we want to evaluate. It's going to be the country. Once we do that, we start defining now our condition. The first condition going to be the Germany value. When the value equals to Germany, then what can happen? We're going to have the abbreviation DE. That's it. The next condition going to be when country equals to France, then the abbreviation going to be F, R. And we're going to go to the last condition, when the country equals to US, then the value going to be US. That's it. You see how quickly we defined three conditions using the case win. It is very logical and as well very easy to create right now. If none of those conditions are fulfilled, let's get the not applicable and we have to end it. That's it. As you can see, the calculation is valid and it's really easy to read as you're right. So it is everything like structured. I like a lot using case win statements and compared to the FL's. So that's it. Let's go now and hit okay to check the results. And now we've got a new dimension, as usual, from the calculated field, let's put it in the view to check the results. So as you can see, we're going to get the same results. But in this situation, for this task, I'm going to recommend you to use the case win, since as you can see, it's very easy to write and as well to adjust later or to add more conditions if it's needed. So with that, we have learned how to use all those logical operations in order to create a new logical conditions. All right everyone, So I'm going to show you a very common use case that you might find it in many projects where you're going to go and create the colors of the QB eyes using the ecological conditions. Let's go to the big data source and we need the subcategory from the products, as usual, to the rows. And then we need the sales from the orders. Let's put it on the columns. And then we're going to sort it, we're going to add the labels. And now we need color for this KBI. Let's go and create our new calculated fields. We can call KBI colors. And the logic can be the following. If the sum of sales are higher than 200 Ks, I would like to see the green color. Anything between 200 K's and 100 K is going to be the orange color. And anything below the 100 K, it's going to be red. So now we have to decide on the method that we want to use in our calculation. As I recommend you always start with the IIFow. In the logic, we have multiple conditions, we cannot use it. Iif is only suitable if we have only one condition. Iif is away. The next one we're going to talk about the case win. But since the conditions are based on the sum of sales, it is integer. We cannot use the case win because case wind can accept only string values. This is as well a way we are left only with the FL statements. That's why in this calculation we're going to build it based on the FL's. Let's go and do that. We can start the context over here with the F, and then we have to specify our first condition. Anything higher than 200 K's, it should be green. So now we are talking about the field sales. But in the sum, because indivisualizations we have the sum of sales. So if the sum of sales is higher than 200 K's, then what can happen? We can have the color green. So that's it for the first condition. Now we have to specify the condition for the orange. Anything between 200 K and 100 K, it should be orange. So let's go and specify that L again, we're going to have the same field, sum of sales higher than 100 K, then it's going to be orange. So now you might say, you know what, In the condition that you just say it has like two boundaries, right? Higher than 1,000 and lower than 2000. Well, the first boundary, we have it already with the first condition checked. If it is higher than 200 K's, it's going to get green. And this can be anything going to be checked. In this case, it is going to be lower than 200. That's why I specified here only the lower boundary. That's it for the orange. The last one is going to be, if the sum of sales is lower than 100 K, what can happen? We're going to get red. Let's go and specify that we're going to have another LF, sum of sales and lower or equal than 100 K. Then it's going to be red that we have covered the third condition, the third color. And we covered everything. We covered all possible values that could happen. That's why it doesn't make any sense to make an L statements. We just can go and end it. Now let's check, Everything is fine. Now we've got an error. I think I missed here to close it. Now let's check it again. The calculation is valid. That's it. We have three conditions to three colors. Let's go and hit Ok. All right, now we have our dimension over here. We're going to use it for the coloring, right? Let's track and drop it on the colors over here. Now, as you can see, our colors are splitting our view. Tabloid. Got it, almost correct. So we have a orange, red, but this one is not blue. Let's go and change it. We're going to go to the colors then. Idiot colors. Now instead of green as a blue, we're going to have it as a real green. Let's go and hit Ok. So that we got the colors of our KPI. As you can see, all those subcategories with the sales are higher than 200 K. They are all green. And now anything between 200 K and 100 K, you can see all of them are orange and anything below is red. So as we can see, we can do a lot using those logical conditions. We can use it in order to create the coloring in Tableau. We can use it to create a new informations like in the country, abbreviations that are very necessary to understand. All right, so so far we have learned how to create conditional logics in Tableau and how we evaluate it in order to manipulate our data based on the decisions. Next we're going to start talking about the logical operators and or not. 144. Tableau | Logical Operators: AND, OR, NOT: Now we're going to learn how to combine, how to evaluate multiple conditions in Tableau using the logical operators and or, then we can learn about the operator. Let's go and understand the concept, then we can practice. Let's go now. Let's start with the and or operator. Let's have the following scenario. Let's say that we have one condition where we are checking whether the sales is higher than 100. And a second condition where we are checking whether the country is Germany. Now if you want to go and evaluate both of them, you want to combine those two conditions so that they work together. We can use the end or operator in between here. We can use those two operators to combine the condition A with the condition B. And the output can be as well as usual epullion, true and false, our two operators or there are logical operators that are used to combine multiple conditions. Now let's say that we're going to use them in FL statements. Let's see how the syntax can look like. Let's start with the end operator. As you can see, we have here the F statements. Then we have our two conditions, and in between them we have the end operator. The condition can combine both of them in one statement. If the sales is higher than 1,000 and a country equal to Germany, then we're going to get the value high. If it is true, otherwise it's going to end and we will get null. The same thing for the ore operator. We are saying here, if the sales is higher than 1,000 or the country equal to Germany, then we're going to get the value high. So as you can see, it's really simple. Let's check an example in order to understand what are the differences between and Re. So now we have in our table four customers with their sales informations and the countries. So the first condition going to check whether the sales is higher than one K. So now let's check the first customers we're going to get through because the sales is higher than 1,000 and the last two going to be false because it is below 1,000 So this is the information from the first condition. Then the second condition that we have, we're going to check whether the country equal to Germany. So the first customer is from Germany, that's why it's true. The second one is not, we have it false. Then the next one is Germany true and the last one is false. So now, as you can see, we are evaluating the table first in order to get the result for each single condition. But now what we can do is we can go and combine those two conditions to generate new results. So now if you go and use the end operator, it can return true only if both conditions are true and false otherwise. So now let's go and combine those two conditions together using the end operator. Let's check the first customer we have the condition is true, condition P is true as well. So we are fulfilling the requirement to get it through for the first customer, we're going to get the output true for the next customer, Maria. We have in the condition A true, but in the condition B falls so it does not fulfill the requirement, both of them should be true to get it through, that's why it's going to be false. For the next one, Martin, going to be the same. So the condition A is false, B is true, both of them should be true. That's why we're going to get false the last one anyway. Both of them are false, so we're going to get false. As you can see, the end operator is very restrictive. Both of the conditions should be true in order to get true. Otherwise, immediately you will get false. This is how the end operator works. Let's go to the next one. We have the operator, or operator can return true if at least one condition is true. Otherwise, it's going to be false. That means we need at least one true to get through in the output. Let's go and check the example again. For the first customer, we are fulfilling the requirement. We have more than one. Both of them are true. That's why in the output we will get true. The next one we have true at the condition A. False at condition B. At least we have one, so we are fulfilling the requirements. It's going to be true as well, the third one the same, so we have at least one true and the condition B. That's why for Martin, we're going to get it true. But for the last customer, George, both of them are false. We need at least one true to get true, that's why the output is going to be false. As you can see, the operator is less restrictive than the ends. We need at least one true to get true at the output. This is how the end and O operator works in Tableau in order to combine multiple conditions. One more thing to notice here as well is that if you are using end and O, we are evaluating the end result of the condition. We are not evaluating the table itself. We are evaluating those results that we got from the. We're going to talk about the third operator, the nut operator. So let's take an example. We're going to have the following table. And we have our condition where the sales is higher than 1,000 So we will not use the nut operator to combine two conditions together, like with the end or operator. But this time we're going to reverse the results of the condition. The nut operator is a reverse logical operator. It's going to return true if the result of the condition is false. And it's going to return false if the condition is true. If you tell it to go right, it's going to go left. If you tell it to go left, it going go right. So it's going to do exactly the opposite. So let's see what's going to happen if we say not this condition. If you use the nut operator for the first customer, you will get false because the value is true. The same for the second customer, you will get false. But for the next two customers, you will get true because the output of this condition is false, as you can see, as the result. We're going to flip the truth. We're going to get exactly the opposite if you use, so it's going to look like this in the calculation in Tableau. Here again we have our F statement, our condition, but just before the condition, we're going to go and put nuts. And with that, you are reversing everything. Now what you are saying here in this condition, if the sales is not higher than the 1,000 then we're going to get the value low. So that means anything equal to 1,000 or smaller than 1,000 it's going to be low. We are reversing the results. That's it, this is how the nut operator works. Now let's go back to Tableau and practice those three operators. All right, so now we're going to go to our big data source. Let's grab the information of the customers to the view. So we're going to get the customer ID, the first name, country, and the scores as well. But I would like to show the discrete values of the scores. Let's switch it to discrete. And then we need a measure. Let's go to the orders and get the sales, put it on the caums, as you can see. Now we have for each customer, the total sales that they ordered. Now the task is to not show all the sales of all customers. We want to focus on specific group of customers. Now we want to show the sales for only customers that come from Germany and their score is higher than 50. With that, we have two conditions and we can go and use the end or operator in order to combine them. As usual, we're going to go and create our new calculated field, and we're going to call it sales. We're going to start with the F statements. Now we need to write our conditions. So the first condition, the country should be equal to Germany. The country field, we have it over here, must be equal to Germany. Now, since we are seeing end in the task is going to be here as well. And in order to connect condition the second condition, the score should be higher than 50, the field score should be higher than 50. Now we have our two conditions. Both of them are connected with the ant operator. Now, if both of them are true, what can happen? We can show the value sales. Next, we're going to say then sales, and otherwise it's going to be null that sets. We're going to go and end the statements that we can see that the calculation is valid, everything is fine. So let's go and try what can happen. Let's go and click okay. Now we have our new field in the data on the left side, it's going to be continuous measure because the output going to be sales. Now we're going to go and check the values. But first I would like to get rid of those par diagrams. I'm just going to move the sales to the details and then move it again to the view over here at the APC. So now we have those values. Let's get our new sales with the end operator and put it as well on the view. Just let's make it a little bit bigger to see the headers. All right, so now let's go and check out customers. Let's take the customer number two, you can see the country equal to Germany, so we have the first true and the score as well, higher than the 50. So we have another true. With that, we're going to get at the output to true. That's why we are seeing the value of the sales at the output. Let's move to the next one. We have the customer number three. You can see the country is not Germany, so we have here France. So the first condition going to be false. Immediately, the output going to be false because both of them should be true. But we can check the second value, you can see the score as well, not higher than 50. Both of them fails. And the output can be failed as well. That's why we are getting Et, we are not getting the sales. All right, now let's move to another customer, number 23. You can see the customers comes from Germany. The first condition is fulfilled. We have our first true, but the score is not higher than 50. The second condition failed. That's why we didn't get any results. As you can see, the end operator is very restrictive. Everything should be true in order to get the results. That's it. This is how the end operator works. Let's move to the next. We want to show the sales only for the customers that they come from Germany, or the score is higher than 50. The logic is very simple, right? But here we have to change the operator on how we are combining those two conditions. We're going to have the same thing. That's why I'm going to go to the sales and let's duplicate it, and then we go and edit it. We're going to change the name to Or, and we have the same conditions if the country equals to Germany, but this time or the score is higher than 50, that's why I'm going to go over here and let's change it to Or operator. Now I would like to mention something that those logical functions are very close to the English language. If you just read this code, it's like you are saying a sentence in English. So what you are doing here is if the country is equal to Germany, or the score is higher than 50, then show the sales. That's it. You see it's like translating the English sentence to a code. And it's really easy to write and to read as well, so it's really logical. Now let's pack our calculation. You can see it is valid. Let's go and hit Ok. And immediately we can see in the view that with We are getting more values than the end because the end is very restrictive. Now let's go and check some customers. You can see the first one we have, the country not equal to Germany, come from France. The first condition failed, so let's have hope for the next one. But the score is higher than 50, that means this customer going to fulfill the requirement. It's enough to have only one true. That's why we have the sales and the output the next customer fulfill. Both of the conditions come from Germany, higher than 50. That's why we have the sales like the end operator. But the third customer, as you can see, the first condition failed because France and the second as well failed because the score is not higher than 50. That's why both of them are failed and we don't have any results. We have to have at least to get something at the outputs. So that's it, this is how the operator works. All right, now we have the following task for you, is to show the sales for only customers who either come from Germany or France. You can bounce the video now in order to complete the task, and once you're done, you can resume it. Okay, so let's see how we can do that. We can go and create a new calculated field. We can call it Sales Country. And we're going to start with the statements. Then we have the two conditions. The customer should be either from Germany or France. The first one going to be the country equal to Germany and the operator going to be or the customer could be either from Germany or France, country equal to France. What can happen if one of those conditions are fulfilled? We're going to have the sales, then sales, and that's it. Let's end it. As you can see, very simple. Let's go and hit, okay. As usual, we're going to go and check the values. Let's drag and drop it over here in the view, We have it here in the middle. Let's just make it a little bit bigger and see the customers. Now we are checking only one field, but in two conditions. Either the country, France or Germany. The first customer we can see come from France. We're going to get the value. The second one as well, we're going to get the sales value. France, USA. We will not get any value because it's not part of the condition. As you can see now we are getting the sales of all customers come either from France or Germany. Okay, now I'm going to show you quickly something. Let's go back to our calculated field, sales country, and go and edit it. Now instead of having or we're going to use the operator now, what we are saying is the customer should come from Germany, and at the same time from France. It sounds weird, right? So let's go and try it. Let's hit okay, and check the results. You can see that the sales country is completely empty, So we don't see any values, because in our situation, the customer should only come from only one country. We cannot have this condition logically. From the data perspective, this is not possible. All right guys, what do we have learned at the end? Let's move next to the nut operator. Okay, so now we have the following task. Show the sales of all customers who don't come from Germany. If the customer come from any other countries, we're going to see the sales and the view. But if the customer from Germany, it should be null. All right, so now let's go and create a new calculated field. We're going to call it Sales Germany. And we're going to have as well the F statements. So now we have two ways to do it. The first option and the long one, where we're going to go and create a condition for each value inside the country. Beside Germany, we're going to do something like this country equal to USA. And then we're going to say Or country equals, for example, Italy. And then for the next one, or country equal France. As you can see, I'm creating a condition for each value from that dimension country. Of course, if you have a long list of countries, you're going to end up making a lot of conditions as well. What can happen if a new country enters inside your data source? What can happen? You can always go to the calculation and add it as a condition. In this option, we are including all the values that we want to see in the view, But there is a better way to do that where we're going to exclude only Germany. Let's go and remove everything from here. We're going to say if the country equal to Germany, and this time before the condition. We're going to add the operator here. We're going to go and reverse everything. If the customers don't come from Germany, what can happen? We're going to show the sales, then sales, and that's it. As you can see, it's very short and simple. We are just excluding one values. We don't have to add all the values. We don't have to be worried about if there is like a new country value inside the data source. Anything not Germany, we're going to show the sales. Let's go and check the values. I'm going to go and hit okay. Now as usual we're going to get a new calculated field in our data source. Let's drag contribute to the view to check the values. Just make the head a little bit bigger to read it. Then scroll up and the first customers come from France. We're going to get the sale informations. The next one from Germany we have now here we have as well the customer, five from Germany, six as well from Germany. We don't have any sales informations. So we can see that all the customers that don't come from Germany had the sales in this field as well. We can check that by sorting the countries and it's sorted like this and all those values from France, we're going to get always sales informations. And if we go to Germany, you see all the customers from Germany don't have any sales informations in this field. They say we're going to get, again, the values. As you can see, it's really easy to use and really useful to make filters and so on. And as well to focus on specific group of customers. In our views, that's it's about the three operators. They are really nice to use. All right everyone. That's all for the logical operators. And with that, we have covered all eight logical functions in Tableau. They are really important functions since it's going to help us to make data driven decisions in the analysis. And with that, we have covered the last group of functions under the category row level calculations. We learned around 40 Tableau functions. And next we're going to learn about the aggregate calculations in Tableau. 145. Tableau | Aggregate Functions: SUM, AVG; COUNT, COUNTD, MAX, MIN: All right, so now we're going to talk about the second type of calculations that we have in Tableau, the aggregate calculations. And I split the functions into two groups. The first group going to aggregate the measures in our data source, so we have the sum, average count, and so on. And the second group, where we can aggregate the dimensions of our data source. And here we have only one function. We have the attributes. So now we're going to focus on the first group, how to aggregate the measures in Tableau. All right, so the first question is, what are aggregate calculations in Tableau? If you use those calculations, you're going to aggregate the rows of the data source, put the result at the visualization level of the details. That means the dimension that you are using in the view going to control the granularity of the measure. Let's have a quick example. In order to understand it, let's say that we have the order table inside our data source. We would like to find the total sales by the products. In this example, the sales is a measure and the product is the dimension. In order to find the total sales, we can use the function Sum in Tableau. Look like this. We can use the sum of sales in the view. We can have one dimension, the products. It is the one going to control the level of details in the view. And then we have the result of the function sum. We're going to put here the results of the aggregations. Now with this table going to go and group up the rows of the orders by the products. As you can see, the first group is based on the product number one. Then we have the second group for the product number 23.4 As you can see, the orders now is divided into groups. At the visualization levels, we're going to have exactly only one row for each group. That means for the product one we can have only one row. And then table going to go and summarize all the sales inside this group. At the end of the result, we can have the value of 40. As you can see, the aggregate calculations is grouping up the rows from the data source and presented as one row at the output in the visualizations going to move to the next group. For the two, we can have only one row and the summarization of the sales going to be 50. And the same thing going to happen for the product three, we're going to have here two rows and the summarization of that is going to be 45. And as well for the P four, we have as well one row in visualizations with only 15 as a total sales. As you can see, the aggregate calculation is going to go and group up the rows of the data source and present it as one value in the visualizations. And the level of detail is going to depend on the dimension that is used in the view. That's why we say that aggregate calculations going to bring the data at the visualization level of details. And it's not like the functions in the row level calculations where we have computed each value on the same row. So we anything the number of rows going to stay exactly like before. So this is how the aggregate calculations works. And we don't have only one function. We have here multiple functions. So the first one we have the sum that we just learned. It can return the total sum of all values within a field. And then we have another one, the average. It's going to return the average of all values. Then we have the count. It's going to count the number of values within a field. Then we have another very similar function called count D. This time we're going to count the number of unique rows within a field. Then we have the max and min. It can return the maximum value or the minimum value within a field. Now if you check the syntax of those aggregate functions, it's going to be the easiest. If you compare it to any other functions, they all follow the same pattern, so they always start with the name of the functions. For example, the sum, average, count, and so on. And they all accept only one field. So as you can see, we have the sum of sales, average of sales, and so on. So we have only one argument, and it's very simple. So now let's go in Tableau and start practicing those aggregate functions. Okay, so back to our small data source. Let's go to the products, and as usual we're going to get the category and as well the product name. Now those two dimensions are going to define the level of details and the product name going to be the one that is controlling. So here we have the five products inside our data source. Now, in order to create aggregated calculations in Tableau, there are two ways. You're going to do it. Locally, directly only for this view, or globally by creating a new calculated field, and it's going to be available for all other worksheets. So now let's go and check the first methods where we're going to go and create a quick aggregated calculation. We're going to go to the orders and we're going to take the sales. Just drag and drop it here on the view. Now as you might already noticed that Tableau always try to aggregate the data at the visualizations, and for that, Tableau going to use the aggregated functions. So as you can see, we have the sales, but before it we have the sum of sales. That means Tableau is using the function sum in order to aggregate data in the view. And this is the default methods from Tableau to aggregate the data. That means in Tableau, the default type of calculations can be used. On the measure is the aggregate calculations. And the default function that's going to be always be used is the sum. Now in order to change the function that is used in the aggregations, we can go to the measure over here, right click on it. And here we see that our field is a measure. And using the sum function in order to change that, let's go to the measure and we can find here a list of all different aggregate functions that we have in Tableau. We have the sum, the average, the count, count, distinct, minimum, maximum, and so on. Now, for example, we can go over here and change it to the average. Now instead of sum of sales, we have average of sales. And add the output we can get the averages. As you can see, it's very simple. With just one click, we change the aggregation function. And as well, it doesn't need a lot of configurations like we're going to see later in the table, calculations for example, or the LOD expressions. So this one is really easy. If you want to change the function, just go to the measure radically on it. And then here you have a list of all functions that you can configure. And of course, anything that I'm choosing now from those functions will not affect any other sheets and will not affect our data source. Here we still have the sales. We don't have any field called the average sales, so it can be only locally available for this visualization. That brings us to the second method where we can create an aggregated function that is globally available for all other worksheets or workbook connected to the data source. All right, so now let's say that I would like to have an extra field inside my data source to find the total of sales. In order to do that, we're going to go and create a new calculated fields. It's really simple. We're going to call it Total Sales. Then in order to see the aggregate functions in Tableau, we can check the documentations over here. Let's go to All. And then let's choose Aggregate. And with that, you can find all the aggregate functions in Tableau. Inside it, you can find as well the LOD expressions we have here, the fix include and so on. Find the total sales. We're going to have the function sum and as you can see it need one expression. It's going to be the sales. It's going to be only one field. We're going to have the sales. And that's it. As you can see, the calculation is valued. Let's go and hit, okay. And with that we got a new continuous measure inside our data source. But here, the difference between aggregated calculations and the row level calculations, those calculations is going to happen on the fly, where the row level calculation is going to store the data inside the data source. That means if you go and check the data source data or if you view the data from here, you can see that we don't have any information about the total sales. Now if you browse the data, we don't have any extra field called total sales. Because those informations will not be recalculated from Tableau and stored inside the data source. It can happen on the fly as you bring the field to the visualization. That means Tableau will not go immediately and execute the aggregate calculations as you are creating them and then put the result in the data source. Tableau will do it on the fly. That's because Tableau doesn't know the level of details that you need at the visualizations. As you know, the data source has the level of details. That's why only one type of calculation, the row level calculations, can be pre executed and stored inside the data source, and the rest can stay on the fly. That means our new calculated field using the aggregate functions will not store inside the data source any data. The data going to be calculated. Once you drag and drop it inside the view, it's going to stay empty as long as you don't use it. Let's go and close this over here. And let's drag and drop it to the view to check the results. Now in this view, we got the total sales pi the products because the product name going to control the level of details. Let's say that you would like to have the total sales by the category. In this view, you have to remove the product name. In order to do that, we're going to go and remove the product name from the view. And with that we got the total sales for each category. That means the aggregate calculations or the granulity of the measures is going to depend on the level of details of the visualizations. The dimension can control everything. Going to control the level of details that we see in the view. So now let's go and understand how Tableau brought those numbers to the view. Okay, so in the data source we have 15 orders. And in the visualizations we said, okay, we would like to have the category Tableau going to go and get the category to the visualizations. And inside there there are like two values. So we're going to get the accessories and the monitors. So we're going to have with that only two rows. Then we can have the sales, the total sales. Tableau going to go and aggregate the sales for each category. So as you can see, Tableau going to go and split the orders into two groups. One with the category accessories and the other one with the monitor. Now in order to find the total sales of the accessories table, going to go simply and go aggregate all those values of the sales and put the result at the output. The first one going to have like around 2377 for the next group table can do the same. Going to go for all those orders underneath the category, Monitor and go and aggregate all those values that we're going to get around 4,129 As you can see, table can go and split the rows by the dimension that is used in the visualizations in this example. It's going to be by the category, it's going to split it into two groups. And then you can go and apply the aggregate functions. Let's move to the next one. We would like to find the average sales for each category. In order to do that, we're going to go and create a new calculated fields, and we're going to call it Average Sales. The function is very simple. It is the AVG, the average. Then we can have our field sales and that sets, it's pretty simple. Let's go and hit Ok. And as usual, we're going to get a new empty field inside the data source, but once we drag and rub it on the view, the calculation is going to happen. Let's do that. We can find the average sales for each category. How Tableau did the calculations is very simple. Table going to split again the rows inside the others into two groups. The first group for the accessories, so it's going to go and. All those values inside the sales. And then it's going to be divided by the total number of orders inside this category. Here we have around eight orders. The final value going to be around 297. The same thing going to happen for the second group table, going to go and add up all those values, then divide by seven because we have only seven orders for the monitor and we will get 590 as a result. We can see again that that dimension category is deciding how the calculation can happen and as well how the data going to be split up. That's all for the average function. Let's move to the next one. We have the count. Let's say that we would like to find the orders for each category. In order to do that, we can go and create again new calculated field, and we're going to call it number of orders. The function is really simple, so we're going to use the counts, and inside it we need only one field. This time we're going to go and count the order IDs. In order to do that, we use the order ID and that's it. We are counting how many orders IDs we have inside our data source. The calculation is valid, let's go and hit. Okay. As usual, we're going to get a continuous measure in our data source. Let's go and drop it to the view and check the results. We can see that in the accessories we got eight orders, and in the monitor we got seven orders. Now let's see how Table is doing that. It's very simple. Again, our data is splitted into and Tableau going to start simply counting the rows. So how many rows do we have inside the accessories? It's going to be eight rows. We have here eight orders. And if you count the rows of the monitor, you will get as well seven orders. With the count function, we are just simply counting the rows. So that means in the accessories we got eight rows, and on the monitor we got seven orders. There is one more special thing about the count, Let's say that's inside our data, we got nulls. Let's say that we don't have any order ID. It's empty, it's null. So what can happen here? Tableau will not count it. So in this example, Tableau going to go and count only six instead of seven, we're going to get six. And this as well going to affect the previous function, the average as we learn before. It's going to go and add up all those values and then it can be divided by the number of orders. So let's say that we have here a null this time. Tableau will not divide it by seven. Tableau going to go and divide it by six. And here again, a reminder that we have to handle the nulls inside our data as we learn before using the z end or Nal ifnal and so on. So if we divide it on six, it can be different than dividing it by seven which is more correct, sorry, we have seven orders. Are six orders, that's means pay attention if you feel that you are doing the aggregates on top of it, whether it has nulls or not. Because having a null here, we're going to get inaccurate results. We don't have six orders, we have seven orders inside the monitor. All right, so that's all for this function, the count. All right, so now we're going to move to a very similar function in Tableau called the count D. It's going to return the number of unique or distinct values within a field. It sounds very similar to the counts, but here we have a difference between them, where we are counting only the distinct values. Let's have an example in order to understand the difference. We would like now to show the number of products, each category. Let's go and create a new calculated field. Let's call it number of products. This time I'm going to start first with the function counts to show you the differences between them. And we're going to use the field product ID. Let's go and select that. And then get, okay again, we got a new calculated field. Let's show it at the results. And we can see that the results is very similar to the number of orders here. Again, we have eight products for the accessories and seven products for the monitor. Now what happened here? Well, if you check the data inside the order, we got only two products with the accessories and as well only two products for the monitor. Why we got Ta and Civil. And that's because Tableau going to go and count the number of rows, whether it's like duplicates or not, it doesn't matter. So Tableau going to go and count. Okay, here we have eight rows, that means we have eight products. So that's why we cannot use the count function for this task. We have to use another thing where we're going to use the count D. Let's go and change it. I'm going to go to the calculated fields. It just add a D after the count to use the next function. So we have count product ID. Let's go and hit Okay. And as you can see in the result, now we got two for the accessories and two for the monitor. So let's see how Tableu going to work here. Tableau can count the distinct or unique values within the field. This time Tableu going to pay attention to the content of the field, so it's going to start counting. Okay, here we have the USP mouse. This is one. Then the next one we have the same information. Tableau will not count it at all. The same for the third, then for the fourth order, we have a new product. So here we have a new value, the logitic keyboard. So here we have two, then move on to the same stuff. So here we have the same values. Tableau will not count them at the end. Tableau did count here two unique values. Here we have two products for the accessories, that's why Tableu going to go on the output and put two. The next category, so we start to the same, We have the LG full HD monitor. This is one product, the second one is the same. Value will not count it, then move to the third one. As you can see, it's new products, new value. So it's going to count two. And the rest will not count anything because it as well Duplicates table going to go and count the number of unique values within the field. That's why we're going to have as well here two which is more accurate. We got only two products for the accessories and only two products for the monitor. This is the difference between count and count D. Count will just blindly go and count, how many roles do we have inside each category? But count D going to go and check the content, and it's going to count only the unique and the distinct values. All right, so now we're going to move to the last two. We have the max and min. They are very simple functions in Tableau. The max can find the highest value within a field and the men can find the lowest value within a field. Let's go and check how it can work. So let's say that we would like to show the highest sales for each category. In order to do that, we're going to go and create a new calculated field. Let's call it Highest Sales. And then we can use the max function and we have the sales. It's very simple, it always needs one field that set. Let's hit okay and let's check the results. Let's put it on the view so we can see the highest sales inside the accessories is the 525 and the highest sales for the monitor is the 1691. So let's see how this works. As usual, our data is split it into two groups. We start with the first group, so table going to go and check all those values. What is the highest values inside those sales? It's going to be the 525 table going to present it as a result. Then we're going to move to the second group. So table going to take all those values and compare it to each other's in order to find the highest value. And it's going to be this order number two as the highest sales inside our data for the category. Monitor that. This is how the max function work in Tableau. Let's go to the next one to find the lowest sales for each category. We're going to do the same stuff. We're going to have a new calculated field, lowest sales. This time we can use the function and then our field Sales that sets click Ok. Let's present it as a result as well to compare it. So we can find the lowest sales in the accessories is 56. And the lowest as well for the monitor is 40. The same thing, Tableau. Going to go and check all those values for the first group, what is the lowest sales? As you can see, it's going to be this order, order number ten going to be the lowest value. And then Tableau going to go and check those group of values in order to find the lowest value, it's going to be this 139. Tableau is just surrounding the numbers, that's why we have here 40, but in reality it is 39.97 So that's it. This is how the max and main works in Tableau. As you can see, the aggregate functions in Tableau are very simple. Those functions like I think this is my easiest tutorial that I made in the Tableau series. All right guys, so that's all for these six functions in order to aggregate the measures of our data source. Next we're going to talk about how to aggregate the dimensions using the very confusing function, the attribute. 146. Tableau | ATTR Attribute Function: We're going to talk about another aggregate function in Tableau. But this time this function is going to be very special and it is very confusing. A lot of people get confused about the attribute function in Tableau first. As usual, we can understand the concept behind it and then we can practice in Tableau. Previously, we have learned that the aggregate function is going to go and aggregate the numbers, the measures inside our data source. This makes sense, right? To have the total sales in the view. But now how about to aggregate the values of the dimensions, for example, the customers or the products? How to aggregate those values? We cannot go and use the sum function in order to aggregate the dimensions. We can go and use the attribute function, the attribute function in Tableau, going to go and aggregate the values of the dimensions of the data source and present the result in the view. But this time I would like to go and aggregate the values of the customers by the products. In order to do that, we can use the function attribute. For the customers in the view, we can have two values. First we have the dimension product. This one is going to define the level of details of this view. Here we have another field where we can have the result of aggregating the customers, the attribute of the customer. Here we have two options. The first one, if all values same, then it's going to return a single value, the same value. Or if we have multiple values, then it's going to return risk. This might sound very confusing or complex, but don't worry about it. Let's just follow the example again here, since we are grouping up the data by the products Tableau going to go and group up the orders by the products. The first group for the product number one, the second group for two and so on. In the visualizations, we're going to have only one row for each group like any other aggregate functions. Now for the first group, we're going to have one row, the pay one and Tableau going to go and check the values inside the customers for this group. As you can see, we have the same informations in those three rows. We have John, John, John. We have the same value, so we are at the first options. If all values are the same, then it can return a single value. That's why table going return. In the output, John with that tablet did implement the first option. Let's go to the next group. So the two as you can see in the customers and the two we have here different values. So the first one is John, the second one is Maria. Maria, we don't have the same values rights. We have different values. That's why Tablet going to go and execute the second option because we have multiple values and table going to return risk. So that's why we have here and trick other results. This is how the attribute function works in Tableau. Let's move on to the next products. Let's see that we have the P three and as you can see we have here again two different values, John and Maria. They are not the same. That's why the second option going to be activated. And table going to have the asterisk. Other results for the product. Four, let's check. We have Maria and Maria, we have the same value. That's why table going to go and execute the first option where all the values are same and then we're going to get the same value in the output. That's why we have Maria. That's it for the attribute function. It's really simple, right? Once you have an example, then everything going to be clear. Again, if the values are the same, like here John, then we're going to get the same value. And if the values are different, so you have multiple values, then table going to have the Asterix. And now you might ask what this Asterix means in the view. Will table use it as a highlight or warning for you to tells you there are more details in this field inside the customers and the Asterix can help you as well to understand the relationship between dimensions between, for example, the customers and the products. As you can see, for the product two, we have multiple values, so it is like one to relationship. But for the product one, we have one to one relationship. So we have only one customer for only one product. With that, you can understand the relationship between dimensions. All right, with that, we have understood that in Tableau, we can of course, aggregate the measures like in the sum function. But as well, we can go and aggregate the dimensions inside the data source using the attribute function in Tableau. So this is the main task that we usually use the attribute function to aggregate the dimensions. Now let's go back to Tableau in order to practice this function. All right, so I'm going to show you a very quick example on how to create the attributes in Tableau. Let's stick with the small data source. Let's go this time to the customers. We're going to take the countries and the cities as well to the view. Now I would like this aggregate the dimension city inside this view. In order to do that, we can use the function attribute. There is two ways to do it. Either globally and locally, as usual, locally only for this view, globally for all other worksheets. Let's see the quick one, the local one. In order to do that, we go to the city over here, write a click on it, and then you can find this option between the dimensions and measures. This time we have the attributes. Again, this is not the third option of the meta data that we learned before, dimensions and measures. This is simply an aggregate function that Tableau just put it between those two options. It is not the third option, it is an aggregate function. Let's go and click on that. Now we can see from the name of the field, we have the function attribute applied on the field City. And the level of details in our visualizations is not anymore the city like before, it is now the country, the city going to have an aggregated value. For France, we have Paris, for Germany, and USA, we have the risk. Let's see quickly how Tableau did that. Okay, here it's very special about the attribute function in Tableau. It's not like all other aggregate functions where we start from the data source. Here we start from the visualizations depends on the visualization level of details that we have inside the view. It's going to do the calculation. Here we have the visualizations, the country and the city. It's going to focus only on those two dimensions. At the start, we have France, Paris, and we have two values for Germany and two values for USA. Since the country only dimension that we have in the view and the city can be an aggregation, the level of detail is going to be the country. That means we're going to have only three rows, only three values. Tableau going to show us as we can see here on the left side that we have France, Germany and USA. Now as we learned, Tableu going to go and check the values. If all values are the same, we're going to get the same value for France, we have only one value, it's going to be the same value, Tableau going to go and put it at the output. Then the next one, Germany, we have this group of rows. We have two rows, Berlin and Stuttgart. We have two different values. That's why Tableau going to go and put the asterisk at the output. The same for the USA. As you can see we have two different values, so we have multiple values and for that Tableau can show as well the Astrisk at the outputs. And that's why we have here only Paris for France and two Astrisks for the other two countries. So you can see this is very simple. Let's go to another example to understand the use case of the attributes. All right everyone, So now we might ask, okay, nice. We can now aggregate the dimensions, but where do I use it in my dashboards. So what are the real use case for the attribute functions in Tableau? Well, usually I tend to use the attribute functions in two use cases. The first one inside the tool tip, where I want to show for the users more details about the aggregations. Let me show you how I usually do it. Let's go to the big data source and then we're going to go to the customers. Let's take, for example, the country, the city, all informations about the location, and as well the postal code. Then as usual, we would like to show the sales informations. So let's go to the orders and take the sales to the columns. And we're going to show the labels and as well the color of the sales. So now we can see that the level of details of our visualization is going to be based on the postal code. Since it's going to bring us to the lowest level of details, let's say that the requirements wants us to have the level of details of the city and not the postal code. There is two ways to do it. Either we can go and remove the postal code from the view over here. With that, we got the level of details of the city. But now let's see that I still want to bring the postal code informations to this visual as a details for the users. I cannot just drag and try. Put it here, it's going to split the data, right? You can see here, Paris, we have two values. Instead of that, we can use the attribute functions in Tableau if we still need to present the postal code informations in this visualization. As we learned before, we can go over here and quickly switch it to attribute, or we can make it globally to re, use it in different worksheets. Let's go and choose that. We're going to go and create a new calculated field. I'm going to call it attributes, Postal code. The function is very easy. It's going to be on the attribute and accept only one field. It's going to be the postal codes. It should be a dimension. That's it, the calculation is valid. Let's go and hit. Okay, so we've got a new calculated field, a new dimension. Let's go and bring it to the view. I remove the postal code. Now we can understand quickly from the view that the postal code and the city, they are almost at the same level of details. As you can see, we have always values, but only two countries where we have the asterisk. So we have the Paris and the Portland. With that, we understand the relationship between the postal code and the city. They are almost at the same level, but sometimes we have more details. In Paris, we have here different values for the postal code and as well for the Portland. Now, in order to show those details for the users, either we can leave it as like a field over here as a header or a better way in order to save some spaces in the visualizations and not show a lot of headers. We can show it in the tool tip. In order to do that, we're going to drag our field and drop it on the details. And then we have over here this option to configure our tool tip. Let's go inside it now. As you can see, we have four informations, City, country sales, and our new field, the attribute postal code. But I would like to rename it in order to make it easier for the users to read it, so it's going to be the postal code information. Let's go and hit Okay, and now Add. The users are mouse hovering on those informations. You can see that we have more details about the city. We have the postal code informations inside it, and if we have multiple values, like in Paris, we can have the Astrisk I usually explained for the users. If you find the As risk, it means we have more details about the aggregations which may raise the curiosity for the users to go on more detailed analysis about the postal codes instead of the cities. And with that, we are presenting the postal code informations even though that's our level of details in the visualizations is the city. This is very common use case for the attribute where you can present more details for the visualizations. Even if you have a very high aggregated data at the view, and for that we use the abate functioning in Tableau. But sometimes we end up, like in most of the situation, that the users want to see those informations, they want to see those postal codes and the sales informations for them. In order to do that, we do the following. We go and create a new sheets, and this time we're going to create a view where the postal code is, the level of details, all what we need is the postal code and as well the sales. Drag and drop the sales to the view. Let's just make it a little bit bigger to see the header information. So that's it. Let's call it sales by postal codes. This view can be now embedded in the original view. In order to do that, we're going to go back to our view where we have the city as the level of details. Now we want to do embedded worksheets inside this view, inside the tooltip. Let's go to the tool tip over here. Let's have a new line. And then we're going to go to this menu over here, the inserts. With the first option, we have the sheets table going to show us all the sheets that we have in this workbook. It's going to be the last one, sales by Postal Code. Let's go and hit on that. Now we have embedded another worksheet inside the view using the tool tip that sets. It's very simple. Let's go and hit, okay. Now let's go and mouse over on those cities. As you can see, we have now a table or a view, small view inside the tool tip if you go to Paris. Now we see now the two postal codes, and this will the sales of those postal codes. This is how I usually do it as the next step if the users want to see more details. But of course, this needs more calculations and more resources in Tableau to put one view in another one. If the users are happy with the Astrix, then stay with the attribute. But if they need more details, then you have to create another view and then put it inside the tool tip. All right, so that's it for the first use case. We use the attribute to show more details for the users if we have a high aggregations in the view and we use it usually in the tool tip. All right, now let's move on to the second use case, where I usually use the attribute functions in my project is to check the data quality inside the data sources. Usually, if you are working with the data, you have some expectations about the data quality. And if you have any suspicions, we can use the attribute functions in order to investigate the situation. For example, let's say that the expectations in our data to have only one country for each customers, the data should not allow for some reason to have multiple countries for each customers. If you are skeptical about this information or we want to check the quality of the data that we get, we can use the attribute functions like this. We can go, for example, and take the customer ID. We can take the first name, last name, but now we would like to check the quality of the country. But since we have a lot of data inside our data source, it can be really hard now by just checking the values to understand whether we have multiple values for each customers or is it one to one relationship? Instead of that, we can go and aggregate the country using the attribute function. Let's do it this time by the quick way. Or right click on the country, and let's apply the attribute function. At the start, you might see, okay, nothing is changed. But now instead of quickly to validate the data, we can sue it as a filter. Right click on the country over here and show filter. Now on the right side table going to show us all the possible values that could happen to this view. Here we have the Astersk. We have France, Germany, Italy, and USA. Of course, what is interesting is the first one, so I'm just going to remove everything and select the asterisk. Now we can see as we selected the asterisk, we don't get any data. This is perfect. That's my, the data quality inside our data is perfect and we have exactly one country for each customers. But if we start getting data from the Asterix, it means we have multiple values for each customers and we can investigate this situation. So this is one time analysis for our data to check the data quality. But let's say in the next day or the next month, we got a lot of new customers and we want always to check those informations. We can go and make data quality dashboards for us or for the users to check whether our expectations is correct, only selecting the Asterix. And we can explain that we expect that this view going to be always empty. If this view is not empty, then we have a data quality issue. And we can add this information in the title. We can call it data quality check. Then it's about the multiple countries. This is expected to be empty. If it's empty, then everything is fine. That's all for the second use case for the attribute function in Tableau. As you can see, it's really handy for the projects rights to understand your data, to do data quality checks and so on. Or as well to show more details for the users inside the tool tip. All right, so that's all for the attribute function in Tableau. And with that, we have covered many important functions under the category aggregate calculations. Next we can start talking about the LOD calculations in Tableau. They are really interesting and important to understand. 147. Tableau | Introduction to LOD Expressions: All right everyone. So now we're going to talk about the third type of Tableau calculations. We have the LOD expressions or LOD calculations. It is another type in order to aggregate the data in Tableau. And here we have only three functions we have, include and exclude. And as usual, first we have to understand the concept behind them. Then we can have enough examples in Tableau. So let's go. All right guys, so now we can understand, when do we need LOD expressions in Tableau using this very simple example. So let's say we are building a view where we have the category informations and the product name. And now we are showing the total sales for each products. Now by looking to those two dimensions, you can understand that the product name is controlling the level of details in our view. So we have five products, and with that we got five rows. So the product name is splitting the rows of this table. But now we come to the issue. If you want to show in the same view, in the same dimensions, and set up, you want to show the total sales for each category. Well, we cannot do that as long as we have the product name inside this view, because the product name is splitting the view into products. In order to show the total sales for each category. Either you have to remove the product name from the view by just drag and drop it away. You can see now we got the total sales for each category. But if you say, wait, wait, we need to have the product information in the view, we cannot drop it. So let's go and bring it back over here. If you need to have the product name and you still want to have the total sales for each category, we have to use the LOD expressions exactly in this situation where we need the help of LOD expressions to control the level of details of our aggregations. Now let's go further and understand how LOD works. Okay, now we're going to have quick facts about the LOD calculations. First, LOD calculation is going to go and aggregate the rows of the data source at the dimension level that we specify inside the calculation. That means the dimension of the visualizations will not control the level of details. This time we're going to have the level of details of the LOD expressions. The LOD calculations, like the aggregate calculations Tableau going to go to the data source in order to query the data there, and then bring the result to the visualizations. And the calculation can happen on the fly. That means Tableau can execute the calculation only if you bring the field to the visualizations. Tableau will not recalculate and store the informations inside the data source. Again, how it works, the visualizations can send query to the data source and the data source can answer with their results. This is how Tableau execute the LOD calculations. All right everyone, we talked about the level of details many times during the tutorials but now let's understand what do we mean exactly with the level of details. Let's say that we use in Tableau only the measure without any dimensions. With that, we're going to be at the level one and we will get, for example, the total sales if you are using the measure Sales Tableau. Going to go and summarize all the sales inside the data source and present it as only one row, one value. Without using any dimensions, we will get the highest level of aggregations. Let's go to the next level. Let's say that we use a dimension like the category. In our small data source, we have only two values. Tableau can split this one value into two values. Here we can see more details about our sales. It's not only one value, now we have it as two values. So that means this dimension going to split our view into two rows. Moving on to the third level, let's say that you use the country inside the data source. We have three countries, That means we are going to have three rows. We have more details now about the sales. So as you can see, the sales going to split into three rows. So that means the level of details of the category is different from the country. In the category, we have two rows. In the country, we can have three rows. Moving on to the last level. If you bring the order ID to the visualizations, you will get the highest level of details. It is exactly the level of details that we have inside the data source. We don't have in our data model any dimension that's going to break this rows to more details. So we are now at the bottom, at the highest level of details. And we can have exactly 15 rows, because we have 15 orders. So that means each of those dimensions going to go and break the visualizations into different levels of details. The category going to break it into two country three, product name four order ID, going to break it into 15 rows. That means the level of details is the highest at the order ID and it's going to be the lowest if you don't use any dimensions. The opposite if you're talking about the aggregations. The highest level of aggregations, if you don't use any dimensions. And you're going to get the lowest level of aggregations if you're going to use a dimension like the order ID that we understood, each dimensions brings us to a different level of details. This is, what do we mean with the level of details in Tableau? All right guys, now we're going to go and understand the LOD functions in Tableau. But first we can split those three functions into two categories. The first one is going to be the static. Where we have only one function, it is the fixed. The second one we have the dynamic calculations. And here we have the two functions include and exclude. If you want to have a fixed or static calculation, you can use fixed. But if you need more dynamic, then you have to use include and exclude the dimensions. Inside our visualizations or in the LOD expressions define the level of details and each dimension has different level of details. For example, the category has only two values. That means the level of details here is very low compared to the order ID, where we have the highest level of details. Let's say that our current level of details inside the view is the country. So we have the level three. We can use the LOD expressions order to bring the calculations to a lower level of details. And we can use the exclude or the fixed function to bring it, for example, to the level two at the category. But now, in order to present the calculations in the current view, what can happen? The values can be duplicated or uplicated, like we have seen in the last use case, where we have the tables and we duplicated or replicated all the values. Or we can use the LOD expressions to bring us to a higher level of details like using the include or fixed. But now, if we want to bring back the calculations to the current view, we have to do aggregations like we have done the average number of customers for each category. Since the customers has a higher level of details than the category, you have to pay attention to the dimensions that you are using inside the LOD calculations. If it's going to bring the aggregations to a higher level of details, then you have to focus on the aggregate functions that you are using in order to bring the result to the current level of details in the view. So that means we have always to aggregate data in order to go back to a lower level of details or to a higher level of aggregations. Always here, we have to use aggregate functions in order to come back to the current level of details. But if we are on above, it's easy. It's going to just duplicate replicated. All right guys, I hope that was clear. This is one of the most complicated concepts that we have in Tableau, if you compare to all other concepts. All right guys, now we're going to go and understand the syntax of the LOD expressions. They start with the function name, so either it's going to be the fixed, include or exclude. After that we have the double points. Then we have to define the aggregations. It's like the aggregate calculations something like sum of sales, average of sales, maximum and so on. But the most usual aggregation that we use here is the sum of something. Let's have a few examples. We can go with the following. Like we say fixed, then we don't specify any dimensions, then we specify the aggregations we have in this example, the Sum of Sales. Now think about the LOD expressions as you are building and view in Tableau. You always have to specify the dimensions and measures of the aggregations. Here we are telling Tableau to do the sum of sales without considering any dimensions. Now let's go and add dimensions inside the calculation. Like for example, the category here. Again the same analogy. It's like you are building view from the dimension category and the aggregation sum of sales. Of course, you can go and add more dimensions like the category and the product name. The same analogy, we have two dimensions in the view category, product name, and then we have the sum of sales. Now, of course, we can go and add more dimensions like the category, product name, the same analogy. We are adding two dimensions to the view category and the product name. And the aggregation is the sum of sales. And of course, we can go and use another functions like the include or exclude in those examples or another aggregations like the average of sales and so on. So as you can see, building an LOD expression is very similar. As you are building any view, you have always to define the dimensions and as will the aggregations from the measures. So that's all about the syntax of the LOD expressions. 148. Tableau | FIXED LOD Expression: All right, so there are two types of level of details. Lod, the first one is the one that we define inside our visualizations. We call it LOD viz, and the other one that we define inside the calculations, we call it LOD expressions. Now let's say that inside the visualizations, we have two dimensions, category and country. And we have the sales. Now on the right side in the LOD, if you go and use the fixed function, let's say that we have the fixed category, Sum of sales. What we have done here is exactly like you are building any other view. You need always a dimension. And as aggregation with that Tableau going to go and let's say internally going to create a hidden view with the dimension category and the aggregation sum of sales. Here, since we say it is a fixed function, Tableau will ignore the dimension that we have on the view, so it can work completely independent from the dimensions that is presented in the view. That means the calculation is going to be very stating and doesn't matter what you're going to do in the visualizations. Nothing going to change in the calculation of the LOD expression. What do I really mean? Let's say that in the view, you have added a new dimension, let's say the product now you have made a change in the visualizations. We have now three dimensions, product category and country. But the LO D expression will not change at all. It's going to get exactly the same results it can, has the category and aggregation. Sales. So this is the main purpose of the fixed function, to make it independent from the dimensions that we have inside the view. So everything going to be static. And this is exactly the main difference between this function and the other two, include and exclude. So as you can see, building the LOD expressions, it's very easy, It's very similar as you are building visualizations in Tableau, as you are dragging the dimensions and aggregations here. Instead, you have to define it inside the calculation. And always you have to define the dimensions and aggregations. So it's really simple. Once you understand it, let's move to the next one, to the exclude. All right everyone, now back to our view where we have the product name. In the visualizations, we cannot use the aggregate calculations in order to show the total sales pi category. In order to solve this, we're going to use the LOD expressions using the fixed function. Let's go and create a new calculated field. We will call it sales pi category. Now we're going to use the fixed function. So let's start tipping fixed and use this suggestion from here. Now next we have to define the dimension. Since we say sales Pi category, then we need the category. Let's add the dimension category and then double point and the aggregation can be the sum of sales. At the end, we have to close the packets. As you can see, it's very simple. We have to define the dimension and as well the aggregation that we need in the visualizations. Let's go and hit Ok. But as usual we will get a new calculated field on the measure and it's going to be calculated on the flies. That means table will not go now and store the results in the data source. Let's go and take the results, drag and drop it to the view over here. Now we see in the results, we have the sales by the category. We are ignoring the dimension product name. And it is based completely on the Dimension category. I usually work with the LOD expressions in order to understand it. I always imagine that Tableau is creating a separate view in order to calculate the LD expressions. Then add it current view. So let me show you what I mean by that. Let's go and open again our calculated field. And on the right side we have over here the data source information sense table. Going to go and query those data. We are saying fixed category, so that means we can grab the dimension category. And inside there are two values. We have the accessories and the monitor. Next we have the Sum of Sales. This is the aggregation table, going to grab the sales and start doing the aggregation. So it's going to go and summarize all those values. For the first sections for the accessories, we will get the total sales of the accessories. And then Tableau going to go and summarize all the sales for the second category. And with that, we will get the total sales by monitor the output of our calculation. The LOD expression can look something like this. As you can see, the level of details in the LOD expression is completely different than the view. Here we have only two rows, and in the view we have five rows. The next step table, going to go and merge those results to the view. We have the first three products belongs to the category accessories. That's why we are seeing the values, the total sales from the accessory in the view. And then the next two products belongs to the category Monitor. That's why we are seeing the total sales by the monitor. This is how I usually do it in order to understand expressions if things get complicated. Now one more thing about the fixed calculations. We say that it is static. It is fixed. So it doesn't matter what I'm presenting in the view, we will always get the same results and nothing changed in the LOD expression. What I mean with that, let's go and change a few stuff. Let's take the product name away. You can see we still get the same values. Let's go and add, for example, the country to the view. Let's go to the delegations and just add the countries. As you can see, nothing to change. The LD expression can have exactly the same values and it is static. All right guys, that's how the fixed LOD expression works in Tableau. All right. The following case. I would like to create a histogram to measure the customer's loyalty. That means I would like to have the data distributions of the number of customers distributed by the number of orders. I would like to understand here what are the number of orders that the majority of my customers are ordering. That means I would like to understand the behavior of my customers. That means in order to build such a thing, we need two measures, The number of customers and the number of orders. Well, before we have learned how to build histograms, but only from one measure. If you have two measures, this time we have to go and create LOD expressions. So now let's do it step by step in order to learn how to build such a visual. All right guys, so first let's understand the data that we have. Let's show the number of orders for each customers. So let's go to the customers. Over here we are at the big data source. Then let's take, for example, the customer ID. With that, we can have a list of all customers inside the data source. And then let's go to the orders and grab the order counts. With that, we got the count of orders for each customers. Now let's go and sort the data so we can see we have only one customers with the highest number of orders, 29. Then we have three customers that ordered the same amount. We have 2083 times three customers ordered the same amount. Then we have one customer that ordered 26. Then we have over here, five customers that ordered the same amount. We have 25 orders, those five customers. Now since we have two measures, the number of orders and the number of customers, we have to turn one of them to a dimension. So I'm going to be working now with the number of orders. To turn it to a dimension, we want those values, the 292-82-6205, In order to do that, we can go and create an LOD expressions using the fixed function. Let's go and create a new calculated field. We can a number of orders per customer. We're going to go and build something very similar to this view using the LD expressions. We can start with a fixed function, then our dimension going to be the customer ID like in the view. And then our aggregation going to be the count of orders. You can go with the distinct if you are not sure whether there are duplicated inside the orders. But I'll stick with the accounts and then we can have the order ID. And then let's go and close it. With that the calculation is valid, we just build exactly like this view. Let's go and it. Okay. Now with that we've got our new field over here, the number of orders. Let's go and check the results. It's going to be exactly the same data that we have inside our view, but this time we have an LOD expression where we have more control in this measure. Now we're going to drop everything from the view. We just need the new calculated fields. And now let's go and switch it to dimension in order to have distinct values. Then move it to discrete. So with that, we've got something very similar to the bends right here. We have a distinct values from the number of orders. Now what is missing is, of course here, the number of customers in order to have histogram. So let's go to the customers counts over here and just drop it on the rose. With that we've got exactly what we want, the data distributions of the number of customers. So as you can see over here, for example, we have three customers that's ordered four times. And here again, we have only one customer that ordered 29 times, if you remember the example. And then we have here those three customers that ordered 28 times. So that you can understand quickly the behavior of the customers by just checking the view, we can understand that most of our customers are ordering 11-16 which is really good. Like we don't have a lot of customers that are ordering only once. The left side over here is really low, which is very good. And of course, now we are summarizing all the data that we have inside the data source at the five years. And now you might have the question, does the behavior of the customer change over the time? In order to answer this question, you have to bring the time. So we have to bring the order date, let's drag and drop it to the roads over here. And now we can see very quickly that the behavior of the customers are not changing over the time. So as you can see, the histograms looks identical, right? So most of the customers are ordering 11-15 and that's over the years, and we cannot do such analysis without the LOD expressions. So you can see the power of LOD. 149. Tableau | EXCLUDE LOD Expression: In the visualizations, we're going to have exactly the same view with the two dimensions, category and country. But now in the LOD expressions we're going to use the where we're going to exclude category, sum of Sales. Now what we are telling Tableau is to go and exclude the dimension category from the visualization. That means in the LOD expression on the right side, we're going to get all the dimensions from the visualizations and we will exclude now the category. We're going to remove the category from the dimensions, that means on the LOD expression. Now in this example, we have the country that's going to control the level of details in the LOD expressions and Tableau going to do again, depending on this dimension, that means the exclude function will always remove the dimensions that is specified in the calculation. Here the big difference between the exclude and the fixed exclude is depending on the dimensions that we have in the view. Let's say that we have added in the view another dimension. So now we have product category and country. What can happen to the LOD expressions? Tabla. Going to take all those dimensions and will only exclude the category. That means the calculation now going to depend only on the product and the country. You can see it is very dynamic and it depends on the visualizations. The exclude will always react to the dimensions that are specified in the visualizations. Going to remove the dimensions that we specify in the calculation. Moving on to the second LD function that we have, the exclude. Let's say that I would like to have the total sales inside the view, but I would like to ignore the dimension category. In order to do that, we can use, let's go and create a new calculated field. Let's call it sales exclude category. We start with the function excludes, let's select that. Then we're going to have to specify the dimension that should be excluded. It's going to be the category after that, as usual, we have to define the aggregate calculation. It's going to be the sum of sales. Let's close the packets. So it's really simple. We are telling Tableau to ignore always the category from the calculations, so everything is valid. Let's go and hit Okay. And as usual, we will get our new calculated field in the data brain. Let's go and trip it on the view in order to check the results. If you check the new results, you can see we got different numbers from the sales by category or the original sales. What is going on over here? Now, since we are using the exclude function in Tableau, the LOD calculation is going to be depending on the dimensions of the view. Let's open again our calculated field, and let's see what Tableau going to do. Tableau going to depend on the dimensions that we have inside the view. We will have in the LOD calculations, the country and the category. But since we are here saying okay, go exclude go and remove the category table, can remove the dimension category, and with that we are left only with the dimension country. Since we here have like dublicates, we have only three countries at the end. In the LLD expressions we will have three rows. Now what table do going to go and find the total sales for each country? The data source is going to be split into three groups for each country. One we have France, Germany, and USA. That means tab going to go, for example, for France and go and summarize all the sales for those three orders and put the results at the output then goes for the same as well for Germany. And take all those sales, summarize it and get as well. And the results sales for Germany. And then we have for the USA, those four orders. And we're going to go and summarize the sales for that so that the output of the expression is going to look like this. We have the country and the total sales of countries. Now if you compare to the view to the results that we have, as you can see, as we exclude the category, we're going to have the total sales for each country. Here, France, we have 172 and as well for the second category, we have France. We will get exactly the same total sales. And the same thing going to happen for Germany. So we will have exactly the same values in both categories. For Germany, we'll get this value as well. For the monitoring in Germany, we will get this value. As you can see, once you understand what is going on in the background, you will understand the in the view as we say that the exclude is dynamic. It is not like the fixed. We will not get always those results. It's really going to depend on the views, on the dimensions that we have in the view. Let's take, for example, let's add another dimension to the view. Let's go and get the customers. Let's go to the customers. Take the first name, let's drop it over here. Now if you look closely to the data, you can see those numbers, nothing changed inside it because it's always fixed to the category dimension, but they exclude this time they have different numbers. If you go and compare what we have at the start, the total sales for countries, those numbers, you don't find it anymore in the sales over here. And that's because we have added a new dimensions. We don't have the country. We have as well, the first name of the customers. So that means now we have in the LOD expressions two dimensions, the country and the first name. The result. The output of the LOD expression can look like this. We have two dimensions, country and the first name. We don't have the category, we exclude it. We remove it from the view. And then we have the total sales for this combination of dimensions. The total sales for George from France, total sales for Maria from Germany, and so on. Those numbers are exactly the same that you're seeing in the view. As you can see, the exclude function is dynamic and depends on the dimensions that are presented inside the view. This is how it works. Now let's move to the next one. We have the include. 150. Tableau | INCLUDE LOD Expression: All right, so now let's move to the include function. It is exactly the opposite of exclude. So we're going to have the same example in the visualizations. We have the two dimensions, category and country. And now we're going to say to Tableau include customer dimension. And we're going to have the same aggregation, the sum of sales. Now what we are telling table with this calculation is to add one more dimensions to the visualizations. To add dimension customers to the two other dimensions that we have inside the visualizations. Here again it's very dynamic tablo going to take the dimensions that are presented in the visualizations, the category and the country, and add to it in new dimension. The customers the function include is very similar to the exclude. It is dynamic. It is depending on the dimensions that we have inside the visualizations. Again, the same example, if we go and add one more dimension the products, we will end up having three dimensions in the visualizations and table in the LOD expressions. Going to add one more dimensions to the expression where we're going to have at the end four dimensions, customers, product category, and country. So that means in include function, we are saying do the aggregations in all dimensions that we have inside the visualizations plus one more dimension that comes from the calculation. So it's really easy, right? So now to summarize, the fixed function is very static. It doesn't care about the dimensions that we have inside the visualizations. It is completely independent. So it's going to stay the same as you are changing the visualizations. But they exclude and include. They are depending on the visualizations. So exclude going to go and remove one dimensions from the dimensions that are presented in the visualizations were include going to go and add plus one more dimension to the dimensions that are presented in the visualizations. So we have now understanding how those three functions works in Tableau. So now we're going to go back to Tableau in order to practice those three functions. So let's go. All right, so now we need more attention about this function. To include, it is more difficult than to exclude and fixed, so let's have some coffee. Let's go. All right, so as we learned before that each dimension has different level of details. For example, the first name has more details than the country or the category. So now comes the issue. If you want to remove such details from the visualizations, you want to remove the customer's names. And you want to stick only with the category and the country. But still, you want to introduce an aggregation that has to do with the customers, with a dimension that has a lot of details. For example, we want to bring here an aggregation that shows the average sales of customers for each country and category. But without showing the customer's informations as a dimension, let's go and remove the first name from here. We don't have here any customers information. But still we want to bring the aggregations to the customer's level by calculating the average sales of customers. In this case, if your aggregation is based on a dimensions with a high level of details like the customers or the order ID, then you have to use the function include. So let's see how we can do that. Let's go and create a new calculated field. And we can call it Average Sales of customers. We can use the function include. So let's select the include. Now we have to say to Tableau which dimension can be included in the view. So currently we have the category at the country, we would like to add the first name or you can add the customer ID, doesn't matter. Let's add the first name. And then we have to add the aggregation. This time we're going to use the sum of sales. Now you might ask, why do we have the sum of sales We are talking about the average. Well, the average is going to be the second aggregation that we're going to do it on top of this LOD expression. First, we have to summarize the values that we have inside the data source, and then we can do the average on top of it. We're going to do it step by step, don't worry about it. Then we have to close the brackets like this. As you can see, now the calculation is valid. Let's go and hit okay. With that, as usual we get a new calculated field. Let's drag and drop it to the view. We still are not there because here we have the average sales of customers, but the function that is used in Tableau is the sum. We have to go and switch it to the average function. Let's go and do that. With that, we got the average sales of customers for each category and country. Now we're going to see, step by step, how Tableau did the execution of the include. The include going to depend on the dimensions of the view we have here, the category and the country. That means Tableau can start up something like this. We category and the country. The next step, Tableu going to go and check the LOD function. Let's go and open it again. We are telling Tableau now go and include the first name to the dimensions that are displayed in the view. Tableau going to go and grab those informations, the first name, and presented in the output we will have three dimensions, first name, category, and country. We can have something like this. Now if you compare the number of rows of the LOD expressions with the view, you can see that we have now more details in the LOD expressions since we added the first name. Here we have round eight rows, but in the view we have six rows. The level of details of the LD expressions is higher than the view table. Going to go to the next step and say, okay, we have to have the sum of sales. We can have the sales as well over here. And Tablo going to go start aggregating the rows. For example, first we have George Accessories are France. It's going to be only this row over here. We don't have it anywhere else, so we're going to have the 91. Then we have Maria Accessories, Germany. For that, we have three rows. Table going to go and aggregate those three rows. In the outputs we will get something like this and so on. So tab going to go and start summarizing those values based on those three dimensions. And at the end we will get in the outputs something like this. That table calculated the sum of sales by including the first name to the dimensions that are presented, Z. Here we come to the issue where we have in the LOD expressions more details than the view. In order to bring those results to the view, we have to aggregate it again. We have to either summarize it or do the average and so on. So we cannot bring those details over here without doing any aggregations. In this example, we want to find the average of customers for each category and country. That's why we have used the average function. That means if you are using the include function or you have more details in the LOD expressions, we have to aggregate the data in order to bring it to the visualization. But on the other hand, if you are using exclude or fixed and the output of the LOD expression has lower level of details than the view, then what can happen? We're going to have double kits. For example, you can see over here, sales by category, we have doubled. So it doesn't matter which function we're going to use, summarize or average, we will get always that doublates. The same thing for the exclude. We had lower level in details in the expressions compared to the view. That's why you can see duplicates. We have the same numbers over here. Those three rows, they are like repeated over here for the second category, this is the effect of the LOD expressions. If the level of details in the expression is higher than the visualizations, then we have to aggregate the data. But if the level of details in the LOD expressions is lower than the view, then what can happen? We can get back to our example Tableau going to go and find the average of those values. So the first value is going to stay the same because we have it only as one row, so it's going to stay the same. But now for those two rows, as you can see, Germany Accessories Tableau going to go and find the average of those table values, we will get 954. And then for the next row, we have Accessories USA. In the output we have only one row. That's why the average going to be exactly the same. The same goes for Monitor France. The same value, but the next value we have Monitor Germany. Here we have two values. Table can go and find the average of those two values and we will get 433. And for the last one we got only one value. That's why we got exactly the same number. Yeah, as you can see, if you get more details as a result from the LOD expressions, things get more complicated and you have to be careful which aggregations you are using in the visualizations. All right, So that we have learned how table can execute those three functions step by step. Now next we're going to go and learn real use cases of those functions. All right everyone. Now in this use case, we want to compare the sales of all categories to the sales of a specific category. Like here selected one the tables in order to understand how the sales of the other categories are doing to this specific category. In order to build such a view, we have to use the power of LOD expressions. This time we can use the exclude. Let's learn step by step how to create such a view. All right, let's start with the first step where we want to show the sales by subcategory. This is the easiest one. Let's go and grab the subcategory to the rows. And let's take the sales to the columns. And then we're going to go and sort the sales. Let's go and do that. Now our task is to go and find the differences between each subcategory with a specific subcategory of the tables. For example, we're going to go and find the difference between the sales of phones and the sales of tables. That means in order to find the differences in each row, we need two measures. The first measure are going to be the sales of the current category, like for example, the sales of the phone. The second measure, we need the sales of the tables. Here we need the sales of the tables to be as well. At the same row, the first measure, we have it already, right? We have here the sales for each category. But the second one, we don't have it yet. We need to have for each row, the sales of the tables. In order to do that, we're going to go and create a new calculated field. To have these tasks, let's go and create a new calculated field. Let's call it Sales of Tables. What do you want to check now is whether the subcategory, the current one is tables. If yes, then show the sales. We're going to use if statements, then we want to check the subcategory. If it equals to tables, you should write it exactly like the data that we have inside the data source. What can happen? We want to show the sales, do nothing. We want to have nulls. The subcategory is not tables. What we are doing now is isolating the sales of the subcategory tables. Let's go and it okay, and let's go and bring it to the view over here. As you can see, we have isolated the sales of the tables in this in new measure. But we still have the problem that we would like to repeat this value for each row. As you can see, we have it only if the subcategory equals to tables. Now, in order to repeat this value for all the rows, here comes the trick or the magic of the LOD expression. As you learned before, the exclude going to go and repeat the values, right? We can go and use this trick. What we can tell Tableau is that imagine that in this view there, what can happen? This measure is going to be repeated for all rows. Let's go and do that. Let's go and create a new calculated fields. We can call it exclude subcategory. Now we have to use the listed calculations because if you put everything in one calculation, it's going to be really complicated. Now we want to tell Tableau, imagine that we don't have subcategory, in our view, exclude subcategory and the aggregation going to be the sum. But this time of the new measure that we created for the tables, some sales of tables. And then we have to close it, something like this. We are telling Tableau exclude the subcategory from the view and do the aggregations. Let's see what can happen. Ok, and drag and drop to the view over here. As you can see, since we have only one value, we are ignoring completely the subcategory. We will get the same value repeated for each rose. So now we have all, what do we need to find the differences, right? We have the sales of each categories. And the sales of specific category, the tables. So now we're going to move to the last step, where it going to be the easiest part, where we want to find the differences between those two measures. So we're going to go and subtract them. Let's go and create a new calculated field. Let's call it difference. We can subtract the first value. It's going to be simply the sum of sales. This is going to be the first value that we have over here. Then with our new measure, it's going to be the sum of our exclude function, exclude subcategory. And that's it. Let's go and hit okay. And let's drop it to the view that we solve the task. We have the differences between the sales of each category and the sales of specific category. The tables, of course, you can see the table is going to be zero over here, because we are subtracting the sum of sales with the exactly same sales. It is a little bit tricky, but if you understand how the LOD expression works, you can really do such analysis. Now let's go and drop everything from here. We don't need those sub steps, I'm just going to remove them now. Of course, we can add the coloring over here. Let's go to the measure on the right side. Let's take the measure to the colors, and with that, we can see nicely the differences between the subcategories and the tables. Now if you'd like to highlight the tables, since it's our main category, where we're comparing all the others to it, we can make the use of the Sales of Tables. Let's switch to this measure over here, to the Sum of Sales and the Marks. And then let's take the Sales of Tables and put it on the colors, and with that, you are highlighting the main subcategory. With that, we have made really complicated analysis using the LOD expressions. 151. Tableau | Table Calculations: FIRST, LAST, INDEX, RANK: Everyone, So now we're going to talk about the last type of calculations that we have in Tableau, the table calculations. And here we have different functions, like the running window, rank first, last index, lock up. We're going to talk about all those functions in this tutorial as usual. First we can understand the concept behind the table calculations. Then we're going to go back to Tableau in order to start practicing. Let's go. The first question is, what are table calculations? Well, there are calculations that are going to be executed or performed after the aggregation is done on the visualizations. So they're going to like aggregate the aggregations in Tableau. And it's important to understand the level of details. It can be depending on the visualizations. That means here again, the dimensions in the view can control the level of details. Now to the big difference between the table calculations and the others. The calculations can be performed on the data that we see in the view. Tableau will not go to the data source, equate the data. Tableau can equate the data that is presented in the view. That means the view can be quaring the view itself. It's going to send equery to the data inside the visualizations. And the view going to return the result pack to the view itself. We are not going back to the data source, everything going to be quared inside the view. The other three types of calculations like the aggregate calculations, LOD and roll level calculations. Always going to query the data from the data source and bring the result to the view. Only this type of calculation going to query the data in the view. All right guys, in order to create table calculations, we have to define two things. First, the scope. Second, we have to define the directions. The scope means which data can be included in one calculation. For example, we have the following view. It looked like a table, right? So we have rows and we have multiple columns. But here we can see that our data is splitted by groups. Each group can be defined by the dimension quarter, so we have the 123.4 Now the first option that we have is the whole table. That means the calculation can include everything inside the table. It will ignore any partitions that we have inside the table. It's going to start from the first value and it's going to end up by the last value, moving on to the next scope or to the next option. We have the pain this time, the calculation going to focus on a smaller scope. This time we're going to focus on the partition or the group of data which is defined by the quarter. That means the table calculation is going to be done for each group separately. We can have for those three rows calculations. Then we can move to the second group, to the third group, and so on. Moving on to the last scope, we have the cell, it's going to be only one value inside the view, the scope going to be very small, including only one individual value. Here we have to define for Tableau, the scope calculations. Is it going to be the whole table or only the pain, Only the group of data, or only one cell? All right, the next thing that Tableau needs from us is the direction of the calculations. How the calculation is going to move through our table. So here we have four different options. The first one going to be down. That means we're going to start from the top value and we're going to move down until we reach the bottom. That's of course going to depend on the scope, whether we are running the whole table or only a group of values like we have in the pain. In this example, we have the table down. That means we are processing all the values in one calculations from top to bottom. Then it's going to reset and move to the second column. And we can do the same thing for the next year. That means this time the calculations is moving through the columns in one go, it starts from the first year and it ends up with the next year. Then it's going to reset and start for the next raw and so on. We are moving from left to right. Those two methods are the basics. Either you can move down or you can move right the next two directions, it's going to be mixing those two methods, the first one going to be down, then across. That means first we have to go down through the table and then we have to go across, it's going to start from the top first, then go to the bottom. But this time it will not reset and move to the next column. Continue doing the aggregations, it's going to go to the right across, then it's going to move again from top to bottom. There, across, top to bottom until we reach the last value. That means here we don't have any resets, it's going to continue the calculations through all values. It's not like the first two methods where we have resets for each row over here or for each column. This time the starting value going to be the top left and the last value going to be the batum right. Moving on to the last direction that we have, I think you got it already. It's exactly the opposite. First we do across, then we're going to do down here. Again, there is no resets. We're going to start with the first value on the top left and then we go to the right first. Then we jump to the next row, then we go to the right. We jump down right until we reach the last value on the patom, right. So that means the calculation first is going to move right and then it's going to jump down to the next row. All right, So as you can see, it's not that hard. Once you get it, we have four different directions and three different scopes that Tableau needs from us in order to create table calculations. All right guys, in Tableau, we have different methods on how to create table calculations depend on the difficulty. The first methods that we have is the quick table calculations. As the name says, it's very quick and easy to create. Here we have a list of different table calculations. You don't have to configure anything, you just have to click on the function that you need and table can do the rest. Here we have a very common table calculations like the running total, the difference, rank, moving, average, and so on. The second methods, it's going to be not that quick. We have to configure a few stuff. But still we are not writing any functions or any calculations. Still we are clicking around. But here we have more options and more control to configure the table calculations. If you compare to the first one, the first one is just selecting the function, and that's it. Here again, we have very similar functions. We have the rank running, total moving calculations. We can define different options like the scope, which dimensions can control the table calculations, and so on. Moving on to the last methods on how to create table calculations. We can do it by creating a new calculated field and then use the functions that are used for the table calculations. Here we have a list of many functions that you can use in order to do table calculations, but they are a little bit harder if you compare to the first two methods in order to create table calculations. As you can see, as you are moving from left to right, things get harder. But with that, you are getting the full control and the full options. Next, we will go back to Tableau in order to try those three methods. And we're going to try a few functions that we have inside the table, calculations. All right guys, so back to Tableau. Let's go to the big data source. Let's go to the products and get the usual stuff. So we're going to get the category subcategory and the sales as usual to the sales over here. So I'm going to show you the different methods on how to create table calculations. And we're going to start to the first one. We have the quick table calculations, which is the easiest one. In order to do that, we're going to do it on the view, so it's going to be only locally available for this view. It's not like creating a new calculated field. So we're going to go to our measure over here, right click on it. And then here we have two options. The first one says add table calculations and the second one going to be quick table calculations. The first one is the middle one that I showed you previously in the presentation where you have to configure different stuff. But the second one is the easiest one and the quickest one where we can create table calculations with only one click. Now let's go and check the quick table calculations. If you go over here, you will find a list of different table calculations. And we can go over here and let's check, for example, they are running Total. Click on that here, there's two things to be noticed. First, the numbers here changed because here we have different aggregation functions as well. We have here a new icon, and the measure table wants us to quickly identify whether the measure is using aggregate calculations or table calculations. If you see the triangle, that means this measure is using table calculations. As you can see, with only one click, we have created table calculations. Here we have running total. Don't worry about it, I'm going to explain it step by stepulator. Well now you might say, you know what, We didn't define anything. The scope of the directions for the calculation. So how we can do that, if you go back to our measure, to the table calculations, riticlculate and you can find, now we have more options once we converted to table calculations. And exactly here, the computing using. We have those options here we can define the scope table, paying, sale, and as well the directions as well. You can see that we have different options like clear table calculations if you want to remove it back to the aggregate calculations. Once you do that, you can see we got back our sum of sales without the icon. Well, that means we are not using anymore the table calculations. Using now the aggregated calculations. So that's all for the first methods, how to quickly create table calculations in Tableau. But we don't have a lot of options to configure. That's why we have the second methods where we have more options to control the table calculations. But again, we're going to create it locally only for this view. It will not be available for the data source. All right, so before I show you how to do that, we're going to get one more dimension to our view. So let's get the years of the order date. And I would like to have only three years, so I'm going to show it as a filter. I'm just going to remove the first two years in order to have fewer data in the view. Now in order to create table calculations only for this view, with more options we can go back to our measure the Sum of sales. Currently it is an aggregate calculation, but we want to convert it to table calculation, so radically connect and this time we're going to move to add table calculations. For the first option, you can see we have this small icon indicate that this is table calculation. So click on that and we will get a new window here to configure our table calculations. So what do we have here? The first thing that we have to define is the type of calculations. We have here a menu of different functions for the table calculations. Again, here they're running total, the rank differences and so on. So let's stick with the first one, the differences from here. We have to define for Tableau two things, the scope and the. They are always together, They are not splitted as options. The first one going to be Table across. Tableau here did really great job by highlighting how the calculation going to work. As you can see Tableau here, highlighting with the yellow color how the calculation is going to be performed. Just to help you to understand how it's going to work, It's really great. We have the table across from left to right, then we have the table down from top to bottom. Then we have the option off across the down. As you can see, it's going to affect the whole table since we move from the top left to the bottom right. Then we can define the other scope. Like for example, the pane down as you can see. Now the scope is smaller compared to the table down. Now the table down in. Everything in this column, but the paint down can include only this group. As you can see, our view is split into three groups based on the category. We have the first group over here, the second and the third, and Tableau is highlighting the first group. It is like a partition. Another option, we have the cell where Tableau can highlight only one value or we can define specific dimension to do the calculations. Here we have a list of all dimensions that we have inside the view. And you can go and select what the scope going to be, whether it's going to be the subcategory or the year of order dates. Then each function that we have has more specifications. For example here, what are the values that are relevant for this calculation? Again, don't worry about it. I'm going to explain how the difference work as well in Tableau, you have to define whether it's previous, next, first, and so on. Each function in Tableau has different options. For example, if you go to the rank, you will find over here we don't have now those previous, next, and so on. But instead we have different options to configure the rank. Each Tableau calculation function here has different set of options to be configured. All right, that's all for this method. As you can see, we got more options compared to the first one. Let's go and close this. Let's say that we are interested to have this calculation for all other worksheets, we want to reuse it. In order to do that, we're going to go to our measure and just drag and drop it on the data pain. And with that, we got a new calculated field. This time we are using the rank of sales. I can go and rename it Try And Sales. And with that, we got a new field on our data being and we can reuse it in different worksheets. All right, Sana, we can move to the last methods in how to create table calculations in Tableau. We're going to go and create a new calculated field and use functions. So let's go and do that. We will start with the function index, So let's create a new calculated field. We can call it index. And the syntax is very simple, so start with the index and that sets. We don't need to specify anything for this function. So you can see the calculation is valid. Let's click okay. And with that, we got a new measure, new calculated field. Let's go and check the results. So I'm just going to drag and drop it under view. So what this function does is it's going to return position number of the current value. That means the first position in this view going to be the first row as we are moving from top to bottom, this going to be the position number one, position number 234, and so on until we get the last value as the last position. Now you might notice that we are calculating all the rows in the table. We are using the scope of the table. We can check that if we go over here to our measure erratically connect. And we can see that the compute using is the table down. Let's say that we would like to have an index for each group, not for the whole table. Let's go and switch it to the pane down. Now as you can see the calculation on the pain, not the whole table. For the first group, we have the first row, the pocas, then the second third force and so on. Then it go and reset for the second group. On the second group going to be this row going to be the number one and the last position or the index in this group going to be the supplies and not the last one. The fonts, As you can see, it always reset for each group because we have specified the scope only for the pain. Now if you go and switch it to the cell, let's go and do that computing using cell. You can see that each cell going to be the first value, the position number for each row going to be one. This is how it works with the scoping table. All right, now let's go and switch it back to a table computing using. As you can see, it's very simple. Let's go and try another function in Tableau. We're going to use this time, the first function, so let's carry o, a new calculated field. We're going to call it first. And the function is going to be as well. Really easy. It's going to be first and that's it. It's like the index. You don't have to specify anything inside the calculation. The calculation is valued. Let's go and hit okay. And check the result as well in the view, let's try and drop it over here. And now we can see that Tableau assigning the first row with the value of zero. And as we are moving down with the values, as you can see the numbers are decreasing. Those numbers are going to be, How many steps do we have until we reach again the top, to the zero? Here, for example, we need three steps until we reach the first row. And as well here we have -11 until we reach the top value. Here we have like a distance between each row and the first row in Tableau. There is another function where it does exactly the opposite. It's going to be the last. So let's go and try it. Let's go and create a new calculated field. It's going to be the last function, not in this tutorial. Be last as well. It doesn't need any fields inside it, so that's all the calculation is valued. Let's go and hit Ok. Let's drag and drop it on the view over here. So now we can see that it has exactly the opposite effect of the first. So Tableau going to go and assign the last value in our view with the zero, and as you are moving to the top, the values can increase. Here again we have the distance, or how many steps do we have until we reach the last values? Okay guys, we have one more function that is very similar to the last first index, where it going to gives us the position number of the rows. We have the rank function. Let's go and create a new calculated field. We're going to call it ranks. Starts with the keyword rank. And as you can see, we have five different functions and how to rank the data. We're going to start with the easiest one, the first one, let's select rank. And here we can specify two things for Tableau. The first one can be the expression or the aggregate functions. In this view, we have the sum of Sales. So let's go and define that Sum of Sales. And the second information that Tableau needs it as an optional. It's going to be how to sort it, ascending or descending. If you leave it empty, Tableau going to use it as a default, the descending methods, so let's stay with the defaults, that's all the calculation is valid. Let's go and hit Okay. And with that we got a new calculated field. Let's drag and drop it to the view to check the results. So now we can see that Tableau goes and ranks all the subcategories based on the sum of sales. So we can see over here that the phones has the highest sales and we have it as a rank one and then the second highest sales, we have it over here as a two for the chairs. All right guys. So now if you look at those four functions and the results, you can see that they are very similar to each other, right? They're going to define the position number of the rows using different methods. Now you might ask, what are the use cases of those four functions? Well, generally, there are two use cases. First, we can use it as a filter ind visualizations, and second we can use it in another calculations for the force use case. For example, let's go and pick the rank and show it as the filters to the users. They go and specify, for example, the top five subcategories in the visual. You already know that there are different methods and how to show the top product or the top sub categories indivisualizations. This is one method in how to do that. Or we might be in a situation where we have a very big visualizations, a lot of rows. I would like to show for the users only the first five rows. Without any specifications or ranking or anything, we can just go and show the first five rows. In order to do that, we go to the first and show it as a filters. Let's go and reset the rank. We can go over here and define. Okay, I would like to see the first five rows or the opposite, we want to show the last five rows, so we can go to the last and show it as a filter. Let's go and reset the first. So now we can go over here and say, okay, I would like to see the last five rows inside my view. So this is the first use case for these very simple table calculations functions. We can use them as a filter. All right guys, moving on to the second use case for these functions. I usually use them in another calculations to generate a reference line. Let's have a quick example. Let's go and create a new worksheets. We're going to take the order date to the columns and as well the sales to the rows. And this time we're going to have the months as well. So let's change it from year to month. And I would like to have it as a part diagram. As usual, I want to show the labels and as well the colors from the measure. The task now is to show a reference line based from the first value in the diagram. We have the first value of 21,000 I would like to have it as a reference in order to compare the other Manss with it. We can do that using the function first, but we have to add it in another calculations. Now, in order to make it simpler to see how this works, I'm just going to go and duplicate this view in order to make it like a table. Let's go to the Show me over here. And switch it to a table. And then I'm going to take the mans to the rows. Now we have a very nice table. I would like now to have the first value as a new calculated field. Okay, I would like as well to add to this view the values from the first function. Let's go and get the field that we already created and drop it on the view. You can see the first row in this table going to be January 2018. So we have the value of zero. And I would like to show now the sales only for this row. I'm not interested with the other rows. Only for the first row, we have to show the sales. In order to do that, we have to go and create a new calculated field. Let's call it First Sales. And the logic can be like this. We can check if first function equal to zero. If we are at the first row, as you can see, we have hit a zero value. What can happen? We want to show the sales it's going to be then we can have the field sales. Otherwise we don't want to show the sales. That means we can go and end the statements with that. As you can see, if the position number is going to be zero like the first one, then show the sales. Otherwise don't show anything. Let's go and hit, Okay. And with that, as usual, we got our new measure. Let's drag and drop it to the view over here. As you can see, tablet can show the sales only if the first equals to zero. If not, as you can see, we don't have anything with that. We got the first value in the seals and now we can go and use it as a reference line. In order to do that, we're going to go back to our original sheets and let's go and add our new calculated field to the details. Then let's go to the axis to the seals, radically add reference line. The value can be based on our new calculated field, so let's go and switch it to the fares of sales. And we can go as well and change the label from computations to custom. And we can say, okay, this is the first that sets. Let's go and hit, okay. Now as you can see, we got our new reference line. And the value of this reference line can be based always from the first value. As you can see, it's going to be 21,000 So we can go now and compare the other values to our reference line as well. This can be very dynamic. That means, for example, let's go and add a filter to our view. Let's go to the order date and show the filter now what can happen if we deselect the 2018, the first value going to be from January 2019. Here we're going to get 47,000 as a reference line. With that, we can understand the power of table calculations. They are based on the visualizations, not based on the data source. Anything you change individual, the table calculation going to react to it, which makes it very dynamic. This is another use case for those four functions. First last, index, rank, and so on. For example, you can go and say, let's make the reference line based from the last value on the table, so you can go and switch it. That's it for those four functions. 152. Tableau | Table Calculations: RUNNING TOTAL: Guys, now we're going to talk about very important and very common table calculation in Tableau. It is the running total. The running total is going to go and sum all the values as they progress over the time. For example, in this view we can track the performance of our business, where we can go and compare the three different categories of our products. Where we can see here the development or the progress of customers, and as well the orders in order to quickly understand whether our business is growing or declining. Now if you compare, in this view, those three categories, you can see that the office supplies is growing very fast if you compare to the two others. So you can see using the running total in our view help us to understand progress, the performance of our business. So now let's go and understand how this function works in Tableau. Okay guys, so how the running total calculation works. It's going to go and add each value to the sum of all previous values. Let's have an example on others understandards. We have over here the months and the sales as well. And we want to build the running sum. We start with the first value, so we are currently at the first row, and since we don't have any previous sum of values, it's going to be exactly the same value. The calculation going to be the current running total going to equal to the sales value. That means in the output, we're going to get exactly the same value, 2607 on to the next month to the February. So currently we are at this level at the sales 523, and the previous running total going to be the old one from January. Now in order to get the running total for February, it's going to be simply adding those two values. So we are adding the sales value plus the previous total run. And with that we will get 2,590 So as you can see, we are simply adding the current sales with the previous running value. Let's move to the next month. We have a new current, we have the 6,422 And we're going to add it again year to the previous running total. So we have again the same formula. With that, we are going to get 9,013 As you can see, we are just adding the current sales with the previous running total from the previous month. We can proceed and progress our table until we reach the last one. It's going to be exactly the same. We are currently at December, and this is our current value. We're going to go and add it to the previous running total from the previous month, November, until we're going to get the last value. And with that we have the final value for the total run. As you can see, we build like progress or development of the sales over the Monsls. This is how the calculation of the running total works. Let's go back to Tableau in order to learn how to create it and build the visualization using the running total. Let's start with the big data source and let's go to the bad acts here, we're going to get our category to the rows, and then we need the date. So we're going to get the order dates from the table orders and put it on the columns. We need it as a continuous month, Right click on it. And then let's switch it to this option over here. Now we need the measures because we are tracking the progress of customers. We want the count of customers. We're going to go to the customers over here and let's grab this measure, customers count, and put it in the view. And now we're going to go and change the visual from line to bar. So we're going to go to the Marks over here and change it to bar. Now we have here the total number customers for each month. We still don't have the running total. In order to do that, it's very simple. We can go and use the quick table calculations. It is the easiest one, right click on the customers over here. And then let's add quick table calculations. And simply here, the running total. Let's go there. So now we can see that tablet converted to running totals for each category. And we can see immediately that the progress of customers in the office supplies is the best. As you can see, it's very simple. What we are missing now is the count of orders, The number of orders. So let's go and get our second measure. It's going to be the orders count. And let's grab it and put it near the customers over here. But as you can see, both of the measures are very similar. So we have to change the visual for the orders in order to understand the differences between the two measures. How to do that? If you go to the marks over here, you can see we have three sections. The first one is all that means. Anything that I'm going to configure over here, it's going to affect everything, both of the measures. But now, since we want to change the visual only for the orders, we're going to switch the marks to the orders. So let's click on that in this tab now, I'm configuring the running total of the orders instead of bar. I would like to have it as a line. If you go to the colors over here, we can add this dotted line in order to see like the differences between the muscles. And I can reduce as with the opesity in this line. All right, so now the next step we're going to go and change the colors because both of them are blue. Let's go to all, and let's grab from the left side, measure names. Let's go and put it over here on the colors. The next thing that we can do is to merge those two axes for each category into one. I would like to have only one axis. In order to do that, let's go to the orders right to click on it. And here we have an option called dual axis. What it's going to do is going to merge those two axes into one. Let's go and click on it. Now as you can see, we've got only one axis for each category. We don't have any more of the split between two axes, so now we have it only on one view. Now we can see that we've got only one axis for each category. We don't have any more of the split between the two measures, everything in one. Now we can see that the axes are on the left and on the right. The next step, what we usually do is, but not always, is to go and synchronize those axes. Right click on it and we have here the option synchronized axis. Thus, both of the axes are at the same level. We can go now and hide the right one because it is useless to have the same information twice on the left and on the right. I will go and hide the header from the right side. And maybe we can go and get rid of those information that we have on the axis. So go and edit the Ax and we can go and remove the title, the set. It's close. I'm just minimizing the information that we have inside one view. That's it. As you can see now we can track the progress of the customers and orders by the category using the function that is very commonly use, the running total. 153. Tableau | Table Calculations: DIFFERENCES: All right everyone, so we're going to talk about the last table calculation function. We have the difference. The difference is very simple. It can find the difference between two data points. And there are many use cases for this function, but the most famous one is compare two things. For example, to compare period to period. A very common one is to compare the sales or profit month by month, or year over year in order to uncover seasonility or cyclical patterns. Now let's go and understand how this function works. All right, now in order to understand how the calculation works, we're going to have the following examples where we have the sales of mans in the calculations. Let's say that we are currently at the months, may the current value going to be this value? And for Tableau, in order to create the difference, it needs always two data points, the first one always going to be the current value, in this example going to be the current sales of my second data point. Here we have more freedom where we can select which value going to be compared to the current value. In Tableau, we have four different options. The first one, we can go and compare the current month with the previous month. In this example, we can compare the y with apprel if you define it like this, with the previous Tableau going to go. And simply find the differences between the current and the previous Tableau going to go and just subtract those two values. This is the first option. The second option that we have is to compare the current value with the next month. In this example, we're going to compare the month of May, the current one, with the months of June, Tableau going to go and simply find the differences between the current and the next month, and it's going to go and subtract the values. Now moving on to the third option. We can compare the current month with the first month, the first value that we have inside the staple. That means in this example, if we define for Tableau the first, that means Tableau going to go and find the differences between the current sales, that will be the sales of Y with the first, so we have it as January, and then go and subtract the values. Now moving on to the last one, I think you already got it. We're going to compare the current month, the M, with the last month, the month of December, Tableau going to go and find the differences between the current value of my with the last value inside the visualizations of December. So it's going to go and subtract the two values. As you can see, we have here four different options on which value we are comparing with the current, either the previous value, the next value, the first value, or the last value. That means in Tableau we get like really great control which data points can be compared to each other's. Now let's go back to Tableau in order to start practicing for this function. All right everyone. So now we're going to go and create a view in order to compare the sales over the time, over the years. We're going to go with a big data source. Let's go to the orders, the order date to the columns to have the years. Then we would like to have the rows, the Ns and the quarter hold control and just duplicate it like twice. The first one going to be the quarter. Let's change the format to quarter and the second one is going to be for the month. We're going to replace it as well to the month. Now, I would like to make the tip a little bit bigger. I'm just going to stretch it from the rows and as well from the columns. Now what is missing? Of course, our measure. Let's go and get the sales and put it in the view. Now we have the sales aggregated by the months and spreaded by the years. Now we have to create the differences between those years. In order to do that, we're going to go to our measure Radicallickit. This time we're going to go use this option. More control on the calculation. Add table calculation. Let's do that. Now we have to configure a few stuff. First, we have to choose the calculation type. It's going to be the difference from, as a default is correct as well. Computing use which scope, which direction we want. We want the direction from left to right. We want to compare the years which is currently correct. We don't want to compare the months together. If we want to compare that, we can switch it to table down. With that, we are now comparing the months together, but now we want to compare the years. In order to do that, let's select the table across. And then we have to specify for Tableau relative to. And here we have to define one of the four options that we learned before. We have the previous, next, first, and last. Now in this example, we want to compare current year with the previous year. So we're going to stay with the previous. So that means, for example, let's pick this value of our year. It's going to be the differences between the sales of 2022 January and the year before with the same month. So it's going to be the difference between this year and the year of 2021, January. And that's why for the whole year of 2018, we don't have any values. Because in this view we don't have 2017, we don't have a previous year. It's going to be the first year, that's why it's completely empty. All right, so that we have created the table calculations. But as usual, we're going to go and change the view that we are currently presenting for the users. So what I would do now, I would reduce the number of years to only two years. So let's go and apply a filter. Show filters. And I would pick the last two years. Now I would like to add to the view the total sales for each month. In order to do that, let's go and grab the sales and attribute to the view. Now on the left side we have the differences in sales, and then we have the aggregate of sales. Now we can see very easily where those numbers come from. It is the differences between those two years. All right, the next step, let's go and replace those numbers with visuals, with pars. In order to do that, we're going to take our measures and put it on the columns. This is the first and the second. Then let's change the visual. Instead of line to par, let's go to the marks over here and say we would like to have the bars. All right here. As you can see, all the measures having the same coloring. Instead of that, I would like to change the coloring of the differences. Let's go to the sum of Sales over here. As you can see, we have the icon of table calculations. And then let's drag and drop the sum of the table calculations to the color by holding control. Let's change the colors of the first measure. So let's switch the sum of sales, the aggregations, and go to the colors. And let's pick any color from you. Like for example, the blue, those informations, Gus, from the total sales, from the aggregate calculations. And this one comes from the table calculations. And it's very simple to create. And with that, we can go and compare the years for the sales. Now if you would like to analyze the differences between those two years, you can see in January, for example, there's no big difference between year 2020, 1.20, 22. There is like small growth. But if you go, for example, to February, you can see there are big differences between the two years we have made a lot of sales in this month. And another thing to notice here is that in November, we made less sales than the year before. So as you can see, we can very quickly find the differences between those sales in 2022 and the sales of the year before. So this is the power of the difference function. It can help us to compare two things like the years, or maybe the categories month and so on. All right, so that's all for the difference function in Tableau. All right everyone. So that's all we have covered. The four types of Tableau calculations. And with that, you have learned around 60 different functions in Tableau so that you have enough tools in order to create new fields in your data source and as well to manipulate your data. And with that, we have completed the section Tableau calculations. And now in the next section, things going to get really interesting where we're going to go and build around 63 Tableau charts. We're going to start with the basic charts like par charts, and we're going to progress to more complex charts in Tableau. 154. Tableau | Section: Tableau Charts: Jump immediately by start building charts in Tableau. And we're going to cover around 63 charts. So let's have a sneak peek at some visualizations and charts that's going to be covered in this course. You will start by creating some basic charts, like different part charts, we have column draws, stack part charts. And then after that, you're going to learn how to create different line charts. And as well we're going to have area charts. And then we're going to learn how to combine different type of charts, like for example, a par chart and a line charts. And moving on, we will be creating different maps in Tableau. And then you will go to the next level where you're going to start building charts like scatter plots, slobby charts, parple charts, poly charts, calendar charts. Then after that, we're going to go to the last level, to the advanced charts. For example, we have reto charts, waterfall butterfly or tornado charts, Quardont charts and funnel charts. So as you can see, we're going to cover a lot of Tableau charts and visualizations in this course. So now let's jump in and get started. 155. Tableau | Multiple Measures in One View: Before start learning how to build charts in Tableau, we have to understand some basics. Like for example, how to add multiple measures in one single view. I saw many new Tableau developers that they get confused on how to add a second measure to the visualization. Because in Tableau we have different places and different methods on how to add multiple measures in one single view. Here in Tableau we have three methods. The first one is to use individual axes for each measure. The second method is to use one single shared axis using measure values and measure names. And the third one is to use dual axis in Tableau. So now we're going to go and learn those methods step by step, and we're going to learn as well the advantages and disadvantages of each methods. So let's go all right guys, now we're going to start with the first methods. We have the individual axis for each measure. So let's see how we can create it and how it's going to look like. Let's go, for example, to our big data source. Let's pick the order date to the columns. And now in order to create individual axes for each measure, we're going to drag and drop the measures in the rows or in the columns. So for example, we're going to take the sales and put it in the rows. And let's take as well the profits, drag and drop it to the rose as well. Now we can see in our view that each measure has its own axis, That's why we call it individual axis for each measure. We can see for the sales we have this axis that starts 0-1 million. And for the profit it starts 0-100 k. And those two axes for those two measures are completely separated from each other's. There is no overlapping or anything. Now, of course we have two measures we can go and add a third, fourth, and so on. So there is no limitations on how many measures we can add to our visualizations. We can see now we have four measures. You can see each of those measures has different axes with different ranges. I would like to understand something very important in Tableau that once you are adding multiple measures to the views, you will get multiple pages on the marks. The marks in Tableau is the place where you can go and customize the visualizations to customize the charts that we have over here. In our view, since we have multiple measures, we will get multiple pages in the marks. Let's check what we have over here. We have the first one is all. Then we have an individual mark for each measure that we have inside our view. Now let's understand how this works. Let's start with the first one, the all. Now in this page, anything that you change in the set up, it can be reflected for all measures, for all charts. For example, instead of having the line, I would like to have the P. But now if I change it to bar, as you can see, all the measures can be changed to bar charts. Or if you go over here, for example, to the colors and change it to black, you can see that all our measures now are black and so on. If you go to the size, reduce the size, you can see the size of all our measures is going to be reduced. So anything that I'm changing in the, all it can be reflected for all measures in the view. But now since we have individual axis for each measures, we can go and customize each of those charts individually. So for example, let's say that I would like to change only the sales. I can go to the Marks of Sales over here. So let's switch to the page of Sum of Sales, and then instead of having bar, I would like to have it as a line. So now we can see we have changed the chart type only for the sales. Everything else can stay as a bar charts. And the same thing for the profit. You can go over here to the profits and say, okay. Instead of plaque, I would like to have it. For example, as blue as you can see, this customization is, can be done only for this measure. Only for the profits. And then the same thing for the other measures. If you say okay for the quantity, I would like to change the chart type instead of let's go for something like area. So let's switch the quantity and then let's go to the area over here. With that, we have changed only the chart type for the quantity, so you can see those marks are really helpful in order to customize our charts. And you can go and do that individually for each measure, or you can go to all measures over here and then do the changes for all measures together. That's all for the marks. They are really important in order to customize the charts inside our visualizations. One more thing that's important to understand, that we have here four tabs inside the marks because we have four measures. Well, because we have continuous measures, For example, for the years, we don't have any tab in order to customize the years because it is discrete. For example, let's go and switch the sum of sales from continuous measures to discrete. Rat click on it and go to discrete. With that, you can see that the sum of sales disappear from the marks. That means we cannot customize it anymore because it is discrete. Let's go and change it again, back to continuous. And with that, we're going to get it again in the marks, you can customize continuous fields. All right guys, now as you can see for these methods, we can go and customize our charts individually and as we want. And another advantage that we can go and add as many measures as we want inside our visualizations, but the disadvantage that we have separated axis, which is in some situations it's really hard to compare the measures together if they are like splitted like this. That's why we have tablo different methods in order to combine and to merge the axis and the charts together. So that's all for the first methods where we're going to have individual axis for each measure. All right guys, moving on to another method in order to combine multiple measures in one view. And that is by sharing the same axis. We can do that using the measure names and measure values. If you take the data pain in each data source in Tableau, you will find always two fields. We will have always measure names and measure values. Those two fields, the measure names and values, they are automatically generated from Tableau. They don't come from the original source of your data. What are those fields? The measure names is a discrete dimension that contains the names of all measures that you have inside your data source. On the other hand, we have the measure values. It is continuous measure that contains the values, all measures that you have inside your data source in Tableau. There are two ways in order to use the measure names and values. The first one is by simply just drag and drop from the database into the view. Let's take, for example, the measured names to the rows. As you can see, currently no measure values are selected because we don't have anything in the view. Now what we're going to do, we're going to go to the major values and less drag and drop it to the text over here. And now you can see in the view all our measures that you have inside our data source. The count of customers, count of orders, discounts, profits, sales and so on. So those are all available measures that Tableau can find inside your data source. Again, the major name going to be the name of the measure, the count of customers, count of orders. Those information comes from measure names. And the values of those measures going to come from the measure values. So as you can see, it's very simple. The names of the measures, the count of customers, discount and profit. Those names comes from the measure names. And the values that we have inside this view comes from the measure values. So here you can control stuff. For example, you can go and remove any measure that you don't want to see inside our view. For example, let's go and remove the sum of unit price. So just drag and drop it somewhere outside. And as you can see, Tablelated immediately filter. So if you go over here on the filters and edit it, you will see a list of all measures that we have inside our data source as well. If you want to remove some measures, you can go and deactivate or deselect the measures that you don't want to see inside our view. Let's go and hit okay. And with that, we have reduced the number of measures inside the view to 4.1. More thing that we can do over here that we can go and change the sort of the measures inside our view. For example, let's take the count of customers from the top and put it in the bottom so you can see we just change the order of the measures inside the view. All right, so this is one way in order to use the measure names and measure values inside the visualizations by just drag and drop them inside the view. But there is like another quick way in order to use those informations. Let me show you what I mean. I'm just going to go and remove everything from our view and then start from scratch. Let's take the order data to the columns. And let's take, for example, the sales to the rows. So far we have only one measure. In our view, everything as normal. But now let's say that I would like to add another measure to the view before we learn that we take the profit and put it near the seals. But with that, we have learned that Tableau going to go and create two individual axis. We don't want that, so let me just remove it. I would like to have one axis for both of the measures. In order to do that, we can use the measure values and names. And in order to quickly generate that, let's take the profits now Very slowly, let's just drag it to the axis of the seals. And as you can see now, Tableau going to show us two green vertical lines. With that, we are telling Tableau I would like to share the same axis for two different measures. So let's just drop it on the axis. And here table going to go and convert everything so we don't have anymore here. The sum of sales, we have now the measure values and in the filters we have the measure names. Inside it we will get only two measures and the sales. So as you can see, table can prepare everything for us. And this is a quick way in order to use multiple measures using the measure values and measure names. And we can see as well here in the measure values as we have only those two measures. So now let's check the visual. As you can see, we have only one axis for two measures. The green one going to be the sales, and the grey one can be the profits. So that means those two measures are sharing the same axis. And of course, we can go and add more measures to our view. Only two we can take, for example, the discounts. We can go and drop it inside the measure values to the last one for example. And with that we got three lines. Three measures are sharing the same axis. It's really nice and compact way in order to compare multiple measures using the same axis. But of course you have to pay attention to the scale of the axis. For example, the scale of the sales. As you can see, the green one is really huge, 0-1 million. Now if you take the discount, as you can see, everything like almost zero, because the scale compared to the sales is very small. That's why for this methods, it's makes sense to use multiple measures in the same axis if they have a similar scale of data. But if there is like big difference in the scales, the visual will not make sense compare two measures. So in this example, it doesn't really make sense to use the discount inside these visualizations because we cannot really compare it. It has really small scale. One more disadvantage of this method of that, if you check the marks over here, you can see that we have only one tab for everything. We don't have individual marks for each measure. And that means we cannot go and customize each measure as we want. Like we saw before in the method one where we want to use, in one case, for example, the line diagram and another measure, we can use the bar diagram and so on. So we cannot go and customize individually each measure, but instead all those measures are sharing the same set up for the visualizations. That means let's go, for example, and go and change the sides. If we do that, it's going to affect all measures inside the view and I cannot. Change it individually. So everything that you are making here or changing individual, it can affect all the measures. For example, let's go and change it to par diagram and so on. The only thing that you can go and customize is the colors. So if you go to the colors over here and edit colors, you can assign for each measure a different value. But that's all. We cannot go and customize the charts as we want. So if you use measure values and measure names, pay attention. You don't have the freedom of changing the visuals of your charts, but it's still very useful in many cases where you want to have multiple measures sharing the same single axis. All right, so with that, I hope it's more clear. Now why do we have Tableau measure values and measure names? All right, now moving on to the last methods. In order to combine multiple measures, in one view we can use the dual axis. Dual axis are really great way and very useful in many scenarios where you can go and compare two measures together. Let's see how this works in Tableau, and there are two ways on how to create dual axis in Tableau. The first one I'm going to show you now is that let's take, for example, the order date to the columns. And then let's take the sales in formations to the rows. Now I would like to get another measure inside our view. So let's take the profit and just put it in the rows side by side near the sales. So here we are back to the method one where we have two measures separated with two individual axis. Now as you can see, those two measures are separated from each other's. I would like to bring those two visuals on top of each other's how to do it. Let's go back to our measures. So yes, you can see we have two measures, sales and the profits. We're going to go to the profit, to the one on the right side, right to click on it. And here we have the option of dual axis. So let's go and click on that. Now as you can see, those two charts now are on top of each other's using dual axis. The axis for the sales and the axis of the profits side by side. And we can see as well the shape of those measures, the change. So now, instead of having two green pills, we have now one green pill from two measures, the sales and the profits. And now, if you check the scales of those dual axes, you can see that the sales as usual, 0-1 million and the profits 0-100 k. So now here we have two options. Either you can leave it as it is with two different scales, or you can go and make them similar to each other's. And this is what we do in most situations. We go and synchronize those two axes. In order to do that, let's go to the profit over here on this axis. Right click on it, and here we have the option of synchronize axis. Let's go and select that. As you can see now, the profit scale has exactly the same scale as the seals. It starts 0-1 million and the marked or the visual did adjust as well to the new scale. So as you can see, now we have it on the bottom before we had it near the seals. Now you might ask, you know what, Why do you use dual axis? I can just go and use the Mejor values like the method two and I can add as many measures as I want to the view. So why do we have dual axis? Well, there's two reasons for us. First, here you have the option to decide whether you want to synchronize the axis or not. So if you go to the method one with the Mejor values, you can see that everything is synchronized and you have only one axis and we cannot change that. But if we go back to the dual axis, we have always the option to synchronize axis or not. So this is one benefit, the major benefit of dual axis that I can go now and customize each measure as I want. So if you check the marks we have here, again, a tab for each measure. Again, the all going to customize both of the measures. But if you go to the Sum of Sales, we can go and decide the visual set up of this measure. For example, I can go over here and change the size. Or I can go to the sum of profits and say instead of the line diagram, I would like to get a bar diagram. Here is exactly the advantage of the dual axis, where we can go and customize the chart or the measures individually but still using the same axis. And you don't have this option if you are using the measure values because you have to make a decision or a set up for all measures. But the disadvantage here that it's dual axis or only two measures, but it's still a great way in order to compare two measures in Tableau. I would like to show you now the second method on how to create quickly dual axis in Tableau. So let's go and remove those stuff, and then let's take again the seals. Now for the second measure, instead of dragging and dropping it here near the seals. And then switch it to dual. What we're going to do, we're going to go to the visual over here. And if you move it to the right side, you can see that we have one vertical line here. Be careful. If you move it to the axis, you have two vertical lines where you can have the measure values and measure names. We don't want that, We want a dual axis, so just move it to the right side, the opposite side of the axis. And you can see we have one vertical green line if you drop it, Tableau going to go and create immediately dual axis between those two measures. So this is how you can create dual axis in Tableau quickly. And one last point about the dual axis is to understand the order of the measures has an effect on the visual. So let me show you what I mean. I'm going to go now to the profit and change it from bar diagram to line diagram. And as you can see, the red line from the profit is like in front of the sales. So that means the measure sales is in the back. And The profit is in the front. If you want to switch that individual, what you're going to do, you're just going to switch the order of the dual axis. If we take the sales from left and just put it on the right. And as you can see now the part diagram in the front and the line diagram in the background, which in this situation it's not really cool to have the line behind the parts. Now let's go and switch it again so the profit on the right side so that we're going to get it in the front and the sales in the back. All right, that's all for the dual axis. Now of course in Tableau, you can go and mix all those methods together in single view. Here we have a dual axis. In this example, I can go now and add the measure values, the profit. Instead of having the profits, we can have the measure values, the method two. In order to do that, let's take, for example, the quantity. And let's drag and drop it on the axis of the profit. Let's drop it over here. And as you can see, table immediately switch the sum of profit to measure values. But still on the left side we have sales. Now we are doing a dual axis between the sales and a bunch of measures. Now we can go and add more measures to the measure values. Let's take the unit price and add it over here. We can add the discounts. But now let's just change the colors in order to make clear. Now I am at the tab of the Major Values. Click on the Colors colors. Now the quantity, I'm going to give it green unit price. Let's give it gray discount this color. And that's all. That. As you can see, we have different lines, but all of them are lines. We cannot change that because it is a major value. So all of them are sharing the same set up. And on the background we have the sum of sales from the dual axis. That means you can go and combine those stuff and of course we can go and add the method one. Let's take the count of the orders and just drag and drop it to the roads over here so that you can see that Tableau did go and create an individual axis for the counts of orders. That means if you look now to our measures, in this view, the first one, the sum of sales, we are using the dual axis. This part diagram, the blue one. And then on the right side of the dual axis we have punch or bundle of measures. Here we have the sum of profit, quantity, unit price, and discount. So we have a group of measures as a part of the dual axis using the measure values, count of order. It is completely separated and not sharing the axis with the others. We have it as an individual axis using the method one. All right, so as you can see, you can mix the stuff. And this is exactly the power of Tableau, where we have high customizations on how to visual our data. All right everyone. Now let's have a quick summarize. In order to combine multiple measures in single view, in single visualizations in Tableau, we have three methods. The first one is to use individual axis. That means we're going to have for each measure a different separated independent axis. And the advantage of these method dots, we can go for each measure and decide about the visuals, which visual type we can use, the colors, the sizing, and so on. So the customizing of the measures is going to be independently. And the second benefits, we can go and add as many measures as we want inside one view. But the weak point in this method, it's really hard to compare those measures together. That's why we have the second methods where we can go and compare all those measures together using one shared or single axis. And we can create such a visualizations using the measure names and the measure values. So we have only one axis and we can have multiple measures sharing the same axis. With the main benefit of our thoughts, we can add as many measures as we want. And as well we can compare those measures better than the method one since they share the same axis. But the disadvantage in this method A thoughts, we cannot go and customize each of those measures independently. So that means all those measures are going to share the same configurations of the visualization. So we cannot use here a line then apart and change something else. We have always to use the same visualizations for all measures. And that's why we have the third method in table to use the dual axis. The main benefit of the dual axis of dots, we can compare two measures closely to each other's. We can define whether we can synchronize the axis or not. And here, the advantage compared to the previous one, the single axis, the dots, we can customize the visuals for each measures independently. So here we have a line diagram together with a bar diagram. Only this advantage of this method of dots, we can compare only two measures. All right, Kay, so that was the different methods on how to add multiple measures in one single view and when to use them. Next we're going to start building basic charts, and first we can have the bar charts. 156. Tableau | Bar Charts: All right, so now we're going to start with the easy stuff where we're going to build a bar chart in rows. Let's start with the big data source and let's take the subcategory to the rows. And then we need to measure, let's take the seals and put it in the columns. With that, we got the sales by category. Now in order to make it bigger, I'm just going to go over here. Instead of standards, let's take the entire view. Now as you can see, we have bars in the rows. Table can use Bar chart as a default, but in case you have something else, you can go to the Marks over here instead of Automatic. You can move it to bar, let's go and click on that. Nothing going to change because currently is a bar charts and we usually use the par charts and rows in order to make ranking. In order to do that, let's go to the sales and sort our data. With that, we've got a very nice ranking in our charts. One more thing that I usually add is the coloring. So I take the measure, the sum of sales hold control and put it on the colors. That's all for the bar charts and rows. Okay. The next we have the bar charts in columns. It's very easy and very similar to the rows, I just duplicated the worksheets. Now here, instead of having the dimension on the rows, we have to move it to the columns. We have to switch between the measure and the dimension. In order to do that, it's very simple. Let's go to the Quick menu over here and just switch it that we got the parts. Now on the columns, as you can see, it's very simple. We usually use this as well for ranking, of course. Now the question is when to use columns and when to use rows. If you have a dimensions with low cardinality, like we have the subcategory, you can go and use. But if your dimension has a high cadrety, a lot of values, you can go and use the rows in order to have like a long list and you can scroll down. It's always better to scroll down than to scroll to the right sides. If you have a lot of values inside your dimension, go with the par rows. But if you have low number of values inside your dimension, go with the column bars. All right, moving on to another part chart. We have the side by side bars. In the previous part charts, we have used only one dimension. This time we're going to go and use two dimensions. Let's go and build it. First I would like to get the dimension country to the columns. And then let's go and get our measure the cells to the rows that we got the normal part charts. But now if you go and add another dimension to the columns, you will get side by side part charts. The second dimension going to be the years of order dates. Drag and drop the order dates to the columns. As you can see, Tableau converted to line charts. We don't want that, we want part charts. That's why we go to the Marks over here. And instead of Automatic, we're going to switch it to bars. Again, here I would like to make it entire view. Now we have a lot of data inside the view. We have five years of data. I would like to have only two values. I would like to compare the last, let's drag the years to the filters. Then I'm going to filter using the years. Select the years next, let's have only the last two years. Click. Okay. The last thing that I would like to add is the coloring. Since we have two years, I would like to have for each year a color. Let's take the years, hold control and put it on the colors, and that's it. We have now really nice separations between the values. Now as you can see, we've got side by side bars and it's really useful in order to compare multiple values in each category. With that, we can really easily compare the last two years in each country. Here in this type of charts, try to not have a lot of data, then it's going to be really hard to compare. You can see we just have a filter on the data in order to compare only the last two years. That's it for the side by side charts. All right, moving on to the next one, we have the bar chart over time. It's a very famous one. You can find it almost in each dashboard. So let's see how we're going to build it. We're going to go to the order dates, let's put it on the columns as usual. We're going to have the years. Let's go and get our measure, the sales, and put it in the rows. Here has a default tablet going to show it as a line. Let's go and switch it to the bars. Since we are working on the bar charts that we got very nicely the sales over the years, but we usually add more details because those data are very aggregated. Let's go and add another dimension. In order to do that, let's just drill down the years. Click on this sign and with that we got the second dimension, the quarter. And here we can see more details about how the sales are changing over time. The main use case of this part chart is to show how the data are changing over time to show trends. If you have such a requirement, go with the part chart over time. Okay, moving on to the next one, we have the stacked part charts. The requirement for this one is going to be similar to the side by side. We can use two different dimensions. Now let's go and build it. I would like to see the total sales of each month for this year. In order to do that, let's take the order date to the columns, and let's take the sales to the rows. Now I'm going to go and switch the years to months, right? Click on it. And let's select the formats, the month, so that we got those parts that represent the total sales for each month and this year. But now I'd like to add more information to this view in order to compare as well the categories. Now let's go and get the categories. Is always the question where we're going to place it. If you put it on the columns, what you're going to get, you will get side by side bars. We don't want that, we want to get stacked charts. In order to do that, let's take the category and put it just on the colors. Let's go and do that. And with that, we got this information, this dimension as a color inside each bar. And with that, we're going to have the stacked bar charts. Now as you can see, the main purpose of the stacked par chart is first to have the total of sales over the time. We can compare the months and how the sales are developing over the time. Then the second task, which is not the main task, is to go and compare the categories to see how the category contributing in the total sales of each month. That's all for the stacked part charts. All right, now we have a very similar chart to the previous one. We have the full stacked part chart, or sometimes we call it 100% stacked part charts. Now I just ublicated the previous one, and as you can see in the normal stacked part charts, each part starts and ends differently from month to month. Total sales is naturally important in these charts. What is important is now to compare the subcategories over the time. A very nice way in order to do that is to have full stacked part. That means each part in our visualizations can have exactly same length, and it starts from 0% to 100% In order to do that, let's go to the Sum of Sales, Right click on it. And then let's go to the quick table calculations. And have the percent of total that we got, the percent of total instead of the total sales as a value. But we're still not there because those parts are not having the same length. In order to do that, let's go back to the Sum of Sales. Right click Ont and let's go to Edit Table Calculations. Let's go inside. Now what we're going to do over here, instead of having table across, we can have specific dimension. Let's go and switch on that. And we're going to select only the category. Since we are focusing only in the category, let's remove month of the ordered age. Now as you can see, we get immediately a full stack. Let's go and close this. Now as you can see, all those parts has exactly the same length and they all start with a 0% and end up with 100% We call this type of chart as a part to whole. That means I would like to see and understand how each category are relate to the whole sales of each month. Now let's quickly summarize when to use which chart. If you want to focus on comparing the categories over the times, then go with the full 100% stacked part charts. But if it's more important to show the total each month, then compare the categories, then go with the normal stacked bar charts. All right, moving on to the last type of bars, we have the small multiple bar charts. Many bar charts inside our visualizations. And we can do that by adding more than two dimensions. Let's start for the first dimension. We're going to go to the countries from the data pane, let's put it in the columns. And with that, we got the values of the countries as columns. I would like now to add rows from the category. Let's get the second dimension, the categories to the rows. Now I would like to fill those informations in order to see some data. Let's go and get our measures, The sales, drag and drop it to the rows over here. As you can see, our bars are not really small still. We have big parts inside our view and always we can go and check how many marks or how many parts do we have inside our view. By checking this information over here, we can see that we have 12 marks. Now let's go and get our third dimension. It's going to be the order date. Let's get the order date to the columns. Now we went 12-16 marks or 16 data points. Now Tableau switch it to lines. I would like to bring it back to pars. Let's go to the Marks, switch it to pars, but still our parts are not really mini or small. In order to go more in details inside our view, instead of using the years, we're going to go with the month. Let's go and change the format. Right click on it. And let's choose this format, the continuous one, the month. So now if you check again, we went 60-707 marks mini bars inside our view. I would like to add as well some coloring to it. Let's go and get the country to the colors. So that's it. With that, we got small multiple bar charts. As you can see, as you are adding more dimensions to the view, you are splitting the measure to more and more details. 157. Tableau | Bar-in-Bar Chart: Okay, next we have the bar in bar chart. Previously we have compared two dimensions inside our view, but now how about to compare two measures in our views using pars? Let's see how we can do that. As usual, we're going to take our subcategory to the rows, and then let's take the first measure. It's going to be the seals to the columns. With that, we got our standard bar charts. Let's go and sort it by the sales. Now we need our second measure. Let's go and take the quantity and put it as well in the columns. Now with that, we got individual axis for each measures, and we can go and compare the data. But it's way more better if you have two measures and you want to compare them is to use the dual axis. As we learned before in the previous material. Let's go and use the dual axis. We're going to go to the quantity erratically connect and let's go to the dual axis. Now here, Tableau did decide to go with other visualizations since we have automatic. Instead of that, I would like to switch it back to bars. As you know in the dual axis, we will get different tabs inside our marks. Now, since both of them going to be bars, we're going to go to all and then select instead of Automatic, we're going to have the bars. But now as you can see, we are not there yet. It's like the stacked part, but actually it's not stacked. In order to change that, what we're going to do, we're going to go for each individual measure and change the set up. But first, I would like to change the coloring. I don't like those current informations, so let's go to the quantity, Make it orange. The sale is going to be blue. Let's okay. Now what we're going to do in order to have bar in bar, we're going to go and change the size of the quantity. Let's go to the quantity over here, go to the size and just make it a little bit smaller. So now we can see in the background the big blue bar, and in the front we have this small orange bar. So with that we got something like bar in bar chart, which is really great in order to compare two measures using dual axis. If, for example, if you check the category art, you can see the quantity is really huge. But we are generating very few sales compared, for example, to the cubres. We have less quantity that is ordered, but we have huge sales. So it's really nice way in order to compare measures. 158. Tableau | Barcode Chart: All right, the next one can be fun. One where we're going to create barcode charts. We usually use it in order to show more details inside each par. So let's see how we can do that. As usual, we're going to get the same information, subcategories to the rows and sales to the columns. I think you already got it. Let's go and sort it. Now what I would like to bring is a dimension with high cardinality, like the product name. Let's go and bring it, for example, to the rows over here. As you can see, Tableau is warning us and telling us there's a lot of members inside the product name. And now if you go and say, okay, add all members, what can happen? The view going to be broken and it's not really informative. But instead of that, we can take the product name and put it on the details. So let's go and do that. And now with that we have built something like bar codes where we have the product informations inside each pars, which is sometimes useful to show all those details in one view. So that's how you build barcode charts. 159. Tableau | Line Charts: All right, so now we can start talking about the line charts in Tableau. They are very basics and very standards in order to show the change over time. Now let's go and build very simple line charts in Tableau. Since we are saying change over time, that means we need a date. Let's go and get the order dates to the columns. And then the roads, we need our measure, Sum of Sales. Now as a default, as usual, Tableau going to show the years. But instead of that, in order to make it more interesting, we're going to go and switch it to months. Let's go and change the format to month continuous, so click on that. Now with that, we got our line charts. If for some reason at your end you are not getting a line charts, in order to switch to line charts, we go to the marks and then instead of automatic, let's go and choose the line. Once you do that, you will get exactly like by me, a line chart. This is the most basic line chart in Tableau that shows the changes over time. Okay, next I would like to show you the different visuals that we can add to our line. For that, let's get more measures to our view. Currently we have the sum of sales. Let's get everything like the discount, the profits, ad sales. Let's take the unit price and as well the orders. Now as you know, since we have five measures in our view, we get as well five tabs in the marks in order to individually set up the visual for the sum of sales. We're going to leave it as it is as a standard line charts. But for the next one, what I'm going to do, we're going to change the path or the visual of the line. If you go over here on the pass and click on it, we will get different types of lines. The first one going to be the standard one, the linear, but the second one going to be a step. Let's go and select dots. Now if you check the discount over here, we don't have a linear chart like the sales we have now like steps like it's jump up, then we have steps down. All right, so let's move next to the profit over here. So let's switch the tab to the profit. Now we're going to go again to the path. And here we have two sections, the line type and the line pattern. In the line pattern, we have the solid line or we can make a dashed line. Let's go and select the dash line. And as you can see now individuals, we have very nicely a dash line in Tableau. So this is one more way in order to present the lines in Tableau. Let's move to the next one, to the next measure, we have the unit price. Let's switch there. Now what we can do over here, for each points that we have in the charts, we can make a marker or like small circle in order to add the markers. What we're going to do, we're going to go to the colors over here, and then here we have the effects. The first one is automatic. The second one to have marks, and the last one to have no marks. Let's go and switch everything to marks. Now with that, you can see the line chart in the Enterprise has like small circles, small data points. This is one more visual effect on the lines in Tableau. Let's move to the last one, the count of the orders. Let's switch there. Now what we can do, we can change the size of the lines depends on the values. In order to do that, let's take the account of orders. So it's control drag and drop it and put it on the side. So now if you take the last line, we're going to see a really nice effect. If the values are small, we will have a thin line. But if the values are high, we will get like a heavy line, which is really looks nice. All right guys. So as you can see, Tableau is very rich in the visualizations and with few clicks we can change the visual representations of the lines. All right, now we're going to build the multiple line chart in Tableau. I'm always duplicating the sheets in order not to build everything from scratch each time. So now previously in the standard line, we can see the changes over time, but sometimes we want to add more information. We want to compare the values of one dimensions inside this view. And we can do that by having multiple lines. Let's say that I would like to compare the values inside the category. Let's go to the categories in our Bod, Arts, and now let's put it on the colors, drag and drop it to the colors. And as you can see by doing that table going to go and plot three lines for each value inside this dimension. With that, we got multiple lines inside one view. And now we can see that it's not really informative because we have a lot of lines and a lot of zigzacs. In order to reduce that, we're going to switch the format to, let's say for example, a quarter. Now it's a little bit more clean in order to see the data are changing over time and you can compare the values of one dimensions, the number of lines really depend on the values inside this dimension. One more thing about how to create those three lines. You don't have to have it always on the colors. If you move the category from the colors and put it on details, you're going to get the same effects where Tableau going to go and create multiple lines for each value, but this time without colors. This is another method on how to create different lines in Tableau. But I think it makes more sense to have it on the colors to have subarate color for each line. This is how we can create multiple lines in Tableau using dimension. All right, the next one, we can have dual line charts. This time we're going. Compare two different measures in one view. So we're going to create for each measure, one line. So now I'm going to stick with the same view where we have the sum of sales and the quarter for the order date. Now we'd like to compare, in this view, two measures, the sum of sales and the profit. Let's take the profit and put it side by side by the sales. And with that, we've got two different lines for each measure, But I would like to have it on top of each other's. In order to do that, we're going to go and use the dual axis. Let's go to the Bf, right click on it, and here we have the option of dual axis. So as you can see, it's very simple. We've got a dual line charts, and here you can add more stuff. For example, you can go and synchronize those two axis by going to the brofit, right click on it. And here you can go and synchronize it. Or of course we can go and set up each line differently. So let's go to the profit over here, go to the path and let's make it a dashed line. As we learned brieviously, using the dual axis, we got the freedom of changing the visual of each measures individually. And this is a really great way in order to compare two measures. Okay, moving on to the next one, we have the cumulative line charts. So currently in the standard line charts, we are using the month and the sum of sales. And we can see the total sales for each month. But sometimes we would like to understand how the thing are developing or growing with the time. Now we want to see the growth over time. We have to use a cumulative line charts. In order to do that, we're going to go to the Sum of Sales. And instead of having sum of Sales as aggregate functions, we're going to go and create quick table calculations to have the running total. Let's go and switch that. And as you can see, we're going to get very nicely cumulative line charts where you can see how the thing are growing over the time. But of course, to make things more interesting, we're going to add more information to our view. Let's go and get the category and generate different lines. So we can drop it on the colors and now we can see how the different categories are growing over the time. Add as well to the cumulative line is the ending point of each line. In order to do that, we're going to go to the Marks, to the labels, click on the labels, show mark labels. But as you can see, we have for each month one label. We don't want that, We want only the ending of each line. In order to do that, we're going to switch it from all to line end. Now if you check our lines, you can see at the start and at the end we have this information. But the starting point is not really interesting, so we can go and disable it. Label start of line. Let's go and disable it. With that, we're going to have the total sales of each category at the end of the line. With that, we can go and analyze the growth over time for each category, Okay, So now we're going to go and create small multiple line charts as we've done for the bar charts. We're going to do it now for the lines. Now what we're going to do, we're going to bring like at least three dimensions to the view in order to break down the sales to smaller lines. Let's go and do that. We're going to get, as usual, the order date to our view. Let's get the sum of sales to the rows. And then we can get another dimension, the category to the rows as well. As you can see now as we are adding more dimensions, we are splitting the lines. Let's go and get the countries and put it as well to the columns. So now that we've got more charts, but table going to show it as bars since we have as automatic. So let's go and switch it to lines. Now we have it as a discrete line. Instead of that, let's get a continuous line. In order to do that, let's go to that date and switch it to something like the month as continuous. Let's change the formats with that. As you can see, we get very interesting multiple line charts. I would like to add the colors as well. Let's go and get the country, for example, and add it to the colors. Now, just to enhance the visual, let's go and remove the grid. Right click over here. And then let's go to formats. Then we can go over here to the lines, and then we have the wrought tab. Let's go to the grid lines and move to none that we have removed those grid lines, which is really annoying to have a lot of them. Then the last thing that we can do with that, we can have the total sales of the last point. In order to do that, let's get the sum of sales hold control and boot it to the labels. Then we're going to go to the labels over here and let's select Mean Max. We're going to have it by the order date. So let's switch from Automatic to month. And let's have only the maximum value. Let's remove the minimum value. So what that we've got for each chart, like the total sales for the last month. So that we have created very nice small multiple line charts in Tableau. 160. Tableau | Highlighted Line Charts: All right, moving on to the next one, we have the highlighted line charts in Tableau. This is especially important if you have multiple lines in one single view and there's different methods on how to do it. I'm going to show a quick one and a professional one. Let's start with the quick one. Let's have multiple lines in our charts. I'm going to take this time, the country, and put it on the colors that we got. One line for each value inside the country dimension. And now I would like to give the ability for the users to highlight one of those values. In order to do that, it's very simple. Go to the country over here, right click on it. And let's go to the highlighter. Here we have the option of show highlighter. Click on that. That, if you check the right side, we're going to get smallpox. In order to highlight the values inside the countries, the users can go over here and select one of those values, for example, Germany. And as you can see, Tableau going to go and highlight the line of Germany and it can applure all other lines. This is really nice way in order to go and highlight different values in Tableau in order to focus on one value. This is really great way in order to go and highlight one line, especially if we have a lot of multiple lines. That's what it. This is how you can create quickly a highlighted line chart in Tableau. All right, so now we're going to talk about the second methods on how to create highlighted line charts, but this time professionally. So now I just duplicated the old line chart where we have the quarter sum of sales and the countries on the colors. But this time we're going to get rid of this highlighter. So I'm just going to go and remove it. So now we have to give the users a list of all countries in order to select, and this selected country going to be highlighted in the view. In order to do that, we're going to go and create a parameter. Let's go to the data Pain, write it, click over here, then create a parameter here. We're going to give it a name, select country. Since the country values are string, the data type going to be as well a string. Now next we're going to go and create a list of all countries that we have inside the dimensions. Here we have four values. We have France. Be careful that we have exact case. The first capitalized and the rest is small. We have Germany, Italy, the last one is USA. That's it for our parameter. Let's go and hit okay that we've got our new parameter on the left side, rightically connects and show parameter in order to see it here on the right side. Now the users can go over here and select one of those countries, but as you can see, nothing is changing in the view because we haven't connected yet to our view. Now, in order to connect it to our view, we have to go and create a new calculated field. Let's go to the data pin. Again, create calculated fields. Let's call it Highlighted Country. And here we can have a very simple condition where we're going to say country equal our parameter. So our Peter going to be select country here. What we are saying is that if the selected country from the parameters equals to the value of the country, then we're going to have true. Otherwise it's going to be false. For example, now currently we have the value of France selected in the parameter. That means the country, France, going to be true, and all other countries can be false. Let's go and hit, Okay. So now we're going to go and work highlighting the selected country. In order to do that, let's start with the coloring. Currently we have the coloring on the country. I'm going to go and move it to the details. That means now the countries are just creating the lines, not responsible for the coloring of the lines. Now, in order to bring the coloring, we're going to get our new calculated field, the highlighted country. And let's put it on the colors. Now we can see that we have only two colors because we have false and true. If it's true, it's going to be orange. If it's false, it's going to be blue. But I would like to change those coloring to do the highlight effect. Let's go to the colors, colors. False going to be gray and the true going to be, let's say for example, the blue, let's say. Okay, now we get like a highlight effect. All other lines are gray and only the one that we select is going to be blue. But now let's go and test our parameters. We have here France selected currently. Let's select Germany. And as you can see, and as you can see now that selected line going to be Germany. Let's stick Italy and USA. Now, as you can see, our parameter now is working. Now here we have a little bit issue where the highlighted line is behind the gray lines. In order to switch that, I would like to have the highlighted in the front and the gray in the back. We're just going to go to the legend over here. If you don't have it, you can go to the analysis. And then here we have the option of the legends and make sure to select the colors. Currently it's selected by me. So what we're going to do, we just going to switch those two values. Let's take the true and put it on top so that we have sorted those two values. And as you can see in the charts, the blue color in the front and the gray color in the back. Now the next step, in order to create this highlight effect in doubled dots, we're going to change the size. In order to do that, we're going to use our new calculated field. So the highlighted line drag and draw it on the size by holding control. Now with that, we've got different size for the highlighted line compared to the others. But here we have the opposite effect, but we don't want that. We want the rest going to be thin and the highlight going to be heavy. In order to do that, let's go to the legend over here. Just doubl here. Now as you can see that through a thin, the falls is heavy. In order to switch it, we're going to go to reversed. Let's click on that and hit okay. With that, you can see the highlighted line is way heavier than the rest. You can change the size if you don't like it like this. So we can reduce a little bit the sizing and it's going to be now more nice. All right, so that's all on how to create highlighted line in Tableau more professionally than the Brison where you have more control on the sizing and the coloring. The users can go over here and start changing the value. And with that we are highlighting one line compared to the others. That's it. 161. Tableau | Bump Chart: All right, next we have a fun one where we're going to build a pump chart using lines in order to do ranking between different values. So now for example, I would like to rank the countries over time. In order to do that we're going to have the same view where we have the quarter and the sales and we have a line. So now the first thing that we're going to go and grab the country and put it on the colors in order to create those different lines. Now since the analysis is about ranking, not the total sales, in order to build that, we're going to go to the sum of sales over here. And we're going to go and create a quick table calculations. Here we have the rank function, so let's go and select that. So now we have a ranking that depends on the whole table, on the whole view, I don't want that. I would like to rank between only four values. In order to do that, let's go to the Sum of Sales over here. Write it. Click on it, and let's edit Calculations. Let's go inside. And now instead of having Table across, I'm going go and specify a dimension. Now we would like to have a ranking only using the country, so we're going to have only four values. I'm just going to go as well and select the order dates. Let's go and close this. Now we have some kind of effect of the pump chart, but we are not there yet. As you can see, the ranks starts from the bottom to top. I would like to reverse it. In order to do that write and click on the axis it the Ax and then let's reverse. That's all. Let's close this. As you can see now we have the top rank at the top, and then the bottom we have the lowest rank. Now in order to have this pump effect, we have to have like circles inside of our visual. We can do that very easily if you, in order to have the pump effects, we have to have lines. We have it already, but as well we have to have circles on the data points. There is one easy way. In order to do that, let's go to the colors and change the markers to circles. Now as you can see, we've got our small circles on each data point and we get the pump effects. But now sometimes we go more advanced in these charts where we can make our own customizations for those circles where we want to make those circles, those data points a little bit more bigger and inside it, the rank. Now in order to do that, let's first hide those small circles. We don't want that. Let's go to the colors and just have a line without markers. Now in order circles, we have to have the same measure. Again, in our view, let's take the sum of sales hold control and put it on the right side. With that, we've got two charts for each measure. Let's go to the second one, to the Sum of Sales over here. Instead of having lines, let's move it to circles. Switch the marks here to a circle. As you can see, now we've got very nicely those circles, and now we can go and change the size of those circles. All right, that looks nice. Now the next step is that we're going to go and put it on top of each other's. And we can do that using the dual axis. Let's go to the Sum of Sales on the right side. Right click on it, and let's select the dual axis now that you have very nicely those circles on top of our line. But the colors are not correct yet because those two axes are not synchronized. Let's go to the right side. Right click on it and synchronize axis. Now we've got those circles perfectly in our lines. I would like to hide the right axis, Right click on it, and let's hide the header. Now the next step we can go and add numbers on those circles. I'm going to stick with the second measure on those circles. Let's go to the labels and show label. The next step, I would like to add those numbers inside the circle. Go to alignment over here, and then the vertical, and let's make it to the center that we got those numbers inside the circles. And we can go as well and change the coloring and the fonts over here. Let's make it to white. The next step I would like to go and change the sizing again of those circles. So let's make it a little bit bigger until it looks nice. All right, so that's enough. And with that, we got a really professional pump chart and we are controlling the size of those circles. So now we can go and very nicely check the ranks of those countries. As you can see, France was in the first data points, the rank number one, then it dropped to two, then three, then back to one. And we can see the development of those sales between countries. And we can see very nicely that Italy is always the lowest rank in the sales in our business. All right, so this is how we can create Pump chart Tableau. 162. Tableau | Sparkline Chart: All right, so now we're going to learn how to create Spark line chart in Tableau. Spark line charts are really like compact visuals in order to show the trend that changes over time. And you're going to find it in a lot of dashboards in order to show KeBIs. Now let's see how we can create that. It's really simple. So now we're going to take a dimension like the country and put it on the rows in order just to split those lines smaller size. Now in the Spark lines, it's very important to have the information of the sales at the start and at the end of each line. Let's go and do that. Let's take the sum of sales, drag and drop it to the labels over here, holding control. So now we have the information of sales on each quarter in each data point. We don't want that, let's go to the labels over here, and now let's go to the Min and max. Let's go select Dots. Now we can see that we have for each line, two values, the minimum and the maximum. But here really on the sum of sales, Instead of that I would like the min and max depends on the value of the order date. Let's go and switch that. We can go to the field over here instead of automatic. Let's select the quarter now. As you can see, with that, we got exactly our spark lines. We have the starting value and the end value of each line. But now usually the spark lines are really compact visuals, they are really small lines. In order to change that, let's switch from entire view to standard. And now we're going to go very carefully to the end of our axis until we get the size of our mouse. Then now let's go and completely reduce it that we've got our compact lines I would like as well to remove those lines in our charts, so right click on it over here and go to Format. And then on the left side we're going to go to the lines. We are at the rows, I would like to remove those rows. So make sure to select the row tabs and removing those grade lines, we can go over here and select none. And with that we got really clean spark lines without any grades as well. We can go and hide those informations about the sales. Let's go right click on it and show header. Let's disable it. That's it. Now I'm happy with that. We got a very nice spark line chart in Tableau. And as you can see, there are compact visuals in order to quickly identify trends, which we usually use it inside QBI. 163. Tableau | Barbell Chart: All right, so now we're going to go more advanced on building visualizations in Tableau. We can learn how to create Pipa charts in Tableau. Parble charts are really amazing in order to compare two data points and find the differences between them. It's like before and after. And it works perfectly if you have categories now we would like are two years 2020, 1.20 22 by the categories. So now let's start first with taking the subcategory in other category in order to have more values. Now next we need two measures, the first one for the year 2021 and the second for 2022. In order to do that, we have to go and create a new calculated field. Let's go to the data again. Click over here, Create New Calculated Field. And now I'm going to call the first one, Sales 2021. And the form going to be very easy, so we're going to use the F condition if the order dates, but now we are talking about the year of order date. So let's move it to year if the year of the order date equals 2021. So now what can happen if the condition is correct, we're going to show the sales, then sales, and otherwise going to be null that sets, Let's go and end it. Now in this calculated field, we will get the sales only if the year is 2021. Let's go and copy it because we need it for the next one that sets. Then hit okay. And with that, we got in the data pain in new calculated measure for the sales 2021. Let's go and create for the next year, it's going to be the sales of 2022. Pace. Same calculation, but now we're going to say if the year is 2021, then show the sales. So that's it, let's set. Okay, so with that, we got our second measure for the sales of 2022. Now we want to compare both of those sales in our view. Let's take the sales of 2021 to our columns. Now in the purple charts, we're going to have like two circles and between them a line in order to find the differences. First, let's start with the circles. Instead of having parts, we're going to go to the marks of a year and change it to circle. With that, we've got, in our view, the first circle for the year 2021. What is missing now is the second circle. In order to do that, we're going to go and get our sales 2022. Move it to the axis in order to generate the measure values and measure names. Just drag and drop it over here. And now with that, we've got our second point. The first one, the blue one is for 2021 and the second one is 2022. All right, with that, we have built the first part of the parble charts where we have the starting point and the end point. Now in order to show the differences or the distance between those two values, we have to have a line chart between them. So that means we need now another type of chart inside our view. In order to do that, we're going to go and duplicate the measure values. Hold control, drag and drop it and just put it beside it. Now that we have the same data on the left, the right, on the right, we're going to have now different visual instead of circles, we're going to have a line. Let's go to the tab over here on the marks to the second one. Now we're going to go and change the visual from circle to line. With that, we got our lines, but we are not there yet. I would like to have a distance between two values. In order to do that, we're going to take our measure name from the colors and we're going to go and put it on the path. Drag and drop it on the path. And with that, we got exactly what we want. We have now like a line between two points. All right, so now the final step, with that, we're going to go and merge those two charts in one. So in order to do that, as we learned, we're going to use the dual axis. Let's go to the measure values over here on the right side. Right click on it. And dual axis, let's slick that. Now we got a perfect line to show the distance, the difference between the starting point and the end point. But now we still have small issues in the visuals. I would like to make those circles a little bit bigger. So let's switch to the circles and go to the sides over here and make it a little bit bigger. All right, so that's enough. Now, as you can see, the line is on top of the circles, which is naturally correct. In order to make it in behind, we have to go and switch the order of those dual axis. So let's take the right and put it on the left. All right, so with that we've got a perfect Parbal chart in Tableau. And we can go and analyze the differences between two data points between the sales of 2020, 1.20, 22. And we have this very nice line in order to indicate the distances between them. So you can see for example, in the envelopes, there is no change on the sales between those two years. But if you go to the phones over here, you can see a huge change on the sales between those two years and individuals, It really indicates those informations. So that says this is how you create and why we create parble charts in Tableau. 164. Tableau | Rounded Bar Chart: All right, so now we're going to go and build rounded part charts. Previously we have learned how to build bar charts, standard ones, but now we're going to go advanced and build rounded part charts. And we will use lines in order to do that. I know it sounds a little bit strange, but let's go and build that. First we're going to go and get, as usual, the subcategories in order to make a, and I'm going to stick with the entire view in order to have the whole view over here. Then let's go and get the sum of sales to the columns over here. So far as you can see, this is very nice standard part charts now instead of having those classical bars, we're going to have rounded each bars at the start and at the end. How we're going to do that, we're going to go and have like a dummy value average of the zero. Now we're going to do, we're going to go and merge those two measures in one single axis. In order to do that, let's drag the average and put it on top of the sales over here in order to generate the measure values and names. So now we're going to go and confer the bar chart to a line chart. Let's go to the marks over here to the line. And then what we're going to do, we're going to take the major name and put it on the path, so now we are almost there. What we're going to do, we're just going to go and increase the size of those lines. Let's just make it bigger. And with that as you can see, we got rounded part chart in Tableau. And as well we're going to get very nice color effect if we take the major values, hold control and then drag and drop it through the colors. And with that we got really nice rounded part chart in Tableau. Well, if you ask about now the use case, it's exactly like having standard part charts. For example, here we can make a ranking list of the subcategories. We just change the visual off its, so that's how you can build rounded partchart in Tableau. 165. Tableau | Slope Chart: All right guys Sona, we're going to learn how to build slobby charts in Tableau. Slobby charts are perfect in order to show how the ranking is changing over time for different categories. So let's see how we can do that. Since the ranking over time, that means we need the order dates. So let's go and bring the order dates to our view. Then the next step, as usual, we're going to get our measure, the sales to the rose we want to compare the last two years. In order to do that, let's go and filter the data show filter for the years, and let's go and select the last two years. So now we have to decide which category you want to compare. You can go for the border categories, we can go with the countries. Let's go and pick the country and put it on the details. Now the next one I'm going to go and just make it a little bit bigger in order to compare those two years. The next step that we're going to go and put the category or the country on the names, let's control on the country and drop it on the labels. Now we can see the country name on the end of each labels, but I would like to have it as well at the start in order to get the sloppy chart. So let's go to the labels. So now what do we have to do is to put the labels at the line ends. So instead of having goal, let's switch it to line ends. And let's close it. So now we can see that each line starts with the country name and ends as well with the country name. Now the last step that we want to add for each line, like small circle. In order to do that, as we learn before we go to the colors and we put the markers, so now we have a small circle at the start and at the end of each line. And this is the easiest way in order to build slobby chart in Tableau. Again, the use case of the Slobby chart that we can see how the ranks are changing the time in 2021, you can see France far as a first than USA, Germany, and the last was Italy. And now we can see the change over time. In the 2022, Germany went from place number three to be place number one. And then France moved to number two, USA moved to number three. And as you can see, Italy, nothing changed. So this is the power or the sloppy chart in order to see how ranking are changing over the time. And of course in Tableau, we can go more advanced where we add more complicated stuff in order to have more customizations. For example, you say, you know what, I would like to have bigger circles. In order to do that, we have to have two charts, one for the line and one for the circles. Let me show you how we can do that. Let's take the sum of sales control and duplicated the first one going to be the lines and the second one is going to be the circles. Let's go and switch for the second measure instead of automatic. We're going to select here the circle. It's two way big for our visual. Let's go to the size over here. And just reduce it in order to have smaller circles as well. A little bit more that sets. Now what we're going to do, we're going to bring those two charts in one. Let's go and merge it using the dual axis. I'm going to go to the second one over here, right click on it, and then let's go to the dual axis. Then if you look closely, those axes are not 100% synchronized. What we're going to do, we can right click over here and then synchronize the axis. So now we've got the circles exactly in the place that we need. Since we have two axes that have the same informations, I'm going to go and hide one of them. So let's go and disable the show header. Now you've got the full customizations of the chart. You can say, you know what, for the lines, I would like to have another color. For example, let's have a gray color. Or you might say, let's make it a dash line, so we got the bath over here and move it to the dash line that we get full customizations on our chart. But usually for the sloppy charts, we have a solid line between that. This is how we can create sloppy chart in Tableau. 166. Tableau | Bar & Line Charts: Okay, so now we can learn how to combine different types of charts in one single view. Here we're going to mix the parts with the lines. There are different methods on how to do that depend on the use case. The first one is using the average line. First, let's go and build a standard bar line over the time. In order to do that, let's get the order dates to the columns and as well the sales to the rows. Then let's switch the years to a continuous month. Let's change the format now, instead of having the line, we're going to go and switch it to bar charts. So let's go to the Marks and switch it to pars. Great. With that, we've got our bar chart. The second step is to add a line. This line going to be the average line. In order to do that in Tableau, it's very simple. Let's go to the analytics. And here we have the option of average line. Let's go and drop it to our view, so it's going to be for the whole table. And that's it. As you can see, it's very easy. With that, we got a nice average line combined with the part charts. All right, moving on to the next method. We're going to go and combine the parts and lines using the dual axis. And here we're going to go and compare two different measures. So this time as a change, we're going to go and compare the number of orders together with the number of customers. Now let's go and get the order date in order to see the changes over time. Then the next thing we're going to go and get the order, the count of the orders to the row. Now let's go and change the format of the order date to months and then change as well the chart, 2 bars that we got, our first chart, the bar chart. Let's go and get our second measure and we're going to have it as a lines. In order to do that, let's go to the count of the customers. Put it near the rows that we split it, our view to two charts. Let's go and change the second 12 lines. We're going to go to the Marks, switch this page then. Now instead of having bars, we're going to switch to line. Now we have our two charts, the bar chart and the. And as usual, we want to go and merge them together in one single view. In order to do that, we're going to use the dual axis. Let's go to the customers right click on it and then choose dual axis. With that, as you can see, we have a bar chart together with a line charts, and of course, with the dual axis we can go to the right side and synchronize those two axes. But for now it makes no sense. Of course now we can add more customizations. For example, for the line, we can do the markers. Let's go to the colors over here, and let's just add the markers to it. So that's now we can go and start comparing the number of orders together with the number of customers in one single view using two different chart types. 167. Tableau | Bullet Chart: Okay, so now we're going to build the Pollet charts in Tableau. Here we're going to combine again parts with lines. Polite charts are really important in order to compare the current value with the target or compare the current year with the previous year. Now let's go and get, as usual, our subcategory to the rose. And now I would like to compare the current y with the previous year. So let's take the sales of 2022 from our data pane over here to the columns. And now let's go and sort it by the axis, so we have like a rank and then we're going to go and compare it to the sales of 2021. So what we're going to do, we're going to take the 2021 to the details and then we're going to go and add a reference line. So let's go to the axis to the sales of 2022. Radically connect and let's add a reference line. So now let's take it a little bit to the right side and also to see those reference lines. So what we're going to take, instead of the sum of sales, 2022, we're going to have that 2021. So let's slick thats and now we've got one line for the average. We don't want that. We want to have the total sales for each subcategory. So in order to switch that, we're going to go and say instead of peer pan, we're going to have it peer sale. So let's switch it. So now we line for each bar, which is great, but let's go and customize those informations. I don't want to see any labels, so let's go to the labels and switch it to none, and then let's go and format those lines. We're going to go over here and let's take, for example, the orange color. And then let's go and change the transparency to 100% to have a full line. And then let's go and make it more heavy in order to see the lines. I'm just going to go with the full. That's it. Let's go and close this as you can see. With that, we've got very easily a pullet chart in Tableau where you can compare the current year of the parts with the lines of the previous year. This is how we can create a very nice pullet chart by combining parts and lines. 168. Tableau | Lollipop Chart: All right, so now we're going to learn how to create a lollipop chart in Tablo. There are two types of darts, horizontal and vertical. We can use this type of charts by combining the pars and circles. It's like a stick. And at the end we have big circle. And we use the circle in order to highlight a data value. Let's go and create that. It's very simple. Let's take the subcategories to the rows. Then our measure going to be the sales as usual. Let's put it on the columns so that we have already our bar charts. If not, then go to the marks and change it. Let's go and sort it in order to have a rank. Since it's lollipop, we have sticks, so let's have smaller bars. Let's go to the size over here and just reduce the size. Now what is missing in the lollipop is the end circle. In order to make another chart, what we're going to do, we can take the sum of sales as well and duplicate it. Hold control, and just drag and drop the sum of sales that we've got, our two measures. And what we're going to do next, we're going to go and change it two circles. Let's go to the marks, to the second sum of sales. And instead of Automatic, we're going to have the circles. Now we've got very nicely those circles, but they are really small. Let's go and make it bigger. Little bit smaller. All right, maybe this is fine. What is the next step in order to merge two together in one single view? As usual, we're going to use the dual axis. Let's go to the second Sum of Sales, right? It click on it. And then let's go to the dual axis. So as you can see, things got destroyed. We don't have any more of the bars, and that's because in the first measure of the sum of sales, we didn't specify for Tableau, that is bar, it was an automatic. And with Tableau going to go and make guesses on the suitable visual for the current data, which is something that is wrong. So what we're going to do, we're going to go to the first measure and say for Tableau, it's not automatic. We want it always to be as a bar. Let's switch it. As you can see, we have already the shape of the lollipop. We have to do some few stuff that is not a big deal. We forgot about synchronizing the axis. Let's go to the second one. Right click on it, and let's synchronize it just to make sure that everything matches correctly. Now I have those two axes that have exactly the same information, so I'm just going to go to one of them and hide those informations in order to have it only once. Now the key thing of the lollipop is that to show information at the end, at the circle here, we can put anything like any imager, for example. We can have the total sales or the total number of orders, and so on. But in this example, I would like to have the text of the subcategory on those circles. How we're going to do that? We're going to go to the circle over here. We're going to put in the labels, the subcategory byhldect control, and putting the subcategories on the labels. Now as you can see, we have now the headers, informations on those circles. What we can do, we can go and now and hide those informations. Right click and show header. With that, we have removed those informations and we have now the header informations or the subcategories on the circles. One more thing that we can do, we can go and add coloring. Let's take the sum of sales and put it on the colors that we have a really nice rank chart for the subcategories. Okay, now let's see quickly, the second type, we can have a vertical lollipop charts. I just duplicated the previous one. All what we're going to do, we're going to go to the Quick menu over here. And switch everything between the rows and the columns. All right, so now we have everything vertical, but we have really big circles. Let's go and change that. Let's go to the second sum of sales, and let's try to reduce stuff over here. We can reduce as well the sticks. Let's go to the first sum of sales to the size as well. Let's try to reduce the sticks now. It looks really nice, but still we have a problem with the labels. Let's go again to the circles, go to the labels, and we're going to change the alignments from Automatic to, so we're going to go and change the charts. So now we have the labels on top of those circles, but still we don't have all the labels because the size of the text is really big. So let's go to the fonts over here. Changes 10-81, of them is missing. You can go and reduce the size of the circles. That's it. This is how you can create lollipop charts in Tableau. And here you can see the power of Tableau. We can go and combine different type of charts in one single view, like here we are combining the circle with the bars. That means we have endless amount of combinations. And this opens the innovations in Tableau where you can create amazing charts and visuals. And this is exactly the magic of Tableau. 169. Tableau | Area Charts: All right, so now we're going to talk about the area charts in Tableau. They are like the line charts. We can use it in order to see how the data are changing over the time, but under the line we're going to get a field area in order to make it easier to visualize those numbers. So now we're going to start with a very basic area chart in Tableau. Since it is changed over time, we're going to get the order date to our view and then as usual we're going to get the sum of sales to the. And instead of a year, we're going to switch it to month continuous. Now here we have it as a line because it's automatic. If you go over here to the marks, you can see we have a chart type called area. Let's go and switch it. So this is the most basic area charts that you have in Tableau. Okay, so now we might say, you know what, the basic area chart in Tableau don't have a line and usually the area chart has a line. And between the line and the axis, we have like a field gap. But the basic area chart in Tableau don't have this visual. In order to recreate this design, what we're going to do, we can go and create a line on top of our area charts. So here we can have two types of charts, the line and the area. So let's go and create that. We're going to take the sum of sales and duplicate it by holding control. So now we have our two charts. The first one going to stay as an area chart, the second one going to be a line chart. Let's go to the second one of the sum of sales instead of area, we're going to have a line. I think you already know the next step. We have to go and merge those two charts in one single view. How we're going to do that using the dual axis. Let's go to the second Sum of Sales, right click on it, and let's choose dual axis. Now the next step, we're going to go to the area chart and just reduce the obesity. Let's go to the colors. Now let's go and just reduce the obesity. And with that, we're going to get a perfect area chart in Tableau where you have a line between the line and the axis, You have a field gap, way better than the basic area chart in Tableau. All right, moving on to the next one, we're going to have the stacked area charts. It's lack the part charts. We can add more informations to our visualization by adding the dimensions to the colors. Now we have the basic area chart at the start where we have the sum of cells and the month over the time. Now we're going to go and add a dimension. Let's take the category and put it to the colors that we got. Three area charts stacked on top of each other, because inside these dimensions, we have three values. What we can do over here about the design, we can go to the colors over here and increase the opacity, Really, that says, this is how we can create a chart in Tableau. All right, next we're going to go and build full 100% stack charts here if the total of the sales is not important. But what is important is to go and compare those different categories together. We can go and use the full stack charts. Let's see how we can do that. We're going to go to the Sum of Sales, and we can switch to Quick Table Calculations, Percent of Total. Let's go and click on that. We are not there yet. As you can see. We have the percentage over here on the left side. We want to have it 0-100 In order to do that, we're going to go again to the Sum of Sales. Right click on it and let's edit the table calculations we're going to do. We're going to switch it to specific dimension. And this dimension is going to be the category. Let's deselect the months of order, age. Let's go and close it. With that, you can see the Regi now start 0-100 and you have it like one block. Now we can go and very easily compare the three different categories. Here we can see very clearly how each category is relating to the whole, to the total sales of each month. This is how we can create very easily a full or 100% stack chart in Tableau. All right, so now we're going to go and create small multiple area charts by adding multiple dimensions. Let's go and get the first dimension. It's going to be the country to the columns. Let's go and get the order dates as well to the columns. And then to the rows. We're going to go and get the categories. Those are our three dimensions. And then I'm going to go switches from standard to entire view. Now let's go and get the numbers inside our view. So it's going to be the sum of sales, Let's put it in the rows as a default. Tableau going to show it as lines. Let's go and switch it to areas to the marks that we get our mini area charts in Tableau. But now let's add more details where we want to see the months. So let's go to the year over year and change the format to continuous month. So let's switch it. And then next we're going to go and add the coloring. So let's control and drag and drop the country to the colors. And in such a visualizations, it makes no sense to have those grid information. So right click on it. Let's go to the formats, to the lines, make sure to select the rows and then the grid line over here and make it none. What we have created a small multiple area charts in Tableau. It's very similar to the lines or to the bars. 170. Tableau | Scatter Plots: Okay, so now we're going to learn how to create the scatter plots in Tableau cutter plots are one of the fundamental charts in order to understand the relationship between two continuous measures. That means the main task of the scatter plots is to find correlations between two continuous fields. Another task of the scatter plot is to find the outliners inside your data. Let's go now and create a very basic scatter plots in Tableau. And as I said, we need two measures in order to do that, our two measures are going to be the sales and the profit. Let's get the sales to the columns and as well the profit to the rows that we got, our two axis. And it going to represents a two dimensional graph. Now what is missing is of course our data, the data points here. We're going to go with the customer ID. Let's take the customer ID and now we're going to go and put it to the details. And here is the power of Tableau compared to any other tools where Tableau going to go and plot all data points that we have inside our data without any restrictions, so that we can see the correlation between the sales and the profit. And as well to find the outliners, for example, those points that we have it as standalone. All right, So that we have created the very basic scatter plots in tableau. All right, And add more stuff to the design of the scatter blots, where we're going to change the colors, the size, add circles, and so on. So now we're going to go and change the size of each data point, but it's going to depend on a third measure, the count of orders. Now let's go to the order counts and drag and drop it to the size. Each customer is going to have different sizes and that's going to depend on how many orders did this customers place. This is one thing that we can add to our scatter blots. Another thing we can add coloring. Here we have different twins on how to add coloring. Either we're going to add a dimension or we can make a cluster. Now for example, let's go and get the dimension country and place it on the colors, the data points we can add as well different shapes in our visual. Currently we have the circle for everything. We can take the country, drag and drop it to the shapes. Now we can see in the scatter blot, not only that the countries has different colors, but they have a different shapes. But what we usually see in the scatter blots, that each data point can be represented as a filled circle. That means we're going to go and change the visual. Let's go to the marks over here. And then change it from shapes to circles. Now as you can see, we have everything as a filled circle, but we are not there yet. Let's go and make the size a little bit bigger. Now, what do we have over here? We have a lot of points. And what we usually do, we go and reduce opacity of the colors. Let's go to the colors over here, and let's just reduce it. And with that, you can see very nicely. For example, those two points there is like overlapping between them. One more thing that we can add to those circles. We can have a line border for each circle. In order to do that, we're going to go again to the colors, and here we have an effect called border instead of automatic. Let's have something like this color of the gray. With that you can see we have a very nice border for each data point. All right, so those are some different options on how to customize the scatter plots. 171. Tableau | Dot Plot: Okay, so now we're going to create the dot blot in Tableau. Dot blot is one dimensional graph in order to see the distribution of your data between different categories. And each dot can be representing one data point. Now let's go and see the sales by the order date. And then we can have the order ID as a detail. We're going to take the order date to our rows. So now we're going to go and see the distribution of order ID's by the date. Let's take the order date to the rows this time. And let's go and change it to a month as a continuous. Then we're going to go and get our measure to the columns. Now as a default, we have it as a line. Instead of that, we're going to go and make it as circles. Now we are not there yet. We have to add more details to the view and that by moving the order ID to the details. Now since we have a lot of orders inside our data sets, Tablo can ask us, do you really want to do that? Well, yes, add all members. Now as you can see, we have a very nice dot plot. We can add more informations. Like for example, let's take category and put it to the colors as well. Since there are like a lot of overlapping, we can go to the colors and reduce the opacity. So now, with that, each data point, each circle can represent one order. And you can see now very clearly and very fast, which orders has the most sales. This is how you can create dot plot in Tableau. 172. Tableau | Circle Timeline: All right, so now we're going to learn how to build circle or Pubble time line. We usually use the circle time line in order to analyze the changes over time. And we usually use it to show the distinct values of different circles across multiple categories. So let's see how we can build that. Since we say it is change over time, we need a date. So let's go and get the order dates to the columns. We need one more dimension. Let's take, for example, the subcategories to the rows, and then we need our measure. It's going to be the sales. But now instead of dropping it to the columns or to the rows, we're going to drop it on the size. Since each data point has different size, table going to show it as squares, let's go and switch it to circles. Now in order to have more data points in our view, we're going to go and switch to the years. Let's take, for example, the quarter as continuous. Let's click on that. Now I'm going to go and change the size of our view. I'm just going to go to the header and make it a little bit bigger. Then we're going to go to the axis and just make it a little bit smaller in order to have some overlapping. Now let's go to the and increase the size or make it a little bit smaller. And then we can go to the colors and reduce the opacity. And now we can add more customizations about the design. For example, let's take the sum of sales and put it to the colors. And then let's increase a little bit of the opacity so it looks better. And as well depend on how you like it. Maybe you can go and add some borders, so let's go to the borders over here. I like the dark ones, so maybe I'm just going to go and make it more gray course here. You can go and customize different stuff. For example, you can go and use two measures. For example, instead of having the sum of sales on the colors, we can go and get the sum of profit. So let's go and get the sum of profit on the coloring. So now we can see in this one chart, we can see a lot of stuff change over time. We can see as well the coloration between two measures in order to understand the relationship between them. Where the side is going to indicate the sales and the color is going to indicate the profits. This is really powerful and very great analyzed in Tableau using the circle time line. 173. Tableau | Pie & Donut Charts: All right, so now we're going to talk about the pie chart in Tableau. It is very easy and common way in order to analyze or show the part to hold data. Let's we can build that on Tableau. There is like an easy way or sheeting way in order to do that. If you go to the Show Me over here and then click on the pie charts, We will not do that. We will create it on our own so that we understand how Tableau works. Let's not take the shortcuts. I'm just going to close it in order to build a pie chart in Tableau. First, let's go to the marks over here, Change it from Automatic to a Pi. With that, we get a small icon called Angle. And here we're going to go and drop our fields on top of it. In this example, we're going to build a pie chart from the seals and then split it by the country. Let's take the seals and put it on the angle. With that, we've got our fare chart. It is like a circle and it's not divided yet. Let's switch from standard to entire view in order to get a bigger pie chart. Then the next step we're going to go and divide the pie charts into sections. So our dimension going to be the country. Let's decode the customers, then grab the country and let's put it on the colors so that our pi is divided to multiple sections. And the size of each section can indicate the sales of the country. And this type of charts is used in order to analyze the part to whole. For example, here we can analyze how the USA is contributing or relating to the whole of sales. As you can see, it's really easy to build and very commonly used in many dashboards. We can go over here, for example, and add some labels and change the design of course, of these pie charts. And one more thing that I would like to show you, that sometimes in the dashboards you can see that there are multiple pie charts in one dashboards in one view. In order to do that, you just grab any dimensions and put it to the rows or to the columns, for example, let's take that category and let's put it on the columns. And with that, we got immediately three part charts under those three different categories. This is how we usually deal with the pie charts. We have one dimension that split the pie charts and another one that is duplicating those pie charts. All right guys, so that's all for the pie charts in Tableau. Okay, so now moving on to the next one, we have the donut charts. Donut chart is very similar to the pie chart. You still have this analysis of part to whole. You have a circle and you have different segments. But many people prefer to use the donut chart and that's because we can add an extra information to the circle. All right, so now in order to build it, we need two charts. The first one going to be the pie charts and the second one going to be the empty space in the middle. So let's start with the pie charts. As we learned previously, we have to switch the Automatic to a pie chart. Then we take our measure. It's going to be the sum of sales to the angle. And then next we're going to take the divider. It can be the country to the colors. And with that we got our pie charts. Okay, so now next I'm going to switch from standard to entire view. This is for the first chart. Now in order to get the empty circle in the middle, we have to create another chart inside this view. So now we're going to go and create our empty measure, just to have a second chart. In order to do that, let's go to the columns over here. A right average of zero. So now we still on the marks, we have only one visual. In order to get a second one, we will go and duplicate it. Now with that, we've got our two measures, one for the pie chart, and the second one can be for the empty space. So now what we're going to do, we're going to go and merge those stuff together in one place because we have to have only one doughnuts. So right click on the average and let's go to the dual axis. And as usual, we're going to go and synchronize stuff. So let's go and synchronize the axis. And now let's go and get rid of them. We don't want them, so show header away and as well from the bottom. So now we have the two charts in one place. It's a little bit small, so let's go and make things a little bit bigger. So let's go to the sizes and just make it bigger in the middle. All right, so now let's go and make the empty space in the middle. So let's switch to the second marked over here. And now the second chart. It will not be a pi, it's going to be like a circle. So let's go and switch it to a circle. Let's get rid of all those informations. Now if you check our view, we don't see the pie charts and that's because we have overlapping and the pie chart is behind our circle. Now in order to show it's what we're going to do, we're going to go to the circle. Go to the size. And now let's go and start reducing sides of the circle. And as you can see, now we are getting the shape of donuts, but our donut should, has in the middle a white color. Let's go and change the circle color to white, perfect. Now we've got the donut shapes in our view. But now let's go and get rid of all those lines. Right click over here and the empty space go to format. Then let's go to the left side. Let's start with the lines over here, the zero line. Let's go and switch to none. Then we still have on the column, one more line. Let's switch to the columns instead of the grid line. Let's move it to none. Then in order to get rid of those borders, let's switch to the borders. Then let's go to the row divider. Make it none as well. For the column divider, it's none. And with that, we got very clean donut shapes in Tableau. Now let's add some labels and some data to our donut charts. Let's go to the pie chart first. Here we're going to get the informations of those sections. So what we're going to do, we're going to bring, for example, the country to the labels as well. We can go and get the sum of sales like Hold Control and Drug and Tribute to the labels as well. Now we can go and change the font format. Of course, if we go to the labels over here and then click on the three dots, then let's make, for example, the sum of sales bowls. And that's it. So far, there is nothing new compared to the pie charts. We are just showing the informations of each section. But now here comes the power of the donut charts. We can give an information here inside the site circle. And it can be usually the total of the measure, the total sales. Now let's go and switch to the circle over here. Let's go and get the sum of sales and put it to the labels. Now you can see the sum of sales here, strangely on the right side, because we didn't customize it yet. So let's go to the labels and then let's go to the alignment over here and make it everything to the middle. With that, as you can see, we got the total sales in the middle. Let's go and customize the text a little bit. So let's go inside. So now what we can do, we can write the total sales at the start. Then we can make everything like pulled for the real number, the real values. Let's make everything a little bit bigger, 16 and click okay. Now as you can see, we've got now another information to the par charts, where we have the total sum of sales in the middle. And then we can see very nicely the different sections around this number. That said, this is how you can create donut charts in Tableau. And this type of chart, it is like way more used than the pie chart since you can add one extra information in the middle. 174. Tableau | Heat & Treemap Charts: Okay, so now we have another chart in order to analyze the part to whole using the three map. We usually work with the three maps in order to show the hierarchical data inside our dataset. Let's see how we can build that. Let's first start with the marks. Let's go and switch it to squares. The next step, we're going to go to the sales, and we can put it on the size. With that, we got one blue square for the total sales inside our data. Now of course, we want to go and split this square to multiple informations. And here we can work with the hierarchy of the products. Let's start with the first dimension, the category. Let's strike and drop it to the colors. As you can see, we already got now a three map. The colors of the three map is decided from the category, and the size of those blocks can be decided from the sales. Now, of course, in this three map, we want to represent the hierarchy. The next dimension is going to be the subcategory. But this time we will not move it to the colors, we will move it to the details. Let's go and do that. Now, as you can see, each of those blocks are divided to more blocks where we have the subcategory informations. That means the data will keep splitting in the tree map the more dimensions we add from the hierarchy. For example, let's go and grab the product name and let's put it to the details. Now we can see that we have a lot of mini blocks that represent the product name. With that, we have represented our hierarchy of the product individual in a tree map. And we can see that each category, for example the red is split it into multiple subcategories and each subcategory is splitted for the more two products. But of course, the disadvantage here that the more details you add harder going to be to read this visualization. I don't recommend you to go with the product name. In such visualizations, it should be enough with the category and the subcategory. Of course, like any other charts in our visualizations, we can have multiple tree maps in one view by adding a dimension to either columns or rows. Like for example, let's go and get the order date to the rows. And thus, we got multiple tree maps splitted by the years, which is really useless to have such a visualization. So let's go and remove it. Okay, so we're going to the heat map. It is like a matrix where you have colors inside it. And we usually use it in order to do colorations between two categories. Let's see how we can build that. We need two categories, that means we need two dimensions. Let's say the first one going to be the country. Let's drag and drop it to the columns. And then the second dimension is going to be, for example, the subcategory. Let's drag and drop it to the roads. And with that, we got our matrix. Let's switch to entire view. We have roads, we have columns. Now what is missing, of course, is our measure the data. Now in order to create the effect of the heat map, we're going to take the sum of sales and let's put it to the colors. Now with that, we've got our heat map. And we can see from the colors the coloration between countries and the subcategories where we can see immediately that the highest seals where we have the dark color. So for example, we have high seals from the country, France and as well from the subcategory phones. And the lowest sales, we can see it for example here in the envelopes and Italy where here we can see again the power of visualizations, where we can read now the trends and the colorations between our data, which is way better than having only numbers. But of course, if you want to add some numbers in this matrix, we can go to the labels over here show marks. And if you want to make it to the middle, let's go to the alignments and let's make everything in the middle. That's it. As you can see, it's resemble and this is how we can create heat map in Tableau. 175. Tableau | Bubble Charts: Bubble chart in Tableau. They are really great way in order to add a lot of dimensions and measures in one single view. Bubble charts are like circles and we can define a lot of stuff in the circle, like the colors, the size, we can put inside it, text. Let's have an example. We're going to start with the mark. So instead of automatic, let's go and switch it to circles. Since the bubbles are circles, let's start with the face information. We're going to go and get the measure cells. Let's put it on the size. With that, we got our small Pubble or Circle. Let me switch it to entire view. Now we have one information, the total sales inside our data. Let's add another information like dimension. So let's go and add the subcategories inside our view. So I'm going to take this dimension and let's put it on the details. So now as you can see, we got more pubbles and we're going to get a bubble for each subcategory now. All right, so now let's keep adding more informations to our bubbles. Let's say that I would like to add the coloring for the Pubble, and this should come from another measure. Let's take the profits and let's put it to the colors. So now with that, we've got different colors. Depends on the values from the profits. And now, how about to add one more information inside those bubbles? Let's say the category. Let's go and get the dimension category. And now let's put it on the labels. Now we can see the category of each bubble, of each subcategory. Now, as you can see, we have four different informations that we have inside our bubble. The first one is the colors of the bubbles indicates the profits. And then the size of the bubbles show us the sales informations. And then the number of those bubbles are decided from the subcategory. We have all those subcategories inside our data. And finally, the text inside the bubble comes from the category. This is the power of the bubble charts where you find a lot on for formations in one view. So now we have another fun one called stacked Pubble Charts. Here we're going to add a lot of dimensions in the details. So let's see how we can build that. Let's go to Automatic as usual. Then switch it to circles. Let's take the sum of sales and put it on the size we are just creating. Again, our pubbles. This time we're going to go and get the country and let's put it to the colors. So far we have those four colors for four countries. Now if we bring any dimensions to the details, it's going to split this pupbles to more small pubbles that's depend on the cardinality of the dimension. For example, let's take the category, it has very small cardinality. And with that we will get just few pubbles if you go and remove it. Let's take the subcategory. Now as you can see, we are getting way more pupples than the category, and that's because we have more data inside the subcategory. Now let's go with higher cardinality. So let's just remove the subcategories, and let's get, for example, the broad act name. Once you do it, you will get a lot of small pupbles and they are all stacked together. And of course, you can go and sort the pubbles differently. If you go to the country over here, right click on it and let's go to sorts. Let me just move it to the left side a little bit, change the sort. As you can see, the color is going to change as well. So here you can go and sort the Pubble as you want. Now of course we can go with more details. If we take the lowest level of details, the order ID, let's drop the product name away and let's go and get the order ID. And with that can ask us, do you really want all of those data? Yes, add all members. Now you will get for each order a small Pubble inside our visualizations. Okay, So this is another way on how to represent your data in visuals using the stack double chart. But if you look at it, you will find it's looked like the son. All right, so that's all for the stacked bubble charts. 176. Tableau | Maps: Okay, so now we're going to talk about Tableau Maps. First, let's get the data in order to plot the maps, let's go and create a third data source. I am at a data source page. Let's go over here on this small icon, new data source. And then let's go to the text file and then to the data that we download it. Let's go to the big folder. And then we have over here, USA Sales. Let's select this CSV file and click Open. It's really simple table where we have the orders, country, region, state and sales that sets. Let's go back to our view and let's create now a very basic map in Tableau. Again, we can go and sheet using the show me, but we're going to go and create it from scratch. Now if you have a look to, you can find that we have two automatically generated fields, the latitude and the longitude. They are geographical coordinates in order to plot the map, the Earth. The latitude is responsible to plot the horizontal lines, and the longitude is responsible to blot the vertical lines. What you can do, get and go, and use them to the columns. Let's take the longitude to the columns and the latitude to the rows. With that, you can see that Tableau is now able to plot the Earth. Now next we have to specify for Tableau the country, the states, those geographical informations. Let's take, for example, the country to the details. And with that, you can see that Tableau is now focusing only on the United States because we have only information about SA. Now let's take the States as well and boot it to the details. Now as you can see, Tableau is focusing now with those points on each states. All right, so now the next step, instead of having circles, I would like to have a map chart. Let's go to the Marks. Switch it from Automatic to map. And with that we have the whole area covered with the colors. Now you can go and add coloring depends on the dimension that you want. For example, we can go to the region over here and boot it to the colors. Now we can see that the map is now splitted by the regions. Now what is missing here is the sales informations. Let's go and get the sales. But see we have small problem that the sales is dimension and discrete because of the data type. Let's go and switch it to a number hole and then make it continuous, or convert it to continuous. Then the last thing, we have to convert it as well to a measure because it's still has a dimension. So everything is fine. Let's go and get the sales to the labels. And with that, we got very nicely the total sales for each state. This is how you can create a very basic map in Tableau. Okay, moving on to the next one. We can create maps in Tableau with simples. I just duplicated the previous one. Let's go and switch the visual from map to, for example, circles. And then the size of the circle going to be decided from the sales. Let's take the Sales and put it to the size. Then the next Sable, let's go and make the circles a little bit bigger. Now we can add another measure to the circles. Let's say the number of orders we're going to take over here, the count of the USA sales V. Let's take it to the colors. Now, the scale of the color going to define the number of orders and the size of the circle can be defined from the sales. This is one way in how to represent those informations as the circles or bubbles. We can go and choose different shapes. Let's go over here in the marks and go to the shapes you can go. For example, let's say what we're going to have over here. Let's go with the stars. As you can see, we have here a lot of options on which symbol can be presented inside our map. This is how we can add symbols to the maps in Tableau. All right guys, Maps in Tableau are very rich in the customizations. There are a lot of options on how to blot the maps in the view. I'm going to show you few possibilities on how to blow the maps in Tableau. The first one is about how to have a map without any background noises. Now let's go and do that. If you take the country field and just throw it here in the middle. Can understand we are talking about map and we're going to get automatically everything inside the columns and the rows. Now the next table, let's take as usual the states over here, and then we're going to go and color it with the region on the colors. So if you check the map, you can see there are a lot of grade out areas inside the map that is not used directly. If you want to remove all those informations, what we're going to do, we're going to go to the main menu. You have here Maps options, and then here we have a background layers. Let's go and click on that. And then here on the left side we will get many options on how to customize the maps. I really recommend you to go and click around. It's really fun to Worcus maps in Tableau. Now the task is to remove all those background informations. What we're going to do, we will just remove all those selected informations. Let's just remove everything with that. As you can see, we have removed the background and we have only the relevant information inside our view. There's another way on how to remove the background. Let me just go back with all those settings. I think with that we got all informations back. Another way to remove the background informations to go to the wash out and move it 0-100 Now as you can see, the background inside our map did disappear. This is how we can remove the background informations inside our map and you get really a clean map in order to focus on the relevant data. Okay, the next one is as well, about customizing the maps in Tableau. So now let's go and create a night vision map. It is just fun to work with maps in Tableau. So let's go again and get the countries in the middle of the To the details. Now in Tableau, we have different types of maps, not only one. If you go to the main menu over here to the maps, either you check the background map. So here we have the different modes. Or if you go again to the background layers and on the left side, you can see here the styles. Currently it is white and gray, it's lights. If you click over here, you can find the different models. We have the normal one and then we have stuff like dark street, outdoors, and satellite informations. It's really nice to have different styles. What we're going to do now, since it's night vision, we're going to go with the dark modes. Now the next thing I would like to reduce some informations like United States and Mexico. Let's go and remove those stuff from the left side. What we're going to do, we're going to go and add some measure to our view. Let's close the background layers over here. Let's go and get the sales to the size that we are getting, those nice circles. Let's make it a little bit bigger, then we can add the sales as well to the colors. So hold control, voted on the colors and let's change the coloring. So let's go and edit colors. Now let's go to the automatic over here. And let's change it to another pattern. For example, let's take the blue green over here. Click Okay. Okay. Now we're going to go and add more customizations to our map. For example, let's say that I would like to change the color of the borders for those states. I would like to make it red in order to make it more interesting. I cannot do that in the current view because if I change anything about the border, it's going to change the border of the circles and not the border of the states. In order to do that, we need two maps, one for the circles and one for the states. All right, now let's see how we can do that. We're going to go to the longitude and we're going to go and duplicate it. Now that we've got two maps, the left and the right, let's go and configure the right one. Let's switch the marks to the second map. Now instead of having circles, we want to have a map. Let's switch it to a map. Now, as you can see now we have two different types of maps. But now I would like to have only the border information, so I'm not interested about the sale. So let's go and remove it. And as well for the sizing. Now as you can see, we have gray colors that is filling the map. So let's go to the colors and reduce the opacity to 0% so that we don't have any colors on the map. What do we need is the color of the border. So let's go again to the colors. Let's go to the borders over here. Let's make a read. I'm not really happy with this color. I want it to be more red. So let's go to more colors and let's get the re red. Now the question is how to merge those two maps in one map? Well, the answer for that using the dual axis again. So let's go to the right one over here, right click on it and dual access. All right, so with that we got to one map, but I'm still not. That tab, you can see that the circles are behind the lines in order to have it in the front. Let's go and switch those two measures. And now you can see that the circles are in the fronts. All right, so with that we have created our night vision map. And with that you have learned as well how many possibilities that we have in Tableau In order to customize the maps, all those different options that we have inside the maps, I really recommend you to go and explore those options that we have inside Tableau. It's really fun. 177. Tableau | Histograms: Okay, now we're going to learn how to create histograms in Tableau. There is two ways, one quick way and one advanced way. The quick way if you have one measure, the advanced way if you have two measures, the histograms are really great way in order to show the distribution of your data using power charts. So let's see how we can do that. Let's work with the one measure, the quantity, right click on it and then go to Create. And then two pens. Here we can go and configure our pens. I'm going to leave it as default as Tableau suggests. Let's go and click Okay. With that, we have created a new, been new dimension in our data pain. Now what we can do, we're going to go and grab it to the columns, and here we can find the size of our pens. And then we're going to go and get the quantity to the rows. And then the next and the last tap can do. We're going to go to the quantity and convert it from discrete to continuous radical. Click on it and switch it to continuous. So with that, we have created a very simple and nice histogram to see the distribution of our data using the measured quantity. All right, the next one is going to be a little bit more advanced, where we're going to create a histogram using two different measures. The number of customers by the number of orders we want to cluster our customers based on the number of orders that they placed. Now in order to do that, we have to create our pens, but now we're going to use the calculated field in order to do that using the LOD expressions fixed. We can do that. Let's go and create a new calculated fields. Let me just move it a little bit over here. What we're going to find out is the number of orders per customers. In order to do that, we can use the LOD function fixed. It's start with fixed, let me select that. Then for each customers, we want to count the number of orders for customers. We're going to get the customer ID. And then the aggregation going to be the number of orders. That means we're going to go and count the order ID. All right, so that's it. Let's go and hit, okay, that Tableau did create a continuous measure, but I would like to convert it to a discrete dimension Rat, click on it and let's convert it to dimension. And that's it. Now let's go and grab it to our view and check the informations. All right, so that we can see that we have already our pens and those are the different number of orders that the customers did order. The next step we need our second measure. It's going to be the number of customers. Let's go to the customer's count over here, drag and drop it to the rows as well. Let's take the customers to the labels. And with that, we got a very nice histogram in Tableau using two measures. Again, here, if you want to build histogram from two different measures, one of those measures has to be the basics, the pens of the histogram and the second measure going to be used in order to do the counts. So now we can see very quickly that most of our customers are ordering between 13 orders and like 16 orders. All right. So those are the two methods on how to create histograms, the easy way and a little bit complicated way. 178. Tableau | Calendar Chart: Okay, so now we're going to learn how to create calendar in Tableau. So now we're going to go and build this calendar using the order date. Let's take the order date first to the columns. Now in the columns we have to have the days radically connect in order to change the format. And then go to more. And then let's get the week day that we got, the mandate, Tuesday and so on. Then we need to build the rows of the calendar, and it's going to be the week number. Let's go and hold control duplicated to the rows instead of the week day. Let's switch the formats again. Over here to the more and then week number that we got. Our matrix, our calendar. You can see we have here all the weeks. I would like to reduce it to only one month. That means we're going to go and add some filters to our view. Let's take the order date, put it on the filters. And the first filter going to be on the years. Go and select the years. Let's select the last year, He Ok. And we can of course, go and offer it for the users. Right click over here and show the filter on the right side. We can do the same for the months. Let's go and take the order date and put it on the filters. Let's go for the month next. And let's select only one month. And then offer it as well to the users. All right, with that we got of one month. Let's go and switch it from standard to entire view. Now as usual we need a measure in order to fill our calendar. It's going to be the sum of sales. So drag and drop it and put it on the colors. All right, So that we can see already that we have a heat map inside our calendar. Now we need to just add few stuff. For example, let's add some white porder between those informations. Go to the colors, and then go to the porder and add a white color so that we get nice separations between the days. And let's add as well the day number in each box. In order to do that, we're going to go to the order dates. Put it on the labels over here and then here, tablet, switch it automatically to a text. Let's go and switch it back to square. And instead of having the years, we have to go and format our date. So radically connect. And let's go and select the day. And then the next step, let's go and place those numbers of the days on the top right corner. So let's go to the labels alignments and let's go to right and then top. All right, so that we got a really nice calendar in Tableau. Of course you can go and switch to another month, let's say for example in February, or check another year 2021. And that, this is how you can create calendar in Tableau. 179. Tableau | Waterfall Chart: All right, now we're going to create in table the waterfall charts. It's very useful in order to show the flow of the process of your data and as well to show the analysis of part to whole. Let's see how we can create that. First, we need a dimension like the subcategories. Let's move it to the columns. Then we need a measure. This time, let's take the profits track and drop it to the rows. And then let's change it from standard to entire view. Now in order to have a waterfall inside our view, we need the running total. In order to do that, let's go to the Profit over here. Right click on it, and let's do a quick table calculations. And let's switch it to Running Total. So that you can see we have now a running total of our data, but still it is not a waterfall. In order to do that, we have to switch it from the classic parts. So let's go to the Marks over here, to the Gant parts. All right, so that we got the basics for our waterfall, but now the size of each line going to depend on the profit. Let's go again and grab the profit to the size. But now if you check it closely, we can see that those parts are not making the waterfall because they are in the opposite direction. We would like it to be starting from zero, from the bottom to top. In order to make this effect, let's go to the sum of profit over here. Double click on it, and then let's make it as a minus. Click on that. Now, exactly. We got what we want. It's start from the bottom to, and with that we are forming the shape of waterfall. Now we have to add some coloring. Let's go and get the profit. Put it on the colors. Now, what we want to do with the colors, if the numbers are positive, then it's going to stay blue. But if it's negative, it should be red. In order to do that, let's go to the colors and edit colors. And now we're going to do the following set up. So let's go over here and make it only two steps. And then let's go to advance over here. And make sure that everything in the center, so it is zero over here. And that's it. So let's go and hit. Okay. And with that, we can see very easily where are the negative values in our waterfall and where are the positive values. You can, of course, make it as green and red. So now the last thing that we have to add to our waterfall is the total. In order to do that, it's really simple. Let's go to the Analyses on the main menu. And then we go to the totals over here. And let's add show raw grand totals. By doing that, we get our total on the right side and with that we get a perfect waterfall charts in tableau. 180. Tableau | Pareto Charts: Now we have the Parto chart. It is very famous charts in the statistics, and this chart is based on the Parto principle where it used the rule of 80 20 and the principle says 80% of the outcomes generated from 20% of work or efforts. One way to visual the Pareto charts, we can use two different charts. The first one going to be the part chart and the second going to be the line charts. Let's yeah, we can build that in Tableau. First we can start with the dimension subcategory, drag, and drop it to the columns. And then we need our measure. Let's check the Sid and drop the Sales to the rows. Now in order to have the perretta effects, we have to sort the data. Descending first, should comes the data with the highest sales. And then we go descending to the right sides. What we can do, we can go to the Sales over here and sort its perfect. Now we have the Parcharts. The next step we want to do is to build the line charts. So in order to do that, we're going to go and get the sum of sales and duplicated. So hold control and duplicate these fields. And with that we've got our two charts. So since the second chart can be a line chart, let's go and switch it. So I'm going to switch the Sum of sale, the second one, and instead of Automatic, we're going to have it as a line. And as well, I'm going to change the color to orange. Perfect. As usual, we have to go and merge those two charts together. So let's go to the Sum of Sales, right? To click on it and all axis. And here our chart is broken because the first chart is automatic. So let's go to the first one over here and switch it back to pars. Alright, so we are not there yet because we have to work on the line. The line should be the percentage of the running total. In order to do that in Tableau, it's really easy. Let's go to the Sum of Sales over here, right click, and let's go and add table calculation. All right, so now we're going to go and configure our table calculations for the second measure. And as I said here, we have to do two things. First we have to calculate the running total, and then we have to apply the percentage. In order to do that, let's go and change the calculation type to a running total. Let's go and select that. And with that, as you can see in the background, we have a running total. But the principle here is based on the percentage of the running total. So we have to go and switch this to a percentage in order that we can click over here and say Add a second calculation. Let's click on that. We get a primary and secondary calculations. The first one can be executed as a running total, and then on top of that we want to get the percentage. Let's go and switch it from difference from the secondary, 2% of total. Let's click on that, that's set for the table calculations. Let's go and closet with that, we have built our Pareto charts, but let's understand what is going on over here. Now, in order to easily read this, I'm going to go to the second one, to the line, and let's put the labels on top of it. And of course, the principle says 80 20, that means 20% of those subcategories should 80% And as you can see, we cannot see that's in this business. If you check our subcategories in this example, you can see it's not 20% We have around nine subcategories in order to reach the 80% In this example, our business does not follow this principle. It's 80% of the sales are covered by 20% of the subcategories. All right? So this is one method on how to create Pareto chart in Tableau, and this is how you can read it. All right. So now we're going to learn another method on how to create Pareto chart in Tableau. This time we're going to go and use two different measures using only one line. Let's see how we can do that. Now we have the business question and it's ask us, do the 20% of the products makes up 80% of the sales. Now let's go and get the answer from the data. In order to do that, let's get first our first major. It's going to be the sum of sales. Drag and drop it to the rows. Now let's go and get our second measure. It's going to be the count of products. In order to do that, let's take, for example, the product name to the columns and table. Ask us here We have a lot of members. Add all members. Now as you can see, we have a dimension, but we want to count how many products we have inside our data so radically connect. And let's go to the measure, and then let's select count Distinct. With that, we got our two measures. One more thing that we need inside the details in order to do the calculations. We need the product name to be on the details in order to use it. All right, so I'm going to go over here and switch it to entire view. Let's go to the first measure, right click on it. And let's add table calculation here again we have the same stuff. We can switch it to a running total. And then we're going to go and add a secondary calculation. The secondary calculation going to be the percent of total. Well, let's specify the dimension. Let's go and specify the dimension to the product name. The same as well for the right side, it's going to be the product name. All right, so with that, we got everything ready for the first calculation. Let's go and close it. Now as you can see, we have already now the percent of the running total for the products. Let's do the same stuff for the sales, right Click on the Sales, and then let's go and add table calculation. Let's go to running Total. Specify the dimension going to be the product name. Let's go and add the secondary calculation. It's going to be the percent of total. Then the same stuff, we have to go to the specific dimension and specify the product name. All right, so that we have prepared everything for the second calculation. Let's go and close it. Now we have to go and switch it back to line since we have it as automatic. So table, we decide to go with the shapes, let's go and switch it to line. Now with that, we are almost there. Have the running total of pose of the measures and we have our line, but as you can see, the line is a little bit jittery. And that's because we haven't sort the data yet. It's very important for the Pareto charts that we sort the data like we have done in the method one. Now let's go and sort their product name by their sales. In order to do that, right click over here and go to Sort. And then we can sort it by the sales. Let's switch it to a field. And let's go and select the Sales from the field name over here, convert it, so let's make it as a descending. Perfect. Now we got exactly the Pareto chart that we need. So now we have to check whether it's true that 20% of our products make up 80% of our sales. Now in order to check that quickly and easily in the view, we can add the support of the reference lines. Let's go and add some reference lines. Let's go to the analytics over here. Let's take here a reference line. Let's drag and drop it first to the first value. Now we can do, instead of having the average, let's go and switch it to constants. Now here we're going to check whether the 20% so it's going to be 0.2 And now with that, we're going to get a reference line exactly on the 20% of the products. Let's go and close that. As you can see, we have a very nice line, indicates exactly the 20% on the products. The next step with that, we're going to go and add another reference line for the sales. So let's take a reference line, drag and drop it exactly on top of the sum of sales. And now we're going to do the same stuff, instead of average, let's switch it to a constant, and since we need 80% it's going to be 08. So with that, we got exactly the 80% of the sales. So perfect. Now we have our Parto chart. And we can easily answer these questions from our data. So we can say, yes, 20% of our products are covering 80% of the sales, which is exactly matches the rule of 80 20, the principle of the Parto. All right, so this is the two methods on how to create Pareto charts in Tableau and analyze your business. 181. Tableau | Butterfly (Tornado) Charts: All right, now we have the butterfly chart, or we call it sometimes the tornado charts. It is great chart in order to analyze two different measures by specific dimension. So for example, if you want to compare the number of customers with the number of orders by the category, then the butterfly chart is your charts. What do you need First, the dimension. It's going to be, as usual, the subcategory. Let's move it to the rows, and then as usual, we're going to move it as entire view. Then we need our two measures. The first one going to be the customer count. Let's move it to the columns. Then the second one going to be the order count. All right, so with that, we have our two measures and the subcategory. Now in order to form the shape of the butterfly, we have to have the dimension exactly in the middle. And then on the right side we have one measure, and on the left side we can have another measure. In order to do that, we're going to use the placeholder, the average of zero. Let's have it over here, and let's go and place it exactly in the middle. Now with that, we have the measure on the left, measure on the right, and something empty in the middle. And then let's go and configure the charts. It's going to be the middle one, the average of zero. Let's go and switch it to a text. And now the next thing we have to go and get the dimension to the text over here. And with that you can see we've got now the spine of the butterfly. So let's go and make it a little bit more poles. So I'm going to go over here and just make it poles. But now we have to have the two wings right on the right and then the left. You can see the right side is okay, so we have it as a wing. Let's go and sort the data by the way. But the left wing is not correct yet, so in order to do that, let's go to the count of customers over here on the axis. Let's edit the ax and let's go and reverse the scale that we get exactly the opposite in the scale. Let's go and close it, and as you can see now we got it perfect. On the left side the wing of the customers and on the right side we have the orders. Now the next step is what we usually do is to add some coloring. For example, let's stay at the customers over here and drag holding control the count of customers to the colors as well. We can go to the orders over here and drag and drop the orders by holding control to the colors. But of course, we can go and customize the right side with using different coloring. Let's go to the colors over here and change the pattern maybe to orange, let's say. Okay. As well. We can go and make the ticks in the middle a little bit more bigger. Let's go to the middle. And then let's make it maybe something like 15. Now we can see those subcategories in the middle very clearly. But since we have it in the middle, we don't need it on the right. So let's go and hide it. Right click on it and then let's go and disable show header. We can go to the axis over here and as well disable the headers. And of course we can add more formatting in order to remove those grids. Right click over here on the empty space to the format. And then we can go to the columns tab and as well remove the grid line. With that we've got a clean chart, represent a butterfly or a tornado, depending on how you see it, where you can go and compare two different measures by specific dimension. All right, now in the method two, we're going to bring those two wings together. In order to do that, we're going to get exactly the same information. Let's go and get the subcategories, the rows, and then as usual, switch it to entire view. Let's go and get our measures. So the first one going to be the counts of customers, and then the second one going to be the counts of orders. But we have to put it now on top of each other's. Since we are using the same type of charts, we're going to use the measure names and measure values. Take the order counts and drag and drop it on top of the axis over here, in order to generate the measure names and values. All right, so we have those informations. Now we're going to go and take the measure names. We don't need it on the roads, so drag and drop it to the colors over here. And just to make sure that everything stay as bars, I'm going to go from here and switch it from Automatic to bar. And now the next step we're going to go and store the data. So click Axis over here, and then sort the data. Descending both of the values, or the wings are on the right side. Now in order to have the effect of left and right, we don't have here two axes. What we're going to do, we're going to do a very small trick in order to do that. Let's go to the customers over here. Double click on it and just go to the front before the counts and put a minus. Let's go and hit Enter. So with that, we get again the effect of the butterfly where we have the left and the right wings together. But of course what is missing here is the spine, the dimension, the subcategory. In order to do that, we're going to do the same. We're going to go and have the average of zero as a placeholder. We have it now on the right side. Let's switch to it, and then we can switch it to a text, since we want to have a text of the subcategory. And then the next step we're going to go and get the text. It's going to come from the subcategory, drag and drop it on top of the text. And with that we got the values or the spine of the butterfly. The next step is that we're going to go and merge them together in one charts. What we're going to do, we're going to go and use the dual axis. Right click on the average. And then here we use the dual axis, but as you can see, those values are not yet in the middle. And that's because we haven't synchronized the axis. Go to the average over here, and then let's select synchronize axis. And with that we got the spine exactly in the middle, but it's not really clear because it's red. So let's go and change those colors. So let's go to the Average over here. Double click on it. And let's select Complete White. That's it. Click Okay, And now the next step, as usual, we're going to go and start hiding stuff because all those informations are not necessary. So the average over here, let's go and hide it. And that's all we don't need the header information because we have it already in the middle. So right click over here and disable show header. And with that we get a very elegant and nice butterfly charts in Tableau where both of the wings together. And now we can go and analyze the coloration between the number of orders and the number of customers by the category. All right, so this is how we can create butterfly alternator charts in Tableau using two methods. 182. Tableau | Quadrant Chart: All right, so now we're going to go and learn how to build quadrant charts in Tableau. This type of chart is going to go and present a lot of data points in one view using two measures. And then we go and compare those different data points based on their position on the quadrant. And then we go and split the chart into four different quadrants. This type of chart is really great in order to do strategic planning or to do risk managements, or as well to find some trends. So now let's go and check in Tableau how we can build that. The first thing that we need is two different measures. The first one going to be, let's take the discount and put it on the columns. Let's go and find the average of the discount. Right click on it, and let's go to the average instead of sum. So this is our first measure. Now we need another measure. This time going to be the profit ratio. We don't have it in our data. Let's go and quickly create it. Create a new calculated fields profit ratio. And it's very simple. It's going to be the sum of profit divided by the sum of sales that let's go and hit. Okay, then let's go and bring it to our roles that we got, our two axis, but I would like to have it as percentage. Let's go and change the formats. Let's go first to the profit ratio. Then instead of numbers, let's go and switch it to percentage. Then let's go and remove those decimals. The same thing, let's do it for the average of discounts. So let's go and format it as well, two percentage and remove those decimals. All right, so that's all for the access. What do we need now is the customers as data points. In order to do that, let's go and get the customer ID and let's put it on the details. Now as you can see, each of our customers are presented as a data point. Let's go and change the visual of that. Instead of shapes, let's have circles. And let's go and reduce the opacity in order to see the overlapping between those points as well. We can go and make it a little bit bigger. So now we need two values in order to split this chart into four different quardants. Now here, since we have the titlezed dynamic, we want to offer it to the users as parameters in order to specify those two values. So now let's go and create two parameters in the data Pain, so we're going to create the first one. Let's say select discount, so it's going to stay as float and the display going to be as a percentage. Let's reduce the decimals and then let's say that the default going to be 0.15 so with that we're going to get 15% So that's it for the first one. We're going to do exactly the same for the second one in order to get the profit ratio. So let's create another parameter and we're going to call it select profit ratio. Have the same stuff again, so we can have it as percentage, reduce the decimals. Let's have it as a 10% your one. That's it for this one. Let's go and close it and show it in our view. Show parameter and show parameter. Now we have it on the right side. Next, we have to create now a separation in our view in order to show how the data are splitted. In order to do that, we can add two reference lines. Let's start with the profet tertio, right click on it and add the reference line. Then the value going to depend, of course, on our new parameter. Select profet tertio. And then let's go and make the label empty. And then we can go and change the format. Instead of having a line, let's have a dash one, then let's have the plaque. And then increase the opacity. And that's it. Let's okay. And do the same as well for the discount. Right click on the discount. At the reference line, we need our parameter. Can we select discounts? Remove the label. And as we'll do the same stuff on the customization so we can have it as dashed and as well have it clear on our view. All right, now let's go and hit, Okay. All right. Now as you can see, we have already our quadrant charts where we have splitted our data in four different sections. Of course, we can go now and change those splitters using the parameters. Let's got the buffer ratio and change it to 0.2 With that we move it to 20% Now of course, what is missing in our quardant is the colorings of those points. Each section should has its own colors. In order to do that, we have to go and create another calculated field to have those four values. Let's go and create one. Let's call it quadrant color. Now we have to go and identify the position of each data point inside our quardants. Let me just move it a little bit over here. In order to do that, we can use the FL statements. Let's start first identifying the points on the upper right. All those points on the upper right. How we're going to do it, We say if the profit ratio to the parameter value that is selected from the users, we're going to say select and then the profit ratio. That means we are checking whether the user on the upper section and now we have to check whether it's on the left or the right. So we're going to talk about now the discounts and the average discounts as well. Higher or equal to the value selected from the parameter we're going to write select and discounts. Now we are targeting all the customers on the upper right. So what can happen if the condition is fulfilled? We're going to say, right. All right, so now we're going to go and do the same stuff for all other three sections. Let's go and just copy it from here. Then we're going to say, then let's go and paste it. Let me just make it a little bit bigger in order to see it. Now we're going to go and target the upper left. In order to do that, we have to go and change the discount to smaller. Now we are saying if the discount is smaller than the selected value in the middle, so that means we are on the left side. What's going to happen? We will just go and flag it with the following value, upper left. Then we have to do the same stuff for, let's say now we're going to go and target the bottom right. Let's call it bottom right for the discount part, it is not correct. Let's move it like this in order to have the right section for the ratio in order to be in the bottom, this time is going to be smaller. With that, we are at the right side for the last section. In order to target it, we don't have to go and specify it. We would say just simply else because if none of those conditions are fulfilled, we will end up by the last one, we're going to call it bottom left. That's all. Let's go and end our FL statements and the calculation is valid. Let's go and hit Ok. And with that we got our new calculated field. Let's go and drag and drop it to the colors. Now as you can see, we have a dedicated color for each different sections inside our cordons. And of course, if the users goes over here and change the values of the parameters, the coloring will react as well. Since we have the parameters inside our calculated field, for example, instead of 15, let's have it as 0.25 Now as you can see, the reference lines goes to the right sides, to the 25% and as well, the coloring will be adjusted. That's all. This is how you can create a very nice dynamic Urdan chart in Tableau. 183. Tableau | Box Plot: Now we're going to talk about the box plot. Inta, blow, or sometimes we call it box and whisker plots. This type of chart going to help you to understand the data distributions of your datasets. This chart has like a box and two whiskers on the top and on the bottom. And then in the middle we have the median and the edges of the box so that we will get five different numbers in how our data is distributed. Let's see how we're going to build that inta blow. It's really easy. Let's start as usual with the sales. Let's drag and drop it to the rows and then we're going to see how the sub of categories are distributed on those sales. Let's take the subcategory to the details first, and then we have to change the visual to circles. Let's go to the marks over here and change it to circles. Now in order to have different charts, I would like to add the category to the columns over here. And then let's go and make it a little bit bigger to the middle over here. Now let's go and reduce those circles a little bit in order to have it more clear. And with that, we have the first part of the box, blots where we have circles. Next we have to get those numbers or the shape of the box and the whiskers. In order to do that, we have to add a reference line. Let's go to the cells over here, radically connect and reference line. And here everything is prepared from Tableau. If you go to the Boxplot over here, and that's it, let's click okay. And that's it, actually. With that, we got a boxplot in Tableau. Now if you go and mouse over on the charts, you will get the five different values. The upper we score, the lower we score the median, and so on. All right, so now the question is how to read the boxplots? Well, there are a lot of informations over here, but the first thing that you can do is to compare the position of the median of each box. If you have a look over here, you can see that those two boxes are at the same level, right? So they are very similar categories. But if you check the office Supply, that you can see the median or the box itself, it is below. Those two other boxes indicate for us that the furniture and technology has the same distribution, but the office supply has a different one. Another thing that you can check is the size of the box itself. If the box is tall or the links of the box is long, then that means the subcategories inside this category are not really similar and they are far away from each other's. But if you check the office supply, you can see that the box is shorter, so the links of this box is smaller compared to the other two. That's going to give us the information or the hint that the subcategories of this category, the Office Supplies has like a similar sales. That means if we have a shorter box, the members of this category going to have a similar behavior. But if you have a tall box, that's going to suggest that the members of those informations going to have different sales. But if we have a big or tall box, that means the members of this category going to have different behavior. And of course, this type of charts gonna help us to find the outliers, especially on the upper and on the lower whiskers. All right, so that's all about the box plot in Tableau. 184. Tableau | KPI: Okay, so now we're going to talk about the KPI charts, Key Performance Indicator. We usually use it in order to analyze the performance of our business, whether it is succeeding or failing. All right, so now let's go and build a KPI in order to track the performance of our sales in our business. So let's go and do that. As usual, we're going to go and get the subcategories to the rows. Let's take the sales as well to see the numbers. And then the next step, let's say that we want to check the sum of sales for each country. So let's go and grab the country field to the columns. And then the next step, we have to define the core of the QBI. The rule when the sales is going to be considered as a success and when it's going to be considered as fail or maybe in between. So what we have to do is now to go and create a new calculated field in order to define the KBI rule. Now let's go and call it BI colors. Now by checking the data, let's say that if the sum of sales is higher than 50 K, then it's going to be considered as a success. Or if we are talking about colors, it's going to be green. We're going to work with the FL statements, so we're going to check whether the sum of sales is higher than 50,000 Then what can happen? We're going to say it's green. So now the next step, we have to define the second rule. Let's say that if the sales is between ten K and 50 K, this can be medium, or let's say orange. Let's go and build that using LF, sum of sales less or equal 50 K, the sum of sales we are making, like a range is higher than ten K. Let me just make it a little bit bigger. Then what can happen? It's going to be range. All right, then we have the third rule. If it's not in between or not higher than 50,000 then it's going to be less or equal to ten K. So what we're going to do at the end, we're going to say L, it's going to be red. That's, let's end it. This is our KBI rule in order to track the performance of the sales. Let's go and hit, okay. And with that, we got a dimension here on the left side, the KBI colors. Let's go and grab it and put it on the colors. The next step, let's go and assign the current color table almost correct. Let's edit the colors. The orange is orange, Red is red, but the green is blue. Let's go and switch that. And with that, we can immediately track the performance of the sales, where we can see immediately where we are performing good. So we can see those green numbers or we are performing bad by the red numbers. But if you saw any QBI dashboard, you will see that they are using a lot of shapes. Now, instead of those numbers, let's go and get shapes assigned to those three values. That means we're going to go to the marks over here and switch it to shapes. Now, things are ugly currently, so let's go and take the sum of sales to the Details. And then we're going to take the B color to define the shape of our visual. So with that, we've got different shapes for each level of our KBI. But I would like to change it. So let's go to the shapes over here, and then let's go to the Default and then switch it to QBI. Now we have better icons for our BI, let's go and switch stuff. So green it's going to be this icon, orange, it's going to be this. And then the red, it going to be the red one. All right, so that it, let's go and hit. Okay. And now we can go over here and make it entire view and as well change the size of our KBI. With that we've got a nice KPI where we can see immediately where we are doing good and where we are doing pads. This is how we can create BI in Tableau. 185. Tableau | KPI & Bars: All right, so now we're going to learn how to combine a QBI together with any other type of charts, like for example, the Power Charts. So now we're going to go and build view in order to compare two years. In order to do that we're going to get the same stuff. So let's get the subcategories to the rows. Then here we have the sales of 2022. Move it to the columns over here. With that, we've got our power charts, But I would like to move it from automatic to power in order to make everything stable and not later break in our visualization. The next step I would like to go and add as well the coloring. Let's take the sum of sales 22 and put it in the colors. Now the next step, let's take the 2021 as a reference inside our view. Let's move it to Details. And then let's go to the axis, right it. Click on it and let's add reference line here. We would like to have the value of 2021 for each category. So let's switch it to per cell and then select the 2021. And then let's go and hide the labels. This is only customizations. Then let's move it to a little bit heavier line and then increase opacity as well. Change it to orange. That's it. Let's go and hit okay. Now in order to see the data better, let's switch it from standard to entire view. And with that we got a reference from the previous year, and the parts are the current year, so that you can see quickly the differences between the two years. But we are not done yet, This is only the bar charts. Now we have to go and add a KPI for it. So here we have to define the rule of the KPI. And this time is going to be easy. If the current year is less than the previous year, then it's going to be red. If it is more or equal, it's going to be green. Let's go and define this rule as usual. We're going to go and create a new calculated field. We can call it KPI colors. Now we're going to go and define that rule. We're going use as well the FL statement. If the sum of sales of 2021 is higher or equal to the sum of sales of 2021, then we are safe. It's going to be green. Let me just make it a little bit bigger in order to see everything. But if the condition is not fulfilled, what's going to happen? We will have bad performance, so it's going to be else red and then ends. So this is our rule. Let's go and hit okay. Now for the KPI, we need another chart inside this view. But since it is like a dimension, if we bring it to the view, it will not split into two different visuals. In order to generate another chart, we will use the trick of using the average of zero. So we have to create a placeholder average of zero. And with that, as you can see, we will get a new chart on the right side, this measure. We will go and configure our BI. Let's go and switch to this marks. And now we're going to switch it from bars to shapes. It's like we are building any other QBI. I will go and get rid of those informations. And now we're going to go and get our new calculated field, the KPI rule, and put it on the shapes. Next we're going to go and define the shapes of our QBI. Let's click on Shapes. Let's say if it's green, then it's going to go up. And if it's red, it's going to go down. That sets for the shapes. Click Ok as well. We want to change the coloring of those stuff. Let's take the BI colors hold control and put it on the colors. Let's go and assign it, edit colors. Green going to be green and red going to be red. That's it. Click okay. Now we have our KPI on the right side. We can go and make it a little bit bigger in order to see the shapes. Now we have two different charts. The next step we're going to go and use the dual axis. That's because they have different shapes. So let's go to the right sides and have the dual axis. And as usual, we're going to go and synchronize the axis and remove one of them. Let's go to the average as well and then go and disable show header. With that we hide it with we got the two KPIs on top of each others. But still here we have an issue. As you can see, the icons of the KPIs are exactly on the top of the edge of the bars. And that's because everything is starting from zero. And we have here the average of zero. Now what we're going to do, we can move it a little bit to the left side using the negative values. Let's go to the average of zero and switch it from zero to minus ten K. We can see our KPI is perfectly on the left side of the pars. And we can see immediately where we are doing bads. Here we can see that almost all of the subcategories are doing grades. We have all those green icons, but only two, the envelopes and the machines are doing bad. That's because the sales of the current year is less than the sales of the previous year that we have learned how to combine the KPI charts with any other charts. It should not be a bar chart, it could be an area or a line charts. 186. Tableau | BANS: Okay, so now we're going to create bands in Tableau. There are those big numbers that you can see usually in BIs or in dashboards where you're going to see the total of something like the total of sales, the totals of profit. How many customers do we have inside our datasets? So it's very common and you can see it almost in each dashboard. So let's go and create it. What we're going to do first, we have to go and switch our visual from Automatic to a text. Since we are working with text, there is no charts or any visuals. Let's take the sales and put it on the T. Now with that, we've got one number. Without any charts, only one big number, the total sales of our data. Now we can go and split it by a dimension like a country. Let's take the country, put it on the columns, so now we can see the total seals of each country. Now since we are talking about pans, those numbers should be really big. In order to change that, let's go to the text over here. Click on those three points, and then let's go to the Sells and make it really big. We're going to go to the size over here. Let's take, for example, 22 and make it polled. Then you can check by hitting apply the size of those numbers there. Looks good. Now let's go and hit, okay. And let's make the alignments correct. So let's have everything centered on the horizontal and the vertical. Now say we can go and change the format of those numbers. Let's go to the Sum of Sales over here and go to formats. And then we can go to the numbers over here in order to change the formats. Let's go for custom. So there is no decimal blass, let's make a zero. And then let's say we're going to display the unit as 1,000 as a K. Then we can add the dollar sign on the Brefix over here. So let's go and do that. That's all about the formats. Let's go and close it from here. Now with that, we have created really nice pans for our dashboards. We can go and make it a little bit bigger, See those numbers. Now you might say, you know what, I would like to have those texts beneath the numbers, not on top of it. To do, that's what we're going to do. We're going to take the country again and let's put it to the text, and we're going to get the text below it. But of course we have to make it really small. Let's go to the text over here, then to the three points. And then let's go to the country, Remove the polt, and let's move it, for example, like 12. All right, now let's go and hit a line order to check the format. Now as you can see, we've got those small text beneath those numbers. But we can go and as well reduce it to ten to make it really small beneath those big numbers. Now let's go. Okay, and with that, we got really nice small text below our numbers. But we still have an issue where we have the header informations. In order to remove it, just go to any values like Germany over here, right click on it and disable the show header. And with that we got really nice pants where the text is below the pick numbers. So as you can see here, we didn't use any type of charts, we just use the text in Tableau. 187. Tableau | Funnel Chart: Now we can learn how to build a final chart in Tableau. Final charts are really great in order to show the progress of your data through different stages. Let's see how we can build that. Let's take the seals and put it in the rows. Now we want to see how the seals are progressing through the different subcategories. Let's take the subcategories from the products and put it to the colors. Now the next step, we would like to change the size of those blocks based on the sum of sales. In order to do that, let's take the sum of Sales by holdering control and put it to the size. Now let's go and switch it from standard to entire view in order to see the size of each block. Now we need to form the shape of the funnel. In order to do that, we're going to go and saw the data descending, the biggest one going to be on top. And then we go to the small. In order to do that, let's go to the subcategory of our here, radically connect. And let's go and sort it. And then we have to change the sort pie to a field, then move it to descending. That's it. As you can see from the background, we have now the shape of the funnel. Now the next and all the important step in the final chart. We want to show the percentage of total for each block. In order to do that, let's take as well the sum of sales and put it to the text. With that we got the total sales for each subcategory, but we don't want that. We want percent of total. In order to do that, radically connect And let's go to quick table calculations. Then let's pick the percent of total. Great, now we have those percentages on the finals, which is very nice. And the final charts, let's go and add as well the text of the subcategory. Let's take the subcategory and put it to the labels. Now we can go and customize our view a little bit. Where we say, okay, let's put the text of the subcategory on top of the sales, switch the order, then let's go and change the labels and make the subcategory a little bit bigger and pulled, let's say. Okay as well. We can go and remove those grid lines so radically over here to the formats. Let's go to the lines, and then let's go to the zeros over here and make it none. All right, so that is more clean. What we can do, we can add the category to the filter. Let's go to the category, show it as a filter. And with that, we can go and select specific category in order to see the data. With that we get like less blocks inside the Finnel charts or you can go and add all of them. That's it. This is how we can create Finel charts in Tableau in order to track and check the progress of your data. 188. Tableau | Progressbar: In our QBI Dis parts we can add stuff like a progress bar. Let's see how we can build that in Tableau. Now let's go and get a dimension like the country to the rows. And then we're going to go and track the progress of our sales as a progress bar. In each brogress par, you have like 2 bars, The one in the background for the 100% and then your actual progress. That means we need two bar charts. Let's stick with the first one and switch it to bar. And as well, let's show the text. But now instead of the total sales, let's go and switch it to a percent of total. Let's go and switch our sales to table calculations, 2% of total. Now the next step, we're going to go and add the background bar. In order to do that, let's go and add our placeholder. It's going to be the one average of one. Now we've got our background on the right side and on the left side, we're going to get the actual progress. Let's go and merge them together using the dual axis. Right click on the right one and then move it to dual axis, okay? As usual, we're going to go and synchronize those two axes. And let's go and make it a little bit bigger in order to see the bars. Now we can see that the average, the background is in the front. In order to switch that, let's go to the axis of the average. Right click on it and then here we can say move marks to the back. All right, so now the next step in order effect of the brokers par, we have to change the coloring of the background. Let's go to the colors edit. And then let's select the average. And let's take the plu, let's select something lighter. So let's take a light plue apply. Okay? All right, so with us we get the effect of the brokers par, let's go and hide few stuff like for example the Ag over here as well. Let's hide those numbers on the background. So let's go to the labels and hide them. All right, so that's it. This is how we can create a really nice progress bar in Tableau where you can put it inside your dashboards. 189. Tableau | Choose The Right Chart: We learned how to build 63 charts in Tableau and what are their use cases. But you might be still like overwhelmed with all those options and all those charts in Tableau and it's still not that clear how to answer the question, how do we know which chart, which visualizations that we have to pick. That's why we're going to go now and summarize and group all those charts under different categories. We have the change over time, magnitude part, whole colorations, ranking, distribution, spatial and flaw. And each of those categories is going to focus on a specific question, specific problem in order to answer it using visualizations. So now let's go through all those categories one by one in order to understand them. All right, so now we're going to start with the first one and the most basic category we have, the change over time, or sometimes we call it trends over time. This category is going to show us the trends or the patterns over a continuous period. And it usually answer the question, how does the data change over time or another one? Are there any trends or patterns that we can uncover from the data over time? If you have the kind of questions then you are talking about the category change over time and the best chart in the category, we have the line charts. Because the line chart focus only on one thing, the changes over time, the trends over time. Because mainly the line chart focus only on the changes over time, the trends over time, nothing else as well. Visually, it makes it really easy to spot trends. As we learned before, we have multiple charts that covers the topic of change over time. Of course, all the line charts usually are change over time, so we have the line chart as the perfect one. Then we have as well the Spark line charts. We can use it if you want to have a compact chart for the trends analyzes over the time. Or we can use the sloppy charts to see how the ranks is changing over time. Or as well, we can use a part charts, so we can use the parts as well in order to analyze the changes over time. And as well to go and compare different time period together. Not only the part charts, we can use any type charts. For example, the stacked area charts. Here we have different use cases, one of them is the change over time. And as well, to go and compare different categories together as well, we can go and use the calendar chart or the circle Pubble time line in order to visual the change over time. So as you can see, if you want to have only one use case inside your visualization to show the change or the trend of our time, then go with the line charts. If you want to go and cover multiple use cases in one chart, then you can go and use the area chart, bar chart, or the circle time charts. Because they don't focus on only one use case. They can cover multiple use cases and one of them is the change over time. All right, so now we have the magnitude. Sometimes we call it size category. And it uses the sites in order to compare values. We could use relative or absolute values in this category. So for example, if you have the following task or question, find out the highest and the lowest tales of the categories. Or we have to go and compare the different categories by sales in one chart. If you have such questions or task, then we are talking about the category magnitude. And the best chart for this question is the bar charts because it makes it very easily and clean indivisualizations in order to compare values, you can compare very easily the data by comparing the length of the bars of each category. Under this category, we can find multiple charts, and most of them are bar charts, so we can use the raw part chart as a main one or we can use a bar chart columns. As we learned before, if you have a dimension with high cardinality, you can go with a raw. But if you have a chart with low cardinality, then go with a column. So those two charts only cover one dimension. But if you have multiple dimensions, then you can go with the side by side bars, or the stacked part charts, or as well the full stacked part charts. Then we have different charts under this category, like the lollipop charts, pupple charts, and the scatter plots. And you might ask why scatter plot and Y pupple chart because the size of the Pubble can be used in this analyzers. We can see immediately that the technology and the furniture has the highest cells from the size of the Pubble. The same thing goes for the scatter here. Again, it's really depends on how many questions you want to cover in one visualizations. If it's only one use case to go and compare the data, then go with the row part chart or the columbar charts. But if the size comparison is not only the use case that you want to cover, you want to cover multiple stuff like adding multiple dimensions and measures. Then you can go with the other charts under this category. All right, now we have the category part to whole. It shows how a whole or value breaks down into its components and how each component contributes to the whole, to the total. And it's going to show how each component contributes to the whole, to the total. So if you have a question like how does the value contribute to the total, we are talking about part to whole category. And the best chart to visual, the answer is the pie charts. Because visually it's very easy and as well very effective to show how each slice of the pie going to contribute to the whole pile. In this category, the part to whole, we have different chart types, like as we said, the main one is the pie chart. But we can go and use the donut charts, especially if you want to show the information of the whole, the total. So you can present it in the middle and around it you're going to have the slices. Or we can go and use the part chart, for example, the full stacked part chart or the area charts. The full stacked area charts as well. You can go to the tree map if you want to analyze. Not only the part to whole, but as well. You want to show the hierarchical data as well. We can go to the waterfall in order to show part to whole and as well the flow of the data here. Again, if you want to only focus on the part to whole use case, go with the pier chart. But if you want to add more information and analyze different use cases, then you can go with the others. All right, now we're going to talk about very important category. We have the correlations. It can show the relationship between two or more measures In one visualization, this category can answer questions like, is there any relationship between two measures? Or how strongly related are two variables or two measures? If you have such questions, then we are talking about the category correlation and the base chart, in order to visual the correlation is the scatter plot. The scatter plot is very effective in order to show the relationship between two measures. And it covers a lot of use cases like discovering the outliers. It's very flexible. We can add a lot of informations to each data point. And as well, it can help us to build clusters. If the question to show the relationship between two measures, the base chart is to use the scatter plot. And underneath this category, we can find different type of charts. Not only the scatter plot, but scatter lot is the favorite one. We have the Quardon charts. We can use it as well to analyze two measures and as well to cluster our data or to split it to four sections. Or we can go and use the dual line chart if you want to see as well changes over time. Not only the coloration, but you can see the trends as well. So we can go and use two lines in order to analyze the coloration between two measures. Or we can go and use one line and one part charts coloration. And as well, we can go and compare the sizes of each part. Moving on to another chart, which is very beautiful. In order to go and compare two measures, we can use the butterfly or tornado charts. And the last one you can use as well the histogram in order to find the coloration between two charts and as well to show the distribution of your data. Again, if you want only to focus on the correlation, nothing else, you can go and use the scatter blots. But if you want to go and add different use cases, like the change over time or the distribution or comparing the sizes, then you can go and use the other ones. Moving on, we have another category called ranking. So we use this category if the most important thing to show is the position of the item in a sorted list. So for example, if you want to show the ranking of customers, the top ten customers by the sales or the lowest ten products by the sales, Then we can use the ranking category in order to solve those tasks. And the best charts in this category is the part charts, because part charts are really amazing in order to build a list and as well to go and compare different ranks together. All right, so in order to show the ranking we have different types of charts. Basic one as we saw, we have the part chart whether it's raw or columns. And then we have different charts if you want to add more information or more use cases in one chart. For example, the Lull pop charts where you can go and put one extra information inside the circles or you can use the sloppy charts. Here, not only we are seeing the ranks between countries, but we can see how they are changing over time. And we have other charts like the final chart or the pump charts as well. Here we can show the ranks, how they are changing over the time. The last one we can use as well the butterfly in order to show the ranking of the categories, for example, here. And as well the correlation between two measures. Again, as usual, if you want to focus only on ranking only in this, you can go and use the part charts. But if you want to go multiple use cases in one visual, then you can go and use the other charts. All right, so now we have the distribution category. We can use it in order to show the values of a dataset and the frequency of their occurrence. So if you have the following question, like what is the distribution of customers age? Or if the question is, what is the busiest time in the work day? So if you have such a type of questions, then we are talking about the distribution category and the pit chart to visual those questions and the answers is to use the histogram. Histograms are amazing way in order to show the patterns using pens. And it's going to make it very easy to understand the distribution of the data. Under the distribution category, we can find different types of charts, the main one going to be the histogram. And we can go and use different type of plots, like the box plots, in order to see the distribution of data as well for the dot plot over the time as well, we can go and use the scatter plots or the quadrant charts in order to see the distribution of our data. And as well to show the coloration between two measures. We can go and use as well the barcode charts. For example, here we can see the distribution of each product in each subcategory as well. The paper chart considered to be a distribution chart. Again, if you want only to focus on the distribution, then go and use the histochrom. But if you want to cover multiple use cases in one view, you can go and use the other charts. Moving on, we have the special category. Use it when the geospacial pattern of your data is the most important thing that you want to show. If you have questions or tasks that involves informations about the location like country cities, states like, for example, you want to show which city has the highest sales. Then we're going to go with this category, the special category. Of course here the charts that you're going to use in this type of visualizations is the map. And in this course we have built four different maps. The first one the field map, or we call it coroplith map. So as you can see, the states are filled with colors. Or we can go and use simples like here we are using the star in order to show the sales for each state. And then we have learned how to customize the maps. For example, here we have created the night vision map. All right, so now we're going to talk a type of category. We have the flow. We're going to use it in order to visual the movement or the flow of our data. So if you have a question like how the data is moving from one point to another point, then we are talking about the category of a flow. And one very common chart in order to show the flow of the data or the process of the data, we can go and use the waterfall charts. With this chart, you can see the movement of data or the flow of the process of your data as well. We can analyze here the part to whole. All right, so what do we have covered? The eight different categories and we mapped different charts that we have learned in discourse to those categories. As you can see, the process is really simple. In order to understand which chart of visualizations you need in your projects, first you have to understand the questions that should be answered. So once you understood the task or the business question, you can go and map it to one of those eight categories. And after that, you're going to go and choose the best charts within each category in order to answer the question. And with that, you have learned the process of choosing the right visualization, the right chart for the question, and make sure to check the description. I leave there link for the visualization sheet sheets as well. You will find the Tableau file where you have sorted all those charts under the eight categories. All right, so with that, we have learned how to choose the right chart for your requirements. And with that we have completed the Tableau Chart section. And now in the next section in our plan, we can learn how to create and design our dashboards in Tableau. 190. Tableau | Section: Tableau Dashboard: Tableau dashboard. Now we can learn the basic principles about how to structure our chart inside dashboards in Tableau. And we can focus on the containers in order to structure our dashboard. So once we build all those beautiful charts, we can go and group them in one place using Tableau dashboard. So let's go. Okay. So if you create a new dashboard, you will get different options on how to customize and design your dashboards. So for example, we usually go and start changing the size of our dashboard of this white space. In order to do that, if you go to the side on the left side, we have here three different options. Fixed size, automatic range. What I usually do, I go to the fixed size. Here we can go and customize the width and the height. For example, let's scale with the width with 1,000 through 100 and for the height with 800. And then beneath us, we have a list of all worksheets that we have inside our dashboards. And then here it's really important is the objects that we have in Tableau. So here we have a list of different objects like containers, text extensions, images, blanks, and so on. Those objects, you can use them in order to build up your dashboards in Tableau. And the very important objects. Here, we have the containers in Tableau and they are really confusing. If you are new to this tool, we will be focusing on how to work with the containers in order to build the structure of our dashboards. The first question is containers. Containers in Tableau can allow you to group up different Tableau objects together in one place. The object could be anything like worksheets, blank text images, or even another container. Once you have all those different objects in one place, you can do many stuff. Like, for example, moving them all together using the container from one position to another one. Let's have a quick example. Let's take one of those containers. Let's take the horizontal container and drop it to the middle. And here's the first thing to notice, if that's the coloring in Tableau. As you can see, we have now a dark blue border around this space. The blue border can indicate that this is a container. Now, we can go and drop anything inside this container. It could be a worksheet, it could be a text, anything. Let's go with any sheets, for example, I have one prepared one, so drag and drop it exactly in the middle of the container. Now you might notice that we don't have any more, the blue color, the blue border. We have now a gray border. That means in Tableau, currently I'm selecting an object that is not container. So now we can go and grab anything like, for example, a text. Let's take this object and drag and drop it on top of this chart Here, let's write anything like the sales dashboards and just make it a little bit bigger so he Okay. So now this you can see we have another object that contain only a text. And as well it has a gray border, So that means we have one object with gray border and another one with gray border. So now the question is how to select the container that has those two objects. There are many ways in order to do that. So for example, let's say we are selecting the text, if you go over here to those two lines and double click on it. So once we do that, as you can see, now we have again this plume border. That means we are now selecting the whole container. So that means by double clicking on this small icon over here, you are going back to the container that's grouping up those objects. And there's another way in order to select the container. So now let's go again inside it. And only click on the sheets. Over here again we have this gray border. Now if you go to this small arrow over here, we're going to get more options. And then here we have the option of select container, vertical container. Once we do that, we will go back again to the containers where we have those objects inside it. This is another way in how to select the current container. All right, so now you might ask, you know what, Why we are selecting the container? Well for the following reason. For example, if you are just selecting this charts, you can go over here and you will get different options about the worksheets. For example, you can show the titles, the filters, the highlights. You can configure only these worksheets. Those options are only related to these objects. But now, if you want to go and configure the whole container, you have to go to the container. For example, let's go and Blan. If you go to the options over here, we will get completely different list of options. And anything that you are selecting here can be reflected for all objects inside this container. For example, in the current container tables, there is still space left inside this container in order to fill it. The whole space over here is not used, which is naturally good. As you can see, we have the text objects is way smaller than the worksheet object, which is now fine. But what you can do in Tableau is that you can go and split everything evenly. Containers, options, you can see over here distributes contents evenly. If you select thats what can happen. As you can see, Tableau going to go and automatically split the size of the container evenly for all objects. This is really helpful if you have different charts in one container, Tableau going to go and split the space evenly for all objects. As you can see, the options of the containers can affect all the objects inside the containers. One more thing to notice in Tableau, Tableau grit is knee key container, always on the right side. This container is a special one where Tableau can put all the filters, legends, highlighters, and as well parameters always. Each other's on the right side. So for example, in the subcategories we have the filter of the order date. And immediately Tableau can create a special container on the right side and can place the filter inside it. So for example, if you take any other charts that contains those informations, let's take this one over here and put it in the bottom. You will see Tableau immediately going to go and add the filters inside these worksheets. Beneath the first one here we have the filter of the categories that comes from these charts. If we take the next one, the customer distributions, as you can see, we will get a lot of filters in Tableau on the right side. And as well the legends. So here we have the profit sides. Here we have the country colors and so on. All parameters, all legends, all filters going to on the right side. And of course, if you want to customize the container that table creates on the right side, you can go to any objects and then double click on it. And then you can go and customize it. For example, I can go over here and split everything evenly. All right, moving on about the containers in double, we have two different types, the horizontal container and the vertical container. Let's start with the first one, the horizontal container. If you use this type, what can happen? All objects inside your horizontal container going to be side by side next to each other's. Let's try that. Let's take the horizontal container, drag and drop it to our dashboards. And then let's take one sheet, for example, the subcategory over here. And then let's take another one. Once you can select it, as you can see, table can offer you either to put it to the left or to the right. For example, let's go and drop it to the right. With that we've got two charts side by side, near to each other's using the horizontal container. Of course, if you go and add anything it's going to be as well, either to the left or to the right, or in the middle. Once you drop it, you will get it as well side by side. This is how the horizontal containers works in Tableau. Okay, the next time we have the vertical container, what can happen here? All objects inside this container are going to be on top of each other's like the R stacks. So let's have a quick example. Let's take the vertical container, Drop It Dashboard. And then let's take Any Charts, and as, we'll drop it over here. And now once we select another one, we can put it, for example, below it. And the third one, either below in the middle or in the top, let's drop it in the top. As you can see, the vertical containers, we are putting those objects or those charts on top of each other's so that we are stacking the objects on top of each other's. And this is how the vertical containers works. One more thing about the type of containers, which is very confusing if you are starter in Tableau, is that you can decide on the type of container as you are dropping the second objects. So let me show you what I mean. Let's take, for example, the horizontal container, drag and drop it to our dashboards. So now we can go and drop different sheets next to each other's, right? So let's take the first one as usual, let's put it over here. And now we come to the second sheet and our expectation that we can put it either to the left or to the right because we have horizontal container. Well, the second sheet or the second object is a special one. You can use it in order to change the type of the container. Let's take, for example, this one over here. You can see we can put it left. We can put it right, but as well we can put it on the top or on the bottom. Once I drop it to the bottom, what can happen? Tablet Going to go and convert the type of this container to a vertical container. So now we cannot go and change our mind. It's going to be fixed. This is going to be a vertical container. So for example, if I take the third one, I cannot change my mind by putting it to the left or to the right. I can put it only to the top or to, can stay as a vertical. And the third one will not change. The container type can drop it, for example, here at the bottom On the second sheets, we still have the option to change our mind to make it either horizontal or vertical. Container depends on how you are dropping the sheets. But after that, for the third sheets, you don't have any more of those options you can drop. It only depends on the container type. All right, now the more thing that we put inside our container, the things gets more complicated. In order to control the structure of our dashboards, there will be a lot of nested containers on top of each other's and you will lose control with the time. A complex container for that tablet did provide a view of the current structure of our dashboard. Now we are currently at the dashboards. In order to go to the view, let's go to the layout. So let's switch that then. Here in the bottom, we have something called item hierarchy. Here we will see the structure of our dashboard. It starts with the tilts. If you click on that, you can see Tablet Go immediately and select the current objects. He will see the structure of our dashboard and it starts with tilts, since we are using these methods. If you click on that tablet, going to go and select the current objects in the hierarchy, this is the highest container where we have everything in our dashboard inside it. Let's go and expand our hierarchy. You can see that it then splits into horizontal container. As you can see it clearly, we have one container for all those filters, legends and so on. And on the left side, we have a container for all our worksheet. And you can see that by just like moving this slider over here. As you can see, the first object is horizontal container. And then inside horizontal container, we have two vertical containers. The first one going be this container for the chart. And as you can see, things are stacked up on top of each other. So this is our first vertical container. If you click on the second one, now we are selecting the container on the right side. It's as well a vertical container, as you can see all those filters and stuff. Each other's. Then of course, we can go and expand those containers to see the content. So as you can see, we have here three sheets inside the first container. And in the second one we have three filters. And then we have those two legions. Having this item here, a key. It can help us with a lot of stuff. For example, it can help us to understand the structure of our containers, how things are nested to each other's. And another use as well, to understand whether we have made any errors by building the containers as you are dropping stuff inside your dashboard. Weird stuff might happen in Tableau where you are creating way more containers than you need. It can help us as well to select stuff. For example, if I would like to select the horizontal container, it can be a little bit harder by double clicking on those different objects. It's going to be easier if I into the item hierarchy and just click on the horizontal container. As can see, it's really easy to go and select stuff inside the item hierarchy as well. Here we can go and have options. For example, let's go to the subcategories over here, right click on it. And with that we'll get all the options of the worksheets. Or if you go to the containers, you will get the containers option. The item hierarchy are really important in order to structure our dashboards. All right, moving on, we're going to go and learn how to drop objects inside the container. Now just to make things easier, I just went through all the worksheets. I removed all the filters, legends, and so on. Just to keep things simple, for example, let's go and start with the horizontal container. Drag and drop it to the worksheets. Let's take an object like the sheet and drag it to the view tablet. Going to show you different visuals to indicate what can happen if you drop it. For now, everything is gray and we have a clear border of the container. That means now we are dropping the objects inside the container. Once I release it over here, what can happen if we go to the layout? You can see the horizontal container contains the worksheets. That means with this action, we placed the objects inside the container. Let's check another options. Let's go to the dashboard over here and take another sheet. Now if you drag it and as you are moving your mouse, you'll find different shapes and different stuff. For example, if you move your mouth a little bit to the left, you can see that the gray line is on the left side and the container, the blue container is marked, going to mean if you drop it tableau going to add it inside the container to the left side. If you move it to the right, going to happen the same stuff path to the right side. As long as Tableau is highlighting the dark blue color for the border, it means we are dropping the objects inside the container. But now check this. If you keep moving your mouse to the right sides, you will see that Tableau can change the color from dark blue to light blue. That means now we are dropping the objects outside the container. So let's go and do that. I'm just going to drop it to the right side. Now let's go to the layout in order to understand what happens. As you can see, the first sheet is inside the horizontal container, but the second sheet is completely outside of the container. If you just minimize it over here, you can see that it's not inside the horizontal container. That means you have to be really careful how you are dropping the objects inside dashboards. Table can react differently, depend on the shapes. Now let's go and drag a third one. Let's take the customer distribution now as we are dragging. So here you can see that tablet is highlighting the container because the mouse is inside the container. Here you can drop it either to the left, right, bottom up. But if I move my mouse completely outside, Tablec, drop it outside of the container. For example, I can put it to the left, to the right, to the bottom, but all of those staffs are not inside the container. Now let's go back to our container. I will drop it to, let's go and do that. And of course, to check what happened, we're going to go to the layout in order to check the item hierarchy. Now as you can see, Tableau changes from horizontal to vertical container because we have dropped it below. And you can see that this object, this sheet is inside the container. All right, so that sets, be careful how you are drag and dropping stuff inside table dashboards. Okay. Moving on to the next one in table, we have two different options on how to arrange our objects inside the dashboards. And we have the tiles and floating as a default table going to use Tiled option for all our objects, but you can go and switch it to floating what those objects means. Let's start with the first one, the tiled option. If you use the option tiles table going to go and automatically arrange your object as a grid layout. That means, for example, if you go and resize the dashboard table going to go and automatically change the size of all objects inside the containers and dashboards. Let's take an example. Now we are selecting the tilt. And if you take anything like the sheet over here and place it inside our dashboards table going go and automatically use the whole space. So that means the worksheets is going to take the size of the dashboards because table going to say okay, we have a lot of spaces, let's go and use everything. But the other option we have the floating. On the other hand, here if you select it here you have the freedom, the flexibility on how to customize the objects, and advantage of the floating Dat. We can go and do overlapping between the different objects. But the disadvantage of the floating dats, it's time consuming and you have to do everything manually. So now let's check how this works. Make sure to select the floating, let's take another sheet and just drop it wherever you want. So as you can see, we have now gray box indicate the place where we are putting the charts. Let's drop it over here. And now we have the full control where to position the objects. For example, let's got this icon over here and just drop it on top of the old one. So as you can see, we are now just overlapping. Or we can change the size as we want. So I just can make it like this. So as you can see, we are having the full control of this chart of the objects without any limitations. Now the question is, should I use floating or tiled? Well, in Tableau projects you can end up using both of them, and we normally use floating for the big containers inside the dashboard layouts and the tilt for all objects that we have inside those containers. All right, so those are the main options on how to work with the containers in Tableau. But of course, the best way to understand the containers in Tableau is dots to have real projects. And that's why as next we're going to have a mini projects in order to understand how to design and build the layoft of our dashboards using the containers. All right, so that was the basics about Tableau dashboards and how to deal with the containers. Next we're going to build a simple dashboard and learn the dashboard development process. 191. Tableau | Tableau Dashboard Project: All right, so the task or the project is to create a dashboard for the sales. And one of the first steps that we usually do in order to plan our dashboard is to create first a skitch. Here we're going to go and draw a very simple skitch for the sales dashboards. Where first, for example, we have the title of the dashboards like the Sales Performance, and then beneath it we can have three pick numbers or three pants. So we have the total sales, the total profits, and the total quantity. And then beneath that, we're going to have three different charts. The first one on the left one, we're going to have a power chart in order to show ranking the top sales by category. And then on the right side we two charts. The first one going to be a line charts where we're going to go and compare the sales with the performance. And below that we're going to show the sales by category using P charts that we have a sketch, we have a plan on how to visual our informations inside the dashboard. Now in the next step, we have to go and plan the structure of our dashboards in Tableau using Containers. If we're going to go and translate this sketch to containers, we're going to have one big vertical container that has three objects on top of each other. We have the title, then the bands, and then the charts. Since they are on top of each other, we're going to use the vertical container. Now we're going to go in more details in each information. So let's start with the first one. We have the text. The text, We don't have any other informations like beneath it or side by side. That's why we will not use any container here. Then moving on to the next information to the pans. As you can see they are side by side. That means we can go here and use the horizontal container. That means the horizontal container is inside the vertical container. Okay, moving on to the next one, we have the charts. And here, it's going to be a little bit tricky. First, if you check the sketch, we have like charts side by side, left and right. That means we're going to go and use the horizontal container. Again, here, this horizontal container going to be inside the big vertical container. Now if you check the right side, you can see that on the right side we have two charts on top of each other's. So that means on the right side we can go and use the vertical container in order to cover those two charts. So this vertical container going to be inside the horizontal container and both of them going to be inside one big vertical container. So as you can see, everything makes sense if you are organized and you start sketching and planning your dashboards, so now we have a plant enough. Let's go to Tableau and start creating this structure. All right, so now we're going to start from the scratch. We have one empty dashboard. And now let's go and follow our plan. Where first we're going to have the main container, the vertical container. So let's take it from objects, the vertical container, drag and drop it to the dashboards. And now as you can see, if you don't select anything, it's going to be still a white page in order to have an identifier for this container and make it easier to see during the design. What I'm going to do, we're going to go to the layout over here. So select the container and then we can have a border for it. So let's go to the border over here, make it a line. And then let's make it a little bit heavy and give it the color of orange. Now if ID select, you will see that we have one big container, the orange one. And this can indicate for me this is a vertical container as well. What we can do, we can go to the item hierarchy over here and give it a name. So let's go and give it a name. So now let's call it the main vertical container. All right, so what do you have inside this container? Three informations. The first one going to be a text, the title of the dashboard. Let's go to the dashboard over here and grab our text objects and drop it inside this container. Let's call it sales performance and get little bit pi. Let's make it 2022 bold. Okay, that is the first information. The second information that we're going to go and add a horizontal container for the different pans. Let's go to the objects over here. And grab the horizontal container and just put it beneath the text, now that we've got a horizontal container. And let's go and make an identifier for that. Let's go to the layout, make a border. And now we're going to give it the color of blue. So now we can see that we have a blue container inside the orange container. And we can go and give it a name. Let's go to the hierarchy, and let's give it the name of pants. And now what we're going to do, we're going to go and add planks inside this container in order to have a placeholder for the actual pants in our plan. We're going to have three pants. What we're going to do, we're going to go to the dashboard. Let's go and add three planks. And as you can see now we have it very small. Since it's plank, let's make it a little bit bigger. And let's go and add the second one to the right side, another one to the right side. Now what we can do, we're going to go to the layout and go and check the structure over here. As you can see, everything is fine. Those planks are inside the horizontal container. All right, that's all for the container, for the plants. Now next information, we're going to have the charts again. Here we're going to go and add as our plan horizontal container beneath this one over here. As usual, we're going to go to the layouts and give it a color and as well a border. As you can see, we have one container beneath another container, and both of them are horizontal containers. Let's go and give it a name, but we're going to call it charts. Now. We're going to go and add the plans, the placeholders for the charts. What we can do, we're going to grab a plank over here, it goes again, small. Bigger, the second one to the right side, and that we got the left and right. Now as usually, go back to the layout and check whether everything is fine. So you can see those two planks are beneath the horizontal container. Now as you can see, I'm always going back to the hierarchy in order to check whether everything is fine. And here is exactly my tip for you is always to check and don't leave it until the end. So don't check the item hierarchy at the end after you drop everything in the charts. I promise you will see stuff here that you didn't plan. As you are dropping a new stuff to the dashboard, go and check the item hierarchy, whether everything is fine. All right, now only on the right side, over here we're going to have two charts on top of each others. So that means we can have a vertical container only on the right side. Let's go to the dashboard over here. And now remove the right plank, because instead of that, we're going to have the vertical container. Let's click on this plank over here and drop it. And then let's go and get our vertical container. And just put it to the right side, make sure it's placed on the right side and we still inside the container, off the horizontal container, let's drop it. Now you can see we have something on the right and something on the left. Let's make it a little bit bigger to the middle over here. Let's go back to the layout and check everything is fine. So you can see we have the horizontal container, this main one, and then inside it, on the left it's plank, and on the right we have the vertical container. Let's go to the right side and give it a color. So it's going to be a border and this time going to be orange. In container we're going to have two charts. So I'm going to go with the planks again and put it here inside, underneath each other's. Now let's go back to the layout. And as you can see, we have those two planks for the charts on the right side and one big plank for the left one. Now the next day what we're going to do, we're going to go and make sure that everything is distributed evenly. Let's start with the container on the right side, over here, right on it. And let's click on Distribute Contents Even. Then let's go to the next one, to the horizontal container for the charts, right click on it and distribute the content evenly. And then we're going to go to the next one, right connect and distribute things as well. Even now for the last one, for the main container, I will not do that because things here has different sizing. So the text can be smaller than the pans and the charts going to take the most of the space. All right, so with that, as you can see, we have built the basics for our dashboards and we have implemented our plan. So now the last step we're going to go and bring the content inside our containers. So let's go to the dashboards over here. So let's start with the pants. So let's take the pan sales, then the profits and the quantity. And what we're going to do, we're going to go and remove those planks since we don't need them anymore. Now things here don't look really nice, because here we have titles. So let's go and remove the titles from each one of them as well. We would like to have everything in the center. In order to do that, click on the objects and go instead of standards to entire view, or for example, if you go over here to those more options. Fit and then entire view. And for the quantity, we're going to go and switch it to entire view. With that we have our three pants as plants. The next thing we're going to have the Pow charts on the left side in order to show some ranking. So let's go and grab our part charts. And what we can do, we're going to go and remove the placeholder, the plank. And then the next step, we're going to go and add the last two charts. So first we have the line charts going to be Sales versus Profits over here. And as well as I'm going to go and remove the plank. And the last one, it's going to be the pie charts, sales, Pi category. Let's drop it over here and remove its plank. Now the next step we're going to go and make sure that everything has entire view. Same for the Pi. All right, so as you can see, as we have a solid structure, everything else is going to be easy. We are just drag and drop stuff and remove the planks. Now with that, we have everything. Let's go and remove those porters. So let's go to the layout. Go to the first one. Let's remove the border to the horizontal. As we'll remove this, all our containers removed. All right, so with that we have our dashboards and of course we can go and add a lot of designs and a lot of customizations. For example, we can add a border for all those pants. Let's go into it just quickly. We can add a great border for each of one of them in order to separate them. With that, we have built a very organized and simple dashboards in double using the power of containers. So as you can see, it's very easy once you organize your stuff and do it step by step, instead of rushing things and dropping your charts immediately to the dashboard without any plan, it's going to be really hard to control. And as well, the look and feeling of your dashboards gonna be really bad, especially if you want to add more elements with the time. It's going to be really hard to extend your dashboard. Slow down, make a plan and then implement it using the containers in Tableau and at the end bring your contents. Alright, so that's all about dashboards, Tableau. Alright. So with that, we have a solid foundation about the Tableau dashboards. In the next section, we're going to do a real Tableau project where you're going to learn how to execute Tableau projects step by step. 192. Tableau | Section: Tableau Project: A projects now we can work together in order to implement Tableau project. But what's special about this project is that you will not only learn how to work with Tableau, but also you will learn how I usually implement projects in pig companies. I'm currently leading big data and business intelligence projects in Mercedes pens. So that means I'm sharing with you now in knowledge of real life skills on how we implement staff in real projects. It's not just another online course. So I'm going to take you in the projects from the starting point, the user requirements. And we're going to end up by having a wonderful Tableau dashboard. So the first step, we're going to go and analyze the user requirements. We're going to design and draw a dashboard, mock ups. And then the first step in the implementations, we're going to prepare our data source. And after that, we're going to start building the different charts. And once we have all the charts, we're going to start planning our dashboard containers and we're going to start building and designing the dashboard. So let's start first by understanding the phases, the steps of any Tableau projects. So now let's go. 193. Tableau | Tableau Project Steps: Projects are like any other projects. For example, building a house, The first thing that we have to sit with the users and understand the requirements and their wishes. That means we have to analyze the user requirements. And then before starting constructing the house, the architect can go and create a blueprint and the layout by defining the structure of the house and the rooms. And then after everything is planned, the foundations of the house going to be created. And this is very crucial step in the construction. Now, once the foundation is finally stable, the construction going to be starting by building the floors, walls, roofs, and so on. The last phase, it is the finishing touches by adding doors, adding electricity, choosing the paint colors, the decorations. The project phases of building a house is very similar to itable projects. And I'm going to show you now the different phases that I have usually in each table. Projects. In the first phase of each double projects, we start with collecting and analyzing the requirements. First, we have to understand the user requirements. Then we have to go and decide on which chart types we're going to use for each requirement. And then together with the users, we're going to go and draw the first mok up of our dashboards. And as well decide on the colors we have understood the requirements, we can go and start building stuff in Tableau. And we start with the first step by preparing the data source. And here we have the following steps. First we have to connect our data, then we have to build a data model. And then the last step of that, we're going to go and understand the data model and the data inside our data source. Then once we have a solid data source, we can start building our charts. And here we have different steps. First, we have to check whether we have all the data inside the data source or we have to create a new calculated fields. And then once we create those calculated fields, we have to go and test them first before we start building any charts. And then after that, once we have all the data that we need, we can start building the charts. And then once we have the basic charts, we're going to go and start formatting it by adding colors, removing grades, editing the axis and the headers. Now once we are building all our charts using the worksheets, we're going to go to the last phase where we can start building our dashboards. And now for this phase, you have to slow down and start planning everything step by step. And rushing on this phase will not help you at all. So first we start planning the whole structure of the dashboard by planning the containers. And once we have a plan, then we go to the next step where we start building the foundations. We start building the containers of the dashboard. And once we have a solid structure, we're going to go and start adding the content to the dashboard. And after that, we're going to have the step where we can take care of the filters and the interactivity inside our dashboard. And then the last step of building a dashboard, we're going to have the final touch by adding like icons for the logo, icons for the filters, or for navigating between dashboards. All right, so those are the main phases of building a dashboard in Tableau. And of course, my recommendation is to take it step by step and don't rush things, otherwise you're going to end up by chaos. And it can be as well, really hard to maintain the dashboard later, so don't rush building the dashboards always take time in analyzing the requirements, understanding the data, planning the structure, planning the mockups. And by that, I promise you going to deliver a professional work. 194. Tableau | #1 Step - Requirements Analysis: All right, so I'm going to start with the Tableau project from the scratch where I'm going to show you step by step how I usually implement projects using Tableau and we start right now, all right, so the first step in each project that we do with that, we're going to go and sit with the users in order to understand the requirement, their wishes. And we usually document the requirement in something called user story. So now we're going to go through these requirements. I'm going to leave the link in the description, and then we're going to go and start choosing the right charts for each requirement. So the user story or the project is about sales performance. And here in the introduction it says, we have to go and build two different dashboards using Tableau to help the managers, the stakeholders in order to analyze the sales performance and as well the customers. So that means we're going to go and build two dashboards inside Tableau. So let's start with the first one, the Sales dashboard. The main purpose of this dashboard is to provide an overview of the sales metrics and trends. Here it says, in order to analyze year over year sales performance. So that means here we are comparing two years together. Let's check the key requirements in these dashboards. So the first one is, that's to provide an overview for the PPI where we have to display a summary of total sales, profit and quantity for the current year and compare previous year. So that means in the dashboard, we don't have to present all the sales. We have to present only the sales of the current year and as well the previous years. And now let's go and decide which type of charts that we have to present. For these requirements, we can go with the bands. Bands are very useful in order to show the main metrics like the total sales, profit, quantity, and big numbers. For this requirements, we're going to go and create bands for its. Let's go to the next one. We have the Sales Trends. Here we have to present the data of each KPI. That means the total sales profit quantity on a monthly basis. So here we are talking about change of our time, right, for both the current year and compared to the previous year. And as well here, they want us to identify the months, the highest and the lowest sales. So that means we have now to choose a chart that presents a change over time. And for this, you can of course discuss it with the users and show them different types of charts as we heard before. So for now I'm going to go with the line charts and precisely we're going to go and use the Spark line charts in order to highlight the max and min values. All right, moving on to the third requirement, we have the product subcategory comparison. So here we have to compare the sales of different subcategories for the current year and as well the previous year. And it says as well, we have to include in the comparison as well the profits. So here we are comparing multiple stuff. First, the subcategories with each other. We have two measures, the sales of the current year, the previous year, and as well the profits. So here we can understand that we are comparing the members of the subcategories, and for that we can use the bar charts. And since we have two values, the current year and the previous year, we can use, for example, bar bar charts. And then for the second point, in order to compare the sales with the profit, we can present as well another bar chart side by side to the sales in order to show the profit informations. All right, so moving on to the last one, we have the weekly trends for sales and profits requirement sales. We have to present the weekly sales and profit data for the current year. So here we are talking about change over time because we have the time aspects and we have to display as well the average weekly values. We have to highlight the weeks that are above and below the average in order to understand the trends in our charts. So here again, we are talking about change over time, but on the weekly basis we have it before as a monthly. So here we can go as well with the line chart in order to compare the sales and profits. All right, So that we have covered the main requirements of the sales dashboards. And as well, we have a plan on which charts be used for which requirements. All right, now we're going to move to another type of requirements. We have the interactivity requirements. Here. It says that the dashboard should allow the users to check the historical data by allowing them to select any desired year. And not limited to just the current year or to the last year. So that means the dashboard should be dynamic, where the users select the year that they want to compare it with the previous year. So it should not be always the last current year. And for that, we can use parameters in order to solve this task. Then we have the second requirement. It says that we have to provide the users the ability to navigate through the dashboard very easily. And for that we usually epatoms inside our dashboards in order to switch back and forth between the dashboards. And the next about interactivity of the user should be able to filter the data using the charts and for that we can use dashboard filters. And now moving on to the last one, it's about data filters. So we should allow the users to filter the data by product information like category and subcategory, and as well by the location like region, states, and city. That means we have to provide all those filters inside our dashboard as well. All right guys, with that, we have covered the first two steps inside our projects where we understood the user requirements as well. We have decided and choose the right charts for each requirement. Let's move to the third step, where we're going to build a mop for our dashboard. This is how I usually draw a mock up for a dashboard in Tableau. As usual, it starts with the title. It's going to be Sales dashboard. And we can put as well in the title, Which Year Is Currently Selected? So it can be, for example, the Current Year 2023. Now below that, we can have our pants right. We can have three sections, or three pants for the total sales, total profit, and total quantity. Now in each of those blocks, we're going to show the following informations. First, we have to show, of course, the total. So we're going to show the total sales as a big number. And then below it, we're going to show the difference in percentage to the previous year. Since we're talking about PIs, we have always to show a symbol in order to show the performance of the current year. So it's going to be either up so that we have covered the first requirement. The second requirement is to present the data on monthly basis and compare the current year with the previous year. And for that we're going to use the Spark line in order to show the curves and as well the progress of each line. So we're going to have two lines, one for the previous year and one for the current year. And we're going to show the max and the min values using like a circle. That we can position it on the lines so that we have covered as well the second requirements. And we're going to do the same stuff for each KPI, so we're going to do the same stuff for the profit and as well for the quantity. All right, moving on to the third requirements, we have to present the subcategories comparison. So we're going to go and use the bar in bar charts in order to compare the current, the previous year. So for that we're going to have the background bar in order to present the previous year. And the current year going to be the one in the front. And what is missing here is the profit. So we can present the profit side by side to the sales to the right side. And as well using the bar charts and the profit could be plus or minus. The next infos we can present in this chart is the profit side by side by the sales. And as well it's going to be bar charts where it's going to have plus and minus values. All right, moving on to the last requirements, we're going to have the Weekly Trends for sales and Profits. And here as well, we can use the line charts since it's change over time. And we can have two sections, one for the sales and one for the profits. We will not bring them together in one because we want to show the average line for each metric. So that means we can have a reference line in order to show the average for the sales and as well another one for the profits. And then we have to go and highlight using the colors, the data that is above the line and below the average line. All right, so with that, we have covered all the charts inside our cup. Of course we have to add different stuff like a filter. So since we have a lot of filters and there will be no space inside our dashboard, I'm sure about that. We're going to go and have an icon in order to show and hide the filters. So that means we're going to have a dedicated section where we can put all our parameters and filters like the product filters and the location filters. And the users can go and hit the Batom in order to show or hide. This section, we come to very interesting part of the design of our dashboard dots. We have to decide on the coloring. And it's very important to decide on the coloring at the start of their projects so that you don't have to adjust a lot of stuff later. So you have to decide on the coloring as you are creating the mockups together with the users. What I usually do, I use maximum of four colors inside the dashboards. So the first two colors are the basic colors and they really depend on the background color of Tableau. If you are using the white color as a background inside the dashboards, then I usually go with a very dark gray and light gray. So those two colors are the basics that I usually use in each dashboard that creates. And the other two colors really depends on the user's preferences. You can lead the users to decide on those two colors, or you can take it as well from the icon of their logo. So as you can see in the Mocap, we are not designing only the chart types and the position of the charts inside the dashboard, but also the coloring of the dashboards. So now here, the final dash that we can add to our cap art, we can add a logo for the dashboards. And as well, we can add that dynamic where we can switch to another dashboard by using Ptoms, as the requirement says. We have two dashboards, We have the sales dashboards and the customer dashboards. And we can introduce on the header of the dashboard two buttons in order to switch between those two dashboards. So if the user clicks on the customers, it can switch to the customer dashboards. But if the users clicks again on the sales, it can switch back to the sales dashboards. All right. We will not design now the customer dashboard. I'm going to leave it for you in order to practice. We are focusing only on the first part of the requirements of the sales dashboards. All right guys, so now we have a Mocap, we have a Blueprint. And if the users agrees on the plueprints, we can go and execute our plan. And we can start building that in Tableau. And we will start by preparing the Tableau data source. 195. Tableau | #2 Step - Building Data Source: All right, so so far we have understood the requirements and as well we have a mok up for our dashboard. The next step it does, we're going to go to Tableau and start building stuff. All right guys, so the first step is to prepare our data source. And I promise you, start from the scratch, that's why we're going to start our Tableau public as an empty where we don't have anything inside it. So now the first thing is of course we need our data. Go to the link in the description and download the data that I live there for the projects. Then we're going to go and connect it. In order to do that, we're going to go to the left side over here, so make sure you are at the home page or the starting page of Tableau. So let's go to the text file. And then he, previously we worked with the Pig and Small data source. Now we're going to work with the Tableau Projects Sales dashboard. Let's go inside it. And here we get files which has similar informations as the old data sources. So let's go and select something over here, and click Open. So now we are at the data source page, and as you can see, we have connected now our data to Tableau. All right, the next shibit that we're going to go and create our data model inside the data source. So here we have to go and understand our data. I'm just going to go and remove this from here in order to have everything from scratch, So we have to understand our data inside those files to know what is dimension and what is fact. Let's go for the Customers over here and click View Data. And as you can see here, we have only two columns, Customer ID, customer Name. This is the dimension, it doesn't have any facts. That means the customer's table is a dimension. Let's go and closet and go to the next one. We have the locations, let's go inside and check the data. As you can see, we have city, country, region, states, and so on. Those informations are dimensional informations as well, because we don't have any events inside it, it's not really a fact. Let's go and closet. Let's check the third one, the orders. So now we can see over here we have some ID's, like the customer ID, order ID, product ID. Then we have some dates, like for example here, the order dates, we have the ship dates and as well some numbers like the sales quantity, profit and so on. So this is an indicator that this table is a fact because we have a lot of measures and as well we have dates which can indicate that this table contains events. So once you see such a set up where you have IDs, dates and measures, this is a big indicator that this table is e fact. So the orders are facts. Let's go to the last one to the products. So we can see that we have the product ID, category, product name and so on. Those informations are a dimension. So that means this table, the products is a dimension table. All right, so we have now an overview of our data and we can start moduling in table data source. The first thing we can start by drag and dropping the facts. So that means we're going to go and get the orders and put it in the data model over here. And then after that, we start bringing all other dimensions to the data model. Let's take the customers, for example. Just drag drop it over here as a relation. Now as you can see Tab going to create a relation. It's very important to check the relationship. So as you can see, we have the customer ID equals to the customer ID, which is correct. We will leave all other options over here in the performance as a default, since we don't deal now with the performance. First we have to build stuff and then check whether the performance is bad or good at the start. Leave everything as a default. Let's go to the next one. Get the location, Drag and drop it as well over here. And we're going to check as well, the relationship, it's going to be the postal code equal to the postal code as a key. And the last one, we're going to get the last dimension, the products and throw it to the data model as well. We can check the relationship. So as you can see, we have the product ID equal to the product ID. All right, so we have our data model where we have one fact and all the dimensions are connected to these facts. And now the next sibit that I'm going to go and start changing the names around. So for example, let's go rename our data source to sales data source. And then we're going to go to the table names and remove the CSV. Rightly connects. And let's rename, let's remove the extensions. And as well for everything, just to have it nice data model. So with that we have very nice naming in the tables. All right, so this is about the renaming. The next tab that we're going to go and check the data types for the fields, whether they are correct or not. Sometimes if you have bad data quality from the sources, you will get strange data types which can make later a lot of issues if you don't check the data quality at the starts. So let's do it quickly. We're going to go to the broadcts. As you can see everything here we have like characters and the data type is string, so everything is fine to the products. Let's go to the locations. And now we can see that all those informations are geographical informations. And as you can see, all the data types are correct beside the region over here. So we can go and switch to a region, So let's click on that and go to Geographical Role. And here we have the type of country, Region. Let's go and select that. And we can see that's all of the contained characters and they are the data type of string, so everything is fine as well, the customers. Let's go to the orders. And here we have a lot of fields. What is very important to focus here on the date field. So as you can see, the order date and the shipping date, both of them has the data Tup date, which is really perfect. And in many situations I see a lot of information as the dates, but the datatype is string and that's because we have corrupt data inside those fields. And now the next important thing to check inside our data, we have to go and check our numbers. So let's make sure that all our numbers has the data type number. So as you can see, all our fields has the data type number. And this is really important because we want those numbers to be continuous measures in order to build the charts. So that means if you have any of those informations as a string, what can happen table, and I think this is a dimension. And then you cannot use it in your visuals to do aggregations like sum and average because it's a dimension. So that's why it's really important to check that all your numbers has the data type number in order to have it as continuous measure. All right, so with that we have very good and solid data source. The next table that I go and try to understand the data before I start building visualizations. So let me show you what I mean. Let's go to the worksheet page and let's start, just randomly check the data inside the data source. All what I want now is to get closer to the data, to the content of those tables. Because normally on projects we have a lot of tables. If you don't understand the content of the tables, it can be really hard to find your informations and build the correct charts. I know that you have practiced with most of those informations before, but I wanted to show you what are the steps that I usually do inside the projects in order to build really nice visualizations. So now I go, for example, and check, okay, what is category? Which values are inside it? And with that, I can see that we have three values. That means we have low cdonality inside the category. And then I check another example. Let's say the subcategory dragon, Drobta can see that there's like heirarchy between those two dimensions. And then I go and take something else like the segments of our here. Now we can see that we have a lot of duplicates inside the data. Which means maybe there's no relationship between those two dimensions and the segments. If I brag it to the starts still there's duplicates, so there's no relationship between those informations. So I go and drop those information. I can see we have three segments. Those are actually segments of the users and not for the product. As you can see, step by step, we are learning the data inside our data source. Then the next step, which is interesting, do we have a lot of countries inside our data source. So let's drag and drop the country. As you can see, we have only one country. This data is about the USA data. Then interesting, which regions do we have inside the data? Which is so we have all four regions and states and so on. So as you can see, I'm just browsing the data. So this is really important step in order to understand the business and start discussions with the users of those dashboards that you are creating. Reading your data, understanding your data before creating any charts or any visualizations. All right, so now once you are done browsing and understanding the content of our data, we can go to the next step where we're going to go and start building our charts. 196. Tableau | #3 Step - Building Charts: All right, so now we're going to start implementing the requirements by creating the charts. And we're going to start with the first charts where we're going to go and build pans. The requirement says, display a summary of total sales profits and quantity for the current year and the previous year. Let's not forget the requirement that it says, the dashboard should allow users to check historical data by offering them the option to select the desired year to the current year. Now let's start with the first pan where we're going to focus on the total sales. Now let's go to our data. Let's go to the orders and check the information that we have inside the sales. Let's grab it to the text over here. And now, with that total sales inside our data for all years. But the requirement says we have to show the total sales for the current year. So let's take, for example, the order date and put it to the roads over here. So as you can see now we have the sales for all years and not only for the current year. So that means I need feel that shows only the sales for the last year for 2023. In order to do that, we have to go and create a new calculated field. So let's go and do that. And we're going to call it Current Year Sales. And then the function can be really easy. We're going to check whether the current year is 2023. If it's true, then we're going to show the sales. Otherwise we will show nothing. And for that we're going to use the F conditions. So let's go and use that. And then what we need is the year of the order date. The condition is based on the year. So if the year equals to 2023, then what can happen? We will get the sales rights. Otherwise, if it's not 2023, I don't want anything, so it's going to be null. So that's it. Let's end it again. The logic is very easy. We are checking the year of the order date. If it is 2023, then show the sales. If it's false, then don't show anything, it's going to be null. So let's go and hit okay. And with that we've got a new calculated fields, the current year sales. Let's go and grab it to the view over here to check the data. Now as you can see, this field now is showing us only the sales for the current year, 2023. This is for the first fields, but in the requirements it says we need as well to show the sales of the previous year. That means we have to show sales of the 2022. In order to do that, we have to create as well, again, a new calculated field to fulfill this requirement. So let's go to the current year sales and go duplicated in order to create the new calculated fields. So let's go and edit it. So now what we're going to do, it's really simple. Instead of having 2023, we're going to go and make it one year less. It can be 2022. All right, so let's go and hit, Okay, With that, we have the previous year of the sales. Now let's go and check the values. I'm just going to take it and put it here in between those two values. And with that, as you can see, we have the previous year of sales. So with that we have the sales 2022. So now we have the two main calculations for the projects. We have the current year and the previous year for the sales. How to make those two fields dynamic? We can go and use the parameters in Tableau. Now, before we create the parameter, we have to create one more calculated field in order to have the years of order dates so that later we can use it inside the parameter. So let me show you what I mean. Let's go and create a new calculated field. Let's call it order dates and be the years. Then what we're going to say, we can use the function year and inside it we're going to have the order dates. This field going to return always the years of the order date that sets. Let's go and hit okay. Now we're going to go and create our parameter. Right click over here and create parameter. We have to go and give it a name. It's going to be select a year and the data type going to be integer since it's going to be years. So there is no float. And now we have to define what is allowed to be used as a value inside this parameter. If you leave it all, then the users can go and insert anything which is not really good because then the users have to go and guess how many years do we have inside our data? But instead of that, we have to give them a predefined list of all years that we have inside our data. For that, we're going to go and check a list over here. And then the values inside this parameter going to come from the new calculated field that we called it, years for the order date. Let's go over here, add value from, then we're going to go and pick our new calculated field. This is really good. First, because it is automatic, you don't have manually add all those years. And second later, maybe you get a new year inside your data. And you don't have to go manually and adding those informations, it's going to be automatically added to the list. We are almost fine, but I'm not really happy with the format. As you can see, we have hit the Southern point. Let's go to the display format and what we can do, we're going to go to the number custom. Let's remove all those decimal places as well. The display unit is going to be none that sets. So what we're going to do, we're going to go to the number custom over here. Let's remove all those decimal places, and as well remove 1,000 separator. All right, so that's all. Let's click over here then. As you can see, we have now the years without any separator thing that we have to go and make the current value as the last year. Let's go to the current value over here and select 2023. That's all for this parameter. Let's go and hit or k. And as you can see we have it on the left side. Now with the parameters, let's go and show it for the users. Or show parameter to the view. And now the users can go over here and start selecting what is the current year. As you can see, if I'm selecting the years, nothing is changing inside our view. And that's because we haven't now link this parameter inside the calculation. And this is exactly our second step. Let's go and do that. Let's go to the current year sales over here, and let's go and edit it. Now, instead of this static value of the 2023, we're going to go and add our barometer. Let's write the name of the barometer it is. Select Year, and that's it. So what you are saying now. The year of order date equals to the selection from the user. Then show the sales, otherwise show nothing. Let's go, okay, let's go and try that. So let's focus on the current year sales and let's go and change the value to 2022. And as you can see now the current year for the sales, it is the 2022. And the same if you go over here and make it 2021. So as you can see, everything is dynamic and the users now can go and select what is the current year. Now the next. Yep. With that, we're going to go and integrate it inside the previous year. Let's go to the previous year, edit it. And the same thing, instead of 2022, we're going to say select year. But now since we are talking about the previous year, what we're going to do, we're going to go and subtract one year. That sets. Let's go now, let's go and test again. So 2023, everything is fine. Let's go and switch the current year to 2022. So let's do that. Now we can see that both of those two values did react to our selection. So now the previous year is 2021 and the current year is 2022. So that we have completed the first requirement inside our user story, where the users can go and decide which year going to be the current year. And we made it completely dynamic using the parameters. All right, so with that we have our main calculations for this project where we have the current year and the previous year of the sales. So now the next step, as we decided in the Mocap, we want to show the differences between the current and the previous year. And we're going to have it as percentage in order to show the KPI. Let's go and create a new calculated field, and we're going to call it percent difference sales. The calculation can be really easy, so we're going to go and subtract the current year of sales from the previous year of sales. But now, since we want to present it as a percentage, we have to go and divide it by the previous year. Let's add starting and ending brackets divided by sum of previous year. With that, we will get the percentage of the differences between the current year and the previous year for the sales. Let's go and hit okay. And with that we got our new calculated fields. And now what we're going to do, we're going to go and change the format to percentage. Right click on that. And then let's go to Default properties, number formats, And now let's go to the percentage. And let's have only one decimal. Let's hit, okay. Now in order to show those values year, let's go and remove the year. And now let's go and check the value of the differences between the current and the previous year. And with that, as you can see, the differences between the current year and the previous year is around 29% So again, we can go and check our parameter to see whether everything is working fine. So let's go to 2023. As you can see, the difference now is only 20% Alright. So with us we have almost everything that we need in order to build our fares pain. So I'm going to call this first sheet as a test in order just to test the data. So let's go and create a new worksheet, KPI Sales. And we can start building our fares charts. So now if you check our cap, our KPI has the first part going to be the pants where we have the big numbers and the second part going to be the Spark line. Here we have two options. Either we're going to go and make a dedicated sheet for each section, or we make everything in one sheet, like the whole QBI in one sheet. And we're going to do that. So what we're going to do in the title, it's going to be the pan. So we're going to put all the information of the pan inside the title and then inside the view. We're going to go and build our spark line. Let's start with the pans first. What we need for information is the current year of sales. Let's go and grab it on the details. And then the second information that we need is the difference of sales. So let's grab it as well to the details over here. And that's it for now. Let's go now to the title and start building the pan, double click on the title. And now in the first line we're going to give the name of the measure. So it's going to be the Total sales. And then the second information, it's going to be the current years of sales. So let's go to Insert over here and add the sum of the current year sales. And the third information is going to be the differences. So a new line. Let's go and add our calculation, the difference of sales. Now let's go and hit a line in order to see the information. As you can see, now we have total sales. We have the total number of sales for this year as well. At the end, we have the differences. So now we're going to go and start formatting this plan. So what we're going to do, we're going to go over here to the total of sales. Let's make it the front Tableau book. Then let's go and reduce it a little bit more to 14. Now the next year we're going to go to the total, Make it really big. Let's select that. Let's take the font to bold. Tableau Bold. And then let's go and increase the font to, for example, 2022, and make it bold as well. Here we have really to make it really big, let's go and hit Apply. Just to check the numbers, as you can see, a total sales small, then a big number, which is really great. Now for the next one, we can go and select it. Let's choose, for example, the Tableau semibold, and then make the size 220. Then we're going to go and add. That takes off versus previous year. All right, let's go and hit Apply. Now, everything looks fine. This information is not really relevant to show. It's very bold inside our data. So let's go over here and change the fonts back to Tableau Po and as well, let's go and change the coloring as well. Something like here, really light gray. As you can see, everything looks fine. Now let's go and change the coloring and the format of the text because this is not really relevant information. So we're going to go over here and change it again to Tableau Pok. And then let's go to the coloring and make it like light gray a little bit. Let's go and hit, Okay. Now you can see that our pan look really nice. Let's go and hit. Okay. What I'm going to do, I'm just going to go and change the format of the total sales, right? Click on the current year of sales, and then let's go to format. Then instead of having the axis, let's go to the pan over here and go to the format of numbers. Let's go to the number custom, remove the decimal numbers, let's have the unit as 1,000 To make it more easier to read and let's add the dollar sign in the prefix. So now things looks more professional. So we have the dollar sign and as well the number is rounded 2000. All right, so now the next what is missing inside our KPI? If you look to the Mok up, we have decided to add the KPI simple. We need an icon to indicate whether the sales is going up or going down. In order to do that, we're going to go to the differences and change the formats. So let's go to the differences to the formats. And then let's go to the format of number over here. And let's go to custom. And then we're going to go and add the following format in order to indicate the PI. I will leave this format in the description as well in order for you to copy and paste it. Here what we are seeing, if the percentage is a positive number going to be up. If it is a negative number, it can be down. And of course, if you want to add more decimals to the percentage, you can go over here and add zero. So as you can see, once I add zero, the format can change. But now for that I would like to have only one decimal. All right, so that's all. As you can see now we have a really professional band where we have the total sales of the current year. And as well, we have the differences between the current year and the previous year using a really nice PI. Of course, we can go and test it. Let's go and show the parameter to the right side. Let's go, for example, to 2022. And as you can see, everything is changing perfectly, 2021. And now you can see the arrow is down because the previous year was higher than the current year, perfectly so. With that, as you can see inside the title, we have created the pan. Now the next step that we're going to go and create the spark line. All right, so now let's go and build our spark line. It's going to be based on the months, don't forget the requirements. It's to show the current sales based on the month and then compared to the sales of the previous year. So first let's go and switch the parameter to 2023. And let's go and get our order date to the columns. And now what we're going to do, instead of having years, let's go and switch it to months. And then we can go and grab the first measure. It's going to be the current years for the sales. Let's put it to the rows. And now instead of having discrete line, I would like to have it as continuous line. So let's go to the months of our year, right? Click on it and switch it to continuous. So now what we're going to do, we want to compare it to the previous year. In order to do that, let's go and get the previous years of sales. And now since both of the charts are going to be line charts and going to be on top of each others, we're going to use the measure names and values. So let's drop it on the axis over here. Now you might note that we have Brock in our pan. So we have here like a range between the lowest value and the highest value. We don't want that, but we will fix it later. Don't worry about it. So now let's keep focusing on the spark lines so that we have our two lines. Now what is missing is to highlight the highest value and the lowest value of the current year. Now in order to get those two circles on top of our view, we have to go and another measure. But first we have to go and calculate it using calculated fields. So let's go and create a new calculated field, and we're going to call it min max of the sales. So now we're going to go and search for the highest and the lowest values of the sales. In order to do that, we're going to go and check a condition using the FL statements. So let's start with the first one. We're going to say if the sum of the current year and now we're going to go and check whether this value is the highest between all other current sales. So what we're going to do, we can use the function of Window and Max since we are searching for the highest value. And then inside it we are comparing all those current years, current year of sales. Now we are just checking whether you are the highest value, it's true, then what can happen then? Show the value of current year of sales. That means if you are the highest value, then show yourself. Show the value. Otherwise, we're going to go and search for the lowest value, LF. We're going to take the same stuff, some of the current year equal. But now instead of window max, we're going to use window. I'm just going to go and copy everything from here and replace the max with me. Now what can happen if you are the lowest value? We're going to do the same show yourself. So we're going to show as well the value of the current ear for the sales. Otherwise we don't want to see any value. So what we're going to do, we're going to go and say, that's it, the calculation is valid, Let's go and take. Ok, we have it as a new field, but I would like to test the value whether it's working instead of throwing it now to the visual. Let's go in to another sheet. Let's grab the other date to the rose. Switch to month. I just want to check whether everything is fine. Let's grab the current year of sales to the view. Now with that, we have the sales of each month. And now let's go and grab the new calculated field, the min max, and drop it over here. Now let's check the table. What is the lowest value? It's going to be the February. So as you can see, we have the min and what is the highest value? It is November. Now, as you can see, this calculation is working here. My recommendation for you, if you are creating something complicated, always go and test on the table in order to see the numbers before you switch it to like circles or lines. Those tables we can go and validate. Peter, let's go back to our QBI sales and let's grab our new value, Minmax sales and drop it to the rows. With that, we got our new charts because we have a new measure over here. We have as well in the Mark new tab for the Minmax. Now let's go to this tab in order to configure the Minmax instead of automatic. We want to have, we're going to go and make it a little bit quicker in order to see those circles we have here, the min and the max. Now let's go to the first chart. So we're going to go and switch it over here and make sure instead of automatic it's a line because we're going to go as X and merge those two charts in one. In order to do that, we're going to go and use the dual axis. Right click on the Minmax over here, Use the dual axis. On the right side, and maybe just hide it from the right side over here. As you can see, we have now those circles on top of our line charts. And with that, we are highlighting the highest and the lowest value inside our Spark line. Now we have our spark line, but now let's go back to our pan and fix it. As you can see, we have a range. And that's because inside the view, we are using the month as continuous fields and table going to go and make it as a range. This is the disadvantage of having everything in one chart that are like related to each other's what we can do. We going to go and fix it by doing the following. Now, in order to fix this, we're going to use a trick in order to make it fix and does not react to the things that we have inside our view. Let's go and double click on the first one. And we're going to add at the end, Prackets. Let's add it at the end as well to the starts. And let's go and hit. Okay. And as nothing is changed because we have to go inside the title and change stuff, but let's keep changing those stuff. Let's go to the second one, double clone open Pcketstends. Let's add it to the starts. So let's go and hit. Okay. So now the next tip that's, we're going to go inside the title and start fixing it. Double. And as you can see, missing fields because for Tableau this is a new fields side by side. I'm going to go and add the sum of the current year of sales. And then I'm going to go and remove the missing fields. The same thing for the second one. We're going to go and add that differences. And remove the missing field as well. We have to go and change the coloring again from reds because it was a warning. And let's add it as plaque for the second one as well. All right, so let's go and hit. Okay, so now as you can see, everything is packed on neural and we have again our pan. All right, so with that, we have built our chart. And the next step is that we're going to go and format it in order to make it a beautiful chart, right? And this includes a lot of stuff like removing the lines, removing the grades, removing the headers, axis, adding coloring, simplify everything, right? So let's start with the easy stuff where we're going to go and remove those grids and those lines. So rightallyhre on the empty space, go to format. And then we're going to go to the left side over here. Let's go to the lines. Let's check the zero lines to none. Let's go to the rows. Remove the grid as well. As you can see, we don't have any lines here in the middle. Let's go to the grid over here. And let's go to the sheets and start removing everything like any line should be. None. With that, we are removing everything inside our grid. All right. As you can see, we have cleaned up all those lines inside our charts and everything looks really clean. The next step with that, we're going to go and work with the axis and headers. Let's go and remove the axis over here. So right clicking it and let's remove the header. Now we might ask why we are removing a lot of stuffs. And that's because in the dashboards, if you add a lot of aformations, you're going to distract the users. And they will not focus on the important stuff which is showing the trends inside the view. So we have to reduce a lot of information and only present the relevant informations. So really here we have to be very minimalist in the design. So now what is left is the months of over here. So rtically conducts. Let's go to the edit at we want to remove the title from it, so let's go remove that as well. We're going to go and indicate that those informations are months, rightly conduct and formats. Then let's go to the dates of over here and let's have abbreviates. You can see now we have abbreviations of each month. Let's go and clear this. So now the goal is to show for the users. This park line is based on the months and we don't want to show all those informations. So it's enough to show only few values. So I would like now to show only January and December. Remove all other information. So once you see it's January and December, you will immediately understand this is based on the muscles. So what we're going to do, we're going to go and edit the X again and change the X. Let's go to the tick marks over here and let's go to fixed. Now next we're going to go and change the tick. So it's going to start from January and it's going to show the value of December after the interval of 11 values. It can show the last month. As you can see now we are showing January. And only December, and everything is between is not shown. So that's it. Let's go and close it as well. We have those nulls. Let's go and remove them. So right click and hide indicators. Now as you can see, we have everything cleaned up and we have only the line charts, and here we are indicating that it's based on the month. Now what is left is coloring of our charts. So as I said, I'm following here only four colors. So here we have our basic colors. But now let's go and change those informations. So now we're going to do, we're going to go and change the lines. Let's go to the lines over here and start working on the coloring. It colors now. We'd like to have the current year of sales to be very dark gray. And the previous year going to be like in the background as light gray. In order to do that, let's go and double click on the first value. So now what we're going to do, we can add our colors instead the custom colors over here. In order to configure it only once and keep using it in all other charts, let's start configuring the colors. Let's click on the first sale over here. So make sure you are selecting it. Then let's make it as something like here, a very dark gray. And then the next, we're going to go and add to custom colors. So let's click on that. So with that, as you can see, we have defined the first color. And let's go and hit Okay. So with that, we have defined the first color. Let's go to the previous year sales and as well make a new color. So let's go to the seal over here beneath it. And let's make it something like here. It's going to be the light gray. And let's make it more lighter. All right. Something like this. Let's add to custom colors and hit. Okay? All right. So now let's go and hit. Okay. And with that, as you can see, the current year is going to be the black one or the very dark gray. And in the background we have the previous year of sales. So now next we're going to go and change the coloring of those two circles. So let's go to the Minimax and the Marks over here. And let's grab the minimax sales by holding control and put it to the colors. All right, so now let's go to colors in the colors. Now, instead of automatic, let's go and switch it to custom over here, the last one. And then we're going to change the steps to only two steps. So now we're going to start on the right color, where we're going to define the max value. So let's go inside. And now we can define our third color. So let's click on Empty Sale over here. And let's add the code of our third color, the turquoise. All right, then let's go and add to custom colors over here. So as you can see, we have our third color. Let's click Okay. And now we have to define the left color. It's going to be the mean value. So click on Arts, and we're going to define our fourth color. Click on the empty cell over here. Let's add the code for the orange, and then let's go and add it to custom colors. And with that, we've got our four colors that we can use in all our charts inside these projects that sits. Let's hit Ok. And hit Ok. Now as you can see, we've got our two circles, the highest value, the mean value, using our coloring. Now the last touch that I'm going to add to this chart is to reduce the opacity of those two circles. Let's go to the colors over here and reduce it from 100 to something like 70% that sits. All right, so now the next step after formatting our charts, what we're going to do, we're going to go and work tool tip. If you mouse over anywhere in the lines, you can see that we have a tool tip and it's not really nice. As you can see, it looks like calculations and not human readable. What you're going to do now, we're going to go and edit those informations. Now in order to do that, let's go to the tool tip over here in the marks and then we're going to get this box here. We can see in this window, it's very similar like you are editing a title or any text in Tableau. Here you have two different types of text. The one that is not highlighted, this is going to be a static, and the one that is highlighted with this light gray background. It's going to come from the charts. What we're going to do, we're going to go and remove all those informations and start creating our tool tip. Let's start with the first one, Sales, and then we're going to have off. And then we're going to go and add the month. We're going to go over here into Inserts, and then let's insert the month order dates. And here we're going to go and add the current year. We can go and use, for example, the barometer for the selected year, but we're going to have a problem as we're going to show the sales of the previous year for that. In order to show the years inside the tool tab, we're going to go and create some calculated fields. Let's just close this and we're going to go back to it later. Now just check the tool table. As you can see, we are going to get sales of March, April, and so on. So we don't have a lot of formations. But now let's go and create calculated fields. Now we're going to call it the current year, so it's going to be really simple. It's going to be the value that the user selected from the parameter. That's select year. That's it, okay? As you can see, we have the current year on the database. Let's go and create another one for the previous year. Previous year. And it's going to be as well. Select year, but this time we're going to subtract one year from it. So that's, let's go and hit. Okay. But now I would like to go and change them to dimensions because they are not measures. Right click on the current year and let's change it to dimension, the same for the previous year. Let's go and convert both of them to dimensions. All right, so now we're going to go and grab all the information that we need in the tool tip to this box over here to the tooltip. Well, the previous year just drag and drop it on top of this box year. Let's go and show the informations about the current sales and the previous sales and the differences between them. All right, so now we have all the information that we need for the tool tip. Let's go inside the Tooltip and start configuring it. Let's go over here now. After the month, what we can do, we're going to have a coma. And then let's mention the year. So it's going to be the current year. This one over here. All right, after that, let's have double points. Let's go and insert the Current Sales. Insert. And now make sure to select the current year of sales. This one over here. And not the fixed one. So it's like fixed. But now we would like to show in the tool tip the sales of the current month. In order to do that, we're going to go and select the sum of the current year for the sales without any fixed. So let's go and select that. We're going to go and do the same stuff now for the previous year. Sales of, we're going to add again the month. So now we're going to go and do the same stuff for the previous year. Sales of, we're going to have again the month, so let's go and grab the month. Come on, and then we're going to go and add the previous year, so it's going to be this one over here previous year. Double points. And then let's go, that gets the sales of the previous year. Okay, now the next information, The next line going to be the sales differences. Let's say differences, then douple points. And now let's go and add that differences here. Again, make sure to not use the fixed one that we have inside the title. Let's go and get the variable one, the one that we added from the data. Pain this one. All right, the last information that we're going to show inside our tool tip is the min max values. The highest lowest sales, double points. Let's go and grab our measures. Going to be the Minmax sales. Let's go and select that. All right, so that's all the information that we want to add inside our Tooltip. Let's go and hit Okay. And check the results. For example, let's go to the viewpoint over here. So now we can see that the sales of the current year for the month November, it at this value. And as well, it can be compared for the sales of the previous year for the same month. And then we can see the sales differences and what is the highest and lowest value. So now as you can see as we are moving to different months, the values inside the tooltip going to change. So now, as you can see, the format and the design of our tool tip is naturally nice, right? So for example, we have the thousands dots and as well everything bold. So it's not really easy to read as well. The alignment of those informations are naturally nice. So now we can go and format it. All right, so now let's start first with formatting the current and the previous year. Let's go to the current year and let's have the default properties and then format number, we're going to have it as custom. Let's reduce the decimal numbers as well. Remove include thousand separator. All right, now let's go and hit okay. And let's just test. Now as you can see, 2023, don't have any dot. Let's go and do the same for the previous year. Let's go to the default properties and then number format as well. Let's go to the number custom, reduce the decimals, and remove the south separator. Now the next one, what we're going to do, we're going to go and adjust the format of the numbers. As you can see, the current month has different format than the previous months. Now in order to do that, let's go to the previous sales over here, right click on it. And let's go again to the default properties number format. And we're going to go again to the number custom. Let's remove the decimals, The unit display, it's going to be thousands. And we're going to add that dollar sign. Let's go and add it. And then hit okay. Now let's check again. Now we can see now both of the numbers have the same part format. Let's check the max and min. You can see the max and min has as well, the same problem. Let's go to the Minmax value as well to the default properties number format. And then let's go to the custom remove decimals, add the dollar sign, and don't forget to add the unit, it's going to be a Southend. Let's go and hit. Okay. All right. So now all our numbers has exactly the same format and now what we're going to do, we can go and format the text. Let's go back to the tool tip over here. All right, now we're going to go and work with two colors, the light and dark gray. Let's select the first part where we have a text, we don't have a value. This is going to get the light gray. Let's check this value over here. Let's remove the bold as well. All right. Now let's do the same for all other stuff. We're going to select the have the light grey. Remove the bolds. Well, for the next informations. All right. The next information. As you can see, they have exactly the color that we need. They are bold. Make sure that everything has a dark gray and as well as the bold. Everything so far is fine. Let's go ahead to K and test. Let's over over here. Now as you can see, it's really easy to read where we have a different coloring for the text and the value. All right, so now the last thing that we're going to do inside the tool tip that we're going to change the alignment of the numbers. As you can see, all those numbers starts from different positions. Now let's go and change the alignments. In order to do that, let's go again to the tooltip. Now what we can do, we can go and add a tab exactly after the double points and make sure there are no white spaces. We're going to go over here to the first one. Let's add a tab now. Let's go to the second one. I believe we have here an empty space. Let's just remove it and add a tab. All right, for the next one, I believe I have space. Let's remove it and add a tab. And for the last one, the same thing. Remove the space and add a tab. The tab can go and automatically alignment for all those numbers that sets, we have all the taps, let's go and T. Okay, now let's go and test. So as you can see, all the numbers start from the same position. Let's go to the point over here as well. As you can see, everything looks really nice. All right, so that we are done, and we added a very nice and readable tool tape inside our charts. Let's do a quick summary for the steps. First, we create our calculated fields that 197. Tableau | #4 Step - Building Sales Dashboard: All right, so we're going to start talking about building the dashboards. The first step that we have to plan the structure and the containers of our dashboard. All right, so let's start sketching the container structure. The first one is as usual going to be the main container and it's going to be a vertical container. And then we're going to start from top to bottom. So first we have like a title and two buttons. So for that we can include a horizontal container where we have the title and the buttons. Moving on, below that, we have the information of the QBs. So we have side by side objects here. Again, we're going to go and use another container, another horizontal container, in order to have all those bi side by side. Then moving down below that, we have the charts rights. It's again two charts side by side, and we will use a third horizontal container for them. This is the main object that we have inside the main vertical container. But of course in our dashboards we have as well a lot of filters. What we're going to do, we're going to build a vertical container where we're going to put all the filters for the dashboards. But this container going to be outside of the main vertical container and we will use the floating options. This vertical container going to be outside of the main container, the vertical container. For that, we're going to use the option of floating. And as well the ability to hide it or show it. I would say we will go with this plan, and of course it is. That means as we are building the dashboard, sometimes we add like an extra container to organize stuff. So we will not cover everything in the plan 100% but we will cover the main stuff. All right, so now with that we have a plan for our dashboards. Let's go and implement it in Tableau. All right, now let's go and create a new dashboard and wig call it sales dashboard. So now the first step that I usually do is fixing the size. Let's go in the left side to the size, change it from range to fixed size, and then let's go to the width. I usually go with the 1,200 And for the heights let's go for 800. Okay, so with that, we got enough white space for our dashboards. And I usually start with the main container. But since we have container which is going to be hidden and shown for the filters, I'm going to start with that first. Now, in order to create this vertical container, I have a quick way in order to catch it. So what we're going to do, we're going to take any worksheets. Let's, for example, go with the QBI sales. Let's drag and drop it to the middle. So as you can see, table can go and automatically create a vertical container on the right side where it can put everything inside it. The parameters, filters, legends and so on. And this is the container that we can use for our filters. Now what we're going to do, we're going to go and convert it to a floating element or floating container. In order to do that, hold shifts and then click on this icon over here. And then just move it. As you can see now it's like freed and let drop it anywhere. Now let's just move it here to the end. What we're going to go and remove this chart because we have to go now and build the main container. Let's go and just remove it. And as you can see, we still have a here on the right side. Now what we can do, we're going to go and color the container. So make sure to select the container over here. Let's go to the layout. And then let's go to the porter, make it a line. And then let's choose any color. For example, the purple one as well. Let's go and put a background for it, maybe the purple as well. That we can see that we have here a container, floating container on the right side. The next step, we're going to go and give it a name. So we have a here in the item hierarchy. Let's go to the vertical container. Click on it, and then let's give it the name of Filter. Filter. All right, now we have our first container. Let's go back and building the main container for the dashboards. So let's go back to the dashboards and let's grab a vertical container for the main one. So let's draw it here in the middle. And now we're going to go and add the coloring for it. So let's go to the layouts. Let's go to the borders, and let's have it as an orange as well. I would like to add a background color for that. So let's take the orange as well that we have our main container on the left side, you can see we have the tilts and then the vertical container. Let's go and rename it. I'm just going to make it a little over here, so we're going to say you are the main container. All right, so now the next spit that we're going to go and add planks in order to have a placeholder for the elements inside this container. Let's just go and add one. And then let's go with the first container inside the main one. We have the horizontal container for the title. Let's take a horizontal container. Just drag and drop it here, below. Make sure that is inside the main container. Do that carefully. All right, so we have our horizontal container. Let's go and put some coloring on it. Lay out border, Let's make it blue as well for the background. Let's have it as well as blue, of course. Let's go and check stuff over here. We have the vertical container, we have our plank on top. Then we have the horizontal container. Let's go and rename it. You are the container for the title. All right, now let's go inside it and put some contents. So what we have, we have a text, so let's track and drop it inside the horizontal container. So let's say you are the sales dashboard. We will format everything later. That's it, let's go and it. Okay. Now as you can see, our container can be very small. Let's make it a little bit bigger. And now we have to go and add the two buttons. Let's go with the naviications. Make sure to add it inside to the right side. Right, because it is horizontal container, let's go and drop it. And we need another one. Let's go and drop it as well, to the right side or in the middle. Doesn't matter. Right now, let's go quickly and check the layout to make sure that everything is fine Inside the title, we have a text and then two buttons, grades. Now let's go to the next content. We're going to have another container for the key. Let's go again to the dashboards and take Horizontal Container and make sure to put it beneath the first container. Let's rub it over here. And now make sure to click it. And let's go and add the coloring to it. So it's going to be line as we'll p, the background is going to be as well. Plu. All right, so now the next step we're going to go and add again a name for it. So let's go inside. You are the container for the keys. Okay, now let's go and add some content inside it using the planks. So the first plank, make sure to drop it. Second horizontal container, and now we have it very small, let's go and extend it. Then let's grab another one. Make sure to put it on the right side now that we have two planks. And let's go and grab the third one to the right side that we have our three place orders for the KPIs. Again, I always go back to the layout to check that everything is fine. As you can see, those three planks are inside the QBI, everything is clean. Let's go back now to the dashboard and add the last container for the charts. So we're going to go and grab again, a horizontal container. Drop it below the middle one. Let's go and add some coloring to it. So let's go to the layout. We add some border blue and as well a background for that. Now let's go and give it a name. You are the container for the charts. Okay, Now let's go and add some planks in order to have some content inside it. So the first plank inside it, and now we have it very small, so let's extend it and the second plank to the right sides. Now we have two places for our charts. Let's go to the layout and check. As you can see we have the two planks underneath the charts. All right, with that, we have the three containers for our content. Let's go and remove the first plank. Since we don't need it anymore, we have it over here. Let's go and draw it with us. We have built the foundation, the structure of our dashboard. So we have the container for the title. We have the three KPIs and then place for the two charts as well. We have here on the right side our floating container for the filters. All right, so as you can see, it's really easy. Just do it slowly, step by step, check everything. Give it a name. Don't rush it. All right, so that's all for this step. Now finally, let's go to the step where we're going to put everything together and put the content inside our dashboard. Okay, so now let's go and put all our content inside our dashboards. Don't worry about the filters. We're going to do it at the end. So let's start with the KPIs, right? So, we're going to take the first one, the KPI of sales. Make sure to put it near the planks. And then let's go and grab the second one next to it, and the quantity as well next to it. So let's go to the layout to check everything. So as you can see, we have this container for the KPIs, and inside it we have our three KPIs. Now we don't need anymore of the planks, let's go and start deleting them. All right, so now let's keep going and put the other charts inside our dashboards. Let's take the subcategory, make sure to be inside the third horizontal container, so let's drop it over here. And then the last chart is going to be the Weekly Trends. Let's drop it side by side over here. So let's go to the layouts and check so that you can see the horizontal container for the charts has our two charts and the two planks. Let's go and remove the planks. Great. Now you can check again our structure in the item hierarchy to see that everything should be looking like this. We have the main container, where we have inside it three horizontal containers. The title should have the title and the two buttons. And then the KPI should have the three KPIs. The chart should, has the two charts. If you have it like this, that means everything so far is clean and we are in a good way. All right guys, that's it for this step. We have the main content inside our dashboard and it was very easy and fast. Now in the next step, things going to get interesting where we can start formatting, coloring, positioning the stuff in order to have a clean and professional dashboard. Okay, now let's start formatting our dashboard. The first step that we're going to go and make sure that our content is distributed evenly in each container. Let's go to the KPI container over here. Make sure to select it. And let's go to the small arrow. And let's click on Distribute Contents Evenly. All right, so let's move to the next one. As you can see, those two charts are not distributed evenly. Let's select the container and let's go to the more options and distributed evenly. With that, we're going to get a fair alignment for all charts. We will not do that for the first container because the title should be bigger than the unification patterns. Let's start from top to bottom. Let's start with the title. Let's go inside the title over here and start formatting it. So we're going to call it Sales dashboards. And then let's have a pipeline. And then let's have the year, the current year that the user selects. What we're going to do, we're going to go to Inserts. And let's add our parameter. Now let's go and change the front sides. Let's select everything and make it, for example, 24. Now let's go and change the coloring. So let's go to the colors and pick our coloring, right? So let's go and pick the dark one for the year. Let's have it as Tableau medium. And pick the other color that recuse. All right, so we have our title. Let's hit. Okay. And check how it looks like. Yeah, I think it looks fine. Let's make it a little bit smaller. That's all for those two containers. Now let's go and check the patterns. We have to make sure that those patterns has exactly the same sizing, which is really hard to configure. So what we're going to do, we're going to go and grab a mini horizontal container in order to put those two pattoms inside it and distribute it evenly that we're going to get a perfect sizing. Let's go to the dashboards and let's get a horizontal container. Make sure to drop it to the right sides that we have a small container, let's make it a little bit bigger to see it. I'm just going to remove stuff now. We're going to go and move those patterns inside it. Let's drop it inside it. We'll pick the second one and put it to the right sides. Of course, let's go quickly and check that everything is fine. Now, let me close all those stuff. We are the title, we have our title, and then we have the mini horizontal container. Inside it, we have the two patterns. All right, great. Now let's go and make everything distributed evenly. Let's go to the horizontal container. Let me just quickly give it a name. You are the horizontal container for the patterns. Okay, perfect. And let's go and distribute this container evenly. So make sure to select the horizontal container. Let's go to the Options and distribute content Evinlyow. As you can see, those two buttons going to get exactly the same size as I'm reducing or making it bigger, both of them going to get exactly the same size. Let's just make it a little bit smaller. Now let's go and change the design of those buttons. So click on the first one. Let's edit the button. Okay. Now let's say the first button going to be for the sales dashboards, so let's go and select it. It's going to be the Sales Dashboards. Now let's go and give it a title or a name. It's going to be Sales dashboards. Now let's go and format the fonts. It's going to be white, so everything is fine. Let's go to the background. Let's pick our colors. So let's go to more colors and pick our pluekey. What else? Let's go again to the fonts and make it instead of 12, let's make it ten. All right, so that's it. Let's go and hit. Okay. Now with that, we have configured the first button, let's go to the second one. Let's go and hit the button. Now, since we still don't have this customer dashboard, we cannot go and select it. But still I want to format it. Let's go to the font, make it ten, and this time I'm going to make it plaque. And let's give it a title. Going to be the Customer Dashboard For the background, it's going to be the white, and let's go and add a border for it so it can be the line, something like this maybe and then gray. Okay. Now let's add a toll tip. It's going to go to custom dashboard. Okay. Let's check that. Okay. As you can see we got the second button gray because we haven't select any dashboard. So once we have a dashboard, it's going to be white. Now let's go and make it a little bit bigger. Select the container, just make it a little bit bigger. Okay, that's it. We will visit it later once we have the customer dashboard. All right, so that's all for now. For the first container, what I'm going to do, I'm just going to go and remove the background coloring of the container. Let's select the title. Let's remove the border, and as well the background color. Let's have it as none. All right. Now, let's move to the next one. We have our QBs. The first thing that I'm going to do, I'm just going to make it a little bit bigger, maybe somewhere like this. Then what we can do, we're going to go and add the background color. So as you can see, we have here white color. But here we don't have any coloring for the title. In order to do that, let's click on each one of them, and then we go to the background, let's make it white. Then to the next one, and the third one, it's going to be as well white. Okay, so now we have like a big card or QBI for all those informations, for each one of them. All right, so now the next step that we're going to go and remove the coloring of this container. So let's remove the porter and remove as well the background. All right, now let's start with the first container over here. What I'm going to do, I will just as well add a background color for those two charts, going to be the white. Now, what to configure those stuff? We still have this container which is really bothering me. Let's go and select the whole container. Let's move it to the top over here. And then let's go to more options. And we're going to select this one. Add Show Hidden button. Let's click on that. Once you do that, you will get like small icon in order to show and hide the whole container. What we're going to do, we're going to hide it. Click again on the Options and hide it. Now the whole container is inside this icon. I will just place it over here in order to work on our charts. All right, so now the next up that I would like to go in each chart and make sure that it fits the entire view. Let's go to the first one. You can check it from here, you can see it is entire view. The next one as well, third one and as you can see it's standards. So let's go and switch it to entire view. And the same thing for the weekly trends, it is entire view with us. We make sure that Tableau is using the whole space and we can make this one a little bit bigger as well because we'll still have a little bit space. So let's go to the middle over here and make the KiByes little bit bigger in order to use the whole white space. All right, so with that we have a perfect positioning for each chart. I'm really happy with that. All right, so now the next step that we're going to go and add some nice legends to our charts. Now for the first charts, we have to give the following information for the users. So the dark gray going to be the current year and the background color is the previous year. So now I'm going to go and customize. Legend. I will not use the one that's from Tableau because I want to customize it. So for that we're going to go and create quickly a chart for the legend. Let's create a new sheet and all what we need is the text of the current year. And the previous year, we have it as calculated field. Let's move the current year to the text. And as well the previous year to the text. Now let's go and customize those informations. Okay, now we're going to start on the left side, so let's make the alignment to the left. I'm going to start with the first information, the current year, we're going to say the current year sales, let's make the bigger, and let's go and change the funds to something like maybe a medium as well. The coloring, it should follow the pattern, the chart. The current year of sales, It was a dark one. Let's go and pick our dark color for the previous year, it was the light color. Let's do that. Let's make the current year as bold. Okay, let's go and test it. Let's go and apply now public to show it as hashes because the size is really small. So let's go and hit Okay. And we can go to the standards and make it entire view. Now we can see it over here. 2023 sales versus 2022 sales. Now as you can see it, the current year versus the previous year. Okay, one thing that I'm naturally happy about it, let's go inside it and remove the bold. Okay, let's give it a name. So this can be the legend category charts that sits. Now let's go to the back to the dashboard in order to use it. Now I would like to have the legend between the title and the chart. We cannot do that. Instead of that, we're going to go and make an extra container for those three informations. We have a legend and then the charts. As I said, again, we cannot plan everything at the start. As you are building the dashboard, you will understand the needs and you will adjust stuff. Now what we can do, instead of having this chart, we're going to have a vertical container inside the horizontal container. Now let's grab a vertical container. And the bit thing to do, it here in the middle. And what we can do, we grab the chart, put it inside this container, so make sure to drop it inside this container. And of course, let's go quickly and check the layout where everything is fine, it's inside the tilted main charts. Now, instead of the first charts, we have a vertical container. Let's go and give it a name quickly. You are the container of let's say chart one. Inside it, you can see we have our charts now, our vertical container going to start with a title. Let's go and grab a title or a text on top. And now we're going to give it the name, Sales and Profits by subcategory. Now let's go and format. You're going to be table medium as a font. And then the size going to be 14 and the color in dark one. So let's go and select that. Okay, so that's it. Okay. All right, so that means we don't need the title of our chart, right? Click on it and hide the title. Great, so now finally we can go and grab the legends. But now in this chart, I would like to have as well a legend on the right side for the profit. So that means we have a legend on the left and legend on the right. And in order to do that, we're going to have another container. In order to put those two legends side by side. We cannot do it currently because we have a vertical container. So let's go and grab a horizontal container and just put it in the middle of over here. Just resize it makes you to select the container and let's put the first legends inside it. Okay, so now we have a title for the small legend. Let's go hide it. Great. So now let's go and make everything smaller. All right, so with that, we have really nice legends where we are telling the users, we are comparing the sales of 2023 with 2022. All right, so now let's go and configure the right legend. We have to tell the users, this is profit informations and the blue color indicate for profits. The orange can indicate for loss. For this legend, I'm just going to use that text object. So let's drag the text and make sure to put it inside this mini container to the right side. So first let's indicate the current year. Let's go to inserts and have the parameter, because here we have the profit only for the current year. Next we're going to say, okay, a circle, this is going to be profits. And another circle, this is going to be a loss. Okay? Now let's go and make sure that the font is a Tableau medium, it's going to be a nine. And let's go and make sure that the coloring that is used is the dark one. But now let's go and change the coloring of the circles. So the first one going to be the blue and the loss is orange. Our orange. Okay. So now let's go and it okay, and test it. All right. So now as you can see we have, it's really big. Let's go and make it smaller. All right. So with this legend, the users can see immediately that we are talking about 20:23 The blue one can be the profits and the losses can be the orange. All right, I'm really happy with the first chart. Of course we still have the coloring of the background. Let's go to the layout and make sure that everything is correct of the containers. Let's go to the chart. One, as you can see, we have a vertical container, we have a text, and then we have a horizontal container for both of the legends. Inside it, you can see we have the charts for the first legends and the text of the second. Then below that we have our charts. If you have it like this, you are following me correctly. Now what we're going to do, we're going to go and give a background color for the whole container for the first charts. Let's go to the background over here and make it as a white. With that, the users going to get the feeling that everything is in one unit, in one charts. All right, so this is for the first chart. Let's go and do the same stuff for the right one. In order to do that, let's go and grab. Container. Let's grab it to the middle over here. So now with that, we have our container. Let's go and grab our chart and put it in the container, the new one that you have created. So now that we have our chart inside the new container, let's go and check the layout to make sure that everything is fine. Let's go to the charts. We have chart one, and the new one can be for the chart two. Let's go and rename it. You are the container for chart two. Okay, inside it we have our chart, so perfect. So that means we're going to go and grab a text objects and drop it on top of our chart inside the new container. Let's call it Sales and Profits, Trends Time. Now we're going to go and start formatting it. Let's go and grab the Tableau medium as well. Going to be 14. Let's go and pick our color. It's going to be the dark one that we're going to get exactly the same title as the left one. Okay, the next tip. Let's go and hide the old title from the charts. Next we're going to go and put our legends to be, it takes objects. Let's put it in the middle between the title and the charts. We're going to say in the legends. Let's enter a parameter in order to show the year. And after that we're going to have a circle. And we're going to say this is the above. And another one it's going to be below. Now with that, we're going to indicate whether the line is above the average or the below the average. We are using the coloring. The above can the blue one. Let's go and choose us. And below can the orange, our orange color. Now what you can do, we can make sure that we are following the same font. So it's going to be the Tableau medium, and it is a nine. All right, so that's all. Let's go and hit. Okay, I think we missed out the coloring of the 2023. Let's go inside it and make sure to choose the dark color for it. All right, let's hit Okay. So now we've got a quick explanation about the coloring inside our chart on the right side. Now what we're going to do, we're going to go and select the whole container. And we're going to change the background color to white in order to have this one unit feeling in the charts. So let's go to Layout, and let's go to the background and choose the white color. All right, so that we are done with the container of charts and what we can do, we're going to go and select the whole container. And remove the border and as well the background color. Okay, so now by looking to our charts inside our dashboards, we still are missing some information about the Kpyes. We have to present here legends explaining those two points and as well the coloring of those two lines. So we will have something very similar to the legends where we're going to say 2023 versus 2022 in order to explain those two lines, and then we can explain those two circles. In order to create the legends, what we're going to do, we're going to go to the legend of Subcategory. And let's go and duplicate it. Let's give it a name you can ape the legend of BI. Let's just move the dashboard to the end in order to have all the sheets on the left side. Let's go to the legend of BI and start formatting it. Now, since we have different KPIs, not only the sales, I'm going to go and remove the saleswords in our text. Let's go to the text, to the three points. And then let's go and remove the sales. And let's have only the years. And then let's go and add our circle. And we're going to say highest month. And another circle for the lowest month. Now as usual, we're going to go and start formatting those informations. It's going to be low, medium, and nine, so everything is fine. Let's go and change the color of those circles. The highest going to be the blue and the lowest going to be the orange. Let's go and hit Okay. And check the results. Looks nice, right? But I think here I have an extra space. Let's go to the text again. Let's have only one space. All right. Let's go and hit. Okay. Now let's go and use it inside our dashboards. So what are we going to do? We're going to go to the dashboard over here. Let's grab the QBI, the legend KPI. And let's drop it just below the title. We can have it between two zonal containers. Let's drop it first. Next time we're going to go and remove the title. So let's go and hide it. Now, it's really small between those two containers. What I'm going to do in order to select it, let's go to the item hierarchy. And now we can check and see we have the container for the title, the container for the KPIs, and in the middle we have our charts. All right, now maybe let's go and make the title just a little bit smaller like this. Let's go to the legend BI, drag it a little bit below. All right, so now it looks fine and we have an explanation for the three KPIs. All right, so with that, we have everything ready inside our main container. What is missing, of course, is the hidden container where we have the filters. But I will leave that until the end. Now what we're going to do, we're going to go to the main container, it's selected, and remove the border and as well the background. So let's have none. All right, so now the final touch, the last step of formatting these dashboards. We're going to go and add spaces in this dashboard between the charts. Adding spaces between the charts going to have a huge effect on the user experience for your dashboards. And as you can see, those two charts are really near to each others, like they are not able to breathe, right? So adding space between those two charts will not only add a balance between the items, but also it's going to make it easier to read for the users. So now let's go and. Those stuff. The first thing that we're going to do is that we're going to change the background color of the whole dashboard. In order to do that, let's go to the main menu over here to the dashboard. And then let's go to the format option here. The default going to be white. Let's go and move it to the lightest gray. Let's select that. Now with that, we are separating the charts from the background, and we can see immediately the spacing between the charts. Now if you look to the three KPIs, you can see we have a minimum space between them. But between those two charts, there is no space at all. Now let's go and fix the spacing from top to bottom. First, I would like to have the background color of this legend to be a gray. In order to do that, let's go to the sheets. So I'm just going to switch to, let's go to the format. But if you don't have it open, just right click on that white space. Go to format, and let's go to shading. So now we can go and color the background of the worksheets. So let's go and say none. All right, so now let's go back to our dashboard. And as you can see for the legend over here, we don't have a coloring. We need a background color of white only for the charts. All right, so now let's start working on those three KBIs in order to increase the spaces between them. In order to do that, let's go and select the first one. Let's close the formats and let's stay at the layout. Now here, if you go to those two options, we have the outer padding and the inner padding. The outer is the space between the objects and the inner is the space inside the chart itself. So now what do we need? We need to increase the spacing between those three KPIs and as well the spacing between the KPI and the charts. All right, so now let's go and start with the outer budding. Let's click Connect. Now here as you are increasing the numbers, as you can see the budding, the spaces between this chart and the neighbor charts can be increased. And as you can see, it's going to increase for top right, bottom left. So as you can see, everything is connected together. If you change something here, it's going to change for all values, and that's because all sides should be equal. And here, it's very important to understand that you have to make a decision about the spacing between your charts and you have to commit to your decision for the whole dashboard. This is really important, otherwise the dashboard going to be ugly. So now we're going to go with the value 20 for all the charts inside this. Now let me show you how we can do that. Let's go and make everything to ten. Now what we are doing this chart is taking a ten on the left right top button, and our goal is to have a 20. If this chart on the right side is taking a ten and the neighbor QBI is taking from the left side as well ten, then we will have a 20. That means in order to have a 20 between all our charts, each one of them should, has a ten. But now I care only for the spaces between the charts and not the legend over here. What we can do, we're going to go to the outer batting over here. And then let's remove all sides are equal and from the top, I really don't care. Let's make it as a zero. Our chart is not taking any spaces to the top, taking only space to the right, bottom, and left. Now let's go and do exactly the same for each KPI. Let's go to the profits, go to the padding. We have to have it here as ten. Now let's go and disable all sides equals, and we don't need any spaces to the top. All right, so let's move to the next one. The same stuff make it ten, and let's remove the top. Now we can see clearly there is a space between all those three KPIs and this space is equal to 20. Now let's go and add spaces to the two charts over here. So make sure to select the whole container. Now the same thing. We're going to go to the padding over here, and now we're going to make it a ten. This time we care about the top to be ten in order to have a 20 between these charts and the QBI above. All right, so that's all for this charts. Let's go to the next one and do the same. So make sure to select the whole container and let's move it to ten. Alright, perfect, let's go and deselect. So as you can see, the whole look and feeling of our dashboard look more professional and easier to read. And this is exactly why we add spacing between our charts. Okay guys, now not only the spacing between the chart is important, but as well the inner spacing, the inner padding is important between the content and the border of the content as well. Adding spacing inside the container or the contents going to make things look more bitter, for example. Let's go to this KPI over here, you can see the total of sales is very close to the border right now. We go to do, we're going to go to the inner budding. Now let's go and increase the size a little bit and see how things look like. Let's make it maybe seven. Now as you can see, as I'm increasing those numbers, the content are getting pressed and move away from the border. If you increase it, for example, like to 20. And as you can see now we have a lot of spaces between the title and the border of the content. Now let's go and move it to seven. We will go and do the same for all other KPIs. Let's go to the right one and we're going to make it seven. And to the third one. Let's go and make it seven. So as you can see, moving the content away from the border a little bit, going to make everything breathe better. Let's go and do the same for all other charts. So I'm going to go over here to the whole container. Let's add seven as well over here and add a seven. All right, so that's all. With that, we are done formatting our dashboard. The next step of that, we're going to go and start working on the filters and the interactivity. Now let's check quickly what was the requirements We have to allow the users to filter the data by the product informations like category and subcategory, and as well by the location informations. Like the region, states and city. And we have another requirement about interactivity and filtering. It says we have to allow the users to use the chart and the visuals as a filter. All right, now let's go and add the requested filters. We didn't add any filters inside our worksheets, so let's go to any of those worksheets. For example, the QBI sales, and let's start adding the filters. So the first one is what's about the products informations. So let's go and get the category show filter. Then let's go to the location information. Let's add the country. All right, so those are the filters that are requested from the users. The next step that we're going to go and apply them for all worksheets. So since all those filters are relevant for all our charts, so let's go to the first one, Radic, click on it and apply to worksheets. And then let's say all using this data source, let's go and select that. And as yo 198. Tableau | #5 Step - Building Customer Dashboard: All right, so now I hope you are done building the customer dashboard. Now I'm going to show you my version how I did implement it. So now let's have a quick overview on the requirements. Let's start with the key requirements we have here, the same stuff it says that we have to show KPI's, where the QBI should display the total number of customers, salesper customers, and as well the total number of orders for the current year and the previous year. And the next requirement is about the trend. We have to present the data on a monthly basis where we have to compare the current and previous years, and that's where we have to identify or to highlight the highest and lowest values. So those two requirements are exactly like the sales requirements, but with different measures. So for the chart type here, we're going to go exactly like the sales dashboards where we can have bands and as well spark lines with small circles. All right, moving on to the third requirement, we have the customer distribution by number of orders. So here we have to present the distribution of customers based on the number of orders. So here we are talking about data distribution, and for that we have a perfect chart. We have the histogram. Okay, so now for the last requirement, we have to show the top ten customers by profit. So here we have to show the top ten customers with the highest profit as well. They need a lot of information like the rank, number of orders, current sales, current profits, and the last order dates. In this requirement, we have to present a lot of details about the tamed customers. And for this, I have decided to go with a symbol table where we can have rows and columns. All right, so this is about analyzing the requirements and deciding on the chart type. For the next step, we're going to talk about the mock up and the coloring. We're going to use exactly the same stuff like in the sales dashboard. And that's because the two dashboards are in the same projects and it makes no sense to create each time for a new dashboard, a new mockup. So here we have to follow one mock up for all our dashboards in order to have the same look and feeling of our dashboards inside this project. As you can see, things goes easier for the next dashboards. Now we can go and start implementing the charts in Tableau. All right, Sona. For the first charts we have the three QBIs, Customers, Salesper Customers, and orders. They are the usual stuff like before, It's just copy paste and switching the measures. Of course, if you are interested in how I implement it, I'm going to leave the file as well on the projects or you can go to my public profile and download it from there. Maybe one interesting thing to show you, how did I calculate the salesper customers? So let's go over here. Since now we have a lot to filter, we can go and search for customer in order to check the calculated fields. So first we have to decide which customers did order for the current year and which one did order for the previous year. So it's really simple if we go over here to the current year customers and let's go and edit. You can see over here we have the same condition. If the year is equal to selected year from the parameter, then show the customer ID, otherwise it's null with the previous year. We're going to have exactly the same pat, subtracting one year. So this is the first step. Then the next step, we're going to go and calculate the current year sales per customer. We have it over here. Let's go and check inside it. For that, we have the following calculation. We can divide the current year for the sales by the count of the distinct value of the customers. And with that, you're going to get the average sales peer customer. So we will do the same stuff as well for the previous year. And there is going to be, as usual, so finding the differences and finding the min, max values. So that's it for the sales per customers. Now let's go and start implementing the first chart using the histogram in order to show the data distributions for the customers. So let's go and create a new sheet and we can call it customer distribution. All right, so now since we are talking about two measures, the count of customers and the count of orders, we have to go and use the LOD expressions in order to generate the pens. And I explained that in details in the LOD expressions using the fixed. So make sure to check that in order to understand the LOD expression that we're going to use now. And for that we're going to go and convert the number of orders into pens using calculated field. In order to do that, let's go and create, let me just remove the search, create a new calculated field. So here we want to find for each customers how many orders they placed, and of course we are talking for the current year. For that we're going to go and use the function fixed from the LOD expressions. Then we have to define the dimension. It going to be the current year for the customers. So here we have all the customers that did order in the current year. Then after that we have to do the aggregation. And it can be the number of orders. So we're going to go and count distinct as well. The current year for the orders. The current year for the orders is like the customers, all the orders that are placed in this year. All right, so that's all. Let's go and close the fixed over here. All right? So again, what we are doing over here, for each customers we are going to find the number of orders that are placed for the current year. All right, so now let's go and hit, okay. And now we have it over here as continuous measure. Let's go and change it to a dimension. So right click on it and make it a dimension because pins in the histograms are usually discrete values. So now what we're going to do, we're going to go and test the values. Let's drag and drop it to the view. Okay, so we got our pen for the histogram, but I would like and go and test those data. In order to do that, let's go and create a new sheet, let's call it test histogram. So what we can do, we're going to go and check our customers. Pick the customer name. And now as well, let's go and grab the order ID over here. Let's show all the values as well. We need the date, so let's go and pick the order date. It is over here in order to see the year. And then what we're going to do, we're going to go and check our new calculated field. Let's drop it over here. Then let's go and switch to a measure. And all right, I will go and drop it on the labels. All right, so now let's go and check one of those customers. Let's focus on Adam heart radically. Let's say keep only now we can go and check all orders of Adam. And as you can see, we have a lot of orders in the history and none of them can be counted inside our calculated field, because we are focusing only on the current year. As you can see, we start counting from 2023. And in 2023 we have five orders, 12345. As you can see, the measure is returning a correct value. We can go and test the other years. For example, let's go and show the parameter. Let's go and switch to 2022. That you can see in the 2022 we have only three orders. Let's go and switch it to 2021. And we have here only one order. So that means our calculated field is working as attendance and we can use it now for the histogram. So this is what I usually do once I create a new calculated field, especially if it is LOD, I go and test it. So I create a simple table in order to see the data and focus, for example, on this one customer instead of testing directly in the histogram, because it's really hard individuals to test the data. All right, now let's go back to our customer distribution and let's get our bars. In order to do that, we're going to go over here to the rows. Let's say count distinct. And now we're going to go and count the customers for the current year. So the current year customers, we have to go and change the visual to pars, since histograms are bars. And what do we got our histogram that says, now next we're going to go and start formatting our histogram. So the first thing, as usual, we're going to go and remove the lines. So let's go and format. Let's go to lines, let's go to rows and remove the grid. All right, that's all for the lines. Next we're going to go over here and remove the headers. Let's make those pins and make it more readable. So let's go in formats. Maybe I'm going to make it bold and change the color. All right, so now we have the name of the dimension over here. We can go and hide it. Okay, now let's go and start with the coloring. Let's hold control and drag the customer to the colors. Of course, we're going to go and use our coloring. Let's go and edit it. Let's pick one. All right, so that's it. That's it. Okay. Okay. Next we can go and add some borders to those parts. So let's go to the colors to the borders and make it something like this. All right, So now the next time I'm going to go and add some labels. So let's get the customers to the labels. And I think with that you are done with the hat gram. We can go and test it by adding the parameter. Let's select another year like 2023, and as you can see, everything is reacting. And that's it for this requirement. Now we are showing for the users the distribution of customers by the number of orders. Let's go now for the next requirement where we're going to show the top ten customers by the profit. All right, now let's go and create a new worksheet. Let's call it Top Customers. So now we need our customers to the rows and now we're going to show only the top buy, the profit for the current year. Let's go and get our measure. It is the current year for the profit. Let's drop it on the text over here. Now next we're going to go and make the filter in order to show only the top ten customers. Hold control, drag and drop the customer name to the filters. And now here we're going to go to the tab off top. And then let's switch it to buy field, so we have top ten by the profits, and the aggregation going to be the sum. So this is exactly what do we need? Let's go and hit, okay. And with that, we're going to get a very simple list of the top ten customers by the profit. Let's go and change the format in order to see the whole number. So let's go and formats where I'm going to go and remove the unit, remove decimals. Let's have the dollar sign at the stars. Alright, so now we can see the whole number. Let's go and sort the list by the profit. So in order to do that, go to the customer name, Then let's go to sort and we're going to go to a field. In order to have a ranking, we're going to switch to sort order by descending and make sure that we have the field name, current year of profit. All right, so that's all. Let's close it and as you can see, the first customer on top, it's going to be the top customer. And now the next step that we're going to go and add the rank to this list. In order to do that we're going to use it, the function index. Let's go to the roads over here and just write index And that's it. And then let's go and switch it to discrete and just put it at the front. And with that we have a ranking 1-10 All right, so now we're going to go and add additional information for each customer's like the sales for the current year. So let's go to our data pin and let's grab the current year for sales. Drag and drop it on top of those numbers so that we can see as well the sales for the current year. Let's just make it a little bit bigger. And now the next information that we're going to go and add is the number of orders for the current year that is placed from the customers. In order to do that, let's go to the Measure Value over here and double click on the empty space and write down count distinct in order to count the orders. So we're going to go and type current year off the orders. All right, so let's sit. Okay, and now we're going to see the number of orders that each customer did place in the current year. All right. So now the next information that we're going to add is the last order date did the customer place. And now we need the last order date. In order to do that, right click on it and let's go to the measures and get the maximum. So with that we can see, now when was it the last time, did Customer order from our business. All right. So with that, we've got all the information that we need inside our chart. The next step that we're going to go and start formatting it. First we're going to start with the lines and the grids as usual. So right click on it and go to format. Now I would like to get rid of this line in the middle between the measures and the dimensions. So let's go to the grids. And let's go as well to the column divider and remove it. With that, we don't have the line in between. Now the next step we're going to go and get rid of the gray background color. Let's go to the shading, And then here we're going to go to the row bonding and reduce the size to the minimum that as you can see, the background color did disappear. All right, so that's all for the lines and the grid. Let's go and start formatting the ponts and the colors of our phon. First, I would like to format the index over here. Let's go to it. Format. Let's go and make sure that you are selecting the correct field. Yeah, we are selecting it. Let's go to Pan. Now, let's go to the numbers over here. And I would like to add a, let's remove the decimals by the number of custom and add the prefix of hash in order to have a ranking. That's it. What else we can add to this ranking is that we can go and add the background color for it. Go to the shading over here and make it very light gray. All right, that's all for the ranking. Let's go to the next one and start changing the font color format. Let's go to the font, so we can leave it as a Tableau Po and we can go and change the color to something like black. That's it. Let's go to the next one, format, and we're going to go over here, make it plack. All right, so I'm moving on to the measures. Let's go and remove the unit from the sales. So let's go to the sales over here for mats and then we're going to go and format it as usual to the number custom remove the decimal and add $1 sign. All right, and for the number of orders, we're going to leave it as it is. All right, so that's it. Let's just keep it very simple. And with that we have a really nice detailed table to show the top ten customers with additional information. All right, so with that we are done building all the charts. The next step we're going to go and start building the dashboard. Okay? So now in order to create the customer dashboard, we will not create everything from the scratch. We're going to go and duplicate sales dashboard. In order to have the structure, Let's go to the sales dashboards. Radically connect and duplicate. With that, we've got two identical dashboards. Let's go to the second one and start formatting it. First we're going to start with the naming. So it's going to be the Customer dashboard. Now let's start from top to bottom. We're going to start with that title. Let's go over here, change it from sales dashboard to customer dashboards. As Cain, creating the second dashboard can be very easy once you have a really solid structure. All right, so now next what we have, we have the three charts. We're going to go and replace them all with the new ones. The first one is going to be the QBI customer, let's just drop it at the start. Of course, going to go and start adding stuff to our new container. Don't worry about it. We're going to go and delete it later. Let's go and get the next KPI, Salesper, customers and the orders, okay? All right. Now let's go and hide this container. So right click on the icon and let's go and hide it. All right, so now we can go and drop those old BI's from these dashboards. Let's just remove them. With that, we've got our three QBI. Let's keep moving and add our charts. It's going to be the histogram, so let's drag and drop it below the legend over here. And we can go and remove the old stuff. So the old chart. And as well, we don't need the legends. Let's go and drop the whole container for both of the legends. And let's go and change the title to customer distribution by number of orders. Okay, let's sit. Okay, and let's remove the title from the charts. As you can see, this container keep popping up because we have a new legends and new stuff. Let's go and hide taking. Let's work on the right charts. It's going to be the detailed list for the top customers. Let's throw it over here. And we're going to go and remove the old one. Now we're going to move on to check that everything fits the entire view. Let's go check one by one, entire view. Entire view, this one as well. Everything looks fine. Let's check the last table. It's standard. Let's go and switch it to entire view to use the whole space. All right, so now we put everything together in one dashboard. The next step that we're going to go formatting this dashboard. And it will not be that part because we have almost everything. Let's start with the first chart. Let's make everything with a white background. Let's go to Lay out and change it to white as well for the next BI, just to make sure that we have done for everyone. All right, with that, we've got like a card for the whole QBI. The next step I would say let's go immediately and start working with the spacing between those charts. Let's click on the first one. If you remember in the sales dashboards, we have agreed to have a 20 between each charts. Let's go to the outer padding and make everything as a ten, but only on the top. We don't need this extra space. Let's disable all sides equal and make it zero only for the top as well. We say it, inner padding going to be always seven. Let's have it like this and do it for the others. Outer is ten, on top is zero, and the inner padding going to be seven as well. For the last one you are ten. Remove it for the top. And the inner going as well seven. Let's do it like this. All right, so with that we are done for matting the three QBI. Let's move on to the charts now. Let's go and select the whole container. And as you can see, we have everything done as before. The outer padding is ten and the inner padding is seven. Great, let's go and check the right one. We're going to have it as well. Correct. As you can see, things get really fast as you are building the second dashboard using a solid structure. All right, so now we're going to do one more thing about the top ten Customers by Profits. As you can see, those header informations or the field name is not really nice. Now we're going to go and remove those informations and we're going to build our own custom field names. So let me show you how we're going to do that. Let's go to Dashboard. And let's grab a horizontal container on top of our table. And here we're going to go and put inside this container, the field names. Let's just make it a little bit smaller. Let's start adding texts. So this is the first text. The first information going to be the rank. Let's have a rank. Let's change the font to a medium. Let's size to ten, and make it a little bit lighter for the colors. All right, let's go with this. Let's okay, let's go and add another one for the next field. So make sure to be on the right side customers and we're going to do the same stuff. You're going to be medium ten and this color we can go and copy it for the next one. Let's go and, okay, now let's go and keep adding our field. So the next one going to be the last order date. Let's paste the old one and we're going to call it last order that sets. Let's okay. Then we have the current profit. Let's grab a text instead of the current profit. I'm going to go and add the parameter and then the words profits. Let's go and make sure that everything has the same format. So you're going to be Tableau medium ten and the same coloring. Let's copy it for the next one. We're going to add another text for the sales past. Let's have the sales. And the last one going to be the number of orders. Let's write it like this, past, remove the year. We don't need it here. As you can see, we got our titles. What you're going to do, we're going to go and remove the titles from the original table. Let's hide the field labels and as well, let's hide the header. All right, next we're going to start working on the alignment between the titles and the detail list. So we're going to start moving stuff around. First I'm going to go and make it a little bit bigger, and then we're going to start moving those boxes, the information, until everything matches the last order, a little bit to the right side. Maybe make this filter a little bit smaller. And then let's go and push the sales a little bit to the right side as well the profits. Now we're going to go and push this a little bit to the right side. You can see we don't have any more spaces for the order. Let's go and just call it orders. All right? And we're going to go and move it again a little bit to the top. Okay, I'm happy with that. Everything is perfect. And now we have formatted all the charts that we have inside the customer dashboard. Next we're going to go and start cleaning up the filter information. Let's go and show the filter what is happening here. Okay, now what we're going to do, we're going to go and remove all additional information that Tableau did add to our new container. We don't need all those information. Let's go and remove them one by one. And with that, we got exactly like before, the same container. And of course, you can go and start testing your dashboard again. We can go and switch it, for example, to 2022. And as you can see, everything changed, even we have a new top ten customers. We can go and add, for example, different subcategories and everything is reacting, so everything is perfect. Let's go and put everything back to 2023. And with that, we have fixed our filter. Let's go and close its, let's hide it. All right, so now the next step that we're going to go and add interactivity in those charts. So make sure to select the histogram and use it as a filter. With that, if the users go anywhere and start selecting staff, for example, those two. And with that, as you can see, the dashboard is reacting. Let's deselect. All right. So now let's do the same stuff for our top lists. Let's go and make it as a filter. And now we can go and select our top customer. And we're going to have a quick analysis only for this customer, which is really nice. So let's go and deselect that. And with that, we are done with the interactivity inside our dashboard. Now moving on to the last step where we're going to work with the icons in order to make navigating our two dashboards very easy. Okay, so now let's go and fix this icon over here. So double click on it. And now finally we can see it's going to navigate to customer dashboards. And now since we are at the Customer dashboard, we're going to show an icon that is like an active icon. In order to do that, let's go and choose the icons. So as you can see, this one going to be the active icon if the customer select the customer dashboard. So let's go and select that. So now everything looks good, let's go and it. Okay. And with that, you can see we have a new icon that indicates we are now at the Customer dashboard. All right. So now next we're going to go and fix the sales dashboards icons over here. So let's go inside it and navigate to the customer dashboards. And let's choose the one that is not active. So we're going to go and select this icon. All right, so that's all okay, so now let's go to the sales dashboards over here and change it to an active icon. We're going to choose this one over here. Sales dashboards active. So select that and let's have an okay. All right. So that's it. With that, we have fixed the icons. So the sales dashboards going to be activated. If you go to the customer dashboard, it's going to be exactly the way round. All right. Key. So with that, we are done with the second dashboard inside our projects. Let's go and test everything. So let's go in the presentation models over here and let's check the data. All right, so now we are at the Customer Dashboard. Let's go and click on this container over here. As you can see, everything is working nice. So now let's go and switch back to the Sales dashboard. So let's click on this icon. And now as you can see we are back to the Sales dashboard. So with that, the user should not go to the taps and switch between those two dashboards. The users can just go and click on those icons in order to switch between those two dashboards. And with that, I'm really happy to announce our project is completed and we have fulfilled all the requirements. I will leave this project inside Tableau public or you can get it from the download link. All right, so with that, we have completed our Tableau projects and we walked through all the phases that I usually follow in order to implement any Tableau projects from the scratch, from the requirements until the delivery of the dashboards. And here again, my recommendation is that to not rush the projects where you can go immediately start building charts and dashboards without having a clear or organized plan. So do it step by step in order to deliver clean work. 199. HR Project | Introduction: Friends, so today, we're going to go and implement an amazing table project, where we're going to go and build an H R dashboard using Tableau. And what's special about this project is that, you will not only learn how to use Tableau in order to create visualizations, but also you can learn how I usually implement professional table projects at my work. If you are new here, welcome. My name is Bara, and I lead Big Data and BI projects at Pacida S Pens. I'm here to share everything that I know about working with data. So make sure to subscribe so you don't miss anything. In this table project, I'm going to guide you step by step, starting from the user requirements. Then we're going to go and draw the concepts and the mockups of the dashboards, and at the end, we're going to have a fantastic dynamic dashboard using Tableau. That means by the end of the projects, I'm going to leave you with a table dashboard and as well, real life skills on how to implement table projects. My friends. Before we jump to the project, I would like to take a moment and say the following. Everything in this project is for free. And as well, I highly recommend that you follow me along with this project, step by step. Because just sitting and watching, it will not really help, you have to get your hands dirty. And, hey, this is your project, so feel free to share it in any platforms you want, like in Linked in or in Tableau public as a portfolio. So that's all for now, let's jump and get started with the projects. Now, my friends, by the start of each project, first, I decide on the coloring. The first decision that I make is whether we want to have a dark or light theme in the dashboard. And since the last sales project was a light theme, this time we're going to go with the dark theme. After that, we have to decide on the four colors, not more, and we divide it into two categories. The first category is the basic category, and here we have two colors. Black and white. Usually, I go with the gray coloring, so we have a dark gray and very light gray. Now, the second category, we have the custom category, and here we have the two colors of our own style. So for this project, I'm going to go with the green and pink. But wait wait here, we have an issue. My wife said this is not green. This is Persian green, and the other one is not pink. This is royal Fuca. So sorry. All right. So those are the coloring that I've decided for this dashboard. Of course, you can go and add your own style. You don't have to follow my coloring. All right, friends, Table projects has mainly three phases. The first one is by preparing our data where we go and connect our data to Tableau using a data source. So we have always to do this step before building any charts or doing an analysis. In the second phase, we're going to go and build many, many different charts and visualizations based on the user requirements. And in the last phase, we're going to go and put all the charts in one single consolidated dashboards. In this phase, it includes a lot of formatting and refining in order to make the dashboards user friendly and effective. So let's start with the first phase, where we're going to go and build tableau data source for our project. 200. HR Project | Build Data Source: All right, friends, now we're going to go and build the data source for our projects, and here what we're going to do. First step, we need data. We're going to go and download the data for the project, and then we're going to go and connect the data with Tableau using a data source. After that, we're going to go and check the quality of the data and the data types. And the last step, we have to go and understand and explore our data before building any visualizations. Okay. The first step of building a data source in Tableau, we have to go and get a data. And to BNS I've checked a lot of projects and datasets, and I didn't find anything that is suitable for these projects. That's why I have decided to generate my own data. Of course, I have a personal assistant in order to help me with this task, and that is the SGBT. I have asked the SGBT to generate a Python code in order to generate a data set. After a long shot and twisting around, Finally, I've got a really nice code in Python using the library faker in order to generate data. If you want this Python code that I've used and the prompts in the SGPT, you can find everything in the project link. Friends, as you can see, SGP here, help me in order to generate a datasets for practicing. Now let's go and get the data. In the video description, you can find a link for this page where I've collected everything that you need for these projects. As you can see here, we have a Zip folder where you have all the files for these projects, and if you scroll down over here, we have the user story for this project. Here we're going to go and build tableau dashboard for the human resources based on those user requirements. L et's go and download the Zi folder, it's over here. Let's click on it, and you can have it in the download folders. Now the next tab, we can write click on it and extract all and then extract. We have it over here. Now what I usually do, I move this folder to somewhere else because I tend to clean up the downloads and if you lose the connection between tableau and the data, you will get a lot of errors. Let's go and do that. I will just copy it and put it somewhere like here. Now let's go inside it and check what do we have. What do we have over here, we have icons and images. You can find all those stuff that we need later for the dashboard. And as well, you can find the Tableau project file, and of course, you can go and download it from the Tableau Public. And here we have our data, human resources, CSV. This is the data of our projects, and you can find the dashboard mockups that I've created using the Draw AO. All right. So with that, we have our data for this project, and the next step of that, we're going to go and connect Tableau to our data. All right. So the first step of that, we're going to go and start Tableau Public. Then we are now at the landing page. Let's go and connect to our file using the text file. Then we're going to go and open that downloaded data, human resources, CSV. Let's go and open it. Now, usually, the next spit that we're going to go and build a data models from the files. But now for this project, we have only one file. That's means we don't have to worry about relations and joints and union, and so on. Our data model has only one table, one file for the whole projects. Now the next sib of that, we're going to go and check the quality of the data inside this table. The first thing is, of course, if you are using text file das, the columnames should be correct. We can find over here that everything looks fine, right? We have employee ID, first name, last name, gender, stage, and so on. So the names looks okay. And if you don't have it like this, we have to go and check the properties of the file. So in order to do that, right to click on the table. Usually in text or CSV files. The first row should be the filled name or the column name. So make sure this is checked, and then we're going to go to this option. Text file properties, let's coincide it. And here, it's very important to that. You have the setup like me that I'm showing now. So the filled separator should be the semicolon. And if for any reason that tableau did select something else, make sure to select Semicolon. And the third option is important, it is the encoding of the file. It should be as well UTF eight. So if you have those options like this, you should be safe, so let's go enclose it. That's means Tau is reading the files correctly and the column names are correct. Now the next exhibit that we're going to go and check for each field whether Tableau did assign the correct data type. Let's have a look. The first column then blo ID, it is a string, and that is correct because here we have a character between the numbers, so we cannot have it as a number. First name, last name, gender, all those information. Has characters inside, and of course, it is a string. Let's move to the right side. Now we can see we have two columns about the locations. As you can see Tableau did assign this correctly to a geographic role. If you don't have it like this, it's very simple. Go over here on this icon, and then we have here the option of geographic role and make sure that we assign it to the correct information. Now, let's keep moving, we have here, the education level, which is correct. It is string. Then after that, it's very important. We have several dates. We have the birth date, the higher date, and the termination dates, and all of them has correct data type. Now let's keep moving to the right side. And as you see, we have department, job titles, all of them are string, and we have salaries. So the salts is the only field inside our datasets that has the data type number. The last one is the performing strting, it is string, which is correct. As you can see, Tableau did wonderful job by mapping the correct data types to the columns, and having the correct data types is very important in your project in order to do the calculations correctly and to have good data quality inside your dash. It's so good that we have built our data source and everything looks really great. Now the next shibit that before I start building anything, any charts, I would like to understand the data to explore the data. What I usually do, I go and create any sheets over here, and then I start dropping in formations to the sheets in order to explore the data. For example, which departments do we have inside the data? As you can see we have seven departments, customer service, finance, HR, and so on. Then what is interesting, for example, the job titles drop it over here. And now we can see all those job titles, but we could understand as well, there is relationship between the departments and the job title right. So what we can do over here if you have relationship between columns at that, you go and create Hierarchy. Let's go and do that. It's very simple. Let's take the job title, drag and drop it on top of the department like this. And then you have to assign a name for it. I'm just going to leave it like this. Let's go and click. Now on the left side, we have hierarchy, where it starts with the department and ends with the job title, the order of the hierarchy is as well correct. Let's keep exploring. Let's go and get the education level, for example, over here, and we can see there is no really a relationship between the education level and the jobs and department. I go and go and drop it in order to see. In our data, we have four education levels, we have bachelor, high school, master, and PhD. As you can see we are just browsing and exploring the data. Now my recommendation is that to bows the video and you go through all the fields. Only after we understand the content of the data, we're going to proceed with the next steps. Now I hope that we have now better understanding about the project data, and now with that we have a solid data source in order to start building charts in Tableau. 201. HR Project | Build Charts - Part1: All right. So now we're going to go and build the charts for the first dashboard, the summary dashboards, and here what we're going to do. First, we have to analyze and understand the requirements in order to decide on the charts. After that, only for one time, we're going to go and do initial steps by formatting the worksheets in order to use it as templates. After that, we have to make sure that we have all the dimensions and measures in order to build the charts, and if not, we have to go and create calculated fields, and only after that, we can go and build our charts. The last step, we have to take care of the format. So now let's go and start with the first step where we have to analyze and understand the requirements and decide on the charts. Okay. So the first step before building anything that, we have to go and understand the requirements. So let's have a look to the user story. So what do we have over here? We have to go and build a dashboard for the HR managers in order to analyze the human resources data. And we have to provide them with two views. One has a summary view for high level insights and another detailed view in order to show a list of employee records for in depth analyzers. So that means we might end up building two dashboards, but we will see. Let's start now focusing on the first section, the summary review. So the summary review should be divided into three main sections. This is about the dashboard. We should have an overview section, demographics, and the income analyzes. The first requirement for the first chart going to be display the total number of hired, active and terminated employees. It sounds like we have different status of the employees. We have active and terminated. Now in the next spit, we're going to go and decide on that chart type. Since we are talking about the total number of employees, it's like a big number that we should present in the dashboards, so we can go and use the bands. Bands are a great way in order to highlight the big numbers that the pig measures inside our data in the dashward. Pack to tableau, but now before we start implementing any requirement before we build any sheets or charts, we have to do an initial step, and that is by formatting the first sheets to be used as a template for all other requirement and all other sheets. That means we're going to go define the background, the colors, the fonts, everything to be prepared. That's of course better than creating the sheets from scratch each Now with the first preparation we're going to do, we're going to go to the format in the menu over here, and then let's go to the workwok. Now we're going to go and define the font for the whole projects. Let's go over here to all and then let's go to the Drop list. For this project, I've decided to go with the tropuh MS. Let's go and select it. Now everything that I'm creating in dashboards and shields, going to be using this font. All right Now the next step that we're going to go and start adding the colors that we have defined for this project. Let's go to the marks over here and select the color. Let's go to more colors. So now we're going to go and add our four colors. Let's go and start with the first see over here, click on it, and then go add the codes, and with that, we have the green color over here. Let's go and click then, add two custom colors. This, of course, can help us to have e quick access to the colors that we defined for the projects. Now let's go and add the second color. Again, the same steps, let's select the sale below it and add the codes, and with that, we have the pin color. Let's go and click on, add two custom colors. Now the next two colors is going to be our basic colors, select on the sale. Add it and with that we have our gray and then add to custom colors. Now let's go and add the last one. The fourth one, it's going to be the light gray, and as well add to custom colors. With that we have our custom colors to be used in the whole projects, those four colors. Let's go and hit okay. Now what we're going to do, we going to define the default font color for the whole projects. Again, we're going to go to the font over here, and then let's go to more colors, and let's pick the gray, and then select. So that's all for the colors and for the fonts. Now, the next step that we're going to go and define the color of the background. As we decided at the start, this project going to be a dark theme. Let's go again to format and then to shading, and then we're going to go to the worksheet over here and let's pick the first dark color. Now let's move to the next step. We want to go and change how the sheet is fitting the view. For dashboarding, it's always good to have it as entire view. The default tableau show it as a standard, so let's go and change it to entire view. Let's click on that, with that, the chart can take always the whole space that is available in the view. Now maybe one more thing that's about the title. We don't want to show any titles in our dashboards. We're going to go and create our own style. So right click on it and high title. All right so that we have done the initial steps, and we have now a template to be used for all other sheets. Now I would say let's go and save our work, and this is really amazing new feature from Tableau. Are allowed now in Tableau Public to store and save our work locally at our BC without publishing. Let's go and do that. This saves a lot of time. Let's go to file over here and save us, and then we're going to go to the types over here and to make sure that we are selecting Tableau package workbook TWX. Now we can see over here, we have a second option called Tableau workbook TW. I have as well a dedicated video explaining the differences between them, but we will go with the package because I would like to have everything, the data, the data source, and the visuals. Go with the second option, you will not save the data. You'll be saving only your work and going to be really hard if you lost the connection to the data. Let's store everything in one file and choose the tableau packaged workbook, and let's give it a name. HR dash words So. Let's save it. And with that we are done, let's start implementing the first requirements. All right. So now, the first step with that, we're going to go and ask ourself, do we have all the data in order to build our visual? So what do we need? We need the total hired employees, total active employees, and terminated. So now if you check our data over here, we don't have any information about the status of the employee, right? So that's means we have to go now and create calculated fields in order to derive and generate those informations. So the first one is total hired employee, which is records available in this data set. We have this as a default over here, but I would like to go and create a new one. Let's go ahead create a new calculated field. Let's give it a name called Total Hired, and this is going to be very easy, it's going to be the count function for the employee IDs. So that's it. Let's go aha and click. Now the next one, we want the total number of employees that are terminated. Now we have to take a look to our data in order to choose a column in order to build this logic. We have here the termination date. The logic can be very simple, if we have termination date for the employee, then this employee is terminated. Otherwise, the employee is active. Let's go and create this logic. So let's call it total terminated, and now we're going to have the following logic. Since it's logic, we're going to go and use the function if, if n is null, for the term date. So we are saying if the termination date is not null. So we have a value inside it, so what can happen? Then show the employee ID. And that's it, so let's have an end. That means if it is null, so we have a null value inside it, we will get as well null. Let's go and test the logic. I'm going to just click OK. And of course, in order to test stuff, I'm going to have a test worksheet. To check the data. So I need the records of the employees. Let's get the employee ID, yes, add all members. Now let's take the termination date as well over here, and our new field total terminated as well to the outputs. So now as you can see over here, we have all the employee IDs. This is normal, and then we have the termination date. So you can see if it is null, then our new field going to have as well a null. So since we don't have termination dates for those employees, then they are active, so we have here nulls. But only if we have a date, then our new field going to show the ID. We are doing that because we want to go and count how many Ds do we have inside this new column. That means our logic is working. What we're going to do now, we're going to go and edit. Again, the calculation, and we will do on top of it over here, just to count So we are counting how many employee IDs going to be used or shown after this logic. That's it. This is the total terminated, and to get the total active employees that are actively hired and not terminated. We're going to use exactly the same logic but the way around. Let's go and copy everything from here and click Okay. So of course, we're going to get a red one because Tableau used to have it as a dimension and it's not working anymore. So let's go and drop it. On more thing, as you can see here, we have it as a blue bill, the total terminated. Let's go and convert it to a continuous because it is a major nut dimension. Now let's go and create our third one, so it's going to be the total active. And let's have the same logic. But before we start counting, I'll just remove those staff away, I would like to test the logic. So if is null. So if the terminated date is empty, then show the employee ID. Let's go and test it. So I'm going to. And the same thing, let's go and drop it to the view over here. Now as you can see here, we have exact opposite. If that terminate date is empty, then show the employee ID. And if we have a value like here for this employee, then don't show any value. Now, the same thing, we're going to go and summarize all those values. So let's go and edit it again and add accounts. Like this and it. Again, it will not work over here and we have to change it as well from a blue pill to a green one to continuous. With that we got our new three measures that we're going to use inside our pans. Let's go back to our templates over here. Since the band is only one number, we don't need any dimensions in the view. Let's go and drop the education level. The first one is going to be the total hid. Let's go and drop it on the text. Of course, I would not leave it as automatic. I'm going to make sure it's always a text, and our number is here on the right side. Let's go and change the setup. Let's go first to the text to the three points, and now we're going to go and change the font size to 18 and as well the color to our light dark. Let's go and hit k, and as well. Now we still have it on the right side, but it's way bigger than before. Let's go to the alignments and everything to the center to the middle. That's it. This is the first peak number from our data set, so the total number of employees inside our dataset is 8,950. Let's give it a name as well. It's going to be the pan of yards. So we are done with the first one, Let's go to the second one. We want to have the total active. Instead of creating a new sheet from scratch, we're going to go and duplicate it. So right click on it and doblicate. What we have to do is to take the total active, drip it on the tick over here, remove the old one, and let's go inside in order to make sure that everything is fine. So we have here a new line at the start, let's remove it, and hit. That's it. Let's go and give it a name. You are the ban of active. Now, let's go and create the last one. Let's go and duplicate it again. You are the ban of terminated. Let's go and get the total terminated two the text over here and drop the old one away and as well remove the new line. That means the total terminated employees inside our data is 966. All right. So those are the three peak numbers, the three pants for the first requirement, the hired active and terminated employees. All right. Moving on to the next requirement at this says, visualize the total number of employees hired and terminated over the years. We have to display how the number of employees are developing over the time, and the best type of charts for this type of analysis is the line charts. You can go as well with the bar chart. The line chart is the best in order to visualize the trend over time. So back to Tableau, let's go and create our line chart. What we're going to do at the start, we're going to go and duplicate one of those sheets in order to have the same style, and then let's go and rename it. Going to be hired by year. Let's go and remove the measure over here and now we have an empty chart. Since it's over the time, we need a date field, and this is going to be the higher date. Let's drag and drub it to the columns over here, and then the next one, we need a measure and it's going to be the total hid. Let's rub it to the rows. Of course, our chart is a line chart. Let's go to the marks over here and make it a line. Now by looking to the charts, we have a lot of unnecessary information over here that we don't need. Let's go and edit this x. Let's include zeros like this. Now the data looks way better. Now, the next sib, we're going to go and edit the design of these charts. First, let's go to the colors over here and pick our colors, so more colors, and let's pick the green. The next sib, I would like to go and highlight all the area below the line. Let's go and get an area chart below it. It's just for the design. In order to do that, you're going to go to our measure, hold control and just duplicate it as a second measure, with that, we have, of course, two charts. One going to stay as a line, but the second one going to be an area chart. Let's go to the second one over here and change the type two and area charts. Now the next step with that we're going to go and merge those two charts into one using the dual x. Let's go to the right measure over here and let's use the dual axis. Of course, now things are not matching together because we have removed the zeros. Let's go to the right one, right click on it, and synchronize xs. Now the line chart is exactly matching the area charts. Now we can go and get rid of all those lines and stuff, so let's go and remove the headers from the left side, and as well from the years. And we want to get rid of all those grids. So right click over here and go to format. And now we go to the lines and let's go to the rows. I remove the grid lines. Let's make it none. But now looking to the charts, there is like a white box around our charts. What we're going to do? We're going to go to the grid over here and then go to sheets and let's remove everything from here. So remove the row divider and as well the column divider. With that, it's look really clean, but still it looks like not a line chart. It looks like an area chart. Let's go and change that. Let's go to the area chart and let's go to colors, and let's go and reduce the opacity 215, like this. One more thing we can go and reduce the size of the line. Let's go to the line over here and make it a little bit like thinner. I'm happy with that. It looks nice. With that we got the total hired employee over the time. Now we need the same chart, but not for the hired for that terminated. What we can do were going to go and duplicate this, and let's give it the name. It's going to be terminated by year. And of course, we have to go and change all those affirmations. Now we have to go and replace the higher date with a terminate date. So let's go and replace it. You can do it on top of it in order to replace it. Now we have the termination date instead of the higher date, and now we have to go and replace the measures as well. We need the total terminated on top of the first one and the same thing on top of the second one. By looking to the data, we have here in nulls because we have employees without any terminations. We don't need that. Let's go and hide it, right click on it and click hide. We don't need to remove any zeros because the first value is one and it's very close. We are fine with that. Let's go and hide all those informations left and right and as well from here or remove the headers. Now let's go and change as well the color of this. Instead of green, we can have a pink for the terminated. Let's stay at all and then let's go to colors and to more colors and pick our second color over here and click Thus we are applying the same color on both charts, the line and the area. All right. We are almost there, but there's a white dotted line over here. Let's go and remove it. Let's go to format, and I believe it is a line, and it is the zero line. Let's go to the sheet and remove the zero lines, and let's have a none. Perfect. With us we are done, we have now the total terminated employees over the time by the years. With that, the requirement is solved. Let's move to the next task and it says, present a breakdown of total employees by department and job titles. This means we have to go and analyze and compare the values between different categories, the departments. That means we are talking about the category magnitude, and the best chart in this category is to go and use the par charts. Now, my friends, if you need a deeper knowledge on how to choose the correct chart, I have made a dedicated tutorial about this topic, explaining the different types of chart categories, when to use which category, and what is the best chart for each category. So now let's go and build a par chart for this requirement. Let's go and build it. We're going to duplicate as usual, and let's give it a name. It's going to be the departments. And as well what we're going to do, were going to go and remove everything, all those dimensions and measures. Now, it's very simple. Let's go and get the departments to the rows, and we need the total hid to the columns. Of course, we have to go and change the marks to the parts. Now, of course, because of the previous charts, we go and change the opacity to 100%, and as well, let's go and pick the green color for this charts. Now since we are using the Part chart, it would be nice if we go and saw the data. Let's go to the axis over here and click on sort. With that it is descending, we have the department with the highest employees until we have the last one is the lowest. Now since we are using a par chart, it looks like a rank. We are ranking the departments by the employees. We can go now and add like a nice index, a nice rank number near those departments. In order to do that, let's go to the roads over here to the empty space, double click on it, and now we can go and use the function index. We can use it in order to ranking. So let's go and hit OK, and of course, it can break everything because it's a measure. Let's go and convert it to discretes. Now as you can see, we have a nice rank to those departments, so we have 123 and so on. We can go and move it to the left side to the names of the departments, and it's like a quick indicators for the ranks. That's now let's go and format the charts by removing all those unnecessary stuff. We're going to go to the axis, remove the header. Let's go to this department over here, right click on it and hide field label. Of course, we're going to go and remove all those lines. Let's go to format, and now let's go to the left side to the lines. Let's go to columns and remove the grade lines to none. All right. So that's it. Now we can see the total number of employees five departments, and we have a nice rank for it. Okay. Moving on to the nx requirement, it says compare the total employees between HQ and the branches. And here as an info, New York is the HQ. It's like the previous analysis where we have to compare the values between different categories, the HQ, and the branches, and the bar chart here is the best type of chart for this analysis. Now let's go and create it as usual, we're going to create a new sheet by duplicating any of the previous ones. Let's call it location. And of course, the first question is, do we have the informations in the datasets? We don't have any fields about the H Q and the branches. But about the locations, we have only two informations, the city and the states. But in the requirement, we have a hint where it says the state New York is the HQ. That means all the other states are branches. So again, we have to go and create this logic. So let's go back to our test over here, and let's go and get the states to the list. And now we're going to create very simple logic where we are checking the value of the state? If it is New York, then it's HQ. Otherwise, it is branch. So let's go and create a new calculated field. Let's give it a name location. And now since we are evaluating a value from a column, we're going to go and use the logical function case statement. So we're going to say case. And then what we are evaluating, we are evaluating the state, right. Let's write state. Now let's evaluate the first value, which is the New York, right. Make sure to write it exactly like we have it in the dataset. So the first capital litter, as we'll here. What happens if the state is in New York, then you are the HQ, right? It's like this. Now if the state is not in New York, then it is a branch. So we're going to go and use the default se like this and what can be going to be the branch. So that's it, and don't forget to add an end like this. So let's go and hit okay. Now with that, we got a new field code location. Let's go and test, of course, to the right side of over here. Now we can see in this field, we have branches and HQ now in order to see all the values of the states. I don't want to see all the employees, so let's go and remove all those informations, and now we can see very nicely how the states are mapped to the location. So only New York HQ, all other states are branches. Now we have the field that we need for their requirements right. Let's go back to the locations over here, and let's get rid of those dimensions. We don't need it. We're going to stay with the total hired, but now we need our new calculated field to the rows. Now, I would like to go and switch this charts where we have the locations on the rows. To go and click on this. And they are switched. That's it, as you can see, we can now go and compare the total employees between the HQ and the branches. As you can see in the HQ, we have way more employees than the other branches. Of course, now, the next step with that, we're going to go and change the designs over here. Let's take the location and put it to the colors by holding control, of course. Then let's go to the colors and edit colors. Now, let's go to the SQ double connect in order to get our green and as well to the branches doubt and let's get the gray. For the branches. I would like to sort the data the way around. I would like to have the Q first then the branch. Let's go to the location, right click on it. Then go to the sort, and we're going to go and sort it manually. I would like always to have the HQ to the left side, so H Q on top and then the branches. Now let's go and remove some headers in formations from here. Of course, as usual, we're going to go and get rid of those white lines, Let's go to format, and then let's go to the lines and then here, the axis rollers. Let's go and select none. As well, I'm going to go to the next one x six, and let's have a none as well. Now on the right side over here, you can see we have a legend, we're going to go and hide it since we want in the dashboard to design our own legends. Let's go over here to this small arrow and hide card. So that's it for this requirement. Okay, let's go to the next requirement, and it says, show the distribution of employees by city and state. Now since we are talking about the location informations like the states and the cities, here we are talking about the special analyses. And of course, the maps are the best visual for this type of analysis. All right. So now let's go and create a map in Tableau. We're going to go and duplicate the sheets in order to have the same design. Let's give it a name. Map states. Let's go and remove everything in order to start from zero. Now in order to plot a map in Tableau, we have to go and get those two informations, the longitude to the columns, and the latitude to the rose. With that, tab going to plot the word map in the view. Now what do we need, we need the locations. Let's go and get the state first to the details. Let's drop it over here. And now depend on your location, you're going to get different results. For me, since I'm now in Germany, it's going to says you have now eight nn informations. How we are going to solve it? We're going to go to the map in the menu over here, and then we're going to go to this option edit locations. Let's go there. Now it's currently to Germany, I'm going to go and change it to USA. Let's search for USA and that's it. Now as you can see, we have everything mapped correctly between my locations and the informations from Tableau. If you hit k over here, the unknown stuff will be disappeared. Let's go and do that. Now as you can see Tableu understood the informations and zoom into USA. But here we have very funny parts on the maps. It's not correct. Let's go to the marks over here and switch it to a map. Now as you can see Tableau is highlighting the states from our data with a green color. So now I would like to go and change the design of this map. Let's go to the menu and then map, and then we're going to go to this option, background layers. Since the style of our dashboard is going to be dark, I'm going to go and change the style from light to dark, and I would like to go and get rid of all those informations that I don't need. Let's go and deselect everything from the layers. So we don't need anything. All that I'm happy, we got a very clean map with only states and information that we need. Now let's go and add the stuff that we want. The first thing that, I would like to add again the name of the states. So hold control, drag and drop the state to the labels. Now with that, we got only the states from our data highlighted in the map. The next step of that, I'm going to go and change as well the color based on the hired employees. Let's close this over here and get hire employees to the colors. Now tableau is using another colors that we want, let's go to the colors, edit colors. Now instead of having automatic, we're going to have our custom coloring right. So let's go to the blue over here, click on it, and we're going to have our green again. That's it. That we got our coloring. Now it's really white, what I'm going to do, I'm going to go to the colors again, and let's go and reduce the opacity. Let's just reduce it and maybe more. Let's go and reduce more to maybe 30. All right. What else we can do? We can just highlight the borders of the cards. It looks really nice. Let's go to border and choose this color over here, and with that we have nice borders between the states. That's it, we have now the total employees for each state, but now we have to have it as well for the city, right. Let's go to the city over here and add it as a new layer on top of our ma So let's drow it over here. Now we don't have enough points. What we're going to do, we can add as well the states to the details. Now with this Du is able to map all the cities to the states, and as you can see, we have those small circles. Now let's go and add, for example, the total hired to the size. If the circle is bigger, that means we have more employees, but I would like to increase it a little bit more like this, may As well, let's go and add the coloring. Maybe we're going to go with the location information. Let's go and get the locations to the colors. That means the gray dots are the branches, and only the green one is the H Q. Now, let's go and change a little bit, the design of those circles. Let's go to the colors. Now let's go and add the border for it. Using our colors, it's going to be green one. Then let's go and reduce the opacity, maybe something like this way back to around maybe 30. All right. I'm happy with that. On the right side, as you can see we have those legends. Let's go and remove them. So hide and as well hide. So far, I'm happy with this design. We got the total employees by the states and as well by the cities and we fulfill the requirements. 202. HR Project | Build Charts - Part2: So that we have covered all the requirement of the overview section. Now let's move to the next one. We have the demographics. The first requirement in the demographic section is present the gender ratio in the company. We have to analyze the gender proportions in our data and we call this type of analysis part to whole analyzers. And the PI chart is a wonderful chart in order to do this type of analysis. Okay, let's create bi chart in Tableau. We can go to the locations over here and doublcate it in order to use the same setup. Move it to the right side, and let's give it the name, gender like this. Let's get rid of all those informations to start from Of course, the question is, do we have the data? Well, yes, we have the gender information in our data, so we don't have to go and create an e calculated field. Let's start with the marks. I would change it from bar to Pi. Now in order to create Pi chart in table, we have to go and do some tricks. Let's go to the columns, double click on it, and let's select the average and zero. It is placeholder for a visual or chart in t. Now for the Pi chart, I have a full detailed video on how to create a step by step. Now we have to do it a little bit quickly. For the Pi chart, we need two circles, one for the inner circle and another one for the outer circle. That's means we need two visuals, and that's why I'm going to have two placeholders for it. So hold control and a duplicate it. With that, we have two circles and now let's go and have a dual axis for both of them and make sure to synchronize the axis and as well to hide it and from below as well. Now we have two circles on top of each others. Now let's go and configure those informations. Let's go to all first to the size. And make it a little bit bigger like this. Here we have two marks. The first one is for the outer circle, and the second is for the inner circle. In order to see the coloring, we're going to go and change the inner circle to something dark, as well what you're going to do, we're going to go to the sides over here and reduce it in order to see. As you can see, we have already a Pi chart right. Now, usually in the Pi chart, we show the total aggregation in the middle, and that is the total hid. Take the total hid and put it to the labels over here. Now as you can see, we have ever nice number in the middle. Now let's go and configure the outer circle right. Let's go to the first chart over here. Of course, we want to divide the chart by the gender. Let's go and take the gender and put it to the colors. Now let's go and edit the colors, it the colors. Now, of course, I will not go with pink and green because the pink means in our dashboard terminated employees and we cannot use it over here. We're going to stay with the green. Let's go to male over here. Let's go and get the green, but this time I'm going to make it a little bit darker like this. And then hit k. Now let's go to the female. We're going to take it as well as green, but make it lighter. Maybe something like this way lighter. As you can see the circle is splitted to two sits. Now we need as well a few informations on top of this circle. Let's go and get the gender or let's comp it from here, hold control and put it to the labels. As well, we need the percentage of the employees. Let's go and get the total hit to the label over here. But we don't need it as an absolute number. We would like it as a percentage. Write the click on the measure, and let's go and have a quick table calculation. So that we got a percentage for male and female. I would like to round those numbers. Again, let's go to our measure and format it. Then let's go to the left side over here instead of automatic, let's go to percentage and reduce the decimal places. With that, we are rounding the percentage. So as you can see in the chart we have for the male 54 and for the female, 46. It looks really nice and let's go and clothe it. Now this calculation, I think we're going to need it later in other charts. I would like to have it in the data source, so that I don't have to go each time and format and create this table calculation. Let's go and drag and drop it to our data source. Now as you can see on the left side, we have a new measure. Old calculation one. Let's go and name it, so let's give the percentage total hid. This is really nice in order to reuse the stuff that we have already created, and it is a new calculated field. In order to check the formula for that, let's go and edit the field, and you can see. It's very simple, the total hid divided by the total total hid. All right. That's it for this requirement. Now, we have a really nice pie chart in order to see the distribution of employees between genders. Wait, wait. Sorry, when we think we have to remove the allegiance, so we are not done yet. So let's go and hide it. All right. That's it. Moving on to the next requirement and it says, display the distribution of employees across age groups and education levels. Now we have to show the relationship, the correlation between two categories, two dimensions, the age groups and the education levels. One of the best chart for this type of analysis is the heat map in order to show the relationship and correlations between two dimensions. Okay, let's go and build the heat map. As usual, we're going to go and duplicate stuff. Let's give it a name. I'll be age versus education. Now let's go and get rid of everything like this. Now, the first question is, do we have all informations in the data source? Well, we have something about the education level, so we are safe with this, but we don't have ages. Of course, we can go and calculate the age from the birthday, here we have the birthday informations, and we can use it in order to generate the ages. We have to go to our test again in order to see whether everything is working fine. Let's go and add again an employee ID in order to have the level of employees, and let's go and get the birth date to the view. Now let's go and create the logic of the age. Going to go and create a new calculated field, and let's call it an age. Now of course, how do we calculate the age? It is the number of years between the birthday and today. Let's go and do that. We have to go and subtract today from the birth dates, and we can go and use the date dif function. Of course, the age is based on the number of years. We have to specify here the date part. So it's going to be year. What is the starting date? It is the birth date, and what is the end date? It's going to be the function today. The two day function is a table function that generate the current date as we are speaking now. That's it. It's very simple, right. Let's go hand it okay. With us, we've got a measure continuous measure because of course, it's ages. So let's drop it to the output in order to see the results. Now we're going to have it as a measure. I would like it to have it as dimension, so let's convert it to dimension and as well to discrete in order to see the numbers. Let's put it beside the berth dates. Now we have ages right. I think this is the simplest one. If you check this employee over here, you can see Perth is 2000 and we have around 24 years. Of course, if you are doing this project in the 2025, you will get the age of 25. As I'm recording this video, we are at 20:24. It's really interesting when you are doing this project, write it in the comment below. Of course, the task says, we need age groups. We don't need ages. In order to create age groups, we have to go and create again a new calculated field on top of the age. Let's go and create a new calculated field. Let's give it the name age groups, and we're going to go and use the FL statements in order to group up the employees to a specific range. Let's start with the first one, the youngest employees. All the employees that their age is below or younger, 25 going to be in one range. We're going to say if the age like this younger than 25, then they belong to the group younger than 25. Like this. Now let's go and define the second group. It's all employees 25-35. So we have ten years in between. All employees where their age is older or equal 25, and their age as well is younger than 35 like this, and they all belong to one group, which is 25-34 because here we are not including the 35. That's it for this group. Let's go to the next group. I'm just going to go and cry base it over here. We will just increase the number of years 35-45, and the same thing over here, 35 and 44. Let's go and add another group, it's going to be between the 45 and the 55. Let's just increase everything with ten years, and as well over here. Now let's move to the last group to the nicest group where we have all employees where they are older or equal to the age of 55. LF age, it is older or equal to 55, then we're going to have 55 plus. That's it. Now we have covered all the groups that we have inside our data. Let's go and date, of course, right. Everything is valid. Let's go and K. And with that we have now a new dimension, and which is on the top over here, age groups. Let's go and put it in the output in order to check the results. What else I'm going to do in order to test, Let's show it as a filter, and let's start with the youngest generation, the employees where they are younger than 25. Now as you can see, all those ages is less than 25, which is correct. Let's move to the last one as well, to the oldest employees over here, as you can see, they are all other than 55 or equal. So, as you can see, it is as well working. Let's check another one over here. So employees 35-44, and everything looks nice. Let's check this one 25-34. That you can see everything looks perfect. Now let's go back to our sheets, age versus educations. Let's get first the age groups to the columns, and then let's get the education levels to the rows. Now we have our matrix, but it is not sorted correctly, so let's go and sort those dimensions. Right click on the age groups, and let's go to sorts. Now the next in order to have a heat map, let's go and change it from Pi to circles, nothing at a change just to make sure we are not talking about Pi. Now of course, what controls those circles is the number of employees. Let's go and get the total hired to the size. Now we have our heat map, but as you can see, those dimensions are not sorted correctly. Let's go and sort it. Let's go to the age group right click on it and go to sort, and then we want to sort it manually. The first is the youngest group, then 25, 35, so it looks good, let's close it. The same thing for the education level, let's go and sort it as well. As well, Manual. From education, we're going to start with the high school, the Bachelor, master, and PhD. Now it looks better. Let's go and close it. Now from designs, we don't have any exits or anything. I will just go and change the colors because I would like to decide later on the dashboard. I would say let's go with the gray. Let's go and hit. Of course, don't forget about this legend, let's remove it, so hide it. Check the data. It's very interesting. You have the most employees in the category 35-44 as an age group, and most of them have the pasar. So with that, we can go and analyze the coloration and relationship between the age groups and the education levels of the employees. Let's move to the next one and it says, show the total number of employees within each age group. Again, here we have the comparison analysis in order to compare the values within category, as usual, the par chart is the best one. Let's go and build it as usual, duplicate one of those charts, Let's rename it to age groups. This one is going to be very simple, so we need the age groups, but we don't need the education level. Let's go and remove the sizes as well. We need the total hid as a rose, and instead of circle, we need pars. That's it. It's very simple and as well. It's already sorted because I've duplicated the previous one. The sorting of the age group is correct. Let's go and hide. This axis over here, and that sets for this requirement. Let's jump to the next one. It's very similar. It says, show the total number of employees within each education level. So we're going to go with the same visual, departure in order to compare the different values within a category. All right. So we're going to do the same stuff. Let's go and duplicate this one over here, and let's call it education levels, and we have to go and replace this dimension with the education level so instead of age groups. We're going to have it like this. But of course, we have lost the sorting of this dimension. Let's go and sort it again. So let's go sort, and it's going to be a Manual. And the high school is first, Bachelor Master PHD, which is correct. So again, part charts are really easy. Okay, let's move to the last requirement, and this section as it says, present the correlation between employees education levels and their performance rating. So for this requirement, we're going to go again with the heat map, since we have to show the relationship between two dimensions, two categories. Okay, so let's build another heat map. So as usual, we're going to go and duplicate stuff, and we're going to rename it two education versus performance. So of course, the first question, do we have all those informations? Yes, we have the performance and as well, the education. So we don't have to go and create any calculated fields. So we need the two dimensions. The education, we have it already over here. Let's go and get the performance rating, and let's check the marks from parts to maybe squares like this. And let's go and get that total hied to the size. All right. So now by checking the data, we have to go and sort, I think the performance. It's not correct. Let's go and sorted again as a manual. It starts with excellent good and then satisfactory. That means we're going to have it a step above needs improvement. That looks good. Let's go and close it. Now as you can see, we have the highest group is between bachelor and good, which is okay because we have a lot of employees having the Pahlar compared to the PhD. Instead of having the absolute numbers, let's go and get instead of that the percentage, which is going to show declaration more accurate. Instead of having the total hired, I will just remove it. Let's go get this total percentage. From higher to the size. Now the percentage doesn't make really a lot of sense because here we have 72%, 65%. I think this is cross table, so let's go to the measure over here at click on it, compute using n table across. So instead of that, let's go and change the calculation to performance rating. Because we are focusing on the performance, let's go and click on that. Now it looks more accurate if you go, for example, to the employees with PHD, as you can see, 48% of them having excellent rating, and then the next one, we have good satisfy and as well, the last one needs improvement, only 5%. As you can see, the highest group of employees with PHD, having the excellent rating. Let's go now and check the high school. Here we can see this group is smaller compared to the PhD. We have only 13% of employees with high school education, having an excellent where we see here a big pupple, where we have 34% of employees with high school that needs improvement. We can understand from this data that is generated from AI, that there is correlation between the education level and the performance rating. The high education level might enhance and increase the performance rating. But of course, this is not a rule, it depends on a lot of stuff like the field of work, the skills, and so on. Not only the education level going to improve the performance, but in this data, we can see there is a clation. Of course, one more thing before we close, we have to go and hide the legend right. With that, we are done with this requirement. All right, friends, let's move to the third section and we have the income analyzers. So in this section, we're going to focus on the salary based matrix, and we have here two requirements. First requirement says, compare the salaries across different education levels for both genders to identify any discrepancies or patterns. In this requirement, we want to see the differences in salary between the different genders. This is not only correlation, we are talking as well about something called Gap analysis, and the Bs chart, the visual the gap analysis is the parple charts. This is exactly why I go with the parble chart instead of the heat map because with the parple chart, I can very clearly and easily show the distance between values. And as well, we can show the correlation between two different dimensions and categories. For this requirement, I will not go with the Hat Map, since I cannot show the distance between values, I will go with the purple charts. Okay, so let's build a purple chart in Tableau. We're going to go and duplicate stuff as usual, and let's give it a name. It's going to be gender versus education level. So that sets and let's go and clean everything from here. But we're going to still need the education level as a rose because we have it already sorted correctly. What is a parable chart? It contains two points and the distance between them as a line. So we need two charts, one for the line, and another one for the points. Let's go and create it. We need the salary information. So as you can see, we have it over here. Let's go and drop it to the columns, and we don't need the sum of salaries. We need the average salary, let's go and change the calculation of the measure from sum to average. Since we need two charts, we need two measures, and we are using the same measure, so let's go hold control and duplicate it. What does we have two charts. As we said before, one going to be align and another going to be point data points. Let's start with the first one. Let's go over here and change it from square to a line. Now since we want to show the distance between the gender values, we need to go and get the gender informations and put it to the path. What does we got like the lines, the distance, the gap between points? Let's go and make it bigger in order to see those informations to the max. So now let's move to the next one where we're going to configure the points of the genders right. Let's go to the second mark over here. Instead of square, let's go and get the shapes. Now for the shapes, we're going to have the gender informations. Let's go and drag and drop the gender to the shapes. Now as you can see, we have our two genders, but I think we have better shapes for that. Let's go to the shapes. Instead of default, let's go over here and we have already from tableau gender shapes. Let's go over here. That's it. Let's hit k. As you can see we have those signs, but they are really dark. Let's go and get as well the gender to the colors, so hold control and put it to the colors. As you can see on the right side, we have now those symbols, but they are really small. Let's go and change the size of that, something like maybe to the middle. As like this. Now the next s that, we're going to go and put everything in one chart. Now they are splitted. Let's go to one of those and use the dual axis and make sure that we synchronize the axis as well. Now we still have here a huge space where it's not used. Let's go and configure the axis, dit axis and make sure to remove include zeros. That's it. Now it looks really nice. Now, of course, we can go and add a label for the average sales. Let's go over here, and let's get the average sales hold control and put it to the labels. It's not really clear, so let's go and change the phones. Let's go to label and go inside it. Let's go and use our second gray. Let's get the light gray. Okay. Now we can see the numbers are really big, Let's go and change the format of the salary. So right click on it and go to format. Let's go to the numbers over here, and as well to the custom number. Let's go and remove the decimals, and now the display units can be thousands. I'm still not happy about the symbols and the text. Let's go to the labels and change the alignment. Currently, it's middle center. Let's go and change it to automatic. It's way better. With that, we have the symbols and as well the numbers beside it. Of course, don't forget about the final touch. Let's go and remove all those headers from top and Patton. Let's not forget about the legends. Let's go and remove it. And now we have very clean charts. All right. So now let's understand the result of this insights. As you can see the average salary of male and female with high school education, they are relative equal right. But now if you go and check the bachelor, you can see the average sales for male is way higher than female. As you can see, the Pabl chart is really amazing. You can see immediately the gap, the distance between those two values. The males are getting way more salaries than the female with the education level of Bach. Let's go and check another huge distance between the genders if you check the education level PD. As you can see, we have a huge distance gap between the genders. But this time is the way round. On average, the female doctors are earning around like 25%, more than male doctors. As you can see, the Public chart is amazing in order to understand the distance and the gap between data points and as well to have coloration analyzes. This is amazing visual and that's all for this requirement. Friends, now we're going to move to the second requirement of the income analysis and the last requirement in the sum review, and it says, present how the age create with the salary for employees in each department. This time we want to show the cation, the relationship between two measures, not two dimensions, like the at Map, two measures. Of course, the best type of chart here is the scatter plot. The scatter plot is amazing in order to show the correlation between measures. All right, now let's go and build a scatter plot in tableau. As usual, we're going to go and duplicate the sheets, and we're going to rename it to age versus salary. So do we have those informations in our data? Well, yes, we have the ge celery. We don't have to create any calculated fields. Let's go and clean up those informations. Let's remove everything. We don't need all those stuff. So now let's start from the scratch. Since it's corration between two measures, we have to go and add our two measures. The first one going to be the celery. Let's go and drop it to the rows, and we need the ages. So let's go and drop it to the columns. Of course, we don't need the summarization of salary and ages. We need the average. Let's go and change that. Let's go and changes from summary to average and the same for the age from sum to average. Great. Now we got our two xs, our two measures and make sure that we are using the marks of shapes. We got it from the previous charts. Know what is missing, we need the data points, and it's going to be the job title. Let's go and get the job title and put it on the details. Now as you can see, we got our data points, but we have here huge wasted space, and that's because we are including the zero in the xs. Let's go and clean that up, it xs, I remove the zero and the same thing for the average. Add the axis and remove zero like this. Now's say let's go and change the shape. Instead of circle, let's get it a filled Damont like this. Now sometimes we have overlapping between points. It would be nice if we reduce the opacity to something like 75. Now let's go and add labels for those data points, and it's going to be the job title. Hold control job title to the labels. Now let's go and reduce maybe the font size 9-8, something like that. Now, of course, in order to get the effect of scatter blots, let's go and add reference lines for both of the axis. Let's go to the salary over here, right click on it, and let's add a reference line. So let's go and check the informations. Average lines, let's remove the label, and maybe we can have custom tooltip like this average. And let's go and insert the value. So now let's go and format it. It's going to be dashed one, a thin one, and let's use our gray color like this. So that's it, Let's okay. And with that we have a very thin average line. Let's do the same for the ages. So add reference line. So no label, and let's add a tool tip like this. Average. And the value and the same format for the line, is going to be dashed one thin and as well our gray color. So, that's it. That's it, okay. So what we have created a really nice scatter plot. So now if you check the jobs like most of them are managers, right, we have the IT manager, finance manager, HR, and so on. So most of them are managers, but we have three types of jobs that are getting high salary, but they are not managers like software developer, and we have here system administrator and finance analyst. As you can see below the line, we have different types of jobs, but none of them are managers. It makes sense, of course, managers are getting higher salary than the other jobs, but still there's some jobs that are getting high in salary. Now we are just checking the salary, only one measure. Now, let's check the coloration between the age and summary, thinking about two things. Now if you take a look back, we have a group of jobs that are centralized in the middle, which is okay. But here we have extremes like the HR manager and the finance manager. HR managers are getting high salary, even though they are young employees. And as well, it is the only manager group that having young age. If you compared to the other manager jobs, they are like around 40. So this is one extreme in the data. So now let's go and check the way on top to the right. We have the finance managers. So they are getting on average the highest salaries inside our data, and as well, the average age is relative old. So this is one extreme. And as you can see, we have another position the IT manager is as well like moving toward this direction right. So, my friends, this is what we can understand from our data from the scatter blots, and that's all for this require All right, friends. So with that, we have covered all the requirements for the first dashboard, the summary dashboard, and we built as well the charts. And after that, we have to go and put everything, all those charts in one single consolidated table dashboard. 203. HR Project | Sketch Mockup of Summary Dashboard: All right, Sara, we're going to go and build the summary dashboard and here what we're going to do. First, we have to create a plan, where we're going to go and sketch out the mockups for the dashboard and the containers in order to have a plan for the layout. And after that, we're going to go and create the container structure of the dashboard in order to put all those charts in one single view. And after we have all the charts in one place, we will start with the refining and fine tuning process. So we're going to go and tweak and twist a lot of stuff like the text, colors, icons, legends, filters to get everything looking just right. So are you ready, let's start with the first step where we're going to go and plan the dashboard for the summary view. A. For this project, I have decided to have around 15 charts in one single dashboard. It is definitely a challenge, but don't worry about it. We can do it step by step. Now, of course, we'll not jump immediately by creating the dashboard because we will struggle without a plan. Any professional in any project knows that. Before building anything, we have to have a plan. We have to have a blueprint. And of course, we want to be professionals right. That's why we have to go and plan the dashboard by sketching the of the container end of the dashboards. So of course, the question is, how are we going to do it? Of course, you can go old style by just having a pin and paper, and you can go and draw the sketch of the dashboard. Can go and use digital tools like, for example, PowerPoint, or like I'm doing here, procreate using my tablets, or you can go and use tools like Figma or DO. So any tools that helps you to design and to sketch the mockup of your dashboard, that suits your fancy. So let's go and sketch the mocap of our dashboard. The background is going to be dark gray, and that's because we are making a dark theme. So now we can have the usual stuff where we have a title for the dashboard, human resources dashboard. In their summary requirements, we have three sections, and that's why we're going to go now and divide our dashboard into three main sections. We have overview, demographics, and income. Now let's focus on the overview and put everything that is required in this one section. We're going to start with the pig numbers, the bands. The first one is going to be the active employees, and here we have a big number, and then we're going to split it into two sections. The left side going to be the hired employees, and to the right side, we're going to have another big number for the terminated employees. Now in order to have the effect of the KPI, what we're going to do, we're going to put the line charts exactly below those big numbers. Now below it, we're going to have another section for the department. We're going to have our ranking of the departments using the par charts. Then below it, we're going to have the last section in the overview. We have the location. Here we have two charts. We have the one with the part chart where we show the number of mploye in the HQ and the branches, and the other charts here, we have a map. We're going to put the maps and the part charts side by side in this subsection. As you can see, it's not really easy to fit everything in one place. So that's all for the overview. Now, let's go to the right section to the demographics and here we have a big challenge. Have to fit in this section five different charts. The first section is about the gender, so we have our Pi charts. But now for the age and educations, we have two separate par charts. What we can do here, we can integrate all those three charts in one block. In the center, we can have the heat map, but on the top and end to the right, we can have those par charts. With that, we have all those three charts in one subsection. Now to the right side to the last section, we're going to have the performance and educations and here we have another heat map. Let's move to the last section to the income analysis. It's pretty easy. We have here only two charts. The first one, the gender and education, we can have it on the left side, and to the right side, we're going to have here our scatter blot, the H versus salary. With us, as you can see, in one dashboard, we are showing almost 15 different charts. Of course, in our dashboard, we have to have a section on the left side for the logos, for the navigations, between the two dashboard, the summary, and the detailed views. Of course, we can go and add multiple functionalities about exporting the dashboards or icons where we can put our links. We will not forget about the filters, so on the top right, we can have like a switch in order to show the filters or to hide it. All right, friends, to the next step. Now we are not done planning our dashboard. We have to go and sketch the mockup of the container structure. Building a dashboard in tableau requires a knowledge about how to control and manage the containers. If you don't have plan, I promise you things can get chaotic. That's why we have to bland the container structure, and this time I'm going to sketch the mocap using the DAO. DroO is amazing tool and as well free in order to create professional charts and concepts that I usually do as well in my projects. Okay, so now we are inside DO, and I just put our mocap as a reference for us, and working with DroAO is pretty simple. The first step that I usually do that, I go to the style over here and make it as a sketch. Now what this does is that all the shapes that we have on the left side going to look like hand drawing. So at the end of your concept going to look really cool and n pouring. Now, for our containers, we're going to have three different objects. The first one going to be the horizontal container. So you are the horizontal. Container, and I usually have the color of plue. Let's first year, remove the fill and go to the colors. Choose plue and maybe make it thicker. So this is the first type. The other one, we have a vertical containers, right. So vertical container, and we're going to have the color of orange. So maybe something like this came. And the last box is going to be our objects. It could be anything. It could be an icon, it takes an image. So I would like it as Gray. Let's have something like this. So we can see that our whole dashboard is split it into two sections, the left sections where we have the logos and the icons, and then the rest to the right side. So that means we're going to start with horizontal container for the whole dashboards. So we're going to make it like this. And we're going to have it like this so big. All right, so let me just remove the text over here and maybe give it a text name. This is the whole dashboard. This is the first step. Now let's start with the left one where we have the icons and the logos. It's like a vertical, we have all objects below each others. What you're going to take, we're going to take a vertical container for the left side. We're going to call it Nav for navigation like this, and let's make it a little bit smaller. Inside, we're going to have different objects like a logo. Let's make it smaller. I will go and make a feel for that, so let's click on fel and gray, same here. Now we can zoom in and add more icons in order to navigate between dashboard, to explore the dashboard, to put links, and so on. So we're going to have multiple links and stuff on the navigation. This is everything about the navigation. Now, on the right side, what do we have? So we have first like a title a filter, and then below it, we have a whole section of charts. That's means we have two objects below each other, and for that, we're going to need again a vertical container. For the whole thing over here, we're going to have one big vertical container like this, and we're going to call it header and charts header and charts. Okay, something like this. Now let's start with the header. It looks like we have a header and beside it, we have filters. That's why we're going to go with horizontal container right. We're going to have it like this and what do we have inside it? We have the header and the filter right. So we have the title. And here on the right side, we're going to have a few icons or maybe one icon we will see. Now let's have a look to our charts over here. Here we have three sections right, but actually they are splitted into two sides. The lift sides where we have the overview and the right side, where we have two sections. That means we have two object side by side, and for that, we're going to take another horizontal container. Let's do it like this. It's going to be the main splitter between the lift side and the right side. Let's start with the lift side. As you can see, they are object beneath each others, and that means we're going to go and use a vertical container. For the lift side, we're going to have a vertical container like this. Let me just remove the name and let's go and call it the overview. Overview, and we have inside overview, a lot of charts. We can have multiple charts like this and all of them are beneath each other's. We will not now drill down inside each detail. We will just have a rough plan for the containers. Now let's check the right side. Now on the right side, as you can see, we have two main sections, we have the demographics and the income. That's means we're going to go and have a vertical container. As well. The right side, we can have vertical one like this, and we're going to remove the name here. Now let's go and check each side. As you can see, we have first like a title and below it, we have different objects. Again, here we have a horizontal container. We're going to have like this. It is very nested because it's a little bit complicated. We're going to have as well for the below section for the income. We're going to have a title and then charts. Let's give it a name. This is the demographics, and below it, we have the same thing. We have a section for the income. What do we have underneath that title, we have here like charts side by side. That's means we can go and use horizontal container for that right. We're going to have horizontal container below it like this and inside it, we have our different charts. We have charts like this, let's have three like this. For the income as well, we're going to have only two charts, we're going to need as well a horizontal container since they are object side by side, and we can have our two charts. All right, guys. I think we have a plan, right, so we have a blueprint for our dashboards, and we have a lot of layers like around six layers. We will not find you now, the plan, is it just a rough plan. But one thing that I would like maybe to zoom in a little bit is about each chart. So as you can see, for example, this one, we have a title always and below it a chart. The same thing goes for the gender, we have a title and a chart. That's means we have a vertical container for each chart. If we go and zoom in inside those charts, we will not place immediately the charts. We're going to have it always as a vertical like this, where the first object is going to be the title of the charts. So like this and below it, then we can have that chart itself. All right, my friends. So now we have a rough plan. So now let's go and implement those containers in Tableau. Alright, friends. So finally, we have now a rough plan for our dashboard. But of course, it doesn't contain all the details, so we will be like twisting and tweaking stuff as we are building the dashboard. So let's go back to Tableau in order to build the dashboard. 204. HR Project | Build the Summary Dashboard: Okay, friends, let's go and create a new dashboard and let's call it HR summary. Like this. Now, the first step of that, we're going to go and define the size of the dashboard. So let's go over here on the left side. Instead of range, let's go and select a fixed size, and this time we'll go with that with 1,400 and the height of 800. All right. So let's start with the first container. It is the horizontal container for the whole dashboard. What I usually do, I go over here and switch it to floating, because having everything in one floating container, it adds more dynamic and we can go and change the background as we want. Make sure to switch it to floating, let's take the horizontal container and drop it in the middle. As you can see, it's a little bit small. What we can do, we going to go and change the size of it in order to fit our dashboard. Let's go to layout, and the widths going to be exactly like the dashboard, 1,400, and the 800 for the height. For the position, it's going to be zero, zero. Order to have it exactly on top of our dashboard. Now in this phase as we are adding the structure of our containers, I usually go and add borders to each container in order to see whether we are doing everything correctly. Now let's go and do that. Let's go to the borders and add a line, thick one and plu. With that, we can see a Plue horizontal container. Of course, let's go and give it the name, so let's rename it to hold dashward. Okay. Now in order to avoid mistakes by converting the horizontal container to a vertical container. I go and add planks inside it in order to make it as a fixed horizontal container. Let's go and do that, two dashward, and now let's switch it back to tilt. Only the first main container going to be floating, the risk going to be tilted. The first plank to the middle. Now make sure that the second blank exactly on the right side. Let's go back and check in the lie outut. You can see we have planks inside our whole dashward. Now let's go to the next level and start adding the containers inside the whole dashward, and here we have two vertical containers. One for the Navy, let's go and do that. We can have one vertical container over here. As usual, I go and add planks inside it. Let's go and add the first plank. It's a little bit small like this. Let's go and expand it. Let's go and add another one plank below it. Make sure it's below the first plank. Let's go and check the layout. Now as you can see, we have a vertical container and two blanks inside it, which is correct. And let's go name. Let's give it a name of Nav, and we can go and remove the first plank over here. We don't need it anymore, so let's remove it. Of course, we can go and add a border color for it. This time's going to be orange. This is the container for the Nav. Now let's go and add another one for the right side for the rest. So, let's have a vertical container and two planks inside it, one in the middle, and one exactly below it. Now it's very small. Let's go and chick the vertical container and make it wider like this. Let's give it a name now. It's going to be header and charts. So click. Of course, we're going to go and give it a color like this and it's going to be as well and orange. Now if you're looking to the tree over here, we have a whole dashboard and inside it, we have the nav and to the right side, we have the header and charts. Let's go and remove this plank. We don't need it anymore. From here. We will not now focus on the Nav, since we don't have a lot of containers, we have here only logos and icons and so on. We will focus now on the header en charts because here we have the real content and we have a lot of containers. What do you have inside it? We have two containers, one for the header, and another one for the whole charts, and both of them are horizontal containers. Let's start with the header, so we're going to go and add horizontal container. In the middle. This time instead of adding blanks, we're going to add one text for the title of the dashboard. It's going to be human resources, dashboards. Let's add the word overview. Let's have it like this, and let's have the size of 20. Now we're going to go and add a blank to the right sides. Make sure you drop it exactly to the right side inside this container. Let's go to the layout and check what do we have. As you can see, we have now a text and blank underneath the horizontal container. Let's go and give it the name now. This is the header, and of course, we're going to go and add a color for it, it's going to be the blue. Now we can go and remove this upper plank. Like this. Now let's go and add another container for the charts. So it can be as well horizontal container, so drop its beneath it. As usual, we're going to go and add our blanks. One here. Let's make it bigger and one to the right side. And we go to the layout and check stuff. We have two blanks inside the horizontal container. Now let's go and give it the name. Here we have everything, the lift and right sections. Okay, and we're going to go and add the borders as usual. So with that, we have our two containers, and we can go and remove this place holder from here. Now, let's keep drilling down and we're going to focus on this container, the left and right sections, and here we have two vertical containers. So let's start with the left section, the lift container. We're going to have it for the overview, so vertical container. And now let's drop a text instead of blank and call it overview. And maybe let's make it like 12. Now below it another blank in order to make sure this is a vertical container. Let's go to the layout and check. Vertical to container, we have title blank, and let's give it the name over view left section like this. Let's go and remove this plank from our dashboard and don't forget about the color of the porer. We can have it orange. That sets, let's make it a little bit smaller like this. Now let's go to the right side, and we can have as well a vertical container, like this, the same stuff, a plank and below it as well another plank, and we go to the layout the same stuff. We have two planks and let's give it a name, demo and income sections. As usual, the pder, as we orange, and we're going to go now and remove the place holder like this. Let's adjust the sides, so the left section, the overview, should be smaller like this, and then we have the right section. With that, we have everything on the left side. What is left is designing the containers of those two sections. Here we have two vertical containers. Let's go and do that. The first one, We're going to drop it here in the middle. Let's go and add it text for it. It's going to be the demographics, and the size is going to be 12. Okay. Now let's make it bigger like this. Let's drop a blank. Make sure to drop it exactly here, and let's go to the layout and everything is fine, as you can see, I'm just beak a little bit thicker. We have here the text and the blank. Let's go now and give it a name. It's going to be the demo section. Like this, and we're going to give it as well a color. As well, a vertical container. Let's go and remove this, placeholder, and we need to do the exact same thing for the second section. Let's go and add a vertical container, a text, going to be the income, 12, and we're going to make it bigger like this. We're going to bring as well a blank. Make sure to drop it inside the container. Let's check the layout, so everything is fine. Now we're going to go and rename it as usual. Income section. Don't forget the coloring like this. And with that, we are done. Let's go and remove the last plank. Here we still have spacing. Let's go and adjust the size, so the demo going to be the middle and the income going to take as well the whole space. Okay, guys, I promise you the last drill down, where we're going to add a horizontal container for the charts. For the demographics, we're going to have one horizontal container here inside. Let's go and add a few planks inside it. The first plank small and to the right side. So let's go and check that. We have horizontal container, give it a border color. Now we're going to go and do the exact same thing for the income. We need as well horizontal container inside it and two planks. On here. Let me just make it bigger, and one exactly to the right side. And we're going to check the stuff. We have two planks inside the horizontal container, give it a name. Income charts like this, give it a color. And remove the placeholder. So let's go to remove it. Okay, friends, so we are done. Let's go and have a final check on the structure. We have a whole dashboard and inside it, we have the lift section for the Nav, the right section for everything header and charts, and inside it, we have two horizontal containers, one for the header, and another one for the lift and right sections. Let's drill down. We can see here we have the lift section as a vertical container, and then we have a right section for the demo and income sections, and then we go and split it into demo section and income section, and each one of them has a title and as well horizontal container. The same thing as well for the income So if you have it like this exactly like me, we can proceed. If not, then go back and do it step by step. Okay. Now the next step that we're going to do the first iteration in the dashboard, where we're going to put all the charts inside our dashboard. We will not care a lot about the designs. It's all about placing the charts inside the containers. So let's start with the first section in the overview, so make sure to select it. And I'm going to say, let's make it a little bit bigger. So we're going to start from top to down. We're going to go to dashboard, and let's go and add a title. For the first pan, it's going to be the active employees, so active employees. And let's centralize it in the middle. Now below this title, we're going to have the pan off active. Let's drop this chart below it. Of course, we're going to go and hide the title. We don't need it. Nice. Now below it, we can have two KBIs, the left and right, and for that, we need horizontal container. But before that, we're going to go and have a small separator between this pan and the two bands below. We're going to have a blank below it. Let's go and make it smaller like this, and we're going to go and design the following stuff. Let's go to the background, or colors, Pick our gray and make the opacity something around 60. All right. W we think, let's go and remove the outer budding 20. And we're going to go and give it the name divider. All right. All right. So below it, we're going to have a horizontal container for the two KPIs. Drag and drop below it like this. As usual, we're going to go and add our two planks, one, and the second one, make sure it's going to be exactly to the right side. So let's go to the layout and check. So here we have the horizontal containers. Let's go and call it. We're going to call it QBI section like this. Of course, we're going to go and add few borders for it just to see it. All right. As you can see now, things are smashed. Let's go reorganize it. We're going to make this new container a little bit bigger like this. Now let's focus on those two KPIs. Now what do we need for each QBI? We need a title ban and a line charts. So we have to have a vertical container. So let's go and grab one and put it inside it. Let's start immediately adding stuff, so we need a text. It's going to be the hired and make it to the center. Below it, we need the pan, drag and drop the pan, course, make sure to remove and hide the title. Below that, we need the line charts. It is hired by year and drop it exactly below the pan. And we hide the title. Now this is the first container. Let's go and check the layout. We have here, vertical container, we have the title, pan, and as well the line charts. Let's go and give it the name and be hired BI. Like this, let's go and remove the first place holder from the plank. So remove it. Now, don't worry about the size and the coloring. We're going to do a second iteration on the dashboard in order to do fine tuning. Now we can just a little bit adjust the side from the line chart like this. Now we need in the right side, again, the same KBI, the same steps. Let's go and grab a vertical container to the right side, make sure to drop it inside the container, and we need a text. It's going to be terminated in the center. So what do we need else, we need a pan so make sure it is exactly below the text and as well, hide the title. Let's go and this small zone to this container, go to the left side. And as well, the blank should be smaller. Now what do we need, we need the line chart. So let's go and drop the line chart below the pan. Remove the title and make it a little bit smaller. Now let's go and check the layout. So we have a vertical container. We have a title, pan, and as well, another chart. Let's go and rename it. This is the term KPI. Okay. Now one more thing, I would like to go to the this blank, rename it to divider. Like this, Let's give it the same coloring. It's going to be the dark gray and as well the pity 260 like this. Let's go remove the outer padding. Now what do we have below that? We have the department and like lines lift and right. For that, we need a horizontal container. What do we need? We need a text in the middle. I' going to be departments, and it should be in the middle andft and right, we're going to go and add a planks. Make sure to drop it exactly to the lift. And exactly to the right. Let's go and check the layout. We have here, er container, blank, department blank. So let's go and color those stuff in order to see it. It's going to be the d gray and 60 without any outer bodying, the same thing for the next one. 60 and no ao padding. We can go and call it department title. Now what do we have below it? We have the chart of the department. Let's go and drop it beneath it, and of course, go and remove the title like this. Now below that, we can have the location title, so it can be exactly like the departments. What do we need? We need horizontal container. We need a text. Let's call it location like this and centralize it in the middle. We need two blanks lift and rights, like this, and we go to the layouts. We have plank location plank, and we can rename it to location title. And we're going to design those planks, so make it gray, 60, and remove the padding. The same thing for the next one, as well, 60, remove the padding. Now, below that, we have two charts, one, a map, and another one, a bar charts. What do we need? We need horizontal container below it, and we need the two charts. Let's get the location to the right side, remove the title. Let's go and get the maps exactly to the left side, and remove the titles. Now let's go and check what we have done. We have horizontal container and the two charts. Let's go rename it, can be the location charts. And now we can go and remove the last plank. It's just a placeholder, so remove it. That's it, we have now all the stuff inside the overview section. As you can see if you don't do it slowly and step by step, planning, everything, this can be cows. But with the planning, everything going to be easy. Now let's move to another section to that demographics. Here we have a lot of charts. Let's do it step by step. We are this section over here. What do we have? We have a title, and then we have multiple charts side by side. As usual, each chart is a vertical, we have a title, and as well the chart itself. Let's go and add the first vertical container over here, and then we need inside it a text. So make sure to drop it here. This going to be the gender. And center. And below it, we need the charts. Let's go and pick our Pi chart for the gender, drag and drop it beneath it. Of course, we're going to go and remove the title. A great. Now before we go to the next chart, we're going to go and have a divider like this. Let's go and give it the colors. Gray, 60 like this and The outer pudding. Now to the next charts, we need as well a vertical container to the right side, make sure to draw it right to the divider, and here we need three charts. Let's do it step by step. First, we need the title. It's going to be education and H to the center as well. Below it we have the first bar chart, which is the H groups. So drag and drow it beneath the title and remove the title as well. Now beneath it, there is two charts, the heat map, and as well, the bar chart of the education. Since they are side by side, we're going to go and get horizontal container beneath. So drop presenter container exactly beneath it. So now things are getting resized, left or right, and so on, don't worry about it. The main thing does, we are placing the charts in the right container. So let's go and get first H versus education and put it. In this new container, remove that title, and now to the right side, we need the education levels, so make sure to place it to the right side and remove as well the title. So now let's go and resize this divider in order to have a little bit space. Like this. Now we have to change a few stuff with those part charts like hiding the headers. For example, click on the first one, right click on the header and remove it. Now for the second chart, I would like to switch stuff. So let's go inside this chart by clicking to this arrow. Now I'm going to go and switch columns rows, and as well, we're going to go and hide the header. Let's remove it and we have to go back to our dashboard. So we're going to stay with this, but we will configure it later on the second iteration. Now let's have a look to the layout in order to make sure that everything is correct. So let's see. This is the vertical container for the education and age. Let's go and rename it. Education and age charts like this. It should has a title then the first chart where we have the part chart, and then plod, we have horizontal charts, where we have two charts side by side, the at Mm and the part charts. If we get it like this, then we can proceed. So now we need another chart to the right side, where we have the last chart in this section, but we need a divider between them. So let's go and get a plank and drag and drop it exactly to the right side. So make sure you drop it correctly. So let's go and check the layout. We have the color of gray and as well 60, and the outer budding to zero. Now as you can see, our plank is after the education and age charts. So let's go and rename it. If either, and as usual, we need a container, so it's going to be a vertical container to the right side, and we need a text. It's going to be education and performance like this in the middle. And this is going to be very simple. We're going to go and get the chart just below it like this, remove the title. Of course, you can go and make the divider a little bit smaller left and right. Okay, let's check again the layout, whether everything is fine. So we have a vertical container for the last chart, we have a title and beneath it, we have the charts. Okay, we are done with this section. Now, let's move to the last section to the income. So what do we have over here? Let me just close this and as well this, we have the income. So we have a title and beneath it, we have a container. We need here two charts as usual. We have the vertical container for the first one, and we need a title. So let's go and drop a text inside it. It's going to be education and gender. Make it in the middle. Now we need our charts. Let's go and drop it beneath the title. Remove the title. Now before we go to the next chart, we need a separator or divider. Let's just design it as usual. To 60 and the padding to zero. Now we need to build the last charts. As usual, we get a vertical container to the right side. We need a title. It's going to be age versus celery to the middle. Okay. And of course, we need our chart. So let's go and drop it beneath it. Remove the title and make the divider smaller like this. Okay, so that's it for this section, and now we have all our charts inside our containers as we planned. All right, friends. So with that we have all the charts in one place in one dashboard, now we're going to start with the process of refining and find unit of the dashboard, where we're going to go and tweak and twist many stuff in order to have a professional dashboard. 205. HR Project | Fine Tuning The Summary Dashboard : Right, friends. So with that we have all the charts in one place in one dashboard. Now we're going to start with the process of refining and find uni of the dashboard, where we're going to go and tweak and twist many stuff in order to have a professional dashboard. Okay, so now, the first step of that, we're going to go and add background colors to the dashboard as containers, and we're going to go and remove all the background colors of the worksheets. Let's go and do that. We're going to start first with the whole dashboards over here. So let's go and add the following. It's going to be like a dark gray. So I will go with this one over here. So we have the background, a dark gray, and then the section is going to be black. So let's go to the next step. We're going to go to the navy over here. So thenav going to be its own section. That's why we're going to have it as a black like this, and then to the right side, we will not have everything as black, we'll have only the three sections overview, demographics, and income. That's why I will not change anything over here. Let's go to the sections, and we're going to start with the overview over here. We're going to have it as a black. Then we need those two sections. We need the demo section, it's going to be as well plaque and as well, the income section can be plaque. With that as you can see, we are getting now the dark theme of our dashboard. The next se of that, we're going to remove all the background colors inside our worksheets. We have added it at the start in order to have a feeling about the dark theme, but now we will not use the background colors of the worksheets, we're going to use only the dashboards. Now we have a boric task, where we're going to go through all the sheets and we're going to start removing the background. Let's start from the top left. We're going to start with the pan, right click on it and go to format, and then we go to shading and we're going to go and remove the worksheet color. T none. Now we're going to go through all worksheets that we have, and we're going to go remove the background color. We can do it in the dashbard here or you can go and visit each of those sheets one by one. We have the last one. Remove like this. We are done. Now we have fixed the background colors of the dashboard and as well the worksheets. All right. Moving on to the next step, were going to go and fix the font size and color. Let's start with the title of our dashboard. Let's select the whole thing, and we're going to go and use our light gray, and we make sure it is 20, so we have it as 20, and let's make the first section the title itself as a bolt and we leave the overview as it is. So that sets it. Now we're going to go and edit the title of each section. Here we have three sections, overview, demographics and income, and we're going to do the following, let's go to the overview. We make it light gray. Like this, and we're going to make it as we 14 and bold. Let's go to the next one, we're going to do the same stuff. Bold, change the color to light gray and make it 14, and to the last section. 14 bold and, we pick the color. The sections looks exactly the same. Now we're going to go and edit the titles of each charts. We're going to have the following list start with the agenda over here. We're going to make it as well light gray, and we're going to make it as 11 for the size of the font. Let's go and do the same for each one of them. It's going to be 11 light gray. For the next one for the next. 11 for the age and gender. All right. And don't forget about the departments over here. 11 ands gray and the location. And 11. Now we are done with the titles and stuff. Now, let's go and check the phone size inside our charts, and I would say we can make it smaller. We have to go through that again. Let's start with the department. Go to formats, and instead of nine, let's have it as eight. Let's go for the index as well and move it to eight. I would say let's make it bold all right. Now let's go to this Pi charts, make it eight, and the same thing for the map, so click somewhere, go to ft, and make it eight. Now for the Pi chart, I would go inside its, and we're going to go to the outer circle. And there we're going to go and change the font size to eight. But the big number inside, we're going to leave it as it is. Maybe we're going to make it little bit even bigger. Let's make it ten. Let's go back to our dashboard, and now we continue to the next charts. Make everything as eight. Same for the age. Now to the next one, same stuff. And as we eight for the income, and for the ages and stuff. Everything should be eight. I think it looks really nice. We are done now with the font size and colors. All right. Now the next bit that we're going to go and visit all the chart again in order to enhance it, refine it, and maybe add extra stuff. Now let's have a look to the departments over here. What we can do, we can go and add the status of the employee for each department. We can show as well on this par the total terminated. In order to do that, let's go inside the chart again. Now we need like a status dimension in order to control the colors inside those bars. We don't have it yet, so that's why we're going to go and create a new one. Let's call it a status. So it's going to be the same logic. Let's go and have an F statement. F is null. The terminated dates, term date, then it is employed. Then the employee is hired. Otherwise, terminated like this. Let's go and end it, and now we're going to go and take the status and put it to the color over here. Let's go and assign the coloring, so the hired going to be the green and the terminated going to be the pink. Now, what else I'm going to do? I will just go and switch between those two status. Let's go and do that. And I would like as well to show the total hired inside the label. Let's go and get it, and we can go and change maybe the color of this label to light grate, and maybe make it seven, something like this, and we can make still the index smaller. Let's go back to our charts. Now we can see in this parts as well, the number of terminated employees. I would say let's make the index, little bit smaller. This. That's all for this chart. Let's move to the next one. We're going to go inside this chart. I would say let's add the percentage informations to the columns. Let's go and get the total higher and put it near the location, and then let's go and switch it to discrete. So that we have the percentages here and the header information on top. What we're going to do we're going to go and change the format of those percentages. Let's remove the decimals. Let's go and make those parts a little bit smaller. I'll go with something like this. Let's go back and check the dashboard. They look nice, maybe we're going to make it smaller size for the font. Instead of nine, we usually have eight. And we can go and make it smaller. We have more places for the map, something like this. Now for the map, everything looks nice, so we don't have to change anything. Let's go now to the gender informations. Now, what we can do, we can make maybe two pie charts for each gender, and then we can show the percentage of terminated employees. Let's go and try that. Maybe it can look nice, so we can go inside. Now in order to do that, we need the gender as row. Of course, now, our bi chart did broke, so let's go to the outer circle and repair it first. We don't need the gender information. We have it here as a dimension. What do we need for the colors, we need the status of the employee, and as well, we need the total hired as percentage and put it on the Pi. Something like this. What you can do inside those circles for the big numbers, we can change it to the percentage right. Let's go and replace it with a percentage, something like this, and let's go and format it. So to the percentages and remove all the decimals. It looks nice right now, we can see the percentage of terminated for each gender. Let's go and have a look to our dashboards. Now, it looks that it needs more space, what we can do, we can go and rotate the labels first. And with that we have enough space, maybe you can make it a little bit bigger. We're going to fix the spacing between charts later. One more thing that I just noticed that the inner circle of the bi, they are naturally black. Let's go to the chart again. To the inner circle to the colors, and change it to black. Let's go back. That we are done with the gender chart, as you can see. We are really thinking again the chart as we see all the informations in one place in the dashboards. Now we're going to come to the fun one where we have here three charts on top of each others. First of all, let's give it more space like this and maybe make it a little bit bigger. Now what do we have we have here four values and for the age we have here like five values. What we're going to do first, we're going to give it more space, and I'm thinking about maybe we're going to go and switch those two informations. Maybe it's going to look more better. Let's go again inside the chart. Let's go and flip it like this. Let's go back to our charts. Now it looks more nice, Let me just make this smaller, something like this. Now we can see that the high school is taking a lot of space inside our charts, so we can go and edit the ES for that, so right click on it and edit LS. We let's have it like this as an abbreviation. Okay. So now we have more space. We have to fight with the space inside this dashboard. So now the next sib that, I would like to go and highlight the highest value. So as you can see now we have everything as gray, and if we highlight now the highest value, it's going to be very clear. So let's go inside this chart. And now in order to highlight the highest value, we have to go and create a new calculated field. So let's give it a name highlight Max. So we need the function max but for the window. What is our measure? It is the total hard so the total hid. We are searching for the highest value. And if the current value equal To the highest value. We're going to get true. Otherwise, we're going to get false. Let's go and hit k, and let's use this function on top of the colors. Now let's go and change the coloring first. If it is false, it should be a dark gray. If it's true, we want it as green. Now if you check the view, we have multiple values as the highest value. We would like to have only one value. Let's go and change the aggregate function, right click on it, and let's go and edit the table calculation. So now let's go to specific dimensions and we're going to consider both of the dimensions, and with that, we have only one value, which is exactly what we want. Let's go and hide the legend. We don't want it in the dashboard yet. I would say let's show as well a label for the highest. Let's go and take that total hight as a percentage. Put it on the label, and of course, we're going to go and change the table calculation. It should consider both of the dimensions. So let's close it, and we're going to go and change the format as usual. We don't want all those decimals. Let's remove it, and let's go and change the format. What we need, we need it let's go with seven, and with a light gray. We don't need all the values. We need only the men and max. Switch it from all to men and max and remove the minimum value, so that we have only for the highest value this label. I think we are done. Let's go back and check how it looks like in the dashboards. It's fine right. Now let's go and fix all those part chart lefts and rights. We have here switched the dimensions. That's why we have to go and switch this as well. Make sure to do it correctly, so we're going to bring it down and the other one should go up. What we're going to do, we're going to go and switch as well the dimensions like this. This is for the first chart and as well for the next charts like this. Now let's go and highlight as well, the highest value. Let's go back to this charts. We're going to take the highlighted value as a color. Of course, we're going to go and hide the legend as well. Let's remove it. I would say let's go and reduce the size of those pars in order to fit inside our charts. I will go something around here. We will see. Let's go back to our charts, and let's do the same things for the ages. We're going to go and get the highlight value to the colors, and we have to go and change the colors over here, so it's going to be Gray and true, going to be green. Let's as well remove the legends and as well, we have to go and reduce the size of those pars, maybe something like this. All right. Let's go back and check. So now as you can see with the highlight effects, it looks really nice. Now as you can see the parts are not fitting exactly on top of those values. We will fix the spacing and the positions later as the next step. So we can leave it as it is for now and let's move to the next charts. So let's go inside it, and I would say let's go and highlight as well those values. Now, we cannot go and use the same highlighter because here we have percentage, and our highlight is based on the absolute numbers. So what you can do going to go and duplicate it. And let's re name it two percentage. I'll remove the b as well from it. Let's go and edit it. Now instead of having the total yard we can have, we can have the percentage of total hyrod ride. We're going to take this measure. I remove the percentage from here. Let's go and copy it and put it as well for the equation. Hit and let's move it to the colors. Now, of course, we have to go and add as well the coloring as usual. False is gray and true, can I be green, and we're going to hide as well, the lesions. Now let's go and check the table calculation, whether it is configured correctly, so dit table calculation. This one should be based on the performance rating like this. Now I'd say let's go and add the label for those charts. We're going to take the same measure, hold control, and put it on top of the label, and let's go and adjust the style, so it's going to be light gray. And we're going to have it as an eight and we don't need all those values. Let's have only the min and max. Now we have the mean value and the max value, but I don't want the min value, so we can have only the max value like this. That sets, let's go back to our charts, and I think everything looks nice. Now let's go to the education versus gender. I think here in the charts, I would not add anything. It looks really nice. But I would go and change the size of the labels. We forgot about that. Let's make it eight instead of nine. So Doch. Now for the last chart over here, I think we have to go and add some coloring tots. So I will just go and add our green color and maybe reduce the opacity to something like 50, very nice. And maybe go and reduce again, the size of those labels to something like seven. Now I would like to go and add for the axis a line. Let's go to format. So let's go to the lines over here and on the sheets, we're going to go to the axes. And we can add a line for it, and we make sure that we are selecting our dark gray for that. Maybe as well reduce the opacity to somewhere like around maybe 60. Let's go back to our charts and maybe let's go and rename those axis. Instead of average age, we're going to have only the age and the same thing for the salary. So we're going to have only the salary like this. That's it for this chart. As you can see, we just revisited all the charts and we added extra stuff, some refinement and fine tuning. All right, everyone. Now in the next step, we're going to start working with the pixels in order to add more spacing between all those sections and containers using the inner and outer padding. Now the distance between all those main sections can be always as a 20. Let's start doing that. To for the left side from the navigation. Make sure to select the navigation over here. Now, the first thing that we're going to go and get rid of all those porters. We don't need it. Now we have to add 20 as a space between this section and the outer dashboard. We're going to go to the outer bedding over here and just Add 20 everywhere, top left, bottom right. The next step of that, I'm going to go and do a fixed width for this container. Let's go to this small arrow over here and edit the width, and we're going to have the value of 100. So let's do it like this. Now, as you can see, we have spacing between the container and the border of the dashboard. Now let's go to the right side completely. So let's go and select headers and charts, remove the border, we don't need it. So as you can see we have a lot of spaces on the right side, so we're going to go and edit the width. Instead of this value, we can have, let's go with 1,300. Let's go like this. Now if you take the whole container, we need spacing from the right side and exactly going to be 20. Let's go to the outer bedding over here. The select all sides equally because we have already space between those two sections. We need only from the right side 20. Now let's go inside all those containers and start adjusting stuff. The next s is that's the header. We're going to go and remove the border, and I would say let's go and have a fixed height for that, so change it to fixed. And as well, let's say the fixed two, 65, something like that. We have a little bit spacing between the charts and the title. I'm happy with that. Now let's go to the next section to the left and right. We can see here, we have enough spacing around the dashboard for the whole container. Let's go and remove the border for that. I would say let's jump to the next one. Let's go to the overview on the left side. What do we need here? On the left side, we have a 20, so we are safe on top, on bottom, but on the right side, we don't have enough space between the sections. That's why we're going to go and adjust it. But first, let's go remove the border, and then we're going to go to the outer padding and we're going to remove all sides equal, and on the right side, I need 20. Now we can see we have enough spacing between the lift and right. That's look really good for now. I would go as well change the container color of those informations. So we don't have anything. Now let's go to the right sides and select the whole container. We are at the demo and income section, remove the border. I think we are done with this. Let's go inside those sections. Let's go to the demo section, remove the border. Now of course, we need now spacing between the demographics and the income. On the bottom, we need 20. Let's go to the outer patting, D select and only a bottom, we need 20. Looks really nice so far. Of course, let's go and remove all those borders, so we don't need it anymore. O this as well, we don't need borders and here. I think we have to go above like this. If ID selects, we still have one border, which is the whole dashboard. So is just remove it. As you can see adding spacing, it's like giving air to your dashboard, so it can breathe. Now we're going to go and add an inner adding inside those sections. We will ignore for now deidentifications, because we're going to have another story about the icons. Now if you check those sections, you can see that the wording is very near to the border of the section right. We have to give here some spacing. We will do that only for the main three sections. We're going to go first to the overview. Like here, and now this time we're going to go to the inner budding and we can add a seven, something like that. You can see as we are moving the values away from the border, it's easier to read. We can do the same thing for the section over here. We are at the demo section and go and give it seven as well. The same thing for the income. The income section over here, let's go and give it. Seven. Sometime we can see those values, male and female, they are not on top of the border right. Now let's have another look. I think we can go and add spacing between those titles and the title of the section right. What we're going to do, let's go and select the whole container. Demo charts, and we can add on the top adding, only the top, something like five right. We have here a nice space. Now as you can see in the demo charts, we still have some spacing below right. What we can do, we can go and e it the height. Instead of this value, we can go and increase it. To 300. So that we are using the whole space. Now, let's go to the other section to the income, and let's go and select the whole container income charts, and we're going to do the same thing, so we're going to go and add on top five. So we have some spacing between the title of the main section and those charts. Now if we sit back and check the whole sections and the spaces between, then we can see that everything is perfect. We have 20 everywhere, but only here we have a problem right. As you can see here, tables show it as hash line. It means there is an issue with the spacing. So we have to go and fill it. So what we can do, just click on One of those charts and just move it like below. So we are just pushing until we reach the limit right. The spacing between those sections are perfect. That's all about the spacing between all those sections. Now we have to go and focus about the spacing inside each of those sections and between the charts. Of course, we're going to go and fix all those dividers between the charts. I would say let's start with this section, the demographics. Now my rule is side one section, we can to have ten between the charts. Let's go and do that. We're going to start from the left to the right, so we're going to select the gender over here, and we're going to have the outer padding to the right side as five. Let's go and selected like this, and then to the next one, we have our divider. Our dividers has always on the top, we have ten outer padding and on the bottom as well ten, and we have to go now and make it really thinner, so we're going to go and at it therewith, and we're going to have only one With that, we can have a really fine line between the charts. Now let's move to the next chart over here. We're going to have from the left five and from the right five. With that, we have a total of ten between the charts. That's it, let's go to the next one. Here we have a divider. As usual, we're going to have ten on the top. Tin in the bottom, and we have to make it thin. So we're going to go and addit the width to one. Now let's go to the last chart over here. So the whole container. From the left side, we're going to have a five, and that's it. On the right side, we don't have to deal with that. As you can see now, we have really nice separation between all those charts and we have enough spacing between them. Now finally, we can go and adjust this middle chart since we have now the spacing perfect. We're going to do it like this. We can select the top charts, and we can just reduce the size of it a little bit like this. Now what we're going to do, we're going to go and squeeze this chart from lift and right until it matches the values. Let's go to the outer padding over here, the elects, and let's start with something like 4070. We are almost there. We have to keep pushing between those values. Maybe like this, Yeah, we are almost there, but we are shifted a little bit to the right. Let's increase the right and maybe the left and come on. So now we have it perfect. To know if I deselect, it looks like we have the part charts on top exactly of those values. Now we're going to do the same thing for the right side. I think we have to push more from the top. Let's go over here to the outer budding and then deselect. Let's go and start with 20. So I think we are almost there. Let's go with 25, maybe one more. T six. Perfect. So now we have it exactly on the rows of the ages. So now the chart looks really amazing. Okay, so we are done with that demographics. Let's go to the income. So we're going to do the same thing. We're going to go and select the whole container of the charts, and to the right side, we're going to have five like this. Then we're going to go and edit the separator from top. We're going to have ten from pattern as well, ten, and of course, the width going to be one, let's do it like this. Now let's go to the right container, and we're going to have from the left side five. That's we have a total of ten. I would say we can push on those spacing to the left side a little bit. To the ptular right now with that, I'm happy. Final look to the income. I would say we can go and increase the whole height of those charts. Select the whole container and let's push more on the height. Let's go with the 300 again. We are done with the income section. Now let's go to the left side. Let's start with the first pan over here, and we're going to have L five between the charts, but this time we have it as a vertical. We have it four over here, but we can go and make it five in order to stick with the rule, and let's go and make it a little bit bigger to see the pan. Then we have our divider. This time, we're going to have from the left and the right. We're going to have ten. And we're going to have as a height one like this. Now we're going to go and make everything like in the middle. So make sure to have it something like this, and we have to go and change this divider. We have to have on the top ten below as well ten, and the width is going to be as usual one. Then we have to make sure again that the containers having the same side, something like this and the middle perfect. Now let's go to this title over here. Select the whole container and add on the top five. I would say since it's a line, we're going to have ten from left and ten from right as any other divider. We're going to have here ten and as well ten. Then now since here we cannot go and edit the heights. We can only edit the width, what we're going to do. We're going to go and squeeze it from top and bottom. How we're going to do that? Let's go and select those separators and we're going to go to the outer padding. Let's have on the top 15, and on the bottom 14 and with that, we got the line effects. The same thing for the other separator. On the top 15, On the bottom 14. With that, we have a line. Here, there is no other spacing. Let's go to the other title to the locations. We can do the same thing. On the top, we're going to get a five, not a ten, from left and right, we're going to have a ten since it's supera now we're going to do the same things for the separators. On the top 15, bottom 14, the same thing over here. So 15 and 14. Nice. Okay, great. So now let's have a look to the whole dashboard. Let's go to the presentation models. And now sit back and check whether you can find any problem with the spacing, from my point of view, we have a perfect dashboard. So we are done with the spacings between the containers, charts sections and everything. It looks really professional right. Okay, now the next step, we're going to go and add tooltips to all our charts, and I think you would agree with me if I say, adding tooltips is a little bit boring. But it's provide really nice informations for the users. Let's go and do it. We're going to start with our bands, so we're going to start with the active employees. Let's go to the charts. Now let's go over here to the tooltip, and we're going to do the following. We're going to say the total number of active employees and then we're going to go and insert our measure. Now, it's very important that we always follow the same standards when we are using the tooltip. I would say that always the normal text should be not bold. Only the words that you want to highlight could be go bold, for example, here. What is important is the active employees. Of course, the measure itself, it's already bold. Now, about the colorings, we're going to use two different gray colors. If we go to the normal text over here, let's go to the coloring, we're going to go and choose this gray over here. Let's go and select it. Then for the highlights, we're going to go and use our dark gray. Like this and the same for the measure. For now we are done. Let's go and copy it because we're going to go and use it in the next chart. Click and then let's go back to our dashboard and just mouse hover on it. You can see very nicely the total number of active employees, and we have then the number. Now let's go to the next pan to the hired employees. Let's go to the toll tube and replace the whole thing with this one. Instead of active, we're going to have the hid. Let's go and give it the color that we use usually for the hid the green one. Of course, we don't use the total active, we're going to go and insert the total hid. And of course, remove the active one. That's all, let's go and copy it for the next one, and of course, we have to go and test. So D's co. As you can see, the total number of hired employees, and we have the number, let's go to the next one. Here we have the terminated. So we're going to use terminated and for that, we need to use the pink color. And here, of course, we don't have the hired, we're going to have the terminated Like this, it's it okay and check the result as a dashboard. Everything is perfect. Now let's go to the line charts, and we're going to go to the tool tip, but make sure that you are not selecting the tool tip of any of those marks. Make sure to select the all. That we have the same tool tip for both of the charts. Stay at all and go to Toll tip. Now let's go and add it as a new line. We go and remove this one, but we need the year. Of course, now we have a chart and depend where is our mouse. We can have the year displayed. Let's go and make it bigger like maybe 11, and as well, let's make it green. Okay Let's go and hit. Let's go and test it. As you can see, we have here 2017, 2020. You know what? I would like to go and add the percentage side by side to the number. Let's go and get the total hired and drop it on the tool tip, and then let's go to the tool tip and have a pipe. Then we're going to go and insert the percentage. Let's go and test it. Now, as you can see, we are getting both of the percentage and as well the absolute number. But I would like to go and get rid of the decimals. Let's do it from the data source. Right click on the field. Let's go to the default properties and then to the number format and then remove from the percentage the two decimals, and then it's okay. With that as you can see, we don't have any decimals with the percentage. Perfect. Now let's go and copy the whole thing for the next charts. Of course, we're going to go and test it on the dashboard. As you can see, it looks really nice. Let's go to the next one. And same, make sure to select the all and then go to the tooltip and insert the whole thing. Now instead of higher dates, we need the year of termination dates. Like this, I remove the old one. Now we're going to have that terminated. Of course, we go and change the color to the pink like this. Here we have the wrong major, so let's get the total terminated like this, but make sure to select the same color right, so it is our dark color, and we have to create a new percentage for the terminated. Click for now and we can go and test it. As you can see, the total hid is not working. Let's go and fix it. We're going to go over here to the total id with the percentage and duplicate it, and we're going to go and edit it to total terminated. Here instead of hyod, is going to be total terminated, divided by total total terminated. Like this. Let's go and it and let's go and grab the total terminated to the tooltip, and let's go and edit it. We have to go and insert it and remove the hid. Like this. Now we have a nice percentage as well in our tooltip. Let's go and test it as well in the dashboard. It looks nice. Now let's go to the departments. This is going to be interesting. Let's go to the sheets. Now what you're going to do we're going to go to the tool tube and insert our template. Now what is the main dimension over here? It is the department. Let's go and insert it and remove the higher date. Now here it depends where our mouse is, we're going to get either the hired or the terminated employees. We cannot have it like this as a static. We're going to go and insert the status over here. Now it's going to be dynamic. Let's go and make it bold and make sure that we having the right color, so it's going to be the dark gray, and I think we can leave it like this. Let's go and test. So Let's go to operation over here. As you can see, we have operation the total number of hired employees, but the percentage is not working. Now let's go to the terminated employees, and as you can see it is dynamic and change to terminated employees. So far it is working, but we have to go and fix the percentage. That's because we don't have it in the charts, so drop it on the tooltip. Let's go and check. It's still not working. I think we have to go and insert it again. Let's go and insert it and remove the old one. All right. So let's go and hit and test. Now it is working. All right. Now here are the best practices as well. If your dimension in your chart having hierarchy. As you can see here, we have departments and job title. We can go and add the dimension that is next in the hierarchy as a tool tip. We can go and build a special chart for the job title and include it in the tooltip. This is really amazing technique in order to quickly drill down to the next dimension without changing the whole dashboard. Let's go and do that. It's very simple, what we're going to do. We're going to go and duplicate the departments. Let's go and do that. Now let's go and give it the name of the job titles. Now what we're going to do, we're going to go and replace the departments with the job title. Let's go and do that. Now I would say we're going to go and reduce a little bit, so we don't need the status at all as a color, so let's go remove it. But we still have to go and sort the data, which is now not correct. Let's go and sort Then we're going to go with the field, descending, and, of course, go and select the correct field, which is the total highd, since we are using it in the charts. Let's say okay. Now about the coloring, I would like to go and highlight only the maybe let's say two jobs. In order to do that, let's go and create a new calculated field. Let's call it top two, and the function is very simple, so we're going to have the rank function. Then we are ranking, we are ranking the total highed. So the total hirod. If this is smaller or equal to two, then it's going to be true. Otherwise it's going to be false. Let's go and call it rank top two. Now with that we have a new dimension. Let's go and grab it to the colors. Now as you can see, we are now highlighting the top two, and of course, we have to go and change the coloring for that. If it is false, it's going to be the gray, and if it's true, it's going to be the green. That's it. Let's it and of course, go and remove the legend. I would like to see the labels at the end of the par. Instead of center, let's have it to the right side, and let's go and change the color to the gray color. We're going to have our gray color. Like this. All right. Now the next s of that, we're going to go and add the whole chart inside the tool tip of the departments. Let's go back to our departments and tooltip. Now what you're going to do. Let's have a new line. Let's call it total by job titles. Now we have to make sure that the coloring is okay, so we're going to use this gray and the chop titles, it's going to be our dark gray and only the job title is pulled like this. Now the next epi that, we're going to go and add our charts. So let's go and do that. Go to insert, to sheets, and then we're going to go and add the job titles from here. So let's a ok and check the results. 206. HR Project | Build the Table: Now let's check the second section of the user story and the requirement. So here we have the employee records view. It says that we have to provide a list of all employees with necessary information such as name, department, position, gender, age, education, and salary. Another point in the requirements about the interactivities, that the users should be able to filter the list based on the available cons. Here we don't have to build any visualizations or charts or anything. We have to provide only a list of all employees with important formations, and on top of it, we need filters. It sounds very simple. Let's check how we can build lists in Tableau. Let's start immediately building the charts. Here we have two methods. Either we're going to go and build a symbol list, where we have a symbol table in Tableau, where we're going to go and add, for example, let's say the employee ID, go add locations. Like as we see, we are adding just dimensions side by side. So of course, we can say this is the detailed list of the employees, and the job is done. So I cannot go and put in each cell like two informations underneath each others, or I cannot go and add icons and so on. So it is nice, quick way, but it is very limited. And now the other method is that, we're going to go and use some tricks in order to customize the list. It is time consuming, but the end result is really nice in tableau. So since it's advanced projects, I'm going to go with advanced techniques. So now, what are we going to do? We're gonna leave the employee ID. As a starter, and make sure we are selecting standard and not entire view. Otherwise, we going to have all the employees in one view. This will not work. So make it standard. Let's go and remove the header. And of course, I'm going to go and change the design of our worksheet. So let's go somewhere here and say format, and we're going to go to the shading and let's make it plack. Of course, we're going to change that later once we have everything in the dashboard. So what do we see here first? We have the Ds of the employees. Let's go and hide the header as well. And we're going to have the coloring of this dimension. Going to be our light gray. So let's change that. Now, this is the only dimension that we're going to use as a row, and the rest, everything going to be a columns, and we're going to do the following trick. So we're going to go over here and say average and -1.0 like this. Now as we learned, this format is going to add a placeholder for a shape for a visual. Now for the chart type, we're going to go with the shapes. So now we have here as the shapes. Now here we have like circles everywhere. This is our placeholder. I'm going to go and change as well the format of our grid. So what do we need with the lines? I make sure everything is none, just to make sure that we don't have anything. Then we're going to go to the columns, remove the grid, and we're going to go and add a fine line as raw, but I'm going to go and make it really dark. Now it looks nice. Let's go and hide as well, the header informations. So the first column going to hold all the informations about that demographics. What we need, we need the first name and the last name, since it is the most basics about each employee. Now we have the first name and the last name separated. What I'm going to do, I'm going to go and create a new calculated field. I'm going to call it full name. But now I'm going to go and merge both of them like concat, both of those informations. We have the first name, and then we're going to have the plus and then space between the first name and the last name, and we're going to get the last name inside our calculation. Wh that we have the full name. We have it as a new field. Let's go and drop it. On the labels over here. So as you can see, we have the full names of the employees. Now, for the shape, let's go and add the gender. So we're going to go and have the gender shape over here. We cannot see it yet because of the colors, so let's add it as well to the coloring. So now we have the same shapes that we have used in the income analysis. Now, what else we want to add is, for example, the age, let's go and drop the age as well to the label. And the last information about the demography, we're going to have the education level. So let's drop it as well to the labels. Now as you can see, we have a lot of information that is naturally nice, and there's a lot of overlapping. So we have to go and format it. Let's go first to the labels. And we're going to go inside it in order to customize those informations. Everything going to be to the left side as alignment, and then we're going to have the HL education side by side and split it by a pipe. About the style, the first draw, it's going to be bold and using the light dark or gray, and the second draw it will not be bold, but we're going to go and use our dark gray. This is going to be our style for all columns. Let's go and hit okay. Now as you can see it looks nice. We have the full name and below it, we have a few more informations about the employee. But still, as you can see the alignment between the informations and the ID is not correct. What you're going to do is going to go to one of those rows and just slightly increase the size until it fits the screen. I'm going to go and make it as well. I'm going to go with one more increase. With that, as you can see, one row holds all the informations, there's no overlapping, and you keep doing that until you don't have any overlapping between the employees. As you can see, it looks already very nice compared to having a list. Now on the right side, we have those legends. Let's go ahead remove them. We don't need it. Now we're going to go to the second column as well, it's going to be a bunch of informations. What we're going to do, we just to copy it. Hold control and just drub it side by side. Now as you can see, we have like two columns now. I'm going to go and as well format the grid, where we're going to go to the grid over here to the columns. And we're going to remove the column divider. As well, I'm going to go and remove the rows. Let's go to the rows. I remove it. It looks more clean. What we're going to do with the second column? Let's go and add the whole dimension of the department and the job titles. Make sure to select the correct one. The first one is for the demographics and the second one going to be for the departments and jobs. Let's go and remove everything. From it. Now we're going to go and drop those a formations? Let's get the job title first to the label. It's more important than department. Then the second one going to be the department, as usual, we're going to go and design it. Everything to the lift, the first row going to be bold and light gray. The second row going to be a dark gray and not bold. That's it. Let's it. As you can see, it looks really nice. Now the question is, do we have an icon for the departments and jobs? Well, I don't have any one, so that's why I'm going to go and hide it. If you have one, you can go and dit. What I'm going to do, we're going to go to the size and reduce it completely. But we still have a fine dot. We have to hide it by the opacity. Now if I remove it like this, you will not find it anymore. This is the trick, and it looks really nice. Now, let's go and add another column. It's going to be about this time, the dimension location. Same things. Let's go and switch to it. I'm going to go and add the location as a color this time and then the city in the lapel. We're going to get both of them as a lapel. Now let's go immediately and start formatting. Both goes to the left side. I wish to have first the city, then the states. As usual, the first one going to be the lights. Bold and the second one going to be the dark one. All right. Now let's have a look. Everything looks nice. I'm going to go and change the design of the shapes. It's going to be filled circle and it's a little bit beak, so I'm going to go and reduce the size of this one. If it is HQ, it's going to be green, if it's gray, it's going to be branch. You can see it's not that complicated right, it's easy. Let's add another information. I think now we can go and add the celery, but sadly we cannot go and add anything else to the salery. So we have to go and use it alone. Let's go and add the salary to the labels. Here we have those numbers. I would like to format it, Let's go and format the numbers. Let's go to numbers, and then we're going to go to the number custom, reduce the decimals, and as a prefix, let's add the dollar sign. The number looks nice. Let's go to the label and design it. Here we have the informations from the previous one. We don't need it. We have only the celery, and since it's the first row, we're going to make it light gray. Since it's in the first row, it's going to be the light gray, and as well bold. Let's it okay. For now, I don't have any shapes for that. That's why we're going to go and reduce the size and make the opacity to zero. Now to the next column, what we're going to have, we going to have the status of the employee, the higher date and the termination date. The status of the employee, we're going to make it as a color. That's we have the gray and the green, and we're going to make the circle as a filled circle, reduce the size. Something like this. Now I would like to add it as well to the label. Now what we need, we need the higher date as well to the label, and as well the terminate date. But here we have it as a year, I would like to have the exact date. We're going to go and switch it to exact date and then to discrete, the same thing for the terminate date to exact date, and then to discrete. Now we have all informations. Let's go inside and start configuring it. Now we have here the status higher date and term date. Let's go everything to the left side, and we're going to put the terminate date and then minus between them, then that term dates, we're going to go and design it as usual. So the billow one going to be the dark one. Okay. Let's get ok and check. Now we can see in the output, we have the higher date, and let's see a terminated employee. As you can see we have here a terminated date side by side. All right. Now the last column is going to be interesting. We're going to have a bar chart indicating the length of the hire. We're going to go and calculate in years the duration of the employment. Let's go and create a new calculated field. We're going to call it the length of higher. Here we have two calculations. If the employee is hired and not terminated, we're going to go and calculate the years between today and the higher date. Let's go and do that. We're going to need an F statement, and then we're going to check whether the employee is hired or not using the following logic as usual. Is null. So we are checking the terminate dates. If it is null, then the employee is not yet terminated. So what can happen? We're going to calculate the differences between today and the higher date. Date dif, and we're going to have a year. I'm going to go and add it as a new row. What we are calculating between the higher date and today. This is the formula for the employees that are not terminated, and now we're going to have otherwise se. We're going to have the date diff, and now not between today and the higher date, it's going to be between the higher date and the terminated date. Going to be the same thing year, higher date, and terminated dates. It's very simple. Let's go and end it. Let's. So now we have a new major, and I would like to go and test it first. Remember the first sheets where we test stuff here. I'm going to remove a few stuff. We need the higher dates, the terminate dates, and our new nice column. I'm going to show it as discrete. Now, of course, depend on the year that you are doing the tio, you might get different results. Now as you can see here, we have six years, two years, two years, and so on. Since here we have a termination date, we have here a zero. Everything is working, let's go back to our detailed list. Now we need a new column, but this time we will not use the placeholder because we have already a measure. We have already the lingth of higher, let's rag and drow it side by side. Now we have to go and configure the chart type. It will not be a shape. Let's go and use the par. Now we have a par in our charts. I'm going to go and reduce the size of it. Maybe more. Now let's go and add content to those pars. Let's start with the status. I'm going to put it on the colors, and we need as well the label, we're going to take as well the length of higher to the label. Now let's go and edit it, so let's coincide. We don't need all those informations. We have here the number of years, so let's go and make it bold and as well change the color type to light gray. After that, we're going to have years like this and maybe not as bold. That's it. Let's go and hit ok. Now we have light years at the end of the bars. But what we can do, we can go and change the alignment completely left and in the center. All right. Now let's go and check the results. As you can see in the list, we have the two colors. Here, for example, we have one year of termination as well here. The legend is working. Now, as you can see, things might be very tight. What I'm going to do, I'm going to go and change the size of all those sticks. Let's go to all and then let's go to label, and then to the font, and let's make it eight instead of nine. That we're going to have bitter spacing between those columns. Now the next sib of that, I'm going to go and remove all those informations here the axis. Let's go and remove Shohader, and we are done. Now we have a really nice list for the employees. Again, this is the one that is time consuming, but as you can see, we have nice bars, we have a lot of icons, and we have multiple informations in one column. It is a little bit confusing at the start on how to build it. But once you understand it, you can go and make amazing lists. And of course, having a simple list as well is fine. 207. HR Project | Sketch Mockup of Detailed Dashboard: So now we can plan the mockup for the second dashboard, and this one can be really easy. And we have the same title, but at the end, we're going to swab it with the details. Now in the middle, we're going to have only one section called the employee list, and here we have only one type of charts. We have a list, so we're going to have multiple rows and multiple columns and informations in each cell. Now, of course, if you have a detail list, it would be nice if we can filter the list. That's why we're going to put on top of each column an option for the users in order to filter the informations that we can see inside the cells. At the end, as you can see, it's very simple. We have only one list and on top of it, we have filters. That's it for the dashboard Map. As you can see. It's really easy. Let's move to the second mocap where were going to plan the containers back to Toyo. Now I have a screenshot of our new mockup, and I cap it a lot of stuff from the previous design. Now let's dive in and see how we can do it. We're going to focus on the black box in the middle. What we have here, we have a title, then filters and a list. We need a vertical container for that. Let's go and do it. This is the main vertical container like this. Now what do we need? We need a title. First, it's start with one title. It's going to be as well to the left side. I'm going to make it like this. Now what do we have below it? We have now different filters side by side. We need horizontal containers. Below it, we're going to have a horizontal container like this, and let's remove it and inside it, we're going to have multiple filters. It's going to be filters. Well, they all going to be side by side. Of course, they are way more details as what I'm showing you now. And we can talk about it later here, we are talking about the rough design about the containers. Now what do we have below the filters? We have our chart, the list. It's going to be only one object without any container, so below it, we will have a pi list like this. That's it. Now let's go and focus what we can have inside the filter. Now, I just took a copy of a filter and let's design the container for this. As you can see, it's like something below each others, so we need a vertical container for the whole filter like this. Now inside it, we're going to have a title and side by side with an icon. For that, we're going to go and get a horizontal container. Inside it is going to be like a horizontal container like this. We're going to have a title for the filter. And side by side with a very small green icon. Now to the next one, what do we have? We have like filters underneath each others, and that's why we're going to go with a vertical container for the filters. It's going to be like this. And inside it, we're going to have multiple small filters. Filter one and another one below it. This is the design of each of those filters that we have on top of the list. All right guys. W us we have a rough plan for the container structure and as well for the dashboard itself. Now let's go back to Tableau in order to build our dashboard. 208. HR Project | Build The Detailed Dashboard : Now, we're going to go and create the dashboard for the detail list. But this time we will not do it from the scratch. We're going to go and duplicate the whole work that we have done and only do a few adjustments for the new dashboard. It's going to be time consuming only for the first dashboard, but once you have it, then you can go and duplicate it for the rest. Let's go and do that. We're going to go and duplicate this dashboard, and we're going to go and rename it to H R details. So now the first step of that, we're going to go and prepare the containers as usual. Let's go and make this bigger, and let's go to the layout. Now of course, we are not going to change the navy container. We're going to go work with the container in the middle. Let's go to the whole dashboard over here and drill down, so it's going to be the Nav. And here we have the header and charts. It's fine. Let's go inside it. Now we have here the header, it's going to stay as it is, but this container going to be dropped completely, right click on it and remove. Well, yes. What is left over here is this legend. I'm just going to take it and put it here on top. Maybe later we're going to use it. Now let's focus on creating the content in the middle. What do we need? We need first a vertical container. Let's strike and drop it exactly below the title. Then as usual, we're going to go and drop some planks. This is the first plank and then the second plank. We can go of course and mark it if we want. The whole thing going to be with the border, the orange one. Now we can go and as well rename it, filters and list. Now, for the filter, we need one horizontal container. Let's go and drop it here on top. Of course, we're going to go and add some blanks inside it. This is the first plank. We have it somewhere here. Then the right plank in order to have it as fixed. Select the whole thing, and we're going to mark it with a plu container. Now what is below the filters, it's going to be our list. Let's go to the dashboards, and we're going to go and grab the details. Let's drop it beneath the filters. Let's go back to the layout and check it. As you can see, we have the filters and the details, how we can go and remove the planks. We don't need it anymore. So by looking to the charts, we can go and remove the title. This is the main containers for the dashboards. Now what we're going to do, we're going to go inside the filters container, and we're going to build one container for each group of columns in order to have the filters for it. Now for the first two groups of the columns, I'm going to do it step by step slowly, but for the rest, I'm going to speed up the video. Now let's start with the first container for the employee ID. What do we need? We need a container, of course. It's going to be vertical container, and then inside it, we have two plocks, And make sure to have it below it exactly. This is our container. Let's make it a little bit bigger, and we can go of course and market in order to see the borders, going to be this one and orange, and we're going to go and rename it like this. Employee, ID. Filter. Now, what do we need inside this is two horizontal containers. The first one going to be for the title of the filter. We're going to have immediately a text inside it. Let's call it employee ID. Let's take it to the middle, change the color to light gray and maybe make it as a ten for now, so it okay. Now the next we need a second container, but this one is going to be a vertical one exactly below it. Let's go as well and add a few planks inside it just to make sure that we have it as a vertical container. Let's go and rename stuff. This is going to be the title. And below it. We're going to have it as the filters. Of course, we can go and add the borders in order to see everything. Let's go remove those place solders. So remove the plank and as well the plank. Now the next sib of that we're going to go and add a button for the second container to be used or to be added on the first container. Let me show you tan. Make sure to select the filters, right click on it and add show Hide button. Now we have here a small button over here. We have to go and remove the floating from it, so it lands somewhere here. Now, drag it and put it side by side with the title. Let's go and make the whole thing a little bit smaller. Now in order to understand what I mean with this button, we're going to go and add a filter inside the second container. What we're going to do we're going to go to our list and to the small arrow, and then let's go to filters, and let's grab employee ID. Now as you can see our filter now inside the container filters. It's very important to make sure that everything is correct in the correct container. Let's go and test out. Now why do we have this patom? Check this out. If I click on it, we don't see any filters, so we are hiding the filters, and if we click on it again, we can see the filters. That's why we have to have this icon outside of the container in order to control the visibility of this container. This ptom is controlling whether we are showing the filters or not. Now, let's make the design a little bit better, so let's go inside it, and this time we're going to go to the pattom, so let's go and edit it. So if it is shown, I have an image for that. It's going to be this arrow, the green arrow, so let's go and select it, and if it is hidden, then we have the gray one like this. So let's go and hit. Now we have to make sure that the whole container of the title is fixed. As you can see it's fixed height, which is correct. Now let's go and test it. As you can see now, the arrow is inactive, but once I click on it, it's going to be inactive and it has really nice effect. Now we need to fix something. If you see here, I'm hiding the filter, but there's a lot of wasted space. What you're going to do is going to make things more dynamic and flexible. If I'm not showing any filters, this space should be used for the list. So currently, we are wasting a lot of space. Let's see, we can fix that. So let's go back to our dashboards. Now the first step of that we have to make sure that our list is flexible. Let's go to this small arrow over here, and we have to make sure there is nothing selected here, so fixed height is not selected, which is correct. Now the next step, we're going to go to the container filter over here, select the whole thing and make sure this as well without a fixed height. Go over here. You can see it is fixed height, so let's go and remove it. Now as you can see, Tableau did use the whole space, so now it's more variable and dynamic. Now one more thing that I would like to do is to go to the filters and remove all those planks, remove this one and this one as well. Let's go and test again. Now we are using the whole space because we are not showing any filters, but once I click on the button, what can happen? I'm going to use the space in order to show the filter. This is very dynamic and looks really nice. That's all for the first filter. Let's go and make everything smaller. And I'm going to go and do the same stuff for the second filter. So here we have a bunch of informations, we have a round like four informations, so we need four filters for dots. Now we're going to go and do the same stuff. So we need a vertical container side by side. Let's go and add a few planks inside it. It is this very small one. I'm going to go and select it and maybe as well, change the color of thats. So like this, it's still small, so make it bigger. All right. So the first container in side is going to be the horizontal container. I'm going to go and add for that, the text. This one is going to be the demographics, going to be the middle and light gray, as well, let's make it ten for now. Ho. Then the next tap, we're going to go and add another container and this time it's going to be the vertical container below it, and here we're going to have a lot of filters. Let's go again to our list. The first thing we need that full name. It's dropped over here, let's go and drop it where we want, and we're going to change it to a drop down list. Now the next spa we need to go and get the gender filter. Let's go and get it. Now we have it over here, so drag and drop it exactly below the full name. I'm going to go and remove this plank. Otherwise, it's going to go and confuse us, so remove it from dashboard, and as well the second one. Now it's fine. Let's go and edit the gender. It's going to be a drop down list. Now the next one we need the age. I'm going to say, let's go and get the age group. Let's go to filters. We don't have it yet because we don't have it in the list. We have to go inside the worksheet. Let's go to all and drop the age group somewhere in the details here. Then we should be able to find it. Let's check again to filters. I now we have the age group. Of course, we can have it on the first filter. Let's go and drop it exactly below the others. Make sure always that you are dropping everything inside this vertical container. It's going to rename them as well. It's going to be the filters, and the above one, it is the title, and the main one, is the demo graphic filters. Let's go back to our filter, make it a drop down list, and we need the last one. It's going to be the education level. We're going to have it as well here, drop it exactly below the others and a drop down list. Great. Now the next step that we're going to go to the filters and add a button for that. Let's go and do it, add a button. We have it over here, change it from floating to tilt. We have it over here. Let's drop it side by side to the title. It's not working, so we'll drop it somewhere here, maybe first and then take it near the title. Great. Now, let's select the whole container, make it smooer, and we're going to go and work with the icon. Let's use the green as shown. And the hidden should be the gray. And we can go of course and test it. So now close it, and show it. We have to go and fix the height in order to not have this strange effect. So fix the height, and now we will not have it. Hide it and show it. All right. Now what we're going to do, we're going to go and fix the design of those two filters, and we're going to follow the same design for all other filters. Let's see how we can do that. First of all, I'm going to go and give a background color for the whole section. Let's go and check the whole section, it is filter and list. So let's go to the background over here and pick the place one. Now, the next step, I'm going to go and remove the background color of the worksheet. Let's go to the format and then to the shading and remove the worksheet color. Now let's go step by step for those two filters. First, I'm going to go and switch the title and the icon. I would like to have the icon to the left, the same thing of our here. Now the next step, those icons are really big. Let's go and give it a fixed width, and then let's have a value like 25, the same thing of our here, so fix and 25, the next sib, I'm going to go and work with those titles. Let's move it to the lift and make it smaller to the nine. The same thing here instead of employee ID, let's have only ID. We don't have a lot of space, make it nine and to the left side. Now the next sibth that, we're going to go and work with the coloring. Let's put one of those filters then to format filter and set control. Now for the title, we're going to make it smaller to eight, and with the color, it's going to be the dark color. Now for the body, it's going to be as well eight. At this time, the color going to be the light gray. It seems the title the change again, that's strange, let's go and change it back to the dark gray and taste. So the color of the values are okay and the titles are darker. Nice, great. Now the next time we're going to go and place the filter exactly on the top of the column itself. Let's go and do that, select the whole container, and let's press it to be exactly on top of the IDs, something like this, and the same thing here. L et's move it and maybe around here. But we still have a divider between them. It's going to check the layout. So we're going to have it always like this, a filter and then a divider between it. Let's call it divider. How we're going to start the divider? It's going to be as usual, a dark gray. Now let's go to the outer budding, make everything as zero. Change the width to one. So we have it very thin, and then we're going to go and add an outer padding to the left and right. Let's have something around like 36 to the lift and six to the right. We have a small separation between them. Of course, the last step, we're going to go and remove all those borders. We are done with that. We have here as well a border and the same thing for the next filter. We have here a border. Now we can see we have still space between the filters and the list, so we can go and select the whole thing. Just to make sure that we are selecting it. Let's just shift it to the education level. All right. Now by checking that divider doesn't look good. So let's go back to divider and have as well on the top ten and below that as well ten. So let's check again the design. All right, so we are done with the first two filters. We have to go and repeat the same stuff for all other columns. So what can happen, I'm going to go and speed up the video as I'm creating all those filters. Oh Oh. Oh. H. Oh Was a lot of filters inside our dashboard. Now let's go and test it, so we have all those filters. We can go and hide all those filters as well, but we still have an issue. It is not any more flexible. I think we have still a fixed height. Let's go and fix that. Let's go and select the whole container. It was the filter containers and it should not be fixed yeah. Here is the issue, let's go and remove it, and let's go and test again. We open the first filter, the second third. And we are almost there. We still have here a lot of wasted space, so let's go and check the containers. And it should not be fixed, so we have it as fixed, so let's remove it. The first one, it's not fixed, so it's fine. Second one, remove a fixed, and here as well, it's not fixed, fine. So and the last one. Great. Let's go and do the final tests. If we close everything, the list should be bigger. Now let's go and add spacing inside our dashboard. Let's go and do that, and we're going to go and remove all those borders. Let's go and select the whole container filters and list. And we're going to go and remove the border. Now as you can see at the bottom, we don't have any spacing, so we have to go and add an outer adding. Let's remove the two. We need only 20 at the bottom. Great, now we have space. On the right side, it looks good as well on the top, now it looks good. Now let's go and add an inner spacing and it's going to be the number seven for all sides treat. Let's go and remove the blue container here. We don't need the order. Let's go and expand everything again to see whether we have any borders. We don't have any border colors, great. Let's go and close it. Now we'd like to go and add a title for this list. Let's go and grab a text and carefully put it on top of the current container. We're going to say employee list and then a Pie, and then we're going to tell the users to click on the arrows, so click arrows for filter options. No know we have to go and change the coloring. This is going to be a light gray, a bold, and it should be 14 for the size. For the rest, it's going to be a dark gray. Let's go with an eight. All right. Looks fine. Now, let's go and add a spacing between those three sections. We have a title, we have the filters and the list. Let's start with the employee. I'm going to go and add a badding at the button around like maybe ten. Looks nice. Now let's go for the group of filters, select the whole container, and let's go with the padding to the bottom around ten. With that, we have like spacing between all those objects and it looks way better. Now the next time we're going to talk about the legions, I'm not going to use any legions in this charts, and let's go remove it as well, we didn't need any filters since we have enough filters, let's remove it as well. And as well this icon. With that, we're done with the main part of our dashboard. Now we're going to go and check our navigation and the title. Of course, we have forgot about the title. Instead of overview, it is details. Let's go and change the size of this word to 16 and maybe something darker. I'm going to go and change it to something like this. Yeah It looks way nicer than before. I'm going to go and take the number of the color, and we have, of course, to change that for the first dashboard. Let's go over here, make it 16, and as well, change the color with the same color. It's a little bit darker and it looks way nicer. Now on the left side, we have an easy job. What we're going to do, we're going to go to the first icon and make it deactivated. Let's go and edit the button, and now instead of active, we have to have it as a deactive or inactive. Now as you can see it is inactive, and for the first button, we're going to go and make it active. This is going to be the green table. Of course, now we can go and map it. We have this dashboard. Let's go and map it to the details. All right. It looks really nice. Let's go back to the first dashboard, and of course, we have to do the same mapping. Let's go and edit the button, and we're going to mab it to our new dashboard details. Now I would like to go and add one more nice thing in order to indicate that this icon is active. I'm going to go to the dashboard to the floating, and let's grab a plank. L click on the plank and let's go and pick the background color of the green color. Now we're going to go and decrease the size of this to be a small indicator like this, maybe. And we're going to move it over here. I'm going to say let's make it like the height 40 and place it exactly near the icon. Maybe something like this. Now let's go and chick the dashboard. I'm going to go and reduce the width of that, so let's make it thinner, maybe like this. With that, we have like a small indicator that this icon is active. Let's go and do the same thing for the second dashboard. We're going to grab as well. Again, a plank and we're going to make the color of that green. The width is going to be six and the height going to be 40, and now we're going to go and place it exactly near the active icon. Something like this. All right. Let's go and check the design. It looks really nice. Let's have a final look to our dashboard. Here we have a nice filter and the main dashboard. Here we have this nice information. We can go and download stuff, we can go and follow, and the whole dashboard is interactive. Now if the users wants to go and click on the second dashboard, all what they have to do is to go and click on this icon. And we are now on the detail list about the employees, and everything here is very interactive. Let's go and hide all those informations, and it looks wonderful. 209. HR Project | Bonus - Build Background Layers using FIGMA : O. All right, friends, now we have a bonus section, where we're going to go and customize a background image for the layout of our new dashboard, and that's going to make the overall design of our dashboard look really cool and profesional. At this time, we're going to use another tool in order to create the layouts. We're going to go and use Figma. What is Figma? Figma is a design tool that is used by many UI and UX designers in order to create concepts, mops for the user interfaces. And it is amazing tool in order to share your work with the others in order to work and collaborate at the team. You can find the link to my work with the other links in the project materials. Of course, don't worry about the cost. There is a free plan for stars. Now we will not do a deep dive into how to use Figma. I will just show you how I usually use it for Tableau. Let's go. Now we're going to start with empty file, and we're going to put a screenshot from our dashboard. Now the next step with that we need a frame. So let's go and get a frame exactly on top of our dashboard. Now we can go and hide the image. Now we need a color for our dashboard, so it's going to be something maybe like this. Or let's increase it a little bit. Now what we're going to do, we're going to go and add lightning from the corners. In order to do that, we're going to take the shape of circle or ellipse and going to make it like this and maybe a little bit bigger and to the pack. Let's go and change the color of this and something here like in the middle. Then we're going to go and add an effect in order to have like a glue. We're going to have a blue, and we're going to go and change the value to something around 1,500. Some of you check, we have a glue or like light that comes from this corner. Now let's go and add the same in the other corner, can do it like here. Now let's go and increase the size of this one. Something like this. We need more lightning comes from the right side, and still we have to have it like bigger and one more darker. All right. With that, we have a background. Next, we're going to go and add the background colors of each section. We need again our image, and now we have to go and zoom in. Now, what we need, we need a rectangle, and we have to be very careful that we meet the exact edges of our dashboards. So let's get it like this. I'm going to go and reduce the opacity to something around 50 just to see the borders. So Yeah. Nice. Now we're going to go and increase it to 100, and we need now the color of complete black. Now what we're going to do, we're going to go and use the gradient instead of the solid. So let's go to do this. Now we're going to go and work with the lower value. We have to decrease it like this, maybe a little bit more, like this. Now the next step, we're going to go and add a corner for our container, maybe 20, great. Now let's go and repeat the same things for the other containers. We're going to have it for the overview. Maybe reduce again the opacity to see the borders. So like this and here as well. It's going to meet the same borders. So now let's go and copy this to the second section. So increase it like this, and we have to meet the itches perfect. Let's go and do the same for the last section. Something like this. Now we are done. We have to go and increase the two, 100 everywhere. Of course, we're going to go and remove the background. We are almost there. What we're going to do were going to go and change the coloring of each of those containers. Let's go to the linear and maybe we're going to go and take the lower level like outside and this here. It's going to go a little bit darker, to the next one as well to the linear. We're going to have it somewhere here, and the low value going to be outside. Now what I'm going to do, I'm going to take those eclipse and put it somewhere like here and let's keep working on those coloring. Let's move to the next one to the linear. Et's move this somewhere here and check the colors. We can put it like this and to the last one. It like this here. I'm going to have it here like rotated. Great. Now let's have a look. It looks very nice. Now I'm going to go and add our second dashboard over here and make sure to place it exactly on top of our dashboard. Let's move it here and let's close some of those informations. I'm going to have only the. Now we need one more for the list. Let's go into this. Le bit. Decrease the opacity to see through. Decrease the opacity to see through 40. Let's go and meet the Borders. Yes. Okay. That's it. We're going to go and increase again, the opacity to 100. Now for the filling, we're going to do something like this. And the low value going to be a little bit outside. That's it. Now we have to go and export those background images. We're going to do it like this. For the first dashboard, what do we need? We need the Navy and we need those two, and we have to go and hide all the images. That's it. Click on the container, and we have here the option of exporting. Let's go and export it. Now we have to go and export again for the second dashboard. So we're going to go and hide those informations. We need this and that sets, let's go and export again. All right back to Tableau. We're going to first remove all the background colors of each containers before adding the background image. Let's go into that. Let's start with the whole dashboard. We're going to remove it, and then we're going to go and select the nav, remove it as well. None, and to that overview. None to the next one. To the last one. It's none. With that, we don't have any background color for the containers, but you still see here gray and that comes from the default color of the dashboard. If you go to the format dashboard, you can see, we have it as a default. This is nice, if you go to the presentation models, you're going to have everything as gray. We're going to leave it as it is, and now we're going to go and add the background image. We're going to have it as a floating image to the middle, make sure it is fit and center and then choose. We're going to go with the background summary. Now next, we're going to go and change the size to our dashboard size. And then the position to be zero. Of course, now we are not seeing anything from the content and that's because the order of the floating objects. Now as you can see it is on top, so let's go and move it to the background and with that, we see the background image of our dashboard. I think it's really nice. Now let's go and do the same things for the next dashboard. We're going to do the same things. The whole dashard, going to be removed, the V be removed, and the list can be removed. With that, we don't have any background colors. Let's go and add our floating image for the background. Center fit, and we're going to have our image. Same things, the size, the height, and the position to be zero. Now, of course, we are not seeing anything. We have to go and sort the floating objects. It's going to be as a background. All right, so that says, I'm really happy about the results. Let's go and go to the presentation models. So, guys, what do you think we have an amazing dashboard, and this is the power of using the background image for your dashboards. So we have more way options to add shadows, rounded edges like here and some lighting. So let's go and switch it. As you can see, it looks amazing. All right, my friends. If you still hear congrats, you have just completed the table projects from the scratch from the requirements until having this amazing dashboard. And with that, you have experienced all the phases of the table projects that I usually do in my real word projects. So, friends, I cannot really stress enough how it's important to take time planning the projects before rushing into building the charts and the dashboards. Without having a clear plan for the projects, things can lead to chaos. So take your time planning it step by step. Course, feel free to share your project in any platform that you prefer. L use it as portfolio for your table public profile or as well in LinkedIn. And it would be nice of you if you share and mention my channel to spread the knowledge. So if you like this project and you want me to make more content like this, please support the channel by subscribing, liking and commenting. This really helps with the YouTube algorithm, and as well, it helps me to reach the others. And of course, don't be stranger. You can connect and follow me in Linked in. So, my friends, nothing left to say beside. Thank you so much for watching the tutorial, and I will see you in the next video. Bye. 210. Congratulations & THANK YOU Video: Hi, I'm very proud of you that you made it until the ends. I hope you enjoyed the journey. And I know it wasn't easy going through all those complex tutorials, but you made it until the ends. And now I can say that you have learned everything that you need to start doing amazing projects in Tableau. And as well, you have learned everything that I know about Tableau and how I usually implement real life projects in Tableau. So now I'm going to ask you for one more thing. If you found this video helpful and it helped you to start working with Tableau, I really appreciate it if you like it and share the content with the others. And of course, if you have any questions or suggestions for the next topic that you want me to cover in the future or you want to give me a feedback, make sure to use the comment below. Well, nothing left to say. Thank you so much for watching this course and I will see you in the next course, bye. 211. Advanced SQL | Download SQL Server & SSMS: Hey, friends, so we're going to go now prepare your PC with everything that you need in order for you to start practicing que with me using SQL server. And of course, everything is for free. So now the first step it does, we're going to go download and install Microsoft cual server locally at your PC. Then in the next step we're gonna go download and install another software code. SSMS, it is like a client in order to interact with the SQL server. And of course, after that's what do we need, we need data. That's why we can go download and create three different databases for you to practice advanced topics in SQL. And in the last step, I'm going to take you into a tour into the new interface of SSMS for you to get familiar with the interface of the clients. So, guys, let's start with the first step. We're going to go download and install Microsoft SQL Server locally at RBC. So let's go. So what is an SQL server? SQL Server is a database management system, where it runs a database, and it stores data as well. So it is basically where the database lives. In companies, usually they install SQL server on one of their own prim services or they use a service from clouds where it runs and SQL server. And, of course, don't worry, we will not buy any cloud services or we will not use any powerful servers. What we're going to do, and for free, we're going to go download and install SQL server at our PC locally in order to practice Squal. Let's go and download it. Either go to Google and search for SQL Server downloads or go to the link in the description where I've collected all the links that we need. The first one, we're going to go to download SQL Server. Let's go and open that. Now we're going to land on the Microsoft page where we can see the different offering from Microsoft CL server. Either we have it on the Azure or we can download it on the premises. But we don't want those staff just scroll down to see those two options. The first option on the left side, we have the developer addition. You will get all the features and services that Microsoft offers with the SQL server. It is as well free, but the installation here is a little bit complicated. But in the second option on the right side, we have the express edition. The installation here is going to be really fast and very easy. You will get as well all the stuff that you need for practicing qu and learn q. Both of the options are free. It's just a matter of the installation. We will go now for the express edition. Go and click download now. It's very small file. So let's go and start it, and now the installation to start. So we have basic custom and download media. Download media means download now and later we're going to do the installation. Custom means we have more control on how to download and install the stuff. The basic is the easiest one and the quickest one. Let's go with the basics and click on that. Let's go and accept all those stuff. Now, let's click on Install. Now we're going to install the applications, drivers, and so on. It may take a little bit time. All right, so when that we are done with the first step, we have downloaded installed SQL Server locally at OBC. So now everything up and running. Let's move to the next step where we're going to go and download SQL Server Management Studio, SSMS. It is a graphical interface where you can go and start interacting with the database where you can see the data, write queries, solve tasks, and so on. So in order to do that, let's go and click on Install SSMS. Let's click on thats. You can find, of course, this link as well with the other links that you have collected. So now we are again at Microsoft's page. Let's go scroll down. And now we will see the following link, free download, Cal Server Management Studio SSMS. Let's go and click on that. And then it's going to go and download it. Let's go and start it. The first thing that we have to define the location. I will go with the default stuff. Let's click on Install. Okay, set up completed. We just installed SMS. Let's go and close it. Now let's go and start it if you go to your menu over here, search for SQL Server and you will find it here, squal Server Management Studio. Let's go and start it. Okay, now we're going to get this window in order to connect to our server. Again, what is our server? It is the one we have installed at the first step, SQL Server Express. That's why you're going to see in the server name your PC name, of course, it's not going to be MPC name. But here we have something called SQL Express. This is the server we just installed. In the first option, we have database engines, we have reporting services. Those are different stuff from Microsoft. We're going to leave it as a database engine, and it should be like this SQL Express. Now, how to access this database. We have the following stuff. We can do that using the window authentications or scale saver authentications. I'm going to say that. Let's stick with the window authentication. The user name going to be the PC name and as well the Window user. If you don't have it for some reason, those informations, you can go to your search search for CMD. Then here you can say, who am I. With that, you will get the PC name and as well, the user that you are currently locked in. And this is exactly what I'm seeing over here. So we will not change anything. Let's go and hit Connect. Perfect. Very nice. I didn't get an error, if you have the same. That means now we are connected to our squeal server. 212. Advanced SQL | Create Databases: Okay. So with that we are done with the second step where we have downloaded and installed SSMS. So we have all the softwares now running at our PC. In the next step, we're going to go and get data. So we're going to go download and restore three different databases. Three, we have different sources for the databases, one that I have prepared, and another one from Microsoft. So, the one that I've prepared is very simple database with few records for the sales, and I made it in order to practice SQL. So let's go and download it. Let's just click on the download course data. And below that, we have the data model of the course. So let's go and click on this link. And what we can see over here is the data model of the database. As you can see it is very simple. Those are the tables and the relationship between them. So it's very classic we have in the middle, the central table, very important one, the orders, the left and right, few tables like the broad act customers and employees, and all of them have a relationship to the table orders. So as you can see, it's very simple database. Let's go to the next link, where we're going to go now and download the databases from Microsoft. Let's download project data. Here we have again, a Microsoft page where it says adventure works simple databases. Let me just scroll down. As you can see here, we have three types of databases. We have ATP, Datawarehouse, and lightweights, and you can see the last version of each type. Now, let me just explain for you quickly what is ATP and Datawarehus. What is LTP, OTP stands for online transactional system. It is classic if you go to any company, you're going to find there few operational databases where they deal with day to day business and transactions. It is a traditional operational database that you can find it everywhere in each company that is optimized to do read and write requests. But in the other hand, we have another type of databases called data warehouses or OAB. What is O? OAB stands for online analytical processing. These type of databases, they are optimized in order to handle large amounts of data in order to do data analytics, business intelligence, maybe to build reports, dashboards, and usually they contain data model that contains dimensions and facts. They form something like this, a cube. This cube can help you in order to do analytics to slice data to filter the data and so on. Now let's go and download them. Let's click on the LTB adventure work, and as well for the data warehouse. I would say let's download both of them. Now we have several databases in our download folder. Let's go over there, and we can see we have the two adventure works from Microsoft and the one zip file that we have just downloaded. This is the simple database that I've created. Let me just extract it first in order to get the file. Let's just get the file. Over here. Now we have the three databases and they all end with the same format, PAK. This format, the PAK stands for backup. That means we have a backup of the databases and we have to go and restore them in our server. Or let's say install them. In order to do that, we have to go to a specific folder. We need the path for that. I've prepared that as well in the link. Just go and copy this path, and let's go back to our explorer. Let's just go over there. You can see, we don't have any paps. Now we're going to do, we're going to go and copy those files inside this path. If I just go back, let's go copy and go to the path and just paste them. Great. Now we have the files in the correct place. If the path didn't work for you, maybe you have different version of the SQL Express like I have. Make sure to go to BrogramFiles, then Microsoft SQL server, then check the SQL Express, then MS SQL, and then to the backup. Should have something very similar for it. Now let's go back to the SSMS and restore the databases. Let's open again our application. As you can see, we have the server and inside it, we have the databases. Let's go to the databases inside it, we don't find anything yet. What we're going to do? We're going to click on the databases. Write a click on it, and we will go and restore the three databases, but we have to do it one by one. Let me show you the steps. Click on restore databases. Here we have the sources. We're going to go to that device, go and select a device, and then we're going to go to this pattern, the three dots. Click on that. After that, we're going to go click on At. As you can see now, we can see the three databases. Let's go with the first one. Click then again, now we have the database over here. Let's go and hit Okay. So now we are restoring or installing the database is successful. So if you click over here, you can see we have now a new database called Adventure Works, 2022. This is the OLTP. Okay, so now we have to get the other databases. Let's keep doing the same stuff database, restore database. I'm just going to do it quickly. Devise three points, add, and then the TW or the data house. Okay. So successful and we got now our second database on the left side. You can see it over here. Let's go and import or restore the last one, the one that I've prepared, the simple one. So add sales database. And one more okay. So now we have on the left side three databases. 213. Advanced SQL | Tour in the Interface of SSMS: All right, friends. With that, we are done with a third step. We have now data. We have databases in order to start now selecting and querying the data. We have the application, we have the data. Now what we're going to do? I'm going to take you in very quick tour into the interface of the client, the SSMS. Let's go. Now in order to see and check the data, it's like hierarchy. If you go to the sales DB, let's go inside it, and now we can find a lot of stuff like tables, views and so on. The main one going to be the tables. Let's go inside the tables. And here, you can find our tables, the customers, employees, orders, and so on. Now in order to see the data, go for example, to the orders, write a click on it. Here we have different stuff. What we're going to do, we're going to go and say, select 2000 raws. Let's click on that. Great final we can see some data. As you can see, we have over here query editor. You're going to write your query over here, you select statements, and then we have here the result grids. What we're going to do, we're going to write over here the query for example, let me just remove a few stuff. And then once we are done with the query, we have to execute it. In order to do that, we can go over here and click Execute, very simple. As you can see, que execute the query, and we're going to get the new result here in the result grid. Let's say that you need to write another query in order to make a new tab. What you're going to do, you're going to go over here to a new query. And with that, we're going to get clean a new window in order to write our q. One more thing that is very important to understand, especially if you have multiple databases in silo server that you select the correct database in your query. For example, over here, if we go, you can see that we are selecting currently the sales DB database. Now, anything that's I'm querying now, It should be a table inside this database. So customers, let's execute. Now we are selecting a table that is inside the sales DB. Now if you want to select a table which comes from other database, make sure to switch the databases. Let's go over here and switch it to, for example, adventure works. Now, if I go and execute this, it will says, In this database, I don't find the table. So if you are confused and say, I can see the sales customers over here, and I'm still getting the or from a scale that it's not finding it. It's because you are selecting the wrong database. Now, what happened if you want to work with multiple databases in the same query, what you can do, you can define it at a starts. So you can say sales DB, dots, sales dot customers. That means we have hierarchy. Here we have a database, then the schema, then the table name. Now if I go and execute this, even though that a different database, it's going to understand this table comes from other database and we will get the results. That means in one query, you can query multiple tables from multiple databases. Either you can go and switch it from here or you can use these statements. I can say Use sales DB And with that, I'm telling SQL. Now use this database instead of the other as you can see, Q going to go and switch it. Now since I am inside the database, it makes no sense to tell SQL again about the database. I just go and remove it It's going to work. All right, so that we have prepared your environment, you have everything to start doing amazing work in SQL. Now I would say, just go and explore the other databases, just do random selects in order to understand what do we have from content inside those databases. And if you would like to see the data model of the adventure work, I have it as well as the link. Over here, if you go to the Data Warehouse data model, you can see over here, all tables that are available. And you can see, we have a lot of tables. So since it's Data Warehouse, you have dimensions and facts. And as well for the OLTB I have it for you. If you click over here, you will find a huge operational database with a lot of stuff. So here they market with sales, persons, products, purchases and so on. All right, friends. So with that, we have prepared to PC with everything that you need in order to start practicing SQL. So we have the SQL server, the client SMS, the data, the three databases. And now you are ready to practice with me advanced topics in SQL. And now in the next chapter, we're going to deep dive into the word of window functions in SQL. They are the most important group of functions that you need for data analyses. So I really want you to focus on this. You can end up using those functions in real projects. I promise you. 214. Advanced SQL | What are Window Functions: Window functions or sometimes we call them analytical functions. They are very important functions in SQL. Everyone must know them, especially if you are doing data analyses. Each time I write SQL script in order to do data analytics, I end up using them. As usual, we're going to go and now understand the concept behind them, and then we're going to start practicing. Let's go. Okay, guys. Now let's start with the first question. What are SQL window functions? They are functions that allow you to do calculations like aggregations, but on top of subset of data without losing the level of details of the rows. It is something very similar to the group. But here we have special case, you don't lose the level of details. Now iner to understand the definition, let's have a very simple example. Okay. So now let's understand how SQL works with the group by Clous. Let's say that we have a very simple example. We have four orders, two orders for the caps and two orders for the gloves. Let's say that, I would like to see the total sales for each products. Now if we decided to use the group by, what SQL going to do? Going to take the first two orders for the caps and put it in one row. In the output, we're going to have only one row for the caps. With the total sales of 40. And the same thing can happen for the gloves. I'm going to take the two rows of the gloves from the input, and in the output, we're going to have only one row for the gloves. That means the number of rows is going to be depending on the number of products we have on our data. We have two products, we get two rows. That means SQL is really like smashing or squeezing the results in the outputs. And this is exactly what the group by does to our data. It aggregate the rows, aggregate the data into different level of details. Now on the left side, we see four rows on the right side we have two rows, and with that, we are losing some details in the results, but still we have solved the tasks. So now let's see what can happen if you use window function in squal. Okay, so now we have the same data, and as with the same task, we have to find the total sales for each product. Now, if you use window function, qual going to do the following, it's going to go and execute each row individually from each other's. So what can happen, it starts with the first row, the order ID one, In the output, we're going to get as well the same stuff, the order ID one, the same row, but we will get the total sales for the caps. Here the total sales is going to be ten plus 30, we will get 40. Then it's going to jump to the second row and I'm going to process it as well. In the output, we will get the order ID two, the brodat caps, and as well, we have the same aggregation since we are talking about the same product. We will go 40. Then it's going to go to the third order and here we have the gloves. In the output, again, we have the order ID three, the product gloves, and the total sales this time going to be five plus 20, so we'll get 25. Then it goes to the last row to the outer ID number four, in the output, we're going to get four gloves and as well, 25. Now we can notice that. If you use the window function, you will not lose the level of details of your data. So we are doing something called row level calculations. So if in input data, we have four orders in the output, we're going to get four orders and as well, we will get our aggregations correctly. Now if you compare both of the methods, side by side, we can see that we are solving the same task. So we are finding the total sales for each product, but with the group, we are smashing, squeezing the results from four orders into two rows, one row for each order. That means with the group, the granularity is changing. In the input, the order ID is controlling the level of details, but in the output of the group, the product is controlling the level of detail. So we have different granularity. But in the other hand in the window functions, we are still able to do aggregations, but we are not losing the level of details, the granularity of the input can be the same like the output in the results. This is exactly the main difference between the group Pi and the window function. If you want just to do simple aggregations, then go with the group Pi. But if you care about the level of details and you need to add more details to your results, then you can go with the window function where you can do aggregations plus having more details. Now, if you go and compare the functions between the window and the group Pi, We can find that both of them has exactly the same functions for the aggregations. We have the count some average mean max. Here comes another difference between the window and the group i. The group I has only the aggregate functions. That's it. But in the window functions, we have way more functions to use for analytics. For example, we have the ranking functions, and we have here another group of functions for the value or we call it analytical functions. That means in the qual window, we have a lot of functions. We can cover a lot of analytical use cases and advance complex stuff. But with the group, we have only the aggregate functions only for simple use cases. This is another difference between the group i and the window. Group use it if you have simple, simple aggregations, Window functions, we can use it for more advanced data analysis where we can cover a lot of use cases. All right, now we're going to have few tasks in order to understand one thing, why do we need scale window functions? Why in some scenarios, group is not enough and we have to use scale window functions. Let's go. All right, so let's start with very simple task. It's going to say, find the total sales across all orders. So we need one value with the total sales. Let's say we can do that. First, make sure that you are using the database. So use sales database in case you have closed the clients. So that's we don't get any errors. So now we're going to start with the first thing. We're going to go and select the sales. You're going to find it in the table sales orders. So now let's just query the data. And as you can see, we have ten orders with ten sales. We didn't aggregate anything yet. So we have the raw data now. So now in order to solve the task, we're going to use the function. So some of sales, and we're going to give it a new name, total sales. We don't have to use any group I because we don't have to group up anything. So that says, Let's go and execute that. And as you can see, QL going to return one value, 380. This is the total sales that we have inside of our data, and this is the highest level of aggregations. So with that, we have solved the task, we have the total sales. Across all orders, we don't have to group up anything. Let's move to the next example. Let's say that in the next task. This time we want to find the total sales, but for each products, not for the all orders. For each products, we want to find the total sales. This time we don't need only one value. We need one value for each products. In order to do that, now we're going to go and use the group I function, and we're going to group up by the product ID. Group up need as the dimension in the selection. We can do it like this. That says, Let's go and execute the query. Now, as you can see in the results, we don't have one value, we don't have the highest aggregations. This time we are drilling down to the next level of details. The level of details here is the product ID. We have one row for each product. For the first product, we have 140, the next one, 105, and so on. As you can see, we are now splitting the data at the level of product ID. We went from ten orders. Now in the results, we have four orders, and that's because we have four products. So the number of roads at the output, going to be defined by the dimension, the product ID. And with that, we have solved the task, we have the total sales for each product. All right, guys. So let's keep progressing our examples. Now the next one going to be a little bit advanced where we have the same aggregation. Find the total sales for each product, Additionally, provide details such order ID and the order date. As you can see, we have already solved the first part. We are finding the total sales for each product. Now we just have to add some additional information like the order ID and the order date. Let's go over here and just add it in our select. Order ID, let's have the order date. Let's go and execute that. Just going to make it a little bit bigger. Let's go. But now, as you can see, SQL will not be happy and throw an error and says, the stuff that you are adding to your select, are not included in the group. As you can see in the group i, we have only one dimension or one field called the product ID. But in our selection, we have three dimensions, the order ID, the order date, and the product ID. So there is no matching between the select and group i and SQL will not allow it. Now you might say, You know what? Let's add everything to the group. With that, we're going to get our aggregation, and as well, we're going to get our details. Let's try that. I'm just going to zoom out a little bit. Instead of having the product ID, let's add everything. The order ID, order dates, and the product ID. Now we have matching and scale should not through any error. Let's go and execute it. Now let's check whether we have solved the task. The task has two parts rights. We have to do the aggregations and to provide details. You can see we have solved the second part. We have the details, or ID and or dates. But now, the first part finding the total sales for each product is destroyed because if you check the results, we have the product ID 101, it has the total sales of ten But in the third order, we have it as a 20 for the same product. So actually, the data is not aggregated. And that's because we are aggregating at different levels, and we have included way more stuff that we don't need for the aggregations. We are aggregating at the order ID level. So as you can see now, we are hitting the limits of groupi. We cannot provide aggregations and as well provide additional information from our data. You have to pick one. That's why we have to go to the second option where we can use the window functions. So do that. I'm just going to get rid of the group parts and as well all the fields. Let's pack to the root. Now we have the sum of sales, and if you execute this, I'm going to get to one value, so we are at the highest level of aggregations. Now we need to use the window function. I'm just going to remove the name, and now we're going to tell SQL. This is a window functions. Using over after the aggregations or the functions tells SQL, we are talking about window functions. Let's just execute it like this and with that, we got ten rows, and that's because we have ten orders, and for each row, we have exactly the same value. We have the total sales of all orders for each row. As you can see, Scale understand, this is a window function, and Scale should not like group all the data in one row. It should keep exactly the same rows or same number of rows like the input. With that, we have the window function, but we have to split the data by the products. Now we're going to use the keyword partition by. It's like the group by, by another wording. Product ID, the same dimension. With that, we have the total sales by products as a name. Let's go and execute this. Now as you can see in the output, we still have the same number of rows. We have ten orders, we have ten rows. But the result did change because now we are aggregating the data at the level of product ID. In order to understand the results, we have to add more information to our select. Now let's add the same dimension. It can be the product ID. I'm just going to add it at the front over here. Let's select and as you can see. Now it makes more sense. We have those products and they have always the exact same sales and as well for the next product and so on. Now here comes the magic of the window function. We can add more information to our select statement without having any errors. Now we need additional information like the order ID. We can go over here and say, order ID, order date, any type of column, you can add it to your select, and let's go and execute. Can see now we get the result, even though those three dimensions in the select are not part of the window aggregation. With that, we have solved the tasks. We have additional information, we have the order ID, the order dates, and as well, the first part of the task to find the total sales for each products. Each of those values are the total sales for each products. And with that, we have solved the tasks. And this is exactly why we need window functions. In real projects, things get really complicated. You are doing different tasks in one query. So you are doing aggregations. You are doing some other stuff. So just focusing on the aggregations is not going to be enough. You have always to add additional informations to your query. As you can see, we use group Pi to do symbol analyses, but as things get complicated in the analytics, we use the window functions in order to show the aggregations and as well add additional information. As you can see, we use groupi to do symbol analysis, but as things get complicated in the analytics, we use the window functions in order to show the aggregations and as well add additional information. 215. Advanced SQL | Syntax of Window Functions: All right, so we're going to go and d dive into the syntax of the SQL window functions. We're going to cover everything, each part of the syntax for you to understand how to use them. Let's go. All right. Let's start first by understanding the basic components or the basic parts of each window syntax. Mainly, we have two parts. The first part is going to be the window function. We have average and so on. The second main part is going to be the over clause. Inside the over clause, we have three different parts. The first one going to be the partition clause, the second order clause and the last one, we have the frame clause. Those are all components that you can use inside the window function. Two main parts, window function and the offer clause, and inside the over, we have partition order and framing. Let's go more in details. For example, we have the following window function. You can see we have a lot of stuff going on here. We're going to understand them step by step component by component. Let's start from the left from the first one. What do we have over here, we have a function. Window function. What is a window function? Like here, we have the average. It's like any other function in squa L. You can use it in order to do calculations on top of the window. The first thing to do or to define in a window is to define the function of the window. As we learned before, we have a long list of many window functions available in sql, and we group them into three groups. The first one, we have the aggregate functions, we have the count average maximin A those functions, we have them as well for the group by. Those are used for the aggregations. The second group of functions, we have the ranking functions. We have the row number rank entile and so on. We can use those groups in order to give a rank for our data. The last group, we call it value or sometimes analytics functions. Here we have very important functions like the lead lag, first value, and the last value in order to access a specific value. Of course, we're going to go and learn all of them one by one, understanding the concepts, some examples, and as well for you to understand when to use them for data analysis. Now let's keep moving understanding the other parts of the window syntax. Now, inside the function average, we have here a field name or column name called sales. This is called a function expression. It's like a value parameter argument that we can pass it the function. Here we can use multiple different stuff. For example, depend of the function, of course. Here, it could be empty like here in the ranking. It doesn't allow to use an expression, so it should be always empty. Or we can use a column in the example. We use the sales. We use the column name as an argument or an expression for the average, we are finding the average of sales, or we could use a number. Here in the intel, we are allowed only to use numbers, or we could have multiple stuff. For example, in the lead, we can have sales, the numbers, and so on. Things get complicated. Don't worry about it. I'm going to explain that. Here we have multiple stuff. Or we can have a whole conditional logic. For example, here we have the win, so on inside the sum. The whole thing over here calls an expression for the sum. As you can see, we can build here a complex logic and the output of this logic can be passed to the function sum. That means as an expression for the function, we can use different stuff. Of course, depends whether the function allows it or not. Now, let's have a quick overview in order to understand which data types are allowed in the expressions for those functions. Let's see the aggregate functions. As you can see the count function except any data type. But the others like the sum average main mark, they allow only numerical data types. Now let's move to the rank function. The expressions are pretty easy. It should be empty. It doesn't allow any argument or anything inside those functions. As you can see, all of them are empty, but only one that accept numerical values, which is the tile. You have to define a numeric value. Now moving on to the last type, we have the value functions. They accept an data types inside the expressions. As you can see, each functions has its own specifications and you have to be careful which data type you are using in the expressions. So now let's keep moving to the next one. We have a very important part in the window syntax. So far, what do we have? We have a function, we have an expression. It's like usual stuff. We have done that before using the group by. Now we have to tell SQL that we are dealing with the window function. It's not a normal one. In order to do that, we have to specify the keyword. Over. The second main part in the syntax is the over clause and we use it in order to define a window. Inside it, we can define multiple stuff like the partition pi, the order by the frame. But all though stuff are optional, we can skip it and leave it empty. The main task of the over, it tells first SQL, we are dealing with the window function here and as well, you can use it in order to define a window of your data. Now we're going to go and cover everything inside the over clause, and we're going to start with the first one, the partition Pi. 216. Advanced SQL | Window Functions: PARTITION BY: All right. Now we're going to learn how to define a window inside the over clause. The first part that we can define is the partition Pi. For example, here, we have partition Pi category, we have to define that dimension. It's very similar to the group and wording. The first part going to be the partition clause. What is going to do, it's going to divide the entire data sets into groups or you can call it windows partitions. Here we tell how to divide our data. Here we have two options. Let me just show you. If we don't use anything, so we have it empty. You see over and partition by is not used, what can happen is QL Use the entire data in order to do the calculations. The whole data, the entire data can be counted as one window. We are telling SQL, don't divide anything, leave it as it is. The second option that we have is to divide the data by partition Pi. We define the window like this partition Pi products, for example, SQL going to go and divide the entire data into different windows. For example, here, two windows. Here, this time, the calculation, the sum of sales will not apply on the entire data set. This time, it's going to be applied on the different windows individually. We're going to find the sum of sales for Window one separately from the total sales of window two. All right. So now we have this very simple example. We have here three fields, the month product sales. They are really easy informations. And now we have the following SQL window function. So we have some of sales, and inside the over close, we are not using anything. So we are not using partition pi. So how ICL going to define the window now? Q going to say, I don't have to divide anything. The entire dataset is one window. So SQL going to go over here and say, The whole thing is one window. There is no partitions. There is nothing. We have only one window. The entire data going to be aggregated. This is what happens if you don't use partition by and you leave the over clause empty. The entire data is one window. All right. Now let's move to the next example. We don't want to have only one window. We would like to have multiple windows, so we have to divide the data by something. In the overclause we're going to define the window like the following partition by month. It's not empty. We are now dividing the data. By the field month. The values inside this column going to difide the data sets. Here we have two months, January and February. What's going to do is QL going to go and divide data into two sets. The first window going to be this one of January. We have the first window, going to make it smaller and the second window going to be the February. It's going to be two windows inside our data and the calculation going to be happening on each window separately. So here, as you can see, we are using the month in order to divide our data sets into two windows, one window for January and another window for the February. So now let's have a quick overview about the options that we have with the partition by. The first option as we learned, We can just skip it. Without partition by, for example here, total sales across all rows, and here we don't find anything inside the SQL. The second option, we can use one field, one column. For example, partition by products. We are using one dimension, but we can go and mix stuff. We can use multiple columns or multiple dimensions in the partition by, for example, here, partition by product and other status. Here with the partition by, we can define a list of dimensions that could be used in order to divide our data. In this example, we are saying, find the total sales for each coination of products and order status. Those are the different options on how to work with the partition by. Now let's have this overview again. For all functions, the partition by for all those functions is optional. If you don't use the partition pi in all those functions, you will not get any errors. Now let's go back to scale in order to start practicing with this clause. Now we have the following task. Find the total sales across all orders, and we have to provide additional information like the order ID and the order date. Let's go and seve it step by step. First, I would like to provide the details. I'm going to select the order ID and the order. Dates from the table, sales orders. Next, we're going to work with the aggregations. We need to find the total sales across all orders. Again, since we have here details and aggregations, we cannot use Ruby, we have to use the window function. So we're going to go use the function sum for sales, and now we have to tell SQL, we are working with window functions. That's why we're going to use the over close. Now the next day we have to think about defining the window. Let's check the task. It says, total sales across all orders. So that means we don't have to partition or divide the data sets into chunks or partitions. We have to leave it as it is, like the whole data going to be one window. That's why we don't use partition Pi inside that definition. We're going to leave it empty. Let's go now and give it a name. It's going to be the total sales. Let's go and execute this. Now at the results, as you can see, we have all the orders, all the details, and as well, we have the total sales across all orders. With that, we have solved the tasks, we have the total sales and as well some details about the order. Now let's move to the next task. It's going to be very similar. It says, find the total sales for each product. We have to provide additional information like the order ID and the order date. It's going to be very similar task. But this time, we have to divide the entire data into windows, and that's going to be by the product. Since we are saying total sales for each product. This time we have to go and divide the data. We're going to define the window like this partition by and we can use the dimension product ID. Let's go and execute this. Now you can see in the total sales, we don't have anymore the total sales of the whole data, but they are divided. But in order to understand the results, let's go and include the product ID in the results. Product ID and execute. Now by looking to the results, you can see that the data is divided into four windows. Let's see them. It's going to be by the product ID. So this dimension going to be controlling the partition. So the first window going to be the product ID 101, we have the total sales for this product 140, and the next window going to be 102, the third one, 104, and the last window, it's going to be only one row, the 105 and the total sales F 60. With that, we have solved the task, we have the total sales for each product, and as well we have some details. Now I would like to show you the dynamic of the window function. We can add multiple aggregations on multiple levels. Let me show you what I mean. Let's say we stay with the same example. But we're going to find the total sales across all orders and as well, the total sales for each products. What we can do, we can do the window functions on different levels by, for example, here removing the whole definition. Here we have the total sales for the entire data for the first task, and the next one going to be the total sales but divided by the product ID. Let's rename it by products. Let's go and execute this. Now you know what, I'm going to go and add the sales as well just to explain the flexibility of the window function. Let's go add the sales and execute it again. Now by looking to the results, you can see we have the sales in formations three times, but with different granularities. The first sales, the sales itselves without any aggregations. It is the highest level of details of the sales, and we're going to have the sales for each order. The next one, the total sales with the window function. Here we have the highest level of aggregation we have the total sales of all orders. The last one the total sales by product, it's something like in the middle. We are aggregating on a window. The window going to be the product ID. As you can see, we have different granites of the aggregations, and this is exactly the flexibility that we have with the window function. We can do all those stuff in one query. Now let's keep moving and adding stuff to our task. It's going to say, find the total sales for each compination of the products and the other status. This time, we have to divide the data not only by the product p as well with another dimension, the order status. Now let's see how we can do that. I'm going to just show the dimension order status and the results. And we're going to add the following sting. So some sales over since it's a window function, and let's go now and define the window. Petition by. So we have, again, the product ID, but not only this dimension as well, the order status. And let's go and call it sales by products and status. Let me just rename those stuff. Okay. Let's go and execute. All right. So now let's check the results. It is the last aggregation over here. As you can see here the aggregation has different granularities as the previous one, and we have more details. This time we are splitting the data by two dimensions. The first window going to be the product ID with the order status, it's going to be only those two rows. We have the order ID 101 and the order status delivered. The total sales of this going to be ten plus 20, and we're going to have 30. The next window going to be the same product, but with different status. It's going to be the 101, shaped, and we're going to go and summarize those two values and we're going to have 110. The next product and order startle is going to be the 102, and we have it only one. 102 delivered, it's only one. It's going to be the same value. The next partition or window, it's going to be two rows, 102 with the shaft, it's going to be those two things, 60 plus, 15, we're going to have it. 75. As you can see, here's the product ID and the or status, they are controlling how many windows we're going to get. We get here around like six windows. With the product ID, we got only four windows and without using anything inside the overclouse, we will get only one window. This is how the partition by works. 217. Advanced SQL | Window Functions: Order BY: All right. That was the first part of the window definition within the over clause. Let's move to the next part. We have the order by. For example, we can use order by order date. It's just a field. The order clause is very important in order to sort your data within a window. The order by is very important as well for many functions. By just checking the overview over here, for the aggregate functions, it's optional, so you could just leave it or add it. But for the rank function and as well for the value functions, they are a must. If you want to use those functions, you must use the order clause because it makes no sense, for example, if you are ranking the data without sorting your data first. Okay, guys, now back to our very simple example, and we have the following query. The function this time is going to be rank, so we have to rank the data and the definition of the window going to be partition by month. That means we divide the data by the months, so we have it over here, and then the second part going to be, order by sales descending. We have to sort each window by descending order. That means we start with the highest value and we end up by the lowest value. Let's see how going to go and execute this. First, partition by month. It's going to divide the data into two partitions because we have two values by the month. Let's see how this is going to look like. One window for January and another window for February. All right. Going to go to the second part and execute order by sales descending. So what can happen, SQL going to go for each window separately and start sorting the data from the highest to the lowest without checking the other window. So in those three values, the highest one is this one. So it's going to be on top. Let me just sort it. This is going to be the lowest. You're going to be in the middle. So Q going to sort this window separately from the next one. And then once it's done, it's going to go to the second one. So the highest value going to be this one. You are the lowest. Let me just do it like this. So SQL go to sort it like this. The highest one is 70. The next one is 40, and the last one is five. That is scale done with the definition of the window. So it's splitted by the month and each window is sorted by the sales. The next step is Sq going to go and rank those values. So it's really simple in the outputs. It's going to rank the data like this. So the first one going to be this value. The next one going to be two and the third one going to be three. So as you can see, qu sorting only this window, and it's going to go and repeat the same stuff for the second window. So each rank is separately from the others. You can see it's very simple. This is how QL executes partition by together with the order by for the rank function. Now, let's have a quick task for the order by. It says, rank each order based on their sales from the highest to the lowest. We have to provide additional information like order ID and order date. Let's see how we can write the query. We have the basic staff order ID order date and the sales, and now we can go and rink the data using window function. We can use the function rank. And then we're going to tell SQL, this is a window function, and inside it, we have now to provide the definition of the window. So now by checking the task, you can see that we don't have to divide the data, so we don't have to use partition by. We have just to use rank, and with rank, we have to use the order by. It is must. So we're going to use order by the field going to be the sales and from the highest to the lowest. So just call it rank sales, and let's go and execute this. And as you can see, our result is going to be sorted from the highest to the lowest, so you can see the sales 90 at the top and the lowest going to be the ten. And as well, we have a rank. So for the top rank, it's going to be one, and the lowest rank going to be ten. As you can see, we just quickly create a rank in SQL. It's very simple. The whole thing is one window since we are not using partition pi. Of course, if you want to have ascending from the lowest to the highest, you can just remove it because optionally going to be ascending. Let's go and execute the query. So now we can see the orders are sorted the way around, so we start with the lowest and end up with the highest. Of course, we're going to get the same results if you go over here and add ascending. If you excute see we've got exactly the same results. This is how you use the order pi inside the window definition. 218. 3 5 window frame: Okay, guys. With that, you have covered the second part of the window definition. Now we're going to go to the last part to the most advanced part of window, and we have the following stuff. We have rows unbounded proceeding. We call this frame clause or window frame. What we are doing over here that we are defining a subset of rows within each window that is relevant for the calculation. Totally understand if this is confusing at the start or complex, it was for me as well. What we're going to do, we're going to deep dive into the concept in order to understand how this works, and we're going to do it step by step, so don't worry about it. All right. So now let's understand what is going on with the frame clause. From the basics. Now if you do aggregations and you don't use window function, you're going to consider the entire data, or rows inside the table. But what we can do, we can go and divide the data using partition Pi to a window. For example, here, we have window one and window two. Now, if you go and do aggregations, all the rows in the window one going to be aggregated and then scale can go to that window two and aggregate all the rows. What we can do in scale that we can say, You know what? I don't want all rows inside the window, I want a subset of rows inside the window. What we are doing over here is that, we have those two windows, but we specify scobe or we specify subset of data. From each window to be involved in the aggregations. Of course, not only aggregations, we can do ranking other stuff. So I mean, calculations. So here like we have a window inside a window. So we are defining scope of rows. Not all rows should be involved in the calculation, but only specific subset of data. And we can do that using the frame clause. So again, the partition by, you can use it in order to divide the entire data sets into multiple windows. And now for the frame clause, if you don't want to consider all the rows within each window in the calculation, you want to focus and specify only a subset of data within each window, then you can go and use the frame clause. All right. So now let's go and understand the syntax of the frame clause. Let's have the following example. We are saying the window function is the average of sales, and then we define the window. So we have the first part, partition by categories, order by order dates, and then we have the frame clause. It's going to be the following rows between current row and unbounded preceding. This is the frame types, and we have two types, we have the rows and groups. Then we have between and the range. So the first range is going to be the frame boundary, lower value, and here it accepts three types of keywords like the current row or a number of proceeding or the unbounded proceeding. Then we have another frame boundary. It's going to be the higher values, and it accepts the following stuff. We can use the current row in following or unbounded following. As you can see, we are defining like boundary or a range from low value to higher value. Now we have some rules. We cannot use the frame clause without order by, order by must exist in the definition in order to use Frame clause and the second rule says, lower boundary must be before the higher boundary. So always we start with the lower boundary and we end up having the higher boundary. You cannot switch that. Okay, so now we have a very simple example. We have the month and the sales and the following query, sum of sales. This is the window function, and the definition of the window going to be order by month. We are not using partition by just in order to make our life easier. And the frame cloth is going to be defined like this, rows between current row and the two following. Now let's see how Equal can execute this. The first definition order by month, as you can see, the months are sorted already. Now qual going to work with the frame definition, current row and the two following. Sq going to process this row by row. So it's going to start with the first row, and it's going to be our current row as here in the qual. So this is our current row, and we say the range until two rows, two following rows. So it's going to be February and March. That means the pointer is going to be over here for the two following. With that, we have the frame boundaries, and I scale have the following scope for the first row. We have three rows and the summarization of those three rows can be around 70. We'll get for the first row 70 because the scope is not all rows, but only the subset of data. With that scale is done with the first row, it's going to jump to the second row. The pointer going to be the current row at the February, and the second two following going to be at April. So with that, as you can see, we are sliding down in the subset of data or in the window. And with that, we have a new scope, a new subset, and the summarization of all those values going to be 45. So that's set. I think you get it already. It's going to go to the next one, the pointer going to be on March, and the two following going to be on June, and it's going to slide like this. We have those three roads in the scope, and the summarization of that is going to be 105. Now, things get interesting for the next row. So the pointer for the current row going to be april, but the two following going to be like after the end of the table or something like that. So as we slide down, as you can see, the scope now or the subset of the frame going to be only two rows and the output going to be 75. And finally, if you go to the last row, it's going to be the current row and we're going to have only one row for the subsets because the two following is just outside of the table, and we're going to get the same value as the summarization. As you can see, that's it's very simple right. The frame we use it in order to scope which rows are involved in the calculations. What do you have to do is to define the boundaries of the frame, the lower and the upper boundary. Let's see what other options do we have with the frames. Here we have the same example, but we redefine the boundaries of the frame like this. Rows between current row. This is the first boundary and unbounded following. This means that we are targeting always the last record in the window or in the table. Unbounded following is going to be always static and it's going to be in this example, pointing to June. I was going to go row by row and the current road going to be like the start January and then February. I'm just going to take this example. The pointer is on February, and the subsets or the frame going to be those four rows. So it can be February, March, April, June. So it can be four rows, and the total aggregation of that is going to be 115. You can do it like this and previously it was flexible, more flexible. It was two following. But this time we have unbounded following. That means always the boundary going to be the last one. As we are moving with the records over here, The boundary is going to be smaller, smaller and like this, and the last one, they're going to be both in the same record. The current record is going to be as well, the unbounded following. Let's see the next one. The definition of the window going to be the following rose between one proceeding and the current row. Here is the way around. One proceeding is lower than the current row. Let's see how SQL going to execute this. Let's say that we are currently at March. This is the current row, and we are saying between one proceeding. That means one row before the current row. So the frame going to be like this, and we have only two rows. So the value going to be the summarization of those two rows and it's going to be 40. That means we are always targeting the rows before the current row. Okay, now let's keep going with the other options in order to understand everything about the frame. So we redefine like this rows between unbounded preceding and the current row. So unbounded preceding going to be the first row in the table or in the window. So it's going to be static like this. It's going to be the first one January. Let's say that we are at this current row in March. The window or the subset going to look like this. Though three rows and the total of that is going to be 60. Now as a scale is proceeding to the next one, it's going to fix the first boundary. It's going to be always pointing to January and the subset going to be a little bit bigger until we reach the last one and with that, we're going to have the subsets, the whole rows. With that, we get really great flexibility on how to define the subset and how the subset is shifting through the window. Okay, now we are just having fun, so we're just playing around with the boundaries. We don't have always to use the current row. So we can use, for example, here in this definition, rows between one proceeding and one following. So we don't include at all the current row in the boundaries. So let's say again, our current row going to be in March. So one proceeding going to be February and one following going to be April. So with that, our frame going to be tho three rows, let me get it This and the aggregation of this going to be around 45. So with that, as you can see, the boundaries is going to be one proceeding and one following, so it should not be always the current row. Alright, so now I think you already get it, what's going to be the last option. We're gonna have everything. So the definition of the frame going to be rows between unbounded proceeding and unbounded following what we're going to have over here. The unbounded proceeding go to be January, and the unbounded following going to be June. And now the frame going to be everything, all the rows. And it doesn't matter where are we with the current row? It's going to be always a fixed subsets. So it's going to be always everything. So if we are over here or February or March, we're going to be considering all rows. And the total sales of that is going to be 135. So we will get the exact same results for everything for all rows. So with that, I think it's not that complicated right. We just have to provide the boundaries, and then the calculation going to be depending on the frame on the subset of data. Okay, guys, now let's go back to scale and start practicing in order to understand how the frame work. So let's go and define a window like this. Some of sales and the window definition like this, we're going to divide the data by order status, and let's say we're going to sort it by order date. Let's define a frame like this. Rose between current troll and two following. Let's give it a name, total sales. Let's go and execute it. Now let's look to the data. You see that scale can divide our results into two sections, two windows delivered and shaped. You can see that the data is sorted by the order date. As you can see over here, for example, on this, status delivered, we can see that first of January 10 and so on. Then the third part, we have defined a frame in each window. So for example, let's take the first one. This is the current row. We say the frame is between the current row and the two following orders. That means the scope going to be like this. Ten plus 20, 25, it can be 55. Now what is interesting as well to check here is the last record of each window. Now let's take this window over here and the last record going to be number seven, this order. And let's say this is the current record. We say the frame between current record and the two following. But since it is the last record of this window, it will not go and consider the next two orders because those two orders are outside of the window, and that's why we have here 30 and sq didn't go and summarize all those value. So we have it 30 and there is nothing after that. That's why we will get 30. As you can see, the frame can be calculated within one window, so it will not consider anything outside of that window. This is how the frame works within partitions. Now, I would like to show you as well, a few stuff about the frames. We can use shortcuts, but we can use them only with the proceeding. For example, let's say, I'm going to change the definition like this, two proceedings. And control. Let's go and execute it and we'll get those results. Now if you want to check the results quickly, let's take, for example, this order of our here, and we are always summarizing the values of the two previous orders. So that means those three order is going to be involved in the frame and the output is going to be 55. Now there is a shortcut for SQL, but only for the proceeding where we can remove the range, so we can go and remove everything and we can leave it like this. Rows two proceeding. If you go and execute it, we will get exact results. This is a quick way or a shortcut on how to define a window, but it only works with the proceeding. For example, if I go over here and say, for example, unbounded, it's going to work, so we will get the results between the unbounded proceeding and the current row. But if you go over here and you say, you know what? Let's have the unbounded following, IQ going to say there's an error and the same thing if you remove the unbounded, let's say for example, one following, IQ will not like it. You can use the shortcut only with the proceeding. And one last thing about the frames it does, there is a default frame. If you don't use any frame and you use order by, what can happen, qual going to use a default frame. If you check the result, you will notice that for this window over here. Those values are not like the whole values of the cells. There is frame, there is hidden frame. The default frame in qual going to be like this rose between unbounded preceding. And current row. This is the default frame if you use order by. Now if you go and just execute it, you will see that we will get the exact results. Be careful, once you use order by with the aggregate functions, there will be a hidden frame or a default frame like this, between the unbounded proceeding and the current row. That means there are three ways in order to do this scenario framework between embody proceeding and current row, either write it like this or you can go and have a shortcut like this. Let me just execute it. So we'll get the same result or just remove it completely. We will get as well, the same results. Now again, the hidden frame of the default frame is only working with the order by. So if you go, for example, here, and remove the order by. Let's see the results, the whole window will be aggregated. So again, let me just select it, so you can see that QL going to consider all the rows in the aggregations and we will get the total sales for the whole window, so there will be no frame defined. Only it can be present once you use order by. All right, friends. So with the frame close, we have now covered all the components on how to define a window inside and over close, and with that we have covered everything about the syntax of the window functions. 219. 3 6 window Rules: Okay, guys, now we're going to go and understand the rules or let's say the limitations of window functions. So let's learn what you are not allowed to do while using window functions. Okay, the first rule of that, you are allowed to use the window function only in the select clause and as well in the order by clause. So here we have, again, the same example where we're finding the total sales by the order status. So as you can see, we used the window function in the select clause, and we didn't get an error right. So now we can go and use it as well in the order by. So let's say order by, let's go and copy everything, but not the name. Derby. If I go and execute this, there will be no errors and Q all going to allow it. As you can see the result didn't change. Let's go and sort it, for example, descending. I'm going to right here descending, and let's execute. Now we have the total sales with the highest values, then the lowest values. Having this rule that we can use it only in select and order by, that means we cannot use window functions in order to filter data. Let me show you, for example, instead of order by, let's have ware clause Were total sales, let's say bigger than 100. Let's go and execute this. As you can see, kale going to say, no, you are not allowed to do that. You can do that only for select and derby. So we are not allowed to use it for filtering data using the were clause. And as well, you are not allowed to use it in the group. So if I go and do a group, and as we remove the condition over here. So if you execute it, you're going to get the same error, you are not allowed to use the window function in the group. So only with the derby or as well in the select clause. Okay, now to the second rule, you cannot use window functions inside another window function. So that means you cannot go and list window functions together. Let me show you what I mean with that. So let's remove the group Pi. Now, everything should be working. Let's stick and copy the whole window function over here, and let's just st it. Instead of sales, we're going to have now window function inside another window function. As you can see, this is the inner window function, and the rest the outside is the outside window function. If I go and execute this, you'll see that the scale going to tell us, you cannot use the window function in the context of another window function. So we cannot do sting using window functions. As you can see, this is another limitation for those functions. All right. Moving to the third rule or let's say an info, the window function can be executed after filtering the data with the were clause. Let's have an example. Now, let's say that, I would like to have the same information. The total sales for each status, but only for two products, 101 and 102. Let's go and do that. We're going to use the were clause, and then we're going to say product ID in. We're going to specify 101 and 102. Let's go and execute this. Now, we can see we still have two partitions. One for the delivered and one for the ship, but the total sales is reduced because we are only focusing on two products and we filtered the whole data sets. So how scale works, First, the work clause is going to be executed, and then the window function is going to be calculated. That means first filtering and then aggregations. Okay, guys, now we're going to move to the last rule to the most interesting one, and it says the following. You are allowed to use the window function together with the groupi clause only if you use the same columns. So let me explain what do I mean, but first, some coffee. Let's have the following task, and it says, rank the customers based on their total sales. Now, it sounds really easy, but if you check it, you have here two calculations. The first one, you have to rank the customers, and the second calculation is an aggregation. You have to find the total sales for each customers. So I'm going to show you step by step how I usually solve those tasks. Now, let's check the total sales. It is an aggregation, right, so we can use the SM function, and this function is available in both group i and as well in the window function. So for now, I'm going to go with the group i, and that's because the task is very simple. We don't have to show any other details, right. So it's all about aggregations. So why not using the groupi. Now to the first part where we have to rank the customers, we cannot use the rank function with the groupi, right? Group Pi uses only aggregations. So here we are forced to use the window function. So that means for the rank, I'm going to use window function for the total sales. I'm going to use a group i. So now let's do it step by step. So first, we have to find the total sales for each customer using group? It's very simple. So I'm just going to remove all those stuff in our select statements. We need the customer ID, and then we don't need a window function over here. And then after that f, we're going to have a group customer ID. So now I'm just grouping the customers and finding the sum of all sales. Let's go and execute this. So now we're going to see in the results, we have four customers, and that's why we have four rows, and as well we have the total sales. So let's say the half of the task is already solved right. Now, what is missing that We need to rank. So let's go and build that. The second step, we're going to use the rank function, and we can define a window for that, over. And inside it will not partition the data at all because it's already grouped up. So what we're going to do? Over order by. The rank function always needs an order by, don't talk about it. We can talk about it later. So now we are ranking the data based on the total sales. That means the sum of sales. So what we're going to do, let's just go and copy this and put it after the order by. And now we have to decide whether ascending or descending. It's going to be descending so the highest sales first and then the lowest sales. So now, as you can see, we have now a rank. Customers, and we have a window function now together with the group Pi. Now let's go and excuse this and see whether Q going to allow it. Let's run it and as you can see, qu runs it, and we will get the rank for each customers. The customer three has the highest total sale, then the customer number one and the last one going to be customer number two with the lowest total sales. All right, we solve the tasks we have now ranked the customers based on their total sales. So as you can see, SQL allows you to use window function together with the group, but only with one rule. Anything that you are using inside the window function should be part of the group Pi. For example, we fulfill the rule because we are using the sum of sales, and the sum of sales is part of the group. If I go, I just spreak the rule by nuts using the sum, just using the sales. So if I just remove the sum and use only the sales, k will not allow it because the sales is not part of the group Pi. As you can see, k is very strict with this. If you want to use everything in one query without using sub queries and so on, you have to use exact same columns. For example, if I go over here instead of sales, I use the customer ID. Since the customer ID is a part of the group, scale can allows it. So be careful using window function together with the group Pi, as long as you are using the same columns, nothing going to go wrong, and qual allows it. Okay, now, I'm just going to go and fix this let's run it. Now as you can see, it's really easy if you follow those steps. First, build the query using group Pi. Don't you think about the window function. Just build the group Pi, and then the next step, the last one, you go and define and build the window function. With that, you can solve really nice analytical use cases with a simple one query without having you to build like sub queries and so on, you can go and use group Pi together with the window functions. All right, guys. So those are the four rules for the SQL window functions. 220. 3 7 window summary: All right, friends. So now let's have a quick recap about the scale window functions. Let's start with the definition. We're going to go and perform calculations like aggregations on top of subset of data without losing the level of details. So that means we can do aggregations and at the same time, we are not losing the details. Now, of course, there is a lot of similarity between the window function and the group I. But the main difference is that window functions are very powerful and dynamic. Compared to the groupi. We have way more functions than the group. But now if you are doing data analysis and you have an advanced use case, then you have to go and use window function. It's more suitable for complex and advanced data analysis. But in the other hand, if you have a simple question, simple data analyses, then you can go and use the aggregate functions using the group Pi. Of course, you can go and use them in the same query. In the same select. You can go and mix the group Pi together with the window function with only one rule, you have to use the same columns. Of course, the first step is to do the group and then later you do the window function in the same query. Now to the next point about the window components, we have two main components. The first one is the window function and the second part is the window definition using the over clause. Inside the over clause, we can define three things. If you want to divide the data to create windows, you can use the partition by. The second section we have the order by in order to sort your data, and the last part, you can go and specify a subset of data like a frame within each window. Now let's move to the last part. We have rules for the SQL window functions. The first thing is that if you have two window functions or multiple window functions, you cannot go and nest them together. You have to go and use multiple subqueries. The next point is that you can use the window function only in the select and the order by clause. For example, you cannot use the window together with the ware clause in order to filter the data. Talking about filtering data, how SQL going to go and execute the window function. It's always after SQL filter the data. All right. Those are the basic stuff about the SQL window function. Alright, so with that you have covered the basics about the Scale window functions. What is window functions? Why do we need the syntax, the main components. Now moving on to the next one, we're going to learn how to aggregate your data using the window aggregate functions. Here we have five functions, so we can understand the synax how it works, the use cases, and everything. 221. 4 1 win aggr what is: Hey, friends, so we're going to learn now how to aggregate your data using five different window aggregate functions. We have to count sum average min max. And as usual, first, we have to understand the concept behind them. After that, we're going to talk about the syntax, and we're going to cover the most important use cases that I collected from my real life projects. So now, first, let's understand why they call them aggregate functions. So let's go. Okay, guys. Let's say that in our data, we have the following informations. We have the months and the sales. Now, if you apply any aggregate functions in SQL, what can happen, sql going to go through all rows of the window or the entire data and start aggregating the data. That means in the result in the output, SQL going to give you one single aggregated value. Q going to go and summarize all those values, and in the output, you're going to find, for example, here, the total sales are going to be 175, or you can use the average or count the data and so on. So the aggregate functions going to deliver at the end one aggregated value for a window or for the entire data. Now, let's have a quick overview of the syntax of all aggregate functions. Most of them follow the same rule. First, as usual, we have to define the function name, and in this example, we have the average. Then to the next part, we have to define inside it as well, the expression. We cannot leave it empty. Here we are using the sales and the second rule for all functions beside the count, the data type of this field, should be a number. And this, of course, makes sense, right? We cannot find the average of the first name of customers or something like that. So we have to define a number. Then next we have to define the frame. So we have the partition pi, and it is optional. So you could use it or leave it depends. And then the next one we have the order pi, it is as well optional. It is not a must or required, so you could use it or leave it. That's mean the whole definition of the window could be empty for the aggregate functions. Let's have a look to all functions, so we have the count average mean max. And as you can see, only the count accepts all data types as an expression or arguments. All others require you to have a number as a data type. All functions, the partition by is optional, the same for order by and frame, so everything is optional over here. Now, what we're going to do with that, we're going to go and deep dive into each of those functions in order to understand how they work, what are the use cases, and of course, were going to practice in scull. So we're going to start with the first one with the function count. 222. 4 2 win aggr count: Okay, so what is account function? It's really simple. It's going to return the number of rows within each window. It's going to help you to understand how many rows do you have within each subset of data. Now let's go and understand how SQL works with this function. All right, now we have again, this very simple example for the orders, and we have the following information. We have the products and sales. And now we want to solve very simple task. How many orders do we have within each products. So in order to solve it, we can use the function count like the following. So we can say count, and then we pass for it an argument or expression, the star. So with that we are telling qual, go and count how many rows do we have in our table, but we have a window definition like this. Over partition pie products. So now what qual going to do go to go and divide the data sets into two partitions. We're gonna have one partition for the caps and another one for the gloves. So with that qual prepared our data into windows, and we are ready to do aggregations. So how many rows do we have within each window? It's going to be three. So for this window, it's going to be three rows, And as well for the next window, we have as well, three rows, so we can have three, three, and three. It's very simple, right, guys. We are just finding the number of rows within each window. But now with the aggregate functions, we have to be very careful with the null values. For the count star, as you can see over here, we are not specifying anything about the sales. So we are just saying, finding the number of rows. So that means Q L will just count the nulls as one row. So that means if we are using the star as an argument for the function counts, The null will not affect anything. So whether we have nulls or nuts, we are just counting how many rows do we have inside our data. But in some scenarios, we should be ignoring the nulls in our count. For example, let's say that, I would like to count how many sales do we have within each product. That means if we have nulls, it should not be counted. So now, in order to achieve this task, what are we going to do? We can use instead of star over here, we're going to have the field sales. So now with this, we are telling SQL. Don't just count blindly, how many rows do we have within each window. You should be very careful with the values. Find how many sales do we have within each window. So now let's see what can happen. For the first window, we have three sales, so we have three values. So the number of rows is correct. But for the next one, how many sales do we have? We have two. So we have this sale and then the 70, but the last one is null, so it will not be counted. It would be ignored. That's why we're going to get in the output, the value two. We have two sales. You can see the result did change and we are now more sensitive to the null values. Be careful what you are specifying for the count? If you are using a column name like this, it will ignore the nulls. But if you have a star, it's just going to go and find how many rows do we have within each partition. Now if you go and compare the result side by side, you can see that. If you specify a column within the count function, it's going to be sensitive with the nulls. It's going to ignore it and will not use it within the aggregations. That's why we have here only two rows. But if you go and use the star within the count function, what's going to happen? Scale just going to go and count it. We're going to find the number of rows that we have inside our table. And there is one more way in order to do the same thing here on the left side. You can use instead of star, you can use one. So you might find it somewhere that people are using count one, and then the same window function, and we will get exactly the same results. So the nulls would be counted and would not be ignored. So now you might ask me, which one should I use the one or the star? Well, I would say, It doesn't matter. We are getting the same results. And if you are thinking about the performance, I hardly find any differences between them. You can go and try both of them and stick with the one that is giving you more better performance. Now, we have special case for the count function compared to all other aggregate functions. It allows any data type. So that means we can use numbers, we can use characters, dates, and so on. That means we can go and specify something like the products for the count instead of sales. So we can go over here and say products. And it's going to go and count how many rows do we have for the products. So it's going to be three over here. And since here, we don't have any nulls, it's going to go and count it like this. So we have three rows. And be careful here, we are not counting the unique rows. We are just counting the rows that we have inside our data. So this will not be counted as one, and this as well will not be one. So we have three times the caps. That's why we have here. Three. Okay. So now we have this very simple example. Find the total number of orders. This is very simple task. In order to find how many rows, how many recurs, Do we have inside the table orders. So let's go and solve it. So let's start by selecting just star from the table orders without anything like this. So as you can see, we have ten orders. It's very simple. It's very easy as well. But now, let's say that you have thousands or millions of rows. You cannot do it like this, by just checking the rows. What you're going to do, you're going to go and use the function count. So we can go over here and say counts, star, and then let's give it a name total orders. So let's go and execute it. So as you can see, we got only one record, one value, we don't see any other details. We got the ten orders, so this is the total number of orders. This is very helpful in order to understand the content of your data. This we call it overall analysis. Or let's say having the big numbers about your business. For example, how many orders do we have, how many customers, products, employees, and so on. Having those big numbers can help us to track our business, to understand how well we are doing with the orders and with the customers and so on. This is the basics of reporting. Now, let's go and extend our task by saying, provide details such as the order ID and the order dates. So let's go and do that. So select order ID, order dates. And now, of course, we cannot do it like this. Let me just execute it. We will get an error because here we have different level of details in our select. So in order to solve this, what we're going to do, we're going to use the over clause, and with that we are telling a scale. This is a window function. So now let's go and execute it. So with that you can see with that, we have solved the task. We have details. We have the order ID, or the dates. So this is the highest level of details, since we have the order ID. And as well, we have the highest level of aggregations. We have the total number of orders, in the entire table orders. So now let's keep going and add more staff to our task. Let's say that. We want to find the total number of orders, but for each customers. So that means this time, we have to go and divide our data by the customers. So let's go and do that. We can use as well, a window function, so count star. Over, we have to divide the data using partition by. And we're going to use the filled customer ID. So let's call it orders by customers. And I would like to see as well the customer informations in the query. That's why I'm going to go and add it. All right. So that's all. Let's go and execute it. Now, as we learned before that, Equal first go to go and divide the data. So that means we have four customers. We're going to get four windows. The first window going to be for the customer ID number one. And as you can see, we have three rows. That's why we have here three orders. And the same thing for the customer two, we have three orders, customer three, three orders, but only the last customer, the customer ID number four, we have only one row and one. So now, if you go and look to the total orders and the orders by customers, you can see now we are not doing the overall analysis. We are doing like comparison between different categories. And, of course, in this example, the category is the customers. And with that, we can understand as well, the behavior of our customers. So you can see that. We have three customers that has exactly the same amount of orders. So they are very similar, but we have one extreme, which is the customer ID number four. This customer has only one order, so this is the only customer that has different behavior than all other customers. So you see with very simple query, we are able now to analyze our business and understand the behavior of our customers. So if you divide the data by partition by and using counts, you can go and now compare stuff together. All right. So now let's keep moving. Next, we can understand the special cases that we have the function count. So now we have this very simple task, it says, find the total number of customers, and additionally, we have to provide all customers details. So I think it's very easy to solve what we're going to do? We're going to go and select star, since we need all details from customers from sales customers. So let's just have a look. So we have five customers, and the function is count star over. And we don't have to divide the data since we have to find the total number of customers for the entire table, and it's going to be total customers. So nothing new. That's it. We have five customers. Now, as we learned before, if you are passing the star to the count function, what you're telling to scale that go and count how many rows do we have inside the table customers. Scale just going to go and start counting. I' to say, we have five customers five rows. It doesn't matter whether we have nulls inside our data like in the last name or the score. It's just going to count the number of rows. Now, let's say that we have the following task. It's going to say, find the total number of scores for So what do we need with this task is to find out how many scores inside our data. So as you can see, we have around four scores, but the last customer doesn't have any score, so we have it as a ll. So the result should be four, we cannot go now and use the star for it because we're going to get five. We have to go and count the scores. Let's see how we can do that. We can count as well. But this time, the score, and the definition of the window going to be empty. So total scores, and let's go and execute this. So now we can see in the results, we got four scores, which is very correct because Equal did ignore the null and squalw focusing only on one column. So focusing on those values, the nulls will not be counted. This is really great in order to check the quality of your data. So let's say that you are not expecting annuals inside your data. So instead of going manually through the whole records, what you can do, you can go and find the total number of customers like this. And then you can go and count the total number of scores, and you can see there is a difference. So by just checking the data, I can say, You know what? We have one null without checking every record in our data. With that, we can check the quality of our data and understand very quickly, how many nulls do we have in the field score, and you can do the same stuff, for example, for the first name. Show it to you. I'm just going to go and copy this let's say first name. Let's say country, actually. So I will go with the country. So let's go with the country total countries. So let's go and execute this. Now if you check the result, you can see we have five rows with the country. Scale going to go and focus on the countries and it will not find any nulls. So we have here complete data. We don't have any nulls because the total number of customers is equal to the total number of values within the country. And I can immediately find the data quality of the country is very good. All right. Now one more thing about the count function that we have learned before, we can use either star or one in order to count how many rows do we have? Let's just try it. I'm just going to go and duplicate it. And instead of having a star, let's have one. Just going to give it a name. Here it's going to be one and you are star. So let's go and execute it. So if you check the output, we got exactly identical results. So there is no difference between those two queries. It's up to you, you can try it and check the performance. I usually go with the star instead of one. Okay, now we're going to talk about a very important use case for the SQL window function count that I frequently use in my real projects. The data that we use for data analysis has usually bad data quality. And if we don't find those data quality issues and we don't clean it before doing the analysis, what's going to happen are we going to deliver bad results, bad analyses, which can lead to bad decisions. One very common data quality issue that you might encounter in your project or on your data is that's having duplicates. Duplicates are really bad for doing data analysis. So now, in order to discover or let's say, identify the Dublicate in our data, we can go and use the qual window function count. So now let's go and have some examples. So now the task says, Check whether the table orders contains any duplicate rows. So how are we going to do that? By checking out the table orders over here, we can see that there are many orders, but how to find out the Duplicates? Well, the first step is to understand what is the primary key of the table orders. So what we usually do we go and check the data model if there is one. So, for example, for this course, we have the following data model, and we can see that it is defined that the order ID is the primary key for the orders. The product ID is primary key for the products. So that means for our table, the orders, we have the order ID as the primary key, and it should be unique. It should not contain any double kids. Now let's go to our data. And check the other ID, by just looking at the data, you can see that we don't have any duplicates, all of them are unique. So we have one, two, three, four, and so on. But of course, in real projects, you cannot do it like this, you have to go and build query in order to find out whether the primary key is unique. But now we might say the primary keys are usually unique because we can define it in the DDL in the rules of building the table. Well, that's true. If you have it like this, then you don't have to find any Dublicate. But usually in data analysis, we export a lot of files and a lot of data inside an extra database, and we don't build such a rules. Now in order to check the quality of the primary keys that you get from the source, We can use the count function. So let's go and build it. I'm just going to select the order ID first as a detail, and now we're going to do the following. So count and then star, and let's go and define the window. So it's going to be partition by, and here the field is going to be the primary key. So the order ID. I'm checking now the quality of this field. This should not contain any doubles. And now we're going to go and give it a name check primary key. So now my expectation that's The result of this should be at maximum one. That means we have one row for each primary key, and that means as well, it is unique. If you've got anything more than one, then it means we have doublicates. Let's go and run the query. As you can see in the results, we get for each primary key one. That's great. That means we don't have any Dublicates inside of our data and the primary key is unique. So that means the table orders is clean and we don't have any duplicates inside it. Now, let's check our database. We have here another table called Orders Archive. Let's go and check the table. First, I'm just going to go and select the data, select from orders Archive. Sales tots orders archive. Let's check the results. And here we can see that we have exactly the same structure as the table orders. Now let's go and check whether the data quality is well. So now what we're going to do? We're going to use exactly the same query as before. But instead of using the table orders, we're going to take the orders archive. That's it. Let's go and execute it. Now by checking the data, you can see that we don't have everywhere one. Sometimes we have two rows for the same primary key, which is really bad. So we have here for the order ID four. We have two orders with the same order ID. As well, for this order ID six, we have three orders. That means those staff are Dublicds and they are a gist our data model. Now what else we can do with that to generate a list specifically for the data quality issue where we have duplicates. Anything that has one, we are not interested on it. In order to do that, we're going to use the sub query. Let's say, select star from, and then we can use the first query as a sub query. And we're going to say in our filter where the check primary key is higher than one. That means I need only the order IDs where we have doublecates. Let's go and execute this. Now, we have a list with the primary keys where we have Dubliate. We have the order ID four, and as well the order ID six. Guys, as you can see, the window count function is wonderful in order to find data quality issues like the Dubliates. All right, guys. Those are the four most important use cases in the Cal window function count. The first one we can use it in order to do overall analyzes, or we can use it in order to do category analyses like we have done the analysis on the customer behavior, Or another use case, we can use it in order to check the nulls inside our data. And the last use case, we can use it in order to identify or discover the data quality issue duplicates in our data. Now let's go and check the next function. We have the sum. 223. 4 3 win aggr sum: All right. So now let's understand what is the sum function. It's very simple. It's going to return the sum of all values within each window. So now let's go and understand how SQL works with this function. All right, so this is very easy, and we are using the same simple example. And now we would like to find the total sales for each products. So we can define like this sum of sales, since we are finding the total sales, and then we define the window like this over partition by products. So as we learned, SQL is going to go first and divide our data into two windows. So one window for the caps, Another window for the gloves, right? So now after Q defined the windows, it's going to go and start aggregating the data. So the sum of sales. That means, for the first window, we have the three sales, and it's going to go and just simply summarize all those values. So we are adding 20 plus ten plus five, and we will get the result 35. In the outputs, we will get everywhere. 35. So that's it for the first window. And as you can see, SQL going to go aggregate the data within each window separately. So that means as we are aggregating the data for the caps, SQL will not check anything with the gloves, so they are completely separated. So now it's going to go for the next window, and here we have two values and null. So again, here, the null will just be ignored. So what we're going to have, we're going to have 30 plus 70, and the total sales for that is going to be 100. So as you can see, it is very simple, right. 100, 100 and guys, that's it. It's really simple. We don't have here like a lot of special cases like the count function. It's only that it ignores the null in the calculation, and as well, the requirement here, it allows only integers or let's say numbers. So we cannot go and say some of the products since the products are not numbers, they are characters. So you can only use numbers for the sum function. Let's go now and have some tasks and some use cases in order to practice in scale. Find the total sales across all order. As we'll find the total sales for each product. Additionally, we have to provide some details like the order ID and the order dates. Let's go and do that, select order ID, order dates. Let's get as well the sales, and now we have to find the total sales across all orders. That means we can use the window function sum sales and the definition of the window going to be empty since we don't have to divide the data. That's its total sales. And we have to select the table, sales orders. So that sets, let's go and execute it. So with that as you can see, we got all the details that we need, and as well, the total sales, the summarization of all those sales in one field. With that we have our overall analyses one big number for our reporting. We know how much sales we did made in the entire business. Now let's go for the next task. It says, total sales for each product. I think you know already what we're going to do. Sum of sales, s, we're going to do it like this, partition by. Product ID. So that sense, we're going to call it sales by products. With that, we are dividing the data by the product. So let's go and execute it. As you can see, we don't have the product information, let's go and add the product ID in the query just in order to analyze the results. We can see from the data that the winner is the product ID 101. As you can see, we have here the highest sales. If you compare it with the other products, and the lowest one going to be the products ID 105. So as you can see, we can use the window function sum together with the partition by in order to compare stuff to do comparison between the products in order to understand the performance, for example, of the products. So it's really great analysis for the performance. Alright, now we're going to move to very interesting use case for the aggregate functions, not only for the sum, but as well for the others, it is the comparison analysis. Okay, so let's understand quickly, what is the comparison use cases. It's going to go and compare the current value. For example, let's say we are currently at the month of March, and the sales is 30. We're going to compare this value, the current sales with an aggregated value. For example, let's say, the total sales using the sum function. What happened if you compare the current value with the total sales, you are comparing here or doing analysis cold Part to whole analysis, where it can help us to understand how important was the sales in this month compared to the total sales. Or we can go and compare it to the best months to the highest value. For example, the highest value is June, and we can go and compare this month with the best months of the year or to the lowest month in the year. Or we can go and compare the sales of the current month with the average in order to understand are we above the typical sales or below the average? And this is very important analysis in order to study and understand the performance of the current data. Let's have an example in order to understand the use case. Find the percentage contribution of each product sales to the total sales. Let's go and solve it step by step. What we're going to do, we're going to go and let's select the order ID as well, let's take the product ID and the sales just like this from sales orders. Let's go and execute it. Now as you can see in the results, we got the first part of the equation. We have the sales, so nothing like a crazy over here. Now, we need the total sales of all data. What we're going to do? We're going to have the sum of sales. And the definition going to be empty. This is the total sales. Let's go and execute it. Now we have everything for the equation. We have the sales and as well, the total sales, and that is enough in order to find the percentage of the contribution. The calculation for that is going to be very simple. We're going to divide the sales by the total sales. It's really simple. Let's go and do that. It's going to be the sales divided by the total sales. So we're going to go and copy the whole window function over here, and then we're going to multiply it with 100. That's it. Let's go and execute it. Now you notice that's in the output, we got zero. This is because of the data type. So now, if we go to our table over here on the left side, you can see that the orders has the data type of integer. So if you divide integers, you will not get a float or decimal number, you have to go and change the data type. So now what we're going to do, we're going to go and change the data type for one of them, so it's enough for the sales over here. So we're going to use the following statements. So cast sales as floats. So that's it. I'm just converting the integer to floats. So that's it, let me just give it a name, so it's going to be percentage of total. So that sets. Let's go and execute. Now in the output, you can see, we got now the percentage of the total or let's say percentage of contribution. Now what we're going to do with that, we're going to go around those numbers because we have a lot of decimals. In order to do that, we're going to use the round function like this. Then we're going to have two decimals, and let's go and execute it. So as you can see, it is really easier to read. Because we have only two decimals and we can find immediately that the order eight is the highest contributor to the total. This is what we call part to whole analysis where we find the percentage of total. It is very common analysis in order to understand the performance of each order compared to the total. This is an example of how the window function is helping us here to compare the current value with an aggregated value. All right. So that's all for the window function sum. Next, we're going to talk about the average function. 224. 4 4 win aggr avg: All right. So now let's understand what is an average function as the name says. It's going to find the average of values within each window. So now let's go and understand how SQL works with the average. All right. So now pack to our very simple example, and the task says, find the average sales for each products. So it's really easy. We can use the average then pass to it, the column sales, and we define the window like this partition by products. So the first thing that qual going to go is to define the window, so it can divide our data. Into two partitions, one for the caps and one for the gloves. Now I hope that everyone knows how to calculate the average. So as you know, it's going to go and summarize all the values and divide it by the number of rows. So it's going to go and summarize 20 plus ten plus five and divide it on three rows, and the output going to be 11. So we're going to get it for each row. As you can see, QL just ignored everything in the next window. We are focusing only on the caps. Now, is going to go to the second window and start doing the same aggregations. But here we have the special case of null. So the null is going to be ignored in the calculations, and we're going to have it like this. It's going to say, You know what? 30 plus 70, and we are just including two rows, so it's going to be divided by two, and the average going to be 50. So we will get the result 50 for each row, and we are completely ignoring the null. But now we might be in scenario where your users understand a business like this. If we find a null in the sales, it means a zero, so there is no sales, and it is actually a zero, but we store it in the database as a null. That means the average that you have provided is not really correct. We have to divide by three. That means first we have to handle the nulls before doing the aggregations before finding the average. Now, we're going to have a whole chapter on how to handle nulls in squal what are the different functions. But for now, we're going to go with the functions. K. Now what we're going to do we will not use the sales as it is, first, we're going to handle the nulls. That means we're going to use the alisk sales and replace it with zeros. So as you can see, we are not using immediately the sales, we are handling it first, and then we're going to find the average. Qual going to go over here, and if it finds any null, going to go and replace it with zero, and that's going to have then an effect on our average over here. It's going to be 30 plus seven plus 70, but now plus zero. Now we have three rows, instead of dividing by two, it's going to go and divide it by three, and the total result going to be like this, 33. So that means we can have in the output 33 for each row. And with that, we are now fulfilling the expectation from the business. If you have a null, it can be handled as zero, and the result can be more accurate. You see, right? It is very tricky. If you are doing that analysis and aggregations, be very careful with the nulls. Understand them, understand what they mean for the business, handle them correctly in order to get correct results in your analysis. Now, let's go back in order to practice SQL, using some tasks and use cases. Okay, so let's start with the basics. We have the following task. Find the average sales across all orders. As we'll find the average sales for each product and don't forget the details. Now let's go and solve it step by step, so select order ID order date. Let's get the sales as well. Let's go and find the average sales. It's going to be a window function and we have the sales inside it, the usual stuff, that window going to be empty. Average sales, we're going to call it. That table going to be sales orders. So that sets, let's go and execute it. Oh, we have to select everything, of course. What Equal did in the output, we're going to go and summarize all those values and then divide it by ten. With that, we have the average sales of 38. Very easy. This is, again, what we call and overall analysis. Let's move to the next one, find the average sales for each products. Again, we're going to go and build the window function like this, average sales, and we can divide it by product ID, and we're going to call it average sales by products. And we're going to go and add the product ID in the query. The outset, let's go and execute and we missed something here. It is the partition by going to execute again. With that, we have the following data. So with going to go and divide the data. For example, for these products, we have those four orders, what can happen is going to go and summarize the four values and then divide it by four. That's why we have here 35. The same thing for the next orders going to divide it by three. The last one is just going to divide it by one. That's why we have 60. As you can see, aggregation going to done separately for each window, and this is very nice way in order to compare the averages between the different products. Now let's have an example in order to learn how to deal with the nulls. Let's say that we have the following task. Find the average scores of customers and show as well additional information like the customer ID and the last name. Let's go and solve this. We are now targeting the table customers. Let's just select it first. Like this. And now let's go and include the customer ID and the last name. Let's have as well the score. But this time, we're going to go and find the average score. So it's going to be the average score. And since we don't partition the data, we're going to leave the definition like this and going to be the average score. So that set let's go and execute it. Now as you can see we have the average score of 625. Q going to go and summarize the four values and divided by four. But here we have a null. Now we have to understand the business or ask about it, what the null means in the scores of the customers. Is it zero or is it something empty? If it's zero, then the average that we have is wrong because it should be divided by five and not four. Let's say it's zero. That means we have to go and handle the nulls. What we're going to do now, we're going to go and use the function is. Quals earns for the score and replace the null with zero. You are the customer score. Let's go and execute this. So you can see if there is a value, it's going to be exactly the same value, but only if you have a null, it's going to be replaced with zero. Now let's go and correct the average. I'm just going to do it like this. Let's go and copy the whole thing. But now instead of using the score, we're going to use the score that is handled with nulls. I'm just going to go and replace it like this. Here without nulls. Let's go and execute it. As you can see, we are getting more valid result at the output compared to the previous one, and this is only for the case if the null means zero. Guys, as you see, be very careful with the nulls, especially if you are doing aggregations and handle it correctly before doing any aggregations like the average. Moving on to the asuse case, we have the comparison analysis and the task says, find all orders where the sales are higher than the average sales across all orders. That means we have to go and compare the current sales with the aggregated value at this time, the average of sales. Now let's go and do it step by step. What we're going to do? We're going to go and select, of course. The order ID, what do we need? Let's take the product ID, and we need the current sales. It's going to be the sales as it is. That's it for now. So from sales orders, So that sets. Let's go and execute it. So by checking the result, you can see that we got the first part of the equation, right. We have the sales for each order. Now, we need the second part. The average sales across all orders. In order to do that, we're going to go and use the window function average sales, and we're going to use over. Since across all orders, that means it's going to be empty. So let's give it a name average sales. So let's go ahead and execute it. Now in the output, we got the averse sales, so it can be 38. Now we need all the orders that are higher than the average. As you can see, for example, the order one is not higher, but the order four is higher than the average. In order to filter the data, we cannot use the window function in the wear clothes. What we're going to do, sadly, we're going to go and use the sub query. It's going to be like this. Select star from and then we're going to define the condition outside the subquery. It's going to be where the sales is higher than the average sales. That's. Let's go and execute it. Now as you can see, it's very simple. We got all the orders that are higher than the average. You can see all those sales are higher than the average. It would be nice if we can do all those stuff in the first query. But since we cannot do that. We need to use the subqueries in order to filter the data. Afterward. That we can understand the importance of the comparison analysis. For example, here, we are finding or evaluating the data whether they are above the average or below the average, and this is very important in the business analysis. All right, everyone. That's all for the window function average. Next, we're going to talk about two very interesting functions, the min and max. 225. 4 5 win aggr min max: All right, guys. So what is mean and max functions? They are very simple, but yet, very powerful functions for analytics. The mean simply is the function that's going to return the minimum or let's say the lowest value within a window, where the max, it's exactly the opposite. It's going to find the maximum value or the highest value within a window. Now let's go and understand how SQL works with these functions. All right. So now we have the same data, and we have two tasks. First, we have to find the lowest seals for each products. The second one side by side, we would like to find the highest seals for each products. So we're going to go and use the men max. And as you can see the syntax is very simple. Man the seals, and then the partition going to be by the products, and as well, the same stuff, but having the max. Okay. So now let's see how qual going to execute the first query. As usual, first, it's going to prepare the data. So it's going to split the data into two windows, one for the caps and another one for the gloves. And after that, it's going to search for the lowest sales within each window separately. So for the first window, we have the following values, 20, ten, and five. And of course, the lowest value going to be the five. So that's why qual going to find it over here, and everywhere for this window, it's going to be the value five. So we have it as the lowest sales for the product caps. So now we're going to jump to the next window for the gloves and start searching the values. So as you can see, we have 30, 70 and null. Null will be ignored, so Null will not be considered as the lowest value. So que going to find the lowest sales with the 30. So it's going to be actually the first row within this window and the value output gonna be 30 for each row. So that's it is very simple right. Now, let's move to the next one. We have the same stuff, but using Max, so the data is partitions. And for the first partition, what is the highest value? It's going to be the first row, the 20. So Esq go to find it. And in the output, we will get the highest sales, 20 for this window. Then it's going to go to the second window and search for the highest value. So here we have two values, 3070, and it's going to be the 70, right? So it's going to point it over here. And in the output, we will get everywhere. 70. So, guys, it's really simple right. Now, let's back to our scenario in the average, where in our business, we understand nulls as zero in the sales. So that means first we have to handle the nulls and replace it with zero, and then we're going to go and search for the value. So what's going to happen? We're going to go and replace nulls with zero. For the max, nothing going to change. The highest value going to be 70, and we're going to get the same output. But for the min, now we have new lowest value. So it's not anymore the 30. It's actually the zero. So q can go over here and replace the 30 with nulls. Nulls is the lowest sales for the product gloves. Again, guys, the nulls are very tricky and those functions are really sensitive with the nulls. Understand what the nulls means and handle it correctly so that you get correct results in the output. That says, Let's go back to quel to have some tasks and use cases in order to practice qual. All right, everyone, let's start with the basic stuff. Find the highest and lowest sales of all orders, and we'll find the highest and lowest sales for each product, and we have to provide additional information. So let's go and solvet select order ID, or the lats. And let's take as well the product ID. Now, let's find the highest sales of all orders. It's going to be the max function for the sales and the window function is going to be empty sales of all orders. So you are the highest sales. Let's go for the lowest sales of all orders. I go to be exactly the opposite the main function for sales over Then we have the lowest sales. So I'm just going to make it bigger capital. So it's leak the table. Sales orders. I think that sets. Let's have as well the sales, actually. All right. So now let's go and excuse it. Now this is very simple, right? This is the whole sales. What is the highest sales? We have the 90 of the order eight. As you can see, we have now the highest sales, the 90, and the sales is the ten, the first order is the lowest. It's very easy. Now we're going to go and repeat the same stuff for the products. So we have go and partition the data by the product ID. What I'm going to do, I'm just going to go and copy based stuff around. The first one is going to be partition. The product ID. So highest sales by products. And the next one is going to be the same stuff copy paste by the products. So that sits. Let's go and execute it. S again. The data going to be partitioned and divided by the product. So for the first window, what is the highest sales? It's going to be the 90, and the lowest sales is going to be the ten. So it's exactly like the overall right. Now, let's go to the second window over here. We can see that the lowest or the highest sales is the 60, the first one, and the lowest this time is 15. And this is great in order to see that The que can execute each of those functions for each window separately. So let's go to the last window. It's 41. So the sales is 60, and we have only one row. So it's going to be the highest and as well, the lowest sales. So with that, as you can see, we can define a range for each product, and the range are different from each product to another one. For example, for this product 101, the range from ten until 90. But for the second product, we have it 15-60 Okay, guys, let's move to the next one, which is one of my favorites in the window function where we filter the data using the minimax functions. Let's have the following task. It says, show the employees who have the highest salaries. This sounds very simple, but we can use the help of window functions in order to solve it. So now we are working with the table employees. Let's just select the data. Select from sales. Employees. That sets. Let's go and execute it. Now we have five employees and we have those different salaries. Let's go and find the highest salary. Max salary. Let's use the window function over, but we don't partition the data at all. So it's going to be like this. Highest salary. Let's go and execute it. Now by checking the results, we got a new column called highest salary, and inside it, we have the 90 k. If you check those five salaries, you can see that the highest is from the employee, Michael. But still the task is not solved, we have to show only the employees who have the highest series. We have somehow to filter the data and only show this employee. In order to do that, we have to use the sub queries since we cannot use the window function in the ware clause. What we're going to do select star from, and then our first query going to be the inner query. So we have the following condition. It's going to be the salary should be equal to the highest salary. So it's very simple. So with that we are comparing the salaries with the highest salaries, if there is a match, the data going to be presented. So let's go and execute that. And that's it, as you can see, we got the employee with the highest salary. But if they are multiple employees with the same salary of 90 k, of course, we're going to get it in their results. I think Michael going to need a new job, right. This is the worst. So this is another use case for the window functions Min max. All right. So now we come to the use case of the comparison analysis, where we want to compare the current sales with the highest and the lowest value. So we have the following task. It says, find the deviation of each sales from the minimum and the maximum sales amount. So as you can see, this is our sales, this is the highest and this is the lowest. So now we just have to go and subtract the data from each others in order to get the deviation. So it's very simple. Let's get the first deviation, where we're going to go and subtract the sales. With the lowest value. So it's going to be like this. So now what we are doing over here, we are subtracting the sales from the lowest sales of all records. So we're going to go and call you deviation from me. Let's go and execute it. So now we can see from those values, how far is the current value from the extreme. The extreme here is the lowest value. So this is really great way to analyze the extremes in your data. Now as we are near to the extreme, the value going to be low. So as you can see here we have a zero. This is the lowest because we have it exactly as the extreme. Actually, this is our value. So the ten. The next one is a little bit far away from the extreme, which is 15, so we have it here as a five. This is not far away from our extreme value. And then if you check this value over here, we have it 80. The distance is very far away from our extreme value, the lowest sales. This is really nice analysis in order to analyze and evaluate the sales of your data. Now, of course, we can go and evaluate our data with another extreme, which is the highest sales. In order to do that, we're going to first say, let's get the highest, sorry this one, the highest sales and subtracted from the sales. You are the deviation. From the max. Let's go and execute it. Now we can see in the output, we're going to get exactly the opposite distances. The order number one is the farest from the extreme. As you can see, we have the value of 80 and the order eight is the identical one, so that's why we have the distance of zero. Now we can see as well very quickly, which data points are the nearest to the extreme to the highest sales. As you can see guys using the window function mean and max, it is very powerful in order to understand and evaluate your data points to the extreme. 226. 4 6 win aggr rolling running: All right, ever. So now we can focus on very important use case. One of the must know use cases for that aggregations is doing running total and rolling total. These two concepts are very important for that analysis and doing reporting that you must know. The key use case for those two concepts is to do tracking. For example, we can go and track the current total sales with the target sales in our business, and as well, it's great in order to do historical analysis for the trends. Okay, now the question is, what is running and rolling total? They are basically very similar. They're going to go and aggregate a sequence of members. The aggregation going to get updated each time we add a new member to the sequence. A sequence could be like a time sequence. That's why we call this type and analysis over time. So now we still have the question, what is the difference between the running and the rolling totals? The running total can go and aggregate everything from the beginning until the current data point, without dropping off any old data. We, on the other hand in the rolling total, it's going to go and focus on a specific time window like the last 30 days or the last two months. And each time we add a new member or a new data point to the window, we will be dropping off the oldest data point in the window. And with this, we're going to get the effect of rolling or let's say, shifting window. Okay, I totally understand if this might be complicated. Now, let's go and have very simple example in order to understand this concept, and as well, how we can solve it using qual. All right, guys. So now we have very simple example. We have the months and sales, and we have it twice because I want to show you side by side how Squal works with the running total and the rolling total. So now, what is the task on the left side? We want to find the running total of sales for each month. And on the right side, we would like to find three month rolling total of the sales for each month. They sound very similar, but on the right side, we have only fixed window. Now, how we can solve this using SQL? On the left side, we can use sum of sales, so we want to go and aggregate all the sales using the sum function, and the definition for the window going to be like this order by month. Of course, you can go and do anything like you can have here an average, and if you use an average with order by, you will get the running average or the running max or the running count and so on. That means always if you go and mix an aggregate function together with an order by, you will generate an effect of running total. Now, on the right side, we can have the same stuff. We can have an aggregate function together with order by, sum of sales order by month. So far, we have everything like the left side right. But now you might ask, why is going to go and generate this effect, the running total? We didn't specify crazy stuff right. It's all about the definition of the frame clause. So now, do you remember if you use an order by and you don't specify a frame clause, you will get hidden or let's say default frame clause, and it's going to look like this, rose between unbounded preceding and current row. And what was the definition of the running total? It's going to go and aggregate all the data from the very first beginning, Well, the unbounded proceeding until the current position, the current trow without dropping off any odd members. So that means the definition of the running total going to be the exact definition of the default frame clause. That's why equal go to go and generate the effect of the running total. Now, let's go to the right side, the rolling total. Here again, we have the same stuff right. We're going to go and aggregate the data using the SM function, and we're going to go and store the data order by month. So with that we are as well generating the effect of running total. So each time you use order by with aggregate So now in the running total, we want always to specify a frame. Here in this example, three months. That means if we are getting a new month, we don't want to include the latest months. We want always to be fixed window. Now, in order to have this fixed window effect, we have to go and redefine the frame clause. Because if you leave it as a default like the running total, the frame is going to keep extending. You will see this effect in the example. Now we defined like this. Rows between two preceding and current row. The total number of rows going to be included in each window, going to be maximum of Three months. So now I know you might say para what you're talking about. You didn't get anything. It's totally normal. You will understand it only with an example. So in order to do this, let's start with the left side. So first, Qu going to go and sort the data, so everything is sorted from the smallest month until the highest one. So from January until July, everything is good. And now su going to go and start working with the frame. So the frame says unbounded preceding. So this is going to be static. It's going to be always pointing to January. This is the unbounded proceeding, the first row in the dataset. And now, of course, we are starting from top to bottom. The current row going to be pointing as well to January. So the frame going to look like this. It's going to be only one row, and the total sale of this row going to be 20. That's why we can have in the output 20. So now let's move to the right side, the current row gonna be January, and what is the two proceeding? We don't have it yet, so it's going to be pointing maybe somewhere here before the table. So again, what is the frame? It's going to be as well, one row. So in the output, we will get exactly the same result 20. So so far, there's no differences between the running total and the rolling total. But let's Now we're going to go to the next row over here, what can happen to our frame. It's going to go and extend right, so we're going to have now two months in this frame. And what is the total sales over here, it's going to be 30. We added a new member. You can calculate it like this, either go and calculate all the cells within the frame, or you can go and say this is the previous aggregated value plus the new member. The previous one is 20, the new member is ten, we will get 30. Both of them is correct. Now let's move to the right side. What's going to happen, we're going to be as well at February. The tube preceding is still pointing somewhere outside. And here, the window going to go and extend like this. We have two months. And the same aggregation gonna happen. So we have 30. So so far, nothing crazy, right? Let's go to the next month March. The frame going to be extended. So we have now three months. And the aggregation going to be either here, 60 or 30 plus 30, we will get the running total of 60. And now on the right side, what's going to happen, were going to point as well to March, and this time, the two preceding going to be pointing to January. And this is the first time we are getting the whole fixed frame, right? So we have here three months in this frame. So what is the total of that it's going to be 60. Okay, so now you say, we're still getting the same result, so there's no difference. I'm going to say wait for it. It's going to be the next one. So as we go to April, the effect here that the frame going to get extended to four months because always we start from the first month until the current month without dropping any member outside. So what is the total of this? It's going to be 65. Sorry? Now on the right side what's going to happen, we're going to go and add a new member, the April. But we are at the maximum sides of the window. We have only three, and that's because the two preceding going to shift as well down over here. So the boundary going to be from February until April. And with that, we are dropping off January. And now you can see the effect. It is sliding. It is rolling or shifting from top to bottom. And that's because the boundaries as well shifting. So you can see now the effect of the rolling total. The newest member going to be added, the oldest member is going to be out. We are allowed only to have three muscles. So what is the total of this? It's going to be 45. So this times we are not aggregating this value, the 60, together with the five. We are aggregating the values within the window. So now let's keep going. Now, we are at June, What can happen on the side, the frame going to get bigger. And with that, we will get the result of 135. So the frame is getting really bigger. But on the right side, it's going to have a fixed frame. So we are just sliding, shifting and rolling. So with that, we are adding new member. Another member is leaving the oldest one, and the total over here going to be 105. And now we're going to go to the last row. We will have everything for the total. So the whole data set is going to be aggregated. So this is the maximum what we're going to get. It's going to be around 175. But on the right side, it's just going to keep shifting until we reach the last record, the window, the frame, going to be as well shifting like this. So the total of this go to be 105. Okay, guys. So you see, it's very simple. The running total is always consider everything from the starting position until the current row without dropping any member. The rolling totals always drop the oldest member in order to add something new, and the window is keep shifting. So the running total is very great in order to do tracking, like, for example, budget tracking. Or we check, for example, the current total sales with a target or something like that. So always we are considering the whole data sets. But with the rolling total, we always do here focused analysis. We are always interested with the window of three months. So they might sit very similar, but they have completely different scope for analysis. But both of them are doing aggregations over time, so they can help us to do analysis over time, like checking whether our business is growing over time or declining. So, guys, as you can see, using very simple SQLs, using the window functions, we can do really great analysis on our data. So those staff are really fundamental of data analysis or doing reporting for our business. So window functions are really powerful for data analytics. 227. 4 7 win aggr moving avg: Okay. So now we have the following task, and it says, calculate the moving average of sales for each product over the time. So now we have here something called moving average. It is very similar to the running total. In the running total we used count and SM and so on. But here, we're going to go and use the function average. And instead of calling it running average, we call it moving average. So let's go and solve the task. Let's start always by selecting the usual stuff. So let's get the order ID. Let's get the product ID. And I would say, since it's over the time, I will get the order date as well. And the last one, the sales fra table sales. Orders. So that says, Let's go and execute it. So now we got our ten orders with the products, order date, and sales. Let's start building our window function step by step. Which function do we need? We need the average. This is the easiest one. It says moving average, so that says we need the sales. So it's going to be the average of sales. Let's go and define the window. So now do we have to divide the data partition the data? Well, yes, it says for each product. That means we're going to go and use the partition by clause by the products. ID. So now I would say that's it for the first step, average by product. So let's go and execute it. So now if you check the result, you can see that we got our windows. So the first one for the product 101 and the total average of the sales going to be 35. So we have like aggregated one value for each window, the same thing for the next product. And for the next and so on. So we don't have any progress over time or something like moving average or that time, right? We don't have this effect. We have just one average for each window. So now in order to have the effect of the moving average, it's gonna be like the running total. We have to use the aggregate function together with the order by. So I'm just gonna make it in the new column. I'm just going to copy everything like here. And now we're going to do order by. Okay. And since it's over the time, we're going to go and use the order date order dates, and we're going to have it as ascending because it's over time, over time always start with the earliest dates and dub with the latest dates. So from the lowest to the highest. We're going to leave it like this. Let's call it moving Average. So now let's go ahead execute it and we got here an extra cameo because of the copy base. So let's execute it again. All right. So now let's check the results. Let's take the first window over here, and you can see we have on the moving average like progress. So it starts with ten, 15, 14, 35. So there is moving average. We don't have one solid number for the average, we have different values. So now, how is QL going to solve this? It's really simple. It's going to start row by row. So the first row, what is the average of ten? It's going to be ten. Then moving on to the next one. It's going to be ten plus 20, divided by two, you will get 15. So now moving to the third one. Although three values is going to be summarized, divided by three, you will get 40. And now to the last row in the window, It's going to be summarizing all those four values divided by four, and you will get 35, and this is exactly the same value in the previous column. You have here, the average by products. We don't have order by. You got as well, 35, exactly like the last row. That's because we have the same calculation. It is summarizing all those four values dividing it by four. But now, it's interesting the next value. As you can see the next value, it comes from another window. You see here we have 15 for the product 102, But the average is going to be as well 15. So squale is not considering the old values from the other window. So a scale going to calculate each window separately. So again here, this is the first value of this window, 15, the average 15, then the same stuff right. Summarizing those values divided by two and so on. This we call in data analysis this last field over here, we call it a moving average, and you can implement it very simply using an average function together with the order by. Alright, let's move to the next task, and it says, calculate the moving average of sales for each product over time, including only the next order. So, as you can see, the first part we've already done is right, we have the moving average and divided by partition by for the products. But here, we have more specifications. It says, including only the next order. That means we're talking about, the current order and as well, the next order. So here we have a fixed frame or fixed window. So we don't need the whole average of the window. We need only maximum two orders in each calculation. So how we going to do that, we can have our custom frame clause inside our window function. So that means we cannot leave it as a default. We have to specify it. So let's go and do that. I will just copy the old definition of the window because we have the exact stuff. So we have the average sales over partition by product ID, order by date. So this is the first part. So now we would like to have this fixed window. So we're going to go now and define our frame close. I'm just going to zoom out a little bit. It's going to be rows between. So we have now the boundaries of the frame. It says, including the next order. So we're going to go and use the following. So the first boundary is going to be the current row. And since it's next order, so it's going to be one, following. So that is our frame, including only the next order, and we have it like this. One following. Let's call it rolling average. So that's it. Let's go and execute. So now let's go and check the result. You can see the moving average has completely different values as the rolling average. So let's go and understand why. You can do it row by row. Let's take the first row over here, so the cells here is ten. And the rolling average is 15, why is that? Because in the calculation, we are considering the next value. So ten plus 20/2, you will get 15. That means the qual defined the frame like this. Those two rows for this calculation for the first row. Now moving on to the second row, qual going to include as well, the third one, the next one. But since the window is only two orders, it's going to go and drop the first row. The next frame is going to be like this. As you can see, it's going to be 20 plus 19/2, you will get 55. We can see the effect of the rolling average. Now for the next one, is going to be exact same. We are at the third row. It's going to go and include the next one, and we're going to get the same value because 19 plus 20 divide by two, you will get 55. Now, interesting to the last row in the window over here, it will not go and consider the next value because it is outside of the window. It's going to be 20, and it's going to stay as well, 20. That's it. Alright, guys. So with that, we have learned about the moving average, the rolling average, and those amazing concepts using the window function. Alright, now we can have a quick overview of the different use cases in the aggregate functions and how the definition of the window going to change the whole use case. So now, the first use case is finding the overall total. And here, if you don't define anything in the window, if you leave it empty, what can happen you are doing here overall analyses. So you're going to go and aggregate the whole data sets, and then provide this aggregation for each row. This is what happened. If you leave it empty, you don't define anything. You are aggregating the whole data sets. Now, moving to the next step, we can do analysis called total pair groups. So what you're going to do, we will add partition by to the definition of the window. So by adding, for example, here, partition by products, what can happen? The data going to be splitted into two categories or two groups, and the aggregation going to be done for each window separately. This is, of course, a great analysis in order to go and compare different products like here, the caps and cloves. So this is helpful in order to compare categories. You can do this analysis total pair groups if you use the partition by. Now, if you go and use the order by, you're going to land in the third use case. As we learned, we will be doing running total. As you can see here in the output, we are building a commulitive value for the sales, and this can help us in order to do progress over time analysis in order to understand the performance of our business. Now moving on to the last use case, the final phase of the window function with the aggregation. Here you have the aggregate function together with the order by with customized fixed window. Of course, we can use it in order to help us building progress over time in specific fixed window. Of course, you can use those use cases. You will get the same effect if you use the other functions, not only the sum, you can use average count maximin, so all aggregate functions. Guys, as you can see, the window function scale is very important in order to do data analytics. By just like changing the part of the window, you are generating a whole new use case for data analytics. 228. 4 8 win aggr summary: All right. Now we can have a quick overview of the different use cases in the aggregate functions and how the definition of the window going to change the whole use case. So now, the first use case is finding the overall total. And here, if you don't define anything in the window, if you leave it empty, what can happen you are doing here overall analyses. So you're going to go and aggregate the whole data sets and then provide this aggregation for each row. This is what happens if you leave it empty, you don't define anything. You are aggregating the whole datasets. Now, moving to the next step, we can do analysis called total pair groups. So what we're going to do, we will add partition by to the definition of the window. So by adding, for example, here, partition by products, what can happen, the data going to be splitted into two categories or two groups, and the aggregation going to be done for each window separately. This is, of course, great analysis in order to go and compare different products like here, the caps and gloves. So this is helpful in order to compare categories. So you can do this analysis total pair groups if you use the partition by. Now, if you go and use the order by, you're going to land in the third use case. As we learned, we will be doing running total. So as you can see here in the output, we are building a commulative value for the sales. And this can help us in order to do progress over time analyses in order to understand the performance of our business. Now moving on to the last use case, the final phase of the window function with the aggregation. Here you have the aggregate function together with the order by with customized fixed window. Of course, we can use it in order to help us building progress over time in specific fixed window. Of course, you can use those use cases. You will get the same effect if you use the other functions, not only the sum, you can use average count maximin, so all aggregate functions. So, guys, as you can see, the window functioning scale is very important in order to do data analytics. By just like changing the part of the window, you are generating a whole new use case for data analytics. All right, friends. So now let's do a quick recap about the window aggregate functions. So what they do, they're going to go and aggregate a set of values and return a single aggregated value for each row. So it's very similar to the roi. But here we don't lose details. Now, to the next point, what are the rules for the syntax? About the expressions, they all expect a number in the expression. So you have to pass a number like sales or any integer. But only for the count, you can go and use any data type. The things for the aggregate functions are very simple. Everything is optional inside the definition of the over clause or the definition of the window. So you can go and use partition by or by frames or not or just leave everything empty. Everything is optional. Now as we learned, we have a lot of use cases for the aggregate functions, and they are really amazing for analytics. So the first one, the simplest one you can do overall analysis if you just leave the window function empty, so you will get one big number about your business. And the next use case, we can do total bare groups analyses As learned we can use partition by in order to compare categories with each other like comparing the products or customers and so on. Moving on to the next one, we can do part to whole analysis. We can go and compare the performance of each data point with the overall. So you can, for example, compare the seals to the total sales in the window or to the all data sets. And we have many comparison analyses. We can go and compare the current value with the average or we can compare them to the extreme to the highest seals to the lowest seals, and so on. Another use case, we can go and identify data quality issues in our data. We can go, for example, identify double kits using the count function. Moving on to the next use case, we have the outlier detection. We can go and find out which data points are above the average and below the average and so on. Then the next one we have the running total. As we learned, it is a great tool in order to track the progress or to monitor the performance of our business over the time. Or if you want to be more specific, you can go and use the rolling total in order to have a specific window and only track this window like three months or something like that. And the last use case, we can go and calculate the moving average of our data. So it's really amazing how order by and aggregate functions can open for you a door for amazing or advanced analyzers. So, guys, as you can see, we have a lot of use cases for the window aggregate functions in the world of data analytics. Alright, so with that you have learned how to aggregate your data using four different Scale window functions and their use cases. Moving on to the next one, we're going to learn how to rank your data using six different SQL window functions. So as usual, we're going to do D dive into the syntax, how scale works, and the different use cases, they are amazing for data analytics. 229. 5 1 win rank what is: Hey, friends. So now we're going to learn how to rank your data using six different window ranking functions. We have the row number, rank, dense rank, ile, um dist, and as well, the percent trach. As usual, first, we have to understand the concept behind them. And after that, we're going to learn the syntax, and we're going to have the most important use cases for the ranking functions that I collected from my projects. So now let's start with the first question. Why do we call them ranking functions? So let's go. All right. So now, let's say that, we have the following data. We have products and their sales. If you want now to go and rank your products, first, you have to sort the data based on something, like, for example, ranking the products based on their sales. So that means SQL first going to go and start sorting your data from the highest to the lowest. So sorting the data is always the first thing CL has to do. Before ranking anything. Now, in order to rank our data, we have two methods. The first method, we call it the integer based ranking. So that means Equal is going to go and assign for each row an integer whole number based on the position of the row. So now, by looking to the example, the first row, we have the product with the sales 70, it's going to be rank number one. Then the next row, the product B with the 30 sales, we will get the rank number two. Then the next one going to be three, four, and the last one going to be five. So that means Equal here is assigning an integer for each row, based on their position in the sorted list. This method, we call it integer based ranking. Now, let's go to the second methods. We have the percentage based ranking. So in this methods, SQL going to go first and calculate the relative position of the row compared to all others and then assign a percentage for each row. So in the output, qual going to start assigning percentages instead of integer, and we're going to have a scale 0-1. So now, if you go and compare both of the methods, you can see that on the left side on the integer base ranking, we have discrete distinct values. It starts from one then two, three, and end up in this example by five, so it really depends on how many rows do we have in the results. It could be five, it could be 500, 5 million, and so on. But in the right side, we have always the same scale 1-0. 0-1, we have infinite number of data points. This scale, we call it a normalized scale or we call it continuous scale, continuous values. Now the question is when to use which method. For example, for the percentage based ranking, it is great to answer such questions. Find the top 20% products based on their sales. This method is a great way in order to understand the contributions of data values to the overall total. We call this kind of analysis a distribution analysis, where on the other hand in the integer based ranking, we can answer questions like find the top three products. With this question, we are not interested about the contributions of each product to the overall total. We are just interested in the position of the value within a list. So this is very commonly used analysis and reporting. We call it tub button analyses. So now let's group up our ranking functions based on those two methods. For the first group in the integer base ranking, we have four functions, raw number, rank, dense rank, and entile. But on the other hand, we have only two functions that generate percentage based ranking. We have the mist and as well the percentile. So now that was an introduction and overview of those methods and how we group up those ranking functions. Next, we're going to go and learn about the syntax of the ranking functions. Most of them follow the same rules. So, for example, we start always with the function name, so we have here the rank. But as you can see, we don't use any expressions, so they don't allow you to use any argument inside it, it must be empty. So this is the first rule using rank functions. Then about the definition of the window. As usual, the partition by, it is an optional thing. You can use it or leave it. And now to the second part, we have the order by. It is as well required. So you must order the data or sort your data in order to do ranking, so you cannot leave it empty. So that means for the definition of the window, at least we should have an order by, for example, here, sales. So we cannot leave it empty. All right. So the two requirements, you cannot use any expressions for those functions, and as well, you have to sort your data using order by. So now let's have an overview of all functions. So as you can see, all those functions are ranking functions, and almost all of them don't allow to use any expressions inside them. Beside this function here, we have the tile. It accepts a number inside it. So that means you cannot use it empty. You should use a number inside it. All others must be empty. So now for the potential by, all of them are optional, and for the derby, all of them are required. So you must use derby and the frame clause, they are not allowed to use in the ranking functions, so you cannot change the definition of the frame inside the window function. So now what we're going to do as usual, we're going to go and deep dive into all of those functions in order to understand when to use them and what are the use cases, and as well, practice in a scale. So we're going to start with the first one, the row number. 230. 5 2 win rank row number: All right, so what is a raw number? In ICL, the raw number function is going to go and assign for each row, a unique number as a rank, and it doesn't care at all about the ties. That means if you have two rows sharing the same value, they will not share the same rank. Okay, so now we have very simple example. We have a list of all sales, and we have the following query. So it's going to start with the ranking function, a raw number. It doesn't accept any argument inside it, and the definition of the window going to be like this, order by sales disc. That means we're going to go and sort the data descending from the highest to the lowest. C going to go and do the following, the highest going to be the 100, the lowest going to be the 20, and here we have twice, the 80. Now once SQL done sorting the data, what's going to happen, it's going to start assigning a rank. So the row number are going to go and assign a unique number for each row. So that means it got to start with the first one, the 100 going to be the rank number one. The next one going to be rank number two. The 80 going to be rank number three and the 54, and then the last one gonna be five. And now, if you check the output, you can see that, all those numbers are unique. We don't have any repetitions. So one, two, three, four, five, there's no repetitions. They are unique distinct value. And as well, there are no skipping of ranking. So that means we have here one, two, three. There's no jumping to six seven or something. There are clear sequence of distinct value, and there are no gaps. But still there is something special in our data. We can see that in the sales. We have the same value twice. So we have two rows with the same sales. As you can see in the row number, they will get distinct value. So they will not share the same ranking. So that means row number does not handle the ties. If you have multiple rows sharing the same values, they will not share the same rank. They can have a distinct rank different ranks. So this is how the row number works in Sq. It generates unique ranks for each row. It does not handle the ties, and as well, it doesn't leave any gaps, so there is no skipping or ranking. So now let's go to Sq in order to have few examples and use cases. All right, so now we have the following task. It's very simple, rank the orders based on their sales from the highest to the lowest. So now, this is very easy. We're going to go and select first the data. So order ID, product ID. Let's take the sales as well and select the table. So it's going to be sales orders. Let's go and execute it. So with that we've got all our orders. What you're going to do now is to assign for each row rank. That means we need a column here that contains the rank for each row. In order to do that, we're going to go and use the window function row number. It doesn't accept any argument inside it, so should be empty, and then we have to define the window. As we learned in the ranking functions, we cannot leave it empty. We have to sort the data using order by. Order By is a mast. We don't have to use any partition by, so we can rank all the data that we have inside the table. So how to sort the data, it says it should be based on their sales from highest to lowest. That means we order by sales. Since from highest to lowest, we have to use the descending. And now we're going to go and give it a name. Sales rank and let's say row. Since we are using the row number. So that set is very simple. Let's go and execute it. So now let's have a look to the results. Before Equal did sort the data by the order ID, since we didn't define anything. But since now we are order by sale descending, qual went and sorted the data by the sales from the highest to the lowest and start assigning a rank. Or let's say an integer unique integer for each row. Now the highest order going to be the order er eight. We have the sales of 90. This is the highest one. So you can see, we have one, two, three, four, five, until ten. Now by checking the result, you can see that. The ranking here is unique. There is no duplicates over here, and as well, there is no skipping or gaps. So we have everything 1-10. Even though that we have in our data, a couple of sales, that's sharing the same value. For example, we have those two orders. You can see. Both of them has the 60 at the sales, but they don't share the same ranking, right? So we have here as well, the nine and three, they share the same value 20, but they don't share the same So with that, we have solved the task. It's very simple. We have now a rank based on the sales from highest to the lowest. 231. 5 3 win rank rank func: All right, so what is a rank function? In aquel the rank function going to go and assign for each row, a number rank, and this time it's going to go and handle the ties. So that means if in your data, you have two rows having the same values, they go to share the same ranking. One thing about the ranking function, that it's going to go and leave gaps in the ranking. So there's possibility of skipping ranks. In order to understand how the rank function works in Squeal, we're going to have a very simple example. Alright. So again, with the same data, but with different function. So our window looks like this. It starts with the function rank, doesn't accept any argument inside it. Then we have the window like this or by sales descending from the highest to the lowest, and our data is already sorted like that. So now, how is Kale going to go and assign the ranks? The first row going to be the highest rank. So the value 100 going to be one, then the second one going to be two, but now for the third one, as you can see, we have here two values that are the same. We have a ti, and this time qu going to go and as we lead them to share the same rank. Both of them going to be the rank two. It's not like the row number where we have over here three. This time we have two because we have a ti. Having the same values means they going to share the same rank. Now moving to the next value going to be tricky one because if you check over here, you can see that, the next rank should be like the three. We have one, two, and then the next value that generated in the rank. Should be three. But Esq going to say, You know what? This value position is going to be number four, so you can see, one, two, three, four. So actually, the position number here is four, and squeal going to go and give it the rank of four. So with that, Equal is going to be leaving a gap in the ranking. You can see we are skipping the rank number three. And this always happen once you have a tie where you are sharing the same ranking. So for the x one is going to be easy. It's going to be the row number five. So now by looking to the output of the rank function, you can see that we don't have unique ranking here. We have shared ranking in case of the ties. So it handles the ties. But here we have gaps in the ranks. So we are skipping ranks. When I think about the rank function, I think about the Olympics. If two athletes tie for the gold medal, the first place, there will be no silver medal for the second place. The next medal going to be given to the bronze to the third place. All right, so now let's go in this qual in order to practice the rank function. Alright, now we're going to go and solve the same task, but using the rank function. So what we're going to do, we're going to stay with the same example over here, and we're going to rank the order base on the cells from highest to lowest, but this time, using the rank function. So we use the rank and everything inside it is going to be empty. And then our window going to be exactly the same as before. Over order by sales and disc. So let's give it the name. Sales rank. Yeah, let's give it a rank. So that's it, as you can see, the syntax is very simple and very similar to the row number. We just changed the function. So now let's go and execute this in order to check the results. So now let's go and check the results by looking to the new rank, if you go and compare it with the old rank. We can see that we are sharing some ranking, right? We have here the two twice. So the rank number two, we have it twice because we have over here the same value. So 60, 60, we have it here, two and two. But if you compare to the raw number, you can see that it is not sharing the same ranking. So this is one difference, and as well here, the same thing. They have the same value. The sales is 20, so we have it twice, the rank number seven. And here we have it as different values. And the next value, as you can see, we are skipping the rank. So there is GAP. There is no rank of eight. So you can see that. This is the row number nine, and that's why it get the nine. The same thing I believe over here. So now if you check those two ranks, the next one should be three, but since it is in the row number four, it's going to get the rank four. So by checking the results, we can see that sharing the same ranks and as well we have gaps. So this is how the rank works. 232. 5 4 win rank dense rank func: All right, so what is a dn rank? It is very similar to the ranking function. It's going to go and assign for each role a number rank, and it as well handles the ties. So same values, they go to share the same ranking. But this time it doesn't leave any gaps like the rank function. So the dns rank, it will not leave any gaps. It will not skip any ranking. So in order to understand this, we're going to have a very simple example. So let's go. Alright, so again, the same data, but with different function. We have this time the rank function dense rank, and the window going to be the same order by sales descending from the highest to the lowest. So now the data is as well sorted already. Let's see how SQL going to go and assign the ranks. As usual, the first row going to be the rank number one, the second as well. But again, here, we have the same values. So we have same values, and it's like the rank. It's going to go and share the same rank. So both of them going to have the rank number two. And now you might say, Well, this is very similar to the rank function. So why do we have dense rank? I'm going to say wait for it. We're going to have the difference in the next value. So qual going to come over here. This value is exactly after the tie. And rank, qual went and took the position number. So the row number it was four, right? So one, two, three, four. But this time with the dense rank, Q will not leave gaps in ranking, so there will be no skipping. The next rank and the sequence can be three. So that's why we're going to have the rank three for this value. So as you can see, there is no gap. We have one. We have two and three. So we are not skipping, we are not leaving any gaps, and the last one going to be four. So this is exactly the difference between the dense rank and the rank. So now by checking the output of the dense rank, you can see that. We don't have unique ranks. We have here shared ranks. As you can see, we have here repetition. So it handles the ties and as well. It doesn't leave any gaps. It doesn't skip anything in the ranking. Okay, so that's it. Now, let's go back to Scale to practice the dense rank. Now we have the same task rank the orders based on their sales from highest to lowest. We're going to do the same stuff, but this time using the function Dnrank Tense sranks going to be empty, and then we're going to define it like all others over order by Sales disc then we're going to give you the name of sales strink dense that's it. As you can see all of those functions, having the exact same tax rights. So let's go and execute it. Okay. So now let's go and check the results. We got our new rank using the dns. And by just checking the results, you can see that it handles the tie. We have two twice, right? So let's check the example over here. We have the seals 60 twice. That's why they are sharing the same ranking in the dns and as well in the normal rank. But now, what is interesting is the value after the tie. So as you can see over here, with the dns rank, we have three. So we didn't skip A ranking, we don't have any gap, one, two, and then three. But with the rank, it's just focused on the position number, so it is the row number four. That's why it's four, with that we have a gap. So as you can see, now we don't have any gaps in the dns rank. So we have three, four, five, and now we have over here, the same two values. So we have sales of 2020, and they share the six twice. So as you can see, there's difference now between the dns and the rank. So here we have seven seven, but here we are at the rank six six. So that's why we have differences between them, because we skipped before in the rank number three. Now the other stuff you can see, we have seven and eight. So now, if you compare those three ranking, you can see that they all start with the rank number one, but they didn't all end with the same ranking. So the row number and the rank, they really focus on the position number or the row number of the orders. So you can see over here, it is the row number ten. That's why we have here ten. Ten. So the scale is 1-10, and that is exactly the same for the roll number 1-10. But with the dns over here, we have it 1-8, and that's because we shared the same ranking, and with that we wasted, let's say, a few ranks. So the scale is different from the two others, and that's because we have ties twice. This is one tie, and as well we have over here, one tie. That's why we are missing over here two ranks. So this is how the dn strengths works, and you can go and compare now all three togethers in order to understand how those strengths are working. 233. 5 5 win rank compare ranking: All right. Now let's quickly compare the three functions side by side. Let's start with the first point about the uniqueness of the rank. And if you compare those three, you can see that only the row number generates unique distinct rank. This is going to be unique rank, and the two others, we have double kits or let's say shared ranks. Now the second point, whether the function handles the ties and the only one that doesn't handle the ties is the row number. So This one doesn't handle the ties and the two others handles the ties since they offer the shared rank. Now we have the last point about leaving gaps or skipping ranking. Now if you check the raw number and the dense rank, you can see there will be no skipping. There is no gaps for the raw number and as well for the dense rank. Only for the rank function, the middle one, we are skipping ranks and we are leaving gaps. That's it, guys. This is the differences between those three functions. I tend usually to work with the raw number more often than the two others. 234. 5 6 win rank top bottom analysis: All right, guys. So now, I had to look to those three functions, and I checked my projects, real projects, and I found out that there are many use cases for the function raw number compared to the other functions, dense rank and rank. So now what we're going to do? I'm going to show you a few use cases for the rank number that I usually use in my real projects in order for you to understand how important is the raw number function. So let's go to a scale. Alright. So now let's start with the first use case, and we have the task of find the top highest sales for each product. So this is very classic. In reporting or data analysis. We call this top analyses. So here, the managers or decision makers, they would like to have the best performers or the best success in our data. So, for example, the top highest five customers or the top five products or categories and so on. So this is very important analysis in order to focus on the best products or to the most important customers and so on. And this is, as I said, very classic and very important in order to make decisions in the business. So now let's see how we can solve this. So we're going to start with the usual stuff. Let's first select the data. So select order ID. Let's take as well, the product ID. And the sales from sales orders. So let's go and execute this. Now as we know that for each product, we have multiple orders, and we have multiple sales. But we are interested only in the high sales for each product. So we have to go and create a rank. In order to do that, we can use the raw function. Raw number, and we have to define the window now. So do we need partition by? Check the query. So it says for each product. That means we have to divide the data by the product ID. So let's go and use the partition by products ID. And now we must use the order by. So order by and now how to solve the data by a sales, and it is from the highest to the lowest. Let's go sales and we have here. Descending, from highest to lowest. Let's go and give it a name, you're going to be ranked by products. Let's go and execute this. Now by looking to the results, you can see that CL did divide the data by the product ID. So we have here around four windows. The first one over here, you can see that the rank starts from one it's with four, the highest rank can be the order number eight with the sales of 90. And then it goes to the four. Now, as you can see that the second window, we have a new ranking. So it resets. The first going to be the order number ten, and the last one going to be order number two. So as you can see, each window has its own ranking as well, the last one, we have it only as one row. Of course, in the task, we have to return the highest, so we are not interested in the others. We have to return this this row as well and this one and this one. As you can see, We have to return everything that has the rank one. We are not interested in the rank two, three, four, and so on. So we would like to have the highest. So now, in order to filter the dots, what we're going to do, we're going to go and use sub queries. So select star, from and then we're going to have the following condition. So where, and we're going to say rank by products equals to one. So we are interested only on the rank number one. Let's go and execute it. And with that since we have four products in our data, we're going to have only four rows, and we have the highest sales. As you can see, we have only number one over here, and those sales are the highest for each product. And with that, we have solved the tasks by finding the top analyses. Okay. Moving on to the x use case, we have the following task, and it says, find the lowest two customers based on their total sales. So now we have the exact opposite use case. We call it button analysis. So now, in this example, in the business, the decision makers want to optimize the costs, want to cut costs. And with that, they have to analyze the lowest performers in the products or the lowest performers in the employees in order to cut costs. So now with this analysis, the decision makers are not focusing on the best successful stuff. We are focusing on the lowest stuff. The lowest performers. So now let's solve these tasks. So if you check the question, we have multiple stuff, right? We have the total sales, and as well, we have to find the lowest two customers. So we have ranking and as well aggregations. Remember, we can do stuff together with the groupi. So now let's do it step by step. First, let's select the data, right. So what do we need? Order ID? Customer ID. And we need the sales from sales orders. So let's go and execute this. So now, if you check the customers over here, we have around four customers, and they have multiple sales. Now, we would like to have the total sales for each customers in order to find the Luis two. So let's start first with the aggregations. So what we're going to do, we're going to go and aggregate the sales. So the sum of sales. Let's call it total sales. And now, in order to do the group buy, we have to have only the customer. So group, and we have the customer ID. So it is very simple group by statements. Let's go and execute this. So now by checking the results, we can see that scaled aggregate the data. We have four rows, and that's because we have four customers and we have their total sales. So we have solved the first part of the task. We have the total sales for each customers. Now let's move to the second part. It says lowest two customers. That means we have to use the ranking functions in order to rank those customers. So we are not interested in all customers. We are interested only in the lowest two. So in order to do that, now we're going to go and use the window function. Row number, and then over. Now, do we have to partition the data? Well, no. We don't have to do that. We have now to sort the data. So order by. So this time, we're going to go and use the aggregations in the order by, so the sum of sales, and we want to have it sorted from the lowest to the highest, so I'm just going to go and use the default. So it is ascending. Now, let's call it rank customers. So that's it. Again, here, the rule is that. If you are using a window function together with the group by function, you have to use only columns that is used in the group by. So this should be working. Let's go and execute it. So now, as you can see in the results, we got an extra column for the rank. Now the lowest customer going to be the customer number two, the second one going to be four with the 90 total sales, and the highest customer with the sales is going to be the last one, the 125 customer number three. Now we have almost everything, but the list should contain only the last two. So in order to do that to filter the data, we're going to go and use sub query select star from And then we have to define the condition where rank customers, it should be smaller or equal to two. So with that, we will get the first two. Let's go and execute this. And with that, we got the lowest two customers based on their total sales, customer number ID two, and the four. That's it, we have save the task, and now we have done patent in analyses. 235. 5 7 win rank unquie ids: Okay, let's keep moving to the next use case and we have the following task. It says, assign unique IDs to the rows of the table, orders archive. So now, guys, we might be in a situation where you have a table without any primary key, and you would like to create an ID for each row. So in order to do that, we can use the function row number in order to generate unique identifier IDs for each row inside our table if we don't have one. And generating such ID for each row, it's very important to do stuff like importing data, exporting data. Maybe joining tables as well using this ID, or let's say optimizing the performance of query using the ID. So now let's see how we can generate that using R. Okay. So now let's first select the table order archives in order to understand the content. So select star from sales orders Archive. So let's go and execute. So now by checking the results, you can see that we have ten orders, and we have repetitions in the order ID over here, so it is not really a primary key. As you can see over here, we have twice, ID four, and here we have three times the ID six. Now what we're going to do, we're going to go and generate unique identifier for each row. In order to do that, what you're going to do? We're going to go over here and say row number, and then we're going to define the window function. We don't partition the data at all, but we have to sort the data by order ID. Oder order ID. Or you can use something else as well using the order date or something, doesn't matter. Let's add to its order data as well. Let's call it Unique ID. Let's go and execute this. Now, by checking the data, you can see that we have a new ID over here that comes from the raw number, and we have unique identifier. As you can see, we have ten rows, and with that we have as well, ten different distinct unique IDs. With this, as you can see, we have solved the task, and we have now unique identi and ID for the table ordered archive. Now having this ID, we can do many stuff like joining tables or doing something special and important called paginating. Imagine we have, like a huge table, and we would like to retrieve the data. So now in order to not have all the data in one go, we can go and divide the data by the primary ID or by unique identifier. For example, we can make a page from one until 100,000. And then the second page starts from 100 k to 200 s. So now by dividing the data, we can maybe improve exporting or importing data, or we can have faster retrieval for the users. We don't want to have the whole data in one go in one page. So it has a lot of benefits using paginating and we can do that only if we have a nice ID like this. 236. 5 8 win rank identify duplicates: All right. I'm going to show you the last use case for the function raw number that I usually use in my real projects. Sometimes if you're doing data analysis, you're going to find out that there are data quality issues, especially with the double kits. So what I usually use, I use the raw number in order to identify the double kits. Not only that, I can use it in order to delete the double kits. So we can use it in order to do data cleansing. And this is essential task for each data engineer, not only data analysts, in order to prepare and clean up the data before doing data analyses. So let's have the following identify duplicate rows in the table, orders archive and return a clean result without any duplicates. So not only we have to identify the Dublicates, we have to return no duplicates in our results. Let's say we can do this. Let's first select the data, select star from sales orders archive. Let's go and execute. Now by looking to the data, you can see that we have Dublicates. We have an issue. So the other idea before is twice in our database. It doesn't make sense, right? It should be only one. Which one is the correct one? If you check the data over here, you can see that this order is shipped and then delivered. So it looks like the last one is the correct one. So how we can do that. If you just scroll to the right, you can see that we have a creation time, and we usually use such a timestamp in order to identify what was the last valid like order. And then we can see immediately that this order time is higher than the previous one, which means this is the more up to date, right? The more current. So what we're going to do, we're going to go and rank our data for each order ID and sort the data by the creation time in order to find the last inserted or curated raw for this order. So let's see how we can do that. What we're going to do, we're going to go over here and say, let's have a raw number. And then over, and what we're going to do, we're going to partition by the primary key. Partition by order ID. And as we said, we have to order the data by this time stab at the end, partition by order by creation, time and descending. So we want the highest then, the lowest. That's it. Let's call it RN and execute the query. Now by checking the data, if everything is clean and we don't have Dubliates, everything should be one because maximum for each primary key, we should have one row. But you can see very we have here two and we have here three, two, that means this is indicator that we have doubliates inside of our data. So now by checking one by one. As you can see, the order ID is only one. We have the rank one, the second one as well. We have the rank one. But here we have the issue. As you can see, we have now two ranks for the order ID four. Now, which one is the correct? In our logic, we say, it is the last row that is inserted inside our data, and this is rank number one. If you scroll to the right side, you can see that the creation time here is higher than the second one. With that, we have identified what we want. We want the last inserted row for each ID. And now let's check this over here. So here we have it three times. So it says the first one is the highest creation date. So if you go to the right side, and now by comparing those time stamps, you can see that. This records, the first one is the latest one that is inserted inside our data. So as you can see, this one is the one that we need. The other two, we don't need it because it is old informations. So now, everything that doesn't has that rank number one is not valid. It's something old. And It's actually that data quality, so we want to remove it or not to select it. So now, in order to have a clean data, what we're going to do, we're going to go and select the following as subselect. So select star from that table. And now we are interested Only with the rank number one. We don't need anything else. So let's go and execute. Now if you check the results, you can check the order ID over here. It is unique. We don't have any Duplicates, right. One, two, three, four, five, six seven. There is no Dublicates at all, and we have now only the latest inserted data inside the orders, and we don't have any duplicates or data quality issue. So now, of course, now we can go with these results in order to do further analyses, and this is exactly what data engineers usually do. Clean up the data and prepare the data before doing any data analysis. Of course, if you want to communicate those data quality issues to the source of the data, let's say you are not the owner of those informations, You can generate a list of all bad data quality issues, and you can send it to the source system and tell them to clean it up from the sources. Now in order to select the bad data, what we're going to do is, we can just change here the condition and say, if it is higher than one, then you are like bad data. Let's go and execute this. Now with this we have in the results, all records that shouldn't exist in the data in the first place. So we can go and export it and communicate it to the source and tell them. Check here you have something wrong in your system, and those information should not be inserted in the data. So, everyone, it is very strong, right? It's very powerful. I use it a lot in my projects. There are many use cases for the row number function in SQL. We can do it in order to find the top analyses, the bottom n analysis, the best performance, worst performance. And as well, we can assign unique IDs to do paginating or we can use it in order to discover data quality issues to clean up our data, so it is amazing function in SquL and you're going to use it a lot. So that's it for the three functions, row number rank, a dense rank. Now we're going to talk about the entile 237. 5 9 untile: Okay, so what is tile? Tiling scale is very simple. It's going to go and divide your rows, your data into specific number of almost equal groups or sometimes we call them packets. So now in order to understand this and how it scale works with this function, we can have a very simple example. So let's go. Okay, we have the following set up. We have four rows, four sales, and we would like to divide it into two groups or into two packets. So in order to do that, we can use the entile function. It has different syntax than the other raking functions. So it starts with tal. Then we must define a number, so we cannot leave it empty like the other raking. So here we have two packets, then over. And here again, we have to sort the data, so it is must order by sales descending from the highest to the lowest. So now, as usual, que going to go and sort the data. We have it already sorted in this example. Then it can start assigning each of those rows in two packets. But quel first has to calculate the pocket side. So how many rows we can insert inside each packet. So the calculation is very simple. It says the packet size equals to the number of rows divided by the number of packets. So what is the number of rows here? We have four rows. So we have four over here. Then the number of packets, we define it in the syntax of the query. So here we define two packets. We need two groups. So that means we are dividing four by two and the size of the packet is going to be two. Now with this scale is ready, I'm going to start assigning each row to packets. It's going to start on the top. The first one going to be in the packet number one. Then go to the next one. It's going to say, Okay, we still have enough space in the packet. It's going to assign as well to one. But with this, we reach the maximum number of rows within each packets. The next row going to be assigned to another packet. It's going to be two, and the last one going to be as well two. As you can see, it's very simple, we have just assigned our sales based on the sorting, of course, into two packets. This two sales belongs to the packet number one and the other two belongs to the packet number two. Very easy. So that's why it's very straightforward because we are dividing even numbers, and we got perfectly sized packets. But now, what can happen if we have an odd number? So we have here five instead of four. So the packet side going to be dividing five y two, we're going to get 2.5. And now, of course, quel will not go and divide like two half for each packet, then we are splitting this into two packets. Of course, this will not be working. We should have now a packet with three and another packet with two. So now the rule in Squal makes it very clear. It says larger groups comes first, then smaller. So that means, if we have here an even number like this, The larger group going to be the first group. So that's going to look like this. It's going to reset everything. Let's see what's going to happen. The first one going to be one, the second one is bill one. The third one going to be as well one. So it has a larger packets than the second one. Then the rest going to be two. As you can see, the larger group comes first, then the smaller, and this is how scale going to work if you have odd numbers. You don't have here perfectly sized packets. You have approximately or roughly equally sized packets. This is how the tal works. Now, let's go back to scale in order to practice this function. So now let's have some fun working with this function. So we're just going to select something like order ID, sales from sales orders. Let's go and execute it, and with that we got our ten rows. Now, let's say that, I would like to create only one packet from the data. So entile and only one packet over. Partition. Let's say not partition by. Let's take order by sales cending that sets. I'm going to call it one packet. Let's go and execute it. As usual, it's going to go and sort the data and then calculate the packet. It's going to be ten rows divided by one. So the size of the packet going to be ten. So that's why you're going to see everywhere here as one because all those rows can fit into one packet. This is very simple. We have only one packet. Let's go and now have two packets. I'm just going to copy and paste. Instead of one, we're going to have two, let's call it two packets. Let's go and execute this. Now here, again, what is the size of the packets? It is 10/2. We will get perfectly grouped packets. The first packet is going to be five raws, and the second one is going to be the next five raws. So it is very perfect. Let's go to the next one. Let's have three packets. Three. So let's go and execute. So now what can happen is going to go and divide ten by three in order to get the size of the packet, and it's going to be 3.3, so it's decimal, and we will not get perfectly sized packets. So again, the larger group comes first, and then the smaller. As you can see, we have to fit then in the first group four, in order to get the others with three. So that's why the first packet is going to be the biggest one. So four rows into the first packets then the second three rows going to be in the packet two, and as well, the last one go be packet three. As you can see, the larger group is going to be the first packets. So now let's keep playing with the data. Let's go and take now four. We would like to have four packets. Now things going to get interesting. So now by checking the result is going to be interesting. Equal going to divide ten by four, and we'll get something like 2.5. So again, we will not get perfectly sized groups. So QL has to fit now ten rows into four groups. So the first three rows going to be fits in the packet number one, and as well, the second three rows like this, going to be in the packet number two. And then you can see over here we have two packets with the size of two. And with that, we can fit ten into four groups. And again, you can see the larger groups comes first like this one. And then the second and the smaller's comes later. Okay, so this is how the inter woks ins qual. And now you might say, You know what? Why do I need buckets in the first place? So what is the use case? 238. 5 10 ntile use case data segementation: There is two use cases for the tal function in my projects. In one hand, if I'm data analyst, I'm going to use the tal function in order to segment my data. In the other hand, if I'm data engineer, I'm going to use the tal function in order to do ETL processing and as well to do load balancing. So now let's start with the first use case as a data analyst. Where you want to do segmentations with the tal function, Segmentations is very nice way in order to understand your data. So you can go and segment your data into different packets or groups. Like, for example, doing segmentations for the customers. So you can go and group up your customers depend on their behavior, like the total sales, or the total number of orders. So with that, you can make, like, for example, IV section, and then the medium and then the law. So now in order to understand the segmentation use case, let's have the following task. Okay. The task says segment all orders into three categories. High, medium and low sales. In order to solve this, let's do the basic stuff, right? Select order ID. Let's take the sales from our table sales orders, and let's go and execute it. As usual, we got our ten sales. So now, if you check the task, it says, we need three categories. So that means we need three packets, and it says high, medium and low sales. So that means we are dividing by the sales. Let's go and do it step by step. So we're going to use til since we need to segment the data. Hree categories means three packets, and then let's define the window. Over, we don't have to divide the data by partition by. We just need to sort it first by the sales. So it's going to be by sales, and let's take discrete, since we want to sort it from the highest to the lowest. So that sets, let's say you are our packets. Let's go and execute this. Now if you check the data, you can see that they are segmented into three packets. So the first packet going to contain all orders with the high sales. Then the second one going to be all sales with the medium. And then the last one going to be all sales with the low sales. So adecuacy we have already categorized our data into three groups. But now, adequacy, we have numbers, and maybe the user is expecting to have those text, high medium low. So that means, what we're going to do now, we're going to go and translate those numbers into text into words. And of course, we cannot do that inside the window function. We're going to use data transformation using the case win statements. Don't worry about it, we can have complete dedicated section explaining the case win. So for now just follow me in order to see how this works, we're going to go and use subquery. So it's going to be selects. And let's take the star for everything, and then let's have the following logic. Ken, packets equal to one, then it is high. The sales is high. We are just mapping the numbers in two text. Otherwise, Ken, the bucket is equal to two, then we are targeting the medium. Medium. And then the last group packets equal to three, then those sales are low. So let's call it ended, and let's call it sales segmentations. So that sits. Let me just make it a little bit smaller in order for you to see it. From then we have our sub query like this. So as you can see, we just mapped the numbers into text. We are just doing translations. So let's go and execute it. And now by checking the results, we got our three categories for the users. So the first category is going to be the high sales. The second one going to be the medium sales, and the third one going to be the low sales. So, guys, you see, tal is very powerful in order to segment our data. So now you can go and segment stuff like the customers buy their total sales or the products by prices, employees, by their salaries, and so on. 239. 5 11 ntile use case data load: All right, so this is the first use case for the intel function as a data analyst where you go and segment your data in order to understand the behavior. Now, in the other hand, if you are a data engineer, you can use intel function in order to do load balancing in your ETL. So now I'm just going to explain it in very simple sketch. All right, we have the following scenario where we have two databases, and we would like to move one big table from the database A to database. So in this case, I'm doing something called full load that means I'm loading all the rows from one database to another. So if you do it in one go, what could happen at that? It could take long time, so it could take hours or even sometimes days. And maybe at the end, you will get maybe some network errors because you have stressed the networks between those two databases and everything go to break, and you're going to load the data, and you have to start again. So now instead of loading this table in one go, what we can do, we can go and split it into fractions or let's say packets. We can split this table, for example, into four small tables using the function tile. Now after we split this pig table into small tables, we can go and start moving those small tables one after another, and with that, we are not stressing the networks, and it's going to succeed. Now after loading everything at the end in the target database, we're going to have those small tables, and of course, we can go and use the union in order to merge them in order to have again, the pig table that we have it in the original database. This is a very common use case for the tile in order to split the load and to balance the processing of extracting data. All right. So now we have the following Q task. It says, In order to export the data, divide the orders into two groups. So let's go and do that. First, we can select everything from the table, in order to see the data. Sales orders. Let's go and execute it. So now we got our ten orders, and what we have to do is that, to go and split it into two groups. In order to do that, we can use the tile function. Two groups means two packets. Let's define the window. Here we don't have to partition the data using partition by, but we have to specify the order by. Now which column we're going to use in order to sort the data. Of course, here, there is no rule, you can go and split the data by sales or by the order status by date by anything you want. But we usually go and use the primary key. It's just systematic better and more clean, especially if you have a sequence of numbers in the order ID, so you can export the first range of the orders, then you can go to the next group and so on. So let's go with the order ID, and let's give it a name packets. So that it. Let's go and hit cute. Now, as you can see, it's very simple. We got our two groups, so this is the first patch of the data, and this is the second batch of data. So now we can go and select the first patch and export it, imported in the next system. And then after that, we go with the second batch. And of course, if you still suffer from the size of those packets, you can go and split it to more smaller size, so you can go over here and make it four. So with that, we're going to get smaller packets, and it might be easier to export the data. So this is really great use case for the entire function. Alright, everyone. So with this, you have learned that two use cases for the tal function that I usually follow in my projects. So as a data analyst, you can use it in order to do segmentations. And as a data engineer, you can use it in order to do load balancing of the ETL. 240. 5 12 win rank cume dist: Okay, everyone. So with that, we have covered everything about the integer based ranking functions. Now we're going to talk about the second methods. We have the percentage based ranking functions. And here we have two functions, the um dist, and as well, the percent tile. So now let's have a quick recap. With the percentage based ranking scale going to go and calculate a relative position, as a percentage and assign it for each row. The output going to be a continuous normalized scale 0-1, and this is really amazing in order to do distribution analysis. Those functions can consider in their calculation, the overall total the whole size of the dataset, which can help us in order to find out the contribution of each value to the overall total. Now in SQL in order to generate the percentage, we have two different formulas. In one hand, we have the function, QumDist and in the other hand, we have the percent. That means we have two different functions with different formulas in order to generate and calculate the percentage. Now let's start with the first function, Qum dist. All right, everyone. So now, let's start with the first function. We have the um dist, and it stands for cumulative distribution. It's going to go and focus or calculate the distribution of your data points within window. So what this means, in order to understand it, we're going to go and have very simple example to understand how QL works with this function. So let's go. All right. Again, we have our very simple example of the sales, and we have the following query. So um dist, then we don't give any argument inside it, so it can be empty. The window going to be like usual order by sales descending from the highest to the lowest, and the order by is must. The first step is squeal going to go and sort the data, we have it already sorted from the highest to the lowest. Now the next step is that squeal can go and start calculating the percentage for each row. And we have a very simple formula. It says, the um dist equals to the position number of the value divided by the number of rows. Now the next step is squall going to go and start calculate the percentage for each row. And we have this very simple formula. It says the um dist equals to the position number of the value. Divided by the number of rows. It's very simple. Let's do it step by step. So scale going to start with the first value in our list. So it's going to be calculated like this. So what is the position number of the first value? It can to be one, right? So this is the first value in our list. And what is the total number of rows? We have five rows, right? So one, two, three, four, five. So we're going to divide one by five, and the result is going to be 0.2. So this is going to be the first value for the first row. Okay, so now scale going to go to the next row, and this time, we're going to get a special case. As you can see, we have the 80 twice. So we have here a ti. So now, first, we need the position number. As you can see, we are at the position number two, right? But since we have the 80 multiple times, EQ going to go and take the last position that we see the value 80. And the last position going to be the record number three. So that's why ESCO going to say, for this record, it's going to be the position number three and two. And then it's going to go and divided by five, and we will get the value of 0.6. So this is the most confusing thing with this function. So if a SCL finds a ti, it will completely ignore the current position number, so we don't have two. It's going to go and take the last position number for the same value. And the last in our list is going to be the record number three. So that's why we have three over here. Okay, now let's keep moving. Let's go to the third row, and as you can see, we are again in the ti. But this time, this is the last time we see 80. So next, we don't have 80. So what's going to happen, we're going to have exact same results. So it's going to be 3/5. So, as you can see, if we have a ti, they go to share the same percentage. So that means with the um dist if you have same values, they go to share the same rank. So let's keep moving to the fourth one. So now, what is the position number of the 50? We are at the record four. So position number 4/5, we will get zero. Eight. Okay. So now let's move to the last one, and it is the easiest one. Which position do we have over here, it is the position number five, it's the last one, and the number of rows is five. That's why we will get one. So, guys, that's it. This is how the cumulative distribution works. Once you understand the formula, it's going to be very easy in order to understand the output. So as you can see, calculating the percentage always depends on the total size of our datasets. You can see here the number of rows. With that, we're going to get an output that help us in order to understand the distribution of our data points within the datasets. 241. 5 13 win rank percent rank: All right, everyone. So now we're going to go and focus on the second function that generate percentage as a rank. We have the percent rank. So the percent rank is going to go and focus on generating the relative position of each row within a window. So in order to understand what this means, we can have a very simple example in order to understand how scale works with this function. So let's go. Okay, again, we have those sales, very simple example, and the syntax can be like this. Percent rank, and inside it, we don't use any arguments. And the window going to be like this order by, it is a must sales descending from the highest to the lowest. The first step that Sque going to do with that is going to go and sort the data from the highest to the lowest, and we have it already like this. Next, is qu going to go and start, calculate the percentage which is very similar to the cumulative distribution. But this time it's going to be like this. Position number, then we subtract it from one, and as well, divided by the number of rows, subtract it from one. So it's like exact formula, but we are only subtracting here one for both numbers. Okay, so now, let's go through all rows step by step and see the output. So it's going to start with the first row, right? So what is the position number of the first row? It's going to be one. Then we have to subtract it by one. That's why we will get zero. Now, what is the total number of rows? We have here five rows, and it is subtracted by one. That's why we're going to get four. Now, zero divided by any value, the output can be a zero. So that's why for the first value, we will get a zero. Alright, now let's move to the second draw over here, and here we have our special case where we have a tie. So we have two cells sharing the same value, eight. So now, for the percent trnkqel can have different behavior than the um dist. Remember in the um dist, qual did search for the last position of the shared value. So it was the position number three, since this is the last time we see 80. But now with the percent trunk, qual can stick with the first occurrence of the shared value. So now by checking those two eights, what is the first occurrence? It is the record number two. So that's why we have position number two, subtracted by one, we will get one, and here the same going to be a number of totals. We have five, subtracted by one, we have four. So now if you divide one by four, we will get the result of 025. So this is the percentage of this value. So now let's go to the second row. Here we have again, the ti. So scale go to stick with the position number two, the first occurrence. So it's going to be the same two, subtracted by one, we will get one, and as well, the total number of rows, five, subtract by one, we will have four. That's why we will get the same exact results. So here as you can see, with the percent rank, it's um disc, the shared value going to share as well, the same percent drank. Now, let's move to the fourth one, so we have the value 50. So what is the position of this? It's going to be the record number four. Subtracted by one, we will get three, and if you divide three by four, you will get 0.75. And now moving to the last value over here, it's going to be easy. So what is the position number of the 30? It is five five subtracted by one. It's going to be four. And as well, we're going to have four as well here for the total numbers, subtracted by one. So if you divide four by four, you will get So that's it, guys. This is how the percent rank works. It always has the scale 0-1. So it's always like this. It doesn't matter which values do we have inside, and it's going to have a continuous scale. And again, here, if you have a ti, they go to go and share the same percentage drank. Okay, guys. So now if you go and compare those two functions, you can see that they are really similar to each other. The output of both functions, we are generating percentage, based ranking and both of them as well, handling the ties perfectly, so they share the same percentage rank. If you check the syntax, they are very similar. And now by checking the formulas of both of them, we are always considering the overall size of the datasets. So here, the size is considered in the calculation to help us finding the relative position of each value to the overall. And this is very important in the analysis in order to measure the contribution of each value to the overall. Now about the use cases, if you want to focus on the distribution of your data points, go with the cumulative distribution. But if you want to focus on the relative position of each rose, then go with the percent trink. Alright, now, there is one more difference between the um disc and the percent trink and that's if you check the formulas, you can see that the um dist is more inclusive. We always consider the position number of row. But with the person trink we don't consider the current row, we like skip it or make it exclusive. We say for the person trank it is more exclusive and the commuative distribution, it is more inclusive. Now if you ask me the hard question, which one do you use? I'm going to say if you want to be more inclusive, go with the commitive distribution. If you want to be more exclusive with the current row, go with the person trank they are very similar to each other's. If you want to calculate the distribution of your data, go with the commulative distribution. If you want to find the relative position of each row, then go with the person t All right. So now we have the following task that says find the products that fall within the highest 40% of the prices. Let's go and solve this. Now we are targeting the table products, and I will just select like two columns product price from sales products. So that's it. Let's go and execute this. So as you can see, we got five products and the prices, and the task says, find the highest 40%. So we have to find and generate a percentage rank. In order to do that, we have the two functions, *** dist and a percent rank. I will go this time with the *** dist. Let's go and do that. So um dist. And then let's go and find that window like this. It's going to be order by. We are targeting now the prices, right? So order by the price from the highest to the lowest. And let's give it a name Dest rank. So let's go and execute this. So with that qual can go and generate for us a percentage ranking using the formula that we just learned before. So on the output, we are getting all the products, but the task says we have to get only the products that are in the highest 40%. So that means the first row, the second row, and that sets. So those rows are in the highest 40%. The rest are below that. So in order to do that, so filter the data, we can use the sub query. So select star. From and then we have our sub query like this, and then our filter going to be dist rank smaller or equal to 0.4. This is our three should in order to get the data. So let's go and execute this. Now, as you can see, we got the top products, the top 40%. Now, of course, you can go and format the percentage. We can do that like this. Let's take the dist rank. Multiply it with 100. So let's go and execute this. So as you can see, we got 20 and 40%. We can go and add to it as well, the percentage character, right. So we can go and say cart and we're going to add the character. After that, like this, let's call it this rank percentage. So that's it. Let's go and execute it. So that you have solved that task, we have the products that fall within the highest 40%. Now, of course, you can go and try the percent rank. So it's very simple. We just have to go and switch the cumulative distribution with the function percent. Bank. So let's go and execute it. Now as you can see, we'll get the exact same results, so we're still getting the gloves and caps as the highest products within the 40% of the price. So, guys, that's it, it's very simple, right. 242. 5 14 win rank summary: All right, friends. So now let's have a quick recap for the window ranking functions. So, what they're going to do? They're going to go and assign a rank for each role within a window. And we have two types of franking, right? The first one is the integer based ranging. It's going to go and assign a number an integer for each role. And here we have four functions, R number, rank, dense rank, and tile. And the second type of franking, we have the percentage based ranking. So scale fair is going to go and calculate a rank and then assign it for each role. And here we have two types of formula or functions. So we have the um disc, the cumulative distribution, and the second one, we have the percent rank. Now, to the next point, if you are talking about the rules of the syntax, so the expression should be empty. We should not pass any argument to the functions. We must use order by in order to sort our data, so it is required, and the frame clauses are not allowed to use, so you cannot go and customize a frame within the window function. And as we learned, there are many use cases for the ranking functions. For example, we have the top in analysis, the pattern analysis in order to identify a wire, performers or the worst performers in our business. Another use case using the row number, we can identify and remove duplicates in our data, so we can use it in order to find data quality issues and as well to improve the quality. Another use case, if our table don't have a clean primary key, we can go and generate unique IDs using the row number in order to do as well, One more use case, it was the data segmentations. You can use the Intel in order to segment your customers, your products, employees, and so on. And another use case, we can do data distribution analyses. As we learned, we can use the QTS in order to understand the data distributions of our data points compared to the overall. And the last use case, it's more for data engineering. We can use the intel function in order to equalize the loading process of our ETLs. So as you can see, there are many use cases for the ranking functions. Alright, so with that, you have learned how to rank your data using six different scale window functions, and their use cases, they are amazing for data analytics. Now moving on to the next one, we have the last group of window functions. We have the value functions. They are my friends, the most important group for data analytics compared to the other two. So here we're going to focus on four functions. We're gonna learn how scale works with that, the syntax, and as well, the use cases. 243. 6 1 win value what is: Hey, friends, so. Now we're going to talk about the most important category of window functions for data analytics. We have the value functions or sometimes we call them window analytical functions. So here we're going to cover four different functions. We have the lead lag, first value and last value. And as usual, we're gonna learn the concept behind them, how scute them behind the scenes, and then we can learn the syntaxes and we're gonna cover the most important use cases for the value functions that I collected from my projects. So now let's start with the first question. Why do we call them value functions. So let's go. All right, everyone. So now we have this very simple example. We have the months and the sales. Now, we can use the value functions in order to access a value from another row. So in order to understand it, let's say that is L now processing the months, and we are currently at the month of March. So now, for example, I would like to access the value from the previous month from February. So in order to do that, we can use the lag function in order to get the value of ten. So with that we have in the same row, the current sales of the month March, and as well, the sales from the previous month, the February. Maybe in other cases, I would like to get the sales of the next month from April. In order to do that, we can use the function lead, and we will get at the same role, the value five. So now I can very quickly compare the current month with the previous month and as well with the next month. Now in the other cases, you might be interested, in the first month of your list, so it's going to be here January. So in order to get the sales of the first month, you can use the function first value. So we're going to get at the same row 20. And now for the last option, I think you already get it, we can go and get the value of sales of the last month. So here we can get the July. So for that, we're going to use the function last value, and we will get the value of 40. So this is exactly the purpose of the value functions or analytical functions. We can access value from another rose. And it's very important to decide as well. The value functions is like the ragging functions. We have to use the order by in order to sort the data. In order to understand what is the first row and the last row. In this example, the data is sorted by the month. So, guys, the access functions are really important for analytics. You can use it in order to access a value from other rows in order to do comparison. Alright, now let's have a quick overview of the syntax and the rules for the value functions. So here we have four functions, lead lag, first value, and last value. So you can see we can group them into two groups. So we have the lead aag, they are very similar to each other's. Especially with the syntax, we can use three things or three arguments inside it, expression offset default for both of them. For the first value, we can use only an expression. That means we have to pass and value for those functions. You cannot leave it empty. Now about the expression data type, you can use any field with any data type. There is no restrictions about only, for example, using numbers. An data type is allowed. Now, about the definition of the window. The partition by, as usual, is optional like any other group. The order by here is a must. You must define an order by. It's like the ranking. Here, you cannot leave it empty. Now we come to the last one, we have the frame clause. They are really different stuff over here. So for the first two functions lead a lack, you are not allowed to define any frame. So you are not allowed to define any subset of data. It's very similar to the ranking. So you must use order by, but you cannot define the frame of the window. But for the other two functions, the first value and the last value, they are optional. You can go and use them, and for the last value, it is recommended to define frame clause. Don't worry about it. We're can have enough examples in order to understand. You can see those functions has different requirements, so there's no generic rule for all of them. But one thing that they all agree on that you must use order by. Now, as usual, what we're going to do? We're going to go and deep dive into those functions. We're going to address first the two functions lead and lack because they are very similar to each other's. We can understand the use cases, when to use them, and of course, we're going to practice in the squale. Let's go. 244. 6 2 win value min max: Lead functions. The lead function can allow you to access a value from the next row within a window where the lag function is exactly the opposite. It's going to allow you to access a value from a previous role within a window. It sounds very easy, right? So let's understand how scale going to execute those functions. Okay. So now let's have a quick overview of the syntax for both of the functions, lead and lag. We have here very simple example for the lead function. So, as usual, we start with the function name, it's going to be the lead. And now after that, we're going to go and pass the arguments. And as you can see, we have here multiple stuff. So let's do it step by step. So the first thing that we're going to go and specify an expression, and the data type could be any data type. It could be a number like here, the sales, it could be a character like names or dates or anything. So this is required. We have to specify an expression, we cannot leave it empty, and we can use any data type. Now, moving on to the next one, we have here in number. So what is it? This is the offset, and this offset is optional, so you can go and skip it. So what offsets means, what we are doing over here. We are specifying for SQL the number of rows forward or backward from the current row. So here in this example, we are specifying the offset as two, using the lead, and with that we are telling a scale, go jump to the next two rows and get me the value. And if you're using lag, it means you're telling a scale. Go back two rows up and get me the value. So here you are telling a scale how many rows it needs to jump. And if you don't specify anything like leave it empty, que going to go and use one. So the default of this the offsets is going to be one, if you don't specify. All right. Moving on to the last one and to the third one, this is as well optional. You can go and leave it empty. So here, it is the default value. Now, what happens with those functions that? Sometimes scale jump to the next two rows or something like that, and skull doesn't find anything. So there is no more rows available to access. And with that, k going to go and return a null. That means if q goes to the next rows or go to the previous rows and doesn't find anything, k as a default, going to go and return a null. So if you don't specify anything over here, in those scenarios, you will have a null values as a return from the whole function. But in some scenarios, you don't want to have a null. You would like to have a value. So here you are defining the default value. So it should not be a null. It should be a ten. So Scale, if you don't find anything, return a ten. Don't return a null. So again, guys, the default values, the offsets, all those informations are optional for you in order to configure it, but you should know the default if you don't use anything, for the offset it's going to be one, for the default value going to be null. But you must specify an expression. So here you cannot leave it empty. All right. So that's all about the arguments that you can pass to the lead or lag functions. Then the next stuff are the standard stuff. So we have the over close. Then we have the partition by. As usual, partition by is optional. And then to the order by. Those functions, it's like the rank functions. It requires you to sort the data. So it is a must to sort the data. Otherwise, it's care will not know what is the next row, what are the previous rows. So we have to sort the data. It is required. You cannot skip this, so it is not optional. Alright, so the syntax is not crazy, right? We have the usual stuff, but only we can go and configure the default value and the offsets. Okay, guys, now we have a very simple example. We have months and sales, and we're going to go and understand how the SQL works for both of the functions, lead and lag side by side. So now in the first example, we are interested in the sales of the next month. So in order to do that, we're going to use the lead function. So lead, and then we can specify the argument. It is the sales. We want the value of sales. And then we define the window like this order by month. So it's going to be ascending. Now on the right side, we're going to be interested in the sales of the previous months. So in order to do that, we're going to use the lag function. So it's going to be very similar to the gd. We have lag and then the sales, since we are interested in the sales, and we're going to sort the data by the month. So now let's see how Scale going to do it step by step and side by side. So Sq going to start with the first. So now let's see how scale going to process those informations side by side and row by row. So it's going to start with the first row over here. What is the next month of January. It is February, and we are interested in the sales of this row. So Q going to take the value from the next row, and we're going to have the value of ten. So now by looking through the January, we can see the sales of the next month of February in the same row. So now let's check the right side over here. Now, we are interested in the previous month. So what is the previous months of the first row? It will be nothing, right? So we cannot point it with anything. That's why squeal going to say, this is null. There is no previous month for the current row, and we're going to have it as. Okay, so now going to go to the next row. We are at February. What is the next month, it's going to be March, and it's going to point to it. So we will get 30 as the sales of the next month of March. And on the right side, what is the previous month of February? It's going to be January, right? So it's going to get the value, the sales of the previous month. And here we will get 20. So as you can see it is very simple. On the lead, we are always checking the next values. On the leg, we are always checking the previous value. Let's keep going. We are currently at March. What is the next month? It's going to be April. Sq going to go and point to it like this. And we will get the sales of the next month April. For the March on the right side, what is the previous month, it is February, right? So I go to go and point to February. So we will get the sales of ten. Now, interesting to the last row over here, you can see that we are at April. What is the next month of April? There is nothing because we are at the end of our table, right. So since there's no month after dance, we will get a null in the output. But for the lag, we still have a previous months for April. So what is the previous months, it is March, and we will get the sales of the march. So it's going to be 30. So that's it, guys. It's really simple rights. It's just like they are doing the opposite things. So now, if you check those values side by side, you can see that with the lead, we will always get a value for the first row. But for the last row, it can be always empty because there is no next value we are at the end of the table. But if you check the lag, For the first value, we will always get a null because there is no previous value or previous record from the first row. And for the last record, as you can see, we're always going to get a value because we will have a previous value. Okay, let's move on in order to understand how scale this time works with the offsets and the default value. So now we have the same data, but we have different task. So now on the left side, we would like to get the sales of two months ahead. So it's not the next month. It's gonna be two months. And we would like to tell QL, if you don't find any value, don't return null. Return for us is zero. So this is going to be our default. Now, if you check the syntax, it's going to be exact like before, but we are adding now an offset of two because we are interested in two months ahead. And we are specifying a default value zero. So if you don't find anything, put zero. Don't put null. Now, on the right side, we have the exact opposite. We are interested in the sales of two months ago. So we are not interested in the direct previous month. We need the sales of two months ago. And here, the same thing. If you don't find anything, don't return I'll give us a zero. So you can see, we have the same syntax, but using the function lag. So now let's understand how l can execute this step by step and side by side. So is kal going to start when the first month, January. So now K going to ask, what is the sales of two months ahead. So we are at January. It will not be February. It's going to be the month of March. So it's going to go and point it like this, and we will get the value of 30 30 is the sales of two months ahead. Now on the right side, we are as well in January. Esq going to ask the question, what is the sales of two months ago? So we don't have any previous data right. So we will not get anything. Q going to return null, but it's going to check, do we have a default value? Well, yes. This time, EQ will not return null. It can return the default value, and this time it's going to be zero. All right. Now let's go to the next value. We are currently at February. What is the sales of two months ahead? It will not be March. It's going to be April. So it's going to go and point it like this. And we will get the value of five. So now on the right side, we are currently at February. Now the question is, what is the sales of two months ago? We have history. We have the previous month, but we don't have two months in the history. That's why we will still get zero at the output with a default value. Okay, so now let's keep going to the next value. We are currently at March. Quel can ask what is the sales of the two months ahead. We have only one month after that, but we don't have two months. That's why Equal will not find anything, and it's going to return null. But it's going to go and use the default. So here we're going to go and get the value of zero. There is no more data available in the table. But now on the right side, we are currently at March, and we are asking what are the sales of two months ago. So now we have enough history in the past, and it's going to get the value off. T. All right. So now let's go to the last month over here in our table, April. What is the sales of two months ahead? We don't have any data, so it's going to be zero as well. But now on the right side, we are currently at April. What is sales of two months ago? We have enough history. That's why I Cul gonna get and point it like this. So we will get that February gonna be ten. That's it. This is how qual works with the lead and LG using offsets and as well default value. Let's go back in quel in order to practice those two functions. 245. 6 3 win value MoM: Okay. So now we have the following task, and it says, analyze the month over month performance by finding the percentage change in sales between the current and the previous month. So that means we have to go and compare the current month with the previous month. So the main use case for the lead and LG is to do comparison analysis, and we have a very common use case. It's called time series analysis. So it is the method of analyzing our business, our data in order to understand the patterns and trends over the time. And one of the most important and classical question that you're going to get from the decision makers or business is to do year over year analysis or month over month. Analysis. So the year over year analysis is going to help us in order to understand the overall growth or decline in the performance of our business over the years over the time. But in the other hand, we have month over month analysis in order to do short term trend analysis, and as we'll discover the patterns in the seasonality. So the main focus is to understand the performance of our business over the time. So now let's go back to it scale in order to solve the task. Okay, guys. So now let's go and do it step by step. Now, what is the first step? Before we go and compare things together, we have to collect the data. We have to do the calculations first. So we have to find out first the total sales for the current month, and then the total sales for the previous month. And after that, we can go and compare them. So now let's start with the easy stuff. We have to find out the current sales for the current month. So in order to do that, let's just do very simple select. So what do we need? Let's take the order ID. Let's take the order date because inside it, we have the month. Let's go and collect the sales. So that's it for now from sales orders. So let's go and execute this. So on the result, we got the usual stuff. We have ten orders, sales and order date, but the order date is on the level of the days, and we are not interested on the whole date. We would like to get only the month in order to calculate the total sales for the month. Now we're going to go and use a function in order to extract the month from a date. Don't worry about it. We can have a dedicated chapter in order. To show you how to deal with the dates formats in scale. So now, what we're going to do, we will use a very simple function called month and order dates. And let's call it order month. That's it. Let's go and execute it. Now, as you can see, we've got a new field where we have only the month of formations. So here we have January, February and March. So now the next step is that we want to find the total sales for each month. So what we're going to do, we're going to go a new group by. So let's do that. We're going to go and say we want the sum of sales. I'm just going to call it current month sales. And let's go and get rid of all those informations. We're going to go and group by the month right. So group by, let's have the month. That's it. Let's go and execute it. So it's very simple right. We got now the three months and the total sales of the current month. So now with that, we got the first information that we need in order to do the comparison. We have for each role the total sales for the current month. Now the next thing that we're going to do is to find out the total sales for the previous month, side by side in the same row. And in order to do that, we have learned, we can go and use the g function. So we're going to go and integrate the lag window function in the same group by. So we're going to do it like this. So lag we are now interested in the previous month. So that's why we're going to go and get the sum of sales as an expression inside it. And after that, we're going to define the window. It can be like this over and order by is a must, so we're going to go and sort the data by the month. Let's go and do it. And with that we've defined the previous month sales. You are the previous month sales. So now let's go and execute it in order to see the results. All right. So now let's check the results. The first row. What is the previous month. There is no previous months. We are at the first record, and the first month. That's why we have Null. Now, let's go to February. What is the sales of the previous month from January? It is 105. So this is correct. And now to the last value to the March, what is the sales of February? The previous month? It is 195. So with that we got the two information we have the current month and as well the previous month. So, guys, as you can see, it's magic, right? It's very simple. We can go and use the lead and lag functions in order to access another values from another rose without doing any complicated joints and so on. Okay, so now, what is the next step? We're going to go and subtract the total sales from the current month with the previous month. So in order to do that, we're going to go and use a subquery like this. So select star from. We're going to have it like this as subquery. And now the calculation is very simple. Let me just move this little bit down. So it is the current month substracted from the previous month. Let's go and call it month over month change. So that's it. Let's go and execute this. So now let's go and check the results for the first month, you can see that. We don't have any value, and that is correct because the previous month is empty, so there is no change. Now moving on to the February, you can see over here, we got plus 90. That means we have improvement in the performance of our sales. Now moving on to the last one, it's really bad. We have decline in our performance. We can see that we have -115. So that means the current month is doing really bad compared to the previous month. So the March is really bad month. Okay. So now as you can see in the output, we got the absolute numbers, but the task says, find the percentage change. So we have to compare this to a percentage, and we can do it like this. It's very simple. Let's do it in a new column. Just go to zoom out a little bit. So. It's going to be the change, the differences divided by the previous months sales. And then let's go and multiply it is 100. In order to get the percentage, like this. And now as you can see we got zeros, and that's because those numbers are integer. So we have to go and cast one of those values. Just going to do it for the first, so cast, and you are float. So that sets. Let's go and execute it again. Now, the result looks better. We have the percentages, but we have a lot of dsymbls let's go around the number two. Let's say one dymbol only one. Let's give it a name. So now you are month over month percentage. So let's execute. So now you can see things get better, and with that, we've calculated, the percentage change in sales between the current and the previous months. And this is how we do month over month analyses. 246. 6 4 win value customer retention: Alright, so now we have another use case for the lead and LLC function. We can use them in order to do customer retention analysis. It's all about measuring the customer behavior and loyalty. So we are helping the business and decision makers to build strong relationship with the loyal customers and for them as well to focus on their needs. So now let's see how we can use lead and LC function in order to do. Customer retention analysis. So, let's go. Alright, now we have the following task, and it says, in order to analyze customer loyalty, rank customers based on the average days between the orders. So there's a lot of things going on over here. Let's do it step by step. And I would like always to start with a very simple select. So let's go select informations like the order ID. Let's get the customer ID. And as well, since we want the days, we would like to have the date. So order date from the table, sales orders, and let's go and sort the data on order by customarily and order dates. So the assets, let's go and execute. So now, as usual, we got our ten orders, the customers, and when they did order. So now let's check the task. Let's solve this over here. Days between the orders. So we have to find how many days are between two orders. For example, if we check the customer number one over here, he did order around ten January. And the second order is like after ten days, 20 January. So we have to go and subtract those two dates. Now, in order to subtract those informations and do calculations, we have to have everything in the same row. So, for example, if we are at the first row over here, I would like to have as well one column about the next order, so the date of the next order. So we have to access a value from another row. Of course, we can go and do joins, but we have lead and lag functions. And for this scenario, we're going to go and use the lead window function. So let's go and do that. I'm going to go and call the order date over here as a current order, and let's go and calculate the lead. I would like to get the next order date. I would like to get this value over here in the same role. That's why this time, we're going to get the order date. Now let's go and define the window. Now, we have to go and partition the data because we are analyzing each customer's separately, right? So that's why we have to partition that by the customer ID. Of course, in order to do the lead, we have to use the order by. Let's go and define that as well. Oder by, and it's going to be by the order date. So now, we have to give it a name. The order date here is the current order. This is going to be the next order. So next. Oder, Let's me zoom out a little bit and make this smaller. So let's go and execute it. So as you can see in the output, we got a new column called next order. And with that, we got the current order, the current row, and as well the value from the next row. So what is the next row? It's going to be the 20 January. The same thing, of course, for the next row over here, we have the current order date. And the next order date. So this value going to be exactly as the next one over here, 15 of February. And then, since we are working with window, this is the whole window over here, The last order for this customer, it's 15 of the February. There is no next order. This can be. The same thing. If you check the other customers, you can see always the last order don't have any next order. So looks like everything is fine. And for the last customer, it has only one order. Now with this, we got all the information for our calculations. We have the current order and the next order in the same row. Now we can go and subtract them in order to get the days between those two orders. Now, in order to subtract date, we have to use the function date dip. Don't worry about those functions. We can explain all those stuff in the next chapters. Now, just follow me with those steps. What we're going to do, we're going to go and subtract this order date with the whole thing over here. The whole thing here is the next order. Let's do it in a new line. And it's going to be very simple. So date D we are finding the differences between two dates. So the syntax going to be like this. First, we have to define what we are talking about, are they days, months, years, and so on. So we have to tell SQL. Find me the differences in days. Now we have to specify two days. The first one going to be the order date. This is the current date, and the second date going to be the whole thing from here. Let's take it and put it side by side. And this calculation going to give us a number of days. We're going to call this days until next order. All right. So now let's go and execute the whole thing. So now let's check the result as you can see over here. We got ten. So this is ten days between those two dates, and the next one, we have around 26 days. Here we have a null because we don't have here a date. And for the next one, we have 31 days, so we have a whole month over here. So everything is working perfectly. And with that, we have solved, Only this part, days between the orders. So, guys, you see, right? This is the magic of the lead ag function. We can very easily access any information you need in the same role in order to do such a important analysis. And with very simple query, we are not doing any crazy stuff like joining and stuff. We are just specifying the lead function. So long we got all the information that we need, next, we're going to go and calculate the average of those days. So in order to do that, we have to go and use a sub query. So let me just zoom out. So let's go and select star. Just prepare the subquery. So the whole thing can be a subquery. I'm just get rid of the order by. It's not now necessary. So lets me just put it like this and shift it. So now, what do we need? We need the average of the day. So we need the average of this value. So what can we do? We can go and use a group. So customer ID, since we have to find the average for each customers and we're going to get this value and say average. Days until the next order, and we're going to call it average days. And we have here to group, group, customer ID. So like this, make this a little bit smaller and zoom in here. So that's it. Now we are just doing a very simple average and group statement. So let's go and execute it. You can see, scale can go and aggregate the data. So we have now only four customers. And for each customer, we have the average days between the orders. So now what is missing in our task? If you check over here, it says, rank the customers based on this average. So we have to go and use the rank function. So here, again, another window function that we have to go and use, we're going to do it together with the group I. So let me just make this a little bit smaller. And then let's do it over here. So I'm just going to go with the rank function. Then we're going to defy the window like this over order by, and then we're going to go and sort the data by the average days. So that means we're going to go and get this calculation over here and put it as order by, it's going to be ascending, so we are focusing on the lowest average days. So that's it. Let's call it rank average. Now, let's go and execute this. Now by checking the result, you can see now we have a ranking for the average. And here Scale says that the number one customer or the number one loyal customer is the customer number four, which is naturally correct because the number four, we don't have a lot of information about this customer. He or she did order only once. Either now you go and filter the data and remove this customer, where you say, if the average is null, then don't put it in the rank or we can go and replace this value with a very huge value. In order to make it at the end of our list. For example, we can go over here and replace the null with Kuaisk like this, and we say, if the average is null, then let's say, give me a crazy number like this, very huge one. So that's it. Let's go and execute. Now, as you can see, this customer is going to be at the end of our list, and now we can see that the most loyal customer is number one, and then the other two customers are in the rank two. Here we are sharing the same rank since we have the same average. So guys with that, we have sold the task, and we have ranked the customers pace on the average days. Between the orders, so we have now a really nice rank, and we can understand now the behavior of the customers, and maybe we have to go and focus on the customer number one and understand here or share needs. And of course, the function that helped us here in order to do such a customer retention analysis is the lead function in order to find the next order to calculate the days. So this is how you use lead functions to do such a use case. 247. 6 5 win value first last: The first value and the last value functions. I think the name says everything, right. So the first value can allow you to access a value from the first row within a window where the last value is exactly the opposite, it can allow you to access a value from the last row within a window. Es right. So now let's understand how SQL execute those functions. So now, as usual, we have this very simple example, we have the months and sales, and we have it twice because we would like now to go and compare side by side, the two functions, first value and last value. So now for the left side, we would like to get the sales of the first month. And on the right side, we would like to get the sales of the last month. So now for the first task, we can go and use the first value. It's very simple. So the first value function, then the argument going to be sales since we want the sales. And then the window going to be defined like this order by month because we want to get the first month. So, as usual, we must use order by. Now, on the right side, in order to get the sales of the last month, we can go and use the last value right. So the same things, lost value sales over order by mouth. So as you can see on the left and right, we don't use any frame definition, but the default going to be used from this. All right. Now, let's see how quel going to process both of those queries side by side. So the first step que can go and sort the data. They are already sorted. From the lowest to the highest, and then the next step is going to start row by row finding the first value on the left side. So what is the unbounded proceeding? It's going to be static and always pointing to January. So this is always going to be the unbounded proceeding. We have it in both sides like this. And what is the current row? It's going to be at the start at the first row. And on the right side, the same things over here. So the window definition going to be is only one row right. So what is the first value on this window? It is 20, right. The same things on the right side? What is the last value in this window? It is as well 20. So we will get exactly same results. Now, let's move to the second row. So it's going to be pointing to February, and the frame definition going to be here extended like this. So what is the first value in this frame? It's going to be as well 20. So the output, we're going to get to 20. Now on the right side, the current raw going to be as well pointing to February and the window going to go get extended. So now what is the last value of this frame? It's going to be ten. Now, let's keep going. We're going to go to the march and the window going to get extended. What is the first value? It's always going to be the same. 20. On the right sides, window going to get extended. What is the last value? It's going to be 30. So as you can see, the default definition is always having the static start, always the same start of the subset. And as we are moving with the current row, the frame going to get extended. So now moving to the last one, and with that, we will get the whole data set inside the frame and the first cell is going to be 20 on the right side, the same things going to get extended like this, and this time, the last one going to be April So now if you go and compare them side by side, you see that on the left side, the task is solved and everything is working correctly, right? So we have for each row, always the sales of the first row. And what is the first row, it is January. So we have everywhere, e 20, which is correct. But now, if you check the right side, you can see there is something wrong, right. We are getting not the last value. We should always get apt we should have here everywhere, five. So we have here exactly the same result as the sales. So it's really useless to use it like this right. And that's, of course, because scale is using the default definition of the window frame. Last value is the only function from all window functions that you cannot use the default frame definition. You have to go and customize the frame definition in order to get the effect of the last value. For the first value, everything is working. If you're using a default frame, if you're not specifying anything. But for the last value, you will not get the effect correctly without customizing. The frame window. So my friends, you can go and use the first value function like all other window functions. Without defining a frame, you can go with the default, and you will get the effect of the first value. But the last value, you have to go and define a frame. So let's see how we can solve that. Alright, now in order to solve this, we're going to define the frame like this. It's going to be the rows between the current row and the unbounded following. So we just switch things around. So now let's see how this can work. Now, of course, SQL going to go and solve the data and so on. Now squel going to have a pointer to the unbounded following. So it's going to point always to the last row in our dataset. Then it's going to proceed step by step. So the first row going to be like this, and the frame going to be the whole thing, right? So from the current row until the unbounded following. So what is the last value? The last row, it's going to be the five, the appl. So we'll get in the output five. Now, let's proceed to the next value. The frame going to be shorter and smaller. And what is the last value? It's going to be as well, the five, right? So now we jump to the next one. And the frame going to be like this. What is the last value as well five, and then we will get the last value like this. Current raw is equal to the unbounded following. We have only one raw and it's going to be as well five. So as you can see is very simple, fix the frame clause, and you will get the last value working as expected. So this is how Sq is going to go and do it. Now, let's go back to a Squal and start practicing. Alright, now we have the following task. It says, find the lowest and highest sales for each product. So now let's see how we can do this. As usual, we're going to start with very simple select statements. So select order ID. We need the product ID, And as well there sales. So let's select the table, sales orders. That's it. Let's go and select this. Now in the output, we got our orders, products, and sales. So now let's start with the first part of the task. Find the lowest sales for each product. In order to do that, we can use the first value function. Let's go and do that, first value. Then what we are talking about, we have to give an expression. We need the lowest and highest sales. So let's go and have the sales inside it. And now we have to define the windows or over. Since we are saying for each product, that means we have to go and make windows. So we have to divide the data using partition by product ID, and then we must use an order by. So we have to go and sort the data by the sales. Since the first value should be the lowest value, we have to do it as ascending from the lowest sales to the highest sales. So we're just going to leave it like this as a default, and we're going to call it lowest sales. Let's go and execute this. Now let's go and check our results. First, skill going to go and partition the data by the product ID. So as you can see, we got now here, four windows, then sort the data by the sales. So the data are sorted from the lowest to the highest 10-90. Now, what is the first value of the sales? It is the first row right. So it's going to be ten. That's why we have everywhere ten. Let's check another one, let's take this one here. So this window has two rows, and it is sorted the lowest sales or let's say the first value is, 25. So with that, we have solved the first part of the task, finding the lowest sales for each product. Let's go to the next one. We have to find out the highest sales for each product. So let's go and use the last value for this. So let's have a new line. We're going to have last value. Again, the sales. Then we're going to go and define the window. It's going to be the exact same window, we have to partition the data by the product ID and order the data by sales. Let's go and just carry. The previous one. Let's call it for now highest sales. Let's go and execute it. Now if you check the results, you will see our issue over here again. We are not getting the highest sales for this window. The highest sales is 90, but as you can see, we are getting the exact same sales, and we have explained that in the previous example. In order to fix this, we're going to go and add for it the frame. Rows between current row, and the unbounded following. Now, let's go and execute this. Now let's check the result. As you can see over here, we got the highest sales correctly. For this window, the highest one is 90 as well for this window, the 60, and so on. With that, you have solved both of the tasks, the lowest and the highest sales. But now, I would like to show you my honest opinion about the tasks. I will not go and use the last value to find the highest sales. Let me show you how I usually do it. I'm going to go and use the first value in order to find the last value. Now let me show you what I mean. Let's go and add a new row. I will just take the whole thing from the lowest sales. But what I'm going to do, I'm just going to go and change the order. So that means we will not go and sort the data like this ascending from the lower cells to the highest seals. We're going to go and switch it. So we're going to go and sort the data from the highest cells to the lowest cells. And with that, the first value going to be the highest cells. So let me just rename it. Highest sales, give it like two. Let's go and execute this. Now you can see over here, we got the exact same results because we sort the data differently and we get the first value. This can give you the exact same effect like the last value. As you can see, I don't have to define now any window or something like that. I can stick with the default frame but just twisting the order by. This is how you can do it as well, using only the first value. Now, just for the sake of this task, there's as well another possibility in how to solve this. You can go and use the minmax functions. Let me just take the same avenue one, the lowest sales. We can go and say, You know what? Let's get the men. We are saying find me the minimum sales, and we don't have to go and sort anything, so we can go and just divide it like this. Let's give it another ID. Let's go and execute it. As you can see, we got the exact same results like the other two higher sales. So as you can see, we can solve this task using three different functions. Either go and use the last value, but you have to define the frame or you can go and use the first value where you switch or flip the order by. Or simply just using the max function in order to get the highest sales. So, guys, as you can see, we can use the first value and the last value in order to find out the extremes like here in this example, the lowest and the highest sales. So there is like similarity between those two functions and as well, the mean and max. Of course, what we're going to do with this value over here, we can go and compare it with the current sales. So for example, we can go and extend our task where we say, find the difference in sales between the current and the lowest sales. So in order to do that, let me just clean up all those stuff. Let's stick with the first value. And the highest value like this. So we have to compare now the current sales, which is this field over here, the sales, the original one, with the lower sales with the whole thing from here. So let's go and do that. So we're going to have a new line, and we're going to say, simply subtract the sales from the lowest sales, like this, and let's give it a name sales difference. So that says, Let's go and execute it. Now, as you can see the results in one row, I'm comparing the current sales, which is 90 with the lowest sales from this product. It's going to be the ten. So with that we're going to get the distance, let's say, between those two informations, and it's going to be 80. So now for the next one, the distance between this value and the lowest value is shorter, so we are near the lowest value. So as you can see over here, we can now compare the sales between the current sales and one extreme in order to find the distances between two values. So this is again, very important analysis in order to do comparison analyses. 248. 6 6 win value suzmmary: All right, friends. So now let's do a quick recap about the value functions or we call them sometimes analytical functions. So what they do, they're going to go and allow you to access a specific value from another row. This can help you in order to do complex calculations with very simple SQL without having you joining tables together or doing self joins. And for the value functions, we have four types or let's say for functions. The first one allows you to access the previous value like the previous month using the lag function. The next one, it allows you to access the next values, the next month, using the lead function. Then we have another one. It allows you to access the first value in a subset using the first value function. And another option, we can go and access the last value in a subset, using the last value function. Moving on to the next one, we have the rules of the syntax. So A the first point, it is the expressions. We can go and use any data type. It could be a number, string, date, anything. Now, in order to perform those functions, we have to go and sort the data by the order by. So order by is required. It is a must. Then for the frame, you are allowed to use it, so it is an optional thing. I would say always leave it empty for the frame, but only for the last value, you have to go and customize. Otherwise, it will not work. Now, to the next point, we have the use cases. We have simply very important use cases for the value functions in data analytics. So what we can do, we can do time series analysis. As we learned, we can do month over month analyses and year over year analyses. Hose analyses are classical, and it's always the first question and that analysis in order to measure, are we growing with the business or are we declining how the performance between the current year and the previous year? So you can see we are doing always comparison using those window functions. The next use case is as well about the time we can do time gap analysis as we analyzed the customer behavior, the customer retention. Where we have calculated the average days between two orders. In the last use case, it's as well about comparison comparison analysis. We can go and use the value functions in order to compare the current value with extreme, like comparing the current sales with the highest sales or to the lowest sales. So my friends, those analyses are essential in data analyses. You will be countering them in each company. In each business, you have to answer those questions, and you can do that very easily using the SQL window functions. 249. 8 1 intro case : Friends, now we're going to learn how to build a conditional logic in SQL using the case statement. And we're going to start with the basics likes understanding how they work, the syntax, and how QL execute the case statement behind the scenes. And after that, I'm going to show you many use cases for the case statements that I use in my projects. So now let's start with the first question. What is case statement? Case statements, it can allow you to build a conditional logic in your SQL query by evaluating a list of conditions one by one and return a value when the first condition is met. So now let's understand the syntax of the case statements and what this means. 250. 8 2 syntax case: Ooh. Now let's see the syna step by step. It's start with the keyword case. This case indicates now we are starting logic, a conditional logic in SQL. It's like programming languages as you start with the Fl, the F is the keyword of logic. The whole logic as well ends with another keyword called once SQL sees the end, so this is the end of the conditional logic. The case is the start and the end is the end. Now what we can have in between is the conditional logic. The conditional logic start with the keyword. Now we are telling SQL, we have a condition to be evaluated, and then we're going to go and specify the conditional logic. We have to tell SQL, what can happen if this condition is fulfilled. Now we have to use another keyword code then. Now we are telling a SQL, show these results if the condition is true. As you can see, it's very simple. It's like the natural language, like in English. When the condition one is met, then show the results. It's very logic. Now of course, we can go and add a second condition inside our case statements. We can have the same set up. When condition two, if this is true, then show the result number two. We specify the keyword when, then we have a second condition, and if this condition is true, We tell SQL to show another results. Of course, it's very important to understand and the syntax of dots, SQL going to go and process the conditions from the top to the bottom. So the first most important condition should be at the start. SQL going to first check this condition. If it fails and it's not true, then it's going to go and jump to the second condition. The order of the conditions is very important in your logic. Now of course we can go and add multiple conditions depend on the logic. Using the keyword when. And now once we are done defining all the conditions, we can go and specify an else keyword. The else can introduce the default value, and it is optional. You can go and skip it. So the value of the ils or the default going to be used only if all the conditions failed. So that's means all our conditions are not true and nothing is fulfilled, then Q going to go and use the value from the else. So it is the default value that's going to be used if all conditions are false. So those are the keywords that you must use inside each case statement, so we have case, win, then, and end. Only the else is an optional, so you can go and use it or skip it. This is the main structure and the syntax of each case statement. 251. 8 3 howitworks: Now, let's have a very simple example in order to understand how is SQL execute the case statements behind the scenes. All right. Let's have this very simple example where we have only one condition. So as you can see in the syntax, it starts with case and end, and then we have only one condition, and we are evaluating here the sales. The condition says if the sales is higher than 50, then show at the result the value of high. It's very simple only one condition, and on the right side, we have here a flow chart in order to understand how the logic is executed. Now, what we're going to do, we're going to go and evaluate those four sales through this logic and see what the outtu going to be with the case statement. Let's do it one by one. Let's start with the fair sales. It is 60. So here we're going to go and check is 60 higher than 50? Well, yes. That means the sales is meeting this condition, and we will get true, and we're going to get in the output, the value of high. Here we're going to get the value high in the output. That means the first sales is fulfilling the requirement, the condition, and EQ going to give us the value from this condition. All right. Now EQ going to go to the next value, and we're going to start evaluating the 30. Now we're going to ask the same question, the same condition is 30 higher than 50. Well, no. That means in the output for this condition, we will get false, so we will take the bath of the false. Now, if you take the bath of the false, we will not get any value right, that's means the output going to be a null. So the output for the 30 is null. And that's because we didn't define in our logic anything about the default option. So we don't have here an else. And this is what going to happen if you don't use els, you will get a null in the output for the case statement. Now let's move to the next one. It's going to be the same thing. So 15 is smaller than 50, so it's not fulfilling the condition, and as well, we're going to get a null. And for the last one, since it's null, we will get as well a null, since it will not fulfill the condition. Now after evaluating all those sales, Only the first sales is fulfilling that condition, and that's why we have only one value the high. All right. So now let's keep moving and adding stuff to our case statements. Now we are adding a second condition. It says, after checking the sales, whether it's higher than 50 and it fails, check again the sales, whether it's higher than 20. If yes, then show the value of medium. Now in our workflow, we are adding a second condition to be checked. If the first one is false. Now let's go and evaluate our sales again and check the output. The first one, the 60. As you can see, the 60 is higher than 50, so we are fulfilling the first requirement. That's why we will get the value off high, it seemed like before. Here we're going to get. I in the output. Now here very important to understand one thing that. SQL didn't evaluate here in this scenario, the second condition. SQL didn't waste any time by checking the other condition. It skeped everything once it get a true from one condition. This is exactly how SQL process the case. It's going to check each conditions from top to down, and once it finds it true, it's going to stop everything immediately and show the value from this condition, and it will not evaluate any other conditions. Scale going to go and jump to the next value. We are the value of 30. Let's evaluate the conditions is 30 higher than 50, well, it's not, so it's false. Now what can happen? Ice going to go and jump to the next condition and start evaluating the second one, whether it's true or false. Now we're going to check here is 30 higher than 20. Well, yes. It can be fulfilled and we will get the value of medium. C going to stop everything and show in the output for this value. The medium, so we're going to get medium here. In this scenario, we have evaluated both of the conditions that we have in the case statement. Now it's going to go to the third one, we have 15, is 15 higher than 50, will know. We will get the faults for the first condition. Then we're going to go and jump to the second condition and check it is 15 higher than 20, will as well know. Now what's going to happen? The faults going to be a here and we will not get any value as a return. We will get the value of null in the output. Now for the last one, we have null, we will get as well null because it will not fulfill any of those conditions, and that's because we didn't define an else in the case statement. If we define these conditions like this, we will get the category medium for the 30. This is how Scale evaluate multiple conditions in the case statements. Right now, we're going to go to the final form of our case statements, and we're going to go and add an else, we're going to have a default value. We are seeing here if the sales is not higher than 50 or higher than 20, then show a default value as low. That means any sale that is equal or smaller than 20 going to get the value of low. Now very interesting if you check the workflow over here, you can see that. We have now a value for each path. For the first condition, we're going to get high for the second one medium, and if nothing is fulfilled, we're going to get always the value of flow. So there is no way in this chart to get any nulls right. So let's go and evaluate again our values. I think you already get it. The 60 is fulfilling the first requirement, and SQL going to stub everything immediately and just show the value of high. So on the right side over here, nothing going to be evaluated because the first condition is true. Here in the outsots, we're going to get the value of high. On nothing changed like the two previous examples. Now, Scale going to go to the next value, we have the 30, so we can evaluate the first one. It's going to be false. The next one, it's higher than 20, it is true, and that's why Scale going to show the value of medium, and this is as well, we had it in the previous example. Medium Now, is C going to move to the next one and here things going to get interesting. The value of 15. We're going to evaluate the first condition, is it higher than 50? Well know, Is it higher than 20? Well know. Now we are in scenario where none of those conditions are true. That's why Q going to go and execute the else. If you check our chart, it's going to be false and we will get the value of low. So in the outputs, we will not get this time. A null, because we have els, we will get the value of flaw. The same thing now for the null. Null will not fulfill the first condition as well the second condition, and that's why we will get as well the value from the else. So here in the output, we will get as well the value of flaw. So now, as you can see, if you use an else inside the case statements, you will make sure that there will be no nulls in the output. So that you have learned the different options that we have inside the case statements, and how Scale execute the case behind the scenes. 252. 8 4 usecase 1: All right, friend. So now we come to the part where I'm going to show you the most useful use cases of the case statements that I usually use in my projects. So let's start. The main purpose of the case statement is to do data transformations. Data transformations is a very important process in each data projects. And one very important task in data transformations are that, we can generate new informations. We can go and create a new columns based on the existing data that we have in the database using the case statements. This, of course, can help us driving new informations for our analyzes without modifying the source database. Only for analytics. My friends, the main purpose of the case statement is to do data transformations by creating and generating new columns. Now let's start for the first use case and the most important and famous one is, we use case statement in order to categorize the data. This means we are going to group up the data into different categories based on certain conditions. Now you might ask why this use case is important. Well, classifying and grouping data is fundamental in data analysis and reporting because it makes the data easier to understand and as well to track. But what's more important, it going to help us aggregating the data based on the categories. All right. Now let's have the following task, and it says, generator reports showing total sales for each of the following categories. Category high if the sales is over 50, category medium, if the sales is 20-50, and low if the sales is 20 or less and sort the categories from the highest sales to the lowest. Let's do it step by step, and now before we do any data aggregations, we have to go and create a new column called categories because we don't have it in the database. Now let's start with very simple, select statements. Select what do we need? Let's take the order ID. The sales, and that's it for now. So from sales orders. Let's go and execute it. And now we have our ten orders, and we have to go and now create a new column called categories, and we're going to do that using the case statements. So let's take a new line, and we start with case, and then again, a new line in order to define the first condition using the w. So the first condition is the high where sales is over 50, so it's very simple. So when the sales is higher than 50, what can happen if this is true? We want to show the value high. So this is the first condition, and then let's move to the second one. If the sales is higher than 20, that means it's less than 50 and higher than 20, then we want to see the value medium. Now for the last category, the low, we don't have to go and create a condition for that, because if those two fails, then that means the sales either equal to 20 or less. What we're going to do, we're going to just do simple se and show the value low. Like this, let me make this a little bit smaller. Now what is missing in our case is, of course, the end. Without it, you're going to get an error, end and let's give it a name category. We are ready. Let's go and excuse it. Now let's check randomly stuff. As you can see here, we have the sales of 50. It is low, which is correct, and then we have here 60, it's above 50, and we have the category high. Now if you check the order number six, we have the order 50 it's medium because it is not higher than 50. It is 50-20. Now as you can see, we have now classified our orders using the category. Now the next step with that, we're going to go and aggregate the data. How we're going to do that. We will use a subquery. Let's do it like this. We're going to go and select, and of course, we're going to group up the data by the category, so we're going to go and Lk de category, and we need the total sales that means you're going to go and use the function sum for the sales, and we're going to call it total sales. Now we have to nest the queries together, F, this is our query like this, and then we have to close it and group i, So we are grouping by the category. With that, we are now aggregating the sales by that category. It's very simple. Let's go and execute it. Now in the result, we have only three categories, we don't have the ten orders because now we are doing data aggregations. Now the granularity now on the level of category. Now we can see the total sales for the high is 2010. The low we have 65 and the medium we have 105. Of course, we are not done yet because in the task, it says, sort the categories from the highest sales to the lowest. That's means we have to go and use an order by statement at the end, and we're going to sort the data by the sales from the highest to the lowest, that's means sending so that's it, let's go and execute. Now with that, we have our reports. Now we are showing the total sales by the categories and the data sorted from the highest to the lowest. The highest category is high, then medium, and then the last one is low. My friends, as you can see, with the help of the case, we have created new informations from our data, we have the category, and then we have created insights or report based on this new informations, where we have aggregated our data using this new information. The use case of categorizing data using case statements is fundamental and very important in each data project. 253. 8 5 Rules: Okay. So now, one more thing before we jump to the next use case, that there is one rule to follow if you are using case statements. And that is the data types of the result must be matching. So what this means, if we check again our example over here, we can see that the result of each condition is string. So as you can see we have here high, medium, and low, and all of those informations are following the same data types, so it is correct. So if I go and break this rule, for example, After this then, let's have the value too. So now we have a number, and we have characters. So let's go and execute it. Now, of course, we're going to get an error because now kel is trying to convert the value low to an integer which is incorrect. So the data types of the output of the result must be matching, and that's not only include the value after the then, but also the value after the else because this value is as well part of the output. So let's have here again medium. Now, let's go and change this to, let's say one. So let's go and excuse it. Again, scale going to throw an error because this is an integer number, and the others are string characters. So this is the rule of using the case statement. The data types after then and after else must be matching. And if you ask me whether there's restriction about where you can use the se statement in which clauses, you can use it everywhere in select, in joints, from where, group by, order by, everywhere. So there are no restrictions, and we have only this one rule. 254. 8 6 usecase2: Okay, friends. Another use case for the case statements, we can use it in order to map values. We can use the case statement in order to transform the data from one form to another in order to make it more readable and more usable for analytics. One scenario of mapping values a dots, sometimes the database developers stores the data and values inside the database as codes and as flags. So for example, the status of the order could be stored as one N zero, instead of having inactive and active, and this is one technique in order to optimize the performance of the database for the application because one and zero is way faster than storing the whole string. But in data analysis, we usually generate reports to be read by human by persons. Now instead of showing the data as zero and one, it's going to be more nicer and readable if you show the data as active and inactive. For these scenarios, we're going to go and use the case statement in order to translate those cryptical and technical values into readable terms. Otherwise, each one can consume your report, going to ask you, what do you mean with the zero and one. Let's have the following task and it says, retrieve employee details with gender displayed as full text. Now let's go and solve it. First, we're going to go and explore a few informations. Let's go and show the employee ID. And let's take the first name, last name, and we need the gender information so gender. From sales employees. That sets. Let's go and excuse it. Now, as you can see in the results, we've got our five employees, and now the gender informations are stored as only one character. F and M. Of course, it's easy to understand that the F is female and M is male, but we would like to show it in the report as a full text, female and male instead of those abbreviations. In order to do that, we're going to go and use the case statement in order to do the mapping between the old value and the new value. Let's go and create a new column, using the case. We're going to have here two conditions because we have two values. Let's start with the first one, so we're going to have a new line and w. So when the gender equals to F, is first, then female. Now for the second value, it's going to be exactly the same. When gender equal to M. Then we're going to have male. Be careful for the case sensitivity of the values. Of course, we will not end this without an else or else. Then we can have the default value. We can have the default value, not available. It's better than having nulls. So what we are missing is the end. So we're going to have an end over here, and we're going to call you gender full text. So the set, let's go and excuse it. Now, if you check the results, we have now done the mapping between the old format of the value with the new format. So instead of, we have males and females. Of course, we don't have here an nulls. That's why we don't have a not available in the data, but if you have huge data, of course, you're going to have somewhere a null, and then you will get this default value. This is how you can do mapping between values very easily using the case statements. Let's have another task for the mapping use case, and the task says, retrieve employee details with abbreviated country code. Sometimes as we are generating reports, maybe using BRBI or tableau, don't have enough spaces in order to use the full name of values. What do we need? We need abbreviations. We need short form of the values, and we can go and use in CL, the case statement in order to map the full value to an abbreviated value. It's like the previous example, by the way around. All right. So now let's go and solve it, we're going to go and select few details like the customer ID. Let's take the first name, last name. And what do we need? We need the country information. From sales customers. That's it. Let's go and execute it. And as you can see, we get our five customers and we have the country of formations as a full name. Now, of course, for the reports, we need abbreviated values from this. So we're going to go and map those full names of the countries to a short form. But in real project, you might get pick tables where you have thousands and millions of records, so you cannot just check it like this. How I usually do it, I go and retrieve a distinct list of all values from one column. I usually go and have a subate query for that. So we're going to have select distinct country. From the table sales customers, is just for me to see all the possible values inside the database. So now you see the second result over here, we have only two values Germany and USA, and then I can go and map the data correctly. Always if you are mapping data using the case win, you have to understand all the possible values that you have inside the table. So let's go and generate this new informations. It's start with case, and then you line when country equal to the first value, it's going to be Germany. Make sure you write it exactly like in the database. The first character is capital, and the rest is small. So what happened? We're going to have the abbreviation of Germany. It's going to be TE, right. So this is for the first value, and then let's move to the second one. It's going to be country equal to USA. It's already abbreviated, but maybe we can get only two characters. So US like this. Now let's go and add an else. It's optional, but in case that we have nulls in the data we get new value. So else, it's not available. That set and never forget about the end end and the name going to be country abbreviation. That's it. Let me just get rid of the other query. The mapping is correct. Let's go and execute it. Now if you check the results, we got a new column called country abbreviation, and as you can see now, the mapping is working. Here we have Germany and we have here D E, and for the USA, we have US. With that, we have solve the task and we've done the mapping correctly between old value and the new value. 255. 8 7 quickform: All right, trans, now there is a special case for the syntax of the case statements, if you are using it for mapping values. Now let's go and tack it. Now let's say that we have a lot of different distinct values inside the country, not only to values, we have a lot of values. If you are mapping the values using the case, when you're going to end up always writing the same thing, country equal Germany, country equal India, country equal United States, and so on. We are always using the column country. The conditions over here using always one column, and it's always the operator is equal. Now only for this scenario, we have another syntax for the case statements, and it looks like this. We start with a keyword case, but after that, immediately, we're going to use the column that we want to evaluate. Here you can use only one column. You can et use multiple columns. Now we are telling SQL, we are now evaluating one column, the country. Then for each condition, we have the following stuff. We say when Germany, that means when country is equal to Germany, then DE. As you can see here, we don't have here the whole condition. We have only a possible value that you can see inside the country. We are saying, is the value country, If it's true then show D E, the next one is it India, then E N, United State, US, and so on. We call this syntax a quick form of the case statements, and on the left side, we call it full form of the case statements. Of course, the restriction and limitation using the quick format is that, you can use only one column and it's only for the equal operator. That means only for these scenarios, you can go and use the quick format. If things get a little bit complicated where you have to mix and make complex logic, you cannot use the quick format. I would say if you are sure that the logic will not get complicated and you can stay always with the same column, you can go with the quick format, but I would recommend always to go with the full format because for one s reason, if you add one small logic, you have to go and rewrite the whole case statements back to the full format in order to add any small logic. But of course, there is nothing wrong using the quick form in order to do the case statements if the logic and stay static. You are sure we are using only one column and we are just doing mapping, there is no ic. Now let's write this quick format for the case statement for the previous example. I will just go and copy everything to a new column. I'm just going to rename it to two and now how we're going to do it. It's going to be case, but this time we're going to write a country, and then inside the wind, we will have only the values, so no need for the condition. It's going to be like this and we scrolled up. That's it, as you can see, it's smaller and quicker than writing the whole condition each time. Now let's go and execute this. As you can see in the result, we're going to get identical values. Now we know one more trick in the case statement. 256. 8 8 usecase3: All right. Moving on to the next use case for the case statements, we can use it in order to handle nulls. Handling nulls means replace a null with a value. And as we learned before with the window aggregate functions, sometimes nulls leads to incorrect calculations and results, which leads to wrong decision making. We're going to have later a dedicated chapter on how to handle nulls in sc, but now we're going to learn how to handle nulls using case statements. So now let's have the following task and it says Find the average score of customers and treat nulls as zero and additionally provide details such as customer ID and the last name. Okay, now let's solve it step by step. Again, we have here details, and as well, we have to do aggregations. That means we have to go and use the window functions, and we don't have to forget that we have to treat the nulls, so we have to handle it. So now let's go and start with very simple select. Select customer. ID. We need the last name, and as well, we need the scores. So from sales customers. Let's go and execute it. So as usual, we have our five customers and the scores, and here we have a null. Now, we're going to go and write the window function, but without handling the nulls just in order to see the differences. So we need the average function. For what? For the scores? Do we have to now partition the data? Well, no, so we're going to leave it as empty. We need the average score of all customers. So that set, let's go and give it a name. And then execute it. Think I have a mistake, so it is a score, not scores. So now, as you can see, we have the average of 625. And as you learned before, score going to go and summarize all those four values and divided by four. But our business understand the nulls as zero, not as missing information. So we have to go and handle the null. Let's go and create a new column for the scores. But this time we're going to go and use the case statements. It's going to be very simple, so we're going to say, When the score is null. So in SQL, we don't write equal null. We say is null. With that, we are replacing the nulls with zero. So now, otherwise, what can happen? If it's not null, so we need the score as it is. We should not manipulate anything. So the default value is the score itself, if the score is not null. Now, let's go and end it. Let's call it score clean. So let's go and execute it. Now, if you check the result over here, it's like almost identical as the score. We don't have a new values for the scores, but only the nulls now are zero. And all other values, they are not affected, so we didn't touch it, we didn't transform it at all. This is what do we mean with handling nulls replacing nulls with another value. Now in order to finish the task, we have to do the average for the score clean and not for the original score. How are we going to do it? Let's go and copy the whole case statements. I'm just going to do it in another column. Let's have an average and inside it, we have the case statements like this. Just sort it like this. Now what is missing is the er, and it's going to be empty. Average customer, let's call it clean. This is the logic. Let me just make everything smaller. So now as you can see it's exactly like the previous one, but instead of using the original score, now we are using the column that we have created. But of course, we don't need the AS over here, so we have to remove it. So it's start with case and so let's go and execute it, and now you can see in the output, we got a new value for the average, and it is more accurate for the business. Now we have 500, previously, we had 625. As you can see, you have to understand what the nulls means in your business and handle it correctly. Otherwise, you will get wrong results. That's it, we use case statements in order to handle the nulls inside our data. 256. 8 9 usecase4: F. Conditional aggregations means we can go and apply an aggregate function in a square like some average count, but this time only on a subset of data that meet specific conditions. This technique is amazing in order to do deep dive analysis or target analysis on a specific subset of the data. So now let's have the following SQL task in order to understand this use case. The task says, count how many times each customer has made an order with sales greater than 30. All right. As usual, we can do it step by step. What do we need? We need the orders, let's get the order ID, and as well, let's get the customer ID. Like this the sales from sales orders. Let's go and execute it. So now what else I'm going to do with that? I'm going to go and order the data by customer ID. So let's execute it again. Okay. So now the task sounds easy, but it's a little bit tricky. We have to count the number of orders for each customer where the sales is higher than 30. Let's have an example. For example, this customer number one. So the total number of orders is three orders, right, but we have to count only the orders where the sales is higher than 30. And in this example, we have only one order where The sales is higherthan 30, so it's only the order number four. The count for the customer ID number one should be one. Now, let's check another customer, for example, the two. As you can see, we have three orders, but none of them have the sales higher than 30. So the count should be zero here. How are we going to do that? We have to go and flag each row whether it's higher than 30 or not. If it's higher than 30, it gets the flag of one. If it's less than 30 or equal to 30, it's going to get zero, and then we're going to go and summarize all those flags in order to get the count. So let's do it step by step. Let's first create the flag. So we're going to go and use case, and then our condition is very easy. We're going to say when? What is the condition? Sales greater than 30. Sales is higher than 30. Then what can happen? We're going to flag it with the one? Because later we're going to go and summarize the one. Now, else, if it's not higher than 30, equal to 30 or less, so it's going to get zero. All right. So now let's go and end it. So let's say sales flag. Now let's go and execute it and check the results. So now, if you check the results, we got now a very nice flag in order to see which orders has sales higher than 30? Now, for example, let's take that customer ID number one. As you can see, only the order number four has sales higher than 30 and it's flagged with one, and all others are zero. Now let's take that customer ID number three, and as you can see, we have now two orders where the sales is higher than 30. And as you can see, we have the one twice. We can use this flag in order to do the aggregation. Now, if you go and summarize the flag for the customer ID number three, we will get two. This is the count of orders where the sales is higher than 30 right. Let's take another example, the customer ID number two, we have everywhere zero, and if we summarize those values, we will get zero, which is the count of orders where the sales is higher than 30, which is correct. Now as you can see first, we have built an extra column in order to help us doing the aggregation, and now in the next step, we're going to go and aggregate this column. Let's go and do that. We don't need all those informations. The order ID, need the customer ID because it is the granularity for the aggregation, and let's remove the order y. Now let's go and drove up the data by customer ID. But, of course, we need the aggregate function. How are we going to do it? We're going to go and summarize the whole flag. Now, of course, we're going to go rename this since now it is an aggregated column, so we're going to call it total orders. Now let's go and execute it. Now let's go and check the result. As you can see now, we have our four customers. For the customer ID number one, we can cut only one order higher than 30. The second one has no orders, higher than 30. The third, we have two and one. And with that, we have solve the task. Now I would like to add one more thing to our query in order to see the normal aggregations, not the conditional aggregations. Usually we go and count For example, the star in order to get the total orders, and let's rename the previous one to high sales. So let's go and execute it. So we are just now doing aggregations without any conditions, and now we can see how many orders did each customer. So we can see that the customer ID number one did order three times, but only one order higher than 30. This is a normal aggregation, and this is a conditional aggregations using the case statement. 256. 8 10 summary: All right, friends, now let's do a recap about the case statements. Case statements can go and evaluate a list of conditions one by one and return value once the first condition is met. And if we are talking about the rules of using the case statements, we have only one where the data types of each condition after the den and else must be matching. And now, if we talk about the use cases of the case statements, main use case is to do data transformations and especially by creating new columns and driving new informations. As we saw there are amazing use cases for the case statements. For example, we can use it in order to categorize our data. As we learned, we can go and create new groups of data then to be aggregated for our reports. Then we saw another use case is mapping values. We can use the case statement in order to help us mapping the cryptical technical values that is stored in databases, to new values, which is more readable and more friendly to be used. The next use case that we have learned is handling the nulls. We can use the case statement in order to replace the nulls with value to make our aggregations more accurate. The last use case that we have learned, and I think the most used one in my project is doing conditional aggregations, where we can aggregate a subset of data that meets specific conditions in order to do focus and target analyses. All right, so efficacy, the case statement is very powerful tool in order to create conditional logic, and it's amazing. In order to derive and generate new informations for analysis. And now in the next chapter, we're going to learn all the functions and all the techniques on how to handle nulls in SQL. It's very important to clean up our data before doing any data analysis.