6 Essential Skills to Make A Great BI Analyst Series 2 | KAIMA LM | Skillshare

6 Essential Skills to Make A Great BI Analyst Series 2

KAIMA LM, BI/Data Scientist

6 Essential Skills to Make A Great BI Analyst Series 2

KAIMA LM, BI/Data Scientist

Play Speed
  • 0.5x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 2x
27 Lessons (2h 8m)
    • 1. Introduction:What does this course cover?

    • 2. Understanding Population and Sample Data Copy

    • 3. Various Types of Data and Levels of Measurement

    • 4. Visualisation Techniques for Categorical and Numeric Variables

    • 5. Calculating Measures of Central Tendency Copy

    • 6. Calculating the Measures of Asymmetry

    • 7. How to quantify Variability

    • 8. Standard Deviation and Coefficient of Variation

    • 9. Measures of the Relationships betwen two Variables

    • 10. Correlation Coefficient

    • 11. Distribution

    • 12. Normal Distribution

    • 13. Standardisation

    • 14. Central Limit Theorem

    • 15. Standard Error

    • 16. Estimators and Estimates

    • 17. Confidence Intervals

    • 18. Confidence Intervals Clarification and Student's T Distribution

    • 19. The Null and Alternative Hypothesis

    • 20. Establishing a Rejection Region

    • 21. Type 1 and Type 11 Error

    • 22. Test for the Mean Population Variance Known

    • 23. What is P Value and Why is it an Important Tools in STATISTICS

    • 24. Test for the Mean Population Variance Unknown

    • 25. Test for the Mean Dependent Samples

    • 26. Test for the Mean Independent Sample Part 1

    • 27. Test for the Mean independent Sample Part 2

  • --
  • Beginner level
  • Intermediate level
  • Advanced level
  • All levels
  • Beg/Int level
  • Int/Adv level

Community Generated

The level is determined by a majority opinion of students who have reviewed this class. The teacher's recommendation is shown until at least 5 student responses are collected.





About This Class

About This Class

Hi! Welcome to our Business Intelligence Analyst Course Series. This course entails the six essential skills you need to make a great BI analyst

We are excited to present you a course series that stands out.

This BI program is different than the rest of the materials available online.  

These are the precise technical skills recruiters are looking for when hiring BI Analysts. And today, you have the chance of acquiring an invaluable advantage to get ahead of other candidates. This course will be the secret to your success. And your success is our success, so let’s make it happen!  

  • Introduction to Data and Data Science  
  • Statistics and Excel  
  • Database theory  
  • SQL  
  • Tableau  
  • SQL + Tableau  

Here are some more details of what you get with The Business Intelligence Analyst Course:   

  • Introduction to Data and Data Science – Make sense of terms like business intelligence, traditional and big data, traditional statistical methods, machine learning, predictive analytics, supervised learning, unsupervised learning, reinforcement learning, and many more;  
  • Statistics and Excel – Understand statistical testing and build a solid foundation. Modern software packages and programming languages are automating most of these activities, but this part of the course gives you something more valuable – critical thinking abilities;  
  • Database theory – Before you start using SQL, it is highly beneficial to learn about the underlying database theory and acquire an understanding of why databases are created and how they can help us manage data  
  • SQL - when you can work with SQL, it means you don’t have to rely on others sending you data and executing queries for you. You can do that on your own. This allows you to be independent and dig deeper into the data to obtain the answers to questions that might improve the way your company does its business  
  • Tableau– one of the most powerful and intuitive data visualization tools available out there. Almost all large companies use such tools to enhance their BI capabilities. Tableau is the #1 best-in-class solution that helps you create powerful charts and dashboards  
  • Learning a programming language is meaningless without putting it to use. That’s why we integrate SQL and Tableau, and perform several real-life Business Intelligence tasks  

 Sounds Awesome, right?  Our course comes with

Our courses are unique and is equipped with:

  • Work with real-life examples  
  • Provide easy to understand and complete explanations  
  • Create beautiful and engaging animations  
  • Prepare exercises, course notes, quizzes, and other materials that will enhance your course taking experience  
  • Be there for you and provide support whenever necessary  

We love teaching and we are really excited about this journey. It will get your foot in the door of an exciting and rising profession. Don’t hesitate and subscribe today. The only regret you will have is that you didn’t find this course sooner!

What you’ll learn

  • Become an expert in Statistics, SQL, Tableau, and problem solving
  • Boost your resume with in-demand skills
  • Gather, organize, analyze and visualize data
  • Use data for improved business decision-making
  • Present information in the form of metrics, KPIs, reports, and dashboards
  • Perform quantitative and qualitative business analysis
  • Analyze current and historical data
  • Discover how to find trends, market conditions, and research competitor positioning
  • Understand the fundamentals of database theory
  • Use SQL to create, design, and manipulate SQL databases
  • Extract data from a database writing your own queries
  • Create powerful professional visualizations in Tableau
  • Combine SQL and Tableau to visualize data from the source
  • Solve real-world business analysis tasks in SQL and Tableau

Are there any course requirements or prerequisites?

  • No prior experience is required. We will start from the very basics
  • You’ll need to install MySQL, Tableau Public, and Anaconda. We will show you how to do it step by step
  • Microsoft Excel 2003, 2010, 2013, 2016, or 365

Who this course is for:

  • Beginners to programming and data science
  • Students eager to learn about job opportunities in the field of data science
  • Candidates willing to boost their resume by learning how to combine the knowledge of Statistics, SQL, and Tableau in a real-world working environment
  • SQL Programmers who want to develop business reasoning and apply their knowledge to the solution of various business tasks
  • People interested in a Business Intelligence Analyst career

Meet Your Teacher

Teacher Profile Image


BI/Data Scientist




My name is Karima and I am super-psyched that you are reading this!

Professionally, I am a Business Intelligence Analyst with over five years of experience in e-commerce,, retail, information technology, other industries. Today I leverage Big Data to drive business strategy, revamp customer experience and revolutionise existing operational processes.

I combine my real-life experience and academic background to deliver professional step-by-step coaching in the space of Data Science. I am also passionate about regularly presenting Big Data to individuals and groups with no background knowledge in the field

To sum up, I am absolutely and utterly passionate about Data Science and I am looking forward to sharing my passion and knowledge ... See full profile

Class Ratings

Expectations Met?
  • Exceeded!
  • Yes
  • Somewhat
  • Not really
Reviews Archive

In October 2018, we updated our review system to improve the way we collect feedback. Below are the reviews written before that update.

Your creative journey starts here.

  • Unlimited access to every class
  • Supportive online creative community
  • Learn offline with Skillshare’s app

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.



1. Introduction:What does this course cover?: I and welcome to our business intelligence analyst course, Siri's. This course entails the six essential skills you need to make a great BR on a list. My name is Karima, another business intelligence analyst. Was spent a significant amount of time walking on business intelligence dust requiring more skills like statistics, Excel, database story, Bablu, beytin, the combination off SQL and tableau on even the combination of SQL with tableau and biter. I'm really excited to present to you because Siri's that stands out because each other call Siri stitches. You is done alone skill you need a Xabi are on a list, and by the end of critical Siri's, you would know how to apply Eat in the real world walking environment they cost. There is curriculum that has been prepared for you. Consists off diverse sets off topics are serious. Start with getting you to know the world of data and data signs. We explain all that to Science Jagan's on areas of activities before diving into more sophisticated analytical. Thus, once we have built a solid theoretical foundation, we go through statistics applied in excel and teach database management in s key. Well, then you will be ready to land disabilities off data of data visualization are reporting by creating professional visualizations with Bablu public, one of the most popular, be Iittle available all day. What I consider to be remembered if it's off this program, is that you will let out of leverage, ask your skill and combined them with Bablu to visualize the data contained in disobeys files. Once you know how to do that, it is time to start accordion infighter. But pretty soon he will understand that Lenin, a programming language, is meaningless without putting it to use. That's why, in final Siri's off, this course will integrate SQL bite and and taboo, which would allow us to build a model that predicts clients default rates on. Visualize your findings in taboo, and this will take your preparation to the next level. Because Siri's is it truly also adventure. To get the most out of this series journey, please don't skip any of the lessons in every Siri's as well gradually beauty or knowledge . Given that we teach several skills, it is essential he starts at the beginning as this lessons at the foundation, you need to tackle the more advanced topics you finally town in the program. Many of our lesson contained downloadable resources that will help reinforce what you've learned, such as cost notes, exercise files, pdf materials and the notebook files. Everything is included and can be downloaded easily. I strongly suggest you complete all exercises as they are designed, not only for practice but also as an additional source of information that will end the hands. You'll be I problem solving skills. Are you excited? Awesome. Let's begin this journey together. See you they 2. Understanding Population and Sample Data Copy: before processing any data and making decisions, we shield introduced some key definitions. The first step off every statistical analysis you perform is to determine whether the data you are dealing with is a population or example a population is a collection off. All items of interest to our study on is usually denoted with an open case in the numbers were obtained when using a population are called parameters. It's sample is a subset off the population and is denoted with the lower keys in on the numbers we've obtained when walking with a sample. Cold statistics. Now you know why the field we're studying is called statistics. Let's say we want to prefer miso v off the job. Prospects off student studying at the New York University. What is the population? You can simply walk into the New York University and find every student's right. Well, that's what surely not be the population off any Y u students. The population off interests includes not only the students on campus, but also the world's At war. On exchange are broad distance education students, part time students, even the world's enrolled but are still in high school Populations are to define on arts observed in real life. It's sample. However, it's much easier together. It is less than consuming on less costly. Time and resources are the main reasons. We prefer drawing samples compared to analyzing on entire population. So let's draw a sample. Then, as we first wanted to do, we can just go to the New York University Campos on Ansari contain because we know it will be full of people. We can do an interview, 50 of them Cool. This is a sample drawn from the population of N. Y U students. Good joke population are to observe and contents. That's why statistical tests are designed to work with incomplete data, and you will almost always be walking with sample data and make data driven. Decisions on inference is based on it, right, so the statistical tests are usually based on sample data. Samples are key to accurate insights. They have to defining characteristics. Randomness on representativeness is sample must be both random and representative for inside. To be precise, a random sample is collected when each member off the sample is chosen from the population strictly by chance. A representative sample is a subset of the population that accurately reflects members off entire population. Let's go back to the sample, which is discussed. The 50 students from anyways students contain We walked into the university contain on violated boots conditions. People were no Children. By chance, they were group off anyway. Students were therefore launch. Most members did not even get the chance to be chosen as there were not any contain those. We conclude that example was not random, but was it representative? Well, it represented a group of people, but definitely not all strains in the university container. To be exact. It represented. The people have launched at the university contain and also they've been about job prospects off anywhere you students will eat in the university contain we would have done well okay. He must be wondering how to draw a sample that is both random aren't representative. Well, the safest way would be to get access to the students database and contact individuals in a random manner. However, such surveys are almost impossible to conduct without assistance from the university. All right, through this course will explore both sampled on population statistics. After completing discourse, samples and populations will be a piece of cake for you. Thanks for watching 3. Various Types of Data and Levels of Measurement: you may be watching discourse. Probably you want to use this knowledge acquired as a stepping stone to a career in business analysis, business intelligence on data science, either we before we can start any analysis intestine. We have to get acquainted with the types of variables we usually encounter. Different capsule variables require different types of statistical on visualization approaches. Therefore, to be able to classify the data you're walking with his key, we can classify data in two men. Will is based on his type on its measurements level. Let's start from the times of data we can have them is categorical on numerical data that agree Good data describes categories or groups. One example is car brands like Mercedes, BMW and Audi. They sure different categories. Another example is answer to yes and no questions. So if I ask questions like, Are you currently enrolled in a university or do you only car Yes, I know would be the two groups off answers that can be obtained. This is categorical data, a miracle. Data on the order hand, as its name suggests, represent numbers. It is for the divided into two subsets discrete and continuous discrete data can usually be counted in a finite matter. A good example would be the number of Children you want to have. Even if you don't know exactly how many, you're absolutely sure the value will be on integer such as zero, 12 or even 10. Another instance is great on the exam indicated 1015 160 or 2400. What is important for your variable to be defined as discreet is that you can imagine each member of the data states knowing that he sat score range from 624 100 on the same points separate. It's easier to understand discrete data by saying it's the opposite of continuous data. Continual detail is infinite and impossible to count. For instance, your which can take on every value in some range. Let's dig a bitch, depart into these. It gets on the scale. On the scale shows were £50 or 68.434 kilograms, but this is just an approximation if you get £0.1. If you go on, the skill is unlikely to change, but you knew where to be under them. £50.1. Now think about sweating. Every drop of sweat reduces your wits by the weight of that drop. But once again, it skill is unlikely to capture that change. So is that What is it continues variable. It's going take on an infinite amount of values. The matter how many digits there are after the dots to conclude your which can vary by incomprehensibly small amounts on this continues by the number of Children you want. Tohave hysterically understandable on is discreet just to make sure there are other examples or discrete and continuous data. Grates at the university are discreet. A, B, C, D E f or 0 200% the number of objects, in general, the matter. If bottles, classes, tables or cars they can only take into job volumes. Money can be considered both discrete and continuous, but physical money like banknotes and coins are definitely discreet. You can't be one dola onto for three cents. You get all weepy Aidala on 24 cents. That's because the difference between two sums of money can be one cent at most. What else is continuous, Apart from wits author measure, means are also continuous. Examples are right area distance on time. All of this come very by infinitely smaller amounts. Incomprehensible for human. It start on. The clock is discreet, but time in general, isn't it can be anything like 72.123456 seconds. They're constrained in measuring wits, ICT area distance and sidebar technology. But in general, they can take on any value, all right, these very types of data. So let's explore the levels off measurement. So far, we have been able to distinguish between categorical on numerical data for Dumber. We saw that numerical data coming discreet on continuous, so it's time to move on to the other. Classifications levels off measurement. Discomfort splits into two groups. Qualitative on quantitative data. Qualitative data can be nominal. Or, you know, nominal variables are like the categories we talked about just now. Niceties BMW and Audi or 94 Seasons Winter spring somewhere on autumn. The irons numbers and cannot be ordered, but in our data on the order and consists off groups in categories which follow a strict order. Imagine you have been asked to reach your lunch, and auctions are disgusting on appetizing, neutral, tasty and delicious. Although we have words are not numbers. It's obvious that this professes are ordered from negative to positive those the level of measurements is qualitative or, you know. Okay, So what about quantitative variables? Well, they are also split into two groups, Interval on reissue, Interval and ratio are both represented by numbers. But off world major difference ratios after zero on intervals. Don't Muslims. We observe in the reward our issues. Their name comes from the fact that they can represent re shoes off things. For instance, if I have two couples are you have six apples, you have three times as many as I do. I was this found out? Well. The regime off six and two is three. What are examples Are number off objects in general distance on time. Intervals are not ask. Omer Temperature is the most common example of our interval variable. Remember, it's cannot represent a ratio. Fins on doesn't have a true zero. Let me explain. Usually temperature is expressed in socials or for a night they are bought in the world. Variables said. Two days five degrees socials, 41 degrees for a night. On the yesterday was 10 degrees socials, or 50 degrees for a night. In terms off socials, it seems two days, twice colder. But in terms of foreign nights, not really. The issue comes from the fact that zero degrees socials and zero degrees for nights are not true seals. These skills are artificially created by humans for convenience. Now there is another skill called giving, which asked a true 00 degree. Curving is the temperature at which atoms stop moving on. Nothing recorder down zero degree Kelvin. This equals when a store it and sell it to 3.15 degrees socials. Or when it's forward on 59 points 67 degrees for a night. Variable showed in carvings are issues us we have to get through. Zero. Are we convict Iqlim that one temperature is two times more than the order socials on fire night are not true. Zero on our intervals. Finally, numbers like 23 10 10.5 X sector can be both interval or issue, but you have to be careful with the context you are operating in. So we've gone through the title data on the measurement levels in the next lesson, with city types of charts and graphs that I used most often. Thanks for watching 4. Visualisation Techniques for Categorical and Numeric Variables: and order with city, different types of data levels off measurement. We are ready to explore different graphs and tables which will allow us to visually represent the data we are working with. Visualising data is the most intuitive way to interpret it, so it is invaluable skill. It is. What is there to visualize data If you know it's type on measurements level. Assuming recall, there are two types of variables. Categorical aunt in a miracle. So let's begin with categorical variables, some of the most common way to visualize them. Our frequency distribution tables bar charts by chance on but little diagrams. Okey forced. Let's see what a frequency distribution table looks like. It's asked two columns, the category itself and the corresponding frequency. Imagine your only car shop on you sell only German cars. The table below shows the categories of cars Audi, BMW, I miss cities on the number of units sold or the frequency by organizing your later in this week, you can compare the different fronts and see that Audi has been sold the most. So that is a frequency distribution table. Using the same table, we can construct a bar chart also known as column charts. The vertical axis shows the number off units sold while it bar represents a different category indicated on the results are access. In this way, it is much clearer that Audi is the best selling brunt. Okay, let's represent the same data as if by chance, in order to Beautiful, we need to calculate what percentage of the total each brand represents. Statistics, this is known asked related frequency. Not really already difficult. Nces are up 200% by charter, especially useful when we want to not really compare items. I'm on each mother but also see their share of the total. Okay, this example could easily be transformed into the business example of market share. Market share is so predominantly presented by bar charts that if you set for market share on Google images, you would only get by chance. Imagine that data in our table is representing the sales off Audi, BMW and Mercedes in a single German city. The chatter shores the market. Share that each off these brands us. Lastly, we have the marital diagram. Infarct. A parrot or diagram is nothing more than a special type of bar charts where categories are shown in descending overall frequency by frequency statistician, mainly normal for currencies off each item. As we said Elian example, that's exactly the number of units sort. Let's go back to our frequency distribution table on although the brands by frequency. Now we can create a bar chart based on the reorder table on Douala almost off Imparato diagram. There is one last touch to make it one. A Korbel. The same graph showing the community frequency the community frequency is a some of the related sequences. It starts a difficulty off the first brand. Then we are the second, the third and so on that it's finishes. Arts 100% see other part of diagram combines the strong side off the bar. By chance it is it is a competitive powerboat between categories, and that's part of the total for them. Or if this was a market, she a graph, you could easily see the market share or the top two out of five companies. The Pareto diagram, also known as 80 20 Rule States darks. 80% of the effects come from 20% off. The course is a real life example. Is is determined by Microsoft's that by fixing 20% off software books, they managed to solve 80% off the problems. Customers experience a part of that from current view information like that. It is designed to show our sub totals change with traditional category and provide. Also, it'd better understanding of our data, Oakey decided main ways in which we can visually represent categorical data. What about the miracle variables we already know all to create graphs and tables for categorical variables, So let's do the same for numerical variables. Whenever wants upload data, it is best to first order it in a table. So as we did with categorical variables, let's start by creating a frequency distribution table. Here is a list of 20 different numbers if you challenge them in a frequency table like the one we use for categorical variables, would obtain a table with 20 rules off them. Represents in one number, it's a corresponding frequency of one as each number, because exactly one time this table will be impractical for any analysis. So when we deal with the medical variables, it makes much more sense to group the data into intervals and then find a corresponding frequency in this week, we make a summary of the data that allows for a meaningful visual representation. How do we choose this interval? Generally statistician prefer working with groups of data that contain 5 to 20 intervals. This way, the summary can be useful. However, this various full case to case on the caricaturist off intervals likely depends on the amount of data we're working with. In our example will divided data into five intervals off equal live. The simple formula that we use is as follows. The interval with is equal to the largest number. I know the smallest number divided by the number of desired intervals. In our case, the lens off the interval should be 100 minus one, divided by five. The result is 19.8 now. We want around off this number in order to reach a need or representation. Therefore, interval will be as follows. 12 2121 41 40 watt to 61 61 to 81 on 81 to 1 a one each interval as a with off 20. The most common graph used to represent the medical data is instagram forced who let out a created, and they will provide a description of the way he did. Size represented. We're going to use the frequency distribution table for my previous example to oppose out. Yeah, it is. As you can see, it looks like a bar charts what's actually conveys very different information. As any bar charts, the political access is in the medical time and shoes. The absolute frequency this time do the results are access is numerical two. So each bar, as with equal to the entire well on ICTs equal so difficult. NSI I noticed all the different bars are talking disease to show dearest consternation between the intervals. Each interval ends where the next one starts, whereas in the bar chatter is Aurelia. Different bars represented different categories, so the parts were completely separate. Okay, sometimes it is useful to plot the intervals begins, but if, rather than the absolute frequency, as you can see, the instagram looks the same visually but gives different information to the audience. Remember, relative frequency is made up of percentages. This is how we can build a instagram in order to represent the miracle data. Okey. So far, we've covered graphs represents only one variable But how do we represent relationships between two variables? Let's explore across tables on scatter plots. Once again, there is a division between categorical on the medical variables. Let's start with categorical variables. The most common with representative is using cross tables or are some statisticians? Call them contingency tables. Imagine you're on Investment manager. Are you manage stocks, bonds, real estate investments for three different investors. Each of them has a different idea off risk, and hence their money is located in a different way. Um, only three assets Colossus. So it cross they were representing all the data looks the following way. You can clearly see the rules showing the type of investment that's been made on the columns with each investor's allocation. It's a good practice to calculate. It'll thous off true and columns as it is often useful in for the analysis. What is that? It's a total off. The rules gives us total investments in stock bones on real estate. On the other hand, this off total of the columns kiss off the holdings off each investor. Once we have created a cross table, we can proceed by visualizing the data. Various full charts in such case is a violation of bar charts. Call decide by side bar charts. It represents the holdings off investor in different types of assets. Stocks are in green bones in red, on real estate in blue. The name of this type of charts come from the fact that for each investor the category off assets are represented side by side. In this week, you can easily compare our set holdings for a specific investor among investors. Easy Right now. Graphs are very easy to create. One Red Rose. You have identified the type of detail you're dealing with and decided best week to visualize it. Finally, who like to conclude with a very important graph? This got applause. It is using represents into numerical variables For this example, we have got at the reading and writing off South schools off 100 individuals. Let me for show you the graph before analyzing it. All right, First South scores by component Teran from toe 100 to 800 points on. That's why our data is bonded. Within the range of 200 to 800 Second are vertical axis shoes. The writing schools. Why the results are access contains reading schools third there are under students on the results corresponds to a specific points on a graph. Each point gives us information about a particular students. The four months, for example, this is Jane. She's called 300 on writing but 5 50 on the road in parts Scott uploads usually represents lots and lots off observations. When interpreting it's got uploads is the decision is not expected to look into single data points. It will be much more interested in getting the idea off. Our dictator is distributed, so the first thing we see is that there is no videos up trained. This is because lower writings calls are usually obtained by students with low ridin schools, and I are writing schools have been achieved by students with IRA in schools. This is logical, right? Students are more likely to do well on boots because they to test are closely related. Second, we noticed a concentration of students in the middle of the graph, which calls in the region off for 50 to find 50 on both Brilliant on writing. Remember, we said that this cause can be anywhere between 200 to 800 well, 500 is average score one on get so it makes sense that a lot of students fall into that area. But there is. This group of people would put very high writing on reading schools. The exceptional students tend to the excellent on boats components. Finally, we have Gene from immunity go. She's far away from other observation, as she's called a both of rich one reading but poorly on writing. This observation is called an outlier as it goes against the logic off the old it is it will and more about off liars on how to treat them. In our analysis, let's our own in this lesson. So we have gone through the very basics. In statistics, we have covered populations, samples, types of the reviews, graphs and tables, and it is time for us to dive into the art off. Statistics, measurements off, central tendency on variability. Thanks for watching and see you there 5. Calculating Measures of Central Tendency Copy: business, and we'll introduce you to the three measures of central tendency. Don't be terrified by technology. We're talking about the main media on the mood. Even if you're familiar with these terms, please stick around as we will explore their upsides and shortfalls. The first measure will study is the main, also known as the simple average. It is noted by the Greek letter meal for Population on the X Bar, for example, we can find the mean off a detailed state by adding up all off its companies on dividing them by their number, the minute the most common measures of central tendency. But it has a huge downside. It is easily affected by outliers, less aid ourselves with an example. These are the prices off Pisa on 11 different locations in New York City on 10 different locations in Ellie. Let's calculate the means or the two data sets. Using the formula for the mean In NYC, we get $11 whereas for L. A. Just 5.5 on average, pizza in New York can't be twice as expensive in L. E. Right Or it's The problem is that in our example, we have included one push place in New York, where the charge success $6 for Pisa and this doubled demean. What we should take away from this example is that the moon is not enough to make definite conclusions. So how can we protect ourselves from this issue? We can calculate the second measure and media. The media is basically the middle number in an order. It sits. Let's see how it works. For example, in order to calculate the median we have told our little state in our standing order, the media of the details states is the number at position on plus one divided by two in the order list where N is the normal off observations. Therefore, the media for NYC is at the sixth position or $6 much closer to the observed crisis than the mean off $11. What about Ellie? We have just tenderizer visions in Ellie. According to our formula, the media is at the position 5.5. In cases like this, the media is a simple are rich of the numbers exposition five and six, therefore, at the 1,000,000 off, early prices is five on $5. So we have sent the media is not affected by extreme prices, which is good when we have posh New York restaurants on the streets. Pizza sample. But we still don't get the full picture. We must introduce another major. The mood. The models value that because most often it can be used for boating, a miracle and categorical data, but will stick to a numerical example. After country frequencies off each value we find out, the more off the New York P surprises is $3. Now that's interesting. The most common price. Cough pizza in NYC. It's just $3 but the mean on the media letters to believe it was much more expensive. Okay, let's do the same on finding more off a piece of prices from each price appears only ones. How do we find in Morden? Well, well said there is no mood but connected in our 10 moods. Sure I can. It will be meaningless with 10 observations on gun experienced statistician, I will never do that. In general, you often have multiple moves, usually two or three moves a tolerable, but more than that would defeat the purpose of finding a mood. There is one last question We traveled answered. And that is which merger is best, based on the example with just saying, it shows us that the measure off central tendency should be used to get the rather than independently. Therefore, there is no best. But using all the war is definitely the worst. Now you know about the mean median on mood. We use that knowledge to talk about skew nous. Thanks for watching. 6. Calculating the Measures of Asymmetry: after exploring the measures of central tendency, let's move on to the measure off a symmetry. The most commonly used to measure a symmetry is skew nous. This is a form a lot of calculated. Almost always you use software that performs the calculation for you. So in this lesson will not go into the competition, but rather understanding skill. Nous Steelers indicates whether the observation in a data sets are concentrated on one side . This can be confusing in the beginning, so let's see an example. Here we are three deficits and respective frequency distributions. We have also calculated the means medians and the moods. The false datasets are in mint off 2.79 on a median off to hence, the men is bigger than the medium. We see that this is a positive all right skew from the graph. You can clearly see that the data points are concentrated on the left side notes that the direction of the skill is counter intuitive. It does not depend on which side in line is let into, but rather toe which side it. Still, it's leading to so rice que nous means the outliers are to do right It is interesting to see the measures of central tendency incorporated and a graph when we have rights units. The main is bigger than the medium, and the mood is value with the highest visual representation. In the second graph, we have plotted a data set that as an equal mean median mode, the frequency off a currency is completely symmetrical. Are we call these zero or low skew? Most often you will year people say that the distribution is symmetrical for the data sets . It's open nine in Indiana, five on the mood of six. As the main is lower than the median, we say that there is a negative or left skew. Once again, the highest point is defined by the mood. Why is it called the Left skew again? That's right, because the outliers are to the left. All right, so why is que nous important? Ask unions tells a lot about where the data is situated. As we mentioned in our previous lesson, the mean the median under mood should be used together to get a good understanding of the date is it measures off. Our symmetry likes que nous or the link between central tendency measures after politics, Yuri, which ultimately allows us to get a more complete understanding of the detail we are working with. Thanks for watching. 7. How to quantify Variability: next on our to do list of the measures off their ability. There are many ways to quantify variability. However. We're focused on the most common worlds. Variants, standard deviation, coefficient of variation in the future of statistics will typically use different formulas when working with population data and sample data. Let's think about this for a bit. Where you have the whole population, each data point is known, so you are under present shoe All of the measures you were calculating. We take a sample off this population on you compute its simple statistic. It is interpreted as an approximation of the population parameter. Moreover, if you extracts 10 different samples from the same population, you get 10 different measures. Decisions have solved the problem, but I just in the algebraic formulas for many statistics to reflect. These issue, therefore, will explore both population and sample formulas as their boat used. It was the asking yourself why their unique formulas for the mean median on mood? Well, actually, the sample many is average off the sample data points for the population mean is the average of the population data points, so technically there are two different formulas, but they are computed in the same way. Okey Now, after this short vacation, it's time to go onto variance cereals measures. The discretion of the sets of data points around their mean value. Population variance denoted by Sigma squared is equal to the sum of squared differences between the observed values on the population mean divided by the total number off observations sample. Varias, on the other hand, is denoted by S Kuwait on is equal to the sum of square to differences between the observed sample values on the sample mean divided by the number off sample observations miners one. Okay, we're getting acquainted with statistics. It is hard to grasp everything right away. Therefore, let's talk for a second to examine the formula for the population and try to clarify its men in the middle. Part of the formula is its memory. It'll So that's what we want to comprehend some of differences between the observations on the me and square them. So the clothes I ain't nobody demean the lower the result will obtain on the fourth I away from the main, the larger. But why do we elevate to the second degree? Squarely? Differences has two main purposes first by square and the numbers. We always get no negative competitions, which I'll get into deep into the mathematics off it. It is intuitive that discretion cannot be negative. Discretion is about this. Does on distance cannot be negative if, on the other hand, we calculated difference and they're not elevates to the second degree. Who would obtain both positive and negative values that went stormed would cancel out, leaving all's which no information about the discretion. Second Squared amplifies the effect off large differences. For example, if the men is you unravel observation Off 100 disquiet spread is 10,000 all now it's time for a practical example. We have a population of five observations. 1234 and five less fine. It's variants. We started calculating the mean one plus two plus three plus four plus five divided by five equals three. Then we apply the formula with your soul. One minus three squared close to mine of three squared plus three minus three squared plus four miles, three squared plus five minus three squared. All of these components have to be divided by five. When we do the math, we get to so the population variance of the data sets is too ports. What about the sample variance? This would only be suitable if we were told that this fight observations way sample drawn from the population. So let's imagine drastic ease The sample men is once again three. The numerator is the same. But the denominator is going to be four instead of five. Giving us a sample variance off 2.5. To conclude a serious topic, we should interpret the results. Why is the sample variance bigger than the population variance In the first case, we know the population. That is we are all the data we calculated the variance. In the second case, we're told that 1234 and five was example drawn from the bigger population. Imagine the population of the samples with these nine numbers 111234555 on five. Clearly the numbers are the same, but there is a concentration around the two extremes off the deficits. One on five variants off this population is 2.96 So our sample variance arsed rightfully corrected upwards in order to reflect the higher potential very beauty. This is the reason why they are different formulas for sample on population data. This was a very important lesson, So please make sure you have understood it. Well, you can enforce what you've learned by doing the exercise available in the course resource section. Remember, To better understand statistics, you need continuous practice. Thanks for watching. 8. Standard Deviation and Coefficient of Variation: well variance is a common measure off data discretion. In most cases, the figure you'll obtain is pretty large on arts to compay, as the unit of Measure Men's is quit, Easy Fix is to calculate its square roots on obtaining statistic nude as standard division . In most analysis, you perform standard deviation will be much more minute for Dan VARIAS. As we saw in the previous lesson, there are different measures for population and sample variance. Consequently, there is also population and sample standard deviation. The formulas are the square root off the population variance and squared, for example, variants respectively. If you have a calculator in your hands, you'll be able to do this job all right. The other measure we still have to introduce is the coefficient of variation. It is equal to the standard deviation divided by the mean another name for the term is relative standard deviation. This is an easy way to remember its formula. It is simply the standard deviation relative to demean. As you probably guessed, there is a relation and sample formula once again. So standard deviation is, in most common measure off the ability for a single detail sets, but Why do we need yet another measure, such as the coefficient of variation? Well, comparing the standard deviations off to deter sets is minimalists. But comparing coefficients off variation is not. Here's an example of the comparison between standard deviations. Let's ticket prices off. Pisa are 10 different places in New York. They went from 1 to $11. Now imagine that you only have Mexican pacers onto you. The price look more like 18.81 places to 206 point, and I won bases, given the exchange rate off 18 points. 81 pistols for gondola. Let's combine our knowledge so far on fine, distant a deviation and coefficients of variation off. These to deter sits. First, we have to see if it is a sample or population. The question is, are the only level restaurants in New York off course not. This is obviously example, drawn from all the restaurants in the city. Then we have to use the formulas for sample measures or very beady. Second, we have to find a mean. The $1,000,000 is equal to 5.5 on remaining vessels, is one or 3.46 The third step of the process is finding the sample variance for lonely formula that we showed earlier. We can obtain 10 points $72 quid on 3793 points. 69 vessels, quite. They're expected stand post on a division are three points to $7 on 61.59 This was let's make a couple of observations. First, Syrians gives results in square the units whilst on a deviation in origin. Now units. This is the main reason why professionals prefer to use standard deviation as the main measure off very ability. It is directly interpret. Hable Square dollars means nothing even in the field off statistics. Second, we goes on the deviations off 3.27 61.59 for the same pizza at the same 11 restaurants in New York City. The same strong don't worry. It is time to use our last tool. The coefficients of variation divided Istana deviations by the respective means. We get the two coefficients of variation. The result is the same 0.60 notice that it is not dollars pesos Douglas Quaid or basil squid. It is just 0.60 Yes, use all the great advantage of the coefficients of variation easels. Now we can confidently say that it's to deter states have the same variability, which is what we expected beforehand. Let's recap what we've learned so far. There are treatment measures are very beauty, various standard deviation on coefficients of variation it off them us different strains on applications. You should feel confident using all off them as we're getting closer to more complex statistical topics. I remember Aristotle's advise. Involve me, I would understand. So please don't forget to get involved with the exercise. Also refer to the nudes provided for Excel formulas to easily calculate these measures. Thanks for watching. 9. Measures of the Relationships betwen two Variables: cookie. We've covered all you. Never it measures. Now it's time to sing. Measures that I use where we work with more than one variable will explore measures that can help us explore the relationship between the rebels. Our focus will be on core variance on linear coalition coefficients. Let's think of an example that is very easy to understand. Our will help us grasp the nature of the relationship between two variables. It beats better. Think about relisted, which is one of the main factors that it's harming house prices. Yes, size right. Typically, larger houses are more expensive as people like Harvey in Extras piece. It's every issues was data happen. Several houses on the left side we consider size off each hours on. On the right, we have the price at which it's been listed in the local newspaper. We can present these data points in its cut, applauds the X others who show a house off the size and why access will provide information about its price. Because certainly notice a Parton. There was a clear relationship between these variables because it's variables are correlated on. The main statistic to measure these coalition is called Cove Erian's on Like variants, Cove areas may be positive, equal to zero or negative. To understand the concept better, I would like to show you the formulas that allow us to calculate the convenience between two variables. It's asked formulas with an S because once again, there is a population on example. Formula here, they are says, is obviously sample data. We should use the sample Cove Erian's formula. Let's apply it in practice for the example with So Alia extends for outsides are y stands for house price. We need to calculate them in size. Condemning price. This has been computed as well of the same coast on the division, in case we need them. Literally. Okey. Now let's tackle it. Denominator of the co variance function, starting with the first house, all multiplied the difference between each size on the average, outsize by the difference between the price of the same house on the average house price. Once we already we have to perform this car pollution for all houses that we have in a table and then some D numbers we've obtained. Now we have to divide the some approved by the sample size guys, my nose one. The result is a cove Erian's. It gives us a sense of direction in which the two variables are moving and they go in the same direction. The conference will have a positive sign, while if they go in opposite directions, the coveralls will have a negative sign. Finally, if their movements are independent, the coherence between your house size on its price will be equal to zero. However, there is just one tiny problem with co variance do. It could be a number like five or 50 but it can also be something like 0.23456 or even over 50 million, as in our example, the use of a completely different skill. How could one into a pit stop numbers and find a part of this lesson? I will answer that question. See you, they and thanks for watching 10. Correlation Coefficient: Correlation adjusts Cove Erian's so that the relationship between the two variables becomes easy on intuitive to interpret the formulas for the coalition. Quite fishing's are the Cove Erian's departure by the products of the standard deviations off the two variables. This is either sample or population, depending on the detail you're working with. We already have the standard deviations off to DigiScents. Now we use the formula in order to find the sample coalition coefficients. Mathematically, there is no way to obtain a coalition value greater than zero or less than minus one. Remember the coefficient operation we talked about a couple off in essence ago? Well, this concept is similar. We've manipulated the strange core variance value in order to get something intuitive. Let's examine it for a bit. We got a sub coalition coefficient of 0.87 so there is a strong relationship between the two values. A coronation off one, also known as perfect Positive correlation means that the entire variability of one variable is explained by the other variable. Oh era. Logically, we know that the size determines the price. On average, a bigger house he built, the more expensive it will be this relationship only goes this week was the Houses Butte. If for some reason it becomes more expensive, it's size doesn't increase. All do. There is a positive correlation. Okey, A coalition of zero between two variables means that they are absolutely independent from each other. We would expect a coalition of zero between the price of coffee in Brazil on the price of houses in London. Right? It's variables don't have anything in common. Finally, we can have a negative correlation coefficient. It's coming profit negative, coalition off minus one or much more likely on in perfect negative correlation off value between minus one on zero. Think of the following businesses a constant producing ice cream on the conference selling umbrellas. Ice cream tends to be sold more when they were. That is very good, and people buy umbrellas when it's really. Obviously, there's a negative correlation between you two on hands. When one of the continent's makes more money, the other ones all right. Before we continue, we must note that the coalition between two variables X and Y is the same as a coalition between Y on X. The formula is completely symmetrical, which respect to boot variables. Therefore, the collision off price on size is the same house, the one off size on price. This leads us to cause ality. It is very important for any analysts or research to understand the direction. Of course, on relationships, in the house, in business sighs courses the price and not right. But this we conclude our class on statistics for business and other states on data science . Thanks for watching. 11. Distribution: inferential statistics refers to the metals that rely on probability theory on distributions in particle are to predict population values based on sample data. While this definition may not be completely clear just yet, literally in this lesson would define what a distribution is. I will go through a couple of distributions you will likely use at work. This will naturally lead off two points estimates. I will conclude this course with confidence intervals will not only present these topics but will also develop a deeper understanding off the statistical processes. This will help you agree. Deal If you decide to get into data science, I recommended complete all the exercises provided of the will reveal additional subjects you will be in charge off. Researching this insight will likely be your gateway to the fundamentals. Off constitutive Research on guitar driven decision making so less begin Before we can talk about this thing, we have to learn what a distribution is in statistics. When we use the term distribution, we usually mean it. Probability distribution, good examples in normal distribution, the binomial distribution on the uniform distribution. All right, let's start with the definition. Distribution is a function that shoes the possible value off a very bold on our often for your call. Think about Gifford. I it's are six sides numbered from 1 to 6. We wrote it. I What is the probability off getting one. It is one out of six. So on sixth, right, easy. What is the probability off? Get into once again insist the same holds for 345 and six we have are in questions over getting each of the six outcomes. Now what is the probability of getting seven? It is impossible to get a seven where ruling singled are. Therefore the probability is zero. Okey. Let's generalize. The distribution off our events consists not only the input values that can be observed, but it is made up off all possible values. So the discretion off the events ruling a day, I will be giving in the following table. The probability of getting one is 16 or 0.17 the probability of getting to 0.17 and so on. We are sure that you have exhausted all possible values. When this storm off the probabilities is equal to one or 100% similar to what we discourse about getting it seven for all other values. The probability of occurrence is zero on that. The probability off ruling it die, by the way, it is called a discreet uniform distribution. All outcomes have an equal chance off a Korean. Okay, the probability distribution as in visual representation, it's the graph describing the likely would off occurrence off every event. Here is the graph, for example, discretion to understand that the graph is just in visual representation off a distribution . Often when we talk about distributions, the makings of the graph. That's why many people believe that distribution is a graph itself. However, that's not true. Distribution is defined by the underline. Probabilities are not the graph. A graph is just a visual representation. All right. After this short clarification, let's explore different example. Think about ruling to dies. What are the possible outcomes? One on 1 to 11 well into and so on years. It's a rule with all possible combinations. Syria plane again, where we're trying to guess the sum of the two dies. What's the probability of getting in some of 10 as this event is impossible? The minimum so we can get is too. So what's the probability of getting is some of two? There's only one combination that would give us a sum of two when both dies are equal to one so one other 56 to the outcomes, or 0.3 Similarly, the probability of getting it some off three is given by the number of combinations that give it some off. Three divided by 56 therefore so divided by 36 or 0.6 We can continue in this way and we'll have the full probability distribution. Let's city graph associated with it. Looking at it, we can easily understand that when willing to dies, the probability of getting its seven is the highest. Moreover, we can also compare different outcomes, such as the probability of gets in. It's in on the probability of get in a five. It is evidence that it is less likely we'll get a 10. So the examples that we saw here we're off discrete variables in the next lesson who focused on continues distributions as they are more common in interferences. See you there 12. Normal Distribution: Okay, here we go. So far, we've learned that a distribution of a data set shows off the frequency at which possible values or call within an interval. We also said that there are dozens off distributions experienced. A decision can immediately distinguish by no meal from a Poisson distribution as well as a uniform from an exponential distribution in a quick claims. Off plots in this cause do rather focus on the normal on students T distributions Due to the following reasons. The approximate a wide variety of random, very booze distribution of sample means with large enough sample size could be approximated to normal. All comfortable statistics are elegance, decision based or normal distribution insights of a good track record. If this sounds to general or technical, don't worry, I assure you things will be more easier once we get started. Here is a visual representation of the normal distribution you have surely seen in the more destruction before assistance. The most common one statistical time for it is Gaussian distribution, but many people call it the back off aspect is shaped like a bill. It's symmetrical on its mean media, and more are equal. If you remember skill nous you would recognize it has no skew. It is perfectly centered around its mean all right. It is denoted in this week on stands for no more diffuse signed shoes. It is a distribution on in brackets. We have the main and the variance of the distribution on the plane. You can notice that the highest point is located at the main because it's going sides with the mood. The spread off the graph is determined by the standard deviation. Now let's try to understand the normal distribution a little bit better. Let's look at this approximately. Normally distributed is to Graham. There is a concentration of the observations around the main, which makes sense as it is equal to the mood. Moldova. It is symmetrical on both sides. Off the main. We used 80 of the visions to create this instagram. It's mean as 743 on its standard deviation is 1 40 Okay, great, But what if the mean is smaller or bigger? Let's zoom out a bit by heart. In the origin of the graph, the origin is zero point. I don't need to. Any graph gives perspective, keeping Istana division fixed or in statistical jargon controlling for this town. A division a little mini would result in the same ship off the distribution boats on the left side of the plane, in the same way it began. Mean would move the graph to the right in our example. This result into new distributions one with immuno for 70 honest on a division of 1 40 on one with him in of 9 60 honest on a division off 140. All right, let's do the opposites. Controlling for the mean We can change this turn on division and see what happens This time . The graph is not moving. Both is rather reshaping. It lost on a deviation results in a lula discretion so more detaining middle on sina tales . On the other hand, the highest honor division will cause the craft to flatten out with less point in the middle. On more to the end or in statistics juggle, Fatah feels great. These are the basics of a normal distribution. In our next lesson, we use this knowledge to talk about standardisation. Stay tuned 13. Standardisation: a Let's talk about standardisation. Every distribution can be standardised, say they mean and variance for the variable are new. Are Sigma squared respectively. Standardization is a process of transforming this variable to one with a mean zero Palestinian division off one. This into formula allows us to do that Hooky logically, in no more distribution can also be standardized. The result is called a standard normal distribution. In the last section, we explored shifts in the mean understand our division. So if we shifted mean by me, you a list on a division by Sigma for any normal distribution will arrive. Understand that normal distribution? Great. We use the letter Z to denote it on our set. Previously, it's mean zero on Istana Division One to standardize Variable is called Disease Call on its sequel to the original variable mine. No, it's mean divided by its stunned that division. Let's see an example that will help us get a better grasp of the concept. Will take on approximately normally distributed set of numbers. 12233344 On five It's Minutes three on Istana Division 1.22 Now let's subtract the mean from all data points, we get a new data ST minus two miners one minus 100011 on two. Let's calculate inhumane. It is zero exactly as we anticipated. Showing that photograph we have shifted because to the left or preserve in its ship. Claire. Okay, so far, we have a new distribution, which is still normal, both within Green off zero. Understand a division of 1.2 to the next step of the standardization is to divide all the points by the standard deviation. This will drive just on a division off the new deter states to one. Let's go back to example. Well, the originality to sit on the world. We often after subtracting the men from each data points, are based on a division off 1.22 Remember, I did not subtracting values toe audit appoints. It does not change the standard deviation. Now let's calculates each data points by one points to two. We get minus 1.63 My No. Zero point. It's to my No. Zero point. It's 20.0 point 0.0 point 820.82 on 1.63 If we calculate Istana division off this new details, it's we'll get one on the mean is to you zero in terms of a curve. We kept it at the same position, but we shaped it a bit Great. This is how we can obtain is done. I'm no more distribution from any normally distributed data. Sit using it makes predictions on inference is much easier on. This will help us a great deal on what we'll see next. Thanks for watching. 14. Central Limit Theorem: so you have a population of use cast in car show. We weren't analyze the car prizes and be able to make some predictions on them. Population perimeters, which may be of interest, are mean car prices, standard division of prices, co variance and so on. Normally in statistics will not have data on the whole population, but rather just example. Let's join sample out on that data. The men. He's $2617. 33 cents. Now a problem arises from the fact that if I take another sample, I might get a completely different mean 3200 or $1.34. Then it's all mean off $2844 on 33 cents. As you can see, the sample mean depends on the incumbents off the sample itself. So taken, a single value is definitely so optimal. What we can do is draw many. Many samples will create a new deficit comprised of sample means These values are distributed in some way. So the RV distribution when we were friends for distribution from my samples to use the time it's sampling distribution for case becomes even more precise were to load a something distribution of the means. So far, so good. Now, if we inspect these values closely, will realize they are different boats cause entreated around is certain value right? Or our case somewhere around $2800 since each of the sample means are not import, approximation of the population means the Value T revolve around is actually the population mean itself. Most probably none of them is a population mean, but taken together, they give a really cool idea. In fact, if we take the average off, the example means we expect to get a very precise approximation of the population mean nice . Let me give you some more information. There's a lot of the distribution of the car prices. We haven't seen many distributions, but we know that this is not a normal distribution. It as a rescue, and that's all about You can see you have the big revolution. It sounds out that if you visualize, the distribution of example means we get something else something familiar, something useful in normal distribution. That's what the central limited or in states, no matter the distribution of the population, binomial uniforms exponential or another one. The sampling distribution off the men will approximate in normal distribution. Not only that, but it's mean is the same as a population mean that something we already noticed. What about the variants? Well, it depends on the size or the sample. We draw what is quite elegant. It is the population variance invited by the sample size. Since the sample size is in the denominator, the bigger the sample size, the lower divvy Reince or, in other words, they close at the approximation we get. So if you are able to draw bigger samples, statistical results would be more accurate. Usually for CLT to apply. No the sample size off at least 30 observations. Excellent. Finally, let's finish off widely. Centrally. Material is so important. As we already know, the normal distribution asked elegant statistics on on on March applicability in calculating confidence intervals on performance tests. The central limits Europe allows us to perform tests, solve problems I make a difference. Is using the normal distribution Even when the population is not normally distributed, it is Correa improve auditory revolutionized autistics as a feud. I will be relying on it a lot in the subsequent Less is, that's all for now. Thanks for watching 15. Standard Error: in the previous Listen, we showed that LaMotta on the line distribution of the data sets the distribution of the sample Mean would be no more with a mean, equal to the original mean and variance accord to the original variance. They better by the sample size. All right, this lesson will be very short on Has the sole purpose of defining what they stand at a row is the standard Aero Istana division of the distribution formed by the sample means, in other words, just on a division of the sampling distribution. So how do we find it? I know it's variance Sigma squared, divided by n. Therefore, the standard deviation is sick, modified by D squared off in like a standard deviation to start a row shoes variability. In this case, it is a very big area off the means of the different samples we extracted. You can guess that sells itself, has its own name. It is widely used and very important. Why is that important? Well, it is used for almost all statistical tests because it shows how well you approximated the Trumaine. More on that. In the next lesson, note that it decreases as the sample size increases this nexus as bigger samples give a better approximation of the population. That's all for now. Thanks for watching. 16. Estimators and Estimates: okey greats. Let's continue by introducing the concept of an estimate. Oh, off the population perimeter. It is an approximation dependent solely on sample information. A specific value is called estimate. There are two types of estimates. Point estimates on confidence interval estimates. The point estimate is a single number. While the confidence interval literally is an interval, the two are closely related. In fact, the point estimate is located exactly in the middle of the confidence in Savar. However, confidence intervals provide much more information are not preferred when making inference is turns. Worry will have separate listen dedicated to confidence intervals all right of recent estimates. So far, sure we have. The supplement expert is a protest in medical the population main meal. Moreover, the sample variance R squared was an estimate of the population variance Sigma squared. There may be many estimations for the same variable, however they love to properties. Buyers on efficiency will not approve them. That's the mathematics appreciated is really out of the scope of this course. However, you should have an idea about the concept. As the metals are like Georgie's. We're always looking for the most efficient on buyers estimate ALS on unbiased estimate o as unexpected value equal to the population parameter. Let's think of it by us as tomato to explain that point. What if somebody told you that you find the average height of Americans by taking a sample , finding its main on, then adding one foot to that result? So the formalize expire plus one foot? Well, I hope you trust them to give you an estimate. It'll but it by as to one, it makes much more sense than the average height of Americans is approximated jobs by the sample. Mean right? We said that by us off this estimator is one foot Claire. Okay, great. Let's move on to efficiency. The most efficient estimators are the ones with the least variability of outcomes from the estimates we know. So far, we haven't seen estimators with problematic variance, so it's hard to exemplify. It's enough to know that the most efficient means the unbiased estimator with the smallest variance. A final note. What making is about the difference between estimators on statistics? The wall statistic is the border team. Eight points estimate is a statistic. All right, this is how we can describe estimators on poor point estimates. Okay, so you've learned about point estimators, right? But as you can guess, they're not very reliable. Imagine 55% of restaurants in London. I'm saying that the average milk it's Walter turns £2.50. You may be clues, but chances are that the true value it really turns £2.50. But somewhere around it, it's much safer to say that the average merely London is somewhere between 20 and £25. Evidence in this week, you have created a confidence interval around your point estimate of change £2.5. The confidence interval is a much more accurate representation of reality. However, there is still some more certainty left, which we measure in levels off confidence. So getting back, for example, you may see that you are 95% confidence that the population permits are lies between 20 and 25 quid. Keep in mind that you can never be 100% confidence unless you go through the entire population. And there is, of course, if I've percent chance that the actual population perimeter is outside of the 20 to £25 range will observe that example, we have considered deviate significantly from the entire population. All rights there is one more ingredient needed. The level of confidence. It is denoted by one. My knows how far and it's called the confidence level off interval Alfa is value between zero and one. For example, if you want to the 95% confidence that the perimeter is inside the interval, how far is 5%? If you want a IR confidence level off, say, 99% are probably 1%. Don't worry, we'll discuss this in more detail in our next lesson. You can't wait until the next lesson. Okay, here's a formula for all confidence Intervals is from the point estimates China's reliability factor will supply by just on an arrow to the point estimate plus the collaborative facto multiplied by the standard Aargh! We know what a point estimate is Value like expert on expert, right. We also know what Istana Aero is. What about the reliability factor will have to introduce it in our next lesson. Thanks for watching 17. Confidence Intervals: the confidence interval is arranged, within which you expect the population parliament are to be on. Its estimation is based on the data we have in our sample. There are too many situations where we calculate the confidence intervals for population when the population barrier is known and when it's unknown. Depending on which situation we are in, we would use a different calculation method. Now the old feudal statistics is this because we never have population data. Even if we do after pollution, we may never be able to analyze it. It may be so much that it doesn't make sense to use it all at Worms. In this lesson will explore the confidence interval for population mean within noon. Various. An important assumption in this calculation is that populations no mileage introduce it even if it is not. You should use a large sample on lettuce central limit Syrian duty normalization magic for you. Remember, if you work with the sample, which is large enough, you can assume normality off sample means. All right, let's say you want to be committed. The scientist. I'm interested in a salary and mitigate. Imagine you have set an information, but the populations on the division of data science, salary is equal to $15,000 for the more you know, the salaries under my distributed in Suffolk assists off pretty salary. The formula for the confidence interval it in known variance is giving blew the population men will forbid 20 sample mean my nosey off our 40 guarded by two. What applied by the standard arrow on the sample mean Lawzi off our four divided by two times to stand at a row. The sample mean is the point estimate you all know about star? Not there already. So let's compute it with using the formula. What we have left is a so called reliability factor. Zero for four divided by two is a statistic we've described earlier is still alive. Variable That has a standing No more distribution, right? What about how far this is the same offer we are when we defined our confidence interval. So for confidence level of 95% are far be equal to 5%. Similarly, for a confidence level of 90 90 cents, how far will be equal to 1%? It's all fits into place now, doesn't it? Let's go back to our example. The sample minutes $100,200. Understand a division is known to be $15,000. Those based on an aero is 2007 over $89. I haven't calculated these values. We can take the next day and choose our confidence level. Common confidence intervals are 90% 95% on 99% which respect about 10% 5% on 1% on and I want to put a value of our for these 0.10 point 05 on 0.1 respectively. Keep in mind that 95% confidence level means you won't show that in a 12% of the cases, the true population parameter would fall into the specified interval. Okey Is he off? Are far come from so called stand on, the more distributed to book. It is best for seeds and then commenced on it. Let's say we want to find the values for 95% confidence interval after is 0.5 Therefore, we are looking for the off divided by two or 0.25 in the table. This one market value off one minor 0.25 or 0.975 Correspondence he Come from There's some of Ruan column several letters associated with this cell. In our case, the values 1.9 plus 0.6 or 1.96 commonly used him for disease is critical value. So we have found the critical value for these confidence interval. Now we can easily substitute in the formula. The final confidence interval becomes 94 towers on 833 205,568. The interpretation is the following were 95% confidence that every data scientist salary will be in the interval 24 hours on 833 on one of around $5568 less repartee exercise using the higher confidence level said we want to be 99% setting off the outcome out for 0.1 We look at the table for the value off one minus 0.5 which is equal to 0.995 There was no such value. When this happens, we just after round up to the nearest value available. The corresponding which can value is 2.5 plus 0.0 eats. There's we have 2.58 We plug it into our formula. Small on the new confidence interval is equal to 23,400 to 5 on one ran $7 on 106 listening that where 99% confidence at the average data scientist salary is going to lie in the interval between 23 times on or around 35 on 1 7000 $26. Please note that in this case there is a trade off between the level of confidence we choose on the estimation precision. The interview op tion is broader. The opposite is also true in our confidence in several translates to higher uncertainty makes sense, right. If you are trying to estimate the population, mean we are picking in larger interval, they're increasing our chances off having an interval. That actually includes the mean advice. Answer. If you want to be more specific about the population, mean range is to take away from our confidence about this statement. Okay, This lesson was a bit longer, but very insightful Please make sure you practice with exercises provided they will help you increase. Reinforce the knowledge about the concept, which is fundamental for anyone who wants to work with numbers in their day to day jobs. Thanks for watching. 18. Confidence Intervals Clarification and Student's T Distribution: Let's take a step back and try to understand confidence. Interval a bit better? Yes, he grabbed off a normal distribution. You know where the sample mean is in the middle of the graph. Now, if we know that a variable is normally distributed, we're basically making a statement that the majority of observations will be around the mean on the rest far away. Form it, there's joy, confidence interval. There's a liberal limits on the open limits. On that factors and confidence in several would imply that we are. Let's 5% confidence that a true population main force Within this interval there is 2.5% chance that it will be on the left off the lower limits on 2.5% chance it would be on the right. Alvaro. There was 5% chance. That's our confidence that our rule does not contain the true population mean so when Alfa is 0.5 or 5% we have often divided by two or 2.5% chance that the true population mean is on the left of the interval and to pull factors in chance on the right. Okay, great. Using is this school on the formula were certain form Eastern time. No more distribution. Therefore, the minute zero deliver limit is my nosey. While the upper one is easy for 95% confidence interval using is the table we can find out . This limits are on those 1.96 on 1.96 That's exactly what we did in the previous Listen, finally, the formula makes sure that we get back to the original range off the values and we get the interval for a particular data sets. Okay, what if we're looking at in 90% confidence interval? In that case, the interval looks like this and there's a 10% chance that the true Maine is outside interval, actually 5% on each side. This causes the confidence interval to shrink. So when our confidence is lower, the confidence interval itself is smaller. Similarly, for 99% confidence interval, who would have a higher confidence puts in much larger confidence in tomorrow? Let's see an example just to make sure we have solidified this knowledge. I don't know the HDR students, but I'm not tractors and confidence that you are between 18 and 55 years old. There's only far that you were taken on online statistics costs. There's no more information to begin with clothes. I don't have any information about eight of any of the students hiss Dwight Interval. Okay, I am not factor some confidence that you are between 18 and 55 years old. Also, I'm not make sense confidence that you are between 10 and 70 years old. Un represent confidence that you're between zero and 118 years old, which is the age of the older specimen alive at the time of recording. Finally, I'm 5% confidence that you are 25 years old. Obviously, this is a completely arbitrary number. As you can see, there's a trade off between the level of confidence on the range off the interval. UN represent confidence means that our role is completely useless as the most include all ages possible in order to gain 100%. Confidence 19 represents confidence gives me emotional awar. Arrange what it's still not insightful in love for this particular problem and five years old on the order hand is a pretty useful estimate as we have an exact number. But the level of confidence, or 5% is too small for Austin. Make use off in any meaningful analysis. There is always a trade off, which depends on the problem at hand. 95% is the accepted norm. As we don't compromise with accuracy, too much posted get a relatively narrow interval. It's time for a short break from all these numbers and calculations. I would like to tell you a story. William Gosset was on English statistician. What really bury off? Guinness developed different metals for the selection of the best union varieties of barley on important ingredient in making bay because it found example studios. So it was trying to develop a way to extract small sample. We'll still come up with meaningful predictions. It was a curious and productive researcher and published a number of papers that are serially wants today. However, due to this company policy, it was not allowed to sign because we just own name. Therefore, all of this work most gonna depend. Name students later on in front of these on the famous opposition who not Fisher stepping on the finance off? Cosette introduced the T statistic on the name that's talk with the corresponding distribution. Even today is students see the students. Distribution is one of the biggest breakthrough in statistics, as it's allowed in Ferris's two small samples. With an unknown population serious, this certain can be applied to big parts off the statistical problems we face today. And it's an important part of discourse, all right. Visually, the student's T distribution looks much like in the more distribution for generally. Hard for, the team's father tells us, you may remember, allows for higher dissipation of the reviews on. There is more uncertainty in the same way that is, the statistic is related to D standard normal distribution. That statistic is related to the student's T distribution. The formula that allows us to calculate eat is the which inventors one degree of freedom on a significant level off how far it cost to the sample mean Man of the population mean divided by the standard aero off the sample. As you can see, it is very similar to dizzy. After all, it is an approximation of the normal distribution. The last characteristic of incidences statistic is that there are degrees of freedom usually, for example, off end. We have n minus one degrees of freedom. So why example off 20 observation. The degrees of freedom are 19 more to Pakistan and no more distribution. We also have a student's T table. The rules indicate different degrees of freedom, abbreviated as the F while the columns comin out fast. Please note that after it's easier to rule, the numbers don't vary that much. Actually, after 30 degrees of Freedom City statistic table becomes almost the same. Others a statistic as the degrees of freedom depend on the sample. In essence, the begat example, the closer it gets to the actual numbers. A common rule of Tom is that, for example, continue more than 50 observations. We use Izzy table instead of the tea table. All right, in our next lesson will apply our knowledge and practice. Thanks for watching. 19. The Null and Alternative Hypothesis: When you're making a decision, you need a years or new answer. The correct approach in this case is to use the test. In this lesson, we learned how to perform one of the fundamental trust in statistics. There are four steps in detail driven, decision making. Frost reformulated processes. Second was your for military and processes. You'll find the right test for your hypothesis. Third US equity test and forth. You make a decision based on the results. Let's start from the beginning. What is a POTUS is, though there are many ways to define it, the most interest you are saying is there are protests is is an idea the camp attested. This is not the formal definition, but it explains the point very well. So if I tell you dug up was in New York I expensive. This is an idea or statement, but it is not testable until I have something to compare it. For instance, if I define expensive us any price, I don't need to listen to five cents a pound. Then it immediately becomes POTUS is alright, wants something that cannot be a hypothesis. An example maybe with the USA do better or worse under the Clinton administration compared to a Trump administration. Statistically speaking, this is an idea, but there is no data to test it. Therefore, it cannot be a hypotheses off a statistic. Artists. Actually it is more likely to be a topic off another discipline. Conversely, in statistics, we may compare difference US presidency that have already been completed, such as the Obama administration and the Bush administration, as we have data on both. All right, let's get out of politics and get into a prosthesis. Here's a sample to calm protested. According to Glassdoor, the popular salary information website, the Men Data scientist salary in the US it's only about $10,000 so we want to. Just The estimate is correct. There are two protesters made. The knowledge protest is denoted eight ball, and the alternative Iapetus is denoted H one or H A. And other protesters is the want to be tested on the alternative is everything else an example. The normal processes will be demanded. A scientist salary is honored on $13,000 while the alternative mend it assigned a salary is not $113,000 and now you would want to check if under $15,000 is close enough to the Truman predicted bar sample. In case it is, you would accept the no hypothesis. Otherwise, you would reject the null hypothesis. The concept of the knowledge Patasse's is similar to innocent until proven guilty. We assume that immense Hillary ease $115,000 we try to prove otherwise. OK, this was an example of a two sided or two tailed just. You can also form one sided or one tail test. So your friend poor told you that it thinks did a scientists and more than $105,000 A A you doubt in. So you design it test to see what is right. The no hypothesis off distanced would be demented to scientist salary. It's more than $125,000. The alternative will be everything else. Those demanded a scientist salary is less done or equal to $105,000. It's important to know that outcomes off test referred to the population parameter rather down the sample statistic. So the result we get it's for the population. Another crucial consideration is that generally then the searcher is trying to reject the null hypothesis. Think about another processes as a state of school, on the alternative as a change or innovation that challenges this little school. An example Paul was representing the state of school, which we wait'll engine, Let me emphasize, is once again in statistics do not lie processes in a statement we're trying to reject. Therefore, the hypotheses is the present state of affairs. While the alternative is our personal opinion, it surely is counterintuitive at the beginning. But later on, when you start doing the exercises you would understand in MC Knicks. After this lesson, there will be a detailed comments on these two examples. Thanks for watching. 20. Establishing a Rejection Region: By the end of this lesson, we would understand why hypothesis testing walks first. We must define the term significance level normally into rejecting all if it is falls right . However, as with any test, there is this more channels we could get it wrong on rejecting all hypotheses. That is true. The signatures level is denoted by how far on is the probability of rejecting Ginola. POTUS is if it is true. So the probability of making this arrow typical value for our for our 0.10 point 05 on syrup on one is a value that you select based on the certainty you need. In most cases, the turns off how far is determined, but it context your protein in what is it? A 0.5 is the most commonly used value. Let's explain example, so you need to test in machine is working properly. You would expect a test to make a little or no mistakes, and you want to be very precise. You should pick in low significance level, such as 0.1 The famous Coca Cola glass bottle is two of us. Is that the motion pause? 12.1 houses some of the liquid with dispute on the liberal will be damaged as well. So in such a situation, we need to be as accurate as possible. However, if we analyze in, humans or companies would expect more random or at least uncertainty view and hence a higher degree of herro. For instance, if you want to predict how much Coca Cola coziness drink, on average, the difference between £12 is on 12.1 ounces will not be that crucial, so we can choose a higher significance level like 0.5 or 0.1. Okay, now that we have an idea about the significance level, let's get to the MEC Knicks off hypotheses. Justin. Imagine you were consulting a university. I want to carry out an analysis on how students are performing. On average, the university dean believes that on average, students have a GP off 70% being data driven researcher that you are, you can't simply agree with his opinion, so you start destined to know that protest is Is the population mean greed is 70%? This is the airport, a size value on we denoted with immune zero the alternative hypotheses if the population mean is not 70%. So zero differs from 70%. All right, assuming that the population of Greeks is normally distribute. But it all grades received buying students. She looked this way. That is a true population mean now The test we would normally perform is the test. The formula is Z equals to the sample Mean my nose, the hypothesized to mean divided by the standard aero. The idea is the following we're standardizing or scaling the sample mean we got into sample minutes close enough to their criticized mean Then Z will be equal to zero. Otherwise it'll be far away from it. Naturally in the sample, mean is exactly equal to the hypothesized mean z will be zero. In all these cases, we would accept no hypothesis. The question here is following our big should zb for us to reject the null hypothesis. Well, there is a cut off line since we're conducting its decided or a two tailed just there to cut off lines. One on each side of this value falls in the middle parts. Then we cannot reject the know if it falls outside shaded region. Then reject the null hypothesis. That is why the show, that part is called Rejection Region. All right, the area that is caught off actually depends on the significance level. The level of significance are far is 0.5 then we have our father. But by to the 0.25 on the left side on 0.25 on the right side now decided values we can check from the table when out for a 0.0 to 5. There is 1.96 so 1.96 on the right side on minus 1.96 on the left side. Therefore, if the value we get from the test is lower than minus 1.96 or I, Adam Open 96 reject the null hypothesis. Otherwise, who will accept it? That's more or less. Our POTUS is destined. Walks was killed. The sample mean which respects right, criticized value. It is. It is close to zero. Them cannot reject the null. If it's firing from zero, Dam will reject the null hypothesis. What about one side a test? We have just to lessen the example from the last lesson forces, data scientists and more than $105,000. So 80 is Museo is bigger than war wrapped and $5000. The alternative is that zero is lower or equal to war out in front $1000 using the same level of significance. This time, the whole region is on the left, so the rejection region has an area off Alpha. Looking at is a table that corresponds to a score off 1.645 and since it is on the left, it is with a miner's sign. Now, when calculating artist artistic, see if we get the value Lula than minus 1.645 would reject dino hypotheses. As we have statistical evidence that he did are scientists. Salary is less down $105,000. Otherwise, I would accept it all right to exhaust all possibilities. Let's explore another example sitting. A student told you that the average GPS student get is lower down 70%. In that case, you know I put asses. Mu zero is lower than 70% while the alternative is bigger or equal to 70%. In this situation, the rejection rejoin is on the right side. So if the test statistic is bigger than Dick autopsy, we would reject the null hypothesis. Otherwise we wouldn't cool. That's all for now. Thanks for watching. 21. Type 1 and Type 11 Error: in this lesson, we learn about the arrows that can be made in Iapetus is testing in general we after types off arrows type one on type 20 type on a roll is when you reject a true no like processes. It is also called a false positive. The probability of making this a row is how far the level of significance Since you do research, you choose the al far the responsibility for making this aero life solely on you. Type two aero is when you accept in force and all I pretences. It is also called false negative. The probability of making this a row is denoted by beta beta depends mainly on the sample size on the magnitude of the effect. So if your topic is difficult to test, do you two have some plain or the effect you are looking for is almost negligible. It is more likely to make this type of hero. We should also make sure that the probability of rejecting in force, not by protesters is equal to one miners and data this is there is such as go to reject a force no, like protesters. Therefore, woman is better. Is called the power of the test most often with, such as increased the power of the test by increasing the sample size. This is a common Seibu statisticians used to summarise the types of arrows. You don't really want to make any of the two arrows, but it happens sometimes. It should be a way that statistics is very useful, but not perfect. All right, that's all from this lesson. Thanks for watching. 22. Test for the Mean Population Variance Known: all right, Juries don't. It's time for some testing. We're going to explore two types of tests drawn from a single population on drop form multiple populations. This is very similar to confidence, interval for a single population and confidence interval for two populations in the next few lessons will run tests for a single test, with both known variants on unknown variance. Let's start with a test in which Devious is known for. This test will use it. Did our scientists salary example. He has the data set. The supplement is $100,200. The population variances known on its The motivation is equal to $15,000 more. Over the sample size is 30. However, according to Class Toe, the popular information website, the Men Data Scientist salary is under than $15,000. The sample that's available on Glassdoor is based. All self reported numbers are you would like to see if it's Valley was correct. We needed a two sided tests. Are we interested in knowing both that the seller is significantly less than that's all significantly more than that's. They know hypothesis is a population mean salary responded on $15,000. We denoted us new zero equals $215,000. The alternative hypothesis is that the population of men salary is different down under the $15,000. All right, destiny is done by standardizing Devery of a lot and and comparing it to the Luhukay's e which follows is done that normal distribution. We standardize variable by subtracting the main on dividing by the standard deviation. Since it is example, we used the standard arrow. First, the formal laugh Personalization becomes a couple Lousy is equal to the sample. Mean minus the value off interest from the knoll apoptosis divided by the standard Aargh. In this way, we obtain a distribution with a mean of zero Honest on a division of one is your pocket Z should not be mistaking with Luke. Easy. The upper crazy is standardized. Variable associated with the tests, I will be called disease call. From now on, the locket Z is the one from the table and henceforth will be referred to us political folly. Alright, how does testing walks think about this. The low Markazi is not my distributed in mean off zero honest on a division of one the park Izzy is normally distributed with a mean off X bar. Miners Museo Honest on a division of one standardization letters complaint. That means the closer the difference off expert um, you 0 to 0 the close at this thescore itself to zero. This implies a higher chance to offset the no hypothesis. Let's go back to the example. So what does it value of our standardize? Variable it plug in the numbers that we have from the begin of the lesson. What we get is this call off minus 4.67 Now we'll compare the absolute value of minus 4.67 with a lower Kinsey off Alpha divided by two. Where how far is the significance level? Note that we use the absolute value as it is much easier to always compare positive capital with positive work. Easy's. Moreover, some the tables don't include negative values. You should be aware that the two statements one has 4.67 is lower than the negative. Critical value is the same house 4.67 It's Ireland, the positive critical volley. Those are decision rule book becomes absolute value of the Z score. Should be. I am down the absolute value off the critical value. Using 5% significance are how Far East 0.5 Since it is, it's a scientist. We checked the table for the off 0.25 correspond in value is 1.96 The last thing we need to do is compare our standardize variable to the critical value. If disease cause I hadn't won for Nice six, we would reject in all hypothesis, if it is lower, you will accept it. What 0.67 is I are down 1.96 Therefore, we reject in our hypothesis, the are size that at 5% significance level we have rejected in the hypothesis, or about 5% significance. There is no statistical evidence that immense salary is only around $15,000. What if we had a different significance level using 1% significance we have on offer off 0.1 so zero off for Divide by two is 2.58 Once again, our score off open 67 is I've done 2.5 gates, so we would reject in our hypotheses evil at 1% significance. But how much Can we go for that before we can No longer rickets. They know like POTUS is 0.52 cents. 0.1. There is a special victims that Allah was to see what the significance level is, after which we would be on neighbor to reject in all our processes will sit out in the next lesson, so stay tuned. 23. What is P Value and Why is it an Important Tools in STATISTICS: Now we know how to test their POTUS is on outer. Reject them. Actually rejecting all Iapetus is at various levels off significance. But we couldn't find the level of significance for which we can no longer do it. This is the right moment to introduce in measure called the P Value. This is the most common way to test hypothesis. Instead of testing preassigned levels off significance, we can find the smallest level of significance at which we can still reject. The null hypothesis is giving the observed samples statistic. So how do we do that? Recorded tests. We did a scientist salary. We had Istana error off 27 39 known populations down a division. All 15,000 normally distributed population on a sample size off 30. The correspondences course minus 4.67 rejected in a lie poultices our technical levels off 0.5 and 0.1 But we wanted to know how much lower we could go. We contribute. See table four plus 4.67 which gives us the same results. Asked minus 4.67 in most visit tables, you would not even find this value as it is so large those we round up to the clothes that value available on get 0.1 But how do we actually test the hypothesis? Well, after losing a significant level, you should compare the P value to eat. You should reject the null hypothesis that P value is lower down the significance level. Therefore, we can safely say that such a result is extremely significant by any measurements. Off signal because let's see another example. IFRC scores 2.12 We will reject the null hypothesis out 5% but will not reject it at 1% significance. Now it becomes more interesting at this point. We can actually look at its able and then find the P value we look for the value that corresponds to 2.12 and find out it is 0.983 It be value for one scientist is one minus. The number was seeing a table. So the correspondent P value is equal to 0.17 people. If I to scientists is equal to the number will say and it's able multiplied by two. Therefore, the P value would be 0.34 This is also the ice actual. Our question. All right, so where are the values used? Most statistical software packages run tests on them providers with this serious off results. One of them is the P value. It is then up to the researcher to decide whether the variable is statistically significant or lots. Generally, software is designed to calculate the violent to the felt giddy after the separate. The point is where you start conducting your own research. You would love to be able to see the three zeros after the dots. He closer to zero p value is the most significant. Is the result obtained? The final consideration is that a P value is an extremely powerful measure as it works for all distributions. No matter if we are dealing with the normal students t binomial or uniform distribution whatever to test the P value, Russia now holes. If the value is lower than the level of significance, he will reject the no hypothesis. I will say that he will normally use it p value in the presence off a digital medium. Throughout this cost, I recommend that he used online P value calculators to support your studies on the check us US. When doing exercises, please download the pdf that comes with this lesson, as it will include detail instructions for how to use P value calculators. Thanks for watching. 24. Test for the Mean Population Variance Unknown: now that we know what it people, these are now to use it, we'll get back to hypothesis testing. We saw only one of the two possible cases. Remember, we haven't covered the more commonly observed cases when the population variance is on. No. All right, imagine you re marketing analyst off a company on the manager acts to estimate the email. Open rates of one of the firm's competitors is up over your confidence. The company has an open right off 40% on an email operates. Is a measure off how many people on the email list after I open the email Ever see. At first he struggled to figure out how to get such a specific information about a competitive or complaint. But then you see that one employee off that competitive confident posted this selfie on Facebook, saying, The in your management soft away we are using drives me nuts in the background you can see a screen on. It shows clearly the sundries off the last 10 email campaign. That was sense. Andrea Correspondent Open reads Excellent with your statistical skills, that's all you need. A little help from Phrasebook. Let's states there. POTUS is no hypothesis may operate is lower or equal to 40%. Alternative hypothesis may operate is I are down 40% notes that in hypothesis testing, we are aiming to reject the null hypothesis where we wanted Just if the open rate is I had an 40% the null hypothesis actually stated the opposite statement. Also, pay attention that this time we are dealing with a one sided test. All right, your boss told you that 0.5 is an adequate significance level for the test. So that's what you will see is the data set. You are clearly sample men get that 7.7%. The sample Standard Division is to 10.74% 1st this time in a row is 4.34%. You assume that the population off operates Offset email is normally distributed like confidence interval with various well known on a small sample. The chorus statistic to uses the T statistic. Remember, you did not know the variance on example is not begin off. This means that the variable follows the students. The distribution Are you most employed? A T statistic. Let's calculated. Then we concluded it is called the same way as the Z score. It is score is equal to the sample mane, my nose, the hypothesized mean value divided by the standout arrow There is not, we get is 0.53 As we said earlier, it is easier to work with positive numbers. So we should compare the absolute value off my No. 0.53 with the appropriate city with any minors one degrees off freedom at 0.51 sided significance, we quickly navigate through the table on get 1.83 at 5% Significance for the can't follow you OK, 0.53 is lower than 1.83 Remember the decision room in the absolute valley off the Tesco is lowered ality statistic from the table we cannot reject in all hypothesis. Therefore we must accept it. What you do Ness's go and tell your boss that this level of significance Statistically we cannot say that women operate off our competitors is iron than 40%. Okay, what about a second measurement we saw? What was that? Ah yes. The p value the people he off. This statistic is 0.304 As the P value is greater than the significant level off 0.5 we come to the same conclusion we cannot reject in our hypothesis. Therefore we accept it. Let's do a quick check. If the significance level was 0.1 the P value would still be higher. We wardens rejected in ally processes. This is an important off the vision that we haven't noted before. If we cannot reject it test at 0.5 significance. We could not reject it at a smaller level either. All right, that's awful. Now thanks for watching. 25. Test for the Mean Dependent Samples: we're not done with hypothesis testing just yet. A single population means are just the beginning. It is now time to switch to multiple population means we'll start with dependent Sample. The most intuitive example of dependent samples. The ones you might have been through, like with loss on blood test. The sample is drawn for weight loss data or concentration a translator. But its subject matter off interest is the sentencing before Andi after. Okay, let's get going. There was this drug complaint developing in the appeal that supposedly increased levels of magnesium off recipients. There were 10 people involved in this story that we're taking the drug for some time are recalculated confidence intervals. It helped us studied the effects from that drug. The indicator the range for possible values for the population mean. However, this time we want to come to anything. Go definite conclusion about the effectiveness off the drug. All right, let's step it in. No lie. POTUS is the population mean before is greater or equal down the population mean after the alternative is that the population mean before is lower Dan the world after once again, we want to know if the magnesium levels are higher because struggling no hypotheses on alternative hypotheses in such a way so that we are aiming to reject. They know hypothesis. We expect the level to the ire. So when they know hypothesis, with states them to be lower off peak or okay, let's reload, I repeats it. Zero is new before, which is because I have done you after this equivalent to meal before my nose. New after it's equal to zero or positive, we can substitute this with capital deal each time for the hypothesized population mean difference. So we restate hypotheses using D for simplicity. Now we have artists designed Let's cornerstone numbers Years, Editor states. We have 10 observations people have registered before and after naturally, the differences because before all minors after we can calculate the sample mean off the difference, we get minus 0.33 The sample standard deviation is 0.45 I understand that Arrow E 0.14 the appropriate statistic to use easy teas that is, think we have a small sample. We has you normal distribution of population Onda. We don't know the variance, so the score is equal to the following expression. Now we can simply carry, have the calculation and find out its value is equal to minus 2.29 Since we don't want to choose a level of significance, let's solve the problem with the P value In order to find people off thes one scientists, we may get to the table and see it is somewhere between 0.1 on 0.25 I told you earlier, using software is much easier. So after using an online P value calculator, I can tell you that it is exactly 0.24 What was it is a general again, if the volley was lower than the significa level we're interested in, we reject the no like protesters. Okay, so if the level of significance is 0.5 on the P value is lower, we will be able to reject did not hypothesis at 5% if the level of significance 0.1 the P value is higher. So we cannot reject the null hypothesis at 1% level off significance, the lowest level at which we can reject the null hypothesis is 0.24 which is exactly the P value. So what does this tells? Well, it is up to the researcher to choose the level off. Seek me because in the case of the magnesium peel, we expect that the researcher will be very cautious, as he would want to know if design effective, feel that will be able to actually help people. If we cannot say that appeal Walk starts 1% significant level, perhaps it's better to take it back to the laboratory. Alternative would be to test again on increasing sample size for better results. It's sample off 100 people would improve the level off decision significantly. All right, so we don't so more hypothesis testing. I was explored. Some factors that help you determine the significant level off detest stilettos in our next lesson will lend out to test independent samples. Thanks for watching 26. Test for the Mean Independent Sample Part 1: as you probably expected in this lesson, all in about independent samples with known various. So let's get into the example right away. We are about to test the average grades off strains for two different departments in the UK University. I would like to remind you that in the UK Greens I expressed in percentages. It's the departments are engineering on management's, we're told by David in that engineering is it's over. Discipline on people tend to get lower grades. It believes that on how average run German students outperform engineering students by 4% points now it's our job to verify, if that is the case, less safety to our processes. HCL is a difference between their mitts. Off the two populations is minus four. By the way, notice that I can make a geo engineering minus managements on get in negative difference or I can make it zero money Jermaine's minus engineer on get a positive difference. Either we walks us so we can see as many different situations as possible. I will keep the difference negative. So which one is? The population means difference is different, Dan. For once again, this is a Jews sided test arm. Such question is not to find a difference, but a check. If it is exactly for right now, let's get ours dirty. Here's a table that summarizes the data. Sample sizes are under the facility, respectively. The sample means are 58% on 65% on the population. Standard deviation are 10%. 6% cannot learn from past data. When the population is known for independent samples, the standard error off the difference is equal to the square root of the some of the variance off engineering divided by the sample size on various, off the management it clean divided by this sample size or we have left is to compute 50 statistic. We are example on non variance. Therefore, we use dizzy statistic. I hope you're getting this point. Small samples on on on various means. The statistic lack sample on no serious means. The statistic after we have a large sample on unknown theory, as it is up to the research A. But generally it is okay to use dizzy in that case as well. All right. Yes, a formal authority statistic. Sample difference means my nose I put the size difference means divided by the standard L. We plug in the numbers and get a Z score off minus 2.44 Let's calculate the P value once again. Now just really P value as usually you will obtain it using this software. The P value of the two sided test is 0.15 or we can say is that at 5% significance, which is calling for certainly study. The P value of 0.15 is lower than 0.5 Those will reject the null hypothesis is there was enough statistical evidence that the difference of the two means is not 4%. All right, he is a trick. What if you want to know if the difference is higher or lower than for the sign of the statistics can give you that information in my no sign off statistic means it's smaller. If you reverse engineer dishonored ization process, you find out true value is likely to be lower than their politicized folly in our kiss. This translates into Trumaine is likely to the lower down minus four lower than minus four , until Czar possible values minus five on minus six and so on. This is additional information you can give to the Dean. All right, don't with this lesson. Let's proceed to define out less than see there. 27. Test for the Mean independent Sample Part 2: This is the last lesson will be doing on testing. The last case will examine here is the one with independent samples on for non variance, which are seem to be equal. So you were trying to see if Apple's in New York are more expensive than the ones in L. A. You went to 10 grow straight shops in New York on your friend Paul Will lives in any went to grocery shops there less 16 all on alternative hypothesis is extra. Amir in New York is equal to near in Italy or even New York, minus me when L. A. Is equal to zero each one. You in New York is different than knew in early nearing New York miners Millionaire E differs from zero. All right, that's our data. We have also calculate said the sample means standard deviations on sample sizes. What can we do when difference is unknown? But I seem to be equal. The used it pulled variance formula yet is it's all about plugging in numbers. So 70 trouble they poured variance is 0.5 one. Last thing we need is a standard air off the difference off the means. It's given by the following formula. Testing is about understanding. Competition is routine. So let's start testing. Shall we? Small sample various unknown. What statistic do we need exists? Artistic? How many degrees of freedom Again, you may recall, from Elia, it was combined sample size miners, the number of variables. So Temple ST minus two, which gives us 16 less City statistic formula. Once again, the difference between subway means I know the difference between a politicized Truman divided by the standard arrow. After plugging in the numbers, we get a T statistic off six points. 53 Do we need to compare? It's This is by far the most extreme test statistic we have seen. You will have a hard time finding eight, and it's a table for comment. Test A rule of Thomas to reject the null hypothesis when thescore is beginning to generally for Cisco on Cisco or value that is higher than four is extremely significant. Yes, City decided people, really. The people off this test is lower than 0.0 someone around 0.1 In our lesson about the value , we said that researchers are always looking for those three zeros after the dots. It means that the test is extremely significant. The probability of making its up one arrow is virtually zero. Therefore, we're rejecting all I processes at all common and uncommon levels off significance. There is a strong statistical evidence that the price off oppose in New York differs formally put such an extreme result may also mean that their protests is is pointless or poorly designed from the means value off 3.94 on 3.25 I was such small, close standard deviations off around 0.2 We could easily say that the prices are different . No testing is needed. But a much more interesting question would be if the price off oppose in New York is 20% higher. Don in L. A. I will leave you this example for or walk. All right, this is our last lesson for I put us is Destin. Thanks for watching