Introduction to Data Science & Machine Learning (For Beginners & Managers) | Fadi Khoury, MSc | Skillshare

Introduction to Data Science & Machine Learning (For Beginners & Managers)

Fadi Khoury, MSc, Business & Technology Professional

Play Speed
  • 0.5x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 2x
16 Lessons (45m)
    • 1. Introduction

    • 2. What is data science?

    • 3. What is big data?

    • 4. Analysis Vs Analytics

    • 5. Structured and Unstructured Data

    • 6. Basic probability

    • 7. Expected Value

    • 8. Population and Sample

    • 9. Types of Data

    • 10. Central tendency

    • 11. Skewness

    • 12. Jupyter notebook hands on

    • 13. Introduction to Machine Learning

    • 14. Supervised Learning

    • 15. Unsupervised Learning

    • 16. Azure Machine Learning Studio hands on


About This Class

This is a high level introduction to Data Science and Machine Learning  for beginners and managers. By taking this course, you will learn the basics of Big data, probability, statistics, python coding and Azure machine learning studio. This is a great way to start your journey towards data literacy!


1. Introduction: Hello, everyone. And welcome to this introduction to data science and machine learning course. So I have made the scores for three main reasons. So why did the science and why I'm she learned. The first reason is data literacy. My intention is toe improve and increase data live to see among business professionals, especially the 2nd 1 is, I want to open new horizons for you. You could decide to become a data scientist, data analyst, machine learning engineer, whatever in this field. So this is a good introduction for you. Toe. Make a decision, whether you are a professional ready or you're still a student who's not sure yet about what career path to take. The 3rd 1 The last but not least, his understanding off machine learning. Unfortunately, not many people understand machine learning when it's been on the hype for a few years, and that's why I made this introduction toe machine learning. We will be doing hands on exercise using Microsoft Asia Machine Learning Studio to better understand how machine learning works and how it's structured. I'm really excited. I hope you are. Let's jump straight into it and start our course 2. What is data science?: hello again and welcome to this course. So what is data science? Let's have a look at the definition from Wikipedia. So let the science is an interdisciplinary field that uses scientific methods, processes algorithms in systems to extract knowledge and insights from many structured and unstructured data. So it's an interdisciplinary field. Why? Because in that a scientist has to know math, statistics, probability, computer science programming. And above all, he should also have a domain knowledge or the business knowledge, which, in my opinion, is the most important thing that it a scientist should have. We should possess in order to be able to extract meanings and insights from data structures related toe that domain or that business and that will allow him or her toe construct a story. Please, guys, remember that good data scientists regret Data scientist is also a great storyteller. So have a look Peace at this. You can see the data science comes in the center. It's in the centre between math and statistics. Probability is computer science and of course, the domain and the business knowledge. I hope that summarizes the definition off that the science for you and all Syria in the next video to continue our lessons 3. What is big data?: Hello again, everyone. So this chapter will discuss big data. When do we call it big data? Around 40 exabytes Off data get generated every single month by a single mobile phone users imagine that's one billion gigabytes, and we have around five billion mobile phone users in the world. So you do the math on the sides off the data. That's in tremendous amount of data for traditional computers 200. And that's what big data is all about. Reclassified data as big data based on what we call the five. These and those are volume, velocity, variety, veracity and value. So let's take a look at the 1st 1 which is volume. This is the amount of data we have from different sources. That's a big volume of data as previously discussed. Velocity is the speed at which the data is being generated usually is a real time or near real time. Then we have the variety. This is the multiple sources from which the data is flowing from machines people processes , and it's a mixed off structured and unstructured data. Then comes the veracity. This is the quality, consistency, integrity and reliability off the data, lastly, comes the value. Of course, this is the usefulness of the data and ability off extracting valuable insights from it. So I hope this summarizes what big data is all about. And I will see you in the next video to continue, of course. 4. Analysis Vs Analytics: Hello again, everyone. So analysis is and analytics People tend to use analysis and analytics interchangeably, which is off course, not correct. That's due to the lack off understanding off the definition of both. So let's take a look at the definitions. Let's say you take a data set and you spit it in tow. Three smaller data sets to study correlations. This is what we call analysis and analysis. Always examines the past, for example, why there was a drop in sales last year. On the other hand, and, as you might have guessed already analytics these with the future so it's used to detect patterns. To formulate predictions, the analytics is performed using qualitative or and quantitative methods. The qualitative methods usually uses intuitions to make decisions while consecutive methods relies on data and numbers. Let's have a look at an example. Say you have a clothing store and you want to introduce a new collection. You could use Qualitative Analytics to decide on the best design and quantitative analytics to decide on the best price. Great. So now we know what the difference between analysis and analytics see you in the next video to continue our lessons 5. Structured and Unstructured Data: all right, Hello again, everyone. So we can't talk about data science and big data without explaining the difference between structured and unstructured data. Let's start with the structure data. Well, as its name suggests, it's a data that has instructions normally in columns and rows where the valuables are well defined. Businesses in usually deal mostly with structure data where it's stored in databases such as SQL, and you might hurt a lot. This name sq unstructured data, on the other hand, comes in different types and forms such as images, audio video, de Maynes, text files. But here's my question to you. What type of data is more available in the world? Is it the structured on the order? Unstructured. If you said it's done structured data, you are absolutely right. Now take another guests on the percentage split between the two. And here comes a shocking number. Up to 95% of data in the world is unstructured. That's message right in enterprises upto 80% off. The data is unstructured. Ask for a study carried by Gardner, and that's by the way where the real value lies. And that's where the roll off the skill data scientist comes in play to extract those hidden gems from those spies and finds of the hope you have now a clear understanding on the difference between sexual unstructured data. See you in the next video toe. Move on with our lessons. 6. Basic probability: Hello again, everyone. So let's talk a little bit about probability. By definition, probability is the likelihood off an event. Occurring probabilities are usually expressed in either percentage fractions or preferably between zero and one. For example, we can say its 20% 1/5 or point to, if we are to put the probability off, an event occurring on a line between zero and one zero would represent the absolute certainty off the event not occurring, and one would be the absolute certainty off the event occurring. So if we take an event, A, we would denote the probability off this event occurring as PC, which is probability off a equals, toe preferred outcome divided by old possible outcomes. Let's take an example off flipping a coin. It's a simple one, right? If you consider getting ahead, being are favorable preferred outcome and since the possible outcomes that we have here are only too, it's either head or tail. We would denote P off a as one or for two, which is 0.5 great. Now let's take another example. Let's say you want to roll the dice and get number three. What is the probability in this case. So we have six possible outcomes this time, right? And I'll preferred outcome is one of them. So we would denote it as the off a equals 1/6 or 0.16 Let's say you want to get a number that's divisible by three. That means you need to get either three or six. So in this case, we would denote it as key off a equals to over six 40 point 33 Probability off this occurring. I hope this is clear and see you in the next lesson to continue. 7. Expected Value: Hello again, everyone. And let's talk now about the expected value. That would be the expected outcome to an event. If we run the experiment many times, let's say we don't know the probability off getting ahead if we don't see coin. So we decide to toss a coin 20 times as an experiment and be re record the outcome. This outcome would be called the Experimental Probability. Unlike what we saw before, which is called the Theoretical probability, we have denoted the theoretical probability as P off a. However, the expected value off an event A is denoted as e off a, which is the outcome we expect to occur when we're on an experiment. So let's take an example to understand better. We want to know the expected value off getting number three when we roll a dice. So we decided to roll the dice 20 times. The expected value in this case is denoted as he or fe equal p off a times. Then where N is the number of times the experiment is conducted, and in this case it is 20. Remember, from our previous test in key off A was one point 1/6 so we can Do you know what? This as he off equals 1/6 multiplied by 20 and that gives us 3.33 times. So we would expect to get number three 3.33 times if we throw the dice 20 times. I hope this is clear and I will see you in the next video. 8. Population and Sample: Hello again, everyone. So let's talk a bit about some basic statistics when we hear the word statistics. We should immediately think about whether we are dealing with what we call a population usually denoted with a capital. And ah, and the number we obtained when working with population. It's called the parameter or example, which is denoted as a lower case in, and the numbers obtained when working with sample are cold statistics. So now you know where the term statistics comes from. So why do we normally work with samples? Well, that's because it's a lot easier less time and money consuming to analyze a small, simple off the population rather than the entire population. Samples have toe defining characteristics, and those are randomness and representativeness. A sample must always be random and representative in order to be precise and reliable. Let's say you want to conduct a survey about the satisfaction off the citizens with countries have care system. You helped tow one off the states and you conduct the survey when a number off people is that a good sample? In your opinion, you might guess that it's not. But why? Because the sample hasn't random as you have chosen the citizens off that state only also, it's not representative because many people from other states did not get the chance to express their opinion. So that's what population it's simple. What about see you in the next video to continue, of course. 9. Types of Data: Hello again, everyone. And this isn't we will talk about the types off data, which can be divided into two. We have the categorical data and the new medical data. Categorical data is like its name suggests it splits data into categories. For example, you asked the question to a group of people leaving a movie theater, whether they like the movie or not. You get answers such as yes or no, and those are categories. Your medical data, on the other hand, represents numbers, and it's also divided into two subsets we have. The discrete and continuous discrete data can be counted in tow. Infinite number like numbers of people, for example. 0123 Other continuous numbers, on the other hand, are infinite, such as distance, which can be anything like 7.231245 or time, which can be like 9.45676 seconds. Hope this explains the type of data, and I'll see you in the next lesson. 10. Central tendency: all right, everyone. So hello again. In this lesson, we will learn about three measures off central tendency, the mean the median and the mode. Let it start. Let's start with the mean. The mean is just the simple average, which is calculated by adding all the components off a data set and then divided by the number. Offsets, they mean, is the most common measure off central tendency. But it has its own downside as it gets easily affected by outliers. If we take, for example, the price off a pizza in 10 different locations in two different cities, Los Angeles and California, let's calculate the median price in both cities. So the average in L. A. Is $12 while and see a it's five or $6. Can it be that the Pitzer price in L. A. Is double the price in C. A. Have you noticed the problem here? The restaurant selling the pizza at $70 had a big impact on the average result, and that's our outlier here. The best way to protect ourselves from such a misleading calculation is to calculate the median instead off the meat. So how to calculate the media Well, we just need to arrange our data in ascending order. Then we use the formula and plus 1/2 where n is simply the number off positions. So, for example, here we have 11 total off 11 positions in L. A. We will have 11 plus one divided by two, which is six. So at position six, the price is $6 in L. A. While in, uh, see, a we have total off 10 positions. So we do it 10 plus one, divided by two, which is 5.5, but we don't have a position. That's 5.5 year, right? So what we do, we take the two positions before the 5.5. And after the point, the 5.5, which is positioned five and position six. We add those two and we divide them by two. So we add five plus six. Divided by two is equal to 5.5. So now notice we have the median for the two cities, and that makes a lot more sense than the ah, the average or the mean right? So what about the moat? Well, the modus, simply the value that repeats the most in the data set. If we look at l. A, we see that the price $3 has the highest frequency. So in this case, the mode off L. A is three. But wait, what about the A? Can you find the value that was repeated more than once? No. Right. So in this case, we say that there is no more or it zero mode. It's was also worth noting that we can have multiple modes in a single letter set. Great. So now we learned about the three measures off Central tendency. See you in the next video. 11. Skewness: What are you guys? Hello again. So have you heard the words que nous before? Well, in that the science this Kunis indicates whether the data is concentrated on one side, we look at the tail which indicates whether the data is left or right skewed and that's where the outliers would be leaning to. Let's have a look at those three examples. The 1st 1 is a right skewed data or also known as positive skew. Which means the out players are concentrated on the right and we have here the mean is greater than the medium. As you can see, even it's great in that. Then the moat and the 2nd 1 we have what we call a symmetrical data were also called zero scoop. Here, the mode, the median and the mean are equal. And the 3rd 1 we have a left skewed data or a negative skew where the outliers are concentrated on the left side. This is one off the visuals that we will generate when we start our coding lessons to examine the symmetry off our data sets and decide whether we should use the mean the median or the mode when calculating our central tendency. Hope that was clear and see you in the next video 12. Jupyter notebook hands on: Hello, everyone. So we're now we reach to our coding exercise. We are going to use a Jupiter notebook, which is python with bison. Uh, this is gonna be very basic. Just toe give you a hands on on, uh, show you how this works. Nothing complex. Nothing complicated about this. This is very basic that you can use. If your manager you can maybe use it during a to your work. You have a data set that you want to explore. You can use those commands if you command. So let's start and get into it. In order to open this page, you can simply were using the Web. The Web browser for Jupiter. Nothing to be downloaded. So just go ahead and type in Google. Just type two Peter notebook. Yeah. Once done, Go to Egypt around Bakir first link and go down and hit on. Try Try it in your browser. Once you do, you did that. You will be redirected toe this screen. So the first thing to do here is toe upload the data. So I have I have attached the data in the classroom for you can go there, download the day that you have to see. Once done, come here, confined, open, and then upload the data here. Hit uploads and go upload your data. So once that is done, then you are all set to start. Yeah. So let me cut all this to start a fresh page. All right, So what we have here is this is where we are going to write or codes. Anything we write that starts with hash tag doesn't count as a cord is counted as a note. Another thing here, that's nice. We can write, for example, introduction and changing. Changing Toby heading. So now I have heading. I just shift. Enter. I have heading. It's not a code. And then I can code after that. So its nice toe toe organize the recording underway and a quick overview here. So the plus allows me toe ad quoting lines. The cut cuts, the cooling lines have a copy, and this allows us toe change the position off the lines up, end up, and plus we have the run button. So it very quickly on the run button or shift enter to run the court. So All right. So the first thing we would do here is, we will import a couple off libraries. We import a library called pandas and another one called Matt bluntly. These libraries are usedto later on toe. Also do some visualization, so go ahead and type import. Banda's has beauty and also import Matt trip dot by plot Has plt shift enter then? Okay, so once the Asterix is gone, that means the processing is complete and we don't have another. That means we're all good. The second thing is we will need to import our data set. We have uploaded it. We need to import it toe work with or to call it so we will call it a date. A friend. The f is equal to pretty. So here we are, using the pandas. A library you did not read on this course is me hoping parent disease and then the real estate on the score data set dot C a C. Jeff Hunter. And now we have the data uploaded. Now the first thing that I would do is to have a look at the data structure. So I would how it's called, the head or the f dot head, and that will allow me to have a look at the 1st 5 rose off the reader. And you can see here the valuables. We have street cities, the code state, etcetera. And we have the 1st 5 throws. So this is this is how our data set looks. Looks like. All right, So what should we do next? Let's next. Let's toe some calculation or check the meaning medium off one of the variables so we can recall our lessons. To do that, we have to do something that's called indexing. So we say that our frame and we open the brackets and the square brackets, and then we have here to index that we want to use the price for it. So this would call only the price car. And then let's ask for a meet that What's the mean off the price off this data set? So the mean is 234144 Okay, how about the medium? We can do the same thing, Or let's just copy it and paste it and then change the mean toe medium shift enter. Okay, So do you notice him? Do you notice that the mean is bigger than the media? Do you have something coming back now from our lessons. Yeah, let me refresh your your mind Now with this following. So what we will do is we will say the f All right, got price. But here, what we will do is thought so. We would ask for his together, and now it will all come back to you. Uh, I'm sure I did steak here, which is the dot So that's that's good, because this is how the other comes. And I have a steak. I corrected it. And now you can see what do we have here? We have a skew and skew is positive or Rights Cube, because, as you can see here, the mean is bigger than they needed. So that means we have about liars to the right. Okay, so it's all coming right? All right, everyone, let's have another look at something else. It's another plot. Let's go ahead and throw a scatter plot. To do that, we will use the library. My foot slipped by using plt not scattered for scatter plot? Yep. And then open parent disease. We will call the data frame with an index off square feet, which is the size with the price. Let's see how these two correlate. Right? So we will use here the price and just see what we get. Shift. Enter on. There you go. So we have a correlation. Of course. Between the size of the price, the bigger the size, the higher the price. Right. And we can see here that are data needs some cleaning because we have some blanks. So we have apartments with the zero size and with some price. So those are some blanks. Probably that needs to be clean, but in general And we have here one that has zero price. So if we drop these in general, we can see that we can draw a straight line here, and it would be on uptrend straight line. That shows, um ah, correlation between the price and size of the party and lived around when we talk about linear regression and when we go ahead and do some examples using machine Learning Studio, this is what linear regression uses. It measures the distance from the points. So the earlier linear regression would work on fitting the best line so the line can go here can go below below. So that's what machine learning does. It fits the line in the best position in order to predict so later on. For example, once we have the line properly position, we can give the model specific signs off apartment in the city, and it would give us an estimate or it would predict the price. Yeah, so those are the commands that we can use using jumped around book. Those are quick commands. Of course, it's not everything. I hope you enjoyed it. And I will see you in the next video. The next lesson with the machine gun X to do. 13. Introduction to Machine Learning: Hello again, everyone. So now it's getting more interesting. Let's talk about machine learning, a topic that has invaded the business world in the last few years and you've all heard it, right? So what is it machine learning is uses. I'll go to thems and statistics to find patterns in massive amount of data on data here can be a lot of things. It can be numbers, words, images, videos, clicks, whatever it can be anything. And if it can be digitally stored, it can be fed in tow. That machine learning algorithm. So it's it's always a new source off data or it's a new input machine. Learning uses three different types off algorithms. We have what we call the supervised learning and we have the unsupervised look. We have also 1/3 type that's called reinforcement learning. But we will just focus on these two in this course. So under supervised learning, we have two types. We have classification and regression under the unsupervised. We have also two types called association and clustered. Now don't ready. We will talk about each off those algorithms and the following lessons. So stay tuned and I will see you next 14. Supervised Learning: Okay, guys. So hello again. We now we will now talk about the supervised learning so supervised learning uses labeled data toe train moderns and map functions that turns input valuable X in tow. An output valuable Why, in other words, it solves for F in the following equation. Why equal have effects? Let's start with the classification algorithm. We use classifications when we want the model to predict a category, for example, benign or a malignant tumor. So in this case, we can say we have two classes Class A and the or benign and malignant. Regression, on the other hand, is used to predict the outcome off given sample when the output variable is in the form off real value. For example, a regression model might process input data to predict the value off the house or the height off a person. The example here that you see on the screen is an example off linear regression model where a value Y can be predicted for any given value X. We should also mention that regression comes in five different models. We have a linear regression logistic CRT naive base and the K nearest neighbor, or what's known Ask A and M hope this gives you a high level idea about the models and see you in the next lesson. 15. Unsupervised Learning: Hello again, everyone. So now we are going to talk about the unsupervised. Learning on this is the second tables machine learning that we previously mentioned. Unsupervised learning models are used when we only have the input variable X and no corresponding output variables. They used unlabeled training data toe mother, the underlying structure off that data. Let's talk about the first algorithm, which is called the association, if you remember, and it's used to discover the probability off the court currents off items in a collection . It's extremely used in the market market basket analysis, for example. An association model might be used to discover if a customer purchases bread. He or she are 80% likely toe. Also purchase milk. The second model is the clustering and is used to group samples, such that objects within the same cluster are more similar to each other. Then toe the objects from another cluster. And as you can see here, we have in this image three clusters that can that are clearly divided. Hope this explains the unsupervised learning, and I'll see you in the next video, also to start our hands on exercises and hands on examples 16. Azure Machine Learning Studio hands on: Hello again, everybody. So now we reach the last exercise that on our last hands on, which is the machine learning We're going toe test our data, set the real estate data set and see what kind of accuracy we get if we use the linear regression. Remember when we did scatter plot? That's how a linear regression model would look like. So this is the azure machine Learning Studio. You need toe go and create an account on Asia. Once you have done that, you will be prompt Toe this screen. So once in this clean on this clean, you have to go to data sets and go ahead and uploads from looking fine. Upload the data set, which I have done already, and I have here the real estate data set dot C S V. So after that, we moved toe expects. So we have to create a new experience, click on you and hit on black. And there's really we're now in hour X experiment. It's like a canvas. Can we can, We can work on it and way are now going toe use dragon drop. It's very simple. Um, and the model will soon take shape So the first thing that comes in this model is off course. Our data set. So we need toe come the later set here and select our data set will drag. It's on. Drop it in here. So this is the first thing to do. Then in machine learning, we do what we call training the model. And in order to evaluate this model, we need to split it into. So the first part, the majority of the first part will be used to train the model and the other man. The other part will be used toe score, the modern. So that's why we will need here, toe, do a split. So we will use the split data in here, so I will connect the data to split the data. You see it? It lights in green. So that means I'm okay to connect a bit. So we have these two now. Next we need to train the modern. So we have toe Look for the train, the model time in here. So because train modern, it's called trade modern. I put it here and then when you connect the script data toe train model No, I have toe choose what model? What? I'll go to them. I'm going to use. And as we agreed, we're going to use any regression. Let's see what kind of results we get. So I'm gonna I'm gonna ask for a linear regression here. I'm gonna drank it and put it here. And this will go and connect toe model training. So there you go. Great. Next we need to score our model so called school Modern, and you notice I'll place it here. And in order to score the model as we mentioned, we need toe connect the data splits and the data train toe the start. Leslie, we need toe. Have it a model evaluation. So then we have toe evaluate this model. Then way like this. Toe the value. Why? Doesn't want going back to you Well connected to the other one. Yeah, it's accepted. All right. So this is how it looks like. Pretty simple, right? I'm gonna just Yeah, that's how it looks. It looks like now notice on. Bring the model that's giving me Hey, Tiny. No, This here and it saves value is required. So if I click on it, why value is acquired? Because I need to tell the system. On what? What variable A My training. So in this case, I'm training to predict. What? What? What do I want to predict from this data set? So what we're doing here, we're predicting on the price, right. So I'm going here. And so Yeah, this is the column, and I'm going to select the price. That's what I'm training. And I'll select this once. Done. You see, the modification here is is gone because I have selected or their value. So I'm all set now. I go ahead and hit. Run. So now the model is getting training and you will see it because those off those green ticks here that means my model has been drink. That was pretty fast. So Mother is trained. I can go ahead now and double click or or right click on the school model and visualize the scoring. Let's have a look here and see what happens. So those are our valley variables. Remember? The says date price. So here I have the price. And here I have the school label pour the predicted price. So let's take few. The price here says that this is the real place off the apartment is $126,000. The system for the algorithm predicted that this apartment is for $215,000 which is a shame , because that's that's really bad prediction. The 2nd 1 was was pretty close. Not bad, right? 212 for 219? No, but 3rd 189 for 212. Quite far from from the real. But look at you. Whoa. This apartment is for 55,000. This is the real price and the predicted prices 175,000. So clearly the model is not performing well. I can close this and go to the modern evaluation again and visualize this and I can see that the coefficient off the termination is 0.48 or like around 48% which is terrible. Security is not a good accuracy. Now, I want to remind you, if you remember when we did the scatter plot, we noticed something right? We noticed that many apartments had zero, uh zero space. The area was zero, and I still had pricing on on the and at the same time, we had one apartment that was a big outline or it's It's a blank, probably because it had zero price, even though it was close to 6000 square feet size. So those out liars are What what are they doing there are affecting the accuracy of our model. That's why the data needs to be cleaned first before performing this kind off analysis or off training data and using it for protection. So, uh, this is how Microsoft azure machine learning is used A simple as dragging and dropping. And then you can have a model and you can't wait around deployed, published to gallery. Or you can set up a web service and, uh, deployed the model to to use it. So I hope this is this was nice and it was helpful. Please go ahead and use it. You can also what you can do for fun. You can change the algorithms so you can go ahead and get this okay, And look for another algorithm, for example. Maybe you can use logistic regression. So a two class logistic regression you can connect this one and hit again and see what kind of acute issue get. It's probably gonna be a bad accuracy again. but this is This is the way to look, so thank you very much again for taking the scores. I hope you enjoyed it. You like that on, uh I hope that you will keep on learning about this topic and advance in your career. See you and next courses.