The Data Science MicroDegree: Introduction To Python, Data Analysis & Visualization | Abhishek Pughazh | Skillshare

Playback Speed


  • 0.5x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 2x

The Data Science MicroDegree: Introduction To Python, Data Analysis & Visualization

teacher avatar Abhishek Pughazh, I build cool stuff with code.

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Lessons in This Class

66 Lessons (4h 24m)
    • 1. Introduction

      1:05
    • 2. Setting Up Your PC

      3:06
    • 3. Anaconda Installation

      3:02
    • 4. Launching Jupyter Notebook

      3:37
    • 5. Navigating Jupyter NoteBook

      5:52
    • 6. Markdown cells

      3:38
    • 7. Python - Data Types & Arithmetic Operations

      5:51
    • 8. Python - Variables

      4:13
    • 9. Python - Strings & Print Function

      3:41
    • 10. Python - String Splicing

      2:55
    • 11. Python - Lists

      6:15
    • 12. Python - Dictionaries

      4:54
    • 13. Python - Tuples & Sets

      3:12
    • 14. Python - Relational & Logical Operators

      5:03
    • 15. Python - If Else

      4:23
    • 16. Python - For Loops

      2:11
    • 17. Python - While Loops

      2:21
    • 18. Pythpn - In Built Functions

      3:26
    • 19. Python - Creating A Function

      5:33
    • 20. Numpy - Introduction to NumPy

      2:20
    • 21. NumPy - Arrays

      2:15
    • 22. Numpy - Generating NumPy Arrays

      2:49
    • 23. NumPy - Linspace

      1:47
    • 24. Numpy - Identity Matrix

      0:39
    • 25. Numpy - Generating Arrays With Random Values

      3:20
    • 26. Numpy - Reshape, Min and Max

      3:38
    • 27. Numpy - Shape and Dtype

      1:20
    • 28. NumPy - Indexing

      3:45
    • 29. Numpy - Index Broadcasting I

      1:11
    • 30. Numpy - Index Broadcasting II

      2:29
    • 31. Numpy - 2D Indexing

      2:48
    • 32. Numpy - Extracting Submatrices

      1:37
    • 33. Numpy - Conditional Indexing

      1:50
    • 34. NumPy - Operations

      2:33
    • 35. Numpy - Universal Functions

      1:36
    • 36. Pandas - Series I

      4:16
    • 37. Pandas - Series II

      4:24
    • 38. Pandas - Dataframes

      5:26
    • 39. Pandas - Dataframes adding & dropping columns

      3:31
    • 40. Pandas - Loc and iLoc

      3:58
    • 41. Pandas - Conditional Selection

      4:50
    • 42. Pandas - Multiple Conditions

      2:34
    • 43. Pandas - Reset Index & Set Index

      4:12
    • 44. Pandas - dropna & fillna

      6:23
    • 45. Pandas - Group By

      4:52
    • 46. Pandas - Join, Merge & Concatenate

      7:50
    • 47. Pandas - Operations

      8:12
    • 48. Pandas - File Processing

      7:23
    • 49. Matplotlib - Introduction

      2:22
    • 50. Matplotlib - Plotting A Simple Graph

      4:57
    • 51. Matplotlib - Multiple Plots Inside Same Canvas

      1:28
    • 52. Matplotlib - Object Oriented Plots

      6:38
    • 53. Matplotlib - Subplots Using OOP

      4:12
    • 54. Matplotlib - Modifying Figure Size & DPI

      3:13
    • 55. Matplotlib - Saving The Plot

      1:26
    • 56. Matplotlib - Creating A Legend

      3:22
    • 57. Matplotlib - Customization

      6:14
    • 58. Matplotlib - Plot Range

      1:57
    • 59. SeaBorn - Introduction

      2:10
    • 60. SeaBorn - Distribution Plots I

      6:51
    • 61. SeaBorn - Distribution Plots II

      5:32
    • 62. SeaBorn - Categorical Plots I

      5:41
    • 63. SeaBorn - Categorical Plots II

      7:00
    • 64. SeaBorn - Matrix Plots

      7:16
    • 65. SeaBorn - Grids

      7:21
    • 66. SeaBorn - Size & Color

      8:28
  • --
  • Beginner level
  • Intermediate level
  • Advanced level
  • All levels
  • Beg/Int level
  • Int/Adv level

Community Generated

The level is determined by a majority opinion of students who have reviewed this class. The teacher's recommendation is shown until at least 5 student responses are collected.

14

Students

--

Projects

About This Class

We start from absolute Python scratch and gradually progress into NumPy, Pandas, Matplotlib & Seaborn for data analysis.

///

PLEASE FIND THE RESOURCES YOU'LL NEED TO DOWNLOAD FOR THIS COURSE BELOW

(Make sure you download and open them inside your Jupyter Notebook)

///

What Will You Learn In This Course - (Download Syllabus / Overview)

There are lots of Python courses and lectures out there. However, Python has a very steep learning curve and students often get overwhelmed. This course is different! This course is truly step-by-step. In every new tutorial, we build on what had already been learned and move one extra step forward. After every video, you learn a new valuable concept that you can apply right away. And the best part is that you learn through live examples.

This comprehensive course will be your guide to learning how to use the power of Python to analyze data and create beautiful visualizations. This course is designed for both beginners with some programming experience or experienced developers looking to make the jump to Data Science!

"Data Scientist" has been ranked the Number #1 Job on Glassdoor and the average salary of a data scientist is over $120,000 in the United States according to Indeed! Data Science is a rewarding career that allows you to solve some of the world's most interesting problems!

In summary, this course has been designed for all skill levels and even if you have no programming or statistical background you will still be successful in this course! I can't wait to see you in class.

In This Course You'll Learn:

  • Programming with Python

  • NumPy with Python

  • Using pandas Data Frames to solve complex tasks

  • Use pandas to handle Excel Files

  • Use matplotlib and seaborn for data visualization

RESOURCES

(Python - Data Types & Arithmetic Operations) - Click Here

(Numpy - Universal Functions) - Click Here

(NumPy - Arrays) - Click Here

(NumPy - Indexing) - Click Here

(Pandas - Series I) - Click Here

(Pandas - Dataframes) - Click Here

(Pandas - Join, Merge & Concatenate) - Click Here

(Pandas - Operations) - Click Here

(Pandas - File Processing) - Click Here

(Matplotlib - Plotting A Simple Graph) - Click Here

(SeaBorn - Distribution Plots I) - Click Here

 

Meet Your Teacher

Teacher Profile Image

Abhishek Pughazh

I build cool stuff with code.

Teacher

This is Abhishek, India. I'm a Python Freelancer. I build cool stuff.

See full profile

Class Ratings

Expectations Met?
  • Exceeded!
    0%
  • Yes
    0%
  • Somewhat
    0%
  • Not really
    0%
Reviews Archive

In October 2018, we updated our review system to improve the way we collect feedback. Below are the reviews written before that update.

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.

Transcripts

1. Introduction: Before we even begin, I'd like to tell you that data science, artificial intelligence, and machine learning are going to be the only technological fields who have stable job vacancies for the next 10 or 20 years. And if you're here, You and I both know that data scientists, the new buzzword. But DataSite just so happens to be one of the broadest subjects, which is actually easy to learn, but has a lot of syllabus to cover. And by lot, I mean, a lot. I know you're probably looking for an easy way to study data saints and say completely understand in this dataset is my dual degree. I'm going to introduce you to four major data analysis library such as NumPy, Pandas, macro, glib, and C1. Apart from gaping, you extract an analyst data frames. I'm also going to show you how we can visualize them in an appealing and pleasingly. Hey guys is Abishek and I'm super lucky billions technically. So I'm pretty sure that you're not just going to enjoy it, but you're also going to learn a lot in this dataset. It's microti 2. Setting Up Your PC: Hey guys. So whenever you're trying to learn something new and programming or new programming language, the very first step is downloading software from the Internet and installing them executive easy. These authors are going to help you convert the programs in quotes into a language that a computer can understand and process, okay? In China, your computer is stupid. Unless the instructions are extremely precise and peered, your computer will never understand what you're trying to say or what you're trying to do. Okay? You and I can understand English. Human eye can communicate in English, but you cannot communicate your PC using English, which is exactly why we need these third party software, okay? Most programming bigness have a general misinterpretation. They think that UPC understands Java, Python, C, C plus, plus are all just programming languages. But when in reality, your PC understands none of these, okay? Your PC can only understand binary. You understand Python, Java, but the text editor are the ID in programming terms, as we call it, is watched going to convert these languages into binary for a computer to understand and process it in this course have us, I'm going to be using two things. I'm going to be using Anaconda and I'm also going to be using Jupyter Notebooks. These two softwares that I just told you about our industry element. So what do we mean by this dome industry relevant, most people from the industry, that is, most people from the data science and machine learning community prefer using Anaconda and Jupyter notebook in your day-to-day activities. Okay. So what does this mean? It means that you're going to stay fresh in the job market. Your skills are going to be relevant for a really long time. And also whenever you encounter certain problems that are going to be a lot of people to help you out. Okay? This is the advantage of using softwares that are industry limit. And agile. Anaconda has a lot of tools and liabilities that come in bulk with the installation. Again, you don't have to browse the Internet and install all of those. Again, whenever you install Anaconda, there are a lot of tools and libraries that you get for free. It also has its own virtual environment system. Are going to be talking a lot about this in a goes. But right now, I just want you to know that Anaconda has its own virtual environment system. Again, Jupiter similarly is an interactive ID. Now if you already used ideas like we should record an atom before, you would know that you can only type in books and you can only see the output. But in Jupiter, there are a lot of other uses. You can add images, you can see text, you can do a lot of other things inside a single document again. Now Jupyter notebook is also one of the most popular IDEs among data scientists and machine learning experts. You will actually start realizing what I'm talking about when we get into the course, okay? Jupiter has a lot of interesting features that other ideas do not. Now, this doesn't mean that you shouldn't use other ideas. As a matter of fact, I personally prefer using wishes to be a quote for web development and things like this because it was actually a cone and add them. Look nice to the ice. But this is not going to be a purpose for now because we don't want an IV is to look visually pleasing. We want them to work the way we want them to, correct. So I highly recommend that you stick with Anaconda and Jupiter in order to follow the course properly. Okay, so in order to instruct Jupiter and Anaconda first, we start with Anaconda again. So go to the following link that you currently see on the screen and we'll meet on the next lesson. 3. Anaconda Installation: Hey guys. So in our previous session, I told you the reasons for which way you need Anaconda for the scores of us. In this session, let's actually go ahead and install Anaconda. Okay? So the first step is you'd have to come to this homepage, blue, this influenced appetite, www.anaconda.com. Here. You can go ahead and Google the dome anaconda up by done. Good on the first page that you see undergo mitosis, mitosis and doesn't bring you to this homepage again. If you're learning this code at some point in the future, this homepage might not look exactly the same because these guys constantly update the website and anyway, okay, so do not worry if the homepage has a different design when you're learning the scores out of it doesn't look exactly the same as you see now. Ok. It's going to be ending, the process is going to be similar. It doesn't matter that they've obtained doesn't look exactly the same, but the process is going to be some genetic and promise. You. First go to products. And in Tape Products, click on individualization, okay, you'll be directed to this page. If you scroll down, you can actually see the link download links for the installers. All right, So there are separate download links for both Windows, Mac OS, and Linux. Okay, So it doesn't matter what Vizio using, I'm using a Windows system, so I'm going to be downloading illegal in stone it, I mean, if you're learning that goes on a Mac or PC, please don't know, but aspect of instrument is okay. And if you see on the Windows column board, the installers have the name of graphical install it again, but in the Mac OS and Linux, you'll have the option to choose between graphical installer, but kindly choose graphical installer because this is going to make the installation process a little more easily. Okay? So, yeah, I'm going to choose the graphical installer for windows. Just by clicking on it. You can see that things started being donor no good. And so this is a pretty large file, so it'll take a moment to download. One of the things solos, download it, just click on it. You'll actually see a window like this. Okay. Click on Next. I agree because no one day lead read software, Grameen Danone. And yet click just me. Sending the location measure. You want to take a plate of being stolen. And okay, so this is a really important step again, by default, this checkbox or not. Okay, if you can zoom in and read, Anaconda is actually telling us that this is not recommended. But I strongly suggest that you check this tick box, okay? Because this is going to allow us to execute our goods trade from the Anaconda itself. Okay, we don't have to open a separate dominant in order to install and fight. This is the reason why I'm taking this shape box and it's highly suggest that you do so. Okay, so kindly take the stick bugs and click on install. Now depending on the speed of your BC this might take away, so it will please hold on. So once it is installed, click on Next. Now if you see something like this, it is always good to installed PyCharm along with Anaconda as I'm clicking on Next. So I am clicking Finish. If you're here, you successfully completed your process of installation. So we'll see how to use Anaconda in the next lesson. 4. Launching Jupyter Notebook: In order to launch Anaconda first, have to go to your windows launcher and type Anaconda here, okay. You can see something named Anaconda Navigator here. If you just click on this, it should take a moment to load. And again, this depends on the processing, smear the pleasant computers, so do not, but if the sneaks away. All right, so if you just give it three seconds and the open, okay. And we're going to be using something named Jupiter notebook. We wanted to discuss about this before, but there's also something called as Jupiter lab, but we're not going to be using the slope. Again. We want to be using Jupyter Notebook, but as you can see, this is not the 1D IDE or integrated development environment that's encoded in say an Anaconda Navigator it okay. There's something called despite of this is 90 S. Well, there's Google Console and there are a lot of IDs. There's also we should pseudocode here, but we're going to be using Jupyter notebook for now. Again, if you just click on Launch, it'll open Jupyter Notebook inside your Google Chrome or whatever default grows in a union. I highly suggest that you stick with Google Chrome or Mozilla life application. It's even okay if you know Microsoft Edge, but do not go to Internet Explorer, okay. I highly recommend that you download Google Chrome and make it as your default project. Now, if you pay at ancient jupiter doesn't have its own graphical interface. Instead it's using Google Chrome acids graphical user interface. Okay? This does not need internet to work. It just borrowing the graphical user interface properties of Google Chrome, okay? This is how you should open, open Jupyter Notebook, okay? If you don't want to open like this, I'll show another tactic which most professions use. I'm just going to close this one out. I'm even going to close the Anaconda Navigator. Okay? I want you to open your command prompt, okay? Mac users or to open your command line. And in say the command line, you will first have to navigate to the folder where you want your Jupyter Notebook, the open, okay? Now, if you can see there is a default path here. If you want your Jupyter Notebook to open in another, but you can manually go and download or whatever you want. Now, this is called as changing directories, okay, This process is called chaining data, please. In short, when you are using chaining that a producer, you should use a CD, CD the person sending that feel good, I'm just going to take downloads. But if download is already present inside this pod, by just clip, by just pressing Tab, you're coming in and automatically complete typing downloads. This means that this folder is already inside of it. Now, if you wanted to go backwards, which means I don't want this, but I want to go one step behind, okay, I want to go to users to do this your first time to type cd, followed by double-dot. If you press Enter. But we'll go one step backwards, okay? If you again want to go and step back and you can type cd and double-dot. Okay? Now I have already created a folder named Jupiter Notebook inside my seed right? Now I highly suggest that you have a separate folder for all our lessons. And I'm just going to change my directory to Jupiter. And I just told you that if this photo is already there and say my C drive, you don't have to even fully type it. Will just you have to give it a hint and if you press Tab, it'll automatically I mean, your commanding automatically designates the folded and complete your typing again. And I'm going to press enter. Now my path is currently set to see drapes Jupiter notebook. So inside this folder, I'm going to open my Jupiter notebook, or you have to do is to type Jupiter notebook, irate and hit Enter. And this will open Jupyter Notebook inside the path that you've said. Okay, so I want you to come in and say no, you can see something called Les notebook. Okay, you can see Python 3 here. If you just click on this, it'll open a Jupiter Notebook better. You can type your Python goods. Okay, So let's actually start diving by the unquote in the next session. 5. Navigating Jupyter NoteBook: We have a brand new Jupyter Notebook cured and by default, your job ignored. We're going to be named untitled. Okay. But it doesn't mean it has to the mango steam by just clicking oviduct, rename it whatever you want. Again, I'm just going to name and V6. And I'm going to hit Rename. And these lines actually type your Python conjugate. These are actually called quote cells, where we'll be typing on it by 10 quotes instead of courts edge like this. I'm just going to start with the good old print, HelloWorld. Printing. Hello World. Right? And in order to execute this, you can press Run over here, okay, by just clicking on ground, you can actually see the output of print hello word it being printed. And let's say I want to bring a 100 plus 100. I want to print the output of a 100 plus 100. And by this point you probably know that Python doesn't want you blas and 3D, which are numbers. In another language. If you're coming from another language like C or C plus, plus, you would probably assume a is equal to a 100, and then B is equal to a 100 liters, correct? But Python doesn't need your buzzing media buzz. You can just go ahead and execute this rate of it. I'm just going to brush it on. Ok. And you can see the output of 100 plus 100 beings limited. If you pay close attention when a print HelloWorld, this doesn't show output, but when I bring a 100 plus 100, it is showing me output. Okay? Now, this is a quick little trick to know because you are not using straight away, which they, because when you're using a function to bring it out, you wanted to receive an output. But when you're printing on Nicaea numerical addition, which doesn't have any functions and you're printing the output right of a Jupiter notebook will show you a column called it as output, okay, to specify that this is the output of this action. Alright? So this is a quick little trick to know. Now, I'm just again going to print 1 plus 1. It is painstaking to just go ahead and let's run everything every single time, correct. You don't have to go ahead and press it on neighboring diamonds. Did all you have to do is to press Shift, Enter. Shift plus Enter is the same as going in pressing von here. Let me again go and print my name. Okay. I don't have to go and press running here. All I have to do is to just press Shift Enter. Okay. I'll get me out right away. Let's say you wanted to insert a code so it's somewhere in between here. I told you that shift ended gives you the output. And if you pay close attention every time I press Shift Enter, this number will get really because if you see here, this number is getting an increased critic. So every time I execute, the number of executions will increase. And similarly, if a is odd and good, until this point I've been pressing Shift Enter connect. If I press Alt Enter, it will add a code cell immediately after the court said very pressed Alt Enter. Let's say I want to insert the Senate after this print Hello world, okay, just by clicking odd ended, I can insert a new courts will immediately after the previous quarter. And are there let's assume. I'm just going to print a here and press Shift Enter. Alright. Let us assume you've finished typing equity, okay? And you want your code to get saved, correct? To say via code, you can go to file and press Save. Odd, you can hit this little floppy disk icon right here. This will save your code. And automatically your Jupyter Notebook will auto save every code. I guess, like every 20 minute to some. But if you want to be just showed you then press this floppy disk every dominant it'll save your code and create a new SharePoint. And now you've saved get cold and you want to download your code and a new phone like Jupiter notebook, believe by default, Savior by Don goods in DOD, IP way NB vomit, Okay, God, ieee, be white and B. But if you want your Jupyter Notebook, the Savior phase and I brought by phone night. You'll have to go to File, Download As. And here you can see a bunch of different formats in which you can download your Jupyter Notebook files and by default, it'll save your files in this format. I just told you, correct. The default format of Jupiter notebook is dot IB way and B. And this is how your Jupyter notebook and save your flight. But let's say you want your Jupyter Notebook to save again by just clicking on dark place. Your Jupyter Notebook will convert this IP way and be filing to what dot py file. But she can extract one other ID and executed there. Okay? And, okay, let us assume you pressed an infinite loop by mistake. If you had a programming but you don't know what I am Frank lupus in shot. An infinite loop is when you've created a loop but forgot to type a condition which will stop the execution of the loop. Okay? And now if you had lupus running infinite number of times, okay? If this happens and you don't know a way to stop this, you can just go to kernel and press a reset. Okay? This will restart your entire file over again. So anything that's running infinite material declared an infinite loop. And if it's running for US, for God knows how many days, just by clicking on Connell and hitting restock, it did stop any infinite loops that is currently running inside your code. Okay? Not just putting Frank looks like to run into one adder, which doesn't seem to stop. You can just use the same method, okay? You can also use interrupt. But I prefer using the Stato good because it keeps things really clean. If you'd like to know more about Jupyter notebook, then what I just said, you can go to Help and click on notebook help. Okay. This will redirect you to a new page, right? You can see and learn a lot about Jupyter Notebooks, Okay, And I'll show data a lot more shortcuts that you can use. Okay, I just taught you about Chip n dot and aunt ended, correct. But apart from this data, Chernoff, other shortcuts that you can use if you want to know about them. You can go to Edit and go to keyboard shortcuts, okay? Now here you can find a bunch of different shortcuts that can make things easy for you. All right, in say the head, you can also find a lot of different Python libraries it will be learning later in this course. But if you'd like to know more about them, if you'd like to know about them in advance, you can just simply click the normal. What about that? Okay. This would be all for this lesson. We've seen, what about Jupiter notebooks and how did I quote inside of them in the next lesson? 6. Markdown cells: Under this finding, our goal is we want leaping dining courts and state courts. It's okay. I already don't either these are called quartiles. But I want you to argue that Jupiter can do more than just straight coach, okay, you can actually type text inside Jupiter. But in order for you to do this, you'll first have to convert these courses and do watch call it S markdown. So it's okay, you can see Madonna here by just clicking this this Court said to me because I didn't do a markdown cell where they can actually pay fixed. Okay. Let's Hamlin just going to type by Don here, okay. There's also a VNet. You can format the sticks, okay. You can fall out and like Biden and make it bold to make it as a heading, to make it as an IT, Alex, okay, these processes are quite modern formatting, even begin expanding on modern form. I think it can even be a separate course on its own. So for now, I'm just going to tell you the basics of it. If you just hash and leave a space in between, this will get converted to what hitting, okay, This is called H1. And let's say you're going to press two hashes. Faraway, the same good. I'm sorry, I haven't gone what did this to him? I don't send yet. If you do this and hit Shift and Enter, this is a comparatively smaller heading, okay, This is called SH2. This is H1 and H2. Similarly, if you are type three hashes, followed by the word, I'm sorry, Okay, and hit Shift and Enter. This is quite less h3, okay? This is the order of headings. This is heads, heads do NH3. If you already know HDMI it you'd probably know what I'm talking about, No. Okay. And let us see what type of sentence this code should give the output. Give the output one. Okay, let's say this is your sentence and you want to make this particular part of the same things into a bold or italics, okay. First was he plugs, you'd have to enclose that part of the sentence with an aspects. Okay, once you do this and hit Shift and ended, just this part of the death sentence was done in 2010 x again, I'm just going to copy paste the same piece of kit. And those are going to convert this to a markdown cell. Let's say you wanted this bug can be converted to a bowl formatting. Or you have to do a serotype, do hashtags, and close the part of the sentence with two Hystrix. And once you do this and yet Shift plus Enter, you can turn this into, okay, Now, these are the basics of markdown formatting, okay? Obviously that anymore to this, but this should be enough to get you started. Okay? So I'm also going to show you a quick code now to get you started. Court should display the current time, right? And I'm going to go on today, but sharp Python code here, I'm just going to import time here. And I'm going to print datetime, datetime dot. Actually this should have been shifting, good. And here I'm just going to press Shift and Enter. This will give us the current time. Right? Now, starting from here to here is how you should format your code inside Jupiter notebooks. Okay, you'll first haplotype and instruction for you. Or you can ask me why should I even typed same thing, okay? Let us see me typing your code that has 1000 lengths, okay? Which will obviously do when you're working in a company. When you do this and come to the very first thing that come to a particular function, you might not know what the function does, okay? You'd probably forget when you start to code, it is so easy to forget or lose the whole of what you're actually doing. So typing instructions like this in between your code is a good practice, is always a good practice because when you pass on your CO2 and other employee or a friend of yours to examine. He or she can also understand what your code is all about booking, which is exactly weights. But if you use a combination of quarters and modern cities when your baby coordinates a Jupyter Notebook. 7. Python - Data Types & Arithmetic Operations: Hey guys, for a few lectures. And it's going to try and give you a little bit of introduction about the things that he can do with python. And to remind you, this is just a refresher course. If you'd like to know more about Python, learned Python and D, and if you want to build interesting software and applications using Python tie learning my other courses, the Python micro degree. Now that goes is going to be descriptive. But right now within an odd, I'm going to try and cover all the basics in Python. Now I'm going to try and keep this as concise and clear as possible. So sit tight. So firstly, let's talk about data types. The data type of a value is an attribute that tells what kind of data that value can have. So basically, when you define a datatype, you're basically telling the computer what type of data and trying to handle. Python can handle five different datatypes. They add numeric, dictionary, Boolean, sick, and sequence type. Now the numeric datatype again has three different sub categories. The first one is the integer, the second one is the complex number, and the third one is the fluid. And as you might have already guessed, these three data types are the most used, not just in Python, but in every other language as well. Now the sequence type data type again is divided into three separate sub categories. The first one is the strings, the second one is the list, and the third one is the tuple. Now, every single datatype category that you see on the screen right now has its very own purpose. So this is everything you need to know about datatypes and a high level. Now let me proceed with the arithmetic operations and you can do with Python. Python can handle about seven simple arithmetic operations. Addition, subtraction, multiplication, division, and modulus, quotient and exponent. Now I'm going to display picture on top of my screen. So if you want, you can pause the screen and have a look at it. Now let me write a simple arithmetic operation. I'm just going to write one less one. I'm going to hit Enter here. Now if you have no idea about what I just did, I just type a piece of code inside a code cell. Now this area is called as a code cell. And if you want to execute that particular code cell, or you have to do is to hit Shift plus Enter. Just by hitting Shift plus Enter at the same time, you can execute the code that you've typed inside a single court said, now writing a program for addition is as simple as this in Python. You don't even have to allocate variables while performing a simple arithmetic operation. Now let me perform a simple subtraction operation and it's gonna type two minus1 and I'm also going to hit Shift plus Enter. This is helping execute my Coachella just told you. And let me do a simple multiplication operation to gastric, to let me hit Shift plus Enter and let me do one for division as well, to backslash, to let me hit Enter. Great. So these are the four basic arithmetic operations that you might have been learning from school. Let me show you the modulus operator. Now, division and modulus is more or less similar, but division is going to display the quotient of the division, whereas the modulus operator is going to help you display the remainder of your division. Well, let me show you an example. The modulus operator is nothing but the person is saying, Okay, well let us assume I wanted to find the modulus of 10 divided by 6. Now in this division, one is the quotient and four is the reminder, correct? Now, as I just told you, the modulus operation will give you the reminder, which means your output should be for. Now let me try and execute this code cell. And as I told you, the model is operator is giving us the remainder of this division. Now again, what do you want to get the quotient alone? But you don't want the quotient to be a decimal number. You can use the quotient operator, which is double backslash. And let me do the same thing that I did here To quotient. And let me execute this. If you see a simple division is healing as the quotient, which is actually a decimal number. But when you use the quotient operator, you're going to get the same quotient, but it's not going to be a decimal or a float. And finally, there's also something called as the exponential symbol, but just nothing but a double asterisk symbol. If I take three exponential three, this is nothing but three times, three times three. All right, now I hope everyone in this course knows what an exponential is. Now let me delete this and it makes secure. This court said, now what do they want to perform multiple arithmetic operation inside the same quotes it. Now, first of all, is it possible? Of course it is possible in Python you can perform multiple arithmetic operation in the exact same line, but there is something that you must know about. For example, let us take one plus T2 has to three minus one has trick to know what I get an error message when execute this code cell. Of course not, because I just told you you can execute a lot of arithmetic operations inside the same court said, Well, let me execute this right away. Now, I'm getting a valid output as I just told you. But is this the output that you really want? Now keeping Python aside, in general mathematics, there is something called as the board must. Now, I'm just going to display a picture on top of the screen. Pause the screen if you want to have a better look at it, any mathematical operation will only be executed in this particular order. Now, as you see on the screen, brackets are the first things that would get executed, followed by OFF or exponential, and then followed by division and multiplication, addition and subtraction are actually on the last. Which means if you're trying to execute a code cell that actually has a lot of arithmetic operations instead of it are addition and subtraction actually has the least priority. They are on the very bottom of the hierarchy. Brackets are in the highest position of the hierarchy, followed by division and multiplication and addition and subtraction, as I just told you, is at the very least of the hierarchy. Now let us have a look at a court said here in our code cell, the multiplications would get executed first. Now, only the result of this multiplication would get added or subtracted. Now if you want to stay in complete control of what's happening here, you will have to use brackets. So let me delete this line of code. Whenever you are performing multiple arithmetic operations inside the same courts will always prefer using brackets. Now inside the brackets, first gonna type 3 minus 2 and outside the bracket, I'm going to multiply this result with another set of brackets, which goes like 3 minus 2. Now I'm placing a subtraction operation inside a bracket, which means even though there is a subtraction inside this court said just because I've enclosed it inside the bracket, it could get executed first. The result of both this abstractions would then get multiplied. Now let us try and execute this. Now this is how you can use brackets while performing multiple arithmetic operations to change the hierarchy of their operations. 8. Python - Variables: Until this point, we have in executing arithmetic operations, even without storing those values inside a variable. Now, what really is a variable? A python variable is a reserved memory location to store values. Now what is a reserved memory location? Imagine variables. The data is getting stored. Now I'm Abishek and I'm in India now. Similarly, you can be a John Doe who's living in the US or the UK or whatever your variable is. A variable in general is nothing but an address where a value is getting stored. Now, this assigning variables easy, of course it is. All you have to do is to type a word or an alphabet. Now I'm just going to type an alphabet called a and just type equals. And inside this, you can type anything you want. And I'm just going to store the variable one instead of it. All I have to do is to hit Shift plus Enter. And right now, every time I call the a variable and hit Enter, this variable will call back to the data that I've stored inside of it. Now this doesn't necessarily have to be an integer. It can be a float value. And let me execute this a again. It can also be a string if you want. Now if you want to provide a string as an input value, you'd have to enclose the string inside quotations. Okay? And let me provide my mean. Let me execute this and it makes it a again. So in short, assigning variables is nothing but storing a data inside an alphabet or a word. So if you're planning on using this data on multiple locations inside your code or you have to do is to use the table. Again, you don't have to manually type the data that's stored inside your variable. And let me modify this to one again. And in the same code cell, I'm going to assign a new variable called B. And inside the B variable I'm going to store the integer two. Let me execute this. And in the next code cell, I'm gonna introduce a new variable called C. And inside the C variable, I'm going to perform a numerical operation which is nothing but I'm just going to add the data that's stored inside a width, the data that's stored inside B, it makes secure. And finally, I'm going to print everything that's stored inside you see variable. Now, I have used the print statement here just to make it look like I'm programming. But in Jupiter Notebook, you don't actually have to use the print function if you just want to display our output, all you have to do is to type the variable's name and hit Enter. Now this is how easy it is to store a value inside available. However, there are certain rules that you must follow while naming a variable. Now let us assume the name of my variable is Apple. And let me store an indecent into effect and hit Execute. Now this would work just fine because the name of a variable, it should always start with a letter or an underscore character. Now it can be an apple, or it can be the underscore apple. It doesn't make a difference, but your variable can never stack with the number. For example, it can never be one apple, for example. And let me execute this. If I do something like this and obviously get an error message, because as it is told you, a variable's name cannot start with a number. Now what if you want your variable to have more than one word? Let us assume you're trying to get the user's name and you're using a variable lake type, your name. Can you use such a variable? Of course not because if I execute this, I get an error message again because the name of your variable never have space in between. Now, I'm not saying this is impossible. All you have to do is to separate your words with an underscore, type, your underscore name. Now if I do this and it execute, no problem, your code would work just fine. And again, your variable can only be made up of three elements. It can either be an alphabet, it can either be a number or it can be an underscore. You cannot type any other symbol other than this. So what do I mean by this? You cannot do something like type backslash, backslash name. If I do this in it execute, I get an error message again. Now remember, your variable can only have alphabet numbers and underscores. Now finally, let me demonstrate something to you. Let me erase this first of all, and I'm just going to name my variable ABC. All right, any variable that you declared is case sensitive. So what do I mean by this? Now, ABC is a variable. Similarly, a, B, C in caps is a completely new variable. And let me assign the same one here. Now, again, in small caps, but seeing is a completely new variable. You get my point right anyway, when you're declaring is case sensitive. So even if you modify the case of one single alphabet in your variable, you are going to get an error message because your job in a notebook will not be able to locate the data that you're trying to extract it. 9. Python - Strings & Print Function: Now let us talk about strings and print statements a little more. Now, I have already told you, whenever you're typing a string, a string has to be enclosed inside quotations, correct? Now let us assume ABC is ministering. In order for Python to recognize ABC as a string, it has to be enclosed inside quotations. Now this conditions can either be a single quotation, this would work just fine, or you can use the industry's standard practice, which is the double quotations. Now both of these quotations would yield the exact same results. But the baseline is in order for Python to recognize a particular word or alphabet as a string, it has to be enclosed inside some form of accommodation. Let us say you're typing something like I'm, okay. Now this sentence already has a quotation inside of it. Should you do something different? Of course not all you have to do is 10. Close this entire string inside quotations. Now the moment I execute this, the entire symptoms who did recognized as a string, regardless of how many quotations that have data inside the sentence. Now this is pretty much everything you need to know about strings. Strings is a really easy concept. There's nothing more complicated about it. All right, we move head to printing it output. Let me type printing here. And if we convert this into a markdown cell, create double hash tricks. Now let me hit Execute. And before demonstrating this to you, let me store a string inside a variable. My string is going to be, let's say hello. Now, I want you to clearly understand the difference between displaying and output and printing in output in Jupiter notebook if you want to display an OT, but all you have to do is type the name of the variable and hit Enter. Now this will open and output code cell, as you can see on your screen right now, okay, right now you're just displaying the output. But if you want to print your output, you'd have to use the print function. And inside the parenthesis, you'll have to provide the gables name whose data you want to print it. And let me hit executed. Now as you can see on the screen, whenever you use the print function, you want explicitly open and output courts it because you're printing the output and you're not just displaying it. Now, Jupiter notebook fortunately allows you to display an output instead of having to print it. But if you're learning this course on another IDE like visual Studio Code, you'll have to use the print function whenever you want to display the output. Let me also show you what a dynamic print function is. Now a dynamic print function is something that can alter its output based on the inputs that are provided. And let me declare two variables here. My first variable is going to be name. Now as a data, I'm just going to provide John Doe here and followed by this, I'm going to type the age. And let us assume our John Doe is 40 years old and it may execute this code cell. Now instead of typing a default print statement, I'm going to include a format method here. First, let me show you the code. So have a better understanding of foreign talking about. All I have to do is type the print function again. And inside the print function, I'm going to type the name of, I'm going to open parenthesis here and inside the parenthesis. And it's going to take a here, but it can be anything you want it. Now, after the parentheses, I'm going to complete my sentence. The name of a is, and again, I'm going to open another flower bracket and inside the seconds in a flower brackets, and it's going to provide the variable b. Now please do not worry, I know that we haven't declared the values for a and b here. We are going to be assigning the values for a and b variable using the format method. In order for you to use the format method, you will have to place the cursor after the quotations and initiate typing format. And inside the parenthesis, you'll have to assign data for the variables you have created. So the data for my a variable is going to be name comma. The data from a-b variable is going to be h. Now let me execute this code cell. If you see here, even though I haven't typed the name of the John Doe and his age inside my print statement, my format method is automatically acquiring the data from the variables and displaying it here. Now similarly, you can use the format method to type dynamic print statements. 11. Python - Lists: Allow me to introduce you to one of the most interesting concept of Python called lists. A list is a data structure in Python there is a mutable or changeable and ordered sequence of elements. Now, a list is nothing but a string. In string enclosed all elements within quotations, correct? Whereas in lists, you're going to encase all of those elements inside square brackets. And all of these elements in square brackets have to be separated with a comma. A list is nothing but a container. Now I'm not making this up. A list is actually called as a container. Unlike any other container, Alice can store almost all the data types. For example, let me create a list here with numeric values. So all I have to do is to open square brackets. And inside the square brackets, I can type any number of elements I want. And all the elements have to be separated with a comma. Now creating a list is this simple, and if I want, I can store all these elements inside a b. And let me store this list inside the variable a. And let me execute this again. Now, every time I display the output of a, I'll be able to print out my list here. Now as I just told you, a list is a container that can store all data types. So for example, let me create a new list, but this time the element in segment list is going to be strings. So my first element is going to be a, and my second element is going to be b. And my third element is going to be C, for example. And let me just three more Azi wins here, D, E, and F. Now let me execute this again and let me print be graded. And also any list can contain more than one data type inside a single list. Let me create a new variable C. And inside this list, I'm going to store strings as well as numbers. And this is completely possible as long as you separate the elements with a comma. Great, let me execute this and let me print C again. Now creating a list is this easy, but what can you actually do with a list? And we already have the list a here. Now what if you wanted to add an extra element to this list? Now let us say I wanted to add the number 6 to this list. All I have to do is to use the method called append. Now, append is a method that's going to help you add an extra element to the end of your list. An example, let me add the element six inside the method, and let me execute this again. And if I print the list a, I would have added the elements six at the end of my list. And make a note of this, the append method can only add an element to the end of your list are right. Now let me give you a really short exercise. Try and add the alphabet G to the end of the B list. This pause the screen and try to add the alphabet G to the end of the list. Great. I hope you were successful or you had to do is type B dot append. But since we are adding a string here, all you have to do is to use quotations before typing the alphabet. Now if I type G and hit Execute and then display my B list, I would have added the element g at the end of my list. Now this is just one of the properties of lists. Now, in a lot of them are similar to strings. So most of the properties that apply to strings, applied to lists as well. So let us say I want to display the very first element in the list. Only have to do is you type the name of the variable. And inside square brackets, all I have to do is to provide the index of the variable I want to extract. So as I just told you, I want to extract the first element from the list. So I have 0 here. And if I execute, I'll be able to extract the very first element in the list. And if you want to extract a group of elements, instead of extracting a single element, or you have to do is to declare the upper limit and the lower limit using a column. And let us say I want to extract all elements between the index 14. You have to do is to give one as the lower index. And after separating them with a column, you'll have to provide for as the upper index. And if execute, I'll be able to extract a group of elements from this list. And similarly, you don't necessarily have to provide an upper limit or lower limit. And let me demonstrate this to you one more time. And without declaring a lower limit, I'm going to do is to write four here. And if I do this in it execute, I'll be able to extract all the elements until the limit for or the next word. Now let me execute this. And similarly, I can also do this without giving the upper limit. And right now I'm going to extract all elements from the index to until the end. Let me execute this greater. Now this is not just an interesting property, but also a really important property because you are going to be using lists, placing in a lot of different places. Now let us say, instead of adding or appending an element to the end of the list, you want to replace an element completely. Now what can you do if you want to replace an element completely, as I just told you. Now for this example, I'm going to take the B list and let us say what do we place the very first element a. I don't want the element a anymore and you want to replace the very first element. All I have to do is to type the name of my list. And inside square brackets, I must first provide the index whose element I want to replace. So right now, I want to replace the element on index 0 and type equals. And after the quiz, you'll have to provide the element with which you want to replace the existing eliminate. Let us say you want to replace a with new. And if I display the elements of the B list again, I would have replaced a with the new string. Now this is how we can replace elements of the list instead of appending them. And let me demonstrate another interesting property called nested lists. You can use nested lists if you want to place a list inside of another list. And nested list is nothing but a list that has placed inside an existing list. And nested list. It will look something like this. Let me create a new list now. Now this list is going to have numeric values between 1, 2, 3, 4 and 5. Now the sixth value is not going to be the number six, but instead I'm going to create a new list here. And inside this list I'm going to place the number 6789. As I just told you, a nested list is nothing but a list that's placed inside an existing list. Let me execute this again. Now let me give you a short exercise. Now let us say you wanted to extract the number seven from this list. What would you do? Now I cannot just type D and type the index here. Now if I type the index 6 year and you'd execute, I would receive an error because there is no element in the index. Just count with me. One is in the index 0 to index 13 is in the next, 24 is in the next three, and phi is in the index score. And when it comes to index file, there's not just one single element, but there is an entire list that's actually sitting in the index file. So even if you type index file here, you will not extract a single element, but you will extract the entire list. Let me execute this quickly. And as I just told you, you're not just extracting a single element, but you're extracting the single list. But as I told you, what if you want to extract the element 7 again, all you have to do is to include a new set of square brackets here and include the index of the number 7 in this list. Now seven is sitting in the index one of the second list. And if I execute, I'll be able to extract the element 7. 12. Python - Dictionaries: In Baden, just like a list or dictionary is also called as a container. Now you can ask me a picture. He is autopilot as a container. Then what's the difference between a list and a dictionary? I have already told you that a list is a container that has elements that are separated with commas, correct? Let me demonstrate this index of lists again. For this, I'm creating a variable called a, and I'm going to store a list that has elements from 1234 up to fight. And all of these elements are separated with commas. And if we create a dictionary now, for this, I'm going to generate a variable be, the very first difference is in the syntax. For starters, a dictionary does not use square brackets, but instead we're going to use flower brackets for this. So I'm going to open and close the flower brackets enlists if you want to isolate an element or if you want to locate an element, you'll have to use the index, correct? You will have to use that elements respective index. Now let us say you want to extract the element 2. In order for you to extract the element do you'd have to refer to its index, which is one. This isn't the case when it comes to dictionary. In dictionaries, the order or the index doesn't matter. Every element inside of a dictionary is a key value pair. Now, what is a key value pair? Now let me explain this to you with a simple example. Let us assume Mikey is going to be student one. The key must be separated from the value by using a colon. And the value of this student one key is going to be, let us say John Doe, student one is the key and John Doe is the value and the entire thing is called as a key value pair in a dictionary, unlike a list, is not just a collection of elements, but it is a collection of key value pairs. And you can call this value by using its key, just like how you call the elements of a list by using the indexes, you can call the value of a dictionary by using its key. Alright, so first let me create a full fledged dictionary. I'm going to add another key value pair here. I'm going to name this student too, and I'm going to separate the key from the value by using a colon. And again, the name of my students who is going to be tom. And I'm going to create a third key-value pair, students three colon. And the value is going to be, let us assume to jack, it may execute this. And just like I do you, if you want to locate our isolate a value, you'd have to use its key. Now let us say you want to extract John Doe from this dictionary, or you have to do is to type the name of your dictionary. And then within square brackets, instead of declaring its index, you'll have to provide the key whose value you want to extract. Now, John Doe has the key, student one, correct? Now if I execute this, I'll be able to extract the value that this key is withholding. Now similarly, let us assume I want to extract the value of TM inside square brackets. I'd have to declare its respective key, which is student to. Now if I execute this, I'll be able to extract DOM. Now, a value can be anything you want. Now let me copy this again. And instead of B, I'm going to type C here. Let me delete the second and the third key value pairs. I'm also going to delete the John Doe. And instead of using a string as a value, I'm going to declare a full-fledged list here. Now, again, this list is going to have 1, 2, 3, 4, 5. It makes secure this. I'm just doing this to tell you that the value doesn't necessarily have to be a string. A value can be anything you want. It can be a number, it can be a string, or it can even be another container, like a list or a dictionary. Right now, I've used a list as a value. Now let us say you want to isolate the element three here. What can you do? First, you'd have to use the key to extract the value, correct? So I'm first using the student one key, student one. And if I do this, I'll be able to extract the entire value. But it's not just the element but the entire list. So first, I'll have to isolate the entire list. And after doing this, if I want to isolate the elementary again, I'll have to use the index of that particular element. The index of 3 is 2. Now if I execute this, I'll be able to extract the number three. Now, just like how we saw nested lists, this is a sort of nested dictionaries, okay? And like I just told you, a value cannot just be a list. It can also be a dictionary if you wanted to. And let me copy and paste this one more time. And right now, instead of a list, I'm going to declare a full-fledged dictionary as a value. Now, as I've told you already, the syntax of a dictionary consists of flower brackets. It must first have a key, and the key is going to be age. And the value of Mickey's going to be, let us assume 20 here. It makes it clear this. And let me give you a short exercise here. I want you all to extract the element 20 from this huge dictionary. What can you do this? Pause the screen and try to figure out the answer. I hope you all were successful. But if you still happen to have doubts, let me clear it for you. First. You'll have to isolate this entire dictionary, correct? So this dictionary is actually a value of a bigger dictionary. So first of all, I'm going to extract the secondary dictionary using its key, which is student one. It makes secured this. And after doing this, in order to extract the value 20, I'd have to mention its key again. So I'm again going to open square bracket. And inside the square brackets, I'm going to provide the name of my key, which is age. If execute this, I'll be able to isolate the element 20. Now this is everything you need to know about dictionaries for now. 13. Python - Tuples & Sets: In this lecture, I'm going to show you another type of container, which is a tuple. Now if you look at the very first that I've typed here, a tuple is used to store multiple items in a single variable. Now this is actually similar to that of a list and a dictionary. Even in a list and a dictionary, you can store multiple elements inside a single variable. Then how is it you build really different from a list and a dictionary? Now the major difference between a tuple and a list is once you create a tuple, you cannot really modify the elements that are there inside a tuple, which is actually what I've typed in the third line. Once you create a tuple, the elements inside of it are unchangeable. You cannot change the elements, you cannot add elements, and you cannot remove elements. In simple terms, a tuple is immutable, and let me show you how we can write a tuple. The major difference begins with this syntax. We write lists with the use of square brackets, and we write dictionaries with the use of flower brackets. And similarly, you have to write a tuple with the use of code brackets. Not a square bracket, not a flower bracket, but a curved bracket. Now everything else is exactly like a list. All you have to do is to type a set of elements inside the tuple. And all of those elements have to be separated with a comma. Now let me execute this. Let me auto create a list here called B. And inside the list I'm going to store the same set of values, 1, 2, 3, 4, 5. Great. Let me execute this, and now let me show you the difference and let me try and add a new element with the append method. I'm going to try and add the elements six for the end of my list B. Now if I do this and hit Execute and then display the list, the hybrid of upended the elements to the last of the B list. And let me try and do the same for a tuple. Now if I try and append the number 6 to our tuple, I would receive an error message. This is because you can never change, add, or remove an element from a tuple. Once you've created a tube, that is, you cannot do any changes. Now let me also show you what sets our first. Let me add a markdown cell, convert this to a markdown cell, and let me execute this grid. Now set is the fourth type of contact of the radius causing the first continent was a list. The second container is a dictionary. That container was a tuple and set as default container. Now what does a sec? Let me create a set. Training set is really simple. You'd have to create a set with a flower bracket. But unlike a dictionary, you don't have to provide keys or values are you have to do is to type in a set of numbers or strings or whatever you want. And it makes the keyword this. And if I display the variable a, I can extract the elements that have stored inside my sick. Now how does a sec actually differ from a list or a tuple? There is one major difference. A tuple or a list supports duplicate values. For example, if I create a list, mindless can have 10 elements and all the 10 elements can be the number one. But if I try to do the same with the set, now let us assume I'm creating a set with ten elements. And all ten elements are the number one. If I do this and then display the variable a, I would just receive one element in my output because a set does not support duplicate values. Now let me demonstrate this to you one more time. I'm creating a variable b i, inside this variable, I'm going to store a set, and this set is going to have nine values. The first three values are going to be one. And the last three values are going to be three. And it may execute this. If I display everything that is stored inside the variable, I would only receive three elements. This is because a set would remove all the duplicate values that are stored inside. 14. Python - Relational & Logical Operators: I'm going to show you two types of operators in this lecture. The first set of operators are called relational operators, and the second set of operators, logical operators. Now first, let me show you what relational operators Add. Now there are six major relational operators. The first operator is the less than operator. The second operator is the greater than operator. The third operator is the less than or equal to operator. The fourth operator is greater than or equal to operator, followed by the equal to operator. And the last comes the not equal to operator, which is an exclamatory mark followed by an equal symbol. Now, whenever you use relational operators to compare two values, your most probable output is going to be a Boolean value, which is either going to be a true or false. Let me demonstrate this to you. Imagine if I'm comparing one is less than two, which is obviously true, correct? Now if I execute this, I'll receive two. And if I compare two greater than three, this is a false. So I'm gonna get false here. Now this is what I was talking about. The output of a relational operator comparison is often going to be a Boolean value, which is either going to be true or false. And let me also demonstrate before and the relational operators. The third operator was the less than or equal to operator. Let me compare is two less than or equal to 3, which is going to be true. And it may compare three greater than or equal to two. Now if this is going to be true, now if I modify the two with a three and then execute, my output is again going to be true. But if I modify the three with four and then execute, my output is going to be false. You get my point correct. The fifth relational operator is the equal to operator. Now, this equal to operator has to be used if you're trying to determine if two values are comparison are equal. Now if I try to compare two equals to 2, this is going to be true because 22 are equally correct. And if I compare two equals to three, this is going to be false. You get the point again. Now the mistake that most beginners do is using one equals to some value instead of two. Now if you take one equals two symbol instead of two and then hit Enter, you are going to get an error message because Python is going to think that you're trying to assign a value to a variable. So whenever you're trying to use the equal to operator, make sure you use two equal signs and then hit Enter. The final relational operator is the not equal, which goes like this Midwest type and exclamatory mark followed by an equal sign. And let me try and compare three naught equals to 4. This is going to be true because 34 are two separate values. Now if I modify 4 with 3, I'm going to get false because 33 are similar values and they're not different. Now everything is so until this point are about relational operators. Now let me show you what logical operators are. For this, I'm going to comment out a code cell. I'm going to type logical operators. Great. Now there are two major logical operators I want you to know. The first logical operator is the and operator, and the second logical operator is the odd operator. And let me actually show you how we can use this in real time. So first of all, I'm gonna do a simple comparison, one less than two, right? Foreign by this I'm going to use the logical operator AND, and after this, I'm going to type 2 greater than 1, which is going to be false. Now let me hit enter here. Now there are two major logical operators and they want you to know about the first logical operator is the and operator and the second logical operator is the odd operator. Let me actually show you how these operators work in real time. Let me delete this. So to demonstrate this to you, I'm first gonna do a simple comparison. I'm first going to compare if one is greater than two, which is going to be false. And then I'm going to use the logical operator AND, and foreign by this, I'm going to use two greater than one, which is actually 2. So one of our comparisons is false and one of our comparisons is true. And if execute this, I'm going to get false. And we'll make secure this. And as I told you, I'm getting a false here. I'm going to display a picture on top of the screen. And this picture is going to show you all the different outputs that you get when you use logical operators. Now in order for the output of a logical operator to be true, both of the values of comparison has to be true even if one of those values of files, you're only going to get false in your output. Now, as you can see on the picture, when you compare true and true, you're going to get through. But if you're going to compare true with a false value, you're always gonna get files. And even if you compare to false values, you're again going to get files as input. And another major tip is whenever you're trying to perform more than one operations in a single line, as I've told you before, try using brackets. Now even if you don't use brackets, this might not affect your execution, but it still makes your code a little more readable. So even if it doesn't affect your execution, always try to use brackets whenever you can and we hit Execute again. Great. Now the second logical operator is the operator, correct? So let me try and do the exact same thing here. Let me copy and paste this grade and let me copy and paste the second comparison created. Let me execute this. If executed Disney did through. And let me tell you why the output of the logical operator is completely different from the output of the AND operator. Now if you see on the screen one thing when you compare it to false values with the OR operator, you'll get files as your output. But even if one of the values of comparison is true, you'll again get through as your output. If you compare two drew comparisons, you'll get prove as your output. And even if one of those comparisons are true, you'll again get through as your output. Even in our course, we are going to be using both the relational operators and logical operators in a lot of different places. 15. Python - If Else: If statements are easily one of the most used feature, not distinct Biden, but in every other programming languages. If statements are also called conditional statements, which means they operate based on the condition that you provide. Now without any further ado, let me show you how a simple if statement should look like. You must start writing the first statement with the obvious f-word. And immediately after this, you must provide the condition with which you want to proceed. And I'm just going to type a simple condition here. And there's going to check if two is greater than one. And immediately followed by a condition, you must type a colon here. And this is really important. This is standard syntax. If you forget to type the colon after your condition, you are going to get a syntax error when you execute this code cell. And after doing this, just hit enter. And to make things clear, I am not executing the code cell and just hitting the Enter button alone. I'm not hitting Shift plus Enter, understanding the Enter button to bring the cursor below. If you have type a colon here and then hit Enter, your Jupyter notebook would automatically give you an indentation here. If we don't have an indentation, this is a clear sign that you forgot to type a colon at the end of their condition. And after the indentation, you must provide a simple task that you want your if statement to perform. Let's just say I want me first statement to print hello, provided that this condition is true array, my if-else statement will only print this message if this condition is true. And let me modify this head gradient. Now, obviously my condition is true because 2 is greater than one. If I could execute, my if-else statement would perform the task they've allocated. But what if this condition turns out to be false? Right now, my condition looks like this. If through, I'm just printing this message Hello, correct? This is going to give me the exact same output. But what if I provide a condition that turns out to be false? For example, let me try and check if one is greater than two, this is going to be false, correct? So what do you think is going to happen? Let me execute this code cell. And as a beginner, you might think you're going to get an error message, but when in reality it is going to get no output at all. But this isn't an ideal scenario, correct? You would still want to get some form of an output, even if you're given condition turns out to be false. So what can you do? Are you have to do is to include an else statement here. And even after the else keyword, you must provide a colon. And after doing this, just hit Enter and add another simple task. So I'm just going to provide print. Task failed greater. And I'm going to modify this hello to task successful. Great. Now let me execute this code cell and let us analyze this if statement again. So first my if condition is checking if this given condition is true or false, which in our case the condition is false. So since the given condition is false, RFA statement is not executing the very first task, but instead it will check the task that you have provided inside the array statement and execute that. So this is how an if-else statement works in January. But what if you want to check multiple conditions right now? I'm just taking one condition, correct? I'm just checking if one is greater than two, which is just one operation. But what if we want to check multiple conditions together inside a single if-else statement, can you do that? Of course you can, but instead of using this index, you might have to modify it a little more. Now this is a first condition. So let me add another condition immediately after that. So I'm just hitting Enter here. And after this, I'm going to type the if keyword and followed by the elif keyword. You'll have to provide your second condition. Now, let us say my second condition is going to be greater than four. And as you might have guessed, after typing a condition, you must never forget to type colon. And after the colon just hit Enter and provide a simple task that you want your if statement to perform. Now I'm just going to type task to successfully created. Now let me execute this code said, I'm sorry, I made a mistake. This should have been three, lesser than four. Now let me execute again. Now, our quarterly sprinting Task 2 was successful. So let us analyze this if statement one more time. Obviously our first condition was a failure. One is not greater than. So RFM statement is not performing the first task, right? So first task was a failure. And after this, it goes on and checks if your second condition is true or false and your second condition happens to be true. So it is performing the second task that you've provided. And if both of these conditions turn out to be false, let me modify this lesser than to, greater than and then hit execute. Both of these tasks would fail, and if this happens, it will finally perform the task that you have provided inside your statement. So this is how an if else statement could work in general. 16. Python - For Loops: Far loops or another interesting and essential feature in programming. You use for loops when you went to ICT, read through every single element inside an array that you have created and perform an operation to those elements while they are being migrated. Now, I traded iteration might sound like a complicated word to you, but just imagine hydrating a cycling through all the elements inside the array. So first of all, to explain things a little bit clearly, I'm first going to create an array myself. So my array is going to have five elements of numbers from one to 512345. Great. And as I told you, a for loop is going to help you. I trade through all the elements inside this array, correct? So the syntax will FOR loop is as follows. Just type four followed by a temporary variable. And this temporary variable can be anything you want. Alright, so for now I'm just going to type temp, but you can type anything you want here. And after doing this, just type in and provide the name of the array whose elements that you want to iterate through. So the name of my array is a. And after doing this, just like I told you in the if-else statement, you must never forget to type a colon. And after the colon just hit enter. This is the most crucial step in a for loop. You'll have to provide the operation that you want to perform while iterating through all these elements. So let's just say, I just want to print the word hello. What I came into the elements are right, so let me execute this. Now let me tell you what this for loop is doing. So in the very first cycle, my for loop would consider the first element. For the first element, I'm printing hello here. And in the next cycle, my father was going to consider the second element. And for the second element it's going to print a second hello. And the same would repeat for the last three elements as well. So let me change this into a dynamic print statement to help you understand things a little better. So instead of hello, I'm just going to type the element. I'm sorry. The element is after doing this, I'm just going to type flower brackets here. And after the quotations, I'm going to use the format method. And inside the format method, I'll have to provide the value which I want to print inside the flower brackets. All right, so I'm just going to provide template. Now let me execute this. Now the output of this print statement must help you understand what a for loop is doing. So first of all, we are iterating through all the elements inside this array. And for every single iteration, this operation is being performed. 17. Python - While Loops: The syntax with while loops, a slightly different term for loop. There are three essential components of a while loop syntax. The first component is the entry condition, the second component is the while condition, and the third component is the exit condition. So first you need an Indri condition, and this condition can be anything you want. Let's just say main preconditions a equals 1. All right, so this is me in three condition and this is going to get my while loop starting. And faraway this condition, I have to provide the while condition. And you must provide this condition to tell you why loop until what point do you wanted to operate? So let's just say I want my while loop to operate until a is less than five. So this is going to be my wild condition. And as you might have guessed already, the next step is to provide a simple task that you want your while loop to perform. So let's just say I want me while loop to print hello until it becomes less than five. So that is my entry condition. Now, do you think I can execute my code cell? I can execute McCord, so there is nothing wrong in executing this code cell at this point, you're not gonna get an error, but you are going to get an infinite output. This means that your while loop is going to print hello until the end of time because you haven't provided an exit condition here. So what do we mean by this exit condition and why is it necessary? So I am three conditions, a equals 1, correct? So in our first titration, my way Lucas was going to check a is less than five. So since a equals 1, a is less than phi is true in our first technician, I need an exit condition to change the value of a after every iteration. So let me change this back to a again. So let's just say after every iteration, I want my E value to get incremented by one. Now let me execute this. Great. Let me tell you what's happening. So as we just discussed in our very first iteration, the value of a is going to be one. But once the iteration comes to this step, the value of a is going to get incremented by one. So in the next iteration, the value of a is not going to be one, but it's going to be two. So at some point, this condition is going to get false. Probably at the fifth iteration, the value of a is going to be five. So phi is not going to be listing fight, correct. So your iterations gonna get dominated and your output printing is going to stop at that point. This is exactly why and exit condition is important. And by chance, if you're an exit condition happens to fail or if you forget to type in exit condition on the whole, there's nothing to worry. Just go to kernel and type rheostat. This'll help you dominate an infinite execution. 18. Pythpn - In Built Functions: Most programming beginners happen to think that functions that are really difficult topic to understand. But when in reality, functions that are really easy concept and let me try and help you understand functions with a really simple analogy. Now let us assume your name is Jack. You might have a really long family history. You're going to have a mother, you're going to have a father. You might have education, you might have debt, you might have assets. You have a lot of other things in your portfolio or in your family history, they degenerate, don't tell people, correct. Even with all this family history, if an ingredient want to refer to you, He's aren't going to use all of those. He's just going to call you Jack and you're going to respond, correct. So you are basically comprising your entire history into just one name that's going to be your named Jack. So that is exactly what functions are trying to do. A function can be defined as an agonist block of reusable code, which can be called whenever the quiet. So basically you're just going to type a huge code. Let's say you're going to type a 100 lines of code and you're just going to provide a name for that. Good. So whenever you want to reuse that code further in your program, you don't have to type those into a 100 lines. Again, we just have to call the name that you provided for your code and you can utilize all its functionality. So this is what's called a function. You're just going to write a piece of code and you're just going to provide a name for this code. And whenever you want to use this code, again, you don't have to type the entire code, you just have to type its name and your programming language would automatically inherit all its functionalities. Now Python's developers have already made a lot of things easy for you. There are a lot of inbuilt functions in Python that are readily available for you to use. For example, print is actually an inbuilt Python function. And let us say I have a string. My string is going to be hello in small letters. And let's say you want to convert this entire string into a capital case for it, or you have to do is to utilize the inbuilt function upward. Okay? Any function should have a parenthesis. It is not a function. If it doesn't have a parenthesis or read, any inbuilt function or any function that you create will have a parenthesis. Now I have to do is to hit enter now. And my dad's monkeys where we now get converted to a capital case with now let us see you have the same hello, but in capital letters, if we want to convert this entire word to a small case, would only have to do is to use the lower function, lower as another inbuilt function that's going to help you convert uppercase letters, lowercase letters. And if I hit Enter, my Enter hello in capital letters would now get converted into small case letters. And let me show you a few other inbuilt functions that are really useful in handy. Now, let me create a string. Now my string as he or she is going to have five elements. Now let's assume I want to count the number of elements that are there inside this list that I've just created. Now visually I can just read that there are five elements inside my list. This is because my list is short and straightforward. But imagine having a list that has 1000 elements inside of it. How can you manually count all those elements? You actually don't have to, because Python's developers have already created a function called elastic, the Len function that's going to do all the work for you. You have to do is to call the Len function. And inside parenthesis, you must provide the variable that contains your list, which in my case is a variable. And if I just hit Enter, the ln function will automatically count all elements that are there inside my list. Now there is also a really interesting function called the sum function. This sum function is going to help you calculate the sum of all the elements that are there inside your list. So inside the parenthesis or they have produced you type the name of my variable that has the list. And if I hit Enter the sum function, calculate the sum of all the values that are there inside my list. So these are all the inbuilt functions. But what if you want to create a function on your own? 19. Python - Creating A Function: In our previous lecture, I showed you a few inbuilt functions, correct? But it won't take you too long to realize that python doesn't really have a lot of inbuilt functions. If you want to perform a specific task in your program, you might have to create your very own function for it. Well, let me give you a real-world scenario for which you'll need your very own function. Now let us assume I'm the user and my name is Jack. Now, I need a function that should greet me every time I execute it. For example, I don't need that. My name is Jack click. So whenever I type in my name, I want my function to return. Good morning Jack. Let us assume my name is Tom. Every time I type they were done, I want my function to return. Good morning Tom. So there is no inbuilt function by it, and that'll help you with this. You'll have to create your very own function. So how can you create your own function? This process is called defining a function, and you'd have to start with the keyword D-E-F, which stands for define. And followed by this, you'll have to provide a name for your function. And let us assume the name for my function is going to be graded and followed by the name of a function. You have to provide parenthesis. And inside the parenthesis, you'll have to provide a parameter. Now in our scenario, our parameters are named correctly. The name of your parameter can actually be anything you want. It's just like defining a variable. It doesn't necessarily have to be named. It can also be a simple Eastern, right? It doesn't matter. So just to keep things really simple for you and explicitly typing name. So after this, I'll obviously have to end this lane with a colon. Now let us take a step back and have a look at what we did so far. First, I've used the defined keyword to start defining the function. And I've provided the name from a function. And it wasn't provided a parameter that I want a process which is going to be my own name. Now place your cursor after the colon and hit enter. And this is very much defined what you really want to do with this parameter. Now before we go there, let me type a simple print statement here just to see for functions working or not, I'm going to do is to just print hello here. So every time a function is called, myFunction would just print a simple string hello. This is to see if the creative function is working or not. And let me just call me function again. To call your function, all you have to do is to use the function's name that you provided foreign by the parenthesis. And if you have provided a parameter here, you'll again have to provide a value for the parameter. Now let us assume the value of my parameter is going to be eight. Now you don't have to do this if your function doesn't have a pedometer. So if your function doesn't have a parameter here, you don't have to provide a parameter's value here. All right, All you have to do is to just provide the name of your function followed by parenthesis. This will help you call the function that you have created. And it makes it to this court said, I'm sorry, I forgot. Execute this court said before. Now let me execute this code cell again. For advice. Now as you can see, I've called the function here and myfunction is performing the tasks that have allocated it to do, which is to print a simple string hello. Now this is how a function call would work. Now, let us go back to our original scenario. I want MY function to greed every time I call it. So I'm going to include a parameter again, name. And instead of using Hello, I'm going to replace this with Good morning. And every time my function is called, I want us good morning stream to be concatenated with my name, which is going to be jack, right? So after the quotations, I'm going to include an at symbol followed by the name of the parameter. Let me execute this code cell. And inside my function call, I'm going to provide the name, which is going to be Jack. Now every time I execute this function, this string Jack get concatenated with a good warning string. So this is just like a program greeting you every time you execute it. It makes secured the second quartile again. And as I just told you, our programs greeting us every time. Now as you can see, everything seems to be working fine, but all we need is a space in between morning and Jack. So I'm going to be adding a space before the mutations. It makes a good board. The court says again, great. So this is how you can create your very own function. Now just to explain things a little better, let me show you another example. Let us try and create a new function that will help us generate this choir of the integer value that we provide. So if I provide two as my input value, my function has to return the square root of two, which is going to be four. So at is try and create dysfunction. So first you'd have to start with the defined keyword, followed by the name that you want your function to have. Let us say I want my function to have the mean square. And inside the parenthesis you'll have to provide a parameter. And inside the band this is, you'll have to provide a name for the parameter. I'm just going to keep things really simple. I'm just going to provide a here followed by colon, enter. In this lane, you'll have to define what you want to do with this parameter. So as discussed before, I want to print the square value of the parameter. So I'm just gonna type of print statements here, a exponential. Let me execute this code said I'm going to call the function here square. And inside parenthesis, I'm going to provide to let us execute this code cell and see if our function is working or not. Great. Now let us modify it to, let's say a 100. But this, I believe most of you would have a general understanding about how you can create your very own functions in Python as a general practice, programmers, but generally not use a print statement at the end of your function. Although there is nothing wrong in printing the output of a function, the ultimate goal should always be to return the value that your function is trying to provide. So let me copy the same data and paste it inside another code cell. But instead of using the print statement, I'm going to use the return keyword. I'm going to remove the parenthesis, secure discord cell again. If you are going to use return instead of printing, you'll be able to store the value that your function is providing insight and other variable. Now let us say my other variable is b, and I'm going to store the output of this function inside the variable b. So I'm going to call my function, I'm going to provide the value, lets say the same a 100, it makes it your grade. And if I display the value that the B variable is storing, I'll be able to retrieve the value of my function. So as a standard industry practice, instead of using print, always try and use return at the end of your function. 20. Numpy - Introduction to NumPy: Hey guys. So when it comes to data st, number is everything. Almost every data scientists around the world would have used Numpy at least once and then bragging about at least once per day. Because campaigns that embodied get assigned disorder board mathematics credit, it doesn't dissolve mathematics, which I don't want converting mathematical information and statistically information into something was the user or your employer recruiter would read and understand. Because you, yourself as a data scientist, will not be able to understand that I'm either medically leader unless you convert them into something which you can understand as a graph or into any form of readable leader. Numpy will help you with this, okay? And then it auto either valid reasons for why you should use NumPy. Numpy is again one of the most important linear algebra libraries in Python. And again, linear algebra as a concept is extremely important when it comes to your designs and all the liabilities in the bio data ecosystem Haley the layer number. Because NumPy arrays, one of the major building blocks again. And NumPy is built on top of celebrities, which makes an MBA extremely fast to use. Okay? So before you use numb by now, you should actually installed numbering system. So I believe most of you would have undergone name's Jordan UBC. Okay. If you have Anaconda install in UPC or you have to do is to follow the steps that you see on the screen. First, you should go to your dominant and type Conda install numpy with spaces in between. Okay? This Lead Watch on the screen and follow the steps are, as a matter of fact, let's just go ahead and install Numpy. I don't know. Again, I'm first going to open my command, come in, take my camera and gunned. I'm going to type conda install number. Okay. And please note the majority have Anaconda distribution in struggling EPC for this to work. Okay. So I have Anaconda lanes, Judge, on using this. If you don't have adequate enjoyed, all you have to do. The state that installed NumPy. I have Anaconda install and unpleasant you going to use Anaconda toward the course. So I'm going to use Conda install numpy, okay? And wait for a minute. Okay, this is going to pick, if this is the first annual instruction number eight, this might make a weight. So I've already installed it on a previously. So this installation, you did complete it and say getToken. As you can already see. I've equipment packages have been installed already. Okay. So there's just took a second. If it takes away to not wait, you get Numpy arrays as getting instruct. So if you see something like this, NumPy is no instructing obesity. Okay, you're good to go. 21. NumPy - Arrays: In this session, let us actually start exploiting the features that number has. The very first feature of number that I'm about to show you is a NumPy array, okay? Now most of the people who use numpy actually use NumPy to access this feature to a NumPy array is one of the most used feature of number. Numpy arrays have due dates. The first tape is one-dimensional vector and the second type is two-dimensional magnet. Okay? Let us actually explore them a bit, okay, I guess most of you people know how to declare a list by now. So I'm going to assign a simple list inside a variable called eight. Okay? So when I print a, I can actually see that list being printed. Let us assume I wanted to convert this simple list into a NumPy array, okay? First I'd have to import numpy as np. Okay? I'm just going to import numpy as np. And after importing, I'm going to guess that list as a NumPy array, okay, So inside an MPRE, I'm going to give that variable where we stored our list. If I hit Enter, this list is now a NumPy Eric. Okay? This is a one-dimensional NumPy vector. This is not a matters because this is made up of this wonderful, correct. So this is a one-dimensional vector and it is not a map, does it? Let's say you want to create a two-dimensional market is using number, okay? Glue this, you'll first have to create a list inside of a list. Okay? So I've declared This shirt and in save this list, I'm going to create two independent list. Yeah, okay. So the first list is going to have 1, 2, 3, and the second list is going to have 4, 5, and 6. And if I print this nicely without using NumPy, it'll actually give me compound list. Okay, but let's see what happens when a gas this into an antibiotic. Okay. I'm just going to do the exact same thing what I did for the ABA we're looking at. I'm going to cast this b variable as a numpy array. If I do this, this will automatically get converted into a NumPy array, two-dimension NumPy macros. If you see here, when I print normally, this is just a bomb point list. But when you bring the scene as an empire, it, this is not what matters, okay? Now this is how you normally cast a list as a NumPy array. But in most cases you won't be casting lists, but you will be generating your own with its own data. Okay, we'll see how to do this in the next session. 22. Numpy - Generating NumPy Arrays: Until this point, we have only been converting regular lists into NumPy arrays, correct? But if you want, you can steadily go ahead and create NumPy arrays. There are a lot of message in NumPy that'll help you with this. To start with, I'm gonna be showing you the NumPy arrange. Again. People in general call this S NumPy arrange and we'll make just as much sense. Okay. I'm inside the NumPy arrange. You'll have to provide a lower limit. And an upper limit. If I execute this, I get all the integers that are there between the numbers 1 and 10 are the lower limit and upper limit that you've provided. Okay. Let me just go ahead and execute this. If you see I have received all the integers that are there between my lower limit and upper limit. Let's say I want my lower limit to be 20 and the upper limit to be 30. Okay? If I execute this, I'll get all integers that have data between 20 and 30. Now there is a third argument that you can provide. This third argument is called step size or interval. Let's say I want the interval to be okay. If I execute this, I'll get all the numbers between 2030 mile, receive them in the interval. So there'll be two intervals between every integer that I received. Let's say I want the interval to be five. If I execute this, I'll receive the numbers between 20 and 30. Bile receive them in the intervals of five. Now this is something that you can go around and play with to understand this better. Let's say I want all numbers between 10 and 100, but I wanted them in the individual ten. Okay? Now you get my point right. Now I'll show you another method. There is something in numpy called desk numpy zeros. Okay? Now this is pretty much straightforward. If I go ahead and give it an argument of three, and if I execute this, I get a NumPy array. But every integer inside this NumPy arrays going to be 0. This is a one-dimensional vector, correct? If I want to convert this into a mattress, all I have to do is to create a tuple inside of this. I'll have to give it a row, column. Okay, let's say one dot three is two. T matters. If I execute this, I'll get a mattress. 3 is 2. 3 matters because the row is going to be three and the column is going to be three asteroid, let's say I wanted to go into for my Americanness with blue rows and four columns. Obviously every one of the, Indeed this is going to be 0. I'm just stressing this for you to understand the rows and columns a little better. Okay? Let us move ahead to NumPy once, okay? Now this is pretty similar to numpy zeros, but the only difference is you're going to be receiving once instead of zeros. Okay? Well, let me start with the same three here. If I execute this, I'll receive a one-dimensional vector with all ones inside of it. Okay, Let's say I want to convert this into a mattress. Only have to do is to give it a tuple. And inside this tuple, I have to declare a row and a column. Let's say I want to ten into ten matters, okay? If execute this and receive a tendinous certain matters that is a mattress with ten rows and ten columns. It's exactly what this line of code means, okay? Now, this is something that you should know about. 23. NumPy - Linspace: Allow me to introduce you to something called as the NumPy linspace. Okay? And before we get into this, linspace is really similar to NumPy arrange data quality, same unmodified image. As a matter of fact, let me just give you a quick recap of what NumPy arange can move. Or numpy arange requests three arguments, correct? The first one is the lower limit, the second one is the upper limit, and the last one's the steps. Now step, say this virtual choir less number of intervals, correct? If I go ahead and execute this, I'll receive all the numbers between one and 10 to be more specific, all integers between one and 10, but I'll receive them in intervals of three, correct? The point to be noted as the output of the NumPy array will always be integers. Ok. Now remember this, I'll show you what NumPy linspace can do. Now, NumPy linspace would require the same amount of arguments. It will need a lower limit and upper limit. But in this case, we want call it third argument a step size. Instead, we'll call the mass motions because the number that we give will divide the lower limit and upper limit into equal portions. Now, in this case we've declared 10, correct. So all the numbers between 15 going to get segmented into 10 equal portions. If I execute this, you will get a far more, a better understanding. And it's going to execute this. If you see here, one and phi is being divided into 10 equal portions. Another point to be noted as the output of linspace will not necessarily be an endangered, but the output of NumPy arange will only be in peaches, okay? Now the output of linspace depends on the number of equal portions that you give your, let's say I want this to be seven. If execute this, 17 will get segmented into seven equal portions. Okay? You'll get seven equal numbers here. This is what NumPy linspace can do. 24. Numpy - Identity Matrix: Let us go ahead and generate what's called an identity matrix. Okay, to do this, you'll have to use this index ii white. And inside the arguments, all you need is just one argument, because the number of rows and columns and an identity matrix is always going to be equal to, let's say one to five rows and five columns, okay? If I execute this, either receive an identity matrix, phi rows and five columns. Okay? People were familiar and high school mathematics with no identity matrix is a two-dimensional square matrix with equal number of rows and columns. But the diagonal elements of your identity matrix is going to be filled with ones and all the other digits are going to be zeros. Okay? This is what an identity matrix is. 25. Numpy - Generating Arrays With Random Values: What is generating NumPy arrays? You wouldn't always need user-defined values. You can also generate numpy arrays with random values. To do this, you'll first have to call the random function inside Numpy. And this random function also has several methods that you can use in order for you to access this method. You first have a type what I've typed in the screen and then click on tab. If you do this, you'll be able to access all the methods that are there inside the random function. Okay? To start with, I'm going to be using the rand method. Now this random method would follow uniform distribution, which means I'll get numbers between 01, the data uniformly distributed again, Let's say I am, then I'm just going to give the argument of three. And if I execute this, receive an array with three digits, okay? This is a three element array, but these elements are uniformly distributed values between the numbers 0 and 1. Okay? Now, let's say I don't want a one-dimensional vector by one. A two-dimensional matters. If I wanted two dimensional Madras using the random method, as we've seen before, I don't have to declare tuples, okay? I don't need tuples like this. All you have to do is to declare the number of rows and columns directly into the arguments. Okay, Let's say I want a mattress with two rows and three columns. If execute this, I'll receive a matrix with two rows and three columns. And all of these are uniformly distributed values. And these values and generated between the numbers 0 and 1. Okay? There is another method called rand in, okay? Now this rand n has one difference. Now rand generates uniformly distributed values between 01, correct? Rand n generates normal distributed values, are Gaussian distributed values between 01. Let's say I'm going to declare the same Gu commodity here. If execute this, I'll receive the numbers between 01, but they'll be Gaussian distributed by us. Okay? You'll see more about uniform distribution, Gaussian distribution, a normal distribution when you talk about data visualization, but it is always good to know in advance, okay? Now let's talk something simple. You don't need uniform I normally distributed values and if you just need integers, you can use something called as the random rand intimate that, okay, if you do this, you first have to declare a lower limit and an upper limit. If I execute this, I'll receive a random number between the lower limit and the upper limit that I've provided. And you'll have to know that the lower limit is inclusive, by the upper limit is exclusive. Okay. Whenever you declare upper limit and lower limit, the lower limit is inclusive and the upper limit is exclusive, which means it output can be one, are the lower limit which have provided, but your output will never be the upper limit. Okay? Now let's say I'm just going to give 10 here and the upper limit is going to be 20. If I do this, the lower limit might be ten, okay, if I keep executing this, the lower limit could be 10 at some point, but it will never be 20, okay, the upper limit is exclusive. Now this applies to everywhere where you'll need to provide a lower limit and an upper limit. Okay, now let us say you want an array instead of a singling teacher. Are you have to do is to provide a third argument. Let's say I want an array with three values, okay? If I click on Enter, I'll receive an array with three elements inside of it. Okay, now this is how you use the rand, rand n and branding methods inside the random function, okay? 26. Numpy - Reshape, Min and Max: I'm going to create two variables. And inside each variable I'm going to be storing a separate array. For example, I'm going to create any variable. And inside the a variable, I'm going to be generating an array using the arrangement that, okay, Let's say I want this to be a one-dimensional vector with 25 digit instead of it. Okay? And I'm also going to create a b variable. I'm going to use the random method to generate this array. I'm going to use the rand int method. Okay? Let's say we want numbers between one and 100, and I want 10 random numbers, okay? All right. Now, if I print a, I'll receive a one-dimensional vector which is 20. If I did it inside of it. And if I execute V, I'll receive a one-dimensional vector again with ten digits inside of it. I'm going to introduce you to something called as the reshape method. Okay? Now this reshape method is going to allow us to transform the 25 digits that we have here into a phi into phi matrix. To do this, I'll first have to call me a variable. And I'm going to apply the reshape method. And inside this reshape method, I'm going to be declaring fight or flight because I need this 25 digits to be arranged as o phi into phi, my text. If I do this, you can actually see these 25 digits being arranged I so phi into phi matrix. But for this to work, you should have the exact same number of bitter chill get. To be more specific, the product of air mattress should match with the overall number of digits that you have. For example, I need a phi into phi matrix. The product of phi into phi is 25, and this matches with the overall number of digits you have. Unless this criteria matches, you will not receive an output. You'll receive an error message. Now, to clarify this, let me just clarify. Come at ten, okay, This represents a phi into 10, my text. If I do this and if I hit Execute, I'll receive an anatomist is because this cannot be reshaped into an array, say's my dv because we just have 20 faded it here. For this to be satisfied, we need 50, that it's correct. So this wouldn't work. Now, let me call the B variable like int, because we haven't used the B variable. And we can say the B variable, we have 10 random integers between the numbers 0 and enjoy one in a 100, correct? All right. Now these are not arranged in any specific order. These are just 10 random numbers between one and 100. Now if you want to find the maximum number of this array, all you have to do is to access the max method. Okay? If you do this and hit Enter, this method will help you find the maximum number inside this array, okay? Similarly, let's say you want to find the minimum number stored inside the BIA. Just by using the minimum method, you'll be able to find the minimum number in this array. All right? Now this is how you use the max and minimum with it. Now, let us say you don't want to find the numbers exactly, but you want to find the position of where these numbers are placed inside these Eric, if you want to find the position or the index of the maximum and minimum numbers, all you have to do is to access the arg max method. If I do this and hit Execute, this will help me find the position of where the maximum number is. If you see this has given me an output of 5. Let us count 012345. Now if you see in the fifth position, I haven't maximum element. Similarly, if I want to find the position of where my minimum number is, I'd have to use the arg minimum method. Okay? If I do this and hit Execute, and this will help me find the position of the minimum number. In my case, the minimum number is incorrect. If you see the position is for your discount again, 01234, okay? Now, my minimum number is placed in the position for now, these are phi interesting methods that you'll be using a lot in our course. 27. Numpy - Shape and Dtype: For this lesson, I'm going to be creating a new variable called C, okay? And you can say the C media, but I'm going to be storing a two-dimensional matters, but let me just go ahead and copy paste the same two-dimensional matters, okay? And we want to copy this code, and I'm going to paste it in say, D, C variables. Alright, now if I print my C variable, I'll have a two-dimensional matrix. Okay? Now, we know that this is a phase to find metrics, but assume you don't know the number of rows and columns in this matters, okay? As you may not know the shape at all. For you to find the shape of this map is all you have to do is to use the shape method. But this shape method doesn't need arguments are done, are you have to do is to use the following syntax, followed by the variable. You'll have to type dot shape. If you hit Shift plus Enter, you can find the shape of this particular matters being printed here, okay? The first digit represents that row and the second digit represents the number of columns. Similarly, if you want to know the number of elements that are there inside the mattress. And also, if you want to find the data type of this elements, you have to use the D type method. If you do this and hit Shift plus Enter, you'll be able to see that this particular matrix is made up of 32 integers. Okay? Now these are again, two interesting methods that you'll be using a lot in the following lessons, okay. 28. NumPy - Indexing: People who are already familiar with Python's lists and dictionaries are in luck today because some of the concepts that you learned will apply to NumPy as well, okay, For people who are new here, I'll give you a quick demonstration so you don't have to worry. Okay? So first step I'm going to import numpy as np again. And I'm going to create a new variable. And inside this variable, I'm going to create a NumPy array. Okay? Let's say one does added to have ended it. Alright? If I print the variable, if I bring the a variable, I'll receive the empire. And this NumPy array would have tended. Okay? Now, we already discussed about how you can access a particular element using its position. Let's say you want to access this element for, by knowing the position of this element for, you can actually extract this particular elements on this list. For example, this Ford is in the position 0, 1, 2, 3, 4. Now in our case, four happens to be in the position for this by using a square bracket and declaring the position for instead of it, you'll be able to extract the element that sitting in that particular position, okay? In our case, the position equals the number of digits book yes, So we have declared four here and we've received the same output is four. And if you have an array with random digits, it won't be the case, okay? Now, you can also extract a set of elements from this array. Let's say you want to extract these particular digits from this entire array, okay? This concept is called as glazing I NumPy, it's called Les NumPy indexing. Again, I'm going to use the same variable. I'm going to use the same square brackets, but I need to provide upper limit and lower limit. Okay? Let's say my one map but limit to be true. And I want my lower limit will be 10. And I want to separate my upper limit and lower limit using a coelom. Okay? Now let's say this columnists everything, okay? This column means everything. Now, this index means I want everything that's stored in within two and 10. Okay? If execute this, I'll receive everything that's stored in between the positions of 210. Alright? If you see 0, 1, 2, soil DO is inclusive, lower limit is inclusive. With discussing that previous lessons, the lower limit is always inclusive, but the upper limit is exclusive. But this way, we get the element with the stored in the position to, when we don't get the element that's stored in the position 10 because 10 is exclusive. Let's use another example. Let's say I want all the elements are stored in 3D in the position 36 per se, okay? If execute this, I can now extract all elements that's there in between 36, okay? Now this column literally means everything. Okay? So as a syntax, it will be easy to read for you if you remember this quarreling as the word everything to be easier for you to remember the syntax. Now let us consider another example. Now in this case, I'm not going to provide an upper limit and lower limit. Okay. If I don't provide an upper limit or lower limit, I told you the column represents everything tech. If I just execute this, this index would print all the elements, okay, because this column type presents everything since we did not give an upper and lower limit, this impact is printing all the digits that are stored inside the variable. Now, I'm going to use the same syntax, but this time I'm going to be giving it a lower limit if I do this and hit Execute. Now this index is going to print everything that falls after, everything that follows after the given position. Okay? Now similarly, I'm gonna do the exact same thing, but I'm not going to give it a lower limit. Okay? I'm only going to give it an upper limit of let say, okay, if execute this, I get all the elements that false before the given position. In our case, we're given the position of eight, okay? Which is exactly why we're receiving all the elements that are falling before the number eight, okay? Now NumPy indexing is another important concept that we'll be using a lot, okay? 29. Numpy - Index Broadcasting I: Now most of you would have a question right now. You can ask me, How was NumPy indexing different from regular Python list indexing? Your question is absolutely hate it. But there is one major difference that you must know about. Numpy array indexes have the ability to broadcast. I'll show you what this is if we just go ahead and print the variable again. Now we have an array which has tended, it's correct. Now, let me just go ahead and slays a portion of it. Let's say one displays the elements between the position one. And let's see six, okay? And I'm going to broadcast it with the number 1. But what do you think this would do? If I go ahead and print the variable again, you can see that the number 1 has now replaced everything that we've placed here. Everything from the position 16 is now broadcasted with the element that you've replaced it with. Let's say I want to replace it with the element a hunter. Or let's say I want to replace it with element 1000 or 10 thousand, whatever it is. If I print the variable again, this element that I've used to broadcast will replace every element that we've placed. Okay? Now this is called as Broadcasting, okay? This is a future that is very particular to NumPy indexes. 30. Numpy - Index Broadcasting II: Index broadcasting has a tricky topic that you must know about. Now we have an array, okay? Now this ID has 10 digits stored inside of it. Okay? Now I'm going to create a new array called b. But instead creating new digits, I'm going to copy us place of this. Okay, I'm not going to create a new array. I'm just going to place, Let's see the elements two to six, okay? And if I print everything that's stored inside of B, I'll get a new array, good. But this new array is displays of everything that's stored in, say, D ARE, Okay. Now I'm not going to touch the ALA, I'm only going to use the B. Okay? Let me just use the concept of broadcasting to be, okay. I'm going to choose everything that's there inside B array. And I'm going to broadcast tech to the number, let's say 0, alright? If I print b now, I'll get all zeros. Okay? Now, you might think that this won't affect the AIRE. But if I print the array, you can actually see that everything that you did to the B array is affecting the AIA as well. Even though you haven't touched the ARA to this point, everything that you've done to the BIA is affecting the a. This is because use placed a portion of ARRA, okay. Just because you stored it and say the new variable NumPy does not allow you to create a new array out of an older age due to memory issues. Numpy trace to save memory at every possible instance. Okay? So unless you create a new array with a new set of variables, it would only keep affecting your parent array, or in our case, the ARRA, okay, if it sounds confusing, just try to think about it for a while. It's actually really easy to grasp. If you want to create a new array. With an old array, you should particularly specify that it is a copy, okay, Now you can ask me how you can do it. I'll tell you how. I'm going to specify. The B is a copy of it. Okay, if I do this and then use the concept of broadcasting, it wouldn't affect my original array. Let's say you want to broadcast everything that's stored inside the beam. And I wanted to broadcast it to, let's say 10000. All right, if I do this and if I print every element that's stored inside B would get broadcasted. But if I bring the a here, this wouldn't reflect in the ARRA, okay? Because you've particularly specified that the B array is now a copy of the air, okay? Now, this is essential if you're using a new array to store a portion of your older, okay. 31. Numpy - 2D Indexing: Until this point, we've only been working with one-dimensional arrays and one-dimensional vectors. So let us start discussing about indexing concepts of two-dimensional arrays and maps exist, okay? So to start with them first going to create a two-dimensional array. And we're gonna create a variable and say this variable. I'm going to be creating an array with three rows and three columns, okay? The first rule is going to be 1, 2, 3, and the second row is going to be two comma three, comma four. And the third row is going to be three comma four, comma five, right? If I do this and if I print it, I would now have a two dimensional matrix. Okay? Before we jump into the indexing concept, I first want you to understand the x coordinate system. The first column in the matrix is the 0th column, okay? And similarly the first row in the matrix of T 0 to now assume we have a 5'-end of a matrix. If you want to access the element that sitting inside this particular position, you will have to use the 0th row and column as the average. Okay? So the position of this particular element is 0 comma 0. And similarly, let's say you want to access the element that sitting in this particular position, you must use the positions 0 comma one as the example because this particular element is sitting in the row 0 and it's sitting in the column one. Okay? So this is how the matrix coordinate system works, okay? Now, assume we want to find the position of this particular element two, okay? You know that it's sitting in the 0 through, and it's sitting in the first column. If you want to call this particular element, you'd have to use the positions 0 comma one. Okay? Now, there are two ways in which you can call this. You can either use the single bracket method and use 0 comma one. If you do this and hit Enter, this will extract this particular element from this matrix. There is also another method where we can use to bracket two sets of brackets, okay, the same thing, but you'll have to be white the rows and columns into two separate brackets. Alright? I personally prefer using the single bracket method because it reduces the possibility of errors and it keeps your code really simple, okay, now this is how you extract one particular element from this whole matrix. Let's try one more example before we jump to the next lesson. And I want you to pass the screen, and I want you to extract this particular element phi from this entire matrix. Okay? I want you to pause the screen and try and experimenting all time. So we know that this element is sitting in the second row and it's sitting in the second column. We want to access this, or I have to do is to type its position to come out. Okay? If I do this and hit Enter, I'll be able to extract this particular element from the entire matrics are the two-dimensional array. Alright. 32. Numpy - Extracting Submatrices: Let us assume that you don't want to extract single elements anymore and you want to extract chunks of data from this matrix. I'll try to sound a little more specific. Let us try and extract this particular square altar of this matrix. Okay, you want to 3, 3 and 4 to be an individual metrics. How can you extract a submatrix from this whole market? Will try to do this. In order for you to do this, you'd have to use the command notation, okay? And to the left side of the command, you'd have to declare the rows. Now, I know that first I'd have to extract everything until the root two. And I need everything from the first column. If I do this and hit Enter, I would have extracted a submatrix. That is this particular portion of the whole matrix. Okay, now, let me break this down for you. To make this a little more clear. I'll first extract everything up to root two. If I do this, I would have extracted everything that falls before the second row. That is everything that falls before the second row. The first two rows, I would have captured the first two rows, okay? And again, after I do this, I'm reducing it to the first column, which is, I'm removing the first column out of the equation and storing everything that falls beyond the first column, which is the 2334. Okay, what do you after I do this, I'll be able to isolate this particular submatrix from the whole matrix, okay? If this sounds confusing to you, please do not worry because this is an advanced concept that we're not going to be seeing a lot in this course, okay? As a matter of fact, you won't even be using this a lot as a data scientists or business analysts. So if this sounds confusing, they're not worried. 33. Numpy - Conditional Indexing: In our previous session, I told you that we wouldn't be using submatrix indexing a lot and there is no need to worry. But conditional indexing is a whole another story, conditional indexing is important and we're going to be using this a lot more often. Okay? So what is conditional indexing? Let me tell you. First, create a new variable. And inside this variable, I'm going to be generating a new one-dimensional array. Okay, Let's see within that array, now, this a Ira has a one-dimensional vector inside of it with 10 digits. I'm going to be comparing this a variable with a numerical feature. Let's say I'm going to compare this a variable with the number five, okay? If I do this and hit Execute, I will receive a Boolean output or GIF through and Florida called as Boolean output. If a combat of vector with a numeric integer, I get Boolean outputs. Alright? I'm going to store all this Boolean output into a new variable called b. Alright? Okay, now this B has all my Boolean output. Okay? Now what I'm gonna do is I'm gonna compare my a vector with a new vector which has made Boolean out, which if I do this and hit Execute, I will only receive the output for the elements for which the Boolean output was through. Okay? Now this concept is called S condition indexing. Now you don't always have to be this descriptive. You can simplify this process a lot more. Now, I have my ARA here. To simplify this, all I have to do is to take a square bracket is greater than phi or any new medicine PGY2, okay? This will give me the same output. This process is called conditional indexing, and this is a really important concept, not just in numpy, will also be using this when we study pandas. Okay, now this syntax might look a little weird to you, but it is good if you get familiar with this because we are going to be using this a lot often, okay. 34. NumPy - Operations: There are a lot of simple mathematical operations that you can perform using arrays. You can either both of them using two arrays or you can pull from them with an array and a scalar value. To demonstrate this to you, I'm first going to copy the same line of code here to create a new way. So I have an array with 10 digits inside of it. You can either add an array with another array of every single element will get added to the same exact element in that, in that particular position. Okay? Are you can subtract an array with another array. You can also multiply two arrays together, okay? The process is exactly the same. You can also flow from operations and scalar values. Let's say you want to add something to all the bits in Antarctic, all you have to do is just go ahead and add it. Okay? Let's say you wanted to add 2 to all the digits, okay? Now, be careful, whatever you do here will reflect across all the digits inside you're ready, Okay, Let's say you want to subtract one from all the digits, okay? Whatever you do here will reflect across all the elements inside the array. So this is something that you must get a book. And let's say 12 multiply 10 to all limited. Okay? Now this is also doable. Now, in some cases, numpy will allow you to execute your operation even if you encountered another. Now, what do we mean? Now, you practically cannot divide a number by 0. You will encounter an error. You'll encounter a ZeroDivisionError because this isn't possible. You can divide any number by 0, okay? But let us try and divide a with a, because the first element of array is 0, correct? And we're going to be dividing 0 by 0. Let's see what happens. If I do this. Numpy will issue a warning, but it will still give me my output because I told you that it cannot divide 0 by 0, correct? Which is exactly why numpy has displayed NaN here, but will soon receive the output for all the other elements. Your execution will not stop with an edit. You will still receive an output, but you'll also receive it with a warning message. Okay, let us re doing the same with one divided by a. If I do this, I'll receive a warning aswell, Because right now I'm dividing one by 0. The first element of myArray is 0. Since I'm performing the operation one by a, I'm dividing one by 0, which is not possible, which is why I am getting an infinity here. But my execution is not stopping with the editor. I'm still receiving an output with a warning message. Okay, this is another feature of NumPy which he must know about. 35. Numpy - Universal Functions: Numpy also has a lot of inbuilt universal functions that you can use to your advantage. Let's say we want to find the square root of oil element inside your body. Or you have to do is to access the SQRT method again. And inside the argument you'll have to provide the name of urea. We hit Enter. You can calculate the square root of all the elements inside the array. Similarly, let's say you wanted to find the exponential of all elements in January, or you have to do is to access the EXP method and in say the argument, you have to provide the name of your array. This will give you the exponential of all the elements and said You're alright. Similarly, there are also technologically functions that you can use later. You want to find the sine. You have to access the sine function. This will give you the same values of all the digits inside DOD. Okay? Now, let's say we want to find the log. Log is another universal functioning said numpy. If I hit Enter, I'll receive a warning because log of 0 is actually negative infinity, which is exactly why I'm receiving an error here, but you'll still receive the output even if you issue a warning. Again, this is not an adder. This is just a warning message. This is an untimed one. But this doesn't mean you won't receive your output. You'll receive out which, but you'll also receive a wanting to show you what's really happening behind the scenes. At a lot more inbuilt universal functions like this that you can use if you want to know more about them. We'll have to visit this page for sure. I'll make sure I leave this link on my display right now. And I'll also give a link to this on my lecture notes. So you can follow this link and learn a lot more about this functions, okay? That are a lot of functions like this. Before we go ahead and start writing your own function, be sure to visit this page and check if this function is going to be theta naught, okay? 36. Pandas - Series I: Hey guys, welcome to our new lesson in Pandas. The very first thing that we're about to learn here is about Pandas series. Panda series is really similar to NumPy array objects. As a matter of fact, it is built on top of NumPy array objects. But then it also made a different decision was no abort. For example, the integer indexes that they use to access the data in NumPy array objects are implicit while in Pandas, then D1 indexes that you use to access the data. And explicitly, we understand a lot better when you actually start the code. Okay, let me first go ahead and import numpy as np. And I'm also going to import pandas as pd. I'm going to start with creating for Python objects. My first Python object is going to be a list called labels. And it's going to store three variables, a, b, and c. My second by going Object risotto going to be a list. And it's going to store three, impedes the values. And we talked by going object is going to be an unbiased way. I'm just going to cast the same data and this year. And we fought by the object is going to be a dictionary. And I'm going to store three keys and values inside here. The first key is going to be a, and the first one is going to be one. The second key is going to be b, and the second value is going to be two. And the third key is going to be C. And the value for that key is going to be three. Okay? Alright. Now I'm going to access the Pandas series, the syntax to access panda series, a speedy dot series, but here, when your time series, this needs to be in caps. Now inside this argument, I'm first going to give the labels HF, which means the list that I've created here. Let us see the output that we're going to get. If you see here the output or symbols or numpy array object. But here the integer indexes are explicit, which means you get the indexes along with all the elements are all the data that's there in your particular series. This makes accessing them a lot easier to get. This feature is not available, no, by adding objects, which is exactly why Pandas is a clear, more understandable version of non-white. And we'll just do this one more time. But now I'm going to show you something really interesting. If you want to know all the arguments that it can provide insight. This particular method, all you have to do is to press Shift tab here. This not just a place to see this as a place to all the arguments inside Jupiter it okay? If you want to know the arguments that you can give to a particular method, all you have to do is to place your cursor inside the argument and press Shift tab. There's an open ahead, but what I can see all the arguments that you can get. All right, so these are all the Ottomans had been given with the Panda series. If you notice, the very first argument is the data and the second argument is the index. These are the two important arguments that needs to be given to a Pandas series. So for now, my data is going to be data itself. All right? My very first argument was data, and this was my output. Okay? If you see here, this output is really similar to this output, but there is one interesting feature that you must know about. Now, if we just go in and access the series once again, but now my data is going to be the same data, but my index is going to be pointed to the labels. This one right here, okay? If I do this and hit Execute, you can now see I've now forcefully changed the name of my indexes because usually the indexes start from 0, 1, 2, and goes to one, correct. But if you want, you can go ahead and change the name of it indexes. This is one of the major advantages of using pandas series because once you change the name of the indexes, you can refer to that particular value using your changed index. Okay, it makes accessing them a lot more easier. If you want to simplify this even further. This data and index are in the correct order, correct. We added mixed matching, the order we've declared the arguments in the same order of their syntax. So if you want, you can just go ahead and provide data comma labels, right of it, okay, if you do this and hit Execute, it would still work. You don't have to explicitly declare that you want your data to the data and labels to be indexed because you've typed them in the correct order. So you don't have to explicitly mentioned that if you want to simplify your code even further, this is a quick little tip that you can use. 37. Pandas - Series II: If you didn't know that by now, abandoned cities is extremely flexible. So what do I mean by this? A panda series can display integers. It can also display strings. And it can also display dictionaries. And it's not just that, it can display anything you want as an example, let me just go ahead and provide my own input values. Again, my first input value is going to be my inbuilt function sum. And the second one is going to be printed. I thought I was going to be, let's say lending. Okay. Even if I do this and hit Execute, a Pandas series can also display it because Abandoned series is really flexible. It can store and display anything you want. But this is another important feature that panda series has and it's what makes it so usable. Let us move ahead to another example. I'm going to be creating a new series and I'm going to name it CDS one. This is going to be a Pandas series, okay? And my first argument needs to be data. Mine. It has going to be numbers 1234. And my index is going to be, let's say names. The first name is going to be, let's say David. The second name is going to be John. The third name is going to be shot maybe. And the full name is going to be, let's say Victor, right? And let me just go ahead and display this. Okay, if you see here, you can see all the indexes being displayed along with the data. Okay? Alright, let me modify this to four. All right, Okay, now I'm also going to create a new CDS now as an afterthought, good news. Go ahead and copy this. Okay. And just copy pasting this, I'm going to modify the name for us to fall. And here I'm going to modify the numbers also. I'm going to name this file and I'm going to need the six purposely. I'm also going to change the names of these values. Let's say Roger. Roger, and let this be our shell. Okay, I'm just purposefully changing the names are the indexes of my series of the newer cities. Okay, you can just go ahead and display this module. Before we get into something else, let me just show you how to extract values from a particular series. You can extract values by referencing to their indexes. Okay, so what do we mean by this? I'm accessing through this one. And let's say one devalue of asha. All they have to do is should have finished the index and say, okay, this is just like have you access the values of a dictionary using its skis. If you hit Execute, you can actually access the value that's stored in say, asha. And similarly let's say, and take CDS tool this time. And I'm going to access the value of erasure, all you have produced by its index. And by hitting Enter, you'll be able to access the data that's stored inside. Roger. Alright, now, let me just go ahead and modify the last name here, okay? And we're going to modify this to let's say Natasha, alright, and then execute this. I'm not going to execute this. And so I'm now going to add series one and see these two together. Pandas is a mathematical library and it can handle mathematical operations. So let's go ahead and see the output that we're gonna get. If you look at the output, we have any indices for our most, for indexes, because Pandas can only add CDs if their indexes at the same. Even if the indexes are different, it will only add the values for which the indexes are the same. If you look at the output, we have a numerical ODE, which for this David and John, because we have David and John in both disease. Okay, so pandas is adding the values of David and John together, okay, get an added output. But if you look at the output of asha, Natasha, Roger, and Victor, they have any and outwards because we cannot add another value because there is no ash are present in the series, okay? And also, if you look at the output, our inputs are integers, okay? If you look at an input and then put it in pages, but if you look at the output, it has been converted to float values. This is another interesting property of both NumPy and pandas. If you add two numerical values together, even if they're integers, the output will also be floated. This is to keeping output more mathematically accurate, right? 38. Pandas - Dataframes: Dataframes at an integral part of Pandas, which is built on top of the series object. Everything that we do in pandas will depend on DataFrames in one way or the other. So let me start showing you what DataFrames are. First, I'm going to import numpy as np. And then what you're going to import pandas as pd, right? Will be needing input data. For this, I'm going to be importing the rand n from numpy random. Okay? And once I do this, I'm going to use the seed with it. Okay? Now, series going to make sure that you and I receive the same input data. Which means if you're following me, you will receive the same input data that I do. Okay? Alright, so I'm going to initiate the data frames command. The basic syntax for DataFrames are you must first call your pandas. And I want you to call your pandas. Or you have to do is to type data frames. Okay? If you don't want to do this, you can click on tab here and you can see all the methods that pandas have. And you can find DataFrames here. Okay? You can stop midway and click on Tab to complete what you're typing. This is the basic syntax of Pandas DataFrames. Now, if you click on Shift Tab node, you can see the arguments that are needed for the DataFrame method. Okay, if you see here, the first argument is data, the second argument is index, and the third argument as columns. Okay, imagine DataFrames excel. To fill an Excel sheet, you'll first need input data. So we've imported the random function, okay? And then you'll need to declare the index, and then you need to declare the name of the columns, okay, which is exactly what we need. So first, let me start with the input data that we need. So I'm going to import the random function. And it's, I need a three is to three matrix. Now that this is done, let me go ahead to the index. And I want my first index to be a 10 second index to Bibi and it won't return index to be seat. And let me move ahead to the columns. I want my first column to be named, let's say x and 10 second column to be named white. And I want my last column to be named Z. Now this is pretty much it. If I go ahead and print the data frame that I've created, I will receive a nice crisp Excel type output, okay, if you're familiar with Excel sheet, just spreadsheets before. You should be familiar with this because this resembles an Excel sheet. You have gross and you have columns and everything is named, everything is sigma and did everything looks needed. This is y. Dataframe is so important because it makes processing mathematical data really simple. We have three columns here. The first column is x, the second column is white, and the third column is, is it. And they all share a common indexes. And it's actually what DataFrames, DataFrames or group of columns that share common indexes. Now let us say you want to extract one single column from here, okay? Let's say you want to extract the column x, okay? All you have to do, the CU, square brackets. And inside the square brackets, you need to declare the column that you want to extract. And since this is a string, not forget to use Co-Chair. Once you do this and hit Enter, you'll be able to extract this particular column from this entire DataFrame, okay? And once you extract a column, this is not a column anymore, but this is a series, okay? This is a CDS and this is not a column. If you don't press me, you can use the type function here. And let's say the type function. I'm just going to copy paste this. If execute this, you'll see that this is not a DataFrame, but this is a Pandas series. Okay? So once you extract a single column out of a Pandas DataFrame, it is not a DataFrame anymore or it is not a column anymore, but it becomes a series. Okay? Now let us say you want to extract new columns because before we extracted a single column x, correct? Now let's say you want to extract white and is it together? You don't want to do it separately, but you wanna do it together. If this is your requirement or you have to do is to use the same syntax but inside the square brackets, unity declared another square bracket. Okay? And all you have to do is to type the name of the columns that you want to extract. Okay, I highly recommend that you stick to this in texts. But there is another syntax that's been used, which I'll be showing later. Okay? If you do this and click Enter, you'll be able to see that you've now extracted two columns from this particular DataFrame. When you extract two columns together, it reminds a DataFrame, but when you extract one column from a DataFrame, it becomes a panda series. Okay, now this is exactly what I'm trying to say. I told you that I have another syntax that you can use for the exact same thing. Now let us say you want to extract the same x column from this DataFrame, okay? You don't have to take square brackets at all. Instead, you can type a dot and followed by the dot, you can name the column that you want to extract. If you do this in type Enter, you'll pretty much get the same output. But I wouldn't recommend that you use this syntax because in Pandas you generally use dots to call methods, correct? Panels is a lot of methods you can use every time you use it. Dot pandas is going to assume that you're calling a method, okay? Because all those other methods that are there in jade plant is every time you use this dot pandas is going to assume that you are about a type of method. Now let us assume the name of a column is not an alphabet, but a complete word. Let's say the name of your column is alpha or beta, whatever. If you do this, pandas is going to assume you're typing a method and you're not calling a column, okay? If this happens, you might get an edit. So I wouldn't recommend that you stick to the syntax and try to use this syntax wherever possible. Okay? This is going to make things really easy for you, and it's also going to reduce the chances of you getting an edit. 39. Pandas - Dataframes adding & dropping columns: In our previous session, we saw how you can create data frames and how you can extract columns and DataFrames. In this session, I'm going to teach you how you can create or add new columns to an existing DataFrame. Or you have to do is to type a new column name, like authority dating or existing column name. If you do this and execute, you'll encounter an error. And it will say that it's a keyword error because even though I plan to create a new column here, we have no input data for that particular new column. So why are you creating a new column? We should also provide all the input data that it needs. Let's assume I want this new Harlem to have sum of all elements that are stored inside the x column, less semaphore all the elements that are stored in say, d Y column. If I do this and hit Execute, I won't receive an error message because we have now successfully created a column. If you don't, trust me, let me just go in and print the data frame rate not. Or if you see here, we now have a new column, okay? And this new column is the sum of all the elements that are there in the x column and y column. So this is how you create new columns in pandas. And let us assume that you want to delete or drop the column that you've created. Let's assume that you don't want this new column anymore and you want to erase it. Or you have to do is to access the drop method. There is something called as a drop in Pandas that'll help you erase columns. Or you have to do is to actually drop method and name the column that you want to drop. Now you might think that this would work, but you're going to receive an error message. Let us see what it is. If I hit Shift and Enter at receive an error message like I told you. And this error message is going to tell us that new is not found in the axis because it will just delete this. If I click on Shift plus Tab here, I can access the argument is correct. And if you see here the default axis is set to 0. If you want to erase columns, you'll have to modify this 0 to one, because this will only work if your brand or delete rows, but you're not trying to delete rows here, repent or delete columns, correct? If you want to delete columns, will have to explicitly declared that the axis equal to one. If you do this and hit Execute, you want to receive an error message and receive a proper output variable, drop the column that you've needed. So this is how the drop method works. But let us assume your brain to be the row. If you're trying to delete the row, you don't have to do this, okay, you can leave this by default because by B5 axis equals 0, correct? 1. If you're trying to delete the column, you will have to manually modify this to axis is equal to one. Now you might think that the changes we made a permanent, but when in reality it's actually not because when I try and print the data frame now, my DataFrame will still have the new column inside of it. Even though we've dropped the new column here, this change will not reflect in our existing data frame. If we want this change to reflect in the existing DataFrame, you'd have to add one more argument to this. This is because Pandas wouldn't let you delete columns just like that. You have to tell Pandas to get absolutely sure that you don't want this column anymore. If you want the changes you made to be reflected in your original data frame, you will have to add an argument here applied in place, in places and other argument. And by default it is set to false. So if you want to change the mode of this argument, you'll have to modify it to true. If you want the changes you made to be locally, you can just leave it as such. Because when you leave this, the changes that you make will not be reflected in your original DataFrame. We want the changes that you've made to be permanent. I would have to modify this in place through one living will do this and then print your DataFrame. You won't see the new column anymore because the changes that you've made is now permanent and it cannot be reversed. 40. Pandas - Loc and iLoc: In our previous session, we saw how we can remove or drop columns from our DataFrame. So in this session I'm going to show you how we can crop rows from our data frame, okay? And why do you have to do is to do the exact same thing, specify the name of the row that you want to delete. Let's say, I wonder drop the arrow. But now you don't have to specify our modified the axis to one, because by default the axis is equal 0, so you don't have to modify this. A dark access is equal to 0, refers to the rows. When you want to extract our drop rows, you won't have to modify the axis, just leave IT assets. If you just do this and hit Execute, you can now see that the arrow is now dropped from the DataFrame, or you successfully removed the arrow from the DataFrame. Obviously, this change is not permanent. If you want this change to be permanent, you'd have to declare the in space here, okay. Now, I also told you how we can extract columns, correct? In our previous sessions, I told you how we can extract the x column or the y n is a columns together. I hope you remember it, but I forgot to show you how we can extract a rose. So let me show you how we can extract a rose from this DataFrame. Let's say you want to extract the a row from this DataFrame, or you have to do is to use the loc method, okay? Loc, which refers to location, is one of the methods in pandas, which will help you extract a row from an entire data frame. And inside this block, you will have to declare the row that you want to extract. Okay, Let's say I want to extract the 80. And let's say you want to extract the CDO, execute this, the row that you requested will get converted to a series because I told you when you extract an individual column or an individual row comma DataFrame, it becomes a Pandas series corrected. This rule applies to the rose asset, but there is another method that will help you extract rows from a DataFrame, which is the ILO. See, now, I here refers to index and LOC refers to location. Okay? If you want to extract a ruse by using this method, you will have to declare the index of the row. And let us assume we want to extract the B-roll from this DataFrame. The index of 0 is one, correct? So you have to specify one instead of typing B here, one refers to the position or the index of the row that you want to extract. Now if I hit Execute, I would have another extracted, the biro out of the DataFrame. All right, so these are two methods that will help you extract a rose from a DataFrame. Now as you know, you don't want to extract the entire series, and instead, you want to extract one single elements on the DataFrame. As you may want to extract this element from this DataFrame, all you have to do is to specify the row and the column which contains the element that you want to extract, okay, using this same LOC method, I'm just going to type LOC and inside the brackets and first going to specify the row. And then I'm going to specify the column. My element that I want to extract is sitting in the below, and it's sitting in the y column, okay? If I hit Execute, I would have segmented this particular element from this entire DataFrame. Okay? So this is what you should use if you're planning on extracting one single element from a DataFrame. Similarly, let's say you want to extract a submatrix. Let's say you want to extract this portion, this four elements from this entire DataFrame. All you have to do is to use the same NOC method here, okay? I'm going to use this. But inside the bracket, you will have to establish two independent lists. The first list represents the rows and the second list at a presence the columns, okay? Now I want these four elements, correct. They are sitting in the a and the B. So first I'm going to declare, hey, I'm sorry. First I'm going to get a and B, okay? And then auto setting on the x and y columns. So in the second bracket, I'm going to declare X and Y. If I do this and it executes, I would have isolated or segmented as submatrices from the Zantac DataFrame. Now this is how you use the LOC and ILOs EMA, your advantage to extract individual elements, to extract the submatrices bar, to extract growth from a huge DataFrame. 41. Pandas - Conditional Selection: Dataframes is a really huge but a really important concept to learn. In our previous sessions, I taught you how you can create an extract information from data frames. In this session, I'm going to be teaching you more about conditions, election, and multiple indexing. Now if you can see on the screen, I've also added two columns and two rows to my existing DataFrame. So if you're following me, just do all the things that temperature and you should be fine. We've already discussed about conditional indexing when we were learning NumPy. So conditional indexing in pandas is not so different from NumPy arrays. We're going to be following the same principles. Now let us start by using a comparison operator. Now, you may have already seen me using this when we were learning NumPy arrays. Okay, the concepts are similar. Let's say I want to compare the entire DataFrame to 0. I want to compare if the values in the DataFrame are greater than 0. If I do this and hit Execute, I will receive a Boolean DataFrame where the values are true and false. Now, let us go ahead and save this into a new DataFrame. I'm going to be storing this Boolean DataFrame into a new DataFrame calling UDF. If I print new df, I'll be able to see the Boolean data frame that I've created. Okay? Now I'm going to compare the original DataFrame. So the Boolean data frame that I've created, if I do this and hit Execute, I only receive the value of elements that the Boolean West 2. Okay, so executable get a far more better understanding. If I execute, you can see that I've received NaN for the values for which the Boolean value was false, Okay? And it will only receive numeric data for the elements where the Boolean value was true. In most cases even be creating a new DataFrame. Instead, you'll be falling this particular syntax, okay? Straightaway declared the DataFrame I inside this will compare the DataFrame right of it. Even if you do this, you'll extract the same element. But you would have significantly reduced the number of lines of code. But in most cases you won't be needing this NA and results. Okay, Now, this sounds confusing. If you're about to give this output to a non-mathematical person, he wouldn't understand what NA and beans. So this is not how you'll generally use the concept of conditional indexing in Pandas. Instead, you'll do something lake, you'll compare it to a particular row or a particular column. Now let's say I want to compare the w column is greater than 0. If I do this and hit execute, I'll receive 25 for my given comparison. And let me print the w column for us to get a better understanding. If you look here, we've received true for all the elements that are greater than 0. And we've received files for all the elements that were less than 0. Now what I'm going to do is I'm going to compare this condition to the entire DataFrame. For this, I'm going to copy this line of code and we're going to paste it inside here. When I hit Execute, I only received the elements for these three rows. And all the rows for which the Boolean value was false will be dropped automatically if execute. You know what I'm talking about? If you see here, I only received the value for the elements for which the Boolean value was true. So this is how we will generally use the concept of conditional indexing in Pandas DataFrames. Okay, for you to understand this better, let me use another example. This time I'm going to take the values inside, let's say the x column, and let us extract the values that are less than 0 for a change. If I do this, I need to execute. I receive a Boolean series. Now for comparison, let me just print the raw data also. I'm going to extract the elements. If you see here, we would have received true for the elements that were less than 0, and we would have received falls for the elements that are greater than 0. So we've received an output that is followed our condition. Now let us compare it to a DataFrame. Okay? I'm just going to copy, paste this condition and paste inside the brackets even before they get the output. I know that will only receive the rows of a and C because these are the only rows for which the Boolean output is true. If execute, as I told you, we would have only received two rows for which the Boolean output with true. So this is heavier. Normally use the concept of conditional indexing when you're using Pandas, DataFrames, okay, now, even in this output, you can again extract columns if you want it. What am I talking about? I'll show you. I'm going to create a new DataFrame. And inside this DataFrame I'm going to store this particular line of code, okay? And if I print the output of new df, I'll receive this particular dataframe. And from this new df, I can again extract the v column if they want, not just the v column. If I want to extract the w column, I can do that too. You can keep extracting the submatrices or you can keep simplifying the DataFrames as much as you want. That is the point that I'm trying to make. But again, you will not be declaring new DataFrames and on. You can straight away use this condition. And once this condition is ended, you can again create brackets. And inside the package, you can specify the column that you want to extract. This would work just fine. All right, If you're just a beginner in Python or if you're just getting introduced to Pandas now, this is going to confuse you because this might be the very first time you're seeing so much brackets within the same line of code. But it is always good if you can get yourself familiarized with this, because this is the syntax that we're going to be using a lot in our course. 42. Pandas - Multiple Conditions: Let us go ahead and quickly explored the concept of using multiple conditions together in a single statement. Under this point, you are comfortable with using one condition. First statement, for example, let me extract the values from the w column that are greater than 0 by using a syntax that looks like this, df, df, w column greater than 0. You are comfortable in using a syntax like this because this syntax only has one condition and this correct. But what if you wanted to add one more condition to the statement? What if you also wanted to find the values for the y column that are greater than one, which is along with this condition, you want to add another condition. What can you do? Now, most Python beginners would try to do something like this. They use the AND operator from Python. And let me try using this. I'm going to extract the data from the y column that are greater than one. And let me enclose this within parenthesis. And I'll also do the same for the first condition. Now this is what most Python beginners would do. But if I execute this island counter and error, which say something about series being ambiguity, now I'm being intentional about my error because I already know we're going to receive an error message. Okay, well, let me just execute this and see what they are. Msh6 or let me just scroll down. If you see it is saying something about the series being ambiguous. This is because the AND operator from Python can only compare two Boolean values of the time. So what I'm talking about, you can do something like two and false. If I do this, net execute will be receiving a proper Boolean output because the and Python operator as a two, you can only compare two Boolean values of the time. But what we're trying to do here is we're comparing a series of Boolean values of the time each of these conditions are returning a series of Boolean values. So admin try and execute them individually. If you see this one condition alone is returning more than one Boolean value you get. We are trying to compare a set of Boolean values together, which the pythons AND operator cannot do. So what can we do? Or you have to do is to replace the and operator with the ampersand operator. Now if I do this in it execute, I'll be receiving a proper output. Now the same goes for the OR operation. You cannot use Python's default OR operator. Okay? If I do this in an execute, I'll receive the same additive muscles that are going to say before. All you have to do is to replace this OR operator with the pipeline operator. This pipeline operator can be found on top of your Enter key or you have to do is to type Shift. Backslash can be found on top of your Enter key and let me execute this. Now this is how we can use multiple conditions together in a single statement. 43. Pandas - Reset Index & Set Index: In this session, I'm going to show you how we can reset the existing index with Python's default index, which is nothing but a series of numbers from 0, 1, 2, 3 up to the intro. And I watch, I'm going to show you how we can reset the index with an existing column that is already there in your data frame. First, let's talk about how we can reset the index with Python's default index. To do this first, I'm going to recollect the DataFrame that we already have. Now, our DataFrame has rose from a to E, And we also have a columns from V to set. Our current index is nothing but alphabet from a to E. Now I'm going to try and reset this index with Python's default index, blue. This, you'll have to use the method called reset underscore index. This pandas method is going to help you replace the existing index with Python's default index. All right, well let me execute this. Our index is now Python's default index, which is nothing but a set of numbers from 0, 1, 2, 3, up until your last end through our previous index was a set of alphabets from a to E, correct? This previous index of yours will be saved as a new column. This is an advantage of reset index when you're using the reset index method. First of all, it's going to replace the existing next with Python's default index. And after doing this, it's going to restore your old index as a new column inside the data frame. Now this is the advantage of using the reset underscore index method, but the changes that you've made here will not reflect in your original DataFrame. So what do I mean by this? Now if I try and access my data frame again, you can see that the changes that I've made here is not reflecting here. This is because we haven't used the in-place that you've seen already. And there's going to type Shift Enter here. If you want to make a change is permanent, you will have to use the in-place operator and set its values to do as we saw in one of our previous lessons. But right now, I'm not going to make this permanent. So I'm going to leave this here. But if you want to make the changes Berman and you can use the in-place operator to do so unless or until you do this, the changes that you make here will not reflect in your original DataFrame. Let me delete this now that we know how to replace the existing next with Python's default index. I'm going to show you how we can replace the existing index with one of the columns that you already have. So to do this, I'm first going to create a new column. Now I'm going to create a variable called new index. And inside this variable, I'm going to type a series of state names of gap, like California, New York, Orlando, WY, Colorado, etc. Now make sure the number of values that you type should match with the number of rows that you have. We have five rows and I've typed phase gate names. If you happen to have six or seven rows in your data frame, you must type those respective number of states here. And after doing this, I'm going to use a new method called the split method. Now this split method is going to help us remove the spaces that were left in between are strings. And this is also a quick way to convert a set of string values into a list. So what do I mean by this? Let me just copy, paste this to a next line and it will just execute it. If you see the split method is converting our set of string values into a list on its own. Now I'm just going to execute the new one variable. And I'm going to try and access the values inside the new invariable. If I execute this, my output would be a list and not a set of string values. Now that we have a list on our hand, let us convert this into a column inside a DataFrame. To do this, we must use something called as the SEC index method. Previously, we used the reset index method. So now I'm going to use the second x method. But before I go there, let me include this list as a column inside a DataFrame. To do this, I'm going to take df, the name of my column is going to be States and new end. All right, so let me execute this. Now if I tried to access my DataFrame, you can see that I've included this list as a column inside a DataFrame. Now that way here, let us convert this column as the default index. Could do this, as I told you, you must use the set index method. And inside the set index method, type the name of the column that you want to make as a default index. Now if execute this, you can see that my original index is now replaced with the values from these states called while using the reset index method, our old column was not discarded, correct? It was included as a new column inside a DataFrame. But while you're using the set index method, your old index or the old column will be completely discarded. So using these two methods, you can replace your existing index with a new index on your own. 44. Pandas - dropna & fillna: Whenever you're working with Pandas DataFrames, you are most likely to stumble upon missing values here and there. And your Jupyter Notebook or Python IDE is going to name these values as NaN or null. In this session, I'm going to show you a few ways using which you can handle missing values in your data frame. So before we do this, we're first going to create a DataFrame that has missing values inside of it. As a first step, I'm going to show you how we can convert a dictionary into a DataFrame. The syntax is fairly simple. First of all, I need a variable, and inside this variable, I'm going to type a dictionary. Now as you already know, a dictionary has a key value, correct? The key of a dictionary is going to be the name of the columns and the values of a dictionary are going to be the row values of our DataFrame. So I'm going to be creating a dictionary that has three key values instead of it. My first key is going to be a. My second key is going to be b, and my last key is going to be, let's see. Now this represents that the name of my columns are going to be a, B, and C. And every column requires a set of row values, correct? So that's why we're going to provide enough. Now, I want my a column to have values such as 1, 2. And the third value is going to be an L value. To create a null value, it must type np dot ANA and np dot NaN is going to allow you to create a null value inside your data. Now similarly, I want my b column to have value such as np NaN comes np dot n a n. Now finally, I want my C column to have no null values. So I'm just going to type 1, 2, and 3. Now let me execute this. After doing this, I'm going to convert this dictionary into a python dataframe. Through this, I'll have to do is to type df equals pd dot DataFrame. And inside the parenthesis, all I have to do is to type the variable name that contains the dictionary. After executing this, if I tried to open my DataFrame, you can see that we've now created a DataFrame that has null values inside of it. The a column has one null value, the B column has two null values, and the C column has no null values. Similarly, the 0th row has no null values. The first row has one null value, and the second row has two null values inside of it. So now what can you actually do to this NAN values? Because even if you're going to work as a data engineer or a data analyst, you are going to encounter a lot of DataFrames like this and it's going to have a ton of missing data inside of it. What can you actually do to this missing data? Because you cannot always leave them blank. You have to do something correct. So I'm going to teach you two methods. The first method is the drop NA method, and the second method is the free end. These are two methods that we want to help you handle missing data a little better. Alright, So first of all, I'm going to show you the dropped in a method. Now if I hit execute the drop NA, my dad is going to remove all the rows that has a null value in setup it. For example, in our DataFrame, the first row and the second row has an advantage in setup it correct. Now we've executed this, the drop any method would remove the first row and the second row because they contain null values. Let me execute this and show what I'm talking about. Now, if you look at the output, we are only receiving the row that has no null value inside of it. Now as he would have noticed, we were only able to remove the rows which had null values. But what do you want to remove the columns that have null values? Now let me just type here. This will help me see all the operations that I can perform. This drop any method. Now, if you see there's something called axis that's going to help us do this, okay? By default, the axis is equal 00 represents the rows. If you want to remove the columns that has null values instead of it, you'll have to replace the 0 to one. Let me type axis equals 1 here, and let me hit executed. We have no remote, the columns that had none valleys instead of it. And you're only seeing the column that had no null values. Again, modifying the values inside the access method is going to help you switch between rows and columns. Now this is one way to remove any missing values that you may have in your dataframe. But what if you don't want to remove the missing values, but you want to replace those missing values with another set of value. This is where you can use the fill any method. Now, before we go any further, I'd like to talk about another method, the drop any has, which is the traditional method. By default, the thresh method is set to none, but you can replace this with a numeric value. So what can thresh actually do it? You're dropping a method. I'll show you. So first I'm going to type threshold and we're going to provide the value of two. Now, if I hit execute, my output would contain all the rows that has at least two non missing values. Now if you notice the row 0 and row one has at least two non missing values, and that is exactly why we are receiving those rows in our output. But the row two does not have to non missing values. It has just one non missing value. So our drop any method is dropping these second group. Now this is how we can use the axis and the thresh operators inside the drop any method. But what we don't want to drop the missing values, but you want to replace the missing values with values of your own. This is where you can use the fill any method. The fill any method is going to help you replace the missing values with values of your own. Let me hit Shift Tab here to show you what the fill any method can do. If you see the first argument of this bill and a method is something called less value. By default, it's going to be labeled as none. You can replace this nun with an integer value over the float value or with the string value, whatever value you want. So let me try and modify the data of our missing values using the value argument. Now, let me replace them with a string called fill value. If I hit execute, all are missing values will now be replaced with the string that we provided to our value argument. But this doesn't necessarily have to be a string. It can also be an integer value if you want it. So let me show you an example. For this example, I'm only going to be taking the values of the a column. Let me replace the last missing value that the a column has with an artist string, but the average of the other two values. Okay, I'm just going to replace this last missing value with the average of these two values. Only have to do is to use the fill any method. And inside the parenthesis, I'm going to use the value argument. And the value is going to be the mean value of the data that's there inside the a column. So let us execute this and see what we get. Now if you see we have replaced the last missing value with the mean of the other data that we're dealing said be a column. So this is how we can use the drop NA and the filename methods to work your way around any missing values at the DataFrame may have. 45. Pandas - Group By: In this session, I'm going to be talking about the groupBy method and the things that can do with it. If you're coming from a MySQL background, you already have an idea about the group BY clause, but if you have no prior knowledge about SQL, it's completely okay. We can start from scratch. Basically a group BY allows you to group together rows based of a column and then perform some sort of aggregate functions on them. And aggregate function is not a complex dome. An aggregate function is nothing but functions like some main, which takes a bunch of values and simplifies them into one single output. And before we proceed any further, I have already typed the code to create a dictionary. Now please pause the screen and type whatever you see on the screen. This code will help us create the dictionary or the DataFrame that we need for this session. Pause the screen and type what you see on the screen now, but this dictionary, let us go ahead and create the DataFrame that we want to do this, I'm going to type the syntax that we already know. D equals pd dot DataFrame. And inside parenthesis, I'm going to type data. All right, let me execute this and let me return the values inside the DataFrame. If you see, we have now created our very own DataFrame with the dictionary. Now let me show you how we can actually use the groupby method. Now since groupby is a method, first of all, I'm going to type df dot group by an inside parenthesis. You must provide the column name with which you want to group the data. So I want to group the data based on the column company. Now if I execute this, our output is going to be a dataframe object. Okay? This is the location in memory where we've stored the group data. Now, I'm going to store this output into a new variable called, let's say group. And it makes secured this. Now, every time I execute the group variable, I'll be able to access the memory location. We have now successfully grouped all and values using the company column. Now let us go ahead and perform certain operations by using aggregate functions. Okay, so firstly, let's try and use the main method. Now if I hit execute, you can see that my data is now grouped based on the companies. And I've also found the mean values for the sales data. And similarly, you can also use the sum function if you want. The output will have the added values of the sales data of every individual company. Alright, you can also find standard deviation if you want using the std method. And if you pay attention, the output is actually a DataFrame, which means you can use the dot LOC method as well. I'm going to use the dot LOC to find details about, let's say the Facebook company. Now if execute this, I will be able to access the details of this company, Facebook. Once you start familiarizing yourself with the syntax, you and be storing the location object inside this variable called group. You will straight away group the data and instead of storing it inside a variable, you can perform operations right away. You would still receive a proper output. You don't necessarily have to store the location object inside a variable called group. But if you feel like you need time to familiarize yourself with the syntax, you can still choose to store the location object inside of evil. That's completely your choice. But I personally prefer ending the code inside a single line instead of complicating it. Now, even after finding the sum, I can still use the loc method To find the details about, let's say Facebook again. All right, it's this simple. Now let us have a look at other aggregate functions that you may have to use a lot count as another aggregate function that's really important right now, the con method is helping us find the number of persons and the number of sales that each company has. Similarly, there's also the max function. If execute this min-max function will return the maximum sales of each company. But if you see we also receiving the person column, because Python can recognize alphabets based on that hierarchy. For example, as a bachelor, S and V falls on the latest side of the alphabetical table, correct. What is exactly why, while using the maximum rated were receiving names like this? If I use Min function, I'll receive names that start with alphabet from the early half of the alphabetical table. Okay? And similarly, the main method will also give you the minimum series that each company had. Another really important function is the describe function. This is going to help you get the most out of the data that you currently have. You'll understand what I'm talking about if execute this, if you see are described method is giving us every possible output that you can possibly want from the data frame that you have. Now, if you don't want your company's to be rows and if you want to transpose them into columns, or you have to do to use the transpose function. If I execute, my companies will no longer be rows, but it will be transposed to columns. You can use the transpose function to transpose rows, two columns and columns to rows. And even after doing this, you can further simplify the output by limiting it to individual company. If I only want the details about the Facebook company, I can just use this syntax. And if I hit Execute, I would only receive the details that belong to the Facebook company. So as a beginner, this is everything you need to know what the groupBy method and the aggregate functions that come along with it. 46. Pandas - Join, Merge & Concatenate: Hey guys, in this session, I'm going to show you how we can merge, join, and concatenate dataframes using pandas. Unlike other syntaxes, this index we're merging, joining and concatenating is really simple. But our XES AES is going to be extremely lengthy because even though the syntax for merging, joining, and concatenating is really simple, for the sake of teaching you, I would have to create a lot of DataFrames to explain this. So I have made the job easier for you. So what I've done is I've already created all the DataFrame that we need for this lecture and compile them into a Jupiter Notebook file. And I've attached the same Jupyter Notebook file in the resources section of this lecture. So I want all of you to extract the Jupyter Notebook file and extracted inside your Jupyter notebook console. Once you extract it, you'll be able to see everything that I have on my screen now. So first of all, extract the Jupyter Notebook file from the resources that chin and open it inside your notebook console. So there are three main ways of combining DataFrames together. The first one is merging, the second one is shining, and the third one is concatenating. In this lecture, we'll be learning about all of these three methods with suitable examples. So we are going to start this lecture with concatenating, okay, to explain this concept, I have created two separate DataFrames here. I first created a dictionary and then converted them into DataFrames using the period or DataFrame syntax. So respectively, I've created three dataframes called DataFrame one, DataFrame 2, and DataFrame. Dataframe one has rows from 0 to three and columns from a to D. Dataframe 2 has the exact same column names, but has different set of rows when compared to data frame one. And the same goes for dataframe three. Dataframe three has the same column names as DataFrame 1 and 2, but it has a completely new set of rows from eight to 11. So we have created three dataframes for the concatenation section. So what really is concatenation? Concatenation basically glues together DataFrames. Keep in mind the dimension should match along the axis here, concatenating on. So for you to do concatenation, you must use the PD concat syntax. And inside the period of concat method, you must pass in a list of dataframes that you want to concatenate together. So shift your focus to this line of code. We've already created three dataframes that we need for this lecture. All we're doing in this line of code is we're using the period of concat method. And inside the method, I'm passing all of this DataFrames together. So as I just told you, the pd dot concat is essentially going to glue together all of these DataFrames, as I told you before. So if you execute this, you can actually see that the DataFrames are being placed one over the other. Now using this syntax are DataFrames are getting concatenate it based on the rules. But what do you want to concatenate them based on their columns as it only before, you must use the axis argument, okay? By default, the axis will be named 0, but if you want to glue them together based on their column names, you must modify the axis from 01. Now if you execute this, you can actually see a lot of null values here. This is because all three dataframes that we have created have separate row names. So what am I talking about? If you see the first DataFrame has rows from 0 to three, whereas the second DataFrame has rose from four to seven, and the third DataFrame has rose from a two lemon. This means that all three DataFrames have completely different set of rules, but they're column names, on the other hand, is similar. All three DataFrames have the exact same column names. So our pandas code is trying to concatenate three of these DataFrames together based on their columns. And for all the values that are missing here, our pandas is naming them NaN Arnold. So this is how CONCAT works. Now the steps that we follow might seem really complicated you, this is because we are creating our own DataFrames here. But when in reality, you wouldn't have to create a DataFrame UK in real-world, all you have to do is to just use the period at concat method. This is actually everything you need to know what pandas concatenation. And if you want to change the way in which you concatenate, you must use the actors argument. This is essentially everything you need to know about concatenation. So let us move ahead to the next topic. For the next topic, I'm again creating two different DataFrames. And the first DataFrames the left DataFrame. And the second DataFrame is named right. It will just execute this real quick. And if you see these DataFrames are slightly different when compared to our previous DataFrames, because these DataFrames have a common column called as the key. The key column in both of these DataFrames are exactly similar. And as a matter of fact, we're going to be merging these DataFrames together using this common key columns. So let me show you how. So first of all, what is merging? The merge function allows you to merge DataFrames together using a similar logic as merging SQL tables together. But if you happen to have no idea about SQL, do not worry because this is a really simple concept. The syntax for the merge function is pd.me, open parenthesis and close parentheses. And inside these arguments, you must first provide the names of the data frames that you want to merge. The first DataFrame is left, and my second DataFrame is right. And I want them to be merged together using the inner method, people who have idea about SQL would know what inner means. But if you have no idea but SQL do not worry the how argument takes a default value called inner. Inner is nothing but a default value of the house argument. And we're about to merge both of these DataFrames together based on their common key called this. An argument actually means based on, we're going to merge both of these DataFrames together based on their common key column. So if you're going to create a DataFrame that has a different column name for its key column. Make sure to replace the key columns name here. Now if I do this and execute, you can actually see that we're just merging both of these DataFrames together based on their common key column. Now it is completely okay to have more than one key column, because most of the DataFrames that you see in the real world will have more than one key column are common columns. So here is a similar example. Both of the DataFrames that we're creating here has more than one key columns. And all you have to do is to mention the names of all the key columns inside your own argument using square brackets. This is all the differences that are there between merging together DataFrames that have one key column and merging together DataFrames that have blue key columns. There's nothing to worry about because the syntax is really simple. It might look a bit complicated to you right now because we are creating DataFrames on her own. But when in reality, the syntax for merging DataFrames together. Okay, This is everything you need to know about much. Now let us move ahead to joining DataFrames together. Joining is a convenient method for combining the columns of two potentially differently index DataFrames into a single result DataFrame. So what do we mean by this emerging? We combine DataFrames together using a common key column, correct? While enjoining, you can join DataFrames together using their indexes. Let me be really clear. If you are going to merge DataFrames together, you'll merge them using their columns. But when you're going to join DataFrames together, you're going to be joining them based on their index. So for the joining lecture, I have again created two different DataFrames. The first DataFrames the left DataFrame. The DataFrame is the right DataFrame. And this is the syntax for joining both of these DataFrames. We are going to be joining the first DataFrame of us, dot join. And inside parenthesis, you must give the name of your second DataFrame. Now, I have already told you that there is an argument called Hub and inner is its default value. Inner join is how pandas will usually joining DataFrames together. But you can also choose to join your data frames using an outer join. If you're going to use an outer join, you'll have to manually modify the value that's there inside the house argument. And again, if you have no idea about SQL, it is completely okay to not know about the inner and outer keywords. I just want you to know that these are just methodologies to join two DataFrames together in a giant is one method and outer join as another method. Let me give you a shorthand for you to understand inner and outer joins a little better. I'll give you a beginner version of what inner joins and outer joins are. When you are going to use inner join, your output will not have any rows that have missing values inside of them. But when you're using outer join, the output of outer joins will have gross that has missing values inside of them. So in a giant means no missing values, whereas outer joins means that he can still choose to include missing values inside and output. This is one way for you to get a quick introduction about inner and outer joins. 47. Pandas - Operations: In this lecture, we're going to learn some of the important panas operations as well as some useful operations that we haven't gone over yet. But before we do this, I want you to create the DataFrame that we need for this lecture. So to do this, this copy paste whatever you see on the screen right now, this code will allow you to create the DataFrame that we need for this lecture. All right, so pause the screen and try to copy paste everything that you see on the screen and just hit Execute. So once you execute this line of code, you'll be able to replicate the DataFrame that you see on the screen. The first operation that we're about to see is going to help us identify the unique values in a DataFrame. For example, let us assume that I want to find the unique values in the column 2, as you can see with the naked eye, column two has one value that is repeated. It will be for value is repeated twice, which means triple four is not an unique value, correct. But what do you want to identify the unique values that are there in a DataFrame? Fortunately, our DataFrame is really small. But imagine having a DataFrame that has hundreds and thousands of rows in it. How do you think you can identify unique lot even in his kingdom, we have a method called unique in our parish library. And let me show you how it works. For now. I'm only going to take the column 2 and column 2. I'm going to apply the unique. If I do this in an execute, this will give me the unique values I had. One, show me the values that are repeated to before is repeated twice, which is exactly why our unique matter is eliminating one of those values from our output. Whenever you use the unique method, will only receive the unique values in your given data frame or in your given column. Now what if you don't want your output as an array, but instead, you want to find the number of unique values as an integer. Now there's two ways in which you can do this. You can either use the Len function and encloses the entire line of code inside a parenthesis. And if you hit Execute, this will give you the number of unique values that you have. Okay? This is one way to do this, but instead, you can remove the Len function and use n unique. The n unique method will help you identify the number of unique values that you have in your dataframe or your column. Now if I hit Execute, I'll essentially get the same output again. Now what if you want to create a table of unique values along with the column that says how many times they have occurred, you can use a method called the value counts method. So to do this, I'm again going to use the Column 2 of our existing data frame. And to this, I'm going to apply the value counts method. Now if I hit Execute, I would receive a table that has all the unique values along with another column that says how many times they have occurred in my DataFrame. And we can do the same for column one Nashville. Let me just execute this. As you see, column one has no repeated values, which is exactly why we're getting all the row values that are their intake column one and each row value is just repeated ones. Now let us move ahead to another important function that's there inside the pandas library. Now, this is one of the most powerful functions in your Pandas tool belt. Okay? This is called the apply function, but what if you want to create your own function? Let me just create my own function here. Now, I'm going to create a function that's going to calculate the times two of any data that I give. If I provide to as an input, it should return the value of two power two, which is going to be four. So I'm just creating a function that's going to return the times to or could have whatever input that I give inside parenthesis, I'm going to provide x here. And I want this to return the time to value of my input until this point, we have only been using functions that are already built into Pandas, correct? But right now I'm going to show you how we can apply our own custom-made functions. Do your DataFrame using the apply function that I just told you about. And let me just execute this. Now let us use the column one in a DataFrame dot apply and inside the parentheses, this provide the name of the custom function that you've created. In our case, the custom function that I've created. It's time to execute this and see the output that I'm going to get. If you see the column one originally had values of 1234, okay? But if you look at our output, our custom function is now returning this quiet of all the values that were there in the input. Now pandas has another interesting feature called lambda. What if you're only going to use this function once in your entire code, if I'm only going to use a function once there is no point in actually creating a function, correct? You are creating a function in hopes that you're going to be using it in a lot of different places in your program. But if you're only going to use the function once in your program, you don't actually have to create a function, but instead, you can use something called S and lambda. Let me show you how we can actually create this. Or you have to do is to just replace the name of a function with lambda, leave us peace and just type what you want your function to do. I want my lambda to take the input x and I wanted to return x power two. We just execute this. If you see, I'm essentially going to get the exact same output that I did before. Now if you are only going to use this function once in your code, you don't actually have to write a function, but instead, you can use the Lambda syntax that I just told you about. Now let us move ahead to the next important operation, which is going to be the drop operation. Now what do you want to drop a certain column from a DataFrame or a setting growth from your data frame. Or you have to do is to just use the drop method if you want to know about all the arguments that you can provide Ysaye the drop method, just place your cursor inside the parenthesis and type Shift plus Tab. Now let us just expand this. This will show you all the arguments that he can provide to your drop method. So my first argument is going to be the name of the column that I want to drop, okay, which is going to be column one, for example. Now since this is a column, you'll have to modify the axis argument from the default to 0 to 1, okay, because we're going to be removing the column here and not the rule. Now let me just execute this column one from a DataFrame has now been removed. But again, any change that you make here will not reflect in your original DataFrame. So what do I mean by this? If I just execute df again, you can see that the changes that I've made here is not reflecting here. This is because I'm not using the in-place argument. If you want to make the changes permanent, you'll have to replace the in-place argument from the default to do. But we're not going to be doing this right now. So I'm just going to remove this. Now what if you want to extract the names of your columns or indexes, since those are default attributes. Or you have to do is to type df dot columns and just hit Enter. This will extract all the column names from a DataFrame. All right, The same goes for indexes. Just type df dot index and hit Enter. This will extract the values that are there in your index, since our index is not a custom-made index, but a default range. In next, we're just receiving the start value, the stock value, and the step size. Now these two operations will come in handy in a lot of different instances. Let me just bring my DataFrame again. The last operation for this lecture is going to be sorting. Now what if you wanted to sort the values inside a DataFrame in a particular order. Now for example, let us assume I wanted to sort the values and segment data frame using the ascending order of the values inside the column 2. Now what I have to do is to use the SWOT values method to access this. Or you have to type this df dot, dot underscore values, open parenthesis and close parenthesis. Place your cursor inside the parenthesis and type Shift plus Tab to know about all the arguments that this method can take. If you see the first argument is something called by, the by argument is actually asking you for the column name or the row name using which you want to start your DataFrame. So since BY is the default faster argument, I don't have a typed by equals. This straight away type the column name that I want, let us assume I wanted to sort my data frame using the values from the column two. Well, let me just execute this. Now, even with the naked eyes, you can spot the difference between this output and this output. In our recent output, we have sorted the entire DataFrame using the values that belong to the column two. And these values are arranged in ascending order. And asset has told you typing BY equals, year would make no difference because by default first argument. So you don't actually have to mention this. And there's one last argument I want to show you in this lecture, which is going to be the null argument. This is null argument is going to help you spot the null values that are there in your dataframe. Now to access this, this type df dot is null, open parenthesis and close parenthesis. Hit Enter. You'll receive a set of Boolean values inside your DataFrame. These Boolean values are actually telling you that there are no null values that are currently there in your data frame already. If there wasn't any value you want to receive falls here, but instead you'll receive two. Now, these are all the essential and useful functions and operations that pandas has. 48. Pandas - File Processing: Pandas can actually read and extract data from a bunch of different file formats. In this lecture of us, we're going to try and extract data from CSV, XML, and HTML files. But there are a lot of other file formats which you can read through pandas. But before you actually do this, you must go to your command prompt and installed all the libraries that you see on your screen right here. Now, most of these libraries would have been already installed along with the Anaconda distribution. But just to be sure to go to a command prompt and try installing them again. And also download all the files that are given in the resources section and save them inside the path variable, open the Jupyter Notebook. This is really important. Once you download the files and extract them, it must be stored inside the same path that you've currently open your Jupyter Notebook. How can you actually identify the path? I'll tell you how. First just import pandas as pd and then type PWD and hit Enter. Once you do this, you'll be able to see the current path very open to Jupiter Notebooks in. So all the files that you download from the resources section must be saved in the exact same path as you see on the screen. Now based on value opened your Jupyter Notebook, your file path might not match with my file path. So do not vary, but make sure you're storing all the files in the same file path that you're getting in your screen. So the first file we're going to try and read is our CSV file. So to read a CSV file, just follow the syntax and I'm gonna type, just take baby dot-dot-dot, read underscore csv, open parenthesis and close parenthesis. And inside quotation, this datatype defining. But since we're saving the files inside the same location where it will open your Jupyter Notebook. You don't actually have to type the entire file name, this type of few alphabets and hit Tab, this should automatically fill the balanced half a bitch. Now after you do this, hit Enter, you must be able to extract the data that was stored inside the CSV file. Now if you want to see all the file formats that you can open with Pandas, this type PV, dot-dot-dot read underscore and hit tab. You can actually see a scroll bar that is displaying all the file formats that you can open using pandas. You can read data from your clipboard. You can read CSV, excel feather, and a lot of different file formats. And this includes HTML and JSON files. How much aspirin for now, I'm just going to remove this. I'm going to store this data inside a variable called df. Alright, like we always used to do. And after doing this, and it's going to print df here. Now this is, this intensity must follow in order to open a CSV file into a Pandas. Now what if you want to write inside a CSV file? The syntax is really similar, or you have to do is to type df.head to underscore. And just like we did before, just hit tab to see all the file formats that you can write into. So right now, our output is going to be a CSV file as well, open parenthesis and close parenthesis. The first argument is the new file name that you want to create. So I'm going to name this, my output, but you can name this whatever you want. Essentially, I'm going to store the same data, but I'm going to store this into a new CSV file. Now the second argument must be the index argument, and you must label this as false. And before I do this, let me actually show you what happens when you don't name the index as false. Okay, let me just erase this and execute this. And after doing this, let me try and read the CSV file that I've created here using read CSV method. And inside the method, right now, I have to provide the name of the new CSV file that I've created. All right, well let me hit Enter. If you see, since I have not declared index is equal to false, my oil index is being included as a separate column. If I don't want this to happen, I'd have to type index equals false. And I'm going to execute the cell again. And I'm actually going to execute this cell again. If you see you only by doing this, your pandas only storing your ALL IN next as a new column inside your new CSV file. All right, so this is everything you need to know about reading and writing into a CSV file using pandas. So let me show you how we can open Excel files using pandas. Through this, the syntax is very similar. Just take pd dot, read underscore Excel this time, because last time you use CSV to open a CSV files, and we're going to use Excel to open Excel file. All right, this is the Excel file that I've attached in the resources section. Make sure you download them. And as I already told, you, save them in the same path where you open your notebook. And as you can see, the data that's there inside our Excel sheet is really similar to the data that we had in our CSV file. And also if you notice, Excel files have a feature called sheet names. If you pay close attention to the bottom left corner, you can see sheep names here. Currently, our data's being stored in sheet 1. But in your case, if you're going to download an Excel sheet with the internet, it can possibly have multiple sheets with each sheet having its own data. When you open Excel files, your pandas is going to treat each sheet as an individual data thing. So while extracting data from Excel sheets, it is always important to properly declared your sheet name. Currently, we have just one sheet, so it is not important for us to mention the sheet name, but as a good practice, it is always important that you mentioned the sheet names clearly. So let me move to a Jupiter Notebook again. And open parenthesis, close parenthesis. And inside quotations, I'm just going to type Excel and I'm going to type tab to autocomplete, as I already told you, the second argument needs to be sheet underscore name equals 1. And let me execute this. If you see we have no extracted the data that are stored inside our Excel file into our Pandas. But what if you want to write data into an Excel sheet using Pandas and you have to do is to take df dot two, underscore Excel and inside parenthesis as a first argument, provide the name of a new Excel file that you want to create. Right now I'm just going to name it Sample 2 dot XML SX. And again, the second argument, it has to be sheet name equals sheep one. Let me execute this again. Let me go to my file path and check of my new Excel file has been created. If you see the excel file that I tried to create is now there in my file. But this means that the operation that I tried to perform has been successful. So we now know how to extract and write data into CSV and Excel phase, the last file format that we're going to see in the session as HTML. So as an example, I've selected this has TML page. This HTML page has a list of all the failed banks in the United States. So let me show you how we can extract tables from this website using pandas. The first process is really simple. This type pb dot read. And as you might have guessed already, does take his Gmail, open parenthesis, close parenthesis, and insight quotations. You must provide the path of this website. All right, I'm just going to paste the path here, which all after you do this, just hit Execute. This might take a while. If you see pandas has tried its best to extract all the elements that are there inside tables. And we're going to store this inside a variable called data, and we're going to print data again. Now if you're wondering how pandas is actually doing it, what did we page? Right-click and click on View Page Source. If they're already familiar with HTML, you'll know that it's actually easy to find table elements. This is the XAML code with which the website is made of. Our Pandas library is just looking for the table elements that are there inside this particular website. Let me go back to Jupiter again. If you want your output to look a little more streamlined, just use the index, index, just type 0 here and hit Enter. Right now, we have compiled all the table elements that are there inside the website into a Pandas DataFrame. So this is how we can actually extract values and right values into CSV, excel and HTML files using Pandas. 49. Matplotlib - Introduction: Hey guys. So I believe you would have learned a lot about creating and processing DataFrames using NumPy and pandas in previous lessons. But from here on, I want to teach you how we can actually visualize the data that's there inside DataFrames. To do this, matplotlib is the very first data visualization library we're going to use. So why should one actually use matplotlib? Matplotlib is one of the most popular plotting libraries in Python. Almost every single data analyst on this face of earth uses matplotlib in one way or the other. On top of this, it also has the ability to work well with other operating systems and graphic backends. It can then at high quality graphics and bloods to print and view for a range of graphs, histograms, bar charts, pie charts, scatter plots, and heatmaps. Matplotlib is also a multi-platform data visualization tool that's built on top of NumPy and site by frameworks, this means that map lot leapt can be extremely fast and efficient. And the last but the most important point is matplotlib has a really large community and cross-platform support. So what do I mean by this? Matplotlib has been dead for a really long time, which means a lot of people are using and have used MATLAB live in the past. This means for whatever question you may have or developing the future, there are always people over the internet who can help you. So before we go ahead and use matplotlib, I want you all to install mac, not linear system. First go to your terminal and type the commands that you see on the screen to install mac lot lebanon system. And after you do this, didn't want to give you a quick tour of Matlock lips website. So this is map lot lips official documentation site. And when you scroll down, you can actually see a ton of information about matplotlib and it's documentation and libraries and a lot more. But probably the most important thing on this website is this examples tab right here. If you click here, you will be able to see a ton of lot names as you scroll down this website. And let me scroll down for you to get a quick understanding of what I'm talking about. And each plot name would have a lot of different examples. So let us say you want to know more about the hat graphs right here. All you have to do is to click over it and you will be redirected to a site that has information and quotes that relate to hack graphs. So similarly, this side has hundreds of blush like this that you can visualize using matplotlib. And later in the course, we'll also be learning about a new data visualization library called Seaborn, which is a little more sophisticated than matplotlib, but C1 is again built on top of the matplotlib library. So if you want to learn about later in the course, you must first understand matplotlib and its functionalities. 50. Matplotlib - Plotting A Simple Graph: Let's actually start building plots using Matlock level. I've actually typed all this inside my Jupyter Notebook. Don't worry, I'm going to attach this notebook as a resource. So if you want to download, you can go in and download it and audit inside of Jupiter notebook. So first of all, I'm going to import matplotlib using input Matt lot lib dot pyplot as the LLP, and followed by this percentage matplotlib inline. This command is going to help us print the output of a map blood inside of Jupiter notebook. If you're not following this course and Jupiter notebook, and if you're using another editor, you might have to type a command like VLT, dot-dot-dot show every time you want to bring your Mac lot output. Okay? But if you're learning this course and Jupiter notebook, you don't have to type this command. Do not worry if you're studying this course and another editor, I'm going to show you how we can use this plt.plot Show to display our output. For now I'm just going to execute this. Now before we start plotting, we need something to plot, correct? So I'm first going to create a few Numpy arrays, import numpy as np. And the first thing I'm going to create the array x equals np dot linspace. This is going to give us linearly spaced values. I'm going to start with 0 and end with five. And I wanted to print 11 units. And followed by this, I'm going to create a wide array, which is going to be the values of x power two. And let me print the array enough and it may autoprint the wiring. Right now, I have two different arrays that I've created using NumPy. So let me actually plot a graph by using the x-ray in the x-axis and y array as the y-axis in Matlock left, there are two ways in which you can plot a graph. The first one is the functional way and the second one is the object-oriented method. Object oriented method is actually the right way to do things. But since we had absolute beginners, I'm going to teach you what functioning better. Sort of print a matplotlib function. We'll plot the syntax is real simple. All you have to do is to take PLT, dot-dot-dot, and inside parenthesis, just provide your x and y-axis inputs. Once you do this in an execute, since you're using the matplotlib inline command, your matplotlib is displaying the output inside your Jupyter Notebook. Now, like I just told you, if you learn in this course or a different ID, you will have to include another command called plt.plot, Show open parenthesis, close parenthesis. This is nothing but a print statement. You're just providing a command to print the matplotlib as an output inside the console. Now if I do this in an execute, my Jupiter notebook would print the output instead of displaying it. Let me just comment this out. If I execute this cell now, you can see something called as output here. But if I include this command and hit Execute, you would actually not see the output borrow anymore because my Jupyter Notebook is not displaying the output, but it is printing the output. Alright, so if you're learning this course and another ID, make sure you include this length every time you want to print a map lot graph. So I'm just going to remove this for now as a quick analogy, let me show you what displaying and printing the output is. This is completely unrelated to map lot. Yeah, I just wanted to show you this. If you just want to display an output, I have to do is to type a string like this and it executes. If you see we are now receiving an output cell here. But instead of doing this, if I include a print function and encase the string inside the parentheses and hit Enter. I will no longer be able to see the output cell because we are printing the output instead of displaying it. Alright, so this is the difference between displaying an output and actually printing it. Now this is totally unrelated to Matplotlib. I just wanted to show you how it works. Let me remove this and execute an empty cell grading. So let us actually get back to our matplotlib graph output. This is just a really simple graph. There are a lot of different things that can do to a graph like this. For example, if you want to change the color of this line or you have to do is to include another argument called hyphen. And if I execute this, you'll be able to see that I've changed the color of the line from blue to red. We're going to be learning about a lot of functionalities like this in the future. But right now for this lecture, I just want to keep things really simple for you. Now. What do you want to provide a name for the x and y-axis of this graph. And what do we want to provide a title for this graph? Now the syntax for doing this is PLT xlabel. The x label method is going to allow you to give a name for the x-axis. All right? Now I just want my x-axis to be named x label. If you see my x-axis is now labeled x label. Alright, now similarly, let me name my y-axis. Plt.plot, y-label, open brackets, close brackets, just going to name it y-label. Now finally, let me provide a title for my graph. Now the syntax for doing this is plt.plot. Titled method is going to help you name your graphs. I'm just going to name it Title for now. If I execute this, I would have named bought my x-axis and the y-axis. I've also provided a title from a graph. Now let me include this line here. Great, and let me execute this writer. So this is how easy it is to plot a graph using matplotlib and to label the x and y axis. 51. Matplotlib - Multiple Plots Inside Same Canvas: Right now we have this printed one plot in a given canvas. But actually you can do multiple plots on the same canvas. And let me show you how. For this, we're going to be learning about the concept of subplots, the syntax of subplots, plt.plot. And inside the band does is there are three arguments that you must provide. The first argument is the number of rows. I'm just going to provide one here. I want the number of columns to be to. The second argument is the number of columns, and the third argument is the number of subplot that you're referring to right now. This is going to be the very first subplot that are going to be printing one here. And after doing this, will at least have to provide some inputs that can be plotted. So I'm just going to type plt.plot x comma y. And just to differentiate a large, I'm just going to change the color of the first block to read. After doing this, I'm going to move ahead to my next plot, plt.plot one comma two. And this time, this is the second subplot that a printing. So I'm just going to name this too. If you don't seem to be getting the grasp of things right now, do not worry. You'll understand this a lot more when we jumped to the object oriented approach. Right now, I just want to show you what's possible. And after doing this, I'm just going to provide input for the second plot, plt.plot. But this time I'm going to plot y versus x, not doing X versus white. Now just to differentiate, I'm going to change this color of the plot to blue. And let me execute this. If you see we have no printed multiple plots inside the same canvas. 52. Matplotlib - Object Oriented Plots: Let us move ahead to plotting matplotlib graphs using the object oriented approach. Now, one major difference between using the functional approach and the object oriented approach is the level of control that you have over your graphs. Okay, why did using the object-oriented approach, you are going to be in complete control of the grasscutter going to create. And the way in which the object oriented approach is going to work is you're first going to create a figure object and then you're going to call message off of it. So what do I mean by this? Let me actually start the core and you'll start to understand this a lot better. So like I just told you, first, you'd have to create a figure object. The syntax for creating a figure object is, I'm just going to name my figured FIG equals PLT dot figure, open parenthesis and close parenthesis. Now if I execute this, you can see that I've actually created a figure object. And inside this bigger object is where we're actually going to create a canvas. And inside the canvas, we're going to plot a graph. So after creating the figure object, we must start adding access to it. First, you must provide a name for all the access that you create. I'm just going to name it axis as such, equals FIG dot add underscore axis. And this add access method accepts an argument which is actually a list. The argument that we're about to provide for this access method is a list that's going to take four values. The first value is the left, the second value is dried. And a third, fourth law, use the width that you're about to provide for our axis. And please note that all of these values must be numbers between 01, which means all of these values must be a decimal value between the numbers of 01. Now I want my left corner to be placed in zero-point one. I want my right to be placed 0.1, and I want my width and height of the axis to be 0.80.8. Now, I know that as begin as you won't understand this immediately. But once we start seeing more and more examples about creating axis, you'll understand this a lot better. And let me just execute this right away. If you see, first of all, I created a figure object. And after creating, I'm including my very own axis into the figure object that I created. Imagine the figure object as a blank canvas and you're the artist, you have to draw inside the figure object that you've created. And since we're passing in an entire list, we have complete control over the graph that we're trying to plot. Now, our canvas is empty, so let us go ahead and provide an input for it. I'm going to provide access dot plot tan x comma right now this should be enough. Now let me just execute this. Now we have this essentially did what we previously did using the function method, but using the object oriented approach, we have complete control. Okay, That is the point that I'm trying to make. And similar to the function method, you can label your x, y and provided 94 your graph aspect, but the syntax is slightly different. Now let me show you what the syntaxes. Syntax we're providing. A name for an x label is access dot x, underscore x label. The syntax is slightly different when come back to the object-oriented approach. And inside this, and it's going to provide x label animal Autobahn to do the exact same for y label asteroid. I'm going to name this way and it will just copy paste this one more time. I'm going to replace this, that title. And the title of gnuplot is going to be object oriented graph. And let me execute this. If you see I have no provided a name for make stable y-label. I want to provide a rate IT from a graph. But the syntax was slightly different. Now, I know a lot of people wouldn't have understood this particular line of code. I know a lot of you are really confused. So right now I'm going to try and plot multiple plots inside this canvas. And I'm going to show you what can you actually do by modifying the input inside this line of code? But let me just open a new code. And I'm going to create a new figure object, fig equals plt.figure, figure. And I've created a new figure object, but right now I'm going to create two-axis inside. They figured out that the recreated, okay, I'm going to name my first axis, axis 1 equals 6 dot add axis. And inside the first list, I'm going to provide the same inputs that we gave before. It will just copy paste this and it would be a secure. And let me create the second axis, access to Eclipse, fig dot, add axis. But here I'm not going to give the same and I'm going to alter the input lightly. I want my left to be 0.2. I want my right to be placed in 0.5 and I won me with could be 0.4 and I want my height to be 0.3. Now there is no compulsion that you must take the exact same numbers I've typed here. I have this typed that differentiate of decimal numbers. So to understand this concept a little better, and it will just execute this. If you see I have placed a second plot inside the first blood because my dimensions have been slightly different. Now I want all of you to pass the screen for a second. And I want you all to modify the decimal numbers and see the plots that you get. For example, let me give you a quick problem. I want you all to place the second graph on the right side, bottom corner of the first plot. Okay, I just want you to move this block from here to the bottom right corner here. All right. And let me work along with you. And first of all, I'm going to try and change the left from 0.20.8. Let's see what happens. All right, so this doesn't seem to work. Let me reduce this a bit. I'm going to replace this computer 0.8 to 0.5. Didn't we just execute this again? All right. So I brought this to the right and all I have to do is to bring this to the bottom segment, decrease the y-axis from 0.5 to, let's say 0.2. So I have now moved the plot just by changing the dimensions of my axis. Okay, So this is exactly what happens when you modify the inputs that you provide. You have complete control over the position in which you want your access to be placed. And let us just provide input for our axis. I want my axis one to have access one dotplot and inside the parenthesis x comma y. Similarly, you want access to, to have access to dotplot y comma x. And let me execute. This node is actually overlapping. So let me modify the dimensions back again. I'm going to name the 0.2. I'm going to name this 0.5. Rachel, our plots are not intersecting anymore. So this is how we can use the object-oriented approach to plot graphs using matplotlib. The only difference between using the functional method and the object oriented approach is the level of control that you have a way of lunch, everything else is exactly the same. So as a final step, let me just provide a value for both of the plots and repeat it. I want my axis one to have a name access one dot set, title, and I want it to be named, let's say larger plot. And similarly, I'm going to copy paste this, and I'm going to paste it here. Access to Title. I want this to be named smaller. Now let me execute this again, right. 53. Matplotlib - Subplots Using OOP: In our last session, I gave you an introduction to plotting Matlock graphs using object oriented method. In this session, let us see how we can plot subplots using the object oriented method. So let me go ahead and remove this. Plotting subplots using the object-oriented method is completely different from plotting subplots using the function method that we saw before. The syntax is entirely different in our previous session to initiate a figure object, this was the syntax that we used, correct? But when you're plotting subplots, the syntax is totally different. And let me show you what it is. Hash subplots. The syntax for plotting subplots is as follows. First of all types, fig comma axis equals PLT. Now inside the parenthesis is very much the number of rows and columns that you need. The first argument is n rows, which the presence and number of rows. Right now I need one grow. The second argument is the number of column, which is represented by the end coils argument. Now I need two columns. Let me execute this real quick. If you see wave no plotted a subplot and we have one row and two columns inside of it. And what does this syntax actually doing? This dot subplots method is actually doing everything that we did here. The dots applets method is automatically calling the dot add access method. It's automatically creating a new axis for us. Now similarly, by varying the number of rows and columns, you can determine the number of supply chain. For example, I'm going to change zeros 2, 3, and I'm also going to change the columns to three. And it makes this, now since I've typed three rows and three columns here, I received an overall nine subplots. And based on the number of rows and columns you provide, the dots up lots method will call as much add access materials that you need. Now based on the numbers you provide for the n rows and n columns argument, your dots of lacZ method is going to call as many add access method that it peaks. Now you can see that there is a slight overlap between these applets needed to eliminate the overlap, you can use something called plt.plot type underscore layout. Now, tightly over is a method that's going to help you eliminate the overlaps in your sacrilege. And let me execute this. And you see after including the tight overlap method, I don't have any overlaps in segments of blood. Right now. I'm not going to be using this. So I'm going to remove this. And I'm also going to modify the number of rows and columns back to 12. Now what is really happening here? Now this is a process called tuple unloading. To understand this a little better, let me execute access independently and have a look at the output that you're going to get. Now, as I just told you, when you execute the axis, we're getting an array output. And by the property of array, any array can be created using a loop, correct? So I'm going to utilize this property and use a for loop here. I'm going to type for a given axis, a dot plot x comma y. So this is how you can actually provide inputs to the plots that we've created. Let me execute this and have a look. What the output All I'm doing is I'm iterating through the array object that I've created. And for every array object, I'm providing the x comma y input. Now as you can hydrate arrays, you can also index arrays, correct? So this is one way to provide inputs. Let me show you another way. You can do something like axis 0, which represents the 0th index or the first element in this array. Axis 0 dot plot x comma y. And I'm going to copy paste this axis one way come up x axis one represents the second array object in your axis array. Let me execute this again. This is the second way where you can provide inputs to the subplots and you create using the object-oriented method. Every other syntax is exactly the same. For example, if you want to include titles for your graph, label a y label, the syntax is exactly the same, and let me do it once. Axis 0, dot SEC, title and NCAA brackets. I'm going to type axis 1 and just going to copy paste this again. Axis one, sit idle and Ted black edge, I'm going to type access to. Let me just provide a space in between for clarity. And let me execute this. Now, as he just told you, the syntax for every other functionality is exactly the same. 54. Matplotlib - Modifying Figure Size & DPI: In this lecture, I'm going to show you how we can modify the figure size and DPI of the plots that you create using Matplotlib. So first of all, I amortize the basic syntax for creating a figure object, fig equals PLT figure. And inside the parenthesis, you can provide an argument called fixed size. And this fixed size argument takes a tuple as an input. And inside the parenthesis you can provide the intended figure says For your plot right now I'm just gonna take one comma one. And remember, these dimensions are going to be in inches. Now after the fixings argument, you can also add an additional argument called DPI, stands for dots per inches. And this can be whatever number you want and base with the clarity that you need, your DPI can vary. You can type a 100 or 200 and it's completely up to you. And right now I'm just going to create a plot with the default DPA's and it's going to remove this. And inside the figure object we need access, correct? So I'm just going to create my axis now, AX equals fig dot add access. And as a list I'm going to pass 0 comma 01 comma one. Let me execute this. And as I told you, this one comma one dimensions is in inches. So if I vary this to, let's say three comma two, for example, if I execute, you can see based on the dimensions that I provide here, the width and height of my plot would vary accordingly. Let us again modify this three by two to, let's say eight by two, it makes securities again. And my plot size is now really huge. Let me provide an input for this AX dot lot and inside bracket x comma y. And I can execute this again. So this is how we can actually use fixed size to modify the width and height of the plot that you're trying to create. Now pause the screen and try to play around with these numbers and just have a look at all the different block sizes that you can work with. Let us see how we can do the same for subplots. So again, I'm just going to write the basic syntax for subplots. Fig comma axis equals plt.plot subplots. And inside parenthesis, I'm just going to copy paste the same fixed size here. And let me execute this. And I have now created a subplot. So let me go ahead and provide the number of rows and columns that I wanted in rows equals 1 and n equals two. Now since the width of my plot is really long, I'm going to change the end rows to 2 and number of columns to one. Let me execute this again. And just to make things really clear, and it's going to provide an input for both of these blocks. So first axis, 0 dot plot x comma y. I'm gonna do the same for the second subplot axis, one, white command x. So let me execute this again. This is how we can modify the fixed size when you're using blood. And there is a slight overlap that's happening. So as I told you before, I'm going to use the tight layout method, PLT underscore layout. It makes securities again. And now my Jupyter Notebook is trying its best to eliminate any word lab that's happening. So if you don't want this to happen, be mindful about the fixed size that you're providing. The more longer if it says is, the more difficult it is going to be for you to remove any overlaps It's happening. Okay? So let me just reduce the length here. Let me modify this to four comma two. Great. And I'm also going to modify the number of rows, number of rows 1 and number of columns to it makes securities again auditor, this looks a bit neater. 55. Matplotlib - Saving The Plot: Why introducing the map lot library to you? I told you that matplotlib can produce high-quality made is correct. So when I say images, they must be a way to save all the images that you're creating. So in this lecture, let me show you how we can save the figure that you're plotting. So to do this, I'm just going to copy paste this. As a matter of fact, let me make this as a Markdown cell saving figure. And I'm just going to copy paste the code. I'm going to modify the figure size four comma two, and let me execute this. So we have plotted a figure here. Matplotlib can actually save your figure in a lot of different high-definition file formats including PNG, JPEG, SVG, PDF, and even periods. The syntax for saving your plot is as follows. First of all, I think dot save fig. This is a method again, the first argument you must provide is the file name with which you want to save your plot. I'm just going to name my plot, my picture, followed by the file format with which you want to save your plot. It can be PNG, it can be JPEG, it can be whatever you want it. So I'm just going to leave this to dodge JPEG. And optionally you can also include a DPA argument. And depending on the clarity you need, you can type any number you want here. I'm just going to leave this to BPA to a 100. And if I hit Enter, my macro clip library would have now saved this picture in the path where your Jupyter Notebook is operating. All right, So after I hit execute, my matplotlib library would save this plot in the same file path where my Jupyter Notebook is currently running. 56. Matplotlib - Creating A Legend: Whenever you're trying to create a graph that has more than one plot and you must add something called as a graph religion. A graph legend is going to tell a user or a consumer about what each plot is trying to represent. So in this lecture, I'm gonna show you how we can add a legend to any plot that you create. So first of all, let us create a graph that has to plot inside of it. Okay, to do this, I'm first going to create a figure object, fig figure. And followed by this, I'm going to create the axis. And axis. 0 comma 0 comma one, comma one. And let us start adding the plots AX dot plot x comma y and foreign. By this, I'm also going to add another plot inside the same canvas with the inputs white comma x. And it may execute this. If you see we have one single graph to plot inside of it. Similarly, you might even have more than two blocks. And in cases like this, it's going to get extremely confusing at a point for even you to understand what each plot is trying to represent, which makes it even more important to add a legend to a plot. So before we actually provide a name for this flux, let us actually differentiate this plots a bit because both of these plots are more or less getting the exact same in which connect. So I'm just going to modify the inputs of each of these blocks. I'm going to modify this to x squared. And here I'm just going to modify the inputs to x comma x cubed. Let me execute this. Great. So to add a legend, this is the syntax that you must follow. Ax dot legend. Legend is a method. Alright, so let me execute this. If you see I'm getting an error which is no handles with labels found to put religion. This means that whenever you create a plot, each plot should carry something called as the label. The legend method would then analyze the labels that you've given for each of these plots and then form a legend. So first I'm going to provide the label for the first block. You cannot modify the name of this because labeled as an argument which takes strings as input. So I'm going to name this x squared, and I'm also going to add a label for the second blood label equals cubed. Now let me execute this. If you see we have added a legend to a plot. Now, when a user or consumer sees this for the first time, he or she can first see this lesion to understand what each of these blocks are trying to convey, you can also modify the location of this legend using the LOC argument. Loc represents location. And by using this argument, you can modify the location of this legend placement. Before we go there, I want you to go to this link. You're not worry. I'm going to attach this link to the Jupyter Notebook. So here inside this link, you can see a lot of locations, things along with the location code. Now for example, let us take the upper right. All right, all I have to do is to type LOC equals 11 represents upper right here. Okay? And if I hit execute my legend, we'll now get shifted to the upper right corner of the canvas. Now similarly, you can play around with all of this location codes. For me, the default location code was good enough. So I'm going to remove the location argument created. Now if you're not satisfied with the location quotes, you can also manually modify the location using a tuple of x and y coordinates. Okay, now let me name 0.1 comma 0.1 here. And if I hit Execute, I'll be able to manually modify the location of my legend. So whenever you're trying to plot a graph that has more than one plotting side, the same graph. Always make sure you create a legend. 57. Matplotlib - Customization: We are almost nearing the conclusion of the matplotlib series. And I think this is a great time to talk about how we can actually customize the blood's recreate before we actually get there, let me first create a plot which you can customize. Fig equals PLT dot figure, AX equals fig dot add axis. And inside the parenthesis, I'm going to provide a list of 0 comma 0, comma one, comma one. And then I'm going to provide the necessary inputs for the blocks with trying to create with just x comma y. Now, this is the plot we have on our hands right now. And then at a lot of ways in which you can customize this blood, let us start with something that we have already worked with before, which is the color of the plot. Now you can provide an argument called color. This is going to take a string input. Now Matlock liberal generally recognize all simple colors. This includes red, blue, yellow. You get the point rate. Other simple colors just by hitting Enter your map lot level, modify the color of your plot. Let me modify the students a purpley this time. This again proves my point. Matplotlib can recognize all the simple colors. And let us assume you want a custom color. If this is your case, you'll have to use something called as the hex code. There are a lot of tools that are available online which will help you create a hex code on your own. And then we type a hex code that I remember. Hex code looks something like hash FF, HCI 000 and let me hit enter. Now, if you see this is a hex code that I remember and similar to this that are a lot of Xcode so you can get from the internet. And also there are hex code creators that he had you create a hex code on your own. So go ahead and play with this and let us move ahead to the next customization that we haven't talked about before, which is the language. It can modify the width of this blood by using the line width argument. The line width. By default, your line width is going to be one. Okay? So therefore, if you hit Enter, you're not going to see any changes. But let us say you want to step down this line with the bit, or you have to do is to make this one a decimal number. Line width is Aurelie 10 when compared to our previous outputs. Similarly, you can step up your line width if you want. And let us say you want a line with a 20, this is going to be enormous. So choose align with that you're comfortable in. I'm just going to stick with two. Now if you want to change the level of transparency of this plot, you can use an alpha argument. By using the alpha argument, you'll be able to modify the transparency of your blood. And let us say you want to transparency of 0.5. If I hit enter, our plot would have now become a real transparent. So by modifying the numbers that are there inside the aliphatic event, you'll be able to modify the opacity. For now. I'm just going to remove this argument entirely and let me hit Execute again. Looks good. And as a little side note, you don't have to type line with entirely just by typing L W. Python would recognize that you're trying to modify the line with her and it would work just fine. Let us move on to another interesting feature which is called as the line style. Let me show you what this can do. A default lines tell that most data analysts use is the double hash. If I do this and execute, you can see that I will hatch reflecting here, but there are a lot of different lifestyles, such as the colon. This is going to turn your plotting two tiny dots. And similarly you can use a hyphen along with a dot. This is another popular line style. And if you just include one hyphen, sure, your lines tailored go back to the default lengths 10. And just like how we simplified language to LW, even simplify lifestyle two Ls. Python would recognize that you're trying to modify the line style and it would reflect in your output. Now let me zoom out a bit. Let us recollect the values to the data inside the array. You can see that these and other values that are available in say, the exhibit that we are using as input for lunch. Now what if you want to plot all these points in your graph as markers? To do this, you can use an argument called the market argument. This market argument will help you plot all of these points in your graphs. And similar to line style, the market argument is a lot of different types, but the one that most programmers prefer to use a small o if execute this, you can see that all of these points, and now getting plotted in your graph, there are a bunch of different markers you can use, which includes bless. Bless is another popular market, but it is not visible now. So let us try modifying the line width a bit. Let me modify this to 0.5 here. And now you can see the plus and budgetary prominently. And you can also use the hashtag symbol. Or you can also use one here. Let me zoom in a little bit. And let me increase the line width here and also increase the market. So first of all, let me increase the line width back to two. And I'm also going to increase the marker size by using an argument called yes, you guessed it right, the marker size argument, this is going to help us increase the size of our marker. And let me say one thing, microstates, you need three. Now if executed this, our markers are not visible. So I'm going to jump this up to 10. Auditory me increase this a little bit of 25 or actual IMR cousin hours a bit. Let me modify this market back to the small o. If you want to take some time, just refer the Internet to find all the different markers and line style that you can use and just try practicing it. Now, there are a few more arguments are that are available for you to customize the market, such as the marker fees color, the marker edge with, and the marker edge color. And let me use this in real time. Let us start with the marker face color. This marker face color again takes simple strings, such as red, yellow, or whatever you want. I'm going to modify this to, let's say blue. Let me execute this. I'm encountering an error message here. This is because I've made a mistake in the spelling. I shouldn't have included you here. Now let me execute this again. If you see we have no modified the marker face color. And similarly you can also modify the marker edge width and the marker edge color. Let me bring this to the next line. I forgot to include a comma here. Now let me move ahead to the marker edge width. And I'm going to increase the width to, let's say five. Let us execute this again. You can visibly see the edge width of a market getting increased here. You can also modify the color of our markers by using the marker color argument. Marker edge color equals. Let us say. Let me execute this again. And we have now successfully modified the marker edge color also. So these are all the arguments that you can use to customize the plots that you create. Just get creative and play around with all of these. 58. Matplotlib - Plot Range: So this is going to be the very last lecture of our map lot lip series. And we're going to proceed with a much more sophisticated data visualization library called C1. But before we get there, I'm going to show you one more feature that's important. And you're also going to be using this a lot. This feature is called the control over access up your wrench are the plot range. Now before we do this and we just copy paste these lines of codes, I'm going to paste this year and execute. But instead of having this really ugly plot, and it's going to make this blood really simple again, through this, I'm going to remove all of this motto, going to remove this comment. Great. Now let me execute this again. And for clarity, let me modify this color back to, let's say blue. Great. Now let us assume you don't want this entire graph and your output, but you want to limit your graph using the x axis from 0 to one. You only want your graph from the x-axis of 0 to one. What can you do? You can use another method called the SEC limited limb stance. We're limit. Okay, let me actually show you how this works. Let me add a new line here. Ax dot sit underscore x m, because first, we're going to limit the plot using its x-axis and inside parenthesis, we must provide a list. And this list would take two values. And these two values of the upper limit and the lower limit using which you want to limit your blood. So as I told you, I'm going to limit the plot using the 01 of my x-axis. And let me execute this again. Now, as you can see, we only have the portion of our plot that falls between 01 of our y-axis. You can do the exact same thing for the y-axis using the set underscore wildland method, oops, white grid. And now I'm going to limit my graph to 0 comma two of my y-axis. Let me execute this. So this is a really interesting way. If you only want to view a particular portion of a huge blood. And just like I told you in the beginning of this session, this is going to be the last lecture of the macular loop series and let us meet in the first C1 lecture. 59. SeaBorn - Introduction: Hey guys, I hope all of you enjoyed our matplotlib plus h because we haven't gradually building up to see what will have to apply everything that we learned in matplotlib while learning C1, because C1 is a sophisticated data visualization library that's built on top of Matplotlib. So all the principles of matplotlib applies to see bone as well. But what is the difference between matplotlib and Seaborn? Because both of them are data visualization libraries. So that is the difference. C-h bond is just so pleasing to look at. And the plots that you can draw at C1 are far more sophisticated. C-h bond is a statistical plotting library that has beautiful default customization options and styles not available in matplotlib. And also it is designed to work very well with Pandas DataFrame objects. This means we can convert our Pandas DataFrames into beautiful visualizations using C1. So before you can actually learn how to apply C-H bond, you have to install it inside your PC. So just go to your terminal and type the following commands that you see on the screen to first install C bar inside your PC. And most importantly, the seaborne library is completely open source. This means that the source code of C one is openly available to all developers. Let us take a quick tour to our browser to say C-H bonds documentation. So like I just told you, C1 is open source. This means that you can go to GitHub and CC1 is documentation. So I'll have to do is to type GitHub, C1, Python. And the very first link you see on the screen will take you to see bonds GitHub repository. Let's click on the very first link. So this is where all the developers can see this horse world over which is E1 is built. But we're not here for the source code. We are here to explore about C1s default augmentation. To do this, click on this hyperlink right here. This will take you to the website that has seen once default documentation. And you might have noticed a kind of similarity between MATLAB lips and C1s default documentation page because they both have a column called as gallery here. If you click on gallery, you'll be able to explore all the different plots that you can draw with C1. And just by a single look, you can see that the plots that C1 can do is far more beautiful and pleasing to look at when compared to map lot lips, bloods. There are a ton of different plots with each having their own use cases. And all you have to do is to just click over a plot and it will redirect you to a page that has the source code of that particular plot. 60. SeaBorn - Distribution Plots I: The very first thing we're about to learn in C-H bond is about distinguishing plots, distribution plots deals with different plot types, would see one that allows us to visualize the distribution of the dataset. I know it sounds confusing when in reality it's actually not a first. Let me show you how we can import C1. Just type import C1 as SNS. Using SNS is a standard industry practice and I highly recommend you stick to this. Let me execute this. And to visualize all our plots inside of Jupiter Notebook, I'm again going to use the percentage matplotlib in line. And after doing this, we have to import a dataset into a Jupyter notebook. Now fortunately, C1 has a lot of inbuilt datasets that we can import directly through this, I'm creating a variable called Tips. And actually the name of a dataset is also tips. So let me show you how it works, like S and S dot underscore dataset and inside brackets provide the name for your dataset, which in our case is tips itself. Let me execute this now to expand the dataset that you've created, all you have to do is to take tips, which is the name of my variable, dot head parenthesis. Now if I hit Enter, I'll be able to see a glimpse of the dataset that I've imported. And we're going to be using this particular dataset to visualize C1s distribution plots. The very first distribution product about to see is a discipline. Now this, this plot is going to help us visualize a univariant set of observations. By univariant, I just mean you can only visualize one variable at a time. Now for example, let us try and visualize the total bill variable, right? The syntax for doing this as SNS dot blot, and inside parenthesis must provide the name of the variable that you want to visualize from a dataset, which in my case, the total bill variable. Now let me execute this. Now in most cases, you'll be getting a warning message like this, but there is nothing to worry. We'll just proceed with this. And this is how a gentle, this plot would look like. This is a histogram. It's giving us a distribution of the total weights that we've collected. Aspect the histogram, most of her bills fall between the category of ten to $30. Now, this exactly matches with what our dataset is trying to tell us here. And as you can see, there's also a line inside this histogram. Now this line is what's called a Key be our kernel density estimation. If it only want this line to be there. And if you want to remove this Katie, Ollie have to do is to include another argument here called KDE. And by default, KD is true. And if you don't want the KDE, you'll have to modify this to false. Now let me execute this again. If you see I no longer have a kernel density estimation length. Now similarly, if you want to expand this histogram a bit, or in other words, if you want to add a lot of bins to the histogram, or you have to do is to add another argument called bins, and this provides a number here. By doing this, you'll be able to increase the number of rows that's currently being analyzed from a dataset. Now let me execute this. And as I just told you, the DataFrame looks a little more expanded, but still, most of our total bills are in the range between ten to 20 dollar y. Using this variable, you'd have to be really careful with the number that you provide. Because if you provide a really large number and execute, you will pretty much be receiving all the rows that are there inside the dataset, which you might not need. The sole purpose of plotting a histogram or any other plotting that case is to get a quick observation of your data. If you do something like this, the sole purpose of plotting a histogram is getting ruined. So be mindful with the number that you provide here. And I believe 30 is a decent number. Now that we know about dislodge, the next plot we're about to see is a joint plot. Now what is the joint plot and what does it do? A joint plot is going to match up to display for bivariate data in these plots, I told display a university in correct, which means you can analyze one variable at a time. Whereas joint plots are a combination of two plots that will help you analyze bivariate data, which means two variables at a time. And this joint plot or the support an argument called the kind. This kind argument is going to help us modify the type of plot that we're drawing. For example, in our blood, we have drawn a histogram, correct. But why are you using jointplot? You can modify this histogram to another plot if you want. Let me show you how we can do this. The syntax for joint plot is really simple. All you have to do is to call C1 dot jointplot. And inside parenthesis, you must first provide three basic arguments, which is the data that you want to provide for the x-axis, foreign by the data that you want to provide for the y-axis. And the third argument is the dataset that you're currently going to be using for this block. So it's go from the back. The dataset that I'm using is the tips dataset. And now what are these x and y variables are going to take a string as the input. I'm going to provide total bill for x. And in my y-axis, I'm going to provide, let me execute this right away. And as I just told you, unlike a display, not a joint plot is simultaneously analyzing two different types of columns. It is comparing the data of the total bill along with its respective tips. Now this scatterplot is actually making a lot of sense because it is trying to tell us that the greater the total bill becomes, the larger is going to be the size of the tip. Previously, our discipline gave us information just about the total bill. Brass scatterplot is helping us compare the total bill along with its corresponding tip size. And as it is told you, it is helping us compare the amount of the elevator receives based on the size of the total bill. Now, like I explained in the beginning, giant blood support an argument called the kind argument, which is going to help us modify the type of this plot. Right now, we have a scatterplot, but if you want, you can use the variable to modify the type of this blood made me do this for you. I'm going to use the kind of argument. I'm going to modify the scatterplot to, let's say a hex blood. Now let me execute this. Now this Higgs plot looks a little more pleasing when compared to a scatterplot. And the Higgs plot is density-based, which means the more dense a hexagonal, the more data it's accommodating. And if you see here, the hexagons around this region is more dense when compared to the hexagons all around the plot. So what is this trying to tell us? This means that the waiter got most of us tips from the customers who purchased between ten to $20. Now if you need, you can again modify this head shot. Do a regression chapter. Now if you're already familiar with machine learning, you already know what a regression charters. But for people who don't know, our regression chat is just a scatterplot that has a regression line inside of it. A regression line is nothing but a linear equivalent to a scattered data. And finally, you can also use another kind called KD, which as I told you before, stands for kernel density estimation. And let me execute this and let me scroll down. This is going to give us a graph that looks more like a wave and a more closer the waves are, the more dense the data. Which means this region accommodates a lot of data when compared to this large region. Now the basic information that we get from all of these tags are more or less the same. But as a developer, it really depends on the way you want to express the data. For example, this Katie, he's giving us the density of the points match up the most. Let us, someone might find a scatterplot more interesting. In this session, I just wanted to show you how effecient C1 can be. 61. SeaBorn - Distribution Plots II: Hey guys, In this lecture on distribution plots, I'm going to be talking about three distinct blood types. The first one is the batter plot, the second one is the ROC plot, and the third one is the QT blood. I'm also going to show you how a plot and this plot are related. So first let me show you what a bad applaud this. The syntax for Pat, Not just really simple staff for the same SNS, not bad blood inside parenthesis must provide the name of your dataset. You don't have to provide anything about the x and y axis. And let me show you why. In our previous session, we spoke about joint bloods, correct? A giant blood head does come back two separate columns from our dataset. Bad blood, on the other hand, is going to perform a joint blood for every possible combination inside your dataset, it's going to perform a giant blood between total bill and it's going to plot a giant blood we've been adorable and six, total bill and smoker. So basically it's going to compare each and every column inside this dataset. This is exactly why you don't have to provide anything about x and y-axis. It's going to plot every possible combination. So let me go ahead and execute this for you to see what I'm talking about now based on the size of your dataset, the execution of power plot my takeaway because our dataset happens to have just a few columns. But if you're working with a dataset that has multiple columns to compare the execution of your batter plot my date minutes, and let me scroll down to show you all the beautiful plots it to be plotted. Just like a tool you, the bar plots and taxes, happiness, combat all the columns inside a dataset. For example, it's comparing this gate column here with the total bill, is also comparing the state with dip. And it's also comparing the state with state, et cetera. But there is a difference if the columns that the parent node is comparing is similar, like the one we're talking about, size where the size is not going to be a scatterplot, but it's going to be a histogram because there is no data to compare. The same voussoir tip was a step and the same goes for total, but it was a total bill. If the columns of comparison or similar, it's not going to be a scatterplot, but it's going to be a histogram simply because there is no data to compare. So essentially, a barrel blood is a bivariate plot, which I'll just come back to different variables at a time. But you only difference it has kind of a jointplot is we don't have to manually declared the x and y-axis. It's just going to come back all the columns in the dataset, the bad blood support an argument called the hue argument. Now before I show you what it can do, let us have a look at a dataset again. All right, so this hue argument is going to help us provide different set of colors for the column slake, the gender or sex ratio, or let us say you want to provide a separate color for male and female, or you have to do is to provide the name of the column that you want to provide inside the hue argument. For now, I want my six column to have different set of colors. So I'm just going to type six year, I'm going to execute this. Now, as you can see, both males and females have different set of colors and C-H bond is auto generating a legend here on its own. Remember how in Matplotlib we had to type our own code to generate a legend. You don't have to do this. And Seaborn, Seaborn is going to generate its own legend. Whenever you allocate different colors to different columns and make sure you can't apply the hue argument. Two columns like the total bill or a tip because these columns of data that's completely unrelated. Now there's also another argument that I'd like to talk about, which is the palate argument. Oops, I'm sorry. Just one edit. This bad argument is going to help us select a particular color scheme. So we're going to be talking a lot about customizing if c1 plots later in the session. So I'm not going to go so deep into this. I'm just going to show you a glimpse of what the Spanish document can do. For example, cold, warm as a palette out a color scheme that would matplotlib and Seaborn support. Let me execute this. Now this is what a cool, warm color scheme would look like. Now similar to cool one, there are a lot of color schemes that both matplotlib and Seaborn supports, which will be seeing later in the session. And let me move ahead to the next plot that I told you about, which is the ROC blood. The syntax for that brought is as follows, SNS dot plot. And that plot again is a univariant blood, which means it can only analyze one variable at a time. So I'm just going to provide a visual input, hips, square brackets. And it makes the keyword this. And let me tell you what disrupt blood is doing. It just draws a dash mark for every point on the uniform or univariate distribution. Let us compare the output of this red blood with blood. And let me explain this in detail. Sns, dot-dot-dot blood inside parenthesis dips. And it makes a deal with this. Now I don't need the Getty line. So I'm just going to remove the KT ln. If you see here, both are up blood and this blood is getting the same input, but there is a slight variation in the way they operate. And as I told you before, direct blood draws a dash mark for every point on the Zuni former univariate distribution. Whereas the histogram plot essentially has been that counts how many dashes that are in that bin and then shows it as a number over here. For example, let us assume the dot plot approximately has about 150 dashes between the 10 and 28 point. The histogram is just converting this 150 dashes into bins and distributing them between the tenth 28. This is the only difference between a rock blood and this blood. And let me also show you a quick trick for this. I'm going to include the kidney line again. Now, what if you don't want the histogram, but you only need the key length. You will have to use something called as the KDE plot. The syntax is exactly the same. I'm just going to copy paste this to this set and we're going to remove this and replace this with Katie. And now let me execute this. Now this gati blood is going to help us remove the histogram and retain the KT ln. 62. SeaBorn - Categorical Plots I: In our previous lectures, we spoke about distribution plots. That it does visualize the distribution of the dataset. But from here on, we are going to be talking about categorical plots. For categorical, we're mainly going to be focusing on seeing the distribution of data between categorical columns such as gender, sex, smoker, day, time, et cetera. And we're going to be doing this distribution of data in reference to either one of the numerical columns that has total bill or one of the categorical columns themselves. That is, we're either going to compare a categorical columns such as sex, with another categorical columns such as smoker. Or you can compare a categorical column such as sex with a numerical column total. Even before the lecture, I've typed a few lines of code, nothing too fancy. I first imported Seaborn and I've included the mat plot inline, and I've imported our dataset. And finally, I've displayed the dataset that we've imported. Let us tack that bar plots. Bar plots is one of the most used categorical knowledge of all time. A bar plot is going to allow you to aggregate data based on a function. But by default, this function is going to be the mean function. But if you want, you can alter this mean function to any of the function you want to like standard deviation or some, or whatever function you want to just think about barplot as a visualization of the group by action. I know this could have sounded confusing to you, but you'll understand this a lot by determining, show you an example. So the syntax for barplot is as follows. Sns. Barplot and inside parenthesis must first provide a value for the x column, and then you must provide a value for the y column followed by the dataset that we want to import, which is going to be the tips dataset. And like I already told you, we can either compare a categorical column with another categorical column or you can combat a categorical column with the numerical column. Right now I'm going to compare a categorical column with a numerical column. So I'm going to provide the gender or sex here, and I'm going to provide a numerical column for the y-axis, which is going to be the usefully total bill. And let me execute this and let me scroll down a bit. This bar plot is trying to tell us the total bill contributions from both of these genders. And I already told you that this is a mean value. This means the main contributors has a total bill with the mean value of 20, that has the female contributors have TO two bills. But the mean value of somewhere in between 15, $20. But like it only before, you can modify this mean function if you want using an argument called the estimator. This estimator argument is going to help you to modify the aggregate function that's currently being performed. But before we do this, we must import NumPy. So I'm going to insert a cell about and I'm going to import numpy here. Import numpy as np. Let me execute this. And inside this estimator argument, I'm going to type np.array and let us say I want to plot the standard distribution. If execute this, the aggregate function being performed would no longer be the mean function, but it would now be the standard deviation. But even with the standard deviation, the main contributors seem to be having a higher total value when compared with the female contributors. Now this estimated arguments supports all the inbuilt functions that come with NumPy. So please feel free to try some of those on your own. Now let us move ahead to another plot called as the count blood. The count lot, as the name suggests, is this going to tell us the total contributions from a particular category. Now the syntax for count blot is exactly the same SNS.com plot. And inside the parenthesis, or you have to do is to provide a value for the x axis. Because the value of the y-axis is always going to be count. And let us see for exactness, I'm going to provide the gender or sex and want to import this value from a dataset called EPS. And as it go, you don't have to provide the value for y because the value for y is always going to be cold. It may execute this and you scroll down a bit. Now this count blot, as he told you, is helping us count the total number of males and females in our dataset. The y-axis is always going to be standard. You cannot modify the way axis, but you can modify the value that you provide to your x-axis. Now followed by this, we're going to be learning about boxplots, but it's also an interesting way of distributing categorical data. A boxplot is also known as a box and whisker plot. This plot displays the five-number summary of a set of data. So what do we mean by this five number summary is going to show us the minimum value, the first quartile, median, the third quartile, and the maximum. And just like I always tell you, you are going to be confused and explain on this. But do you understand this a lot better when I show you this with an example. Now the syntax for boxplot is SNS dot boxplot and inside parenthesis and must provide the value for all x, y, and data. So to the x-axis, I'm going to provide the value of the categorical data for a change. I am going to take the day column for the y-axis. I'm going to provide total bill. And the third argument is the data. Now let me execute this and scroll down. Let me zoom in a bit. And just like a tool you, this box plot is displaying a fine number. Somebody defined numbers are the bottom line displays the minimum value. And this entire box is split up into three different categories. The first box represents the first quartile, the middle line, the presence, the median value, and the top box represents the third quartile, and this line on the top represents the maximum value. These dots are called whiskers or another terms. They are outliers. In simple terms, the boxplot is helping us display the customers per day. In addition to this, it is also helping us visualize the minimum, maximum and the quartile values of the total bill of the customers who came each day. Let me explain this to you one more time. The bottom line represents the minimum total. The top line represents the maximum total bill on that particular day. And every data that falls into this box represents that this is the range where most of the total base files on that. So boxplot is again, one of the most used categorical plot and C-H bond that you'll be using all the time. 63. SeaBorn - Categorical Plots II: In our last lecture, I forgot to show you that box flood supports another argument called cube. When you use the hue argument, it helps you add another category to distribution. For example, let me add for a change smoker here. If I execute this, your existing distribution will again gets split into two different categories depending on whether your customer has been a smoker or not. Alright, let me execute this again. And I could just told you C1 has automatically added a legend. And I taught a splitting our previous distribution into two separate categories based on whether or not our customer was a smoker or not. It is also color-coding them accordingly. If a customer was a smoker, it's labeling them in blue. And if the customer was not as good, it is labeling them in an amber sort of color. This again explains the level of sophistication that C1 offices because the level of information that a user can get from a block like this is image. For example, let us take the day Saturday. On Saturday, the total bill collected from smokers having greater than the total bill that's collected from non-smokers. Now this is far greater information than what we had before. So whenever you can or whenever you want to feel free to add the hue argument to your boxplot parenthesis. The next categorical plot we're going to be discussing is the violin plot, the syntax. It's exactly the same. Sns violin plot. Dividing blood will again help you compare two categorical pledge, or one categorically blot with a numerical plot. So right now, I'm going to compare a categorical plot like D to a numerical plot, total bill. And the third argument is the data, which is going to be dips. And it gets this is enough for now. Suddenly execute this quickly. And let me zoom out a bit for us to compare the boxplot with the output of violin plots. Let me remove the hue here. Alright? So if you see the output of a boxplot and the output of violin plot is more or less the same. A violin plot is just displaying our quartiles as bulges. So the greater the Bulge, the greater is the density of distribution. In simple terms, a lot of customers that we had on Thursday had a total bill between the values of 10 and 20 dollar. The Bulge represents that a lot of customers have the total bits between that particular range and towards the tip, our violin seems to be thinning down, which means not a lot of customers had their total births in that range. And even by using your violin plots, you can add the hue argument again. But this time let's say I'm going to add six year that it makes Acute this. So just like our outputting box plots, the hue argument is going to split up our existing distribution into two separate violin plots, and it's also going to color code them accordingly. Now you can optionally add another argument called split by default, split a set to false, but we're gonna make it through. Now. Let me execute this. Now instead of separating your violet into two separate violins, it's going to display the main distribution on one side and the female distribution on the other side. In my opinion, whenever you're going to use the hue argument, please use the split argument along with it because it's going to make your data look a lot more professional. We also have a clearly defined legend here, so we don't have to worry about getting confused here. But I personally wouldn't use violin plots a lot. This is simply because most people wouldn't have seen violin plots ever in their lives, even in your professional lives. Use box plots whenever possible, because most of you are managers will only be used to boxplots and a violin plot. We're only just confuse them unless you absolutely have to, or unless you're sharing your data with other data scientists stick to basic plots like the boxplots. Now speaking of complex, I'm not going to show you a plot while this triplet, a strip plot is nothing but a scatterplot, okay? A strip plot is going to help you draw a scatterplot that one variable is categorical. But let me explain this with an example as usual. As soon as much blood. As a matter of fact, let me just copy paste under the third argument and I'm going to place them inside the parenthesis. It may execute this honor today it has been an extra bracket. Alright. So like I just told you, a strip load is nothing but a scatter plot, but it's more or less going to look like a violin plot. Now from the looks of it, you can see a lot of dots are overlapping with each other. Connect if you don't want your dots to overlap and if you want your scatter plot to look a little Distinguished, you can use an optional argument. Clyde, did jitter argument. Modify the jitter to true and hit Enter. Digital argument will help seaborne do its best to keep the dots from overlapping with each other. And as we did before, you can add hue argument. You'd also, this you had given it has to be a categorical value. I'm going to add six here. So instead of splitting your scatterplots into two different scatter plots, this is going to color-code your dots. You can also use the split argument again. Split equals true. If Jupiter notebook issues are wanting to not reach not a problem. And this split argument is helping a split this scatterplot into two separate scatterplot. So again, if you are a working professional, I would never recommend that you use to blood because not a lot of people would understand what is their blood is returning to convey, some people like to combine the idea of strip lot with the violet bloods, could do watch called a swamp blood. So that's exactly what we're going to see now. The syntax for this one plot is as follows. Sns dot swamp lot. And I'm again going to copy paste the first three arguments. Greed or it makes security. And just like a W, a swamp land is a communist no volume flux with triplets. Your dots are going to get merged into Watch going to resemble a violin. Now the major disadvantage of using S1, loudest, it does not scale. Which means if you're going to use a last dataset as one plot is going to be of no use. Now just going to crowd your entire canvas with dots from which no one's going to get any tangible information. Now if you want, you can optionally combine a violin plot with this one plot. So let me show you how we can do this. For this, I'm going to copy paste the entire line and I'm going to replace the first-line violent blood. Now let me execute this. If you see, we've essentially combined Aswan blood of the violin plot. And let me just differentiate this one, blood from the blood using the color argument. And it's going to change the color to, let's say black. Although I don't recommend this, this is how you can combine as warm-blooded the violin blood as a conclusion to the categorical approach lecture, I'm going to show you a quick bonus of watch called a factor plot. Every demonstrate this to you first factor plot, and I'm going to copy paste the exact same arguments here. So imagine the factor plots in tax as a common syntax that you can use to call any type of plot you want by using the kind variables. Let me explain this again. Remember the factor plots in tax as a common syntax which you can use to access any kind of plot. For example, let us say I want to access the bar plot. All you have to do is to mention bar and Sandy kind argument. If you execute your data, we now get distributed as a bar plot. Let us say you want the data to be distributed as a violin plot. All you have to do is to replace bar that the violin. And this is going to distribute the data as a violin plot. Now, I personally wouldn't recommend that you use the factor blushing texts. I would suggest that you use the syntax of that particular plot where you're trying to plot them. 64. SeaBorn - Matrix Plots: Foreign by the distribution categorical plots, we're going to be learning about Matt exploits. And more importantly, we're going to learn how we can like heatmaps. For this lecture, I've used another set of data set that comes along your C1 installation, which is the flights dataset. I guess by this point, most of you would be familiar with this index, which I'm going to explain this once again. First, I'm importing the C1 and then to display the output inside of Jupiter notebook, I've used the matplotlib inline foreign. By this, I've loaded the flights dataset and I've stored inside a variable called floods. And in this printing, everything that's stored inside deflates variable. And as you can see, our flights dataset has about 144 rows and three columns. If you don't want all of these rows to get printed, you can use the head method to only print a glimpse of your dataset. Now this will give us the first five rows of your entire dataset. Now, as I just told you, we're going to be focusing on plotting heatmaps. But in order for a heatmap to work properly, your data should already be in a matrix form. By matrix form, I mean that the index name and the column name should match up so that the sale value actually indicates something that is 11 to both of those things. I know that most of you wouldn't have understood a word that I said. Let me explain what I just told you. So the first thing that I told you is for a heat map to work properly, your dataset should be in a form of a matrix. Let me show you an example. This is an example of a simple matrix. Now the haka dataset in the form of a matrix is just one condition. The second condition is that the index name and the column name should match up so that the cell value actually indicates something that is 11 to both of these. Now this is my index and this is my column. And based on the index and my column, I can derive tangible information from this matrix. For example, student a has scored 100 marks in his first test, and student a has coordinated to Marx in a second test. And similarly, students see as quote, a 100 marks in a second test. Now, we are able to achieve some form of tangible information by comparing the row values and the column values, correct? Now this is the exact format in which your matrix has to be. Now let us go back to our original dataset. Let me zoom in a bit. The dataset that we have currently satisfies the first condition, which means our dataset resembles something like a matrix, but it does not satisfy the second condition, which is you cannot achieve any form of tangible information by comparing the row with the column. Because even though the columns have useful information instead of them, the index is nothing but a set of random numbers from 0 to the intro. You cannot compare the index values with the column values to get any form of tangible data from your matrix. Now, our first job is to convert our matrix into a relevant metrics. You can do this using two methods. You can either use correlation or it can use pivot table. In this lecture, I'm gonna be showing you how we can use pivot tables. So first, I'm just calling the flight variable here. Pivot underscore, table, and inside brackets, you must provide three arguments. The first argument is the name of the index, and the second argument must be the name of your column, and the third argument should be the name of your values that you want to fill in psychometrics. So we have three columns here. So I'm going to name the month column as my index. Right? Now, my first argument is going to be index and I'm going to place month as my index. And the second argument is columns. And I'm going to place year instead of columns. And the third argument is the values and passengers is the only balanced columns. So I'm going to use passengers here. Now let me execute this. Let me scroll down. We have a tangible matrix here. This is because we can now compare the row values with the column values to get tangible information from this matrix. For example, in a given month, June, in the year of 1950, about 140 and passengers have used an aircraft. So this is what I was talking about. Now that we have our element matrix, let me store this inside a variable called, let's say FV. Let me execute this and we're good to go. Let us go ahead and plot a heat map of our own. The syntax for heatmap is a really simple SNS dot heatmap. And inside parenthesis, just provide the name of the variable value stored your dataset. If execute this, I would now have my very own heatmap. As you can see, the legend of your heat map is going to be slightly different than the previous lessons that you've seen. And aspart this legend, the light of the color is, the more dense the passengers have been. For example, on a given month of July in the year of 1960, the Heatmap seems to have a really light color, which means about 600 passengers have traveled on that particular day. And if you take the month of Jan from the very first year in our dataset, that is 1949, the color is extremely dark, almost black. This means that less than 200 passengers have traveled in that particular month. In simple terms, a heatmap is going to give you a distribution of data based on color gradients. Now, there is an argument called CMAP that you can use to modify the color gradients. Currently the new heatmap, for example, let us use the color scheme that you've already seen before, which is the cool one. Let me execute this. Cool warm is a really popular color scheme that most data scientists use. And in this particular color scheme, the darker the color gets, the dense the data distribution is going to be. Another common color scheme is magma, MPG and, and let me execute this now I guess this is what we had before. And finally, there are two arguments I want to show you. The first ones, the line width, and the second argument is the line color. Now, on your canvas you can see a heatmap being split up into different boxes, correct? They seem like they're overlapping with each other by adding a number to a line width, you can include a partition in between these boxes. For example, let me add a line with a 11 me partitions to be, let's say in white-collar, it means this. And this is how you can add a partition in between those boxes and modify the color of this partitions as well. For example, let me modify this to black. Now, just like heat maps, there is another category of blood called as cluster map. The syntax for cluster map is really similar. Sns dot cluster map. And inside the parenthesis probate the variable where you've stored the dataset. It makes it give it me scroll down. Now, a cluster map, as you can see, is going to group similar columns together. For example, pay close attention to the rows. You can see that the rows are not in their proper order. 950 must be followed by 1951, but in our case is followed by 1953. This is because our cluster map is grouping similar datasets together. Let me zoom out just by looking at the color. You can tell the 1953 and 1954 follow more or less the same color scheme, correct? This is exactly why our cluster map is grouping those both us together. And similarly, the year 1949 and 1954 seems to have the exact same color scheme. And our cluster map is grouping both of these years together. This grouping is happening both based on the rows and columns. And let us change the color scheme to have a better look at this. I'm going to modify this to cool, warm again. Great. Now in this color scheme, the grouping condition is really visible. The year 1959 and 1960s seems to have the exact same color scheme, which is exactly why they've been grouped together. Now, another interesting argument is the standard scale argument, and we type it here. Standard underscore scale. This argument is going to help us modify the legend that's your right now, our legend is split up in individual funded, correct. If you want, you can modify this to let's say one, it makes it killed this. And let me scroll down a bit. If you look at the legend, it's no longer an individual funded, but it's split up between the numbers 0 and 1. And everything in between are decimal numbers. 65. SeaBorn - Grids: Hey guys. So in this lecture I'm going to show you how we can use C-H bonds grid capabilities to automate subplots based on the features that you have on your dataset. And I'm also going to introduce a new dataset for you in this lecture, which is going to be the Iris dataset. So as you can already see, I have made a few lines of code in the screen. So just pause the screen, try to copy the code on a Jupyter Notebook and execute it. This will help you print the dataset that I currently have on my screen. Now, I, this is a really popular dataset. This dataset has measured bunch of a bunch of different flowers, arteritis. Let's have a quick look at a dataset. Our dataset has five columns. The first four columns are numerical columns. Those are nothing but measurements. The last column on the other hand, is a categorical column. If I have to be even more specific, the Iris dataset has three major species to print. The unique species that either state as it has. And there's going to type iris brackets species. And I'm going to use the unique method here and let me execute this. And as I don't use, the Iris dataset has three major species. The first one is setosa, the second one is versicolor and virginica. So this is everything you need to know about the Iris dataset. Allow me to introduce you to grids. So firstly, let's plot a simple bar plot, SNS, dotplot and inside brackets. And it's going to provide the name of my dataset. So as you may already know, a bar plot is going to give us a bunch of different subplots that's going to compare multiple aspects of a dataset. So let me execute this. And as I told you, we have a lot of different subplots here, but bar plots have one major disadvantage. You as a developer would have no control over subplots that are being drawn. This is exactly why I'm going to introduce you to batting average. A power grid, on the other hand, can do whatever power blocks can do. But you are going to be in complete control. You will understand this a lot, but I'm going to actually start to code the syntax. What batting average is really simple. I first have to delete the backup lot and have to replace it with bad grades. Make sure the P and G should be in capital letters. This power grid is going to take all the numerical columns and drag them up. Let me execute this. Now we're going to have plain grids at the first, and we'll have to map data onto them. So this is telling us bad grid involves a lot of manual steps that has bad lots that automatic. You don't have to map anything, just typing the bad blood syntax and providing the name of the dataset is enough. The pair plot syntax is going to block a lot of different subplots for you. But when it comes to a bad grade, you'll have to do all of this manually. But as an advantage, you're going to be in complete control over the subplot plot. So let me show you how we can map data into the pedigree. Before we do this, I'm going to store this inside a variable called a, and it makes it great. And foreign. By this, I'm going to type a dot map. And let's say I want all of those approx to be scatterplots. And it's going to take PLT scatter. Let me execute this real quick. So I'm getting an error message here because I haven't imported the Mad Lib library it. So let me just add a line of code here. I'm just going to copy paste the code for importing matplotlib grade. And let me execute this code cell and you cannot execute this code. And finally, RHO, as you can see, all our supplies are now being converted into scatterplots. And let me also show you a quick hint here. You can divide this entire grid into three separate divisions. The first division is the top triangle, the second division is the diagonal. Third division is the lower triangle. And you can map different subplots to all three of these divisions. So let me show you what I'm talking about. So I'm going to remove this line, deadly. So firstly, to start with the diagonal, to map data onto the diagonal alone, the syntax is as follows, a dot map underscore diag, which represents diagonal. And inside the parenthesis, I have to provide the type of plot that I wanted my diagonal subplots to have. For example, let us say I want all my diagonal subplots to be distribution blood through those and haplotype SNS dot lot. Let me execute this. Let me scroll down a bit. As you can see on my diagonal blocks are now distribution blood. Now similarly, let us map data onto the top triangle and the bottom triangle. The syntax for doing this is exactly the same. Let's start with the upper triangle. A dot map underscore upward. And I want all this approximate by triangles to be scatterplots. So I'm just typing PLT, but scattered and foreign. By this, I'm going to map data to the lower triangles have lodge a dot underscore lower. And I want all the lowest subplots to be Katie, you bloods, Lewis and typing SNS dot blot. Now let me execute this. Let me scroll down a bit. And aspirin are instructions. All the magnets applauds on our distribution bloods, the upper triangles up large on our scatter plots and subplots in the lower triangle, I know KDE plots. Now this is the level of control you can get by using paragraphs. Now before we wrap this session, I want to show you about facet grid. To do this, I'm first going to import the tips dataset again. So I'm just typing FIPS equals SNS dot load underscore dataset and inside parenthesis. And it's going to provide the name of the tips dataset. And let me load tips dot head. It makes it give this. The syntax for facet grid is as follows. Type SNS dot facet grid. And make sure that your f and g are in capital letters. And this facet grid takes three arguments. The first argument is your dataset, the second argument is the column, and the third argument is 0. So our dataset is depth. And the second argument, as I told you, should be the column name. And the third argument should be the rolling. So for column I'm going to provide time, and for rows I'm going to provide smoker. Let me execute this. And we receive a grid of two-by-two, which has about four subplots. Let's begin mapping data into these applets. So to do this, I'm first going to save this inside a variable called B. And do this b variable, I'm going to add map and inside parenthesis, I must first provide the type of blood that they want. So initially we type SNS. This plot, and this plot, as you know, can only considered one variable at a time, correct? So I'm going to provide, let's say total will. Let me execute this and let me scroll down a bit harder to assay can see the data that's distributed inside this applauds our total base. The first subplot gives information about the total bills from smokers who visited during lunch. Similarly, the second subplot gives information about smokers who visited during dinner. So row 1 represents smokers and brought to the present non-smokers. And similarly, column one represents lunch, and column 2 represents dinner. And the data of consideration is totally. Now we're currently using display ads, which can compare one variable at a time. But what do you want to compare data from two variables at a time? Let me show you how we can do this. For this, I'm going to use scatterplots again. So I'm just going to type PLT scatter. Now as you already know, scatterplot means two variables. The first variable is total bill, and the second variable is let say, let me execute this again. Let me scroll down a bit. Now our subplots are simultaneously distributing data from total bill and the tip column. Everything else is exactly the same in this scenario, our subplot that simultaneously distributing data from the total column and the column. So instead of using blocks, you can use bad grid and facet grid to have complete control over the way in which we distribute data over your supplies. 66. SeaBorn - Size & Color: In one of our previous lectures, I promise you that I'll be showing you how we can customize your blood later in our C1 series. So in this lecture of us, I'm going to show you how we can change the style and color of blood. So first I'm going to draw a simple plot, SNS dot, let's say count blood. And this can plot needs an input and my input is going to be let A6. And the second argument has to be data equals x. Now before I do this, I've already typed a few lines of code here. So if you want, you can pause the screen and type what you see on the screen. So we both can be in the same page. Now let me execute this code. Say now this lane is giving us a really simple blood. Now there's nothing fancy about it. So let me introduce a new line here, S and S dot set state, but anthesis. Let me hit Shift Tab here. Let me expand this again. And as you can see, the set state method has four styles that you can use. The first day is the dark red, the second style is the white grid, and there is also dark white. And now let us try and experiment with this one by one. I'm first going to start with the dark grid. Dark grid. Let me execute this. And as you can see, the dark grid style is adding a dark background behind it, con blood. So the second style was the white grid. And make securities again, the white grid, as he could've predicted, has a white background with lines in between to make our job easy for us. And we also had a dark state. So let us see what it was. So this is just giving us a plain dark background. And there was also another style quite wide, but I guess we both know what it means. Anyway, it will just execute this again. And this is just giving us a plain white background. Now a really useful and important style that most professionals prefer to use is the state in which the state ticks here and it makes Acute. The textile does not provide lines or grids inside of Canvas, but provides something called as spines on the left side of our Canvas. Now this is minimally invasive when compared to the grid, but it also helps us part our data. I'll tell it better. Now if you want to remove these spines are the borders that your canvas has. You can use something called as the find method. Sns, dot-dot-dot the spine, open parenthesis, and close parenthesis. It makes it give this, these bind by default would first remove your top and right side borders. Let me place my cursor inside the parenthesis and let me hit Shift Tab. Let me expand this. As you can see, the top and right arguments are labeled through by default, and the left and bottom corners are labeled false. So if you want to remove the left-hand bottom border is also would have to manually change this falls into true. So first, left equals through, followed by bottom equals. Let me execute this. Now this is going to help you remove all the borders, although I wouldn't recommend this, because in my personal opinion, this looks a bit unprofessional. And also it makes a little bit difficult for people to understand what your plot is trying to convey, or value or data's exactly pleased. Even if you wanted to remove the borders, I highly recommend that you only remove the top and the right side corners. I'm sorry. Let me execute this again. Now let us talk about modifying the size and aspect of your C1 plus direct to ways in which you can do this. The first method is by modifying the plt.figure figsize, which we saw in our matplotlib lectures. The second way is to use the size and aspect methods which c1 plus inherently have. So let us start with the very first Mac lot like PLT feels AS method that I told you before we do this, I'm first going to copy paste this line of code again, and let me execute this. Now, as I told you first, I'm going to use the Mac lot limited PLT dot, figured inside parenthesis. I'm typing fig size equals. And this again takes a parenthesis. And inside parenthesis you must provide two values. The first value represents the length and the second value represents the height. So for example, let me just type in comma phi. Let me execute this. Now this fixed size method is one way to modify the aspect ratio of your blog. And the second way in which you can modify the aspect ratio is by using the set context method that comes with C1. It will show you how the syntax works. I'll just delete this entire outline. Sns, dot, set, underscore, context, open parenthesis and close parenthesis. Let me type Shift Tab real quick. Let me expand this. And if I scroll down, you can see that there are four contexts that this method has. The first context is the paper, and there's also something called less-known book, doc and posted. So all of these contexts represent different styles that you can provide your canvas. So for example, let us consider the poster contexts. I'm just going to type poster here. And let me execute this auditory received in error message because I should have included this inside quotations. Great, let me execute this again. Let me scroll down a bit. So poster is a type of style that comes along with C1. And as we just saw, there are three other styles that you can use. Let us also try using the docs type. So this is how it looks. There's also something called as paper, correct? And this is what the PaperStyle looks like. Now the output of all of these techniques might have looked similar to you. But when you actually save this plot as JPEG files, you can see that they be saved in different aspect ratio completely. All for paper, notebook, doc, and posters types would have different aspect ratios when you actually save your graphs as JPEG or image files. In general, the default state is the notebooks. Notebook. Great. Now you can also manually modify the font style if you want by using the font scale argument. Let me just introduce a second argument here called Font Scale. As it is told you imitate three here, and let me execute this. Now, as I've provided three here, my font size has now increased to three times its original value. Now if I type 2 here and hit Execute, my font size would now get enlarged to two times its original value. Now finally, let me show you how you can modify the color scheme of your blood. To do this, I'm going to draw a new plot coalesce the lm plot are the linear model plot. The syntax is as follows, SNS dot lm plot, and you take parenthesis. I must first provide three definite arguments. The first argument is the x column, the second argument is the y column, and the third argument is the dataset that we're going to provide. So for the x column and it's going to provide total. And for the y column, I'm going to provide. And our dataset is the tips dataset. And we just execute this. And lm plot is nothing but a scatter plot with the regression line inside of it. If you want, you can also optionally include a fourth argument called the hue argument. I'm just going to provide six here. And let me execute this. Now this hue argument is automatically going to allocate different colors for the categorical column values that we provide. In our case, our categorical column only has two sets of data, which was male and female. Now coming back to her point to modify the color scheme of this entire plot, you can use something called as the palate argument. We wanted the skin color palettes such as cool one, or it may execute this. So this is what the cool one color palette looks like. But Seaborn just so happens to have a ton of other color schemes that you can use. So let us quickly open a new tab to discover all the other colors schemes that are available in MATLAB. Or you have to do is to type matplotlib column map. Let me execute this. Click on the very first link. Now this will take you to the page that has all the different colors schemes that are available in matplotlib and on this color schemes are separated into different categories. All these categories has color schemes, and every color scheme has a string that's attached to it. All you have to do is to remember the name of the string. For example, let us consider this very first color scheme. Let us go back to a Jupiter notebook. I'm going to replace cool one with the color scheme that we just saw. Let me execute this again. So similarly, you can try and play around with all the colors schemes that we just saw. And let me show you another example. Now let us use the very first color scheme in the diverging color maps. Let me erase this. And it may replace this with B, G. It makes you feel this again. So this is another interesting color scheme. So this is how you can modify the size, color, and aspect ratio of your C1 bloods.