Transcripts
1. The Next Gen Language - Julia: Hi, and welcome to my Skillshare class on the Julia programming language. I'm Dr. Norman. I have a PhD and post-doc in applications of machine learning. I have more than 15 years of teaching experience and several bestseller courses online, including many related to machine learning and data science. I know where people struggle with these courses, since I've guided several thousand students in these areas. Data science and machine learning are fast-paced fields where you need to keep up with the latest trends. If you get stuck with old tools, languages, or frameworks currently in use in your job. You will have troubled when you try to switch jobs or tried to solve new problems. This is why you need to keep yourself updated with the latest tools and languages coming out in this field in data science and machine learning. The hot new language that shows a lot of promise is the Julia programming language. Udi has been gaining acceptance in the industry very fast. Many people feel that it has the best of both worlds. It's very, very fast. It's comparable to fastest languages such as C. And it's also extremely easy to learn. And that's because it's basic syntax is very similar to the popular Python language. Due to these two strengths, many feel that it will quickly become the language of choice, or at least a significant player in the fields of scientific computing, data science, and machine learning. Very soon, In this course, you will learn the best features of Julia in the smallest amount of time possible. We take a concise approach towards learning that will allow you to learn the basic syntax of Julia, how it's different from Python, how to apply it to the concepts of data science. And last but not least, how it's machine learning capabilities can enable you to supercharge your ML career. If you know Julia, you will stand out from everyone else who has worked with data science or machine learning. Simply a mention of Judea in your profile will show potential employers that you are staying ahead of everyone else and know the state of the art. Knowing Julia would also allow you to work with the latest and greatest data science tools and machine learning models and also create your own very, very easily. The best part is that there are no prerequisites to taking this course. If you know some basic Python, it will definitely help, but you will still get all of the content if you've never worked with them before. I will hover assume that you've worked with some basic data science and machine learning models before, since we won't cover the Machine Learning Theory in this course. But if you have worked with any data science tools and or machine-learning models before, you should have no trouble following the content. This is aimed to be a first course in Julia. It will give you the foundations you need to keep learning on your own even after the course. Everything is practical, no boring slides, very concise and we'll get you up to speed in no time at all. So jump right in and let's start learning the Julia programming language. The next generation machine learning and data science tools.
2. Installing Julia (Windows, Linux and MacOS) : Hi, welcome to course, and thanks for joining. Before we start, we obviously need to set Julia up for our system is fairly straightforward, but I'm going to go through this just in case so that you can follow along. So we'll set Julia up for Windows, Linux, and Mac OS. And you can follow along depending on your own operating system. So let's start off with windows. To get Julia head over to Julia Lang.org slash downloads, I would recommend that you install the current stable release, which is currently at 1.6. Julia releases come out very frequently. So this might be different depending on when you're viewing this course. In any case, go ahead and download the Windows 64-bit installer. This is an executable file which you can go ahead and simply right-click and run. And it will set up the installer for you. So the only thing that you should be doing overhead is changing the installation directory. Do something simpler like C colon slash Julia 16, 0 or whatever you feel like. So something simple. It next, this is going to show you some options and just keep hitting Next and it'll set it up for you. So it's fairly straightforward. So we wait for just a minute. Violet sets it up. When it's done, simply click on Finish, and then you can head over to SQL and Julia 160 or wherever you set it up in the bin directory, there is going to be an executable called Julia. You can simply go ahead and run that. And it will show you the Giulia prompt that you can see over here. So that's the only platform specific stuff that you need to do for Windows. Let's head over to the installation on Linux and Mac, and then we'll come back to what you need to do over here. So if you're on Linux, you can simply go ahead and download the current stable release version 1.6 for your Linux distribution. So typically this is going to be a 64-bit installation on the X86. You can also download the ARM-based compressed file. If you have your Linux setup on our machine, you simply go ahead and download this and then extract the file. Your file name might be different depending on the version that you have are the architecture. But you just need to extract this file cd into the newly created directory and then say bin slash Julia. This is going to execute the Julia interpreter for you and you will arrive at the same prompt as you did for Windows. Similarly, if you are on the Mac, you can go ahead and download the Mac OSX DNG file. This is fairly straightforward. You just click on this and it's going to set it up for you. There are a couple of things that you have to do for Mac. So you can click on this Help link and it will show you the two commands that you need to do to add Julia to the path. So you can say RM minus f slash usr local bin Julia, and then create a symbolic link using this command. So you would need to obviously set up the correct directory over here. That's all there is to it. Once you have that, you can go ahead to your terminal in your Mac environment and simply say Julia, and you will arrive at the terminal. So irrespective of which platform you're working on, which operating system you have, you will arrive at a Julia prompt over here with which we can start working. So in the next video we're going to take a look at how we can set up the rest of the environment on any of these operating systems and then start working with Julia code.
3. Packages and Interactive Notebook: We installed Julia in the previous video. And irrespective of whether you're on Windows, Linux, or Mac, you will arrive at this problem. This is the Julia REPL or rebel prompt. You can go ahead and say any command over here. So x is equal to five and you can say x, you can say print LN X. And you can work with Julie over here. But this isn't very effective. What we want to do is arrive at a notebook like environment, which you might be familiar with if you've worked with Python. And we want to set that up so that we can experiment with our code and learn more effectively in setting that up, we are also going to take a look at how packages are installed in Julia. So for that we have a built-in package. So for installing packages we have a built-in package which we can import using the, using commands. So you can say using PKCE. So PKA is the package that is used to install other packages in Julia. You can say pg dot add and then give it the name of the package that you want to install. So this is going to be IJulia. Ijulia is the package that allows us to interact with the Jupiter notebook. The Jupiter notebook itself is not part of Judea. It's going to come from Python, but we'll see how to install that in just a minute. So we say Pk dot add and double-quotes. We say IJulia. This is going to go ahead and set up IJulia packaged for you. This might take a little bit of time because it has to install and set up some of the stuff. So we'll skip ahead in time until this is done. Okay, So this is done. Your output might be slightly different because I've installed IJulia before. But unless you get an error, IJulia should be installed. Now there are other ways of adding packages. So you can go into the package mode by hitting the closing square bracket. You won't see it when I enter it, but the prompt is going to change. So I'm going to hit the closing square bracket. And you will see that the prompt changes to PEG. Now you can simply go ahead and say Add and IJulia or whatever else package you want to install. So if you heard that it's already installed, so you're done from the package prompt. When you have a new prompt, you can simply hit backspace and this will drop you back to the Julia prompt. Okay, So that is how you install packages in Judea straightforward. You don't have to handle the command line issues as you do with pip. Okay, so to recap, this is all we've done using PKU. Pku dot install IJulia. Once I Julia is installed, we can start using this package using the using command. So using IJulia is going to allow you to import IJulia and start using its functions. We'll cover this in depth when we start working with Julia. But for now we're doing this just to set up our Jupyter lab environment. So you say using IJulia and then you say Jupiter Lab function call. So when you do this function call, you will get a prompt that will ask you whether you want to set up a new environment, a new Python environment for Julia. The various works is Python is going to provide the Jupyter Notebook environment for you and everything else is going to be in Julia. But the environment, the IDE that you work with is going to come from Python. Julia does have its own environment, but it's slightly glitchy right now because people are working on that. So we're going to stick with the Jupiter notebook environment that you might be more familiar with. So all you have to do is say Jupiter lab and it's going to give you a prompt. I'm not going to get that prompt because I've already set up Jupiter lab. But the prompt is going to ask you if a new environment should be set up for Julia. Make sure that you hit y so that you don't run into any conflicts. So when you hit enter, you get a prompt, you say why, and then it's going to go ahead download Anaconda, setup, a basic Python environment for you, and then show you the Jupiter lab environment. You can go ahead and download the resources for this. And it will give you a Notebooks folder that is going to have all the IPython notebooks that we are going to work with. So you can go ahead and check to make sure that your Julia package is correctly installed. So you can go to 000 dash sanity checks dot IP and b as equal to double-quotes. Hey, so that's going to be a string. And if you say S carrot or had three, it should give you, hey, hey, hey. So this carrot is going to ensure that you are working with Julia and not with Python. Okay? And then you can say version info and you will see what the current Virginia's I'm working with 15, 3, but 160 has just come out and you should be able to work with that as well. There aren't any major changes from 15 to 16, but since Judea is evolving really quickly, there might be two other changes down the line, but we'll cover them as we progress. So once you're here, you can see all the notebooks over here. And as we progress with the course, you can go ahead and look at the individual files that we're working with. You can code along or look at the notebooks that I've provided. I would highly recommend that you type the code yourself so that you can get a thorough understanding of this. If you want to follow along, you can click on Plus and create a Julia on 53 or Julia want 600 notebook and court along. So let's move to the actual core of the course now and start working with the Julia language.
4. Basic Syntax, Variables and Operations: Now that we've set up our environment, we can go ahead and start working with the actual language. We'll take a look at the basics of the language and then move on to the more advanced topics. You can follow the tutorial by typing all this stuff on your own in a new notebook. Or you can follow along using the notebooks that are provided with this course. So let's start off with the basic definition of a variable. So we can go ahead and say, awesome variable is equal to 25. So that defines a variable. As you can see, the variables are dynamically typed so you don't need to define the types, but they do have a type and we'll take a look at that in a minute. You can take a look at the value of the variable using just awesome underscore var, so that gives you 25. Now one important thing that immediately differentiates Judea from almost any other programming languages that variable names themselves can be in Unicode, in Julia, for instance, in Python, you can define any variable x to hold a string which has Unicode in it, right? So we're using a Unicode emoji in this case. But the variable name itself in Python cannot be a Unicode. So here it's very possible to do this. So let's go ahead and run this. So as you can see, this is just a variable which holds this string. And if you try to output the value of this variable, it works perfectly fine. Alright, so this seems weird. And the way to type this is to go ahead and say slash colon SMI and hit tab. That is going to give you a pop-up and you can select whichever Smiley you want. So for instance, if you need this, you hit Enter and then you hit Tab again, right? So enter. It seems weird, but you're almost never going to use smileys in your notebooks, right? So that's weird. But in general, Julia does use Unicode variables. For instance, over here we have sigma, and if you want to type sigma, you can say slash sigma and hit tab and it's going to turn it into the Sigma letter. And because Julia aims to translate mathematical models into code while remaining visually similar, this is very helpful, right? So for instance, if you have a mathematical formula that says sigma is equal to 2.5, you can write that exactly like sigma is equal to 2.5. Instead of having to say S is equal to 2.5 or having to write sigma in full. So it's really useful and it's the general practice in Julia ecosystem to actually use Unicode variable names. Okay, So we'll come back to this again and again. So forget about this for a minute. This is how you type it. I'll cover this again. So slash colon SMI, and then here you hit Tab, you'll get a drop-down. You hit Enter, that completes it, and then hit Tab again and enter. And that input set. It seems like a lot of work for emojis, but you're not going to do emojis encode the Latin, Greek symbols. They're really easy. So you say slash Sigma, hit tab, and you're done. Okay? Anyway, so we have defined this variable name over here. And if you want to take a look at its type, we can say typeof this cat. String. Now, two things to notice over here. One is that this variable is just a variable, it's just a name of the variable. It happens to be an emoji. The other thing is that all variables have a type. Always, they're going to have a type. And Julia is somewhere in the middle of Python and C plus, plus, it is going to have very strong types and types have hierarchies in-between them. But you don't have to explicitly specify them unless you're doing optimization of the code. Okay, we'll come back to this when we get to machine learning part. You can say type of this a, and this is a character. Now, notice the difference between double quotes and single quotes in Julia in Judea, strings are defined using double quotes and single quotes are used for single characters, just like C, okay? Type of double-quotes. A is a string. We can define sigma is equal to 2.5 as we just discussed. And now you can go ahead and pass this to a function. For instance, we have a square root function. We can say the square root of Sigma squared is equal to obviously 2.5. And other thing to note over here is that Julia uses the carrot symbol or this little hat symbol for exponentiation. So sigma squared is using cat and not to statics, as is with other languages. We can print using the print LN function. So this is also a built-in. You can say sigma is equal to sigma and you get the value out. This is print LN for printing a line, so it's slash n terminated. Okay? If you need help with any built-in or another function, you can say question mark space, function name and you'll get a detailed help regarding that. Okay, so it's giving you some examples and all the things that you can do with it. We also have integers. So an integer by default is int 64. We also have 32-bit integers. We'll come to that when we do type conversions. We also have real numbers. So those go into float 64 or float 32, depending on the system configuration, you can do multiple assignments. So I understand that I'm going a little quickly, but hopefully this will be very clear for you because this is really basic stuff. If you have any questions, please do feel free to ask questions. So we can have a comma b is equal to 10 comma 2010 goes into a and 20 goes into B. We also have another datatype, which is the literal types. So you can say colon four. And this is going to be an int 64 because colon four is the same as for colon B is the same as B, and that is not a string, that is not a character that is what is called a symbol, right? So if you try to print this out, if you say print l and b, it's going to print some other weird stuff, right? So this is not the ascii be, this is not the string b, not Unicode be. It's just a symbol which you can use. It has specific certain purposes that we'll get to when we tried to apply these concepts. Just take note of this syntax that you can have a colon and then a string of characters. So you can say best, and that will be a symbol. And you cannot say best is equal to 34, right? So that is a, an error. So best itself is not available. It's an actual value of that makes sense. Just as for the digit four is a literal type. Best is now electoral type as well. Okay? Okay, let's go ahead and do some basic arithmetic so we can do print and then we have addition, subtraction, multiplication, division, and modulus. So they do what you would expect them to do. So 3 plus 5 is 8, 3 minus 1 is 2, three into four is 12. Because we don't have any spaces in between them. We have to put spaces like that to differentiate them. Okay, so print LN does not insert spaces in-between different characters as print does in Python. We can output strings, so there is no gap in-between these two things. We can pass it a list. So a list in Julia is just like it is in Python. So comma separated square brackets. And that is going to output this over here. Notice one very important thing over here, 3 plus 5, they are both integers. And when you add them together, you get an integer out. But this is saying 8. The reason for that is the lists that you create in Judea are going to be typed. The type of the whole list is going to be a particular type. Julia does what is called type promotion. So if you have any float in the list, everything is going to be converted into a float. So this three by five leads to 0.6, which is a float. And because of that, the whole list is converted into a float. And easier example for that is one comma two comma 4. And the type of this is going to be an array, which is what Julia calls lists. And in the array the type of values for all the elements is float 64, and it's a one-dimensional array, right? So it's one-dimensional, as you can see. And the datatype of all the elements is short 64, that is how Julia works. It gives a specific type to a list. And the reason for that is it improves the performance by a lot. Okay? Just something to keep in mind. Print does the same thing as print Allen, except it does not output a new line at the end. Four divided by two. Over here is an integer division. So three by five is an integer division that results in 0.6. But for double slash T2 is actually a fraction. So this is irrational, which is built into Julia. So four-by-two is the fraction to buy one. You can actually add fractions and get results in actual fractions output, right? So this is not going to be any precision loss because this is actual irrational number and you have built-in support for rationals in Julia, we can go ahead and convert the datatypes among each other. So I can convert an integer to an integer 64. So one goes into that. I can convert it into a float 64. So 1, I can convert 1.3 into an integer 64. And that is going to give me an error because this 0.3 cannot be discarded. You'll either have to seal it first or rounded first and then go ahead and convert it. Okay? So that is the basic convergent. We do have to do a lot of different conversion things that we'll get to later on in the course. We have Boolean operators, so we have two values over here. Welty is true, well, f is false and is using two ampersand signs. So true and false is false or is using to pipe symbols and not as using the exclamation mark. So basic stuff. We also have string operations. So we have named John over here and the H over here. We can output them using interpolation. So wherever you have dollar and brackets in your string, they are going to be interpolated and this variable name is going to be inserted over here. So the customer's name is John, who is 35 years old, right? So that is called string interpolation. It's very useful. We also have string concatenation. So concatenation typically is done using plus, but here it's using the Multiply operator. There is a reason for that. We don't want to get into that, but this is a conscious decision to use static instead of plus. So that does the concatenation. You can also join multiple strings using the join function. So this is a built-in function and you can give it a list of strings and a delimiter. So it's going to join S1 space as to space and so on, right? So S1 space as two and it's going to return a string to you. You can put the list outside and give it to join. And this works as well. We've already seen string repetitions. So we use instead of multiply, we use the carrot or the hat operator, right? So you can go ahead and do that. So this repeats the same three times. So this is the basic data types that we have and we're going to obviously use them in detail later on in the course. But just to give you a pro tip, if you were working in the notebook, you can right-click and click on new console for notebook, this gives you a constant over here, which is based on this notebook. You can drag it out over here to the side. And now you can do all the experimentation over here. So you can say four. And this is going to go away after you close the console so you don't have to mess up your notebook over here. So this is really good for experimentation and just taking a look at what's going on. All right, so let's get rid of this. In the next video, we are going to take a look at how these variables can be combined using control structures such as conditionals and loops.
5. Control Structures, Iterations and Ranges: Let's take a look at some control structures. So conditional branching and loops. Most of the stuff you'll already be familiar with if you've worked with any programming language before. But we'll go through this just to make sure that you understand the syntax of Julia. So we can define variables and then go ahead and do some conditionals based on those variables. So let's say if x is greater than 20, we print this line and we need to end the if block using the AND statement. So this is the syntax. You start a blog and you ended with the end statement. You don't need to say, and if it's just end. Some differences from functions Python, you don't need a colon over here and the indentation is ignored by the interpreter. Obviously, you should still do it, right? If you work with Java or C, or C plus, plus, a difference from those is that you don't need to have parenthesis around the conditions in the if statement, okay? So you can do that so that as expected, prints x is greater than 20. We can also have else-if to demonstrate that, let's do the usual FizzBuzz example. So given a number n, we print fizz if n is divisible by three, bus, if it's divisible by five and FizzBuzz, if it is divisible by 35. If it's not divisible by either, then we print just n itself against. So if n is divisible by three and n is divisible by five fully, we print fizz buzz. Else. If. So, that's the syntax and Julia LCF instead of LF of Python and is divisible by three fully, we print phase and if it's divisible by 5, buzz and otherwise we print, and so that's your typical if, else if ladder. So if we do that, we get phase for three, we get FizzBuzz for 30, and we should get the number 13 itself, thought 13, okay. There's also a ternary operator so you can use it, but typically it's good practice to just avoid it. So here's this index. So you have x equal to 3 and y is equal to six. We want to assign the value of the larger number to 0. So if x is greater than y, then the return value of this whole statement is going to be x. Otherwise it's going to be y. So here's this index you have conditioned then the ternary operator and value. If the condition is true, colon value of the condition is false. So if we run that, we get 6436. And if you change this to 136, then we'll have the output as 13. Okay, hope that made sense. That's all there is to it for branching. So if you have the understanding of if else, if that's all you need for Julia, Okay? We can also have iterations. So the basic while loop has the exact same syntax, so vile and end. And here you put the condition and this is the increment of the iterated variable. So you can do that. So I 0 up to nine. Okay? So basic stuff, nothing important going on over here. You can also loop over a list. So nums is equal to 19. You can say for war in nums, no colon over here. Just as with if. And you can say print LN of watts. So it goes from 1, 2, 3, 4, 5, 6, 7, 8, 9. And obviously it depends on what values you have over here, okay? So if the value that you have already hit is AD, it gets printed out. So fairly straightforward stuff. You can also go from one to 10 using this syntax. So if you're familiar with Python's range function, this is what it does. So for one in 12, ten, it goes from one to 10, both included. This is slightly different from the Python syntax. By 10 stops just one value before this ending point. Yulia goes right up to their knee point. So both are included in the statements, so it depends on your preference. For me, this is slightly more understandable if you are coming from a mathematical formula because mats uses one based indices. So it makes a lot more sense over here. If you're coming from that, which is essentially what Julia targets. If you look at just 12 ten itself outside of the actual loop, this is just one colon ten. So this is very similar to Python's range. You can say type of 1 to 10. So this is a unit range. Once you loop over it, then it turns into this guy over here. Basic stuff. We are going to come back to this again and again when we actually use them, I just wanted to go over the basic syntax upfront so that you're comfortable with the use cases when you get to the real-world use of these basic building blocks.
6. Data Structures in Julia: Lists/Arrays, Tuples, Named Tuples : Let's do something a little more interesting. So in this video we are going to take a look at the data structures that are built into Julia. Again, I'm going to do some parallels with Python so that you find it easier to understand. But even if you haven't worked with Python before, that's perfectly fine. This should make sense regardless. So let's go ahead and clear this first. So let's first create a basic collection. So this is going to be a list or Julia calls an array. Okay? So we have this collection over here. You can see that this is a four-element array. And the type of each element within that array is 64-bit integer. So it's an array of ints 64. You will notice there Julia has strict typing for arrays. You can always have different types within the array, but the type of the array itself is going to be defined according to certain rules, which we'll take a look at right now. If you define the same list and change five to 5, it's going to convert this into a float 64. So now all of the elements within the list that you provided are converted or promoted to float 64. The reason for that is if you have lists or arrays which hold elements of the same type, it speeds up the performance by a lot. So Judea by default converts everything into float 64. If you have just one float 64 in your collection or a day or less. Okay? So even though we had integers over here, they have been converted into shorts and float 64 is the default. This is going to create some problems when we get into machine learning, but we'll cover this when we get there. Okay, just keep this in mind. Float 64 is the default. You can also have completely different types, so you can have fractions. So you will see that because everything is being promoted to a float, this fraction is also promoted to a float and it turns into 1.66667. Okay? What if you tried to use something that cannot be converted into a different type? So you have integers and a string. Neither of them can be promoted to the other. So what happens is Julia creates a list and the type of all the elements is now any. So any is like at the top of the hierarchy of Julia types. So you have any and then you have numbers and strings and characters. So n essentially means you can have any type. This is going to be a problem if you use it in actual computation because this is going to really slow down your code if you use any. So it's much better to actually define what kind of data you are going to hold in a list and then work with that specific data. Heterogeneous data in a list aren't really all that useful anyway. Okay, so let's go ahead and do some operations on these lists. So we'll define our list as 1245. So it's an int 64 array, four elements. Obviously, we can append something to this collection by using the append built-in function, you will notice that it ends with a bang. So this exclamation mark at the end essentially is Julius convention of defining functions which modify its arguments. So you are passing data collection, which is a list or an array. I'm going to use both of these interchangeably. You are passing it this array, and this is going to be modified by the append function. So by convention, this append function has a exclamation mark at the end. Okay? There is nothing special about the exclamation mark from a syntax perspective, it's just a convention. So we're going to append 60 to this collection. Once we do that, you can see that collection now has the 60 at the end. Okay, Let's reset this and try to access the first element. So if you come from Python or C or Java, you probably would be doing collection 0. This is extremely important that Julia does not have zero-based indices. So Collection 0 is invalid in Julia. Julia uses one based index. So the first element is in the index number 1, right? There are reasons for this. Don't get upset about this. It's just how Julia does things. If you use MATLAB that uses the same syntax, okay, so there are reasons for doing this. Let's not go into that. So we have this, you can always do slicing. So you can go from one to five. This again is different from Python. One-to-five means start from the first element, go all the way up to the fifth element and both of them are included, right? So Collection one-to-five is 10, 20, 30, 40, and 50. Okay. Many languages stop just before this. So you have to make sure that you get comfortable with this when you're using Julia. You can also start from a location and go all the way up to the end using the NPV. So if you say five to end, it starts from the fifth element. And because this is one based index, we actually start from the fifth element and we go all the way up to the end, okay? You can omit both of these. So you have this and essentially this whole thing creates a copy of the array that you have. If you do collection colon for, if you're coming from Python, you might be thinking that this is going to start from the start and it's going to go up to the fourth element. But that is not the case over here. If you recall. In basics, we saw that you have what are called symbols. So we overhear colon for essentially is just an integer for. So if you put colon four over here, this means just for. So if you do collection colon four, this gives you the single element at index four. So this is kind of confusing for people coming to Julia from another language. So if you want to get all the elements from the start all the way up to the fourth element. You have to do one colon four. Okay? So just getting to the syntax, make sure that you understand it, okay? Also, you cannot have negative indices. If you want to do that, you can do n-minus-1 and that will give you the same thing. It depends on who you ask. Some people say This is better. And obviously Python people say that this is better. You get used to it really quickly. It's not all that troublesome. You can obviously go ahead and change the individual elements. So you can say collection 1 is equal to 99 and that is going to change the stuff. If you want to look at the values of different things, you can use the show macro. So this is called a macro. If you do actual collection, it's going to say collection is equal to this whole thing and then it's going to output the actual value as well, right? Typically when you're trying to debug stuff, you say Show collection and then you add a semicolon at the end. What that means is it's going to do the actual showing. So you'll get this line out. But it is not going to return the value of the collection itself. So you will not get this output. Okay? So if you say Show collection semicolon, you get just that value. This is going to make more sense when you output multiple values, which we'll do in a little while. Okay, let's create a copy of the collection using copy of collection is equal to collection. If you do that, you get copy of collection. So this is a copy if you change the first value of copy of collection to a 100. So this 99 is going to change to a 100. Okay? But the problem is that if you try to look at collection now, the value of the first element of collection has also changed. The reason for that is, This does not create a copy, it just creates a new variable that is pointing to the same list. So in the RAM you have this list sitting somewhere. Collection is pointing to it and copy of collection is also pointing to the same list. So if you change this 1909, both collection and copy of collection are going to change. What we actually want to do is we want to create a copy of this whole list. And in order to do that, you have to use the built-in function called Copy. So let's create a second copy and we're going to use the copy function now. And if you do that and change the first element of second copy to 99, 99 and output all of these three using the show macro. You will see that second copy has changed. Collection remains the same, and so does copy of collection because second copy is now completely decoupled from our original collection. Okay, that makes sense. There is also an issue of deep copy, but we'll leave that for more advanced videos when we get to those and when we need that, we'll see what deep copies. So those are the basic operations that you can do with arrays or lists or data structure that is very similar to lists or arrays, but which is immutable is called tuples. So you can create tuples using parentheses are round brackets. So you say collection is equal to one to four. And you can have any data type in that. So you can have Julia, Python, C, Java, and you do one based indexing just as before. And you can try and modify it. But because it's immutable, it's going to give you an error. So it's a method, no method matching set index, so on and so forth. What this means is you cannot set index, which means you cannot modify the value at this index. And that is because tuples are immutable, right? If you look at languages, it remains the same. You can also have named tuples, which is kind of like a merge between arrays and dictionaries. The latest versions of Python are also adding support for this. But if you haven't seen this before, it has a very simple syntax. You can say tools is you could do language is equal to Julia, ID is equal to Pluto and explorer is you could do perseverance. So if you do that, you can access elements of this tuple using numeric indices. So tools one is Julia, that is the first element. And you can also do tools dot language. So you can use the dot operator to access particular values, right? So this is kind of like a merge between lists and dictionaries and enumerations. So they're not very often used, but when they are, you should be aware of this index. So the syntax is fairly straight forward. You have parentheses instead of square brackets, and you can do named elements. So language is equal to Julia and so on and so forth. Okay? So those were the two basic data structures. The third one is dictionaries that we'll take a look at in the next video.
7. Dictionaries (Maps), Symbols in Julia: If you've worked with Python, you would already be very familiar with dictionaries. They're very common, very prevalent in Python. If you've come from Java or C, C plus, plus C sharp, they are called maps or HashMaps or any type of map, okay? So they are a collection of key value pairs. So for instance, let's start off with the syntax. So you say d is equal to dict. And then you can define the keys and the values which are separated by the arrow operator. So key arrow value, comma key value, and so on. You can have multiple values. You're going to have multiple key-value pairs in this one dictionary, okay? There is no shorthand for dictionary in Julia as there is for a raise. You just have to define it using the dict keyword. Okay? So you do that d is equal to depth. So you have this dictionary overhead in which we have strings as keys and strings as values. It has two entries. You can see, you can access it as usual using the square brackets. So you can say the language, the language does not work. It doesn't some languages, but it doesn't. In Julia, you can change the value or even add new values. You can say the explorer is equal to perseverance and that is going to change d. So you can have that, you can modify this or mutate the object using the pop bang operator. So pop bang from the previous video, you will recall that bang means it's going to modify D. So if you say partly, this is going to return the final value that you have over here, which we just added over here. And it's going to remove it from the dictionary as well. So that has been removed from the dictionary. Okay? You can mostly dictionaries together. So if you have D, which is this guy over here, and you have another dictionary, a0, in which you have OS and language. So we have language both in this and in this. If you try to merge them together, again, much is with a bank, which means it's going to modify D. So by convention, this d is going to be modified and E is going to remain the same. So what it does is it takes d and then goes through E and inserts all the values into D from E. So E is the source, D is the destination. Anything that is in both is going to be taken from e. So it makes a lot more sense if you just take a look at the example. So if you merge D and E, The language is now Java, because Java was an e and e essentially overrides everything that is in D. If you show B, you can see that language has been replaced with that. If you show ie, that is unmodified, detriments the same. Okay, so merge is going to modify D, take everything from E and throw it at DFD already heard that before. It's going to be overheard. You can loop over the values of the dictionary using for k comma v in d. And you can outward them using string interpolation. So you can say p is equal to dollar K, tab character values equal to dollar v0. And this is going to output all of that for you. Okay? You can also use symbols as keys in dictionaries. So you can say sample is equal to the colon length is Julia, and ID is Pluto. So now you can access them using symbols just as you could with strings. So now here the keys are symbols instead of strings. So you can say simple Lang and you get the value out. Okay? Rest of it is exactly the same. So you can say for k Columbian simple and you print it out to get Lang and ID, right? So those are the basics of the dictionary. Obviously there is a lot to do with dictionaries, but we don't want to cover them here. This is the basic syntax that you need. That is all there is to it. Once we start using them in the real-world, you'll get a lot more practice with it. For now, Let's go back to the arrays and take a look at the specific functionality that Julia provides for arrays that makes Julia special. So let's do that in the next video.
8. Arrays, Matrices, Tensors, Reshaping, Helper Functions: As you know, one of the most important areas in which Giulia really shines is numerical computing and machine learning. The reason for this is the excellent built-in support for arrays, matrices, and tensors in Judea. So let's go ahead and take a look at that. So we're going to define a list as we did before. And these are called arrays in Judea. So if we do that, we'll get an array of type string which is one-dimensional and it has three elements. We can take a look at the type of this variable to see that it is indeed an array which holds all elements of type string and it is one-dimensional. So this is a 1D array of string values, okay? Similarly, we can define a 1D array of integer values. So this is going to be an int 64 type array with one-dimension only. We can go ahead and do usual operations on that such as push, which pushes the element at the end. And we can pop, which is going to remove the element and modify the actual array. Right? So let's go ahead and turn this into a 2D array. So we can do that using the usual syntax of a list of lists. So this is a list and in it, each element itself is a list, right? So let's go ahead and do that. This is going to give us an array. And the type of each element in that array is an array itself, right? So that is how you read this. It's a three element array and each element itself is an array of integers. So this one guy over here is this array of integers which is one-dimensional. Okay? You can take a look at the type which gives you the same information. Now notice that this is not a 2D matrix. This is an array of arrays, which means that I can go ahead and change the size of this to that. And this still works, right? So the size of each array can be different because the type is still in 64 comma one. Okay? So what if we want to create a 2D matrix or a 2D array instead of an array of arrays. The syntax for that is slightly different, but very similar. So we say x is equal to 1, 2, 3. Notice that there are no commas in between each element. So here we have commas and here there are no commas. So x is equal to one comma two comma three. And you will notice that the type now is one X3 array, so it's a one into three array and each element is int 64. And the dimensions of this array arr2 now, right? So notice the difference of this. This is extremely important that the type of this is an array of arrays, and this is a 2D array and the size is one cross three. So this is a pure 1D array. Now, we can go ahead and transpose this, and this is going to turn this into a column vector, so to say. And you will see that the type is something really weird. But it's still an array of ints 64, which is two-dimensional, and it has been converted somehow into another data type. But we don't want to go into that just now. We'll come back to this when we need it. We can take this step further and do a 2D array. So we can have 1, 2, 3, 4, 5, 6, 7, 8, 9. And notice that we are missing the commas in between the different rows as well, right? So no commas at all. That is going to convert this into a matrix. Or a 2D array of ints 64 types, right? So this is a three-by-three array. What that means is, if I tried to do this, this is not going to work. So if I say 7, 8, 9, 12, the number of columns of each array must match, right? So that has to be equal for all of them. That is how matrix works. Okay? So let's go ahead and change that back. So this is now a matrix which is a 2D array, each element of edges in 64. If you convert one of these to a float, everything is going to turn into a float. So now this is a float 64 array. All the elements are no floats. Okay, we've done this before. Now let's go ahead and take a look at the shapes of arrays and a little bit about reshaping how that works, because this is going to be very important later on when we do machine learning with Julia. So we can say size of matte. And this is going to tell us in tuple that this is a three-by-three matrix. We can also go ahead and take a look at the size of this x that we created, which was essentially a vector. So this is a one by three. And we can say the size of fib, which is this guy over here. So fib is an array which has seven elements, right? So you will notice the difference. This is not one cross seven, this is just seven, okay? Because this is a one-dimensional array and that has one-dimension which has seven elements. Here we have a still one-dimensional array, but it's a one cross three matrix, or what we term typically as a vector. So this has slight distinction and when you do machine learning, this is going to cause headaches or really make your life easy if you understand this, okay, so go ahead and revise this so that you understand the difference between the two. Okay? So let's define a matrix X which has these elements. We can go ahead and reshape this. So this is right now a four by three array. So four rows and three columns. We can go ahead and reshape this into a two by six matrix, right? So we can reshape this. So you will notice this goes 14, then 7, 10, and 25, then 8, 11, 36, and 9, 12, right, so that is how the reshaping works. It's going to convert the four by three array into a two by six array. According to this rule, it goes from left, top to bottom, and then to the right, right. So that is how it goes. We can reshape this to a 12 by 1. So it goes 147258, 11, 3, 6, 9, 12, right? So that is how this goes. So top-left, first it goes down, then it goes right and then down and then write them down. That is how reshaping works. Okay, we can go ahead and reshaped back to four by three. And the way these rules work is if you change something and then you reshape it back, the rules are simply going to work out to produce the same matrix for you. So if you reshape this into a four by three, it's going to create the first column as 14710. Then the second column is 2, 5, 8, 11, and the last column is 3, 6, 9, 12. Which is exactly the same thing that we started with over here. So the shaping really works out. So why is this reshaping necessary? Sometimes when you have a model that expects a specific shape and you have your data in another shape. You don't want to modify the model, you want to modify your data to fit the shape, right? So again, when we get to the machine learning part, we'll see how this works out. Right now we're just doing the building blocks. So finally, let's go ahead and take a look at some helper functions for working with matrices. So we have a rand function which creates a four by three array, or a matrix which has random numbers going from 0 all the way up to one, okay? So each element is uniformly distributed from 0 to one. We can save this in an array. So over here, Matt, we can create a copy of that. And if you recall from the last lecture, if you modify a Mac to one comma one. So that is essentially how you access the individual elements within the matrix. So row comma column, if you change that, Matt T2 is changed and so as Matt one, right? So you will recall this is the same problem of sharing the memory, even though you did this, right? So you did not create a copy, you simply created a new reference variable which is pointing to the same matrix. If you want to actually create a copy, which is sometimes necessary, you can go ahead and use the copy function as before. So if you do that math 3, 11 changes mat three, but does not change mat itself, okay, So Matt remains the same as before and math we can now change. So you can use copy if you want to create an actual copy of your whole data. Obviously if you have like a million by million matrix, you don't want to create a copy. You want to somehow optimize your code. Anyway. Just as with all other modern languages, we have a way of creating lists or arrays using comprehensions, which are a shortcut for creating lists. So if you have not seen this before, It's a very simple syntax. So we can say for i in one to five. So this is going to create a list which is going to go from one to five. And for each of the elements it's going to collect this. It's a lot more difficult to explain. It's very easy to just show it to you. So I'll show that to you. So it creates a five element array. It goes from one to five, and it creates I the second time I, and the third time I, and the fourth time I, and the fifth time I write. So you can go ahead and change this to I square. And this is going to create this list for you. You can go all the way up to ten, and this is going to go all the way up to ten and square the numbers. Okay, So that is how this works. You can also go ahead and do a nested loop. So for I in one-to-five, comma j in six to 10. So this is essentially going to create a nested loop in which AI is going to go from one to five. And within the loop, j is going to go from six to 10. So essentially when you do i comma j the first time it's going to be one comma six, then it's going to be one comma seven, then one comma eight, all the way up to one comma ten, then it's going to go to six to seven, all the way up to two comma ten and so on, right? So this creates a five-by-five array because these are five elements and these are five. And the first time is going to go 16, 17, 18, 19, 110. And the second time, when the outer loop runs, it does this, right? So this is going to be a Cartesian product. It's a very handy way of doing simple manipulations to create matrices. Obviously, this is right now a five-by-five array in which each element is a tuple. And within that tuple, we have int 64 comma in 64 values, right? So in 64, 64, one element is an array, and the array itself is five-by-five. So I'll repeat that, make sure that you understand this. We're slowly building the type system of Julia. Building an understanding of this type system so that when you see very large complicated data types, you understand them. All right, so once again, this is a five by five array, because this is a five-by-five matrix. Each element itself is a tuple. So this is that tuple. The array itself is two-dimensional because it has rows and it has columns. And within the tuple there are two elements. The first one is an int 64, and the second one is also n in 64. As before, we can go ahead and change each individual element to be something else. So this can be I square comma j. So same thing, but this is going to go from one. Then 4, 9, 16, 25. We can also say j plus 1 or whatever function you want to use, right? So 7, 8, 9, 10, 11. You can also do by two. So all of this works out, right? So you can have fractions. So now you have a five-by-five array in which each element is a tuple. The first element of that tuple is an int 64, and the second is a rational of int 64, right? So now this is getting really complicated. Make sure that you pause and you understand this. Okay? Finally, a couple of operations. So you're going to have a random array created, which is three by three. But instead of from 0 to one, each element is going to be between 10 and 20, okay? So you have these insectivores and these are between 10 and 20, both inclusive. You can also convert them to floats by just setting this tend to attend dot 10 dot is going to mean start from the float, 10, go all the way up to 20. And the type promotion system in Julia is going to convert everything into floats. So this is an array of ints 64, and this is an array of float 64, okay? Instead of random values, we can provide a specific value using the fill method. So we can create a fill method in which we are going to have a free sized array and it's going to have the exact value 10, right? So x is going to have three 10s over here, okay? Now that we have this a and this x, we can go ahead and multiply them together. And the syntax, because this is built in Julia, you can simply say a into x and this is just going to work out perfectly, right? So this is obviously a matrix-vector multiplications. So you have the typical rule of row multiplied by this vector and the second row multiplied by this vector and the third row multiplied by this vector to give you the final result out, right? So this works really well. We can also do basic operations like transpose. So you can say a prime to get the transpose. So this goes from 10 1512 to 101512. So rows into columns, columns into rows. That's the transpose. You can also do a prime into a. And because this is a very common operation, Julia provides a shortcut. So you can even omit the multiply sign. You can say a prime a, and this is going to do the same operation. So this is very typical with Julia. Julia strives to make sure that its syntax matches the actual mathematics as closely as possible. Because that helps people translate research papers and research work which is in mathematics into Julia code, right? So you Julia code is going to look very similar to the actual mathematics that is in research papers and in the state of the art work. You can do the trace. So trace of a is Tracy is not defined because we need to do linear algebra for that. So we have the package using linear algebra. So recall that if this package gives you an error, you can do PEG dot add the net algebra as we did in the very first video of this course. Okay? So now you can do trace off a, you can calculate the determinant very easily using the debt function. And you can calculate the inverse, obviously, if it's possible to create the inverse, okay? You can solve linear equations directly. So for instance, you have a, which is a random variable. You have an x, which we already have. You can say b is equal to a into x, so that creates the B. And if you recall, if you have this a into X is equal to b, you can calculate the actual values and solve for b, right? So you can say a by b and it gives you the actual values, okay? So as you can see, it's very close. And this isn't exactly 10 because of floating point arithmetic in computers. But this works out really well, right? So you get the Xbox. If you don't understand this linear equations, if you don't remember your linear algebra, it's fine. You don't really need it for this course or even for machine learning, but I just wanted to cover it to let you know that this is available in Judea, right? Finally, we've seen a two-dimensional array, but you can go ahead and convert these into tensors which are larger dimensional matrices, so to say. So you can have a row comma, column, common depth comma, something else, right? So you can have a four-dimensional array as well. So now this is a four into three into two array. So these are now three-dimensions, which you can think that this is one array and coming out from the screen is this other piece of paper. So think of these as two different pieces of paper lying on top of each other. So on the first paper this is written and on the second paper this is written. So this is now kind of like a three-dimensional structure. So instead of a two-dimensional structure which can be written on a paper, now you have a three-dimensional structure. So this is called a tensor, where the name TensorFlow comes from by the way. So these are tensors. If you're not familiar with tensors, we'll get to these when you get to machine learning. But for now, it's sufficient to say that you can create any number of dimensions using the syntax, right? So now you have a four by three by two by five array, right? So for national array, you can see a lot more in this over here. So in this linear algebra. So on this link you have all the functionality that is provided with the linear algebra package in Julia. So this is built in. You don't have to install anything for this. So if you want to explore this further, you are welcome to go ahead. For our purposes, this is all we need. And when we start doing data manipulation and machine learning, we'll come back to this and go through this once more. In the next video, we are going to take a look at some final thoughts about different data types. Those types aren't really used in machine learning and data science, but I'm including them here for the sake of completeness.
9. Data Type Details, Casting Among Types: So this is a very short video, even though we don't really need it. I hope you do go through this because there is one very small concept here that might be useful for you later on. So we've seen the basic datatypes. So we have int 64, we have floats, we have float 60 fourths, right? Type of ice float 64. You can convert this into a 32-bit float, which is now going to be 432, which is represented using zeros. Okay? We can also have complex numbers. So we can have a complex number. C is equal to one plus three into IM. So this is a built-in which defines the imaginary part of a complex number. If you're not familiar with complex numbers, don't worry about this. This is just here for the sake of completeness. We don't really need them in machine learning aspect. If you go into like physics simulation and all that, you might need it, but then you would already know what a complex number is, okay? So type of c is a complex number in which each element is in 64 type. You can also have floats here in which each part of the imaginary number itself is a float. And the way to create that is, you say c is equal to complex float 60 fourths. So this defines the type. And you're essentially calling a constructor of this type and giving it the actual values one plus three m. And that is going to create one plus 0 plus 3 imaginary. Both the real and imaginary part are now floats instead of integers. Okay, So this is a complex with the different parts as float 64, right? You can have a is equal to one to four. So these are going to be in 64. But if you are sure that you want them to be floats instead of integers, you can say float 64 and square brackets. And this is going to create a float 64 array. Same values, but now there's going to be a float 64 array. So this is a shortcut for something very similar to above over here. The full version would be a is equal to complex float 641 to five. This creates an array of 125. And each element is going to be a complex number in which both the real and imaginary part are flawed system forced to. If you do that, you will see 1 plus 0 imaginary. So this is an array in which each element is a complex. So even though we provided only real part, it's going to create the imaginary part itself, both the real and imaginary part or floors because of this fraud 64 over here. Okay. So I hope that makes sense. If not, please do ask question answers and I'll try to clarify further. We can also check whether an element is or is not as simple data types. So we can say is a one insectivores, obviously it's not, it's a float. He asked, is it a float 64? It will say true. If you ask, is it a float 32, it says false because by default, all floats on my machine go to float 64. Okay, so I hope this made sense. In the next video, we are going to start putting these things together into functions that can solve specific problems for us.
10. Defining Functions, Overloading, Multiple-Dispatch: We've already used functions in the previous videos. Now let's go ahead and define functions using the basic syntax. So the semantics of most of this stuff will be already clear to you, but we're going to see how Julia approaches this. Alright, so let's first define a basic function. So for that you use the keyword function and you end the function body using the keyword, right? So that's how Julia defines blocks. Within the function. You can have multiple lines. Indentation is not important, obviously for blocks. One important difference to note over here is that we don't necessarily need a return statement over here. If we say x into x, this is going to be the last line that is going to be executed when this function is called, this value of X into X is going to be automatically returned. You have the parameter x over here, and we can go ahead and call this function. So let's first define this. And then we can go ahead and call this using square four. And this is going to return the value 16. So you don't need the return keyword over here, but it's still a good practice to include it just to make your code a little more readable, okay? It depends on who you ask. Some people prefer returns. Some people say it should be without the returned. It depends on who you ask, but personally I prefer to have a return statement over here, okay? So we have the square function. You will notice that we don't have a datatype over here, so it's implicit. So if you have hello name and we simply print the value over here, we can go ahead and call hello with the string parameter. So it's going to print hello there world. We can also pass it an integer. So this is obviously going to work perfectly fine because name can hold any different data types. We can even create a matrix over here using rand function and pass that to hello. And that is still going to work. So anything will print as long as it can be converted into the string, right? So this works perfectly fine. So this is probably stuff that you're already familiar with. We're just looking at the syntax that Julia provides for this. But now we get to some of the stuff that is different in Julia. So for instance, you can define one-liner functions using this syntax of defining the function name and parameters in parenthesis is equal to and then the one line of function body that is going to return the value of the function, right? So it's much easier to just demonstrate this. So hover x is equal to x by two. If you run that, you get a function back, which is going to be called hover, and it's going to take one argument. So hubbard five is going to return, obviously five by two, which is a rational number. So this body is going to be returned to you. So this is very useful when you have one-liner functions. And it makes your code look very similar to mathematics, which as we saw in the previous video, is one of the objectives of Julius index. Whenever we have functions which tried to modify stuff, by convention, we append an exclamation mark to the function name. So you've seen these before. Let's go ahead and take a look at this again, just for the sake of completeness. So we have an array named R3 over here. We can go ahead and call this sort function on it. It's going to sort it, but you will notice that V0 remains the same. The reason for that is sort does not mutate the variable that you pass to it, okay? However, there is a variation of sort which ends with a bang. And that is a mutating function, which means that if you pass v2, this sort with a bank, it's going to actually modify v. So this is very pervasive in Julia. Whenever you have functions which end with a bang, they are going to mutate or they may mutate your variables. Okay? So that should be fairly straightforward, okay? We can also have overloaded methods. So this is slightly different from what you might be familiar with if you come from Python. This is very common in type languages such as Java and C plus, plus and C. So let's explain this using an example. So you have a function called show number, and it's going to print the value of the integer that you pass to it. Because I've just told you that this function is going to work on an integer. We can define the data type of the parameter as an integer 64, right? So a 64-bit integer. So the syntax for that is, you say x colon colon int 64. So that means that this x can only hold an integer 64 or int 64, right? 64-bit integer if you try to pass it, anything else, this is not going to work. So let's try and do that. So you define the function, you get the function back. You can call it using show number 65. This works perfectly well, but if you try to pass it a float, this is not going to work because a afloat is not an integer. So it cannot go into x because x has to be an int 64. Moreover, float cannot be promoted to an integer, right? So that's very important. If this was a, if we had defined this x as a float and we had tried to pass it an integer that would have worked because integers can be promoted to float, but floats are higher up in the hierarchy than integers, so they cannot be converted into integers automatically. So if you try to do that, you'll get an error. And it's very prudent right now to take a look at the error. So the error says method error, no method matching show number colon, colon float. So what this says is you're trying to call a function which is called Show number. And you're trying to pass a flow to it. And there is no method that is available that accepts a float, no method name, show number which accepts a float, right? So what we can do is we can go ahead and define such a function. So now we are going to define another show number. And in it we are going to have the same number of arguments, which is x. Except this time, the datatype of x is going to be a float 64, right? So we can define this function, and now we can call show number with an integer and show number with a float as well. And you will notice that these are two completely different functions. So this body has nothing to do with this body even though the names of both functions are the same. Okay? So the first one is printing this, and the second one is printing this, but they have the same name. The way this works in Judea is you have what are called functions. So show number is a function and it has two methods, right? So show number has two methods for this generic function. Show number one takes a float 64 and the other takes an int 64, right? This is called function overloading. If you come from Java or C plus, plus or C or any of the strictly typed languages, you would know what function overloading is. So this comes in very handy when you try to apply a function such as plus on different data types. So for instance, adding two matrices has different logic. Adding two vectors has different logic. Adding two complex numbers has different logic. So all of these logics you don't have to worry about. All you would do is call the function and the methods are going to be automatically matched by Julia. So the reason for doing this is this speeds up the performance of the different function calls because you can specify different logic for different types of data that you can have. And that makes the life of the compiler very easy. And that in turn means that your code is going to run much faster, right? So this is how internally Julia speeds things up. And this is very important, but it's slightly advanced topics, so we're not going to go more into detail. I'll just show you the methods that are available for the generic function plus. So plus is simply a function. Nothing important going on over here. So plus can work on two floats, it can work on two 16-bit floats. It can work on a complex Boolean and a Boolean, it can work with complex Boolean and real it can work even with two missing values. And there are a huge number of things that plus can do depending on the datatype. And whenever you write plus, for instance, 2 plus 4, one of these is going to be caught and there is a hierarchy going on. So if you have a function which does not have a particular method for a datatype, a method in a higher level of the hierarchy can be gone. So for the moment, that's all that you need to understand. When we get to more details of this, we're going to take a look at the specifics of how they are implemented in machine learning and data science when we get to that, right, so this is all you need to understand for the moment. So let's go ahead and take a look at some of the other concepts, such as default values. So let's again explain this using an example. So for instance, if you call log of 8, right? So by default in mathematics, whenever you say log, human natural log, so log to the base e. If you don't know what logs are, It's not really important. It's just a mathematical function that is very useful in a lot of things. But for the moment, all you need to understand is that this function actually has a hidden parameter in it, right? So log of 8 to the base of what, that, what by default is e, right? So the mathematical constant E. But you can change that. For instance, in computer science, we are usually concerned with log to the base two. So you can call log of the same number 8 but to the base two and this is going to give you three back, right? So essentially if you raise two to the three, you get eight. If you raise e to the 2.07, you get eight, right? So that's how it works. So what we're trying to do over here is understand how this works, that the same function can work on different number of parameters. In the previous example, we saw how you can have a method for different datatypes. And here we are taking a look at different methods for different number of parameters. Okay? So that is how this works. These are called default values. So for that, let's take an example that we are going to pass a collection or a sequence of numbers to this collection log function. And it's going to calculate the log for all of these numbers, right? So the syntax is fairly straight forward. You're going to pass it a collection, and you're going to pass it a base. And that base by default is going to be to write. You can call log base I for I in collection. So for all of the values that are passed to it, you calculate the log to the base I, right? So let's run this. So we have a function, we can call this, we can say collection log 4, 8, 16, and to the base 10. So this is going to return 0.6.91.4. We can go ahead and emit the base. And this is going to now use the default value of two. So if you've come from Python, you will be familiar with these default values. But in Julia there are slight differences, and that is what we're going to take a look at now. For instance, if you have base is equal to two over here, you cannot have a parameter that does not have a default value coming after it, right? So all the parameters which have a default value, you have to go at the end, right? So you have to change collection common base equal to two, as we defined over here. You cannot have them the other way around. If you try to do that, you're going to get an error. So optional positional arguments must occur at end. So it's fairly straightforward. However, if you do need optional parameters at the beginning, you can separate your positional arguments and your keyword-based arguments. So you can have basically to do this is going to be a positional argument and this is going to be a keyword-based arguments, so you can call this. This works perfectly fine. Notice the difference here. We have a comma, here we have a semicolon. So these are now two different types. These are going to be positional so you can pass them without having to refer to their name. But this connection over here now has to be passed using its name, right? So if you try to do this, this is going to give you an error. It says keyword argument collection is not assigned. You might be expecting that base equal to default. So base is going to take on the value two. And the only argument that I'm passing over here would go in collection, but that's not how Julia works. The way this works is anything after the semicolon has to be explicitly specified. So you have to say collection is equal to 4, 8, 16. And now this is going to work perfectly fine. You can also pass ten over here. So this is going to go into base. So everything works out perfectly well. I personally would not recommend that you try to do this in your own code. But I've covered this here because you are going to see other people's code which is going to use this syntax. So you should be aware of this minor inconvenience of going from one language to another. So that's all you need to understand for now. As before, once you've covered all the syntax, we are going to take a look at applications of this in data science and machine learning. And this will become a lot more clear. Okay? So in the next video we are going to take a look at another type of functions which are very pervasive in Julia. And you might not have worked with them before. Even if you have, we have to take a look at how Judy approaches them. So let's take a look at that in the next video.
11. Anonymous Functions (and their importance), Splatting and Slurping : In the previous video, we started working with functions and took a look at some of the basic stuff. Now let's go ahead and take a look at another type of functions which are called anonymous functions. These are functions which do not have a name. So what good is that? All right, so let's start with. So these are very close to what mathematical functions out. So you pass it a parameter and it's going to square it. Add plus two x and minus one. So this is simply an expression that the function is going to return. The problem with this is once you run it, the function is created and then it's gone. Because it's an anonymous function. By definition, it does not have a name. That is why it's called Anonymous. So because it does not have a name, now you cannot call it. Alright, so what good is that function then? If you cannot call it, right? So we'll get to why we need this in a minute. But one way to call this is this bracket. This whole stuff defines the function and then you call it just as you normally would write. So this is going to be a function which is a generic function which takes in one argument, and that argument is going to be five. So five is going to go into X. So this is going to be 5 squared plus two into five minus one, right? So this is going to be 34. So 25 plus 10 minus 1 is 34, right? But why do we need this? Let's see an example and then this is going to make sense. We're going to take a look at the use of anonymous functions as well as a couple of other concepts in this very simple example. So for instance, we have a list of numbers over here, right? So we have the numbers. What we want to do is we want to filter out some of the values. For instance, we want to keep only those values which are not multiples of three, right? Or we want to get rid of the values which are not multiples of three, right? So we want to let say, keep 69 and get rid of everything else, or we want to get rid of 69 and keep everything else, right. So we want to apply some sort of a filter. So we already have a built-in function called filter with a bank. You would remember from the previous video that this means it's going to mutate our collection. How is it going to mutate that? It's going to take the collection, then it's going to apply some function to each element in that collection, right? So filter function is going to keep only those that satisfy the criteria, right? So our criteria is going to be not multiples of three. So this function is going to have to be defined. So we're going to define a function not multiples of three. This function takes an argument and returns true if that argument is not a multiple of three, right? So if three does divide it fully, then it is a multiple of three. If it does not divide it fully, then it's not a multiple of three. If it's not a multiple of three, will return true, right? So it's slightly difficult to get your head around it. But think about this and this will make sense, right? So we have a function that returns true if the number that you pass to it is not a multiple of three, right? So that is the function. Now we can call this filter function on these numbers. So once we do that, we get 12, 4, 5, 7, 8, three wasn't there. Anyways, six goes away and nine goes away, right? So the way this works is you take one, you pass it to this function. One modulus 3 is not equal to 0. It returns true. And that means this is going to stay in the list. Then you pass to 45 menu past six to it, six goes over here. This returns false and that means it's going to be dropped from the resulting list. Okay, so that's how this works. So you've seen filter now you understand how this works. The problem that we are concerned with in this video is not filter. What we are concerned with is that this function not a multiple of three. We wanted it just for the sake of this filter. And now our filter call and this function are separated from each other. That is one problem because now you have to go in and take a look at what this not multiple of three is. The other problem is this not multiple of three is going to remain in your memory and you're not even going to use it again. So what we want to do is we want to not have this function sitting over here. We are going to create this as an anonymous function so that it's created. And then once we've used it, it goes away, right? So that is how we're going to do this. Let's go ahead and rewrite the filter. So these are our numbers, the same numbers as we had before. And now we are going to overhear not pass a function name, but we are going to pass the function itself. So this is going to do the exact same thing. It's going to take nums for each of the elements is going to apply this function. And this is the exact same function as we had over here. The only difference is that it does not have a name. It takes an argument and it returns true if that argument is not divided fully by three, right? So it's the exact same function except now it does not have a name. So once you do this, you get the exact same result, right? So 1, 2, 4, 5, 7, 8. The good thing with this logic now is the stuff that we are trying to filter out. The semantics of our filtering are now in the same line. And this function is going to go away and it's not going to clog up our memory, right? So this is really useful and it's very pervasive in scientific computing, in functional programming, people do this very often. So you're going to come across anonymous functions again and again. And at first glance, this looks really complicated, But once you get used to it a little bit, this is a really good way of writing your code because it brings in your logic, your constraints, and the manipulation that you are trying to do in a single line, right? So this is really useful. So that's the whole concept that we wanted. Let's take a look at something slightly cooler in Julia. So what we can do is because not equal to this operator itself is defined as a function. We can give it another name. So let's have the same numbers again. And we are going to define this guy over here, not equal to symbol, to be equal to, not equal to, right? So this is going to be really weird. I'm going to say this again. The symbol not equal to, we are defining it as not equal to, right? So it's the same thing. How do you write this? So you say slash any, and then you hit tab and it's going to get converted automatically by the Jupiter notebook as not equal to symbol, right? So you run that, you get an edit over here because not equal to has already been defined in Judea main. In some of the previous versions of Judea, this was not defined, so we could define it like this. But the point is this not equal to, now is the same as this guy over here. So you can go ahead and rewrite your anonymous function as. X maps to x mod 3 is not equal to 0. It's exact same thing except now this is a lot closer to the mathematical definition that we typically see in research papers. So this is going to again work exactly the same. The only difference is in visual appearance of this versus this. So this is more code oriented and this is more mathematics oriented, right? So that was a side note. Let's go ahead and finish this video by taking a look at another operator which is slightly lesser known because it's kind of unique to Julia. And it really creates problems for people who move to Julia from other languages. So this is called the splattering and slurping operator. Really weird to say this, but very easy to explain using an example. So let's say you have print function over here that takes three values and just output them, right? So if you try to call print values 1, 2, 3, that works just as expected. But if your values are in a list, right? So we have valves in a list. You can pass it to print vals and US VALS and three dots. So dot-dot-dot, just three dots. What this is going to do is it's going to splat the vals array, essentially creating three different parameters. And these are going to go into a, B, and C, right? So this works just like that, right? So if you try to pass it, just vals, it's going to give you an error because you are trying to pass a single array to this print Val function, which it does not accept. It needs three arguments. But if you do this planning that essentially unwinds the array and creates three arguments and those are passed to the function, right? You can also do the opposite of this. For instance, if you have add valves and this expects an array, but you have the values in different variables. So if you've tried to do advise, this is not going to work. But you can add dot-dot-dot over here to the parameter. And once you've tried to do this, apple bananas, strawberry, these are going to be slept together and an array is going to be created that is going to pass to the collection variables. So this works perfectly fine. Now, I do not recommend that you try to do this yourself when you're just getting started with Julia. But you are going to come across these operators so you should be aware of what they mean. You can actually go ahead and use this right now. So if you say a comma b is equal to 12, for this is going to go there. So a takes the value of one, B takes the value of two. It's expected that in a future version of Julia you might have a comma b is equal to one to four, so a is going to hold one and b is going to hold two comma four. But for right now, if you tried to do this, you're going to get an error, right? So this does not work. So this does not work. But you have to be aware that if you try to pass three things onto two variables, this is going to be different from what you might be comfortable with in Python, right? So that's all the differences that we had when you're coming to Julia. So this was a lot of syntax. I do recommend that you go through these two videos again so that you are comfortable with this. In the next video, we are going to take a look at how some of these things combined together to facilitate functional programming.
12. Functional Programming, Broadcasting - Most Important Concept in Julia: In the previous video, we took a look at some basics of functions in Julia. In this video, we are going to take a look at three very important concepts. Two of these are not very often used, but they're still very important for you to understand. The third one is critical to Julia and the way machine learning and data science is structured in Julia. So this is going to be a very important video and hopefully it's going to be fairly straightforward. It's not complicated, but it's very useful. Okay, So the concept is functional programming. And the idea is that you have functions which act on data. And while they're acting on it, they are not affected by the environment. So you can imagine that you have, let's say, one gigabytes of data, which can be divided into 5 thousand or a million different parts. And each invocation of the function on each function call is going to work on one part only. And it's not going to be affected by how the function works on other parts. So this can massively parallelize your code. And that is why functional programming has been gaining traction in the recent past. So Judea takes that a step further. And in this video we're going to see how this works, okay? This is completely transparent. We just have to look at how Julia does things. And it's going to do all the parallelization for us automatically. Okay? This is going to be very useful when you get to machine learning. But here we are just taking a look at the basic syntax. So the three things are number one, MAP. So MAP is fairly straight forward if you haven't seen this before, Let's say you have a list and you want to apply a function, which is the square function to each element individually, right? So all you have to do is call map. Give it a function that you want to apply to each element and give it a collection of all the elements that you want to apply the function on. So what that does is it goes ahead and apply this function on each individual element and collects the results in another array. So it's essentially mapping one collection to another collection. Obviously you can save this into a variable, so squares. And then you can go ahead and show this. So squares is equal to 14, 16. So map therefore is a higher-order function. A higher-order function is a function that takes in another function as its argument, right? So this function is being passed to map. So map itself is a higher-order function. So there are other examples. We've seen an example of this before, so that was filtered in the previous video. These three MapReduce and filter combined make like 90 percent of all functional programming, right? So this is very important. And that's all there is to it. Map applies the same function on each individual element of a collection. That's it. Reduce, does something similar with a minor change. So let's say you have the same collection, 12 4 again, and you reduce it with the plus. So what that does is, as the name implies, it's going to take the first two elements and it's going to apply this function on them. Whatever result is produced, it's going to take that and 4 and apply the same function again. So essentially it takes the whole collection and reduces it. A final result, right, so much easier to demonstrate this. So let's just go ahead and show this. So what plus does is 1 plus 2, that becomes 3, then 3 plus 4, that becomes seven. So if you have, let's say 10 over here, this is going to be one plus two is going to be 3. 3 plus 0 is going to be 77, plus 10 is going to be 17. So that is the result that we get out of it. So these seem like very basic concepts, but they have been shown to essentially be able to do a wide variety of things that we want to do with machine learning and data science. So you might have heard of Hadoop, that is a massively parallel processing architecture. And the architecture on top of Hadoop stance is the MapReduce architecture. Or the MapReduce framework, and this is where that name comes from. So just these two things, combined with filter and a couple of other parts of functional programming are extremely popular, extremely powerful. In this course, we are going to be concerned with them just for simple case studies. We're not going to change our whole frameworks towards MapReduce. So this is all you need to understand for this course. Okay? The third thing that is very essential for this course, and we are going to come back to this again and again is the concept of broadcasting. So again, the concept of broadcast is fairly simple. The first example is going to be very similar to maps. So let's say you have collection one to four and you have a function called f, which simply returns the square, right? So it's exactly the same thing we've done before. So broadcast F on collection essentially means take f and apply it on each individual element of the collection. So this is going to do the same thing as map, right? So 146, except there is a minor difference. First, the collection does not change obviously. So this creates a new collection for us. Just keep that in mind. Broadcasts are so useful that they have a dedicated syntax for them in Julia. And that is, you take the function that you want to broadcast or send to each element. You append a dot to it and then give it the collection on which this function should be applied. So f dot collection is the same as broadcast f comma collection. Okay? You will notice that you cannot do F collection because if you tried to do that collection is going to go into X and the body is going to try and square the array and the array cannot be squared. So if you've tried to do F collection, it's going to give you an error, no method matching squared for an array. So you cannot square arrays, right? So you see the difference between these two. When people come to Julia, they really have trouble understanding this concept of f dot, the dot operator. Okay? So that is all there is to it. It's a shortcut for the broadcast function. Okay, Let's go ahead and take a look at some examples. So if you say m is equal to one of three comma three. Now you have a matrix and you can simply go ahead and try to do f of M. Now, matrices can be squared, right? You know how to square a matrix. You do the arithmetic and the algebra, and it gives you a result, right? So f of m is one into one plus one into one plus one into one. That goes in the first row, first column, and so on it. You can write two copies of the matrix side-by-side and do the squaring. It. It's fairly straightforward, okay? But F dot m is different. What this does is it applies the square root function on each element individually. So the F dot m result is 1, 1, 1, 1, 1, and FMN is this. So completely different things to different operations. Okay, Let's take a couple of more examples. So 1, 2, 3, 4, 5, 6. This is our three-by-three array or matrix. We can go ahead and say a dot plus one. What this is saying is apply the plus operator on all elements of this. And because plus takes two arguments. One argument is going to come from the array, and the second is this one guy over here. So we can say 8 dot plus one, and this is going to add one to each element, right? So extremely easy to work with plus you will notice that this is very similar to what people in mathematics write this as. So people typically add matrices with scalars. And the idea is that you are going to add that scalar to each element individually, just as you have. A dot star t2. And that is going to multiply each element of a width to height. And this is so common in mathematics that people typically write this as two-way. And Julia supports that. And you can do 28. So when you see to a, what this means is we are taking the multiplying operator or the multiplication operator, and we are broadcasting it to the whole matrix, right? And what's the second argument for Multiply going to be that is this guy over here. So a very, very concise syntax that is very close to the syntax of mathematics that people are familiar with. You can take these concepts and bring them together and you can write F dot 28. So what that means is first apply the operator to each element individually and then apply or broadcast f on all the different elements as well. So instead of having to write map f, comma, a, comma map f and such and stuff like really messy stuff. You can just write f dot to a, which seems very, very close to the way mathematics is written. So this comes up again and again in Julius practice, we tried to make the syntax of Julia or our code as close to the mathematics and tax as possible. That is why we have variable names as Unicode, so you can have ITA and Beta and zeta and all of that in your code, right? So that is all there is to it, but it's very powerful. Okay, so let's go ahead and take a look at one final example. So let's say we have this at a, a, so this is a two by three array, or a matrix 2 by 3 matrix. So if you try to multiply this with another vector, you will see that this is going to try and multiply a two by three matrix with a one by three vector. That obviously does not work. You need a three by one, so you can do the transpose of this, but that is a different operator. What we want to do is we want to multiply this guy. If you do dot star, this is going to take this guy over here and multiply it with this, and then multiply it with this. Okay? So let's do that. So you will see one gets multiplied by ten to the 23 with 30. And then this vector is repeatedly placed in this location. And then four is multiplied by 105 and 20 and 6 with 30, right? So that is what we got over here. So you can keep doing this. You can say 587. And this is going to give you, so let's get rid of that. You can do that. And this is going to give you 15, 16, 18 to 10. The idea is that broadcasting, you can think of this as repeating something. So you can repeat a function, you can repeat Scalar, or you can even repeat vectors over here, right? So let's get rid of this. We can also go ahead and change the shape of this from a row vector to a column vector using the transpose. And now what this is going to do is it's going to multiply ten width 120 with four. Then the repeated multiplication is going to be ten with 225 and then one more time then with 3620. So you get that out. Okay. So let's say there you go. So 10802030, 120. Okay, So I hope that made sense. You can think of broadcasting as repetition, and that's all there is to it. If you understand this one thing, f dot 28, what that means, you're good to go, okay? So please make sure that you understand the concept of broadcasting. This is going to come up again and again when we go into data science and machine learning. And this is like the foundation of Julia because this is what makes Julia syntax different from another language that you might have worked with such as Python. So people have trouble with this. But once you do get used to it, you're going to appreciate how concise this makes your code and how easy it is to go from a research paper or a mathematician's work to a fully fledged working program.
13. Interfacing with Python and R: Let's finish off this section on functions by looking at one of the core strengths of Judea. That is, if you already have an ecosystem of Python libraries or other libraries that you are familiar with. You don't have to teach them completely and move over to Julia in all its entirety. Julia people realized that it's a niche language and some people are going to be interested in using it for specific purposes, but not for everything, right? So they have in the core of the language created this capability of calling Python's libraries, right? So if you have something running in Python, you can call it from Julia. It's very easy. So let's begin by installing the bike all package. So you can use this syntax or the PKC dot add syntax. Install Python. I'm not going to do that because I've already done that beforehand. We're going to be using Baikal. And all you have to do is say math is equal to Pi import and give the name of the built-in Python package that you want to use. Okay, so we have Pi important math, and now you can say Math.min, math.pi by four. And this just works out of the box. So this is being called from the actual Python interpreter. So there is a python going on and it calls that Python interpreter and that does the work for it, right? So we can do math dot square root, all of that stuff, right? But these are built-ins. What if we want to use a piece of Python code that you've written yourself. So for instance, I have written this pi e x dot py file over here. It's a very simple file which has the square root function. And it simply outputs the square root, right? So it just returns that string. I want to use that. So what we're going to do is we are going to Pi import and we are going to first give it the path of the current directory. So PWD gets the current directory and we're going to add that into the pythons path, right? So this kind of setting the path for the imports. So we do that. And then we can say by x is equal to Pi import pi x. So this is the name of the file that you have over here. You can say by import that and then you can simply say pi x dot square root 36. And this is going to return that value for you. Alright, so this works really well. If you change the file. So for instance, you go over there and you make some changes over here. Let's say, let's put three dots over here. If you want to call it now, you want to reload it, right, because it has changed. So you can say Pi import in Portland, and from that you can reload Pi x. And now if you try to call it, you get the latest code reflected over here. So as simple as that, you can use any Python code that you have over there are no restrictions. And the reason for that is this code is actually being run by the Python interpreter. So whatever Python can run, you can run from Julia as well, right? So very clean, very easy to work with these two languages side-by-side. So it's the same for r. So I don't have R installed on this machine of mine. But if you do, you can simply go ahead and add the alcohol package and then using our call. And then you can go ahead and define any matrix in your Julia namespace, have a vector over here, and then simply go ahead and create a Y using a linear combination of those. You can output them. So these are Julia variables. Nothing fancy going on over there. Okay. Now you can go ahead and put this into the runtime using our port. So this goes into r, This goes into R as well. And then you can do something in R such as modeling, et cetera. So argon is going to take care of that. And then once it does it's thing, the result is going to be in Z. And you can do are get to get that result back and then show it in Julia. So rpart and forget and anything in the middle is going to be the R code that you want to run. So again, very transparent. You can work with this really easily. But when you're just getting started, you don't even need this, especially in this course. We're not going to call any Python or R from our Julia code. We're going to stick to Julia and do everything within this ecosystem so that you can get used to it. In the next video, we are going to go back to the pure Julia potion and take a look at some really interesting stuff such as plotting. And then start with our concepts of actual data science and later on turn to machine learning.
14. Plotting Basics - Prettier Julia Plots: You've now seen all the core syntax of Julia language. There are a couple of things left, but we'll get to them when we need them. So now we are in the position to go ahead and do some actual data science and machine learning using the concepts that we've learned. So let's start off with one of the most interesting aspects of data science, which is plotting. If you're familiar with Python and Matplotlib, you would know that matplotlib syntax is a little wonky, but here we have a very clean syntax and semantics of plotting. So in this video, we're going to take a look at that. What you first want to do is install the Plots package, which is essentially the interface to all the plotting in Julia. I've already done that, so I'm not going to run this cell. After that. You need to do using plots. And then you can create some very simple plots using the plot function. When you run this, you will notice that the first time you try to do a plot, it's going to take a little bit of time. The reason for that is Julia does all the compilations necessary to speed it up later on. So the first plot is going to take a little bit of time. So let's go ahead and do that. So you call the plot function. The first parameter is the function that should be plotted. And the second is the range, the x-axis from where it should start. So from minus two pi to two pi. So this is the range for which the plot is going to be created for us. So we run that and wait for it to do the recompilation and get everything ready. So this is going to take a couple of seconds and then you see the plot. So as you can see, this looks really clean, very nice plot out of the box, much prettier than the matplotlib defaults. Now, because the sine is a function, we can also pass it an anonymous function, which can be a lot more complicated. So you can say a function which takes one parameter, which is going to be each individual value for the x axis. And it's going to plot the sine squared x and cos cubed x at it together. So you can do any sort of complex function plotting using this syntax. So this looks really nice. Obviously, when you do real-world plotting, you're going to be interested in some data, and we'll get to that in the next video. For now, let's take a look at some of the other syntax details. So you can give it the label, you can give it the x label, y level, and the title. So when you do that, you will see that the label for the negative log function that we are plotting goes in the legend. So minus log x goes over there. Y label is set using the viable argument and x-label using the x limit, right? So this works out really well. So this minus log x is very useful when you do machine learning and classification. So that's why I just wanted to plot this. So this shows you the details of the plotting library in its basics. Now, let's go ahead and plot some specific values. So for instance, we are going to have x go from one to 10. So this is a range and y values are going to be just ten random values. Okay? So let's go ahead and do that. So we have Y and X values out. We can go ahead and plot Y1 using plot x comma y comma label is equal to Y1. This is going to do the plot for the values that we have over here. If we want to add another line to this, we can use the bank variant of the plot function. So if you recall, bang means it's going to do some sort of a modification and the modification is going to be on this plot. So the second plot function with a bang is going to modify the plot by inserting the second series onto it, right? So if this seems too verbose for you, there is an easier way to do this. So you can say plot x and for y axis, you can give it a collection of elements, right? So two arrays, each holding the y-values that you want to plot. And you can give them both a label, which VR setting as first, second over here. And notice that there is no comma in between. So this is kind of a vector. And the reason for that is because typically when you're working with vectors, it's much easier, right? Obviously, Julia is still under development, so this might change, but this is as it stands at the moment. Okay, So we have the same plot over here. We can convert this into a scatterplot by setting the CDS type as colon scatter. If you recall, this is a symbol and you don't have to enclose it in double quotes. It works out just as a symbol, okay? So you do the plot, and this is what it looks like, right? So Y1, Y2, the colors, I'm not really fond of the particular values over here, so we can go ahead and change them. So we can change the thickness scaling to 0.7. And we can go ahead and change the opacity to 0.8. And to me it looks a lot cleaner this way. Okay? Finally, we can go ahead and do a lot of plotting. So we can have four different series. We can pass them to the plot function and we can set the palate as dark to underscore five. And that gives you this guy over here, right? So depends on which color palette you're interested in. We have a detailed list of pilots over here that you can go check them out. Similarly, on this link, you can go ahead and check all the different attributes that are available for the plot function. There is a lot to unpack over here, but this is the basic and this can get you started. We want to move on to more interesting things. And I encourage you to go ahead and follow these links and explore the different options that are available. In the next video, we are going to take a look at loading some data and then working with that data. Instead of plotting some random values.
15. Data Wrangling, Reading CSV Files, Descriptive Case Study : We're going to start off doing some basic data manipulation by reading a dataset from a CSV file and then manipulating it just as you would do with pandas or submitted liability in another language. For this purpose, we are going to use the DataFrames and CSV package. So CSV package is dedicated for reading and writing data from CSV files. Whereas DataFrames package is kind of the equivalent of Pandas. It's a package that allows you to read, manipulate, and store back tabular data, right? So for both of these, you might want to install them if they're not previously installed. I have them installed. So I'm going to skip those two cells and do using DataFrames and then using CSV. Now we can go ahead and read the CSV file using CSV dot file and give it the name of the CSV file that we have. So we've saved this in data folder. So over here, and we are going to read this using the CSV file function. You will see a weird sort of a syntax over here. So this is the pipe symbol. What this does is the output of this function call is passed to the DataFrame constructor. But it's kind of a shortcut for taking the output of one function call and sending it to another function. This reduces the lines of code by a lot. And you'll see this again and again, and you'll understand it when you try to run it. So this creates a DataFrame for us, which is essentially pandas DataFrame. If you haven't seen DataFrames before, it's simply a tabular structure. Okay, so let's go ahead and run that. And the first time you run it, it's going to take a little bit of time just as with everything in Judea. But after that, it should be really fast. So this should be a structure that is very familiar to you. So we have some data points over here, and we have some columns over here. So X1 to X4, N12, v4. We can access particular CDS within this DataFrame. So one data series, or you can think of this as a column using the dot operator. So we can say df dot and column name X1. So that gives us the values of just that column. Or you can even use this with a string. So df dot string X1. And that does the same thing. If you want to access particular rows, you can use this syntax. So 1, 2, 4, and column should be X1. So X1 column just the first four rows. This works really well. If you want to access some particular columns based on some pattern, you can use something very similar to the regular expression. So you can say df and RX. What that means is I'm going to give you this raw string and anything that matches this pattern of x should be returned. So this is going to return X1, X2, X3, and X4 and vi's are going to be omitted because they don't match this pattern. Okay? So just some basic stuff. And as you can see, Df itself remains the same. So all of this selection does not affect the DataFrame. It's just what is being returned to us right. Now. You can access particular rows and you can access particular columns. Those are the basic two things that we have to do. There are many other things that we have to do and we'll get to them as we take a look at other case studies in the future videos. But for now, let's go ahead and describe this data. So if we say describe DF, it's going to give us some basic statistics about this. So for each of the columns, it's going to tell us the mean, the minimum, the median, the max number of missing values, and the element type, right? So these are integers and these are flops, and all the y-values are floats, x values are integers, and all the columns are integers. Okay? If you want to get a particular value out, you can pass these as arguments to the function describe. And this will return just those things for you, right? So just the mean and the standard deviation. If you want to get the number of rows and columns, you can say n rho dF, four rows and four columns. You can also just say size df if you want to get both of them at the same time. Now, let's go ahead and insert some identifiers in this dataset. So this is an example of how you can insert a new column in an existing DataFrame, right? So if you want to calculate something based on the values and then insert them, this is how you do it. So you can simply say df.loc d is equal to and because you are not selecting it but assigning it, this is going to take these values, so 1 to enroll. So this is going to be 12, 11, and this is going to go as the id column of D F. So this is going to be inserted now into the DF. And you will see that ID column has been inserted that goes from one to 11. This was not present over here. Okay, so that is how you can insert columns. What else can we do? We can find out the extrema of the values in this whole dataset. So for instance, if we say extrema of all of these. So this is going to go ahead and calculate the extrema extreme values for this. So the minimum in these is two and the maximum is 30. And so it ignores all the rows and columns and just gives us that. So what we want to do is we want to figure out the minimum value, the absolute minimum value on all of these x's, and the absolute maximum values in all of these Vi's. And the reason for that is we are going to plot them. And for that we need the limits for the x-axis and the y-axis. So we want to know what the minimum value of x is odd in all of the columns combined, and what the minimum and maximum are for the y-values for all the columns combined because we want to give the same x limits and y limits to all the different plots that were willing to try and create. So we're going to plot x1 against Y1, X2, Y2, X3 against Y3, and X4 against Y4. So these are going to be four different plots, but we want them to have the same x and y limits. It will make sense when we see the output. But for now, our objective is to calculate the extrema for axes separately and y separately. So we can go ahead and first select the x columns. So select DF coma RX is going to return all of these. We're going to build this one-by-one. Now, from here, we can convert all of this into a matrix. So this is what we had in the previous cell, and now we are going to convert this into a matrix. So that gives us this guy over here. Now we can calculate the extrema just as we did over here. And this is going to give us the extreme minimum and the extreme maximum, right? But when you're actually doing the plotting, you want to slightly overshoot the maximum and slightly undershoot the minimum. So our plot should go from three to 20. So for that we are going to use the broadcast operator. And what this is going to do is it's going to add minus one to four and plus 12, 19, giving us. Comma 20 back, right? So this is a very common structure of how we get the minimum and the maximum four plots. So that is why we are covering this as a case study over here. And once you see the plot, all of this is going to make sense, okay? Now we want to convert them into an array because that is what the plot package expects. So we can simply use the collect function and that is going to go ahead and convert this into a two element array, both of which are in 640. So once we have this for the x limits, we are going to do the exact same thing for y limits, except this time we're not going to do them one by one. Instead, we'll just do the whole thing at once. So our y limits are 2.1213.74, as you can see over here. So the max is 13.74 and the min limit is 2.1 for us. Now what we're going to do is we're going to actually do the blood. And we're going to do it using two methods. The first method we saw in the previous video, so plotting, so we can use the Plots package, and that does really good plots. But if you're coming from a language such as Python and you already have code for matplotlib, and you've already done all of that. In that case, you don't have to re-learn everything. You can simply use your Python or MATLAB code using the pyplot package. So you can go ahead and say using pyplot. And then because we also want a regression line on this model, we want to do using GLM. So this is generalized linear models. It's not important to what this is. You'll see what this means in a minute. So we have using GLM, because we now have pyplot, we can use the matplotlib syntax for creating this plot. So fake comma access is going to be plt.plot. So we're going to create a two-by-two, which means for total plots, one for each of X1, X2, X3, and X4. Okay? We're going to set the layout as a very tight one. And we are going to create four plots for IL-12. Four for each of these plots, we are going to take X1 and Y1. And we are going to use this GLM model to predict the regression line, right? So let me go ahead and just plot this first and then I'll show you what we're doing. Okay? So this is the plot. We have X1 against Y1, X3 against Y3, x2 against Y2, and X4 against by 4. So you'll notice over here that all the x limits are the same for all of these plots. And the y limits are also the same. So you'll notice over here the X2 has the maximum at around 15, but we're still going all the way up to 20 because we want all four of these plots to look the same so that we can do the comparison. Okay, so here's what the function does. It gets the X and Y symbols out. Then it does a linear model. Is this guy over here, right? So this orange line is created using this model variable. And this model variable is essentially a linear model which has learned from this dataset, okay? So the relationship between y and x is learned from this dataset using the lm function. So if you're not familiar with linear models or regression, this is just telling us how y changes as you change X1. Okay? So it's not really all that important. Now we're going to do the actual plot. We are going to give it the x limits, and we are going to plot the regression line in orange over here, right? So we're going to predict based on this model and based on this dataset. From the minimum to the maximum. So all this is doing is plotting the regression line. Then we are also plotting the scatterplots using DFT all the rows and x column, and DF all the rows, y column. Okay, so that's our scatterplot. We are setting the limits and y limits and giving it the label over here. So this we are doing for each of the plots. Finally, we are doing this calculation over here as well. Because we are assuming y to be linearly dependent on x1, we are defining what the model has learned from the data. So as you will see, all four of these are exactly the same. So 0.53.53.50.53 is 0, but the actual data looks very different from each other. So that's the whole point of this. What we are doing over here is essentially looking at how you can go ahead and plot data. How you can create a basic regression model or a linear model based on your data and how you can plot different things using a very compact format. Okay, So that's what we have over here. Now we can do the same thing using plots, and you will see that the actual code looks a lot cleaner. So we're going to do using plots. And we are going to create a plot a day. And this is going to be an array of type any. The reason for this is that it just makes our life a lot easier because we can simply push different plots into this plot array within this loop. So we're going to create four plots, push them into the plot array, and finally, use a single plot function to plot all of these together. So once you are comfortable with Julia, you will see that this code is a lot messier and you get a score looks a lot cleaner if you just look at it, it looks cleaner, okay? So our x values are going to come from our DataFrame. So our rivals. And then we are going to do the plot for x values and y values. We're going to tell it that it should be a scatter plot. The x label and y label or set x limits and limits are set. We are going to put smoothies equal to true, which is going to do the regression line for us. We don't have to go to GLM or a linear model. It's going to take care of that for us. Finally, we set the palate as well. Once we have the plot v, we push them into an array. So this is just putting that plot into the array so that we can later on plot them all together. Okay? So once we're done with all four plots, we are going to say plot and we are going to give it plot array, except we are going to use the dot-dot-dot operator, which we saw in 0, 3, 0, 1 functions.php and B. So we are plotting the values. So these 123 were converted into three different parameters. Similarly, we are going to convert all the different plots, or four plots that we have over here. Those four are going to be passed to the plot function separately. Layout is going to be two-by-two. Legend is going to be absent and the size is going to be set at eight hundred and four hundred. So that creates the plots for us. So when you try to do that, you're going to get this error over here, both pyplot and the plot function export plot. And because you should not be overwriting the global exports, this is going to give you an error. So the really easy thing for that is you simply restart the kernel. So you should pick one or the other. Here I'm presenting both of them so that you have the option. Here. We are going to go ahead and restart the kernel and simply load our data again. So we are going to load our data. We are going to define our x limit and volume it, and then jump straight to our plot and do the actual plotting. So now you're at a should be gone. And there you go. I think the code, as well as the output looks pretty clean and it serves the same purpose. Okay? So the point of this video was, number one, read data, do some basic manipulation, and then see two ways of doing the actual plotting. I personally prefer doing plots package from Judea because one, it's native to, it looks really clean. And three, it lets you get familiar with Judas index so that you can use the much more powerful libraries which are present in Julia. In the next video, we are going to do some further data manipulation and we're going to do some in-depth analysis after that.
16. Further Data Manipulation, Apache Arrow, Grouping and analysis: Let's go ahead and do some further data manipulation. We're going to load the data again from a CSV file, but this time we are going to try and save it into a much better format, which is Apache Arrow. So we'll get to that in a minute. But let's take a look at the data that we're starting with. So this is the auto dataset. So it has information about different cause, their mileage and all of that stuff, right? So we can download this data from this URL using the built-in download function that Judea comes with. And we can save it in data slash auto dot TXT. I've already done this, so I'm not going to run this cell, but we can then go ahead and read all the lines of this file and see what it looks like just to see if it's there. Okay, so this is what it looks like. So this is a built-in function, so works out of the box. Now we're going to try and use the CSV and dataframes package, just as we did in the previous video and see if this works. So we do the exact same thing. We give it the headers which are not present in the CSV. So we are going to hard code them over here. And we are going to convert the missing values to n, n, okay? And then we pipe this to the DataFrame just as we did previously. So when we do that, you'll see that we'll get an error. And the reason for that is that there are some tab characters over here and then some non standard formatting going on because of which we are getting some errors. So here's a warning past expected nine columns, because I gave it nine columns over here, but didn't reach the end of line around the data, right? So what it's saying is it got to the end, but nine columns were not filled. And the reason for that is this T over here. If you can see, I'm going to zoom in. So this t over here, this is causing the problem. And this is a very common problem when you're working with CSV files. There might be some hidden characters and stuff going on that is causing problems for your dataset. So what we're going to do is we're going to see how we can manage this in our code. So let's go ahead and first read this whole thing into a string. So we are going to read this file as a string into our raw underscore STR variable. Okay, then we're going to go ahead and replace all the slash t characters with a space. And that's going to essentially clean it up just a little bit. And then we are going to create an IO buffer. So i o buffer is essentially a buffer in which you put a string and you can use that as an input, output, or a file. Okay? So you'll see what this means in a minute. So we create an IO buffer based on this string that we just created. And then we are going to do the exact same thing with the CSV and DataFrame, except this time we're going to give it the IO buffer as the source instead of the file. Okay? So this time because we've converted all the T's to spaces and we are setting the delimiter as a space character. This works out perfectly well. So now it can read all nine columns and everything works out, right? So there are many, many things like this in data science that you come across slowly and steadily when you do some exploration on your own. I just wanted to cover one use case, which is very common. Now, we have this red and we want to essentially figure out how many values are missing, right? So some LEAs are missing. You can see a couple over here. So we want to figure out how many are missing. So what we can say is we can sum count missing for each column in Df, Right? So for each call, so for calling each chord D, F, that loops over the columns and we are going to count is missing of the column that we are measuring. So essentially, uh, some of the missing values in all columns, okay, this looks really complicated. We have a much cleaner method of doing this. We can convert the DataFrame into a matrix just as we did previously. And then just County is missing, right? So that is much cleaner way of doing the exact same thing. We can also go ahead and map calls. So we're going to take df and map calls extracts columns, and passes them to this function over here. And each column is going to turn into x and we are going to count is missing. And then map calls is going to return the individual counts of missing values. So miles per gallon is missing eight values, horsepower is missing six values, and all the other columns have no missing values. Okay? So depends on what kind of analysis you're doing. You might be interested in doing this or this, or this, right? If you are interested in finding out missing values according to the column, you use the third method. Now, Let's go ahead and do some more interesting stuff, such as adding a column based on data. So previously we added the id column, which was just going from one to the number of values. Here what we are trying to do is we are going to extract the brand name from the name of the car, right? For instance, we have the name of the car over here as for instance, Buick and Chrysler, Ford. We want to extract this because this is the brand and then we have the Carney Maria. So you want to extract this word, Ford and Toyota or dodge or Chevrolet. Okay? Now, the way to do this is we are going to first see what name looks like. So df.loc is this guy over here. This is what it looks like. We want to split it on a space character. So if you do that, you can say split dot-dot-dot name. And we need dot because we want to broadcast the split function over all the different elements within the series, okay? If you don't broadcast it, it's not going to, it's not going to go into the individual element and it won't work. But because we have a broadcast, it goes ahead and apply split individually to the first string and the second string and the third string. Okay, Once we have that, we can apply the first function which returns the first element from the array. And we are going to broadcast this as well. Okay? So df.loc is this guy over here. So first you split it, then you get the first element out. So that gives you the brand. Okay? We also put that as a new series within the dataframe. So that goes in. So now we can go ahead and retrieve some of these values so we can retrieve name and brand. So name is this guy over here and brand is this guy over here. And we are seeing the first n columns. If you see the size of DFS, it has 400, six rows. We can go ahead and drop all the missing values just as we did before. We can say DF 2 is equal to drop missing DF. And this is going to give us a slightly smaller DataFrame. Now. The 14 missing values are now gone. Now if you try to count the missing values from df2, obviously there are no missing values because we just drop them. One of the most common things in data science is to extract certain rows based on a criteria. For instance, if you are interested in getting just those rows which have the brand as SOB, we can say df2, DF 2 dot dot dot equal to, which means essentially apply broadcast of equal to function on each individual row. And we are saying if that equals sub and return all the columns. So that's a slightly verbose way of doing this. But this is what people typically do with Pandas. So you do that and you get all the values which have solve as the brand. But a much better way of doing this would be using the filter method. And you will say that filter takes a criteria and a source. We did this when we were looking at functional programming. Functional programming 0, 3, 0, 2 over here, we saw this there. If you don't recall this, you can go ahead and take a look at that again. But here we are going to pass this function to filter. And what we're saying is only keep those rows which satisfy this criteria. So we give it a rho and rho dot brand when it is equal to sob, we're going to keep it, right? So this returns the same value for us. But you will notice that this is a lot more readable than this guy over here, right? So it's typically recommended that you use filter for doing filtering because it makes more sense. Okay, finally, the actual issue that we started with, we want to load the data. We want to clean it up a little bit, and then we want to write it into another file so that the next time we read it, we don't have to do the cleaning again. So we are going to simply use the CSV dot right function to write our DF 2 in Autodesk clean dot csv. So that's done. So Data Autodesk, clean dot csv. So you will see that this is perfectly clean now and you can take a look at the names and brand. So brand has been added over here. However, CSV is not very efficient. We have a new highly recommended format for storing very large data, which is Apache Arrow. You can go ahead and take a look at this link and it will tell you the benefits of arrow. Very briefly, you can use arrow for very large datasets. For instance, if you have 64 gigabytes of data and you want to load it into a 16 gigabyte RAM system. You can easily do that using arrow. An arrow is going to essentially manage all the swapping and swapping out for you. So you don't have to worry about loading the required data into RAM because it becomes very problematic very quickly, right? So it's typically recommended that you go with arrow when you are doing any data science with very large data, which is the whole point of data science and Julia, to be frank. Anyway. So for that we have a package in Julia called arrow. So we're going to say using arrow, if you don't have it installed, you can install it using this cell over here. And then we are going to write the arrow file using arrow dot, right? And we are going to give it the name over here. Now, you will notice that this guy is not something that we can open. And the reason for that is arrow is a binary format. It's not a text format, not a character format. So you cannot read it directly over here, but we can read it. And we're going to read it just for the sake of understanding over here. Okay? So DF 2 is equal to arrow dot table and we are going to give it the source. And then we are going to convert it into a DataFrame using the pipe syntax, just as we did with CSV. So once you have an error file, this is all you need to do to load it. And because it's binary, it's very unlikely that it's going to have the problems that CSV, okay, we can go ahead and do some grouping. So once you've read this and you have the same thing back in a DataFrame, you can go ahead and do some grouping. So for instance, you can group by the brand. So you can say group brand is equal to group by. So this is a source that we want to group and this is the criteria or the cell on which we are going to. So we do this. And group brand is willing to have some groups. So first group is where brand is equal to Chevrolet and these are the guys over here. So what we're trying to do over here in this case study is, and this is the Nissan on. What we're trying to do over here is take a look at how many cars, for instance, are created by afford, how many by Nissan and so on and so forth, right? So for instance, if you take a look at the group brands and we get just the Ford values out. These are the cars which are created by Ford because we group that and now we are retrieving this, notice this comma over here. This comment is very important. If you don't do this, you're going to get an error. And the reason for that is this, this list over here has to be a tuple collection. And if you don't put a comma over here, it's simply a string which is surrounded by brackets. We don't want that. We want a tuple. So this converts it from a string surrounded by brackets to a tuple. So it's very important that you understand this. You can give it other grouping criterias as well. But for this simple case study, this is enough. Okay? So these are all the four things. Now we can use the statistics package to, for instance, calculate mean and all the other statistics as well. Let's go ahead and calculate the miles per gallon statistics by using the combined function. So we have the group brands that we created over here, and we are going to combine it and calculate the mean of the MPG, right? So because we have the grouping, the results are going to be grouped accordingly. So Chevrolet has this mileage and so on and so forth. Now we can go ahead and sort them even. But before that, notice a couple of things. Number 1, this is too much data and I don't like it. So we can change the ENV lines to 10. And that is going to just show the 10 rows over here. What this function does is it takes the mean column from each group. So this MPG column from each group passes it to the mean function, this function over here. And then it creates a group name to aggregate function mapping. So this is the group name Chevrolet, and the aggregate mapping is this guy over here. So this is the mean MPG underscore mean MPG comes from over here and mean is this function. We can also pass it any function using the anonymous function syntax. We'll see that in a minute. We can do sorting by using the band function so we can give it brand and BG. This is the criteria for sorting and we want to do a in-place reverse or descending order sort. So these are the people who have the highest mileage. So just to recap, what we did was We loaded the data, we combined it using some aggregate and column, and then we sorted it. That's all we've done. Now let's go ahead and take a look at one of the more powerful features of Julia. You will notice that we have this same brand MPG being repeated. So random PEG group brands. And then Brandon BG again, if you change the variable name in one place, you have to change it in all places. So it becomes very messy very quickly, we're going to see how we can do this really easily. So for instance, you have a origin column over here. So what audiences is, is it from us, from Europe or from Japan? So one is for us to is for Europe and threes for Japan. Okay, So that is the semantics of this column. Well, if we want to do is we want to figure out different statistics based on this origin. Okay? So first we're going to group by brand and then figure out if each brand actually belongs to one origin because that's how it should be. If you have, for instance, Chevrolet going into the origin as us and as Japan, then there is a problem with our data. So this is just to make sure that our data is working perfectly fine. What we're interested in right now is the syntax. So you have the group by clause over here. So this is fairly straightforward. Then we want to combine this DataFrame. But the criteria is going to be origin and origin is going to be not mean, but the length of the unique values for x. So what this is saying is take the x or the row or the origin, do a unique and then see how many there are, right? So, so let's go ahead and run this and you'll see what this means. So Chevrolet only one origin, Buick only one origin. So all of them have just one origin and that makes sense. Okay? So let's make this a little prettier because this is really problematic. As you can see, it's not very easy to understand what's going on. Alright, so that's the whole point of this section of the video. It's difficult to understand what's going on. Let's go ahead and clean the code using the pipe operator. And we are going to, if you don't have the package installed, you should install it using this syntax. What we're going to do is we're going to do an add pipe. What this macro does is it allows us to use this underscore and I'll explain what that means in a minute. So you take the DataFrame, you pipe it to the group by function. And whatever the result of this thing was, it is put in the underscore location, right? So DataFrame was loaded. That was the output of this line that goes over here. Then group BY works on that and groups it based on brand. Whatever the output of this whole thing is that goes into this underscore. So now this is going to be a DataFrame. I only have to give it a name. Now, we are going to combine this origin and the count of the unique values for the origin over here. So this makes our code look a lot cleaner, right? Once you have this, you get the same result out. What the code is much cleaner. What the underscore does is takes the previous lines outward and put it in place of the underscore, then all of this goes over here. For instance, if you want to sort it, you can add another line over here and that's going to sort. It will do that in just a minute. So you can calculate the extreme and just to make sure that all the values are one, they are, so we're fine. So going back to the sorting example, you can do the same thing as we did before. So for instance, you can have DataFrame going into groupBy. And we are going to group, not just buy a brand, but by origin and brand. And then we are going to combine whatever the result of this was based on the number of rows. So what this is going to say is how many cars are in that grind, okay? And then we are going to sort it by passing, by passing the output of this guy to the sort function over here. So take the DataFrame, group it, combine it, and calculate the number of rows, and then sorted in descending order based on the number of rows. So Ford has 53 data points. In this data that we have, Chevrolet has 44 pi mt, has 32, and so on and so forth. Okay. This is what is called a long format, so it's increasing downwards. Sometimes we might want to convert it into a wide format if you're not really sure what that is, don't worry about it. You can skip the rest of the video. But if you are aware of what the wide format was, such long format is, you can go from long format to wide format using the unstack function. And you'll see what the difference between these two is if you just run it. So what I'm stacking does is it takes Ford and essentially creates a multiple column values over here. So all the ones are going to go over here. So 53 over here. And all the threes, for instance, for Toyota, they are going to go over here, right? So essentially it creates a separate column for each individual value over here. So this is called the wide format. Sometimes it's much easier to work with. And you see all of these missing values. We can get rid of them by converting them into zeros using the coalesce function, which is broadcasted to the whole dataframe. So when you do that, you get these values out. So what this is saying is Ford has the origin as 1 in 53 cars, which makes sense. Chevrolet has 44 in 1. So all of these are US cars. All of these are Japanese cars. And this guy over here is the European one. So, so the whole point of this video has been to do different bits of manipulation with data. As you might be aware, data science is a huge field and there are a huge number of things that you didn't do. What we want to do over here was getting a feel of how Julia approaches to different problems and how it does the structuring of the code along with some of the syntax details of Julia. So hopefully after this, whenever you look up a Julia tutorial, you will be able to understand what it does and what the syntax is and what the semantics are behind the different operations. In the next video, we are going to apply a particular problem of clustering based on map data and see how we can read a simple dataset and do some clustering based on that.
17. Case Study: Clustering for Housing/Map Data: Let's go ahead and apply these concepts on a slightly more involved history, the example that we've picked is the clustering of housing data. So this is the housing data for California. And we are going to take a look at real world data set for clustering. The reason for that is if you try to do clustering on data, it's very easy and it doesn't really tell you the importance of clustering and it doesn't really help you understand the concept fully. So we are going to load the houses data from this file data, slash houses and new houses, dot CSFI and we are going to convert it into a data frame, just as we did previously. So the columns that we have in this data set are longitude, latitude and the housing median age to rooms, number of bedrooms, population. So all of that. So these are the columns that we have over here in this case study. We are interested in the longitude and latitude, which essentially tell us where the house is within the city and in the median income or median house value. So we are trying to figure out where the more expensive houses are or where the rich people. So do they live within this location? So for this, we are going to use a library, which is very light. So this is really useful for doing map based visualizations. We want to actually look at the real world map and plot these values on that map instead of trying to do this on a graph paper like visualization. OK, so we are going to use the JSON package for reading the California county's information, which we've downloaded, and it's available in the data folder. We are going to use the Wagonlit and Vigor data sets for doing the maps and map based information. If you don't have them installed, you can install them using these cells. I already have them installed, so I'm just going to start using them. We are going to read California County's Doges and file. This essentially gives information about the shape of the map. And then we are going to read this week adjacent dataset, essentially both of these combined. Give us the information about the map that we want to draw. OK, we don't have to draw them by hand. We just have this regular dataset available. And Julia has this functionality provided to you transparently. So we done that. And now we are going to use these real plot macros which are provided by the very light package. And we are going to use the syntax of Wagonlit to do the plotting. Now, since we are not really interested in whether I am providing the same text over here, you can just look at this and try experimenting with this to get a better understanding. But this really isn't our concern. What does concern us are these lines over here. So we'll see what these are in a minute. But just for the sake of completeness, we are going to support the Mock's, which are going to represent houses on top of the map. So we are going to have a geographical shape. It's going to have the fill of black and the stroke of white. And the data that we are going to show it is the county data that we have over here. So all of these are coming from this dataset. Get the actual plot is overlaid. The housing data is overlaid on top of this map using the plus and ADD will plot Syntex. Each house is going to be plotted using a circle. The data is going to come from the houses. Data frame that we have up above the latitude and longitude are going to define the location of the circle. The size is going to be at twelve and the color is going to be representing the median house value that we have with it. OK, so let's just run this to see what it looks like and then it makes sense. So you might get some warnings over here. It's not really all that important. It doesn't affect the output. Just notice that this Syntex is not Julia, this is Begger. And it's not really our concern. We're just using it for the actual output of the plot. So this is what looks like a black map, white strokes and the houses are over it. So the more expensive houses are in color. So they are over here and the least expensive houses are in lighter color and they are slightly away from the beachfront. So this over here on the west is the beachfront and the more expensive houses are on that beachfront road. So it makes sense. What we are trying to do now is cluster these values using two different methods. One is going to be simple budgeting. So we are going to take this whole range and we are going to split it into one hundred thousand sized buckets. So from zero to one hundred thousand is going to be one bucket from hundred thousand. Two hundred thousand is going to be another bucket. So we are going to have five buckets over here and we are going to instead of plotting all these individual houses which don't really tell us the spread, we are going to put them into five different buckets. So let's go ahead and first convert the data into buckets. So we are going to build this piece by piece. So look at this piece first. What this does is it goes into the housing data frame. It retrieves all the rules, so you would recall that for all the rules, we use the colon, but the colon does is it gives you a copy of the data back. We don't want to create a copy because this might be a very large dataset and we want to just use the data. So this is going to be a view into the same structure. It's not going to create a copy. And for that we use the exclamation mark. So Colon creates a copy exclamation mark, just gives you a view. You can look at it, but it will not be copied in memory. OK, so it's a lot more efficient. You can use Colon as well. It won't affect it, but it's going to jog the memory down. The column that we're interested in is the median house value. So what this does is it gives us the median house values of all the houses. That's it. Then we go ahead and apply the divide function and we divide it by one hundred thousand except divide function works on two scalar values, so we have to broadcast it. So it's going to essentially divide all the values by a hundred thousand. After that we convert it into an integer and again we have to broadcast it because integer works on individual scalar values. So essentially we take the housing median value of one data point divided by 100000 and convert it into an integer and that essentially gives us a value between zero and five. So that makes sense. So take, for instance, the value of one hundred thousand divided by a hundred thousand. That gives us one point zero and convert it into an integer. This is going to be one. OK, so these are the values that we get for three three three two zero one and so on. The extreme model that we are going from zero to five, that makes sense as well. Now we are going to insert these into the data that we have. So we are going to use the insert called function. We are going to insert into the houses. Data from the index of the column is going to be three. The name of the column is going to be surprised and the actual values are going to come from this structure that we just created. So once we do that, our houses data is going to look like they saw one, two and three. This is. This column has just been inserted based on the values that we had in the bucket price. OK, so we have these over here. Let's go ahead and plot these using the exact same code. Except now we are using see price as the color. And this end means that this is an integer value, not a quantitative value anymore. So there you go. So we get the same warnings, and by doing this, we are going to get an output and this is going to be sort of a clustering. So this is a very basic clustering. It's not very efficient, but it works. So it tells us that the zero prices are way over here. And then these guys are slightly more expensive. And the most expensive ones, which are in five, are again on the beachfront. So same data, but now it's in buckets. So a lot more easier to understand. So this visualization is slightly better than this visualization. We can go ahead and do a typical clustering using Kamins. So again, because this is a case study, we are using Cayman's. You can use another method as well, but gaming's is the most common. So let's go ahead and do that. This comes from the clustering package. So we're going to do using clustering. We are going to drop the missing values because gaming's does not work with missing values. We are going to do that. And then. We are going to do the clustering based on the median house values again, so same thing. You go to housing data frame, you get all the rules, and from that you get just the median house value column. OK, and because we have eight exclamation mark over here, we are going to select in place. No copy is going to be created. So that is what we get. Index. So now index. We have twenty thousand give or take values. The problem is that this, as you can see, is a column. So two zero four three three rows and just one column. So it's vertically aligned and it's a column gaming's as it is in the library, requires a row vector. So we are going to convert it using the matrix function. So first it's going to be converted into a matrix and then we take the transpose and we get the row vector back. So this is just data wrangling to make sure that it works with our library. And we are going to call means give it this data and tell it to sort it into five buckets. So that's all you need for clustering. You get the values out. OK, now you can go ahead and insert these values into your houses dataset just as before. And this time we are going to collect clustering and squawky. So we have this over here cluster in this cookie. And these are the cluster values. As you can see, these are slightly different from the C price values that we got, but they have essentially the same purpose. We are going to plot this and this time the color is going to be based on the gaming's cluster. So once you do that. You get a very similar plot out, but gaming's clustering looks slightly better, so you get a much more real world representation of the clusters that you have. So these are the slightly mid-range things and the more expensive ones are over here, as you can see. OK, so hopefully this gave you a good idea about how to do clustering. The actual clustering is just this one line over here. So just this one line. Give it a raw vector and it works out of the box. And the rest of the stuff is just data wrangling to make sure that our data works with the laboratory that we have over here. So hopefully this sense in the next video, we are going to go into a slightly more interesting example, which is classification. And after that, we are going to move onto the most popular aspect of Julia, which is machine learning.
18. Classification with Decision Trees/Random Forests: Let's go ahead and apply traditional machine learning to a classification problem on an existing dataset. This will just give you an idea of how the built-in libraries for classification work in Julia. So let's go ahead and use the datasets and ML based packages. If you don't have them installed, you can install them using these two cells. I already do, so I'm going to just start using them. So our datasets provides the information about well-known datasets coming from our and ML base is the traditional Machine Learning collection. So let's go ahead and load the iris dataset, which is the information about different flowers and their species into our IRS field. So there you go. We can go ahead and take a look at what the dataset looks like in this table that will show up in just a minute. So we have sepal length, sepal width, petal length, petal rate. And in the species, you might have seen this before as well. Let's go ahead and see what the actual pipeline looks like for doing machine learning on this dataset. So we're first going to convert this into a matrix. So all the rows and only the first four columns, which are the features. So those are going to go into x and the labels are going to come from the fifth column. So there we go. These are the columns that we have labeled MC and tell us what the actual values correspond to the actual labels. So one is setosa, versicolor and virginica. We can go ahead and use the label encode function to convert all the labels into their corresponding numeric values, 1, 2, and 3. Okay? So once we do that, y is going to be converted into this values. Okay? So we are going to now create the training and testing splits. So essentially what we're going to do is we're going to do a glass plate, which means for each class we are going to take like 70 percent values and put them in the training and the rest of them we are going to put into testing. So the call is going to be like this. So upper-class split, we are going to give it the y. And we are going to say that 0.7 of the values should go into the trainings. So this is going to return the IDs from the dataset, which should be put into the training set. So let's go ahead and do that and then it'll make sense. So we're going to do using random because we want to have a random sample from the whole dataset for our training. What per class split does is it figures out which unique values y-hat, so 123. For each of these values, it's going to go ahead and find the values from y where that specific row equals this class. So for each of these class, we are going to do this, okay? And what are we going to do? We are going to do a random subsequence from these values. And we are only going to pick 70% of those because we'll pass 0.728. Okay? So for class number 1, it's going to pick 70% of the values for class number two, which one to pick 75 percent of the values. And for class number three and all of those row ID, so these are just the IDs, not the whole rows. So those are going to be pushed into our keep IDs. Okay, let's do that and you'll see what this means. For each of these classes, we are going to get these values. So 2, 3, 4, 6, 7, and all of these, some of these are going to be a fun class 1, class 2, and class 3. The reason we do it like this is, for instance, if you have an imbalanced dataset and like 90% of your data comes from one class and we just do a random split. Then. It's highly unlikely that most of the values are going to come from just one class. So upper-class split ensures that we have a good representation of classes in our training and testing sets. Okay? So we just get these IDs back. We are going to calculate the test IDs now by calculating the differences of all the values from the training ID. So from one hundred and one hundred and fifty, these are the training rows and the rest are going to be the test rules. So that makes sense. Okay? So we have 2, 3, 4 here. So 189 goes over here, okay? So 89 is not going to be over there and so on and so forth. So now we have the IDs for training and testing both, right, so if you do the value, so a 103 go here and 47 go here, which is like a 70, 30 split. Okay, then we need another helper function for calculating the actual value. So before looking at this function, let me give you an example. So let's say you have a machine learning algorithm and you put the value in there, and it predicts 3.1 because it's real numbers, it can predict 0.5, it can predict zero-point, see it can predict 3.1. It can predict all of those different values. And we can only have the classes 123. So we need to write a function given 3.1 will return either 1, 2, or 3 to us. Okay? This is that function. But before we go into this, just look at what argument does. So if you say arg Min of this list, it's going to calculate the minimum. So the minimum of these values is going to be 0.9, but we're not interested in the actual value 0.9. We are interested in the position of this. So 1, 2, and 3. So if you say arg Min, it's going to return three to us. Okay? And Armen of, let's say minus 1.1 is going to be two. So arguing essentially returns the position corresponding to the minimum. Okay? So once we have this, let's go over here. So we'll give it a predicted value, let's say 3.1. It's going to calculate the difference from two predicted value and all of these values, and this is going to be distributed. So 3.1 is going to subtract 1 is going to be 2.13.1 minus two is going to be 1.13.1 minus three is going to be 0.9. So once we do an absolute of that because we're only interested in absolute differences and then do an argument, it's going to return either 12 or three depending on whatever the closest value is, right? So we do that and then we say assign class. And if you say one, it's going to say one. If he say 1.2, it's going to say one. If we say 1.7, it's going to now say two because it's the closest. So similarly to 0.7 is going to be 3.22 is going to be two. Okay? So hope this makes sense. This is because our machine learning algorithms are going to return this value to us and we need to convert this into an actual class, right? So this very simple function does that for us. Now let's go ahead and do the actual usage of classifier. So first we're going to use the decision trees. So it's very straightforward now. So you're going to do decision trees. And the model is going to be decision tree classifier with a max depth of two. If you don't know what a decision tree is. You can look it up and you can study the model, but it's a very powerful model. It's very useful for us here. It's not important what the decision tree algorithm does and how it works. For us. We're just trying to use it here, okay? So we create the model and we pass it the values. So fit function is going to do the actual learning and this is going to be the model that we are using. We are going to give it all the train values and the labels for the train value. So you would remember we had drain IDs. So we are going to take all the drain Id rows and all the columns from x and the train IDs rose from y. That's all there is to it. We do that and we get the result out. So I probably didn't run it here. So 90s. So I have a typo here. So train IDs. There you go. So this is now done. And now we can go ahead and do the testing. So we can say Q is equal to the Q, dy is equal to test IDs and all the rows. And then we are going to similarly here. So we're going to do the prediction. So prediction on this model for the security is this over here. We can find the accuracy. And finding accuracy again is a very simple function. We give it the predicted values and the ground truth values. We are going to check how many of the predicted values equals piecewise equals ground truth values and divide by the total number of ground truth values. So that's all our accuracy is the number of correct answers. Okay? So we can say Predictions DT, which we calculated over here. And these are our ground roots. So once we do that, I really should fix this. So once we do that, we get the accuracy of 93.6, which is really good. Decision trees can be combined into random forests. And using that again, is very straight forward. So you get a random forest classifier and the number of trees in this classifier is going to be 20. We are going to fit it just as before. So here we go. And we are going to calculate the accuracy for this model as well. There you go. 3.6. Support vector machines are also supported by default. So you can do using LibSVM. You calculate the training and testing values. So those go into y train and we simply pass them over here. Okay? The reason we're doing this is LibSVM requires the structure in a slightly different format. So we have to transpose this, right? So you just have to remember that LibSVM is slightly different because it's traditionally done like that. Okay? And we can go ahead and do the learning. And then we can do the accuracy test as well to see how LibSVM does on this dataset. Okay, So we get 95.74 on live SVM really quickly. So people don't typically do these traditional machine learning models. But if you do want to do that, you can see that the code is very clean, very straight forward. You can work with this. In the next video, we are going to move on to neural networks. And that is where machine learning really shines in Julia, because Judea starts with a very simple concept and that really explodes into the state of the art machine learning. So in the next video, we're going to do that.
19. Writing a Neural Network from Scratch in a Few Lines: In this section of the course, we are going to use Julia to build neural networks from scratch using the machine learning library in Julia, which is called flux. Now this is not a course on machine learning, so I'm assuming that you're already a little bit familiar with neural networks and you've worked with other libraries which work with neural networks. If you haven't worked with neural networks before, there are tons of tutorials on the Internet. You can look them up and get familiar with them before you start with Julia, because Julius kind of like advanced machine learning, okay? It starts from 0, but it comes up to state of the art really quickly. I'm going to give you a crash course on the basic concepts of machine learning, but only the concepts that are relevant to our discussion over here. So to get started, neural networks are our two passes. One is the forward pass in which you take some attributes of the data that you have. You multiply them with some weights and you apply some squashing function and you get an output back. Okay? So for instance, you might have W into x plus b is equal to this value over here. And you can apply a sigmoid or an activation function to it. That is your forward pass. It's fairly straight forward. Then you do a backward pass in which you calculate the derivative of this loss that you've calculated or the error that you have with respect to all the parameters that you have for the model. So the W there to head over here, that needs to change so that when it is multiplied to x, the loss is minimized. And you do that using the derivatives of the loss with respect to w. And then you update it using the gradient descent update function, which is essentially the old value of w or Theta. You take that old value, you subtract the derivative with respect to that multiplied by some learning parameter which is called alpha. Okay? So these are the three things that you have to do. So let's put them together. So you have the forward pass, which is essentially just a mathematical function. It's not complicated. You have the backward pass and then you have the parameter updation step. So this is fairly straight forward. This is fairly straightforward. The backward pass is complicated because in this you have to have a very complicated mathematical function and you have to calculate the derivatives for that. And this is where people struggled with machine learning. You have to calculate derivatives with respect to millions of different parameters. And these are partial derivatives, extremely complicated. So easy, easy and difficult. The way Julia works is it allows you to specify the forward pass and the updation step. But it takes care of the backward pass, which is the difficult part automatically for you. And that makes our life very easy as machine learning developers. Okay, so keeping this in mind, let's go ahead and see how Julia does this. So we'll start off with a very basic mathematical function, calculate the derivatives, and then move on to the updation step as well. Okay? So the library is flux. If you haven't installed it, you can install it using add flux and then start with using flux. So this is our dummy function. So three x squared plus two x plus one, just a basic function which will give you a derivative or gradient of 6 x plus 2. But we're not going to calculate this. We're going to get Julia to calculate it for us. So this is our function. And all you have to do is call the built-in function in flux, which is called gradient. And tell it to calculate the gradient of f with respect to one parameter, but for the value of X2, so. This is the derivative of f with respect to x when x is equal to two, that is what? This is. The gradient of f with respect to x when x is equal to two. So 6 x plus 2, that is going to be evaluated, right? So six into two is 12 plus 2 is 4. So we get the right answer over here, right? So that's all the gradient part, which was the difficult thing we did not have to do by hand. So I wrote this here, but we didn't need this. This is just a comment. Okay? We can go ahead and make this a little cleaner because this is in a tuple and we need a value. So we can define a new function Df of x. So this df is just for ease of understanding. It can be any name. So d of effects is equal to this gradient function. We're going to multiply this. So this is the body of the function, right? If you don't remember, recall that this is a function which is defined in a single line. So this is the function name and the parameter, and this is the function body. So what the function body does is it calls the gradient of f with respect to x. So this value and returns the first element, so 14 back. So you can have this and then you call DFS at two, you get the value 14. Okay? So now what df is, is the derivative of f with respect to x at this particular value. Okay? So this is very easy to work with. Make sure that you understand this. If you don't go back and run this again, calculate this for different values of X so that you are crystal clear on this. All right, so we're going to build on top of this. Now let's go ahead and take a look at functions which have multiple parameters. So for instance, we have a function f which takes two vectors as its input, x and y. Okay? So what are the function itself does is it calculates the difference of corresponding elements in X and Y and then squares them. Okay, So your typical mean squared error. This is a simple function definition, nothing complicated going on over here. The only thing is x and y are both going to be vectors, okay? No problem. Now, we are going to take this w as x and b as y. There are two vectors, this and this. We are going to calculate the derivative of f with respect to these two vectors, okay? And because these are vectors, the derivative is going to be piecewise or partial derivative with respect to this guy separately and this guy and this guy and this guy. So there are going to be for partial derivatives over here. So again, if you don't recall calculus, don't worry about it. Just remember that we are calculating this derivative stuff, okay? So if you have one W here and another w here, we are going to calculate the derivative of this loss with respect to this W separately and this w separately. So if this was a vector w1, W2, we are going to calculate derivative with respect to w1 separately and then do two separately. And then, for instance here Q1, Q2 separately. So everything is going to be calculated separately, right? Because we need to use those to apply this parameter updation step. Okay? So let's go ahead and do this. So the function is this guy over here. So this is the forward pass and the backward passes, the calculation of gradient with respect to each different value. So we're going to do that and we get four different value. So what this is saying is the gradient or derivative of f with respect to these two values. And because these are vectors, we are going to calculate the derivative of f with respect to w when w1 is equal to 2. So that is the 0. When w is equal to one, and that is this, b1 and b2. So all of these are calculated automatically for us. Now, this is slightly difficult to work with. So Flux provides a, an alternative syntax. Here you have to calculate what W1, W2 is, what V1 is. What we do is we have an easier method for doing this. What we can say is we can define a new function, g, s. This is going to be the body of the function. What this is going to do is it's going to calculate the gradient with respect to these parameters w and b for this function. So the syntax looks slightly convoluted. It's the exact same thing. The gradient with respect to the parameters w and b of this function is going to be calculated and it's going to be saved in GS, okay? So when you do that, you get some gradients back. If you say gs dot grads, it gives you a huge mess. So it's kind of complicated. But the point of using this syntax is now you can say gs of W and you get 02. So the same value, 02 back out. You can simply use this to get w1 and this to get W2. Okay? So it works out similarly. You can go ahead and get the derivatives with respect to B, which are these guys over here. So same idea. You have a function over here. You need the derivatives, partial derivatives of this with respect to all the parameters you use the syntax. When we go ahead with flux, we are going to get rid of even this stuff over here and the final syntax is going to be very easy, but we are building this from scratch so that if you have to create your own models later on, you know how to do that, okay? The final context of flux is very clean, very easy to work with. This is just the building blocks and that's why it looks slightly messy in the beginning. Okay, now let's go ahead and create a basic model for, for flux. So we're going to create this stuff. We are going to have some x's going in and some weights are going to be multiplied with that. So we have some w. So these are the random weights. So these, so this is a matrix two by five and the biases are a vector. Okay? So what is the prediction? You would have seen this before. We multiply weights with x and then add b to it. Okay? So the prediction of x is wx plus b. This is our basic forward pass for a single neuron. What is the loss? The loss function is Y is you do the prediction and then you calculate the squared error. And you sum over all the different data points that you have, that is your loss, just a basic loss? By the way, we write this phi hat as you write y, and then you say slash head, hit tab, and it's going to convert into y-hat. Okay? This looks very similar to the mathematical notation which appears in the paper. So Julia prefers this, so you should get used to doing this, okay? Anyway, we have the loss function and we have the weights. Now what we want to do is we want to calculate the loss for particular values. So x comma y is going to be ran 502. So this is some dummy data. Our x values are going to be these, and our y values are going to be these over here, okay? So x and y. We can calculate the loss of X and Y. So what loss does is we go over there, we predict a y for these x values. So for these guys over here, we predict a y and then calculate what the difference is. Because W was initialized as a random, we are going to get some random loss over here, 3.6. It's not good. Obviously, we are going to improve this by applying the backward pass for that we need the gradient. So we use this index gradient parameters for this loss function. So the gradient or the derivative of this loss function with respect to w and b is going to be needed. Okay? We do that. And because this is very common, we have an alternative syntax as well. So you can use this exact same thing. It does the exact same thing as that. You have a function which is anonymous function for loss and this guy over here, so this is an alternative syntax. It comes up in Julia's documentation, so it's included over here. But if you, if you understand this, this is exactly the same thing, Okay? Now we can say gs dot grads and you get this big whole mess. But the point is we can say the derivatives for w, RGS w. Okay? So this is through slash bar. Now we have the GS W. This is this guy over here. Okay? So the derivative of the loss function with respect to this particular weight that is in w hat over here are W bar. Okay? We are going to apply this. So this is your alpha. W dot minus is equal to this guy over here. So old value minus alpha into the derivative. We do that. So all the weights are updated. If we calculate the loss now, you will see that the loss has gone down because we have applied the backward pass ones. We can also calculate the derivatives with respect to b. So that is this and we update the biases as well. And now the loss is going to go further down. We can go ahead and do another pass by calculating the derivatives with respect to w and b again. So we're going to do this. And if you do this now, it's going to update. And you update this and you calculate the loss and goes further down. You can't get it the derivatives with respect to B, update your b and you calculate the loss. It goes further down, right? So our path is going to be calculate the gradient with the new w and b values and update those. And that's your basic neural network iteration. Calculate the derivative with respect to current values of weights, and update the weights. Let's go ahead and revise this whole thing again. It's very dense. So you might have to look at this again and again, but this forms the building block of the flux library. And as you can see, if you get rid of everything, it's just three or four lines of code, right? So let's go ahead and take a look at this. So these are our random weights and biases. We initialize them randomly. Our forward pass is simply the multiplication of x with w and then addition. Our loss is just the mean squared error, which we normally have. X and y are going to be some dummy data. So some dummy five values for x and Dummy two values for y. These are the values. First we look at what's the losses, then we calculate the derivative with respect to the loss, and then we update the weights first and then devices. That's all there is to it. So going back to the figure, we'll do the forward pass. We calculate the derivative and we update the weights based on the derivative that we've calculated. We do this again and again, and it becomes very clean. So let's go ahead and put this in one cell so that you understand what's happened. So we have the weights. So there you go. I'm going to do this in real time so that you can see, so these are our weights and biases. Our predict function is this. Then we have our loss function. This would be our loss function. Then we need the ingredients. You can use either of these two. So let's go ahead and put these in a separate cell. We take our data as well. So our data is going to be this. Let's put this over here. So I'm putting everything that has to be repeated here and everything that is done once over here. Okay? Finally, we have the rates over here, and we are going to update the weights using this. Similarly for biases, we have the calculation of biases and updation as well. Okay? And then we output the loss with respect to x and y. Okay? So hopefully this makes sense. Let's go ahead and output the loss initially as well. So there you go. This is our initial loss. Now let's go ahead and do the iteration. Loss goes down, you do it again, and loss goes further down and you keep doing that. And loss is going to keep going down. Okay? Hope you understand this. Forward pass, backward pass and the updation. So that's all there is to it. This is all you need to write a simple neural network in Julia. If you've done this stuff with any other library, you would notice that this is extremely easy, much easier to work with than any other library. So in the next video, we are going to take these basic concepts and build a multi-level neural network instead of a single neuron.
20. Multiple Layers, State-of-the-Art in a Few More Lines : We built a single neuron in the previous video. In this video, we are going to join different neurons together to create a multilayer perceptron. So as before, we are going to start with using flux and we are going to have a very simple model. So we are going to have w1, B1 for the first layer. These are the weights for the first layer and these are the biases. And our model is very simply, are, our layer is very simply going to be l1, x is W and X plus b1. So that's our first layer, okay? Now, typically after layer has done the linear computation, you apply some sort of a squashing function to that. And one of the most typical functions is the sigmoid function. So that is built into flux. You can say sigmoid sigma of 0 is 0.5, sigma one is 0.7 and signals minus1 is point 2. So you can type this as slash Sigma, hit Tab and it turns into sigma. Okay? So this is built in. Now what we can say is you define L1 x as first you do the layer one computation, so this computation over here, and then whatever output you get out, you apply a squashing function to it. Okay? And this is the broadcast. So you apply this Cushing to all the different values. It works out very well. Okay? So that is all there is to it. Very clean, nice code. Okay? So we have L1. Now similarly, we can define L2 as W2 and B2. This is going to be our layer 2. And layer two is not going to have a squashing function. It depends on what you're trying to do. Here. We are not going to squish the second layer. So whenever you'd want to define the model now, your model is going to be first we are going to do L1 on x, and then we are going to apply L2. We don't have a squashing function. If we did, we would have defined a squashing function over here as well. So that's all there is to it. Now what you can do is you can take a random five element vector and pass it to the model. And you get the forward pass in an instance. So you get the forward pass over here. You get two values over here, because that five elements vector is going to be multiplied with this, you are going to get three values out and those are going to be squished. And then these three values are going to go in and two values are going to come out. And that is why we have these two values over here. So if you're not sure about the mathematics, you can go ahead and look at this, but this is all the rest of it. You are multiplying the matrices together and then adding a vector to it. Okay? So this code looks slightly messy even now. So let's go ahead and clean this up. So our linear layer is going to be a function. We're going to define it as a function. So notice this layer is simply a function. It's a layer which defines the number of nodes or the number of values coming in and the number of alleles going out, okay? W is going to be rand n of outcome. I. Notice that this is reversed. Like I said, five comes in, three goes out. Okay? So we are going to reverse this because for us humans it's much easier to write the values first and outrageous data. I'm going to run this and then you'll see what this means. We are going to have b of random values and x is going to be simply this multiplication over here, and this is going to be returned. Okay? So let's do this. The reason this is a function, this is defined as a function is that this function is going to be sent back. So linear five comma three is going to be a function. When you say linear five comma three, it gives you not a value back, but a function that you can later on call. All right, So if you say linear five comma three, it gives you a function back, right? And what, this is, five nodes coming in, three nodes going out. Okay? Now linear one is going to be linear five comma three. What this means is our first layer is going to be a linear layer and which y-values are going to come in and three values are going to go out. The second layer obviously has to have three values coming in, and it can have any number of alleles going out. We are defining this S2 three, and this three has to be the same because layer 1 is going to connect to layer two. Okay? So let's do that. So now we have linear and linear to. The good thing over here is because we have defined these as functions, we can extract their local values out as well. So you can say linear on DOW, it's very easy, right? These are called closures, by the way, in case you're interested, then you can access the local variables of different functions based on this. So it's not all that important there. Now, what our model is going to be is as before, we are going to do the linear on the x and then we're going to squish it. And then we are going to do the linear two over here. Alright, so now this is the whole code that we have, very clean and nice. So this is our model. We can take our dummy data over here and send this to the model. And this is going to work out perfectly when an even easier way to write this can be you do the linear one X, you apply this question function, whatever value you get out, you pipe it to linear two. So this is exactly the same as this, except now this looks a lot more sequential. So instead of this function composition in which linear 2 is written first but called later, it looks kind of weird. This is much more easier to read. So you apply linear on x, you squish it, and then you pass it to linear to. And if you had more layers over here, you could pipe them as Linear 3 and so on and so forth. It would make a lot more sense. But for now we have just two layers. So we can do that. We can say model p of x, exactly the same thing. Okay? Now we have defined this linear layer ourselves. But flux already comes built in with these layers. We have done this by hand over here so that you understand how easy it is to create your own type of layers. Later on, you can go ahead and access everything that is in your neural network using this very simple syntax. Let's go ahead and look at the layers already provided in flux. So a couple of examples. So you can say using flux, we can go ahead and just restart the kernel. Okay? So restart going to unclear and allow port so that you're sure that we are using this excess over here. So we're using flux. It's a little bit of time. And then we can say layer one is equal to dense 105 and the activation function is sigma, something number of values going in, number of values going out, and the squashing function. So we define this up above. Here we are using the pre-built layer, but this tense has been defined exactly the same way that we have defined this linear over here. So nothing just a little bit easier for us to work with. We can go ahead and use this syntax. Or an easier syntax would be to use the chain function. It does exactly the same thing as this guy over here, the pipe symbol. But again, depends on what you like. For some, this is much easier to read. And for some, this over here, this is the common method for flux. So I'm going to stick with this. So we changed the layers. The first layer is going to be a dense layer in which ten values are going to come out. Five alleles are going to go out. Okay? So the input shape has to be 10, the number of nodes is five, so five values go out. In the next layer, we are going to have five values coming in and the number of nodes is two, so two values go out and then we apply the softmax, which you would be familiar with. So this is built-in as well. So that this is our model. And now we can say model 2 on this random variables. When you run that, you get the forward pass out. In this video, we have taken a look at how to create multiple layers for Arduino network. In the next video, we are going to take these building blocks and apply them on m-nest. So m-nest, as you might be aware, is a very common problem in learning machine learning. And we are going to use the flux library to do m-nest classification. So that's in the next video.
21. Case Study: MNIST, Modifying Data for Model, Avoiding Pitfalls : Having seen the fundamentals of flux, we are now in a position to use the facilities provided by flux to do a real-world case study. So this is still a toy example, but it's very representative of the stuff that we have to do in the real world. So we are going to start off with the m-nest example. It's the typical machine learning example that people usually do whenever they're starting to learn machine learning. These are the digits which are handwritten digits from 0 to nine, and each of the images has an associated label, which is the ground truth. So the images have a single channel, which is a grayscale channel. And this is typically a 28 by 28 image in which each pixel is represented with a value representing how black it is. The idea is to take this image and feed it to the machine, give it some examples and then see if the machine has learned. So that is what we're going to do over here. So let's move over to our example code over here. So we are going to initialize our random library with a seed. This is going to ensure experiment reproducibility. You can set this to any value went to 34 is not important, but make sure that you keep it there so that you can run the same experiments again and again. Okay, so let's run that and import some of the stuff that is going to be needed. So we're going to use flux and the statistics package. And we are also going to need the dataset that comes built-in with flux. And we also have the one-hot batch function from flux. Essentially converts categorical values into one-hot representations. Okay, so let's go ahead and run that. We're going to load the images using MNIST images and then see what the type of that is. So the type of images in this case is going to be an array, right? So this looks really complicated, but we're going to break it down. So it's an array which is a one-dimensional array, which means it's an array of images. So all of this stuff over here that you see that is going to represent one image. Okay? So make sure that you understand this. We have to break this down a lot. So this is an array, the whole images variable is an array, which is a one-dimensional array of images. So this is one image. What is one image composed of? It's again an array, but a two-dimensional array because it's an image, it has a height and a width. Okay? And what is an individual cell composed of? It's a corner types dark gray. So it's a grayscale image in which each individual value is an 8 bit unsigned integer. Okay? So this is not all that important. All you have to understand is that it's a grayscale image, a 2D grayscale image, and then a collection of 2D grayscale images. Okay, that's what we have that as our images collection over here. Let's dissect that a little bit so that we understand it really well. The reason for that is whenever you're doing machine learning, you have to really understand your data. Once you understand your data, feeding it to a typical machine learning model will be really straightforward. But if you don't understand your data, it's going to be problematic. So we're going to spend a little bit of time over here on that and see how Julia works with data. Okay, so we're going to import the Plots package and try to plot the first image that we have. So we are going to say that it should do a scatter image or a scatterplot for images one. So for the first digit image that we have, so for this guy over here, okay, so we're going to plot this and the size is going to be 200 by 200. Okay? So when we do that, Julia just knows that this is an image, so it's going to plot it like an image. So the image is over here. If you try to look at the size of images, it should have 60 thousand. So this is the number of images that we have. And for a particular image, the size is 28 by 28. Okay, Well that makes sense. Now we are going to load and encode the labels. So if you do administrative labels, you get all the labels and you can see the first five to see what they look like. So obviously they are 500 for 19. So the first image that we saw over here, this is a five, obviously it looks like a 52 us. But for machine is going to be difficult anyway. So what we want to do now is to convert this five into a one-hot vector representation. Okay? So if you just take a look at the size of the labels, it's 60000, which corresponds with the 60000 images. We are going to use the one-hot batch to convert all the labels into categories or one-hot representation in which the possibilities are from 0 to nine. Okay, let me output the first level here, and then you see what this means. So we have all zeros and just one value is hot or on a one, because this is the value that corresponds to five. So 0123456789. So this is a five, so we have this five index over here as one. Please do not confuse this with D1 based indexing of Julia. Julia still does one based indexing, but this is the labeled 0 and Labeled 9. So the digit that we're talking about. So that's why we have this going from 0 to nine. We can take a look at the second one. So the second one has to be 0. So the 0th index is going to be one. We can go ahead and look at all five of these. So from one to five, the first one should be five, the second 0, the third 14, and so on. So 5, one, this is four, this is one. So this guy over here, and this is 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 9 over here. Okay? So all of these are converted into this representation, which is called the one-hot vector representation. This is because this makes it a lot easier for the machine to work with and it makes the mathematics really easy to work with. You'll see what this means in a minute. Okay? So the size of labels now is not just 60000, but ten cross 60000 because ten rows and 60 thousand images. Okay? Right? Now what we want to do is we want to create batches for feeding into the machine learning algorithm. Typically we don't want to feed the whole dataset to the algorithm in one goal. And the reason for that is, nowadays, datasets are very huge and we have to break them into batches if you want to feed them because your typical dataset is not going to fit in the memory. This m-nest can fit into memory, but we have to learn how to handle the datasets, which do not. So we're going to see how to create batches and prepare our data when it cannot be fit. So what we're going to do is use what are called partitions from based on iterators. Let me first give you an example of this outside of machine learning. So if I say using base wrote I craters partition and I say for I in partition, one colon 72, you will see that I go from one to seven. And each step, each partition is going to be of length 1, 2, 2, 3, 2, 4, 5 to 6, end 7, 27 because that's all we have over here. Okay? So you can create partitions for any length. So from one to ten, you can. Create a partition of size 2. You can create a partition of size five, and that is going to give you 1, 2, 5, 6 through 10. So this is a really easy way of creating partitions instead of us having to loop over this stuff manually. Okay, so let's go back to our images. So this is an image, so this is a 28 by 28 image. If we take a look at its type. So the type of this first cell over here, so image is 1, 1, this is going to be a grayscale image, right? So this one cell over here is a grayscale image. Now, this needs to be converted into float because machine learning algorithms work with floats. They do not work with grayscale images or this really weird datatype. They worked with floats. So we have to convert it. Let's start off with the type of images. This is this, we're glad that we saw earlier. So it's an array of one-dimension. So these are all the images. Each image is a 2D array because it's a 2D image and it has as its individual value, a grayscale image. So we want to convert this whole thing into floats. We can simply go ahead and say float dot images. If you do that, you will see that the datatype of this changes immediately to a proper format. So we have an array which is a one-dimensional array. This one thing is going to be our image. And this is a two-dimensional array of floats. Perfectly clean, right? So this becomes very clean. The problem here is that this is float 64. And a lot of the library which comes with flux is going to work with float 32. So if you do not do this conversion from float 64 to float 32, this is going to be really problematic down the line and it's going to give you errors. It's one of those things that tutorials typically don't cover, but I want to cover it over here so that you don't have the headaches down the line. So we are going to convert this float dot images. So this guy over here, we are going to convert it into an array, which is a one-dimensional array of two-dimensional arrays of float 32s, not float 64 but 432. So this is the difference. So let's read this again. So we are going to convert this guy into this guy. Okay? So let me put this on a separate line. So this is going to be an array which is going to be a one-dimensional array. So this is the collection of images. One image is going to be a two-dimensional array in which each cell is going to have a float 32 type value. Okay? So let's convert this. So now if you take a look at the size of the first image, it's going to be 28 by 28 still. So that's what we need. What type of images is now an array? Array and float 32. So the difference between this and this is that this is float 64 and this is 432, which is what we need for our models, okay? Right? Now, as we saw earlier, each individual image is going to be a grayscale image. So it's going to have just one channel, right? So if you're not familiar with channels, a typical color image has three channels, red, green, and blue. So it's going to have this thing repeated three times, one for each color and those colors combined to form the image that VC as humans as a colored image. Okay, so red, green, and blue. So it's three channels. Here. We have just one channel, but we have to convert it into a different format in which the channel is explicitly specified, right? So if you take a look at the image, it's going to have just a 2D image over here. We have to convert it so that the channel information, as always, is also there. So let's go ahead and do that. So for instance, if we take this guy over here, images one to do so the first two images, and we increase the dimensions to four. That is what cat does. Right now it has three dimensions, the height, width, and the number of images. We have to convert it into fourths. So an extra dimension is going to be introduced. It seems complicated. I'll output this and then you'll see what this means. Okay? So t one is now going to be a 28 by 28 by 1 by two. So these are two images because we took two images over here. And it's a 28 by 28 by 1 now, so this one has been introduced as an extra layer over here. The reason for this is, right now it's just one channel of grayscale. But if it was a color image, it would be 28 by 28 by three. And RGB would be handled perfectly well by our model. Okay, hope that made sense. The syntax over here is you create two elements for the tuple. The first is going to be the image or the collection of images, and the second is going to be the collection of labels. Okay? So we say T of 1, so it's a 28 by 28 by 1. So this is the number of channels and these two is the number of images. Okay, hope that made sense. Okay. We're going to put all of this together. Our batch size is going to be 1000. And now we can create our training dataset by combining everything together. We are going to create a list of the first batch training images and the first batch labels. Where does the batch going to come from? It's going to come from the partition of 160 thousand. So you can use an underscore to make it more readable. So we are going to first create batches of batch size each. So we are going to have 60 batches from one to ten hundred, ten hundred, one to 2 thousand and so on. For each of these, we are going to collect the images, increase the dimensions by one, and append the labels with them. Okay, so let me run this and then show you what the output is. The training data is now 60 batches. Train one is the first batch. It has two things, the training images and the training labels. The training images are in train 11. So these are 28 by 28 by 1 by a 1000 images. So a 1000 images, one channel. Each channel has 28 by 28 grayscale image. And the second one over here is ten by 1000. So 104 this and 1000 for the images because that is our batch size of this made sense. Please go ahead and do this again and again so that you understand everything that's going on. Okay? Similarly, let's also create the test data. So test data is going to come from the MNIST images colon test. So we are only going to read a 1000 test images. That's all we need. And we are going to convert this into floats. And if you take a look at that, It's float 64. So we have to convert it into a 432 as before. And then we are going to take a look at its type. And now this is a 432, so everything is now a float 32, everything works fine. We are also going to convert our tests using the same dimension criteria. So we have a channel added to it. And we are going to convert our labels for the test using one-hot patch as well. Okay? So if you do that and you take a look at the size of dx, It's 28 by 28 by 1. So this is one image for one channel and 1000 test images, and the labels are ten by 1000. So one-hot for this guy and then a 1000 images. Okay? So now our data is ready. We can go ahead and create the actual CNN model and see how this data can be fed to the CNN model, which is going to be fairly straightforward. Now, let's do that in the next video.
22. MNIST Continued, Creating the Deep Model, Training and Testing : In the previous video, we set up the data that has to be fed to the machine learning model. And here we are going to create the CNN model and everything else that we need to make it work properly. Okay, so first the model, it's going to be a chain, which is going to be a sequential model. The first layer is a convolution layer in which the filter size is going to be three-by-three. And it's going to have one channel coming in and 16 channels going out, okay? And the activation is going to be ReLU, rectified linear unit. The next layer is going to be a pooling layer. So max-pooling, if you're not familiar with that, you can look at any tutorial for machine learning and explain what max pooling and everything else is. But really it's fairly straightforward. Okay, Then we have another convolution layer in which 16 channels come in and eight go out. That is followed by a max pooling layer again and then we flatten it and then there is a dense layer of 10 because that is the number of classes that we have. And finally we have a software, and finally we have a softmax. So that is our basic model. We can run this and we get the model out. So this looks really weird, but that's our model that we have defined over here. We are going to need the one cold, which is the opposite of art. So if you give it a one-hot representation, it's going to create the class for you. We have the cross entropy, which is a loss, and we also have the throttle, which allows us to perform some tasks while the learning is going on. We'll see what this means in a minute. Okay? So we're going to have m is equal to model. So this is just to make the code look a little cleaner. We are going to have our optimizer as Adam, there are many other options you can use the base gradient descent, you can use the momentum based. You can use the Nesterov, which is really popular and you can use the anatomy as well. We're going to stick with Adam as our optimizer for now, okay? We also need to define what the accuracy is. So accuracy is going to be, you convert whatever the model predicts into one cold. So if this predicted a one-hot of five, we are going to convert this into a five. And you're also going to convert our y back to one chord. And we're going to do a piecewise equality check. So essentially what this means is on average how many we get, right? So that's what accuracy is. How many answers you got, right? Okay, so that is our accuracy. Our loss is going to be flux dot-dot-dot cross-entropy, what the model predicted and what the ground truth was. That's our loss. Okay, so cross entropy is defined previously in the previous video or the one before that. We defined our loss manually. Here we can use just the cross entropy that comes with flux. So we don't have to define it by hand. Okay, so that's our loss. We can go ahead and perform the training using the loss and the parameters and the training set and the optimizer. So this we can do perfectly well. So instead of having to calculate different matrices by hand after a little while, we can give it a callback. So what a callback function does is it's going to be called automatically by the training function after given interval. So for that we are going to define a new function, eval CB. So this is over here and we're going to use it to throttle. Throttle does is at this many seconds, it's going to call this function for you. So that's what theatre does. It's going to ask you for a function at a time. So every 10 seconds, it's going to call this function. And what is this function doing? It's not taking any parameter in and it's simply going to add show the accuracy on the test set at that time. So if you recall, we created this test set and it's going to every 10 seconds, just show the accuracy on the test set at that time. Okay, So this makes sense. So once again, we'll define the loss as cross-entropy. We have the parameters for modal m, which are going to be built automatically because we have defined this thing over here. So we don't have to do it by hand. We have the training set. We have the optimizer, which is currently set to add them, but you can change it. We have the callback function, which is going to be automatically called every 10 seconds to show the accuracy. Okay? So if you do that, it's going to run the training and every 10 seconds or so, It's going to output the accuracy on our test set. So let's see how well this does. So it starts off with a really poor accuracy of 0.05. So that's really horrible. But immediately after a little while, it jumps to point 64. So you can go ahead and run this again. So if you run this again, It's going to start off from where you left off. So 0.6 goes to 0.72 and then it's going to improve and it goes all the way up to approximately 90 percent accuracy, which is fairly good for a start, okay? It might look slightly complicated at the moment, but what we're going to do now is clean everything up, get rid of everything that was here for explanation, and make the code look really clean so that you can see what you actually have to do when using flux. Okay, so let's go ahead and do that. So we're going to restart the kernel and clear all the outputs so that you know that this is working perfectly fine and fresh. So this is all we have. We have some imports. We have the function that gets the training and test data and we pass it the batch size. It goes ahead and loads the m-nest labels. It loads the images, converts them to the proper format, and creates a training set based on the partitions that we have according to the batch size. So that's the training images and training labels. We also have the test images and the test labels. So we do the same thing over here and then return all of these back. So that's our function for getting the data. For our model. We have a simple function over here that creates the model. This is just separated so that we can look at it and change it later on. Finally, we have a trained model function which asks you how many iterations you should run. And it also has a default parameter for optimizer. Currently, we have set this to Adam, but you can pass it a different one. We are going to do m is equal to build model. We are going to get all the training and test data. We are going to set the loss. We are going to define what accuracy means. We are going to set our callback and then go ahead and do the training for iterations number of times. Okay, so that's all there is to it. Now you can go ahead and call this on the data with the optimizer said to Adam and I, iterations is equal to three. So you can just run that. It takes a little bit of time, but that's because Julia has to do a lot of combination the first time around, the next time it runs really fast. So as you can see, it starts off with a poor accuracy of 0.18. And it's going to jump to a really high accuracy pretty quickly. After a while. It finishes with the accuracy of 0.88. So 88 percent accuracy, not bad, but not really state of the art. But we've only done the iteration for a little while. So as you can see, the whole code that we have is very clean and it's a very small piece of code and it works really well, okay? And the idea is that we can now go ahead and experiment using different optimizers, different number of electricians and different other options. So we can go ahead and run this for an atom, which works really well. We can also run it for our Adam and the base gradient descent algorithm, as well as a gradient descent algorithm which has been set to learn a little faster. So alpha value, the learning parameter is set to 0.4. So I'll run all of these and wait until all of these are done and then we can discuss briefly the results. So all of these have finished running, and we can see that Adam topped out at 88 percent. And Adam went right up to 91.6%. Did really poorly, like 66 percent only. The base gradient descent algorithm went all the way up to 91.8%. And the gradient descent algorithm with the learning rate Alpha set to 0.4 went really quickly to 0.959. So this works really well. So this goes to show that even the state of the art models may not perform well if you don't set them properly. So the base, like 2012 or I guess even older than that, gradient descent algorithm works really well if you set the alpha parameter properly. So the point is that you can go ahead and experiment with all of these different things. And if you want to look at other stuff, you can click on this link. And that will take you to the flux documentation where you can take a look at different optimizers and their respective parameters as well. So a lot of stuff to go over here, but you can see that you have the descent algorithm that we've used in which you can set the Alpha, which is the learning rate. So learning rate it up. They call it eta, we typically call it Alpha. You can also have the momentum optimizer and Nostra of and so on and so forth. Quite a lot of them are available over here. So this documentation is typically in transition, so it's difficult to work with. But once you understand this code over here, you should be able to read it really well. So that concludes our discussion about how to use machine learning algorithms. All you have to do now is go ahead and change your model over here and experiment further with this. In the next video, we are going to see how you can get your experiment from one machine to another machine or continuing experiment that you were getting out earlier on by saving your model and loading it later on.
23. Saving and Loading Models, Exploring More Options : Modern machine learning models require a lot of time to do the training and the prediction. So it's a very common use case that you are working with a machine learning model and it does some learning and then you have to stop your machine. Or if you're running on the Cloud, you have to stop the machine over there and it's necessary that your progress does not go away. For that, we have to save whatever learning the model has done onto a file for persistent storage. So here we are going to take a look at how we can do that using Julia. So I've taken the quote from our previous session. So this is the whole model that we have. The actual model is defined over here. I've gone ahead and run this to save time. We call the same function as we did last time. We define the loss, the optimizer, and the callbacks, and then we go ahead and train it. So we've run this model once and it has got up to the accuracy of 0.564, okay, and then we've stopped it. Now what we want to do is we want to save this progress in a file so that we can load it at a later stage. Okay, So for that we are going to use the binary JSON library. So this is the recommended way for saving your models weights. So this is essentially going to save the weights of the model that we have learned. So all you have to do is just import the ad saved macro from the BSON library. So you can install it using this if you haven't already done that. So we can install this and then create a directory called saves. And you will see that this directory has been created over here. Okay? Then all we have to do is say at safe, give it the file name, and then give it the model that we have to save. So this is the model that is this guy over here. So the parameters have been learned as a result of this strain goal. And now we can go ahead and save these. So that is all you need. So you save that and it gets saved in this saves directory. Okay? So if you go over there, you can see that my model 001 dot BSON has been created. Not only need to verify that this is indeed working properly. So what we're going to do is we're going to restart the kernel and we are going to load this file into our model. So let's go ahead and do that. So Conan and restart and clear all outputs. So everything goes away. So hit Restart. Now everything is going to go away and we know that the model will have to start from scratch if you don't load it. So let's go ahead and run this function again and the model as well. So this just gets our helper functions ready. And we are going to go down here and then run this. Okay? So when we do that, you will notice that the training is going to resume from the previous point. And the way we're going to see that is that it's going to start somewhere above 75 percent accuracy that we achieved before the motor loss. Say, Okay, it will not start from 0.01 as usual. So if we run this, we actually lowered the model, etc. We are going to use the BSON load and load the weights into our model using the AC load macro at this time. Okay? So everything from this file goes into the model. And if you try to see the model, you can see that it's there. We define the loss and accuracy and the optimizer, everything else. And now if you go ahead and do flux dot train, it's going to resume from the point that we left off, somewhere around the 75 percent accuracy, not from the point 0 six that we had earlier. So it's going to start off somewhere from that point where we left off. And you can save it again and loaded at a later stage to just ensure that everything is working properly. So as you can see, it has resumed from the 75 percent accuracy point. And if you save it now, then the next time it's going to resume from 85 percent. Now, another use case that is very common is that you have a very long running model and it like, and it goes ahead and executes like overnight. And you want to make sure that you're saving checkpoints automatically after a certain period of time so that you don't lose the progress if your model starts diverging or if you have some sort of a problem in your model. So for that, we want to save checkpoints automatically as our training progresses. And for that, it's very straightforward. We are simply going to modify our callback function. It's not only going to show the accuracy, but it's also going to save the model at regular intervals and mark them with the timestamp at which it was saved. Okay, So again, we're going to import the saved macro from BCE and library. We are also going to need the null function from the dates package. And all you have to do is just modify the callback function. We had. Just show accuracy earlier. So every 10 seconds it's going to show the accuracy and it's going to save the model for us as well. And then we can go ahead and simply run this as before. So if you work with Canvas or something like that, it's very difficult to save the checkpoints automatically in a clean method. And here, the callbacks are so nice and clean that they work really well. So we run this and you will notice that our model checkpoints are going to start appearing over here in our saves directory. Say as soon as we saw the accuracy of the model was saved over here. And every time the callback is called, checkpoint is going to be created over here for us so that we can go ahead and later accesses. You can modify the callback function according to your own needs so that it saves the model checkpoint more frequently or less frequently or something like that. Now you are in a position to run long running models and save your checkpoints and go back to a previous SharePoint if something goes wrong. So that's it for flux. And in the next video we are going to wrap this whole thing up and give you a little bit of details of how you can go ahead and expand your knowledge further if you want to continue working in this area.
24. Where to Go from Here: Pointers for Further Learning : I hope you're having fun with this course and you are learning a lot in terms of what Julia can do, what flux can do for you, and how you can take your machine learning game to the next level. Now I'm going to leave you with some parting words and I'm going to give you some guidelines of how you can continue your learning based on the foundation that you have gotten from this course. The very first thing that you can do is you can search for flux, Julia Model Zoo. This includes a lot of examples of how flux can be used to solve the state of the art latest models. So for instance, you can do that, you search for that. You arrive at this modern zoo, which has a lot of stuff going on in this. So you have the usage details over here, how you can access it, all of that stuff. But a lot of this you've already done, for instance, if you go into the text slash character RNN and character RNN, HGL. So Julia file over there. You can just download this script and run it using the command line as well. Or you can take it to your notebook, but you will see that it's almost the same. So you get the data, you build the model, and you define your loss. You set your optimizer, you define your callback function, and then you go train. So we've done all of these bits in our course. So now you should be able to do this modal fairly straight forward. All the rest of the stuff around it is just loading the data and working with the data. So it's not going to be all that difficult. You can also go ahead and look at DCF model. So the reason I'm showing you this is there are a lot of things that can be done in Judea. And if you start exploring these examples, you will come to understand these. Please feel free to ask questions. In the course. I would be happy to help out if you get stuck at any point. So for instance, if you want to use the GPU, all you have to do is define your model. So this is your model. This function defines the model and say pipe GPU. That is going to ensure that your model no locks on the GPU. It's as simple as that. So I haven't covered this in the course, but if you look at the code, it's going to be extremely straightforward. So all you have to do to take your model to your GPU is define your model and just pipe it to GPU. So here the device is going to be either GPU or CPU depending on what is available on your machine. So it's extremely easy if you start exploring this on your own. So that is related to flux. You can go to the Model Zoo, start exploring the different models. If you get stuck, please feel free to ask questions. You can also go ahead and search for Julia Pro. So this is a Julia IDE. It's fairly straightforward and it's state of the art. So Julia seems to be pushing this. You can see that it has an integrated development environment. You can run stuff over here and it will appear over again. I personally like to work with the notebook. That is why I have covered this notebook approach in the course. Because for me, I think it's much more easy to work with when you're doing experiments. But if you want to go with an ID, Julia Pro is a really good way to go. You can also go ahead and take a look at some game-changing packages for Julia, so you can search for that and you will arrive at this link. And you will see that you can do a lot of stuff. So flux we've already done. You can do automatic differentiation. If you're into that, you can do a mathematical optimization. You can do, I guess there is a package for probabilistic machine learning. If you want to do Bayesian statistics, Bayesian inference, you, if you want to do differential equations that there is a package for this. And all of these have really good documentation. For instance, if you go to during you can see that it has a detailed documentation over here. You can go to Getting Started and just add the package. And this is the code to do Bayesian probabilistic modelling in Julia. So it's fairly straightforward. It works really well. And you can go ahead and start exploring all of these packages. Udi has a huge ecosystem and it's growing day by day. So the more you explore the module when to learn. Hope you had fun with this course, and I hope you have a really good experience with Julia. Best of luck.