Transcripts
1. Introduction: welcome to mastering regular expressions in Java script. This course has brought to you by all things JavaScript. I'm Stephen Hancock and I will be your instructor. Now, regular expressions is sometimes a topic that is avoided. The syntax is not easy to understand. So when a regular expression is needed, many go out and search for that regular expression that will accomplish what they want and thereby avoid creating it themselves. In this course, you will learn how to build your own regular expressions. You will learn to understand this in tax and learn how they can help solve certain programming problems. What you learned about regular expressions in this course can be applied to a number of different languages. But throughout the course, we will look at how to use regular expressions within JavaScript. So here are the topics we will be covering. First, we'll deal with the basics, and that includes a short history and what regular expressions can be used for. We will talk about the different ways you can use regular expressions in Java script. The rejects object and the string object rapper can be used with regular expressions. I will show you how to test regular expressions that could be very helpful when you're riding your own, so you can verify that things will work as you want them to. Well, look at defining patterns. That's really what regular expressions is all about. Defining patterns. We'll talk about meta characters and character sets. We will look at repetition and groupings and anchored expressions we will cover. Look ahead assertions, and we'll look at how to use Unicode within regular expressions. And then we will look at useful regular expressions, ones that are used most common. Look at how they're created and help you understand them so that you can adjust them to fit your own needs. Now, throughout the course, I will have you complete assignments and quizzes. These will give you a chance to practice the concepts that you have been learning. The files reach assignment, and the files for some of the lectures are available for download. Finally, if you have questions as you move through the course, make sure you ask them. I check for questions frequently and provide answers. Other students may also provide their answers as well, so let's get started mastering regular expressions
2. A Short History on Regular Expressions: before we dive in and start learning and using regular expressions, let's talk about what they are and where they come from. First, let's look at how we can use regular expressions. That is the best way to understand what they are. So let's say we have this text, this quote by hit guard, and we want to remove all the punctuation from it. One way would be to cycle through each letter and compare it to punctuation marks and then remove it using string methods. But a batter way is to create a regular expression that would match punctuation marks and then use the replace method four strings to replace those punctuation marks with nothing. Regular expressions provide a much better way to accomplish certain tasks. Let's say, for example, we need to check a password to see if it meets certain criteria that we've established for passwords that could be done with a regular expression. Let's say we need to verify the input of an email address to make sure it meets the format of an email address that can be done with regular expressions. How about if we need to determine the number of times a certain word appears in a phrase that could be done with regular expressions. What if we need to analyze a log file for certain data? Once again, we can use regular expressions. What if we need to work with the U. R L to extract certain information from that girl that could be done with regular expressions? All of these situations can be done with regular expressions, and so basically, they provide another powerful tool for you when you're trying to solve programming problems . So when did regular expressions first developed? The concept was developed in 1950 bite mathematician Steven King, and he described as a way to identify regular language. And I believe that's where the name regular expressions come from is that he was trying to identify that. And that was, as you can see many years ago, so it's been around for a while. It became regularly used by unit as a part of the utilities that Aaron unit for processing text. Now you may have heard of grap before you may have even used grab. Have you used UNIX in the past? Now what grab stands for is global search for regular expression and print matching So grap applied those concepts that were developed by Stephen King and help to make them popular. That's where they became popular as a part of UNIX. Now many different variations occurred, but then became standardised by the Paszek standard. That's an important point because once something becomes standardized, it makes it so. It can be used in a lot of different places, and it catches on because people don't have to learn different ways to do it now. A version of regular expressions was used in Pearl in the 19 eighties. That was perhaps the first time it began being used in in languages and programming languages. And then in 1997 Philip Hazel developed PCR E, which is a standard for regular expressions and has been used in many, many modern tools. Now PCR E stands for pearl compatible regular expressions. So it came from that initial work and Pearl. Now the reason this is important in the reason I present this to you is that the regular expression syntax you're going to be learning applies to a number of different things. Yes, we will be looking specifically at Java script now to use in a JavaScript But this syntax and the grammar that we used to define the patterns in regular expression are common across multiple languages and multiple tools. There are slight differences that you can encounter, but the majority of the syntax and grammar is common, and that's an important point to remember. So what you learn about regular expressions here you can apply in other places, so let's get started learning about them.
3. Getting Started: as mentioned previously, regular expressions are way to represent patterns. When you are working with any type of textual input, you can't always be sure what the value will be. And so you use a pattern. The big part about learning regular expressions is toe learn this syntax that describes those patterns. But before we can do that, we need to first create a regular expression. Let's look at how we would do that in Java script. So first off, a regular expression is an object in javascript. That probably doesn't surprise you if you've been around Java script for any length of time cause Many things in JavaScript are objects, while regular expressions are no exception. Now there are two ways to create regular expressions. As you know, with user defined objects in JavaScript, there are two ways to create those. There are two ways to create a raise. One is a literal syntax, and one is using the constructor that is the same for regular expression objects. So here is the method that uses the constructor, so we have a variable that we've declared, and then we sat it equal to new regular expression. Raggi, X, P and inside of the quotes is the pattern that we're passing in the pattern that we want this regular expression to look for Now, the literal syntax for regular expressions looks like this. Once again, we declare the variable. Of course, we set that equal to, and then we have a forward slash, which is the start of that regular expression pattern. Here's the pattern. Another forward slash marks the end of that regular express, so those are the two ways to create regular expressions. Now, in these two examples, what is the pattern were defining? Well, right now, all we're defining is literal text, so literal text Hello and the literal text world. So it's a very simple pattern at this point, but it's showing how you would create that regular expression Object. Now, Once you have a rejects object, you can then use it with one of the methods that's available on the regular expression constructor or that's available on the string object rapper. Those are the two areas within JavaScript where we're able to use regular expression objects. So to get started, let's look at an example. This page, which I'm currently showing, is actually an HTML page I'm going to jump to Sublime. Sublime is the text editor editor of Choice for the Work I do here is that html page that I was just mentioning now associated with this html page. We have an app dot Js file and here is the app dot Js file. So we're going to look at how we can use the regular expressions we just saw as an example within this app dot Js file. So we'll make it change this code. Well, then refresh this page and see what our results are. So let's first set up those same two regular expressions which we just saw. So we declare the variable and we set that equal to this one uses the constructor new Reggie XP and now we're passing in. The pattern, which is fellow in the second way, uses the literal syntax. So with this one, we use forward slashes to define the regular expression pattern. All right, now, let's use both these objects with one of those methods that I mentioned are available on either the regular expression object or the string object rapper. So I'm just going to log to the console. The results of this that's all we'll do at this point. And so I will get the first regular expression. And so, since this is a regular expression object, there certain method available. One of those methods it's available is test, and this allows us to test some text to see if the pattern which we created matches. And so the text we're going to pass in is right here. Now we should be able to see if this pattern is matched. And as you can see, there is the word Hello within this string so it will match. It will say yes. I found that pattern within this dream. Now the same thing will occur with the 2nd 1 because there is the word world. And as I said, both of these patterns or just literal text at this point, we're not doing anything crazy up. So let's look at that 2nd 1 as well. We'll test that one and will pass in the same text. Now what is this going to return? What is the test method? Return? Well, it's simply checks to see if this pattern is matched. If it is, it returns true. If it is not, it returns fault so that's all we're going to see. So let's just take a look that really quick. Save that. I'll jump out and refresh it, and then I'll just open the console. And as you can see, we get a true in a truth. So in both cases, those found a match. The pattern Hello is a part of this string, and the pattern world is a part of this string. That's what it's telling us now. What if we change this 2nd 1 toe worlds? That's the literal text we're looking for. Save that refresh. Of course, now the second consul log is false. It did not find a match, so this is great. We've created our first regular expression, and we've actually used it using the test method. But how many applications are there really to that test method? There's got to be other ways to use regular expressions, and there are That's what we will be looking at in the next topic.
4. Using Regular Expressions in JavaScript: in the previous movie. We looked at a very simple regular expression, and we used it with the test method of the rejects object. This is one way to use regular expressions on textual data, but test is not. The only method is available. We have several methods we can use. There are two on the Rejects object and four methods that are available on strings that allow us to work with regular expressions on textual data. We're first going to look at several examples of each of these methods and then will summarize those methods at the end. So here the regular expressions were working with in the last movie. Now I'm gonna change this 1st 1 to use the same literal method of defining regular expression. That's the method I prefer. That's the method that most JavaScript developers prefer. It's just simpler to type, much more straight forward now. We did test, and basically what we got. His result of test is either tour fault. If it found a match in the taxed, it would return true and notice with the methods on the regular expression object we pass in the string that we want to perform the pattern matching on So any method we use from the regular expression object, we create the regular expression first, and then we pass in a stream, just like we've done here. So the second method that's available on regular expression objects is exact. So let me change test to exact and let's take a look at that. Open the consul. Notice that in both cases we get an array. If we open that array, we can see the first entry. It only has one entry, one element, the ray. And it is the actual text that matched the pattern. Now, since we are using a simple pattern right now, which is just literal text, then the actual text that matches is the same is that pattern so exact returns an array of all matches from the string which we pass in. In this case, we have a single match. We get one entry in the array, but noticed that it also provides diminished additional information. We have a property of index. This is the index of where the match occurred. So if we were to take a look at that string and begin counting with zero the beginning of the match would be 41. What is indicated here in the index, we also have the input. So this is basically the string that was passed in that story is a property of that arrays . Well, so we have some additional information on the rate, not just the matches that were achieved. We can see the same thing with the second match, So that is the exact method that is available on regular expression objects. So we have two methods test and exact. Now we also have methods that are available is a part of a string. So we created a string here txt when strings air created in JavaScript, they have a string object rapper which provides some methods we can use to work with that string. Some of those methods allow us to use regular expressions. There are a total of four that we want to look at. So let me comment out these statements, and now we're going to be working with the string. So as with the regular expression object we use, not syntax toe access those methods. So txt dot max is the 1st 1 we're going to look at now. You at the regular expression object. We passed in the stream on the string methods we pass in the regular expression. So inside of parentheses all pass in Reg. Exe one. Now let's see what that produces. This is the match method. Notice. It's giving us the exact same thing that exact does on the rejects object. It gives us the match, the text of the match. It gives us the index and it gives us the string that was used. So match on string and exact on the rejects object are the same. They produce the same type of output, which is an array. Now let's look at search searches. Another method. We can use me say that refresh. Basically, all search does give us the index. Now, if we're doing some sort of statement where we need the index of the match, then search is an easier way to get it than trying to pull it from The aerated is produced with the match method. So if we just want the index, then we'd use search. Now, the next two methods, we're gonna look at these air, probably the most frequently used with regular expressions, especially the next one, and this is replaced. The purpose of replaces to allow us to replace some text. Whatever text matches, the pattern that we pass in with the regular expression replace that text with some with some other text. So replace takes two parameters. The 1st 1 is a regular expression. The 2nd 1 is the text we want to replace it with, so we're going to be looking for a pattern that matches the literal text. Hello and we were replace that with the text. Hi, Let's go ahead and save that refresh. Here is our new string with hello replaced with the word high now something that's important to be aware of here by typing txt impress Return noticed that the string is still the same, so replace didn't make a change to the string because its immutable. However, it did return a new string, and that's what we're displaying with our consul dot log statement is the new string with the hello replaced with the word high. So replace is a method on strings that is frequently used with regular expressions. All right, one more method we want to look at and that is split now. You probably use split in the past Split basically is a method allows you to turn a string into an array, and the way it does that is you specify what character or characters you want to perform the split on. So which character characters act is the delimit er for determining the elements of the array. So, as an example, remove the second perimeter. We only have one parameter was split. What if we were to use the regular expression with the pattern? Hello. What's going to happen here is is going to look for a the word hello and is going to use as a delimit er and so it will create an array with two elements. The first element will have everything up and before Hello. The second element will have everything after the hello, The hello is eliminated because it becomes the delimit er. So let's take a look at that. You say that refresh. Here we have the two elements in the array. Programming courses always start with a space and then world example. All right. Now that's not a very practical use of split. It shows you how it works, but it's not super practical. Ah, frequent application. I have of Split is taking a string and splitting it on spaces so that I can work with. It's each individual word. This is usually done when I'm working with online courses that I create, and I'm in a quiz doing some testing, and that requires the user to input a phrase, and I need to verify that phrase. That's one way I will do it. So let me illustrate how that would be done. Let's changes regular expression to indicate a space the way we indicate a space in a regular expression is with a slash s. So now when we run the split, it is going to create an array of all of the words in our string. Let me say that Jump out, refresh And here's our ray. It consists of all of the words, So those are the methods that are available for using regular expressions in JavaScript. Let's review those really quick. So first on the Reg exe object we have test. It returns True if the pattern is found in the pasturing or false, if it is not so, test is the type of method we would use if we were conducting some sort of conditional evaluation is exact. Remember returns an array of matches and shows the actual text in the elements of the array , the text that matched the pattern and then, in addition, their properties attached to that array, the index of the match and the the textual string that the match was performed on. So those are the two methods that are available in the regular expression object. Now there is 1/3 method because regular expression object is an object. There is the two string method, but basically what that returns is simply a string of the regular expression syntax. So it would look exactly like this if we did to string on rejects, too. Not really anything useful about that, but I thought I would mention it. Since it is available now, let's look at the methods that are available on strings. First, we have match. Now remember, this is just like exact on the radio expression objects, so match returns an array of matches, and it has an index that is assigned to the property on the array and the actual stream that the match was performed on Next we have search, Remember, we use search if we just want to get the index of the match stream. That's all we're concerned about is the index of where it happened. The number of the first character of that match zero would represent the first character in the stream. Then we have replaced. This is the method that is used very common with regular expressions, and the purpose is to replace whatever matches we find with the string that we pass in. And then finally we can split a string into an array, and the division is based on the regular expression pattern. Now split can be used with just plain text that could become the delimit ER, but we can also use it with a regular expression if we choose to. So those are the four methods that are available on strings. All right, let's move on to the next topic.
5. Understanding Regular Expression Flags: before we go any farther, we need to talk about regular expression flags. These air sometimes called modifiers as well. These flags affect the way a pattern has matched. Let's take a look at how we use them and how they affect the pattern matching. So first off, if we want to specify a flag for a regular expression pattern, we do it after the forward slash. If we're using the constructor method, it comes as a second string inside the parentheses, so either after the last forward slash in defining the pattern or is a second string in the prince sees. That's how we specify thes flags. No, here the three of the most common flags that you will run into First off is global. The global flag tells the pattern to match globally, meaning it will find every match for that pattern within the string. If you don't use global, it only finds the first match and then it stops so global is frequently used. Another one that's quite frequently used is case insensitivity. The I flag tells it to do a case insensitive match, meaning case no longer matters. It will match both upper case and lower case of the pattern. Now, if you use both of these flags together than you would have a G I or I G. Either way, finally, 1/3 flag that is seen is a multi line match, and we'll deal with the multi line match when we look at beginning and ending characters. So here's the example we've been working with right now on our first regular expression, we have a space. Let's go ahead and modify that. Let's look for an S followed by a space so this would not match. But this would and this and so on. Let's go ahead and change this to match. Save that and let's refresh. We're getting a single match. That's all we have. One s showing up, and that's a position 18. So it's finding this one year, but it's not finding any others, and that's because we're not using global flag. So if I add a G, save this, come back out and refresh again. Now we see we get three matches. The ray has three in it, so that's what our global flag will do for us. Now, let's say that one of these ass is was upper case let's say that and see what we get. Of course, now we only get to so it doesn't find the upper case s because we entered a lower case s, we could enter an uppercase s refresh and it would find the one. But if we want to get all of the assets and we don't care what the case is, then that's where we use the I flag the case. Insensitive flag. Save that. And now we get all three of them again. Now I want to show you how the regular expression exact method differs when it comes to the global flag. So let me uncommon ent this console log statement. I'm gonna comment this one as well, but I'm gonna change this to the same rate. Regular expression. Go ahead and clement that one out and we'll save that and will refresh No notice what we get. We only get one entry and this is index 18. But the second log statement goes to the second match index 25. We continue doing that in the same code. We get three showing up The fourth console log statement returns. No, because it did not find a match so it retains an index of where it found the match. In the next time the exact method is invoked. It begins with that index. That's how that works. That's how the regular expression exact method works. So take note of that. It's not common to take advantage of that little feature, but if you need it, it could be quite valuable. All right, let's move on to the next topic.
6. Using Regexpal: before we dive deeper into regular expressions, I need to talk about a tool we will be using throughout this course. This tool is so helpful, I'm certain you will continue to use it. It is called projects Powell now rejects. Pau provides a way for you to quickly test a regular expression. We will use it throughout the course to learn all the syntax involved in regular expressions. However, we still will have exercises that require you to solve JavaScript problems. But as we experiment and learn the syntax of regular expressions, we will be using rejects, Pal. So let's take a look at it. Here's the site. You go to toe access rejects, Pal. Let me just show you some of the parts right here in this field is where you would enter your regular expression. Down. Here is where you place a text ring to see if it matches any of the items of that text string. As you can see by default, JavaScript is chosen. So that's the regular expression syntax that will be used, which is very common to most all regular expressions in tax. And you also have the ability to add flags, notice how, right now the global flag has been chosen because it showing up here in this filled where we enter the regular expression, it shows that we're entering it as a literal, which is how you're going to be creating them in Java script. Normally. Now, other parts of this Web page you can see there's a cheat sheet. This is great as we go through and learn the difference in tax with regular expressions. This is a great way to look at those and remember those. You can save a regular expression if you find one that is useful. And then there are some top regular expressions. They're commonly used, and we'll take a look at some of these bit later. Now, one thing I do want to know before we actually try this out show how it works is I like this tool so much I want to make sure that you recognize who has developed it, and you can visit other tools created by this developer by going to dance tools dot com This link down here, he's got a number of different utilities and things he's created, and this is probably the one I use the most, but there's a lot of other useful ones out there as well. All right, so let's see how rejects power works. Let me first copy in a statement. I'm gonna paste it into this filled here. So that's our string that we're going to be checking for a match. Now, let's go ahead and type in something that we can match there. We've typed in Hello and we get a match on the first occurrence. We don't get a match on the second occurrence because remember, this is literal characters. And right now the H is in lower case. So this is the only one that has an H in lower case now. However, if we activate the case insensitive flag right here, then we match both of those because we have the I flag and the G flag so global. So it matches more than one case insensitive. So it ignores the case. Now, if we turn off global, obviously that it was still only match the 1st 1 Now, if we want to include punctuation mark, we can So literal characters are not just text themselves, but it can be other things as well. Now, some of those punctuation marks that you find in text have certain meanings in the regular expression syntax and will be introducing those. So when you come across one of those, you'll need to escape them, and we'll talk about that in a later video. Now let me turn off the case flag and turn back on global. And now let's just put in a simple match. Let's see how many A's are in this statement. Only one A. Let's see how many ease. Ah, bunch of these. So you can see how it works and as it highlights different things that it will match now. This becomes really helpful as we get into the syntax, which is where we're headed next. But key projects pal open in one of your tabs because we will be coming back to it frequently. All right, let's move on to the next topic.
7. Understanding Metacharacters: in this section, we're going to introduce meta characters. But first I want to say a few things about characters and how regular expressions are processed, though many of the simple examples we have used so far are with words. In order to think about regular expressions correctly, you must think about patterns being built of individual characters. Just like words are built from individual characters. When the engine looks for a match, it works through this string one character at a time. As it encounters each character, A determines if it fits the pattern or not. If the pattern consists of multiple characters, then it must determine the first character in a pattern, and then the next character and so on until it finds a match or determines it is not a match. If no match is found, then the engine must rewind the string to the next character and sequence and start all over again. So let me let me illustrate that, cause I think it's an important thing to understand when dealing with regular expressions. So here we have a regular expression, simply the word Hello, but we need to think of it as consisting of five letters. Now here is a string. Let's see how the engine would process this regular expression. So first it starts at the very first letter and it checks to see if it is Imagine. Yes, it's the start of this pattern. And so we have a match. So then it goes to the second letter. Does that fit? Yes, it does. Then it goes to the third letter. Nope, that does not fit the pattern. And so then it has to rewind. It's already tested this one. But it has to rewind to this second character right here and a C checks to see. Is this a match? No, it's not. So then it moves to the third character. Is that a match? No. Is that a match? No. Is a space to match? No. Is this a match? Yes. So then it starts. Checking the match again. Is a second letter a match? Yes, it is. Third letter is a match. Yes, that's a match as well. However, the fourth letter does not fit the pattern. So then it rewinds back to this e checks that no match checks that no match checks that no match checks that match then we get here. Is Adam at Jess? Yes, yes, yes, yes, and that fits the full pattern. And so then we have a match, a full match, and it marks it as a match. And if the global flag is not Saturday, then stops at that point. If the global flag is sat, then it will continue on if there is more of the string to evaluate. So that is how the engine processes regular expressions. Looking for matches has to rewind when it's looking for multiple characters to make sure that it's finding matches that that could occur now. Obviously, we would not get very far if we could Onley use the literal value of a character when determining a pattern. So if we could only use what we actually type in is a pattern. There's not a lot of flexibility in that. Therefore, regular expressions have a number of characters, and they're mostly punctuation marks that are used to represent other characters, and these were called meta characters. Many characters make up much of the syntax of regular expressions, and it's these meta characters that can make regular expressions confusing to look at. Let's first take a look at what those medic characters are. Here they are. And, as you can see, they're mostly punctuation marks. Now these air, all separate characters and they all mean different things in regular expressions, and we will be learning what each of these mean as we go through this course. So, as you can imagine, when you look at a regular expression that contains several of these, it appears as something strange and difficult to interpret until you know what the medic characters mean. Once you know what the many characters mean, then you can decide for that regular expression and understand the pattern that it is trying to identify. So these are the meta characters. Let's take a look at the first man of character. We're going to learn the wild card, so let's move on to the next topic.
8. Using the Wildcard: The first meta character we're going to look at is the wild card. The wild card is represented by a period. Now most everyone is familiar with the concept of a wild card. It can be anything. So when a period is used in regular expression, it represents any single character, with the exception of some control characters like New Line. So it's important to remember that it's a single character, not multiple characters. So let's look at some examples. Let me paste in a string of text. This is what will be testing against, and let's go ahead and put in a regular expression. We're going to use an H than a wildcard character. And then at literal T and with the global flag, we can see that it matches more than one instance within the string. H. A T is a match, and so is H O T. Now, as I mentioned, it can represent most any character. So let's look at H and then a punctuation mark an NT. Yes, we get a match there. Let's look it also H Space t and we get a match there and I notice, even though these two matches air together notice that it considers them as two separate matches. You can see how they are separated in rejects power. Now, as mentioned, it on Lee represents a single character. So if we have a church 00 T, we don't get a match there. Also, I said, it represents almost every character. There's some characters, non principal characters that it does match, for example, that it does not match, for example, a new line character. And how can we show a new line character? Well, I'm going to type an age press return and then type of tea and those we don't get a match there. There technically is a character. There is a non printable character after that age, which causes it to go to a new line, and it is not matching that. However, there are some non printable characters that does match. For example, a tab. Let me copy the letters H tab, tea and pace ALS in. So this is a church with a tab character and then the letter T and that matches. So the wildcard represented by a period matches almost any character. Now what would you do if you wanted to specify that the pattern included a period, not a wild card, but the actual character. A period. How would you deal with that? Well, let's take a look at that next.
9. Escaping Metacharacters: in a previous movie, we looked at a big, long list of medic characters. Let's take a look at that list one more time, so here it is. Now what do we do if we want to include one of those meta characters as part of the pattern , not what the character represents but its literal value? How would we do that? Well, we can accomplish that by escaping the character. The Backslash character represents an escape character When we use it, it tells the engine that the character that follows the backslash should be used as the literal character, not what it may represent. Let's look at that. Let me copy in another stream. So here's the string we're going to use to try to find a match. Now what if I want to defined a D? That was followed by a period immediately followed by a period. Well, let's say I enter this look. We get two matches. We get this D, followed by a space, and we get this D followed by a period, and that's because the period in this case is a wild card, and so it represents any character. And so we're getting a match that we don't want. We don't want this match here. We only want this one. So what we need to do is escape that character and tell the engine on. Lee look for its literal value, not what it represents. And so we use a back slash to escape it there. Now we only match the d with the period following it, which is what we really wanted to accomplish now. How do you know when you should escape a character? Well, if you have a cheat sheet, you can see some of the characters that are used inside of regular expressions. But you may not always have that next to you. For example, any character but accept New Line is telling us what the DOT represents. And so when we see that and we want to use its literal value, we know we have to escape it. Well, what if we don't? No. Or we aren't sure we have some punctuation mark, but we're not sure if we should escape or not. Well, it's not going to hurt to escape any character, even though we don't need to. The Backslash character simply communicates to the engine. Use the literal, literal value for the character that follows, and the engine will do that now What if we were looking for a backslash in our pattern down here? We wanted to see a letter followed by a backslash. Well, it's the same principle. If you want to use a literal value of a backslash, you simply escape it. So what you end up with is to backslash characters together. And there we see, we get our match. So as you want to use the literal value of a matter character in your regular expression, simply escape it using a backslash character. All right, let's move on to the next topic.
10. Matching Control Characters: in a previous movie, we looked at matching a tab character, using the wildcard meta character. The period. Well, what would we do if we wanted to match the literal value of the tab? The wild card would match it because it matches almost every character. But what if we wanted to specify a tab? Well, we conduce this with a control character. Let's look at a few control characters. The back slash T represents a tab. The back slash V vertical tab backslash and new line backslash are carried return. No notice that each of these control characters used the Xscape character, the backslash, to represent something else. So when we use a backslash with a punctuation mark in indicates, it should be the literal value of that character. The literal value of the punctuation mark when we use it with a character such in this case is a TV and her are it may represent something else, and we can see that with these control characters, as it does represent something else. So let's look at an example of how we would use thes control characters here in my test string. I have several versions of the letter h followed by something and then the letter t. In the first case, it's hlt. This is H tab T H I T h carriage Return T or New Lying t. So if we did the wild card like we did previously, it would match three of those. It does not match the new line because wildcard character cannot match a new line character . But what if what I wanted to Onley grab the one that had a tab between it? Well, I conduce that with the control character for tap, which is back slash t and there we get a match. Now How about if I wanted to grab the H followed by a new line character followed by a T? Well, I could do that with the new lying control character and it matches that now. How do I know? How would you know whether use new line or carried return or what combination of them use and what's the difference? While new line and carried you turn go back to the days of the typewriter and they carried return traditionally is meant to go to the front of the line. New line is is you jumped down a line now, which you use in your regular expression, depends on the system you're using. For example, I'm on a Mac, and it's a newer Mac, and so the next line is represented with the new line character. However, in a window system, it may be represented by the carried return and the new line character together, so a backslash are back slash end. And so what you used to represent a new line or a carriage return depends on the system you're working on. And so you may just need to test that regular expression to make sure you're getting what you need. And in all reality, you really need to be testing any regular expression to make sure it's doing what you want it to do. So one more thing to mention about the New line character. This is the same as a lying feed character. Sometimes it's talked about that way, so control characters allow you to create a regular expression that will match the literal value of some non printable characters, such as tabs carried Return New line. All right, let's move on to the next topic
11. Exercise 1 Start: we've come to the first JavaScript exercise now for each of the exercise. In this course, we will be using the browser as the JavaScript engine toe. Handle our code. If you're more comfortable with note or another solution, feel free to use your favorite solution. To do the exercises. Reach exercise. You will be given an HTML file, a CSS file, an a Java script file. Enter your code in the Java script file and then open the HTML file in the browser to see if you completed it correctly to view the results of your code. You're welcome to display it on the HTML page, but in most cases we will just logged the final results to the consul just to keep it simple. So let's look at the first exercise. Here's the Java script file for the first exercise and the HTML file the instructions for the exercise or at the top of the Java script file that indicates using the provided array necessary. Right here, this array of phone numbers create a second array that Onley includes the numbers with an 801 area code. Now, for those that may not be familiar with area codes. The area code is basically the 1st 3 numbers on that phone number. So what we're seeing there that I've highlighted. So take a few minutes to figure that out. Both the Java script and the regular expression you will use with the Java script to solve this. And when you're ready, you can review the solution.
12. Exercise 1 Finish: All right, let's take a look at the solution now. These first several exercises aren't going to be too complex because we haven't covered a lot of regular expression, expression, syntax yet. And so the data I'm providing, I'm making it much easier to identify or find matches. And that's the case with this exercise. Now all show two ways to do this. You may have chosen to do this with the filter method of a raise, or you may chosen to do it with a loop, and I'll look at both. But first, let's look at the regular expression we would create. So I'm going to declare a variable which will contain that regular expression. Now, if we're trying to match the 1st 3 numbers that are 801801 obviously we would need the literal values for those. We don't want to use a wild card because that could end up being something else. But however, if this is the only part of our regular expression, then we could match other places within these numbers. For example, here's a NATO one here that we could match, and it's not the 1st 3 numbers and another one here that we could match. That is not the 1st 3 numbers, and that would create a match on that number, even though the area code is something different. So there's ways to deal with this and regular expressions. But with what we've learned so far, we simply need to add another character. With the way the data is entered. We can see that by adding a hyphen, then it will only grab those that have the 801 at the start. Now, if we had some numbers where the other one was in the middle, then we'd have a problem because there's also a hyphen after that and that could create some issues. And like I said, there's ways to deal with that with regular expressions, and we'll learn those as we go throughout the course. But let's go ahead and finish this off. So here's I regular expression. Now, as I mentioned, my first attempt is going to be using the filter method of a raise. If you're not familiar with this, then you may learn a little something there as well, so I'm creating a new array variable and setting it equal to full numbs the array dot filter filter is a method on a raise, so the way I access it is using got filter and then filter is, ah, higher order function. So what that means is we need to pass in a function that will tell how to process the array . So basically a filter will do is it will go through each element in the array. It will return elements based upon what we enter in our function that we passed in and those returned elements will be come the new array. And so basically the function we pass in either needs to return to her fault. If returns true for an element, that element will be passed in the new array. It returns fault. It will not be so. Here's what I'm going to do. Since I'm using the filter method of arrays, I'm going to use an arrow function. So here it is. Here's a regular expression. I'm going to use the test method of the regular expression object because that returns either a truer fault. If there is a match now, each time we iterated through their array were passing in the element, and that's put into the L M Variable. And so that is what I want to run the test on. And there we go. That is the solution. Let me go ahead and say that we'll check it out. I'm going to copy the file path, come out, pace that in. And then I'm gonna open the console and just open that new array variable just to see if we've got what we needed. And sure enough, there's five numbers and they all have an 801 at the beginning. There's none with the 435 which was the other area code that we included. So that's one solution using the filter method and the one I prefer, obviously a lot simpler to enter. But let's look at how we would do this with a loop. So for this, I'm going to declare the new array, variable up here, said equal to an empty array. And then we'll use a for loop. It's a cycle through the array. We will go through the loop while I is less than the length of the array. And then here's what we're going to do inside of the loop. We're going to use an if statement to check to see if the regular expression test once again I'm gonna use the same test method because that returns a tour of faults, and what we're testing is phone numbs. And then I specify which element, Ray I want to test. And if that's true, then we simply do new rate up, push and push that element onto your A like this all right, that will solve the problem for us as well. Let's go ahead and test to make sure it's working. So I say that refresh type Marais, and sure enough, we get the same exact results. Now remember, this is the critical part of what we're teaching in this course. But if you've learned something from the other JavaScript, then great, that's an added bonus. But this is what's important. Make sure that you entered a regular expression of 801 and then a hyphen. All right, let's move on to the next section
13. Using Character Sets: in this section, we are going to be talking about character sets, ways to find matches using a group of characters. Now let me start with a simple example. Let's say that you are trying to match the word gray in some text. Well, great. Could be spelled one of two ways, either with an A or an E, and we should account for both of those. So we need a way to create a match that could be one character out of a group of characters . No, that is called a character set. So here we have a regular expression and in place of the A or the right now I have a space . So what do we put there? What is it that defines a character set? Well, we can specify a group of characters using square brackets, so it would look something like this right here, the square brackets. And then we have an A in any inside of it, and that indicates that match on Lee, one of these characters inside of this character set. It matches the literal character outside of the character set inside. It matches at least one of those, so on a or an E could produce a match. So let's take a look at this exact example in rejects Pau. And then we'll look at some other examples with character sets as well. All right, here we have, ah line attacks to make the outline for the square grey and the feel for the circle grace. So two words gray spelled differently. Let's see how we would find a match for those. So we enter the g R. And obviously it grabs the G R. And then we start our character, certainly entering a in an E. And then when clothes are square brackets, then finally why? So we're matching both of those both the gray within A and the gray with any because we have the global flag ob. Obviously, if we turn that off, we would only match the 1st 1 Now. What would happen if this word here we're spelled with an A any Well, then it doesn't find a match because this character said is saying match on Lee, one of the characters in this group. So it's a group of characters and you're supposed to find one match for it. Let's look at some other examples that help illustrate that. So first, let me remove all of this and we'll enter a character set of the letters A, B, C and D. And that is all. So, therefore, characters inside the character set, but notice it is only matching one for everything that it finds every time it finds an A a B, a C or a D. It considers it a match. Now something else that considers we can do more than one character set together. So, for example, we could do a second character set for a second character and this one. I'm going to put a space and and I and enclose that Now we're matching only two places. This D has a space after it. This c has an eye after, and so it's matching both of those. So we get one character from this group and then one character from this group. Now, look, look what would happen if we add and e because there's a lot of ease in here. We get a whole lot of matches, especially because of the space that we have in the second character set. Now we can do more than two. As I mentioned you can group, you can put together multiple character sets. So let's say we haven't are in an age up here, are or in age, I should say. Then you can see the matches were getting a lot of the last part of the word that when a space comes after it, then we have one here. Where are E and then a space now? So far, I've only been doing letters, but this is true for any character. So, for example, let's say I wanted to find either a one a two a three or four. Now if I were to enter 1234 notice that that is four different matches. That's not one match for all of those, because once again, it's grabbing one character from that character set. And so four different matches right there. Now, looking at this 1234 it makes you wonder, what could we do? Arranged. We have toe. If we want a range, like between one and four, do we have to type out each number? Well, that is what we're going to look at in the next topic. A simpler way to enter a range like that But before we leave this topic, I need to mention something that's very important about character sets. And that is that Mata characters don't act as many characters inside the square braces so inside a character said they do not act as meta characters. They act as the character which they are now. There is one small exception to that, but we will deal with that in the next topic. So, for example, we have learned about the wild card, the period, which is a medic character. Let's see what we would get if we use that as a part of this regular expression. So I'm going to go back to our gray example e. Inside the character set than a why. And now I'm going to create a second character set. This one's going to have a space and a dot now notice. It's matching both of those because we can match a space or we can match a period. Now, this period is not acting as a wild card. If it were a wild card, it could match any character after the why. So let's just prove that by putting any putting another character after this, why here and notice we lose the match at that point because s is not in this character set . We have a space, and we have a period that is not a wild card. Because inside of the square braces. So that may cause you toe ask. Well, what if we do on a wild card inside of a character set? Well, basically, you wouldn't really want to do that. It goes against the idea of character sets where the character set your specifying a specific group of characters. And you want to match one of those characters from that group. If you want to use a wildcard, really, you should just use outside of a character set. All right, so let's move on to the next topic.
14. Specifying a Range in a Character Set: in the previous topic, we looked at a character set that had the numbers 123 and four, and we had to enter each number by itself. While a simpler way to do that is using a range now, here's how we would type the same thing in a range notice. We use the hyphen character in between the starting number and the ending number. And basically, what that specifies is that this character set can be a one, a two, a three or a four. It includes all four of those numbers. Now notice something about this example. The hyphen is acting as a medic character, not as itself. In the last topic, we mentioned that Medicare does not function. Is America meta characters inside a character set? And we used the wildcard character, the period, to illustrate that well, one exception is the hyphen in a character set. Ah, hyphen specifies a range. So if you wanted to include ah hyphen as part of the characters set, we might need to escape it like this. Now I say might because if the hyphen is in a position where it would not be confused as a range, then there's no need to escape it. So in this example, we actually could do without the escape. It will work with it, but we could leave it out because there's no chance of confusing it with a range, because basically, we have ah, hyphen, a space and a period. Now, something else to be aware of with ranges and that is that the range doesn't Onley need to consist of numbers. You conduce to a range of letters as well. Here's an example of that. We used a similar example in the previous topic, where we had to enter a, B, C, D and E to designate whether we wanted one of those characters. Well, this is the same thing, but he uses a range. It knows to include all the letters between A and E when we use the hyphen. Let's jump to Reg exe pal and do some examples. So here I have a simple sentence. There have been four or five times I've tried, but I will try again. Let's just play around with ranges inside of character sets. So let's start off with a character set and let's go 127 in that character set and see what we match With the global attributes set, we match both the four and the five now. Obviously, if we move this down to four, we would Onley match the four. We would match the five. It's outside of that range now. Notices well because this is a character set. We only match a single time. If this were the number 42 for example, that would be two separate matches because it's looking for a single character in that match. If we wanted more than one character, then we would do a character, a second character set like this. Now only the 42 is matched because the 42 is the Onley thing that consists of two numbers between one and six right together and so therefore we get that match. Now let me remove that second character set. Let's add some additional characters inside of this. Let's say we wanted to find a through Z and knows how that matches about everything because we're either looking for a one through six or the letters a through Z and so about everything about everything matches that with the exception of the upper case characters, we can also do a range of upper case characters, and there were matching almost everything. If we then out of the space, we would grab the space and we can add a hyphen and we don't have to escape it because it's not in a place where it would be confused as a range. Now let's look at an example of where it might be confused isn't really as a range. So let's say we had one to and we also wanted to match Ah, hyphen. And then we wanted to match a five. We'll see what happens there Now sees this is arranged, so it's expecting for to be a part of that as well. So if we wanted to change that, so it's just looking for the hyphen, a 12-5 Then we would escape it. And there we get the type of match that we're after. All right, let's do another match that consists of just letters. Let's do a character set of upper case letters, A through I we find to there and then lower case letters A through I. And of course we get no match because there are no upper case characters that air, followed by a lower case character. Now, if we extended this range both of these ranges here, let's go to Z. Extend this one too well and you can see we get the th because now that T is included in this first character set. The H is included in the second character set. All right, let's say we wanted to grab all the punctuation in there. We could do that in a single character set as well. We can do the hyphen without escaping it because there's no chance of confusing that with a range because it's at the very start. We can do a comma and weaken, do a period. We don't have to escape the period because it's inside of a character's set. And there it matches all of those. All right, let me change the text that is down here. Let's say we were looking through a log file or something for exceptions. We'll put that much of it now. How could we match exceptions? Well, let's look at how we might do that. And we'll use, um, character sets to do that. So first off, they start within a zero in the necks. These particular exceptions were looking for. And then if we wanted to identify a specific code, let's say we wanted to identify something, starts with the letters A through F or the numbers zero through nine and then followed with that as well. So let's do a character set that would do that for zero through nine would give us the numbers and then a through half. And then when close, that character set, we do a second character set zero through nine a through F and close that one. And then we get a match with that exception. So once again remember, there are two character sets here. Each character set matches a single character, so the character this one matches can either be zero any number between zero and nine and any uppercase letter between a and after and the same thing with the second character set. And so we get these two characters here. Zero necks that we placed at the front of the regular expression causes the match for all four. Now there's one more thing I want to cover Before we conclude this topic. I've changed the string, which is here in rejects, pal and basically it contains a question which I want to address. How do we capture the numbers? 13 through 20 for example? Well, let's see what happens if we put a character set up here that contains 10 through 20 and then close that character set. Look at what we have matched. We did not match whole numbers, such as the 13 or the 20. We matched individual numbers. So keep in mind and remember that it's individual characters that were specifying in this character set. And so when we specify a range, it can only be arranged between individual characters, not between a 10 for example, two digit character and another two digit character. The way this is being read is that we're looking for either a one a zero, a one or a two or a zero, and so really repeating ourselves. And so it finds a one. It finds a two, and it finds a 03 separate matches. So when you're doing regular expressions, you have to think of each character as an individual character. You cannot think of 13 as a single entity because it is seen as a one and a three, not as 13 and So it's a little bit change about how you think about numbers if you're dealing with numbers in a regular expression, and I wanted to point that out before we concluded this section. All right, let's move on to the next topic.
15. Excluding a Character Set: Sometimes we may want to use a character set to specify a group of characters that we don't want to include in the match. We can refer to this as excluded or negated characters. To exclude a group of characters. We use the carrot symbol as shown here in the title. Now the carrot symbol is a matter character for character says, Just like the hyphen and like the hyphen, we might want to escape it if we mean to include it as a part of the set, However, the only time we have to escape it is if it comes at the start of the character set. That is because we use the carrot at the start of a character set to indicate negation or exclusion of those characters. So let's look at some examples. Here is an example we were using in the previous topic. Now, let's say, instead of looking for an exception, that starts with zero through nine a through AF, and the second character is also see with a nine or a thrift. Let's say the first character is not one of those, while the way we would do that is we simply put a carrot symbol and this is a symbol that is usually above the six. We put that in front of the whole character set. So basically is saying Matt zero match acts and then the next character cannot be a zero through nine or an eight through F. If it is, then it's not a match and that's why we're not getting a match here now. However, if we change this to say something else a G, then we get that match. But if we then put a carrot for this one now we're negating those. So the second character following the G must also match and it does not because it is one of these characters included here and with the carrot were saying, We don't want to match one of those characters. So let's say we had a bunch of letters, almost the entire alphabet and we had a character set a cruzi. Obviously we match every single one of those characters and their individual matches way. We can see that as we turn off the global, then we only match the first. Now the carrot is going to do the opposite of that. No notice where we place the carrot. It goes inside of the character set, and it is the first character that is a part of that character set. And there we no longer get a match because it's negating everything that is listed here. Now. If we wanted to include the carrot as a part of our character set and it is the first character we listed, we would need to escape it like this. Now it doesn't act. Is it as an exclusion? In fact, if we type a carrot here, it matches it now. We could also now we remove that escape, and the only thing that gets matched is the carrot, because we're saying anything but a through Z. So it's not matching this carrot because the carrot symbol is here. It's matching it because it's saying any character other than a through Z. That's why we get the match Now. If we remove this now, we can see it matches all letters. It doesn't match the carrot, but we could also put the carrot here at the end. And because it's not the first care character in the character set, we don't have to escape it, so sometimes we want to use character sets to specify a character we don't want to include in a match. And we do that within the gate character, the carrot and that is always placed at the front of the character set. All right, let's move on to the next topic.
16. Escaping Metacharacters in a Character Set: during the past few topics. I've mentioned that when you are creating character sets, there is not a need to escape meta characters with a few exceptions. So there are a few exceptions to that rule now. We've mentioned two of those exceptions the hyphen and the carrot symbol. The hyphen designates the range. The carrot negates characters. In this topic. I want to take a few minutes to identify all the matter characters that you may need to escape within character sets. So these were the ones that you may need to escape, not those that you don't need to escape. Basically, the general rule is inside a character set. You don't generally need to escape manic characters. Now there are There are a total of four that you may need to escape. So first off hyphen that's used for the range and remember the reason you may need to escape. This is if it is obvious that the hyphen is being used for a range, then there is no need to escape it carrots symbol as well. It negates those characters. If it's at the start of the character set, there's they need to escape it because at the start of the character set is where it functioned as a matter character, indicating that you're looking for a match that does not include these characters. If it's anywhere else in the character set, there's no need to escape it now in both these cases, if you don't want to try, remember when you may not need to escape it. Just always this escape it that works. There's no problem with that. Now there's two more matter. Characters Now. These next to you will need to escape if you ever use them inside of a character set first . This hash symbol is used to escape characters, so if you ever want to use that symbol within your characters that you need to escape it so you enter two of them, and by entering both them together, it indicates that you want to use that actual character in your character set. The fourth character is the right bracket. This character, when seen, indicates that it's the end of the character set, a character said is between square brackets. So if it sees this right square bracket, it thinks the character set is ended. But if you intend that to be one of the characters is a part of that group. Then you need to make sure you escape it. So all four of these characters you can just go ahead and escape any time you want to use the actual characters, a part of a character's out, as mentioned, the hyphen and carrot symbol. Those you can get by without escaping those if they're located in certain places within the character set. All right, let's move on to the next topic.
17. Using Shorthand for Character Sets: sometimes with regular expressions. There is more than one way to specify a pattern in this topic. We're going to look at short hands that are available for character sets. Many times you will see these short hands in STAD of this standard character sets that we've been talking about. Now there are short hands available for both those character sets that include and those character set set in a gate or exclude characters. Let's first look at the inclusive character set short hands, so we have three of them. The 1st 1 is the Escape character and then the lower Case D, and that represents digits. And so you can see the equivalent characters that next to it character set A. A range of zero through nine would be equal to the shorthand escape D. Then we also have the word shorthand. Now a word in the shorthand represents quite a bit. You can see the sample character set next to it represents all characters upper case and lower case. All numbers and then also the underscore character may not be able to see that, but at the end here we have an underscore character. Those air, all considered word characters. Now the hyphen is not just the underscore. For some reason, it's considered a word character. Then, finally, the third shorthand we have is the space, the white space shorthand, and basically that represents either a blank space, a tab, a carriage return or a new line feed. It can represent any one of those characters. That's what the escape s represents. No, before we look at the negate characters, Lutz, just try a few of these in rejects, pal. So I have two lines, and that was important. I wanted me. I wanted to show that I had a second line. I've included characters. I've included some tabs I've included on Underscore. I've included numbers, and there's a few other characters as well. So let's first try this shorthand for digits. As you probably guessed, that matches every single number, which is in these two lines. 11 total matches. So five hair. Five. Here. No one here. Once again, we have our global flag on, and so it's matching all of those. Now let's take a look at what the word character will match. Most. Everything is matched with the word character. We have all of the upper case and lower case letters. We have the underscore right here. We have the digits again. Their matches a part of the word character. All of that is matched Now. If we replace this with the space, the whitespace, then you can see what we match every place there is a space between words. We get a match. Here are three tabs this first tabs not showing as long as the other two, but it's still a tab. And so it's matching those another space there. No noticed. We indicate up here. 14 matches. Now look. 123456789 10 11 12 13 Onley 13 or showing while the 14 match is this carriage return lying feet area. That's where it's creating another match. It doesn't show up in rejects, pal, not a way for it to show it, but it is creating that match, and I just wanted toe point that out. All right now, I mentioned they're short hands for the negate character sets as well. So let's take a look at those This uses an uppercase deep, so all of the short hands for the negation is the upper case. equivalent. So for digits, it would be uppercase d words uppercase w whitespace uppercase s. So, as we can see for the upper case D, we have the name negate character, the carrot in front and then zero through nine than a gate here, upper case and lower case letters, numbers and the underscore than a gate with space tab carriage return line, feet. All right, let's take a look at those in rejects, pal as well. So now first think about this. It takes a little bit different thinking process when we're dealing with these negate character sets and the short Hatton's, which do exactly the same thing. What we're going to indicate is we want to match everything that is not one of these. So if we put in upper case D here, it matches everything except the numbers notice it's matching the white space. Now, if we put an uppercase W, it's matching everything except the letters numbers and the understand and the underscore. See the underscores not matched here. All the other characters there is the open print and the white space, the tab, etcetera. The period over here. All of those are matched because it's everything but this. And then, finally, if we do an upper case ass, we see that it matches everything but those whitespace characters. So those air shorthand characters for character sets and those air what you'll see quite frequently when you see a regular expression that's been provided on the Internet or somewhere else, you will see that Number four. We leave this topic. I want to jump over here to our cheat sheet. That is a part of rejects, pal. We've learned enough now to recognize all of the items in the first part of this cheat sheet. The wild card. Here are our shorthand characters, forward digits and white spaces, and then are short and characters for the negation of those not word digit or white space. And then here we have the regular character sets with a range down here so this geechee can be helpful to you. If you're using rejects palette, figure out a regular expression. You can refer to that if you can't remember exactly what you need to enter for this particular thing. You're trying to create a match for. All right, let's move on to the next topic
18. Exercise 2 Start: it is time for our second JavaScript exercise. Now this particular exercise is not going to be too difficult. We're going to build off of the previous JavaScript exercise, So let's take a look at what I'd like you to dio. Now this looks very similar to the completed exercise from Exercise one. However, I've added a little bit to it. I've added three additional phone numbers, and as you can see, they're incorrect only three digits on that last one. And so the additional part of this exercise is to make sure that the phone numbers are valid using this format. Obviously, there are a lot of different ways to enter a phone number that we would need to check for validity. But right now we're keeping it simple. We're going to use this format to determine whether it's valid or not. So what's currently happening? Let me just show you that open the console and display the new array. We are grabbing these phone numbers here, these last three, and they're not valid phone numbers, and so we want to modify it so that it doesn't grab those. And so basically all you're going to be doing is modifying the regular expression. Take some time to figure that out, and then, when you're ready, view the solution
19. Exercise 2 Finish: all right. Hopefully, that wasn't too difficult for you. I tried to keep this one simple. Now, basically, all we need to do is in a regular expression, is fill out the rest of what we expect. Right now, we just have a tow one hyphen, and we're just expecting that somewhere in the number, if we fill out more, we can be more certain that we're getting the right type of match. Now. This is something that we would want to do with character sets because the numbers could be anywhere between zero and nine. So we could do a bunch of these character sets like this, and then do another one and then another one and then a hyphen that would get us the next part of the telephone number. However, this is a lot to type for that. So this is a case where would want to use the shorthand, and so we can do that with three digit character sets. Then we'll do another hyphen, and we'll do four more for the last four numbers of the phone number. All right, let's save that and see what we get now. So we fresh that we will open the console. One display. New, courageous. See what's in there and we get five. So we're not getting the last three, which is good. So that accomplishes what we're trying to do. However, I'd like to show you something, and we'll learn about how to deal with this later in the course. But I want to show you a problem with this solution. Let's say that this number up here, the reason it was wrong is because it had an additional digit here at the end, so it shouldn't have four, and it has five. Will this regular expression prevent that from being a match? Think about it. Let's say this and go ahead and find out what we get. So fiery fresh display. No ray again notice. We're still grabbing that. It has five digits. Why is it grabbing that? Well, the reason is grabbing it is because this part is still a match. And so it matches that the extra digit doesn't affect it because we don't have anything coming after this to indicate that this is the end of the four digits. Like I said later on the course, we'll learn a way to deal with that you can specify start an end of a string, and that can help us make this regular expression more usable. All right, let's move on to the next section.
20. Using Repititions in Your Pattern: up to this point, we have not talked about how to repeat characters. When we are defining an expression in this section, we will do just that. At times, you may need a character to repeat in order to find a match. Regular expressions come with three matter characters that allow you to indicate that the item should repeat and how much it should repeat. Let's take a look at those first off the plus symbol. Now, in all of these meta characters, all three of them that we're going to look at, they apply to the left most previous item in the regular expression. No more than that. So you put it right after the item you want to repeat, and in case of the plus symbol, it indicates that we want to look for something that matches one or more occurrence of that item that is on the left. All right, we also have the question mark. That meta character indicates we want to match zero or one occurrences so two possible states doesn't exist it all, or it exists only once, and then it will create a match. Finally, the asterisk indicates that we want to match zero or more occurrences. So an infinite number of occurrences it could match for that particular item. No, these meta characters that match zero are interesting. Basically, what you're saying when you use one of these is that this item may be in the data that I'm looking at and therefore I want to match it or it may not be. And if it's in in the case of the Asterix, it could be there many times. And so watch for that Now. These Medicare occurs can be a little bit difficult to understand how they work. They sound pretty intuitive, Justus. We go through them. But understanding how they work can be a little bit more difficult, so it's important that we understand them. In order to do that, we need to take a look at some examples. Now here we have a string, and I'm not going to try to say it because I will probably mess it up, since it's a it's a tongue twister. But this will give us some data to be able to understand how these things are working. So, for example, let's start off with something simple. I'm going to put a character set in here of upper case letters a through Z, and we can see that it matches all of those upper case letters that we find in this tongue Twister. Now, what would happen if we added repetition character? Let's first add the plus symbol. Now, remember, this matches one or more. It really doesn't change it, right, Because there are not. There is not a situation where there is one or more upper case characters. If this happened to be in upper case there, then it would find a match for it. Without that plus symbol, it's two separate matches. So if we were to turn off the global flag, we just get a match for the S. We put a plus symbol. We get a match for both of those. Okay, let me turn our global back on now. What would happen if we change this to question mark? Now think about this and notice this little indication over here this infinite. When we mouse over that it says air, the expression can match zero characters and therefore matches in infinitely. So we basically match everything because we're saying zero or one of upper case characters and that same with the Asterix asterisk as well. Zero are many, and we get the same air over here. So rejects, PAL indicates that's an air doing that in Java script. We don't get an air, it just returns. Whether it's a matter not, for example, of must create a regular expression. We're simply going toe have the same type of thing. We're just looking at this type of thing. Press return. Now, if we do a test on that regular expression, let's pass in the text. E I O and U All in uppercase was a return returns. True. So it indicates. Yeah, it finds a match even though a regular expression matches infinitely. All its concerned with is whether there is a match or not. Now, what would happen if we do exact on that it basically matches at the start is what it's showing us index of zero. That's where it's finding the match. And so that's how Javascript would handle that. That particular type of regular expressions, not something you want to enter, really. It's not providing you with any information because it could basically match anything but the reason we the reason I use it in here is to give you an idea of how these things are working. Now let's try another one. Let me do an s notice. All the SS is in this tongue Twister. Now, if I do after that a through Z in lower case, let's see what we get. So we get everything that hasn't asked in them follows with another character. Now what if we were to do a repetition matter character? Let's say we did the plus there. Now see what it's matching. It can match as many as possible. And so it goes as far as it can to match all of that that's called being greedy. Regular expressions are normally greedy, is trying to match as much as it can. And then when we run into a space that can't match anymore, and so then a new match starts over. Obviously, if we remove the global flag, we just get a single match there. Now let's see what would happen with the question mark because now it can be zero or one. And so now we just get two characters, and also we can get a single ass without another character after it, because it can be zero. This represents zero of any of these. Remember, it is the item that is the left that it is modifying. And so that's why we're getting zero those now. Obviously, an asterisk would be very similar to the plus. The main difference between the plus is if we had an S hair notices picks it up. The plus does not because it requires one or many one or many of these. The asterisk is zero or many. All right, let's change things up again. Just, uh, allow you to see some difference applications and help you to understand these better. All right, now let's do this for a regular expression. Will do. Upper case letter to start. Then we'll do a lower case letter notice we get to letters in the match. But now what happens if we put a plus symbol after where were indicating we want to match one or more? Because what we're saying is we want to find names that start with an upper case. If we have a name that starts with and upper case, but nothing else comes after it. It doesn't match. Now let's look at it at an example where we would use the question mark. Let's say we want to find the word apple or apples. We don't care. We just want to see if apple or apples is in there. Maybe we were doing a test of some sort, some sort of online education, and they had to enter something about apples. And we didn't care if it was plural or singular. No noticed. Right now, the way it's entered that will match apples, it will not match Apple. However, if we put a question mark after it now, we'll match both. Both of those become a match because we have our global flag on All right. One more example. A little bit more practical example with asterisks. They were searching through a log file and we want to find anything. That's a warning Now The warning could have an exclamation, or it could have multiple exclamation points after. Right now, the word warning There is not finding a match. However, if we put an asterisk after it now that matches. If we then put multiple exclamation points after it, that matches as well because it zero or many and remember it modifies the character. It is next to, So it modifies the exclamation point. Nothing else. All of these are single characters now. I mentioned just a little bit ago about the greediness of regular expressions, and that's important concept. Understand, with regular expressions is that they're greedy, and so we need to spend a little bit of a discussion about that and how that effects when matches were found. So let's move on to the next topic.
21. Understanding Greediness and Laziness: Before we continue discussing repetition, we need to deal with two important concepts. The greediness of regular expressions and how to make those regular expressions lazy. So by default, regular expressions are greedy. This means they try to match as many characters as possible. Let's take a look at this. Here is rejects Pal. And in the test string, I have some HTML basically two paragraph tags. The 1st 1 here 2nd 1 there, and I basically I've repeated that a couple lines down. The reason I've repeated is because I want to use this line to help illustrate what is going on when we create a regular expression and it tries to find a match. This is going to help us explain the greediness of regular expressions and how to make them lazy. Now, let's say for the purposes of our regular expression, we want to match a paragraph time. We want to match from the beginning of the paragraph tag to its ending tag here. So that amount now this is a new set has another beginning, paragraph tag and unending time, so we want to deal with them separately. Now, for this example, I'm going to turn off global cause. I only wanted to match once, and that will help me to illustrate what we're doing. So I have global turned off. And now let's go ahead and enter irregular expression. So if we're trying to match paragraph tag, everything in between and then the ending paragraph, let's start with our paragraph, tagged the opening paragraph tag. And so that obviously matches where it finds the first set of that because we don't have global turned on. Now let's go ahead and put in our ending paragraph tag. Now, obviously this character here, we need to escape, right? And so we'll put our escape character in to account for that. Now, what do we put inside this to account for any text that could be here? Well, we have a wildcard character, Mac, and represent any of those characters, and we've also learned repetition. So now let me go ahead and put in an asterisk. Now, look what happened with this regular expression. It did not stop at the ending paragraph tag. It went all the way to this ending paragraph tag. So this is an illustration of the greediness of regular expressions. Now what do we mean by that? Well, the regular expression is going to match as much as it can. Is going to try to match everything that it can possibly match. And so we start out by matching this opening paragraph tag. Then we get toe wildcard character and then we have this repetition of zero or more and so since it's greedy, it just starts grabbing everything. Does this here match the wildcard character? Yes. And so does this. And so does the P and so on. And so it keeps grabbing, grabbing, grabbing, grabbing until it gets clear here to the end. And then what does it do then it realizes, Oh, wait, I've grabbed everything on that line, but I also need to look for a match for this last part. So it starts backtracking and starts giving back characters little by little, little by little until it gives back enough that it can find that at the end of that last match as well. Now, to show this greediness, look what happens if I were to remove the end paragraph tack. It still matches the whole thing. The end paragraph tags not doing anything because of the greediness of our regular expression. It's grabbing as many characters as possible. So let me illustrate that again, down here in the line below. So we first find this portion of our techs, and that matches the first part of a regular expression. But then we have a wild card and we indicate zero or more. And so it just starts grabbing everything it can. It matches it all, all right. And that's a match for this regular expression. Now when we have this added back in the ending paragraph, tag added back in, Here's what happens. So we first match it all and then it goes, Wait, I have some taxed here that I need to match. And so it gives back a portion. It gives back that first character Does that a match on that last part? Nope. Gives back another one. Is that a match on the last part? No. Gives back. Another one is at a match. Not quite. Is that a match? Yes, and then it matches it. And so it first matches everything because of greediness, and then it gives back some characters in order to find a match. For the last part of that regular expression. So that is greediness. And it's this greediness that is causing it. So it's matching everything, and it's not stopping at this first ending paragraph tag. Now we can make greedy expressions lazy meaning that instead of grabbing everything it can , it is Aziz lazy as possible, and it grabs as little as it can. That is the difference now. The way we express laziness is with the question mark. Now the question mark we've learned in repetition. And remember the question mark matches zero or one character. Now, when we put the question mark after this repetition character here, it forces it to be lazy. The watch. What happens now? We're selecting just the first paragraph, not the whole thing. If we turned on global, then we would match all of them. OK, but there would be four separate matches where if we turn off global, the single match is just the 1st 1 Now, why didn't do that? What's the difference when we make a regular expression lazy? Why does this happen? Well, we explained how it works when it's greedy. Well, let's explain how it works when it's lazy, so when it's lazy, it starts off well, I need to match this opening paragraph. It finds that and then it's lazy and says, I can get by with matching nothing. And so I'm gonna be lazy and try to match nothing. Well, if it matches nothing, it doesn't find a match for this last part here. Okay, if we just had something like this, let me put this in really quick. Then it's going to find a match because it's lazy. It doesn't need anything between the opening paragraph tag and the closing paragraph tag. It's not necessary. It's being lazy. So what happens is it finds a match for that, and then it realised as well. I need to find something that matches the end of this regular expression. So let's go another character. Do I find a match in the end of that regular expression? Yet? No. And it keeps going through these characters, keeps going through. It, keeps going through. It, keeps going through it. Then it finds, Oh, and here's a match. I can stop now. That's how it works. Weren't when it's lazy when it's greedy, it grabs everything first because of this wild card and this repetition character, and then it tries to give back to find a match. When it's lazy, it matches as little as possible. And so that's how it's working. So with lazy with that question mark in here to make this expression lazy, what would happen if we remove the closing paragraph type? It would simply map match the opening paragraph tag, and that is all because it's lazy. It's grabbing as little as possible. When we remove the question mark that makes it lazy. Then it grabs as much as possible so you can see the difference airs. We go between those two. The difference between greediness and lazy and this is important concept in regular expressions, and we're gonna come back to it. But first we want toe add mawr to the whole idea of repetition. So let's go on to the next topic.
22. Specifying a Repetition Amount: up to this point with repetition. We have not discussed controlling how many times of pattern repeats. It has either been zero or many one or many or zero or one well, we can also control the amount. So let's take a look at that with the other repetition medic characters. They apply to the item that came before in the regular expression, and that is true when we specify a repetition amount. Now we use curly braces to specify the repetition amount, so let's see what is possible. So first, if inside of the curly braces we put two numbers separated by a comma, then we're able to specify that it matches from men to max occurrences. So it could be say, we had a four and a six. It could be something with four. Something with five or something with six. It's anywhere between that men and that max number. We'll take a look at some examples in just a minute, but another way that we conduce you repetition amount is we can simply put a single number and then that specifies that it matches that many occurrences at whatever number we specify . And then finally, one more option we have is we enter a number and then a comma without a second number after it. Now, in that case, we're saying that it can match men or more occurrences, so it has to at least be the number we enter for men. But it can be any number after that. So as you can see, this type of repetition amount is going to make it easier to enter certain things. For example, when we want to express three digits, we would have to enter three metta characters that represent digits. Well, now we can simply enter one and then specify the repetition amount is that type of thing. So let's jump to Reg. Exe Powell and let's take a look at some examples. I have some text in here. My telephone number is as follows, and then we have a telephone number entered. After that, let's first take a look at some word character, so I'm going to enter the matter character of reward characters. And now let's say we want to find all words that are either three characters. Four characters are five characters, so we would do it this way. So notice that the Curly braces. A specify the repetition Come right after the item that were indicating We want to repeat So down here we find 12345 letters and then four letters and then five letters, which is the max and then right here. Since we have spaces in between, it doesn't match any of those. It also matches numbers and there's 33 and then four. There now notice what happens if we add another word character at the frontier? Well, now suddenly we're matching six characters because it has to match one here, and then it has to at least match three more. And so we're getting a minimum of four. Right? Here's a four maximum of six. There's a six. There's a six. There's a six. Okay, so that emphasizes the fact that the repetition amount modifies the item that is to the left. And that's been true with the other. Repetition meant many characters that we've looked at so far. All right, let's modify this a bit. Let me remove that word character. And now let's just specify that we want to repeat three times, and that is all. So here we have a 333 we can see that it's grabbing in threes. All right now, notice what we could do about finding a telephone number. We put hyphen in there is finding three characters that air, followed by ah, hyphen, so you can begin to see what is possible with repetition. All right, let's do another example. Let's put a comma. Now remember, this represents three or more, so we have to at least have three. So that's why these ones aren't selected because there's not at least three. But then, once we have three, it selects everything as much as it can until it can't select anymore. All right, so that's what the common does now. One more example. Let's put a six here. Now we can see how that changes it from when we had a five knows the ours left off W's left off when we changed that to six. Obviously, we're going to a larger number that it's matching. Okay, let's move to a different type of example. So I'm going to replace the text that is here in the test string. I'm just gonna put a couple of hexi decimal numbers, notice how it's matching those, but what if we wanted to specify just Hexi decimal numbers. Um, we didn't want to get any other words. For example, way start typing in something like that. It matches. All this text is Well, so let's modify this. I'm gonna put a character set in Now this character set 0 to 9 A to f those of the range. That's what can exist inside of as a part of a Hexi decimal number. So we'll close our character set. Now we need to specify the repetition. And let's indicate six. Well, there were matching one hacks number. It's not matching this one. Why? Because we have to lower case letters here. So how could we fix that? How could we change that? Well, we could, but a range of lower case characters. Or we could just use a flag, our ignore case flag, which is what I'm going to do. So now we have the eye after right here that ignores the case, and now we're picking up both of those. If we wanted to specify the pound symbol in the front, obviously we could add that. So that's a Hexi decimal number. Now what if we wanted to match asshole. Security number, type of security number we have in the US, So I'm gonna enter Anna. Fake. So secure number here. Right there. Let's see what we would need to do to match this. We can get rid of our ignore case flag, and let's go ahead and start building this out. So we need digits. This has to be digits. So first, let's use a digit. Right now we're getting single matches on all the digits. We want the first sat to repeat three times. Okay, see where that is matching. Then we have a hyphen. Then we want the next set of digits to repeat twice. Then we have a hyphen. Then we want the next set of digits to repeat four times. And there we get a match of asshole security number. So there's several examples we looked at was specifying repetition amount. Now just a closing point about the way we specify repetition amounts. And that is that these don't make our patterns more flexible, but they will make them shorter to re. For example, we could have done the same thing without the repetition like this. Then we put two digits here, and then we put four digits it in. But being able to specify repetition amount is easier and it can make them shorter, and it could make them easier to read. All right, let's move on to the next topic.
23. Revisiting Greedy and Lazy Concepts: I wanted to take just a couple of minutes to revisit the idea of greedy and lazy as it pertains to regular expressions. We talked about this to topics ago. I want to now apply that idea to repetition when we can specify the amount. As we've just learned, this idea of greediness and making regular expressions lazy is very important concept, and that's why I want to revisit it. So here in rejects, pal, I've got some numbers. Now. These numbers air such that there are two digits that at hyphen and then it can be between four and six digits. At least the ones I'm trying to grab this one here only has three. I would not want that one. So based on what we've just learned, you already know how to write a regular expression to capture those. We start with a digit that we repeat twice. Then we have a hyphen. Then we have a digit that we repeat 4 to 6 times you're in that space there and were able to capture it Now. What would happen if we told this expression the four comma six? We told that repetition expression to be lazy. Let's take a look at that will notice what it does now. It only grabs, for it does the least amount. It's lazy. So without this it's greedy. It grabs as much as it can and as much as it can is six digits. So we can see the laziness being attached to this. Forces it to go with the minimum amount. Now, what if we had another hyphen after these with a number? Right now, we're still grabbing the same thing. But now if I add a hyphen is a part of this and then another digit. Look what it does now. Let me just reduce that down to the hyphen because suddenly, by putting a boundary, this hyphen becomes a boundary to what we're trying to grab. Here it is grabbing everything. So it goes to the max again. Even though we have a lazy identify air indicating this this expression should be lazy. Why does it do that? Well, it first grabs being lazy at first, grabs the 1st 4 numbers, but then it tries to continue with the match, and the next thing is a hyphen. There is not one, so it picks up the next character and that fits within this. So it says Okay, I'll keep that. It doesn't match this yet, but I'll keep that. Then it goes to the next one that fits as well between the four of the six. And then it goes, the next one. Then it finds a hyphen that it's looking for, and so its able to do that match. So I think it's important, as you're doing regular expression, to understand how they're working and, of course, understanding that by default they're greedy. But we can force them to be lazy is if it helps us, and many times that can help the regular expression to be more decisive, which means that it will find a match faster. It won't use use up much processing power to do that. All right, so that's greediness and laziness again, let's move on and take a look at an exercise
24. Exercise 3 Start: it is time for exercise six, where you get a chance to apply what we've learned to a Java script problem. So let's take a look at what we're going Teoh accomplish in this exercise. First off, I want to show you the HTML page that this exercise is a part of. Basically, what we have here is a filled where we're told to enter a phone number. It's going to be a 10 digit phone number, and here's what we want to do with that phone number once it's entered. So we wanted validate the phone number that's entered into the text field, and we want to validate as the number is entered. So as each number is entered, we check to see if it matches these particular formats. And then I've laid out some formats here that it should work with. So Prince sees around the area code hyphens between the numbers and appropriate place periods between the numbers and appropriate place, so it could be a number without any of those. Those all need to be matches Now. If the number matches once everything has entered, then the text color should change from red to green by default, the text colors red, and so we need to change it to green once we get a match with the regular expression that we create. Now, just as a hint, you can use the key up event on the text field to respond to enter text so you can use that event to check and see if there is a match enough. So take a few minutes, see if you can solve that, and then when you're ready, we can review the results.
25. Exercise 3 Finish: all right. I hope you were successful in solving that problem. Let me go ahead and take a few moments and show you how I would solve this. Now, when I do these Java script exercises, not only do I include the regular expression stuff that we've covered, but I also like to include good practices in solving this. And so I am going to set this up this particular solution up with and immediately invoked function expression because I want this code to execute when the pages load. But I don't want to leave any residual stuff in the global scope. So that's the reason I'm doing that. So let me first set up the structure of that immediately invoked function expression. So there's the function structure enclosed in parentheses, so that makes it valid within javascript. And then I use a set of parentheses in order to invoke that function. All right, so there's the structure. So now inside of that, I can put the solution. So first off, I'm going to declare variable, and this variable is going to contain a reference to the text filled is being typed into right here, this input text field. So the idea of that is phone. So pretty easy to grab? Um, the variable is going to be phone, so we'll set that equal to document dot get elements by i d and ideas. So that's one variable I want to establish. A second one that I want to establish is the regular expression. So what set that up? This will be our variable. And now we needed to find the regular expression. I prefer doing it with a literal. So there's the structure for the literal. Now what do we want inside? Well, we need to account for at the start of this and open prin or it may not be there, so that's probably the first part we need to deal with. So I need to escape that because it's a Perrin. So I'm gonna escape that character because we're referring to the actual character. We haven't learned yet what princes do in a regular expression, but they are a medic character and do something. So we need to escape it now. Since this can exist or not exist in the match, then I'm gonna put a question mark after it. So basically that tells it to repeat zero or one times. Now we need three digits and there will always be three digits in the first part. Now, after the three digits, there could be a close, prim. And once again, this one is not required either. And so we will repeat that zero or one time. That's what the question mark is doing in this case. Now, after we have the 1st 3 digits weaken, get, ah, hyphen, we could get a period or we could get nothing. It could be the next set of numbers, so we need to have a character set now if we put the hyphen at the start of the characters that we don't need to escape it, put the period in there because it could be either one of those and once again, zero zero or one time. There may be situations where we don't have either of those, but we can either have a hyphen or period as well, and so that sets up this portion of our number. After the 1st 3 digits, then we have three more digits, so we can take care of it like that. Once again, we have the same sort of situation after those three digits. So I'm just gonna copy that part there and paste it after those three digits. And in the end we have four digits. So there's a regular expression. This is what's going to match. We've been able to use repetition characters, and an interesting interesting thing about this problem is we've used repetition. Characters also indicate that the item may not exist, not simply to indicate how many times it can exist, but also indicate that it may not exist as well. All right, so those are our two variables. Now, for the rest of the part of this problem, we need to set up an event listener to that text, Phil. And the purpose out event listener is on the key up event. We're going to test and see whether there is a match or not. So let's first add an event listener to phone phone dot add event listener. And of course, the event we want is key up. And then the second argument we pass into at Event Listener is the function that gets executed when that event occurs. So I'm going to pass in an anonymous function, a semicolon in there and now will define what's inside of this now. This part should be pretty simple. We really need to just take the regular expression and test it tested against the value that is inside of the phone text field. If it comes up true, then we're going to use a couple of classes that I've included in our CSS. The red class green class read classes already assigned as we can see here, but we also have a green class. It changes the color green. And so if it ends up being true, I'm going to remove the red class and add the green class and vice versa. If it's not true, all right, so let's set that up. If and we're going to check our regular expression, use the test method and what are we testing it against? What tax Retesting it. Well, we want to test it against the value of this input field, and so we can get that using a phone dot value. Now. If that evaluates to true, then we want to change the class that is associate with that, and so we can do that with the class list and class list, has a remove and an added method associated with it. So we would remove the red class, and then we would add the green Class. Now that will take care of the situation if it evaluates to true. So if we find a match, But if they then delete a character or something like that, we wanted to revert back to red. So it indicates that it's not a match. So let's finish this off with the else clause. And basically we just do the opposite of what we're doing here. So I'm gonna copy that, and this will be green and this will be read. There we go. Let's go ahead and save that and see how that's working for us. So I'm gonna refresh now. Let's go ahead and start typing in a phone number. This time I'm going to just use periods and it turns to green. Soon as we get a valid full number. Well, let's delete a period stays green. Delete another period stays green. What happens if we delete a number, though ah goes back to read, We had a number goes back to green. How about if we add a prin there and a print there that still accepts it. Now let's add a hyphen here. Still accept it? What if we had a hyphen here? Still accepts all of that is valid. So looks like our phone number validation is working. And we were able to set that up in a regular expression and make it more accurate because of the repetition that we just covered in this particular section. All right, let's move on to the next section.
26. Understanding Anchored Expressions: this section is entitled Anchored expressions. But what do we mean by that? What is an anchored expression? Well, let me give an example. What if we wanted to find a particular expression? A word. But we Onley wanted a match. If that expression came at the start of a string or perhaps at the end of a string, how would we do that? We can do that with anchored expressions. This is one example of what is possible with anchored expressions. So this section has a distinct difference from the previous sections. We are learning in this section to define position. Up to this point, we would take a match no matter where in the string it occurred. But now we will learn toe put position prowlers around that match. It may not just be the start or end of a string. It could also be the start or end of a word. So that is what this section is all about. So let's get to it.
27. Using Start and End Anchors: the first set of Anchored Medic characters we're going to look at allow us to specify a match that is it the start of a line or at the end of a line. So these anchor medic characters specify the position where the mat should take place. Let's take a look at these two matter characters, so first off is the carrot. The carrot anchors the match at the start of the line. The companion medic character for the carrot is the dollar sign, and it anchors the match to the end of the line. So when these matter characters are used in an expression, basically what it's saying is that expression has to be located at a certain place in order for the match to occur when we use the carrot. Obviously, that has to be located at the very start of the line where the dollar sign is at the very end. It's the last thing, and if it is the last thing, then it produces a match. Let's take a look at some examples I'm gonna jump over to rejects. Pal and I have some text in here. Uh, provide some information on the dot medic character which is some good information. But we want to use this taxed toe work with matches that occur at the start or end of the line. The first thing I'm going to do is but the word thy in Oregon expression. We can see that it matches in two places because we have the global flag as a part of a regular expression. Now, if we also add the ignore case flag the I, then we can see multiple matches throughout. No notice what happens if we add the carrot? The care is going to specify that the word the has to be at the start of the line, So I placed the carrot. Now we only have a single match because that is the only thing that exists at the start of this line of text. Now this text wraps, but really, the start is right here. For example, if I did in I m, we wouldn't get a match right here because it's not a new line. It's not the start of a lying. Now, how about if I entered first? It finds a match there, but now let's add the dollar sign and because the dollar sign specifies that this will be a the end of the line. We add the man, a character at the end. So basically what it's doing is it looks and finds a match. And then when it runs into the dollar sign Matic character, it make sure that matches at the end of line and notice. It is not the last thing. T is not lasting. There's also a period there. So if we were toe add a period now we get a match because that is at the end of the line. Now, what if we use both of these together? What does that do for us? As you probably guessed, we're not getting a match, But when would we use something like this? When would we use an expression surrounded by both the carrot and the dollar sign? Well, that is perhaps if we wanted to find a match where that is the only thing that existed and we wanted to make sure that there was nothing else with it, For example, if I had a space here suddenly that is no longer a match, because this has to be the only thing that is there. Delete that space and we get the match again. So these start an end. Meta characters can allow you to do some things you would not be able to do otherwise. Now we've introduced the start and the end anchoring meta characters, and these two meta characters play an important role whenever you use the multi line flag, the M. So we need to talk about these with the multi line flag, and that's what we're going to do in the next topic. So let's move on.
28. Using the Multi-line Flag: Now that we have introduced anchored expressions, it is important to explain how they act in multi line mode. The definition of the care is that it matches the start of the line. I specifically used the start of the line to describe it because when we use the multi line mode flag, it will look for a match at the start of each line. If they're multiple lines in the text and that is true of the dollar sign which matches at the end of the line, it will look for a match at the end of each line. Multi line mode. Really Onley effects The carrot and dollar sign meta characters because they'll specify a position, a match that is tied to a line of text. Let's look at how this multi lying flag effects things now here in rejects pal. I still have the same text and let's go ahead and put in the word that again like we're working with in the last topic, knows we get multiple matches until we add a care and obviously that then reduces it down to a single match and it's a start of this line. This right now is considered one line. So until we have a line break, there is not a second line. But I can put another line break here. We do not get a match. There is no match because the multi line mode has not been activated using that M flag. So let's go ahead and activate that. And now you can see that we do get a second match on the start of that second line and you can see we have the M as a part of our regular expression. Now we have the G, the I and the M. The eye indicates that we want to be case insensitive. The G indicates global, so as many matches possible in the M indicates multi line, which affects the carrot and says now you confined matches, not just the start of all the text, but at the start of each line. Now that will be true with the dollar sign as well. So let's replace this with a period and of course we enter period. It is a matter character that specifies any character, so let's escape that and now we can see that we get five matches. There's five periods in there But if we wanted to specify the end of a line, we then put the dollar sign after it. No notice that we only got a match at the very last period. Why did that happen? What's going on here? Well, let's take a look. Notice knows the space After the period, there was a space before the word the and when I pressed return and created a line break, the space was still left there. If I remove that space, then we get a match without period as well. So it has to be the very last thing in the line. Now it's important to remember that this multi line mode, when we add that in flag at the end of a regular expression it a fax, the carrot symbol and the dollar sign symbol. Because they are tied to lines, they're specifying a position on a line so technically speaking in multi line mode, the care it will match at the start and then after each line break. So here's a line break here, right at the end of that, and so the carrot will cause it to look for another match right after that line break, whereas the dollar sign matches at the end of the line, but right before each line breaks. So here's a line break and it's checking right before that same thing here. OK, so it's either the end of the data or right before a line break. That's what's happening with the dollar sign. No, if I think if you keep that technical definition with you, I think it helps you understand how those work and how to use them when you're creating regular expressions that you want toe anchor at a specific position in the line. Now there's one more important distinction I need to mention about multi line mode. Let me illustrate this. I'm going to remove the multi line flag so the M has gone from the regular expression notice. We have a match now. At the very end. We have multiple lines here, but this is now specifying at the end of everything that dollar signed Medicare card specifying that because we're not in multi line mode and so it finds it here and if gets a match, but it doesn't get this one because we're not in multi line mode. But look, what happens if I enter a return here now that is no longer at the end and so no matches given. So in single line mode, this would not get a match because there is US third line been added and so it is no longer the last thing in the text. But if we turned on multi line mode, then of course it would get a match there. So in multi line mode, it is looking for a match. If a line break occurs after immediately after whatever we've specified in our regular expression, all right, let's move on to the next topic.
29. Working with word Boundaries: regular expressions consists of another anchoring tight meta character, and this new anchor defines the boundary of a word. So if we want to make sure our pattern on Lee matches a word not part of a word, we can use a word boundary. No, There is also another matter character, the louses to specify a non word boundary. So the left or right side of the pattern will need a non word character in order to create a match. This is basically the opposite of a word boundary. So let's look at these meta characters first off our word boundary character. It is a lower case be with you escape. So we escaped. Lower case be to create our word boundary, and this indicates that the pattern is bounded by a non word character. Depending on which side of the pattern weak places, we can place it on the left side and on the right side. If we place it on both sides and it indicates, well, whatever's inside, this needs to be a pattern that is bounded by non word characters. Now, the opposite of that is just the upper case. Be so we escaped the upper case be, and that indicates that the pattern is bound by word characters and that could be placed, of course, on the left and on the right. Now what do we mean when we're doing word boundary? Well, as with the last anchoring characters we've talked about, the carrot symbol and the dollar sign these word boundary meta characters reference position, not an actual character that's important to remember with ease anchoring type. Medicare occurs that they're referencing positions, but by referencing positions were able to define MAWR. Exact regular expressions. Now just remind her. What are the word characters? Well, really, the word characters match exactly with our word shortcut. The escape W and that is equal to upper Case and lower Case Z A Through Z and then zero through nine and then also an underscore underscores a part of a word character. So that's what represents word characters. So let's jump to rejects Pal and take a look at some examples. Now I have a phrase in here, and the reason I chose this phrases because it uses the word plan in multiple places. In one place it is the entire word, but in other places, it's just parts of a word. So let's go ahead and put a plan into our regular expression. We conceded it matches four places. Now, if we wanted to match Onley plan as a word and not as a part of a word, we would use a word boundary to do that. Start with a word boundary on the left. So we use escape slash and then a B. And now we've limited to three matches. All three of these matches are the start of the word meaning. What is on the left side of it is a non word character. In this case, it's a space. In all three cases, it's a space, but it could also be other characters. There is a hyphen. A lot of the symbols, for example, are non word characters. It could also be the start of something. For example, if we had again that matches their because that is the start of our phrase. And so the left side of that is a non word character. And so we get that match. Let me go back to plan. Now two of these still have a word character On the right hand side, we can see that tea and evil zor both word character. So if we were to do a another word boundary that would make sure that we Onley grabbed the one that is a word where plan is the full word, it's not part of a word. It is the full word. And that's what a word boundary can do for us now. Word boundaries don't have toe only be used with full words. For example, it could be something like that. And of course, we can mix in our in our pattern that we bound with these meta characters. We can use any of the other things we've learned about regular expressions. Now, with our global flag, we can see that even though we use the word boundary, we can match more than one place. Because both of these patterns this th i s are bounded by non word characters. All right, let's go back to plan. And now lunch Change these two non word boundaries. So I'm going to do an upper case, be notice right now, we're not getting a match at all, and that's because we have a word boundary at the end here. So there's only one place where plan is the end of the word. That's not the case with Plant. That's not the case with Planet. That's not the case with Implant, but the start of plan has before it. A non word character and the upper case be a specifying that it should be a word character . So, for example, if we just put a letter there, well, there we go. Now we get a match. All right. Now, if we change this last part to an upper case be now, we match the plan inside of implant because it is bounded by word characters. So the upper case B specifies that it's bounded by a word character. The lower case be specifies that it's bounded by a non word character, indicating the pattern in between needs to be a word. So that's why we call the lower case be a word boundary because we're bounding a word. And the upper case be, we indicate, is a non word boundary. Because here we've bounded something that is not fully a word because there are word characters on both sides of it. Sometimes that distinction can be a bit confusing when working with word boundaries. All right, let's move on to the next topic
30. Writing Accurate Regular Expressions: We have covered a lot of different techniques that can be used in building regular expressions. We need to take a short moment to discuss an important concept. It is important to try and write the most accurate regular expressions that you can now. What do I mean by that? Sometimes we can get lazy with the regular expressions we write. For example, a particular match that you were trying to achieve may have more than one regular expression that will work for the data you are testing. So you quickly right regular expression tested and it seems to work. So you don't think about it anymore and you just move on. But as possible that that regular expression may have incorrect matches from a different set of data. So even though you're testing is working, make sure you think through your regular expression and that it is as accurate as possible . Now we have seen numerous examples throughout this course of this already, but let me show another example just to emphasize the point. I'm going to use the United States zip code as an example. Now the U. S. Zip code generally consists of five numbers like I have written here. However, it can also have a four digit extension like this, and that is becoming more common. But generally when we talk about a U. S zip code, we're talking about five numbers. So let's say I were riding a regular expression that would match the zip code. Well, I could ride it like this wildcard character and then a repeat, and that repeat indicates that it matches your old more times. Now notice we get the match we want. We have a sample zip code here and we get the match we want. Even if we were to add the four digit extension, we would still get that match, so it looks like it's working for us. However, you're already aware that this is not the best regular expression for doing this, cause this could match anything. I can put any characters on the front of that on the back of it, whatever it's going to match them because we've used the wild card. So let's narrow this down. Let's narrow down what can be matched by putting in a range of digits. Now, if I put letters on the front, it no longer matches, does it match the four digit extension? No, we've now lost that because we're now focusing just on digits. Well, we could add the hyphen there, and now we get the match. But even with that, let's say we got a number. Let's say we got a number like this. That is a match. But that is not a zip code as wave too many digits in order for it to be a zip code. So we need to even narrow down more. Let's put the exact number of character who want to repeat now. If we go back to our original zip code, Yes, it matches Now. What if we had multiple digits? Well, it's still finds a match. If we were to take this entire bit of data, it is not a zip code, but it's getting a match because the 1st 5 digits match our pattern. And so this is where we need Teoh. Add anchors to a regular expression to force it to be the kind of data we want. And so I would put a start ended in, and now we no longer get that match now in order for it to get the match, it needs to be five digits and five digits on Lee because we're marking the starting point in the ending point of that data. So we narrow what it can be even more now. This, of course, causes an issue when we have a four digit extension such as that. So how would we account for that? Well, then we continue to work with our regular expression, do some of the same techniques, and there we get a match and this works great for that. However, will it match just a five ditches of code? No, it doesn't know we've lost that. So you can see kind of the process we go through here and the importance of making your regular expressions as accurate as possible. Now we have learned ways to indicate the data can repeat zero or one times or zero or many times. And so this last part here, what we need to do to make this work is indicate that this is optional, that it doesn't have to be a part of that number. And so we want to add a question mark it end, indicate that it can repeat or not. But however, the question mark only applies to left most item we needed to apply to everything while in the next section we're going to be looking at grouping characters to solve this completely . But basically what I want to emphasize is the importance of making those accurate regular expressions. And there's some rules we can follow when we're doing that, so that we make sure we're covering our bases, for example, when possible to find the quantity of repeated expressions. So when we started with this, I had this kind of a thing, and I was not specifying the quantity it became or accurate when I specified the quantity of that regular expression. Okay, so that's rule number one. The second rule narrow the scope of repeated expressions. We saw that as well when we were working through this. So initially I was using wild card, and that matches. That's great. But the scope of this wild card is very large. It covers a lot of characters, so we narrow the scope with a character set. We indicated what characters could be a part of that set. That's how we narrowed it. So that's the second rule and then Rule number three, the third rule to follow, provide clear, starting in any points. That's where we put the anchors into our regular expression. We indicated that the data we're looking for it should be all of the data we get. That's why we specified that there was a start and an end, and those anchors were around the data we were looking for. So there couldn't be any other data outside of that when we received this data. And so by specifying the start and ending points. Using anchors and using word boundaries is also another important rule. So those three rules can help us be more accurate with our regular expressions and 1/4 rule that I did not put on this slide because I think it's intuitive to most of us. And that's we need to test multiple data sets, not just a single data set. Make sure. Oh, it works with that. It must be working fine. So let's strive for accurate regular expressions so you can trust the results you're getting. All right, let's move on to the next topic
31. Exercise 4 Start: all right, we've come to Java script exercise For now. This exercise is not going to be too difficult. I simply want to give you a chance to apply some of the things we've covered in this section, and we will be doing that by getting you to use the replace method of the string object rapper. So let's take a look at the assignment now. In this exercise, you have an app dot jazz file in a content dot Js file, and what you need to do is repeat using the replace method changed the text that is a part of the variable text. One. There's an content about Jess changes. So the any occurrence of a day a week day meaning. Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, or Sunday is replaced with Monday. Now, you obviously can check to see what is in content dot Js Right now they're set to Tuesday, but the idea is that you may not know what that will be. That could be a different day of the week, but for this application, you want to change it to Monday. Now go ahead and just display the results to the console. So we don't get any more complex than that. So that's the assignment. Go ahead, give it a try when you're ready. Let's review the results.
32. Exercise 4 Finish: All right, let's take a look at how we might do this now, before I begin. I think it's important to mention that the regular expression for this exercise could have been done a number of different ways. So if mine ends up being different than yours, don't let that throw you off. Now. One thing that I think is important, though, make sure that the regular expression you write is accurate enough that it wouldn't match things that you wouldn't want it to match. And also it's accurate enough that it's efficient in the way it's doing matching now in the next section, we're going to learn, perhaps a better way to do this. But I wanted you to use what you've already learned with regular expressions. So let's go ahead and take care of this now, since I'm just logging the results to the consul, all I'm going to do is set up a variable for the regular expression, set up a variable for the new text after I've replaced it, and then I'll log that new text of console. So nothing difficult. So first, let's set up the regular expression. Now I'm going to use to flags both global and the case. Insensitive flag. I don't want to have to worry about what the case of the letters are. As I'm working with this now, the other thing I'm going to set up is word boundaries. I think this is important that we have word boundaries on each side of this. That's the most accurate way to do that. In fact, if you don't have word boundaries, depending on how you do your regular expression, you could end up grabbing this word here and replacing that. Okay, so those word boundaries are important. Now, at the very end of every week, day is the word day. So I'm gonna have that. The end of my regular expression has to match a day at the end. And also word boundary will make sure that nothing comes after that. Now, the rest of this is where it can be very different. Depending on how you chose to approach it. I wanted to be more accurate. And so I did some character sets, mostly how I did that. So to start off with, I did all the letters that a week they could start with right there. Those are all the ones that could start with. So that's the very first thing that needs to be in this expression in order to get a match . And then I went ahead and entered any letter just to simplify things I could have done gone through a doing each letter this way. But that was probably more worked in what it's worth. And the repetition on that group will be one common for the reason I'm doing one as opposed to two and four as opposed to five is because I also have another character set that I want to do after this. And this is the very last letter before the word day and the days of the week. We can handle that with just four letters, so it worked out pretty nice that way. So with this letter accounted for in this letter accounted for, that means the letters that will be in the middle are either one for something like Monday or Friday, or it could be as many as four for Saturday and Wednesday. So that's how I chose to do this. All right, so that's my regular expression. I'm gonna put a comic, is gonna declare another variable. This will be the new text variable, and I'm gonna set that equal to text one replace. And then I passed in the regular expression because the way replaced works is what we want to replace and then what we're going to replace it with. So that will be Monday. Now this works grabbing this variable here because variables declared here and that job script file is included as part of the HTML file. So that works for us. So that's what I need to set up. Now Let's go ahead and log log to the console. I'm just gonna log the new text to the concert. Save that. Copy this file past so we can test it out. Let me go to the console. Just open that up and looks like the text is coming out here Monday, Monday, Monday, Monday, there, and some days has not changed, so it looks like that's working for us. So, as I said, not a very difficult exercise. But it gives a chance to apply some of the things we've learned, especially the part I think is important as those word boundaries. All right, let's move on to the next section
33. Specifying Options: At times you may need to present options and a regular expression. It can be easier to present options in place of trying to figure out a regular expression that covers all of these options. For example, in the last section we were trying to match all the days of the week, we came up with a pretty good regular expression, but it still could be wrong. Let's take a look at that. So here is the regular expression we came up with. What I did is I specified the first letter and the last letter that comes before the word day. And then we used a character set for the letters in between and repetition, and that works great. We also had, ah, word boundary added to both ends, and this regular expression was pretty successful. However, there could be some mistakes. For example, let's say there was a misspelling here. Suddenly that becomes a match because it fits the criteria that we established here. So in this type of situation, where there are a limited number of options and we know exactly what those options are, it may be better to simply express those alternatives in your regular expression. The pipe symbol splits a regular expression into multiple alternatives, and I think that's an important way to think about the pipe symbol. It's splitting. The regular expression into multiple are alternatives so that it can match this one or this one or this one. Every every time you encounter pipe symbol that is an or for your alternates, you can think of it like the or operator and Java script. Except in JavaScript. We used to pipe symbols in a regular expression. We only use a single pipe symbol. So let's look at how we would do this using alternates. I'm going to keep the word boundary, and I'll also keep the ignore case flag and a global flag for now. So the first option is Monday. Then I enter pipe symbol. Second option is Tuesday. Now we see the match for all the Tuesdays in their pipe symbol, so every time we put a pipe symbol that indicates this could be another alternate. So as it is going through the text, it is checking everything against each one of those alternates in the last would be Sunday . And so it knows it needs toe have word boundaries on both sides, and then it can simply check each word to see if it matches any of these alternates that we've specified Now. This regular expression prevents a match to something like this that has a misspelling, so it is much more accurate. And the pipe symbol expressing alternates is something that's very valuable and creating our regular expressions. Now. One thing I want to mention, let's see what would happen without the global flag. I turn it off. Of course, it only matches the first option. Now it would, as it encounters each word because we have the word boundaries. It would go through each of these alternates and as soon as it matches one of them, then it will stop at that point so it doesn't go through and find a match for Monday, Tuesday, Wednesday and Thursday and Friday. It doesn't do that. It's just looking for a match for any of these alternates. And when it finds a single match, then it's good. No, obviously, with the global flag, we could have one of these be another day of a week and it finds a match for it. I think that was probably intuitive, but I think it's important to point out, so you understand how these alternates are really working now. This example uses a total of seven alternates. In most cases, when you use the pipe, you won't have that many. You'll be choosing between two or three alternates. That's usually the situations I've seen now, also in this example, we use whole words Monday, Tuesday, Wednesday and so on. But remember, these words are simply a pattern. That's really what's being expressed here is a pattern. It just happens that we see it as a word. And so what does that mean? That means that we can use the pipe symbol to propose alternate expressions. So let's look it another way. This might be done much less accurate than what I have here, but it shows that I can set up an expression and then use the pipe symbol to do it an alternate expression. So first off, I have a character set a through Z and for Monday that would need to be repeated three times and then end in the word day. And so that works that finds a Monday. Now I could do a pipe symbol and let's account for Tuesday. This time I will use a word boundary again a through Z, and this time there it will be four repetition of four before the word day that found Tuesday. And then we can continue on with Wednesday, which would be six. Oops, I forgot to include my word boundary, another word boundary. And then, of course, our character set. And then our repetition of six day and an end with the word boundary. And so we'd continue on that way so you can see that I've been able to use the pipe symbol toe alternate through expressions as well. So hopefully this opens up. What is possible with this pipe symbol. The ability to express alternates and a regular expression can not only simplify certain regular expression better opens up what you're able to do. All right, let's move on to the next topic.
34. Using Grouping: it is time to take a look at grouping in regular expressions. Grouping is accomplished with parentheses. The use of parentheses and regular expressions allows you to group portions of that expression. Now, in JavaScript, we can use parentheses, group parts of an expression as well. When we use prints Season Java script, it establishes precedence. Code inside parentheses is evaluated before other operators are considered, and this is true with parentheses and regular expressions as well. It establishes precedence. Now there are other applications of grouping, but we will be looking at those other applications in later topics. I simply want to introduce the idea in this topic. Let's look at an example that shows the value of grouping. No, let's say you are trying to match a number that looks something like the number shown here . Now, let me give you the rules for this number. It consists of five pairs, and each pair consists of a letter between A and D and then a number between one and five. So here the pairs of this number, we're looking at five different pairs, and as you can see, each starts with the letter and they're all between a and D, and then the second item is a number, and that's between one and fight. And then it has another pair that follows the same rule. It's a five pairs total. Now how would we set this up in a regular expression? Well, let's take a look at that. So here's some numbers. Now some of these follow that pattern and some do not. And so we're going to try to find out which do in which don't using a regular expression. So we have a pair and each pair starts with a letter between a ND so we can do a character set like that and it finds all of the letters. And then that is followed by a number between one and five. And we could do a character set for that as well. Now you can see that we're matching each of the individual pairs. We're not matching the entire number. We're matching each individual pair. So how do we repeat this? Well, let's try repeating it five times and let's see what happens. Well, now we're not matching anything now. Why did that not work well because it is repeating the left most item. It's repeating the number. So if we were to come in here and put five numbers, suddenly it matches, has a letter and then five numbers each number one through five. So that's what's going on there. While we can solve this using grouping, we can make both of these into a group, and then we can repeat the group five times. So let's put parentheses around that. Sure enough, now we get matches and those are the three which matched the pattern we talked about. There is any hair, so that doesn't match. There's two letters together that so that doesn't match. There's a six, so that doesn't match. There's two numbers together here, so that doesn't make and so using grouping, we can solve that problem. Now Let's look at another example. Let me replace the text. Now This is similar to example we dealt with in the last topic. We're going to use days of the week again. Now we want to match those days of the week, so I am going to. But Monday and then we're going to use the pipe symbol to allow us to specify. Eight specify alternatives, and so Tuesday is also a possibility Wednesday is also a possibility Now. I don't want to grab these days of the week if they're not a full word unto themselves. So in this case, Tuesdays is something I don't want to grab. I only want to grab it if it is a word. And so weaken Jack that by putting a word boundary around these, right? So I do word boundary there in a word boundary there. Now that works on Wednesday because period actually qualifies is a word boundary. So Wednesday is working. Monday is working, But why is a grabbing Tuesday out of Tuesday's? It shouldn't be grabbing that. And that's because once again, Wednesday has a word boundary and Monday has a word boundary before, but not after Wednesday. Doesn't have one before Tuesday. Doesn't have a word boundary on either side of it, so I could put something ahead of Tuesday as well. There's still match, but if I put something ahead of Monday, it would no longer match because of that word boundary. And if I put Wednesday's here at the end, it would no longer match because of that word boundary. So the way to do this correctly with a single set of word bound instead of putting word boundaries around everything. We can do that with grouping again. So I'm going to put all of this inside of parentheses. There we go. Now, if I put something in front of Monday, it doesn't much match. Tuesday Doesn't match because of the S. Wednesday. Put something after it, it wouldn't match. And so grouping helps us solve this problem again. So in certain situations that you've been able to see grouping can be very valuable. But there are other applications for it as well. So let's move on to the next topic.
35. Using Grouping with JavaScript: all of the regular expression information that is covered in this course can be used, of course, with JavaScript as well as many other languages. So why have a topic with the specific Javascript example? Well, Grouping presents a unique application and javascript that I wanted to show you. So that is what we will be doing in this topic. Let me first explain this scenario. We will keep it somewhat simple. Let's say we are receiving date information as input from a form or something like that. And we wanted to take that date information and separate out the month, the day in the year and store that information separately. And just to keep it simple, let's say the date information that we're receiving will be required to come in a format like this. So it starts with the year four digits. Then it has a separator. Unless see that separated could be a slash. Like we show here could be a hyphen or it could be a period, and then we have a month. Let's say the month could be one digit or two digits and then another separator. Same situation is the first separator, and then the day can also be one or two digits, so that's the pattern we want to match. But in addition to matching that and making sure that the date has entered into that pattern, we then want to separate out the year, the month and the date. No, normally in JavaScript, when we separate things out, we use split on a string so we would receive the data's this traing and then would split on whatever is used as a separator. Now, because we're allowing different separators, it makes it a little more difficult to do that split. We would have to use some if statements to figure out what the separator is before we could do the split. But when we're using a regular expression to check the input anyway to make sure it matches a particular format, we can also use that regular expression to break it out as long as we're using grouping. And that is because grouping captures data. So first, let's solve the regular expression, and then we will look at the Java script example, so I'm going to jump to rejects Pal. That's where we'll solve it first. Here I have a sample date entered year, month, day. Now this data is coming from an input field of some sort. And so I want to make sure that that's the only data that we're receiving. And so I am going to specify the start and the end of that data of that stream. Now the first thing that we will get is a year now. Let's put this inside of a group. So we put the parentheses and the reason we're putting it into a group is because this is going to capture this data for us as well, so that we can use it now. We wanna have four digits. We'll just do it that way Now, in between that year, we can have a separator and that separated could be one of three characters. We mentioned a hyphen ah dot or a slash. So obviously we need a character set and we'll go ahead and put those in like that. Now we have another group, and this will be a month, and that could be one or two digits. And so let's do it this way. Put a digit, and then we indicate that the repetition can be the one or two now we have another separator were using that same pattern again for the separator. I would do that as a character set and then one last group And that last group is going to look exactly like the month group because it could be one or two digits as well. And there we have a match. It matches our date. Let's see what happens if we put multiple digits still matches. How about if we have a different separator? Still matches so we can see that this is regular expression is working for us. So we have a regular expression, and now we can transfer this and use it within our JavaScript. So I'm gonna copy the code of my regular expression and I'm gonna jump to Sublime now and we'll set up the jobs report. Now, here's some data that we're going to be using as example now as I mentioned, this data would be coming from a form. I'm just going to put it in here so we can see how we would work with this now to begin working with it, we need to create a regular expression. Go ahead and paste in what we figured out already there's our regular expression, and I'm going to declare another variables well, and this is going to be an array. And this is where I want to show you what we can do. With regulars resting using the exact method of regular expressions in JavaScript, we can break that information out into an array. Now, when we looked at exact previously, we indicated that exact gave us some additional information. Well, with grouping, we get even more information. It breaks each of these groups out into an array. Now, we already had one element in the rate, and that is the input. It will save that in position zero but exact will break out the groups into positions 12 and three, and then we'll be able to use it. Okay, so let me take a look at this. First. Let's look at this first and then we'll come back and finish up this example, So I'm going to save that me. Just jump back and refresh, and then it's. Open up the consul and take a look at that array. And here's our array position. Zero is the original data, but noticed Position one is the year to month and then day. And so we're able to separate those out, and we still have the other information that is provided with the exact method. So knowing that that allows us to solve this problem without using split. So let's jump back and finish this off. So just to give us a place to store this information, I'm gonna declare three more variables like that, and then we're going to do an if statement to check to see if the data is correct. So if the regular expression dot test using the test method and then we tested on data, if that is equal to true, that returns either true or false. If you remember, if that's equal the truth and we go ahead and work with the data else we would do some sort of air handling. I'm just going to do a console log statement here. Wrong formats. Some like that. Now here we can go ahead and put the information in the correct place. Position one. The second position in the array will have the year do the same thing with the month. Just use to the day. I just use three. So here we have our simple solution set up. Let's just go ahead and take a look at those variables running again. Take a locals variables. We're getting the correct information. So we refresh that now if we just wait a year 2000 18 month, three day nine. So there we can see that we're able to solve that same problem without using Split. And this is advantageous to us because if we're allowing multiple different separators in our match, then it would be a little more difficult to use split for that. But with the regular expression already created, we can just use the exact method to pull that data out. So that's an important concept about grouping and that groups will capture information for us, and we're gonna be looking at that in more detail in the next topic. So let's move on to the next topic
36. Understanding Capturing Groups: parentheses in regular expressions are commonly referred to as capturing groups. The reason for the word capturing is that they not only group parts of your pattern, but they capture that portion for use later on if needed. We saw this to some extent with the exercise we did in the last topic. The exact method used the capturing nature of groups to then put the results into an array . In this topic, we're going to look at capturing groups in more detail. Now let's start with a simple regular expression. So let me jump to projects Powell and I'm going to put a group with two letters and down here in our text, I'll put the word Yo Yo! And right now we're caption capturing both of those individually. So we get to captures out of that now. What if we wanted to capture the full word? Well, this is set up is a group, and now we can refer to that group and cause it to be repeated. So the way we refer to a previous group is we do our escape character right there, and then we put the number of the group that we want to refer to so right now in this regular expression, we only have a single group, so that would be group number one. So I put a one, and there we can see that it captures the entire word yo yo instead of two separate captures. All right, let's do a more realistic regular expression now. In the previous topic, we were working with dates, and we had a regular expression like this. Let me paste it in. And the purpose of that was to capture dates that has four digit year, one or two digit month, one or two digit day, and then the separator could be any of those three characters. So let's go ahead and put a date in here that we can capture. Just do something like that. Now, as you look at this regular expression, you can see that we have several groups, but also we have places in the regular expression that repeat that are the same. For example, this separator, this character set that has a separator characters in it. It's repeated here, and this group is repeated here. This is a great example for how we can use capturing groups, so this particular group, which is group number two. We could simply use again here at the end. Let's go ahead and try that. And there we still get the capture. So this indicates that we are repeating what is in Group two and this group to here because this is the first group, the 1st 1 with parentheses. This is the second group now. We could also repeat this portion, but we need to create a group out of it first. So put parentheses around that. Now that is the second group. So let's repeat the second group there and then repeat the third group. And there we continue that capture, so it makes the regular expression a little simpler. Last a type in by using capturing groups when we have patterns that repeat. So by default, a group captures, however, we can make a group, so it is non capturing, and this is done by putting a question Mark Colon at the start of the group. Let's take a look at that. So this group right here we are not repeating anywhere so we could make it a non capturing group for this example. No notice what happens when I make this an on capturing group. So in order to do that, we put a question mark and a colon at the start of the group. And now that is a non capturing group. But notice we're not getting a match anymore. And the reason is when we turn this group into a non capturing group, it is no longer a group we can refer to. And so this is actually the first group, and this is the second group. So we just change those 21 and to and then we get a match. So we turn this into a non capturing group. Now our matching this and this. Now let me show another example of the non capturing group. And let's do it with the example that we used in the previous topic. You know, if you remember, this is the example we're using in a previous topic, and we were using groups to pull out the year, the month in the day using the exact method. Now these groups are capturing, and so we're able to grab them and put them into these variables from the array that is created with exact so exact will. Put these into elements in the array because they're capturing groups now just to show you that that's what it's doing. Let's make one of these a non capturing group and see what happens. Let's go down here at the end and let's make this one a non capturing group. Now let's save that. Let's go back to our HTML page and refresh it and then open the console and let's look at year. That is correct. Month is correct. Day day is undefined, undefined because I turned this into a non capturing group. So now it no longer puts it in the array and therefore were not able to grab it from the array. So that shows you how the exact method is using those caption groups as well. So by default groups are capturing now, when would you want to use a non caption group? Why not just always use capturing groups, even if you don't use them later on in your expression, or as a part of the exact method like we done here, why not just use capturing groups? Well, there's two possible scenarios where you may know where you may want to use non capturing groups and the 1st 1 is, if you're creating a group and you know you're not going to capture, then you could just use that as a non capturing group. But the second scenario, and one that is probably more applicable is if you already have a regular expression that you've created and you're using groups within that regular expression and you're capturing those groups, meaning you're referring to them later on in the regular expression. But then you realize, Oh, wait, I want a group this information as well. But by grouping that information, it would throw off the groups that you already created. So, for example, let's say we were in this sort of situation that is capturing. And then we decided, Wait, I want to make a group out of these four digits just for clarity's sake or for some other reason. But if we make a group, then we have to change these numbers at the end to make sure they're referring to the right capturing group. So this could be a situation where we use a non capturing group we simply created as non capturing when we enter it, and then it doesn't throw off our groups that we're referring to later on, so that's probably a more practical reason. All right, let's move on to the next topic.
37. Understanding Group Back References: before we move on to a new concept, I want to make sure that there is no confusion about group back references, those back references that we create when we use the escape slash and a number to refer back to a captured group. This is an area that can easily be confused. And here's why. The back reference does not refer to the pattern that you put in the regular expression. It actually refers to the text that was captured. Now remember, groups capture text. This part is important to remember. And so when you do a back reference, it's referencing what is captured, not the actual pattern. Now let's look at some examples that help reinforce this idea. So here in rejects, pal, we have a pattern that we use a previous exercise. I mean, just remove the pattern really quick so we can see the text. Basically, we have a letter followed by a number followed by letter, followed by a number and so on, and the letters need to be a through D. The numbers one through five. And so when I paste that these two character sets him, there's a three D one through five And so we see how many matches we get. It matches each one of those pears. No, What we did in a previous example is we would repeat after putting these in a group like this, and then we could cause it to repeat that pattern for three SATs, which we've done here, and we could see how that matches up. Now, what if we were to refer to this group? We have a group here. What if we were to refer to this group using a back reference? Could we then accomplish the same thing? So if we've referred to it once the 1st 4 in a pattern like that, Well, the answer is no. We get no match whatsoever. And the reason we get no match is because the back reference is not referencing the pattern , its referent referencing what has actually captured. And in this case, the first thing that is captured is a one. The first part of this. So if I were to add a one right here, then we get a match because it matches this group and then it matches this back reference which is exact same as this group. Now, something else that can reinforce. This is if we turn this into in on capturing group, we have no match. All right, let me do one more example. Let me get rid of this regular expression, and I'm going to copy in some HTML text. Very simple. Just a strong tag. And in italics, that's all we have here. Okay? One right after the other. Now, what if I wanted to create a regular expression that would capture the tag and everything in between it okay, and it would make sure that there were no tags in between. It would capture what we have to start, and then it would realize, Oh, we started with strong. So I need to end with strong. So let's look at how we can set that up. And this would use a back reference. We're going to capture this portion of the tag, and then we're going to indicate anything can come in between, or certain things can come in between. And then we will use back reference to make sure this last part is exactly the same. And it will work because we're capturing the actual text, not the pattern. All right, so let's see how we had set that up. So the first thing we start with is the opening angle bracket. Then we want to create a group because this part's going to be our group, and I'm going to use a slash W because there could be certain word characters in there, and we will do a repetition of one or more. Then I want to use the closing angle bracket, and that is the end of my group. And there you can see we capture any tag that is at the front. All right now, in between the opening and closing tag, we could have word characters or space characters. That's what I'm going to say. So we'll do a word character, and we'll also do space characters like that. And this could be one or more. Now we're capturing clear up to there now to end this off, we can do an opening angle bracket and then we can add the slash and then we can do a back reference. And there we go. Now we've captured based upon the tag, so it ends our match as long as the tag at the end matches the starting tag and that is because the back reference once again is referring to the text that is captured, not the actual pattern. So very important concept to remember when you're dealing with back references. All right, let's move on to the next topic.
38. Using Lookahead Groups: We need to cover one more topic with grouping in the section, and that has to do with a thing we call Look ahead groups. Now, I am not a fan of this term for this feature. I don't think it really describes what this feature does, so let me give a definition, and then we will look at some examples, so you will know when it might be helpful to include a look ahead. Group allows us to use a particular pattern to determine a match, but everything in that group will not be part of the results. It will affect what is matched but will not be part of the results. Now. How is this different from a non capture group? And secondly, why would you want to do this anyway? So let's deal with the first question with a non capture group no indexes created for that group, so we can't refer back to it or included in our JavaScript solution. However, the non capture group is still a part of the results. It is still a part of what gets matched with a look ahead group. It is not part of the results. It helps determine what is matched. But that pattern that is part of that look ahead group does not become a part of the results. So once again, the question Why would we want to use this? Let's look at an example now. First, to create a look ahead group, we use the question mark and equal signed together, and those come at the start of the group. So right after the open print, we put question mark and then the equal sign. As you remember, a non capture group uses the question mark and the colon. So the difference is is that equal sign is opposed to the colon. Now, for this example, let's say we want to match domains, but we only want to include in the match the portion of the domain that comes before the dot com. We'll just use dot com for this example, so let's first look at what the regular expression maybe like in rejects Bow And then once we've done this in Reg, exp Out will also do this as a JavaScript solution because I think as a job script solution , we'll see how this could be advantageous. So first, the domain itself can include any word character. So that's exactly what we're going to put here is a word character and then one or more of them. However, even though we only want to extract the domain, the only part want to matches domain, not the dot com. We want to force the match to included dot com so that it will not match it unless that dot com is there. So we do that with a look ahead group. So let me start by creating the group and then a question mark equals sign. And now to do dot com, we need to escape the period because, as you know, the period is a special character which represents while the card. So we escaped that because we want to use the actual period and then we have the word come . All right, so that's our regular expression. Now let's see what happens if we have all things javascript dot com notice We get a match, and the only thing that matches is this part of the domain. It doesn't match the dot com, however, it does force it to have a dot com. If we remove any part of that, we no longer have a match. So this look ahead group here is saying There has to be a dot com after this part of the text, which is any word character one or more times, so there must be a dot com after that. If there's not a dot com, then we don't get a match. Now. If there is a dot com, this could be any length down to one character, all right, so you can see how that is working. We're capturing the portion before the dot com. Now let's look at how that could be useful in a javascript example. So let me jump to sublime and we'll take a look at that. So here we have some data, just a string, and it contains three domains. All things jobs. Got dot com. Google dot com, youtube dot com. Once again, I'm just using the dot com. Just gonna make it simple so we can see how this works. Now we want to extract the domain portion. That's all we want to get out of that. So let's set up a regular expression. And once again it's gonna be the same thing we had in rejects, pal. So we're going tohave work characters one or more. And then we're going to create our look ahead group with a question mark. Equal sign. And then we got to escape the period calm. There's our look ahead group. Let me close our regular expression and I'm going to use the global flag here because we're going to use match with this Java script solution and when matches used. If we use the global flag, it will create an array of all of the matches. So there's our first declaration. Now let's declare an array because that's what we're going to put the results into. And we will do match and remember matches done on a string. So we'll do that. Data dot match and then we pass in the regular expression like that. Okay, now, let's go ahead and save that and let's see what we get. Let's jump out to our Web page. Refresh it. Let me open the console and let me just type in the array and notice. We've got an array with three items, and the items that are extracted are the domain portion. All right, so that works good. Let me just make one tweet to this. Let's say this had the other portion of the or l like this. Something like that. Let's say we get now. I'll just do two of them. Save that refresh. Take a look at that A Reagan, and we get the same results. So it's removing the http slash slash. Now why is it doing that? Well, because notice our regular expression, we want word characters that are immediately followed by dot com, and so therefore, it is extracting this portion of it. All right, So nice feature of a look ahead group a nice thing you conduce do in JavaScript with a look ahead group. All right, Now we're going to do a second example. And for the second example, we're going to use Reg. Exe pals. Let me jump back there. I'm gonna remove what we already have. Let me explain what we're gonna try to do. So we want to create a regular expression for passwords. Now, obviously, passwords can include a number of different characters, and those characters can be in any order so it can make it difficult to create a pattern. A regular expression to match passwords. However, with look a heads, we can do this easier because I look at group does not consume any characters. It helps determine the match. But it does not consume those characters. It does not capture any characters, so we can use several look ahead groups together. And what will happen is they will force the password to contain their pattern, the pattern defined in the look ahead group. But then the next look ahead group will allow it to work on the entire text, beginning at the start of the text again looking to see if it has a match. And the reason that does that is because when a look ahead group determines a match, it doesn't consume any of those characters. So the next look ahead group will start at the beginning of the text and check to see if it matches. And then the next look ahead group will do the same thing. Okay, you know, maybe you as I explain this, you're getting an idea of where I'm going. But if not, we're gonna do this so you can actually sit. So let's put together this example and let's say we want to verify passwords that must contain at least eight characters. So we want to check for eight characters included least one uppercase character, at least one lower case character and a number. So those are the four things it must contain. All right, let's look at how we would do that. So, first, this password. We want to specify that it will be the start and end of a stream. So we have those two characters to start and end now. In between that, we want to put our look ahead groups, and we want to put our look at groups at this point right after the start character, because we want them to start at the beginning of this drink and check the screen. Check this string for that pattern. So our first look ahead group Let's do one. The indicates that it must have eight characters, so we first define. It is a look ahead group, so that won't capture any characters, and then that could be any character. So we use our wild card, and then it can be eight or more just like that. And then we close our luck had group. So this is going to make certain that the string has a characters in it, but it's not going to capture any of those. So then we can define another look ahead group, and this second look at group will start at the beginning of the string again because this look ahead group did not consume any characters. And let's do this one for uppercase letter. So we will define it as a look at group. And before we encounter an upper case letter, we could encounter many other characters, zero or many. And then we put a character set for upper case characters like this. Okay, so see what that is saying. There can be characters ahead of this, but we at least want one uppercase letter between A and Z. That's what it's saying, all right. And since it's a look at group, it's not going to consume anything. So the next look at group will start at the beginning of the string again, and this one we're going to do lower case characters, so this is going to look very similar to the one we just did. Question Mark Equal sign, wildcard zero or many, and then a character set a through Z, then close are looking group. Now we have to check one more thing, and that's a number. So one more look at group question mark equals Sign defines it is look at group, then once again, we have a wild card because a character could occur before a number and then another character set of zero through nine. Now, one more thing we need to add before this regular expression is going to work for us. And that's we have to specify. Before we have this ending anchor. We have to specify that there could be other characters at the end of it. So let's do a wild card and then a zero or many quantify air. All right, so let's see how are password works. So first, let's have a lower case character than uppercase character and let's have a number, and we don't get a match there because there's not eight yet, So if we add a few more characters, we now have six seven, one more character and we get a match. So basically we are matching this, but we don't get a match. If these look ahead, groups are not met. They have to be met, even though they're not part of the capture. Okay, so we removed one character just like we saw. It does not match the less. Say we remove a number now unless replace it with the letter. We do not get a match. Put a number back. We now get a match. Let's remove our upper case letters and see what we get. No match. And so it's forcing it to have what we define in each of these look ahead groups. And then it matches the results because of our wild card and repetition character here at the end. So this is a great example of how to create regular expressions for a password. One way to do it. And it's possible because of look ahead group. So one pattern that you may want to use in multiple places and your programming work. All right, let's move on.
39. Using Negative Lookahead Groups: in the previous topic we described Look ahead groups which basically require the pattern toe include what is a part of that group without that information being part of the match? So the match will not happen if it doesn't meet that look ahead group. Well, we also have negative looking at groups. So if we wanted to force the pattern toe on Lee be a match if it did not include something that's part of a look ahead group. So a regular look at group has a question mark. Equal sign and negative. Look ahead. Group uses a question mark. Exclamation point. So let's take a look at that really quick. I'm gonna jump back to rejects, pal, with the example we were working on with passwords. Right now, this is forcing this password toe. Have everyone of these attributes at least eight characters. Uppercase letter, lower case letter zero through nine. So if I add an uppercase letter here, we didn't get a match. Well, let's change this. Look ahead group to a negative look at true. So we just put an exclamation point there and now we don't get a match because there is a number in it. So it's saying Must have this must have. This must have this, but not this. So if I remove that number, we then get a match that works. So that is how a negative look ahead group works. So there may be times where you're not simply trying to make sure it includes something, but you're trying to make sure it doesn't include something. So be aware that that is available. Now, before we finish this topic, I just want to come over here to the cheat sheet again. That's a part of rejects, pal, because we have now covered almost everything that is a part of this cheat sheet here in the groups. And look around. This is where we've been dealing. This is what we've been dealing with in this section are captured. Groups are back reference to a group non capturing group, positive look, ads and negative look a heads. And the reason I wanted to visit this again is to remind you that this does exist. If you forget the exact character for what you're trying to accomplish, take a look at the cheat sheet. It can be hard to remember all these characters because they're not necessarily intuitive, but this cheat sheet can help you remember those and then apply them as you need them. All right, let's move on to the next topic.
40. Exercise 5 Start: we have come to our fifth javascript exercise. We're gonna take the opportunity to apply some of the things we've learned in this section to a Java script example. So let me jump to sublime, and I'll explain the task. Here is what I'd like you to do in this exercise. I've provided some data. It's an array in a consists of names and notice that all the names are listed. Last name, first comma, first name. That's how we're listing the names. So what I'd like you to do is iterated through the data provided in this data variable and use a regular expression. And what I'd like you to do is return a new array, create a new worry that has the names listed, first name and then last name. So, basically, you're switching the order of the names in this new array and you're doing it with a regular expression. So take a few minutes. Give that a try. Try to figure that out. When you're ready, you can view the results movie
41. Exercise 5 Finish: All right, let's take a look at the solution. At least the solution that I came up with their actually multiple ways this could be done. But even if you did it differently than the way I'm going to show you, I think there are things to learn from different solutions. My approach was to try to do this as efficiently as possible. And so since I was mapping one array to another, I chose to use the map method of a raise to do this. So let's take a look it what my solution is. So, first off, I want to define the regular expression. Now I would like to use the regular expression to have capturing groups and that I can use those capturing groups to reverse the name. So by putting the last name and the first name in a capturing group and then using the exact method of the regular expression, I would be able to reverse those. So that is my approach. So let me set up that regular expression first. Now I want account for the last name, and that consists of word characters. I could have even been more targeted with this. I could have just done a character set of letters and then a repetition. But I chose to do it this way. So word characters. There's my repetition. And then there was a comma and a space, so I'm going to include that in the regular expression. And then let's create a group for the first name. We're character again and it repeats. All right, there's the regular expression. Now I want to assign this to a new race. I'm just gonna call the Nouri new data. And then, as I said, I'm going to use the mat method of a race to solve this. Now Matt Method allows us to pass in a function, and that function acts on each value of the array and then returns a result that will be placed in the new array. So basically, map is going to cycle through iterated through every element in this array, and it will pass each element one at a time, to the function that we pass in the map. And then that function will act on that data. So here's the function I want to pass him. I need to create a parameter for the function. This will be the value of the array that has passed in. Now let's go ahead and define the body of the right. Now, First off, I want to run the exact method. So I do it like this on that value. So the exact method on that value and we're storing the results in this variable here. Now, remember when we used exact it stores the results in an array and then we can access those . Now I'm going to include a little if statement just to make sure that there was a match. If there's not a match, then I don't want to try to return something from the array or I'll get in air. So I'm just going to check to see if this is not equal to know. Remember, if exact, does not find a match, it will return. No. And so if it's not equal to know, then we can return the name in the correct order and we do it like returning first element to of the array. Then we're gonna can Canton eight that to a space and then element Dwan of the array like this. No, If it is not a match, then I'm just gonna have it returned. All I could have it return the value, and then it would still be last name, first name. But all this habit return? No, for this example. All right, so there is our code that should make this happen. So let's go ahead and save that. Copy the file path here so we can test this out, you know, open up the console, and then I'll just access that new data variable. And then we have an array of values and noticed that its first name, last name has changed the order. All right, so that worked for us. Now, let me just mentioned one caution here. You would not want to do this with a global flag If you put a global flag there, that could cost, um, issues. Because when you use a go the global flag with exact or with test, what it does is it saves the index position of the last match, and then the next match starts at that index position that would throw the matching off. And so you'll find if you include that it doesn't work for every name, in fact, works for every other name and so we would want to make sure that was not part of our regular expression. So if you're having troubles getting this to work and you're doing a similar approach to what I did, it may have been that you were using the global flag there, so you may want to check that. All right, so that is the solution to this job script exercise. Let's move on to the next topic.
42. Introducing Unicode: before we dive into using Unicode with regular expressions, I want to provide a brief introduction to Unicode. Basically, computers deal with characters numbers. Before you, Nicole came along. There were several different systems for representing characters, and these were referred to as encoding characters. There were many problems with these encoding systems, but that all changed with Unicode. Unicode provided a standard uniform way for representing characters. You can think of it as a giant table that shows a number and what character that number represents. For example, here is a Unicode table. Over here on this side is the start of that number. You mouse over the character. You can see what Unicode representation it is. For that character. I can scroll down through a bunch of these just numerous, numerous characters. Obviously, toe I have something for each language that is available. It would require a lot of characters. Now the Unicode standard provides enough room to represent any characters you saw in that table. So since we're working with characters and regular expressions, we may need or want to represent them with the Unicode equivalent. So how do we specify Unical character or regular expressions? Well, It's pretty simple. It's with an escape sequence, and that escape sequence is slash you. So here is a representation of a Unicode character slash you and then the Unicode. Number 0065 Which happens to be the letter e. So now we know how to represent these characters. Let's move into the next topic and look at some examples.
43. Using Unicode Characters: Unicode Xscape characters can be used like any other character. And of course, the best way to see this is with examples. So let's jump right to rejects PAL and take a look at some examples. Now here we have a name. This is left over from a previous example that we did Andrea Smith. Now let's say we wanted to see what aides are available there. We could do that with the Unicode character using escape sequence 0061 That is the Unicode character for a know How do I know that? Well, I look at a chart like this or your computer may have a way to look up different Unical characters as well whatever method you want to use. But I can see in here that a capital A is 0041 lower case, a 0061 And so I was entering 0061 Notice it is grabbing the upper case and lower case a wise it doing that because of our flag. If I were to remove the flag that ignores case, then of course we would get the lower case A because that's the Unicode character that were currently matching is lower case A. Even though we still have a global flag on, it doesn't grab the first day cause that's not a match grabs a second a. Now we can also use unico characters like we would any other character if we just to put a character into a regular expression. So if I wanted to do a character set, I could do it with the Unicode Escape sequence as well. Much just go a few letters farther in the alphabet at the end of that character set, and there were grabbing D E E and a grabbing. All of those is a part of that, so that allows you to see that Unicode can be used in a place of other characters. But that's not normally how we use. We don't use it that way. We use it for characters That may not be easy to type, and therefore we want to use the Unical representation of that character. So let me copy and a few characters that we may not always know how to type. Each of these could be found in a regular expression by simply putting the exact character in the regular expression like this. I can put the pound symbol there and it will find it. Or we can use the Unicode equivalents. So once again, that is the Unicode equivalent for the pound symbol. Now how? I don't know that once again, I look on that table. Let's say we wanted to get the copyright symbol that would be a nine, and there would grab the copyright symbol so generally used for things that we can always type quickly that sometimes why we will use the Unicode character. But remember, you can also just pace the character into your regular expression. Or, if you know how to type it on your keyboard, type it into regular expression. Now, one more example. Let's say we have some texts that were trying to correct, and it uses smart quotes. And so we're trying to get rid of the smart clothes because we know that causes problems with whatever programmer processing it in. So if we were to try to grab those by just entering a regular set of quotes, it's not going to get them. We would need to end to the smart quote once again, we could pay second or use the Unicode equivalent, which in this case, to a one C. We'll get the opening one and to a one D we'll get the closing smart quote those in a character set, and there we can grab them, and then we could use a replace to put in a regular quote. So the important part to know with Unicode is simply how to enter them. If it If you run into a situation where you need to enter the Unicode equivalent, then you'll know how. That's the purpose of this small section on Unicode. It was one more thing we need to talk about, so let's move on to the next section.
44. Understanding ES6 Unicode Features: the Unicode character representations we have been using so far support four x A decimal characters. So we have been using a slash you and then four characters ranging from a zero to F that allows us to represent X a decimal numbers. Now, with those Hexi decimal numbers, there is plenty of numbers to represent every character you would normally want to use. However, there are some extended characters that can't be represented using four Hexi decimal characters. One example of this is the trouble cleft character used in music. As shown here on this slide, it requires more than four Hexi decimal characters to define it. So what happens if we enter more than four? Well, let me open rejects, pal. Here I have a trouble cleft character here. Now let's say we want to represent that so we do a slash you and the code for that is one de 11 were fine. Now the last one we have to enter is an e. You can see what rejects pal does, too, that it no longer considers that is a part of the Unicode escape. It now becomes an actual character. E. So what do we do about that? Well, traditionally, we have represented characters like that, using two sets like I show here, Here's our first Unicode sat than a second set. The problem with that, that is, that gets counted as two characters, so that's not a complete solution. Well, as of E s six, we have a new flag for regular expressions that changes how we process Unicode characters. Now that flag is the letter you. And when that flag is used, it will consider an extended character as a single character. And also we can represent those extended characters like this where we have a slash you and then we have the curly braces. And inside of there, we then have more than four hex Hexi decimal number, and that would allow us to represent this trouble. Cleft character now rejects Pal doesn't support this. You flag this Unicode black So we'll take a look at a few examples in the consul's. Let me open up The consul for this first thing I want to do is I'm gonna set up a variable that has this character in it. So g clef now has that character in it, plus ah, hyphen and then the word class. So let's take a look at what would happen as we use this inside of regular expression. So first off I'm gonna create a regular expression and I want this to match at the start of the stream. And then ah, single character. There's our wildcard character that raises represents one character and then clap. Now we're going to test that against the variable we created. No, I'm going. Teoh, remove the you flag to begin with and let's see what happens. We get a false we get a false because once again, it considers this part here as more than one character. So it doesn't match that regular expression. Now what if we were to include the Unicode flag? Let's take a look at that. Let me add that in theirs are Unicode flag. Now when we press return, we get a true it matches that. So now I sees it is a single character because of that flat. Now, I also mention that we can use a new way to represent that character as well if we use the EU flag. So let me enter that. So here is our representation. There are five Hexi decimal characters there to represent that trouble cleft character. And we're going to test to see if we get a true and we get a false and the reason we get a false is because we haven't included the Unicode flag yet. So once I add that Unicode flag in, then we get a true so as mentioned, that is something that became available with, yes, six, not the type of thing you're going to use extensively. But I wanted to make sure you are aware of it. So if you came across an extended character that you needed to use and regular expression, you would know how to do it. All right, let's move on to the next section.
45. Applying Regular Expressions: in this section of the course, we are going to be applying regular expressions. This will give you a chance to see riel regular expressions and how you might use them to match certain criteria. Now we'll have several examples in this section and with each example. If you would like you composite video after I introduced the example that we're going to try to match and you can try it on your own, and then you can restart the video and see how you did now. Be aware if you choose to go this route that regular expressions can come in a number of varieties, so your solution may work just fine. Yet it may not be exactly like my solution, but the idea is toe. Learn what you can do with regular expressions. See some examples for real life data that you want to match, and in the process, it will give you practice with regular expressions. So let's go ahead and move on to the first example
46. Matching an Email Address: in this example, we're going to look at matching an email address. Now. This is a regular expression that could be done a number of different ways. It depends on how strict you want to get with your regular expression. So let's first junk to rejects Powell. Let me talk first a bit about an email address in the parts that will help us to determine what need match. Then we'll look a simple, regular expression, and then we'll look at one that is a little more strict. So first off, if we look at the parts of an email address, there is some tax, some characters. And then there is the at Symbol. And then there is a domain period and then a com net or a number of different things at the end. So for all of these parts, we really don't know what characters or how many characters that will be here at the end. We do that can be 3 to 4, usually, while it could be too, as well, um, but we do have some guidelines, but not a lot. And so we probably don't want to limit the number of characters. We probably just want to try to match based upon wild cards based upon a number of characters that we may not be able to identify. So let's first look at a simple example. So if we throw in a wild card here, obviously matches every character that's on there, and then if we indicate that wild card can be won or more, then we get a single match. Now let's account for the at symbol. The at symbol comes at one place, and it can't come other places. So if we put an at symbol here, then we get the first part of the email address matching with yet symbol. So following the at symbol, let's do another wild card, and that could be one or many, and so that's matching all the way through the dot. Now we know also that there will for sure be a dot or a period. And so let's specify that that would be part of the email address. We have to escape that because if we don't want represent the wild card, so we escape it and then once again following that, we can have one or many characters, and so there we get a match on our email address. No, that's a simple regular expression, and it identifies the parts of any email address. But let's expand on this. There are certain characters that you can't have in the different parts. For example, on that symbol can't be in this part or in the domain. It can't exist there. That's illegal also, spaces. I can't be an email address. And so let's approach the regular expression by eliminating some things that someone might be able to put in that we don't want a match for. For example, if there was a space here that would match it, but that's not a correct email address, so let's see if we can address that limitation. So we're going to start this by specifying this start and the end of the string. So putting both the start and end characters there now I want to create a character set for each part and separate each part by an at symbol and a period. So let's create our first character sec, and I said we're going to eliminate those that can't be a part of that. So we want to negate this character set so it will match any characters except those that are in this character set. One of those should be a space. We don't want a matching this base, and we don't want a matching the at symbol. No, there might be other characters that you may want to throw in later on, but this is going to accomplish what we need. So we don't want to match space or at Symbol. Now we want this to repeat, So not a single character we wanted matching one or more and then at that at this point email address should have an at symbol prior to the domain. All right, then we want another character SATs. And in this character SATs, we want to match everything except a space again, a space type character. I should say this could be New Line is, while it's not just space, we don't want a matching that symbol. And in this part here, we don't want it being a period as well, because the period will start the end of that. And so let's put that in this character set a swell and that can repeat one or more times, and then following that portion, we have a dot Now we have to escape the dot here. We don't escape it here because inside a character set, but outside that, characters that we do need to escape it. And then we have one more character set, and we want to match everything except space character and in that symbol, and then that would can repeat as well. And there we get a match to our email address. But if there is a space, we don't get a match. If there's an at symbol where it shouldn't be, we don't get a match. So now this works pretty well for us now. We could get a lot more restrictive. There may be other characters that would not be included. There may be certain lanes, so we want to put on things such as the domain part of the email address. But this works pretty well for us now, a couple of things to be aware of reminders about regular expressions. So the carrot symbol is used for two different things. Carrot symbols used to mark the beginning of the string, the beginning of the line. It's also used to negate within the character set, so be aware of that the space character, remember, doesn't just indicate a space, but it's wife spaces, so it could be a tab. It could be a line break. It could be a space. So it accounts for those. I spoke on it throughout the exercises, a space character, But I want to make sure you realize that it represents the other things we discussed earlier in the court. And then I also noticed that I can put a period here and we were trying to prevent that by having this period negated in this character set. So it should match everything but a period. So why is that matching it? Well, what's happening here, I realised, made a slight mistake. So what's happening here is this period is matching right here and then everything else is matching with this character set, and so we need need to negate a period in this part as well. So let me put one there and there. We don't get a match now if I remove that, it goes back to a match, so that works better for us. So make sure we add that period in this character said as well, so hopefully you learn something from this, hopefully allowed you to review regular expressions. Now let's move on to the next example
47. Matching a Twitter Handle: this example is going to be a bit easier than last. When we did this is matching a Twitter name little bit simpler than an email address. There's not quite as much variation that can happen with a Twitter name. If you like to try this for yourself, pause the video and then restarted again when you're ready to check it out. Now here's how we're going to do this. Pretty simple. I have my Twitter handle sitting here to match. This is not that difficult. We do want to specify the start and end of the string, so let's put both of those in. So we're assuming Twitter name was entered and we're matching it to make sure it's correct . Now it starts with the at character, so let's go ahead and place that in. And then we have word characters that commit that can come after that. Now, remember what word characters represent. They represent the letters A through Z and numbers zero through nine, and they also represent the underscore. So it could be as simple as what we have here. We need to add a repetition character cause we want at least one of those to exist, and it can be as many as possible. And so we get a match there. Pretty simple. Now what if we wanted to extract the Twitter name and store it and we want to distort the same way, unless say, that is possible, that somebody may enter the at symbol and some may not. And so we want to store the part of the Twitter handle without the M percent. And then when we use it or displayed or whatever, we'll add the n percent so we could do that by turning this into a capturing group if portion of it that represents the word portion of that Twitter handle, and it would capture that. And then with get JavaScript, we could simply store that portion of the match now. Also, we want to account for situations where they may not enter the M percent. And so let's enter a question mark after that that indicates that this could be zero or one . So if somebody enters to Amper Sands, it doesn't match. If somebody emperor enters no ampersand, it does match, and the matching also includes a capturing groups that we can use that capturing group with JavaScript to extract what we want and store. And if an ampersand is entered, the capturing group does not grab the ampersand because that's not a part of the capturing group. So that is matching a Twitter name and a little bit discussion around that. All right, let's move on to the next example.
48. Testing Passwords: the next application we're going to look at is testing passwords now. In this application, I'm going to talk about the regular expressions that we would use. But I'm also going to look at a Java script solution of how to test passwords. Now think about what you would do with passwords you want to test to see if it matches certain criteria, for example, if it is of a certain length if it contains upper case characters, lower case characters, numbers, special characters, whatever your password criteria is, that's what you need to verify now. Sometimes, fact most times is easier to do this with separate regular expressions, so you create multiple regular expressions, and then you test the password against each one of them, make sure that it is valid, and when it meets all that criteria, then you go ahead and say the password is correct. So knowing that if you'd like to try this on your own first go ahead and posit video. Then once you're ready, restart and you can watch the solution and I came up with alright. So first, let's talk about the regular expressions that might be used to test a password So let me jump to rejects, Pal. I have a password here that was generated using a password generator. I just put the criteria in and it created the password. Since we have here and you can see that we have upper case letters, we have numbers. We have lower case letters, and we have special characters as well. Now let's say we wanted the passwords to be of a. Certainly they needed to be at least eight characters, but no more than 32 characters. Then, of course, the way we could easily do that is put our character that specifies the starting location, then our wildcard character. And then we specify the length, and we can see that we get a match there. So the fact that matches this regular expression tells us that okay, our password has matched the length criteria. Then let's say we want to make sure has uppercase character. That's simple. We just do a character set, and if we get a match, then we know it meets that criteria. We can do the same thing with lower case, and we can do the same thing with numbers like that. So all of those matches occur now. What if we want to do special characters? Well, there are two approaches to that. You can list out all of special characters that you allow that way. You know it only matches characters that you require. Or you could simply do something like this, a character set where we negate at the start in the gate. Anything that is not a number, an uppercase letter or a lower case letter, and you can see that we get some matches there. That's one way to do it. All right, so that's the regular expressions that we would use to test a password. Now let's take a look at a JavaScript application of this, so I'm going to jump to sublime for that. Now here's what we have. I've set up a password variable that contains that same password that we're working within rejects, pal. Then I have set up several variables to define the different regular expressions. Here's the one for the length upper case. Lower case numbers in the nose, one for special characters. I went ahead and created regular expression that has a character set. Here's the outside of the character set, and it lists all of the special characters that could be there. So all of these air special characters right here we do some escape to get special characters that require escaping. All right, here we do as well. The hyphen requires escaping. That slash requires escaping and then the square the closing square bracket requires escaping. So those escape sequences in order to specify those characters the raster just the actual character that we're trying to match. So I've got all those set up and then I simply do an if statement if and I used test, use the test method of regular expressions in JavaScript and test the password. And that is all included in one if statement, if we get a true from that test a true from this task a true from this test and so on. If all those equal true, then it matches, otherwise it does not match. So let's take a look at the results of that really quick. So let me jump out to refresh that and then if we look at the concert, we can see that that particular password matches Now this Java script solution could easily be turned into a function that then we could call any time we wanted to test the password. So let's see how we do that. Let let's call it, uh, check past, and we're going that have a password pass into it that we can then verified. So I've got my function set up. Let me just copy all of this code here and put it inside the function in Damp that Now what ? I want to change with my if statement, as I simply want tohave it return true or false, if it matches a returns true. Otherwise it returns faults. And that way I can use this function whenever I want to check a password and see if it matches our criteria for certain passwords. So let's go ahead and check this one. We'll just lock to the consul the result of calling check pass. It will pass in the password this variable here that will get stored in this variable and then that will be what is checked in our function. So let's see if we get a true returning for that refresh and there we go. We have a true so that is a function that could be used in any situation for checking passwords. Obviously, if you have different criteria on your passage, you would change your regular expressions. But other than that, it could be recycled. All right, let's move on to the next topic.
49. Using Replace with Regular Expressions: in this application we're going to take a look at using string dot Replace with regular expressions Now, before we take a look at the problem. When I solved, let's just review replace really quick. So I'm gonna open the consul here, and I'm gonna enter a statement that uses a replace. So let's say I have some some HTML code and I want to change the B tag that produces Bold. I want to change it to a strong tag. So let's look at how we might do that. I'm just going to use consulate out, log toe, log the results. We can see what happens now. First we have a string. This is that 80 Mel code is going to be contained in a stream, simply going to look like this. Something not too complex but can illustrate the use of replace. Now replace is a method on the string object rapper. And so any string that we have we can use replace to modify that to replace a portion of that string with something else, and we confined the match with regular text or we can use a regular expression. So let's look at we would use regular expression. So to begin using, replace we use the dot syntax Now, as you can see, I'm doing that directly. Office string. It could be a variable that contains a string. But here it is the actual string itself. Then inside of Prince sees the first part is the part. I want a match. And so here's why. I'm going to use radio expression now I want to replace all occurrences, so I'm going to use the global flag. Then we use a comma. So this is the match. Now we indicate what we want to replace the that match with, and that will be a string. And so basically what we're going to do is we're gonna find every occurrence of be greater than symbol and replace it with strong, greater than symbol bulls that that end, Replace more. Prince sees the end concert out log No, Put our semicolon. Let's just see what we got. So as you can see, it went through and replaced all the B tags with a strong tag So quick refresher of how replace works. No. Remember, that could be a variable as well. So if I contained a string, we could use dot Replace on that. All right, so that's how we use replace. Now, let's look at the problem we're going to try to solve with that. So jumped a sublime. What want to dio is take this array of names and notice that the names are last name than first name. We want to take that array of names. We want to switch it. We want to switch the order. So the first name comes before the last name. Now, in my solution, I'm going to use the map method of a raise if you are unfamiliar with map and would like to try that yourself. All include a link to YouTube video I've created about the map method. The reason I want to use map is because map returns into array. That's basically what it does. And so whenever we haven't array and we want to create a new array from it, map is a logical choice. So if you'd like to try this first, go ahead and posit, give it a try and then restart when you are ready. All right, let's look at how we would accomplish this. How we would switch the names using regular expression and using map. So now, as I said in the introduction map returns into array and we want to know race. So let's go ahead and define that new ram. Just gonna call it new names, and we're going to set that equal to name stock map. Now map is a higher order function, so because it's a higher order function, we pass in a function as the parameter, and that function will process each element in the array and then return the results and each of those results that is returned well. We will be placed in the new array, so let's go ahead itself cannot function. And then in parentheses, I'm going to indicate name as the variable that variable is going to contain each one of these strings as it cycles through the array one by one. And so, since they are strings we can use, replace on that. So let's go ahead and look at how we would do that. No, with map, the function needs to return a value because that value is going to be placed in the new array. So we're going to return name dot Replace Now here's where we enter the regular expression . Now what we're going to do is create a regular expression that will capture. We're going to use, um, capturing groups, because we want to repeat those. So we want to capture the last name and the first name, and then we want to you reuse them again. But we want to put them in the opposite order and then put between them a space will remove the comma and put a space between them. So let's look at how we would do that. Let's set up a regular expression first, so we want to capturing groups. The first capturing group will be four The last name. So let's just enter word characters and we can have one or more of those. Then we want to make sure comma space, that's what separating the names. And then we'll create another capturing group and this will you use the same thing? A word character? No and one or more those word characters. So there's a regular expression, and as long as the name has word characters than a comment of space followed by word characters, it will produce a match. And then, because we've used parentheses, it will capture those parts of it and that will allow us to use it in the second part of the replace. So we put a comma and then inside of quotes, we want to indicate the second capturing group. We do that with a dollar sign to then we want to separate that with a space. And then we want to indicate the first capturing group. I'll see how we've done that. So the second capturing group will be the first name. Then we'll put a space first. Capturing group will be the last name. Put that at the end, so that's how that works. So let's go ahead and save this and then let me copy the file path here so we can try this out. All right, open the consul. Let's take a look at that. New names variable. There we have it. We have now have an array of first name last name, and there's a space between them. There's no longer comma since we've put them in the order of first name last name. So if you were able to work that out in your own fantastic if not, I hope you learn some new things about replace and how you can use them with regular expressions and how you can use capturing groups with that replace method, it can become very helpful in solving certain problems. All right, let's move on to the next topic.
50. Matching a Word Next to Another Word: in this application, we're going to take a look at how we would find a match of some word, some set of characters. But make that match contingent on whether another word is close by or next to now. Think about the application of that. Think about how you could use this type of regular expression in a number of different situations. We have a specific example here finding a word, and we want to make sure that another word is close by. That's our specific example, but this could be applied in other ways. Now we're going to solve this in rejects, pal. But before we take a look at the solution, you can go ahead and give it a try. So if you'd like to stop the video now and then we started when you're ready to continue. All right, let's take a look at this solution. So let me jump to rejects, Pal first. All right, let's say that we wanted to match the word words, this one here, but we only wanted to do that if together also appeared next to it. Now I'm starting with the simplest case first, and then we'll make it a little bit more difficult. So the way we would deal with that is first put word boundaries in a regular expression want to include word boundaries. Now I'm going to put parentheses and create a group here, and I don't want it to be capturing group, So I'm gonna make sure that it's a non capturing group now. Obviously, this portion is not necessary for you to solve it, but this makes it cleaner, and that's why I'm doing it. So let's put the first poured in words. Now, if we just take this example here, we could then have that separated by a non word character, and we can do one or more. And then we can put the word together and we get a match of both of those now works great. So that's pretty simple. Regular expression. If we're just trying to find a word with another word following it now, what if there are words in between? If if there's the possibility of additional words showing up here, let's put a few words in there. So we have two words in between. Let's see how we would modify this regular expression to make sure it can match those. So I'm going to do another group here, and since I don't want to capture, let me just enter that in. Now let's go ahead and look at the regular expression portion, the matching portion. So we want a word character and one or more of those followed by a non word character, and it could be one or more of those, and so that will designate an individual word. Now we can then specify how many of those were going to allow, so it could be zero 25 Maybe that's what we determine. And there we get a match. We have two words in between words and together, and we get a match because we've been able to designate that there could be other words besides that. Now let's expand on this even more. What if the word that we want to be a part of the tax in order for there to be a match could come before or after? Let's go ahead and put together over here, and let's remove it from here. There. Now, we no longer have a match, but the two words are in the same contacts just may not be in the same order. So how can we modify this in order to make sure that that gets a match as well? The way to do that would be to use. And or so we can find this order or there we go. Or Or we can reverse the order. Let me just copy this here. Pay sat in and I'm going to change the order of these two together and words. And there we get a match. So now we could get a match if it comes before, like that one or after, like this one, we get a match in both situations. So the real bulk of it is word boundaries. Then we have our match here, the two words what we're looking for and then whatever could come between and how many times words could repeat it in between. And then we simply repeat that on the other side of or but we reverse the order of the words, and that allows us to match something when another something is with it. And that's the best way to think about this type of regular expression, how you could apply it, match something when something else is with it. That's where you would apply this type of regular expression. All right, let's move onto the next one
51. Validating Dates: in this application exercise. We're going to take a look at validating dates. No dates. Ca NBI entered in a number of different formats. And so the first thing you need to decide is what am I going to consider valid? What am I not going to consider ballot? So for this exercise, let's take a look at what we want to test against. And we'll use rejects, pal, for this. And so I've entered a sample date in here. All work with this. Modify this to see if the solution will match different types of dates. But here's what we want to do. We want to do it in the date, the day, month and year format. And we want to be flexible enough that it can be a single day digit single month, double digit on the year and anything in between. So any combination between these two this one and this one, we want to make sure that that's that those were considered ballot. So this solution's going to use Amores. So if you'd like to try this on your own open up rejects, pal. Go ahead and give it a try, and then when you're ready to continue restart the movie. All right, let's see what we need to do to make sure that we can meet these date formats in any combination in between, I'm going to remove this text here. To begin with, we'll just use this example of a date for our sample. And one thing I want to do first is put a starting and ending anchor for the string. So there's my starting anchor, and then the dollar sign would be the ending anchor. And so whatever Texas entered, we don't want anything outside of that. So we want to start an end anchor point to that, and then we will have the two slashes. All right, so let's first work on the day now, since I'm going to use or is a part of this because we can have a day show up in three different ways. I want to put this inside of princes to help organize it, and then we can have a date that begins with a number three and then either a zero or a one . That is possible. So that's one date day situation. I'm gonna put it over there, and now let's look at the next to it. It could be one or a two. So I'm doing a character set of one or two, and then the second number could be zero through nine. And so do the character set of those and then one more, or and then this third possible situation would be a single digit. Now remember, the single digit could have a zero in front of it could be zero based, or they may choose not to put the zero in front of us. How do we account for that? Well, we're going to put the zero first, and then we will use a repetition modifier and basically were indicating that that zero can occur zero or one time. All right, so that would account for whether they entered it or not. And then, of course, the second or the first number, whichever the case may be, is one through nine. We don't have a zero there that this time, because that's not possible. All right, so that is our day. Now let's take a look at the month now, with the months were either going to start with a one and then some other number or will have a single digit and we'll do the same thing with that. Single digit was as we did with the day. So let's take care of the one. And then that could be followed by either a zero A one or two. So we'll do, Ah, arrange inside of that character set and we'll do an or now, once again, we can have the zero. We're gonna modify that. That could be zero or one time, and then we will have one through nine as the second or possibly the first number. Now, something else I want to do here to group this better. I'm gonna put this in print. Sees as well. All right. Now, for the last part, this will be the year now. The year we indicated may have just two digits or four digits. So let's see how we can account for that. So let's take care. Ah, the two digits at the start. First, I'm gonna put this in parentheses because I want to use a repetition character after it. And so I wanted to apply to everything within that group. And then we'll put a character sets zero through nine. Let me show you why I'm using zero, and it's because of the repetition that falls. We indicate that it can have two digits, and so the zero could be the second digit. So that's the 1st 2 digits in the years what this is now. This group. We want this to exist zero or one times. So a group of two digits. That's what we specified here knows how we've done that. This kind of a neat application of regular expression. We've indicated the numbers that could be part of that two digit number with Vindicated that there are two of them. And then we put this modifier outside the group to indicate that that group can exist zero or one time. All right now, let's do the last two digits in the year and that we can simply do with another character set zero through nine and repetition of to like that. Nana said it immediately matched once I entered that last part in, So that is our regular expression that will allow us to match a date in different formats. Let's see, when I change that to a four, it's still match. What if I do a 34? Well, now it doesn't match 14 matches here. I have a zero in front of the month. If I remove that, zero still matches. If I put a one that's no longer a valid month and so it doesn't match, then in the year I can have a two digit here. Just like that. I must at least have two or four. As you can see when I put three, it doesn't match, but it matches with the two digits. So that's our regular expression to validate a date that consists of a day, month year format with slashes between, and we give it flexibility that the person entering the date can have two digits for one digit in the month and the day and two or four digits in the year. All right, let's move on to the next one.
52. Capturing Matched Text: in this application, we're going to look at capturing matched text. Now throughout the course, we've done some capturing groups, and that is one way to capture match text. But for this exercise, I'm really interested in the match method of strings and how we can use that to capture match text that allows us to do some interesting things. So let's take a look at what the assignment is for this application. So what we want to do is extract all the numbers from this phrase here. So we have a string we want extract all the numbers from this capture those numbers and some those numbers, some into a single number. So 32 plus 100 plus 15. So that's what we'd like to do, and we want to do it with the mashed method of strings. So take a few moments if you'd like and see if you can solve that, Then when you're ready to continue, go ahead and restart this movie. All right, let's take a look at how I would accomplish this. As I've said in the past, there probably multiple ways to achieve this, to extract the numbers and then to some them together. But let's take a look at my solution. So first thing I'm going to do is create another variable that we're declaring here at the top, and this is the total. This is going to be the some of those numbers, and I'm going to set it to zero initially. All right now, with that done, let's go ahead and set up our regular expression, and I'm going to assign the results of that regular expression to a result variable. So we are going to use the match method of strings. So there's the match method, and with the match meth, we pass in a regular expression and that's going to act on that string. So let's go ahead and set up this regular expression. No, since I want to extract numbers, that's what I want to get. That's pretty simple. That's a pretty simple regular expression. I will do this and then I'm going to put plus is well because I want one or more number when I capture this now. Also, I want all of the occurrences, so I'm going to use the global flag at the end. So that's the regular expression that will help us get those numbers now, as you may remember, or as you may have reviewed with the match method, what that is going to do is place the results in an array, so it will put things into the results. Variable here as an array. So let's see how we would then work with that. So I'm going to do an F expression, and I'm just going to check to see if there's something in results. If there is, then we'll do something with that now. This next part, not necessary, doesn't really need to be done in order to meet the requirements. Its application. But let's say I wanted to extract those numbers from the array and use them separately. Well, I could do that by simply assigning two new variables something like this. So the position zero would be the first number that was matched, and we would just continue in that same sort of pattern, extracting those numbers and placing them in a variable. Now notice that I use latte to define these variables. So these variables here would Onley be available. Their scope is the curly brace, so that's there to find scope. So technically, what I'm saying is I'm going to use these numbers inside of this if statement All right, let me finish that off. So that would be a way to get those numbers out and placed them into a particular variable . But what we're trying to do is to get to some of those numbers, and there's a lot of different ways to do that. I'm going to use the reduce function of a raise. That's how I'm going to get the some the results. So let's look at how I do that. I'm going to place the final results in total set that equal to result dot reduce reduces a method on raise, and basically, what it allows you to do is take all of the elements in an array and do something with it, combining into a single value. And here's how we do that. We first have to pass in a function. This is, ah, higher order functions. We pass in a function that it is then used to act on each element in the array, and I'm going to use a narrow function. I usually do that when I'm working with map filter reduce, so let's go ahead and define the parameters that function. Some that variable is going to contain the ongoing some as it moves through each of the items in the array. It's going to add those to some, and that will keep that total as it moves through each one of those and then v a. L. That's the variables going contain the value of the array that Ron so is the iterated. Through that array, it will assign each value to this variable vow. So those are parameters. Then we have the arrow to indicate it's a narrow function. And then this is what we want to return. We want to return the sum plus each value. No, since we're extracting this from a string, our values are going to be strings themselves. So I want to convert those to a number. I'm gonna do that with parse int. So I convert vowed to base Tim. That's basically what I'm doing there. And then the second thing we need to pass into reduce is the starting value for R sum. Of course, we want that to start at zero, so I will pass in a zero for that. All right, so that should get our total. So let me go ahead and save that, that I'm gonna grab the path the html file so we can take a look at it. Pull that up and let's see what we get here in the Consul. I'm going to take a look at the result Variable. First, let's see what that shows. No. As you can see, we have an array three values 30 to 115 and noticed that their strings k so it extracts ALS as strings. And so then let's take a look at Total if that got what we wanted 100 47. So if we add those together, we do get 147. So that is the toll of those now. Something else we did in the code was that we assigned those individual values. Two variables themselves now from the top were not able to access those variables because their scope is inside of that. If statement so we can't really take a look to see if that works. But if we come and change this, let two of our then we could do that, because then I will define them at the top of the scope some very fresh, really quick, and then we can just check and make sure that was extracting things properly. And then we have a 32 and notice it's a string. So it did so we could convert those to a number as well when signing that. But really, the bulk of what we're trying to accomplish is to some those numbers. Hopefully, you found success with that. All right, let's move on to the next topic.
53. Discovering Information about a Match: sometimes with regular expression. You want to find out some information about the match, not just whether it manages, but also information about that match. So that's what we're going to do in this application. Now we're going to begin where we left off with the last application. So let me jump to sublime, and we'll take a look at that. Now you can see that the code is the same the, but the assignment is different. We want to retrieve the starting index for the match, the length of the match and the actual match. So just a couple of hints here. You're going to want to use a different method than what we've used in this example. But you ca NBA gin with this, and with that different method, you won't be using the global flag. Basically, let's look for digits again, just like we've done here. But let's just find the 32. So look for the first set of digits, and that's all we're going to concern ourselves with. But you want to find the starting index of that, the length of that and the actual match itself. So which method will you use? Go ahead and pause this Take a moment to I figure the best way to do that and then restart when you're ready to continue. All right, So the trick to solving this particular issue is using the exact method of regular expressions. So we're not going to be using a method of a string but of the regular expression itself. So first thing I'm gonna do is just set up the variables for the starting index, the length and the actual mess. I'm just gonna call these match start match, put smash length and last, we'll just call match. And that's the information we're going to retrieve. And that's all. Now I want to place the results in the result variable again. But since this is a method of regular expressions, we need to do this in a much different way. So I'm going to remove everything here, and we are just going to do a method of this regular expression. The method we want to use is exact. Exactly is the method that provides more information about the match, and then we pass in the string to that. Now, once we have a match, removal this down here, then we can go ahead and assign the information to the variables. So match start, we'll actually be result dot index. So remember, with exact we have an index property that gives us the index position of the match, the starting position of that match. And so that's what we'll assigned to match dot or to match start then for match length, this one a little bit trick here. The first position in this array stores the actual match. So if we just get the length of that, then we can get the how long the matches. And then finally, since the first position of the race stories actual match, we could just assign that to match. It's like this, so not too difficult once you have the correct method and remembering that it's a method of the regular expression, not a method of strings. So let's say this will just take a look at those three variables going to grab the file path, jump out and let's look console and let me just take a look at match start. That's 14 so 14 characters in to that string is where it begins 0123456789 10 11 12 13. 14 is where 32 begins. Okay. And then, of course, match length is to 32 is two digits, and then the actual match itself. And there we retrieve the number 32. So sometimes you may need to gather information about the match for other purposes to do something with it. We could use this information to make a change to that string. Um, and that maybe what is required. But when you're looking to gather information about a match you would use exactly. All right, let's move on.
54. Iterating over Matches: in this application, we're going to take a look at how we can it rate over matches were going to use the same problem we've used in the last couple of applications. So let me take a look at that. We have this phrase with three numbers in it. You're going to create a regular expression that matches the numbers. So that part we've already done we're looking at how to apply this in Java script. Now, the the trick here is to figure out how you're going to iterating over all those matches. Now, there's one obvious way that we talked about before using the match method because it creates an array of matches. And then we can iterating over that array using a number of different techniques. So that is the obvious way. I am going to introduce in the solution on additional way to iterating over those using the last index property of a regular expression. If you'd like to try to figure that out before I present the solution, go right ahead. Either way, pause the video and restart it when you're ready to watch the solution. All right. Now, as I mentioned on obvious way to reiterate over matches is to use the match method of a string that creates an array of matches. And then you can iterated over that ray using a four in loop Ah, four loop you could use for each is the method. That's part of a race. So there's a number of different ways you could do that. I'm going to show a solution with last index. Now, Last index is a property of a regular expression that is available with either exact or test. So those two methods that you can use on a regular expression, it creates a property. Last index and last index represents the position after the last match. So the index position after the last match, that's what it represents. I'm going to use that property as a part of the loop I'm creating so that we don't get stuck in an infinite loop if our regular expression produces a zero length match. So let me first set up the regular expression. We've done that before, So nothing new there looking for the numbers in this string, and I'm using the global flag. I'm also going to set up match. I'm gonna sign it a value of no indicated is empty. Now let's set up our loop. So while match and we're going to set that equal to the regular expression using the exact method on phrase like that. So while we have something, call it results in something a match. Then let's go ahead and do something with that. And so what I'm gonna have us do is just log to the consul. The match number simple. Is that so? A very simple loop. So let me say that Let's just take a look at what this produces, but that file in the browser and let's open the consul. So here are the three logs statements. There's a 32 there's a 100 and there's a 15. So since we're using the exact method, we get a lot of information about that matches. But in all cases, we do get the value so we could use this while loop to act on each of those matches, depending on what we wanted to dio. So works very slick for iterating over those values and doing something with that those values. Now, in the introduction of this I mentioned last index. No, I haven't used it yet. Why would I mention last index and I'm used it. Well, let me show you something here. If we were to change this to an asterisk that indicates while we confined zero or many digits, well, zero would give us a match that is zero length. It can find a match that is zero length, and as such, we would get caught in an infinite loop here. It would not be able to jump out of that. And so, if you're unsure if your regular expression could get a zero length match, then I would also add an additional if statement here in the wild loop. It would look something like this. We checked the index of the match and we see if it's equal to I'm going to use the non strict Equality operator. The regular expression. It's last index property looks capital, I. So if those things are equal, that's where we would be be stuck in an infinite loop. And so then I want to just increment the last index property. That way it will continue to move through the loop because basically what's happening with last index the while loop works is because the exact method creates the last index. That's where the next match begins is where that last index is. And so that's how we're able to get all the different matches because it begins the next time right after this first match than the next time after that second match and so on. So if those are equal, then we're in an infinite loop and then we want to increment the last index property. You know, just to show you what's going on. Let me. Also whips also display last index property. All right, so we'll save that and we'll see this doesn't get an infinite loop, but notice how it goes through the entire string one position at a time. Whereas if we use the plus, then justice three positions, here's the last index, so the match starts up. 14. The numbers 32 so 14 15 the last index's 16. That's where it begins the next match, and so then it finds the next number 100 that starts at 41 are sorry. That starts at 38 then the last indexes 41 so on. So that's how that one's working. There is no. Zero length match in that case, and so we don't have to worry about being stuck in infinite Loop. But this is a nice if statement to include just to prevent that from happening. So a nice little way to Iterating over matches. All right, let's move on.
55. Conclusion: congratulations. You've made it to the end of this course. Hopefully, you're able toe learn a lot about regular expressions and how to use them with JavaScript before signing off. I just want to leave a couple of things with you first. The syntax for regular expressions is quite difficult. It's not the easiest thing to remember, so it's important that you have a place where you can refer to to remind yourself of some of those syntax elements. I like using rejects, Pal. I like using the cheat sheet that's available with Reg Expel. It's very simplified, and I can easily look at things and remind myself, Oh, yeah, that's what that is. Ah, one of the things I very often mix up or the quantify IRS, And so I have to look at them really quick and, oh yeah, that's the zero or more, and that's the one. Or more. Sometimes I mix those up, so there are probably things that you will mix up a swell or that you will forget. Let's be realistic. Regular expressions are not the type of thing you use in every bit of your code, and so when you're not using it a whole lot. It's easy to forget those things, so you need this type of reference now. The other comment I want to make about regular expressions is, as you've seen with some of these applications that we've been dealing with in the last section. Sometimes the solution toe a problem that uses regular expressions has to do with the method you use in JavaScript. And so get tau know and understand those methods well. Review some of those applications if you need to review the first part where we talk about the methods so that you understand that as well. So it's not so much. While what is the regular expression I need to use to solve this? It's more. I know this regular expression can match the datum after I'm aware of that. But how do I solve the problem associated with matching that data? Sometimes that can be the trickier apart now. Admittedly, there are regular expressions that can be quite complex, and one of the best things about going through this course is you will see some of these regular expressions. When you do some searches on Google for a particular problem, you might find something Well, now you can read through it and understand what it's doing, and that can help you learn more about regular expressions as well. So those are the two suggestions I would make at the end of this course. Now that you understand how regular expressions work, you understand the syntax involved with them, and you understand the methods that air available with JavaScript. For using those regular expressions, you can begin applying them and find more success in your job script coating. So best of luck to you and thanks for spending this course with me.