Transcripts
1. Introduction to Text Functions: All right. Now you've wrapped up Section five look up reference functions. It's time to move on to section six text formulas. And as you may have guessed, this basically covers any formula or function in Excel that allows you to work with or manipulate text strings. So we're going to start with some basic capitalization based functions, like upper, lower and proper. We'll talk quickly about the trim function. Then we're going to King Cat Nate, which is basically how joined text strings together talk about how to convert text values and values to text using the text and value functions. We'll talk about left mid right, which are really powerful ways to extract pieces of text strings. And last but not least, we're going to search and find and go into some Excel examples. Teoh show you how that works, so as usual to files to download Excel for analysts Text functions, which is a PdF and then Section six text functions, which is an Excel file. So go ahead and give those download Let's get started
2. UPPER/LOWER/PROPER & TRIM: all right, let's kick things off by talking about four pretty simple text functions. Trim upper, lower and proper, starting with trim. Basically, all it does is remove any leading or trailing spaces from a text string. And while that might seem like kind of a trivial thing for function to do, it's actually really, really important. And that's because if to text strings are identical, except for the fact that one has a trailing space at the end, they look exactly the same to you. But Excel will treat them is totally different values that can totally wreak havoc on things like look up or match functions that are trying to find to exact values to match um so trim will make them equivalent. And honestly, it's a nice best practice to get in the habit of using just to standardize your text fields lower, upper and proper. These air just capitalization based functions so lower will just convert the characters of a reference array toe. All lower case Upper will convert all the characters to upper case and proper will capitalize the first letter in each word of a text string. So let's dive into excel and see we've got to work with. Um I've opened the section six text functions Excel File. So go ahead and open that up and follow along Just one tab here. It's called raw data and looks like we've got some some names here. Gender address, city state, zip, email, telephone and national i d. So we're gonna be working exclusively with this one tab and this one file as we work through this section. So our first challenge is to create two new fields for first and last name. And what we want to do is convert to proper case and also trim out any potential leading or trailing spaces. And as you can see, these all look, you know, just like normal first names. But if you select any cell and click in the formula bar, you'll see that Jennifer, for instance, has a trailing space after it. Chad does not. Susan does not. Wayne does. But as you can see, it's really manual. And the only way to tell is by actually clicking into the value of each cell. So what we're gonna do is create a new column. I just right Click Colin be, and I'm gonna do insert. And I'm gonna call this new column proper. Oops. Proper first name. And so I'm gonna start with just the proper function so equals proper. Open up the princess. Select a to and close it. As you can see, that converts Jennifer and all caps to Jennifer with just a capital J. Now, I don't need to create a new column and take the trim of this version. I could just add trim to this and right both functions at once. So I'm gonna say equals trim proper eight to close the parenthesis. And then when they apply this down, Jennifer will no longer have a trailing space. So let's do the exact same thing after column D In a right click e insert, call it proper last name equals trim parenthesis proper parenthesis d two and enclosed the two princes off it. Enter and apply it down. There you go. So reviews proper to clean up the first name and the last name columns
3. The CONCATENATE Function (&): All right. Next up, we've got the concoct innate function, which is honestly, one of my favorite functions. It's one of the most underrated and powerful functions and excel. Really. What it does is combine text cell values or formula outputs into a single text string. You can either type equals can Captain eight if you choose. Or you could just use the ampersand symbol, which does the exact same thing. So note that you can combine just sell references and the strings that they contain. Ah, you can combine those with user inputted text strings surrounded by quotes, or you can combine them with other formulas, which will pull out pieces of text strings to be contaminated. So in our first example here, we've got a first name. Daniel. Last name, right. The first case, We're just combining the two a two and B two, which gives us this falling output. Daniel. Right now, the second example working Captain ating the same two fields except adding a space between eso. The output is Daniel space, right? And then the third and fourth examples integrate the left function, which we're gonna talk about in the next lecture. So let's jump into excel and we're going to do two things here. We're gonna create a full name column and a full address comb. I'm so I'm gonna right click column f insert a new column. I'm gonna call it full name. So the goal here is to create ah, full name field that takes the proper first name, the middle initial and the proper last name with spaces between them. So all I'm gonna do is type equals And the first text during that I want, which is proper first name be too than 10%. And I remember I need a space between this and the middle initial So in quotes, I'm gonna write a space so it looks like nothing's there. But I'm actually inserting as space into my resulting string. And then another and percent to the middle initial and another space one last 10% and in the proper last name. So re reading this formula, I've got be to so proper first name, then a space middle initial been a space proper last name. So I hit. Enter. Expand this out. As you can see, this is working beautifully. A similar case for full address, just going to right? Click. Com. L insert a new column. Name it full address. And in this case, I'd like to do something similar. I just want the street address followed by a comma and a space than the city comma Space State Comma space Zip code. So same idea here. Start with equals. H two and percent comma space in quotes ampersand, City ampersand, comma space and quotes and percent state ampersand, comma space quotes, ampersand sip code. So street address Common Space City. Common Space State kind of space zip code press Enter. Expand this out so that we can see it And there you go. So that's a basic concatenation to pull together multiple text strings.
4. LEFT/MID/RIGHT & LEN: All right. Next up, we've got left mid and right, which return a specific number of characters from a location within a text string. In other words, they extract pieces of text from a larger string. We'll also talk about the Len function or length, which returns the total number of characters in a string. So syntax wise, really straightforward. They just start with the text reference left, basically says, I want to return a certain number of characters from the left of this text ring. So if you were to say, from the left of string a three or the string within a three return to characters, Excel is just going to start at the left of this string and return. The 1st 2 characters that it finds right works exactly the same way. It just starts from the right of a text string and works its way to the left. So if I said to the right of this text during in a seven, give me three characters back, it's gonna return 100 because those air the last three characters in the strength, um and then mid has one additional component to it, which is the start number, and it's basically just what you use when you want to pull a text string or pull characters out of the middle of a strength. So here were saying from the middle of this string in a five, I want to start in Position four and return five characters, and that will return 0 to 215 because zero is in the fourth position in the string. And then last but not least, Len, which speaks for itself, just counts the characters in this case there 17 characters within the sample text ring. So let's jump to excel. We're going to do some work on the telephone column in column. And so I'm actually gonna create three new columns here, and I'm going to name them Area code can come Oh, prefix Income P and line number calling Cube, and I'm going to define area code as these 1st 3 numbers of the telephone number. Ill defined prefix as the second chunk of three numbers with the telephone number and then line number is going to be the last four in the number, so we're gonna practice all three left mid and right. In order to accomplish this so for area code, since it's the 1st 3 characters of the string I'm going to use the left function so equals left and two is my text reference number of characters. It's gonna be three. I just closed that off and apply it down. There you go, prefix. We need a mid function because we're pulling characters out of the middle of this text string. So the text is gonna be same reference. And to the start number is basically that where's the first character that we want to return? And it's in 12345 It's in the fifth position, and the number of characters that we want to return is three. Press enter 376 That looks correct to 89 There we go. And last but not least, line number is just the last four digits of the telephone number, so we use right again text reference is in to and we want to return four numbers from the right. The reserve 64 looks good. Apply it down and there you go. Simple left mid right functions
5. TEXT/VALUE: all right, so I want to take a minute to talk about two functions called text and value, which could be really useful functions when Excel is treating your text like values and your values like text eso. What the text function does is it converts a numeric value to text and the signs of particular format to it value. On the flip side converts a text string that represents a number into a value so that you can treat it as such. So for the text function, UH, two components to the formula, the first pieces the value. This could be a number of formula that evaluates to a number or a reference to a cell containing a number. And the second piece is how you want to format your text, whether you want to format as date as currency, certain number of decimal points or digits, and so on and so forth. So one really common example where you might run into issues between texts and values is dealing with ZIP codes. So let's jump into Excel again and take a look at her zip code field, which isn't column K for those of you don't know that the traditional or standard ZIP code should have five digits. The problem here is that certain ZIP codes, which began with a zero, have gotten that leading zero stripped off by Excel because Excel is treating this as a value and it deems the leading zero to be irrelevant. So what we need to do is create a new column here. So I'm gonna right Click column, Alan, insert a new comb. I'm gonna call it New Zip. And so, as a first effort, let's try just using a text function here. So equals text our values going to be the zip code value in K two and then for format text . I want to see if I can give it five digits. So in quotes, I'm just gonna hit pound or hash sign five times, close the quotes and then close the parenthesis and hit Enter. And this worked for all the five digit ones. The problem is, it didn't add zero to my four digit zip code because zero had already been stripped from. It s so we need to get a little bit more clever here. So what I'm gonna do is incorporate ah logical statement with a Len function to treat the four digit zip codes a little bit differently. So what we're gonna do here is we're gonna start with the text function. Were already have since we've gotten pretty far along and we're just going to start with, if the length So if Len of k two, she's a zip code field close the parenthesis equals five, and the value of True is we're gonna convert it to text with five decimal points. And that's that. When a comma over to value of false, here's where we're gonna tell Excel how to treat visit goods with only four digits. So we're going to say, uh, if the length of K two close that parentheses equals four, then I'm going to just grab this and just copy paste, save me a little bit of time. Then we're gonna convert the K two value to text. But I'm gonna make one edit to the format, and I'm gonna force the first digit to be a zero and then finish with £4 signs and then I'll just close this off with two more parentheses, and that should just about do it. So press enter. Apply it down. And, as you can see, the cases where it had only had four digits. Now have that leading zero added back in So again, just to revisit this formula were saying, if the length of the Zip code equals five, just converted to text with as many digits as you can, um, if the length equals four, converted to text and force the first digit to be a zero, eso used combination of logical operators, the length function and the text function to create a new, proper ZIP code field.
6. SEARCH/FIND: Okay, So we talked about the simple left mid right functions on we determined very specific number of characters to return. In each case, that works great for things like phone numbers, which are extremely consistent and standardized. But realistically, there are many situations where we're gonna deal with, uh, a variable number of characters to return. So in other words, what if we need the three characters after the ampersand sign or all of the characters preceding the dash? Questions like that can't be addressed by the left mid right functions alone. You need something like the search function at to help you accomplish that. So search, which is a personal favorite function of mine and hopefully soon to be favorite of yours is a really powerful function that searches for a given character string. And then, if it finds it, it returns the position of that string. If it doesn't find it returns value error. Eso syntax wise. There three pieces. Basically, it starts with what are you looking for? What character or strength, where you looking for it? And then an optional piece called Start numb. So by default, search will start at the beginning or left of a text string and move left to right until it finds what you're looking for. If you set start number two something different, it will basically start after a certain number of characters. So in this case, down here, we're looking at a text string in cell 11. So we're saying OK, search for the percentage sign within 11. It starts in the left and counts 123456789 Finds it in the ninth position and returns the number nine. In this case, we're using the optional start number component of the formula were saying search for the percentage sign within this same text during this time start at the 10th character. So we're skipping over the first percentage sign finding the second instance of the percentage which lands in position 13. So returns at 13 and then the next two examples start to show you how you can integrate the search function with others like left mid right toe. Add that dynamic element that I was referring to, which is the most common usage of search. You won't often see search on its own a little tip before we jump to excel the find function works exactly the same way, but it's case sensitive. So by default, when case sensitivity isn't a factor, I just use search so jumping to excel. What we want to do here is work with this email column and create a new one right click. Oh, insert a column and we call it user name, and I'm going to define the user name as basically whatever precedes the at sign in the email. So Jennifer J. McGrath at gmail dot com would translate to user name of Jennifer Jayme and Craft. So I obviously want to return characters from the left of the text string and column end. But if I start to write that and I get to the number of characters piece, what the heck do right here? So for Jennifer J. McGrath, it looks like 15 characters close 16. You know I can get her right. But then, if I tried to apply this down, you know I'm getting pieces of the address because obviously it's not a standard number of characters that I need to return. So what I need to do is replace the number of characters part of this function with a search. So I'm going to search four The at sign surrounded by quotes because it's technically a text string. It was search for within and to and then close that off. So now what we're doing. So we're saying, started the left of this email and I want you to basically return the number of characters it takes for you to find the at symbol. So if I were to close this out right now, we'd be close. But you see how we basically include the at symbol with it. That's because it's finding the at sign in the 16th position and returning 16 characters, which by definition, includes the outside. So all I need to do is gonna delete that last parenthesis. And since I'm still in the number of characters component of the formula is gonna add a minus one, then close it off, and all that does is it says Okay, return one less character so that I don't include the outside. Now, when I apply this down there you go. Regardless of where the at sign occurs in the string, the search function allows me to always update how I'm defining user name is that I'm on Lee getting the piece preceding it. So really powerful function. This is just one application of it gonna show you a few more applications in the next few lectures, so stay tuned.
7. Categorizing Data with IF(ISNUMBER(SEARCH)): all right, so it's time to share with you it top secret, very, very powerful combination of formulas that I use just about every single day of my life. Call if is number search, so sounds a little crazy. It's a combination of three functions. If statement is number, statement and search statement all rolled into one, and basically what it does is it helps you classify data based on cells that contain specific strings of text. So, in other words, you might have, ah, crazy, messy string of data that contains bits and pieces of useful information that you can use to create new data dimensions or classify ah, new categories of data. If is number. Search allows you to basically mind through the text string, pull those out and create those dimensions or classifications. However, you need to. So to interpret this formula, which looks a little crazy at first, it's always easiest to start from the inside and work your way out. So let's start with the search function, which we just walked through. Basically, you're just searching for text within a certain string of text, and that will return either a value, which is the position where it found the text you're looking for or an error if it doesn't find the text, So then taking that out one step further to the is number formula. All that saying is, did you get a number when you did that search function Yes or no? If you got a number, it means it's returning the position of that thing you're looking for. Meaning it found it. If you don't get a number, it means you got an error because he didn't find it. So all of those two together are saying is Did you find what you're looking for or not? So then when you take one more step back and look at the if statement, all you're saying is if you found what you're looking for, then return this value of true. Otherwise, if you don't find what you're looking for, return the value if false, so going back to our discussion on if statements keep in mind that you can nest as many of these together as you want, so you can create extremely deep and complex categorical rules using this combination of functions. So let's jump into excel and I'll show you one example of how we can do that. Eso I'm going to create a new column here. Right click. Oh, create a column right after email and I'm gonna call this provider. And basically, what I want to do here is determined, uh, which of these users uses Gmail, which uses a well and which users use hotmail. Um And so what I'm gonna do is I'm going to use a combination of if is number and search functions to search for those keywords Gmail, AOL well, and Hotmail in order to categorize these people into their providers so you can start with if open parenthesis is number open parenthesis search open, princess. And I'm on the fine text component that formula. So what am I looking for? First, I could do this in any order, but the first thing I'm gonna look for is Gmail. I'm gonna look for that text within end to and then syntax wise, it gets a little bit complicated. But as soon as you finish that within text, peace your clothes out to parentheses and then comma over to the value of true. So by this point, my formula is saying if you search for Gmail within cell and two and you find Gmail. What do you want to return? So I'm gonna return Gino, get to the value of false peace. Here's where I can start my next if statement cause I'm gonna nest a few if statements in here. So instead of rewriting if is number search from scratch, go into the formula bar, grab everything I just wrote up to the comma. Copy it and paste it right afterwards. So now the value of false is okay. If you didn't find Gmail, let's do another search. So if his number search for a well, within and to, then we're gonna call that. Yeah, well, and then the exact same logic here for my third if statement gonna copy that piece, paste it after the comma. The last thing I'm gonna look for is Hotmail within into If I find it, call it Hotmail. And now my last comment takes me to that end all be all statement that value it false. If all of these tests are false, what I want to show here and let's just call it other and then close out a bunch of parentheses. 123 to get me to the beginning and hit enter, so I know that was kind of a beast of a formula. But each piece is actually quite simple when you think about what it's really doing. So if I apply this down, as you can see, it's pulling out or identifying the provider from each of these users and classifying them or categorizing them accordingly. So if I look at the drop down, you can see that everyone's falling neatly into one of those three buckets. So there we go. That's just one example of how powerful this combination of if is number and search can be .
8. Combining RIGHT, LEN, and SEARCH: Alright, So I've got one more tip to share with you guys, too. Close out this text function section. But before I do, let's just look at how far we've come in this section kind of scroll through this section six workbook. Considering what we started with, it's actually pretty impressive where we've ended up. So we've been able to create, define and modify a number of different fields and new data dimensions, just using combinations of pretty simple text functions. So hopefully this gives you a sense of just how powerful they are, especially once you learn how to combine and manipulate them in interesting ways. So the last tip that I want to share with you guys is one that's deceivingly difficulty and tricky to get right. So remember when we did the user name formula and Colin P. Where we said I want to return a variable number of characters from the left of the email field and we use the search to say, Give me everything up to the at sign? What if we wanted to return a variable number of characters from the right of a text string ? Um, so let's take a look at how we might do that. Using this national I d field, I'm just gonna create new field right next to it. We call the short I D. And I'm gonna define that as just all of the characters that follow the dash in the national I D. So you can see in some cases three summits for I believe in other cases. It's two or five. The bottom line is that it can't just insert a strict number of characters to return from the right. I have to get more clever than that. So why don't we just start with the right function? Because we know that's the right starting point and we're going to say, All right, return characters from the right of you to or my national I D Field. Now the number of characters again, it can't just be a number, so you might be thinking very well. This is obvious. Just use the search function and search for the dash within the text Boom and hey, you write for road to. But before you go and hold a parade for yourself, let's apply it down and see what's going on. So the problem is, it works great for the 1st 2 but as soon as you get to an I D that requires a four digit short, I d. You're only returning three. So something's not quite right yet. And the reason we're seeing this discrepancy is that the right function is working from right toe left the search function, on the other hand, by default searches from left to right. So when you say search for the Dash, it's finding the dash and the third character. So we're returning three characters from the right every single time. It doesn't care if there are 34 500 characters after the dash because the searches going left to right, not right toe left. So let's step back and put our our mathematical hats on for a second and say Okay, if there's no simple way for us to define via functions or formulas, what number of characters occurs after the dash? What pieces can we easily defined with the tools that we have so a weaken define the number of characters through the dash, which we've already done? Using the search function be? Here's the missing link. We can return the length of the entire string using Len Function. So if we have the length of the whole string and we have the length of the strength through the dash, then think about what the difference of those two equals. And that's exactly what we need. So if you subtract everything up through the dash from the length of the entire string, you'll end up with the number of characters to the right of the dash. So kind of an unusual way to think about it, but that will get the job done. So I'm gonna just delete my search function here and start over. So again, I'm in the number of characters component of this function. So what I'm gonna do is say, Let's take the length of you too. Subtract all the characters up to the dash. So I'm gonna do a search for the dash within you, too, and close it off with two parent disease. So now we're saying, Okay, I want you to know, take a certain number of characters from the right. We're going to find the number of characters as the whole string minus everything up to the dash and return what's left over. And when I apply this down. Boom. There you go. It accounts for the fact that there may be a different number of digits to the right of that dash, so you could do the same thing. For instance, if we wanted to pull everything after the at sign for the email, let's just do, uh, email service here. SE equals right this text length of the text minus search for the at sign within the text boom gmail dot com a well dot com hotmail dot com So, again, a great reminder that if you're creative, there are an infinite number of ways to combine these functions to do almost anything you want to do. So with that that wraps up the text section. I hope you enjoyed it up next. We've got another great section dating time functions, so stay tuned.
9. BONUS: The SUBSTITUTE Function: so there's one other text function that it didn't get a chance to cover, and I wanted to introduce it here in the bonus lecture. And that function is called substitute. And the purpose of the substitute function, as you might expect, is to substitute a character or string of characters from within a text field and replace those characters with something new. Now the beauty of Substitute is that it also allows you to define which instance of a character to change. So, looking at our text function workbook here I'll refer to the telephone number, Field and column are, which has two different instances of a dash. Now let me show you how to substitute formula can replace either the first or the second dash from within this text ring. So just insert a new column after column are and let's call it new phone number, and here it can type a substitute function, and it has four components. First component is the text that I want to edit. In this case, it's telephone number cell are, too, And now here the old text. This is where indicate the character or string of characters that I'd like to change in this case, change the dash and my comma over the new text is what I want to replace that dash with. So instead of a dash, let's put a pipe here. And since this is text, I need to surround it in quotes. And finally, the fourth component of this function is that instance number. So type one and close the parenthesis. As you can see, six one eight dash is replaced with 618 pipe. And if I go back into that function and change the 1 to 2 now, you can see that second dash changed to a pipe. And this is helpful when you're trying to extract very specific text strings out of a field and you need to create a single unique identifier. So in this case have replaced two dashes with one dash and one pipe, and now I can use search mid left right functions using that pipe toe. Help me isolate pieces of this string. So there you go. I could just apply this down. As you can see in every case that replaces the second instance of the dash. And there you go. That's the substitute function
10. HOMEWORK: Text Functions: all right. It's homework time. So for those of you who want a little bit of extra practice with text functions, I've got some good challenges for you here in the Excel homework exercise workbook. So I'm on the text functions tab, and here you can see we've got a bunch of text strings for product keys that really messy. A lot of information that's kind of embedded in there. And what we're gonna do is use our full arsenal of text functions to extract meaningful bits of it. So we're gonna populate these columns B C D. All the way out through column h based entirely on the text strings in the product key column. So, looking at the instructions, we've got six steps here. We're going to use everything we've got left. Mid right search. We're going to use the if is number, search combo. Substitute all of that good stuff. So give it a shot. See how you do if you need to look at the answers that files available for you otherwise, best of luck