Social Media Analytics with Python | Kumaran Ponnambalam | Skillshare
Play Speed
  • 0.5x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 2x
24 Lessons (3h 9m)
    • 1. Introduction to SMAP 2

      5:30
    • 2. Social Media Data

      3:58
    • 3. Social Media Applications

      8:52
    • 4. Challenges in Developing Social Media applications

      7:17
    • 5. REST API overview

      9:11
    • 6. Oauth Overview

      9:37
    • 7. Twitter API Overview

      12:38
    • 8. Twitter API Usage Examples for Python

      11:13
    • 9. Google Plus API overview

      7:36
    • 10. Google Plus API usage Examples for Python

      11:49
    • 11. Facebook API Overview

      9:59
    • 12. Facebook API Usage Examples for python

      12:10
    • 13. Introduction to use cases

      2:30
    • 14. Frequency Analysis Use Case Python

      7:40
    • 15. Sentiment Analysis Use Case Python

      8:01
    • 16. Link Analysis Use Case Python

      6:16
    • 17. Action Analysis Use Case Python

      7:39
    • 18. Frequent Pattern Mining Use Case Python

      10:07
    • 19. Real time analytics Use Case Python

      7:06
    • 20. Machine Learning Overview

      10:05
    • 21. SMA Classification Use Case Python

      8:25
    • 22. SMA Clustering Use Case Python

      6:44
    • 23. Linking Data

      3:37
    • 24. Closing Remarks SMAP 2

      1:11

About This Class

Everyone is using social media to share their life experiences, initiate ideas and provide opinions  in a free and open way. Businesses are hence interested in understanding what people think and say about their products and services. They are augmenting their business applications to extract, understand and analyze social media data about them. If you are working or hoping to work in the analytics world, you need to enrich your skill set with social media analytics to improve your market value.

This Social Media Analytics with Python course helps you achieve exactly that ! It introduces you to the tools and technologies required to extract social media data. Twitter, Facebook and Google interfaces are covered. It then walks through multiple use cases for analyzing this data and generating business insights. The examples range from simple histograms to advanced machine learning techniques. After completing this course, you will be able to execute end-to-end social media analytics projects and integrate them with existing business applications.

This course requires previous python experience.

The source code use for this class can be downloaded from : Course Resource Bundle

Transcripts

1. Introduction to SMAP 2: Hi. Welcome to this course. Social Media Analytics with bite on. This is your instructor, Cameron. First of all, thank you for buying this course. Andi trying to learn from this one. I hope this is going to be an excellent on rewarding experience for you. So what are the course goals? So the gold off this course is to train people in understanding, extracting and analyzing social media data. Social media has a lot of data that is being gendered at millions. And millions of records are being generated every day. And the goal here is to understand what this data contains. How you can extract data and how you can. Unless this data, most importantly, how can you achieve your business schools trying to understand your customer, trying to understand your products. How can you under achieve your business goes using this data. So why I learned Social Media Analytics. So more and more people are now trying to use social media both personally and professionally. People are using social media every day. Every are at work at home on the road and they constantly read about what other people are saying. They great about what they think and they don't come and about everything that's happening around. Obviously, a lot of people are interested in knowing what people are saying about them. So companies and dessert, knowing what their customers or potential customers are saying about themselves, are the industry and what their employees are saying about their cos There's a lot of information people are putting on the Social Web, and a lot of people want us annoying what it is. The importance off Social Media Analytics is growing exponentially inside every organization. All organizations today want to invest in Social Media Analytics. They want to get more. One more social mediator. They want to understand the customers more and more products that today being built for social media and an index. Existing analgesics products are adding Social Media Analytics in tow, their product offering, and this is going to be a particular field that is going to go grow in importance. Davide. What that means is that you as an analogic, professional or even in a computer science professional, knowing how to do Social media analytics is going to be great skill for you. It's a grateful for you to add to your assuming it is kind of becoming a mandatory school if you're going to be another takes professional. So learning it is going to really help you in your carrier. What do you want? You. By taking this course, you can understand first what social media data actually contains. I learned how to extract data from various interfaces like Twitter and Facebook and Google . Learn how to performs various kind of analytics using social media data from frequency Toe action, Dexter Lincoln analytics. And, of course, you spite on our models for your data analytics work by done, models are pretty easy to use to extract data and look at it and less data. And, of course, prepare your social media data for advanced mission learning purposes. The sexual intercourse You start with understanding social media cards of even though you might have been already pretty primitive family with that look at what kind of data social media contains. How can you extract data from various sources like Twitter and Facebook and Google? Plus, how can you transform later that is coming from these sources into farmers that are suitable for analytics? There were multiple use cases from frequency to link analysis, too frequent Barton mining on. Then Finally, we are also going to be looking at clustering and classifications. It's a 1,000,000 learning techniques, things that were not covering bite on basics. We expect you to be already family with bite on and also know how to use on any kind of fight on I d. We would be using Anaconda on Spider in this particular course. We did not talk about persisting social media data in various databases because that's like persisting any other data. There is no difference in terms. Off we go, you will persist. Social media data on We're not going to focus on advanced or detail mission learning. Ah, guidelines to students. If your questions doubts concerns, please send us a problem. My search and we would be pretty happy to respond to them pretty quickly. Our poster discussion question. We're constantly improving our courses, so any kind of feedback as welcome and please probably feedback through private messages or emails on at the end of the course. If you don't like it, please give us a positive review. We want to make a mention off the relationship of this course toe other B two Masters courses and we two masters were putting out a lot of courses around the feet of data science and analytics. At about technologies, processes, techniques. We try to make our courses self sufficient as possible. So that means that there could be repeating content between our courses. So that is us. That is, by intention, because we do not want you to be unnecessarily pointing you to buy 10 different courses to achieve one goal, which is to learn about one particular offering at the same time if required. I mean, we do not. That doesn't mean that we create one master course that runs 400 hours to covering everything. So this course we have just now, we hope this course helps you in your career. Best of luck on. I hope this is going to be a great rewarding experience for you. Thank you. 2. Social Media Data: Hi. Welcome to this lecture about social media content. So what kind of content? Though you have on social media, you would have been using social media on a regular basis, and you know, exactly you know what you usually deal with. But there are certain things you pay attention to and the restaurant things where you do not forget and shoe toe. Which is more important when it comes to data mining. The first kind of data that you have a text data and that again, content has content. What text content contains is all the tweets blog's post messages Commons that people right on the bed on? This is one of the main set of data that you get from social media, and you want to mind this text to understand what kind of sentiment sat there in the text. Working of feelings. People expressed what kind off products they referenced to and stuff like that, but I came to return to the content. Is the attributes of the content like who is the creator of the content, like, what's the name or the demographics of the person like their age group, their gender, their education? Where are they from? Which country are they from? Which regional there from what is the date and time in which they actually posted the message? What are the various hashtag Siew Earls and references that they use as a part of their messages? Hashtag on references are very important because they let you No, what kind of things they're tweeting about. It makes it easy for you to collect all those streets and messages together easily. Then who are the consumers off the street? So other people who are rating them retweeting them, liking them, sharing them. That lets you understand Who are the people who respond to these messages on the Internet? Then comes connections data. Who is connected to whom? On this starts with the following for Lord Paradigm, that is being used by Twitter, who was following a celebrity or a specific company and stuff like that. Then there are about friends, and AC attends is in own lettuces, Facebook and Google. Plus, who's linked to whom on do they are their family or their business circle? Are there more like friends? They're from the same college. How are they related to each other, and how do they form circles. This is the information you typically used to understand. Who else might be friends of these people? Then there are business circles. This is coming from things like Lyndon about. People are ready to one another because they work in the same company. Are they work in the same business domain? They have similar business interests and stuff like that on, of course, who is sharing and liking who else who usually leaves who streets, who is responding the whole streets and stuff like that. Of course, there's record so that as the attributes of the people like age, education, the current designation, what kind of interest they have, what kind of qualifications they have, which college they went toe and stuff like that. Then finally, there is media data, a media data, even though you would not remain the media itself, like trying to understand photos with videos and stuff like that. You still look at them to understand that attributes, but people now have started mining photos and videos. Also, you know, to make sense out of what is there in the photo on the radio are trying to identify content that is not suitable for everyone and stuff like that on. Of course. Trying to understand. What are these people trying to post Moreland in the media data is the attributes of the data. Like who is posting their posting? About what? What does the associated comments rules? Sharing whose liking them and stuff like that. So this is all is the carpets off data that you find in social media. And this is what you want? The mine to generate some business insights. You will see more of that in the following lectures. Thank you. 3. Social Media Applications: Hi. Welcome to this lecture on applications for social Media. This is your instructor, Cameron. So what kind of applications to people develop using social media data? The first kind off set of applications of first suite of applications is around custom management. So all companies have customers, and they always want to hear what they customers tell about our talk about their products and services. The typical matured by which they usually used to do before is by cuss doing surveys on doing polls. But if you look at the surveys and polls are very costly, and also that they do not cover 100% of their customers on customers made declaimed, participating in the subways and post So here is an ardent native way by which customers express their opinions very freely on on That has done through social media and all businesses are today very interested in what Arctic this Claire customers saying about them on social media. Because whatever these customers are saying, there are other 11 people, other customers or potential customers who are listening to. Are they reading what these guys are saying about? So the companies are really interested and concerned about customer opinions that has especially spell down on social media. On Diego, they do say about the products services. As for less customer experience today, it is not uncommon for a person having a bad customer experience through a Let's say they're contacting for some product issue. They do not get good support, and they immediately go on social media and Dot complaining about how bad the supporters on that definitely affects the company's name. The company's reputation. Andi, that might actually stall. Some prevent potential customers or the companies are very sensitive toe what people are saying about them on social media. So they want to mine this data as quickly as possible on what they want to do is then go and contact these customers on, then tried to remedy the situation. So contact centers have also evolved today to include social media experience too. So contacts and thus typically you should only phone calls. Now they're doing chat and email, but they are are also doing social media or contact, and centers do today is that they keep looking for people tweeting about them, are posting information about them on when people do that, the contacts and just do try to go and respond to these messages. You know, some customers treating something bad about the company. That's the companies that present natives get on social media. The reply to the tweet saying, Okay, guys, let school fixes issue or whatever. If somebody is giving a good opinion, then they acknowledge that fact. If they see that there is a potential customer who's treating something good about the company, they try to go and contact the customer on, then go contact them, and so maybe make some deals with them. The customer management becomes very important for companies today, and they do the through Social media, the second domain, in which social media is becoming very important as marketing so anytime they don't product launches and product companies typically do it through print and television media. But today they also do it through social media. On the key important thing about social media is that it helps marketers get immediate feedback about their messaging. Like if you do a print or television commercial, you're not gonna get feedback about what customers feel unless you take an independent survey and go look for customers, whereas in social media you get immediate feedback us to what customers are thinking about your messaging, which helps you in no quickly changing the messaging if you want. Oh, kind off correct any of your mistakes. Also, companies look for who are the key people in other company in their customers in their potential customers. What are responding to these messages on, then try to contact them and try to make some deals with that eso. This helps him look at the market, and once they put out a message, see who's responding the way on, then maybe go on, contact them further and try to establish a relationship and tried to sell the products. So marketing is a big area in which Social Media Analytics is playing a big role. Next comes news media, so media today understands public sentiment on events. Whatever are the public events that are happening, they try to understand public sentiments through social media. Typically, news media goes for things like surveys and pose to understand what people think about various happenings. But today they do not do that, that they do, of course, do that. But they primarily lay upon ah, what people are sourcing, marshalling Confucian media to understand what the sentiments are about various world happenings. Our elections are sports, so that's pretty easy for them because they spend less amount of money on day gain more information, and this is possible today in real time. Like if the president is delivering a address to the nation, the news media can look at the what is happening on social media on immediately. Keep giving you real time analytics as to what is the overall sentiments of the people about various messaging? The president just said something about a national security, and immediately you can see how people are responding to the message. You know, Is it positive or negative and they can show you let time information aboard. Ah, what people are thinking on the internment media to track celebrities through social media . You know what? Celebrities are tweeting, and these guys go on and known for us together tryingto unless what they're saying. So Social media is becoming a great source of information for news media. That again means that there are applications that these news media people develop todo hook onto the social media which are platforms and then get information on then played for the people. So what kind of social media applications are typically developed today? The social media applications today, What did they do? So they mind social media data in real time on historical mode, on day extract information about text, connections and media. And this is what we are going to be learning that the rest of the course they try to understand sentiments. They try to understand what kind of networks they tried identify key actors and contacts in all these messages On more importantly, they want to integrate these applications with their internal applications to build a customer. 3 60 Let us say they're somebody going and tweeting about, Ah, bad service. Now you want to look up your customer database to understand who is this customer? Can I link that treat to an existing customer on what kind of current problems he has, what kind off support tickets is open and why were not able to solve this support tickets and immediately respond to him? So those are the kind of applications people wanna build. One very important thing about these applications is all these applications get data from all the social media platforms through rest AP issue No. One. Good thing about other social media platforms is almost all of them provide data through dressed A B A. They all use the same mechanism, which is good, that you don't out of learning on 10 different mechanisms to extract data. So all of them you stressed a P I. The content is usually Jay's on under that some people use XML, but mostly this Jason Authentication and authorization is you through a concept call our technology called what these guys provide excellent development, support and documentation. The only down side is that they all have rate limits as to how much data you can access and how frequently you can access the data. And you might want to take up. Look at this website a p a g dot com That console because this particular council has references to almost all the open open less ta ph for all the platform so you can go there and navigate and look at all these AP is. You can actually play around with the AP is you can create some Oh, what tokens for you, and then you can play around and understanding a working of data they give. This is an excellent talk for extorting. What is in the offering in terms of social media data? No. Take a look at this one. Of course. So hopefully this lecture has been pretty helpful for you. We will continue on dealing with this more in the future. Lectures. Thank you. 4. Challenges in Developing Social Media applications: Hi, the socio instructor Cameron. In this lecture, we will be seeing what kind of special challenges do we have in Social Media Analytics? What are the challenges that developers typically face when they're building social media applications? The first thing we need toe look at social media is that there is a lot of unstructured data. If you're an analytics person and as developed previous enterprise and analytics applications, you have a lot of data. That is number based. All the text based information that you have is might also have a corresponding number like that. Our names and ID's types and ideas and stuff like that on there is actually some, ah, proper validation that happens at the data entry level that make sure that even if you have text data that text radar confirms to certain aspects, for example, you are asking a user to enter his name. You made enforcing that your name has to be so and so characters. It has to have first name and last name some typically in any text. You can force the use or not to use special characters and stuff like that. But when it comes to social media, data. The data is almost all text on text may not contain all their 11 information along the time , so everybody is free to write. What about they want toe? You cannot force any kind off standards as to how the text should look like. Maybe if there is a blood kind of thing than the blogged might have some proper English being returned about in terms off things like cleats and pose. Your people can write whatever they want and trying to mind some information out off it is going to be pretty challenging any time you want to filter data. The filtering data depends again on hashtags and references used in the text in their not using proper hashtags, and they're not using hashtag that all. It is not possible for you to do some correct filtering. Then, of course, it is going to be multi lingual. You know, people treat in different languages, sometimes in the same a message, treat or common. You see multiple languages being used. That's under the challenge on again. The language used is not a standard language like is he? Go look at the blood. Maybe you will see that they use proper English. I don't write the blood, but if you're looking at something like to be a tweet, people are going toe use. Ask my shot cuts as possible to fit themselves in into the one for the character limit. So you don't know what those guys are writing about trying to extract some meaningful information. A lot of it is going to be very challenging. The other challenges that the data is incomplete and empty. What did the mean that is that, for example, if you're to Facebook, our Twitter or even an own Lincoln, there is a place where people can go and put in their profile information. But not everybody is going to fill up the same information. You know some people because almost all of those car fields are optional. So some people might read you that age or they don't. Some people may give their location off. They don't. Some people may give their education qualification or they don't. If you collect like 100 profiles, you cannot expect all the professors to have all the information. That is a lot of missing information on the information that is provided to you through these air platforms, are also limited by security and privacy constrained because again, if some of 100 people are following a specific account, all this each of these 100 people might have different security privacy settings, which does not love you to get all the data about all the people. So you have to work with what you have on. Additionally, these AP eyes. They have their own security limits on who can access whose data and every individual can go down on. Okay, you can see my friends or you can't see me friends, and our depends on that. Your queries might fail that they may not feel so. There's a lot of additional logic your to Putin to handle these permission issues on denial issues. And finally, of course, the information that is available to you does not confirm the excess expected format. Have you got any kind of a proper CR um application or a cloud application? Data validation happens at the data entry level in terms of water, the mandatory fields, an optional fields, what former that should be and stuff like that. So you enforce proper our data by doing a lot of techniques like, for example, you give them drop downs or not be free. Former texts. You want to fill up a date, you give them a drip drop down though they go to sedate. So that possibility of them giving you invalidates zero. Whereas in a free former text like a tweet, people can put a date in any farmer. Direct one. You don't even know where the writing and BBM Amara Mom greedy, not that kind of challenges exist. So those are the kind of challenges that developers to deal with. There's a lot of data cleaning that has to happen. There has to be a lot off later imputation which is filling in for the incomplete data that also happen. If you're developing a social media application, then comes straight limits. No, all these black forms, you know, our free flat forms. You know anybody can you go use them. But they have established what are called rate limits and which is you can go when you when you go and cruelly these platforms through a p A. There is a limit on the number of queries you can run and there is a limit on the size of the prison. So you cannot keep running quays ask You wish on kind of in a bombard that website to help , wouldn't it? Where is strict limits about how much queries you condone? What kind of data you can get back? Each provider are each platform have different kind of these AP rate limits and you need to know how to work around these rate limits. Or developers have to get creative. You know how to use their available bandwidth, Other available rate limits in a very smart manner. You know, how do you pace your queries? How do you cash anarchy of data? So you do not go and keep quitting for the same data again and again. You know that those kind of challenges exist, especially when you're in a development more. You can keep quitting all the time. You keep getting hit with rate limits exploration than your debate for another 15 minutes before you can start quitting again. That becomes very painful. Eso if you're a developer than you want to look at, you know, cashing a data safe date as much as possible. So you try to work off that data than going to the actual rest. A be all the thank you can actually build some data feet. Simulators know something similar to the rest of EU Kenbrell Your own breast ap a server are there could be some opens or service available. You can use them so that you can use them for your development and testing needs. You know, they're then going to the actual source because going to the actual source every time is going to be a painful process, especially among people level about the same time using the same again, you're gonna be hit like crazy with this rate limits. So this is under the pain point that develops after news. Once you go to production, maybe you can pay for getting the data. But I don't know if companies have that luxury of paying for the data. If they are forced to use only data available on a free bases again, you need to be smart in terms of how you built your algorithms to get the data that you only need. I hope the selected has been helpful to you. Thank you. 5. REST API overview: Hi. Welcome to this lecture about rest AP eyes. So when you're trying to get any kind of data from the social media world, you are going to be using Crest AP ice. So this rest a PS is something that all social media AP s U is not just social media AP is . It is also used by a number of the cloud. Our platforms and clouds are risk providers like Facebook or Amazon R E B. All of them use dressed that provide any kind of data access, whether it is modifying data are access. Indeed, a across the bed so dressed stands for representational state transfer. And it is a method of exchanging information between a client and a server. Eso if you're going to be attending interviews. Um, we do expect some questions. I don't dress, so be prepared for it. The rest is a very simple protocol where the request says what needs to be done, and then the response comes back with. What about the client requested for the rest follows. A very uniform interferes as to how it represents objects and actions. I mean, whenever you're trying toe do any kind of activity your trying to do something on some objects like, you know, a person or a product are a page or something like that. And there are actions like, you know, create, modify, update delete. So the rest has a very uniform interface for specifying objects and actions on dressed a stateless. When is a stateless the exchange between at the client and the server? There is no stickiness to it. In other words, one response is complete in itself. The client says all the information the sober needs through its request on the server responds to all the information that client needs in the response. There is no other state that is managed at the server toe. Understand? Okay, that is something that needs to be persisted across requires or some kind of catching that needs to be going on on the server off any kind of statement. Me and there is no state maintain all information required for the server execute. The request is present as a part of the request. There's nothing. There's on the server side that needs to be kept. That is a a long history and long protocol on how would US needs to be designed and that is some of these attributes are what we're discussing here. Rest, of course, is climbs over where it is a client that is putting out the request. And there is a server that is responding to the request and rest. Our records are catchable in the sense that it is possible for using some cash servers to cash the responses that address can provide. And that way, you know, that is how organization. That's why all the black forms on the Internet can scale in trying to accommodate any of these. Restrict was finally arrested, always almost over hash. It'd be even though Egypt is not a man. Iterating. Almost all implementations of rest today are over http. Which is that you basically follow the general header Extra TP protocols like get post in order. Achieve your rest. Api eyes on dress. CanDo The crowd operations like create, read, update and delete on various resources on those are implemented using the gate post delete and put methods at the rest kind of modern itself. Along the hitch GDP lines in terms award operations that can do and finally, the resources and actions it needs to work on those those those other identified using you are rising on the you are all of the You are A that you provide to the rest actually, and it defies what objects you need to be worked on as a part of the U. R itself. So if you're looking at an example for how the rest looks like the requester looks something like this, you know, I start with the girl, the verb that indicates what action needs to be done. Then there is this. Long are the you are in. So the This is a linden rest a p I on here it says, What object? Our resource. We're gonna be working on the results. We're gonna be working on his people on slash a tilt. So in the case of Lyndon, the slash still means it is yourself. So who Where is the person requesting that corresponding Akona? Trying to look at some data related to the contracts what the field indicates on. Then there is this long what authentication token, The water, Something that we deal with in a separate lecture following the suppressed discussion. So you look at the old access token. That's a very long token that starts somewhere here and goes all the way to hear it. Along token, we will talk about how to get this access token, and then finally there, up nor perimeters, that you can pass along like the water, the format. And finally, there's actually people, not one and the requests. Most of the rest babies respond with a Jason String, so Jason is mostly the standard followed for providing the output. Sometimes they also do examine the response. Josie, Liza, Jason, object. So the request is for your basic profile for myself and the response looks like this. So, typically, this is what you want to see in a LinkedIn profile. And here you see the same thing as a part of the responds. The first name, the headline at the idea. Last name on, then your profile. Rick was one of the yuan afford it. So this is how the response off the rest TBH looks like to continue on our discussions about rest. Let's go to the a P. A. A p a g dot com slash council. Now, this is a nice you are where you can go and explore the best AP has provided by various providers. It gives you a nice AP a in which you can not only look at what the A P s a provided for, but also you can understand You know how you can actually simulate them. I mean, you can actually play some actual requests and look at the actual response on how the responses. But this is the place where you can actually go around, play with the data and farm uric bus. Once you've figured out that this is my red post on my response looks OK, then you can translate it into your court. So if you look at the A P eyes here is an example for Lincoln on the Lincoln wants you to hear. You are a lot of the services year. So I'm just in Millington here. It tells you what the services units chose. The authentication when you chose that is going toe actually go out, indicate itself and create a know what key for you on then. Here. It gives you a list of all the all the rest ap A methods that is available for you so you can have things like in order to re basic data a dream. Additional growth fires. You can actually do some post. You can shatter comment threw off the rest FBI are You can manage some of your companies pages that disposed and stuff like that. So let me go on the territory. Basic profile later that starts forming your a You are here on Then it can get any additional parameters that you want a pass for this request. So each off the request have different set of power media. It is going to present to the perimeters you can fill them out on. Then you finally fit. Send it shows you what is the request that has been said How the request looks like and what is the response? So here you can come around, come here and play around with what things you want to achieve player on whether on get, once you've got it working, then you can dance. Take this and translated into your actual court. Even look at a lot of other AP ist over example. There is also Twitter. There is also Facebook that is dreaded. So you should look at something like reddit, for example, it gives you all the red. AP is also, so look a dreaded what are kind of things you can do. But where did you get the link? You can look at some users stuff on a lot of post year, you know, things you can post a message post a common friends, preferences and stuff like that. If you go to something like Facebook, you have things like Google Facebook and stuck around with Instagram, for example. And they also provide a bunch of AP eyes and on a lot of get mint that's get information about your says get information about relationships than there is media comment. So I recommend strongly recommend that you go in here and start playing around with the stuff. Once you start exploring, it is going to ask you for authentication. So you basically choose what once you choose what water is going to do with if you run out to do any explicit, according so, but it is gonna pick up your current uh ah, authentic. If you've already authenticator on, the dollars are like you already have been logging into Facebook, Arlington and stuff like that, it will use that authentication toe, generate an access key on. We're going to look at all those access keys and things in the later lectures. It is going to gender the authentication key yourself on. Then you can play around with the AP is trying to friend. You are trying different methods and get things done. And once you have kind of firmed up as to you know what kind of things you want to do. This is where this is. Your sandbox is your playground. You can do all kinds of things here. Then you can then take this holding and then tried on your trade on your actual court. So I hope this lecture has been pretty useful to you. Thank you. 6. Oauth Overview: Hi. Welcome to this lecture. On what? This is your instructor comment here. So when you're tryingto access data across the public Internet from any of this social media sites, it is very important that security is maintained our our levels. So in order to ensure maximum security in done these data exchanges as for less in chilled that enough functionalities achieved people have come over this thing call or art. So what is the protocol that has followed for authentication authorization? Typically, we have sense provide you with user names, passwords, but when you want an application, so you want to go into a browser and access Facebook or twitter your log and using your user name and password. But if you're trying to access the same using a client program now, the client program can be very powerful in terms of what kind of damage it can do to the data that is there in in your social website. Because if somebody hacks your user name and password, they can actually go and post any kind of information they want in your in your own art with her handler in your own Facebook page on that can have very Dettori ating issues for your company. So in order to make sure things are pretty secure, people have come up with this open authorization protocol. So there's open authorization protocol is something standard across the Internet. Everybody kind of follows them these days. It enables applications toe, obtain our trays, access to others data. So if you're a company in your company, typically the Facebook pages are the Twitter pages off. Your company are owned by s specific department, maybe an I T department on a marketing department. They have access user name and password to that website, and they control what partings are being posted or what things have been access through that log in. So if you want to develop at another application, even if it is within your company to access the data, you do not want to be using that user name and password as a part of your application code to go and contact these clients and do ladies things because it comes with a lot of problems. Like, for example, if people are to change passwords, people can't keep changing passwords every week, Then your client application configuration needs to keep changing all the time. Now that is a painful thing to do, even other ways for people to share passwords to everybody else. It creates a problem because companies have very strict rules as to how they secure administration or admin passwords and stuff like that. So this, huh? Oh, our scheme helps happy developers to develop applications toe access at the state of any of the others. You still are within the company, but you don't have access to the others using name and passwords. You know, the actual had been using him and passwords. So I The thing is, if you don't have to share your passwords with applications and developers and of course, it can support Web a desktop and mobile application all kinds of applications can be developed to use or what and almost all social websites and cloud services today use over, you know, social media like Twitter, Facebook, Lyndon, Google, GW name and cloud applications like salesforce amaze on paper, everybody uses a what Overton dress is the Internet standard. And you, if you are doing any kind of development these days, it is good for you to know how these standards book when you look at, Oh, Art, there are various roles involved in this or what business, and you need to understand water or these those are rules are and who placed them first? Is that a resource owner Resource in this case is data are you can even call it the Twitter handle on the Facebook page. Who are the data? Typically, the marketing department guy had men are the I T department had been no us the user name and password to your company's website on there, the one who can authorize anybody else to access the same data. Now there is a difference when we take stock about accessing the same date, and anybody on the public can go and look at the messages that are posted on your account audio Twitter handle. But nobody can go on actually post data, you know. But you know, once you get access to user name and password, you can actually posted as that is the difference. There is an authorization that is provided by the Web platform. Like Facebook, Twitter, which is his job, was to do authentication and authorization. Authentication is very frank who the users on authorization is working off things they can actually access. Third is the resource of energy. The server actually providing you the data so it provides data and it can be. And it will be different from the authorization server On finally declined. The client is the application that you, as a developer, would develop that needs the data it needs to get the alternative authorization keys from the owner out indicate on. Then finally get access to the resource server and get data. So these are the very s trolls. So how does the workflow look like when you're across all these doors? How do you actually get access to the data? Now, this is a long work floor. Complicated workflow. Very frustrating as a developer to get these things done. But then the reason why it is so complex is that people who are build it in such a way that it is not possible for are people toe hack through the system. You know that trade is pretty complex. So always it started. Start with the owner creating an applications of the owner. Who is that? You? Oh, you're the administrator for your Twitter account. A Facebook account. They log into Twitter or log in to Facebook and they will create an application. Eso if you go to the developer, links and Facebook are traitor. We will see this in the later lectures. You go to the application page, create an application that something called an application application is nothing but 1/3 party application outside, which can get access to the same data. When you create an application, you get what is called US access key. So there's a bunch of keys that are created, not just one that are a bunch of them. It's called the Consumer Key Consumer Secret or token. And over six, this multiple keys created Onda are there could be a more simpler way called a PKK. Some people in the A P a. K is also depends on the website you're using. So these keys our creator, now the owner actually then takes the skis and passes it on to the developer. This day will pass on through, you know, an email or whatever how they want a person. Then the developer court, This is client to use these access keys, these access keys are actually presenting some conflict filers somebody, it is securely present on did. The developer builds code that actors these access keys and then tries to connect toe the service that provides the data. So the client called The application you built will often became arteries with the authorization server. Using this access keys, it goes to your authorization surveillance? A. So here are my keys on. Then the authorization will give you back access token. So you use these keys to get access token. Maxus token is a token for temporary access so you can use an access token for a specified amount of time. That is only it always a time limiting away in our stores or whatever until which the access token is active. So when you go out, raise a not indicate it gives you back an access token and that access toking is going to be active only for the specified time. So if the access token ex players, you have to keep getting a new access token. Once you get, um, access token, you can use the access token to car access. Any resources on the results of so you want to get some create information and stuff you then use the access toe can go to the twitters reserves the source server and get data. So the access that Duncan is going to be only life for a fixed amount of time under Then you can keep going to the resources over and getting whatever data. If the access token ex players, you again go to the authorization server, get more, get one more new access token and continue on your operation on the results of a job, of course, is to act well. Did the access token with the authorization servers or that an internal communication between the resource server on the other recessions over to make sure the token that you're passing is actually valued pretty laborious workflow? As you can see, it involves multiple people, multiple applications. But that's all the advice. I mean, this is one of the painful parts that you will go through trying to create this application . One of the most difficult things I personally encountered is that on I won't tell this a friend. All these guys keep changing the scheme all the time, you know? So what do you find? Another. So when you're looking at any reference doctor on this, even the scores are on the Web. If the attack Friends is something all, you'll most probably find that that is not the case now because they change the webpages. They change the schemes and stuff like that very frequently. Not the mixture is very secure. So this could be a very frustrating thing. Toe. Do this first time establishment off authentication and authorization. I'm trying to get an access token, so be ready for it. It can be pretty frustrating. This is one of the most painful parts in the social media Analytics. Bold. I hope this lecture has been pretty helpful for you. Thank you. 7. Twitter API Overview: Hi. Welcome to this lecture on Twitter Data mining. This is your instructor camera so tweeted. This is a very popular microblogging site, which everybody knows, and I think you guys are familiar with very much as to what Twitter is and what it is capable off on working off things get done on Twitter, which is just, you know, do a quick walk through off what is available specifically from a data mining point of view s. So let's start with Twitter data. You know, what is Twitter? Twitter is a microblogging site, which are those users to publish evens comments, likes dislikes, you know, they can express themselves very easily on day have 140 character tweets limits. Andrea, there is That's one of the things about Twitter. Is it limits? You don't Yes, 140 characters for every tweet. And that forces people to be very creative in how they express themselves. One of the biggest advantage of Twitter, especially from a data mining prospect that was that it provides the ability for others to see other people's tweets without any permissions, which means you can go and look at what on anybody else's treating. There is no security of privacy in terms of who can see who streets. That's not a lot of privacy, especially when compared toa things like Facebook and Lincoln, which means mining this data is pretty easy. You are not limited by a lot off security concerns. There is, of course, shares and retweets people. Retweeted tweet. They also share tweets and charts and retweets imply how the street have been accepted by other people. How they see the tweets that are happening in Twitter It is, I would allows you the ability to follow interesting persons and indeed easily companies, characters, cartoon characters, interesting people, celebrities, sports, people andare so people follow them. And whenever those people tweet, you can see them as to what they're saying. Twitter is all about asymmetric relationships in that you do not have permissions toe follow some someone if you look at Facebook. Typically there is a mutual authorization process as to who can see whom resin twitter, it's easy to fall or somebody without seeking their permission. That's ah allows later Toby disseminated across the vault, but it comes Twitter data. You know what kind of data is available from Twitter? You have users people use us other people who have a constant twitter. They are usually identified by the at symbol, followed by the Twitter handle. Then they have timely. And so whenever you go into ah homepage of any use of what you see is a timeline, a timeline off all the tweets that they made on that which they have been mentioned in. So this is the first, and from about information, you can take a user and look at that. Users start users on gambling's and then, of course, there are the tweets. Knotweed contain a lot of information to begin with. This 140 characters, of course, but in a treat, you can have other user mentions. You know you are using an ad symbol. You can include another user, and the tweet on what this kind of tells you is kind of a relationship between this user and the other users. On their as, of course, hashtag people use hashtag to treat the boards and popular evens are new evens, whatever they want to ash taxes, another way by which Twitter tweets can be filter and looked at to look at all to its corresponding toe. The same even are the same happening they can, of course, of you are also important there that points toe whatever they want to bind to. There is media that can be included pictures, videos, etcetera. And of course, there are eight tweets and people retweet on a tweet up express their comments on it. Typically, retweets by looking at the rate tweets, you can understand how people are responding to the original tweet. So there's a lot of data that comes as a part off Twitter. And then, of course, there are timelines that, as we talked about timelines of not only the user but also what other users and entities that are friends and followers, the friends, other people, which the street or use of follows and followers are the other people who are following this user that I cannot because with direct messages between two users of Twitter on, of course you can have ah, list and favorite. So it is a lot of data that is there. When you mind Twitter data, you will get into all of these data elements that there now moving on to twitter ap I you know, the quitter rest FBI that would arrest FBI. You can access that. We don't s te p a In this you are dev dot windows dot countless sister public. This gives you the documentation about the A B a This a b a is of course, rest base. It uses a lot authorization asking the airport is Jason on Golos you for such as you can search on a user search on especially ashtanga stare. So, John, any kind of string It gives you the ability to go get posting update, which means that you can get data out of the A P A. You can also post eight or you can also to create tweets and posted to Twitter. But that's not what we're gonna be doing in the old in this hole. Of course, we're more focused on the get part off it. And of course, there are great limits rate limits. Tell you how much data you can extract and what frequency you will need to be very cognizant off the rate limits because you need to build your applications in such a way that you do not get hit by this rate limits. So you there are some careful planning required by you in terms of how you wanna build their applications if you want. Oh, about these regiments workflow. How do you create a Twitter quitter based client application? Fast. You going create an application on Twitter itself. This is the roll off the owner as it is, those owner or goes and then creates a nap location in APP. Start with the dot com on in that application that our application settings on. Of course, there are keys and tokens. Once you create an application, the application gives you this consumer key consumer secret access token, an access token secret. So these four pieces of information are given to you. Once you create an application. This is the information you copy over from this you y all the way to your current application on within your client application. You will use these keys to contact with her in a programmatic way on. Of course, there's permissions. Permissions are pretty open and twitter that aren't any really specific permission settings . And you can, of course, access your information on other users messages and other uses. Followers a lot of open things that I'll do it unless you do in this one Thanks. Now let's start looking at how the actual process works. We are on a browser now. Recorded this website called af stark twitter dot com. This is where you go and create and manage your abs. So you have this creating you are but in sheer, which you can create click to create an app I've already created. And I have called Spark data signed. So let me go in there and see what I have. So here is the information about my application. So in does it is going to actually I need to give a lot off these. You are girls to create this one. Now, when I goto the keys and access tokens, it has created the consumer key and countrymen secret for you. Right. So this is where I can go and look at my country monkey and consumer secret on. Then there is this AC access token by default, it does not create access token. You need to hit this button called region read consumer key and secret and regenerative. Different token for you. You can create creating your tokens any number of times. That is one of the most important things is that you can go create. Recreate the consumer key as well as recreate your access token any number of times. So any time you give access to a developer, they go developed application. You can come back here and regent it the key. So the old keys no longer value valid, so you cannot make sure that those people do not continue to have access. So this is one of the great things about having this key based authentication. So once you have this keys the consumer key consumer secret as well as the access token and access take secret. You faithfully copied them over to your program, our client application, and you'll continue from their own. Now in orderto look at AP a documentation, you go to this dev dot to dot com slash rest this different this year. So here you have all the AP A what kind of my thought. It's a boats and stuff like that. Here first lookers look at these AP A rate limits charts, so that tells you what kind of rate limits apply. So you see that for every type of requests, that is a rape limit as to how many requests you can make in a 15 minute window. So there is a 15 minute window and there is a number of requests that you can make in a 15 minute window. There's a limit on that. So you want to make sure that you are always within this limits, Otherwise it's gonna scream and say you're beyond those limits. So inside your program, you need to also make sure that one you pace your program toe only make so many requests. And second, when the error happens that you are exceeded your limits, you also need toe handle that error message. So that is something you're taker within your code. Then you look at the A. P s themselves Here is a nice There's a nice, healthy arrest Elector, you look at this user timeline. This is where you get your own timeline. So this is the request to get data set slash use a timeline. And here it shows you what kind of red limits are there? What Param it is that they're for that particular request and then also it gives you how the output is gonna look like so without what? As you can see, it actually captures it's a Jason object. It captures almost all the information that you can find. For example, it does information about when your own account has been creator. Then it does. You know, things like what hashtags are they're using mentioned. So there number off. You know what your friends are followers are. It has a lot of information that you might want to take a very patient look at what kind of information is available on also understand how the structures there, you know, in terms off what comes inside, what kind of thing. So as you can see that there's a lot of stuff the friends cone followers count stuff like that. And then you can look at your timelines or even look at all the tweets that are there that are happening and everything so you can look at a user timeline. You can look at the home time family and home Timeline is my own timeline on then similar information, but it is only for me. Then I can also get things like, you know, my friends followers. I can get my follower societies, for example, and then we go three. Query them and stuff like that we will see more examples off there in the example part off it. There is, of course, the A P A Council. If you look at here, down to there, not comrades. Stool council. Now here. You can actually go on and play with the request. This is a game followed by a PG. They can go here. I can. First of all, I need to do some authentication. If I don't. About indication, I click on this button and it'll clear not indication for you on the fly in the stool itself. Then even go on, do things like get my time and get home timeline, for example, and then to ascend on it is going to come back and tell you. So this is my request recent and the response I got. And you see, the information about the two masters is coming up here. The lot off stuff here and we'd includes all the pictures. High rate of the height lent all kinds of things about the pictures. And of course, it is gonna have all my tweets in there for every tweet. It gives you a lot of information, so you need to just be patient about how this Jason looks like. And you should be able to know extract things around that particular Jason String. So this is this is the kind of help available to you online. You can patiently go through each one off this request and which kind of information you are interested in and then work on them. Thank you. 8. Twitter API Usage Examples for Python: Hi. Welcome toe this code examples of how to use the Twitter a B I toe do your programming on the client side and this cold sample is available to you as a part off your asses bundle. So how do you go about getting Twitter data? So this one helps you before you start anywhere you want to go and register the application at the abstract or dot com and get your various AKI's that are there than you want to install the Twitter by town library That is your installed by doing be a be installed quitter on once. You don't is you already to get going. So let's start by setting up your home directory so that you know, you know, I do start your programs. From then we go and get this consumer key Consumer secret or token. What secret from your application? If you try to use the keys that come as a part of the soil, it's not gonna work because I'm going to go on and we change them. I'm recommending you to go and create your own application and get your own keys for this one. So let's go on. Insulate these key values. Then let's import with her and Jason. So this is set up now. Next, What you do is you go and create an what object and use that to create a twitter. Objects off for that. You're gonna use this? Not that twitter dot Over there over on deposit, all the keys that you just created and create others. What object? And then, based on that, you go and plead your twitter object. They stoned the authorization in food that you just provided. Now let's go and do some querying. There is first going query for my own timeline. So tweeted our FBI dot status us home treatises dot home underscore time and is the matter . It is actually pretty easy to find out what my thoughts are in this library because these methods are very clearly model pretty similar toe have your AP is themselves look like to in this case it is. Janus is dark home underscored Time land. If you go to the AP reference, you see that it has status. A slash home underscored gambling. So you basically take this method and status is and then put a dot in the middle and then you get that matter and what particular perimeters are allowed for the majority of the same ones you will find in this A period recommendation counts inside the max ity. These are all the parameters that are expected on the exact names are being used in the library. So you don't You can just look at this a p a documentation and figure out what this library does. Actually, So you go Ondo the street $80 status is that home don't by plane that gives you your home. Timeline on that is limited to the number off records that it elevates the max one you can you can then print this to at that time when they came back, using the Jason door dumps mattered with an end of three. And that is gonna print out a lot of later for you, as you can see on the right side. So the last There's a lot of data this year. You need to pick up some patients walking through this old Jason document to understand what parameters air coming through. There are multiple levels in the Jason document, as you can see on then trying to walk through them and understand. What is the information that you want the extra out of this one on, then figure and then do the rest. So what I'm gonna do here is that I'm just gonna walk through my time bland and print all the streets. So I'm walking to the timeline and printing creed, not a leadoff text that's gonna print all the tweets in my tangling. These are only two streets and they're welcome to the de more condoms setting up grandma for social. There's only two tweets, their spending out the tweets. This is my treat on my trembling. Now, how do we do Get a tweets that are happening on somebody else? Timeline is by using the Twitter ap A status is don't use a timeline again. The method supported year are the same ones you will find in this A p I in the a p a free going here and you look at user timeline. This other matter that are supported in the status is don't use a timely on. See that the matter names are pretty similar. That's a status is dark. Here it is a slasher. Yes, I've translated equal Antley on them and If you look at the user timeline, you can either pass the use already, which is a number are you can pass them the screen name on which is nothing but the twitter handle on. There are other parameters to hear. Asked like the maximum count. How many maximum count off tweets do you want on a bunch of these actors? A lot of them are optional. And of course, you can actually look at how the desert looks like here. You really look at the results to understand how to extract data out of the sunset. Suppose you want to just cordage you to see the output and then the look at coordinates and come over. But there are certain things that are at multiple levels. Like, for example, you ours. The U. S. Is a less turn off that list. There isn't the Urals Come out of there as you are under that you are also you need to know Walk through this Jason structure the good with the leader. You want eso? Yeah, we're gonna look at for the use of lamp of time blame for Fox sports. So you look at what kind of messages are coming and Fox Sports, we're gonna be only doing a count of five, which is only we're gonna go beginning the latest five beats and then we're just gonna print the output here. So I've been printing a lot of stuff and you see that a lot of information comes here. Like the time zone in which that particular was created when it was created. The favorite who created it? That screen name? What is the number of favourites? Count for it, How many people are following it and everything. All kinds of information is there. Now what we're gonna do is we're gonna walk through this tweets, so we're gonna take every tweet, print out the tweet text on in that region text you're gonna look at Who are the use of mentioned So user mentions is if in the tweet, they use an act and mention a user. So you're gonna print every tweet text on a tweak anomaly people people mentioned So every mansion you gonna print out who also mentioned we're gonna bring the pretext on. Then we're gonna take the mentions list and walked off the mentions list in entities of your dimensions and then we're gonna print the screen, Name off those mentions, or let's go through this and all right, so you see that these are the various streets on then some tweets don't have any mentions. The particular this particular tweet that is there. I mentions off play babies, Travis and Brand Speaker, and that the bunch of peace that have been printed A Of course, you can make this sprinting more. Ah, user friendly by putting some nice former thing on it. Then let's go and do How do a search Twitter. So we looked at my own time Land. You lifted some of the cells. Timeline. How do a search Twitter? There's one more thing called search treats on that there is a cute which is chance or query and you can give a quick is drinking It is the same search with your do Gooding twitter you and do on the top right corner. So I'm gonna be searching for all tweets with Hashtags Massie in it on. I'm gonna be getting the last five years. I'm just limiting the numbers just to make sure No, the examples are not flying to this beach on. Then, Once I get this. What I'm going to be doing is that I'm also going to be writing it to a file. This is how you get tweets and write to a file in for you on the persistent. I would strongly recommend you guys doing it because you are going to be pretty quickly hitting all the right elements. So once you do any kind of queries saved to a file and then any time later you are does, um, programming. You can load from the file and continue on your experimentation programming where rather than having to go to Twitter every time in getting leader and exhausting on your bandwidth . So I'm gonna be running the search list, bringing out the content, and then I'm also going to be saving to this file. So once you saved the file, you can open the file length is, and I know patiently examine what is coming out on the things you see that there is a search moderate operated here as to when the search happen on the status is you see that on all the fix that comes over here. This is the street next, and then ho are the people who are using mentions that who was actually retweeting it on the number of followers and all kinds of stuff that comes up here, including the people who read Twitter it What are their characteristics or attributes? Everything comes up here so I'm going again now. God, this query from Messi. So I'm just going toe What? Through that handprint on the pretext. So this is our prey. Search for Messi. And this is what I get out. Wherever there is hashtag messy, it comes up here on just good the moment about you. The next example I'm gonna be doing is how do we get the list of followers for a given, Uh, with a handle that. That's another quid Arabia call called Crude Arabia dot Followers start list on the screen . Name is Fox Sports. I'm gonna be getting all the followers for this screen name a maximum off Bannon. Off course. I don't want to be no putting a lot off load on this one. Especially when im going Example One so pretty similar again. You can go and execute this one. Now what I'm gonna be doing with the followers list is them nested creating. So I'm gonna go to the stuff fall overs, then take agent every fall over and for that fall over. Get a list of who that guy's followers are. So it's a multi level fall over tracking. So I'm just trying toe change through the list of fall overs, take every fall over and find their followers. And then I'm of course, gonna be limiting myself to 10 for the main list. Ask for less with the sub list, not agenda. Florida flooded out, but what I'm gonna But that's what I'm gonna be doing. Get the follower slips, print their screen name and number of count friends count so doesn't get the list of users . And then I'm gonna be printing the screening and Prince count. And then I went again, going to be calling the same with the four lowest list. But for the screening room, I'm going to be using the fall over screen in that I got from the year earlier query, and then it will be printing the second level followers. So let's go and execute this one. So, as you can see, it is giving me a message that greatly exceeded. This happens all the time because once you quickly exceed your limits, you go on, Go, go, go and hit the specific thing. So we now continue after sometime on try to execute the same code. And now you see, it is starting to print the first level queries and the second level queries Now. And of course, after some time it does go on it again. The limits. So this is worth the issue that I was telling about it that you keep hitting these request limits on. Then finally are your keys might expire and go to stuff like that. So you want to be watching out for that on also program your own code to make sure that when nobody's ever happens, you handle them in a very graceful manner on you be able to continue layer and pass your court for some time and then Canada toe work on it, leader and so on. Thank you. I hope this electoral 9. Google Plus API overview: Okay. Welcome to this lecture on Google. Plus data A B I. This is your instructor, Cameron. So Google, as you know, has being a really revolutionary social media platform that has tried toe emulate everything that with our does and Facebook does and Lincoln into their Google plus offering . So what is Google, plus Google places and online social network surveys that is, tryingto be a replacement for Facebook and Twitter. It again connects people on. It has symmetric relationships very much like Facebook. It has messages and media. So what you can do it here is that you can put in messages so others can do what is called us plus warning, and they can come and donut on. Then they can re share it. And of course, the relationships and Google Plus are symmetric in the sense that each person has to agree to the other person looking at the data are they gave the arteries people as to what they can look at in their account, working of resources and friends they can look at. And you can, of course, create circles like friends, family and co tenses. And then you can have messaging. That is going on within that subgroup ago replaces. There are a lot more open in terms of its A p A. And it's availability does not very animated, like how Lincoln is in terms off the privacy settings Google plays. Of course, it's a little more open than the other black forms. A Twitter is the most open platform who replaced with a slightly more open than the other ones, but not like Twitter. The data that is available in Google Plus is classified into four types. Know there are people. The people who are the people are the ones who are linked to a Google account, and they have their attributes like what kind of attributes that they have. They have, like, you know, age, demographics, geography, the location there in the times on there and stuff like that activities, other things that the people do. They post data the share information they share, videos, photos on other activities. They do what they call plus one, which is basically you know, you are somebody else, and when somebody else post a message, you can go into a plus one, which actually takes that message and publishers to your own group and it kinds of increases in legs and stuff like that. And then finally, you can also put in comments as toe. You know, somebody is posting something you can put in your command as toe. What you feel about that specific opposed our activity. So these other things that available toe Google plus I antacid as a data set in terms of Google, plus A B I, the Google plus a P A. You can go to this website and look at award the A p A. Contain. So there's called Console that developers dot google dot com slash ap eyes like a p a slash plastered all of you. Google has a number off where baby eyes. One of the A p is It does is the Google plus a B A from where you can go and extract information about what other people are doing. The actor with these people and stuff like that, it again uses a what. But it has a simple AP A key. Rather than having a Facebook is a very convoluted system, Whereas Google Plus is a very simple system, you just create one key, and that said, you can just keep using that key are you want more security? You can actually go and create a what keys. You know the four different keys like a pretty dress. Also. What? You can also create just one key, and we don't with it it again supported Jason. Like any other social media service. It's a boat searches. You can search on people you can search and activities you can. So, John hashtag privacy is pretty open combat up Facebook. You can actually go and look at a lot more information about others. Rate limits are there, but they're less stricter. Google Place to encourage people to use the a p A. To get a dude a lot of stuff. Eso The rate limits are not that strict like it is in Facebook are even Lincoln Documentation. You can go into this little website https slash developers dot google dot com slash class slash perhaps like a piece like arrested, his Web documentation is available. Practical documentation about what the AP is capable off. What are the various methods available? What are the input parameters? How does the output jeez and look like and also called samples and various A programming languages is also available for you to take a look Workflow. The work floor. How do you set up a Google Plus application? You first go and create an application and this website con sold our developers dot google dot com slash ap eyes. You need to remember to enable the Google plus AP otherwise is not gonna work. So do remember to go on enable Google glass a p a in the same website. You create a a p a ki as a browser key type on. Then you can use the A P a key in your application and then start quoting your application to get data from Google. Plus so continuing on the discussions on the Google plus maybe a library. This is the Yueyue Go in Toto. Understand what Google bless? Offer symptoms off its libraries. Console dart developers dot google dot com slash ap eyes So if you're going there, it shows you all the A P ace that Google has on what a PS have been enabled for you on the most important thing. You want to make sure make sure there's Google. Plus AP here is enabled for you, if not even go into the main page. Find that Google plus a p A and click it and enable it on. Once you click it and enable it, there is the button Disable. You would have been showing enable before Then. You can look at it usage in the past 30 days. And what kind of court, as you have in terms of how maney queries you can do on stuff like that, And then how do you create a nap? Is stop part. You have this create new project that create new project. Exactly. Have you created up this case? I already have enough called Social Media Analytics. Eso. Once you can open that one, it is going to show you this page and you can go into the credentials here and you see that I have already created an A B A key. You can create credentials by going into this credentials, create potential links and create an A p a key for yourself. Once you create that AP key here that's going to show up here on this is the key that you were transferred back into your code and start using and as a part off your bland quarter in terms of the AP references of the baby reference page on here. You can come in there and take a look at all the AP does, and it has pretty detail information here. Even click on people here and that's going to show you the various methods available that there is this people search matter gives you information about bar power can, Apparently, does it can take. And what is the request body and how does the response looked like? It has a lot on Also has a lot of court information as your in C Java BHP by Don Ruby All kinds of court information court examples are also provided for you. Here s so this is a great way for you to kind off in our trays and you can execute the query here itself on Take a look at the wrestles on. So this is the grace played place to go and explode more of the U. Y. And understand its capabilities. Thank you. 10. Google Plus API usage Examples for Python: Hey, welcome to this lecture about how to use Google plus AP eyes. So in order to use Google plus AP eyes well, you need to go into the developers. Start google dot com actually canceled our developers dot google dot com on inside that you have to go and create your own tragic. The project is the equal in off the application share. So in the tropics don't and they create a project, and it is going to ask you for some information that project name and in some email, Aries and stuff on. Then you go create a project. Once you create a project, you can go in there into the credential screen in the credential screen. Or you could go by clicking on the credentials link. Here. You see, it can have, as we know, different kind of credit. Children's can have an A p a ki or you can actually create award keys. In this case I located at a P AKIs a year you can goto created NGOs and become the A P. A. G. I will ask you to create a key, and you can create a Broza key. They're actually kind off helps you create one. So I want indicated a key here. As you can see, all created a browser key and is the key and the key that I would be copying over on using it as a part of my prep by code before we get there, we also need to go to the overview page, but it is going to give you all the different. AP is that Google supports on unit. Make sure that you enable the Google plus a p a. So if you look at the list of a PS here, you will find Google plus on. Then you click on it and enable it on. Once you enable it, it is going to show up under your enabled a pH. The Google Plus AP needs to be enabled. You click on it and then you enable here once you know you don't enable not enable that is going to give you enable but another is going to do it disabled. But I'm so, you know, enabled that in order for your baby is to work. So once you have the credentials set up and then enable the Google Libya, you are set. Now, how does the A P A. Itself. Look, you go to the AP reference in this you are all developers dot google dot com slash plus slash web slash rest. You know this you are on this. You, Earl, is actually in the hellfire in the court file that you have. So once you get there, you see what kind of a u R a. P is It supports. It supports a piece regarding people so I can credit about people are the my third supports our get get a specific person by use already are you can search for people you can less by activity our list. The other thing. You can do a search for activities, activities of what people post on the on the Google Plus website on here, you can go list activities by a specific use already, or you can go get a specific activity or you can search for activities and you can actually take on this andl down and I'll show you the actual a p i. Ah, water the mandatory parameters here. And what are the optional parameters of a query? The mandatory perimeters, the term quality. And then there's a bunch of options parameters here. The things you would bother about our maxim serves. You know how many results you wanted to turn up to a max? You don't want to flab you If you're just doing some I don't playing around with that on there is something called as a page took and the page Tokcan What? It happens, The sunset is pretty latch. It is going to only give you like top 20 results. And then you have to use the page token that comes from that result to do the next query toe asked me to reserve don't results starting from that page joke. And once again, we have some examples as to how that one works later when you look at the use cases and then it tells you how the request is going to look like and also tells you how the response is going to look like it really could. The responders tells you how the how the response is going to look like an underwater the Jason Street going to be where the attributes of the values to play on. Finally, it also has some code samples, you know, Java, PHP, python, ruby, dark, not go Joe asking a lot of core samples after how to you can use this one. So this is pretty impressive in terms of the help it provides on also produce simple to use . So we talked about people and activities. Once you have activities, then you can go and get comments for the activities of any state comments. You can go list all the comments by a specific activity 80 on. Similarly, you can get especially comment also a similar living moment. So this is the a P. A. Definition pretty simple and straightforward on it is also very simple and straightforward to actually use as a part of the court. Now it will jump to the coat sample. It's and this is the Google plus a p a r. Usage examples that this is the court for us. The first thing you're going to do with set your homeland, and rightly so that the place where you actually don't loaded this court are we have already So then you just copy over the a P a key here into this very believe be a key. Of course. Under import. Oh, guess dismissed one. You need to install this Google plus a p a k a client for bite on so that you can go by doing p a p installed. Ah, Google ap a pie Tang clan that Winston the Claire, the the Kubilay pay Piketon plant library for you Once you know that you're gonna just import a love them on. This is important for, you know, And then you have the a p a key. This is a pick you got from the website. You store it and this variable and then you create a service object service object is created by a be claimed or discarded or both. You say what kind of start with it is the plus service or the version on? You need to give it a hatchet TV object to work with. So that s actually be called away. Should be library door rigidity on. This is the developer key. So we're going to just run it. Initialize it Now, Once you do this, here comes the rest of the query examples. And as you were telling you, it is pretty straightforward. So first thing I'm gonna search for people whose name sounds whose name has come a runnin it'll people feeding caldo service start. People don't search. So if you look at the a P, a definition is pretty straightforward. You, as you say. It's obvious Dark. So this is people. So you're asking for people that search? So we're looking going together. We people don't get if it were enlisted his people, not less same we have active would be easy to serve Restart, get dark activity and you will get active with information on the perimeters that are provided by this library. Desperate? Is it exactly the same parameters you'll see here. So, like you say really are out of everything that you see in this one is to say the perimeter names are exactly the same sorts. But to use it for you to do Assad, we start people that such they're going to query for everybody whose name is Cameron. And I'm going to say Max ITER results country. You create this query object and then run and execute on it on that is going to return a Jason string that is going to be stored under the variable people feet. Now just run it as good of giving you the data you can do adjacent or dumps toe actually print out the contents off the people feet. I'm not going to happen in all of them. I'm just going to print a burning people feed off items or items. Is the attribute within or the collection within the Jason That usually has all the information that you need? The use, a print? Jason comes on. This is how the except Jason is gonna look like you can see that I a quest only three results. So each one is a person. So a kind of the person with the display name. What does the you are? And for the person doesn't have an image, doesn't have any tack. And there is an i d for every person in there. So you can then quickly using this idea to get more information about the person than that is object time. Then you Once you get this, people feed. And you can I trade through each of the items in the people feed and then extract more information. So what I'm gonna be doing is, in this case, people, people in people freed, I'm gonna be putting their display name on their I d. And then I'm gonna run another query get. So this is a get query for the specific person. So get use. Already called with that personally. So for each other person I got from from the query, I'm doing a get for that specific person by using that people of I d aan den from the information I got about the person I'm going to be spreading the person's gender. So this way you first query their a list of people. Then you take the idea of the people on then query for that specific user I d and get more information. So just let me just run this and let's see what comes back. So as you can see, these other names are credit for people who name had Cameron and it came. We were treatises because I only asked for three of them. And then I ran a query again. Toe, get the gender, all of them being mean. Of course, Cameron is a male names, or it's gonna give you that. Now let's go inquiry for activities. So I said their activities the syntax is pretty similar. So service, not activity. Start search on. I'm gonna be quitting here for all activities that would be too much starts in it. You know, whatever be true Masters have actually posted on. I'm going looking only for fires us just to restrict the Narmada query. And I'm going to just execute this one. This is going to give me all of all the opposed that we two masters that our contest mate and then I can do adjacent or dumps. There's gonna print all the content that's coming on. As you can see, there's elaborate amount of their country that comes out. No actor, you are limited displaying him everything. Then what? I'm going to be doing this. I'm gonna be taking each of the activity on. I'm going to be printing the actor off displaying it. So basically the actress, the person who did the activity, the doctors display Maine and I'm going to be looking at the object object is the actual activity, not content, which is the actual content of the activity of the message of the activity. And when the same activity can also get plus winner so in Google, plus, you can. The way you like things is by doing a plus one. So the total number off plus winners and total number of items in that one thing. We're reprinting all of them, so they may go run it. So we print out all of them. As I was saying, so that the actor that will be two masters on the content and then all the placement was all of them are printed. Now, what I can do with for every activity that was printed are very post that has been played out. I can now go get various comments for the Post. So there's another recursive are next level of wearing. So I'm going to do service, start comin, start list for that specific activity idea where activity and equal to the activity area got. And then for that, I'm gonna be printing whatever is the comment, actually commented some. So let's go on during the holding once again. So this this is how you can progressively quit it later on the right side, you see, it's going and fetching the comments and printing the comments of the comments out there. Surprisingly, there are no comments, so yeah, that's what it is. So this is where is in a very simple fashion. You can actually go and start quitting Google Plus and you would find that there is a lot of similarity between how this is being used on how Twitter Xabi is being used there pretty , you know, easy and simple for you to actually go on quality. So you know why I would recommend you to go playing on more with this one? Just make sure that you do not exceed the all the limits the rate limits that are there. So please limit yourself to three or five, especially when you're going to be doing this. Records, inquiries, iniquity one. And then based on the self square yard gain. Just make sure you just take out of that. Then. I mean, I recommend you to just play around with this one and get familiar with this. You I hope this has been helpful to you. 11. Facebook API Overview: Hi. Welcome to this lecture on Facebook data mining. This is your instructor, Cameron Facebook. Facebook has been a very popular social media website on one and six people in the world used Facebook almost every day. That's the amount off data that Facebook generates. So you, as a user of Facebook, you would have been very family with what kind of things Facebook has I'll just review and the kind of data that Facebook has. So what is Facebook? Facebook is an online social network service, and it connects people. And it's ah has asymmetric relationship. That is. One of the biggest differences between Facebook and Twitter is that Facebook is all about asymmetric relationship in that for two people to interact with each other, they need to seek each other's permissions. Eso It's not like you can go and follow anybody you want. You have the the user has to open themselves up for free public following on the use cases contrasted as to who can you can be a friend off me or who can see my friends who can see my post. There is a very good privacy setting for everybody as to what other people can see with respect to the messages and resources and stuff like that, you can share messages and media in Facebook you can like, unlike comment on what other people are saying on do. You can create circles like friends, families, equity tenses and then you can only pose toe a subgroup, fall off those people s. So this is the kind of things that you can people do in Facebook. So what kind of data that is available for all these activities? So of course, there are user your news related information like user demographic information that is available in phrasebook that our post what people are posting No, the post contains text. They also has user mentions with the at symbol there ash tax, you ours media, they have like and like, comments and stuff like that. They also have timelines, uber user of the time land which shows you know, all the things happening in the user with the user in terms of what messages posted, what they liked. What they come under the boat. You can see the entire users Tambling. You can have friends and groups and what kind of things people post with respect to their friends and groups. There is, of course, private chat that people do with each other on this information is also available provided , you know, they really opened them up. And finally there. Even, you know, people who are public figures can publish that events are schedules as toe while they have their public appearances, and even that data is available. If you go to the Facebook A B I. The Facebook A B A is called the Facebook Social graph FBI because they're focused on building a social graph or a speech, a social link kind of system of us toe How people linked with each other. The A p A. Is available in this website. You can go here in the stool slash explorer and gives you the FBI, and it gives you the documentation and everything. Here it is addressed base. It is what it is. Jason formats. It has a number. It a search capabilities. It has a lot off privacy settings, a lot of privacy settings as to what can be exposed to the A p I. And finally, there are great limits. Of course, the right limb is again like a strict as it is in Twitter. The typical workflow that people follow toe build a Facebook waste client application is they first have to go and create an access token through the developers. Dark facebook dot com slash tools Explorer in the application settings they will love. They will get their access tokens than access tokens. Then they can create an access token. In this you are and then actually query the data visually and then copy over the access token, toe their query and then build analytics. So you go to the stool. You can, but you can create an application on. Then you can use your access token. There is a you A. There's an A p a Explorer that you have to you can actually very visually, and see the content and stuff like that. Once you're happy with the quay itself, then you can take the access token under query and copy it over to your client called and then build your code and then build analytics. The permissions are there is a lot off limitations in terms of what you can access. Due to privacy settings, you can access your own pose your own friends, your own comments, but public access is limited for other people's data heavily. Eso if you're minding your own company's data, yes, Facebook. You can easily go mine wherever your company's mentioned and stuff like that, but trying to explore someone else. Data using the Facebook A B A is highly limited in terms off their access. Twitter, of course, is the most open, followed by Google. Plus, Facebook has very limited. If you go to London linguine and there's too much limited that you know, it's hardly part impossible toe. I know build an application using other people's data. So this is the kind of things that were going to do it. Let's now move on and look at how those you wise actually looked like for creating an application on D AP I explored. So this is the page you go toe to create your Facebook. Absolute developers dot facebook dot com slash us. Once you get there, that's going automatically up up and your user I d and get in here. So here you can go and set up your abs and set up your permission on everything here. So if you go to my APS, you can go at a new app in the new APP. You can go and actually set up it, ask you for a lot of information. There's a lot of confirmation going on. It's pretty frustrating going through this one simply because I know they want to make sure people are not building a robot kind of applications to go and create innumerable number of these kind of APS. Eso. They try to put in a lot of security in here. The ones you created and called Europe Review is going to ask you for a bunch of a Pro Bowls like Do you want to make your thing public? You submit your things for approval on what kind of things people can see in a public profile user friends and stuff like that on this. So you go and create an application, and after you create an application, you were going to be getting your or what related keys and Bogans. And then here is the graph AP I documentations or developers dot facebook dot car comes like a dog slash craft. A. Be a have regard. It'll talks a here. You have a lot of documentation as to what the A B A is and what kind of things it's supposed and what it is capable off. You know all kinds of stuff here on you can go to the baby reference here and gives you all the various things you can quay. Let's say you look at even, for example, on even. You have things like about how you can search what kind of perimeters you can give the wedding about. An event on Ben. There's a lot of things about the leading A breeding and creating these. Even so, there's a lot of stuff that has provided here in terms off what you can do with this AP Ice . In addition to that, you can go to the craft AP Explorer, which is the slash to slash exporter thing. Now, once you go hear what this one does is it gives you a way to actually build a actuary and test it out off. When you begin, there is no access token. You can go on gender and access token by clicking when get token. The moment you do this, it gives you an access token here, long token. Now, this is a temporary token with an expiration time eso you would generate this token and then you can use it in your court toe. Keep resting it for a given amount of time. After sometimes the stoke and will expire. But you can always keep generating this token. Once you do this again, use a visual relatable. A query, for example. It is me. And then you see you can people in the query. Then I can add additional fields from here, like own at my age range I wanna earned by birthday and then I can go a summit again. It's going to execute this Grady. Now here. It shows me the request that has been sent on. Here is my response. Now you see, the request has been sent together. Feels equal toe. I know you can just copy over this town of things to your gold and from there continue building your court. You can actually add a lot off. This feels not only that you can act like friends. No connections. We goto connections and then you can go to friends. And then you can add all your friends And then here another feed that tells you in a work in the fields you want about your friendly? I want on the age range for my friend again. Again. Submit and then get it all, my friend. Information also. So this way you can keep going on and known us toe what you can do, but I can think separately. LTD Unit what? Through the security layer Here, get all the data that you want. Eso That's one things you can do in Facebook. Of course, you can always go to the a p A g console. A biggie Concern also gives you the same kind of capabilities. You can choose Facebook here, and it gives you all the methods that that there what kind of query that you can do on Facebook? And here, too, you can play around with the request on and then try to get stuff going. Of course, you need to go to the authentication set apart or two by taking wants to. It will go on genital thing for you as long as your loved on the Facebook there is nothing else you need to do. You just get that's couple off, agree buttons. And then you were beautiful holding, going. So this is how you can set up in query Facebook, grab a P A and get things going. So in the next lecture, we will see how to use this information as a part of a client application to build actual court. 12. Facebook API Usage Examples for python: Hey, welcome to this lecture on using the Facebook AP is to extract data. So in order to extract data from Facebook, if you want to create a permanent key, a prominent or authentication, you go to this place. Developers start facebook dot abs and then you go create your own new app. And then this is a pretty laborious process to follow in terms of creating this are your app and then go through the settings and then go through an approval process and stuff. So the details steps are there, provided in annual as a part of the example. You can go through that if you want to create a permanent key and then start using that. The second option is to go into the graph AP Explorer. So this is developers dot facebook dot com slash tool stash exploder on Dhere. You can actually go create an access token for yourself on also, create a query. It's called SQL or Facebook Query language. You can create a quick query, your Sullivan a very visual manner, and then you can gender the query player on with that Kredi, and then once you are happy with how the query is supposed to look like. Then you can copy over the query and the access token to your own coat on, then continue from there. So here to create an access token, you have to come here access token into a get token here. And once you click the get token and nooky is generated for you, remember that this access token has a limited lifetime, after which it expires. So you can default Query that it gives you a slash me and the fields, our idea and name. So you do a subcommittee evidence the query for you and gives you the result. Now, on the left side, you see that this plus sign. So here you can go and change Aquarian. Maybe I want to add a date tree age range and then maybe able, had made location on. Then I can also add some collection information like you know, I can go and call the connections and then look in the connections I can look for friends. So plus friends, One tickly conference taken special. What information I need for the friends with this. With this intended one, I can no click here and say there is a modifier. Genes home and limits I want to look for. And then for my friends. I can ask for the bio or us for that birthday. You know, whatever you're wondering about the first name and then I can get all of them and then I can do a subject here on this wonder. Dunn's me, all the information that I have asked for. Now one thing you realize is that once I have the query ready, the access cook token for me is here. And the query for me this year the quality want actually executed again. No, don't this feels, and then you can just take this and implemented us apart off your court. This is over. You can frame a query player and with the most important thing you want to remember about Facebook is that it doesn't let you. I'll give you a lot off access to a lot off information. So you want to play around she and make sure you get you get You are able to get actual data and not keep getting like an are not raced messages and stuff like that before you go and start quoting in the skins equal. You see that? Slash me in stuff, slash me. You can actually put any Facebook I d on. Then you can keep quitting for that particular persons Facebook information. You can find the Facebook idea of any person by going to a website called The Name of This is provided. And this, uh, in this help there is a play. There is a place where you can go and get Facebook. It is for anybody you want on. Then you can use it as a part of up. Get that Facebook ready and use it apart off your query. So this is our Facebook books. We do not have see a lot of use cases where people are trying to mine information from Facebook. Except if you're trying to mind your information about your own company. Otherwise, because off all the privacy and security limitations that Facebook has now let's switch under the cord. So this is recorded as a starting help. Here has to say, you know where you can go to Easter yourself and where you can get your access token. Additionally, you want to install this request bite on library request is a general pattern library for doing a should he be request? And there is also a Facebook STK that you can install by doing people in store for his book STK So now let's start doing some query solidifies Try to import a bunch of thes libraries This is the accent access token that I copied over from the Facebook website initialize this one and then I'm gonna be running actually http request So this is my base. You are to be exactly what all that you got from the Explorer website year on the top, the exact I'm just trying to form this. You are on the do you see here inside my coat? This is my base. You are on disarm. It feels that I actually picked up the same set of feeds I would become from Ah, here in this developer Facebook Doctor on Exploded thing, the same feels so I can just farmed the fields and then just copy with the same exact sandbags here on Then I can frame the finally world like based Ural plus threes. That's Tokcan on. Then I rent this Get requests on, then they get the data out. So pretty straightforward. This is my data. I'm just going to query running it on here. That's my data that comes off compact Twitter and Google. Plus you see that the information is then it's very minimal. It's not going to know Pony with a lot of stuff here Now, getting my friends, how domain I get my own friends. So this is the query again. Informed the square. Using the explorer again from the Ural, based on the fields and the access talk and run it and then printed are pretty similar it like everything else. So he had the jays under dancing. So what now I can do is I can look at the data that I got from the friends my friends start daytime, the Jason on. Then I can get the gender of my friend and also my location of the friend. You see, I'm trying to do something here, which is I'm fast setting the gender and location to any. Then I'm making sure that the gender attributes is there in the friend the friend collection only if there is there, I'm going to be populating it. Remember that money acquiring social media data you won't get data for all the attributes. You know, Onda a lot of anti but would be missing a lot off. Friends will not have populated the gender information on location information. If not, it won't be there. So I'm just marking the mus m. And then I'm checking if that particular variable is there, then retrieving in other ways. It's going to keep giving attribute, not phoned errors. So you just want to make sure you check all those acting's. So here, I'm gonna running this for my friends. You see that male and a male Rosario Argentina meal and me against. Only one person has put in their location information. Others do not have them. The next thing you do have us, I'm gonna know. Start quitting about some other users of the use I'm picking here is Donald Trump so that you know all traumas his own website, real Donald Trump on his Facebook ideas. This idea. So the base you are allowing framing is this one installed me. I'm using this Facebook ready on. I'm gonna requiring information about very even see 17. So eyes publishing all the event schedule that he would be attending in this website. I'm just going to query them. So this is that the query again? My framed using the same explorer and the same way Do the final you are from this one on. Then I'm gonna be getting the data on. I'm gonna be printing from the data. Various events. What other start time on how many people are expected to attend. So their score on this one and see the results that this is like the list of 10 events he's going to be attending the various states on the number of expected attendees. So this is so you can create a somebody else they provided they have given you access. Remember that Just because you can go to their Facebook CEO page and see information doesn't mean that they're given access for you in all of them. That's because when you set up this a p a ki also you that when you do this, get talking again, going to ask you for information about in a waters promise permissible and what is not permissible. So that also prevents you from getting locked. Awarded Facebook is a lot more restricted. Compare toe to toe on Google. Plus now, so far we have been using basic request objectively, it hits you deep it inquest that it also ap the Facebook Cafe pH which was set up also, We cannot try to use that to get at the same kind of information. So first thing I'm gonna be initiating is this facebook dot graph a p a with the actress token, this one makes life more easier. You run outta sit and form all these. You are girls for you. It gives you a better way off quitting data. So one of the first method you're gonna look the West get our graph eBay. Don't get me. So I'm going. When I get object Women's I'm getting myself on, then I'm going to be just printing. So this is straight. This is another way, of course, getting my data than giving me like my own. And before I was giving the end There you are in some around or the bother about That's no just do great graphic paid or get object me it is going to give me my data. You see, that are almost all my data. It's not looking for any specific fields and this is the reader I get for myself first name , last name, local gender work, email, birthday. All of them is coming up. The next query I'm gonna be running is getting my connection. So it says graphic paydirt. Get connections pretty straightforward. Get connection me, get connection so myself, and what kind of connections? Get my friends. So get my friends and then just dump it out. So there's not all adjacent strings. You get the data, then you can start grossing yourself. Succeeded. It has got my friend's data and rug printing it out. Next, I will try to get Trump's data, so I'm going to getting Trump straight. I'm gonna be using kids Facebook. I d print Jace under dumps on gets going around us. And yes, this is on Trump's data. They're dispirited out category. Public figure, user name about Homeless people are talking about this on a lot of other things. Like what is his location is addresses there on Facebook. Not everybody has the right dress on Facebook and a bunch of stuff for you on you can also do a quickie for specific strings. When it's a query you're trying to query is like a query for ah, such so here is a Facebook search. So in this case, except graph aviator request, I'm gonna be asking it to do a search on the search. I'm saying the query is messy and type of speech is gonna ask. Quitting for the name, the messy in all the pages. So who are as a page with the name Messianic? When I say may see the name of the page? Missy, there are a lot of people would have created different pages for Monsieur that when we an official side. But a lot of people are UN officials. So such all the pages and print information about them that's trying to go and run this one and see what happens. So here it prints a lot off the pages that has a missing. It s Oh, this is interesting. So you see that That's the name called Kid Chris Messina. Not really the message we talked about the local president, but that's also comes on because we tell using searching for the term on. Then it prints out everybody who has a messy and look at this one. This is my say, a community judge. OK, so that's what happens, and it's this is gender such? You get all the information and you like we have turn it for Twitter. You can get information. One, Corrado, one query. Get information like get their ideas, flake here and then you can use that I d for for the quays. Like do a good object on the righty. Andi So on and so forth so that so you can play around and more on with Facebook about pretty much. You know, Facebook has been tested. Link you is a pretty restricted. Maybe I unless you are doing queries on your own website or your own fair faced of his book page and looking for all your friends and looking for all the posts are refering to your own idee are are handled. You're not going to get a lot off information from Facebook, So just remember that when you're doing these queries Thank you 13. Introduction to use cases: Hello. Welcome to this introduction about the various use cases that we're gonna be using as a part of this course. So, first of all, the use cases that we're gonna be showing you are mostly based on Twitter and Google, plus data. We haven't tried to use Facebook Day Darlington data simply because the privacy and security issues limitations prevent as from extracting other people's information, you know, for giving you some valid examples. We need sufficient amount of data and verity of data on that is not possible. Access irrigated. Get raising other people's Facebook accounts of other people's Lenten accounts without having security and privacy issues. So most of our examples that is, why are based on Twitter and Google, Plus, unfortunately, but whatever examples we're gonna be doing on these ones can equally be played. Do the other through the records also because are you already seen how accessing the data the AP is are very similar in terms of the authentication schemes as well as the data you get so pretty much you can operate the same techniques that you see your play on Twitter data on toe. Other other data from the UK from other websites also might. You might find these steps a little repetitive diamond cottages. For one you are. You might start seeing wired examples looking similar because that's over. This data in the social media world is very limited in terms of water does it is just, you know, tweets. It is people tweets, comments, likes, you know, that's all. It's going to just keep revolving around the world. So we're gonna be extracting the same kind of data again and again and looking at them from a different dimension. That's where you going to keep seeing on DAT is why we didn't put more use cases there because it just you'll start seeing them being, you know, the same steps are being followed on. The focus is for us is to get the data from social media into regular data structures Are local data struck? Just remember that the once you get the data that is in the social media and the regulators struck, just you can pretty much a play any kind off are any other kind off data mining things that you would do for you regularly like your corporate data. So get the data out convert these strings into numerical representations and then stole the man a database. And it becomes like any other data. You can mind them. Use them, analyze them like any other data that you have within your own corporate databases. Thank you. 14. Frequency Analysis Use Case Python: All right. Welcome to this lecture on frequency analysis. Use please for social media. So in this example, we are going to be using Twitter data on. What you are going to be doing is we're going to get some information on some greater on. We are going to do frequency analysis. Frequency analysis means you can take any segmentation. When you say segmentation, you can take any upriver Butte like age, location, language and stuff like that information about people on, then trying to analyze. You know how many people in a specific group along toward section, right? So you can take a mist of 100 followers and say how money followers are from the UK from the U. S. Or how many followers have languages, English forces, French. What is the Spanish? Try it on a lace frequent. Don't understand how your users are where followers are. That's what frequency analysis is all about. On what we're going to be doing is we're gonna be extracting data and putting them into a regular fight on data. I variables are our data sets are like data frames on. Once you get the data off social media data into these data frames, then you can apply regular bite on analytics capabilities as well as you know, graphical capabilities to do all your analysis. So that's what we're going to focus on in this lecture. How do we get data from social media and get them into by tone data structures, so to begin. But I'm going toe start in this one to set my home directory. And then I'm going to start out by setting my twitter authentication things like consumer key country more secret or what token and what secret? And then I'm gonna import my Twitter library and my Jason library. Then I'm gonna initialize the twitter a b i A library, and I'm gonna create the baby object. So once I dough, that's what I'm going to be doing this. I'm going to query for HP. The company HP has their own Twitter handle account. We're gonna look at the followers. We're gonna look at 1000 followers off that particular account and for those 1000 followers , what we're gonna do is we're gonna look at those 1000 followers and try to understand what kind of a place they come from. Watkins their language water the time zone they belong toe on how maney friends they have So wanting the technique that we're gonna be using here is that, you know, to get 1000 followers Twitter can will only give you 200 followers at the max for every time you query it. So you have to quit it five times. It's called imagination. You query it page by page. So if you know the first query, it is going to return something called us your next cars that are. It basically says that Okay, I completed until so far have shown you the first base to begin from the next page, I heard a past the curse of the next curse of for me. So is gonna continue from Varied left off in the previous query and then get the next to wonder the next 200. So how do we do this? Let's start bay first in boarding the Pandas library for bite on. This is the column less. I'm just starting over. What are the column information. I'm going to collect language, time zone and trends and to trying to create an empty upon their data friends. So I'm going to create an empty pond off later frame with these columns and as equity information about hedge piece followers, I'm gonna keep adding them. Tow this specific found out a friend. So initially, I am gonna be setting the next closer to minus one. That means that I'm gonna be starting off. Ah, fresh from my first fall over on I'm gonna be a looping five times because I'm trying to get 1000 followers each time I can only get 200. So I'm gonna be loping through five times on I'm gonna be getting a followers list by doing critter ap a Don't fall over. Stop list. I'm gonna be quitting for the screen name Hezb e Onda setting the next cursor to the first Rarely, which is the minus one. And then I'm gonna be saying count of 200. Once I do this query, I'm gonna be getting the next coarser in this question. Which tells me that, OK, this has completed until 200 people. If you want to get the next 200 use this next person, it comes as a part of the results of the first query. So I'm setting the next cars are told that one also do this, then I'm going to be looking at the followers list of God for hatch be and for each other follow what I'm gonna be collecting. What language? They speak which times over there in And then how maney friends dough those people have. So those follows how many friends to the restrained on late? You know which language people are most friends off times on. People are most friends. That what I'm trying on life. So I'm gonna be just looking through the results set on. I'm gonna be keep populating the pandas data frame from this information, so let's go under and all the score. So it is running now. Eso it is gonna I'd rate five times, get the data and populate this list. So is gonna take some time, so never is complete. So we can just do a followers don't count. I was going to give you that information it has collected art of anyone do follow day off, and it's going till you actually information that is dead in this data. Once I got this in Doha and a pound of data from then, I'm kind off, you know, have a lot of capabilities I can do with the panda state of frame. I can do a lot of group I and then do some analysis. I can do some charting. Does the options are endless? What you can do once you get a data into a pond, a state of friends. So what I'm gonna be first doing is I'm gonna be grouping this followed information group by language by language and trying to get collect some statistics off frequency analysis by language. So I do that. And then once I do a language or describe what is going to give me about by language like, what is the language? Use how many people, How many followers, such bs for that language on then on. Then I'm gonna be doing Min Max on the followers, and harmony follows our death. How many counts are there in terms of the records and what is the main value? Max values and a deviation for the followers called for them. So that information is there like a and language Arabic. There are 16 for hours of HP. How maney follows dough. Those 16 have the means for 46. There's a mint off 22 the max of 3 to 9 toes, so you can start looking at a lot of information about them. Now you can start doing the same kind of queries by time zone so you can do it by time zone and then also look at the same kind of information a group by time zone and then by time zone. I can look at how the friends Stratus sticks are there like I can do a time zone off count and I can look at you know in that times on how many people are there who are followers of each piece of Amsterdam as Levin. And it's a nice five eat in Atlantic time sex and by time zone. I can look a data. So Twitter gives you a lot of attributes for its followers. And then you can take all these attributes and try to analyze and understand working off patterns out there in terms of this is what is your basic frequency analysis that you can do with social media? Reiter. Whether it is a Google plus our twitter, our Facebook, all this same technique would apply acquiring it and then putting in the work standard data frame. And once you put it into a dash under data frame, you can use all the general techniques that you would use for find us based data analysis on the data. Oh, this is be helpful to you. Thank you. 15. Sentiment Analysis Use Case Python: Hi. Welcome to this lecture on sentiment Analysis. Sentiment analysis is a very popular use case, very frequently used use case when it comes to Social Media Analytics in Social Media Analytics what you're focusing on as people are treating all the time writing something on the time that is always some takes some sentences and phrases that people post all the time . And what do you want to understand is, what kind of sentiment does the post reflect? Is it a positive sentiment or a negative sentiment like you have your own company, Twitter Page and people are going to be using the company's Today Twitter handle and tweeting something aboard its products, IT service. You won't understand what kind off street is that. I mean, are people saying something positive of what the company are? Something negative about the company? Most importantly, if replacing something negative, you want to find out what, why and maybe want to reach out to that specific user toe, understand why they're treating so on what is the problem and trying to rectify it? So how do you do? Sentiment analysis is that there are a lot of libraries available. Fortunately to do sentiment analysis. You don't have to bend you back. Trying to do that are these libraries use what is called us? The back of words techniques of back off words. What that means is that these library have liberalism identified a set of words that reflect a positives and tenement like good, great, awesome things like that. And I said of words that reflect their negative sentiment. It or bad First, some really an obscene words and stuff like that. So they look at these life. They look at the treat of the message and try to find out how Maney such words are there positive words and never divert and try to come up with the score. To understand how Maney past devotes, I used harmony. Negative words of use. If so, what is the overall sentiment? Eso There is a library called a similar library called text blob in python that that sentiment analysis for you So you pass it to a sentence or a paragraph to that library. It will analyze that whole sentence of paragraph and come up with the sentiment. Full rt scores cooperate polarity score from minus one plus one, so the closet discord is two minus one. The very negative that commenters the closer it is to plus one the very positive, the common test. So you can either just use that greater than zero less than zero to find it is a positive or negative are you can use the scale and say, you know it s scores from zero to fight as more greatly. Pass chip under from fighting 10 fighters. 2.5 is moderate day possible Positive on dpoint. Find the one. Is there no Highly positive you can do that. Are you can simply go by Positive or negative? Zero being neutral. So the library computes it for you. You do not have the bear of India back to do that. All yard was extract the data from the social media and pass it on to the library and let the library do the magic for you. And that's what we're gonna be doing in this exercise. So how do we do that? We start off by doing, you know, importing setting up our home data tree on this one of your gonna be using Google. So we're gonna be setting up the global Library. Google AP a key on, then setting the Google service. So what we're gonna be doing here is so here. We're gonna be importing this text block library. What we're gonna be doing this Excesses were going looking at 100 activities after what is in our posts in Google terminology, the last 100 activities that carried the word trump in it and try to understand whether the sentiment is positive or negative. That particular opposed us the particular activity. So again, the the after word defeat works as it only gives you 20 results per query, daughter Maxine. So to get 100 results, you need to hydrate through on grand equerry five times. And what happens is the every time we were on a query like Does it is going to done your what is called as the next token the next page Tokcan that's read of this page ends that. And then you take the next place token and pass on the next play Stoke unwto the next query so you'll get the next 28 the next 28 times and so on. So this is all you I'd read in this case off Google Plus to get a set of multiple pages, so we're going to start for the Net. Next based Oksana's empty, which means I need the first page. So you go and execute the activity page, quitting for Trump Max results of 20 and then pass on the next page. Tokcan, Doohan execute Once the first query comeback comes back, you then get the next page token from that particular result and then populate the variables Or next time it executes, it starts out from decked open and then whatever activity you're getting good from looked through that hoop through the activity items on, then extract the activity object content, which is the actual street on. Then, using this continued created explode Val jet on ascended to this activity blubber variable . You create a text blogger object. So once you create the text club, object all their dough to compute Sandy Monday's call this mother dot sentiment dot polarity that gives you the polarity on. Then you can actually do a polarity string based on the polarity value on you can say basically okay, if Paul are integrated and zero polarity this positive since Edo is negative, otherwise is neutral, so you can actually print all these variables polarity pull out of the string and the actual object where you want as your looking through this values. So let's go on, run this 100 feets and then see how the polar B shows up for us. Okay, so I'm running them. I'm going to trap printing polarity. Polarity string on the 1st 80 characters off the activity. I mean, just the tweeters So long, So I'm just gonna predict. So let's see how the thing that looks like. So this is how the polarity looks like a lot of them turn out to be neutral are simply because there is no positive or negative words in the tweet. So let's look at this thing, which is minus five negative. So it has no crook like Cracker Kroger. Some some words being used that tells you it's a negative. It's a negative tweet. Here it is, really your Hollywood stuck up. Okay, there are some some obscene words being used there, which results in a negative one. Up there, you have a positive with a positive score off point refight, these 200 people could descend whether Donald Trump gets the GOP nomination. I think you've been internally 80 characters. Maybe you should look at the full tweet, maybe will know another some positive words in there. So this is off the sentiment so left that this is one thing more positive, sacred to Donald Trump's success. You know, some spots it'd was being used. So that's what is your opinion, Marcus. Positive. But then it's a very mildly positive. We look at the score 10.6 a compartment, this one, which is 10.12 come part of some really high words being used. So this is our in general. These sentiment analysis works the polar day being from miners. Want a plus one on themselves? You rate pretty quickly, you know, this is this is how we can pretty quickly analyze what kind of things people are tweeting about. You could come up with an overall score. Also let you can add up all the polarity and is it on? What is the overall polarity? Just add up all the polarity scores for all the tweets and say this is my overall polite. But are your average polite and compute average polarity and see other and say there's an overall negative perception are positive perception, you know, can come up with all this metrics once you start doing this sentiment analysis, this is a very popular use case. You would be getting in the social Maryanna next. You This is something you will definitely do A some point in time. I hope this lecture is pretty helpful to you. Thank you. 16. Link Analysis Use Case Python: Hey, welcome to this use case on link analysis, but fight on. So what is link analysis when we look at any of these social media sites like Twitter or Facebook or Google? Plus, there are people who are linked with each other and Twitter dusty followers following friends. Similarly, in Facebook, there's the friends and their friends and their friends. So there's a link off that was made in between various people on you want analyze these links to understand some patterns are discover key people are important people from a business point of view. What, you're interesting and is in analyzing these links to see if there are any key in prevention people in the whole circle. Then you can go reach up with them to sell our push your products or services in this example for link analysis. What we're gonna be doing is we're gonna take to Twitter users and then for those two Twitter users, we're gonna find water people these people are following. So what people these can you play interested in who, what they're following. Then get that list and then find common people. These both the days follow. And among these common people. Then you query again and see whom did those common people follow? And within that network, what the most popular in that list? You're trying to find a link link link kind of to find out who is the most popular P person in this whole circle who everybody is trying to follow. So let's start off the example by first setting up the Twitter, are you going to start out by tweeting up a setting of the Twitter FBI object? We're gonna be quitting for two accounts. One account this week. A seven would just fork averages for Victoria Azarenka gone and in its a nerves for Caroline Wozniacki on both of them are pretty popular tennis players. So given that they have in the common field and they have common interests, let us see if they're trying to follow the same people. And I'm in the same people of us seeing other similar people that these guys followed and stuff like that. So let's trade it in for pandas here on then. First, I'm going to be getting the list of friends for Rekha. Okay, so I was hit by a great limit, exceeded that message. So No. I've gone back and waited for some time and know why Executed this command. I know what has come back. So now I have become friends. The variable that carries all kinds of Vika on. You can see that it has got all the information about our friends. I'm gonna then walked through through this list and populate this list called vehicle list with all the friends names. So I created a trend. A list of all the friends off Vika, all the screen names. Now let me go and do the same thing for Carlin. And I hope I don't get hit with the credit limit. Yeah, I went through fine. And I'm gonna be popular in this colorless to the list of friends off Carlin. Now I'm going to try to find our friend who will the mutual friends between Carlin on Bika that I doe using the set operation here. I do a set off the colors to convert it into a set, similarly with Caroline. And then there was the and symbol, which is the intersection operation of Excuse me, the common pre common list off work between both of them. So there are 32 mutual friends. As you can see between Mika and Carlin, the next thing I'm gonna do is I'm gonna take this 32 friends and walk through them and find who obese people follow and then find who is the most popular among this whole group. So let me find who are the common friends between these two lists. Common friends is this one. And then I'm gonna be trading through the common list. I'm only going to be I dating for five people. But the fear of not aiding the ratings again on for this five people, I'm gonna accumulate their friends list, come up with that whole list. And then in that list, will there, of course, these people who might repeat again and again on among these people. Repeat again. I'm going to see who are the most popular one. So left court with his friends list and collect that data into the second reference list. Okay, Looks like advent refined. Great. What are the total county? There are about 745 entries across all this fight. But remember that there will be a lot of duplicates because the same friends might be followed by multiple people. So I'm gonna be using the counter feature and the collections for Quite. On the contrary. I told me, actually count in the given list. If there are certain entries that are occurring again again, the counter will help me come by unique keys. So I'm gonna do that to get me. Have come the number off friends or repeating on then in that friend counts dot Most common when they say most common, it is going to give me the top 10 among the list that we just counter and then print those values out. So you see that team questions is the most common among two people Gwen Stefani and the two people. You see that? That is that this is all you can now try to start unless French even that I only quit it for five people in their their the maxim counters only looking to body. If I went and related for 50 people, I would have gotten a lot better pattern. But to do that 50 people out of work through the rate limits. Maybe given my court, a lot of delays are sleeps before I start. No quitting again again. That way I would have collected more data. But I think you've got the just offered how I can do this analysis. So wicker and Carlin follow set of people before the common people among these common people who phone home so they follow and that list trying to find where the most popular one. So this is a You can go to the link one by one by one and analyze some. Parton's off. Our people are linked to each other. This is another big use case when it comes to social media, and I thank you. 17. Action Analysis Use Case Python: Hey, welcome to this use case on Action Analytics when it comes to Action Analytics, what we're trying to do with whenever you people post something on the social media websites, there are associate ID actions with actions like likes retweets user mentions shares. So there are actions which other people take, which indicates something good or bad about the original post. A Tweet gets retweeted a lot of times. It means it's a very popular tweet. If a Facebook post get a lot off likes, it means it's a pretty popular post. So you're trying to understand which kind of treats get more read, tweet, tweets and stuff like that. So in this example, what we cannot look at as we're trying to look at tweets for a specific Twitter handle and then look at which ones are getting retweeted on When Minister, which ones are getting retweeted? We look at in the content off the tree to see what that treat as in terms off other user mentions another hashtag so trying to find out which hashtag and which user mentions typically generate a lot of retweets. So that is kind off actions, you know, while accurate, that people take on the out. That action is related to the content of the tweet. So let's start by against setting. But Twitter here, I'm just going to run through this one, setting up all the keys and secrets and then parading the Twitter object. I'm going to create a pond as data frame. So I'm gonna be creating a couple of data frames to collect data from the tweets once you get them into the later frame than it is easier on the list. So one of the hash data frame in the house data frame I'm gonna be coming. What hashtag are there in the tweets? So for every hashtag how many times they got retweeted So I'm gonna want through the Twitter tweets and keep collecting for every tweet at record here. So if a treat has to hashtag so I will put into records here that hash tag than that Council, given that an ash stock and across multiple tweets, the same hashtag might get repeated because I'm gonna be creating one record part treat part hashtag so the same hashtag might occur across multiple tweets, so there will be more records. I collect the information then I can do a group by and find an overall count. So I have this hash column. Similarly, are the user columns to come? Use dimensions on how many times they got retweeted. So I just create this empty data friends to begin with on. Then I can start collecting them. Now. I'm gonna be setting this, Max. I d here because I'm going to quit and quality for a lot more tweets so that the Twitter account I'm going to be reading offers the really Donald Trump Twitter handle are the screen name. So I'm going to be getting the last 1000 tweets on this account wherever this account is mentioned. So I would use this Max EDI toe. I did it again and again. Eso that I get the next set and accept. So this is how I go on, get multiple pages, You know, the every time a query only get 200 count. So how do we get the next 200? Next 200 hours? I get the first query. The first query returns me information about the first lady would have done me information about the last area that was used on then that ideas then used to do the next time. So in this case, you see that the tweet treat ideas there. So what about tweet I d? I get from this list, I take the plus one and query that again and again and again that so I'm going to go inquiry and get other in their list. Once they get the list, I'm gonna be collecting the hashtags and user mentions in their corresponding data frames. And once I get that, get that information. Then I can do my analysis. So hashtags are available for every tweet there is this entities hash tax attribute car collection can extract the hashtag and then keep adding to this list I'm trying to collect . So let me go and run this whole court. So I collect all the information that I need. I hope I don't get into the rate limits. Okay, that's van No. Let me just go and print the hash data from how does it look like us? You can see I've collected all the hashtag stacks will repeat because the hashtag might be repeated across multiple seats on the nice have account off all of them. You know, how many times they arrested. I mean the if twitter to it had multiple hashtag and grid coordinate with lex x amount of times that I'm counting against all the hashtags in that particular tweet. So you see that this make America great again and Trump Digital wants is came in the same treat and then 6475 the number of times the tweet got retweeted and so on collecting that information once you get in the data from it becomes very easy for me to do any kind of another's There's no back and do a group by hashtag And then I can get an old all count by hashtag and then I'm doing an order. So I phone I can see now that which hashtag Andi more treats of the treat The hashtag that says Trump 2016 got great. We did you know, 695,000 times so you can see how the pattern is, which is the hash tag will be generating the most free treats. You can see it here the same thing I can do for the use of mentions. Also, I can do account and grew by and see which use a mention gets the most retweets of this again really acknowledged that that isn't more. Street reads for Basie, and then you see, this is where things get interesting. OK, it's Donald Trump's wheat account. Of course, you expect him to be the most popular views and mentioned, but next one is CNN. And next one, this Fox in that, that's when things start getting interesting as to why is these news channels mentioned along with Donald Trump? Then you have testicles, of course, is running against against Donald Trump in the primaries. So when this luxurious recorded around, but that the 2016 primaries were going on? So that's why I'm trying to use a more popular topic so that I instantly would get some a lot of tweets for it. If I use an unpopular time, I wouldn't be getting so much data. So I'm trying to use a popular topic off the day to get a lot more information. So there's a the data comes up. So this is all I can understand. What does the action people are taking? What kind of information generates a lot off read? So this is a game you would use, like in your company, a company sprinting, opening or a lot of marketing messages. And they beat Retweeting. There'll be having use a mention it will be having cash tax on. You can see which off. These are generating a lot of interest from the people in terms of retweets in terms of Sharp's in terms of legs, and then try to understand user patterns. And then that way you and improve your marketing messages to actually generate more of these three tweets and legs. Because you know that whenever people retweet out like something, then that particulate information is propagated toe their own friends, their own circles. So that's out at we'd get spread our our Facebook post get spread across the community is when people shouted out like it. So you want people to be sharing and liking your treat. So you want to understand how good are how what cannot generate that kind off. Interesting. When you look at action analogies, that is the goal to find what kindof treats what kind of hashtags and mentioned generate a lot off users. I hope this example is helpful to you. Thank you 18. Frequent Pattern Mining Use Case Python: Hi, this is Cameron. Here. This is on a lecture about frequent pattern mining with social media data. So what is frequent pattern mining? Infrequent part mining. We're trying to find things that frequently occur together a pattern. So whenever a records be also occurs and we're trying to find out how many times many acres just be also occurs one of the simplest things you want to look at as use and mentioned. So whenever in a tweet, I use that AIDS mentioned us bees also mentioned on We're trying to find this kind of links of dependencies us to. Then you can go on, understand? Try to understand why are they mentioned together before that unit? Understand, Find out who are the ones who are mentioned frequently together. So this is an example where we're going to be looking at tweets on trying to find out which uses are frequently mentioned together and then come up with an analysis. Are the most frequent patterns the patterns off users being mentioned together. So let's start by setting up the twitter objects again, the same going through the authentication process and setting it up. Now what I'm gonna be doing here is that I'm gonna be looking at the Twitter Freed's that are coming from Fox News. I'm gonna be looking at the last 1000 tweets on Fox News. Twitter handle and see in all the news that their news feeds that they're coming in. Who are The users were frequently mentioned together and given that I'm looking for 1000 it means that I need to run these called five times because each color is only going to give me 200. So every time I query, I'm going to look at all the tweets. So the treats, when they come out off Twitter they come in the descending order off when they created. So they're ideas that tweet ideas themselves would be in descending order. So when I'm getting the first set of 200 tweets, I take the last tweets idee on that last week's ideas. What I will use the next time and tell me, tell them the max I do you want to start with Is that last I d so that I've got a co star getting treats after that? I'm gonna get the last week, Terry, and then say minus one and then I'm saying toe the Twitter feed. Okay, go and look for all these tweets, which are no less, which have I least less than this value, which means that they are older tweets. So I'm going to start with them very high max value. So I get the the highest number possible. And then I'm also starting to set up something going a basket. What is this basket is there are for every tweet I'll be reading. I'll be creating an entry in the basket at an entry. In the basket would be a comma separated list off all the user's mentioned in the tweets. So the treat. I'd like three users mentioned. You love that user. They come, I use that become. I use that see as one line in the basket. So it's actually I'm building like a CSE filed here with a comma separated values off. For every there is one line part tweet on. For every user that has mentioned that will be one entry there, which is comma separated. The tweet had no user mentions. Then it won't have any entry. The basket of that's what I'm trying to do. No. So if you look at the court. I'm walking through it of five times on requiring it to one of a toner tweets every time for every tweet I'm gonna be, then getting all the user mentions in that tweet. And then I'm creating a CS feline here, a CS feline. And then I'm happening to see is relying to the basket. So let's go and execute this code, and we hope that we don't have any date limits. Okay, we're done here. So let's print and see how the basket looks like. You can see how the basket is looking like here. See that this is one entry portrait. So in this particular tweet, you see, there are two people mentioned Fox News politics and Christine's What? And here is a treat. Whether there are a lot of people mentioned, you know, Fox and friends then Perino. General Rivera. Hey, great. A lot of people. So you just collected a list of all the tweets, all the users who are mentioned together in a lot of the streets. Once I get this, I'm not know, trying to compute some metrics. So what kind of metrics I'm going to compute is that I'm trying to find for every user until that occurs, how many times that user as unless occurred in all the tweets together. And then every time this occurs, which other handle also occurs and how many number of times that handle occurs. So you take every user handle are the reader couldn't handle how many times it has occurred overall, on every time when this handle occurs, which other handlers occurred and how many times to each other and lesser guts kind of coming up with some numbers as toe. Okay, whenever this handle Lakers 75% of the time, there's other All handle. Osaka is trying to find this relationship. Before we get anywhere, I'm going to save this information in a basket. It's always better. Good idea that whenever you collect data from any of this Oprah social media websites save it and somewhere so that you don't have to keep quitting again and again, and possibly a great limit. So this is how you can save the information in fire. Then you can read back the file also whenever you want, so you can get back the file into the same variable. So I'm going to start off with a dictionary object and empty. Additionally object where I'm going to collect data. So the date, the information. You see that little complicated? So let me explain what is happening? Drowned one. What? I'm gonna be doing this. I'm gonna open up this basket listed. Collected Look at each and every user mentioned on by every user mention I'm gonna be counting one. And that's what I'm gonna be doing here on then. No one. I print the dictionary. You will see that it is printing a lot off information. So you see that Hillary Clinton was mentioned 60 for a number of times, and I'm creating one more thing called Link, which I'll be populating later. The 1st 1 I'm counting simply how many times each user has been mentioned politically and announcement in 64 times. John Roberts Fox was mentioned five times. Pontifex would mention 10 times just collecting that information round toe. I'm gonna walk through this dictionary, pick up each each off this mention on. Look at whenever this guy was mentioned Who other people, each other persons are being mentioned. How many number of times. So I'm gonna populate this link. A structure here with all the people who get mentioned along with this guy and how many times they got mentioned. So this is the walk through here to see that there's a fire level six level walker going on , but I take every user in the dictionary on. Then I walk through every record in the basket list and see if that user waas in every card . If that's so, then I'm trying to find who other people are in the record. I know every other people in the record. I'm going to be creating an entry in this list. So let's go and run this one and see what is it coming out. Now you see, it's a lot better when it looks like So again. Hillary Clinton got mentioned 65 number of times on every time Hillary Clinton was mentioned. Jadeite Abella Guard mentioned five times a bony Williams fight. Lizza Merlin both files. So you see that who Bernie Sanders off mentioned 20 times every time Hillary Clinton was that Bernie Sanders with 20 times. So you're trying to find that relationship and build this data structure? Once I built the state a structure, then I all I wanted to do was I'm trying to find a patent frequently occurring patterns on I can set a support level off 0.5, which means 50% you know, which means that I can print all the information here, but this would be a lot of information about what I'm rather going to focus on us. I'm only going to look for patterns where use the handle occurs more than 50% of the time when other you signed the Lakers trying to find a relationship, but two handles I mentioned together more than 50% of the time. So again, I'm gonna walk through the dictionary. I'm gonna look at, you know, the count overall count. This guy was mentioned and then the number of times the other person was mentioned on trying to compute this together value at the support Bradley which is basically taking the number of times the user was mentioned for a given user. They were about the number of times that those of us mentioned, And then, if together, greater than support, just printed out. Let's run this cordon. Okay, have indicated support here. That's why it is complaining. So you see now, there's a lie spotter under this coming out. There's no every time Dan Marino was mentioned, then Cadwell was mentioned every time. So it's 1%. It was a value of one, which means 100% so that that's kind of tells you kind of pattern as to you know, how much it is repeating, you know, so things like, you know, Bernie Sanders was mentioned 30 times, and every time Bonnie Sanders would mention Hillary Clinton's one mentioned sixties express it all the time. You know, you see, there's a part, and there when you're looking at patterns, you're looking at both things. How many number of times this great? The first handle it survives mentioned. You want that to be pretty high to start with. Then you look at who other personal mentions that this is a major. One night Bernie Bernie Sanders, 30 times Speaker Ryan, 10 times a Speaker, Ryan and Red. I've got mentioned all the time, and we need to understand why you know more. Mike Huckabee, 24 times political in 10 points extend this other part. Don't you're looking at great out for every times Fox News 0.60 65 66% of the time carried on Fox News. We're together makes sense because Greater is a Fox News commentator, so they're kind of typically occur together. So some patterns you can find meaning instantly some patents will be intruders to buy that's occurring together. That's when you want to dig in more. So this is all you can do. A frequent part of mine. England. We did a frequent part and mining for user mentions. You can also do it for hashtag. You can actually do it for any of the stuff, you know this is But this is the kind of same logic you would use. So do frequent pardon mining. Hopefully this helpful. 19. Real time analytics Use Case Python: Hey, welcome to this lecture on riel Time analytics for Social Media. Now, one of the use cases for social media is real Time analytics. When I say real time analytics, we're trying to look at data in real time. Like, for example, it would district er we want to file. We're gonna be looking frequently monitoring the rear. The Twitter feeds when treats happen in real time, and the moment the tweet happens, you want to get that read, analyze that, read and take some action. Typically, this is what some people do in all the news agencies of the constantly looking for tweets on popular personalities, popular topics and whenever they come in the mutely gated analyze it and summaries it. Nowadays, in a lot of contact our customer support centers people are monitoring drink tweets in real time, which is it's not that they're sitting in extracting keeps on an hour by hour basis. Rather, they're plugged into Twitter and receiving the treats immediately as it happened on immediately ransom analysis on them and take action that I use cases like customer support . Similarly, when, for example, you guys are launching a money marketing campaign, you put out a tweet message on Twitter or a Facebook post immediately. Want to know what the reaction is? People are tweeting bad about it. You won't immediately know what it and take action about it. Or if somebody is trying to be, you know, spamming. You are, you know, pretty good messages. Just off for some, Not more reasons. You immediately want to take actions. So how do we do it? Streaming streaming does not use the rest ap format, but it uses a separating called streaming. Maybe I So, for example, and Twitter there is a separate stream called streaming FBI. Now what? This tree, how the stream AP a works is that you establish a persistent http connection to Twitter. And the connection is always open through the connection. You 12 Twitter together. I want to listen toe tweets happening on a specific topic or on a specific handle. And whenever somebody treats something on the specific handle, you immediately get that feedback and immediate in real time. And then when you get the tweet in real time, who can then go out and do your analytics and take some action? Eso How is this works? Let us go and take a look. So when it comes to streaming a p A, there are three kinds off streams that Twitter supports. That is the public stream, which is called the fire hose stream, which is all streets happening on Twitter. You know, we want to do that. There is the use of a stream where you can listen for a specific user handle, and that's what we would usually do is look for a specific user handle and get all the tweets happening about the user. The 13 This site stream now the side stream is just a collection of users like that. You have your company. There are a few handles you the company made have, like one handle per product, one handle bar he executed or something like that. So you want to go for multiple off those users? That's what aside, streamers setting up off All these dreams are very similar, almost losing. The only difference is, you know, water. The use already that you would give for these dreams. Fortunately, the Twitter library in Beit on makes this work very easy for you. Let us go on, do this specific thing. So again the set up for this is very the same as what you would do for the regular rest. Abia, which is you go and set up this trigger A P. I object the same way by giving on your authentication keys. Once you do this, what out with this specs? A practical example. What we're gonna be doing is we're gonna be a kidding tweets that happen on a specific user account. And this is my own user accounts called me to demo. And as the tweets happen, you want to do real time sentiment analysis on it. So as a treat happened, you immediately want on like that sentiment and throw it out December. Typically, you would save this into a database, a real time database and remember database in this example, we just going to be printing it out of the U by itself. So the way you said it up is that the way you were set it up is that you see that there is this. We set up the twitter ap a. Then we're gonna be importing the text block, which is the library that is used for sentiment analysis. The Twitter stream is set up by Twitter. Start with a stream. That is, the function called you will do on the authorization in for is the same thing we set up and you're gonna be domain giving it domain call. Use the name. Got twitter dot com, which is when the domain says use the stream. Which means that streams for my own user, you might set up under the domain called site Stream. In that case, you will give under the perimeter like what is the list of users were wanted isn't. So you're going to just set up the street treating stream on once you set up the street of three stream. What happens is then you set up this Twitter stream toward user that is going to give you an idea later. So as the streets happened, this isolated will get populated. And then you can just iterate through this site aerator and takes imagines I'm gonna be sitting at the greater now and then for tweeting Ida later. What? I'm gonna be doing this. I'm gonna be here. Is getting the text on that? I'm gonna be doing the sentiment, analysis the polarity, and then I'm going to say OK, the polarity value is less than zero greater than zero or equal digital. I'm gonna be printing whether it is neutral, positive or negative, and finally printed polarity, wheat and tax. So I'm working on this court and you will see that the court is waiting here. And what? Waiting for tweets to happen. And as and when the tweets will happen, it is gonna print. Let's go back to the street, are you? Why? So here I have Twitter and this is my account on whatever I'm going to treat in the second off. Anybody else's treating with my handle. I will get that information here. Uh, let me start tweeting that it has bean good to be okay. And then the woman tie tweet. See what happens on the right side. See immediately. What a twitter on Twitter. I got the tweet into my program on Imitated as independent analysis and got the out. So this is how fast real Time analytics can be very peak again. Street. Okay, I'm forcefully trying to do a negative. We'd ever using it about using the word back and you see immediately it comes out there, right? And then Obama that seem for this positive or negative will typically will turn out to be neutral because there are no past, even negative words here. But the most important thing is the speed at which whatever your meeting here that comes to your program and you are able to do this analysis in real time and publish something on it is very simple, really straightforward on the court itself, as you can see, is very simple to use and fight on. Hopefully, this is helpful to you. Thank you. 20. Machine Learning Overview: I welcome to this lecture on mission learning. All of you. So what is machine learning? You see that data we have Data on data contains attributes, right. So attributes are like supposed to date eyes about an employee. The attributes are like each educational qualifications, gender on this performance, A lot of attributes And what this attributes creators. They give you relationships between the entities. So you would see that typically people who have similar attributes have similar behavior. Like people who are belonging to the same age group might exhibit a similar behaviour. People who have similar educational qualifications would exhibit a similar behavior on learning is about understanding this behavior, understanding the relationship between the entities and understanding how these attributes of the people affect how they behave. And this is the gender learning. Now. Mission Learning is about a computer trying to do that, trying to look at data on from the data, trying to understand the attributes, the relationship between the entities based on the attributes, and trying to come up with a model that shows Okay, this is how the attributes are relate toward. This person will do. Onda, once you have that model you can then use the model for the grouping and production to use past data to build a model as to how these people are expected to be here and future. Then you try to predict their future behavior based upon on the model that you have built. And that's what mission learning is all about. What is about data for mission learning? No missions only understand numbers. This becomes very important in Social Media analytics because social media data is always text, you know, text strings and everything but missions understandably, Numbers on deck strata needs to be converted into equal and numerical representations formation learning algorithms to book so you take texture down. Typically, if it is data about attributes like If you're accepting data about the person you have age , location and stuff like that, you can convert them into a table pretty quickly. But if you're looking at, you know, post data the strings. Those strings also need to be converted into a numerical numerical representation, and we use a technique called TF idea, and there is a separate lecture on what pf idea fist eso that is how you can work, texted I into a numerical representation. A numerical array on with mission learning algorithms can work. So mission article Gardens understand only numeric gator. You need to convert them into numeric data before they can be gallows formation. Learning. So, for example, numbers you went for things like when you're storing strings like accident good back. They need to be converted into equal in numbers. Similarly, are you can convert them into Boolean variables like you create three separate very bus called an excellent rating. Could rating and bad rating on you have flags like 010101 Something like that. You can do it by Boolean variables and numeric variables, but you need to do that before you can use them for machine learning. Work on, of course, Document on my tricks. Unity. Convert that document into a document on metrics that is a separate lecture about dust that one but once you can for daytime that that former the mission learning concept tensile may be complex but using them in your programming languages pretty simple because people are develops. I'm really simple to use algorithms for them. There are two types of machine learning, one that's called unsupervised learning it's call unsupervised because there is no guidance to the mission learning all God themselves to how it's gonna learn from it. So in a case of unsupervised learning, you are allowing the machine learning algorithm to explore the data on come up with. Groups are similar t r structure within the data. Look at the data and see how they could them school themselves together. For example, if you look at a bunch of tweets to see how did this treats group together in terms of what subject their about their about finance, sport, politics. You want a group that day you want, I'll guard them to find natural grouping of data. Then you're trying to force it to say this is the group you are defined on. Our observations are grouped by similarity between entities, similarity in the sense in the case off tweets, you're looking at similar strings being use. Similar hashtag being used are similar users being referred to on there are a number of similarity algorithms. You try to find the best in between the values, the presence or absence of the value. A lot of organisms exist for unsupervised learning the street there are different types of supervised learning, like clustering association rules and mining and collaborative filtering. In this course, we're not going to get into the details of each of them because that's that course in itself. And then there are other courses from B two masters, which actually deal with them. Super waste Learning is trying to predict unknown data using noonday when we say supervised learning. There is that update at that you've already classified are grouped. For example. You can look at the past history of all the your website visitors on you know who bought the product or who did not buy the product so that you already classified them as buyers and known buyers. Now you can use the past data, and then we can look at future data like a new user comes in and logs in the website. Get attributes about the user and try to find out if this guy is going to actually buy your product or not. So you're trying to use past eight are to predict future data, but you already know the group's off for the past eight, and that's why it's called Supervise because you already know the class for the Super West data for the or later that you have and using that you're trying to build and predict new data the way you do with you. Build mortals, boss. Based on the past data where both the outcomes and the predictors are not, you know what the grouping is. You know what the attributes are, but use that data to build a model and then you use the model to predict your outcomes. The type There are two types off supervised learning techniques. There is one is called regression, one is called others call declassification. There's a number of all governments in both categories. Vast variety of our gutters there and implementations off them are available in all programming. Languages were going away only dealing with one. In this particular course, the supervised learning processes like this you have historical data that has both predictors and outcomes. You first played the historical data into training data is there and investigators and so you have, like, 100 records. You put 70 of them in training and 30 of them and testing. You use the training data, said toe, learn about the data and build a model. So the training data set has both. Both the training and testing their desert has both all the attributes and outcomes of does attributes aboard. Users like the age, the gender, the location they come from and whether they bought the product are not using the training data set. You try to build a model and then you play the model on the test data set will just take the test that I said, remove the outcome and try to predict outcomes yourself and then compare your prediction with the ocular. Come to see how accurate your model is able to predict outcomes to sell your testing your data So you're you're played a model on the estate er, trying to predict the outcome and test whether you are able to predict it correctly. So once you're confident that your model is working fairly accurately than any new radar that comes in, you cannot play the moral on the new data with all the predictor variables and played, then predict what should be the outcome. And maybe like a website user comes in and start browsing. Ever upset, you applied this logic defender that these guys could buy the product or not And if we think that there's a very good chance this guy's going to buy, maybe offer him chat, chat you somebody toe help him to kind of buy because you don't invest your people who can actually help you help your buyers on people who may not by the director. So that's all you can do some smart decision making. So training and testing data. So when you spirit dated training investigator Historical data has both predictors and outcomes, you split them as training and testing. Data training data is used to build a model, and testing data is used to test the mortar. So you apply the model and testing NATO. You try to predict outcome on, then compared the outcome with the actual value. And then you measure what your accuracy is going to be. The training and testing best practices. You typically go in for a 72 30 split on. Do you want to split them in a random fashion again? Toto, the assembly to door displayed are in any fair, any any ratio and don't random selection again the programming language and provide you with a lot of tools through them. So you run out of break your back to do that. So it is pretty simple and straightforward to actually do them as a part of the exercises. So the thing we want to mention shares this this particular courses mainly focused on getting social media data, getting extracting data from the sources on doing some analytics on them. It is not a course on end up mission learning. So that's what we want to tell you. Is that about? Once the coordinator is transformed into numerical representations love, like how you're gonna be showing you how you know that? Then you can use any standard mission learning techniques on the data like any anything else. So there are in video masters. We have other courses that deal specifically with data signs with mission learning in depth with a lot of algorithms so you can take a look at other courses. And there you are. And typically, I mean, if you use some co opened by this course, maybe that coupon is also applicable to the other courses too. So you can try that also s o. I hope this lecture is pretty helpful to you on. We're gonna be seeing a couple of examples, one for clustering and one for classification. As a part of this course, if you're more interested in dealing learning more about other mission learning techniques , please go take a look at the other courses offered by us. Thank you. 21. SMA Classification Use Case Python: right. Welcome to this lecture on classifying tweets. So in this election, what we're gonna do is we have first to classify streets are post the messages in the pre defined types on then, using that we are going to build a model. And then, after building the model, we're going toe, start predicting new tweets are new posted are coming out to find out what specific category so they belong to for this work what we're gonna be having as we have here what is called streets, which are free clinics classified. So these streets are about three different leaks. The NB leaks enough a league and the MLB leagues. So this is the CS. We we just leak and treat. So someone has taken a spur a bunch of creature and they're classified what the street is about. It's about nb NFL or MLB. We're going to use this data set and build a model, a model that can then be used for predicting and new tweet when it comes in. And trying to identify is the street about NBA, NFL or MLB. So what, you build a model, you extolled the model in memory, and whenever there are new tweets coming up. You take that tweet and then tried toe Identify what specific a leak that we did about. So this is how classifications books does a supervised learning Where your pre classified information you're gonna build a model on, then on and model you're gonna actually do some predictions. So let's go on first, load up the data and look at how the data looks like. So you see that this data is loaded from the CSB and as the league information on the actual treat information, the first thing we do here is we're gonna go text pre processing, which is we use the sky kit learned library to build the tee off idea Victor that the f i D . Inspector, we first pick up the street column from this the data frame, and then we're gonna create a carpets on on that Corpuz. We're gonna run the pfft or any of that razor and he transform, and that is going to give you a pdf idea. Vector of this rectal, you'll see, is a lot of zeros in there. That's because there's a lot of words and not every recorders, every word populated. If you look at the size of the vector, discuss about 100 of heroes, one for every tweet. But it has 1000 21 columns, which is one for every word. It can find that 1000 condiment columns. And there this is what the steer failure will look like in under declassification. As we talked about in the theory lecture, we gonna take this data set for Sprinted into training and test data sets randomly. Then we're gonna build a model on the training data set and then we're going to use a test Data set for predictions is let that thing into training set and investigators that and then you're gonna build a model on the training set and you're going to use the tested I said on. Then you're gonna try to predict the test skaters as yourself. That is to find how Arcadia model is actually going to predict, because the president has that you don't know what it type. It's actually is. But we will also use the modern to predict what it is, and then you try to compare the actual Was that the production? To see how accurate your prediction is going to be our fathers when we're gonna be using the name by algorithm that comes along with this guy Kid, learn. Ah, library. So on the cricket land library for first pick up the predictors, the predator is going to be the set of attributes that are going to be used for production . That is your TF idea. Victor. The target, which is your actual classifications, which you already know is the treat state are not leak. This is the big column that you picked up from the data that you have, and then you can create your training data set and predicted a test, Data said. Using this one command called training test split. So training just spit. You can split both predictors and targets at the same time. You can also call them separately because you can call them at the same time. And you say test sizes 13 which means that ah, 30% of the date I will randomly go to the test set on. About 70% will go to the training said, and the result you can getting you're getting four different data sets here. The prediction. The prediction training said the prediction Testing said that started the predictor training, said the press Start Testing said the target training said, And the target distances. So the predictor train and on Operate and Predictor trained on the target trainer used for building the model. And then the prediction test and target test will actually be used for testing the mortar. So you're going to do on do the street on, then you're gonna look at the size of each of these vectors. So you look at the training data set it. Size is about under and five, that is a car 70% off 150 then you can look at the size of everybody else. So if you look at the target, a target training set is going to be only under fire in just one column because of just one problem. Whereas the prediction said training set is going to be 1027 because the whole thing got inherited from the deify defector. So once you split the training, it has that intestine. Gator said you're gonna first build the model. You initialize the classifier, which is the Goshen neighbors classifier. And then you build a model by calling classy, very not fit on you pastorate. The predict ups and the target variable the classic. Where is gonna take this predictor on the target variable and trying to build a model wherein it wearing it can use the prediction data retarded er toe actually predict the targets themselves. So let's go and run. The classifier here on the classifier is built. Now you can go on, do the prediction on the tested US that in this one, when you do a production, you only faster protest and try to predict the targets on the predictions are going to be stored in this predictions variable. So you build a model and then on that model, you gonna do a production on the estate ascent. Once you do, there's no you want to actually figure out how a curate you were predictions are so you can know this sky kit Metrics, start accuracy, score. You pass it. The the actual values that target that the target test is nothing but a target separated into the test set, which is basically the actual values. And then you have the predictions for the same thing on. Then you can just run this accuracy score and it'll give you an accuracy off 0.82 which is 82% accurate. That is good. Now you can also created confusion matrix. No confusion. Matrix is a more be Dale analysis wherein you have different classes in the data, right? There's N B A and A felon and will be three different classes. And you're trying to classify data into these three different classes. It tells you how is the accuracy by class. So when you run it by as a confusion matrix here, he's going to give you a metrics like this. What this means is it is plotting the predictions on the X axis on the actuals on the Y axis of the N B A and a phone, and will be on the second nb NFL and MLB on wherever the n b A and and being matches. This is trouble, which means that for these 12 tweets we have credited correctly and Bs and B, where are the other ones are wrong predictions where waas actually n b a n f l. But you pregnant as nb so that's fine. So the one in the cross the Crossland are all your correct predictions. Everything else are incorrect predictions. So that's what happens in a production case that you are not going to predict 100% all the time. But we have an order accuracy off about 82% which is good. So this is how you can take to eat, convert, um, in tow. First, the deify defector then spread them into the training data set on the testing data set against letting the split separately, the predators and the targets. You build a model using the predictors train and target train. Then you actually predict Be using Justin. Target us off using only the predictor test and then compared the actual predictions predictions with the actual values. This is our classifications. Bucks off. This has bean pretty helpful for you. Thank you. 22. SMA Clustering Use Case Python: Hi. Welcome to this use case on clustering. As we have discussed in the clustering lecture, clustering is a way off grouping items that look pretty similar to each other. In this case, we're gonna take tweets on group them together on identified tweets that are similar to one under that and some Lappi is usually determined by the kind of words used in the tweets. And when we say what is usually about what kind of users on mentioned in the tweets and what kindof references are meeting the tweets and stuff like that ASL, as complex as machine learning is by Tom makes it very easy to actually do a clustering. I would just go away simply do the dozen set of functions to call, which will do the grouping for you s. So what I'm gonna be doing here is walking through a simple use case of house to do this one. So to begin with, a two this time I'm not going to be going. And don't loading tweets directly from the Internet. Right? I have already downloaded and saved them into a file. This is what will be a usual way. Will do things you will download things and store it and no fire. For example, you might be downloading thousands of speed reached before you do classifications. And as you know, download thousands off. Please. You need debate A lot of times I don't make sure that you do not hit the weight limit. You collect this tweet in a file like this, you know, continuously over a period of a couple of days, and then you can go and we do the analysis. So your other set of tweets I collected about a few topics and you see the tweets I hear on . I'm goingto go into Stark do clustering. So what I do for it is I am going to get you the rate CSP to greet TheStreet on loaded up into the streets. Data on def. You look at the head one, you will see that it is just a list of preached God loaded A So what are the idea of the textures then? We need to build the t of IVF rector for this text. A dark me doctor. But we had another lecture about how be if idea of works. So this is a straightforward use here in that there is the scale and the sky kit learned Library off, Bite on has a feature extraction capability in that s d of idea pick riser. So all you have to do to use the T. If I leave, it's the rice is to first create a car pus, which is to take all the tweets straight and create a corpus off all the strings. Then you just called the F idea victories er especially for your stop for stop forces to just remove all the unnecessary words. And using the language is in English, or it's going to make sure of a standard English dictionary of collection of stuff. Words toe. It will remove all of them and create vector. Then you just call this function or dot fit transform off carpets to a dense back there. So it is going to create a vector off all the variables off all the words that I used in the various streets and create you api of ivy off my picks. And this is all the Mavericks would finally look like. As you see, there's a huge my tricks. We look at the shape of the my brexiters metrics as about 535 columns. That means that fired and 35 words have been measured overall. And a lot of this matters. Entries are going to be zeros because this is a sparse one. You know, there there aren't a lot off dead and a lot of these columns in this one. So this is a mattress you creator on. Then, in our clustering, it is again, a pretty straightforward thing to do is that is in the sky kit learned Library. There is a K means there is. There is a K means clustering algorithm available there. You first create a model of K means and say how many clusters you need. You need a cluster of three, which means it is going to group the data into clusters off three. And then you're gonna do a model dot fit a pdf pdf on then this is gonna fit the model. Then you go to predict, predict a morally fit and then you product, and then the production is available to you in a in a simple IRA here. So if you look at what the position contains, just going to give you for every line in your original carpets. What is its prediction? It groups it in tow it zero or one, Otto, because we asked for three clusters. No, do identify which tweet got roped into. What cluster? Let's go Run this command, which is to print the production on the sweet side by side and trying to see how it has kind of group here. Look at all that creates. Let's see what 01 and two are actually is Ah, one we're talking about. What? Bangladeshi political refugees. Zero. We're talking about England zero again. We're talking about United Kingdom. England. So one we're talking about Russia you were talking about. Okay, Wisconsin Primary zero we're talking about. So as you start going through the your start, realizing, you know the zero wherever we see zero we may be talking about England. Yes, it was a in England. The 01 doesn't have to do anything. Let's look around. Macedo's have been England. Zero is kind of group arteries that are related to England and you'll see Ah, let's see what one overs to withstand a group tweets that I related. Obama's a two year against Obama Cruciate again, Obama. So you see that there is a pattern. It's time to group treats which talk about similar things. I kind of made it when I don't know that we don't wanna put specific topics or the groups show pretty in a straightforward fashion that won't be the rial. Okay, use case, but the rest to explain how the clustering works I picked up reads off three groups off meet. But typically, when you do this grouping, you download thousands of tweets and then dough. This analysis you don't not do, you should do with just 100 tweets. In real life scenario, though, you can group. Typically, this kind of grouping is used again to classify information by type Alexei by a specific topic. The topic might be a person topic may be like, you know, finance. Our politics are maybe a country, and it can be anything so you can use this clustering toe. Ah, group information and again clustering works on itself. You know that is no supervision as toe. Have you wanted to be grouped in groups by itself into three groups? That's all it'll do. Sometimes the group's won't make sense, but then it's all the data is we can do about your border. But in this case, if you do see a parent, you really do not take generally all tweets and try the group. Rather you trying to get a subset of which, like, for example, for a specific user handle off on a specially query and then tried the group. Then that's all that usually have you work with this months. I hope this is helpful for you. Thank you. 23. Linking Data: Hi. In this lecture, let's talk about linking data. Now when you're looking at Social Media Analytics, data that you get from any social media is pretty limited toe the tweets that people are doing and attributes of the people. Now that data has by itself limited use. If we're just going to use that data alone, you will need to be able to use that data and get more information about those people. Like, for example, you want to know Okay, that somebody is tweeting about your product. If this person, your existing customer, as they've been using the product, does he have any problems with the product? To find out all of that, you should be able to take the Social Media Day that on link it to your existing CRM data bases or customer databases so that you can find some consolation like you can you should into between this data. But remember that when you have your own customer data RCM later, the information that you have about the person is typically their phone numbers are the email ladies, whereas when you go ready a social media, you're getting Twitter handles on Facebook candles and you do not have. Maybe their email ladies. So how do you link those two records? No, that is going to be the challenge that you have on one of the things that you can do is possibly when you're registering a customer, try to get that with a Randall in Facebook candle. But you know that data they may not be willing to share with you, but linking data between the Sierra, um, your customer databases your marketing databases with social media is pretty critical that so you can find if there is an unsatisfied customer, you know who bought your product? Do they have any open support? Tickets are easy. A prospect of customer whom you wanted the job more so in order linked data between the Twitter Facebook candles on your email phone number. That is a company that gives that service called full contact person. AP A. You can Google for this one. So when you use this person a ph again use of the same rest ap schemes we talked about in this course with the same authentication keys ap A keys and stuff like that. When you have this a p A. You can query this a p A. With one specific link like a person, as many links right is phone numbers. Is email ideas is Twitter Facebook? Lincoln handles and stuff like that so you can credit for one link so you can provide the person's Twitter handle and get the person's phone number are you can do ways of ours. Also, what this helps you is you know you can quietly this and build and database yourself faster . Who are your existing customers? Who are you calling marketing prospects and what are their corresponding to charities? And Facebook guarantees that way. Anytime some treats happening, you can immediately look up that tweet and link to your internal databases and find some correlation. Okay, why is this person treating it as you have really? Some problems now are there is a person who's tweeting something positive about their product or easier existing customers is the prospect of customer, and I'm sure they go out and reach to him and tried to sell some products. You know that linking has to happen, so in your a level experience, you would be linking data on. Either you can collect this information off the Twitter handles and Facebook candles is a part of your customer registration process. Are you can use this ap a toe link. The data also let me make a disclosure that I did not work for the full contact company or a needle. I have run out of any interest in them. But I find this linking very important. In order to Linda, between these two sets of two worlds off, that is the social media world. And there is a sierra board. And how do you link them? I hope this information has been pretty useful to you. 24. Closing Remarks SMAP 2: Hi. Welcome to the conclusion. Off the score. Social media, Another text with bite on. I hope you're the great experience with this course. We went through a number of things in the score, starting from social media concepts, social media data extracting data from various sources, transforming data into format suitable for analytics. We've been through multiple use cases on We did look at classifications and clustering. I hope you are able to go through the exercises, play around with them and actually execute some court. Get some your own data from social media websites and analyze them. The next steps we recommend is that continued alone on big data science and analytics. Try exercises with new data sets that you can get from various websites. Also, you can try. Let me try a lot off Twitter and Google plus expects data you can trade from other social media data. If you can get access to them, do learn about other data science and analytics. So congratulations on finishing this course. We hope this course helps you advance your carrier. Thank you so much. And best of luck