Streamlining Workflow: Automated Information Fetcher From Scratch | Zichen Liu | Skillshare

Streamlining Workflow: Automated Information Fetcher From Scratch

Zichen Liu

Streamlining Workflow: Automated Information Fetcher From Scratch

Zichen Liu

Play Speed
  • 0.5x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 2x
7 Lessons (26m)
    • 1. Introduction

      0:55
    • 2. Super Fast Setup

      2:02
    • 3. Fetching From Google

      8:17
    • 4. Script Usage

      2:10
    • 5. Fetching From Amazon

      4:21
    • 6. Fetching the News

      7:28
    • 7. Project Ideas & Wrap Up

      0:26
  • --
  • Beginner level
  • Intermediate level
  • Advanced level
  • All levels
  • Beg/Int level
  • Int/Adv level

Community Generated

The level is determined by a majority opinion of students who have reviewed this class. The teacher's recommendation is shown until at least 5 student responses are collected.

26

Students

--

Projects

About This Class

Check certain websites frequently? You can boost your productivity and focus by using a simple script to fetch the key from the websites you visit. This eliminates the effort needed to do this repetitive checking.

By the end of the class, you will have set up a data fetcher to:

  • the news headlines,
  • a pair of headphones on Amazon,
  • the stock market index.

Check out the intro video to see what we'll achieve!

a2735f0b.jpg

Meet Your Teacher

Teacher Profile Image

Zichen Liu

Teacher

Class Ratings

Expectations Met?
  • Exceeded!
    0%
  • Yes
    0%
  • Somewhat
    0%
  • Not really
    0%
Reviews Archive

In October 2018, we updated our review system to improve the way we collect feedback. Below are the reviews written before that update.

Your creative journey starts here.

  • Unlimited access to every class
  • Supportive online creative community
  • Learn offline with Skillshare’s app

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.

Transcripts

1. Introduction: When it comes to productivity, using technology to automate repetitive tasks can be one of your most powerful tools. Say each warning, you check the news headlines, the Stock Market Index, and the price of the new pair of headphones you want. I'll show you the tools that I use to automate the fetching of these pieces of information and help you downsize that routine, will look for ways to retrieve the data from the websites and then write a script for you to run to fetch that data. And of course, once you're comfortable with that, you can extend it to hundreds of websites. Or at one time, we'll be using Python to do the automation. So it will help if you have programmed even a little in the past. But for those of you who haven't, I hope the result's exciting enough to convince you to give it a go. 2. Super Fast Setup: So first thing we gotta do is get Python and PyCharm installed if you don't have these applications yet. So I'm gonna go to python.org and then go to downloads. And we're just gonna download Python 3.9. And then for the text editor, if you don't have one already, I would go ahead and get PyCharm. So we'll just Google PyCharm and the Community Edition is free to download. So please go ahead and download and install both applications. Next, after you have installed both price and amp PyCharm were going to go ahead into our command prompt. And we're going to do a pip install Python package manager, install XML. And it's gonna tell me already have it. We're gonna install BeautifulSoup. Four, same as before. And it will get i do pip install requests. So after you have these three dependencies, we should be all ready to go. So after you've completed the setup, let's go ahead and create our project. Will create a new folder for the data venture. And that's opened up in my chart. And we're going to click on open. And we're going to just select the data fetch into we. Great. So let's go ahead and create our first Python file. We can call it fetched up high. 3. Fetching From Google: Okay, now with this setup complete, let's go ahead and actually fetch some data. So there's lots of things that people regularly look for and I'll go through three examples. And my first one is going to be the stock market index. And let's use Google's the datasource. So I'm going to go ahead and search for Dow Jones. And it takes me to this page. And I'm just going to copy the URL here. And because we are going to be making requests to this URL. So in Python, we're going to type the URL equals this. And we're gonna use requests which we downloaded previously to make a request to this web page. One thing that we'll need is the headers. And I'm just going to paste this from an existing source. And it's just going to say that we're making this request through a chromate. And then we're gonna do requests with a tight URL, add headers. And we're going to lock the response. And let's have a look at it. Will print response dot text. And let's give a wrong. Okay, great. So this is the website that we fetched and we would like to retrieve the stock market number from this. Well, this is quite a large website and there's no way we can pass the entire thing. So let's go and look for clues on where we might find it. So this is our webpage and we would like to get at this particular section of the webpage is number. So I'm going to click on this button here. And it's going to allow me to select this particular number here. And it's gonna tell me that this is the box containing a number. If we search for strings here that are characteristic of this particular box, we'll be able to search for this in the HTML that we fetched. So why don't we search for this one? I'm going to copy this. And I will search for Ikea. And sure enough, we can see that this is the same as what we saw before. This is the span and the market index is contained here, 29,910. So that looks great. Let's go and try to fetch this within our code here. So we're going to do from BS for import BeautifulSoup. And this is the Plugin Library that we're gonna use to look for this particular, this number here. So first we want to create a BeautifulSoup object, which we do by beautiful soup of response dot TXT. And we're going to use the parsing library which we also just downloaded called XML. We'll set this to write. So we want to go ahead and use soup to find log. And you'll remember previously when we looked at this JS name equal to this. So J S name equals to this string, this string. And we'll call this value. So here we have a list of all of the values that contain this particular tag here. Ok. Well, by the looks of it, there's only one of them. So we're just going to take the first element of the list. And we're going to say, well, let's have a look at it first. So you can see it's selected exactly the right tag that we wanted from the entire file. So let's say we're gonna want everything between these two spans here. So we're gonna say dot contents. And that's going to give us this single singleton list without one single thing as what the element. So we're gonna get the top one. I will get exactly this number here. So that's great, that's exactly what we wanted. So let's go ahead and use this. It's time to tidy up our code a little bit. So let's create a function to make the web request will do DEF, make web request of some URL. And we're just going to return the soup that we created for this. And we want to pass in the URL as well. Soup is going to be make web request all loops. So soup is going to be make web request of the URL we have not yet. Okay, great. So this should do exactly the same thing as before. She's, we've extracted this to its own function. Next, let's go ahead and extract this to its own function so we can reuse it later. So deaf get dow value. And the return of this function should be exactly this clean number here. So we'll go ahead and make this section. And well, let's just stick this into here. That way. Great. So whenever we then want to do a print of get dow value, we're going to get exactly this value here. 4. Script Usage: Now the thing is, we would like to be able to execute the script and get this value. Say in the morning, 05:00 AM when we wake up, what to execute the script immediately and get this number. Otherwise, I could just go to Google, type in Dow Jones and get it anyway. So we want something essentially to be quick. And opening up pie chart to run the script isn't exactly quick. So if we go back to the folder where we created the project and open it up. In the project. We can see this is the script that we've been working on called fetches dot pi. And we can actually double-click this script. And although it will run, it will disappear very quickly because it's not waiting for user input. And we just need to add a very small line here. That's very quick to say. I will just say this is the end of the program. So all this will do is just wait for user input, was its fetched the divergence value. So let's see. So just double-click this. And as you can see, is what we saw before. And we have the end of the program and if we just hit Enter, we'll be able to quit. So that's great. We've done exactly what we've wanted. Let's just tidy things up a little bit and we'll do this for a couple of other things. And the value is, is this great? So we'll get a wake up. I will get a click this button. And it's going to tell us today the value of the Dow Jones is this pres LEP ticker. 5. Fetching From Amazon: So it's Black Friday and you've got your eyes on the parallel certainly headphones or you really want while the price keeps on Girl open data, which is really annoying. So why don't we use this exact same tactic to monitor the price for that? So let's head over to find a pair of headphones. I wonder which one is good. Let's, let's have a look at this one. So I'm going to want to go ahead and monitor this price here, the deal price. And like before, we're going to have 12. We're going to examine what box there's one belongs to. So it looks like for this one, we have an ID here that we can use. The id equals to price Block underscore deal price. And it sounds like exactly what we want. So we'll copy this. And we're going to reuse, make web requests. But this time I'm going to create a new functional Similar to get our value called deaf. The heifers price. I'm not going to do soup equals make web requests of the website anti-life in just a second. And we're going to say the, the price tag is sudo find old. And it was called ID equals price Block underscore deal price. And let's get the website of that pair of headphones right here. And let's have a quick look at what it gives us. So we'll comment out these two lines. And we're going to say get headphones price. Let's just print out the price tag. See what we get. Okay, great. So it comes up back with a singleton array with ones pattern with the exact price right here. So let's go ahead and do add to 0, which will retrieve this singleton element. And then we'll do dot contents, which will retrieve the 219 pounds here. And then we're going to get the first element of that, which will retrieve the string s1. So you can see what we've ended up here with is exactly the same as before. So let's, let's have a look at it first. Perfect, as we would expect, 219 quit or pair of headphones. Not too bad. So we're just tell this. And this right now suddenly is the price tag. And so let's incorporate this back into our program. Today. Or for something like this, the price fluctuates a lot. So you might want to check and maybe every hour or so, we can say right now, the value of the headphones you want is the get headphones pricing. So great. So let's save this and we'll go back to the project folder and we'll double-click this as we might do. And you can see as just taking a little bit of time to visit each website and it's retrieved at these two items here. Let's do one final one. 6. Fetching the News: So the first two that we did were both single numbers. This final one, I know a lot of people do on a very regular basis is watch the news. So you look at the website which contains a lot of news headlines, and you might look at a couple of times a day. Well, let's go ahead and retrieve all of the headlines only. And once you've seen the headlines, you can decide whether it's something worth, worth looking at. So we can go to your favorite outlet. I'll just use ITV News. And maybe we can go to topics. Let's look at the business news. And you can see we've got a couple of stories right here. So first of all, our URL is this ITV.com slash news slash business. So just as before, we're going to do a deaf, get news headlines that we're gonna do. A soup equals make web request and less time lichens through ITV news. And we're going to say the headlines equals soup dot find all. I'm going to go in and see what we want to find. So I suppose within the page, really the most important information is this section here, this, this debt of text here. You see one problem that sometimes can face is you can't select this text exactly. And you click on the section where you can select. You can come into here and look through all of the tags that are around it and be able to select one or find the one that you want. And clearly, the what we want here is this one. And that contains the headline. This time, we have a class here. And it's simple enough. We can just do class equals this. And, but what's different is this is going to be the same for each one of these headlines. It's not obvious now, but once you put this into Python, we'll be able to see that. So let's go ahead and do that. So we'll copy this class name here. I will say souped up, find your class. And the issue is, class is a Python reserved keyword. And whenever you get that as a problem, or you've got to do is type the underscore key and you'll be okay. So class equals the thing that we just saw. This heading. You just wanted to check this a space here. Yeah, there is a space here. And we're going to go ahead and print the headlines and we'll see what we get. Nothing. Forgot to run it. Good news headlines. Let's run it again. So you can see what we have is quite a number of headlines. And see the first one is about Arcadia. The second one is about 1300 jobs, 13 thousand jobs at risk because of Arcadia. And that, if we look back on the website, is actually the second headline. So that looks good. That looks like something we could use. So let's go ahead and do the same thing we did to, to them to extract exactly the sentences. So sky headlines and we want to say for each headline. So that's each one of these guys. We want to do something to it. So we're going to use a list comprehension. And we're going to call it for headline in headlines. So this headline here is going to be an element of a loop that is a single headline. And for each single headline, we're gonna do more or less what we, what we did here. So headlines, headline, dot contents. Well, you can say, I guess is just this the tenth? So in disable tens headline. And let's have a look at what that looks like. So, okay, great, so that's what we wanted, right? We've got the individual text headlines in the list. Okay? We would ideally want to join all of these together with the back tick and which is the new line character, so that these will be displayed one after another in sequence. So one trick is just to do back together, which is the newline character. Docs join on this. And what you should have is the string constructed out of all of them. And you'll notice that each one of these headlines correspond exactly to all of the headlines that are on this webpage right here. So that's a great result. Let's go ahead and finally incorporate this into our program here as add one PSA print. Today's headlines. And then we'll have a new line. And on the new line, say, good news headlines. Fantastic. Let's go into our program and give it a run. Data fetch. Every morning. Cognitive is open this up, double-click fetch it all pie, and just wait for it to run. So we have the stock price with the price of our beloved headphones, and we've got all of the headlines. Doubt we care about. So that's a great result. 7. Project Ideas & Wrap Up: So there are lots of project ideas that you could try for this, you could create a program to check. For example, if a website has changed its information, you could create a program to grab festival tickets for you as soon as they come out. Whatever you do, I'd love to see it in the project section. Thank you for taking this class and we'll see you in the next class.