Elasticsearch 7 and the Elastic Stack: Hands On | Frank Kane | Skillshare

Playback Speed

  • 0.5x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 2x

Elasticsearch 7 and the Elastic Stack: Hands On

teacher avatar Frank Kane, Machine Learning & Big Data, ex-Amazon

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Lessons in This Class

97 Lessons (8h 44m)
    • 1. Introduction

    • 2. Section Introduction: Getting Set Up

    • 3. Installing Elasticsearch [Step by Step]

    • 4. Intro to HTTP and RESTful API's

    • 5. Elasticsearch Basics: Logical Concepts

    • 6. Elasticsearch Overview

    • 7. Term Frequency / Inverse Document Frequency (TF/IDF)

    • 8. Using Elasticsearch

    • 9. What's New in Elasticsearch 7

    • 10. How Elasticsearch Scales

    • 11. Quiz: Elasticsearch Concepts and Architecture

    • 12. Section 1 Wrapup

    • 13. Intro: Mapping and Indexing Data

    • 14. Connecting to your Cluster

    • 15. Introducing the MovieLens Data Set

    • 16. Analyzers

    • 17. Import a Single Movie via JSON / REST

    • 18. Insert Many Movies at Once with the Bulk API

    • 19. Updating Data in Elasticsearch

    • 20. Deleting Data in Elasticsearch

    • 21. [Exercise] Insert, Update and Delete a Movie

    • 22. Dealing with Concurrency

    • 23. Using Analyzers and Tokenizers

    • 24. Data Modeling and Parent/Child Relationships, Part 1

    • 25. Data Modeling and Parent/Child Relationships, Part 2

    • 26. Section 2 Wrapup

    • 27. Intro: Searching with Elasticsearch

    • 28. "Query Lite" interface

    • 29. JSON Search In-Depth

    • 30. Phrase Matching

    • 31. [Exercise] Querying in Different Ways

    • 32. Pagination

    • 33. Sorting

    • 34. More with Filters

    • 35. [Exercise] Using Filters

    • 36. Fuzzy Queries

    • 37. Partial Matching

    • 38. Query-time Search As You Type

    • 39. N-Grams, Part 1

    • 40. N-Grams, Part 2

    • 41. Section 3 Wrapup

    • 42. Intro: Importing Data

    • 43. Importing Data with a Script

    • 44. Importing with Client Libraries

    • 45. [Exercise] Importing with a Script

    • 46. Introducing Logstash

    • 47. Installing Logstash

    • 48. Running Logstash

    • 49. Logstash and MySQL, Part 1

    • 50. Logstash and MySQL, Part 2

    • 51. Logstash and S3

    • 52. Elasticsearch and Kafka, Part 1

    • 53. Elasticsearch and Kafka, Part 2

    • 54. Elasticsearch and Apache Spark, Part 1

    • 55. Elasticsearch and Apache Spark, Part 2

    • 56. [Exercise] Importing Data with Spark

    • 57. Section 4 Wrapup

    • 58. Intro: Aggregation

    • 59. Aggregations, Buckets, and Metrics

    • 60. Histograms

    • 61. Time Series

    • 62. [Exercise] Generating Histogram Data

    • 63. Nested Aggregations, Part 1

    • 64. Nested Aggregations, Part 2

    • 65. Section 5 Wrapup

    • 66. Intro: Using Kibana

    • 67. Installing Kibana

    • 68. Playing with Kibana

    • 69. [Exercise] Log analysis with Kibana

    • 70. Section 6 Wrapup

    • 71. Intro: Analyzing Log Data with the Elastic Stack

    • 72. FileBeat and the Elastic Stack Architecture

    • 73. X-Pack Security

    • 74. Installing FileBeat

    • 75. Analyzing Logs with Kibana Dashboards

    • 76. [Exercise] Log analysis with Kibana

    • 77. Section 7 Wrapup

    • 78. Intro: Elasticsearch Operations and SQL Support

    • 79. Choosing the Right Number of Shards

    • 80. Adding Indices as a Scaling Strategy

    • 81. Index Alias Rotation

    • 82. Index Lifecycle Management

    • 83. Choosing your Cluster's Hardware

    • 84. Heap Sizing

    • 85. Monitoring

    • 86. Elasticsearch SQL

    • 87. Failover in Action, Part 1

    • 88. Failover in Action, Part 2

    • 89. Snapshots

    • 90. Rolling Restarts

    • 91. Section 8 Wrapup

    • 92. Intro: Elasticsearch in the Cloud

    • 93. Amazon Elasticsearch Service, Part 1

    • 94. Amazon Elasticsearch Service, Part 2

    • 95. The Elastic Cloud

    • 96. Section 9 Wrapup

    • 97. Wrapping Up

  • --
  • Beginner level
  • Intermediate level
  • Advanced level
  • All levels
  • Beg/Int level
  • Int/Adv level

Community Generated

The level is determined by a majority opinion of students who have reviewed this class. The teacher's recommendation is shown until at least 5 student responses are collected.





About This Class

New for 2019! Elasticsearch 7 is a powerful tool not only for powering search on big websites, but also for analyzing big data sets in a matter of milliseconds! It's an increasingly popular technology, and a valuable skill to have in today's job market. This comprehensive course covers it all, from installation to operations, with over 90 lectures including 8 hours of video.

We'll cover setting up search indices on an Elasticsearch 7 cluster (if you need Elasticsearch 5 or 6 - we have other courses on that), and querying that data in many different ways. Fuzzy searches, partial matches, search-as-you-type, pagination, sorting - you name it. And it's not just theory, every lesson has hands-on examples where you'll practice each skill using a virtual machine running Elasticsearch on your own PC.

We'll explore what's new in Elasticsearch 7 - including index lifecycle management, the deprecation of types and type mappings, and a hands-on activity with Elasticsearch SQL. We've also added much more depth on managing security with the Elastic Stack, and how backpressure works with Beats.

We cover, in depth, the often-overlooked problem of importing data into an Elasticsearch index. Whether it's via raw RESTful queries, scripts using Elasticsearch API's, or integration with other "big data" systems like Spark and Kafka - you'll see many ways to get Elasticsearch started from large, existing data sets at scale. We'll also stream data into Elasticsearch using Logstash and Filebeat - commonly referred to as the "ELK Stack" (Elasticsearch / Logstash / Kibana) or the "Elastic Stack".

Elasticsearch isn't just for search anymore - it has powerful aggregation capabilities for structured data. We'll bucket and analyze data using Elasticsearch, and visualize it using the Elastic Stack's web UI, Kibana.

You'll learn how to manage operations on your Elastic Stack, using X-Pack to monitor your cluster's health, and how to perform operational tasks like scaling up your cluster, and doing rolling restarts. We'll also spin up Elasticsearch clusters in the cloud using Amazon Elasticsearch Service and the Elastic Cloud.

Elasticsearch is positioning itself to be a much faster alternative to Hadoop, Spark, and Flink for many common data analysis requirements. It's an important tool to understand, and it's easy to use! Dive in with me and I'll show you what it's all about.

Meet Your Teacher

Teacher Profile Image

Frank Kane

Machine Learning & Big Data, ex-Amazon


Frank spent 9 years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers, all the time. Frank holds 17 issued patents in the fields of distributed computing, data mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology, and teaching others about big data analysis.

See full profile

Class Ratings

Expectations Met?
  • Exceeded!
  • Yes
  • Somewhat
  • Not really
Reviews Archive

In October 2018, we updated our review system to improve the way we collect feedback. Below are the reviews written before that update.

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.


1. Introduction: Hi, I'm Frank cane. I've used my decade of experience at Amazon.com and IMDB.com to teach a 100 thousand people around the world about big data and machine learning. Elastic Search is a hot technology you need to know about in the field of big data. It's not just used for powering full text searches on big websites anymore. Increasingly, it's being used as a real-time alternative to more complex systems like Hadoop and Spark. Elastic Search can aggregate and graph structure data quickly and at massive scale. In this course, you'll gain hands-on experience with Elastic Search all the way from installation to advanced usage. We'll create search indices and mappings, import data into Elastic Search in several different ways, aggregates structured data, and use hosted Elasticsearch clusters from Amazon and elastic dot co. You'll also get your hands dirty with the entire Elastic Stack, including Elastic Search, Log Stash, expect Kibana, and the beats framework. Together, these technologies form a complete system for collecting, aggregating, monitoring, and visualizing your big data. I designed this course for any technologists who wants to add Elastic Search and the Elastic Stack to their tool chest for analyzing big data. And we all know these are highly valuable skills to have in today's job market. A few activities will include the use of the Python programming language. So some programming experience will be helpful but isn't strictly required. Some experience with Linux will also come in handy. And if you're already an elastic search expert, you'll probably find this course to basic, but check the curriculum to see if there are topics that are new to you. I'm excited to share this powerful and increasingly popular technology with you. Thank you. I'll see you in class. 2. Section Introduction: Getting Set Up: Let's dive right in. In the real world, you'll probably be using elasticsearch on a cluster of Lennix machines, so we'll be using limits in this course you boon to in particular. Now, if you don't have a new boot to system handy, that's totally OK. I'm going to walk you through setting up a virtual machine on your windows or Mac PC that lets you run a boon to inside your existing operating system. It's actually really easy to do. Once we've got a new boot to machine up and running, will install Elasticsearch and just for fun will create a search index of the complete works of William Shakespeare and mess around with it. After that, we'll take a step back and talk about Elasticsearch and its architecture at a high level. So you have all the basics you need for later sections of this course. Roll up your sleeves and let's get to work 3. Installing Elasticsearch [Step by Step]: Let's dive in and get Elastic Search installed on your home PC so you can follow along in this course if you'd like to. Now, elasticsearch is going to be running on an Ubuntu Linux system for this course. But if you don't already have an Ubuntu system sitting around, that's okay. What we're going to show you is how to install VirtualBox on your Mac or Windows PC. And that will allow you to install Ubuntu running right on your own desktop within a little virtual environment. Once we have Ubuntu installed inside virtual box, will install elastic search on it. And after that, we'll install the complete works of William Shakespeare into Elastic Search and see if we can successfully search that. So that's a lot to do in one lecture, but I'll get you through it. Let's talk briefly about system requirements. Pretty much any PC should be able to handle this. You don't need a ton of resources for Elastic Search. If you do run into trouble, however, make sure that you have virtualization enabled in your bios settings on your PC. And specifically you make sure that Hyper-V virtualization is off if that's an option in your bios. But just try following along first. You shouldn't need to tinker with your bios settings unless you run into problems. Also be aware that the antivirus program called a vast is no conflict with virtual box. So you'll need to switch to a different one or turn it off while using this course. If you're going to be using a vast antivirus. With that, let's dive in and get this setup. So if you're running on Windows or a Mac PC, then you need to install a virtual environment to run Ubuntu within first. And to do that we're going to use a virtual box. So if you're already on Ubuntu, you don't need to do this, but for those of you on Windows or Max, you'll need to do this step first, head on over to VirtualBox.org right there and click on the big friendly download button here. It is free software and I am on Windows, so I'm going to go ahead and download the Windows version of the binary. Once the installer for your operating system has downloaded, go ahead and run it. And there's nothing special about it really just to go through the steps that it walks you through. Next, choose where you want to install it. All these defaults are A-OK. it will interrupt your network connection while installing. So make sure you're okay with that and go ahead and install. Give it any permissions it needs, and off it goes. That was easy. So let's go ahead and hit Finish here. And virtual box is sitting there waiting for us to add some virtual machines to it. So next we need to actually download an operating system to run within our virtual environment here. For that head on over to ubuntu.com, like that from the Buddha homepage here, just look for the download link. And we want to use the Ubuntu Server. And we're looking for the 18 for long-term support version. Go ahead and wait for that to download. This will take longer because it's a much bigger download. Will come back when that's done. So the image for the Ubuntu installation media has downloaded successfully. Just make sure you know where it went on your PC. Now we're going to go back to VirtualBox and set it up. So from the VirtualBox Manager, let's go to the machine menu and say New to add a new virtual machine will give it a name. Elastic Search somehow changed the machine folder to another drive. It has more space on my C drive is almost full and so it makes sure that you're using a hard drive that has plenty of space for this. For me, my E drive has the most room and you'll need to create a folder in there, have to put this in. I already have a virtual box VMs folder, and let's go ahead and create a new folder to put this stuff in. We'll call it elastic search seven and select that folder. Linux is the correct type, but we want the version to be Ubuntu 64 bit. There we go. Hit Next. And I'm going to allocate about half of my physical memory to this virtual machines. So for me that's going to be eight gigabytes, 8192 megabytes. We will go ahead and create that virtual hard disk for it. We're going to need about 20 gigabytes for that. So again, make sure we're putting it someplace. It has plenty of room. Defaults find there. Dynamically allocated is fine, and we are going to select a different home for this. Make sure that that is on a dry that has plenty of space. And I'm going to increase this from 10 gigabytes to 20 gigabytes because ten just isn't enough. Doesn't have to be exact. All right, so we've got things set up. Let's go ahead and kick that off to sit the big friendly green Start button. We're going to now select that, assumes that we downloaded. So click on the folder icon here, navigate to your downloads and select the image of Ubuntu 18, 0.04.2 server. And off it goes. All right, after a couple of minutes of booting up, we're into the installation menu here. Go ahead and select your language for me, that's English. And my keyboard layout is also English. Us. Just seeding entered is select those done. Selections there. Hit Enter again. We want to install Ubuntu. And that should be fine for the network configuration. We do not have a proxy server, so again, hit Enter to skip past that. The default mirror addresses. Fine, hit Enter. And we will use the entire virtual disk. Remember this is the virtual disk we're setting up. There's no risk of actually corrupting our main operating system disk here. So hit enter again and one more time. And everything looks fine. So hit Enter again to accept the done selection there. Now here we need to use the down arrow to select Continue and say Yes, I'm sure you want to do this and enter. Typing your name would say your name is student, your server's name, whatever you want. Yes. Seven sounds good to me. Username, student, password. Password. Again, use whatever you want here. When you're done, hit Tab to select the done selection here and enter. We do want OpenSSH servers, so go ahead and hit spacebar to select that. And hit tab and tab again to select on. We'll install the software that we need by hand. So go ahead and hit tab to go to the Done option there and hit enter again. And now it's often doing its thing installing. So we have to wait for this to finish. Several minutes later that initial installation is done and it's asking us to reboot now. So go ahead and hit Enter and worry about those failed messages. They are perfectly okay. We need hit Enter again. And that will reboot hopefully into our brand spanking new Ubuntu environment. After a minute or so, it looks like it's finished booting up. Just hit Enter to get a login prompt here. And we'll login as a user named student that we set up. And the password that you also set up during installation. And we're in cool. All right, so we have a boo-boo up and running now we just need to install Elasticsearch itself on our new system. Now if you'd like to follow along from written instructions from this point, you can head over to my website at media Dotson dogs, soft.com slash elastic search.html. And you'll find written steps here of what we're about to do. Or if you prefer to follow along with the video, you can do that as well. So here we go. Now that we're logged in, Let's go ahead and first start off by telling ubuntu where defined the Elastic Search binaries. So we're going to type in a W, get dash q, uppercase O, that's an O, not a 0. Then a dash with a space after it. Https colon slash slash artifacts, dot elastic.com slash uppercase GPG, dash, key, dash, elastic search. Make sure you pay attention to capitalisation and spaces and everything here. One wrong keystroke and it won't work. Alright? Alright, now we're gonna do a space and a pipe symbol, that's a shift backslash. And then the space sudo, apt dash, key, space ad, space, dash. Alright, so again, double-check everything. They shrink all the spaces, right? Make sure you got all the dashes where they should be or things uppercase it should be and that's an O and not a 0 will type in our password again. All right, step one is done. Next step, sudo apt install, apt dash, transport, dash, HTTPS. Should look like that. Next we'll say echo space, quote Deb https colon slash slash artifacts, elastic.com slash packages slash 7 X because this is Elasticsearch seven slash apt, stable main quote. And then another pipe to pseudo T, dash a space slash ETC slash api slash sources dot list, dot d slash elastic dash 7 x dot list. All right, and finally, sudo apt-get update ampersand ampersand sudo apt install Elastic Search, which should actually go out and installing Elasticsearch. That appears to have worked. So now that we've installed Elasticsearch, we need to configure it. To do that, we'll say pseudo VI, slash, ETC slash Elastic Search slash Elastic Search, dot YAML. So we need to make the following changes. Go ahead and use the arrow key to move down to where it says node dot name. And we've over to the n and node and hit the I key to enter Insert mode in the VI editor and then backspace to get rid of that comment, thereby naming our node no dash one, very creative, and it will keep scrolling down. And next we're going to look for the network dot hosts setting. There it is. Go ahead and uncomment that and change it from that to 000, 000. That just make sure that everything works fine in our virtual environment. Next, we're going to go out and discovery and uncomment discovery dot c dot hosts. And change that from host one and host two to 127 000 001 inside quotation marks just like that. And finally, we'll go to cluster dot initial master nodes, uncomment that as well. And set that to just node one because we only have one node in our little virtual cluster here. All right, that should do the job. Let's go ahead and hit the Escape key to get out of insert mode and then type in colon WQ to write and quit. Now we're ready to actually start Elastic Search SAP. So let's say pseudo slash bin slash system CTL, daemon reload. Next we'll say pseudo bin system control enable elastic search dot service. And finally, pseudo slash bin slash system control. Start Elastic Search dot service. And this will make sure that Elastic Search boots up automatically when we start our machine in the future. So it generally takes a minute or two for Elastic Search to actually start up successfully. We can test if it's up and running yet by doing the following. Curl, dash x, get all caps, 127, not 0 dot-dot-dot one colon 90 to 100. And right now we're getting a connection refused because it hasn't started up yet. So I'm just going to try this again in another minute or two. And what it actually gives me back a successful response, we'll know that we're ready to move forward. All right. After a couple of minutes actually got back this response instead of a connection refused message. So once you see this, you know, you're ready to keep on going. And you should get this default response back that just sends with, you know, for search. All right, so now that we have lots of search running, we need to actually have some data to search with. Let's go and download the complete works of William Shakespeare and import that. So type in the following to get that. W GET HTTP colon slash slash Media Dotson dogs, soft.com slash e SS7 slash shakes, dash mapping dot JSON. This just defines the schema of the data that we're about to install. Now that we've downloaded that datatype mapping, let's go ahead and submit it to Elastic Search. Thusly, curl dash uppercase H, quote content, type, colon application slash JSON, quote dash x, put 127 dot 000 dot colon 90 to 100 slash Shakespeare. Well, there'll be the name of our index. Dash, dash, dash, dash binary at sign, shakes dash mapping dot JSON. And that has submitted that datatype mapping into Elastic Search. So knows how to interpret the data that we're about to get it. Let's go ahead and download the actual works of William Shakespeare with W get HTTP colon slash slash and media. Dotson dog dash soft.com, slash ES 7 slash Shakespeare underscores 7 dot JSON. And that's everything Shakespeare has ever written in JSON format. Let's go ahead and submit that to our index. Curl dash, uppercase H, quote, content, type, colon application, slash JSON quote, dash, ex post, single quote, 127, 0 dot 0 dot one colon 90 to 100 slash Shakespeare slash underscore bulk, single, quote, dash, dash, dash, dash binary at sign, Shakespeare underscore 7 dot JSON. And we'll talk about what this is all doing later on right now, I just want to get you up and running and doing something cool. So that's going to go ahead and 2 on the entire works of William Shakespeare and index that into Elastic Search. That will of course take a few minutes, so I'll come back when that's done. All right, It took about 15 minutes for all that data to get index. But compared to the amount of time that it probably took William Shakespeare to write all of that. I guess that's nothing right? Let's hit Enter just to get a nice clean prompt back here. And let's get some payoff from all this work out. We've done a lot so far today. We've actually installed an Ubuntu system running in a virtual environment on your own PC, installed Elastic Search from scratch and installed and index the entire works of William Shakespeare. So let's try and actually search that data now and actually do something with it. So let's issue the following command to actually search for To be or not to be and see what play that came from. I think you might know the answer, but let's just see that it works. That had been curl, dash, lowercase h, quote, content, type, colon application, slash JSON, quote, single quote 127 dot 0 dot-dot-dot one colon 90 to 100 slash Shakespeare slash underscore search, question mark, pretty single quote, dash d and another single quote. So basically what we're saying so far is I'm going to issue a JSON request to our Elasticsearch server that's running on when 27 000 001 in the Shakespeare index. And I'm going to issue a search query and get the results back in nicely formatted results. Hit Enter and we're going to start off are the body of our requests with a open curly bracket. Enter quote, query, quote, colon, open curly bracket, enter, quote again match underscore phrase, quote, colon and another curly bracket. Quote, text underscore entry, quote, colon, quote to be or not to be. Quote. You see what's going on here. Basically we're saying that we're sending a query to Elastic Search to match the following phrase that contains the text to be or not to be anterior, and we'll close off those curly brackets, 123 of them. And a final single quote to close that off and let's see what we get back. They worked. So cool. You can see here that to be, you're not to be came back from the play name Hamlet. The speaker was Hamlet. And the full line there was to be or not to be. That is the question. And apparently elastic search has chosen to be. During this lecture, we have actually successfully set it up from scratch on your own little Ubuntu system. And now that we have elastic search running, we can start to learn more about how it works and start experimenting with it and doing more and more stuff with it. So keep on going guys, it's about to get interesting. If you're done for now, however, the way to safely shut this down is to go to the machine menu of your virtual terminal here and say a CPI shutdown. That will send a shutdown message to the host and cleanly shut things down. And then when it's done, you can be for you to close the VirtualBox Manager as well. 4. Intro to HTTP and RESTful API's: So before we can talk about Elastic Search, we need to talk about rest and restful APIs. The reason is that Elastic Search is built on top of a RESTful interface. And that means to communicate with Elastic Search, you need to communicate with it through HTTP requests that adhere to a rest interface. So let's talk about what that means. So let's talk about HTTP requests at a more high level here. So whenever you request a web page from your browser, What's going on is that your web browser is sending an HTTP request to a web server somewhere requesting the contents of that webpage that you want to look at. And Elastic Search works the same way. So instead of talking to a web server, you're talking to an elastic search server, but it's the same exact protocol. Now in HTTP requests contains a bunch of different stuff more than you might think. One is the method, and that's basically the verb of the request, what you're asking the server to do. So in the case of actually getting a webpage back from a web server, you'd be sending a get request saying that I want to get information back from the server. I'm not going to change anything or any information on the server. I just want to get information back from it. You might also have a post verb that means that you want to either insert or replace data that's stored on the server or put which means to always create new data on the server. Or you can even send a delete verb. That means to remove information from the server. Normally you wouldn't be doing that from a web browser, but from Elasticsearch client, totally valid thing to do. It also includes a protocol. So specifically what version of HTTP Ru sending this request in might be http slash 1.1 for example. You will be sending that request to a specific host, of course. So if you're requesting a web page from our website that might be sound ageist education.com. And the URL is basically what resource you are requesting from that server, what you want that server to do. So in the case again, of a web server, that might be the path to the webpage that you want on that host. There's also a request body you can send along. You don't normally see that with a web page request, but you can send extra data along and whatever structured data you want to the server within the body of that request as well. And finally, there are headers associated with each request that contains metadata about the request itself. For example, information about the client itself that would be in the user agent for a web browser. What format the body is in that might be in the content type, stuff like that. So let's look at a concrete example. Again, getting back to the example of a browser wind to display a website, this is what an HTTP request for that might contain. In that example, we're sending a GET verb to our web server and we're requesting the resource slash index.html from the server, meaning we want to get the homepage. We will say that we were sending this in 1.1 HTTP protocol and we're sending it to a specific host. And that's our website, Slumdog education.com. In this example, there is no body being sent across because all the information we need to fulfill this request has already been specified. And there'll be a whole slew of headers being sent along as well that contains information about the browser itself, what type of information and languages it can accept back and return from the server. Information about caching, cookies that might be associated with this site, things like that. And so bunch of information about you being centered around the Internet whenever you request a web page. But fortunately with Elastic Search or use of headers is pretty minimal. So with that, let's talk about RESTful APIs now that we understand HTTP requests. So the really pragmatic practical definition of a RESTful API is simply that you're using HTTP requests to communicate with your web service of some sort. So because we're communicating with Elastic Search using HTTP requests and responses, that means that we're basically using a RESTful API. Now there's more to it than that. We'll get to that. But at a very simple level, that's all it means. It sounds fancy, but that's really it. So for example, if I want to get information back from my Elastic Search cluster, like search results, for example, I'm actually conducting a search. I would send a GET verb along with that request saying I want to get this information from elastic search. If I want to insert information into it, I would send a PUT request instead. And the information that I'm inserting would be within the request body. And if I want to actually remove information from my Elastic Search Index, I would send a delete request to get rid of it. Now, like I said, there's more to rest than that. So let's get into the more the computer sciency aspect of it. Rest stands for Representational State Transfer. And it has six guiding constraints. Well, to be honest, these aren't really constraints, not all of them. Some of them are a little bit fuzzy and we'll talk about that. Obviously it needs to be a client-server architecture. We're dealing with the concept of sending requests and responses from clients to servers doesn't really make sense unless we're talking about client-server architecture. And that is what Elastic Search offers. We have an elastic search server or maybe even a whole cluster of servers and several clients that are interacting with that server. It must be stateless. And that means that every request and response must be self-contained. You can't assume that there's any memory on the client or the server of the sequence of events that have happened there really. So you have to make sure that all the information you need to fulfill a request is contained within the request itself. You're not keeping track of state between different requests. Cash ability, this is more of a fuzzy when it doesn't mean that your responses need to be cached on the client. It just means that the system allows for that. So maybe your responses include information about whether or not that information can be cached. Again, not really a requirement, but it's on this list of rest constraints. Layered system. Again, not a requirement, but it just means that when you talk to, for example, sundial education.com, that doesn't mean you're talking to a specific individual server. That request might get routed behind the scenes to one of an entire fleet of servers. So you can't assume that your request is going to a specific physical host. And again, this is why statelessness is important because one host might not know what's going on in the other necessarily, so they might not be talking to each other really. Another sort of fuzzy constraint is code on-demand. And this just means that you have the capability of sending code across as a payload on your responses. So for example, a server might send back JavaScript code as part of its response body that could then inform the client how to actually process that data. We're not actually going to be doing that with Elastic Search obviously. But rest says you can't do that if you want to. And finally, it demands a uniform interface. And what that means is a pretty long topic. But at a fundamental level, it just means that your data that you're sending along is so some structured nature that is predictable and you can deal with changes to it in a structured way. So at a high level, that's all it is. With that out of the way. Why are we talking about rested all here? Well, the reason is that we're going to do this whole course just talking about the HTTP requests and responses themselves. And by dealing with that very low level of how the RESTful API itself of elastic search works. We can avoid getting mired into the details of how any specific language or system might be interacting with Elastic Search. Pretty much any language out there, Java, JavaScript, Python, whatever you want to use, is going to have some way of sending HTTP requests. So it really doesn't matter what language you're using, what matters more in understanding how to use Elastic Search, how to construct these requests, and how to interpret the responses that are coming back from it. The mechanics of how you send that request and get the response back is trivial, right? You know, any language can do that. If you're a Java developer, you can go look up how to do that. So we're not going to get mired in the details of how to write a Java client for Elastic Search. Instead, what we're going to teach you in this course is how to construct HTTP requests and parse the responses you get back from elastic search in a meaningful way. And by doing that, you'll be able to transfer this knowledge to any language in any system that you want very easily. Some languages may have a dedicated client library for Elastic Search that provides sort of a higher level wrapper over the actual HTTP requests and responses, but they'll generally be a pretty thin wrappers. So you still need to understand what's going on under the hood to use Elastic Search successfully. Lot of people get confused on that in this course, but there's a very good reason for why we're just focusing on the actual HTTP requests and responses and not the details of how to do it from a very specific language of Elastic Search documentation is done in the same style, the books that you can find about Elastic Search, same idea. That's a good reason for that. So the way we are going to interact with Elastic Search in this course is just using the curl command on the command line. So again, instead of using any specific programming language or client library, we're just going to use curl, which is a Linux command for sending HTTP requests right from the command line. So we're just gonna bash, I'll curl commands, sent out requests on the fly to our service and get responses back and see what comes back from them. The structure of a curl command looks like this. Basically it's curl dash h followed by any headers you need to send. And for elastic search, that will always be content type of application slash JSON, meaning that whatever is in the body is going to be interpreted as JSON format. It will always be that. And in fact, we're gonna show you a little bit of a hack on how to always make that header specified automatically for you on curl to save you some typing. There'll be followed by the URL which contains both the host that you're sending this request to. And in this course I will usually be the localhost 127 dot 000 dot one, followed by any information that the server will need to actually fulfill that requests. So what index 2 I want to talk to? What datatype, what sort of command am I asking it to do? And finally, we will pass dash d and then the actual message body within quotes, that will be JSON formatted data with additional information that the service needs to actually figure out what to give back to you or what to insert into Elastic Search. Let's look at some concrete examples to make that more real. So at this first one on the top here where basically querying the Shakespeare index for the phrase To be or not to be. So let's take a closer look at that curl command and what's in it. Again, we're saying curl dash H content-type application JSON that's sending a HTTP header that says that the data in the body is going to be in JSON format. Dash x get means that we're using the GET method or the GET verb. Depending on your terminology, meaning that we just want to retrieve information back from elastic search. We're not asking you to change anything. And the URL, as you can see, it includes the host that we're talking to. In this case 127 000 dot one, which is the local loopback address where your local host elastic search runs on port 90 to a 100 by default, followed by the index name, which is Shakespeare, and then followed by underscore search, meaning that we want to process a search query as part of this request. The question mark pretty is a query line parameter. That means that we want to get the results back in a nicely formatted, human-readable format because we want to be looking at it on the command line. And finally, we have the request body itself satisfied after a dash D and between single quotes. And if you've never seen JSON before, this is what it looks like. It's just a structured data format where each level is contained within curly brackets. So it's always contained by curly brackets at the top level. And then we're saying we have a query level. And within those brackets we're saying we have a match phrase command that matches the text entry to be or not to be. So that is how you would construct a real search query and elastic search using nothing but an HTTP request. Another example here we're going to be inserting data. So in this one we're using a put verb, again to 127 000 001 on port 9000. This time we're talking to an index called movies datatype called movie. And it's using a unique identifier for this new entry called 109487. And under movie ID 109487, we're including the following information in the message body. The genre is actually a list of genres. And in JSON that'll be a comma delimited list of stuff that's enclosed in square brackets. So this particular movie is both the imax and sci-fi categories. Its title is Interstellar, and it came out in the year 2014. So that's what some real HTTP requests look like when you're dealing with Elastic Search. So now that you know what to expect and how we're actually going to use Elastic Search and communicate with it. We can talk more about how Elastic Search works and what it's all about. We'll do that next. 5. Elasticsearch Basics: Logical Concepts: So before we start playing with our shiny new Elasticsearch server, let's go over some basics of elastic search. First, we'll understand the concepts of how it works, what it's all about, how it's architected, and when we're done with that and we'll have a quick little quiz to reinforce what you learned. After that, we'll start messing around with it. So there are two main logical concepts behind elastic search. The first is the document. So if you're used to thinking of things in terms of databases, a document is a lot like a row in a database that represents a given entity, something that you're searching for. And remember in elastic search, it's not just about text. Any structured data can work. Now, Elastic Search works on top of JSON formatted data. If you're familiar with JSON, it's basically just a way of encoding structured data that may contain strings or numbers or dates or what have you in a way that you can cleanly transmit across the web. And you'll see a ton of examples of this throughout the course, so it'll make more sense later on. Now, every document can have a unique ID and you can either explicitly assign a unique ID to it yourself or allow Elasticsearch to assign it for you. The second concept is the index. An index is the highest level entity that you can query against an elastic search, it can contain a collection of documents. So again, bringing this back to an analogy of a database, you can think of an index as a database table and a document as a row in that table. The schema that defines the datatypes in your documents also belongs to the index. You can only have one type of document within a single index and Elastic Search. So if you're used to the world of databases, you'll find Elasticsearch to have similar concepts. Thank you for your cluster as a database, its indices as tables and documents as rows in those tables. It's just different terminology. But as you'll soon see, even though the concepts are similar, how Elastic Search works under the hood is very different from a traditional database. 6. Elasticsearch Overview: let's start off a sort of a 30,000 foot view of the elastic stack in the components within it and how they fit together. So Elasticsearch is just one piece of this system it started off is basically a scale will version of the loose seen open source search framework. And it just added the ability to horizontally scale Lucy in in a See So we'll talk about shards of elasticsearch in each shard in elasticsearch is just a single loosening inverted index of documents, so every shard is an actual loose seen instance of its own. However, Elasticsearch has evolved to be much more than just loosen spread out across a cluster. It can be used for a much more than full text search now, and it can actually handle structure data and aggregate data very quickly. So it's not just researching and handle structure data of any type, and you'll see it's often used for things like aggregating logs and things like that. And what's really cool is that it's often a much faster solution than things like Hadoop or a Spark or Flink. You're actually building in new things into the elasticsearch all the time with things like graph visualization and machine learning that actually make elasticsearch a competitor for things like Hadoop and Spark and Flink. Only it could give you an answer in milliseconds instead of in hours. So for the right sorts of use cases, Elasticsearch could be a very powerful tool and not just for search. So let's zoom in and see what Elasticsearch is really about at a low level. It's really just about handling Jason requests. So you're not. We're not talking about pretty you eyes or graphical interfaces when we're just talking about the last two church itself. We're talking about a server that can process Jason requests and give you back Jason Data, and it's up to you to actually do something useful with that. So, for example, reason Curl here to actually issue an arrest request with a get firm forgiven index called Tags, and were just searching everything that's in it. And you can see the results come back in Jason format here, and it's up to you to parse all this. So, for example, we did get one result here called for the movie. Swimming to Cambodia has given User I D and a tag of Cambodia. So if this is part of a tags index that we're searching, this is what a result might actually look like. So just to make it riel, that's a sort of output you can expect from elasticsearch itself. But there's more to it than just elasticsearch. There's also Cabana, which sits on top of Elasticsearch, and that's what gives you a pretty Web. You Why? So if you're not building your own application on top of elasticsearch or your own Web application, Cabana can be used just for searching and visualizing what's in your search index graphically, and it could be very complex. Aggregations o