Complete ElasticSearch Guide with Hive, Pig, MR, LogStash and Kibana integrations

DataShark Academy, A play ground for Data Engineers

Play Speed
  • 0.5x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 2x
123 Videos (4h 19m)
    • Chapter 1 - Lets get Started

      0:38
    • What is a Search

      1:00
    • What is a Search Engine

      1:09
    • A look inside a Search Engine

      1:24
    • What is MetaData

      1:56
    • What is ElasticSearch

      1:21
    • ElasticSearch's Scalability

      1:41
    • High availability of ElasticSearch

      1:31
    • What is Multi-Tenancy

      0:57
    • Let's do a Full Text Search

      0:52
    • Real time analytics with ElasticSearch

      0:49
    • Chapter 1 - Summary

      0:42
    • Chapter 2- Setting up Development Environment: Introduction

      0:51
    • Let's list the Installations needed

      0:33
    • How to Install Homebrew

      1:52
    • Let's install wget

      1:44
    • Time to check the Java Version

      0:53
    • Checking & Enabling SSH

      3:39
    • Downloading Hadoop

      4:09
    • Apache Hadoop Configuration - Part 1

      5:00
    • Apache Hadoop Configuration - Part 2

      2:48
    • Apache Hadoop Configuration - Part 3

      5:48
    • Checking the Hadoop Daemon

      0:40
    • How to download ElasticSearch

      1:32
    • ElasticSearch Configuration

      5:03
    • Lets install ElasticSearch Head Plugin

      1:26
    • How to install ElasticSearch Marvel

      1:01
    • It's time to check on ElasticSearch Daemon and wake it up

      2:40
    • Chapter 2 - Summary

      0:29
    • Chapter 3 - Elastic Search Building Blocks

      0:14
    • RDBM vs ElasticSearch

      1:39
    • What does a Data Record look like?

      1:19
    • Marvels of Inverted Index

      2:02
    • What is a Shard?

      0:41
    • ElasticSearch Node

      1:54
    • What is Cluster

      0:54
    • Monitoring ElasticSearch Cluster's Health

      2:44
    • Scaling an ElasticSearch Cluster

      1:43
    • RestAPIs in ElasticSearch

      0:41
    • Chapter 3 - Summary

      0:32
    • Chapter 4 - Introduction

      0:25
    • What is ShardID

      1:50
    • Types of Operations in ElasticSearch

      0:44
    • Inside Operations -<write> & <delete>

      2:25
    • Hands on with Write Operation

      4:50
    • Inside Operation : <Read>

      2:30
    • Lets Try Read Operation - Part 1

      4:55
    • Lets Try Read Operation - Part 2

      1:43
    • Inside Operation : <Update>

      3:29
    • Lets Try Update Operation

      3:43
    • Lets Try Delete Operation

      3:19
    • Concept of Mapping in ElasticSearch

      1:49
    • Hands on with Mapping in ElasticSearch

      6:29
    • Data consistency using Templates

      1:10
    • Using Templates

      4:53
    • Chapter 4 -Summary

      0:58
    • Chapter 5 - Introduction

      1:19
    • Types of Search Queries

      0:51
    • Crating Dataset for Search

      6:17
    • Using QueryString Part-1: Select All

      6:46
    • Using QueryString Part-2: Filter Specific Fields

      3:30
    • Word of Caution with QueryStrings

      1:26
    • Using DSL Queries Part-1

      5:49
    • Using DSL Queries Part-2

      4:30
    • Chapter 5- Summary

      0:34
    • Chapter 6 - Basics about Data Pipeline

      0:49
    • Chapter 7 - Introduction to Data pipeline

      0:22
    • Setting Objectives

      0:42
    • Installing Apache Hive

      1:52
    • Configuring Apache Hive Part-1

      2:48
    • Configuring Apache Hive Part-2

      3:59
    • Getting ElasticSearch Connector JAR

      3:40
    • Where to get free datasets for exercises

      1:03
    • Understanding Our Dataset

      1:13
    • Creating a Data Flow from Hive to ElasticSearch Index Part-1

      9:56
    • Creating a Data Flow from Hive to ElasticSearch Index Part-2

      10:17
    • Looking at ingested data inside ElasticSearch Cluster

      3:30
    • Chapter 8 - Introduction

      0:17
    • Setting Objectives

      0:27
    • Indexing Data inside ElasticSearch using Bulk API

      3:36
    • Creating Data Flow from ElasticSearch Index to Hive table

      7:05
    • Chapter 8 - Summary

      0:42
    • Chapter 9 - Introduction

      0:21
    • Our Objective

      0:29
    • Basics about Apache PIG

      1:09
    • Installing Apache PIG

      2:45
    • Configuring Apache PIG

      2:01
    • Lets up a level by introducing Apache Hive into picture

      0:57
    • Getting Dataset for the exercise

      0:54
    • Creating data flow from Apache PIG to ElasticSearch Part-1

      2:45
    • Creating data flow from Apache PIG to ElasticSearch Part-2

      10:03
    • Chapter 10- Introduction

      0:17
    • Getting the Dataset

      0:35
    • Creating a Data Flow from ElasticSearch to Apache PIG to HDFS

      6:50
    • Chapter 10- Summary

      0:53
    • Chapter 11 - Introduction

      0:22
    • Basics about MapReduce

      2:53
    • Pre-requisites

      2:21
    • Key Classes in MapReduce & ElasticSearch Flow

      1:47
    • Getting and Saving Dataset to HDFS

      1:47
    • Creating an ElasticSearch Index and Mapping

      1:24
    • Creating a Maven POM.xml file

      4:07
    • Creating a Mapper Class

      4:08
    • Creating a MapReduce Driver Class

      3:26
    • Building MapReduce Program using Maven

      2:33
    • Running MapReduce Application on Hadoop cluster

      6:29
    • Chapter 12- Setting Objectives

      1:04
    • Getting Dataset for the exercise

      0:46
    • Creating Mapper Class

      3:22
    • Creating Driver Class

      3:47
    • Building ES2MapReduce Program using Maven

      1:32
    • Upload DSL query file to HDFS

      1:48
    • Running ES2MR Application on Hadoop cluster

      4:30
    • Chapter 12 - Summary

      0:22
    • Chapter 13 - Objectives

      0:40
    • Basics about LogStash

      1:54
    • Installing LogStash

      2:27
    • Configuring LogStash

      3:55
    • Creating a simple data pipeline: STDIN to LogStash to ElasticSearch

      5:18
    • Creating data pipeline using a File

      6:30
    • 13

      0:58
    • LogStashing same file multiple times

      5:08
    • Chapter 13- Summary

      1:05

About This Class

This course will teach you everything you need to become a Big Data Engineer.

This is the only course on Internet covering integration of ElasticSearch with Hadoop technologies and on creating various real world applications. All major corporations are looking for trained Big Data engineers. This course will help you be the one.

In this course, you will learn step by step about ElasticSearch which is a Google Search engine like technology and a very hot technology used by large enterprises. You will also learn how to integrate ElasticSearch with all major components in Hadoop ecosystem. With real world examples, you will learn how to build various Data Pipelines such as:

Ingestion Flows (to ElasticSearch)

Apache Hive to ElasticSearch
Apache PIG to ElasticSearch
MapReduce to ElasticSearch
LogStash to ElasticSearch


Egression Flows (from ElasticSearch)

ElasticSearch to Apache Hive
ElasticSearch to Apache PIG
ElasticSearch to MapReduce
ElasticSearch to LogStash

Data Visualization

Getting large amount of data is only part of the job, next step is to generate meaningful insights from this data. This is what we will show you by visualizing data using Powerful tools such as - Kibana and create Real time Business Intelligence Dashboards.


Production Cluster Monitor tool

As a Big Data Engineer, you will also need to support production activities and so we will cover following:

Cluster Health monitoring at Index, Shard, Node levels
Parsing ElasticSearch Cluster statistics using Linux utilities
Setting up wait-for-trigger mechanism and much more

You will also learn about awesome search capabilities offered by ElasticSearch and how to query vast index of data in real time. This will be really fun!!!

We will cover lots of basics to build foundation required to understand ElasticSearch. You will also learn about behind the scenes on how a search engine and specifically ElasticSearch works in a single or multiple node cluster.

You will also get step by step instructions for installing all required tools and components on your machine in order to run all examples provided in this course on your computer itself. Each video will explain entire process in detail and easy to understand manner.

You will get access to working code for you to play with it and expand on it. All code examples are working and will be demonstrated in video lessons.

Windows users will need to install free virtual machine tool on their device to setup single node hadoop cluster. More details are available on Hortonworks website.

6

Students

--

Projects

0

Reviews (0)

DataShark Academy

A play ground for Data Engineers

At DataShark Academy, we have over 15+ years of experience in technology including product designing and engineering. Over years we have trained more than 10,000 students across globe. With our unique teaching approach, our students have achieved maximum results in shortest possible time. Our training materials are high quality and are carefully designed keeping students in mind.

Don't waste hours of valuable time watching long boring videos only to find they won't help you in career g...

See full profile