Create and Manage Redshift Clusters on AWS

Harshit Srivastava, Developer on IBM Cloud, Bluemix

Get unlimited access to every class

Taught by industry leaders & working professionals

Topics include illustration, design, photography, and more

Get unlimited access to every class

Taught by industry leaders & working professionals

Topics include illustration, design, photography, and more

Lessons in This Class

- 1.
  
  Introduction
  
  1:28
- 2.
  
  Create an IAM role on AWS
  
  7:30
- 3.
  
  Create a Redshift Cluster
  
  7:13
- 4.
  
  Redshift Query Editor and JDBC odbc connection
  
  7:00
- 5.
  
  Cluster Properties and Actions
  
  6:51
- 6.
  
  Setting up cloudwatch alarms
  
  2:18
- 7.
  
  Execute SQL queries on Redshift
  
  7:53
- 8.
  
  Save and Schedule SQL queries
  
  4:19
- 9.
  
  Database connections with new tab in Query editor
  
  3:24

Beginner level

Intermediate level

Advanced level

All levels

Students

Project

About This Class

Class Overview:

In this course, you will learn about building and managing Redshift clusters on AWS or Amazon Web Services. Here you will learn to create and customize both single node and multi-node clusters on the cloud where you can perform various kinds of operations on these clusters such as storage, retrieval and analysis of dataset kept in form of records and tables in databases. You will also learn to manage these clusters and perform various actions such as resizing the cluster, setting up alarms, scheduled maintenance, routine checkup and performance monitoring.

A Redshift cluster requires to be linked with a Virtual Private Cloud or VPC, and with an Identity and Access Management role or IAM role on AWS. You will learn to create an IAM role for adding security and authentication to your clusters and VPC for optimal performance on dedicated network paraments where you can customize subnets, internet gateways and other network components. When your cluster is ready and running on the cloud, you may write and run SQL queries on Redshift query editor.

What You Will Learn:

Here you will learn to write and execute multiple SQL queries for finding insights from the dataset and to perform various analytical operations based on certain conditions. Moreover, you would also learn to save and provision SQL queries that can be executed at predefined date and time. Scheduled queries are highly important for having a routine backup, maintenance, creating shards, analysis and other tasks.

This course has a pragmatic approach, so everything is taught with practical examples to help learners understand these advanced concepts easily and quickly without wasting much time on long boring theoretical lessons. You could find these skills useful in various domains related to Database Administration and Cloud Computing where Redshift cluster in involved.

Prerequisite:

Before taking this course, if you are already familiar with any Relational Database and SQL, it would be useful.

Who This Class is For:

Anyone who is curious to learn about Amazon Redshift. Students and IT professionals interested in learning Data Warehouse services on AWS

What is Amazon Redshift?

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. This enables you to use your data to acquire new insights from the dataset. Redshift is an advanced and popular data warehousing service on AWS. Redshift is read optimized, that makes it massively faster as compared to traditional relational databases that are write optimized.

What is Data Warehousing?

Data Warehousing is used to store generally very large dataset that is either meant for archival usage or for analytics. The warehouse becomes a library of historical data that can be retrieved and analyzed in order to inform decision-making in the business.

Meet Your Teacher

Harshit Srivastava

Developer on IBM Cloud, Bluemix

Teacher

I am Self-Taught developer who had worked on various platforms using varied languages, and involved in various Projects both Open Source and Proprietary.

I have developed Web and Android Applications, chrome Extension, worked on various frameworks, fixed bugs for some projects, and explored numerous others. I think education and learning should be free and open, not be bound with restrictions like attending classes or going to college, People from all age groups, gender, faith, race, nations, etc must get equal privilege. When entire world would act this way like being a single FAMILY, we would truly realise VALUE of Knowledge and Human Life.

See full profile

Related Skills

Development More Development Data Science

Level: All Levels

Hands-on Class Project

Create Database Clusters on AWS Redshift

Project Description

Your class project is to create and manage clusters and nodes on Amazon Redshift. You may also run SQL query for database operations.

Getting Started:

In order to create this class project, you may follow the steps as mentioned below-

1) Before creating a cluster you have to create an IAM role for Identity and Access Management on AWS. Create a role for redshift cluster.

2) Now create a cluster on Redshift by defining number of nodes such as 2, 4. Attach the IAM role created in the previous step to this cluster.

3) Go to cluster properties to create actions, jdbc and odbc connection

4) After you are done with it, set up cloudwatch alarm based on certain metrics

5) Create and execute SQL Queries on Redshift Cluster and Database

Sharing your work:

You could share your work by uploading progress shots with the class by uploading to the "Your project" section. If you have any questions or doubts, please let me know! I am happy to help.

Keep Learning!

Class Ratings

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.

Transcripts

1. Introduction: Hi there, welcome to this course on create and manage clusters on Amazon Redshift. My name is Dan, I'm instructor for this class. Yet in this course we will learn about Amazon Redshift. That's a popular data warehousing tool for storing, retrieving, analyzing infrequently access data. Generally that is meant for archival, or that can be used for highly scalable relational dataset that is read optimized. There are various use cases and applications for Amazon ratio. And here in this course you will learn to create and customize Identity and Access Management or IAM roles. You will learn to create and manage re-shaped clusters. You will learn to write SQL queries using Query Editor 100 shift. You will also learn to customize cluster properties and setting up allows under shift. You will also learn to save and schedule SQL queries that can we run added predefined time. You will also learn to create connections, integrations and find various options on Redshift, such as ODBC and JDBC connections. If you are curious to learn these skills on Amazon Redshift, you start learning right now. See you in the class. 2. Create an IAM role on AWS: Hi there, Welcome to this lesson. We're going to learn about creating an IAM role, identity and access management role on Amazon Cloud. Iam role is very important when you have to authorize any service with another service or any other account on AWS. So it is going the core services and building blocks for Amazon Cloud. You can go to security and identity management, or you can just search. I am on the AWS search bar and you would find this dashboard for IAM console. On the left-hand side, you've got various options for access management and access reports. So in order to create a role, you have to just navigate to the rules option just below the user groups and users to create an IAM role. As you can see here, I've already created multiple IAM roles that has attached to various kinds of Cloud services on AWS. And if you want to create a separate rule for accessing resources on the Cloud, you can create resources like a user account, but it's not a user account. It has some attached permissions to use and perform certain operations on the respective Cloud services. Just go to create a role. And this will provide you four different options that you can use to create a trusted entity. First is for AWS services, you can attach to any of the AWS services as listed below. And you can use it to authorize access to another AWS account. You can also attach it for web identity and SAML federation. So if you want to provide access to some resources, maybe you are working on a team where you have hundreds of users or developers and testers working with you on AWS. They all may have separate accounts. You can grant permission to certain resources on the Cloud using IAM. Also, if you have a single account, you can also provide access to various services. This time we are selecting redshift. Redshift read a popular service on AWS that can be used for storage, retrieval of information, and a lot of things here. So just select that Redshift. You can choose any other services like DynamoDB, S3, anything else? Once you select the thing, you can create a policy. You can attach permission policies. And you can select these things. These are various policies that grant certain kind of access, like read access, invoke access, and you can push into the CloudWatch VPN gateway access, right access execute excess SDK axis. And a lot of things. We are using a ratio. If you want to select, we can select the respective permissions. If you want to attach like S3, you can attach S3 full access. This way we have the access for read, write, delete anything, any kind of operations that we want to perform on. Some contained kept on the S3 bucket. We can perform if you want to only provided read-only access, you can provide a read only access. Amazon S3 read-only access. This way. Say for example, if you're building a web application, a mobile application, that can drive information from the data kept on the S3 bucket. You can just provide read-only access. So in that case it will only be able to retrieve information. But it's not allowed to delete, modify or anything. Any additional information. As a permission is blocked. It is very secured. Implementing a security comes to just one service. That is, I am, if you enabled, implemented identity and access management very properly, your Cloud system or cloud integration environment is almost a cure. You can think of it. Although there are other security services and dedicated for doors and other things. But if I am core service, that's why it is categorized under the security management security and identity management. Services on AWS. Just provide the name of this rule. You can provide a role description as we are creating a role for redshift, least provide a ratio of growth. There could be easy to learn that it is relative to Redshift. Otherwise we could get confused. If you want to add some one-line description, more elaborate one, you can provide. Maybe if you have 50 or hundreds of IAM roles and you may get confused at which role is attached to which service or which account you can get informed. Your roles created, the description. Here are the policies that has been attached and hit. Okay, so we have created this new role. Just search for it, redshift roll. This IAM role has been created and there's no last activity, so we have not used it. Just go to the properties and it would find the first one is role ARN, Amazon resource name. So you have to copy this thing whenever you want to attach your identity access management role to when a different service or authentication. You can just have this thing. If you create a re-shaped account. If you create an instance for DynamoDB database and you want to give permission attacks this role, you can just paste it there. You can also get other options. If we wanted to customize the policies anytime later on, you can just go to the attach policies and you can customize it. Say for example, if we granted only read access for S3 bucket, and maybe in future we want to customize it. We can do so. Just clicking on the Attach policy. You're going to also add some other things like exercise advisers, various tags. You can revoke where your sessions, you are, other properties as well. If we wanted to deep dive deeper into this IAM role, you can have this thing. You got to Jason option as well. This is created for this rule. You can also create and operate IAM roles using AWS CLI, so on the command line interface also, you can do these things if you don't want to access via console. This is how you create an IAM role on AWS. And you know how to attach the policies, provide permissions for read, write, execute and other things. And how you can find the ARN for design load. Keep learning and keep moving ahead. You are going to learn more in the coming lessons. 3. Create a Redshift Cluster: Hey, welcome back friend here in this lesson you're going to learn about creating a cluster on Amazon Redshift. Let's start with this. What is their shape? Their shift is relational database service that supports SQL and it's generally used as data warehouse. You can also consider it as a data warehouse service on AWS. It is read optimized that makes it fast as compared to other relational database services such as MySQL network, right? Optimized. It is a petabyte scale data warehouse service. So you can keep a lot of information, a lot of data on Amazon Redshift. For using re-shaped, you just need to create a Redshift cluster at first and later on, you can perform other operations. You can create node, you can select the node type, you can optimize them. You can store this information. Just go to Amazon Redshift on AWS console. And here you can create a cluster, this dashboard for Rachel. If you already have created multiple clusters, they will be visible. If you are creating a Redshift cluster for the first time. This is the homepage as it looks. So go to create a cluster on reshape. This will allow you to customize, whereas properties for re-shaped cluster data warehousing are generally used for the data. Data are considered for archival purpose. The data that has to be kept for record keeping or other information for auditing and other things. But they are generally not used in the live production environment. Also, you can use the redshift data of all the production environment as well. We are read efficiency is more important and write efficiency. If your data is storages, read efficient, very quick and very fast, you can use it for that. Once you create a cluster, you have to configure various things like if you are planning to use this cluster for production environment, you can select this thing. Otherwise, if you want to use it for trial purpose, he can select acting. You can choose the size of the cluster. You can choose a node type. It could be large, extra large. It could be legacy. It could be medium or small. It depends on your requirement. If you have a huge requirement, you can go with the extra large option. Otherwise, just go with the entry-level option. If you are doing it for the trial purpose, it will show you the option for a small cluster. There's x plus and other options. You can select the number of nodes. You can select that data source if you want to load the sample data for practice, for learning, you can sell it the sample data. Otherwise, you can create a cluster, upload your own data for practice. If you have too, as with other database services, you need to provide a database admin, username, and password. You can use this thing to access your resources and it emulator. Then you can provide a cluster permissions. If you want to select the permissions. If you use the trial version, you can find it. Whereas things already configured for you. And these are the calculated configuration. Somebody dc2, large, one node cluster has a single node cluster. It uses a computer service where we got two virtual CPUs and you've got the sample data. So one node has 160 GB, is stories size, so you can store up to 160 GB of content on this Redshift cluster. If we want to have a petabyte scale or a very large scale, you can create the extra large cluster size and with multiple nodes. This has sample data ticket that is 28 MB. And it will allow you to perform whereas query operations. You can choose our admin, username or password for accessing your database resources. It is comparatively similar to MySQL as it is also relational. Its support SQL query. So very chef allow you to write and run execute SQL queries in order to have analytics experience if you want to retry some information query for analytics purpose and other things, you can run SQL queries. You can also access the ratio of resources using AWS console where you got visual appearance. If you want to attach the command line interface, you can run command where you get to the art board. Just like a traditional SQL interface. This is how once you are configured, everything just go to create a cluster. This will allow you to create a cluster. If there is any error, just refer back to that error and try to fix it. So there could be password error, like it should be in a certain format. You have to select the special characters and so on. These are the parameters you have to provide the adequate size of the cluster. The number of nodes should not be very large. If you are beginning trial purpose, you can have one or two nodes data sufficient. But if you are doing it for the production environment for real-life applications, it depends on the usage where you want to use it. Once you are done with configuring red shaped clusters, just hit Create cluster and it will be created. What will navigate you to the dashboard and the cluster option, and it will be created. Once your cluster is available, you can go into the properties, configure it, you start querying. You can upload the data and perform various things. You can also monitor the CPU utilization is storage capacity. You can check the status currently it is being created, hazardous showing, modifying option. And there's a cluster name is pace that you can use to authenticate this question. You can also attach it with other clients, like SQL clients that can be used for ETL operations, extract, transform, and load operations. We can use it with a JDBC or ODBC drivers as well. But here we are using the AWS console for querying this thing. This is how you create a cluster on re-shape. To create our own AWS Redshift cluster for data storage, data warehousing and other things. Keep learning and keep moving ahead. 4. Redshift Query Editor and JDBC odbc connection: Hi, welcome back friends. Here in this lesson you're going to learn about writing SQL queries and running SQL queries on re-shaped using various techniques. There are various ways that allows you to use SQL query for the Redshift cluster. First step is you can use a JDBC or ODBC driver to connect to Redshift cluster with any client tool such as SQL clients. With this ETL tools or IDEs or code editors, you can use those things. For that, you can use a jar tool here. You can download the JAR files for this Redshift cluster or ODBC file without using AWS SDK. If you use the AWS SDK on the Amazon command line interface, you can also connect it to that. Otherwise, if you want to use the Query Editor on the AWS console, you can also use the Query Editor. First check the cluster details. Once you open the cluster properties, you would find various details such as node type, number of nodes, the endpoints, JDBC, ODBC, URL. If you want to go to the Query editor, you can click here to go into the query editor. Before it starting with the Query Editor on re-shape console, you may need to configure a KMS encryption key management service. So you can enable it. You can configure it and move forward. This query editor will allow you to run SQL queries using the AWS console. You will be ready to go. You can find a KMS Amazon key management service. For encryption applications. You can choose to create a symmetric or asymmetric key type. You can define the key usage, or you can find the advanced options, such as you can use material design, you can have the L EFS, you can provide description. Just hit Next. If you want to select the default settings, don't customize it until your family or with this thing. Just configured. Once you are done, you can open the virtual console. On the left-hand side, we've got three different options. The first one is database again, these queries and third is charged. By default, we are into the Redshift cluster. And on the right-hand side we've got editor where we can write script, just write, select asterisk from tickets. So we have the ticket database. We can access this thing. Ticket is a table that has been created by default as a sample dataset. If we wanted to limit your query output 200, you can set the limit or you can leave it. You can also save this query by providing a name. And you can save it if we want to use it folder on any other cluster. You can also run it at your scheduled time manually if you want anywhere else. You can also customize this query. If you want to make a connection with a database, you have to provide the name of the database, the username, and authentication for the state. Whenever you have created a cluster on Redshift, you have provided the username and password. You can use the same green to access your database resources on the Cloud. For the MySQL, for anything, for any kind of database admin database user, you can create a connection using this thing. This will allow you to run SQL query. Using this connection, you can use a temporary credentials or you can have this thing. You can create a connection here. Once it is there, you can, it started executing your SQL queries. There are various kinds of SQL queries that you run. You can provide the insert operations. You can drop the table. You can modify the table, write SQL procedures. You could create giants for the table. Left outer, inner join, some advanced queries. You can use a where clause and a lot of things here. Once you are done, you can run this query on this cluster. Once there was a connection, you can. Now let's move to the query options. So here we have this one saved queries. You can go to the charts. You have already created a chart. You can change the display mode from night mode to Demode. In the daylight. You want a white screen. And then night mode, you get everything in a dark tone. You can edit the work as fast as you like. You may create multiple folders in order to arrange your query's systematically. If you want to run, if you're working in a real production environment for any organization or on some other project, you may require to run thousands of queries sometimes. And they need to be arranged. So you have to provide. So here we can create the various folders, such as for data, for restoring liters and the query's. You can find this thing in the data folder. We could create a quote for writing the rules. We can use it. We can change the variables when we aren't. We can write the queries or analysis and other things and we can chill distinct. You can find various kinds of queries that are created by you that can be shared with your teammates. You ensure public. This is the general Query Editor for Redshift. You can run SQL queries in various ways. On re-shaped database. They're shaped clusters. You can have multiple clusters. You could store a lot of database there. And it is read optimized, so it is very fast, very optimized. You can have a lot of data is stored on Redshift cluster and the petabyte scale used for archival purposes. If I want to have a real-time read and write operations on the database, don't use your chip. You can use the DynamoDB for better performance. Or other databases. Keep learning and keep moving ahead. 5. Cluster Properties and Actions: Hey, welcome back friend. Here in this lesson you're going to learn about various cluster properties on Amazon Redshift. Here you will learn about monitoring various parameters that are running on a cleft or like queries and adult things. You can keep them in track record and you can also resize your cluster whenever you want. Just go to the cluster dashboard for Amazon Redshift selected cluster. We have gone into the details. So here's a general information about your cluster. You could find a resource names and things. Then just scroll down, you will find various options such as Leicester performance. Query monitoring should use maintenance and other properties. Query monitoring, you've got query history, the history of queries that executed on your cluster. You can sort it based on a weekly basis or hourly basis for daily basis, and so on. It will show you the workload concurrency for 2D and running queries on the cluster. The queries that are currently queued and occur, queries that are currently running. They will show you in various matrix. I'm currently we don't have any skewed or running query. As everything is empty. You can also check database performance, various shared use. You can resize the cluster and the real size of the cluster. Anytime you want. You can find various kinds of things. You can have the shoot, you will queries as well. So you can create a SQL query and run it based on a predefined time. Say if you wanted to write SQL query right now and you want it to execute seven days after, do they say or 17th of every month at 1015 AM. You can share this thing and it will run at a predefined time. That way you can create automation. So should you query? I'll provide you the automation capability. You can also find them maintenance details. Whenever you want your cluster to be maintained, you can check up that thing. We wanted to create a backup for your cluster. It will create a backup. And whenever you want to edit your properties, our cluster configuration, you can add it here if you want to edit it or backup properties, you can have the custom value a number of days. When I stopped short of it is taken. A snapshot is present scenario under cluster of other informations of the cluster data is stored as a backup. It will have a connected network and security parameters such as IAM role and VPC. You could also provide shear do limit in. Once you're done, you can save the changes. Otherwise you can recheck these things. These properties are important when you are running our cluster on the Cloud. So there are a wide range of applications that can be deployed on a re-shift clusters. It could be linked to a lot of things. And there are a lot of people involved in maintaining these things. So as a role of manager, you could find various parameters, whether your cluster is running properly. You could be a solution architect, you could be a Cloud developer, and you could be responsible for database administration and anything like that. You would be in a situation that you can check the performance at whether your cluster is on time, your queries are running. Say if you want to check whether your developers and other people have executed queries, but everything has been transformed to the queue and you can monitor it here. You will rectify what is the problem. If there is any problem that could be occurred, you could troubleshoot those problems by going through various properties. There are various actions that you can perform on a cluster, like resize a cluster, rebooted cluster. You can default the maintenance, you can pause it, you can delete it. You can create a backup and disaster recovery options such as restoring table, creating a snapshot, configuring rows and regions. You can manage the permissions. You can manage the identity and access management rules. For access to your cluster. You could have other things as well. You could modify the publicly accessible fittings and other things. Let us this renames I am rows. You can attach multiple IAM roles to your clusters based on certain parameters. Say if your organization have a team of ten developers, but only four are currently working on this cluster. You can create ten different roles, but allow only for different portions to take charge of this Redshift cluster. So they would access certain segment of your cluster. And I could modify it. They can perform their tasks that they are assigned. This way you have to associate I am. Also, if you want to attach various services, integrate those cloud services on AWS with another service. You also require an IAM role. If you want to reboot your cluster or the size of cluster, you can easily do it. Say for example, if there is a problem, there is some error that you don't know. Like all the queries are gone into the cubes, but no query is already. So you can reboot the cluster. Maybe there could be some error, like the first step of troubleshooting. Say if you're using a Windows based on my desktop and there's some situation like your system hang, everything freezes down. What do you do in earlier days? You just simply turn your computer off, restarted again. So this thing can also be done on the cluster. Different thing can be also performed there. Maybe your query load is very high and there is a situation you need to create more new nodes. You need to create have multiple nodes. You can also change it. You can resize the cluster negatively related situations and otherwise, else. If there is a problem with your cleft here, you can also delete it. You can pause it. So currently it won't be available taking new requests and so on. Chegg, whereas cluster properties on re-shape keep learning and keep moving ahead. 6. Setting up cloudwatch alarms: Welcome back friends. Now moving forward with creating allows, previously you learned about various cluster properties on re-shift. Now you will learn to create alarms. So you can create a launch using this option. You can monitor your cluster using CloudWatch. You have to define various things such as cluster identifier. You can set the alarm for a certain metrics. Like here, you can find a maximum of CPU utilization for shared resources. Say here you've got various options. You can find Cb utilization data with connections, read latency, read throughput, and other properties as well. Let us take the CPU utilization for now. It will inform you once your CPU utilization has triggered a maximum load, when a metric value is greater than 80%, it will set an alarm. So it will notify you in various actions. You can create SNS notification on Amazon Cloud or you can disable it. So it will act in two way. It will notify your end CloudWatch. Amazon CloudWatch is a dedicated service that keeps track of various activities that happened on the Cloud. It could be used for auditing purpose. It could be used for setting up. Thanks for making several example. If your developers are currently working on a re-shift cluster and your project manager is just watching the CloudWatch logs and other things. They can be informed. So they can keep practical. If anything goes beyond a set predefined threshold, you can set various thresholds. You can create multiple alarms on Redshift clusters. For attribute utilisation, read latency, Latency and such on. You can have a discussion with your teammates and said various kinds of alarms. Once they are triggered, you can roll back your policies, such as if the demand is much high, you can plan accordingly whether to increase the node size or much. Keep Learning active, moving ahead. 7. Execute SQL queries on Redshift: Hi, welcome back friends. Here in this lesson you are going to learn about running and executing SQL queries on Amazon Redshift cluster. So let's start with this. We have already created Redshift cluster. We can write and running SQL queries on it. Go to the Query editor option. On the redshift dashboard. You would find it just below the cluster options. First option is query, and second option is editor. You can go to the Query editor option. Here you will find this kind of view. On the right-hand side, you would find the editor where you can write your SQL code. You've got various options, such as query history, saved queries, queries. These options can be used to save SQL query. I'll find a query history, execution history and you can shoot you will the query as well. First of all, we just need to select our database on which we want to run our SQL queries. This is a ticket sample database that we are going to use. We just need to create a connection. If you want to create a connection, you can connect to a database. And if there are multiple databases on your cluster, you can switch between various databases by using this option Change connection. You can authenticate and log into your database. Once you're authenticated, you can start running your queries. We have selected a database development Schema public. We are using the table ticket. Here. You've got various options that you can perform. You can insert the SQL queries for the predefined things. And you can run these things. Once you run SQL query, it will produce a output on the same page. You can scroll down and find it. Indeed. You can show the final table details and the query result. So let's go to detail option and insert your option into the SQL, will generate the some SQL code for you. You can execute it. Although you can also write your SQL code right from this crash. What if you don't have a time, you are in a hurry? You can use some of the predefined SQL queries very easily. Maybe if you are confused directory of your table, the data schema. Now you may get confused with the spelling mistake and thus your query output would not be executed, it won't be appearing. You can use the predefined SQL queries. You can also save various SQL queries once you, once we execute this SQL query, select asterisk from the table where we have not used where clause we have used the limit. We are using the limit. It will show only the ten records. So you can change the limit. You can have other things as well. So you can run various SQL queries here. Let's try it. We're class. You can put certain condition. Once you're done with the SQL query, you can start executing. You can run one SQL query at a time. If you want to run multiple SQL queries, you can run multiple SQL categories as well. You can schedule them, we can add them to the scheduled queries. They can be executed simultaneously, or you can save multiple queries and Run one-by-one. Using this pencil. Under shift, you can run your SQL queries using various methods. You can use the Amazon command line interface. You can use two different query editors, the Query Editor V2. And this is the Leicester query editor. You can use this thing. Just put the quantity sold is greater than ten. We will execute, it, will produce a result. We can visualize results as well. Just scroll down. Once your query is executed, it will show you that maybe there is no output. You can change your parameters, you can define various settings and so on. Human also clearly your SQL queries. If you want to remove everything and write it once again. You might be already familiar with SQL syntax. If you're not, you should learn SQL programming. I SQL queries. These are basic things. This is not advanced programming thing like Java and Python. You should be aware of these things. Most of you might already be aware of SQL query. You have already written it in a college and your God for life, you have used it. You can leverage it using leadership cluster. These are various columns and rows are returned when we run this SQL query. And we can also download this thing, this output in the CSV format, that can be used for further data analysis and other things. Why writing SQL queries, you can perform various operations. Like analyzing information. You could add new rows and columns. You can find certain information based on a filter, based on certain conditions, whether they match, and so on. If you have a 1 million records, you want to narrate it, you can make it narrow. You can find query results in different options. You can also visualize it in form of chart. Generally it is not used so much. But if you want to visualize, you can find it as well. How your query execution performed, how it is executing, and so on. So this thing can be used by the project manager. Could we use for auditing and other things, how the query execution perform, and so on. We can plan your queries, you can text, you can use the execution time, execution timeline could we also found here, you can extract it and the data format. Sql queries are widely used for performing analysis. You can be used to modify clusters. You can also run SQL queries with various applications for web or mobile application. But before connecting it to other application. When you want to write to run various SQL queries using the query editor, you can use it to check whether it is producing the right output or not. You can also create multiple tables based on big data. There. Say if you have millions of records and you want to filter it based on certain conditions like geographical conditions, you have customer records. You want to divide them into different tables and databases based separated by the geographies, the nation, and the roles you can define there. You can also define them on various conditions and so on. So SQL is a very versatile query language that can be used with large kind of relational tables. You can also arrange your SQL query is the way you want. If you want to run them in a single line, you can find it. If you want to have. In addition, you can have. This is how you can write SQL queries on re-shift cluster. Keep learning, and keep moving ahead. 8. Save and Schedule SQL queries: Hi, welcome back friends. Earlier in this lesson, you are going to learn about scheduling SQL queries that can be executed on a predefined time. Let's start with this. You already know how to write SQL queries on Redshift cluster. You already know how to run these things and you can save this SQL query by clicking the save option. You have to just provide the name of this query so that you can refer it with your name. Just provide a name and display data based on regard are some parameter. You can define various conditions, have multiple SQL queries that can be saved. You don't need to type things over and over again. Although you can change few parameters and certain conditions, much better to have a query seed. After you have saved the query, you can refer them back and check the history of the query execution. You can find various queries that has been executed in the order of execution. For scheduled queries, you can define an SQL query and provide the time, the frequency. You can define them based on the run frequency or the crown format. You can also repeat those SQL query to be run. As you can find, there are three different options just next to run. Save, schedule and clear. And you can feel this thing. So other top corner, you got the saved queries, a query history, and should you worries, you can always download the SQL format of the queries. Once it has been saved, you can use it anywhere else. So just go to the schedule queries option here you need to provide the IAM role. You can go to the identity and access management dashboard to create an IAM role. Or if you already have an existing row, you can find the ERM Amazon resource name and just provided here. Here we got Redshift role that we have created previously. And we can use this road. Just copy the role ARN for this IAM role and paste it at the ARN link for the scheduled query. You want to run a query. You need to have access to this certain resources and you can provide it using the IAM role. Then you need to define the query's. Here you can write the SQL query. This is most important thing. If the query is very long, you can also upload it uploaded in the text format. Then you again should you options it will query by ran frequency or crown format. If you select by R1 frequency, you can repeat it on the monthly basis, repeated on a daily basis, certain time. You can repeat the last query on the Monday and so on. You can define a type in the UTC format and you can have a different time zone. If you are working on a different time zone, like I'm working in indian Time Zone, IST. I will just add certain values, convert it to you wasting. You also need to provide the name of this query should yield query. Whenever you're required to run. You can have multiple query that can be saved. You can have multiple queries that can be shared you once you have a large cluster on Redshift, you have a lot of databases. You need to adjust, not write SQL queries. You can schedule them at a predefined time that can be used for the maintenance, could be used for auditing, could be used for various things. Say for example, if you have a large cluster with large user database, whereas the lipid is say 1 million records and you know, on the monthly basis it exceeds 1 million records. So you can create a new database. While waiters per person, this thing can be used, keep learning and keep going. 9. Database connections with new tab in Query editor: Welcome back, friend. Now going forward with our schedule queries, we have written a schedule, is provided the identity access management role and failed it. You can find a saved queries. You can find a scheduled queries into this category. We have one saved queries that we have created the shed you will query, but we have not finalized it. Not available. You can delete various kinds of queries. You can delete it from saved queries, you can delete it from federal queries as well. And what you cannot delete anything from the query history. It. You can also create new tab for query editor. Just like a you create a new tab for your browser. Web browser. What will a new tab will do? It will allow you to run multiple queries simultaneously. In the new tab, you can run one query on another tab. You can have another query running. And also you can change the connection with your database, say inner database if you have ten tables and if you have ten different databases. So ten into 1000 tables and AWS's, you have, you can switch over the connection using a new tab. You can around various queries simultaneously on different tables and databases. Also, you can have the schedule queries running on the back-end. You can automate various queries using the scheduled queries that are run executed on the routinely basis, that can be used for routine checkups, maintenance, creating backups, anything else? Then you can run certain queries, undergo whenever you want to go. If you want to run various queries on a maintenance basis, you can also do this thing. So let us write some query. Select venue city when you stack from the table. Here, we said the limit and it will be executed. This is out rewrite query here. Got it. Sometimes there could be some syntax error or there could be some connection issues. You can change the connection, refresh it, it will get that. Here there are various rows return for the window city and the date. We can visualize it. We can export the results into CSV format and can be used for another table as well. Try creating your own SQL queries. Save them. I'll create a schedule queries, run them on the Redshift cluster, use your database table, perform various kind of things. Amazon Redshift can be used for a wide range of applications. As you already know, you can do a lot of things on Amazon Redshift, cry to perform various tasks on re-shift. From creating a database tables, adding records, connecting it to VPC and IAM role, running various queries, share dealing queries and things to keep learning and keep moving.

Create and Manage Redshift Clusters on AWS

Harshit Srivastava, Developer on IBM Cloud, Bluemix

Watch this class and thousands more

Watch this class and thousands more

Lessons in This Class

1.

Introduction

1:28

2.

Create an IAM role on AWS

7:30

3.

Create a Redshift Cluster

7:13

4.

Redshift Query Editor and JDBC odbc connection

7:00

5.

Cluster Properties and Actions

6:51

6.

Setting up cloudwatch alarms

2:18

7.

Execute SQL queries on Redshift

7:53

8.

Save and Schedule SQL queries

4:19

9.

Database connections with new tab in Query editor

3:24