What’s one thing hospitals, universities, online retailers and government agencies have in common? They all rely on data analysis to learn from past events, perform their best in the present and achieve their goals in the future. 

That’s because data analysts don’t just look at data. With the help of various tools, they also prepare, clean, process and interpret it so the organizations they’re working with can make smarter decisions and forge a more successful path. 

What Is Data Analysis? 

You’ve probably already guessed that data analysis involves, well, analyzing data. But what does it actually entail? 

While the exact tasks involved in data analysis vary based on the task at hand, it can involve inspecting, processing, cleansing, condensing, interpreting and extracting useful information from raw data. 

As the U.S. Office of Research Integrity (ORI) explains, data analysts are able to accomplish those goals by applying logical and statistical techniques.  

Data Analysis in Action

Need an example of data analysis? No problem. 

Let’s say a hospital wants to know how many staff members it should have working on any given day. Too many, and the hospital will lose money. Too few, and patients won’t receive the care they need. 

To learn the answer, a data analyst could collect as much admissions data as possible. Then, they could evaluate the patient-to-staff ratio at different times of day, days of the week and so on. Based on their findings, they could help predict when the hospital should have more staff members on-site and when they should bring in fewer people. 

That’s not just a hypothetical example, either—data analysts did exactly that for the largest university hospital in Europe, Assistance Publique-Hôpitaux de Paris (AP-HP). 

Data analysis isn’t just applicable to healthcare, though. It’s useful in just about every industry imaginable, from manufacturing to retail to transportation. Knowing that, it’s no wonder that overall employment of statisticians (a category to which data analysts belong) is projected to grow by a whopping 33 percent from 2020 to 2030. 

Types of Data Analysis

It’s important to know that not all data analysis is the same. In fact, there are several main types of analysis, all of which involve different sources and processes. 

Descriptive Data Analysis

If a data analyst is asked to perform descriptive analysis, that means they’re tasked with describing and summarizing raw, quantitative (i.e., numerical) data. 

For instance, a data analyst could use descriptive analysis to summarize the average age, income and gender of a business’ customers. 

Exploratory Data Analysis

Similarly to descriptive data analysis, exploratory data analysis summarizes quantitative data. The major difference is that it does so via data visualizations rather than statistics. 

In other words, a data analyst who’s performing exploratory analysis would turn their findings into easy-to-understand charts and graphs instead of long spreadsheets filled with numbers. 

Secondary Data Analysis

When a data analyst uses data gathered by someone else to perform their analysis, that’s known as secondary data analysis. 

So a data analyst asked to analyze old data gathered by other analysts would be engaging in secondary data analysis, while one who’s gathering previously untouched data would be performing primary data analysis. 

Diagnostic Data Analysis 

If a data analyst wants to know why something happened, then they use diagnostic data analysis to find out. 

For example, an analyst could use diagnostic analysis to determine why a company’s sales increase during certain months for no discernible reason. 

Prescriptive Data Analysis 

Sometimes data analysts use the insights they’ve gleaned to make recommendations for the next course of action. When they do, it’s called prescriptive analysis. 

For instance, a data analyst might analyze a school’s attendance data and use their findings to recommend that class sizes be kept as small as possible in order to increase attendance. 

Predictive Data Analysis 

If a data analyst wants to use data to make predictions about the future, then they’ll use predictive data analysis. 

An analyst could use qualitative (i.e., descriptive) data, such as customer feedback surveys, to predict which products will sell the best in the coming year—that would be considered predictive data analysis. 

Big Data Analysis

As its name suggests, big data analysis involves analyzing very large data sets, whether to gather basic statistical information, make recommendations, predict future trends or any other purpose. 

A data analyst who analyzes data gathered from the patients of a nationwide network of hospitals is performing big data analysis, as is one who analyzes traffic patterns in a metropolitan area. 

Data Science Without Coding

The No-Code Data Science Masterclass for Business Analysts & Executives

Tools of the Trade 

Whether they’re working from home or a corporate office, data analysts can use a wide range of tools to perform their analyses accurately and efficiently. 

Microsoft Excel

For data analysts, the spreadsheet program Microsoft Excel is an indispensable tool. It’s usually best for small to medium-sized data sets, and one of its biggest benefits is that you don’t need to know how to code in order to learn how to use Excel like a pro

An Excel spreadsheet filled with several rows and columns of both numerical and text-based data.
Still from Skillshare Class The Basics of Data Analytics in Excel: Sort, Filter & Pivots by Ruben Wollerich

Python

The general-purpose programming language Python is often used by data analysts, sometimes in tandem with add-on tools like pandas which are designed to make data analysis easier. 

The programming language Python being used within the Atom text editor.
Still from Skillshare Class Learn Python for Data Analysis and Visualization by Tony Staunton

Another programming language often used for data analysis is R, which was created specifically for statistical analysis, graphics and reporting. 

The programming language R being used within a text editing program.
Still from Skillshare Class Statistics for Data Analysis using R Programming by Venkat Murugan

Structured Query Language (SQL) 

Designed to manage data held within relational databases (i.e., collections of data that are related to one another), Structured Query Language (SQL) first appeared in 1974 and is one of the oldest programming languages still in use today. 

The programming language SQL being used within the software program MySQL Workbench.
Still from Skillshare Class Beginners Data Analysis Bootcamp with SQL by AMG Inc

Apache Spark 

Rather than being a programming language itself, Apache Spark is an engine that accepts multiple languages and can be used for data analysis, engineering and science, as well as machine learning. 

Data being filtered using the Apache Spark engine.
Still from Skillshare Class Big data analysis with Apache spark – PySpark Python by Ankit Mistry

Microsoft Power BI

For data analysts looking to transform data into aesthetically pleasing charts, graphs and diagrams, Microsoft’s data visualization tool Power BI is invaluable. 

One bar graph and one pie chart being created within the program Power BI.
Still from Skillshare Class Microsoft Excel: Master Power BI Dashboards in 120 Minutes by Bash (BizTech Matters)

The Data Analysis Process 

So how do data analysts actually go about doing their job?

Depending on the type of analysis they’re going to perform, the process usually involves some or all of the following steps. 

Gathering Requirements

Before digging into even a shred of data, every data analyst must first understand why they’re analyzing the data to begin with. They should also note the desired end goal of their analysis, whether it be polished visualizations, custom-tailored recommendations or detailed predictions. 

Gathering Data

Once an analyst has nailed down their requirements, they can start collecting the data itself. Depending on the task at hand, this can involve a wide array of sources, from social media insights to customer surveys to paper records. The analyst will typically import all the data into their preferred program, such as Microsoft Excel. 

Cleaning Up 

No dataset is perfect right off the bat – they all include imperfections such as blank spaces, duplicate entries and the like. It’s up to data analysts to remove those imperfections in the interest of avoiding complications down the road. 

Performing Analysis 

After thoroughly cleaning their data, data analysts are finally able to start analyzing it. They can do so using any of the tools outlined above, whether it’s a language like Python or an engine like Apache Spark. 

Interpreting Results 

Now it’s time for what’s arguably every data analyst’s most difficult task: interpreting the results of their analysis. This requires plenty of careful consideration, and may involve creating visualizations, identifying trends or forming recommendations. 

Want to Learn Data Analysis? You’re Already on Your Way 

If you’re thinking about becoming a data analyst, it may seem like a daunting task. But the truth is that you don’t have to be a master programmer or expert mathematician to do so – with the right data analysis tools and some helpful instruction, anyone can learn data analysis

So whether you’re looking to start a new career or learn a new skill, you’ve already taken the first step of becoming a data analyst by learning what data analysis is and how it’s done.

Discover Online Classes in Data Science

Machine learning, data visualization, programming, and more.

Written By

Carrie Buchholz-Powers

  • Click here to share on Twitter
  • Click here to share on Facebook
  • Click here to share on LinkedIn
  • Click here to share on Pinterest