# Big Data 104: Making Predictions (3-day course)

### By Joaquin Roca·I help startups gracefully scale human systems

Big Data 104, the final 3-part course of the Big Data series, will teach you how to use correlation and regression to make statistical models used for making predictions. This class will help you use the data you have already collected to make powerful predictions. We will also use R to create and analyze regression models.

I'm very grateful to SumAll for graciously hosting this class. In this class we will cover statistics from a theoretical and practical perspective. In this course you will learn to use R to run correlations and create regressions. R is a very powerful and versatile open source tool that is available for free by clicking on that link. Please come with R and RStudio loaded on your laptop. Don’t worry if you don’t have a laptop, you can follow along in class, and I’ll send you the command lines to look over at home.

###### Prerequisites
You should know how to calculate standard deviations and z-scores, which I cover in Big Data 101. Don’t let your fear of math stop you from being great at stats! All of the math you need for this course you learned by the 10th grade.
##### Schedule
• ###### Correlation is not causation. But it is correlation.

While you can't make causal inferences from a correlation, it is still a prerequisite. Knowing how strongly two things are related is a powerful bit of information. It is also the first step toward making predictions.

• ###### Statistical soothsaying.

How helpful would it be if you could tell the future (with a certain degree of certainty, that is)? Well, this class will let you do just that. In this class we will look at regression models which will allow us to make statistical predictions. How cool is that?

• ###### And I just can't seem to get enough...

No, we will not be having a New Wave sing along. Seriously, no. Okay, maybe at the end of class. But before that we will be finishing up regression. Like two-way ANOVA learning regression is a two-class endeavor. If you have data you might want to bring it in. Working with real data is awesome. Regression is awesome. Running regressions with real data... When I'm with you baby, I go out of my head. I just can't get enough, I just can't get enough.

Please don't blame me if you are singing that song three days from now.

### Joaquin Roca

##### I help startups gracefully scale human systems

I taught statistics to college psychology majors for years at Hunter College and also a class on data-based consulting and applied research to MS students at Baruch. I have also worked at AMEX and Pfizer running statistical analyses for their training departments. If you have two groups of friends, I can absolutely tell you if one is (statistically) significantly taller than the other.

