Data Science and Machine Learning with Python  Hands On!
Frank Kane, Founder of Sundog Education, exAmazon


1. Introduction
2:44 
2. Windows Setup Instructions
10:43 
3. Mac Setup Instructions
8:17 
4. Linux Setup Instructions
9:11 
5. Please follow me on SkillShare!
0:16 
6. Python Basics, Part 1
4:59 
7. Python Basics, Part 2
5:17 
8. Python Basics, Part 3
2:46 
9. Python Basics, Part 4
4:02 
10. Intro to Pandas
10:08 
11. Types of Data
6:58 
12. Mean, Median, Mode
5:26 
13. Using mean, media, and mode in Python
8:20 
14. Variation and Standard Deviation
11:12 
15. Probability Density Function; Probability Mass Function
3:27 
16. Common Data Distributions
7:45 
17. Percentiles and Moments
12:32 
18. A Crash Course in matplotlib
13:46 
19. Data Visualization with Seaborn
17:30 
20. Covariance and Correlation
11:31 
21. Exercise: Conditional Probability
16:04 
22. Exercise Solution: Conditional Probability
2:20 
23. Bayes' Theorem
5:23 
24. Linear Regression
11:01 
25. Polynomial Regression
8:04 
26. Multiple Regression
11:26 
27. MultiLevel Models
4:36 
28. Supervised vs. Unsupervised Learning, Train / Test
8:57 
29. Using Train/Test to Prevent Overfitting
5:47 
30. Bayesian Methods: Concepts
3:59 
31. Implementing a Spam Classifier with Naive Bayes
8:05 
32. KMeans Clustering
7:23 
33. Clustering People by Income and Age
5:14 
34. Measuring Entropy
3:09 
35. Windows: Installing Graphviz
0:22 
36. Mac: Installing Graphviz
1:16 
37. Linux: Installing Graphviz
0:54 
38. Decision Trees: Concepts
8:43 
39. Decision Trees: Predicting Hiring Decisions
9:47 
40. Ensemble Learning
5:59 
41. Support Vector Machines (SVM) Overview
4:27 
42. Using SVM to Cluster People
9:29 
43. UserBased Collaborative Filtering
7:57 
44. ItemBased Collaborative Filtering
8:15 
45. Finding Movie Similarities
9:08 
46. Improving the Results of Movie Similarities
7:59 
47. Making Movie Recommendations to People
10:22 
48. Improving the Recommender's Results
5:29 
49. KNearestNeighbors: Concepts
3:44 
50. Using KNN to Predict a Rating for a Movie
12:29 
51. Dimensionality Reduction; Principal Component Analysis
5:44 
52. PCA Example with the Iris Data Set
9:05 
53. Data Warehousing; ETL and ELT
9:05 
54. Reinforcement Learning
12:44 
55. HandsOn with QLearning
12:56 
56. Bias / Variance Tradeoff
6:15 
57. KFold Cross Validation
10:26 
58. Data Cleaning and Normalization
7:10 
59. Cleaning Web Log Data
10:56 
60. Normalizing Numerical Data
3:22 
61. Detecting Outliers
6:21 
62. Important Spark Installation Notes
5:00 
63. Installing Spark  Part 1
6:59 
64. Installing Spark  Part 2
7:20 
65. Spark Introduction
9:10 
66. Spark and the Resilient Distributed Dataset (RDD)
11:42 
67. Introducing MLLib
5:09 
68. Decision Trees in Spark
16:15 
69. KMeans Clustering in Spark
11:23 
70. TF / IDF
6:43 
71. Searching Wikipedia with Spark
8:21 
72. Using the Spark 2 DataFrame API for MLLib
8:07 
73. Deploying Models to Production
8:42 
74. A/B Testing Concepts
8:23 
75. TTests and PValues
5:59 
76. HandsOn with TTests
6:03 
77. Determining How Long to Run an Experiment
3:24 
78. A/B Test Gotchas
9:26 
79. Where to Go From Here
2:59 
80. Let's Stay in Touch
0:46

About This Class
Data Scientists enjoy one of the toppaying jobs, with an average salary of $120,000 according to Glassdoor and Indeed. That's just the average! And it's not just about money  it's interesting work too!
If you've got some programming or scripting experience, this course will teach you the techniques used by real data scientists in the tech industry  and prepare you for a move into this hot career path. This comprehensive course includes 68 lectures spanning almost 9 hours of video, and most topics include handson Python code examples you can use for reference and for practice. I’ll draw on my 9 years of experience at Amazon and IMDb to guide you through what matters, and what doesn’t.
Each concept is introduced in plain English, avoiding confusing mathematical notation and jargon. It’s then demonstrated using Python code you can experiment with and build upon, along with notes you can keep for future reference. You won't find academic, deeply mathematical coverage of these algorithms in this course  the focus is on practical understanding and application of them.
The topics in this course come from an analysis of real requirements in data scientist job listings from the biggest tech employers. We'll cover the machine learning and data mining techniques real employers are looking for, including:
 Regression analysis
 KMeans Clustering
 Principal Component Analysis
 Train/Test and cross validation
 Bayesian Methods
 Decision Trees and Random Forests
 Multivariate Regression
 MultiLevel Models
 Support Vector Machines
 Reinforcement Learning
 Collaborative Filtering
 KNearest Neighbor
 Bias/Variance Tradeoff
 Ensemble Learning
 Term Frequency / Inverse Document Frequency
 Experimental Design and A/B Tests
...and much more! There's also an entire section on machine learning with Apache Spark, which lets you scale up these techniques to "big data" analyzed on a computing cluster.
If you're new to Python, don't worry  the course starts with a crash course. If you've done some programming before, you should pick it up quickly. This course shows you how to get set up on Microsoft Windowsbased PC's; the sample code will also run on MacOS or Linux desktop systems, but I can't provide OSspecific support for them.
If you’re a programmer looking to switch into an exciting new career track, or a data analyst looking to make the transition into the tech industry – this course will teach you the basic techniques used by realworld industry data scientists. I think you'll enjoy it!