11

--

My Answers to the Challenge Questions

NOTE: I'm having trouble (or Skillshare is) with formatting my text and tables - the Skillshare page layout was breaking and it was truncating my last table. And it doesn't seem to like double-spaces. So those triple slashes you see have been added to fix the problem. But the last table of data doesn't contain all the results.

///

Here are my answers to the challenge questions. I had a harder time using JOINs than I did using WHERE statements; WHERE/AND/OR make more logical sense to me, I guess, and I had a hard time stringing JOIN statements together like WHERE/AND (and I don't know if it's possible or not).

If anyone sees any errors or has suggestions on how to improve these, I'd love to get your feedback!

///

CHALLENGE QUESTION #1

Who were our best 10 customers in the last three months of the year?

I chose to use revenue and not number of orderes as my measure for "best customer." I made the assumption that "last the months of the year" meant last three months of last year, or October 1 - December 31, 2012.

Query:

SELECT customer_name, SUM(movie_price) AS revenue

FROM order_data AS od, customer_data AS cd, movie_data AS md

WHERE od.customer_id = cd.customer_id

AND od.movie_id = md.movie_id

AND datetime > '2012-10-01'

GROUP BY customer_name

ORDER BY revenue DESC

LIMIT 0, 10;

Data:

customer_name

revenue

George Holm

26.89

Larry Kozikowski

13.9

Amish Haffner

10.94

Sarah Curran

10.93

Natasha Simpson

9.97

Jan Smith

9.95

Jeff Southern

9.94

Rick Haffner

9.93

Lindsey Prentice

9.93

Rebecca Davis

9.91

///

CHALLENGE QUESTION #2

What is the worst-selling movie on our website?

Logically, this one was pretty straight-forward.

Query:

SELECT movie_title, COUNT(*) as sales

FROM order_data AS od, movie_data AS md

WHERE od.movie_id = md.movie_id

GROUP BY movie_title

ORDER BY sales ASC

LIMIT 0,1;

Data:

movie_title

sales

Patton

32

 ///

CHALLENGE QUESTION #3

Summarize the total number of purchases by price; do you think price makes a difference in order behavior?

Clearly, price does have an influence on purchase behavior - in particular at the upper and lower bounds.

Query:

SELECT movie_price, COUNT(*) as orders

FROM movie_data AS md, order_data AS od

WHERE od.movie_id = md.movie_id

GROUP BY movie_price

ORDER BY orders DESC;

Data:

movie_price

orders

0.99

2238

1.99

1399

3.99

1044

5.99

319

///

CHALLENGE QUESTION #4

Summarize sales on our website by hour of the day, i.e., $X of movies were sold between 12 and 1AM, $Y between 1 and 2AM, and so on.

I thought it was interesting that there was very little variance based on time of day. Depending on whether your customers were more local or international, I would guess you would see more variance with real data. Also, I didn't recall HOUR(datetime) being covered in the class, but it worked similarly (as one would expect) as MONTH(datetime).

Query:

SELECT HOUR(datetime), SUM(movie_price)

FROM movie_data AS md, order_data AS od

WHERE md.movie_id = od.movie_id

GROUP BY HOUR(datetime);

Data:

HOUR(datetime)

SUM(movie_price)

0

518.800000000001

1

508.610000000001

2

466.930000000001

3

473.960000000001

4

335.140000000001

5

450.020000000001

6

505.640000000001

7

438.970000000001

8

481.830000000001

9

424.020000000001

10

448.080000000001

11

400.080000000001

12

511.800000000001

13

472.870000000001

14

473.870000000001

15

382.130000000001

16

435.900000000001

17

452.930000000001

18

462.020000000001

19

510.740000000001

20

454.930000000001

21

472.020000000001

22

488.850000000001

23

505.860000000001

///

CHALLENGE QUESTION #5

What is the average price of a movie order across each month of the year?

I first ran a query grouped by YEAR(datetime) to see how many years were in the database. I'm not really sure why people are purchasing more expensive movies in Q1 of each year.

Query:

SELECT YEAR(datetime), MONTH(datetime), AVG(movie_price)

FROM movie_data AS md, order_data AS od

WHERE md.movie_id = od.movie_id

GROUP BY YEAR(datetime), MONTH(datetime);

Data:

YEAR(datetime)

MONTH(datetime)

AVG(movie_price)

2011

1

5.36950138504156

2011

2

3.36293729372938

2011

3

3.02804347826088

2011

4

2.79836236933799

2011

5

1.40042345276874

2011

6

1.94789473684211

2011

7

1.83527220630373

2011

8

1.10309523809524

2011

9

0.990000000000004

2011

10

1.07881578947369

2011

11

1.14037593984963

2011

12

1.92910256410257

2012

1

5.41857142857143

2012

2

3.28166666666667

2012

3

2.92548387096775

2012

4

2.82146067415731

2012

5

1.43943820224719

2012

6

1.87571428571429

2012

7

1.9

2012

8

1.07653846153846

2012

9

0.989999999999999

2012

10

1.06692307692308

2012

11

1.24471698113207

2012

12

2.01941176470589

///

CHALLENGE QUESTION #6

How many times was each individual movie ordered in each month of each year? I.e., Amadeus was ordered twice in 1/2011, three times in 2/2011, etc.

I *think* this solution is correct. If anyone has a better solution, please let me know!

Query:

SELECT movie_title, YEAR(datetime), MONTH(datetime), COUNT(*)

FROM order_data AS od, movie_data AS md

WHERE od.movie_id = md.movie_id

GROUP BY movie_title, YEAR(datetime), MONTH(datetime)

ORDER BY movie_title;

Data:

movie_title

YEAR(datetime)

MONTH(datetime)

COUNT(*)

A Beautiful Mind

2011

5

120

A Beautiful Mind

2012

5

36

Amadeus

2011

10

73

Amadeus

2012

10

24

American Beauty

2011

5

69

American Beauty

2011

6

30

American Beauty

2011

7

1

American Beauty

2012

5

15

American Beauty

2012

6

16

Annie Hall

2011

11

3

Annie Hall

2011

12

36

Annie Hall

2012

11

8

Annie Hall

2012

12

17

Braveheart

2011

7

92

Braveheart

2012

7

29

Chariots of Fire

2011

11

61

Chariots of Fire

2011

12

2

Chariots of Fire

2012

11

14

Chariots of Fire

2012

12

1

Chicago

2011

4

108

Chicago

2011

5

3

Chicago

2012

4

32

Chicago

2012

5

1

Crash

2011

3

104

Crash

2012

3

41

Dances with Wolves

2011

8

108

Dances with Wolves

2012

8

31

Driving Miss Daisy

2011

8

7

Driving Miss Daisy

2011

9

86

Driving Miss Daisy

2011

10

3

Driving Miss Daisy

2012

8

2

Driving Miss Daisy

2012

9

28

Forrest Gump

2011

7

117

Forrest Gump

2012

7

35

Gandhi

2011

10

27

Gandhi

2011

11

37

Gandhi

2011

12

2

Gandhi

2012

10

8

Gandhi

2012

11

19

Gladiator

2011

5

109

Gladiator

2012

5

36

Kramer vs. Kramer

2011

11

52

Kramer vs. Kramer

2011

12

2

Kramer vs. Kramer

2012

11

25

Kramer vs. Kramer

2012

12

2

LOTR: Return of the King

2011

4

116

LOTR: Return of the King

2011

5

4

LOTR: Return of the King

2012

4

37

LOTR: Return of the King

2012

5

1

Million Dollar Baby

2011

3

49

Million Dollar Baby

2011

4

63

Million Dollar Baby

2011

5

2

Million Dollar Baby

2012

3

25

Million Dollar Baby

2012

4

20

No Country For Old Men

2011

2

95

No Country For Old Men

2011

3

24

No Country For Old Men

2012

2

34

One Flew Over the Cuckoo's Nest

2011

12

48

One Flew Over the Cuckoo's Nest

2012

12

24

Ordinary People

2011

11

60

Ordinary People

2012

11

23

Out of Africa

2011

10

77

Out of Africa

2012

10

27

Patton

2011

12

23

Patton

2012

12

9

Platoon

2011

9

28

Platoon

2011

10

44

Platoon

2012

9

10

Platoon

2012

10

21

Rain Man

2011

9

83

Rain Man

2011

10

6

Rain Man

2012

9

26

Rain Man

2012

10

1

Rocky

2011

12

45

Rocky

2012

12

24

Schindlers List

2011

7

83

Schindlers List

2011

8

38

Schindlers List

2012

7

24

Schindlers List

2012

8

9

Shakespeare in Love

2011

6

119

Shakespeare in Love

2011

7

4

Shakespeare in Love

2012

6

40

Slumdog Millionaire

2011

2

125

Slumdog Millionaire

2011

3

6

Slumdog Millionaire

2012

2

34

Slumdog Millionaire

2012

3

1

Terms of Endearment

2011

10

72

Terms of Endearment

2012

10

22

The Artist

2011

1

126

The Artist

2012

1

41

The Deer Hunter

2011

11

53

The Deer Hunter

2011

12

2

The Deer Hunter

2012

11

17

The Deer Hunter

2012

12

1

The Departed

2011

3

172

The Departed

2012

3

55

The English Patient

2011

6

45

The English Patient

2011

7

51

The English Patient

2012

6

18

The English Patient

2012

7

11

///

CHALLENGE QUESTION #7

Show off! What is the most interesting insight you can draw from the database?

I thought it would be interesting to see which customers purchased the higher priced movies and which purchased only the lower priced movies. In the real world, you could use this data to filter recommendations by price, or suggest higher-priced movies to customers who were less price sensitive. Unfortunately, the data shows that customers pretty much behave the same when it comes to pricing. My guess is that there would be more variance in a real data set of customer behavior.

Here was my challenge: Summarize the # of orders by price for each customer

You could throw this data in a pivot table and get a better summary.

Query:

SELECT customer_name, movie_price, COUNT(movie_title)

FROM order_data AS od, movie_data AS md, customer_data AS cd

WHERE md.movie_id = od.movie_id

AND cd.customer_id = od.customer_id

GROUP BY customer_name, movie_price

ORDER BY customer_name, movie_price;

Data:

Note: This is just a sample of the data, because Skillshare is choking on the entire table...

customer_name

movie_price

COUNT(movie_title)

Alicia Fisher

0.99

28

Alicia Fisher

1.99

16

Alicia Fisher

3.99

11

Alicia Fisher

5.99

9

Alicia Smith

0.99

14

Alicia Smith

1.99

17

Alicia Smith

3.99

6

Alicia Smith

5.99

7

Alicia Weller

0.99

25

Alicia Weller

1.99

12

Alicia Weller

3.99

13

Alicia Weller

5.99

1

Amish Fisher

0.99

35

Amish Fisher

1.99

19

Amish Fisher

3.99

17

Amish Fisher

5.99

5

Amish Haffner

0.99

19

Amish Haffner

1.99

15

Amish Haffner

3.99

14

Amish Haffner

5.99

1

Amish Heitzmann

0.99

21

Amish Heitzmann

1.99

9

Amish Heitzmann

3.99

13

Amish Heitzmann

5.99

4

Amish Prentice

0.99

21

Amish Prentice

1.99

12

Amish Prentice

3.99

9

Amish Prentice

5.99

3

Carol Haffner

0.99

26

Carol Haffner

1.99

16

Carol Haffner

3.99

8

Carol Haffner

5.99

5

Carol Holm

0.99

20

Carol Holm

1.99

16

Carol Holm

3.99

16

Carol Holm

5.99

4

Chris Chen

0.99

22

Chris Chen

1.99

14

Chris Chen

3.99

6

Chris Chen

5.99

2

///

///

Comments

Please sign in or sign up to comment.