Course: CSCI-866-001: Data Mining & Knowledge Discovery
Lecturer: Sothea HAS, PhD
Objective: This lab aims at reproducing what we have seen in the class about overall pattern/connection existed in the dataset (Amazon). You will also investigate some interesting aspects of this dataset.
1. Amazon dataset
Let’s begin by importing the dataset into our environment.
0 |
A39HTATAQ9V7YF |
0205616461 |
5.0 |
1369699200 |
1 |
A3JM6GV9MNOF9X |
0558925278 |
3.0 |
1355443200 |
2 |
A1Z513UWSAAO0F |
0558925278 |
5.0 |
1404691200 |
3 |
A1WMRR494NWEWV |
0733001998 |
4.0 |
1382572800 |
4 |
A3IAAVS479H7M7 |
0737104473 |
1.0 |
1274227200 |
A. Reproduce the result and graph illustrated from slide 26 of the course.
B. Product Analysis
B.1. Visualize the rating distribution for the most popular product.
B.2. Repeat the previous question for the 2nd and 3rd most popular products.
C. Time Evolution
C.1. Visualize the rating distribution over time.
C.2. Does it seem that the recent products are higher rated compared to the older ones?