The Auto-MPG dataset contains spec of various cars and is available in kaggle. For more, read here. The data can be downloaded as follow:
import kagglehub # To load the dataimport pandas as pd # To handle and manipulate the data# Download latest versionpath = kagglehub.dataset_download("denkuznetz/food-delivery-time-prediction")# Download latest versionpath = kagglehub.dataset_download("uciml/autompg-dataset")auto = pd.read_csv(path +'/auto-mpg.csv')auto.head()
mpg
cylinders
displacement
horsepower
weight
acceleration
model year
origin
car name
0
18.0
8
307.0
130
3504
12.0
70
1
chevrolet chevelle malibu
1
15.0
8
350.0
165
3693
11.5
70
1
buick skylark 320
2
18.0
8
318.0
150
3436
11.0
70
1
plymouth satellite
3
16.0
8
304.0
150
3433
12.0
70
1
amc rebel sst
4
17.0
8
302.0
140
3449
10.5
70
1
ford torino
A. Are the columns in the correct data type?
# To do
B. Change the wrongly encoded columns into their suitable data type if there are any.
# To do
C. What seem to be the problem with horsepower? Solve the problem carefully.
# To do
D. Make sure all the columns are in correct data type.
# To do
D. Are there any duplicated data? How would you handle them if there are any?
# To do
E. Compute descriptive statistics of the data and visualize their distribution.