TP5 - Linear Regression

Course: EDA & Unsupervised Learning
M1-DAS
Lecturer: HAS Sothea, PhD


Objective: In this TP, you will learn how to implement SLR, MLR and enhance your practical skills in pushing the performance of your models on new unseen data using techniques introduced in the course.

The Jupyter Notebook for this Lab can be downloaded here: TP5_Linear_Regression.ipynb.


1. Importing Abalone dataset

You need internet to load the data by running the following codes. For more information about this data, read Abalone dataset.

# %pip install kagglehub   # if you have not installed "kagglehub" module yet
import kagglehub

# Download latest version
path = kagglehub.dataset_download("rodolfomendes/abalone-dataset")

# Import data
import pandas as pd
data = pd.read_csv(path + "/abalone.csv")
data.head()
Warning: Looks like you're using an outdated `kagglehub` version (installed: 0.3.6), please consider upgrading to the latest version (0.3.7).
Sex Length Diameter Height Whole weight Shucked weight Viscera weight Shell weight Rings
0 M 0.455 0.365 0.095 0.5140 0.2245 0.1010 0.150 15
1 M 0.350 0.265 0.090 0.2255 0.0995 0.0485 0.070 7
2 F 0.530 0.420 0.135 0.6770 0.2565 0.1415 0.210 9
3 M 0.440 0.365 0.125 0.5160 0.2155 0.1140 0.155 10
4 I 0.330 0.255 0.080 0.2050 0.0895 0.0395 0.055 7

A. What’s the dimension of this dataset? How many quantitative and qualitative variables are there in this dataset?

print(f'dimension of the data: {data.shape}')
data.dtypes
dimension of the data: (4177, 9)
Sex                object
Length            float64
Diameter          float64
Height            float64
Whole weight      float64
Shucked weight    float64
Viscera weight    float64
Shell weight      float64
Rings               int64
dtype: object
print(f'Number of qualitative columns: {len(data.select_dtypes(exclude="number").columns)}')
print(f'Number of quantitative columns: {len(data.select_dtypes(include="number").columns)}')
Number of qualitative columns: 1
Number of quantitative columns: 8

B. Perform univariate analysis (compute statistical values and plot the distribution) on each variables of your dataset. The goal is to understand your data (the scale and how each column is distributed), detect missing values and outliers, etc.

  • Qualitative columns
data.select_dtypes(exclude='number').value_counts(normalize=True) * 100
Sex
M      36.581278
I      32.128322
F      31.290400
Name: proportion, dtype: float64
import seaborn as sns                 # For plotting
import matplotlib.pyplot as plt       # control panel for graphic
sns.set(style="whitegrid")
_, ax  = plt.subplots(1,1, figsize=(5,3))
sns.countplot(data, x="Sex", ax = ax)
ax.bar_label(ax.containers[0])
ax.set_title("Variable Sex")
plt.show()

  • Quantitative columns
data.select_dtypes(include='number').describe().drop(index=['count'])
Length Diameter Height Whole weight Shucked weight Viscera weight Shell weight Rings
mean 0.523992 0.407881 0.139516 0.828742 0.359367 0.180594 0.238831 9.933684
std 0.120093 0.099240 0.041827 0.490389 0.221963 0.109614 0.139203 3.224169
min 0.075000 0.055000 0.000000 0.002000 0.001000 0.000500 0.001500 1.000000
25% 0.450000 0.350000 0.115000 0.441500 0.186000 0.093500 0.130000 8.000000
50% 0.545000 0.425000 0.140000 0.799500 0.336000 0.171000 0.234000 9.000000
75% 0.615000 0.480000 0.165000 1.153000 0.502000 0.253000 0.329000 11.000000
max 0.815000 0.650000 1.130000 2.825500 1.488000 0.760000 1.005000 29.000000
  • We will remove abalone with 0 heights.
data.query('Height > 0', inplace=True)
data.describe()
Length Diameter Height Whole weight Shucked weight Viscera weight Shell weight Rings
count 4175.000000 4175.00000 4175.000000 4175.000000 4175.000000 4175.000000 4175.000000 4175.000000
mean 0.524065 0.40794 0.139583 0.829005 0.359476 0.180653 0.238834 9.935090
std 0.120069 0.09922 0.041725 0.490349 0.221954 0.109605 0.139212 3.224227
min 0.075000 0.05500 0.010000 0.002000 0.001000 0.000500 0.001500 1.000000
25% 0.450000 0.35000 0.115000 0.442250 0.186250 0.093500 0.130000 8.000000
50% 0.545000 0.42500 0.140000 0.800000 0.336000 0.171000 0.234000 9.000000
75% 0.615000 0.48000 0.165000 1.153500 0.502000 0.253000 0.328750 11.000000
max 0.815000 0.65000 1.130000 2.825500 1.488000 0.760000 1.005000 29.000000
fig, axs = plt.subplots(2, 4, figsize=(15,5))
for i, va in enumerate(data.select_dtypes(include="number").columns):
    sns.histplot(data, x=va, ax=axs[i//4, i%4], kde=True)
    axs[i//4, i%4].set_title(va)

plt.tight_layout()
plt.show()

Note: Most abalones are around 0.001 to 0.25 in height except for a few abalones that are much taller in height than others. THose can be considered outliers.

2. Correlation matrix, SLR and MLR

A. Compute the correlation matrix of this data using pd.corr() function. Describe what you observed from this correlation matrix.

data.select_dtypes(include='number').corr()
Length Diameter Height Whole weight Shucked weight Viscera weight Shell weight Rings
Length 1.000000 0.986802 0.828108 0.925217 0.897859 0.902960 0.898419 0.556464
Diameter 0.986802 1.000000 0.834298 0.925414 0.893108 0.899672 0.906084 0.574418
Height 0.828108 0.834298 1.000000 0.819886 0.775621 0.798908 0.819596 0.557625
Whole weight 0.925217 0.925414 0.819886 1.000000 0.969389 0.966354 0.955924 0.540151
Shucked weight 0.897859 0.893108 0.775621 0.969389 1.000000 0.931924 0.883129 0.420597
Viscera weight 0.902960 0.899672 0.798908 0.966354 0.931924 1.000000 0.908186 0.503562
Shell weight 0.898419 0.906084 0.819596 0.955924 0.883129 0.908186 1.000000 0.627928
Rings 0.556464 0.574418 0.557625 0.540151 0.420597 0.503562 0.627928 1.000000
data.select_dtypes(include='number').corr(method="spearman")
Length Diameter Height Whole weight Shucked weight Viscera weight Shell weight Rings
Length 1.000000 0.983299 0.888124 0.972600 0.956790 0.952600 0.948662 0.603954
Diameter 0.983299 1.000000 0.895651 0.971290 0.950422 0.948328 0.954896 0.622487
Height 0.888124 0.895651 1.000000 0.915963 0.874163 0.900535 0.922130 0.657403
Whole weight 0.972600 0.971290 0.915963 1.000000 0.977038 0.975222 0.970197 0.630440
Shucked weight 0.956790 0.950422 0.874163 0.977038 1.000000 0.947581 0.918465 0.538956
Viscera weight 0.952600 0.948328 0.900535 0.975222 0.947581 1.000000 0.938893 0.613930
Shell weight 0.948662 0.954896 0.922130 0.970197 0.918465 0.938893 1.000000 0.692991
Rings 0.603954 0.622487 0.657403 0.630440 0.538956 0.613930 0.692991 1.000000

B. Draw the pairs of scatterplots of the target Rings vs the all quantitative inputs.

  • Comment your findings and handle the problems if there is any in the graph.
sns.pairplot(data=data, hue='Sex')

There seems to be outliers in Height, therefore we should remove them.

data.query("Height < 0.5", inplace=True)
sns.pairplot(data, hue="Sex")

data.select_dtypes(include="number").corr()
Length Diameter Height Whole weight Shucked weight Viscera weight Shell weight Rings
Length 1.000000 0.986794 0.900868 0.925328 0.898129 0.903033 0.898363 0.556572
Diameter 0.986794 1.000000 0.907187 0.925499 0.893330 0.899716 0.906026 0.574551
Height 0.900868 0.907187 1.000000 0.888850 0.837485 0.866757 0.891857 0.610107
Whole weight 0.925328 0.925499 0.888850 1.000000 0.969370 0.966290 0.955954 0.540621
Shucked weight 0.898129 0.893330 0.837485 0.969370 1.000000 0.931831 0.883194 0.421156
Viscera weight 0.903033 0.899716 0.866757 0.966290 0.931831 1.000000 0.908133 0.503977
Shell weight 0.898363 0.906026 0.891857 0.955954 0.883194 0.908133 1.000000 0.628169
Rings 0.556572 0.574551 0.610107 0.540621 0.421156 0.503977 0.628169 1.000000
  • Here, I split the data into two parts. You are allowed to use only the training part for building models. The testing part will be used to evaluate the models’ performance.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
                                        data.iloc[:,:8], 
                                        data['Rings'], 
                                        test_size=0.2, 
                                        random_state=42)
X_train = pd.concat([pd.get_dummies(X_train[['Sex']], drop_first=True), X_train.iloc[:,1:]], axis=1)
X_test = pd.concat([pd.get_dummies(X_test[['Sex']], drop_first=True), X_test.iloc[:,1:]], axis=1)

C. Fit a SLR model using the most correlated input to predict the Rings of abalone.

  • Draw the fitted line on the previous scatterplot.
  • Compute \(R^2\) and explain the observed value.
  • Compute Mean Suared Error on the test data (Test MSE).
  • Analyze the residuals and conclude.
from sklearn.linear_model import LinearRegression as LR
slr = LR()
slr = slr.fit(X_train[['Shell weight']], y_train)
y_pred = slr.predict(X_test[['Shell weight']])

ax = sns.scatterplot(data, x="Shell weight", y = "Rings")
plt.plot(X_test['Shell weight'], y_pred, c='r')
ax.set_title("Fitted line")
plt.show()

from sklearn.metrics import mean_squared_error, r2_score
df_mse = pd.DataFrame(
    {'SLR': [
        mean_squared_error(y_test, y_pred), 
        r2_score(y_test, y_pred), 
        'NA']},
     index=['Test MSE', 'R-square', 'Adj-R-Squared'])
df_mse
SLR
Test MSE 6.960663
R-square 0.419115
Adj-R-Squared NA
y_pred_tr = slr.predict(X_train[['Shell weight']])
res0 = y_train - y_pred_tr

df_slr = pd.DataFrame(
    {
        'Fitted': y_pred_tr,
        'Residual': res0,
        'Actual': y_train
    }
)

_, axs = plt.subplots(1, 3, figsize=(12, 4))
sns.scatterplot(df_slr, x="Actual", y="Fitted", ax=axs[0])
axs[0].plot(df_slr['Actual'], df_slr['Actual'], 'r')
axs[0].set_title("Actual vs Prediction")

sns.scatterplot(df_slr, x="Fitted", y="Residual", ax=axs[1])
axs[1].set_title("Residual vs Prediction")
axs[1].plot(df_slr['Actual'], [0]*len(df_slr['Actual']), 'r')

sns.histplot(df_slr, x="Residual", ax=axs[2], kde=True)
axs[2].set_title("Residual distribution")
plt.show()

D. Fit MLR using all inputs. Compute:

  • Compute \(R^2\), adjusted \(R_{\text{adj}}^2\) and explain the observed value.
  • Compute Mean Suared Error on the test data (Test MSE).
  • Analyze the residuals and conclude.
mlr = LR().fit(X_train, y_train)
y_pred_mlr = mlr.predict(X_test)

def adjusted_r2(y1, y2, d):
    return 1-(len(y1)-1)/(len(y1)-d-1)*(1-r2_score(y1,y2))

print(f'R-squared: {r2_score(y_test, y_pred_mlr)}')
print(f'R-squared: {adjusted_r2(y_test, y_pred_mlr, X_test.shape[1])}')
print(f'Test MSE: {mean_squared_error(y_test, y_pred_mlr)}')

df_mse = pd.concat([df_mse, pd.DataFrame(
    {'MLR': [
        mean_squared_error(y_test, y_pred_mlr), 
        r2_score(y_test, y_pred_mlr), 
        adjusted_r2(y_test, y_pred_mlr, X_test.shape[1])]},
     index=['Test MSE', 'R-square', 'Adj-R-Squared'])], axis=1)
df_mse
R-squared: 0.5852978270364873
R-squared: 0.5807738033314307
Test MSE: 4.969320719101582
SLR MLR
Test MSE 6.960663 4.969321
R-square 0.419115 0.585298
Adj-R-Squared NA 0.580774
y_pred_tr = mlr.predict(X_train)
res1 = y_train - y_pred_tr

df_mlr = pd.DataFrame(
    {
        'Fitted': y_pred_tr,
        'Residual': res1,
        'Actual': y_train
    }
)

_, axs = plt.subplots(1, 3, figsize=(12, 4))
sns.scatterplot(df_mlr, x="Actual", y="Fitted", ax=axs[0])
axs[0].plot(df_mlr['Actual'], df_mlr['Actual'], 'r')
axs[0].set_title("Actual vs Prediction")

sns.scatterplot(df_mlr, x="Fitted", y="Residual", ax=axs[1])
axs[1].set_title("Residual vs Prediction")
axs[1].plot(df_mlr['Actual'], [0]*len(df_mlr['Actual']), 'r')

sns.histplot(df_mlr, x="Residual", ax=axs[2], kde=True)
axs[2].set_title("Residual distribution")
plt.show()

3. Polynomial Regression and Regularized Linear Models

A. Build polynomial regression with different degree \(n\in\{2,3,...,10\}\) of the best correlated input to predict Rings. Compute Test MSE for each case.

  • Perform \(10\)-fold Cross-validation to select the best degree \(n\). Hint: see slide 47.
from sklearn.preprocessing import PolynomialFeatures

degrees = list(range(2, 11))
preds = []
preds_test = []
poly_dim = []
mse = []
for deg in degrees:
    poly = PolynomialFeatures(degree=deg)
    X_poly = poly.fit_transform(X_train[['Shell weight']])
    poly_dim.append(X_poly.shape[1])
    model = LR().fit(X_poly, y_train)
    preds.append(model.predict(X_poly))
    X_poly_test = poly.transform(X_test[['Shell weight']])
    y_test_pred = model.predict(X_poly_test)
    preds_test.append(y_test_pred)
    mse.append(mean_squared_error(y_test, y_test_pred))

df_poly = pd.DataFrame(
    {
        'Degree' : degrees,
        'Test MSE': mse
    }
)

ax = sns.lineplot(df_poly, x="Degree", y="Test MSE")
ax.set_title("Test MSE vs Degree")
plt.show()

import numpy as np
ax = sns.scatterplot(x=X_train['Shell weight'], y=y_train)
ax.set_title('Fitted polynomial regression')
cols = sns.color_palette()
id_sort = X_train['Shell weight'].argsort()
for i, pred in enumerate(preds):
    ax.plot(X_train['Shell weight'].iloc[id_sort], pred[id_sort], c=cols[i], label=f"Poly {i+2}")
    df_mse = pd.concat(
        [df_mse, pd.DataFrame(
            {'Poly' + str(i+2): [mean_squared_error(y_test, preds_test[i]), 
                     r2_score(y_test, preds_test[i]), 
                     adjusted_r2(y_test, preds_test[i], poly_dim[i])]},
     index=['Test MSE', 'R-square', 'Adj-R-Squared'])], axis=1)
    ax.legend()

df_mse
SLR MLR Poly2 Poly3 Poly4 Poly5 Poly6 Poly7 Poly8 Poly9 Poly10
Test MSE 6.960663 4.969321 6.753474 6.595747 6.558894 6.581101 6.599169 6.599699 6.579606 6.573815 6.595559
R-square 0.419115 0.585298 0.436406 0.449569 0.452644 0.450791 0.449283 0.449239 0.450915 0.451399 0.449584
Adj-R-Squared NA 0.580774 0.434371 0.446916 0.449343 0.446811 0.444621 0.443904 0.444925 0.444741 0.442228
  • Cross-validation
from sklearn.model_selection import cross_val_score
# List to store all losses
loss = [] 
for deg in degrees:
    pf = PolynomialFeatures(degree=deg)
    X_poly = pf.fit_transform(X_train[['Shell weight']])
    model = LR()
    score = -cross_val_score(model, X_poly, y_train, cv=10, 
                scoring='neg_mean_squared_error').mean()
    loss.append(score)

df_loss_cv = pd.DataFrame(
    {
        'CV-MSE': loss,
        'Degree': degrees
    },
    index=['Poly' +str(i) for i in degrees]
)

best_deg = degrees[np.argmin(loss)]
ax = sns.lineplot(df_loss_cv, x="Degree", y="CV-MSE", markers=True)
ax.vlines(x=[best_deg], ymin=np.min(loss), ymax=np.max(loss), color='r')
ax.set_yscale("log")
ax.set_title("CV-MSE vs Degree")
plt.show()

B. Perform \(10\)-fold Cross-validation to tune the best penalization strength \(\alpha\) of Ridge regression model for predicting Rings. Hint: see slide 51.

  • Compute Test MSE and compare it to the previous models.
from sklearn.linear_model import Ridge
# List of all degrees to search over
alphas = list(np.linspace(0.00001, 3, 50)) + list(np.linspace(3.1, 100, 50))
# List to store all losses
loss = []
test_mse = []
test_r2 = []
test_r2_squared = []
coefficients = {f'alpha={alpha}': [] for alpha in alphas}
for alp in alphas:
    model = Ridge(alpha=alp)
    mse = -cross_val_score(model, X_train, y_train, cv=10, 
                scoring='neg_mean_squared_error').mean()
    loss.append(mse)
    
    # Fit
    model.fit(X_train, y_train)
    coefficients[f'alpha={alp}'] = model.coef_
    y_hat_ridge = model.predict(X_test)
    test_mse.append(mean_squared_error(y_test, y_hat_ridge))
    test_r2.append(r2_score(y_test, y_hat_ridge))
    test_r2_squared.append(adjusted_r2(y_test, y_hat_ridge, X_train.shape[1]))
fig, axes = plt.subplots(1, 3, figsize=(12,4))
pd_coef = pd.DataFrame(coefficients, index=X_train.columns)
# Test error
axes[0].plot(alphas, test_mse, label="Test MSE")
axes[0].set_title("Test MSE at each penalty strength alpha")
axes[0].legend()
axes[0].set_xlabel("alpha")
axes[0].set_xscale('log')
axes[0].set_ylabel("Test MSE")

id_min = np.argmin(loss)
print(f'Optimal regularization strength alpha for Ridge regression: {alphas[id_min]}')
axes[1].plot(alphas, loss, label="CV MSE")
axes[1].vlines(x=alphas[id_min], ymin=np.min(loss), ymax=np.max(loss), label="optimal alpha", color="red", linestyle="--")
axes[1].set_title("CV MSE at each penalty strength alpha")
axes[1].legend()
axes[1].set_xscale('log')
axes[1].set_xlabel("alpha")
axes[1].set_ylabel("CV MSE")

axes[2].plot(alphas, pd_coef.transpose(), label=pd_coef.index)
axes[2].vlines(x=alphas[id_min], ymin=np.min(pd_coef), ymax=np.max(pd_coef), label="optimal alpha", color="red", linestyle="--")
axes[2].set_title("Coefs at each penalty strength alpha")
axes[2].legend()
axes[2].set_xscale('log')
axes[2].set_xlabel("alpha")
axes[2].set_ylabel("Coef")
plt.show()
Optimal regularization strength alpha for Ridge regression: 0.06123428571428571

The first graph is the performance of Ridge regression at each value of on the testing data. On the other hand, the second graph illustrates the performance of Ridge regression on cross-validation subsets. The last graph shows the values of the coefficients of the model at each value of penalty parameter \(\alpha\).

We can see that optimal strength \(\alpha\approx 0.0612\) by cross-validation method. The average CV MSE at this level of penalty strength is given below.

df_mse['Ridge'] = [test_mse[id_min], test_r2[id_min], test_r2_squared[id_min]]
df_mse
SLR MLR Poly2 Poly3 Poly4 Poly5 Poly6 Poly7 Poly8 Poly9 Poly10 Ridge
Test MSE 6.960663 4.969321 6.753474 6.595747 6.558894 6.581101 6.599169 6.599699 6.579606 6.573815 6.595559 4.962656
R-square 0.419115 0.585298 0.436406 0.449569 0.452644 0.450791 0.449283 0.449239 0.450915 0.451399 0.449584 0.585854
Adj-R-Squared NA 0.580774 0.434371 0.446916 0.449343 0.446811 0.444621 0.443904 0.444925 0.444741 0.442228 0.581336

C. Perform \(10\)-fold Cross-validation to tune the best penalization strength \(\alpha\) of Lasso regression model for predicting Rings. Hint: see slide 51.

  • Compute Test MSE and compare it to the previous models.
  • How many inputs are retained by Lasso?
from sklearn.linear_model import Lasso
# List of all degrees to search over
alphas = list(np.linspace(0.001, 3, 100))
# List to store all losses
loss = []
test_mse = []
test_r2 = []
test_r2_squared = []
coefficients = {f'alpha={alpha}': [] for alpha in alphas}
for alp in alphas:
    model = Lasso(alpha=alp)
    mse = -cross_val_score(model, X_train, y_train, cv=10, 
                scoring='neg_mean_squared_error').mean()
    loss.append(mse)

    # Fit
    model.fit(X_train, y_train)
    coefficients[f'alpha={alp}'] = model.coef_
    y_hat_lasso = model.predict(X_test)
    test_mse.append(mean_squared_error(y_test, y_hat_lasso))
    test_r2.append(r2_score(y_test, y_hat_lasso))
    test_r2_squared.append(adjusted_r2(y_test, y_hat_lasso, X_train.shape[1]))
fig, axes = plt.subplots(1, 3, figsize=(12,4))
pd_coef = pd.DataFrame(coefficients, index=X_train.columns)
# Test error
axes[0].plot(alphas, test_mse, label="Test MSE")
axes[0].set_title("Test MSE at each penalty strength alpha")
axes[0].legend()
axes[0].set_xlabel("alpha")
axes[0].set_xscale('log')
axes[0].set_ylabel("Test MSE")

id_min = np.argmin(loss)
print(f'Optimal regularization strength alpha for LASSO regression: {alphas[id_min]}')
axes[1].plot(alphas, loss, label="CV MSE")
axes[1].vlines(x=alphas[id_min], ymin=np.min(loss), ymax=np.max(loss), label="optimal alpha", color="red", linestyle="--")
axes[1].set_title("CV MSE at each penalty strength alpha")
axes[1].legend()
axes[1].set_xscale('log')
axes[1].set_xlabel("alpha")
axes[1].set_ylabel("CV MSE")

axes[2].plot(alphas, pd_coef.transpose(), label=pd_coef.index)
axes[2].vlines(x=alphas[id_min], ymin=np.min(pd_coef), ymax=np.max(pd_coef), label="optimal alpha", color="red", linestyle="--")
axes[2].set_title("Coefs at each penalty strength alpha")
axes[2].legend(loc='upper right')
axes[2].set_xscale('log')
axes[2].set_xlabel("alpha")
axes[2].set_ylabel("Coef")
plt.show()
Optimal regularization strength alpha for LASSO regression: 0.001

df_mse['LASSO'] = [test_mse[id_min], test_r2[id_min], test_r2_squared[id_min]]
df_mse
SLR MLR Poly2 Poly3 Poly4 Poly5 Poly6 Poly7 Poly8 Poly9 Poly10 Ridge LASSO
Test MSE 6.960663 4.969321 6.753474 6.595747 6.558894 6.581101 6.599169 6.599699 6.579606 6.573815 6.595559 4.962656 4.955233
R-square 0.419115 0.585298 0.436406 0.449569 0.452644 0.450791 0.449283 0.449239 0.450915 0.451399 0.449584 0.585854 0.586473
Adj-R-Squared NA 0.580774 0.434371 0.446916 0.449343 0.446811 0.444621 0.443904 0.444925 0.444741 0.442228 0.581336 0.581962

4. Auto-MPG Dataset

Apply what you have learn to predict fuel efficiency (Miles Per Gallon, MPG) in vehicles of the kaggle AUTO-MPG Dataset.

import kagglehub

# Download latest version
path = kagglehub.dataset_download("uciml/autompg-dataset")

data = pd.read_csv(path + '/auto-mpg.csv')
data.head()
mpg cylinders displacement horsepower weight acceleration model year origin car name
0 18.0 8 307.0 130 3504 12.0 70 1 chevrolet chevelle malibu
1 15.0 8 350.0 165 3693 11.5 70 1 buick skylark 320
2 18.0 8 318.0 150 3436 11.0 70 1 plymouth satellite
3 16.0 8 304.0 150 3433 12.0 70 1 amc rebel sst
4 17.0 8 302.0 140 3449 10.5 70 1 ford torino

You should try to work on this problem by yourself.

Further readings