from keras.datasets import mnist
= mnist.load_data()
(X_train, label_train), (X_test, label_test)
# Visualize
import matplotlib.pyplot as plt
0,:,:])
plt.imshow(X_train["off")
plt.axis( plt.show()
Lab - Deep Learning
ITM-370: Data Analytics
Lecturer: HAS Sothea, PhD
Objective: The goal of this practical lab is to introduce you to building Multilayer Perceptron (MLP) models. After this, you will be familar with how to get start with model architecture, what are the influences of activation functions, learning rate, batchsize… We will try to understand all these concepts using the classic handwriten digit Mnist
which can be imported from keras
module.
The
Jupyter Notebook
for this Lab can be downloaded here: Lab_Deep_Learning.ipynb.
Or you can work with this notebook in
Google Colab
here: Lab_Deep_Learning.ipynb.
1. Importing Mnist
dataset
You can import Mnist
dataset from keras
as follow.
A. What’s the dimension of X_train
and X_test
?
Can you guess what are
label_train
andlabel_test
?How many distinct values of these labels?
# To do
B. Visualize the first 8 images of the training data in two rows.
import matplotlib.pyplot as plt
# To do
C. Reshape the training data and testing data from 3 dimensional into 2 dimensional using x.reshape()
, i.e., \(n\times 28\times 28\to n\times 784\) where each row is a \(784\)-dimensional vector of each image.
# To do
D. Scale the training and testing inputs pixels to take values between \(0\) and \(1\). This helps the training process converges quickly.
# To do
2. Model
Here, we build MLP model with the following architecture and hyperparameters:
A. Designing a network
- Network layers: \([d, 32, 32, M]\) where is the input dimension. What are the values of \(d\) and \(M\)?
- Activation function: ReLU and Softmax on the output layer.
- Optimization method: Adam
- Learning rate: \(0.001\)
- Loss: ‘sparse_categorical_crossentropy’ to be minimized.
- Metric: [‘accuracy’] is the metric to keep track the performance of the model during training process.
- Minibatch size: \(b=32\) (subset used to approximate the full loss function when training)
- Number of epochs: epchs = 50.
- Validation size: 0.2 for keeping track the model state during training.
Let me set up all these for you.
from keras.models import Sequential
from keras.layers import Input, Dense
= 784 # Input dimensions
d = 10 # Number of classes
M
= Sequential()
model # input layer
=(d,)))
model.add(Input(shape
# Hidden layers
# Add hidden layer of size 32
32, activation='relu'))
model.add(Dense(
# Add another hidden layer of size 32
32, activation='relu'))
model.add(Dense(
# Add one last layer (output) of size
10, activation='softmax')) model.add(Dense(
# Compiling the model: set up the optimization, loss and metric
from keras.optimizers import Adam, SGD
= 0.001
lr compile(
model.=SGD(learning_rate=lr),
optimizer='sparse_categorical_crossentropy',
loss=['accuracy'])
metrics
# Fit the model
= 32
b = 50
n_epoch = 0.2
val_size = model.fit(X_train, label_train, epochs=n_epoch, batch_size=b, validation_split=val_size) history
B. Learning curves can be extracted from the history
object.
Create learning curves as shown in slide 29.
What information do you get from the curves?
Compute the testing accuracy.
Visualize images that are wrongly predicted.
# To do
C. The network’s architecture and hyperparameters were defined nearly at random. Now, it is your turn to try to study the impact of these values.
Increase the number of neurons within each hidden layers.
Change activation function.
Change the learning rate.
Change the optimization algorithm.
Change the minibatch size \(b\).
In each case, compute the test performance of the model.
# To do
Further readings
- Graphical tools:
Mnist dataset
- Deep Learning