MNIST Written CharacterClassification with a Multi-Layer Perceptron
Learning Outcomes
1.Evaluate and articulate the issues and challenges in machine learning, including
model selection, complexity and feature selection
2.Demonstrate a working knowledge of the variety of mathematical techniques
normally adopted for machine learning problems, and of their application to creating
effective solutions
3.Critically evaluate the performance and drawbacks of a proposed solution to a
machine learning problem4.Create solutions to machine learning problems using
appropriate software.
Data set
The dataset MNIST contains images of hand-written characters 0-9. The task is to
take an image as input and determine which of the numbers 0-9 is written in it. The
dataset is an industry standard and one of the most common benchmarks for new
classification algorithms. The dataset and information about it can be found here:
http://yann.lecun.com/exdb/mnist/ .
The dataset is already pre-structured into training and test sets, which can be
downloaded on the homepage. Performances of common algorithms on this
database are well known and also documented as error-rates on the test set on the
homepage.The dataset can be easily read into Matlab or Octave with commonly
available helper scripts such as on
http://ufldl.stanford.edu/wiki/index.php/Using_the_MNIST_Dataset .
% Change the filenames if you've saved the files under
different names
% On some platforms, the files might be saved as
% train-images.idx3-ubyte / train-labels.idx1-ubyte
images = loadMNISTImages('train-images-idx3-ubyte');
labels = loadMNISTLabels('train-labels-idx1-ubyte');
% We are using display_network from the autoencoder
codedisplay_network(images(:,1:100)); % Show the first 100
imagesdisp(labels(1:10));
Machine Learning and Evaluation
For this coursework you will program and use the Backpropagation learning
algorithm for Multi-Layer Perceptrons (a.k.a. “Deep Learning”) in Matlab/Octave. The
attached Matlab file provides a stub for the neural network code with data members
and a constructor already in place. Fill in the methods to initialize the weights, to
compute the output of the network for a given input, as well as methods to train the
network by means of the backpropagation algorithm. Use of the template is
compulsory. Implementations outside the template design will not be accepted
unless explicitly and individually agreed by the module leader. The network
implementation must have at least one hidden layer (as is provided).
Backpropagation may be implemented as online algorithm.
Experiments must at least show:
•The training and test error
•A comparison of different hidden layer sizes
The entire experiment must be submitted as Matlab script file from which it can be
reproduced. Indicate whether you developed in Matlab or Octave.
Bonus points are given for the implementation of more network layers, an additional
batch-gradient implementation of backpropagation, or meaningful pre-processing
steps.Further bonus points are given for use of a separate validation set, for crossvalidation, or experimental evaluation of any other relevant parameters.
Report structure and assessment (70% of module mark)
1)Write a brief introduction that introduces (10%)
1)Explains what MNIST is about and what its contents are, what relevant size
characteristics are
2)Explain why MNIST is more challenging than the Iris dataset.
3)Briefly discuss key algorithm performances using the listing on the data set
homepage and explain what can be expected for the following experiment.
2)Implement and document a multi-layer perceptron and the backpropagation
training algorithm in Matlab (20%)
1)Build your code up systematically step by step and test. Provide evidence of
that process.
2)5 of the 20 marks in this section are reserved as speed bonus for who can
live-demo a MLP successfully learning XOR (or a similar function) by the 14/11/19.
3)Realize and describe an experiment in Matlab that evaluates the classification
error rate for MLP on the MNIST dataset. Use appropriate illustrations and diagrams
as well as statistics. (20%)
1)Make sure you have one successfully learning parameter set first, and start
to explore systematically from there. Pay particular attention to finding an appropriate
learning rate first.
2)This experiment can be conducted without a full back-propagation
implementation as long as the forward propagation and the learning of the output
layer works, although results will vary from the intended experiment.
4)Bonus points for additional features of MLP or experiment, see above. (10%)
5)Write a brief conclusion on the results and compare to results documented for
other algorithms as well as MLP configurations on the data set homepage. Explain
possible current limitations of your solutions and possible further strategies to
improve on the results (10%)