MNIST is stand for “Modified National Institute of Standards and Technology “. This is the large database of handwritten digits which is used for the training in machine learning and image processing systems. For Classification, learning and the computer vision system, the MNIST is important standard benchmark. From the large dataset which is also known as NIST, MNIST is derived from this, where the NIST has special database 19 that also contains digits, lowercase, uppercase, and written letters. The variant of NIST is also known as Extended MNIST (EMNIST) that is also following the same procedure which is done in the MNIST (Cohen & al, 2017).

Dataset of MNIST contains the 70,000 images for a handwritten digit (0-9) which also has been size centred and normalized in square of grid pixels. For Every image array for floating point which is also representing the intensities of greyscale ranging from 0(black) to 1(White). The data which is target it consist of the one-hot binary vectors and its sizes corresponding to classification of digits which categorized by 0-9.In the below figure there are following examples of MNIST.

The report is about the written classification of MNIST by the Multilayer perceptron network (Conx.Readthedocs, 2017).

There are two dataset of the MNIST;

· Training Dataset

· Testing dataset

Training dataset contains the (60,000 images) and the testing dataset contains 10,000 images. In the different studies the training datasets divided into the two dataset, which contains the 50,000 images for training and 10,000 images for validations. The network contains the deformed images which are generated in the on-line fashion however the un-deformed training set is used for validations, without wasting the time of images (Cireşan & al, 2012).

MNIST is more challenging than Iris dataset

The results of the set of Iris dataset that is more challenging in classifications task and also including digits and letters which shares same image parameters and structures in original MNIST tasks. It also allowing direct compatibility by exiting classifiers along with the systems. From the small subset of numerical digits the MNIST data is derived which is also contained in the NIST datasets by using the methods that is outlined. Whereas in the MNIST original images are submitted top pre-processing, where procedure is included to normalized the images which is also fit into pixel box , and it preserving the aspect ratio. To test the behaviour MNIST is used for different implementations of classifiers which has been efforts to publish ranking in past and MNIST is also used as benchmark. By using “test error rate” this metric is referring performance over the MNIST (Baldominos & al, 2019).

And the Iris dataset; the function loads the iris dataset into the NumPy arrays.

For classification Iris data, these features of the Iris dataset are shown below;

· Petal length

· Sepal length

· Petal width

· Sepal width

· Discrete target variable

· Number of samples

MNIST database is more challenging then the Iris database because the MNIST database is used for handwritten digits which is available from the pages of the training set of 60,000 images and the test samples is for 10,000 images. The MNIST is good database for the people because they want to try learning the techniques along with recognition pattern on real world data (LeCun & al, 2019).

Discuss key algorithm performances of MNIST with Multi-Layer Perception

For the Dataset of the MNIST, machine learning techniques establish the different algorithm like;

· KKN ( K-Nearest Neighbour)

· Decision Tree ( DT)

· Neutral Network ( NT)

From experiment and after the analysis of result it shown that, NN algorithm is more accurate in finding the digits and MNIST dataset which is also reduced the error rate.

In KKN algorithm which is the non-parametric methods and it is also used for regression a classification. Whereas algorithm used different techniques which is determine relationship among the features to create predictive task for the accurate digit recognitions (AL-Behadili, 2016). The model which is obtained by error rates is also evaluated through using the MNIST datasets as shown in below figure;

Figure: MNIST test dataset

2).Multi-layer perceptron and the backpropagation training algorithm in Matlab

The Multilayer perceptron having various hidden layers as shown in below MATALB Code; . Whereas hidden layer consist of the with hidden units, and the value of output by the hidden units is present in this layer;

Whereas layers including the output layers and it set as;

Thus in the multilayer perceptron, feed forward multi-layer perception like the model for neural works in the networks; and the model function takes the forms;

Whereas w is the vector for comprising the all weights along with the and the output is for the input units (Stutz, 2014)

3). Matlab that evaluates classification error rate for MLP on MNIST dataset

Without a full back-propagation implementation as long as the forward propagation and the learning of the output layer works,

The algorithm of the back propagation which is also sued in the classical feed-forward of the multi-layer perceptron of the MNIST datasets. To train the large machine learning networks it is the techniques which are still used. The supervised learning method which is used for the multi-layer of feed forward is the back-propagations from the field of artificial network. By the information processing the feed-Forward neural networks is inspired and one of more neural cells is also called the neuron. Principle of the back-propagation method is the model which gives the functions through modification of the internal weighting for the inputs signal and it is also produced for the expected signals (Brownlee, 2016 ).

4). Features selections of MLP of MNIST with Multi-Layer Perception

The method of the features selection is selecting the subset which is relevant for the features that is using in the MNIST datasets. The features which are used in the MLP (machine learning in Python) have the huge influence which is obtained after the results. The features which are contributes for the prediction of the variable is used in the process of features selections, and there are following benefits which performing the features selection before the modelling of dataset of MNIST.

Reduce the over fitting, based on the noise the less redundant data which is implies on the less chances to create the decisions

Accuracy Improve: The data which has the less misleading implies for the modelling of the accuracy improvements

Reduce Training Time: The algorithm which train the faster, then their less data implies

And the methods which is used for the features selection of the MLP;

· Filter Method

· Warp Method (Alsaafin, 2007)

There are further methods which are sued for the features selection of the MLP;

· Fisher Score

· Mutual Information

· Maximum output information

Whereas the Fisher score and the Maximum output information method is the well-known Filter Methods; and the Mutual information is the Wrapper method (Yang & et al, 2010).

5). Conclusion of MNIST with Multi-Layer Perception

Summing up all discussion report has been provided the exhaustive overview of state of art of MNIST databases. Before two decades ago, MNIST database of the handwritten digits is also introduced. Although accuracy in MNIST is very close to 100% as well as will hardly increase. The report introduced the EMNIST datasets which is suited for the six dataset to provide the very challenging solution of the MNIST datasets. Whereas there are 19 characters of the NIST which is converted into format matches of MNIST datasets, creating it compatible by the several network which is capable for working by the original MNIST dataset. Approximately the subset of the 60% features which train the classification techniques for the digit recognitions, and it is also implanted for the different algorithm. All the objective is fulfilled which is required for the report; like the MATLAB coding of the multi-layer perceptron with the MNIST dataset is also shown in the above screenshots.

References of MNIST with Multi-Layer Perception

AL-Behadili, H. N. (2016). Classification Algorithms for Determining Handwritten Digit. Iraq J. Electrical and Electronic Engineering, 12(1).

Alsaafin, A. (2007). A Minimal Subset of Features Using Feature Selection for Handwritten Digit Recognition. Journal of Intelligent Learning Systems and Applications, 9(4).

Baldominos, A., & al, e. (2019). A Survey of Handwritten Character Recognition with MNIST and EMNIST. MDPI.

Brownlee, J. (2016 , November 7, ). How to Code a Neural Network with Backpropagation In Python. Retrieved from https://machinelearningmastery.com/implement-backpropagation-algorithm-scratch-python/

Cireşan, D. C., & al, e. (2012). Deep Big Multilayer Perceptrons for Digit Recognition. Neural Networks: Tricks of the Trade, 581–598.

Cohen, G., & al, e. (2017). EMNIST: an extension of MNIST to handwritten letters. arXiv:1702.05373v2.

Conx.Readthedocs. (2017). The MNIST Dataset. Retrieved from https://conx.readthedocs.io/en/latest/MNIST.html

LeCun, Y., & al, e. (2019). THE MNIST DATABASE of handwritten digits. Retrieved from http://yann.lecun.com/exdb/mnist/

Stutz, D. (2014). Introduction to Neural Networks. RWTH Aachen University.

Yang, J., & et al. (2010). Feature Selection for MLP Neural Network: The Use of Random Permutation of Probabilistic Outputs. IEEE Transactions on Neural Networks, 21(12), 1911 - 1922