In the last blog, we accomplished our implementation of a neural community. On this weblog we are going to check our neural community in opposition to an actual dataset — referred to as MNIST dataset. These are a group of handwritten photos of digits between 0 and 9, which our community will study to categorise.

Among the examples of how the information is:

Every picture is a 28×28 greyscale picture, with every pixel having values between [0, 255], with 0 as white and 255 as black.

Let’s break this down:

- Obtain the dataset and cargo it in java
- Preprocess the values in order that the pixels are flattened to a 1d enter matrix and the values are between 0 and 1
- Configure the neural community to have 2 hidden layers of 16 neurons every. What about enter and output? Properly, enter might be 784 neurons (flattened 28 * 28 pixels) and output might be of 10 neurons, the place ith neuron indicating how a lot is the likelihood that the present enter is i.
- Repeat the coaching course of for ’N’ epochs after which validate the community with unseen samples.

The dataset is free to obtain here. After downloading and extracting it, we make a MNISTReader class:

`public class MnistReader {`public static Record<Pair<Matrix, Matrix>> getDataForNN(

String imagePath, String labelsPath, int samples) {

strive {

return getDataForNNHelper(imagePath, labelsPath, samples);

} catch (IOException e) {

throw new RuntimeException(e);

}

}

personal static Record<Pair<Matrix, Matrix>> getDataForNNHelper(

String imagesPath, String labelsPath, int samples) throws IOException {

Record<Pair<Matrix, Matrix>> knowledge = new ArrayList<>();

strive (DataInputStream trainingDis =

new DataInputStream(new BufferedInputStream(new FileInputStream(imagesPath)))) {

strive (DataInputStream labelDis =

new DataInputStream(new BufferedInputStream(new FileInputStream(labelsPath)))) {

int magicNumber = trainingDis.readInt();

int numberOfItems = trainingDis.readInt();

int nRows = trainingDis.readInt();

int nCols = trainingDis.readInt();

int labelMagicNumber = labelDis.readInt();

int numberOfLabels = labelDis.readInt();

numberOfItems = samples == -1 ? numberOfItems : samples;

for (int t = 0; t < numberOfItems; t++) {

double[][] imageContent = new double[nRows][nCols];

for (int i = 0; i < nRows; i++) {

for (int j = 0; j < nCols; j++) {

imageContent[i][j] = trainingDis.readUnsignedByte();

}

}

Matrix imageData =

new Matrix(imageContent)

.apply(pixel -> MathUtils.scaleValue(pixel, 0, 255, 0, 1))

.flatten()

.transpose();

int label = labelDis.readUnsignedByte();

double[] output = new double[10];

output[label] = 1;

Matrix outputMatrix = new Matrix(new double[][] {output}).transpose();

knowledge.add(Pair.of(imageData, outputMatrix));

}

}

}

return knowledge;

}

}

Now as a result of our matrices are immutable and return new matrices after each operation, the pre-processing turns into a one liner:

`Matrix imageData =`

new Matrix(imageContent)

.apply(pixel -> MathUtils.scaleValue(pixel, 0, 255, 0, 1))

.flatten()

.transpose();

We don’t want a proof right here because the operate names are self explanatory.

We now implement the `MnistTrainer`

class which trains on the loaded enter and adjusts the weights and biases

`@Builder`

@AllArgsConstructor

@NoArgsConstructor

@Information

public class MnistTrainer {

personal NeuralNetwork neuralNetwork;

personal int iterations;

personal double learningRate;public void practice(Record<Pair<Matrix, Matrix>> trainingData) {

int mod = iterations / 100 == 0 ? 1 : iterations / 100;

double error = 0;

for (int t = 0; t < iterations; t++) {

for (Pair<Matrix, Matrix> trainingDatum : trainingData) {

neuralNetwork.trainForOneInput(trainingDatum, learningRate);

double errorAdditionTerm =

neuralNetwork.getOutputErrorDiff().apply(x -> x * x).sum()

/ trainingData.measurement();

error += errorAdditionTerm;

}

neuralNetwork.setAverageError(error);

if ((t == 0) || ((t + 1) % mod == 0)) {

System.out.println("after " + (t + 1) + " epochs, common error: " + error);

}

error = 0;

trainingData = MathUtils.shuffle(trainingData);

}

}

}

This might be referred to as from the primary methodology.

`public class Major {`

public static void primary(String[] args) throws IOException {

String rootPath = "/Customers/satvik.nema/Paperwork/mnist_dataset/";

String trainImagesPath = rootPath + "train-images.idx3-ubyte";

String trainLabelsPath = rootPath + "train-labels.idx1-ubyte";Record<Pair<Matrix, Matrix>> mnistTrainingData =

MnistReader.getDataForNN(trainImagesPath, trainLabelsPath, 60000);

Record<Integer> hiddenLayersNeuronsCount = Record.of(16, 16);

int inputRows = mnistTrainingData.getFirst().getA().getRows();

int outputRows = mnistTrainingData.getFirst().getB().getRows();

MnistTrainer mnistTrainer =

MnistTrainer.builder()

.neuralNetwork(

NNBuilder.create(inputRows, outputRows, hiddenLayersNeuronsCount))

.iterations(100)

.learningRate(0.01)

.construct();

Immediate begin = Immediate.now();

mnistTrainer.practice(mnistTrainingData);

Immediate finish = Immediate.now();

lengthy seconds = Length.between(finish, begin).getSeconds();

System.out.println("Time taken for coaching: "+seconds+"s");

}

}

Discover how we set the two hidden layers with 16 neurons in `hiddenLayersNeuronsCount`

MNIST dataset additionally consists of separate 10,000 testing samples. We’ll use them to check how our skilled community performs on unseen knowledge.

Beginning with a `MnistTester`

:

`@Builder`

@AllArgsConstructor

@NoArgsConstructor

@Information

public class MnistTester implements NeuralNetworkTester {

personal NeuralNetwork neuralNetwork;public double validate(Record<Pair<Matrix, Matrix>> trainingData) {

double error = 0;

int countMissed = 0;

Record<String> missedIndexes = new ArrayList<>();

int index = 0;

for (Pair<Matrix, Matrix> trainingDatum : trainingData) {

neuralNetwork.feedforward(trainingDatum.getA());

Matrix output = neuralNetwork.getLayerOutputs().getLast();

int predicted = output.max().getB()[0];

int precise = trainingDatum.getB().max().getB()[0];

if (predicted != precise) {

countMissed++;

missedIndexes.add("("+index+", "+precise+", "+predicted+")");

}

Matrix errorMatrix = output.subtract(trainingDatum.getB());

error += errorMatrix.apply(x -> x * x).sum() / trainingData.measurement();

index++;

}

System.out.printf("Whole: %s, fallacious: %spercentn", trainingData.measurement(), countMissed);

return error;

}

}

And operating our validation:

`String testImagesPath = rootPath + "t10k-images.idx3-ubyte";`

String testLabelsPath = rootPath + "t10k-labels.idx1-ubyte";Record<Pair<Matrix, Matrix>> mnistTestingData =

MnistReader.getDataForNN(testImagesPath, testLabelsPath, -1);

MnistTester mnistTester = MnistTester.builder().neuralNetwork(trainedNetwork).construct();

double error = mnistTester.validate(mnistTestingData);

System.out.println("Error: "+error);

The accuracy is fairly good

`after 1 epochs, common error: 0.6263571412645461`

after 100 epochs, common error: 0.07471539583255844

after 200 epochs, common error: 0.060457042431757556

after 300 epochs, common error: 0.052867280710867826

after 400 epochs, common error: 0.04818163691903281

after 500 epochs, common error: 0.04496163434230489

after 600 epochs, common error: 0.04240323875238682

after 700 epochs, common error: 0.04034903547585861

after 800 epochs, common error: 0.03881550591240332

after 900 epochs, common error: 0.037430996099864056

after 1000 epochs, common error: 0.03629978820188779

Time taken for coaching: 4676sTesting:

Whole: 10000, fallacious: 613

So out of the given 10k testing samples, we acquired solely 613 fallacious! That’s a accuracy of ~93.8%. Not so unhealthy for a homegrown neural community is it?

Let’s look at which samples the place fallacious.

For one occasion, this one was purported to be a 4 which our community categorized as 8:

And this one was purported to be a 5 which the community categorized as 3:

Properly these errors may be forgiven isn’t it? xD

Folks have really achieved a accuracy of about >99% on this dataset. Here are the general benchmarks on this dataset (we break into high 40).

And this concludes our Neural Networks from scratch sequence 🙂

