Thursday, 14 April 2016

Busting Past 98% Accuracy

I was working on a document and expected to take a few hours ... so I thought, why not try a longer neural network training session to see if I could break past 98% performance.

The neural network architecture and training was boosted to:
  • 300 hidden layer nodes
  • 30 training epochs
  • rotate training images +/- 10 degrees.
That took about 3 hours on my laptop!

The resultant performance did indeed break the previous record .. at 98.03%



  1. I recently coded up the neural network in your book with one hidden layer. Then, as a challenge, I coded a net that allows the user to define any number of layers with any number of nodes. It seems to work and I can repeat the results of you get when you use just one hidden layer. I want to give my net a more robust challenge though! I would be interested in trying to repeat your results above. Could you send let me know the full details of the architecture you used? Regards John

    1. hi there - great job on adding in additional layers in a user-definable way.

      the architecture is simple: 300 inout nodes, 30 training epochs on the full mnist data, and with each input image rotated +/- 10 degrees to create new training data

      the code for this result is on GitHub along with the rest of the examples:

      that code will tell you the details like how we selected starting random weights, and the learning rate

    2. and let me know how you get on with more difficult challenges like recognising faces .. which deeper networks are better at

    3. sorry that link should have been

  2. Thanks for the advice, I'll try to replicate your work with my code. I'd like to have a go at the face recognition problem as well. Are there any training data sets online I can use for this??