Sunday, 1 January 2017

Errata #4 .. Lots of Updates

I've been lucky to have readers that tale the time to provide feedback, error fixes, and suggestions for things that could be made clearer.

I am really pleased that this happens - it means people are interested, that they care, and want to share their insights.

A few suggestions had built up over recent weeks - and I've updated the content. This is is a bigger update than normal.


Thanks

Thanks go to Prof A Abu-Hanna,  "His Divine Shadow",  Andy, Joshua, Luther, ... and many others who provided valuable ideas and fixes for errors, including in the blog comments sections.


Key Updates

Some of the key updates worth mentioning are:
  • Error in calculus introduction appendix where the example explaining how to differentiate $s = t^3$. The second line of working out on page 204 shows $\frac{6 t^2 \Delta  x + 4 \Delta x^3}{2\Delta x}$ which should be $\frac{6 t^2 \Delta  x + 2 \Delta x^3}{2\Delta x}$. That 4 should be a 2.
  • Another error in the calculus appendix section on functions of functions ... showed $(x^2 +x)$ which should have been $(x^3 + x)$. 
  • Small error on page 65 where $w_{3,1}$ is said to be 0.1 when it should be 0.4. 
  • Page 99 shows the summarised update expression as $\Delta{w_{jk}} = \alpha \cdot sigmoid(O_k) \cdot (1 - sigmoid(O_k)) \cdot O_j^T$ .. it should have been the much simpler ..




Worked Examples Using Output Errors - Updated!

A few readers noticed that the example error used in the example to illustrate the weight update process is not realistic.

Why? How? Here is an example diagram used in the book - click to enlarge.


The output error from the first output layer node (top right) is shown as 1.5. Since the output of that node is the output from a sigmoid function it must be between 0 and 1 (and not including 0 or 1). The target values must also be within this range. That means the error .. the difference between actual and target values .. can't be as large as 1.5. The error can't be bigger than 0.99999... at the very worst. That's why $e_1 = 1.5$ is unrealistic.

The calculations illustrating how we do backpropagation are still ok. The error values were chosen at random ... but it would be better if we had chosen a more realistic error.

The examples in the book have been updated with a new output error as 0.8.


Updated Book

The book will be updated with these fixes as soon as the Appendix on how to run the neural networks and MNIST challenged on the Raspberry Pi Zero is updated too - the Raspian software has seen quite a few updates and probably doesn't need the workarounds described there.