Today I was a happy milestone! The code is far from finished but I did my first cut of the back propagation updating the weights.
To check this works, my code prints out lots of info on internal variables, like the weights or the outputs of the hidden layer, or the final output. I used the output of the overall network error - which is the difference between the training target and the actual output of the network to see if it got smaller for each iteration of the back propagation (called epoch).
I was delighted to see it did:
The code is still early, and the above example used a very simplistic training set (all 1s as input and a 1 as output for a 3-2-1 node nework) but the fact that the overall error falls with training epoch is a very good sign.
Onwards and upwards.
Next steps ... I'm worried my code doesn't handle the division in the weight update equation well if the denominator is small or zero: