This is the eighth part of the ML series. For your convenience you can find other parts in the table of contents in Part 1 – Linear regression in MXNet
Last time we saw forward propagation in neural net. Today we are going to extend the process to backpropagate the errors. Let’s begin.
We need to add some more definitions to calculate output:
1 2 3 4 5 6 7 8 9 10 |
CREATE TABLE outputs ( outputNode NUMERIC, outputValue NUMERIC ); INSERT INTO outputs VALUES (1, 290) ,(2, 399) ,(3, 505) ; |
Before we see some SQL code, let’s do some math. We had three layers (input, hidden, output), in input and output layers we used linear activation function. Hidden layer used ReLU.
We start with calculating loss function. We use normal squared error:
Now let’s calculate partial derivatives to update weights between hidden layer and output layer:
Now, the same for biases:
That was easy. Now we use learning rate equal to and we can update both weights and biases between hidden layer and output layer.
Similar things go for other updates. If you are lost, you can find great explanation here.
Let’s now see the code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 |
WITH RECURSIVE currentPhase AS( SELECT CAST(0 AS NUMERIC) AS phase ), oneRow AS( SELECT CAST(NULL AS NUMERIC) AS rowValue ), solution AS ( SELECT I.*, O1.rowValue AS inputLayerOutput, W1.*, I2.rowValue AS hiddenLayerInput, O2.rowValue AS hiddenLayerOutput, W2.*, I3.rowValue AS outputLayerInput, O3.rowValue AS outputLayerOutput, O.*, E.rowValue AS errorValue, P.* FROM inputs AS I CROSS JOIN oneRow AS O1 JOIN weights1 AS W1 ON W1.weight1InputNodeNumber = I.inputNode CROSS JOIN oneRow AS I2 CROSS JOIN oneRow AS O2 JOIN weights2 AS W2 ON W2.weight2InputNodeNumber = W1.weight1OutputNodeNumber CROSS JOIN oneRow AS I3 CROSS JOIN oneRow AS O3 JOIN outputs AS O ON O.outputNode = W2.weight2OutputNodeNumber CROSS JOIN oneRow AS E CROSS JOIN currentPhase AS P UNION ALL SELECT inputNode, inputValue, CASE WHEN phase = 0 THEN inputValue ELSE inputLayerOutput END AS inputLayerOutput, weight1InputNodeNumber, weight1OutputNodeNumber, CASE WHEN phase = 6 THEN weight1Value - 0.1 * (SUM(outputLayerOutput - outputValue) OVER (PARTITION BY weight1InputNodeNumber, weight1OutputNodeNumber)) * 1 * weight2Value * (CASE WHEN hiddenLayerInput > 0 THEN 1 ELSE 0 END) * inputLayerOutput ELSE weight1Value END AS weight1Value, CASE WHEN phase = 6 THEN weight1Value - 0.1 * (SUM(outputLayerOutput - outputValue) OVER (PARTITION BY weight1InputNodeNumber, weight1OutputNodeNumber)) * 1 * weight2Value * (CASE WHEN hiddenLayerInput > 0 THEN 1 ELSE 0 END) * 1 ELSE weight1Bias END weight1Bias, CASE WHEN phase = 1 THEN SUM(weight1Value * inputLayerOutput + weight1Bias) OVER (PARTITION BY weight1OutputNodeNumber, phase) / 3 ELSE hiddenLayerInput END AS hiddenLayerInput, CASE WHEN phase = 2 THEN CASE WHEN hiddenLayerInput > 0 THEN hiddenLayerInput ELSE 0 END ELSE hiddenLayerOutput END AS hiddenLayerOutput, weight2InputNodeNumber, weight2OutputNodeNumber, CASE WHEN phase = 6 THEN weight2Value - 0.1 * (outputLayerOutput - outputValue) * 1 * hiddenLayerOutput ELSE weight2Value END AS weight2Value, CASE WHEN phase = 6 THEN weight2Value - 0.1 * (outputLayerOutput - outputValue) * 1 * 1 ELSE weight2Bias END ASweight2Bias, CASE WHEN phase = 3 THEN SUM(weight2Value * hiddenLayerOutput + weight2Bias) OVER (PARTITION BY weight2OutputNodeNumber, phase) / 3 ELSE outputLayerInput END AS outputLayerInput, CASE WHEN phase = 4 THEN outputLayerInput ELSE outputLayerOutput END AS outputLayerOutput, outputNode, outputValue, CASE WHEN phase = 5 THEN (outputLayerOutput - outputValue) * (outputLayerOutput - outputValue) / 2 ELSE errorValue END AS errorValue, phase + 1 AS phase FROM solution WHERE phase <= 6 ) SELECT DISTINCT * FROM solution WHERE phase = 7 ORDER BY weight1InputNodeNumber, weight1OutputNodeNumber, weight2OutputNodeNumber |
It is very similar to the solution from previous post. This time in phase 5 we calculate error, in phase 6 we update weights and biases. You can find results here.