This is the eighth part of the ML series. For your convenience you can find other parts in the table of contents in Part 1 – Linear regression in MXNet
Last time we saw forward propagation in neural net. Today we are going to extend the process to backpropagate the errors. Let’s begin.
We need to add some more definitions to calculate output:
1
2
3
4
5
6
7
8
9
10
CREATE TABLE outputs(
outputNode NUMERIC,
outputValue NUMERIC
);
INSERT INTO outputs VALUES
(1,290)
,(2,399)
,(3,505)
;
Before we see some SQL code, let’s do some math. We had three layers (input, hidden, output), in input and output layers we used linear activation function. Hidden layer used ReLU.
We start with calculating loss function. We use normal squared error:
Now let’s calculate partial derivatives to update weights between hidden layer and output layer:
Now, the same for biases:
That was easy. Now we use learning rate equal to and we can update both weights and biases between hidden layer and output layer.
WHEN phase=6THENweight1Value-0.1*(SUM(outputLayerOutput-outputValue)OVER(PARTITION BY weight1InputNodeNumber,weight1OutputNodeNumber))*1*weight2Value *(CASEWHEN hiddenLayerInput>0THEN1ELSE0END)*inputLayerOutput
ELSEweight1Value
ENDASweight1Value,
CASE
WHEN phase=6THENweight1Value-0.1*(SUM(outputLayerOutput-outputValue)OVER(PARTITION BY weight1InputNodeNumber,weight1OutputNodeNumber))*1*weight2Value *(CASEWHEN hiddenLayerInput>0THEN1ELSE0END)*1
ELSEweight1Bias
ENDweight1Bias,
CASE
WHEN phase=1THENSUM(weight1Value *inputLayerOutput+weight1Bias)OVER(PARTITION BY weight1OutputNodeNumber,phase)/3
ELSEhiddenLayerInput
ENDAShiddenLayerInput,
CASE
WHEN phase=2THENCASEWHEN hiddenLayerInput>0THENhiddenLayerInput ELSE0END
ELSEhiddenLayerOutput
ENDAShiddenLayerOutput,
weight2InputNodeNumber,
weight2OutputNodeNumber,
CASE
WHEN phase=6THENweight2Value-0.1*(outputLayerOutput-outputValue)*1*hiddenLayerOutput
ELSEweight2Value
ENDASweight2Value,
CASE
WHEN phase=6THENweight2Value-0.1*(outputLayerOutput-outputValue)*1*1
ELSEweight2Bias
ENDASweight2Bias,
CASE
WHEN phase=3THENSUM(weight2Value *hiddenLayerOutput+weight2Bias)OVER(PARTITION BY weight2OutputNodeNumber,phase)/3
ELSEoutputLayerInput
ENDASoutputLayerInput,
CASE
WHEN phase=4THENoutputLayerInput
ELSEoutputLayerOutput
ENDASoutputLayerOutput,
outputNode,
outputValue,
CASE
WHEN phase=5THEN(outputLayerOutput-outputValue)*(outputLayerOutput-outputValue)/2
ELSEerrorValue
ENDASerrorValue,
phase+1ASphase
FROM solution
WHERE phase<=6
)
SELECT DISTINCT *
FROM solution WHERE phase=7
ORDER BY weight1InputNodeNumber,weight1OutputNodeNumber,weight2OutputNodeNumber
It is very similar to the solution from previous post. This time in phase 5 we calculate error, in phase 6 we update weights and biases. You can find results here.