This is the seventh part of the ML series. For your convenience you can find other parts in the table of contents in Part 1 – Linear regression in MXNet
Today we are going to create a neural net and calculate forward propagation using PostgreSQL. Let’s go.
We start with definition of the network: we will have input layer, hidden layer, and output layer. Input layer will have 3 nodes, hidden layer will have 2, output layer will have 3. In input layer we don’t do any transformation on the input data, in hidden layer we use ReLU, in output layer we use linear activation function (so no transformation).
Let’s start with the following definitions:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
DROP TABLE IF EXISTS inputs; DROP TABLE IF EXISTS weights1; DROP TABLE IF EXISTS weights2; DROP TABLE IF EXISTS biases; CREATE TABLE inputs ( inputNode NUMERIC, inputValue NUMERIC ); INSERT INTO inputs VALUES (1, 1) ,(2, 3) ,(3, 5) ; CREATE TABLE weights1 ( weight1InputNodeNumber NUMERIC, weight1OutputNodeNumber NUMERIC, weight1Value NUMERIC, weight1Bias NUMERIC ); INSERT INTO weights1 VALUES (1, 1, 2, 1) ,(1, 2, 3, 1) ,(2, 1, 4, 2) ,(2, 2, 5, 2) ,(3, 1, 6, 3) ,(3, 2, 7, 3) ; CREATE TABLE weights2 ( weight2InputNodeNumber NUMERIC, weight2OutputNodeNumber NUMERIC, weight2Value NUMERIC, weight2Bias NUMERIC ); INSERT INTO weights2 VALUES (1, 1, 1, 2) ,(1, 2, 2, 2) ,(1, 3, 3, 2) ,(2, 1, 4, 3) ,(2, 2, 5, 3) ,(2, 3, 6, 3) ; |
We define some input values, weights and biases. Values are completely made up and do not make a difference.
Before we write SQL code, let’s calculate result manually.
We have the following variables:
Now, let’s calculate input for hidden layer:
Now, we use ReLU activation function for hidden layer:
We carry on with calculating input for output layer:
Activation function for output layer is linear, so it is easy now:
We will calculate errors next time.
Now, let’s calculate the result:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
WITH RECURSIVE currentPhase AS( SELECT CAST(0 AS NUMERIC) AS phase ), oneRow AS( SELECT CAST(NULL AS NUMERIC) AS rowValue ), solution AS ( SELECT I.*, O1.rowValue AS inputLayerOutput, W1.*, I2.rowValue AS hiddenLayerInput, O2.rowValue AS hiddenLayerOutput, W2.*, I3.rowValue AS outputLayerInput, O3.rowValue AS outputLayerOutput, P.* FROM inputs AS I CROSS JOIN oneRow AS O1 JOIN weights1 AS W1 ON W1.weight1InputNodeNumber = I.inputNode CROSS JOIN oneRow AS I2 CROSS JOIN oneRow AS O2 JOIN weights2 AS W2 ON W2.weight2InputNodeNumber = W1.weight1OutputNodeNumber CROSS JOIN oneRow AS I3 CROSS JOIN oneRow AS O3 CROSS JOIN currentPhase AS P UNION ALL SELECT inputNode, inputValue, CASE WHEN phase = 0 THEN inputValue ELSE inputLayerOutput END AS inputLayerOutput, weight1InputNodeNumber, weight1OutputNodeNumber, weight1Value, weight1Bias, CASE WHEN phase = 1 THEN SUM(weight1Value * inputLayerOutput + weight1Bias) OVER (PARTITION BY weight1OutputNodeNumber, phase) / 3 ELSE hiddenLayerInput END AS hiddenLayerInput, CASE WHEN phase = 2 THEN CASE WHEN hiddenLayerInput > 0 THEN hiddenLayerInput ELSE 0 END ELSE hiddenLayerOutput END AS hiddenLayerOutput, weight2InputNodeNumber, weight2OutputNodeNumber, weight2Value, weight2Bias, CASE WHEN phase = 3 THEN SUM(weight2Value * hiddenLayerOutput + weight2Bias) OVER (PARTITION BY weight2OutputNodeNumber, phase) / 3 ELSE outputLayerInput END AS outputLayerInput, CASE WHEN phase = 4 THEN outputLayerInput ELSE outputLayerOutput END AS outputLayerOutput, phase + 1 AS phase FROM solution WHERE phase <= 4 ) SELECT DISTINCT weight2OutputNodeNumber, outputLayerOutput FROM solution WHERE phase = 5 |
This is actually very easy. We divide the process into multiple phases. Each row of CTE represents one complete path from some input node to some output node. Initially row carries some metadata and input value, in each phase we fill some next value using different case expressions.
In phase 0 we get the input and transform it into output, since input layer has no logic, we just copy the value.
In phase 1 we calculate inputs for next layer by multiplying weights and values.
In phase 2 we activate hidden layer. Since we use ReLU, we perform a very simple comparison.
In phase 3 we once again use weights and values to calculate input for next layer, this time we use different weights.
In phase 4 we activate output layer, which just copies values (since we use a linear activation function).
So in our query we start by defining a schema. We simply join all tables and cross join dummy table with one row which we use to define additional column. We fill these columns later throughout the process.
In recursive part of CTE we simply either rewrite values or do some logic depending on the phase number.
Next time we will see how to backpropagate errors.