Postgresql – Random IT Utensils https://blog.adamfurmanek.pl IT, operating systems, maths, and more. Wed, 08 Jan 2020 01:19:03 +0000 en-US hourly 1 https://wordpress.org/?v=6.5.2 Aborted transactions in Redshift due to Serializable isolation level https://blog.adamfurmanek.pl/2020/02/22/aborted-transactions-in-redshift-due-to-serializable-isolation-level/ https://blog.adamfurmanek.pl/2020/02/22/aborted-transactions-in-redshift-due-to-serializable-isolation-level/#respond Sat, 22 Feb 2020 09:00:27 +0000 https://blog.adamfurmanek.pl/?p=3246 Continue reading Aborted transactions in Redshift due to Serializable isolation level]]> I was working with seemingly simple load job in Redshift. I was generating some data and wanted to store the output in separate table, just to persist it for debugging purposes. However, since the job could be reran if needed, I wanted to remove rows generated previously and add new ones. So I tried the following:

CREATE TABLE locks_tests (
  market INT,
  value INT
);

INSERT INTO locks_tests(market, value) VALUES
  (1, 1),
  (1, 2),
  (2, 1),
  (2, 2)
;

This is our table. Let’s assume that job can generate data for market 1 or market 2. So the job for market 1 does this:

DELETE FROM locks_tests WHERE market = 1;
INSERT INTO locks_tests VALUES (1, 1), (1,2);

Similarly, job for market 2 does this:

DELETE FROM locks_tests WHERE market = 2;
INSERT INTO locks_tests VALUES (2, 1), (2,2);

As you can see, both jobs use the same table but access different rows.

Now the thing is, if you try running both transactions in parallel, most likely you will get an error that transactions collide and one of them had to be aborted.

I didn’t understand why it works this way as per my understanding there is nothing wrong. Either it should use table lock and one transaction should wait for the other, ot it should use some range locks and figure out that there is nothing wrong.

Unfortunately, it is not that simple. Since Redshift is based on Postgresql, we can reproduce the problem over there as well. For instance, see dba.stackexchange.

Okay, we know what the problem is, how to solve it? I considered locking tables explicitly with LOCK statement but I never find it nice. I checked if it is possible to control partitions of the table but it doesn’t seem to be allowed in Redshift. I thought about doing so called “poor men’s clustering” with multiple tables joined using single view, but I didn’t want to increase the complexity.

Finally, I decided to go with simple MVCC-like approach. Add new column to the table, indicating timestamp of inserted row:

CREATE TABLE locks_tests (
  market INT,
  value INT,
  insert_time TIMESTAMP
);

Now, when inserting data to the table, use this:

INSERT INTO locks_tests(market, value) VALUES
  (1, 1, getdate())
;

Finally, prepare one view extracting only latest data:

CREATE OR REPLACE VIEW locks_tests_view AS (
  WITH max_timestamps AS (
    SELECT market, MAX(insert_time) AS max_insert_time
    FROM locks_tests
    GROUP BY market
  )
  SELECT C.*
  FROM locks_tests AS C
  JOIN max_timestamps AS T ON T.market = C.market
  WHERE C.insert_time = T.max_insert_time
);

And now query the view instead of the original table. You can also run deletion job periodically if needed.

]]>
https://blog.adamfurmanek.pl/2020/02/22/aborted-transactions-in-redshift-due-to-serializable-isolation-level/feed/ 0
Machine Learning Part 8 — Backpropagation in neural net in SQL https://blog.adamfurmanek.pl/2019/07/27/machine-learning-part-8/ https://blog.adamfurmanek.pl/2019/07/27/machine-learning-part-8/#respond Sat, 27 Jul 2019 08:00:18 +0000 https://blog.adamfurmanek.pl/?p=3004 Continue reading Machine Learning Part 8 — Backpropagation in neural net in SQL]]>

This is the eighth part of the ML series. For your convenience you can find other parts in the table of contents in Part 1 – Linear regression in MXNet

Last time we saw forward propagation in neural net. Today we are going to extend the process to backpropagate the errors. Let’s begin.

We need to add some more definitions to calculate output:

CREATE TABLE outputs (
  outputNode NUMERIC,
  outputValue NUMERIC
);

INSERT INTO outputs VALUES
    (1, 290)
   ,(2, 399)
   ,(3, 505)
;

Before we see some SQL code, let’s do some math. We had three layers (input, hidden, output), in input and output layers we used linear activation function. Hidden layer used ReLU.

We start with calculating loss function. We use normal squared error:

    \begin{gather*} Loss = \left[\begin{array}{c} \frac{\left(y^{out}_1 - target_1\right)^2 }{ 2 } \\ \frac{\left(y^{out}_2 - target_2\right)^2 }{ 2 } \end{array}\right] \end{gather*}

Now let’s calculate partial derivatives to update weights between hidden layer and output layer:

    \begin{gather*} \left[\begin{array}{ccc} \frac{\partial Loss}{\partial W^2_{1,1}} & \frac{\partial Loss}{\partial W^2_{1,2}} & \frac{\partial Loss}{\partial W^2_{1,3}} \\ \frac{\partial Loss}{\partial W^2_{2,1}} & \frac{\partial Loss}{\partial W^2_{2,2}} & \frac{\partial Loss}{\partial W^2_{2,3}} \end{array}\right] =  \left[\begin{array}{ccc}  \frac{\partial Loss}{\partial y^{out}_1 } \frac{\partial y^{out}_1 }{\partial y^{in}_1} \frac{\partial y^{in}_1}{\partial W^2_{1,1}} & \frac{\partial Loss}{\partial y^{out}_2 } \frac{\partial y^{out}_2 }{\partial y^{in}_2} \frac{\partial y^{in}_2}{\partial W^2_{1,2}} & \frac{\partial Loss}{\partial y^{out}_3 } \frac{\partial y^{out}_3 }{\partial y^{in}_3} \frac{\partial y^{in}_3}{\partial W^2_{1,3}} \\ \frac{\partial Loss}{\partial y^{out}_1 } \frac{\partial y^{out}_1 }{\partial y^{in}_1} \frac{\partial y^{in}_1}{\partial W^2_{2,1}} & \frac{\partial Loss}{\partial y^{out}_2 } \frac{\partial y^{out}_2 }{\partial y^{in}_2} \frac{\partial y^{in}_2}{\partial W^2_{2,2}} & \frac{\partial Loss}{\partial y^{out}_3 } \frac{\partial y^{out}_3 }{\partial y^{in}_3} \frac{\partial y^{in}_3}{\partial W^2_{2,3}} \end{array}\right]  =\\ \left[\begin{array}{ccc}  (y^{out}_1 - target_1) \cdot 1 \cdot h^{out}_1 & (y^{out}_2 - target_2) \cdot 1 \cdot h^{out}_1 & (y^{out}_3 - target_3) \cdot 1 \cdot h^{out}_1 \\ (y^{out}_1 - target_1) \cdot 1 \cdot h^{out}_2 & (y^{out}_2 - target_2) \cdot 1 \cdot h^{out}_2 & (y^{out}_3 - target_3) \cdot 1 \cdot h^{out}_2 \\ \end{array}\right]  \end{gather*}

Now, the same for biases:

    \begin{gather*} \left[\begin{array}{ccc} \frac{\partial Loss}{\partial b^2_{1,1}} & \frac{\partial Loss}{\partial b^2_{1,2}} & \frac{\partial Loss}{\partial b^2_{1,3}} \\ \frac{\partial Loss}{\partial b^2_{2,1}} & \frac{\partial Loss}{\partial b^2_{2,2}} & \frac{\partial Loss}{\partial b^2_{2,3}} \end{array}\right] =  \left[\begin{array}{ccc}  \frac{\partial Loss}{\partial y^{out}_1 } \frac{\partial y^{out}_1 }{\partial y^{in}_1} \frac{\partial y^{in}_1}{\partial b^2_{1,1}} & \frac{\partial Loss}{\partial y^{out}_2 } \frac{\partial y^{out}_2 }{\partial y^{in}_2} \frac{\partial y^{in}_2}{\partial b^2_{1,2}} & \frac{\partial Loss}{\partial y^{out}_3 } \frac{\partial y^{out}_3 }{\partial y^{in}_3} \frac{\partial y^{in}_3}{\partial b^2_{1,3}} \\ \frac{\partial Loss}{\partial y^{out}_1 } \frac{\partial y^{out}_1 }{\partial y^{in}_1} \frac{\partial y^{in}_1}{\partial b^2_{2,1}} & \frac{\partial Loss}{\partial y^{out}_2 } \frac{\partial y^{out}_2 }{\partial y^{in}_2} \frac{\partial y^{in}_2}{\partial b^2_{2,2}} & \frac{\partial Loss}{\partial y^{out}_3 } \frac{\partial y^{out}_3 }{\partial y^{in}_3} \frac{\partial y^{in}_3}{\partial b^2_{2,3}} \end{array}\right]  =\\ \left[\begin{array}{ccc}  (y^{out}_1 - target_1) \cdot 1 \cdot 1 & (y^{out}_2 - target_2) \cdot 1 \cdot 1 & (y^{out}_3 - target_3) \cdot 1 \cdot 1 \\ (y^{out}_1 - target_1) \cdot 1 \cdot 1 & (y^{out}_2 - target_2) \cdot 1 \cdot 1 & (y^{out}_3 - target_3) \cdot 1 \cdot 1 \\ \end{array}\right]  \end{gather*}

That was easy. Now we use learning rate equal to 0.1 and we can update both weights and biases between hidden layer and output layer.

Similar things go for other updates. If you are lost, you can find great explanation here.

Let’s now see the code:

WITH RECURSIVE currentPhase AS(
	SELECT CAST(0 AS NUMERIC) AS phase
),
oneRow AS(
	SELECT CAST(NULL AS NUMERIC) AS rowValue
),
solution AS (
	SELECT I.*, O1.rowValue AS inputLayerOutput, W1.*, I2.rowValue AS hiddenLayerInput, O2.rowValue AS hiddenLayerOutput, W2.*, I3.rowValue AS outputLayerInput, O3.rowValue AS outputLayerOutput, O.*, E.rowValue AS errorValue, P.*
	FROM inputs AS I
	CROSS JOIN oneRow AS O1
	JOIN weights1 AS W1 ON W1.weight1InputNodeNumber = I.inputNode
	CROSS JOIN oneRow AS I2
	CROSS JOIN oneRow AS O2
	JOIN weights2 AS W2 ON W2.weight2InputNodeNumber = W1.weight1OutputNodeNumber
	CROSS JOIN oneRow AS I3
	CROSS JOIN oneRow AS O3
	JOIN outputs AS O ON O.outputNode = W2.weight2OutputNodeNumber
	CROSS JOIN oneRow AS E
	CROSS JOIN currentPhase AS P

	UNION ALL
	
    SELECT
		inputNode,
		inputValue,

		CASE
			WHEN phase = 0 THEN inputValue
			ELSE inputLayerOutput
		END AS inputLayerOutput,

		weight1InputNodeNumber,
		weight1OutputNodeNumber,
		
		CASE
			WHEN phase = 6 THEN weight1Value - 0.1 * (SUM(outputLayerOutput - outputValue) OVER (PARTITION BY weight1InputNodeNumber, weight1OutputNodeNumber))  * 1 * weight2Value * (CASE WHEN hiddenLayerInput > 0 THEN 1 ELSE 0 END) * inputLayerOutput
			ELSE weight1Value
		END AS weight1Value,
		
		CASE
			WHEN phase = 6 THEN weight1Value - 0.1 * (SUM(outputLayerOutput - outputValue) OVER (PARTITION BY weight1InputNodeNumber, weight1OutputNodeNumber)) * 1 * weight2Value * (CASE WHEN hiddenLayerInput > 0 THEN 1 ELSE 0 END) * 1
			ELSE weight1Bias
		END weight1Bias,

		CASE
			WHEN phase = 1 THEN SUM(weight1Value * inputLayerOutput + weight1Bias) OVER (PARTITION BY weight1OutputNodeNumber, phase) / 3
			ELSE hiddenLayerInput
		END AS hiddenLayerInput,

		CASE
			WHEN phase = 2 THEN CASE WHEN hiddenLayerInput > 0 THEN hiddenLayerInput ELSE 0 END
			ELSE hiddenLayerOutput
		END AS hiddenLayerOutput,

		weight2InputNodeNumber,
		weight2OutputNodeNumber,
		
		CASE
			WHEN phase = 6 THEN weight2Value - 0.1 * (outputLayerOutput - outputValue) * 1 * hiddenLayerOutput
			ELSE weight2Value
		END AS weight2Value,
		
		CASE
			WHEN phase = 6 THEN weight2Value - 0.1 * (outputLayerOutput - outputValue) * 1 * 1
			ELSE weight2Bias
		END ASweight2Bias,

		CASE
			WHEN phase = 3 THEN SUM(weight2Value * hiddenLayerOutput + weight2Bias) OVER (PARTITION BY weight2OutputNodeNumber, phase) / 3
			ELSE outputLayerInput
		END AS outputLayerInput,

		CASE
			WHEN phase = 4 THEN outputLayerInput
			ELSE outputLayerOutput
		END AS outputLayerOutput,
		
		outputNode,
		outputValue,
		
		CASE
			WHEN phase = 5 THEN (outputLayerOutput - outputValue) * (outputLayerOutput - outputValue) / 2
			ELSE errorValue
		END AS errorValue,

		phase + 1 AS phase

	FROM solution
	WHERE phase <= 6
)
SELECT DISTINCT *
FROM solution WHERE phase = 7
ORDER BY weight1InputNodeNumber, weight1OutputNodeNumber, weight2OutputNodeNumber

It is very similar to the solution from previous post. This time in phase 5 we calculate error, in phase 6 we update weights and biases. You can find results here.

]]>
https://blog.adamfurmanek.pl/2019/07/27/machine-learning-part-8/feed/ 0