ILP Part 35 — Approximation

This is the thirty fifth part of the ILP series. For your convenience you can find other parts in the table of contents in Part 1 – Boolean algebra

Last time we saw how to implement Special ordered sets in ILP. Today we are going to use this knowledge to implement piecewise linear approximation in one dimension and in two dimensions. Let’s start.

Idea

Let’s take simple non-linear function f(x) = x^2. We already know how to multiply numbers in ILP but representing x^2 in this way might be computationally intensive and hence we would like to avoid this kind of formulation. However, if we don’t need to be 100% precise it might be better to approximate this function if we know the possible range of x. Let’s assume that we have these points:

    \begin{gather*} x_0, x_1, \ldots, x_n \in \mathbb{R} \\ x_0 \le x_1 \le \ldots \le x_n \end{gather*}

we are now able to calculate the following values:

    \begin{gather*} y_0 = f(x_0) = x_0^2 \\ y_1 = f(x_1) = x_1^2 \\ \cdots \\ y_n = f(x_n) = x_n^2 \end{gather*}

So for every known argument we simply calculate the function f. If all x_i are integers this looks very trivial, however, if x_i are reals and f is a bit more complicated then precalculating the function value might greatly simplify the ILP model for our problem.

Having all these points, we declare variable representing actual argument:

    \begin{gather*} x \in \mathbb{R} \\ x_0 \le x \le x_n \end{gather*}

So we define x as real number in range [x_0; x_n], which we will use as an argument of function f. However, instead of simply calculating x^2 we will approximate it using linear functions in ranges [x_0; x_1], [x_1, x_2], \ldots, [x_{n-1}, x_n].

1D Approximation

We will use the following trick: for all linear functions we define “weights” of vertices and use them to approximate function. They are defined in the following manner:

    \begin{gather*} \lambda_0, \lambda_1, \ldots, \lambda_n \in \mathbb{R}  \cap \{0, 1\} \\ x = x_0\lambda_0 + x_1 \lambda_1 + \cdots + x_n \lambda_n \\ y = y_0 \lambda_0 + y_1 \lambda_1 + \cdots + y_n \lambda_n \\ \lambda_0 + \lambda_1 + \cdots + \lambda_n = 1 \\ \Lambda = \{\lambda_0, \lambda_1, \ldots, \lambda_n \} \\ \Lambda \text{ is a SOS2 } \end{gather*}

Introduced lambda variables guarantee that x will lie in one of the segments [x_0; x_1], \ldots, [x_{n-1}; x_n] thanks to SOS2 constraints. Without this stipulation would allow the x variable to “lie” on other segments, probably ranging over multiple x_i variables. The same goes for y variable — by using lambda variables we are able to approximate f correctly.

Let’s consider an example: x_0=0, x_1 = 1, x_2 = 2, x_3 = 2.5 with function f(x) = x^2 and values y_0 = 0, y_1 = 1, y_2 = 4, y_3 = 6.25. We have the following relation:

    \begin{gather*} x = 0\lambda_0 + 1\lambda_1 + 2\lambda_2 + 3\lambda_3\\ y = 0\lambda_0 + 1\lambda_1 + 4\lambda_2 + 6.25\lambda_3 \\ \lambda_0 + \lambda_1 + \lambda_2 + \lambda_3 = 1\\ \Lambda \text{ is a SOS2 } \end{gather*}

And now imagine that \lambda_1 = 0.5, \lambda_2 = 0.5, we get x = 0.5 + 1 = 1.5 and y=0.5 + 2 = 2.5 = 2.25 + 0.25 which is quite a good approximation. However, if it happens that x is on a grid (which we can force by making \lambda_i binary variables), we get exact solution.

Our approximated value is represented by variable y so in the remaining part of our ILP model we use y in place of f(x) = x^2.

2D Approximation

Approximation in two dimensions goes almost exactly the same. We have the function f(x,y) = x^2 + y^2 and the following points:

    \begin{gather*} (x_0, y_0), (x_1, y_1), \ldots, (x_n, y_n) \in \mathbb{R} \times \mathbb{R} \\ x_0 \le x_1 \le \ldots \le x_n \\ y_0 \le y_1 \le \ldots \le y_n \\ z_{0,0} = f(x_0, y_0) = x_0^2 + y_0^2 \\ z_{0, 1} = f(x_0, y_1) = x_0^2 + y_1^2 \\ \cdots \\ z_{n, n} = f(x_n, y_n) = x_n^2 + y_n^2 \end{gather*}

We introduce the following variables:

    \begin{gather*} \lambda_{0, 0}, \lambda_{0,1}, \ldots, \lambda_{n, n} \in \mathbb{R}  \cap \{0, 1\} \\ x = \displaystyle\mathop{\sum}_{0 \le i \le n} \displaystyle\mathop{\sum}_{0 \le j \le n} x_i \lambda_{i, j} \\ y = \displaystyle\mathop{\sum}_{0 \le i \le n} \displaystyle\mathop{\sum}_{0 \le j \le n} y_j \lambda_{i, j} \\ z = \displaystyle\mathop{\sum}_{0 \le i \le n} \displaystyle\mathop{\sum}_{0 \le j \le n} f(x_i, y_j) \lambda_{i, j} \\ \displaystyle\mathop{\sum}_{0 \le i \le n} \displaystyle\mathop{\sum}_{0 \le j \le n} \lambda_{i, j} = 1 \end{gather*}

We now need to add constraint similar to SOS2 for 1D approximation. In that case the constraint simply meant “at most two neighbouring variables are allowed to be non-zero”. Now we need to add constraint “at most four neighbouring variables are allowed to be non-zero” which is simply a generalization of SOS2 (bonus chatter: you might guess how to approximate 3D function). We introduce the following variables:

    \begin{gather*} \zeta_i = \displaystyle\mathop{\sum}_{0 \le j \le n} \lambda_{i, j} \\ Z = \{ \dzeta_0, \ldots, \dzeta_n \} \\ \eta_j = \displaystyle\mathop{\sum}_{0 \le i \le n} \lambda_{i, j} \\ H = \{ \eta_0, \ldots, \eta_n \} \\ Z, H \text{ are SOS2 } \end{gather*}

If we allow lambdas to be real numbers (e.g., we don’t require x and y to lay on a grid), we need to make sure that non-zero lambda variables lay on a triangle. We can do that by introducing another SOS2 set:

    \begin{gather*} \xi_j = \displaystyle\mathop{\sum}_{0 \le i \le n} \lambda_{i, i+j} \\ \Xi = \{ \xi_0, \ldots, \xi_n \} \\ \Xi \text{ is SOS2 } \end{gather*}

Summary

We know how to approximate 1D and 2D functions using special ordered sets of type 2. By using this approach we can greatly reduce size of ILP models which might significantly decrease solving time.