ILP Part 52 — Unsigned magnitude decomposition for real variables

This is the fifty second part of the ILP series. For your convenience you can find other parts in the table of contents in Part 1 – Boolean algebra

In part 2 we saw how to calculate unsigned magnitude decomposition for non-negative integer variables. Today we are going to do the same for real variables. It sounds like an easy task to do but as we will see it is very tricky to do.

When comparing real numbers in Part 14 we introduced variable s for setting precision. We also have K as maximum integer value.

We want to decompose variable in the following form:

    \begin{gather*} d = a_0 + 2\cdot a_1 + 4\cdot a_2 + \cdots + 2^{n-1}\cdot a_{n-1} + 2^{-1} \cdot b_1 + 2^{-2} \cdot b_2 + \ldots + 2^{-m} \cdot b_m \\ 2^{-m} \ge s \end{gather*}

We could take as many negative powers as possible. Now we add the following constraint:

    \begin{gather*} d \le a < d + 2^{-m} \end{gather*}

It looks reasonable but it doesn’t work. Let’s take the following variables: a = \frac{1}{3}, s = 0.001. The result should be 0.010101010 in binary which gives 0.33203125. And now when we add lowest power, we get 0.333984375. But when we now compare this by calculating 0.333984375 - 0.333333333 we get 0.000651042 < 0.001. So we cannot determine those two numbers are different, we consider them equal!

But if we choose d \le a \le d + 2^{-m} then we have two representations of 0.5. One is 0.1000\ldots, the other one is 0.0111\ldots. So with this approach we cannot approximate numbers reliably.