Triple Precision Accumlated Inner Product

Posted by Cleve Moler, March 2, 2015

4 views (last 30 days) | 0 Likes | 1 comment

Single and double precision are combined to facilitate a triple precision accumulated inner product.

Iterative refinement
Example with double precision
Triple precision
dot3p
Example with triple precision
residual3p

Iterative refinement

In my previous post on iterative refinement I showed the need for an inner product routine that is more accurate than double precision. This post is about such a function.

Example with double precision

The example I am going to use is contrived so that the first and third terms in the inner product exactly cancel each other, leaving the much smaller second term to arise from the ashes.

   format long e
   x = [1 1/3 1]'
   y = [1 3e-9 -1]'

x =
     1.000000000000000e+00
     3.333333333333333e-01
     1.000000000000000e+00
y =
     1.000000000000000e+00
     3.000000000000000e-09
    -1.000000000000000e+00

Computing the inner product with a conventional MATLAB statement shows the intended difficulty.

   dot2p = x'*y

dot2p =
     1.000000082740371e-09

The result should be 1.000000000000000e-09. We're getting only about half of the significant digits correct.

Of course, the problem is that the second intermediate sum is

   s2 = 1 + 1/3*3e-9

s2 =
     1.000000001000000e+00

That's OK in decimal, but not in binary. There is not enough room in one double precision floating point number to store the bits that are going to be needed when that leading one is cancelled by the third term in the sum.

Triple precision

I do not have a full blown triple precision arithmetic package by any means. I have just enough to do this one task. Here is the basic idea. A double precision floating point number has 14 hexadecimal digits in its fraction. I can use single and double to break a double into 6 high order hex digits and 8 low order hex digits, like this.

   format hex
   x = 1/3
   xhi = double(single(x))
   xlo = x - xhi

x =
   3fd5555555555555
xhi =
   3fd5555560000000
xlo =
   be45555556000000

Two doubles with more than half of their low order bits equal to zero can be multiplied together with no roundoff error. For example

   pihi = double(single(pi))
   pio3hi = xhi*pihi

pihi =
   400921fb60000000
pio3hi =
   3ff0c1524860a920

That trailing zero in pio3hi is an indication that the result is exact.

Additions are not exact, when the two numbers differ by several orders of magnitude. This fact will eventually be the limiting factor of our inner product routine.

dot3p

My inner production routine is both accumulated, which means it uses extra precise arithmetic, and extended, which means an extra scalar is added to the result using the extra precision. You can download this function from here.

   dbtype dot3p

1     function s = dot3p(x,y,s)
2     % DOT3P s = dot3p(x,y,s)  Triple precision extended inner product.
3     % s = x'*y + s for vectors x and y and scalar s.
4     
5        shi = double(single(s));
6        slo = s - shi;
7        for k = 1:length(x)
8           xhi = double(single(x(k)));
9           xlo = x(k) - xhi;
10          yhi = double(single(y(k)));
11          ylo = y(k) - yhi;
12          tmp = xhi*yhi;
13          zhi = double(single(tmp));
14          zlo = tmp - zhi + xhi*ylo + xlo*yhi + xlo*ylo;
15    
16          tmp = shi + zhi;
17          del = tmp - shi - zhi;
18          shi = double(single(tmp));
19          slo = tmp - shi + slo + zlo - del;
20       end
21       s = shi +  slo;

Example with triple precision

Let's run my example with dot3p in the debugger and look at some intermediate results.

%  dbstop 16 dot3p
%  dbstop 20 dot3p
%  dot3p(x,y,0)

The variables shi and slo carry the sum in triple precision. The first time through the loop there are no roundoff errors, and shi and slo are set to 1.0 and 0.0.

Let's look at the second pass through the loop. Halfway through, at statement 16.

K>> format hex
K>> xhi,xlo,yhi,ylo,tmp,zhi,zlo
xhi =
   3fd5555560000000
xlo =
   be45555556000000
yhi =
   3e29c511e0000000
ylo =
   bc7e2df108000000
tmp =
   3e112e0bf341b0a0
zhi =
   3e112e0c00000000
zlo =
   bc97d9296b9a0d85
K>>
K>> format long e
K>> xhi,xlo,yhi,ylo,tmp,zhi,zlo
xhi =
     3.333333432674408e-01
xlo =
    -9.934107481068821e-09
yhi =
     3.000000026176508e-09
ylo =
    -2.617650809219019e-17
tmp =
     1.000000038527825e-09
zhi =
     1.000000082740371e-09
zlo =
    -8.274037106125165e-17

We can see that xhi is the first six hex digits of 1/3 and xlo is the remaining digits. Similarly, yhi is the first six hex digits of 3e-9 and ylo is the remaing digits. zhi is the first six hex digits of the xhi*yhi and zlo is a crucial correction term.

Stopping at the end of the second pass through the loop, at statement 20.

K>> format hex
K>> tmp, del, shi, slo
tmp =
   3ff000000044b830
del =
   0000000000000000
shi =
   3ff0000000000000
slo =
   3e112e0be826d694
K>>
K>> format long e,
K>> tmp, del, shi, slo
tmp =
     1.000000001000000e+00
del =
     0
shi =
     1
slo =
     9.999999999999999e-10

tmp is 1.0 plus some of the bits of 1.0e-10. del is zero because it is not needed in this example. It is involved when the terms vary over an even wider range. shi is exactly 1.0, which is the high order part of the evolving sum. And slo has become 1.0e-10 to full double precison accuracy.

On the third time through the loop there will be no roundoff errors, shi will be completely cancelled, and slo will bear the full responsibility for providing the final answer.

Of course, this example is contrived and unusual. Ordinarily, we can expect some cancellation (otherwise there would be no need for an accumulated inner product), with the high order part losing at least a few digits and the low order part filling them in.

residual3p

With a dot product in hand, it is easy to write the residual function. Here the extended feature is essential because we expect extreme, if not total, cancellation when the right hand side is subtracted from the dot product.

   type residual3p

function r = residual3p(A,x,b)
% RESIDUAL3p Triple precision residual, A*x - b.
% r = residual3p(A,x,b) for matrix A, vectors x and b. 

   m = size(A,1);
   r = zeros(m,1);
   for k = 1:m
      r(k) = dot3p(A(k,:),x,-b(k));
   end

end

You can download this function from here.

Published with MATLAB® R2014b