Hilbert Matrices

Posted by Cleve Moler, February 2, 2013

8 views (last 30 days) | 0 Likes | 6 comments

The inverse Hilbert matrix, invhilb, has recently made surprise appearances in Cody, the programming game on MATLAB Central, and one of Ned's posts in the MATLAB Spoken Here blog. Inverse Hilbert matrices had nearly been forgotten in MATLAB. Their comeback is due to the sign pattern of their entries. But I want to take you back to their original role demonstrating ill conditioning in numerical calculation.

Hilbert Matrix
Badly Conditioned
Inverse Hilbert Matrix
An Old Program
Roundoff Error
My Project
Project Reexamined
Checkerboard
References

   T = invhilb(8);
   imagesc(sign(T))
   axis image, axis off
   colormap(copper)

Hilbert Matrix

The Hilbert matrix is the first matrix I ever knew by name. I met it in my first numerical analysis course, when I was a junior at Caltech in 1959. The matrix comes from least squares approximation by monomials, $x^{j}$, on the unit interval $[0,1]$. The elements of $H_n$ are inner products

$$ h_{i,j} = \int_0^1 x^{i-1} x^{j-1} dx = \frac{1}{i+j-1} $$

Here is the code to generate the 6-by-6 Hilbert matrix in a single array operation. Similar code is used in the function hilb that is distributed with MATLAB. I am using single precision so we can see roundoff error in matrices that print nicely in the width of one blog page.

   format compact
   format long
   n = 6
   J = 1:n;
   J = J(ones(n,1),:);
   I = J';
   E = single(ones(n,n));
   H = E./(I+J-1)

n =
     6
H =
   1.0000000   0.5000000   0.3333333   0.2500000   0.2000000   0.1666667
   0.5000000   0.3333333   0.2500000   0.2000000   0.1666667   0.1428571
   0.3333333   0.2500000   0.2000000   0.1666667   0.1428571   0.1250000
   0.2500000   0.2000000   0.1666667   0.1428571   0.1250000   0.1111111
   0.2000000   0.1666667   0.1428571   0.1250000   0.1111111   0.1000000
   0.1666667   0.1428571   0.1250000   0.1111111   0.1000000   0.0909091

Badly Conditioned

The monomials $x^{j}$ are nearly linearly dependent on $[0,1]$. So the Hilbert matrices are nearly singular. The condition of $H_6$ is comparable with single precision roundoff error and the condition of $H_{12}$ is comparable with double precision roundoff error.

   format short e
   [1/cond(hilb(6)) eps(single(1))]
   [1/cond(hilb(12)) eps(double(1))]

ans =
  6.6885e-08  1.1921e-07
ans =
   5.7268e-17   2.2204e-16

The $l_2$ condition number, $\kappa$, of $H_n$ grows exponentially.

$$ \kappa(H_n) \approx 0.01133e^{3.49n} $$

Inverse Hilbert Matrix

A Hilbert matrix qualifies as a Cauchy matrix, which is a matrix whose entries are of the form

$$ a_{i,j} = \frac{1}{x_i - y_j} $$

A classic Knuth homework problem or the Wikipedia entry on Cauchy matrices (see References) shows how it is possible to express the elements of the inverse of a Cauchy matrix in terms of products involving the $x_i$'s and $y_j$'s. For a Hilbert matrix, those products become binomial coefficients.

Here is a program that generates the inverse Hilbert matrix using doubly nested for loops and many scalar evaluations of binomial coefficients.

type invh

function T = invh(n)
   for i = 1:n
      for j = 1:n
         T(i,j) = (-1)^(i+j)*(i+j-1)*nchoosek(n+i-1,n-j)* ...
            nchoosek(n+j-1,n-i)*nchoosek(i+j-2,i-1)^2;
      end
   end
end

An Old Program

One of the first programs that I ever wrote generated the inverse Hilbert matrix from these binomial coefficients. It was written in absolute numeric machine language for the Burroughs 205 Datatron at Caltech around 1959. I needed to avoid time consuming floating point arithmetic, so I used recursion relationships among the coefficients. Here is the core of the MATLAB version of that old program. For a long time this has been distributed with MATLAB as invhilb. This is the function that the Cody participants discovered.

type invhilb_code

   T = zeros(n,n);
   p = n;
   for i = 1:n
       r = p*p;
       T(i,i) = r/(2*i-1);
       for j = i+1:n
           r = -((n-j+1)*r*(n+j-1))/(j-1)^2;
           T(i,j) = r/(i+j-1);
           T(j,i) = r/(i+j-1);
       end
       p = ((n-i)*p*(n+i))/(i^2);
   end

We can use either code to generate the exact inverse of $H_6$.

T = invhilb(6)

T =
          36        -630        3360       -7560        7560       -2772
        -630       14700      -88200      211680     -220500       83160
        3360      -88200      564480    -1411200     1512000     -582120
       -7560      211680    -1411200     3628800    -3969000     1552320
        7560     -220500     1512000    -3969000     4410000    -1746360
       -2772       83160     -582120     1552320    -1746360      698544

Since the elements of a Hilbert matrix are rational fractions, the elements of its inverse must also be rational fractions. But it turns out that the denominators of all the fractions are equal to one, so, as you can see, the inverse has integer entries. The integers are large, hence the large condition number.

Roundoff Error

Let's invert T and see how close we get to H.

   format long
   inv(single(T))

Warning: Matrix is close to singular or badly scaled. Results may be inaccurate.
RCOND =  3.380340e-08. 
ans =
   1.0102043   0.5085138   0.3406332   0.2563883   0.2056789   0.1717779
   0.5085138   0.3404373   0.2560914   0.2053309   0.1714057   0.1471225
   0.3406332   0.2560914   0.2052232   0.1712378   0.1469208   0.1286576
   0.2563883   0.2053309   0.1712378   0.1468577   0.1285564   0.1143121
   0.2056789   0.1714057   0.1469208   0.1285564   0.1142728   0.1028457
   0.1717779   0.1471225   0.1286576   0.1143121   0.1028457   0.0934704

We can just recognize one or two digits of H. Since the condition number of T, is comparable with single precision roundoff level, we might have expected to lose all the digits. Roundoff error observed in actual matrix computation is usually not as bad as estimates based on condition numbers predict.

My Project

Over 50 years ago, after my first numerical analysis course, where I was introduced to matrix computation by Prof. John Todd, I did an individual undergrad research project under his direction. Part of the project involved studying the roundoff error in matrix inversion. First, I wrote a matrix inversion program. Then, in order to assess roundoff error, I generated Hilbert matrices, inverted them with my program, and compared the computed inverses with the exact inverses generated by the program that I've just described. If I'd had MATLAB at the time, the project would have gone something like this.

   n = 6;
   H = single(hilb(n));
   X = inv(H);
   T = single(invhilb(n));
   relerr = norm(X-T,inf)/norm(T)

Warning: Matrix is close to singular or badly scaled. Results may be inaccurate.
RCOND =  3.570602e-08. 
relerr =
   0.0470157

(My old matrix inversion program would not have estimated condition and issued a warning message.)

The Datatron floating point arithmetic had eight decimal digits, so I used values of n up to 7 and reported the observed relerr's as the result of roundoff error from matrix inversion by Gaussian elimination for this particular badly conditioned matrix. Case closed. Project grade: A.

Project Reexamined

I went on to grad school at Stanford and met George Forsythe and later Jim Wilkinson. As I came to appreciate the work that Wilkinson was doing, I realized that in my project at Caltech I had made a subtle but important mistake. The same mistake is still being made today by others. I had not actually inverted the Hilbert matrix. I had inverted the floating point approximation to the Hilbert matrix. It turns out that the initial step -- converting the fractions that are the elements of the Hilbert matrix to floating point numbers -- has a larger effect on the computed inverse than the inversion process itself. Even if my inversion program could somehow compute the exact inverse of the matrix it is given, it could not match the result generated by my theoretical inverse Hilbert matrix program.

I should have inverted the inverse, because I would not have to make any roundoff errors to generate the input. Since the elements of the starting matrix are integers, they can be represented exactly in floating point, at least if n is not too large. I should have interchanged the roles of T and H in my project.

   n = 6;
   T = single(invhilb(n));
   X = inv(T);
   H = single(hilb(n));
   relerr = norm(X-H,inf)/norm(H)

Warning: Matrix is close to singular or badly scaled. Results may be inaccurate.
RCOND =  3.380340e-08. 
relerr =
   0.0266826

This confirms what Wilkinson had been saying, and proving. The roundoff error caused by inverting a matrix using Gaussian elimination with partial pivoting is comparable with -- in this case actually smaller than -- the error caused by rounding the data in the first place.

Checkerboard

You can see the +/- sign pattern of the elements in the inverse Hilbert matrix by examining T or the (-1)^(i+j) multiplying the binomial coefficients in invh. That's what caught the attention of the folks playing Cody. I have no idea how they came across it.