Solving Commodious Linear Systems

Posted by Cleve Moler, April 28, 2021

6 views (last 30 days) | 0 Likes | 2 comments

This is about linear systems with fewer equations than variables; A*x = b where the m -by- n matrix A has fewer rows that columns, so m < n . I have always called such systems wide or fat, but this is not respectful. So I consulted the Merriam-Webster Thesaurus and found commodious.

This is also one my first posts to use the Live Editor™. Here is a a link to the .mlx file.

Commodious linear systems
Generate examples
Backslash
Pseudoinverse
SVD
Questions
1.
2.
3.
4.

Commodious linear systems

Commodious linear systems with fewer rows than columns do not have unique solutions. It is hard to compute something if you don't quite know what it is you're trying to compute. You can pin down the solution by adding an additional condition: minimize its Euclidean or $\ell_2$ norm. This is the Moore-Penrose solution. It is unique and has many other desirable properties.

But recent years have seen many situations where a different condition is desirable: minimize the number of nonzero components of the solution. Such a solution is called sparse. Actually minimizing the number of nonzero components is combinatorically difficult. (It is NP-hard.) However, the MATLAB backslash operator produces a solution with at most m nonzero components. This is often the minimum.

Generate examples

Pick the dimensions.

   m = 3
   n = 10

   assert(m < n)
   format short

Generate an m-by-n matrix with random one-digit positive and negative integer entries. Every time this script is executed, a different A is created.

   A = randi(18,m,n) - 9

A =
     0     2     7    -8     2    -4    -6     3     0     0
     2     3    -7     6     7     3    -3     9    -3    -1
     4     1     3    -8    -2    -7    -6     1    -8     5

Generate a random right hand side with the same number of rows.

   b = randi(18,m,1) - 9

Save a copy of the rhs.

   b_copy = b;

Backslash

Use backslash to compute one of many possible solutions to A*x = b.

   x = A\b

This x is sparse; it has only m nonzero components.

How was x computed? Compute the QR decomposition with column pivoting. That is A*P = Q*R with orthogonal Q, right trapezoidal R, and permutation P.

   [Q,R,P] = qr(A)

Q =
   -0.6247   -0.4341   -0.6491
    0.4685   -0.8734    0.1331
   -0.6247   -0.2209    0.7490
R =
  Columns 1 through 7
   12.8062    1.7179    3.5920   -1.5617    3.2796    8.2772    6.0908
         0   -9.3834    4.3876   -2.6305   -6.5398    0.6628    6.5502
         0         0   -6.3911    3.2621   -1.8641   -2.2469   -0.9986
  Columns 8 through 10
   -0.4685   -9.5266   -3.5920
   -3.7092    2.4121   -0.2313
   -0.1498   -3.2289    3.6117
P =
     0     0     0     1     0     0     0     0     0     0
     0     0     0     0     0     0     0     1     0     0
     0     0     0     0     0     0     0     0     1     0
     1     0     0     0     0     0     0     0     0     0
     0     0     0     0     1     0     0     0     0     0
     0     0     0     0     0     1     0     0     0     0
     0     0     0     0     0     0     1     0     0     0
     0     1     0     0     0     0     0     0     0     0
     0     0     1     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0     0     1

Keep only the first m columns of R.

   R = R(:,1:m)

R =
   12.8062    1.7179    3.5920
         0   -9.3834    4.3876
         0         0   -6.3911

Apply Q to right hand side.

   b = Q'*b

Back substitution solves the square, triangular system R*x = b.

   x = R\b

Tack some zeros on the end of x, matching the number of columns deleted from R.

   x(m+1:n) = 0

Undo the permutations.

   x = P*x

Pseudoinverse

Restore the rhs.

   b = b_copy;

For comparison, the minimum 2-norm solution can be obtained with the Moore-Penrose pseudoinverse. This is usually completely full; none of its components are zero.

   y = pinv(A)*b

Both x and y are solutions to the original system.

   Ax = A*x
   Ay = A*y

However norm(y,2) is less than norm(x,2).

   normyx = [norm(y) norm(x)]

normyx =
    0.4112    0.6222

SVD

To compute y without computing the pseudoinverse, use the SVD.

   [U,S,V] = svd(A)

U =
   -0.5601    0.2006   -0.8038
    0.3872    0.9211   -0.0399
   -0.7324    0.3336    0.5936
S =
  Columns 1 through 7
   20.2128         0         0         0         0         0         0
         0   15.1459         0         0         0         0         0
         0         0    8.3091         0         0         0         0
  Columns 8 through 10
         0         0         0
         0         0         0
         0         0         0
V =
  Columns 1 through 7
   -0.1066    0.2097    0.2761    0.5246    0.0327    0.4104    0.4231
   -0.0342    0.2310   -0.1365   -0.2150   -0.5533   -0.1705    0.3003
   -0.4368   -0.2670   -0.4292    0.4955   -0.0865    0.1040    0.1695
    0.6265    0.0828    0.1735    0.5967   -0.1499   -0.2918   -0.1873
    0.1511    0.4082   -0.3700   -0.0089    0.7338   -0.0647    0.1593
    0.4219   -0.0247   -0.1275   -0.1803   -0.1248    0.8221   -0.0885
    0.3262   -0.3940    0.1662   -0.1806    0.0969   -0.1350    0.7663
    0.0530    0.6091   -0.2620   -0.0174   -0.2910   -0.0377    0.2134
    0.2324   -0.3586   -0.5571    0.1057   -0.0492   -0.0517   -0.0236
   -0.2003    0.0493    0.3620   -0.0193    0.1110    0.0645   -0.0210
  Columns 8 through 10
   -0.1425    0.4068   -0.2429
   -0.6536    0.0165    0.1666
    0.0216   -0.4643    0.2127
   -0.0817   -0.1943    0.1656
   -0.2858   -0.0944    0.1226
   -0.0609   -0.2152    0.1524
    0.1853   -0.1585    0.0614
    0.6475    0.0133    0.0785
    0.0450    0.6800    0.1605
    0.0664    0.1951    0.8756

Then, with appropriate parentheses, and with a backslash involving a diagonal matrix, finding the minimum norm solution is a one-liner, .

   y = V*(S\(U'*b))

Questions

There are lots of things that I don't know about commodious systems.

1.

Is there a bound on norm(x)/norm(y)?

2.

The condition number of A with respect to any norm could be defined by

cond(A) = max(norm(A*x)/norm(x))/min(norm(A*x)/norm(x))

We know that for the 2-norm this is the ratio of singular values. What about the 1-norm or the inf-norm? How can we compute this?

3.

What can be proved about the residual and the error of the solution computed by backslash? I suspect the residual is small, but what is the error when the solution is not unique?

4.

What is this distribution?

   kmax = 10000;
   kappa = zeros(kmax,1);
   for k = 1:kmax
       A = randi(18,m,n) - 9;
       kappa(k) = cond(A);
   end
   histogram(kappa)
   title('cond(A)')
   ax = axis;
   text(0.7*ax(2),0.8*ax(4),sprintf('m =  %2d, n = %2d',m,n))