Solving Commodious Linear Systems
This is about linear systems with fewer equations than variables; A*x = b where the m -by- n matrix A has fewer rows that columns, so m < n . I have always called such systems wide or fat, but this is not respectful. So I consulted the Merriam-Webster Thesaurus and found commodious.
This is also one my first posts to use the Live Editor™. Here is a a link to the .mlx file.
Contents
Commodious linear systems
Commodious linear systems with fewer rows than columns do not have unique solutions. It is hard to compute something if you don't quite know what it is you're trying to compute. You can pin down the solution by adding an additional condition: minimize its Euclidean or $\ell_2$ norm. This is the Moore-Penrose solution. It is unique and has many other desirable properties.
But recent years have seen many situations where a different condition is desirable: minimize the number of nonzero components of the solution. Such a solution is called sparse. Actually minimizing the number of nonzero components is combinatorically difficult. (It is NP-hard.) However, the MATLAB backslash operator produces a solution with at most m nonzero components. This is often the minimum.
Generate examples
Pick the dimensions.
m = 3 n = 10
m = 3 n = 10
assert(m < n)
format short
Generate an m-by-n matrix with random one-digit positive and negative integer entries. Every time this script is executed, a different A is created.
A = randi(18,m,n) - 9
A = 0 2 7 -8 2 -4 -6 3 0 0 2 3 -7 6 7 3 -3 9 -3 -1 4 1 3 -8 -2 -7 -6 1 -8 5
Generate a random right hand side with the same number of rows.
b = randi(18,m,1) - 9
b = 5 0 6
Save a copy of the rhs.
b_copy = b;
Backslash
Use backslash to compute one of many possible solutions to A*x = b.
x = A\b
x = 0 0 0 -0.5195 0 0 0 0.2812 -0.1953 0
This x is sparse; it has only m nonzero components.
How was x computed? Compute the QR decomposition with column pivoting. That is A*P = Q*R with orthogonal Q, right trapezoidal R, and permutation P.
[Q,R,P] = qr(A)
Q = -0.6247 -0.4341 -0.6491 0.4685 -0.8734 0.1331 -0.6247 -0.2209 0.7490 R = Columns 1 through 7 12.8062 1.7179 3.5920 -1.5617 3.2796 8.2772 6.0908 0 -9.3834 4.3876 -2.6305 -6.5398 0.6628 6.5502 0 0 -6.3911 3.2621 -1.8641 -2.2469 -0.9986 Columns 8 through 10 -0.4685 -9.5266 -3.5920 -3.7092 2.4121 -0.2313 -0.1498 -3.2289 3.6117 P = 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Keep only the first m columns of R.
R = R(:,1:m)
R = 12.8062 1.7179 3.5920 0 -9.3834 4.3876 0 0 -6.3911
Apply Q to right hand side.
b = Q'*b
b = -6.8716 -3.4960 1.2483
Back substitution solves the square, triangular system R*x = b.
x = R\b
x = -0.5195 0.2812 -0.1953
Tack some zeros on the end of x, matching the number of columns deleted from R.
x(m+1:n) = 0
x = -0.5195 0.2812 -0.1953 0 0 0 0 0 0 0
Undo the permutations.
x = P*x
x = 0 0 0 -0.5195 0 0 0 0.2812 -0.1953 0
Pseudoinverse
Restore the rhs.
b = b_copy;
For comparison, the minimum 2-norm solution can be obtained with the Moore-Penrose pseudoinverse. This is usually completely full; none of its components are zero.
y = pinv(A)*b
y = 0.0644 0.0655 0.1261 -0.2161 0.0475 -0.1481 -0.2034 0.1164 -0.1232 0.0612
Both x and y are solutions to the original system.
Ax = A*x Ay = A*y
Ax = 5.0000 0.0000 6.0000 Ay = 5.0000 -0.0000 6.0000
However norm(y,2) is less than norm(x,2).
normyx = [norm(y) norm(x)]
normyx = 0.4112 0.6222
SVD
To compute y without computing the pseudoinverse, use the SVD.
[U,S,V] = svd(A)
U = -0.5601 0.2006 -0.8038 0.3872 0.9211 -0.0399 -0.7324 0.3336 0.5936 S = Columns 1 through 7 20.2128 0 0 0 0 0 0 0 15.1459 0 0 0 0 0 0 0 8.3091 0 0 0 0 Columns 8 through 10 0 0 0 0 0 0 0 0 0 V = Columns 1 through 7 -0.1066 0.2097 0.2761 0.5246 0.0327 0.4104 0.4231 -0.0342 0.2310 -0.1365 -0.2150 -0.5533 -0.1705 0.3003 -0.4368 -0.2670 -0.4292 0.4955 -0.0865 0.1040 0.1695 0.6265 0.0828 0.1735 0.5967 -0.1499 -0.2918 -0.1873 0.1511 0.4082 -0.3700 -0.0089 0.7338 -0.0647 0.1593 0.4219 -0.0247 -0.1275 -0.1803 -0.1248 0.8221 -0.0885 0.3262 -0.3940 0.1662 -0.1806 0.0969 -0.1350 0.7663 0.0530 0.6091 -0.2620 -0.0174 -0.2910 -0.0377 0.2134 0.2324 -0.3586 -0.5571 0.1057 -0.0492 -0.0517 -0.0236 -0.2003 0.0493 0.3620 -0.0193 0.1110 0.0645 -0.0210 Columns 8 through 10 -0.1425 0.4068 -0.2429 -0.6536 0.0165 0.1666 0.0216 -0.4643 0.2127 -0.0817 -0.1943 0.1656 -0.2858 -0.0944 0.1226 -0.0609 -0.2152 0.1524 0.1853 -0.1585 0.0614 0.6475 0.0133 0.0785 0.0450 0.6800 0.1605 0.0664 0.1951 0.8756
Then, with appropriate parentheses, and with a backslash involving a diagonal matrix, finding the minimum norm solution is a one-liner, .
y = V*(S\(U'*b))
y = 0.0644 0.0655 0.1261 -0.2161 0.0475 -0.1481 -0.2034 0.1164 -0.1232 0.0612
Questions
There are lots of things that I don't know about commodious systems.
1.
Is there a bound on norm(x)/norm(y)?
2.
The condition number of A with respect to any norm could be defined by
cond(A) = max(norm(A*x)/norm(x))/min(norm(A*x)/norm(x))
We know that for the 2-norm this is the ratio of singular values. What about the 1-norm or the inf-norm? How can we compute this?
3.
What can be proved about the residual and the error of the solution computed by backslash? I suspect the residual is small, but what is the error when the solution is not unique?
4.
What is this distribution?
kmax = 10000; kappa = zeros(kmax,1); for k = 1:kmax A = randi(18,m,n) - 9; kappa(k) = cond(A); end histogram(kappa) title('cond(A)') ax = axis; text(0.7*ax(2),0.8*ax(4),sprintf('m = %2d, n = %2d',m,n))
Comments
To leave a comment, please click here to sign in to your MathWorks Account or create a new one.