A friend is investigating the use of generative AI in his classes. I asked two different popular chatbots to write MATLAB programs for a mathematically nontrivial problem. Both chatbots understood my query and both wrote plausible MATLAB programs, but one of the programs was not correct. My recommendation for coursework: carefully read and test programs produced by generative AI and repair any incorrect ones.... read more >>
]]>A friend is investigating the use of generative AI in his classes. I asked two different popular chatbots to write MATLAB programs for a mathematically nontrivial problem. Both chatbots understood my query and both wrote plausible MATLAB programs, but one of the programs was not correct. My recommendation for coursework: carefully read and test programs produced by generative AI and repair any incorrect ones.
Frequent readers of this blog know about my recent obsession with Redheffer matrices and the Mertens conjecture. So, when my friend Stan Ahalt visited from North Carolina State a few days ago, and wanted to discuss the use of generative AI in the new Data Science curriculum, I suggested asking chatbots to write MATLAB programs that check the Mertens conjecture.
We logged onto the web sites for two different AI companies. Our query was the same at both sites.
Please write a MATLAB program to test the Mertens conjecture.
I was pleasantly surprised when both systems knew about MATLAB and understood the query. One of the systems replied:
Certainly! I'll create a MATLAB program to test the Mertens conjecture. First, let me briefly explain what the Mertens conjecture is, and then I'll provide the code. The Mertens conjecture states that the Mertens function M(n) is bounded by √n for all positive integers n. Mathematically, it can be expressed as:
|M(n)| ≤ √n for all n > 1
Where M(n) is the Mertens function, defined as the sum of the Möbius function μ(k) for all positive integers k up to n:
M(n) = Σ μ(k) for k = 1 to n
Here's a MATLAB program to test this conjecture.
The program from one of the web sites begins:
function M = test_Mertens(n) % Function to test the Mertens conjecture for a given n % Returns true if the conjecture holds for all values up to n, false otherwise
This is followed by code to sum the Möbius function and verify that abs(M) < sqrt(n). The crucial mobius is a straight-forward implementation of the definition. The code relies on MATLAB intrinsics, factor and unique.
function mu = mobius(k) % Function to calculate the Möbius function of k % Factorize k p = factor(k); % Check for square factors if length(p) ~= length(unique(p)) mu = 0; else mu = (-1)^length(p); end end
I was very pleased. The entire program is only 40 lines long, eminently readable, and correct.
The program from the other web site is more sophisticated. It knows enough about MATLAB to use its cumulative summation.
M = cumsum(mu);
Here is the crucial mobius function. At first glance, it looks OK. Is it?
function mu = mobius(n) % Calculate Möbius function values up to n mu = ones(1, n); for i = 2:sqrt(n) if mu(i) == 1 for j = i^2:i:n if mod(j, i^2) == 0 mu(j) = 0; else mu(j) = -mu(j); end end end end end
As you have guessed, program number two is not correct. Even its own plot reveals that something is amiss.
Let's return to the second Web site and make a more specific query.
Please write a MATLAB program that uses Redheffer matrices to test the Mertens conjecture.
The resulting program generates a Redheffer matrix with
R = zeros(n,n); for i = 1:n for j = 1:n if mod(i, j) == 0 R(i, j) = 1; end end end
It computes the determinant to obtain the Mertens value, M(n). It is able to confirm the conjecture for all the n in the given range.
What's wrong now? Look at the plot. The computed M(n) is always equal to one. Why?
Program number three can be repaired by adding one line to the code that generates R. What is that line? Respond in the comments.
Look at the good side. Instead of just using AI to do homework, we can challenge students to check the results carefully and repair any incorrect programs.
Get
the MATLAB code
Published with MATLAB® R2024a
Shortly after I published the second post about the Mertens conjecture, a reader's comment suggested a new approach to computing Redheffer determinants and the Mertens function. It is now possible to compute a half-million values of the Mertens function in about five hours.... read more >>
]]>Shortly after I published the second post about the Mertens conjecture, a reader's comment suggested a new approach to computing Redheffer determinants and the Mertens function. It is now possible to compute a half-million values of the Mertens function in about five hours.
The comment references the Wikipedia article on block matrices.
You could also consider the matrix as a 2x2 block matrix and use the formula for the determinant of a block matrix [1]. A = redheffer(n); M = full(A(1,1) - A(1, 2:end) * (A(2:end,2:end) \ A(2:end, 1))); Since the (n-1)x(n-1) block is upper triangular, the solve becomes a back-substitution.
My new program is named redmert, an abbreviation of Redheffer-Mertens. It uses the fact that redheffer(n) is obtained from redheffer(n-1) by appending the last column.
Let R(n) denote the upper or right-triangular part of redheffer(n).
R(n) = triu(redheffer(n))
R(n) is sparse, upper triangular and has ones on the diagonal. The indices of the nonzeros in the last column of R(n) are the factors of n. For example, here is R(8).
R8 = full(triu(redheffer(8)))
R8 = 1 1 1 1 1 1 1 1 0 1 0 1 0 1 0 1 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1
The idea behind redmert is to compute a sequence of Redheffer matrices, R, and associated values of the Mertens function, M.
[M,R] = redmert(p,R)
The input is a scalar integer p, the desired sequence length, and a sparse matrix R, the upper triangle of a Redheffer matrix of some order, n. The output is an integer vector of values M(n+1:n+p) and the upper triangle of the Redheffer matrix of order n+p. This output R can then be used as the input R in another call to redmert.
The sequence is started with an empty R.
For example,
[M,R] = redmert(8,[]);
The output is mertens(n), n = 1:8, and R8 from the example above.
MR8 = full(R)
M = 1 0 -1 -1 -2 -1 -2 -2 R8 = 1 1 1 1 1 1 1 1 0 1 0 1 0 1 0 1 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1
The entire code for redmert is twelve lines long. It manipulates sparse matrices and uses sparse backslash to solve a triangular system. Nothing else is required.
Lines 7 and 8 generate the last column of R and lines 9 and 10 implement the new idea about block matrices.
dbtype redmert
1 function [M,R] = redmert(p,Rin) 2 M = zeros(p,1); 3 R = sparse(triu(Rin)); 4 n = size(R,1); 5 for q = 1:p 6 n = n+1; 7 k = (mod(n,1:n) == 0); 8 R(k,n) = 1; 9 e = ones(n-1,1); 10 M(q) = R(1,1) - e'*(R(2:n,2:n)\e); 11 end 12 end
It takes about five hours for redmert to compute half a million values on my laptop.
n = 0.5e6; p = 0.5e4; R = sparse([]); M = []; for k = p:p:n disp(k) [Mout,R] = redmert(p,R); M = [M; Mout]; mertens_plot(M) end
mertens_plot
I started this project by being surprised to find myself computing determinants. Now I am back to my long-time position disparaging determinants. They have been replaced by a good friend, backslash.
Get
the MATLAB code
Published with MATLAB® R2024a
Shortly after I posted Redheffer, Mertens and One-Million Dollars a few days ago, Mathworks' Pat Quillen made an important observation about computing the Mertens function.... read more >>
]]>Shortly after I posted Redheffer, Mertens and One-Million Dollars a few days ago, Mathworks' Pat Quillen made an important observation about computing the Mertens function.
The elements in the first column of the Redheffer matrix, A = redheffer(n), are all equal to one. That dense column does not make MATLAB happy about computing det(A) . However, the last column of A has only a few nonzero elements and so Pat suggested interchanging the first and last columns before computing the determinant. This makes a world of difference. (Thanks, Pat.)
typemertens
function M = mertens(n) A = redheffer(n); M = A(1,1) - A(1,2:n)*(A(2:n,2:n)\A(2:n,1)); end
The time required to compute det(A) varies with the sparsity of the last column, but it is only a little more than the time to compute redheffer(n) in the first place.
mertens2_time
Pat's change makes it possible to take n up to a quarter of a million, and beyond. Here is a new plot of the Mertens function M(n) and the sqrt(n) bounds of the Mertens conjecture.
mertens_plot
There are a quarter of a million points in the data for this plot. Fortunately, the .PNG file used for the blog only needs to sample the data.
The job that I ran on my laptop to compute one-quarter of a million values of M(n) is still running. It currently is past 0.35 million and takes less than two seconds for each value. I may keep the job running over the weekend, just to see how far it gets.
The task is embarrassingly parallel. If I had a pool with a million processors, I could have each processor compute one value. I would then just have to collect the results, but that doesn't involve any arithmetic.
You can see from the plot why late 19th- and early 20th-century mathematicians believed that the Mertens conjecture,
|M(n)| < sqrt(n) for all n,
might be true. It is hard to imagine that the plot of M(n) ever escapes sqrt(n).
We now know that M(n) eventually does escape, but only barely and only briefly. We also know that all the computation we can do with determinants of Redheffer's matrix will never prove or disprove the conjecture or win that million-dollar prize.
Get
the MATLAB code
Published with MATLAB® R2024a
I didn't know anything about these topics until a couple of weeks ago. Now I can't stop thinking about them.... read more >>
]]>I didn't know anything about these topics until a couple of weeks ago. Now I can't stop thinking about them.
Raymond Redheffer (1921-2005) was an American mathematician, a professor at UCLA for 55 years, the author of several popular textbooks, the inventor of the electronic game Nim and, with Ray and Charles Eames, the creator of a four-meter long poster Men of Modern Mathematics.
Redheffer's matrix is a matrix of zeros and ones. A(k,j) equals one if j is a divisor of k. In addition, the first column is all ones.
function A = redheffer(n) k = 1:n; j = k'; A = (mod(k,j) == 0); A(:,1) = 1; end
Or
A = gallery('redheff',n)
Here is the 10-by-10.
A=redheffer(10);disp(full(A))
1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 0 1 0 1 1 0 1 0 0 1 0 0 1 0 1 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1
And here is a spy plot of the 200-by-200.
A=redheffer(200);spy(A)title('redheffer(200)')
The Möbius function was introduced in 1832 by German mathematician August Möbius and is ubiquitous in the study of prime numbers. For positive integers n, μ(n) is
Franz Mertens (1840-1927) was a Polish mathematician who spent most of his career in Austria at the University of Vienna. Here is a link to a biography of Mertens at the University of St. Andrews MacTutor project.
The Mertens function is sum of values of the Möbius function. For a positive integer n, the Mertens function is
M(n) = sum(μ(1:n))
So M(n) is the difference between the number of square-free integers with an even number of prime factors and those with an odd number.
This graphic shows M(n) for n = 1:100000.
mertens_plot_2
The Mertens function M(n) fluctuates wildly and grows slowly with increasing n. The graphic shows that M(n) is easily bounded by of sqrt(n) and -sqrt(n), at least for n less than 100k. The Mertens conjecture is that this continues for larger n.
-sqrt(n) < M(n) < sqrt(n) for all n
The conjecture was included in a letter from Stieltjes to Hermite in 1895 and in a paper by Mertens in 1897. The result is important since a proof of the Mertens conjecture would imply the truth of the Riemann hypothesis.
I became interested in all this when I learned that the determinant of the MATLAB Gallery matrix which I have ignored for years is related to the Riemann hypothesis and the million-dollar prize.
M(n) = det(gallery('redheff',n))
I know very little about the distribution of prime numbers and computing values of the Möbius function. On the other hand, I know a lot about numerical linear algebra and computing determinants.
In general, I am dead set against computing determinants. They are often used to check for singularity or to somehow compute eigenvalues. But here the determinant is an integer counter of modest size.
Computing M(n) directly with det(redheffer(n)) requires O(n^2) space and O(n^3) time and is not practical for large n.
However, A = redheffer(n) is modestly sparse. Here is the fraction of nonzeros.
s = nnz(A)/n^2
disp(sparsity)
n s _____ ________ 10000 0.001037 20000 0.000553 30000 0.000382 40000 0.000294 50000 0.000239 60000 0.000203 70000 0.000176 80000 0.000156 90000 0.000139 1e+05 0.000127
Taking advantage of this sparsity and the MATLAB tools for sparse matrix computation provide linear space complexity and perhaps O(n^2) time complexity.
Here is MATLAB code to generate a sparse redheffer(n) without creating any full intermediate matrices.
typeredheffer
function A = redheffer(n) j(1:n) = (1:n)'; k(1:n) = 1; m = n; for i = 2:n t = [1 i:i:n]'; p = length(t); j(m+(1:p)) = t; k(m+(1:p)) = i; m = m+p; end A = sparse(k,j,1,n,n); end
As expected, we see that the execution time for redheffer(n) is a linear function of n. (The space required also grows linearly.)
redheffer_time
The MATLAB Gaussian elimination function lu with one sparse input and four sparse outputs is designed for solving sparse linear systems.
[L,U,P,Q] = lu(A)
Written primarily by Tim Davis and included in his UMFPACK package, the function uses an unsymmetric pattern multifrontal pivoting strategy to find permutations P and Q so that L is lower triangular, U is upper triangular and
P*A*Q = L*U
Consequently, the determinant of A is the product of four easily computed determinants.
det(A) = det(L)*det(U)*det(P)*det(Q)
The pivoting strategy aims to reduce fill-in while maintaining numerical stability.
For example, here are L and U for the Redheffer matrix in the spy plot near the top of this blog post.
closeA=redheffer(200);[L,U,P,Q]=lu(A);spy(L|U)title('L|U')
And here are the four determinants.
dets=[det(L),det(U),det(P),det(Q)];disp(dets)
1 -8 -1 -1
Finally, M(200) is
M_200=det(L)*det(U)*det(P)*det(Q)
M_200 = -8
Mertens function can be computed with four lines of code.
typemertens
function M = mertens(n) der = @(x) full(round(prod(diag(x)))); A = redheffer(n); [L,U,P,Q] = lu(A); M = der(L)*der(U)*det(P)*det(Q); end
Execution time for mertens is dominated by the time in sparse lu. The time required to compute the four determinants is an order of magnitude smaller than the other two.
Experimentally, we see that the time complexity of sparse lu is O(n^2), but we have no proof.
mertens_time2
The Mertens conjecture stood for nearly 100 years before it was proved false in 1985 by Andrew Odlyzko and Herman te Riele. The authors present indirect proofs that
lim sup (n -> inf) M(n)/sqrt(n) > 1.06
lim inf (n -> inf) M(n)/sqrt(n) < -1.009
Odlyzko and te Riele do not actually produce any value of n for which M(n) > sqrt(x). They suspect that any Mertens conjecture counterexample requires n > $10^{30}$, which is far beyond any computation possible today.
Odlyzko and te Riele also describe several complete tabulations of M(n) for n as large as $7.8 \cdot 10^{9}$ . These computations do not use Redheffer determinants.
To tell the truth, I did not really expect to find any Mertens or Riemann counterexamples. I did, however, enjoy computing determinants for the first time and discovering an unexpected use for sparse LU.
Thanks a lot to Tim Davis for his help with this post.
A. M. Odlyzko and H. J. J. te Riele, "Disproof of the Mertens conjecture", Journal für die reine und angewandte Mathematik, Vol. 357 (1985), Pages138-160. https://eudml.org/doc/152712.
Timothy A. Davis, "A Column Pre-Ordering Strategy for the Unsymmetric-Pattern Multifrontal Method", ACM Transactions on Mathematical Software, Vol. 30, No. 2, June 2004, Pages 165–195. https://dl.acm.org/doi/abs/10.1145/992200.992205.
Get
the MATLAB code
Published with MATLAB® R2024a
The NA-Digest is an electronic newsletter for the numerical analysis and scientific software community. The NA-Digest is one of world's first examples of social networking. The Digest is one of the forces that makes our community a living, viable community.... read more >>
]]>The NA-Digest is an electronic newsletter for the numerical analysis and scientific software community. The NA-Digest is one of world's first examples of social networking. The Digest is one of the forces that makes our community a living, viable community.
The Digest is part of NA-Net, which also includes Netlib, a collection of mathematical software, papers, and databases.
For its first forty years, the NA-Digest has had only four editors. Now, we are adding two more. As we do that, I would like to take a personal look back at the history of the NA-Digest.
Like many other developments in the numerical analysis world, the Digest began with Gene Golub. In the early 1980's, Golub was chair of Stanford's Computer Science Department. Email was a new thing and Gene maintained a list of email addresses for his many friends around the world. Email addresses came in many different formats; today's system of domain names was not yet in wide spread use.
In 1984, Mark Kent, one of Gene's grad students, with help from Ray Tuminaro, Mark Crispin and Dan Kolkowitz, wrote software that used Gene's list in an email forwarding service. Mail sent to
na.name@su-score
would be forwarded to the person with that last name. And email sent to
na@su-score
would be forwarded to everyone on the list.
Gene and Mark Kent began to gather contributions together and send the collection out periodically. By February 1987, this had evolved into a moderated weekly newsletter. Gene dubbed these email services the NA-Net.
Nick Trefethen has this memory.
Early in the days of email, domain names were all over the place. I think there was a period when Gene was using xxx.na for the names of numerical analysts. Then somebody decided addresses should end with the country, giving us .uk and .fr and .ch and all that. For a period, we found that a lot of our numerical analysis emails were being directed to Namibia!
In 1987, Gene asked me to moderate NA-Digest temporarily while he went on a sabbatical. That temporary position ultimately lasted 18 years, until 2005.
Jack Dongarra began his career at Argonne National Laboratory. Jack's colleague, Eric Grosse, began his career at Bell Labs. In 1984, Jack and Eric created Netlib, a software repository and distribution service, and merged it with NA-Net. In 1989, Jack and the NA-Net system moved from Argonne to Oak Ridge National Lab and the University of Tennessee.
Keith Moore, at the University of Tennessee, rewrote the NA-Net software and maintained the servers for many years.
Gerald Ragghianti, the Technical Services Leader at Tennessee's Innovative Computer Lab, currently maintains the NA-Net software and servers.
In 2005, I asked Tammy Kolda, who was then at Sandia Labs in Livermore, to be the NA-Digest editor. Tammy's Wikipedia page reveals that her given name is "Tamara", but everybody calls her "Tammy". She left Sandia is 2021 and now has her own consulting company, MathSci.ai.
In 2010, Tammy recommended that Danny Dunlavy, from Sandia Labs in Albuquerque, take over as editor. He has been the editor for 14 years. Danny's day job at Sandia's Center for Computing Research involves a wide range of fields including computer architecture, cognitive modeling and discrete mathematics.
Starting next week, NA-Digest and NA-Net will move lock, stock, and barrel to Cornell University. The new editors are David Bindell and Alex Townsend. Cornell's IT organization will be taking over the logistics.
David, Alex and Anil Damle are also the hosts for Householder XXII, June 8-13, 2025.
Every issue of NA-Digest since February 1987 is available at https://www.netlib.org/na-digest-html.
When I succeeded Gene as editor in 1987, there were about 800 names on the NA-Net mailing list. Today, in 2024, there are a little over 10,000. Discontinuities in the size of the list result when unused and illegitimate names are removed.
I have made three personally important announcements in the Digest over the years.
October 29, 1989
In 1989 I was working at Ardent Computer, a startup in Silicon Valley. I announced in NA-Digest that MathWorks was looking for a numerical analyst. (Note the MathWorks telephone number near the end of this announcement.)
From: Cleve Moler <na.moler@na-net.stanford.edu> Date: Sun Oct 29 10:39:38 PST 1989 Subject: Positions at The MathWorks
The MathWorks is the company which develops and markets MATLAB. The company currently employs about 30 people and expects to add three or four more soon. The company headquarters is in East Natick, Massachusetts, which is about a half hour drive west of Boston.
The background and interests expected for the various positions available range from numerical linear algebra and matrix computation to systems programming and graphics. Educational level and experience expected range from inexperienced beginner willing to learn to seasoned Ph.D. with a personal library of M-files.
For more information, send email to na.moler@na-net.stanford.edu or phone me at 408-732-0400. Or, contact the MathWorks' president, John Little, with email to jnl@mathworks.com, phone 508-653-1415, or write to:
The MathWorks 21 Eliot Street South Natick, MA 01760
November 26, 1989
Shortly after that announcement, Ardent imploded and I said that I was taking the job myself.
From: Cleve Moler <moler@mathworks.com> Date: Sun Nov 26 09:16:32 PST 1989 Subject: Change of Address for Moler
A couple of weeks ago, I made an announcement here that The MathWorks, the MATLAB company, was looking to fill several positions, including one for a numerical analyst. Now, I've decided to take that slot myself. I'm one of the founders of MathWorks, and have always been a consultant to the company, but now I'll be a full time employee. For a while at least, I'll be working out of my home in California, even though MathWorks headquarters is in Massachusetts. (There are already a couple of other MathWorks developers in the Bay Area.) . . . My electronic address is "moler@mathworks.com". If your mailer can't find the route via uunet to mathworks.com, you can still use "na.moler@na-net.stanford.edu".
November 16, 2007
In November 2007 I was attending the Super Computing conference in Reno. I had rented a car and intended to drive to the Bay Area after the conference. But my wife called me and said, "Hey Cleve, have you heard that Gene is in the hospital?" I left SC immediately and drove to Palo Alto. Two days later we sent out a special issue of the Digest:
From: Cleve Moler <Cleve.Moler@mathworks.com> Date: Fri, 16 Nov 2007 17:55:42 -0500 Subject: Gene Golub, 1932 - 2007
Gene Golub, founder of the NA Digest, passed away today, Friday, November 16, at the Stanford University Hospital. He was 75 years old.
Gene returned home to Stanford recently from a trip to Hong Kong. He was planning to leave again Tuesday on another trip, this one to Zurich where the ETH was to honor him with a special degree. Instead, Sunday night he went to the emergency room because he was "feeling lousy". On Tuesday, he was found to have AML, acute myelogenous leukemia, a form of cancer that affects the white blood cells. This is a potentially curable disease and he was expecting to begin chemotherapy today. But serious complications developed suddenly over night.
I was able to see Gene for an hour last night and he was in reasonably good spirits. Mike Saunders was trying to get Gene's laptop to use dial-up over the hospital's phone system because Gene said he was a couple of days behind on his email. I was planning to get a wireless card for his machine today. None of us had any idea how suddenly the situation would worsen.
The Stanford ICME students have created a memorial blog at http://genehgolub.blogspot.com .
Our community has lost its foremost member. He was a valued colleague and friend. Goodbye, Gene.
-- Cleve Moler
Jack Dongarra, Gene Golub, Eric Grosse, Cleve Moler and Keith Moore. "Netlib and NA-Net: Building a Scientific Computing Community", IEEE Annals of the History of Computing, (Volume: 30, Issue: 2, April-June 2008). A PDF is available here.
Mark Kent, The Numerical Analysis Net (NA-NET), Technical Report 85, ETH Zurich, Institut fuer Informatik, 1988.
Get
the MATLAB code
Published with MATLAB® R2024a
A few days ago, a bug report from our office in Cambridge caught my attention. Computing the singular values and singular vectors of a particular matrix would sometimes cause MATLAB to crash.... read more >>
]]>A few days ago, a bug report from our office in Cambridge caught my attention. Computing the singular values and singular vectors of a particular matrix would sometimes cause MATLAB to crash.
I use two different computers regularly. The machine in my home office is a Lenovo ThinkPad® model T14, loaded with two external monitors, several external disc drives, a sound bar and a dedicated internet connection. My traveling machine is a Lenovo ThinkPad X1 Nano with no external hardware.
The report of a crash in the SVD became even more interesting when I found that it happens on my office computer, but not on the portable. A quick check revealed that the CPUs on the two machines come from different manufacturers. The office computer uses an AMD® Ryzen Pro 5 while the traveling machine uses an Intel® Core i7.
The crash occurs several software layers deep in CGESVD, the LAPACK driver for single precision complex SVD. Various chip manufacturers provide math libraries that have been optimized for their CPUs. However, by default, MATLAB uses the reportedly faster Intel Math Kernel Library, MKL. It is possible to switch to other math libraries.
We have experts at MathWorks who know far more about the details of these libraries than I do. They will soon sort this all out. In the meantime, here is what I have learned about the offending matrix.
We refer to the matrix in the crash report by its case number in our bug tracking system. The matrix is of modest size but is otherwise unusual for several reasons. It is rectangular with fewer rows than columns, it is in single precision, and it is complex.
clear load g3366394 X whos
Name Size Bytes Class Attributes X 219x384 672768 single complex
The following code calling SVD with three outputs will crash MATLAB on my T14, but not on my X1.
trysvd = false if trysvd [U,S,V] = svd(X); R = U*S*V' - X; relres = norm(R)/norm(X) end
trysvd = logical 0
Computing the singular values without the vectors can be done on either machine. The following code uses double precision and then plots the singular values on a logarithmic scale with a line at single precision roundoff level.
S = svd(X); semilogy(S,'.-') ep = eps('single'); line([0 230],[ep ep]) axis([0 230 1e-15 10]) legend({'singular values','eps(''single'')'})
We see that the matrix is far from full rank. About half of its singular values are negligible. This is confirmed by
xrank = rank(X)
xrank = 110
The cause of the low rank is easy to find. This surf plot reveals that almost half of the rows are flat zero.
Let's remove the zero rows.
c = sum(abs(X),2)==0; nnzc = nnz(c) X(c>0,:) = [];
nnzc = 109
The remaining matrix is full rank and it is safe to compute its singular vectors.
[U,S,V] = svd(X); R = U*S*V' - X; resnorm = norm(R)
resnorm = 2.8191e-06
Removing the zero rows was the first work-around that I tried. There are many others. You can replace the zeros with any nonzero "fuzz".
fuzz = 1.e-20; [U,S,V] = svd(X + fuzz*randn(size(X)));
resnorm = norm(U*S*V'-X)
resnorm = 2.8222e-06
You can reorder the matrix so that the zero rows are not at the beginning.
[U,S,V] = svd(flipud(X)); U = flipud(U);
resnorm = norm(U*S*V'-X)
resnorm = 2.3809e-06
How to avoid the crash is not the most important question. What causes the crash with the original matrix? We will find out and get it fixed.
Get
the MATLAB code
Published with MATLAB® R2024a
Floating point arithmetic doesn't get the respect it deserves. Many people consider it mysterious, fuzzy, unpredictable. These misgivings often occur in discussion of vector sums. Our provocatively named SuperSum is intended to calm these fears.... read more >>
]]>Floating point arithmetic doesn't get the respect it deserves. Many people consider it mysterious, fuzzy, unpredictable. These misgivings often occur in discussion of vector sums. Our provocatively named SuperSum is intended to calm these fears.
A ledger is a list of transactions in an account. Auditing the ledger involves comparing the total of the items with the change in the account balance.
If v is a vector of transaction amounts, then we need to compute
sum(v)
If this sum is equal to the change in the balance, then it is reasonable to assume that the ledger is correct. If not, the ledger must be examined line-by-line.
Have you ever used a checksum for a file transfer? One checksum is computed before the file is sent. After the file has been sent over a questionable channel, a second checksum is computed on the receiving end. If the two checksums agree, the transmission was probably okay. If not, the file must be sent again.
Floating point addition is not associative. This means
(a + b) + c
is not necessarily the same as
a + (b + c).
So the order of doing a computation is important.
Computers are deterministic devices. If the same computation is done in the same order more than once, the results must be the same. If you get different sums from different runs on a fixed computer, then it must be because the order has been changed.
In recent years, we have made the built-in function sum(x) faster by parallelizing it. The input vector is broken into several pieces, the sum of each piece is computed separately and simultaneously, then the partial sums are combined to give the final result. If the number and size of the pieces varies from run to run, the order varies and consequently the computed sums may vary.
Here are three replacements for nansum(x), the version of sum(x) that skips over NaNs and infs.
simplesum
Avoid the effect of blocking in the built-in sum(x).
function s = simplesum(x) % simplesum. s = simplesum(x), for vector(x). % Force sequential order for sum(x). % Skips NaNs and infs.
s = 0; for k = 1:length(x) if isfinite(x(k)) s = s + x(k); end end end
KahanSum
This is the Kahan summation algorithm. The sum is accumulated in two words, the more significant s and the correction c. If you were able to form s + c exactly, it would be more accurate than s alone.
function s = KahanSum(x) % KahanSum. s = KahanSum(x), for vector(x). % More accurate, but slower, than sum(x). % Skips NaNs and infs. % https://en.wikipedia.org/wiki/Kahan_summation_algorithm.
s = 0; % sum c = 0; % correction for k = 1:length(x) if isfinite(x(k)) y = x(k) - c; t = s + y; c = (t - s) - y; s = t; end end end
SuperSum
I gave it a pretentious name to grab attention. Use extended precision, with enough digits to hold any MATLAB double.
function s = SuperSum(x) % SuperSum. s = SuperSum(x), for vector(x). % Symbolic Math Toolbox extended precision. % Skips NaNs and infs. % % 632 decimal digits holds every IEEE-754 double. % 632 = ceil(log10(realmax) - log10(eps*realmin)); % din = digits(632); % Remember current setting s = double(sum(sym(x(isfinite(x)),'D'))); digits(din) % Restore end
A test case here at MathWorks, known by a French-Canadian name that translates to "toy example", has a vector e2 of length 4243 and values that range from -3.3e7 to 6.8e9.
When running tests involving floating point numbers it is a good idea to set the output format to hex so we can see every last bit.
format hex load jeuTest e2 x = e2;
[nansum(x) simplesum(x) KahanSum(x) SuperSum(x)]
ans = 423785e8206150e2 423785e8206150e0 423785e8206150e1 423785e8206150e1
For jeuTest, Kahan summation gets the same result as SuperSum, while nansum and simplesum differ in the last bit or two.
The vector length is only three, but the third term completely cancels the first, and the second term rises from obscurity. In this situation, KahanSum is no more accurate than sum.
format hex x = [1 1e-14 -1]'
[nansum(x) simplesum(x) KahanSum(x) SuperSum(x)]
x = 3ff0000000000000 3d06849b86a12b9b bff0000000000000
ans = 3d06800000000000 3d06800000000000 3d06800000000000 3d06849b86a12b9b
I will leave careful timing for another day. Let's just say that in situations like jeuTest, KahanSum is probably all you need. It is usually more accurate than sum, and not much slower.
However, for complete reliability, use SuperSum. There is no substitute for the right answer.
Get
the MATLAB code
Published with MATLAB® R2024a
Our technical support group recently received a request for a tool that would convert IBM System/360 hexadecimal floating point numbers to the IEEE-754 format. I am probably the only one left at MathWorks that actually used IBM mainframe computers. I thought we had seen the last of hexadecimal arithmetic years ago. But, it turns out that the hexadecimal floating point format is alive and well.... read more >>
]]>Our technical support group recently received a request for a tool that would convert IBM System/360 hexadecimal floating point numbers to the IEEE-754 format. I am probably the only one left at MathWorks that actually used IBM mainframe computers. I thought we had seen the last of hexadecimal arithmetic years ago. But, it turns out that the hexadecimal floating point format is alive and well.
The System/360 is a family of mainframe computers that IBM introduced in 1965 and that dominated the computer industry until PCs came along twenty years later. They range in size from desk-sized to systems that fill a large room.
Here is a photo of a mid-sized model.
System/360, Model 60. Photo from Ken Shirrif's blog, IBM 360/System Summary.
The System/360 architecture is byte-oriented, so it can handle business data processing as well as scientific and engineering computation. This leads to base-16, rather than base-2 or base-10, floating point arithmetic.
* Binary f*2^e 1/2<=f<1 * Decimal f*10^e 1/10<=f<1 * Hexadecimal f*16^e 1/16<=f<1
Floating point formats played an important role in technical computing in the early days. This table from FMM lists formats that were in use in the 1970s, before IEEE-754 was introduced in 1985.
The System/360 hexadecimal format is used in many industries for the preservation of data files.
CREWES. Teaching exploration seismology. Comprehensive MATLAB toolbox for use with the textbook "Numerical Methods of Exploration Seismology with algorithms in Matlab" by Gary F. Margrave, a geoscience professor at the University of Calgary.
Library of Congress. Government.
NASA. Astronautics.
SAS. Statistics and business analytics. SAS wrapers for C.
Enthought. Python wrappers for C.
Hex_ieee. I have two twenty-line MATLAB functions, ieee2ibm and ibm2ieee, that convert IEEE-754 floating point to and from IBM hexadecimal format.
Three statements in the middle of ieee2ibm are the key to the entire operation. The first statement is
[~,e] = log2(x)
With two output arguments, log2 returns the mantissa and exponent of an IEEE-754 floating point number. The mantissa is not needed here.
The second key statement
e = ceil(e/4)
makes e divisible by 4. This turns e into the appropriate hexadecimal exponent so that the third statement
f = x.*16.^(-e)
can produce the hexadecimal mantissa.
function z = ieee2ibm(x) Convert IEEE-754 to IBM System 360 hexadecimal. z = ieee2ibm(x) Input x, real column vector. Output z, length(x)-by-16 char. Example: ieee2ibm(-118.625) = 'C276A00000000000'.
s = sign(x); % -1, 0, or 1 x = abs(x); x(x < 16^(-65)) = 0; % Underflow x(x >= 16^63) = (1-eps/2)*16^63; % Overflow
[~,e] = log2(x); % base 2 exponent e = ceil(e/4) % base 16 exponent f = x.*16.^(-e); % base 16 mantissa
E = uint64((e+64)*2^56); % Assemb1e output F = uint64(f*2^56); S = uint64((1-s)*2^62); % 1 or 0 z = dec2hex(S + E + F); % z = 'ZZFFFFFFFFFFFFFF' end
function x = ibm2ieee(z) Convert IBM System 360 hexadecimal to IEEE-754. x = ibm2ieee(z) Input z, n-by-16 char. Output x, n-by-1 double. Example: ibm2ieee('C276A00000000000') = -118.625.
E = hex2dec(z(:,1:2)); % Disassemble input F1 = hex2dec(z(:,3:8)); % < 16^6 F2 = hex2dec(z(:,9:end)); % < 16^8 s = sign(128-E); % -1 or 1
e = E-(s>0)*64-(s<0)*192; % base 16 exponent f = F1/16^6 + F2/16^14; % base 16 mantissa x = s.*f.*16.^e; end
Underflow. Anything < 16^(-65) is too small and is flushed to zero. There are no denormals.
Overflow. Anything >= 16^63 is too large. There is no inf or NaN.
* 1.0 4110000000000000 * 0.1 401999999999999A * -pi C13243F6A8885A30 * 5.3976e-79 0010000000000000 * 7.2370e+75 7FFFFFFFFFFFFFF8
S/360 hexadecimal has 7 exponent bits, while IEEE-754 has 11. Consequently, hexadecimal has a much smaller range, 5.4e-79 to 7.2e+75 versus 2.2e-308 to 1.8e+308.
The base-16 normalization implies that hexadecimal effectively has between 53 and 56 mantissa bits. Counting the hidden bit, IEEE-754 also has 53. So, the accuracy of the two is pretty much the same.
My functions ieee2ibm and ieee2ibm described above, modified to handle both single and double, plus hex_test, which does what its name implies, are available at Hex_ieee.
Homework: What happens?
ok = 0; for k = 1:10 x = single(k/10); ok(k) = hex_test(x); end ok
Get
the MATLAB code
Published with MATLAB® R2024a
The graphics in my post about R^2 were produced by an updated version of a sixty-year old program involving the U.S. census. Originally, the program was based on census data from 1900 to 1960 and sought to predict the population in 1970. The software back then was written in Fortran, the predominate technical programming language a half century ago. I have updated the MATLAB version of the program so that it now uses census data from 1900 to 2020.... read more >>
]]>The graphics in my post about R^2 were produced by an updated version of a sixty-year old program involving the U.S. census. Originally, the program was based on census data from 1900 to 1960 and sought to predict the population in 1970. The software back then was written in Fortran, the predominate technical programming language a half century ago. I have updated the MATLAB version of the program so that it now uses census data from 1900 to 2020.
The latest version of the census application is now available at censusapp2024. Here are the data and the opening screenshot.
[t,p]=UScensus;fprintf('%12d%12.3f\n',[t,p]')
1900 75.995 1910 91.972 1920 105.711 1930 123.203 1940 131.669 1950 150.697 1960 179.323 1970 203.212 1980 226.505 1990 249.633 2000 281.422 2010 308.746 2020 331.449
Today, MATLAB makes it easier to vary parameters and visualize results, but the underlying mathematical principles are unchanged:
One new observation is added to the data every 10 years, when the United States does the decennial census. Originally there were only 7 observations; today there are 13. The program now allows you to fit the data exactly by interpolation with a polynomial of degree 12 or fit it approximately by polynomials of degree less than 12.
Here are the least-squares fits with linear, cubic, and degree seven polynomials and the interpolating polynomial. As the polynomial degree increases, so does R^2, until R^2 reaches one with the exact fit.
Do any of these fits look like they could be used to predict future population growth?
In addition to polynomials, you can choose interpolation by three different piecewise Hermite cubics.
Since these fits interpolate the data, all their R^2 values are one. But before 1900 and after 2020 these functions are cubic polynomials that are not designed for extrapolation.
It is also possible to do nonlinear least squares fits by an exponential, a logistic sigmoid, and an exponential of an exponetial known as the Gompertz model.
An article by Kathleen and Even Tjørve, from the Inland Norway University of Applied Sciences in Elverum, Norway, in the journal PLOS ONE has this to say about Gompertz. "The Gompertz model has been in use as a growth model even longer than its better known relative, the logistic model. The model, referred to at the time as the Gompertz theoretical law of mortality, was first suggested and first applied by Mr. Benjamin Gompertz in 1825. He fitted it to the relationship between increasing death rate and age, what he referred to as 'the average exhaustions of a man’s power to avoid death” or the 'portion of his remaining power to oppose destruction.' "
Which fits are suitable for predicting future population size?
Despite their large R^2 values, polynomials of any degree are not suitable because outside of the time interval they behave like polynomials and do not provide realistic predictions.
Splines were never intended for extrapolation.
That leaves the exponentials. The simple exponential model grows exponentially and is not suitable. The Gompertz fit does approach a finite asymptotic limit, but the value is an astronimical a = 2101, corresponding to 2.1 $\times 10^9$ inhabitants. Hopefully, that is out of the question.
The logistic fit has an asymptotic limit of a = 655.7. We recently passed the value of t where p(t) reaches a/2, namely c = 2018. So, the logistic model predicts that the long-term size of the U.S. population will be about twice its current value. Is that realistic? Probably not.
The British statistician George Box once said, "all models are wrong, some are useful." This is true of the models of the U. S. Census that I have discussed over the past sixty years.
Here is censusapp2024 after all its buttons have been pushed. The extrapolation date is set to 2040. White noise has been added to the data. The model is a fourth-degree polynomial with an R^2 = 0.99. The R^2 value and the error estimates produced by errs account for errors in the data, but not in the model.
This particular model does a lousy job of predicting even twenty years in the future. Some of the other models are better, many are worse. Hopefully, their study is worthwhile.
I have made blog posts about the census before, in 2020 and in 2017.
Predicting population growth is featured in Computer Methods for Mathematical Computations, by George Forsythe, Mike Malcolm and myself, published by Prentice-Hall in 1977. That textbook is now available from an interesting smorgasbord of sources, including Google Scholar, Amazon, dizhasneatstuff, Abe Books, Internet Archive, PDAS, WorldCat (Chinese).
censusapp2024 is available at censusapp2024.
Get
the MATLAB code
Published with MATLAB® R2024a
The coefficient of determination, R-squared or R^2, is a popular statistic that describes how well a regression model fits data. It measures the proportion of variation in data that is predicted by a model. However, that is all that R^2 measures. It is not appropriate for any other use. For example, it does not support extrapolation beyond the domain of the data. It does not suggest that one model is preferable to another.... read more >>
]]>The coefficient of determination, R-squared or R^2, is a popular statistic that describes how well a regression model fits data. It measures the proportion of variation in data that is predicted by a model. However, that is all that R^2 measures. It is not appropriate for any other use. For example, it does not support extrapolation beyond the domain of the data. It does not suggest that one model is preferable to another.
I recently watched high school students participate in the final round of a national mathematical modeling competition. The teams' presentations were excellent; they were well-prepared, mathematically sophisticated, and informative. Unfortunately, many of the presentations abused R^2. It was used to compare different fits, to justify extrapolation, and to recommend public policy.
This was not the first time that I have seen abuses of R^2. As educators and authors of mathematical software, we must do more to expose its limitations. There are dozens of pages and videos on the web describing R^2, but few of them warn about possible misuse.
R^2 is easily computed. If y is a vector of observations, f is a fit to the data and ybar = mean(y), then
R^2 = 1 - norm(y-f)^2/norm(y-ybar)^2
If the data are centered, then ybar = 0 and R^2 is between zero and one.
One of my favorite examples is the United States Census. Here is the population, in millions, every ten years since 1900.
t p ____ _______ 1900 75.995 1910 91.972 1920 105.711 1930 123.203 1940 131.669 1950 150.697 1960 179.323 1970 203.212 1980 226.505 1990 249.633 2000 281.422 2010 308.746 2020 331.449
There are 13 observations. So, we can do a least-squares fit by a polynomial of any degree less than 12 and can interpolate by a polynomial of degree 12. Here are four such fits and the corresponding R^2 values. As the degree increases, so does R^2. Interpolation fits the data exactly and earns a perfect core.
Which fit would you choose to predict the population in 2030, or even to estimate the population between census years?
R2_census
Thanks to Peter Perkins and Tom Lane for help with this post.
Get
the MATLAB code
Published with MATLAB® R2024a