Today's post will take us on an historical tour of the function zeros, and pertains, in various ways, to the related functions ones, eye, false, true, rand, randn, and complex.
You might think "How can a function like zeros need to evolve at all?" - but you will see some of the MATLAB changes that lead to the evolution.
In Cleve's original Fortran MATLAB, there was no zeros function. However, the function ones had the following help entry:
ONES All ones. ONES(N) is an N by N matrix of ones. ONES(M,N) is an M by N matrix of ones. ONES(A) is the same size as A and all ones.
When I first started using MATLAB in 1987(version 3.05a), the behavior for the function zeros matched this ones design, with the obvious difference being the matrix had values of 0 instead of 1 for all of its entries.
Notice what this design means. I always get an array the same size as A unless A is a scalar. And then we get something with size A by A. This means that if we want to ever get a scalar value for zeros, we must constantly program defensively and ask if A is a scalar, and then create zeros(1), else create zeros(A). Constantly. This was a source of both confusion and bugs in code where people didn't recognize the discontinuity in behavior when they were programming.
To alleviate this nuisance, in MATLAB Version 4, we no longer used the syntax zeros(A) but instead expected the inputs to be non-negative integers. MATLAB still only had 2-dimensional double arrays (with an attribute describing when to interpret the values as characters). With this design, there was no more need for such defensive programming.
Once we introduced higher dimensional arrays in MATLAB (version 5), we changed the input syntax for zeros to allow 2 scalar non-negative integer inputs OR a vector of such values for N-Dimensional arrays. While it was nice to be able to create higher dimensional arrays this way, it was tricky to create them for non-double datatypes, another new feature in that release of MATLAB.
The most common way to generate an array of zeros of another type was to first create the correctly sized double array and then convert it to, let's say uint8. This was inefficient, first creating an array using 8 times more memory than necessary, and then converting it to the uint8 array, with its smaller datatype.
Of course there were ways around this too. You could create a scalar zero of the right datatype, e.g., and then set that to be the final element in a newly created array, like this for your 4-D array:
B(m,n,p,q) = uint8(zeros(1));
Effective, but perhaps a bit cryptic.
In MATLAB 7, we introduced another new syntax for zeros and companions, allowing you to specify the datatype of the output, provided that it was one of the built-in datatypes. Here's a snippet of the relevant information.
X = zeros(..., classname) classname restricted to built in types: 'double', 'single', 'int8', 'uint8', 'int16', 'uint16', 'int32', 'uint32', 'int64', or 'uint64'
This was definitely more memory efficient than creating an array of doubles and then coverting them. And less mysterious than setting the end element of a new array to a zero of the right type.
There was a remaining limitation however. It was still cumbersome to create an array of zeros of a user-defined datatype, including classes that shipped as part of our products.
In addition to creating zero-arrays in all the ways possible since MATLAB Version 4, you can now create an array of zeros to be like another array, saving you the hassle of finding out that array's size and type and passing that information along. And it also allows you to create arrays of any datatype, built-in or user-defined. How do you do this? Here's one of the possible syntax choices.
X = zeros(sz1,...,szN,'like',p)
Have you been able to take advantage of some of these changes over time? Did you know about them? I'd love to hear your thoughts here.
Get the MATLAB code
Published with MATLAB® R2013a
Comments are closed.
12 CommentsOldest to Newest
Interesting to see the evolution. I use these type of functions quite a lot for pre-sizing arrays, but even after 8 years using Matlab I have to say the syntax of e.g. zeros(N) is still not intuitive to me. I keep making the mistake of using it when I want a vector of N zeros (where I don’t really care if it is a row or a column) before realising I have a square matrix instead!
If I want to have an array of zeros the same size as an array A, I just use B = 0*A. That also ensures that B is in the same class as A. Is there any reason to use zeros instead?
Thank you for the historical review.
I still wonder why don’t you allow creating a matrix with no initialization which should be much faster.
Namely just the memory allocation, with no need to run through it to set a value.
I often have bits of code that look something like x = zeros(size(A)); or x = ones(size(A));
So, the behavior that would be most convenient for me is for x = zeros(‘like’,A); to be the same as x = zeros(size(A),’like’,A); instead of returning a scalar. What’s the reasoning behind returning a scalar? Typing x = zeros(1,’like’,A); would still be possible if a scalar is desired.
Thanks for the replies.
Adam, Yes – zeros(n) has indeed caused confusion.
Andrew, Not sure if the performance will differ at all – it might especially if you have non-finite elements in your matrix. And then you might not get all 0s as well.
Drazick, MATLAB doesn’t have arrays with uninitialized values in its definition, at least for now. We’d need to add some library functions, at least one, to ask if the array was uninitialized. That might clutter most input syntax checking – being sure inputs have values.
Sky, I think we were trying to unify the older and newer behaviors with similar (parallel) syntaxes.
The pattern B(m,n,p,q) = uint8(0); (or even just B(m,n,p,q) = 0) still works and can be used as a “trick” for fast zero allocation of large arrays. It only explicitly assigns the last element of the array in order to set the size. This only works for zero/false values of a datatype. The one major caveat is that, unlike the zeros function that writes (or overwrites) a zero in each element, this can only be used safely if the matrix to allocated does not already exist.
You mentioned complex, rand, and randn as being “related functions. However, these don’t seem to have received the ‘like’ syntax extension in R2013a. It would be nice if all of the RandStream class functions supported this for the appropriate types. I often try to check for single/double precision and use the appropriate allocation as, from experience, there’s nothing worse than a function needlessly ignoring a datatype.
I said rand and randn since they use the single input syntax to create a 2-D output. They do not have the like syntax since they only generate singles or doubles and not the long list of possible datatypes. I mentioned complex since that is one of the attributes that the ‘like’ syntax will inspect.
You are right that the trick of initializing a matrix to zero values by setting its end element to a zero of the right type will only work correctly for a non-existing array.
NOTE: I just got a note from Yair Altman suggesting I mention another change to zeros, et al. that is not visible in the syntax but is a documented change – that is, we made these functions multi-threaded for large enough outputs in MATLAB R2007A – thereby speeding them up.
I just posted a detailed analysis of the performance of zeros() vs. ones() for different allocation sizes, showing a surprising effect of multi-threading on zeros:
There is definitely more to zeros (and ones) than one would naively assume. This functions are deceivingly simple. We can only speculate how much more effort has been invested in more complex built-in functions, if such simple functions as zeros() and ones() have so much more internal complexity than meets the eye.
Interesting assessment Yair, but it turns out that the reasons for the behavior changes aren’t what you thought. The performance change for zeros in R2008b resulted from a change in the underlying MATLAB memory management architecture at that time.
Michelle, do you have any thoughts on if there’s a way to improve the performance of ‘ones’ for >100k elements, as shown by Yair?