Loren on the Art of MATLAB

Calculus with Empty Arrays 25

Posted by Loren Shure,

MATLAB has had empty arrays since before I started using the program. When I started, the only size empty array was 0x0. When version 5 was released, empty arrays came along for the N-dimensional ride and got more shapely.

Contents

Even relatively simple expressions involving empty arrays cause confusion from time to time, especially in concert with other rules in MATLAB (such as NaN values usually propagate from inputs to outputs). Let's play around a little with some empty arrays to get some insight.

From the Newsgroup

This post on the MATLAB newsgroup motivated me to talk about empty arrays. Here's is an excerpt from the original post:

 I knew that Nan+4 = NaN, but why is []+4 = [] ? Is there more 'black hole
 behaviour' I should know about?

Dimensions Matter in MATLAB

Dimensions matter in MATLAB and you will get error messages when dimensions don't agree. Early on, we found it was convenient to treat scalars as an exception with operators (e.g., ,.) and to treat them as if they were expanded to the size of the other operand. So

rand(3,3) + pi
ans =
          4.10648118878907          4.09875960183274          3.28347899221701
          3.29920573526734          3.62696830231263          3.56335393621607
          4.11218543535041          3.94187312247859          4.05732817877886

adds the value pi to the random 3x3 just created. It's as if a 3x3 constant array filled with the value pi was created and added elementwise to the other array.

So what does that mean for an empty operand? Its size has at least one 0 value. So MATLAB expands pi in this expression [] + pi to be the same size as [] (which happens to be 0x0 here). When that happens, MATLAB creates a second empty array, and then adds the two empty arrays. Hence we get

[] + pi
ans =
     []

returing an empty output.

MATLAB applies the scalar expansion rule to operators. So, for example, size(anything+scalar) is size(anything). As a logical but sometimes surprising consequence, empty arrays often (but not always) propagate through calculations. When doesn't that happen? With some functions where mathematically logical analysis demands different results.

sum([])
ans =
     0
prod([])
ans =
     1

So, why is the sum zero and the product 1? Because they are the identity elements (as in group theory) for sum and product respectively. Think about how to start the computation for a sum or a product and how you would initialize the first value. That's the value given before adding or multiplying any of the array elements, hence the values 0 and 1 respectively.

Empty Array Shapes

Empty arrays can be N-dimensional, and don't need all dimensions to be 0. However, they still must obey the rules about dimensions needing to agree. There are consequences that may surprise you, but they do follow logically. Here's an example of trying to add two empty arrays, ending up in an error!

a = zeros(0,1)
b = zeros(1,0)
try
    a+b
catch MEplus
    fprintf('\n')
    disp(MEplus.message)
end
a =
   Empty matrix: 0-by-1
b =
   Empty matrix: 1-by-0

Matrix dimensions must agree.

Reference

For more information about empty arrays, check out this page in the documentation.

Got an Empty Question?

Got any empty questions (really, questions about empty!)? Or comments? Post them here.


Get the MATLAB code

Published with MATLAB® 7.9

25 CommentsOldest to Newest

Loren,

I frequently accumulate index values within an if statement, nested within a for loop. So, I’ll use the technique of setting the resulting index variable to [] just before the for loop & concatenating to it via the if statement.


indx = []
for ii = 1 : 3
    if something
          indx = [indx, ii]
    end
end

It appears the Empty Matrix construction of


zeros(1, 0)

behaves the same way as the [] construction.

I reviewed the referenced documentation & I don’t think it goes into enough detail on this topic.

Loren,

How can I use a matrix + vector operation as the matrix + scalar operation

randn(3,3) + pi

The following examples shows an error message when adding a vector to a matrix

randn(3,3) + [1:3]

This can be fixed by creating a matrix from the vector and adding it to the original matrix.

randn(3,3) + ones(3,1)*[1:3]

However, this operation is undesirable when large matrices are involved.
Is there a way to add matrix to vec, without for loops, that does not incur a huge penalty in terms of memory?

I generally find MATLAB’s handling of empties to be fairly consistent. It helps to think of an M-by-N array as an M-by-N-by-1-by-1-by-… array, where size() just truncates the infinite trailing ones. To this end, you can define a function like sizex:

function s=sizex(x);s=size(x);s(end+1:5)=1;

This highlights the occasional case of the 0-by-0-by-1-by-… empty being considered somehow superior to other empties. Consider this sequence:

sizex( num2cell( zeros( 3, 1 ) ) ) = [3 1 1 1 1]
sizex( num2cell( zeros( 2, 1 ) ) ) = [2 1 1 1 1]
sizex( num2cell( zeros( 1, 1 ) ) ) = [1 1 1 1 1]
sizex( num2cell( zeros( 0, 1 ) ) ) = [0 0 1 1 1]  % expected: [0 1 1 1 1]

This num2cell quirk often trips up my code and forces me to insert special-case handling for empties.

Janti-

Check out the function bsxfun. It will do exactly what you are hoping – not expanding vectors to arrays in order to add different constants to each column, for example.

–Loren

Wes-

Implicitly there are trailing 1 dimensions for all arrays in MATLAB. But they do get dropped by ndims, for example, after dimension 2. So it can trip up code like you have seen.

–Loren

OysterEngineer (love the name, by the way!)-

Initializing with an empty to start a concatenation process makes lots of sense and is good coding style in cases where you can’t predetermine the final size, and therefore preallocate.

–Loren

Hi Loren!

IMHO, sum([]) and prod([]) should throw an error, as most of the times it shows that some kind of error happened before and somehow the variable became unexpectedly empty.

It is very much like asking to calculate the sum of a string.
:)

Andrey-

Thanks for your thoughts.

One way to think about why empty shouldn’t error for sum is the following. Suppose I have a list of some entities. And I want the sum for all the ones greater than a certain value, say 0. That list may be empty, but the sum is then 0. Works for me without having to special case code then.

–Loren

You know, I really do farm oysters part time. Give me some advance notice next time you are in this part of the world & I’ll give you a tour of the beach.

OysterEngineer-

I will SO take you up on that offer. Can’t wait for a good reason to visit now. Especially if the shucking tools are around as well :-)

–Loren

Loren/Andrey,

A further advantage of having sum([])==0 and prod([])==1 is that it’s consistent with array concatenation, e.g. prod([A;B]) and (prod(A).*prod(B)) remain equal when A or B are empty.

I often find sum([]) and prod([]) useful, but the help documents are strangely silent on their treatment. Is it unofficial/unsupported functionality because of the controversy, or just something that was forgotten to be mentioned in the help docs?

Ben-

The reference link in my post documents the behavior of sum([]) and prod([]) (although the prod part only says the result is nonzero value).

–Loren

Loren,

Are there any aspects of empty matrices that may be tricky when they are used as indices into other arrays? For example, what happens in the statement:

x(ii) = y(jj);

if ii and/or jj are empty? Does it matter if ii/jj have one or more non-zero dimensions? Does it matter if ii/jj are logical or numeric? Is there ever a need to precede this type of statement with

if ~isempty(ii)

Does checking for emptiness of ii first save any time by not having to deal with the right hand side for large matrices?

Thanks,
Paul

Paul-

There *are* issues depending on the sizes of ii and jj. And it’s a bit complicated, but really follows the rules above.

If the left-hand side index is empty and the right is a scalar, nothing happens since you are doing scalar expansion, but into a 0-length array.

If the LHS index is empty and right is an array with length>0, then you get an error. Scalar expansion doesn’t apply and the number of elements on right and left need to agree. They don’t.

If the LHS index is empty and the RHS is empty, then the array being assigned to is unchanged. This may surprise some people because using the literal ‘[]‘ on the right hand side causes elements on the left to get deleted. For that to happen, again, the literal expression [] (or ” for strings) must be there. The reason is that sometimes a program computes some condition before doing the assignment. If nothing was found, then the intent is not to change anything on the left. So,

ii = []
x = 1:3
x(3) = ii

leaves x unchanged, while

x(3) = []

deletes the 3rd element of x.

Now let’s look at the index on the right hand side. If it’s empty, and the LHS is not, you get an error – again dimensions mismatch and the RHS is not a scalar, so scalar expansion doesn’t apply.

We already covered the case where the RHS and LHS are empty.

I don’t believe the dimensions of emptiness matter for these situations. An empty logical doesn’t appear to change the situation I described. Do you have that situation arise? I can’t think of one where I’ve seen that before.

I don’t understand your last question about size. Which matrix is large? Have you timed it?

–Loren

I’ve often been puzzled about how sub-matrix-assignment works for full and sparse arrays:

clear
A = zeros (4,0) ;
try
    A (5) = pi
catch
    disp (lasterr)
end
S = sparse (A) ;
S (4) = pi
whos 
full (S)
clear
A = zeros (4,0) ;
S = sparse (A) ;
S (5) = pi
whos
full (S)

Gives the following:


 In an assignment  A(I) = B, a matrix A cannot be resized.

S =

   (4,1)       3.1416

  Name      Size            Bytes  Class     Attributes

  A         4x0                 0  double              
  S         4x1                32  double    sparse    


ans =

         0
         0
         0
    3.1416


S =

   (1,2)       3.1416

  Name      Size            Bytes  Class     Attributes

  A         4x0                 0  double              
  S         4x2                40  double    sparse    


ans =

         0    3.1416
         0         0
         0         0
         0         0

>> 

A(5)=pi gives an error; A(5,:)=pi doesn’t, but it causes A to become 5-by-0. That seems reasonable. I suppose.

But then what are S(4)=pi and S(5)=pi doing?

Dear Sir/Madam,
I have a problem with find function in the following code:

The result show “Empty matrix: 1-by-0″ message for some value of Tuhs such as Tuhs(1)=0.1 which is seems illogical. I would be pleased to have your comments.

Tim-

That sparse behavior is, as you guessed, a bug. It’s entered in our database and will get fixed (not sure which release). Thanks for pointing it out.

–Loren

Loren,

1. In your response (14) to my questions, you said “Now let’s look at the index on the right hand side. If it’s empty, and the LHS is not, you get an error …” Did you really mean “and the index on the LHS is not…?” This seems to be true, but I might be a little surprised about this:
>> x=1:5;y=1:5;
>> x(zeros(1,0))
ans =
Empty matrix: 1-by-0
>> y(zeros(1,0,1,0))
ans =
Empty array: 1-by-0-by-1-by-0
>> % assignment with incompatible dimensions?
>> x(zeros(1,0))=y(zeros(1,0,1,0))
x =
1 2 3 4 5

2. The source of my question in general goes back to the dark ages when Matlab didn’t have logical indexing and we would write code like this because we weren’t quite sure what to make of the assignment statement with an empty index.

ii=find(x>100);
if ~isempty(ii)
y(ii) = rhs
end

Even now, we sometimes write:
ii = x>100;
if any(ii),
y(ii) = rhs;
end

This code was the motivation for my reference to “empty logicals” when I really meant a logical vector that’s all false. I guess in this case ii has to have dimensions compatible with y and rhs and then y(ii) = rhs is basically a no-op (unless evaluation of rhs has side effects)?

3. Which brings me to the question about the size of the matrices involved. Using the if(any) construction above potentially removes the need to compute the rhs, which could be expensive. My question was really if Matlab sees the expression x(ii) = rhs and sees that ii is empty (or all false), will it short circuit the assignment and not even bother to evaluate the rhs. From what you described above, and what I see from some quick experiments, is that the short circuit does not occur because Matlab still checks that the assignment follows the rules for dimensions, and I guess it doesn’t know about potential side effects of evaluating rhs. I suppose this makes sense and it’s easy enough to use the if(any) if wanted.

Paul-

MATLAB always evaluates the RHS before looking to see if it can assign into the LHS, as far as I know (and according to what you found out also). Therefore MATLAB will not do the kind of short-circuiting to not evaluate the RHS, even if it’s ultimately irrelevant for the LHS assignment.

–Loren

Hi! How are you?

I need your help,I’m trying to do a countdown as in the code:

i=8;
resultado=[];
while (i>1)
i=i-1;
resultado=[resultado i]
end

This code shows the response: [7 6 5 4 3 2 1]
but I wish he showed: [8 7 6 5 4 3 2 1]
ie the input value will appear as the first element of the response.

Please be very grateful if you can help and thanks for your time.

Luis-

Your code is very close. All you need to do is initialize resultado to i (=8) before the while loop instead of to the empty array.

i=8;
resultado=i;
while (i>1)
i=i-1;
resultado=[resultado i]
end

There are other ways as well, for example, updating resultado in the loop before decrementing i. Then the loop would need to be i>=1 as well, I think. (Or i>0.)

–Loren

Loren,

I’m a late arrival to this thread, but discovered an unexpected result when concatenating matrices and thought I’d ask about it.

>> cat (1, zeros(1,0), 1)
??? Error using ==> cat
CAT arguments dimensions are not consistent.
 
>> [zeros(1,0); 1]

ans =

     1

Is this a bug, a feature, or by design?

Ben-

This is intentional to deal with backward compatibility details from before the function cat existed. cat is strictly enforcing the dimension matching while [], which uses horzcat and vertcat, used to have to deal with 0×0 empties, and then some empties arose from deleting elements in an array and become 0×1 or 1×0. To have people’s code still work, you see the result you noted.

–Loren

These postings are the author's and don't necessarily represent the opinions of MathWorks.