Loren on the Art of MATLAB

July 26th, 2007

Assignment with Repeated Indices

I have had customers ask me occasionally about what happens during an indexed assignment when indices are repeated. The answer is that it depends on how you do the assignment.

Contents

Repeated Indices with a Single Assignment

I initialize an output matrix to zeros and create two vectors, one with locations (indices) and one for the corresponding data. Now let's look at the results of an indexed assignment.

A = zeros(3);
loc = [1 3 7 9 1];
data = 1:5;
A(loc) = A(loc) + data
A =
     5     0     3
     0     0     0
     2     0     4

You see in this case that even though index 1 was repeated, its output value is only 5, and not 6. Let me reset A so I can show you what's going on.

A = zeros(3);

Here's what "essentially" happens during an assignment in MATLAB. On the right-hand side, A(loc) is created.

A(loc)
ans =
     0     0     0     0     0

This means that there are 2 copies of A(1) currently and an element of data will be added to each of these. The important thing to realize is that now the values for the right-hand side have been calculated and set.

A(loc) + data
ans =
     1     2     3     4     5

Following this, the 5-element right-hand side is assigned into A(loc) on the left-hand side. Values are currently placed into the output vector by marching down the left-hand side locations, replacing one element at a time. So the first element gets replaced not once, but twice, with the final one being the value that you see.

A(loc) = A(loc) + data
A =
     5     0     3
     0     0     0
     2     0     4

Repeated Indices Using a for Loop for the Assignment

Here's another way to do the assignment, but in this case, the

B = zeros(3);
for n = 1:length(loc)
  B(loc(n)) = B(loc(n)) + data(n);
end
B
B =
     6     0     3
     0     0     0
     2     0     4

Notice that the output here for B does not match that from the vectorized assignment above. This is because the values for the right-hand side get updated one at a time, and can therefore accumulate as additions made to a particular index.

isequal(A,B)
ans =
     0

Using accumarray to Accumulate Results with Repeated Indices

What if you want the accumulation effect, but you'd prefer to use vectorized code? That's what the function accumarray is for.

C = accumarray(loc',data,[9 1]);
C = reshape(C,[3 3])
C =
     6     0     3
     0     0     0
     2     0     4

We see the same results here as those from the for loop example.

isequal(B,C)
ans =
     1

Repeated Index Usage

Did you know this about MATLAB already? I'm curious to hear your experiences in this area. Post them here.


Get the MATLAB code

Published with MATLAB® 7.4

15 Responses to “Assignment with Repeated Indices”

  1. Oliver A. Chapman, P.E. replied on :

    Loren,

    Nice topic. But, you are miss-directing the reader by making A a square matrix but illustrating your points with a vector of indices! I guess I’ve been vaguely aware that MatLab won’t bomb when you reference, for example, A(6). But, it is a technique I avoid because it isn’t obvious which element of A index #6 is. Frankly, it appears MatLab does it backwards from what seems logical to me.

    Since the primary point of the column is to illustrate how MatLab responds to repeated indices, your examples should have been vectors, not the 3×3 matrices.

    Oliver

  2. Mike replied on :

    Is there any difference between these two approaches? Is one more efficient than the other for large matrices?

    coordinates = [1 1; 1 2; 1 1];
    A = accumarray(coordinates, 1, [3 3]);
    B = full(sparse(coordinates(:,1), coordinates(:,2), 1, 3, 3);

  3. Luca Citi replied on :

    Hi Loren,
    thanks for the nice post.

    To Oliver,
    I agree with you that having a square matrix but using a vector of indices can be misleanding. Nevertheless it is not exact what you state: “it is a technique I avoid because it isn’t obvious which element of A index #6 is”. Actually if you address a matrix using only one index, the variable is considered as a single column of elements composed of all of the columns from the matrix, each appended to the last. It is a documented consistent behaviour and you can take advantage of it without fear.
    A common use is when you want to force an input argument of a function to be handled as a column:
    a = a(:);
    Regards,
    Luca

  4. Luca Citi replied on :

    I found the page in the documentation:
    http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/matlab_prog/f1-85462.html

  5. Loren replied on :

    Thanks to Oliver and Luca for your comments and contributions on this topic.

    Mike, if you’re not using sparse in your problem otherwise, you will likely find that using full(sparse…)) is less efficient than accumarray because of the burden of creating and then transforming a sparse data structure.

    –Loren

  6. Piotr replied on :

    Hi,
    The solution with accumarray will work only in case your initial matrix is zero. For more general problem of
    A(loc)=A(loc)+data, where A is non zero at the begining
    I think the solution would be to use sparse:

    loc=[1 1 3 3 1;1 3 1 3 1];
    a=a+sparse(loc(1,:),loc(2,:),data,3,3);

    Piotr

  7. Urs (us) Schwarz replied on :

    The solution with accumarray will work only in case your initial matrix is zero…

    in which case you would simply do this

    a=magic(5);
    loc=[1 1 3 3 1;1 3 1 3 1];
    data=pi*(1:size(loc,2));
    as=a+accumarray(loc.’,data,size(a));
    a
    as
    (as-a)./pi

    us

  8. Piotr replied on :

    Urs,
    You are right that accumarray can be used for adding to non zero matrices, but it is the slowest solution. Sparse is a little better but not much. For my problem I came up with much faster solution, however requiring additional memory (is’nt it always true?)

    My problem can be simulated by:

    n=1000;k=50;
    a=zeros(n,n);
    for i=1:1000
    x=randsample(n,k,true);
    y=randsample(n,k,true);
    % a=a+accumarray([x,y],1,[n n]);
    a=a+sparse(x,y,1,n,n);

    end

    using accumarray cputime is ~34s
    using sparse it is ~21s

    adding another index to matrix a to deal with repetitions
    I could get the code running 10 times faster:

    n=1000;k=50;
    a=zeros(n,n,k);
    for i=1:1000
    x=randsample(n,k,true);
    y=randsample(n,k,true);
    l=sub2ind([n n k],x’,y’,1:k);
    a(l)=a(l)+1;
    end
    a=sum(a,3);

    it runs in 2s.
    My real problem has loop closer to 100,000 so time savings
    are significant.

  9. Oliver A. Chapman, P.E. replied on :

    Luca,

    I agree that MatLab’s documentation does adequately explain the convention for mapping a vector of indices into a variable of higher dimensions. However, some people who read my MatLab code may not have read that section of MatLab documentation, so it won’t be clear to them how this works. That is why I avoid this technique. As Dykstra said, “. . . it is not only the programmer’s task to produce a correct program but also to demonstrate its correctness in a convincing manner . . . “

  10. Tim Davis replied on :

    accumarray for Piotr’s example is slower than sparse, only because each of the ith (i = 1:1000) submatrices are sparse.
    accumarry and sparse are being used 1000 times each.

    Faster than any of these methods, including the sub2ind method, is to avoid assignment with indices altogether, and do this:

    n = 1000 ; k = 50 ; iters = 1000 ;
    X = zeros (iters,k) ;
    Y = zeros (iters,k) ;
    for i=1:iters
    X (i,:) = randsample (n,k,true) ;
    Y (i,:) = randsample (n,k,true) ;
    end
    C = accumarray ([X(:) Y(:)], 1, [n n]) ;

    On my machine, the for i=1:1000 code above using sparse inside the for loop takes 21.5 seconds, your sub2ind code takes 1.3 seconds, and the above code takes 0.2 seconds. The above code also uses far less memory than your sub2ind code, too.

    I find that it’s often faster to avoid fancy indexing, and do as many whole-matrix operations as possible (such as the above accumarray).

  11. Tim Davis replied on :

    Loren replied on July 26th, 2007 at 11:29 am :
    > Mike, if you’re not using sparse in your problem otherwise
    > you will likely find that using full(sparse…)) is less
    > efficient than accumarray because of the burden of
    > creating and then transforming a sparse data structure.

    In MATLAB 7.4, if you use the magic word

    spparms(’chmodsp’,1)

    and then try full(sparse( …)), then you’ll find that accumary and full(sparse) are about the same speed. That’s the case for the C=accumarray(…) above, even when iters is 100,000 for this example.

  12. Ravi replied on :

    I have a problem in understanding the accumarray function. In case the first argument is a vector, I have managed with the help section to follow what the function does. But when the first argument is a matrix (”subs” in the help documentation for the function), I am just unable to understand just what the function does. Loren, can you direct me to some documentation, apart from that in the help, that is more helpful for ordinary mortals like me.

  13. Loren replied on :

    Ravi-

    Sorry, I don’t know of any others. I recommend you pull apart the example(s) that you find confusing into smaller pieces and make sure you understand each piece. :ooking at each piece separately should be helpful.

    –Loren

  14. Steve L replied on :

    Ravi,

    The first column of the subs matrix is the row indices of the elements you want to create in the result matrices. The second column is the column indices. The third column is the page indices (dimension 3.) In general, the nth column of the subs input contains the indices in the nth dimension of the result matrix.

    For instance, if one of the rows of the SUBS matrix is [1 2 3 4 5], then the entry in the corresponding row of VAL will be “accumulated” into element A(1, 2, 3, 4, 5) of the output matrix A of your call to ACCUMARRAY. Try:

    A = accumarray([1 2 3;4 5 6], [1;2]);
    A(1, 2, 3)
    A(4, 5, 6)

  15. Ravi replied on :

    Steve,
    Thanks a lot. Your tip put me on track. I now understand that a simple extension of logic is all that is required to proceed from a vector argument to a matrix one. I just did not get this before.

Leave a Reply

Wrap code fragments inside <pre> tags, like this:

<pre class="code">
a = magic(3);
sum(a)
</pre>

If you have a "<" character in your code, either follow it with a space or replace it with "&lt;" (including the semicolon).


Loren Shure works on design of the MATLAB language at The MathWorks. She writes here about once a week on MATLAB programming and related topics.

  • Jun: I totally can not believe it, Loren. You are really helpful. Thank you so much, MATLAB master!
  • Loren: Wow folks- Always lots of interest when there’s a quickie to try out! I will only make 2 general...
  • Loren: Jun- ismember is your friend here: >> [aa,ind] = ismember(Array2,Arra y1) aa = 1 1 1 1 1 1 1 ind = 1 2 1 4 4 3...
  • Dan: I like the first way better than the second way. Combining the arrays into one and running any is nice, although...
  • James Myatt: How about I = (a == 0 | b == 0); a(I) = []; b(I) = [];
  • Tunc: Hello Loren, love your blog because of such inspiring and challenging comments to such ’small’...
  • Pekka Kumpulainen: Here is my tradeoff. I usually want to keep the original variables as they are most probably...
  • Iain: Followup: Of course, to allow NaNs (counting them as non-zero): mask = (a~=0) & (b~=0); The mask says “a...
  • Matt Fig: I would usually go with something like this: y = a&b; x = a(y); y = b(y); But I was surprised to find...
  • kk: c=all([a;b]) a(c) a(b)

These postings are the author's and don't necessarily represent the opinions of The MathWorks.