Skip to Main Content Skip to Search
File Exchange
MATLAB Newsgroup
Link Exchange
  Blogs  
 Contest 
MathWorks.com

Loren on the Art of MATLAB

December 1st, 2006

Let Me Count the Ways

In the course of many applications, we sometimes need to tally up counts for different purposes, e.g., investigating the distribution of some data or equalizing gray tones in an image.

Contents

In version 4 of MATLAB, we introduced sparse matrices. A bit later, we introduced the first release of the Image Processing Toolbox; and for even more information, see Steve's blog. We then had several says we could count up the number of pixels in an indexed image for each color in the colormap. A few of the ways are

  • Run a for loop over the number of colors and find the elements in the image with that value.
  • Use hist to create a histogram with the bins corresponding to the number of colors.
  • Use sparse matrices to tally the bins, especially since sparse performs accumulation as it goes.

Counting by Color - Brute Force

Here's code to show how to bin the data by brute force.

Load the data.

clown = load('clown');
lcmap = clown.map;
nc = length(lcmap);
X = clown.X;

Set up the output array for collecting the binned information.

count = zeros(1,nc);

Loop over the image elements and augment the colormap entry by 1 for each hit.

M = numel(X);
for ind=1:M
    count(X(ind)) = count(X(ind))+1;
end

Counting by Color - Accumulation Using sparse

Think first about creating an M-by-nc array, mostly zeros, with one 1 per row, in the color column to identify the color.

  S = sparse(1:M,X(:),1,M,nc);

We could do this and then sum each column. However, sparse automatically accumulates duplicate elements rather than replacing them, so instead we can create a 1-by-nc sparse array, and then convert it to full when we're done.

counts = full(sparse(1,X(:),1,1,nc));

If you looked at this M-file, you would now see an mlint message for the previous line of code. Here it is.

mlhint = mlint('meanValues');
disp(mlhint.message)
ACCUMARRAY([r,c],v,[m,n]) is faster than FULL(SPARSE(r,c,v,m,n)).

It suggests to us the next technique to use.

Counting by Color - Accumulation Using accumarray

Following the message as closely as we can (though we can't have the first column of the first input scalar since the second column is a long vector), we get try the following code. We can also use the help for accumarray to guide us.

counta = accumarray([ones(M,1) X(:)], 1 , [1 nc]);

Let's make sure all the techniques get the same answer.

agreement = isequal(count, counts, counta)
agreement =
     1

Computing Mean Values

There was a thread recently on the MATLAB newsgroup asking how to compute the mean value of portions of an array, the partitions described via arrays of indices. Sounds like an image histogram, only not -- because of computing the mean value instead of the total number. Here's the posted advice. help accumarray

 HTH,
 John D'Errico

While I know that information helped the user that day, I suspect not everyone knew exactly what to do with it. So let's try it here.

Here are the details and the original code (except for the sizes). Original sizes: m = 1245; n = 1200; maxnum = 50;

m = 100;
n = 100;
maxnum = 22;

Create the input matrices.

observed = rand(m,n);
num = 1:maxnum;
arr1 = num(ceil(maxnum*rand(size(observed))));
arr2 = num(ceil(maxnum*rand(size(observed))));

Computing Mean Values - Straight-forward Double Loop

meanMatrix=zeros(maxnum,maxnum);
tic
for i=1:maxnum
    for ii=1:maxnum
        meanMatrix(i,ii) = mean(observed(arr1==i & arr2 ==ii));
    end
end
looptime = toc
looptime =
    0.1494

Computing Mean Values - Using accumarray

There are a whole class of functions I can use with accumarray to calculate different information about these arrays. These functions fall into a class of operations that other languages, e.g., APL, call reduction operations and they consist of functions such as sum, mean, and max, for example that return a scalar value for vector input.

tic
means = accumarray([arr1(:),arr2(:)], observed(:), [maxnum maxnum], @mean, 0);
accumtime = toc
accumtime =
    0.0129

Notice how much faster the accumarray solution is than the loop, even with the output pre-allocated.

Ensuring Identical Results

If we compare the results, and not just the times, for the different ways to produce the mean values, we see the results are not identical.

norm(means-meanMatrix)
ans =
  7.3041e-016

What's going on here? I will show you that the difference depends on the ordering of the elements by doing the same calculation with a different program to produce the mean. In this program, we make sure the data are sorted before we calculate the result.

dbtype sortmean
1     function y = sortmean(x)
2     %SORTMEAN Sort data and find mean.
3     y = mean(sort(x));

Do the calculation with sortmean with loops and with accumarray and compare results.

for i=1:maxnum
    for ii=1:maxnum
        meanMatrix(i,ii) = sortmean(observed(arr1==i & arr2 ==ii));
    end
end

means = accumarray([arr1(:),arr2(:)], observed(:), [maxnum maxnum], @sortmean, 0);

We can see now that if we tightly control the order of the calculations for the mean value, we get the same answers, though the loop version still takes longer.

equalmeans = isequal(means,meanMatrix)
equalmeans =
     1

Conclusion

Once again, there are many ways to reach the same outcome in MATLAB, with different characteristics in terms of readability, performance, etc. Which of the solutions used here would you use, or do you have some other favorite techniques? Post them here.


Get the MATLAB code

Published with MATLAB® 7.3

9 Responses to “Let Me Count the Ways”

  1. Urs (us) Schwarz replied on :

    -and- dear loren, let us not forget our beloved and fast HISTC…
    % take the clowny bit with the COUNT
    n=histc(X(:),1:nc);
    isequal(n,count.’)
    just a thought
    us

  2. Loren replied on :

    Folks-

    Just as an update, I ran this M-file as a script and as a function with some timings of the first counting problem and get the following results on my dual-core T60 laptop:

    timesScript =
         for      sparse   accumarray
        0.2425    0.0478    0.0210
    
    timesFunction =
         for      sparse   accumarray
         0.0063    0.0334    0.0175
    

    In any case, there still remain ways to do accomplish the same goal, some of which may be better suited to your needs than others.

    –Loren

  3. sharareh replied on :

    please guide me how can write program to show histogram of an image

  4. Loren replied on :

    Sharareh-

    Please look at the documentation in which there are examples for just your problem. You may also be interested in reading Steve’s blog on image processing.

    –Loren

  5. Wes Campaigne replied on :

    Was accumarray substantially written in some recent release? Previously when I’d used it for computing very wide histograms on very large images it was slow and memory intensive and would not accept integer-typed inputs. I’m happy to see that in R2007a all of that is no longer the case.

    There is, however, one oddity about the example given here: Why did you use two dimensional indexing by concatenating with ones(M,1)? The transpose of the result from using one-dimensional indices (i.e., just X(:)) is equivalent, and for large images on my computer, seems to take only ~25% of the time:

    >> tic; H = accumarray([ones(M,1) X(:)], 1 , [1 nc]); toc
    Elapsed time is 0.483881 seconds.
    >> tic; H2 = accumarray(X(:), 1, [nc 1])’; toc
    Elapsed time is 0.127087 seconds.
    >> isequal(H,H2)

    ans =

    1

  6. Peter Perkins replied on :

    Wes, ACCUMARRAY was indeed substantially rewritten; as I recall it was for R14 (maybe R14SP1), which was a couple years ago. Glad to hear you like it. Faster, yes; accepts more data types, you bet; and it also accepts function handles, with which you can do some interesting things. Loren touched on a few of them, but examples like these from the help:

    subs = [1 1; 2 1; 2 3; 2 1; 2 3];
    A = accumarray(subs, 101:105, [2 4], @(x)length(x)>1)

    subs = [1 1; 2 1; 2 3; 2 1; 2 3];
    A = accumarray(subs, 101:105, [2 4], @(x){x})

    go even further.

    You are right, that color counting example does benefit from using the 1-D form of the indexing arg (even Loren, who, believe me, knows a lot about MATLAB, misses simple things sometimes). It’s faster, though perhaps not for the reason you were thinking: it’s the creation of the 2-column indexing argument that seems to take the bulk of the extra time in the cases I looked at.

    And, as you probably noticed, specifying the size of the output (via the third input arg) saves ACCUMARRAY the trouble of figuring it out, also saving time.

    - Peter

  7. Wes Campaigne replied on :

    Thanks for the response, Peter. I did assume that it was the call to ‘ones’ and the concatenation that caused the slowdown.

    I just did a quick search, and found that accumarray was first introduced in R14, and then the interesting upgrades to it came with R14SP3. I first used accumarray with a version prior to that, which is when I had issues with it. I’m pretty sure my university skipped R14SP3, and by the point they upgraded to a later release, I no longer depended on accumarray enough to notice the improvements.

    It seems like a fantastically versatile function but for some reason I’ve found it quite difficult to wrap my head around it. These examples help quite a bit. Thanks again!

    -Wes

  8. Alex Reid replied on :

    How would I go about counting red pixels please?

    Thanks

    Alex

  9. Loren replied on :

    Alex-

    You’d have to define red to begin with. Is it only [1 0 0] or are there other choices? Once you have the choices, you can then apply logic to see which pixels are red.

    –Loren

Leave a Reply


Loren Shure works on design of the MATLAB language at The MathWorks. She writes here about once a week on MATLAB programming and related topics.

  • J.B. Brown: Ah, and I am at fault for simply testing collinearity with the origin in the example above.
  • J.B. Brown: Indeed, > collinear( [0 3],[0 8],[0 -1e21+2e-15] ) ans = 1 > collinear( [0 3],[0 8],[0 -1e22+2e-15]...
  • OkinawaDolphin: Loren, thank you for telling me where to download timeit. Here are the two functions I just tested...
  • Loren: JB- It looks to me like Ilya’s solution and therefore yours are equivalent to the determinant. As Tim...
  • Loren: OkinawaDolphin, timeit can be downloaded from the File Exchange. Steve Eddins is the author. It does not ship...
  • OkinawaDolphin: It seems that neither R2007a nor R2007b have the function timeit, but I investigated computation time...
  • J.B. Brown: It would appear to me that Ilya Rozenfeld’s solution would be the cleanest. Just to help those who...
  • Loren: Markus- Congratulations on winning! And a nice illustration of how the size matters. Small enough, and the...
  • Markus: Hi Loren, which version is fastest also depends very much on the matrix dimensions. Look at my test function:...
  • Duncan: OkinawaDolphin, Regarding why your third example is slower than your second example, the result is in fact...

These postings are the author's and don't necessarily represent the opinions of The MathWorks.

Related Topics