Steve on Image Processing

July 6th, 2009

A new look for connected component labeling in R2009a

One of the changes we made to the Image Processing Toolbox for R2009a was partially inspired by comments received on this blog. Specifically, several blog readers complained about the memory required by the function bwlabel.

The function bwlabel labels connected components (or objects, or blobs) in a binary image. (I wrote a series of posts a couple of years ago about connected component labeling algorithms.)

I'll illustrate what bwlabel does using a very small binary image.

BW = [1     1     1     0     0     0     0     0
    1     1     1     0     1     1     0     0
    1     1     1     0     1     1     0     0
    0     0     0     0     0     0     1     0
    0     0     0     0     0     0     1     0
    0     0     0     0     0     0     1     0
    0     1     1     0     0     1     1     0
    0     1     1     0     0     0     0     0]
BW =

     1     1     1     0     0     0     0     0
     1     1     1     0     1     1     0     0
     1     1     1     0     1     1     0     0
     0     0     0     0     0     0     1     0
     0     0     0     0     0     0     1     0
     0     0     0     0     0     0     1     0
     0     1     1     0     0     1     1     0
     0     1     1     0     0     0     0     0

L = bwlabel(BW)
L =

     1     1     1     0     0     0     0     0
     1     1     1     0     3     3     0     0
     1     1     1     0     3     3     0     0
     0     0     0     0     0     0     3     0
     0     0     0     0     0     0     3     0
     0     0     0     0     0     0     3     0
     0     2     2     0     0     3     3     0
     0     2     2     0     0     0     0     0

The output, L, of bwlabel is called a label matrix. A label matrix contains nonnegative integer values. 0s correspond to background pixels in the binary image, and positive values identify each labeled object. For example, the elements of L that equal 2 correspond to the second object found by bwlabel.

L == 2
ans =

     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0
     0     1     1     0     0     0     0     0
     0     1     1     0     0     0     0     0

Very often, the output of bwlabel is used as the input to regionprops, a function that measures geometric properties of regions. For example, we can compute the area of each labeled object:

s = regionprops(L, 'Area')
s = 

3x1 struct array with fields:
    Area

[s.Area]   % comma-separated list syntax for structures
ans =

     9     4     9

As I mentioned in the first paragraph above, we occasionally hear complaints about the amount of memory required by bwlabel. Because bwlabel always returns L as a double-precision matrix, the memory required is 8 bytes per image pixel.

Why is L double-precision? Because bwlabel was introduced to the Image Processing Toolbox at a time when support for integer matrices was just getting off the ground. In 1998 or so, the only non-double data type supported by the Image Processing Toolbox was uint8. It did not seem reasonable to limit label matrices to 255 objects, and we were uncomfortable with a design that called for the data type of L to vary according to the number of objects detected.

If there had been more than minimal support at the time for uint32 or single matrices, we might have chosen one of those types. But there wasn't, so we went with double, and that's what it's been ever since.

The memory complaints have persisted, though, particularly with users who are doing multidimensional image processing. 8 bytes per image pixel seemed to be a high price to pay, especially when there were relatively few foreground pixels.

The usual request was to change the data type L to be uint8 when there were only a few objects. We thought about doing that, but there were other issues we wanted to fix:

1. The memory requirement was proportional to the total number of image pixels instead of the number of object pixels. This would still be the case even we used uint8 in some cases.

2. We knew that an early step inside the implementation of regionprops was to compute a list of all the pixel indices corresponding to each object. This information is known inside the bwlabel computation, but it gets lost the label matrix doesn't represent it directly. Recomputing the pixel indices causes a speed penalty.

3. We knew that most users didn't use the label matrix directly. Often, the only reason for computing it was to hand it off to regionprops.

So we looked for a new design that would address all these issues and that would not cause compatibility problems.

Next time I'll show you what we did.


Get the MATLAB code

Published with MATLAB® 7.8

4 Responses to “A new look for connected component labeling in R2009a”

  1. Ahmed Mostafa replied on :

    Hello Steve
    i really like your articles, it helped me a lot through my study. i kinda have a problem with bwconncomp and isosurface, it will be great if you can have time to look at it

    http://www.mathworks.com/matlabcentral/newsreader/view_thread/258281#672407

    Thanks

  2. Steve replied on :

    Ahmed—You need to break down your code a line at a time, looking carefully at each output to see if it matches your expectations. For example, what is the number of elements in the output of bwconncomp? That will tell you how many objects it detected.

  3. Saber replied on :

    Hi Steve,
    First of all, thanks a lot for your posts on image processing, I really learned a lot through this blog.
    Secondly, could you please refer me to the theory (e.g. papers or articles) upon which REGIONPROPS features like:
    - ConvexHull
    - Eccentricity
    - Major-, MinorAxisLength
    - Orientation
    are implemented.
    Best regards

  4. Steve replied on :

    Saber—The concept of convex hull is well described in many, many places. You could just start with Wikipedia for that. Eccentricity, MajorAxisLength, MinorAxisLength, and Orientation are all based on fitting an ellipse to the region by matching second-order moments. Equations were drawn from Haralick and Shapiro, Computer and Robot Vision vol I, Addison-Wesley 1992, Appendix A. See the subfunction ComputeEllipseParams in regionprops.m.

Leave a Reply

Wrap code fragments inside <pre> tags, like this:

<pre class="code">
a = magic(3);
sum(a)
</pre>

If you have a "<" character in your code, either follow it with a space or replace it with "&lt;" (including the semicolon).


Steve Eddins manages the Image & Geospatial development team at The MathWorks and coauthored Digital Image Processing Using MATLAB. He writes here about image processing concepts, algorithm implementations, and MATLAB.

  • Steve: Dansheng—The following code will give you the binary image corresponding to a single object. L =...
  • Dansheng Song: Hi Steve, Your posts are very helpful for Matlab users. Do you know how can I get the binary image by...
  • DP: Thanks Steve. it did work. now can easily calculate DSC.
  • Steve: DP— A & B
  • DP: Hi Steve, Dice similarity coefficient is defined to find the overlapped regions between two objects in an image....
  • Steve: Searching— axis on
  • Steve: DP—Sorry, but I’ve never heard of dice similarity coefficients.
  • searching: Hey, I found this place extremely useful. I have a question though. Although I manage to plot my points on...
  • DP: Hi Steve, thanks for your email. i could not seperate those connected objects. Instead i am trying to compare...
  • Gopesh: Excellent code..was very helpful to me to change the background color of mri images and thereby improving the...

These postings are the author's and don't necessarily represent the opinions of The MathWorks.