Steve on Image Processing

July 17th, 2007

bwlabeln - design decision

In a comment on my "Connected component labeling - Part 6" post, Martin Isenburg asked "what is the rationale behind the design decision to have bwlabeln work on individual pixels rather than on runs of pixels?"

Excellent question. I'm happy to answer "why did they do it that way" algorithm questions—when I know the answer, that is!

To summarize the issue: bwlabel performs two-dimensional connected component labeling by analyzing the adjacency relationships of runs of pixels. bwlabeln, on the other hand, analyzes the adjacency relationships of individual pixels.

Background: Late in the development cycle when Image Processing Toolbox version 3, we decided based on a rising tide of user feedback to add multidimensional support to the toolbox. This was a lot work, and we only had a few months to do it. Component labeling was on the list. 2-D labeling existed in the toolbox, but we needed to add support for arbitrary dimension labeling.

Several toolbox functions, including connected-component labeling, depended upon some definition of pixel connectivity. 2-D functions that already existed accepted 4 or 8 as connectivity definitions. For 3-D, we could add 6, 18, and 26 to the list, but what should we do for arbitrary dimensions? We decided to add to support a very broad notion of defining connectivity. Specifically, to define the desired multidimensional connectivity, you could provide a connectivity array, called CONN in the documentation. CONN is a 3-by-3-by- ... -by-3 logical array, symmetric about its center element. This gives the user the ability to define whatever connectivity they desire.

As part of the overall multidimensional development effort, we had already created code for doing arbitrary-dimensional neighborhood iterating (see toolbox/images/images/private/neighborhood.cpp for the details). By combining that code with the union-find technique, it was possible to implement arbitrary-dimension arbitrary-connectivity labeling quickly and robustly, without much code.

I thought about carrying over the run-length encoding idea to multiple dimensions, but I suspected the code needed to perform arbitrary-dimension arbitrary-connectivity adjacency analysis on runs would be complicated and error-prone. I was also pretty sure that there would be cases, depending on the dimensionality, the connectivity definition, and the specific image characteristics, where doing a run-length encoding pass would be counterproductive.

So it all came down to a judgment call. Based on available time and resources, the long list of functions that needed multidimensional implementations, and the possibility that the run-length technique might be inefficient in some cases, we decided to go for the pixel-wise union-find implementation.

13 Responses to “bwlabeln - design decision”

  1. Pete replied on :

    Hi, I’ve just found this blog recently and see it’s full of great stuff!

    Regarding bwlabeln, I was wondering: is it necessary for the output to be of class double?

    I run into problems because I am trying to process pretty large image stacks (e.g. 300 x 300 x 1500 pixels). I use single precision, but after thresholding I can’t find connected regions using bwlabeln because it uses double and I run out of memory.

    I’ll have a look in more detail at the posts in this series and see if I can implement something that works for me, but I wondered if there was a quick solution or an explanation for why double is used rather than an integer class, or even single.

  2. Steve replied on :

    Pete—You make a good point. The reason that bwlabeln outputs double is that support for integer types and single-precision was very limited in MATLAB when these functions were introduced back in 2001. In fact, until 1997 double was the only numeric type in MATLAB.

    I’ve been indecisive about whether to change this function (and bwlabel) now, because such a change would be incompatible.

    I’m also indecisive about what new type to use. These days it wouldn’t be hard to construct an image for which uint32 or single would be inadequate to represent the number of objects.

  3. Pete replied on :

    Thanks very much for the reply. It makes sense and it’s good to know I’m not missing anything really obvious. I’m a newcomer to MATLAB.

    For my work, I’d like to be able to use the smallest class possible - perhaps even specifying it myself, and then suffering the consequences if I specify a class that is too small. uint16 or perhaps even uint8 would be enough for me, as any more distinct objects than that would be too many to process and would suggest my segmentation hadn’t been very good. I personally would prefer to contend with an upper boundary on the number of detectable objects than memory limitations for this reason, but I realise my needs may be rather specific.

    Anyway, thanks again, I’m glad you did the series on connected component labeling - it helps a lot to see the code.

  4. Steve replied on :

    Pete—So would you like to see the output class dependent on how many objects there are?

  5. Pete replied on :

    Yes, I think that would be better.

    I actually find myself drawn to the idea that the default class remains double, but the user can specify another class if they consider it necessary - that way one always knows in advance what class to expect. But that may be symptomatic of a personal need-for-control issue which I ought to overcome, since I am struggling to think of an occasion when using anything other than the smallest required class would actually be useful. Unless it simplifies implementation or backwards compatibility.

    Sorry, I’m not good at deciding even my own wishes. But yes, the option of using something smaller than double, either automatically or otherwise, is something I would find useful.

  6. Steve replied on :

    Pete—Thanks.

  7. beverley replied on :

    Hi steve,

    i have a question to ask regarding the bwlabeln, i have already implemented it to the image selected by using bwlabeln(L,4) and have come up with an image labelled with {17 20 21 23 25 26} since six patches are determined from the image, i dont know how to separate the clusters so that i can use the regionprops to determine the area. plus my algorithm will be applied to different images so the label for the patches varies, what should i do to pull these clusters to start feature extraction already
    thanks steve
    -beverley

  8. Steve replied on :

    Beverley—If your image is already labeled, there’s nothing else you need to do before calling regionprops. Also, you might consider calling bwlabel, which is usually faster than bwlabeln for two-dimensional images. I don’t understand the second part of your question.

  9. Jeremy replied on :

    Steve,
    I just want to say that I agree with Pete, it would be nice if bwlabeln would output something smaller than a double. I too am working with rather large images and I would like to process them in parallel, but if each of the labs call bwlabeln at the same time I quickly run out of memory.

    I know in my image that I should have less than 2000 objects. I also know that if the number of object approaches 2^16, something went seriously wrong. Thus, as you can see, having more control of the output class from bwlabeln would be benefical

    Thanks

  10. Steve replied on :

    Jeremy—Thanks for your comments. It’s on my list of things to think about.

  11. jiang replied on :

    i have change the bwlabl c function output data type to uint16, and the output support uint16 .

    so the matrix can be large.

  12. jiang replied on :

    bwlabeln would output something smaller than a double.

    another choice is if the bwlabeln could be implemented by
    using the index of connected components , or the spare
    matrix . by this way , the matrix could be rather large .
    maybe bwlabeln have a improved performance.

  13. Steve replied on :

    Jiang—Thanks for your comments.

Leave a Reply

Wrap code fragments inside <pre> tags, like this:

<pre class="code">
a = magic(3);
sum(a)
</pre>

If you have a "<" character in your code, either follow it with a space or replace it with "&lt;" (including the semicolon).


Steve Eddins manages the Image & Geospatial development team at The MathWorks and coauthored Digital Image Processing Using MATLAB. He writes here about image processing concepts, algorithm implementations, and MATLAB.

  • murat: Hi Steve, I have an rgb image of a kind of cream and it contains some small black particles (black dots). In...
  • Steve: Ernest—Look at setting the FaceColor property. The code for setting that is shown on the page you asked...
  • Ernest Miller: Hi Steve, Understood. However, can you explain how to change the colors? Thanks, Ernest
  • Jan: Hi Steve Very useful code, yet what if I parts of my rotated+translated object are outside the original...
  • Steve: MoHDa—It might be possible. You’ll need to use one of the options that produces closed edge...
  • MoHDa: I have one question about the ROIPOLY: I have an image with stripes, I use the “edge” command for...
  • Steve: Shahn—My November 17, 2006 post shows you how to do it.
  • Steve: Kay-Uwe—Thanks for following up. I am planning to make it easier to use test directories in a package....
  • shahn: Hello Steve Instead of superimposing a star on the image to show the centroide. How would you superimpose a...
  • Kay-Uwe: Having TestSuite.fromPackag e() would be nice to have, but so far using simple “test” subdirs...

These postings are the author's and don't necessarily represent the opinions of The MathWorks.