{"id":274,"date":"2009-07-06T13:26:02","date_gmt":"2009-07-06T17:26:02","guid":{"rendered":"https:\/\/blogs.mathworks.com\/steve\/2009\/07\/06\/a-new-look-for-connected-component-labeling-in-r2009a\/"},"modified":"2019-10-28T15:37:54","modified_gmt":"2019-10-28T19:37:54","slug":"a-new-look-for-connected-component-labeling-in-r2009a","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/steve\/2009\/07\/06\/a-new-look-for-connected-component-labeling-in-r2009a\/","title":{"rendered":"A new look for connected component labeling in R2009a"},"content":{"rendered":"<div xmlns:mwsh=\"https:\/\/www.mathworks.com\/namespace\/mcode\/v1\/syntaxhighlight.dtd\" class=\"content\">\r\n   <p>One of the changes we made to the Image Processing Toolbox for R2009a was partially inspired by comments received on this\r\n      blog.  Specifically, several blog readers complained about the memory required by the function <tt>bwlabel<\/tt>.\r\n   <\/p>\r\n   <p>The function <tt>bwlabel<\/tt> labels connected components (or objects, or blobs) in a binary image.  (I wrote a <a href=\"https:\/\/blogs.mathworks.com\/steve\/category\/connected-components\/\">series of posts<\/a> a couple of years ago about connected component labeling algorithms.)\r\n   <\/p>\r\n   <p>I'll illustrate what <tt>bwlabel<\/tt> does using a very small binary image.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">BW = [1     1     1     0     0     0     0     0\r\n    1     1     1     0     1     1     0     0\r\n    1     1     1     0     1     1     0     0\r\n    0     0     0     0     0     0     1     0\r\n    0     0     0     0     0     0     1     0\r\n    0     0     0     0     0     0     1     0\r\n    0     1     1     0     0     1     1     0\r\n    0     1     1     0     0     0     0     0]<\/pre><pre style=\"font-style:oblique\">\r\nBW =\r\n\r\n     1     1     1     0     0     0     0     0\r\n     1     1     1     0     1     1     0     0\r\n     1     1     1     0     1     1     0     0\r\n     0     0     0     0     0     0     1     0\r\n     0     0     0     0     0     0     1     0\r\n     0     0     0     0     0     0     1     0\r\n     0     1     1     0     0     1     1     0\r\n     0     1     1     0     0     0     0     0\r\n\r\n<\/pre><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">L = bwlabel(BW)<\/pre><pre style=\"font-style:oblique\">\r\nL =\r\n\r\n     1     1     1     0     0     0     0     0\r\n     1     1     1     0     3     3     0     0\r\n     1     1     1     0     3     3     0     0\r\n     0     0     0     0     0     0     3     0\r\n     0     0     0     0     0     0     3     0\r\n     0     0     0     0     0     0     3     0\r\n     0     2     2     0     0     3     3     0\r\n     0     2     2     0     0     0     0     0\r\n\r\n<\/pre><p>The output, <tt>L<\/tt>, of <tt>bwlabel<\/tt> is called a <i>label matrix.<\/i> A label matrix contains nonnegative integer values.  0s correspond to background pixels in the binary image, and positive\r\n      values identify each labeled object.  For example, the elements of <tt>L<\/tt> that equal 2 correspond to the second object found by <tt>bwlabel<\/tt>.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">L == 2<\/pre><pre style=\"font-style:oblique\">\r\nans =\r\n\r\n     0     0     0     0     0     0     0     0\r\n     0     0     0     0     0     0     0     0\r\n     0     0     0     0     0     0     0     0\r\n     0     0     0     0     0     0     0     0\r\n     0     0     0     0     0     0     0     0\r\n     0     0     0     0     0     0     0     0\r\n     0     1     1     0     0     0     0     0\r\n     0     1     1     0     0     0     0     0\r\n\r\n<\/pre><p>Very often, the output of <tt>bwlabel<\/tt> is used as the input to <tt>regionprops<\/tt>, a function that measures geometric properties of regions. For example, we can compute the area of each labeled object:\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">s = regionprops(L, <span style=\"color: #A020F0\">'Area'<\/span>)<\/pre><pre style=\"font-style:oblique\">\r\ns = \r\n\r\n3x1 struct array with fields:\r\n    Area\r\n\r\n<\/pre><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">[s.Area]   <span style=\"color: #228B22\">% comma-separated list syntax for structures<\/span><\/pre><pre style=\"font-style:oblique\">\r\nans =\r\n\r\n     9     4     9\r\n\r\n<\/pre><p>As I mentioned in the first paragraph above, we occasionally hear complaints about the amount of memory required by <tt>bwlabel<\/tt>.  Because <tt>bwlabel<\/tt> always returns <tt>L<\/tt> as a double-precision matrix, the memory required is 8 bytes per image pixel.\r\n   <\/p>\r\n   <p>Why is <tt>L<\/tt> double-precision?  Because <tt>bwlabel<\/tt> was introduced to the Image Processing Toolbox at a time when support for integer matrices was just getting off the ground.\r\n       In 1998 or so, the only non-double data type supported by the Image Processing Toolbox was <tt>uint8<\/tt>.  It did not seem reasonable to limit label matrices to 255 objects, and we were uncomfortable with a design that called\r\n      for the data type of <tt>L<\/tt> to vary according to the number of objects detected.\r\n   <\/p>\r\n   <p>If there had been more than minimal support at the time for <tt>uint32<\/tt> or <tt>single<\/tt> matrices, we might have chosen one of those types. But there wasn't, so we went with <tt>double<\/tt>, and that's what it's been ever since.\r\n   <\/p>\r\n   <p>The memory complaints have persisted, though, particularly with users who are doing multidimensional image processing. 8 bytes\r\n      per image pixel seemed to be a high price to pay, especially when there were relatively few foreground pixels.\r\n   <\/p>\r\n   <p>The usual request was to change the data type <tt>L<\/tt> to be <tt>uint8<\/tt> when there were only a few objects.  We thought about doing that, but there were other issues we wanted to fix:\r\n   <\/p>\r\n   <p>1. The memory requirement was proportional to the total number of image pixels instead of the number of object pixels.  This\r\n      would still be the case even we used <tt>uint8<\/tt> in some cases.\r\n   <\/p>\r\n   <p>2. We knew that an early step inside the implementation of <tt>regionprops<\/tt> was to compute a list of all the pixel indices corresponding to each object.  This information is known inside the <tt>bwlabel<\/tt> computation, but it gets lost the label matrix doesn't represent it directly. Recomputing the pixel indices causes a speed\r\n      penalty.\r\n   <\/p>\r\n   <p>3. We knew that most users didn't use the label matrix directly.  Often, the only reason for computing it was to hand it off\r\n      to <tt>regionprops<\/tt>.\r\n   <\/p>\r\n   <p>So we looked for a new design that would address all these issues and that would not cause compatibility problems.<\/p>\r\n   <p>Next time I'll show you what we did.<\/p><script language=\"JavaScript\">\r\n<!--\r\n\r\n    function grabCode_c237c6d08e5c447ca80a58fc44934216() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='c237c6d08e5c447ca80a58fc44934216 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' c237c6d08e5c447ca80a58fc44934216';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        author = 'Steve Eddins';\r\n        copyright = '';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add author and copyright lines at the bottom if specified.\r\n        if ((author.length > 0) || (copyright.length > 0)) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (author.length > 0) {\r\n                d.writeln('% _' + author + '_');\r\n            }\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n      \r\n      d.title = title + ' (MATLAB code)';\r\n      d.close();\r\n      }   \r\n      \r\n-->\r\n<\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_c237c6d08e5c447ca80a58fc44934216()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n            the MATLAB code \r\n            <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; 7.8<br><\/p>\r\n<\/div>\r\n<!--\r\nc237c6d08e5c447ca80a58fc44934216 ##### SOURCE BEGIN #####\r\n%%\r\n% One of the changes we made to the Image Processing Toolbox for R2009a was\r\n% partially inspired by comments received on this blog.  Specifically, several\r\n% blog readers complained about the memory required by the function |bwlabel|.\r\n%\r\n% The function |bwlabel| labels connected components (or objects, or blobs) in a\r\n% binary image.  (I wrote a \r\n% <https:\/\/blogs.mathworks.com\/steve\/category\/connected-components\/ series of\r\n% posts> a couple of years ago about connected component labeling algorithms.) \r\n%\r\n% I'll illustrate what |bwlabel| does using a very small binary image.\r\n\r\nBW = [1     1     1     0     0     0     0     0\r\n    1     1     1     0     1     1     0     0\r\n    1     1     1     0     1     1     0     0\r\n    0     0     0     0     0     0     1     0\r\n    0     0     0     0     0     0     1     0\r\n    0     0     0     0     0     0     1     0\r\n    0     1     1     0     0     1     1     0\r\n    0     1     1     0     0     0     0     0]\r\n\r\n%%\r\nL = bwlabel(BW)\r\n\r\n%%\r\n% The output, |L|, of |bwlabel| is called a _label matrix._ A label matrix\r\n% contains nonnegative integer values.  0s correspond to background pixels in\r\n% the binary image, and positive values identify each labeled object.  For\r\n% example, the elements of |L| that equal 2 correspond to the second object\r\n% found by |bwlabel|.\r\n\r\nL == 2\r\n\r\n%%\r\n% Very often, the output of |bwlabel| is used as the input to |regionprops|, a\r\n% function that measures geometric properties of regions. For\r\n% example, we can compute the area of each labeled object:\r\n\r\ns = regionprops(L, 'Area')\r\n\r\n%%\r\n\r\n[s.Area]   % comma-separated list syntax for structures\r\n\r\n%%\r\n% As I mentioned in the first paragraph above, we occasionally hear complaints\r\n% about the amount of memory required by |bwlabel|.  Because |bwlabel| always\r\n% returns |L| as a double-precision matrix, the memory required is 8 bytes per\r\n% image pixel.\r\n%\r\n% Why is |L| double-precision?  Because |bwlabel| was introduced to the Image\r\n% Processing Toolbox at a time when support for integer matrices was just\r\n% getting off the ground.  In 1998 or so, the only non-double data type\r\n% supported by the Image Processing Toolbox was |uint8|.  It did not seem\r\n% reasonable to limit label matrices to 255 objects, and we were uncomfortable\r\n% with a design that called for the data type of |L| to vary according to the\r\n% number of objects detected.\r\n%\r\n% If there had been more than minimal support at the time for |uint32| or\r\n% |single| matrices, we might have chosen one of those types. But there wasn't,\r\n% so we went with |double|, and that's what it's been ever since.\r\n%\r\n% The memory complaints have persisted, though, particularly with users who are\r\n% doing multidimensional image processing. 8 bytes per image pixel seemed to be\r\n% a high price to pay, especially when there were relatively few foreground\r\n% pixels.\r\n%\r\n% The usual request was to change the data type |L| to be |uint8| when there\r\n% were only a few objects.  We thought about doing that, but there were other\r\n% issues we wanted to fix:\r\n%\r\n% 1. The memory requirement was proportional to the\r\n% total number of image pixels instead of the number of object pixels.  This\r\n% would still be the case even we used |uint8| in some cases.\r\n%\r\n% 2. We knew that an early step inside the implementation of |regionprops| was to\r\n% compute a list of all the pixel indices corresponding to each object.  This\r\n% information is known inside the |bwlabel| computation, but it gets lost\r\n% the label matrix doesn't represent it directly.\r\n% Recomputing the pixel indices causes a speed penalty.\r\n%\r\n% 3. We knew that most users didn't use the label matrix directly.  Often, the\r\n% only reason for computing it was to hand it off to |regionprops|.\r\n%\r\n% So we looked for a new design that would address all these issues and that\r\n% would not cause compatibility problems.\r\n%\r\n% Next time I'll show you what we did.\r\n##### SOURCE END ##### c237c6d08e5c447ca80a58fc44934216\r\n-->","protected":false},"excerpt":{"rendered":"<p>\r\n   One of the changes we made to the Image Processing Toolbox for R2009a was partially inspired by comments received on this\r\n      blog.  Specifically, several blog readers complained about the... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/steve\/2009\/07\/06\/a-new-look-for-connected-component-labeling-in-r2009a\/\">read more >><\/a><\/p>","protected":false},"author":42,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[166,402,168,559,557,555],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/274"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/users\/42"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/comments?post=274"}],"version-history":[{"count":1,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/274\/revisions"}],"predecessor-version":[{"id":3639,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/274\/revisions\/3639"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/media?parent=274"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/categories?post=274"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/tags?post=274"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}