{"id":137,"date":"2007-05-25T12:02:13","date_gmt":"2007-05-25T16:02:13","guid":{"rendered":"https:\/\/blogs.mathworks.com\/steve\/2007\/05\/25\/connected-component-labeling-part-6\/"},"modified":"2019-10-23T12:56:09","modified_gmt":"2019-10-23T16:56:09","slug":"connected-component-labeling-part-6","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/steve\/2007\/05\/25\/connected-component-labeling-part-6\/","title":{"rendered":"Connected component labeling &#8211; Part 6"},"content":{"rendered":"<div xmlns:mwsh=\"https:\/\/www.mathworks.com\/namespace\/mcode\/v1\/syntaxhighlight.dtd\" class=\"content\">\r\n   <introduction>\r\n      <p>In this part of the <a href=\"https:\/\/blogs.mathworks.com\/steve\/2007\/05\/11\/connected-component-labeling-part-5\/\">connected component labeling series<\/a>, I'll finally get to one of the algorithms actually used in the Image Processing Toolbox.  It's based on a technique called\r\n         <i>union-find<\/i>, as described in Sedgewick's <i>Algorithms in C<\/i>, Addison-Wesley, 1998, pp. 11-20.  This algorithm is designed to be able to quickly form the union of two sets, and also\r\n         to be able to quickly find which set contains a given element.\r\n      <\/p>\r\n      <p>Here's the example I'll use to illustrate the method.  On the left is a small binary image containing a single eight-connected\r\n         component.  We start by making each nonzero pixel a node in a graph.  The nodes are numbered as shown on the right.\r\n      <\/p>\r\n      <p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2007\/May\/cclabel_part6_fig1.png\"> <\/p>\r\n   <\/introduction>\r\n   <p>In its initial state, the graph contains 7 unconnected nodes, one for each nonzero pixel:<\/p>\r\n   <p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2007\/May\/cclabel_part6_fig2.png\"> <\/p>\r\n   <p>We can represent this graph in MATLAB by a vector:<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">nodes = 1:7<\/pre><pre style=\"font-style:oblique\">\r\nnodes =\r\n\r\n     1     2     3     4     5     6     7\r\n\r\n<\/pre><p>In this representation, <tt>nodes(k)<\/tt> gives the parent of node <i>k<\/i>.  A node that is its own parent (that is, <tt>nodes(k)<\/tt> equals <i>k<\/i>) is a root node.\r\n   <\/p>\r\n   <p>As we scan the image columnwise, we encounter pairs of pixels that are connected to each other.  The first such pair is that\r\n      pixel 2 is connected to pixel 1.  We find the root node for the set containing pixel 2 (node 2) and the root node for the\r\n      set containing pixel 1 (node 1).  Since they are different, we form the union of those two sets simply by making the root\r\n      node of one of the sets point to the root node of the other.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">nodes(2) = 1<\/pre><pre style=\"font-style:oblique\">\r\nnodes =\r\n\r\n     1     1     3     4     5     6     7\r\n\r\n<\/pre><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2007\/May\/cclabel_part6_fig3.png\"> <\/p>\r\n   <p>The second connection encountered in the scan is that pixel 3 is connected to pixel 2.  The root node for the set containing\r\n      pixel 3 is node 3; the root node for the set containing pixel 2 is node 1. So we connect node 3 to node 1.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">nodes(3) = 1<\/pre><pre style=\"font-style:oblique\">\r\nnodes =\r\n\r\n     1     1     1     4     5     6     7\r\n\r\n<\/pre><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2007\/May\/cclabel_part6_fig4.png\"> <\/p>\r\n   <p>Now continue to process connections.  Pixel 4 is connected to pixel 3, so connect node 4 to node 1:<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">nodes(4) = 1<\/pre><pre style=\"font-style:oblique\">\r\nnodes =\r\n\r\n     1     1     1     1     5     6     7\r\n\r\n<\/pre><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2007\/May\/cclabel_part6_fig5.png\"> <\/p>\r\n   <p>Pixel 6 is connected to pixel 5, so connect node 6 to node 5:<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">nodes(6) = 5<\/pre><pre style=\"font-style:oblique\">\r\nnodes =\r\n\r\n     1     1     1     1     5     5     7\r\n\r\n<\/pre><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2007\/May\/cclabel_part6_fig6.png\"> <\/p>\r\n   <p>Pixel 7 is connected to pixel 6, so connect node 7 to node 5:<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">nodes(7) = 5<\/pre><pre style=\"font-style:oblique\">\r\nnodes =\r\n\r\n     1     1     1     1     5     5     5\r\n\r\n<\/pre><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2007\/May\/cclabel_part6_fig7.png\"> <\/p>\r\n   <p>Now for the moment of truth.  So far our graph still contains two sets (trees). But now we see that pixel 7 is connected to\r\n      pixel 4.  When we connect their corresponding roots (5 to 1), we are finally left with a single set:\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">nodes(5) = 1<\/pre><pre style=\"font-style:oblique\">\r\nnodes =\r\n\r\n     1     1     1     1     1     5     5\r\n\r\n<\/pre><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2007\/May\/cclabel_part6_fig8.png\"> <\/p>\r\n   <p>According to Sedgewick, there are two important things to do to the graph while processing pairs:<\/p>\r\n   <div>\r\n      <ul>\r\n         <li>When forming a union by connecting two trees in the graph, always connect the smaller of the two trees to the larger one.\r\n             This prevents the growth of long paths in the trees.\r\n         <\/li>\r\n         <li>When \"walking\" up a chain of nodes to find the root node, \"flatten out\" the tree using a technique called <i>path compression<\/i>.  Here's an example showing <i>path compression by halving<\/i>.  The tree on the left has a long path from node 7 up to node 1.  As you follow the path, take two nodes at a time and make\r\n            the bottom node point to the same parent as the top node. That give you the tree on the right.\r\n         <\/li>\r\n      <\/ul>\r\n   <\/div>\r\n   <p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2007\/May\/cclabel_part6_fig9.png\"> <\/p>\r\n   <p>If you keep doing this, then over time the trees in the graph stay almost completely flat.  The flat tree is what makes the\r\n      \"find\" operation as quick as the \"union\" operation.\r\n   <\/p>\r\n   <p>The Image Processing Toolbox function <tt>bwlabeln<\/tt>, which performs connected component labeling for arbitrary dimension and arbitrary connectivity, uses the union-find technique.\r\n       If you have the product, you can see the key parts of the implementation in the source files bwlabelnmex.cpp and unionfind.c,\r\n      which are in toolbox\\images\\images\\private.\r\n   <\/p>\r\n   <p>I'll try to wrap this series up with one more post, which will be about the two-dimensional algorithm used by the function\r\n      <tt>bwlabel<\/tt>.\r\n   <\/p><script language=\"JavaScript\">\r\n<!--\r\n\r\n    function grabCode_651a653d96a4485286702241f2a015a8() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='651a653d96a4485286702241f2a015a8 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' 651a653d96a4485286702241f2a015a8';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        author = 'Steve Eddins';\r\n        copyright = 'Copyright 2007 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add author and copyright lines at the bottom if specified.\r\n        if ((author.length > 0) || (copyright.length > 0)) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (author.length > 0) {\r\n                d.writeln('% _' + author + '_');\r\n            }\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n      \r\n      d.title = title + ' (MATLAB code)';\r\n      d.close();\r\n      }   \r\n      \r\n-->\r\n<\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_651a653d96a4485286702241f2a015a8()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n            the MATLAB code \r\n            <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; 7.4<br><\/p>\r\n<\/div>\r\n<!--\r\n651a653d96a4485286702241f2a015a8 ##### SOURCE BEGIN #####\r\n%%\r\n% In this part of the \r\n% <https:\/\/blogs.mathworks.com\/steve\/2007\/05\/11\/connected-component-labeling-part-5\/ \r\n% connected component labeling series>, I'll finally\r\n% get to one of the algorithms actually used in the Image Processing\r\n% Toolbox.  It's based on a technique called _union-find_, as described in\r\n% Sedgewick's _Algorithms in C_, Addison-Wesley, 1998, pp. 11-20.  This\r\n% algorithm is designed to be able to quickly form the union of two sets,\r\n% and also to be able to quickly find which set contains a given element.\r\n%\r\n% Here's the example I'll use to illustrate the method.  On the left is a\r\n% small binary image containing a single eight-connected component.  We\r\n% start by making each nonzero pixel a node in a graph.  The nodes are\r\n% numbered as shown on the right.\r\n%\r\n% << cclabel_part6_fig1.png>>\r\n\r\n%%\r\n% In its initial state, the graph contains 7 unconnected nodes, one for \r\n% each nonzero pixel:\r\n%\r\n% <<cclabel_part6_fig2.png>>\r\n%\r\n% We can represent this graph in MATLAB by a vector:\r\n\r\nnodes = 1:7\r\n\r\n%%\r\n% In this representation, |nodes(k)| gives the parent of node _k_.  A node that\r\n% is its own parent (that is, |nodes(k)| equals _k_) is a root node.\r\n\r\n%%\r\n% As we scan the image columnwise, we encounter pairs of pixels that are\r\n% connected to each other.  The first such pair is that pixel 2 is\r\n% connected to pixel 1.  We find the root node for the set containing \r\n% pixel 2 (node 2) and the root node for the set containing pixel 1 (node\r\n% 1).  Since they are different, we form the union of those two sets\r\n% simply by making the root node of one of the sets point to the root node\r\n% of the other.\r\n\r\nnodes(2) = 1\r\n\r\n%%\r\n% <<cclabel_part6_fig3.png>>\r\n\r\n%%\r\n% The second connection encountered in the scan is that pixel 3 is \r\n% connected to pixel 2.  The root node for the set containing pixel 3 is\r\n% node 3; the root node for the set containing pixel 2 is node 1. So we\r\n% connect node 3 to node 1.\r\n\r\nnodes(3) = 1\r\n\r\n%%\r\n% <<cclabel_part6_fig4.png>>\r\n\r\n%%\r\n% Now continue to process connections.  Pixel 4 is connected to pixel 3, so\r\n% connect node 4 to node 1:\r\nnodes(4) = 1\r\n\r\n%%\r\n% <<cclabel_part6_fig5.png>>\r\n\r\n%%\r\n% Pixel 6 is connected to pixel 5, so connect node 6 to node 5:\r\nnodes(6) = 5\r\n\r\n%%\r\n% <<cclabel_part6_fig6.png>>\r\n\r\n%%\r\n% Pixel 7 is connected to pixel 6, so connect node 7 to node 5:\r\nnodes(7) = 5\r\n\r\n%%\r\n% <<cclabel_part6_fig7.png>>\r\n\r\n%%\r\n% Now for the moment of truth.  So far our graph still contains two sets\r\n% (trees). But now we see that pixel 7 is connected to pixel 4.  When we\r\n% connect their corresponding roots (5 to 1), we are finally left with a\r\n% single set:\r\n\r\nnodes(5) = 1\r\n\r\n%%\r\n% <<cclabel_part6_fig8.png>>\r\n\r\n%%\r\n% According to Sedgewick, there are two important things to do to the graph\r\n% while processing pairs:\r\n%\r\n% * When forming a union by connecting two trees in the graph, always\r\n% connect the smaller of the two trees to the larger one.  This prevents\r\n% the growth of long paths in the trees.\r\n% * When \"walking\" up a chain of nodes to find the root node, \"flatten out\"\r\n% the tree using a technique called _path compression_.  Here's an example\r\n% showing _path compression by halving_.  The tree on the left has a long\r\n% path from node 7 up to node 1.  As you follow the path, take two nodes at \r\n% a time and make the bottom node point to the same parent as the top node.\r\n% That give you the tree on the right.\r\n%\r\n% <<cclabel_part6_fig9.png>>\r\n%\r\n% If you keep doing this, then over time the trees in the graph stay almost\r\n% completely flat.  The flat tree is what makes the \"find\" operation as\r\n% quick as the \"union\" operation.\r\n%\r\n% The Image Processing Toolbox function |bwlabeln|, which performs\r\n% connected component labeling for arbitrary dimension and arbitrary\r\n% connectivity, uses the union-find technique.  If you have the product,\r\n% you can see the key parts of the implementation in the source files\r\n% bwlabelnmex.cpp and unionfind.c, which are in\r\n% toolbox\\images\\images\\private.\r\n%\r\n% I'll try to wrap this series up with one more post, which will be about\r\n% the two-dimensional algorithm used by the function |bwlabel|.\r\n\r\n\r\n##### SOURCE END ##### 651a653d96a4485286702241f2a015a8\r\n-->","protected":false},"excerpt":{"rendered":"<p>\r\n   \r\n      In this part of the connected component labeling series, I'll finally get to one of the algorithms actually used in the Image Processing Toolbox.  It's based on a technique called\r\n     ... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/steve\/2007\/05\/25\/connected-component-labeling-part-6\/\">read more >><\/a><\/p>","protected":false},"author":42,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[16],"tags":[352],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/137"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/users\/42"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/comments?post=137"}],"version-history":[{"count":1,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/137\/revisions"}],"predecessor-version":[{"id":3532,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/137\/revisions\/3532"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/media?parent=137"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/categories?post=137"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/tags?post=137"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}