{"id":203,"date":"2008-03-25T10:45:04","date_gmt":"2008-03-25T14:45:04","guid":{"rendered":"https:\/\/blogs.mathworks.com\/steve\/2008\/03\/25\/bwlabel-search-order\/"},"modified":"2019-10-24T14:01:24","modified_gmt":"2019-10-24T18:01:24","slug":"bwlabel-search-order","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/steve\/2008\/03\/25\/bwlabel-search-order\/","title":{"rendered":"bwlabel search order"},"content":{"rendered":"<div xmlns:mwsh=\"https:\/\/www.mathworks.com\/namespace\/mcode\/v1\/syntaxhighlight.dtd\" class=\"content\">\r\n   <p>I've received several questions over the past months about the search order for the function <tt>bwlabel<\/tt> and whether it can be changed.  Today's post discusses the search-order issue, how useful it might or might not be to change\r\n      it, and how to use <tt>regionprops<\/tt> to reorder labeled objects.\r\n   <\/p>\r\n   <p>First, though, a brief review of <tt>bwlabel<\/tt>: It takes a binary image, <tt>bw<\/tt>, and produces a <i>label matrix<\/i>, <tt>L<\/tt>.  The elements of L are integer values greater than or equal to 0. The elements equal to 0 are the background. The elements\r\n      equal to 1 make up one object, the elements equal to 2 make up a second object, and so on.  Here's a small example.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">bw = false(5, 5);\r\nbw(1:2, 1:2) = true;\r\nbw(4:5, 1:2) = true;\r\nbw(1:2, 4:5) = true;\r\nbw(4:5, 4:5) = true<\/pre><pre style=\"font-style:oblique\">\r\nbw =\r\n\r\n     1     1     0     1     1\r\n     1     1     0     1     1\r\n     0     0     0     0     0\r\n     1     1     0     1     1\r\n     1     1     0     1     1\r\n\r\n<\/pre><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">L = bwlabel(bw)<\/pre><pre style=\"font-style:oblique\">\r\nL =\r\n\r\n     1     1     0     3     3\r\n     1     1     0     3     3\r\n     0     0     0     0     0\r\n     2     2     0     4     4\r\n     2     2     0     4     4\r\n\r\n<\/pre><p>So <tt>bwlabel<\/tt> found four connected groups of foreground pixels in <tt>bw<\/tt>.  Since <tt>bwlabel<\/tt> scans the pixels of <tt>bw<\/tt> of column order, it finds the object in the lower-left corner second, which is why those pixels are labeled with a \"2\" in\r\n      <tt>L<\/tt>.\r\n   <\/p>\r\n   <p>Sometimes people ask if <tt>bwlabel<\/tt> can be made to search along the rows instead of along the columns, because they want the object in the upper right to be\r\n      found second.\r\n   <\/p>\r\n   <p>The search order for <tt>bwlabel<\/tt> is determined by the internal memory storage order of elements in MATLAB.  Searching along rows would substantially slow\r\n      the function down.  However, it is possible, with the assistance of <tt>regionprops<\/tt>, to efficiently postprocess the output of <tt>bwlabel<\/tt> so that objects are ordered however you wish.\r\n   <\/p>\r\n   <p>Before I show that technique, though, let me show an example of how ordering objects in two dimensions is not always straightforward,\r\n      no matter which way you search.\r\n   <\/p>\r\n   <p>Here's a snippet of a scanned text image we've look at before:<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">url = <span style=\"color: #A020F0\">'https:\/\/blogs.mathworks.com\/images\/steve\/186\/scanned_page.png'<\/span>;\r\nbw = imread(url);\r\nbw = ~bw(1107:1194, 17:135);\r\nimshow(bw)<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2008\/bwlabel_search_order_01.png\"> <p>One might expect that <tt>bwlabel<\/tt> would label the upper-left corner \"s\" as the first object found.  But that doesn't happen.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">L = bwlabel(bw);\r\n\r\n<span style=\"color: #228B22\">% Show the object labeled first.<\/span>\r\nimshow(L == 1)<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2008\/bwlabel_search_order_02.png\"> <p>Why is the F found first?  We can see why if we zoom in on the upper-left corner of the image.<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">set(gcf, <span style=\"color: #A020F0\">'Position'<\/span>, [100 100 400 300])\r\nimshow(bw(1:50, 1:20), <span style=\"color: #A020F0\">'InitialMag'<\/span>, <span style=\"color: #A020F0\">'fit'<\/span>)<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2008\/bwlabel_search_order_03.png\"> <p>The serif on the upper-left of the \"F\" extends to the left of the letter \"s.\"  Therefore, the columnwise scan performed by\r\n      <tt>bwlabel<\/tt> finds the \"F\" first.\r\n   <\/p>\r\n   <p>We could force a row-wise labeling search by transposing the binary image and then the output of <tt>bwlabel<\/tt>, like this:\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">L = bwlabel(bw')';<\/pre><p>But another surprise awaits; the first object found still isn't the \"s\"!<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">imshow(L == 1)<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2008\/bwlabel_search_order_04.png\"> <p>The \"t\" is found first, because the vertical stem of the \"t\" extends higher than the top of the \"s\" character.<\/p>\r\n   <p>Also, <tt>bwlabel<\/tt> will merge two or more initially assigned labels if it discovers along the way that two or more apparently separate objects\r\n      are actually connected.  This can change the labeling order in unexpected ways.\r\n   <\/p>\r\n   <p>With those cautionary notes in mind, let me show you how to postprocess the output of <tt>bwlabel<\/tt> to sort objects as desired.\r\n   <\/p>\r\n   <p>First, let's use <tt>regionprops<\/tt> to get some object measurements that might be useful for object ordering.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">s = regionprops(L, <span style=\"color: #A020F0\">'BoundingBox'<\/span>, <span style=\"color: #A020F0\">'Extrema'<\/span>, <span style=\"color: #A020F0\">'Centroid'<\/span>);<\/pre><p>We could sort the objects from left to right according to the left edge of the bounding box.<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">boxes = cat(1, s.BoundingBox);\r\nleft_edge = boxes(:,1);\r\n[sorted, sort_order] = sort(left_edge);\r\ns2 = s(sort_order);<\/pre><p>Let's visualize that sort order.<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">I = im2uint8(bw);\r\nI(~bw) = 200;\r\nI(bw) = 240;\r\nset(gcf, <span style=\"color: #A020F0\">'Position'<\/span>, [100 100 400 300]);\r\nimshow(I, <span style=\"color: #A020F0\">'InitialMag'<\/span>, <span style=\"color: #A020F0\">'fit'<\/span>)\r\nhold <span style=\"color: #A020F0\">on<\/span>\r\n<span style=\"color: #0000FF\">for<\/span> k = 1:numel(s2)\r\n   centroid = s2(k).Centroid;\r\n   text(centroid(1), centroid(2), sprintf(<span style=\"color: #A020F0\">'%d'<\/span>, k));\r\n<span style=\"color: #0000FF\">end<\/span>\r\nhold <span style=\"color: #A020F0\">off<\/span><\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2008\/bwlabel_search_order_05.png\"> <p>To sort according to a row-wise search order, sort lexicographically (using <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/sortrows.html\"><tt>sortrows<\/tt><\/a> instead of <tt>sort<\/tt>) on the left-most top extremum of each object. This extremum is the first row of the <tt>'Extrema'<\/tt> output from <tt>regionprops<\/tt>.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">extrema = cat(1, s.Extrema);\r\nleft_most_top = extrema(1:8:end, :);\r\n[sorted, sort_order] = sortrows(fliplr(left_most_top));\r\ns2 = s(sort_order);\r\nimshow(I, <span style=\"color: #A020F0\">'InitialMag'<\/span>, <span style=\"color: #A020F0\">'fit'<\/span>)\r\nhold <span style=\"color: #A020F0\">on<\/span>\r\n<span style=\"color: #0000FF\">for<\/span> k = 1:numel(s2)\r\n   centroid = s2(k).Centroid;\r\n   text(centroid(1), centroid(2), sprintf(<span style=\"color: #A020F0\">'%d'<\/span>, k));\r\n<span style=\"color: #0000FF\">end<\/span>\r\nhold <span style=\"color: #A020F0\">off<\/span><\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2008\/bwlabel_search_order_06.png\"> <p>Notice how the characters still aren't well-ordered from left to right.  That's because the taller ones get sorted earlier.\r\n       Let's try sorting first on the bottom of each character instead of the top, and quantizing the bottom coordinate.  The left-most\r\n      bottom extremum is the 6th row in <tt>'Extrema'<\/tt> measurement.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">left_most_bottom = extrema(6:8:end, :);\r\nleft = left_most_bottom(:, 1);\r\nbottom = left_most_bottom(:, 2);\r\n<span style=\"color: #228B22\">% quantize the bottom coordinate<\/span>\r\nbottom = 6 * round(bottom \/ 6);\r\n[sorted, sort_order] = sortrows([bottom left]);\r\ns2 = s(sort_order);\r\nimshow(I, <span style=\"color: #A020F0\">'InitialMag'<\/span>, <span style=\"color: #A020F0\">'fit'<\/span>)\r\nhold <span style=\"color: #A020F0\">on<\/span>\r\n<span style=\"color: #0000FF\">for<\/span> k = 1:numel(s2)\r\n   centroid = s2(k).Centroid;\r\n   text(centroid(1), centroid(2), sprintf(<span style=\"color: #A020F0\">'%d'<\/span>, k));\r\n<span style=\"color: #0000FF\">end<\/span>\r\nhold <span style=\"color: #A020F0\">off<\/span><\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2008\/bwlabel_search_order_07.png\"> <p>That's somewhat better, but to get further I imagine that some higher-level analysis of the structure of text lines in the\r\n      image will be necessary.\r\n   <\/p>\r\n   <p>Notice that I did not modify the label matrix, <tt>L<\/tt>, in these postprocessing steps.  Next time I'll show how to relabel <tt>L<\/tt>.\r\n   <\/p><script language=\"JavaScript\">\r\n<!--\r\n\r\n    function grabCode_c6bb749d971745ffab7ffa97903f164f() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='c6bb749d971745ffab7ffa97903f164f ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' c6bb749d971745ffab7ffa97903f164f';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        author = 'Steve Eddins';\r\n        copyright = 'Copyright 2008 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add author and copyright lines at the bottom if specified.\r\n        if ((author.length > 0) || (copyright.length > 0)) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (author.length > 0) {\r\n                d.writeln('% _' + author + '_');\r\n            }\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n      \r\n      d.title = title + ' (MATLAB code)';\r\n      d.close();\r\n      }   \r\n      \r\n-->\r\n<\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_c6bb749d971745ffab7ffa97903f164f()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n            the MATLAB code \r\n            <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; 7.6<br><\/p>\r\n<\/div>\r\n<!--\r\nc6bb749d971745ffab7ffa97903f164f ##### SOURCE BEGIN #####\r\n%%\r\n% I've received several questions over the past months about the search order\r\n% for the function \r\n% <https:\/\/www.mathworks.com\/help\/images\/index.htmlbwlabel.html \r\n% |bwlabel|> and whether it can be changed.  Today's post discusses the\r\n% search-order issue, how useful it might or might not be to change it, and how\r\n% to use |regionprops| to reorder labeled objects.\r\n%\r\n% First, though, a brief review of |bwlabel|: It takes a binary\r\n% image, |bw|, and produces a _label matrix_, |L|.  The elements of\r\n% L are integer values greater than or equal to 0. The elements\r\n% equal to 0 are the background. The elements equal to 1 make up one\r\n% object, the elements equal to 2 make up a second object, and so\r\n% on.  Here's a small example.\r\n\r\nbw = false(5, 5);\r\nbw(1:2, 1:2) = true;\r\nbw(4:5, 1:2) = true;\r\nbw(1:2, 4:5) = true;\r\nbw(4:5, 4:5) = true\r\n\r\n%%\r\nL = bwlabel(bw)\r\n\r\n%%\r\n% So |bwlabel| found four connected groups of foreground pixels in\r\n% |bw|.  Since |bwlabel| scans the pixels of |bw| of column order,\r\n% it finds the object in the lower-left corner second, which is why\r\n% those pixels are labeled with a \"2\" in |L|.\r\n%\r\n% Sometimes people ask if |bwlabel| can be made to search along the\r\n% rows instead of along the columns, because they want the object in\r\n% the upper right to be found second.\r\n%\r\n% The search order for |bwlabel| is determined by the internal\r\n% memory storage order of elements in MATLAB.  Searching along rows\r\n% would substantially slow the function down.  However, it is\r\n% possible, with the assistance of |regionprops|, to efficiently\r\n% postprocess the output of |bwlabel| so that objects are ordered\r\n% however you wish.\r\n%\r\n% Before I show that technique, though, let me show an example of\r\n% how ordering objects in two dimensions is not always\r\n% straightforward, no matter which way you search.\r\n%\r\n% Here's a snippet of a scanned text image we've look at before:\r\n\r\nurl = 'https:\/\/blogs.mathworks.com\/images\/steve\/186\/scanned_page.png';\r\nbw = imread(url);\r\nbw = ~bw(1107:1194, 17:135);\r\nimshow(bw)\r\n\r\n%%\r\n% One might expect that |bwlabel| would label the upper-left corner\r\n% \"s\" as the first object found.  But that doesn't happen.\r\n\r\nL = bwlabel(bw);\r\n\r\n% Show the object labeled first.\r\nimshow(L == 1)\r\n\r\n%%\r\n% Why is the F found first?  We can see why if we zoom in on the\r\n% upper-left corner of the image.\r\n\r\nset(gcf, 'Position', [100 100 400 300])\r\nimshow(bw(1:50, 1:20), 'InitialMag', 'fit')\r\n\r\n%%\r\n% The serif on the upper-left of the \"F\" extends to the left of the\r\n% letter \"s.\"  Therefore, the columnwise scan performed by\r\n% |bwlabel| finds the \"F\" first.\r\n%\r\n% We could force a row-wise labeling search by transposing the\r\n% binary image and then the output of |bwlabel|, like this:\r\n\r\nL = bwlabel(bw')';\r\n\r\n%%\r\n% But another surprise awaits; the first object found still isn't\r\n% the \"s\"!\r\n\r\nimshow(L == 1)\r\n\r\n%%\r\n% The \"t\" is found first, because the vertical stem of the \"t\"\r\n% extends higher than the top of the \"s\" character.\r\n%\r\n% Also, |bwlabel| will merge two or more initially assigned labels\r\n% if it discovers along the way that two or more apparently separate\r\n% objects are actually connected.  This can change the labeling\r\n% order in unexpected ways.\r\n%\r\n% With those cautionary notes in mind, let me show you how to\r\n% postprocess the output of |bwlabel| to sort objects as desired.\r\n%\r\n% First, let's use \r\n% <https:\/\/www.mathworks.com\/help\/images\/index.htmlregionprops.html \r\n% |regionprops|> to get some object measurements\r\n% that might be useful for object ordering.\r\n\r\ns = regionprops(L, 'BoundingBox', 'Extrema', 'Centroid');\r\n\r\n%%\r\n% We could sort the objects from left to right according to the left\r\n% edge of the bounding box.\r\n\r\nboxes = cat(1, s.BoundingBox);\r\nleft_edge = boxes(:,1);\r\n[sorted, sort_order] = sort(left_edge);\r\ns2 = s(sort_order);\r\n\r\n%%\r\n% Let's visualize that sort order.\r\n\r\nI = im2uint8(bw);\r\nI(~bw) = 200;\r\nI(bw) = 240;\r\nset(gcf, 'Position', [100 100 400 300]);\r\nimshow(I, 'InitialMag', 'fit')\r\nhold on\r\nfor k = 1:numel(s2)\r\n   centroid = s2(k).Centroid;\r\n   text(centroid(1), centroid(2), sprintf('%d', k));\r\nend\r\nhold off\r\n\r\n%%\r\n% To sort according to a row-wise search order, sort\r\n% lexicographically (using \r\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/sortrows.html \r\n% |sortrows|> instead of |sort|) on the\r\n% left-most top extremum of each object. This extremum is the first\r\n% row of the |'Extrema'| output from |regionprops|. \r\n\r\nextrema = cat(1, s.Extrema);\r\nleft_most_top = extrema(1:8:end, :);\r\n[sorted, sort_order] = sortrows(fliplr(left_most_top));\r\ns2 = s(sort_order);\r\nimshow(I, 'InitialMag', 'fit')\r\nhold on\r\nfor k = 1:numel(s2)\r\n   centroid = s2(k).Centroid;\r\n   text(centroid(1), centroid(2), sprintf('%d', k));\r\nend\r\nhold off\r\n\r\n%%\r\n% Notice how the characters still aren't well-ordered from left to\r\n% right.  That's because the taller ones get sorted earlier.  Let's\r\n% try sorting first on the bottom of each character instead of the\r\n% top, and quantizing the bottom coordinate.  The left-most bottom\r\n% extremum is the 6th row in |'Extrema'| measurement.\r\n\r\nleft_most_bottom = extrema(6:8:end, :);\r\nleft = left_most_bottom(:, 1);\r\nbottom = left_most_bottom(:, 2);\r\n% quantize the bottom coordinate\r\nbottom = 6 * round(bottom \/ 6);\r\n[sorted, sort_order] = sortrows([bottom left]);\r\ns2 = s(sort_order);\r\nimshow(I, 'InitialMag', 'fit')\r\nhold on\r\nfor k = 1:numel(s2)\r\n   centroid = s2(k).Centroid;\r\n   text(centroid(1), centroid(2), sprintf('%d', k));\r\nend\r\nhold off\r\n\r\n%%\r\n% That's somewhat better, but to get further I imagine that some\r\n% higher-level analysis of the structure of text lines in the image\r\n% will be necessary.\r\n%\r\n% Notice that I did not modify the label matrix, |L|, in these\r\n% postprocessing steps.  Next time I'll show how to relabel |L|.\r\n##### SOURCE END ##### c6bb749d971745ffab7ffa97903f164f\r\n-->","protected":false},"excerpt":{"rendered":"<p>\r\n   I've received several questions over the past months about the search order for the function bwlabel and whether it can be changed.  Today's post discusses the search-order issue, how useful it... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/steve\/2008\/03\/25\/bwlabel-search-order\/\">read more >><\/a><\/p>","protected":false},"author":42,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[16],"tags":[166,46,102,192,172,90,452,76,36,162,168,188,484,486,78,100],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/203"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/users\/42"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/comments?post=203"}],"version-history":[{"count":2,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/203\/revisions"}],"predecessor-version":[{"id":2259,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/203\/revisions\/2259"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/media?parent=203"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/categories?post=203"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/tags?post=203"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}