{"id":2351,"date":"2016-10-10T11:01:44","date_gmt":"2016-10-10T15:01:44","guid":{"rendered":"https:\/\/blogs.mathworks.com\/steve\/?p=2351"},"modified":"2019-11-01T16:49:32","modified_gmt":"2019-11-01T20:49:32","slug":"filling-holes-in-outline-text","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/steve\/2016\/10\/10\/filling-holes-in-outline-text\/","title":{"rendered":"Filling holes in outline text"},"content":{"rendered":"<div class=\"content\"><p>Intrepid MathWorks application engineer <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/profile\/authors\/845693-brett-shoelson\">Brett Shoelson<\/a> recently got a user question that caught my attention. Consider an image containing text characters in outline form, such as this:<\/p><pre class=\"codeinput\">url = <span class=\"string\">'https:\/\/blogs.mathworks.com\/steve\/files\/MathWorks-address-binary.png'<\/span>;\r\nbw = imread(url);\r\nimshow(bw)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/steve\/files\/filling_text_holes_01.png\" alt=\"\"> <p>How can we fill in the text characters from their outlines without filling in the internal holes? If we just use <tt>imfill<\/tt> with the <tt>'holes'<\/tt> option, you can see that it doesn't give us the desired result.<\/p><pre class=\"codeinput\">bw_filled = imfill(bw,<span class=\"string\">'holes'<\/span>);\r\nimshow(bw_filled)\r\ntitle(<span class=\"string\">'Original with holes filled'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/steve\/files\/filling_text_holes_02.png\" alt=\"\"> <p>When I saw this problem, I thought that some combination of <tt>imfill<\/tt>, <tt>imclearborder<\/tt>, and logical operators could possibly solve it.<\/p><p>You've already seen <tt>imfill<\/tt>. Here's how <tt>imclearborder<\/tt> works.<\/p><pre class=\"codeinput\">url_sample = <span class=\"string\">'https:\/\/blogs.mathworks.com\/images\/steve\/168\/aug31.png'<\/span>;\r\nbw_sample = imread(url_sample);\r\nimshow(bw_sample)\r\ntitle(<span class=\"string\">'imclearborder demonstration - input image'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/steve\/files\/filling_text_holes_03.png\" alt=\"\"> <pre class=\"codeinput\">bw_sample_clearborder = imclearborder(bw_sample);\r\nimshow(bw_sample_clearborder)\r\ntitle(<span class=\"string\">'imclearborder demonstration - output image'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/steve\/files\/filling_text_holes_04.png\" alt=\"\"> <p>You can see that any connected component touching any image border has been removed.<\/p><p>Going back to our task, I'm going to proceed first by identifying the background pixels that are inside the text characters. Roughly speaking, I'll tackle this phase by working \"from the outside in.\"<\/p><p>First, let's identify and remove the pixels that are external to the characters.<\/p><pre class=\"codeinput\">bw2 = ~bw;\r\nimshow(bw2)\r\ntitle(<span class=\"string\">'Complement of original image'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/steve\/files\/filling_text_holes_05.png\" alt=\"\"> <pre class=\"codeinput\">bw3 = imclearborder(bw2);\r\nimshow(bw3)\r\ntitle(<span class=\"string\">'External pixels removed from the foreground'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/steve\/files\/filling_text_holes_06.png\" alt=\"\"> <p>Now let's complement the image and clear the borders again.<\/p><pre class=\"codeinput\">bw4 = ~bw3;\r\nimshow(bw4)\r\ntitle(<span class=\"string\">'Complement of bw3'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/steve\/files\/filling_text_holes_07.png\" alt=\"\"> <pre class=\"codeinput\">bw5 = imclearborder(bw4);\r\nimshow(bw5)\r\ntitle(<span class=\"string\">'After second border-clearing'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/steve\/files\/filling_text_holes_08.png\" alt=\"\"> <p>If we fill the holes in the image <tt>bw5<\/tt> above, and then take the exclusive-or of the result with <tt>bw5<\/tt> we'll be left only with the internal hole pixels inside the characters.<\/p><pre class=\"codeinput\">bw6 = imfill(bw5,<span class=\"string\">'holes'<\/span>);\r\nbw7 = xor(bw6,bw5);\r\nimshow(bw7)\r\ntitle(<span class=\"string\">'Internal hole pixels'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/steve\/files\/filling_text_holes_09.png\" alt=\"\"> <p>We're almost there. We can now use <tt>bw6<\/tt> to \"fix up\" the initial filled result, <tt>bw_filled<\/tt>, using an exclusive-or operation.<\/p><pre class=\"codeinput\">bw_final = xor(bw_filled,bw7);\r\nimshow(bw_final)\r\ntitle(<span class=\"string\">'Presto!'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/steve\/files\/filling_text_holes_10.png\" alt=\"\"> <p>Readers, how would you solve this problem? Brett and I think there might be a couple of other reasonable approaches. Let us know in the comments.<\/p><script language=\"JavaScript\"> <!-- \r\n    function grabCode_b16b835b81954357abe39ad417e798f5() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='b16b835b81954357abe39ad417e798f5 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' b16b835b81954357abe39ad417e798f5';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2016 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_b16b835b81954357abe39ad417e798f5()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2016b<br><\/p><\/div><!--\r\nb16b835b81954357abe39ad417e798f5 ##### SOURCE BEGIN #####\r\n%%\r\n% Intrepid MathWorks application engineer\r\n% <https:\/\/www.mathworks.com\/matlabcentral\/profile\/authors\/845693-brett-shoelson\r\n% Brett Shoelson> recently got a user question that caught my attention.\r\n% Consider an image containing text characters in outline form, such as\r\n% this:\r\n\r\nurl = 'https:\/\/blogs.mathworks.com\/steve\/files\/MathWorks-address-binary.png';\r\nbw = imread(url);\r\nimshow(bw)\r\n\r\n%%\r\n% How can we fill in the text characters from their outlines without\r\n% filling in the internal holes? If we just use |imfill| with the |'holes'|\r\n% option, you can see that it doesn't give us the desired result.\r\n\r\nbw_filled = imfill(bw,'holes');\r\nimshow(bw_filled)\r\ntitle('Original with holes filled')\r\n\r\n%%\r\n% When I saw this problem, I thought that some combination of |imfill|,\r\n% |imclearborder|, and logical operators could possibly solve it.\r\n%\r\n% You've already seen |imfill|. Here's how |imclearborder| works.\r\n\r\nurl_sample = 'https:\/\/blogs.mathworks.com\/images\/steve\/168\/aug31.png';\r\nbw_sample = imread(url_sample);\r\nimshow(bw_sample)\r\ntitle('imclearborder demonstration - input image')\r\n\r\n%%\r\nbw_sample_clearborder = imclearborder(bw_sample);\r\nimshow(bw_sample_clearborder)\r\ntitle('imclearborder demonstration - output image')\r\n\r\n%%\r\n% You can see that any connected component touching any image border has\r\n% been removed.\r\n%\r\n% Going back to our task, I'm going to proceed first by identifying the\r\n% background pixels that are inside the text characters. roughly speaking,\r\n% I'll tackle this phase by working \"from the outside in.\"\r\n%\r\n% First, let's identify and remove the pixels that are external to the\r\n% characters.\r\n\r\nbw2 = ~bw;\r\nimshow(bw2)\r\ntitle('Complement of original image')\r\n\r\n%%\r\n\r\nbw3 = imclearborder(bw2);\r\nimshow(bw3)\r\ntitle('External pixels removed from the foreground')\r\n\r\n%%\r\n% Now let's complement the image and clear the borders again.\r\n\r\nbw4 = ~bw3;\r\nimshow(bw4)\r\ntitle('Complement of bw3')\r\n\r\n%%\r\n\r\nbw5 = imclearborder(bw4);\r\nimshow(bw5)\r\ntitle('After second border-clearing')\r\n\r\n%%\r\n% If we fill the holes in the image |bw5| above, and then take the \r\n% exclusive-or of the result with |bw5| we'll be left only with\r\n% the internal hole pixels inside the characters.\r\n\r\nbw6 = imfill(bw5,'holes');\r\nbw7 = xor(bw6,bw5);\r\nimshow(bw7)\r\ntitle('Internal hole pixels')\r\n\r\n%%\r\n% We're almost there. We can now use |bw6| to \"fix up\" the initial filled\r\n% result, |bw_filled|, using an exclusive-or operation.\r\n\r\nbw_final = xor(bw_filled,bw7);\r\nimshow(bw_final)\r\ntitle('Presto!')\r\n\r\n%%\r\n% Readers, how would you solve this problem? Brett and I think there\r\n% might be a couple of other reasonable approaches. Let us know in the\r\n% comments.\r\n\r\n##### SOURCE END ##### b16b835b81954357abe39ad417e798f5\r\n-->","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img src=\"https:\/\/blogs.mathworks.com\/steve\/files\/filling_text_holes_06.png\" class=\"img-responsive attachment-post-thumbnail size-post-thumbnail wp-post-image\" alt=\"\" decoding=\"async\" loading=\"lazy\" \/><\/div><p>Intrepid MathWorks application engineer Brett Shoelson recently got a user question that caught my attention. Consider an image containing text characters in outline form, such as this:url =... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/steve\/2016\/10\/10\/filling-holes-in-outline-text\/\">read more >><\/a><\/p>","protected":false},"author":42,"featured_media":2358,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[404,136,76,36,52,454],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/2351"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/users\/42"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/comments?post=2351"}],"version-history":[{"count":2,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/2351\/revisions"}],"predecessor-version":[{"id":2363,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/2351\/revisions\/2363"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/media\/2358"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/media?parent=2351"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/categories?post=2351"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/tags?post=2351"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}