{"id":241,"date":"2008-12-19T19:17:39","date_gmt":"2008-12-20T00:17:39","guid":{"rendered":"https:\/\/blogs.mathworks.com\/steve\/2008\/12\/19\/packed-dilation-and-erosion\/"},"modified":"2019-10-28T09:49:00","modified_gmt":"2019-10-28T13:49:00","slug":"packed-dilation-and-erosion","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/steve\/2008\/12\/19\/packed-dilation-and-erosion\/","title":{"rendered":"Packed dilation and erosion"},"content":{"rendered":"<div xmlns:mwsh=\"https:\/\/www.mathworks.com\/namespace\/mcode\/v1\/syntaxhighlight.dtd\" class=\"content\">\r\n   <p><i>This post is another in my <a href=\"https:\/\/blogs.mathworks.com\/steve\/category\/dilation-algorithms\/\">series on morphological dilation and erosion algorithms<\/a>.<\/i><\/p>\r\n   <p>One of the algorithm techniques used by <tt>imdilate<\/tt> and <tt>imerode<\/tt> is binary image bit packing.  In bit packing, groups of 32 binary image pixels are stored as bits in unsigned 32-bit integers.\r\n   <\/p>\r\n   <p>The Image Processing Toolbox functions <tt>bwpack<\/tt> and <tt>bwunpack<\/tt> do the packing and unpacking. Let's look at a simple example, starting with reading in text.png and then extracting a 32-by-4\r\n      subimage.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">bw = imread(<span style=\"color: #A020F0\">'text.png'<\/span>);\r\nimshow(bw)<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2008\/dilate_packed_01.png\"> <pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">bw_sub = bw(12:41, 30:33)<\/pre><pre style=\"font-style:oblique\">\r\nbw_sub =\r\n\r\n     0     0     0     0\r\n     0     0     1     1\r\n     0     0     1     1\r\n     0     1     1     1\r\n     0     1     1     1\r\n     0     1     1     1\r\n     0     1     1     1\r\n     0     1     1     1\r\n     0     0     1     1\r\n     0     0     1     1\r\n     0     0     0     0\r\n     0     0     0     0\r\n     0     0     0     0\r\n     0     0     0     0\r\n     0     0     0     0\r\n     0     0     0     0\r\n     0     0     0     0\r\n     0     0     0     0\r\n     0     0     0     0\r\n     1     0     0     0\r\n     1     0     0     0\r\n     0     0     0     0\r\n     0     0     0     0\r\n     0     0     1     1\r\n     0     0     1     1\r\n     0     1     1     1\r\n     0     1     1     1\r\n     0     1     1     1\r\n     0     1     1     1\r\n     0     1     1     1\r\n\r\n<\/pre><p>Now pass the subimage to <tt>bwpack<\/tt>.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">bw_sub_packed = bwpack(bw_sub)\r\nwhos<\/pre><pre style=\"font-style:oblique\">\r\nbw_sub_packed =\r\n\r\n     1572864  1040187640  1065354238  1065354238\r\n\r\n  Name                 Size             Bytes  Class      Attributes\r\n\r\n  bw                 256x256            65536  logical              \r\n  bw_sub              30x4                120  logical              \r\n  bw_sub_packed        1x4                 16  uint32               \r\n\r\n<\/pre><p><tt>bw_sub_packed<\/tt> is just a 1-by-4 <tt>uint32<\/tt> matrix, but those 4 values contain all the values in the original 32-by-4 subimage. You can see the correspondence by using\r\n      the <tt>dec2bin<\/tt> function, which converts a decimal number to binary form (represented as a string).\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\"><span style=\"color: #228B22\">% Convert the bw_sub_packed(1,2) to binary form.<\/span>\r\ndec2bin(bw_sub_packed(1, 2))<\/pre><pre style=\"font-style:oblique\">\r\nans =\r\n\r\n111110000000000000000011111000\r\n\r\n<\/pre><p>You can see that these are the 0 and 1 values from the second column of <tt>bw_sub<\/tt>.\r\n   <\/p>\r\n   <p>So what does this have to do with dilation and erosion? Binary image dilation and erosion can be expressed in terms of translations,\r\n      logical ANDs, and logical ORs of pixel values.  The C and C++ operators <tt>&gt;&gt;<\/tt>, <tt>&lt;&lt;<\/tt>, <tt>&amp;<\/tt>, and <tt>|<\/tt> operate on unsigned integer values in bitwise fashion.  Therefore, by packing binary image pixel values into the bits of\r\n      unsigned 32-bit integers, we can write C or C++ code that dilates and erodes faster, because it is essentially operating on\r\n      multiple pixel values at a time.\r\n   <\/p>\r\n   <p>Let's try dilating text.png by packing it, passing the packed image to <tt>imdilate<\/tt>, and then unpacking the result.  (Note that <tt>imdilate<\/tt> supports 32-bit integer images, so we have to explicitly tell <tt>imdilate<\/tt> to treat the input as packed.)  We'll use a diagonal cross structuring element.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">bw_packed = bwpack(bw);\r\nse = strel(eye(9) | rot90(eye(9)));\r\nbw_packed_dilated = imdilate(bw_packed, se, <span style=\"color: #A020F0\">'ispacked'<\/span>);\r\nbw_dilated = bwunpack(bw_packed_dilated);\r\nimshow(bw_dilated)<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2008\/dilate_packed_02.png\"> <p>Let's time packed dilation and compare it to the ordinary version, using a 2k-by-2k image. \r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">bw = repmat(bw, 8, 8);\r\nbw_packed = bwpack(bw);\r\npacked_dilation_time = timeit(@() imdilate(bw_packed, se, <span style=\"color: #A020F0\">'ispacked'<\/span>))<\/pre><pre style=\"font-style:oblique\">\r\npacked_dilation_time =\r\n\r\n    0.0127\r\n\r\n<\/pre><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">ordinary_dilation_time = timeit(@() imdilate(bw, se))<\/pre><pre style=\"font-style:oblique\">\r\nordinary_dilation_time =\r\n\r\n    0.0396\r\n\r\n<\/pre><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">speedup = ordinary_dilation_time \/ packed_dilation_time<\/pre><pre style=\"font-style:oblique\">\r\nspeedup =\r\n\r\n    3.1251\r\n\r\n<\/pre><p>The measured speedup is good, but much less than a factor of 32, which we might have expected.  There are two reasons for\r\n      the smaller speedup factor:\r\n   <\/p>\r\n   <div>\r\n      <ul>\r\n         <li>Input argument parsing and other overhead in imdilate.m<\/li>\r\n         <li>The C++ code used by the toolbox for ordinary dilation uses a method that does better when there are relatively few white\r\n            input pixels, and this method can't be used as effectively for the packed version.\r\n         <\/li>\r\n      <\/ul>\r\n   <\/div>\r\n   <p>Here's a design question for you to consider: Should <tt>imdilate<\/tt> always use the packed form?  In other words, if you pass it a <tt>logical<\/tt> input, should <tt>imdilate<\/tt> pack it, performed packed dilation, and then unpack it?\r\n   <\/p>\r\n   <p>Well, the packing and unpacking steps take time. For example:<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">timeit(@() bwpack(bw))<\/pre><pre style=\"font-style:oblique\">\r\nans =\r\n\r\n    0.0198\r\n\r\n<\/pre><p>So it's conceivable that packing, packed dilation, and unpacking could together take longer than the ordinary dilation implementation.<\/p>\r\n   <p>We decided that <tt>imdilate<\/tt> (and <tt>imerode<\/tt>) should automatically do bit packing only if the structuring element is decomposed into at least two structuring elements.\r\n       That way, the cost of packing and unpacking is amortized across the cost of at least two packed dilation operations.\r\n   <\/p>\r\n   <p>Lately, we've been taking a fresh look at various aspects of our morphology operations, and we're optimistic that we can make\r\n      them go faster in future releases of the Image Processing Toolbox.\r\n   <\/p><script language=\"JavaScript\">\r\n<!--\r\n\r\n    function grabCode_b88648a38601485eabe1525f60875aff() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='b88648a38601485eabe1525f60875aff ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' b88648a38601485eabe1525f60875aff';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        author = 'Steve Eddins';\r\n        copyright = 'Copyright 2008 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add author and copyright lines at the bottom if specified.\r\n        if ((author.length > 0) || (copyright.length > 0)) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (author.length > 0) {\r\n                d.writeln('% _' + author + '_');\r\n            }\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n      \r\n      d.title = title + ' (MATLAB code)';\r\n      d.close();\r\n      }   \r\n      \r\n-->\r\n<\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_b88648a38601485eabe1525f60875aff()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n            the MATLAB code \r\n            <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; 7.7<br><\/p>\r\n<\/div>\r\n<!--\r\nb88648a38601485eabe1525f60875aff ##### SOURCE BEGIN #####\r\n%% \r\n% _This post is another in my <https:\/\/blogs.mathworks.com\/steve\/category\/dilation-algorithms\/ \r\n% series on morphological dilation and erosion\r\n% algorithms>._\r\n%\r\n% One of the algorithm techniques used by |imdilate| and |imerode|\r\n% is binary image bit packing.  In bit packing, groups of 32 binary\r\n% image pixels are stored as bits in unsigned 32-bit integers.\r\n%\r\n% The Image Processing Toolbox functions |bwpack| and |bwunpack| do\r\n% the packing and unpacking. Let's look at a simple example,\r\n% starting with reading in text.png and then extracting a 32-by-4\r\n% subimage.\r\n\r\nbw = imread('text.png');\r\nimshow(bw)\r\n\r\n%%\r\nbw_sub = bw(12:41, 30:33)\r\n\r\n%%\r\n% Now pass the subimage to |bwpack|.\r\n\r\nbw_sub_packed = bwpack(bw_sub)\r\nwhos\r\n\r\n%%\r\n% |bw_sub_packed| is just a 1-by-4 |uint32| matrix, but those 4\r\n% values contain all the values in the original 32-by-4 subimage.\r\n% You can see the correspondence by using the |dec2bin| function,\r\n% which converts a decimal number to binary form (represented as a\r\n% string).\r\n\r\n% Convert the bw_sub_packed(1,2) to binary form.\r\ndec2bin(bw_sub_packed(1, 2))\r\n\r\n%%\r\n% You can see that these are the 0 and 1 values from the second\r\n% column of |bw_sub|. \r\n\r\n%% \r\n% So what does this have to do with dilation and erosion? Binary\r\n% image dilation and erosion can be expressed in terms of\r\n% translations, logical ANDs, and logical ORs of pixel values.  The\r\n% C and C++ operators |>>|, |<<|, |&|, and ||| operate on unsigned\r\n% integer values in bitwise fashion.  Therefore, by packing binary\r\n% image pixel values into the bits of unsigned 32-bit integers, we\r\n% can write C or C++ code that dilates and erodes faster, because it\r\n% is essentially operating on multiple pixel values at a time.\r\n%\r\n% Let's try dilating text.png by packing it, passing the packed\r\n% image to |imdilate|, and then unpacking the result.  (Note that\r\n% |imdilate| supports 32-bit integer images, so we have to\r\n% explicitly tell |imdilate| to treat the input as packed.)  We'll\r\n% use a diagonal cross structuring element.\r\n\r\nbw_packed = bwpack(bw);\r\nse = strel(eye(9) | rot90(eye(9)));\r\nbw_packed_dilated = imdilate(bw_packed, se, 'ispacked');\r\nbw_dilated = bwunpack(bw_packed_dilated);\r\nimshow(bw_dilated)\r\n\r\n%%\r\n% Let's time packed dilation and compare it to the ordinary\r\n% version, using a 2k-by-2k image.\r\n% (You can get my timing function, |timeit|, from the \r\n% <https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/18798 \r\n% MATLAB Central\r\n% File Exchange>.)\r\n\r\nbw = repmat(bw, 8, 8);\r\nbw_packed = bwpack(bw);\r\npacked_dilation_time = timeit(@() imdilate(bw_packed, se, 'ispacked'))\r\n\r\n%%\r\n\r\nordinary_dilation_time = timeit(@() imdilate(bw, se))\r\n\r\n%%\r\n\r\nspeedup = ordinary_dilation_time \/ packed_dilation_time\r\n\r\n%%\r\n% The measured speedup is good, but much less than a factor of 32,\r\n% which we might have expected.  There are two reasons for the\r\n% smaller speedup factor:\r\n%\r\n% * Input argument parsing and other overhead in imdilate.m\r\n% * The C++ code used by the toolbox for ordinary dilation uses a method\r\n% that does better when there are relatively few white input pixels,\r\n% and this method can't be used as effectively for the packed\r\n% version.\r\n\r\n%% \r\n% Here's a design question for you to consider: Should |imdilate|\r\n% always use the packed form?  In other words, if you pass it a\r\n% |logical| input, should |imdilate| pack it, performed packed\r\n% dilation, and then unpack it?\r\n%\r\n% Well, the packing and unpacking steps take time. For example:\r\n\r\ntimeit(@() bwpack(bw))\r\n\r\n%%\r\n% So it's conceivable that packing, packed dilation, and unpacking\r\n% could together take longer than the ordinary dilation\r\n% implementation.\r\n%\r\n% We decided that |imdilate| (and |imerode|) should automatically do\r\n% bit packing only if the structuring element is decomposed into \r\n% at least two structuring elements.  That way, the cost of packing\r\n% and unpacking is amortized across the cost of at least two packed\r\n% dilation operations.\r\n%\r\n% Lately, we've been taking a fresh look at various aspects of our\r\n% morphology operations, and we're optimistic that we can make them\r\n% go faster in future releases of the Image Processing Toolbox.\r\n\r\n##### SOURCE END ##### b88648a38601485eabe1525f60875aff\r\n-->","protected":false},"excerpt":{"rendered":"<p>\r\n   This post is another in my series on morphological dilation and erosion algorithms.\r\n   One of the algorithm techniques used by imdilate and imerode is binary image bit packing.  In bit packing,... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/steve\/2008\/12\/19\/packed-dilation-and-erosion\/\">read more >><\/a><\/p>","protected":false},"author":42,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[18],"tags":[524,528,262,300,124,246,76,36,116,526,106,474,306],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/241"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/users\/42"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/comments?post=241"}],"version-history":[{"count":1,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/241\/revisions"}],"predecessor-version":[{"id":2642,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/241\/revisions\/2642"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/media?parent=241"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/categories?post=241"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/tags?post=241"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}