{"id":3198,"date":"2019-04-02T07:00:18","date_gmt":"2019-04-02T11:00:18","guid":{"rendered":"https:\/\/blogs.mathworks.com\/steve\/?p=3198"},"modified":"2019-11-01T22:36:45","modified_gmt":"2019-11-02T02:36:45","slug":"multiresolution-image-pyramids-and-impyramid-part-1","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/steve\/2019\/04\/02\/multiresolution-image-pyramids-and-impyramid-part-1\/","title":{"rendered":"Multiresolution image pyramids and impyramid &#8211; part 1"},"content":{"rendered":"<div class=\"content\"><p>There's a function in the Image Processing Toolbox that I'm not especially fond of: <tt>impyramid<\/tt>.<\/p><p>I shouldn't tell you who designed this function, but I'll give you a hint: his initials are \"Steve Eddins.\"<\/p><p>Years ago, we had occasional user requests for creating multiresolution image pyramids, such as Gaussian pyramids and Laplacian pyramids. I thought these requests would be easy to satisfy by <tt>impyramid<\/tt>, but I was wrong. I was reminded of the problems by some code written by one our Pilot engineers, Steve Kuznicki. Steve made a nice of use <tt>impyramid<\/tt> to implement a multiresolution image blending method, but weaknesses in the design of <tt>impyramid<\/tt> made some of the code awkward. Today I want to discuss those weaknesses. In a follow-up post later, I'll demonstrate some possible improvements.<\/p><p>I'll start by explaining multiresolution pyramids. Below an example of one. This form is sometimes called a lowpass pyramid.<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/steve\/files\/mr-pyramid-example.jpg\" alt=\"\"> <\/p><p>The original image is shown in the upper left. The image is lowpass filtered and then subsampled by a factor of 2 in each direction to form the image in the upper right. That process is repeated several times, reaching the tiny version of the original image at the lower right.<\/p><p>So, I think \"multiresolution pyramid\" and \"lowpass pyramid\" are both sensible terms, but why is this frequently called a \"Gaussian pyramid\"? The answer comes from an implementation technique described in Burt, \"Fast Filter Transforms for Image Processing,\" <i>Computer Graphics and Image Processing<\/i>, vol. 16, pp. 20-51, 1981. Burt described a procedure for forming a multiresolution pyramid by filtering at each step with the kernel $h_1(x)$ in Fig. 2 from the paper:<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/steve\/files\/Burt-Figure-2.png\" alt=\"\"> <\/p><p>The coefficients of $h_1(x)$ are $1\/4 - a\/2$, 0.25, and $a$, with typical values for $a$ around 0.4.<\/p><p>As you successively apply this filter at each stage of a multiresolution pyramid, the overall effect becomes similar to filtering with a continuous Gaussian-shaped kernel, and this is where the name <i>Gaussian pyramid<\/i> comes from.<\/p><p>At the time this paper was written, computers were vastly slower than they are now, especially with floating-point computation. The floating-point convolution of a 512x512 image (considered large then, and considered quite small now) with a filter of modest size could take minutes.<\/p><p>That explains part of the significance of Burt's implementation method. Not only is the filter applied at each pyramid stage very small, but when $a = 0.375$, all the filter coefficients are exact binary fractions and so can be implemented by direct processor bit-shift instructions on the integer pixel values. That allowed for reasonably fast implementations on the computers of that time.<\/p><p>When I try to put <tt>impyramid<\/tt> to good use, or when I have seen other people try to use it, the following problems stand out to me:<\/p><div><ul><li>Each call to the function <tt>impyramid<\/tt> implements a single level, either a reduction or an expansion. That leaves it up to the caller to handle a lot tedious code for deciding how many levels there should be, keeping track of the size of the original image, and handling cases where the original image dimensions are not powers-of-two multiples of the size of the smallest image in the pyramid.<\/li><li>When you call <tt>impyramid<\/tt> to reduce an image by a level, and then you call <tt>impyramid<\/tt> on the reduced image to expand it, you don't get the same image size that you started with. For example, if you start with a 1360x1904 image, you end up with a 1359x1903 image after reducing and then expanding again. I distinctly remember convincing myself that there was a legit reason for this. Sigh. It just adds a lot more tedious image padding code around many uses of <tt>impyramid<\/tt>.<\/li><li>As image resizing filters go, the one used by <tt>impyramid<\/tt> (described above) is not a very good one for downsampling applications. I understand its use in 1981, but on modern computers, we can do better.<\/li><li>Finally, there needs to be an easier way to visualize the multiresolution pyramid. This would be useful as an aid to general understanding, as well as to debugging specific issues. The <tt>imtile<\/tt> function that I <a href=\"https:\/\/blogs.mathworks.com\/steve\/2019\/03\/17\/a-new-image-tiling-function-in-matlab\/\">discussed on March 17<\/a> doesn't quite do it because it automatically resizes each image tile to be the same size.<\/li><\/ul><\/div><p>Extra credit: Do you know what building is in the image above? If so, leave a note in the comments.<\/p><script language=\"JavaScript\"> <!-- \r\n    function grabCode_10e284a915754568b6612eaef8f3c7c7() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='10e284a915754568b6612eaef8f3c7c7 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' 10e284a915754568b6612eaef8f3c7c7';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2019 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_10e284a915754568b6612eaef8f3c7c7()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2019a<br><\/p><\/div><!--\r\n10e284a915754568b6612eaef8f3c7c7 ##### SOURCE BEGIN #####\r\n%%\r\n% There's a function in the Image Processing Toolbox that I'm not\r\n% especially fond of: |impyramid|.\r\n%\r\n% I shouldn't tell you who designed this function, but I'll give you\r\n% a hint: his initials are \"Steve Eddins.\"\r\n%\r\n% Years ago, we had occasional user requests for creating multiresolution\r\n% image pyramids, such as Gaussian pyramids and Laplacian pyramids. I\r\n% thought these requests would be easy to satisfy by |impyramid|, but I was\r\n% wrong. I was reminded of the problems by some code written by one our\r\n% Pilot engineers, Steve Kuznicki. Steve made a nice of use |impyramid| to\r\n% implement a multiresolution image blending method, but weaknesses in the\r\n% design of |impyramid| made some of the code awkward. Today I want to\r\n% discuss those weaknesses. In a follow-up post later, I'll demonstrate\r\n% some possible improvements.\r\n%\r\n% I'll start by explaining multiresolution pyramids. Below an example of\r\n% one. This form is sometimes called a lowpass pyramid.\r\n%\r\n% <<https:\/\/blogs.mathworks.com\/steve\/files\/mr-pyramid-example.jpg>>\r\n%\r\n% The original image is shown in the upper left. The image is lowpass\r\n% filtered and then subsampled by a factor of 2 in each direction to form\r\n% the image in the upper right. That process is repeated several times,\r\n% reaching the tiny version of the original image at the lower right.\r\n%\r\n% So, I think \"multiresolution pyramid\" and \"lowpass pyramid\" are both\r\n% sensible terms, but why is this frequently called a \"Gaussian pyramid\"?\r\n% The answer comes from an implementation technique described in Burt,\r\n% \"Fast Filter Transforms for Image Processing,\" _Computer Graphics and\r\n% Image Processing_, vol. 16, pp. 20-51, 1981. Burt described a procedure\r\n% for forming a multiresolution pyramid by filtering at each step with the\r\n% kernel $h_1(x)$ in Fig. 2 from the paper:\r\n%\r\n% <<https:\/\/blogs.mathworks.com\/steve\/files\/Burt-Figure-2.png>>\r\n%\r\n% The coefficients of $h_1(x)$ are $1\/4 - a\/2$, 0.25, and $a$, with typical\r\n% values for $a$ around 0.4.\r\n%\r\n% As you successively apply this filter at each stage of a multiresolution\r\n% pyramid, the overall effect becomes similar to filtering with a\r\n% continuous Gaussian-shaped kernel, and this is where the name _Gaussian\r\n% pyramid_ comes from.\r\n%\r\n% At the time this paper was written, computers were vastly slower than\r\n% they are now, especially with floating-point computation. The\r\n% floating-point convolution of a 512x512 image (considered large then, and\r\n% considered quite small now) with a filter of modest size could take\r\n% minutes.\r\n%\r\n% That explains part of the significance of Burt's implementation method.\r\n% Not only is the filter applied at each pyramid stage very small, but when\r\n% $a = 0.375$, all the filter coefficients are exact binary fractions and\r\n% so can be implemented by direct processor bit-shift instructions on the\r\n% integer pixel values. That allowed for reasonably fast implementations on\r\n% the computers of that time.\r\n%\r\n% When I try to put |impyramid| to good use, or when I have seen other\r\n% people try to use it, the following problems stand out to me:\r\n%\r\n% * Each call to the function |impyramid| implements a single level, either\r\n% a reduction or an expansion. That leaves it up to the caller to handle a\r\n% lot tedious code for deciding how many levels there should be, keeping\r\n% track of the size of the original image, and handling cases where the\r\n% original image dimensions are not powers-of-two multiples of the size of\r\n% the smallest image in the pyramid.\r\n% * When you call |impyramid| to reduce an image by a level, and then you\r\n% call |impyramid| on the reduced image to expand it, you don't get the\r\n% same image size that you started with. For example, if you start with a\r\n% 1360x1904 image, you end up with a 1359x1903 image after reducing and\r\n% then expanding again. I distinctly remember convincing myself that there\r\n% was a legit reason for this. Sigh. It just adds a lot more tedious image\r\n% padding code around many uses of |impyramid|.\r\n% * As image resizing filters go, the one used by |impyramid| (described\r\n% above) is not a very good one for downsampling applications. I understand\r\n% its use in 1981, but on modern computers, we can do better.\r\n% * Finally, there needs to be an easier way to visualize the\r\n% multiresolution pyramid. This would be useful as an aid to general\r\n% understanding, as well as to debugging specific issues. The |imtile|\r\n% function that I \r\n% <https:\/\/blogs.mathworks.com\/steve\/2019\/03\/17\/a-new-image-tiling-function-in-matlab\/ \r\n% discussed on March 17> doesn't quite do it because it\r\n% automatically resizes each image tile to be the same size.\r\n%\r\n% Extra credit: Do you know what building is in the image above? If so,\r\n% leave a note in the comments.\r\n##### SOURCE END ##### 10e284a915754568b6612eaef8f3c7c7\r\n-->","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img src=\"https:\/\/blogs.mathworks.com\/steve\/files\/mr-pyramid-example.jpg\" class=\"img-responsive attachment-post-thumbnail size-post-thumbnail wp-post-image\" alt=\"\" decoding=\"async\" loading=\"lazy\" \/><\/div><p>There's a function in the Image Processing Toolbox that I'm not especially fond of: impyramid.I shouldn't tell you who designed this function, but I'll give you a hint: his initials are \"Steve... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/steve\/2019\/04\/02\/multiresolution-image-pyramids-and-impyramid-part-1\/\">read more >><\/a><\/p>","protected":false},"author":42,"featured_media":3192,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[1237,1235],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/3198"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/users\/42"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/comments?post=3198"}],"version-history":[{"count":2,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/3198\/revisions"}],"predecessor-version":[{"id":3202,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/3198\/revisions\/3202"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/media\/3192"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/media?parent=3198"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/categories?post=3198"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/tags?post=3198"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}