{"id":242,"date":"2010-08-22T11:30:27","date_gmt":"2010-08-22T11:30:27","guid":{"rendered":"https:\/\/blogs.mathworks.com\/loren\/2010\/08\/22\/by-all-means\/"},"modified":"2010-08-22T11:30:52","modified_gmt":"2010-08-22T11:30:52","slug":"by-all-means","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/loren\/2010\/08\/22\/by-all-means\/","title":{"rendered":"By All Means"},"content":{"rendered":"<div xmlns:mwsh=\"https:\/\/www.mathworks.com\/namespace\/mcode\/v1\/syntaxhighlight.dtd\" class=\"content\">\r\n   <introduction>\r\n      <p>Ever find yourself wanting to get some sense of some data, but not sure the arithmetic <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2010a\/techdoc\/ref\/mean.html\"><tt>mean<\/tt><\/a> is what you want?  You might also consider the <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2010a\/toolbox\/stats\/geomean.html\">geometric mean<\/a> (<tt>geomean<\/tt> from Statistics Toolbox).  In the image processing world, I understand that some think that images look crisper often when\r\n         the geometric mean is applied versus the arithmetic mean. Today I want to talk about how to get accurate results for the geometric\r\n         mean.\r\n      <\/p>\r\n   <\/introduction>\r\n   <h3>Contents<\/h3>\r\n   <div>\r\n      <ul>\r\n         <li><a href=\"#1\">Geometric Mean<\/a><\/li>\r\n         <li><a href=\"#3\">Some Data<\/a><\/li>\r\n         <li><a href=\"#4\">More Challenging Data<\/a><\/li>\r\n         <li><a href=\"#6\">Safer Expression for Geometric Mean<\/a><\/li>\r\n         <li><a href=\"#9\">How Do You Average Data?<\/a><\/li>\r\n      <\/ul>\r\n   <\/div>\r\n   <h3>Geometric Mean<a name=\"1\"><\/a><\/h3>\r\n   <p>Let's assume we have a vector <tt>x<\/tt> so we can ignore dealing with different dimensions.  I will first create function handles for the mean and standard expression\r\n      for the geometric mean.  Here's the handle for the arithmetic mean\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">amn = @(x) mean(x)<\/pre><pre style=\"font-style:oblique\">amn = \r\n    @(x)mean(x)\r\n<\/pre><p>and for the geometric mean.<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">gmn = @(x) prod(x)^(1\/numel(x))<\/pre><pre style=\"font-style:oblique\">gmn = \r\n    @(x)prod(x)^(1\/numel(x))\r\n<\/pre><h3>Some Data<a name=\"3\"><\/a><\/h3>\r\n   <p>Now let's create some data and compute the means.<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">xsmall = 100*rand(10,1);\r\nmeans = [amn(xsmall) gmn(xsmall)]<\/pre><pre style=\"font-style:oblique\">means =\r\n       42.403       27.898\r\n<\/pre><h3>More Challenging Data<a name=\"4\"><\/a><\/h3>\r\n   <p>Let's suppose we some data that are much larger in size and compute the means.<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">xlarge = 1e300*rand(1000,1);\r\nmeans = [amn(xlarge) gmn(xlarge)]<\/pre><pre style=\"font-style:oblique\">means =\r\n  5.1363e+299          Inf\r\n<\/pre><p>While we got a finite answer for the arithmetic mean, we got <tt>Inf<\/tt> for the geometric mean.  If you look at the expression for the geometric mean, we first calculated the product of all the\r\n      numbers and then took the nth root.  So we exceeded <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2010a\/techdoc\/ref\/realmax.html\"><tt>realmax<\/tt><\/a> in the calculation, hence the infinite result.  Is there a way to circumvent this, at least for a while?  Yes!\r\n   <\/p>\r\n   <h3>Safer Expression for Geometric Mean<a name=\"6\"><\/a><\/h3>\r\n   <p>We can recast the calculation of the product of some numbers to be the <tt>sum<\/tt> of their natural logs and then exponentiate that result.  To get the nth root, we divide the <tt>sum<\/tt> by <tt>n<\/tt>, the number of elements.  Here's a new expression for the geometric mean.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">gm2 = @(x) exp(sum(log(x))\/numel(x))<\/pre><pre style=\"font-style:oblique\">gm2 = \r\n    @(x)exp(sum(log(x))\/numel(x))\r\n<\/pre><p>Here's the geometric mean applied to our two datasets.<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">[gm2(xsmall) gm2(xlarge)]<\/pre><pre style=\"font-style:oblique\">ans =\r\n       27.898  3.8763e+299\r\n<\/pre><p>You can see that we get the same result for the perhaps more typical data, and have insulated ourselves from poor numerical\r\n      results with the larger data values.\r\n   <\/p>\r\n   <h3>How Do You Average Data?<a name=\"9\"><\/a><\/h3>\r\n   <p>If you have data that may contain <tt>NaN<\/tt> values, you can use <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2010a\/toolbox\/stats\/nanmean.html\"><tt>nanmean<\/tt><\/a> from Statistics Toolbox.  Do you have other expressions that are appropriate for averaging your datasets.  Let me know <a href=\"https:\/\/blogs.mathworks.com\/loren\/?p=242#respond\">here<\/a>.\r\n   <\/p><script language=\"JavaScript\">\r\n<!--\r\n\r\n    function grabCode_8f60f8ad0d984bf29e13998dd6f7048b() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='8f60f8ad0d984bf29e13998dd6f7048b ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' 8f60f8ad0d984bf29e13998dd6f7048b';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        author = 'Loren Shure';\r\n        copyright = 'Copyright 2010 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add author and copyright lines at the bottom if specified.\r\n        if ((author.length > 0) || (copyright.length > 0)) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (author.length > 0) {\r\n                d.writeln('% _' + author + '_');\r\n            }\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n      \r\n      d.title = title + ' (MATLAB code)';\r\n      d.close();\r\n      }   \r\n      \r\n-->\r\n<\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_8f60f8ad0d984bf29e13998dd6f7048b()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n            the MATLAB code \r\n            <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; 7.10<br><\/p>\r\n<\/div>\r\n<!--\r\n8f60f8ad0d984bf29e13998dd6f7048b ##### SOURCE BEGIN #####\r\n%% By All Means\r\n% Ever find yourself wanting to get some sense of some data, but not sure\r\n% the arithmetic \r\n% <https:\/\/www.mathworks.com\/help\/releases\/R2010a\/techdoc\/ref\/mean.html\r\n% |mean|> is what you want?  You might also consider the\r\n% <https:\/\/www.mathworks.com\/help\/releases\/R2010a\/toolbox\/stats\/geomean.html\r\n% geometric mean> (|geomean| from Statistics Toolbox).  In the image\r\n% processing world, I understand that some think that images look crisper\r\n% often when the geometric mean is applied versus the arithmetic mean.\r\n% Today I want to talk about how to get accurate results for the geometric\r\n% mean.\r\n%% Geometric Mean\r\n% Let's assume we have a vector |x| so we can ignore dealing with different\r\n% dimensions.  I will first create function handles for the mean and\r\n% standard expression for the geometric mean.  Here's the handle for the\r\n% arithmetic mean\r\namn = @(x) mean(x)\r\n%%\r\n% and for the geometric mean.\r\ngmn = @(x) prod(x)^(1\/numel(x))\r\n%% Some Data\r\n% Now let's create some data and compute the means.\r\nxsmall = 100*rand(10,1);\r\nmeans = [amn(xsmall) gmn(xsmall)]\r\n%% More Challenging Data\r\n% Let's suppose we some data that are much larger in size and compute the\r\n% means.\r\nxlarge = 1e300*rand(1000,1);\r\nmeans = [amn(xlarge) gmn(xlarge)]\r\n%%\r\n% While we got a finite answer for the arithmetic mean, we got |Inf| for\r\n% the geometric mean.  If you look at the expression for the geometric\r\n% mean, we first calculated the product of all the numbers and then took\r\n% the nth root.  So we exceeded\r\n% <https:\/\/www.mathworks.com\/help\/releases\/R2010a\/techdoc\/ref\/realmax.html\r\n% |realmax|> in the calculation, hence the infinite result.  Is there a\r\n% way to circumvent this, at least for a while?  Yes!\r\n%%  Safer Expression for Geometric Mean\r\n% We can recast the calculation of the product of some numbers to be the\r\n% |sum| of their natural logs and then exponentiate that result.  To get\r\n% the nth root, we divide the |sum| by |n|, the number of elements.  Here's\r\n% a new expression for the geometric mean.\r\ngm2 = @(x) exp(sum(log(x))\/numel(x))\r\n%%\r\n% Here's the geometric mean applied to our two datasets.\r\n[gm2(xsmall) gm2(xlarge)]\r\n%%\r\n% You can see that we get the same result for the perhaps more typical\r\n% data, and have insulated ourselves from poor numerical results with the\r\n% larger data values.\r\n%% How Do You Average Data?\r\n% If you have data that may contain |NaN| values, you can use\r\n% <https:\/\/www.mathworks.com\/help\/releases\/R2010a\/toolbox\/stats\/nanmean.html\r\n% |nanmean|> from Statistics Toolbox.  Do you have other expressions that\r\n% are appropriate for averaging your datasets.  Let me know\r\n% <https:\/\/blogs.mathworks.com\/loren\/?p=242#respond here>.\r\n##### SOURCE END ##### 8f60f8ad0d984bf29e13998dd6f7048b\r\n-->","protected":false},"excerpt":{"rendered":"<p>\r\n   \r\n      Ever find yourself wanting to get some sense of some data, but not sure the arithmetic mean is what you want?  You might also consider the geometric mean (geomean from Statistics... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/loren\/2010\/08\/22\/by-all-means\/\">read more >><\/a><\/p>","protected":false},"author":39,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[26],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/242"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/comments?post=242"}],"version-history":[{"count":0,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/242\/revisions"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/media?parent=242"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/categories?post=242"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/tags?post=242"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}