Loren on the Art of MATLAB

Turn ideas into MATLAB

By All Means

Ever find yourself wanting to get some sense of some data, but not sure the arithmetic mean is what you want? You might also consider the geometric mean (geomean from Statistics Toolbox). In the image processing world, I understand that some think that images look crisper often when the geometric mean is applied versus the arithmetic mean. Today I want to talk about how to get accurate results for the geometric mean.


Geometric Mean

Let's assume we have a vector x so we can ignore dealing with different dimensions. I will first create function handles for the mean and standard expression for the geometric mean. Here's the handle for the arithmetic mean

amn = @(x) mean(x)
amn = 

and for the geometric mean.

gmn = @(x) prod(x)^(1/numel(x))
gmn = 

Some Data

Now let's create some data and compute the means.

xsmall = 100*rand(10,1);
means = [amn(xsmall) gmn(xsmall)]
means =
       42.403       27.898

More Challenging Data

Let's suppose we some data that are much larger in size and compute the means.

xlarge = 1e300*rand(1000,1);
means = [amn(xlarge) gmn(xlarge)]
means =
  5.1363e+299          Inf

While we got a finite answer for the arithmetic mean, we got Inf for the geometric mean. If you look at the expression for the geometric mean, we first calculated the product of all the numbers and then took the nth root. So we exceeded realmax in the calculation, hence the infinite result. Is there a way to circumvent this, at least for a while? Yes!

Safer Expression for Geometric Mean

We can recast the calculation of the product of some numbers to be the sum of their natural logs and then exponentiate that result. To get the nth root, we divide the sum by n, the number of elements. Here's a new expression for the geometric mean.

gm2 = @(x) exp(sum(log(x))/numel(x))
gm2 = 

Here's the geometric mean applied to our two datasets.

[gm2(xsmall) gm2(xlarge)]
ans =
       27.898  3.8763e+299

You can see that we get the same result for the perhaps more typical data, and have insulated ourselves from poor numerical results with the larger data values.

How Do You Average Data?

If you have data that may contain NaN values, you can use nanmean from Statistics Toolbox. Do you have other expressions that are appropriate for averaging your datasets. Let me know here.

Published with MATLAB® 7.10

  • print
  • send email


To leave a comment, please click here to sign in to your MathWorks Account or create a new one.