# Binning data in MATLAB 21

Posted by **Doug Hull**,

**x**and

**y**, bin the

**x**values as if with a histogram. Then take the corresponding

**y**values in each bin and take the mean. This uses HISTC and indexing. Mostly this is applying skills from earlier videos. I really like the Stack Overflow community, and am glad to see the MATLAB questions are being discussed there. Do you have any favorite communities, apart from MATLAB Central, that you like to use?

## 21 CommentsOldest to Newest

3 years ago we built a community for French MATLAB users here:

http://www.developpez.net/forums/f148/environnements-developpement/matlab/

It is like the MATLAB Central except 5 of us can moderate the forum (check spelling, no SMS message, use of HTML markup to insert code or error messages…). It is currently divided in 4 parts : MATLAB , GUI, IMAGE and Signal Processing. For each part we promote the new releases of MATLAB and of course your blogs!

I just checked out Stack Overflow last night, and it looks like a pretty cool community. I think I may start getting active over there.

Thanks for pointing it out!

Ken

The last loop could’ve been done simpler:

binMean = accumarray(whichBin, y, [], @mean)

Ilya,

There are very often ways in MATLAB to accomplish tasks in fewer commands. It is mostly a matter of taste as to which to use.

I tend to go for things that are easily read and understood by the most people. I needed to look up the doc on ACCUMARRAY to understand the above code. So, I suspect many people, myself included, would take longer to understand the shorter code. It is all trade-offs, and knowing several ways of doing something is great.

Good find, I will have to add that to my bag of MATLAB tricks.

-Thanks,

Doug

Hey Doug

Thanks so much for the video..made me finally understand how to bin in matlab. One question thought how would you bin x-y-z data?

Thanks again

Darlene

@Darlene

http://blogs.mathworks.com/videos/2007/12/13/visualizing-the-density-of-a-data-cloud/

Enjoy,

Doug

I just loved this explanation! Thanks a lot, Doug!

Exactly the solution I was looking for!

Many thanks,

Geoff

Hi Doug, thanks for the explanation. I have a question related to your post. After getting the corresponding frequency of occurrence of y in each of the bins on the x-axis, I wanted to plot the data as a histogram plot. I want to come up with a histogram plot that looks like the ones on the following figure. I am not sure how to do it though. Any suggestion will be very helpful.

http://imageshack.us/photo/my-images/30/sampleplot.gif/

@irene,

How about a bar plot with bar.m?

Hi Doug,

Thanks for posting this! I am trying to solve the same problem, but with three dimensional data. I have points in x,y coordinate space and right now I am plotting a 2D histogram with color label corresponding to the number of points in each bin. Instead, I would like to have color correspond to the mean of a third vector z. This should be simple given that you’ve solved the main problem, but I am getting confused with the indexing once the extra dimension is added. Any help would be much appreciated!

thanks again,

megan

@Megan,

If it is just the indexing into MATLAB n-dimensional matrices, try this:

http://www.mathworks.com/products/matlab/demos.html?file=/products/demos/shipping/matlab/nddemo.html

Doug

I tried to use this method and I end up getting following error:

Error using .*

Matrix dimensions must agree.

Error in linspace (line 31)

y = d1 + (0:n1).*(d2 – d1)/n1;

So I debug into the linsapce code and looks like if you have negative values it breaks. And I have negative values in my continues matrix. Any solution?

You need to say what the values of those variables are for scalars and size of the vectors before we can help much here.

This code is very helpful. Thank you.

Now that I know how to bin data for one data set (x and y), I would like to bin multiple data sets together. I have three data sets that are time series at three different locations. For all the data sets x is years and y is temperature. The time steps are not evenly spaced between the different data sets. Is there a way to bin the data from all the data sets that was taken in a given time step. For example, can I find the mean of all three data sets taken between 1950-2000.

Is the best way to bin each data set first and then just find the mean during that time step or is there a shorter way?

Thank you.

@Julie,

please see support@mathworks.com with more details. It is not clear what you need and that is a better venue to get help.

This is great. Would it accomplish this task? I have x= aspect(1:360)degrees, and y=slope values. I want to see if certain slopes are correlated to certain aspects by binning them in 20 degree increments. So if my edges were 0,360 and numbing =18 then this would work?

Hi Doug, this is very helpful. I’m trying to do essentially the same thing you’ve done here, but find the weighted mean of all the values in the bin (each value has an associated uncertainty: e.g. 13.5 +/-0.3). The weighted mean is calculated by first finding the inverse of the sum of the inverse square of all the errors, then multiplying that by the sum of the value divided by the square of its error. This means that number of elements in each bin depends on the number of values in the bin. Do you have any suggestions for how to code the weighted mean calculation in the loop?

Many thanks!

@Zach,

It sounds like you know the math. What part of this is giving you problems? Feel free to e-mail me directly with the code you have written and show me what is going wrong for you.

Doug

Thanks for this brilliant post.

Now I’m trying to perform it for three vectors, x, y,and z. It means bin x and y values and then calculate the mean of the corresponding z for both x and y. I am wondering it is possible or not?

@Nafi,

It is not clear what you are proposing, but sounds doable.