# Advanced: making a 2d or 3d histogram to visualize data density 39

Posted by **Doug Hull**,

This short video makes a 2d histogram as an alternative to plotting data points and visually estimating where the most data is.

**Category:**- Format: Video,
- Level: Advanced

## 39 CommentsOldest to Newest

Daniel Armyr
replied on

Good tutorial. This is one type of problem I often find myself trying to solve.
In fact, I find I try to solve it so often that I took the time to write a properly optimized and documented function for the file exchange:
https://www.mathworks.com/matlabcentral/fileexchange/23238-cloudplot : **1**of 39
John A
replied on

Doug, I am a little confused...I assume x and y are just random coordinates...you then created xd, yd...what is this actually?? How did you do it? Then in linespace you use x,y again, but in interp you use xd,yd. I think I understand the concept, but I'm lost in the details. Can you post all the code so I can run what you did?? : **2**of 39
dhull
replied on

@John,
XD and YD are the random points. When used in the interp1 function, XD and YD are the raw data that is then interpolated to the nearest nicely spaced data points in XI.
Make sense?
Doug : **3**of 39
John A
replied on

Doug, in the linspace functions then you should use xd,yd not x,y...was that just a typo?? If so all makes sense, thx : **4**of 39
dhull
replied on

@John,
I can see where that would work too. For simplicity, I had hidden where the data came from. X and XD both had the same range, so it did not matter.
I had originally planned on showing where the data came, but in production the movie got too long, so it ended on the cutting room floor!
Good catch!
Doug : **5**of 39
Joe Pohedra
replied on

Sorry, I'm lost. I could not follow what's happening without the data set to reproduce the example. Fortunately for me, the hist3 function seems to do all the heavy lifting I need.
: **6**of 39randx = randn(100000,1); randy = randn(100000,1); z = hist3([randx randy],[40 40]); surf(z)

Douglas Neal
replied on

Doug,
This is a great code! I am also lost though. When I run the line
: **7**of 39>> z = accumarray([xr yr], 1, [n n]);I get the following error:

??? Error using ==> accumarray Third input SZ must be a full row vector with one element for each column of SUBS.Also, can you comment how this differs from hist3.m?

dhull
replied on

@Doug
Please post all the code so I can reproduce it by copying and pasting.
Doug : **8**of 39
Douglas Neal
replied on

@dhull
Here is what I was doing to try and duplicate your code:
: **9**of 39xd = randn(100000,1); yd = randn(100000,1); n=49; xi = linspace(min(xd(:)),max(xd(:)),n); yi = linspace(min(yd(:)),max(yd(:)),n); xr = interp1(xi,1:numel(xi),xd,'nearest')'; yr = interp1(yi,1:numel(yi),yd,'nearest')'; z = accumarray([xr yr], 1, [n n]); figure(2) surf(z)I copied everything save the declarations of xd, yd out of your example in the video. Thanks!

dhull
replied on

@Douglas,
z = accumarray([xr' yr'], 1, [n n]);
You needed to transpose xr and yr so they are columns when you put them together. The way it was done made them a long row, not two columns. Works fine now!
Doug : **10**of 39
Douglas Neal
replied on

@dhull
Thanks! Works great now -- very nice code! : **11**of 39
Doug Carter
replied on

I received this error:
Error in ==> testrun at 22
z = accumarray([xr' yr'], 1,[n n]);
Is there anyway to get around this or what is the best way to figure out how to maximize the amount of data without excedding the memory limits?
Thanks,
Doug (Ohio State University) : **12**of 39
dhull
replied on

@Doug,
Without more context, I can not really answer the question. Please contact https://www.mathworks.com/support.html with all the relevant files to reproduce, and we will take a look at it.
Thanks,
doug : **13**of 39
Colin Norris
replied on

I'm trying to use this function (which I've modified below) but I'm getting an error
: **14**of 39??? Error using ==> accumarray First input SUBS must contain positive integer subscripts.EngSpd & EngTorq are independant 24000 x 1 vectors SpdBP is a 1x16 vector TrqBP is a 1x21 vector Any ideas on how to fix this issue? Thanks Colin

xd = EngSpd; yd = EngTorq; xi = SpdBP; yi = TrqBP; xr = interp1(xi,1:numel(xi),xd,'nearest')'; yr = interp1(yi,1:numel(yi),yd,'nearest')'; z = accumarray([xr' yr'], 1, [length(SpdBP) length(TrqBP)]); figure(5) surf(z)

dhull
replied on

@Colin,
I would put a break point on the line with accumarray. Once stopped in the debugger, figure out what the inputs to accumarray are. It looks like the first one is not what is expected.
Doug : **15**of 39
Andrea
replied on

Thanks for the tutorial. Nice thing, I never came up against accumarray which seems to be very useful!
But actually there is a mistake when calculating the BIN-number: since interp1 assigns the nearest value in every dimension, all data points say in x-direction belonging to BIN 1 and half of them in BIN 2 will be assigned to (1,:). Correct should be interpolating to
: **16**of 39xr = interp1(xi,0.5:numel(xi)-0.5,xd,'nearest')'; yr = interp1(yi,0.5:numel(yi)-0.5,yd,'nearest')';which requires

z = accumarray([xr yr] + 0.5, 1, [n n]);Best Andrea

dhull
replied on

@Andrea,
Thank you for catching a subtlety there!
Doug : **17**of 39
Thomas Smith
replied on

In the example code, the data from xd will be plotted on the Y axis, and yd on the X axis. To get a properly labeled graph, with the X data along the X axis, you actually need to say:
: **18**of 39surf(xi, yi, Z');Why is that? In the example, we have

Z = accumarray([xr yr], 1, [n n]);For instance, if something falls in the 1st X-bin and the 5th Y-bin, it will get counted in Z(1,5). But, the MATLAB plotting commands like SURF and CONTOUR take their inputs in the opposite way: To have something show up at (1,5) in the graph, it should be in Z(5,1). Hope this is useful to the next person who's thinking sideways! -Thomas

Thomas Smith
replied on

I made the changes suggested by Andrea and myself, and put them into a nice copy-pasteable (and editable!) file on GitHub:
https://gist.github.com/883933 : **19**of 39
Jbrand
replied on

I am trying to plot out a point spread function, there are positive and negative x and y values and I get this message
??? Error using ==> accumarray
First input SUBS must contain positive integer
subscripts.
any help? : **20**of 39
dhull
replied on

@Jbrand,
You can not pass negative numbers to accumarray. Check your inputs.
Doug : **21**of 39
Megan
replied on

Thanks for the video! I was able to use it effectively for 2 vectors (xd & yd).
I had some other questions about the extent to which I could use this code, or if there are other options available.
1. Is there an alternative to accumarray that will take negative inputs -or am I able to edit accumarray in a way that would allow it to take negative inputs?
2. Is there any way to have such a histogram plot for 3 vectors (xd, yd and zd)? What I would like to have is the x and y coordinates with z as the height and color coded by the frequency distribution. I am basically trying to visualize a special euclidean(2) space (since my z vector lies on polar coordinates). Is there any way to adapt your code to work for 3 vectors in this way?
I know these questions don't directly apply to the working of the posted tutorial, but any suggestions would be very much appreciated!
Thanks in advance! : **22**of 39
Rod
replied on

I believe I have implemented the 3d analogy for use with a 3d color histogram. Correct me if this is wrong but it seems to be working. Hope this helps someone.
colorimage = imread('rgb.jpg');
nbins = 16;
redmat = colorimage(:,:,1);
greenmat = colorimage(:,:,2);
bluemat = colorimage(:,:,3);
ri = linspace(0,255,nbins);
gi = linspace(0,255,nbins);
bi = linspace(0,255,nbins);
rtp = interp1(ri,1:numel(ri),double(redmat(:)),'nearest');
gtp = interp1(gi,1:numel(gi),double(greenmat(:)),'nearest');
btp = interp1(bi,1:numel(bi),double(bluemat(:)),'nearest');
Z=accumarray([rtp,gtp,btp],1,[nbins,nbins,nbins]);
The only issue I have found is displaying this behemoth, since surf() only takes in mxn or mxnx3. : **23**of 39
Catherine
replied on

Hi there, I would like to ask if you know how to label the peak values of the data points in your 3D plot. I can't seem to find any commands in MATLAB that are able to perform that function. Thank you. : **24**of 39
Doug
replied on

Check out the command >>annotation : **25**of 39
Raj
replied on

Hi,
Thank you very much for this post. Its wonderful and exactly what I need to look some 3.5 million data points I have. So I have gone through each comments and details on this page and summarized the code as given below. But I am getting an error "Error using accumarray, First input SUBS must contain positive integer subscripts." I know there are no negative values in xr and yr, and as suggested I am adding 0.5. So how can I fix it? Please help. Thanks in advance.
x = MAge000(:);
y = DSTS000(:);
% NaN values taken out by
x = x(isfinite(x(:, 1)), :);
y = y(isfinite(x(:, 1)), :);
% Number of bins
n = 49;
xi = linspace(min(x(:)), max(x(:)), n);
yi = linspace(min(y(:)), max(y(:)), n);
xr = interp1(xi, 0.5:numel(xi)-0.5, x, 'nearest');
yr = interp1(yi, 0.5:numel(yi)-0.5, y, 'nearest');
% [yr xr]: Y then X because the plotting commands take matrices that look like Z(y-coord, x-coord)
Z = accumarray([yr xr] + 0.5, 1, [n n]);
figure(1);
surf(xi, yi, Z);
: **26**of 39
Raj
replied on

Hi,
Never mind, I found my mistake, and fixed it. It should be
y = y(isfinite(y(:, 1)), :);
not
y = y(isfinite(x(:, 1)), :);
But I do have a question regarding the selection of n. Why n = 49 and not 50. May be its a stupid question but I dont know/understand. So if you can explain to me will be great.
Thanks again,
Raj : **27**of 39
Doug
replied on

@Raj,
It is an arbitrary number.
Doug : **28**of 39
Vipin Iyer
replied on

Hi Doug,
This might be a dumb question but before I spend the time to implement this, do you think this will work for an array having 2 million points?
Thanks
-Vipin : **29**of 39
Doug
replied on

d = rand(1,2000000);
hist(d)
2 million is not a big deal to MATLAB. : **30**of 39
Neda
replied on

Hi,
Thank you for your useful codes. My problem is that when I used the code there was an error:
??? Error using ==> interp1 at 125
X and Y must be of the same length.
when I run the codes line by line I noticed that the error happend because of this line in my program:
xr = interp1(xi,1,numel(xi),a,'nearest')
I clicked on the error and it went through the function named interp1
Do you have any idea why this happened and what should I do?
I would be thankful if you could send me an email.
Regards : **31**of 39
Doug
replied on

Make sure X and Y (The first two inputs) are vectors of the same length. See the doc for interp1. : **32**of 39
Iain
replied on

@Thomas Smith, Andrea and Doug
I know this thread is a little old now, but i was wondering if you had any suggestions on keeping the heatmap colour consistent between plots. To elaborate, if I look at two different data sets using the code you outlined, will the colour be relative to the data set or an external control i.e. if i see yellow on one data set does this amount to the same yellow/density on another data-set or are they effectively arbitrary and cannot be compared? Thanks : **33**of 39
Doug
replied on

You can look at clim. Just like you would hold ylim constant between to plots, same with clim (color limit) : **34**of 39
Pinkee
replied on

Hi Doug,
Thanks for such great videos. They are truly of great help. I have a question that is, although not directly related to this video, is related to density functions. How would you plot the PDF of Z = sqrt(X.^2+Y.^2), if X and Y are uniformly distributed over (0,10). I tried using hist3, but I was not sure if that was the right way. I hope your advice will help me figure out. Thank you once again. : **35**of 39
Doug
replied on

I am not a great stats guy, so not sure I can answer. Seems more stats than programming.
Doug : **36**of 39
Claudia
replied on

Hi everyone,
I have found this code very useful while working with a 2D matrix.
Now, I wish I could reproduce the same results with a 3D matrix: I have a X,Y,Z matrix and I wish to have in a voxel/volume the accumulation of my events. I have tried few things with no success. Anyone can help me with this?
It does not really work If i use the accumarray by just adding one dimension!
Thanks in advance! : **37**of 39
Paul
replied on

Minor nit: the first and last bins are not the same width as the rest. One easy fix is to replace
xi = linspace(min(x(:)), max(x(:)), n);
with
xi = linspace(min(x(:)), max(x(:)), 2*n+1); xi = xi(2:2:2*n);
(Similar for yi.) : **38**of 39
TeoNoobis
replied on

@Daniel Armyr Sir, I would like to thank you very much! This is the exact thing I was looking for and there is nothing like it in Matlab.. I have been looking for it in days and have found nothing till now. Thank you very much! : **39**of 39
## Recent Comments