gramm examples

Examples and how-tos for gramm

Here we plot the evolution of fuel economy of new cars bewteen 1970 and 1980 (carbig dataset). Gramm is used to easily separate groups on the basis of the number of cylinders of the cars (color), and on the basis of the region of origin of the cars (subplot columns). Both the raw data (points) and a glm fit with 95% confidence interval (line+shaded area) are plotted.

We stat by loading the sample data (structure created from the carbig dataset)

Create a gramm object, provide x (year of production) and y (fuel economy) data, color grouping data (number of cylinders) and select a subset of the data

g=gramm('x',cars.Model_Year,'y',cars.MPG,'color',cars.Cylinders,'subset',cars.Cylinders~=3 & cars.Cylinders~=5);

Subdivide the data in subplots horizontally by region of origin using facet_grid()

g.facet_grid([],cars.Origin_Region);

Plot raw data as points

g.geom_point();

Plot linear fits of the data with associated confidence intervals

g.stat_glm();

Set appropriate names for legends

g.set_names('column','Origin','x','Year of production','y','Fuel economy (MPG)','color','# Cylinders');

Set figure title

g.set_title('Fuel economy of new cars between 1970 and 1982');

Do the actual drawing

figure('Position',[100 100 800 400]);

g.draw();

Grouping options in gramm

With gramm there are a lot ways to map groups to visual properties of plotted data, or even subplots. Providing grouping variables to change visual properties is done in the constructor call gramm(). Grouping variables that determine subplotting are provided by calls to the facet_grid() or facet_wrap() methods. Not that all the mappings presented below can be combined, i.e. it's possible to previde different variables to each of the options.

In order to plot multiple, diferent gramm objects in the same figure, an array of gramm objects is created, and the draw() function called at the end on the whole array

clear g

g(1,1)=gramm('x',cars.Horsepower,'y',cars.MPG,'subset',cars.Cylinders~=3 & cars.Cylinders~=5);

g(1,1).geom_point();

g(1,1).set_names('x','Horsepower','y','MPG');

g(1,1).set_title('No groups');

g(1,2)=gramm('x',cars.Horsepower,'y',cars.MPG,'subset',cars.Cylinders~=3 & cars.Cylinders~=5,'color',cars.Cylinders);

g(1,2).geom_point();

g(1,2).set_names('x','Horsepower','y','MPG','color','# Cyl');

g(1,2).set_title('color');

g(1,3)=gramm('x',cars.Horsepower,'y',cars.MPG,'subset',cars.Cylinders~=3 & cars.Cylinders~=5,'lightness',cars.Cylinders);

g(1,3).geom_point();

g(1,3).set_names('x','Horsepower','y','MPG','lightness','# Cyl');

g(1,3).set_title('lightness');

g(2,1)=gramm('x',cars.Horsepower,'y',cars.MPG,'subset',cars.Cylinders~=3 & cars.Cylinders~=5,'size',cars.Cylinders);

g(2,1).geom_point();

g(2,1).set_names('x','Horsepower','y','MPG','size','# Cyl');

g(2,1).set_title('size');

g(2,2)=gramm('x',cars.Horsepower,'y',cars.MPG,'subset',cars.Cylinders~=3 & cars.Cylinders~=5,'marker',cars.Cylinders);

g(2,2).geom_point();

g(2,2).set_names('x','Horsepower','y','MPG','marker','# Cyl');

g(2,2).set_title('marker');

g(2,3)=gramm('x',cars.Horsepower,'y',cars.MPG,'subset',cars.Cylinders~=3 & cars.Cylinders~=5,'linestyle',cars.Cylinders);

g(2,3).geom_line();

g(2,3).set_names('x','Horsepower','y','MPG','linestyle','# Cyl');

g(2,3).set_title('linestyle');

g(3,1)=gramm('x',cars.Horsepower,'y',cars.MPG,'subset',cars.Cylinders~=3 & cars.Cylinders~=5);

g(3,1).facet_grid(cars.Cylinders,[]);

g(3,1).geom_point();

g(3,1).set_names('x','Horsepower','y','MPG','row','# Cyl');

g(3,1).set_title('subplot rows');

g(3,2)=gramm('x',cars.Horsepower,'y',cars.MPG,'subset',cars.Cylinders~=3 & cars.Cylinders~=5);

g(3,2).facet_grid([],cars.Cylinders);

g(3,2).geom_point();

g(3,2).set_names('x','Horsepower','y','MPG','column','# Cyl');

g(3,2).set_title('subplot columns');

figure('Position',[100 100 800 800]);

g.draw();

Methods for visualizing Y~X relationships with X as categorical variable

The following methods can be used when Y data is continuous and X data discrete/categorical.

Here we also use an array of gramm objects in order to have multiple gramm plots on the same figure. The gramm objects use the same data, so we copy them after construction using the copy() method

clear g

g(1,1)=gramm('x',cars.Origin_Region,'y',cars.Horsepower,'color',cars.Cylinders,'subset',cars.Cylinders~=3 & cars.Cylinders~=5);

g(1,2)=copy(g(1));

g(2,1)=copy(g(1));

g(2,2)=copy(g(1));

%Raw data as scatter plot

g(1,1).geom_point();

g(1,1).set_title('geom_point()');

%Jittered scatter plot

g(1,2).geom_jitter('width',0.4,'height',0);

g(1,2).set_title('geom_jitter()');

%Averages with confidence interval

g(2,1).stat_summary('geom',{'bar','black_errorbar'});

g(2,1).set_title('stat_summary()');

%Boxplots

g(2,2).stat_boxplot();

g(2,2).set_title('stat_boxplot()');

%These functions can be called on arrays of gramm objects

g.set_names('x','Origin','y','Horsepower','color','# Cyl');

g.set_title('Visualization of Y~X relationships with X as categorical variable');

figure('Position',[100 100 800 550]);

g.draw();

Methods for visualizing X densities

The following methods can be used in order to represent the density of a continuous variable. Note that here we represent the same data as in the previous figure, this time with Horsepower as X (over which the densities are represented), and separating the region of origin with subplots.

clear g

g(1,1)=gramm('x',cars.Horsepower,'color',cars.Cylinders,'subset',cars.Cylinders~=3 & cars.Cylinders~=5);

g(1,2)=copy(g(1));

g(2,1)=copy(g(1));

g(2,2)=copy(g(1));

%Raw data as raster plot

g(1,1).facet_grid(cars.Origin_Region,[]);

g(1,1).geom_raster();

g(1,1).set_title('geom_raster()');

%Histogram

g(1,2).facet_grid(cars.Origin_Region,[]);

g(1,2).stat_bin('nbins',8);

g(1,2).set_title('stat_bin()');

%Kernel smoothing density estimate

g(2,1).facet_grid(cars.Origin_Region,[]);

g(2,1).stat_density();

g(2,1).set_title('stat_density()');

% Q-Q plot for normality

g(2,2).facet_grid(cars.Origin_Region,[]);

g(2,2).stat_qq();

g(2,2).axe_property('XLim',[-5 5]);

g(2,2).set_title('stat_qq()');

g.set_names('x','Horsepower','color','# Cyl','row','','y','');

g.set_title('Visualization of X densities');

figure('Position',[100 100 800 550]);

g.draw();

Methods for visualizing Y~X relationship with both X and Y as continuous variables

The following methods can be used when both X and Y data are continuous

clear g

%Raw data as scatter plot

g(1,1)=gramm('x',cars.Horsepower,'y',cars.Acceleration,'color',cars.Cylinders,'subset',cars.Cylinders~=3 & cars.Cylinders~=5);

g(1,2)=copy(g(1));

g(1,3)=copy(g(1));

g(2,1)=copy(g(1));

g(2,2)=copy(g(1));

g(1,1).geom_point();

g(1,1).set_title('geom_point()');

%Generalized linear model fit

g(1,2).stat_glm();

g(1,2).set_title('stat_glm()');

%Custom fit with provided function

g(1,3).stat_fit('fun',@(a,b,c,x)a./(x+b)+c,'intopt','functional');

g(1,3).set_title('stat_fit(''fun'',@(a,b,c,x)a./(x+b)+c)');

%Spline smoothing

g(2,1).stat_smooth();

g(2,1).set_title('stat_smooth()');

%Moving average

g(2,2).stat_summary('bin_in',10);

g(2,2).set_title('stat_summary(''bin_in'',10)');

g.set_names('x','Horsepower','y','Acceleration','color','# Cylinders');

g.set_title('Visualization of Y~X relationship with both X and Y as continuous variables');

figure('Position',[100 100 800 550]);

g.draw();

Warning: Start point not provided, choosing random start point.
Warning: Start point not provided, choosing random start point.
Warning: Start point not provided, choosing random start point.

Methods for visualizing 2D densities

The following methods can be used to visualize 2D densities for bivariate data

%Create point cloud with two categories

N=10^4;

x=randn(1,N);

y=x+randn(1,N);

test=repmat([0 1 0 0],1,N/4);

y(test==0)=y(test==0)+3;

clear g

% Display points and 95% percentile confidence ellipse

g(1,1)=gramm('x',x,'y',y,'color',test);

g(1,1).set_names('color','grp');

g(1,1).geom_point();

%'patch_opts' can be used to provide more options to the patch() internal

%call

g(1,1).stat_ellipse('type','95percentile','geom','area','patch_opts',{'FaceAlpha',0.1,'LineWidth',2});

g(1,1).set_title('stat_ellispe()');

%Plot point density as contour plot

g(1,2)=gramm('x',x,'y',y,'color',test);

g(1,2).stat_bin2d('nbins',[10 10],'geom','contour');

g(1,2).set_names('color','grp');

g(1,2).set_title('stat_bin2d(''geom'',''contour'')');

% %Plot density as point size (looks good only when axes have the same

% %scale, hence the 'DataAspectRatio' option, equivalent to axis equal)

% g(2,1)=gramm('x',x,'y',y,'color',test);

% g(2,1).stat_bin2d('nbins',{-10:0.4:10 ; -10:0.4:10},'geom','point');

% g(2,1).axe_property('DataAspectRatio',[1 1 1]);

% g(2,1).set_names('color','grp');

% g(2,1).set_title('stat_bin2d(''geom'',''point'')');

%Plot density as heatmaps (Heatmaps don't work with multiple colors, so we separate

%the categories with facets). With the heatmap we see better the

%distribution in high-density areas

g(2,1)=gramm('x',x,'y',y);

g(2,1).facet_grid([],test);

g(2,1).stat_bin2d('nbins',[20 20],'geom','image');

%g(2,1).set_continuous_color('LCH_colormap',[0 100 ; 100 20 ;30 20]); %Let's try a custom LCH colormap !

g(2,1).set_names('column','grp','color','count');

g(2,1).set_title('stat_bin2d(''geom'',''image'')');

g.set_title('Visualization of 2D densities');

figure('Position',[100 100 800 600])

g.draw();

%We change the point size in the first graph a posteriori

set([g(1,1).results.geom_point_handle],'MarkerSize',2);

Methods for visualizing repeated trajectories

gramm supports 2D inputs for X and Y data (as 2D array or cell of arrays), which is particularly useful for representing repeated trajectories. Here for example we generate 50 trajectories, each of length 40. The grouping data is then given per trajectory and not per data point. Here the color grouping variable is thus given as a 1x50 cellstr.

%We generate 50 trajectories of length 40, with 3 groups

N=50;

nx=40;

cval={'A' 'B' 'C'};

cind=randi(3,N,1);

c=cval(cind);

x=linspace(0,3,nx);

y=arrayfun(@(c)sin(x*c)+randn(1,nx)/10+x*randn/5,cind,'UniformOutput',false);

clear g

g(1,1)=gramm('x',x,'y',y,'color',c);

g(1,2)=copy(g(1));

g(2,1)=copy(g(1));

g(2,2)=copy(g(1));

g(1,1).geom_point();

g(1,1).set_title('geom_point()');

g(1,2).geom_line();

g(1,2).set_title('geom_line()');

g(2,1).stat_smooth();

g(2,1).set_title('stat_smooth()');

g(2,2).stat_summary();

g(2,2).set_title('stat_summary()');

g.set_title('Visualization of repeated trajectories ');

figure('Position',[100 100 800 550]);

g.draw();

Methods for visualizing repeated densities (e.g. spike densities)

With the support of 2D inputs for X and gramm's functionality for representing the density of data, useful neuroscientific plots can be generated when the provided X corresponds to spike trains: raster plots and peristimulus time histograms (PSTHs).

%We generate 50 spike trains, with 3 groups

N=50;

cval={'A' 'B' 'C'};

cind=randi(3,N,1);

c=cval(cind);

train_template=[zeros(1,300) ones(1,200)];

%Pseudo-poisson spike trains

spike_train=cell(N,1);

for k=1:N

temp_train=rand*0.05+train_template/(cind(k)*8);

U=rand(size(temp_train));

spike_train{k}=find(U<temp_train);

end

clear g

g(1,1)=gramm('x',spike_train,'color',c);

g(1,1).geom_raster();

g(1,1).set_title('geom_raster()');

g(1,2)=gramm('x',spike_train,'color',c);

g(1,2).stat_bin('nbins',25,'geom','line');

g(1,2).set_title('stat_bin()');

g.set_names('x','Time','y','');

g.set_title('Visualization of spike densities');

figure('Position',[100 100 800 350]);

g.draw();

Options for separating groups across subplots with facet_grid()

To separate groups in different rows and columns of sublots, the grouping variable just need to be passed to the facet_grid(goup_rows,group_columns) function or facet_wrap(group_columns). Both have multiple options concerning the scaling of data between subplots.

• By default 'scale','fixed' all subplots have the same limits
• 'scale','free_x': subplots on the same columns have the same x limits
• 'scale','free_y': subplots on the same rows have the same y limits
• 'scale','free': subplots on the same rows have the same y limits, subplots on the same columns have the same x limits
• 'scale','independent': subplots have independent limits

In facet_grid(); the 'space' option allows to set how the subplot axes themselves scale with the data. It should be used in conjunction with the corresponding 'scale' option.

% Generating fake data

N=2000;

colval={'A' 'B' 'C'};

rowval={'I' 'II'};

cind=randi(3,N,1);

c=colval(cind);

rind=randi(2,N,1);

r=rowval(rind);

x=randn(N,1);

y=randn(N,1);

x(cind==1 & rind==1)=x(cind==1 & rind==1)*5;

x=x+cind*3;

y(cind==3 & rind==2)=y(cind==3 & rind==2)*3;

y=y-rind*4;

clear g

g(1,1)=gramm('x',x,'y',y,'color',c,'lightness',r);

g(1,2)=copy(g(1));

g(2,1)=copy(g(1));

g(2,2)=copy(g(1));

g(3,1)=copy(g(1));

g(3,2)=copy(g(1));

g(1,1).geom_point();

g(1,1).set_title('No facets');

g(1,2).facet_grid(r,c);

g(1,2).geom_point();

g(1,2).no_legend();

g(1,2).set_title('facet_grid()');

g(2,1).facet_grid(r,c,'scale','free');

g(2,1).geom_point();

g(2,1).no_legend();

g(2,1).set_title('facet_grid(''scale'',''free'')');

g(2,2).facet_grid(r,c,'scale','free','space','free');

g(2,2).geom_point();

g(2,2).no_legend();

g(2,2).set_title('facet_grid(''scale'',''free'',''space'',''free'')');

g(3,1).facet_grid(r,c,'scale','free_x');

g(3,1).geom_point();

g(3,1).no_legend();

g(3,1).set_title('facet_grid(''scale'',''free_x'')');

g(3,2).facet_grid(r,c,'scale','independent');

g(3,2).geom_point();

g(3,2).no_legend();

g(3,2).set_title('facet_grid(''scale'',''independent'')');

g.set_color_options('lightness_range',[40 80],'chroma_range',[80 40]);

g.set_names('column','','row','');

%g.axe_property('color',[0.9 0.9 0.9],'XGrid','on','YGrid','on','GridColor',[1 1 1],'GridAlpha',0.8,'TickLength',[0 0],'XColor',[0.3 0.3 0.3],'YColor',[0.3 0.3 0.3])

g.set_title('facet_grid() options');

figure('Position',[100 100 800 800]);

g.draw();