For quite a while now, we have made MathWorks product documentation available from the support page on our web site. Though we started with MATLAB documentation initially, we have added documentation for all the rest
of our products and have hosted a complete set on the mathworks.com for quite some time now.
As you likely know, within the MathWorks family of products, you can often find more than one way, or even more than one product,
to accomplish a particular goal. However, it's been difficult to see the whole landscape of possibilities, including functions,
examples, information in the User Guide, etc., for any given topic - until now, that is. I want to show you our more integrated
approach for seeking information in this post.
If you go to MathWorks website, you can search in the box in the upper right corner for the term "documention". Here's a snippet of the screen you see
next.
Find New Beta Documentation Center
Choose the second link, 'Online Documentation for MathWorks Product Family', and you reach a page that looks like this one.
See the prominent green box near the top 'Try the Documentation Center Beta' to see the new way to search for topics in our
documentation and you will see a page that looks like this one.
Documentation Center Features
In Release 2011b, the documentation already has many features that allow you to pinpoint relevant information quickly.
Browse Documentation Topics by category
Help topics are organized by categories to supports your tasks
Categories include function reference pages, examples, and other related content * Consistent access to Getting Started, Troubleshooting,
Featured Examples, and reference pages
Search Documentation Topics
* Get search suggestions as you type, organized by content type * Refine search results by product, category, and content
type * Single-click access to Functions and Blocks in a product * Expand search results to all Support content
Learning About Curve Fitting
Let's see what happens when we search for 'curve fitting' in the Doc Center. Here you see a screen with several different
areas of focus.
You can see relevant documentation in MathWorks products in the main, right-hand portion of the window. In addition, the
left side contains information clustered in other ways, allowing you to refine the search overall, by the type of information
(functions, blocks, etc.), or by product.
Documentation Center
We hope you like the presentation of information in the Documentation Center and find it helpful. We have plans for more features
for the Documentation Center. Please feel free to post your thoughts here.
I thought I would share my top goto list of things I try to do when I write MATLAB code. And checking with other MathWorks
folks whose code I admire, I found they basically used the same mental list that I use. You can find blog posts on all of
these topics by selecting relevant categories from the right side of The Art of MATLAB blog site.
Clearly (at least to me), this is not everything you generally need to do. You still need to comment the code, add good help
information and examples, etc. But these are the main coding practices and tools I always rely on.
Vectorize (but sensibly).
Use bsxfun in lieu of repmat where possible.
When looping through an array, loop down columns to access memory in the same order that MATLAB stores the data in.
Profile the code. I am often surprised about what is taking up the time.
Pay attention to messages from the Code Analyzer.
Use functions instead of scripts.
Don't "poof" variables into any workspaces. Translation, don't use load without a left-hand side; avoid eval, evalin, and assignin.
Use logical indexing instead of find.
Avoid global variables.
Don't use equality checks with floating point values.
Missing from Your List? Additions to My List?
What's on my list that you don't currently do? Do you have a major addition to my list (there can't be too many, or I won't remember to do them all!)? Let me know here.
Applications developed with the MATLAB Compiler direct their output to the console window. While this may suffice for batch
or background tasks, it is often not suitable for interactive or graphical environments. In an earlier post, I demonstrated how to use message handlers to redirect program output and error messages to a log file. For many batch or background applications, simply capturing
output to a log file provides sufficient information for validation and debugging. However, if you need to more closely monitor
the progress of your application while it is running, you might want to redirect output to a visible window.
I've written a short C program in monitor.c to demonstrate a few applications of this idea: on UNIX, the program sends output to an X terminal; on Windows, I created two programs. The simplest of the Windows programs displays the output in a Windows dialog box. The
other, slightly more complex, sends the output as events to the Windows Event Log.
In all three cases (UNIX and Windows), the monitor main program calls a MATLABfunction, sendmessages, contained in a shared library generated by MATLAB Compiler.
To follow the rest of this posting, download the source code for the main program that calls sendmessages from MATLAB Central. See the monitorREADME in the ZIP file for a description of each of the downloaded files and complete directions for building the example application.
The first command builds the dialog box version (monitorWinDialog.exe) and the second command builds the event log version (monitorWinEvents.exe).
You can't run monitorWinEvents.exe on Microsoft Windows until you register the event source. (UNIX readers can skip this step.) To register the event source,
follow the step-by-step instructions in the monitorREADME file in the source code distribution ZIP file.
Displaying Messages in a Visible Window
The main program in monitor.c relies on three external functions:
void StartMonitoring(void) void StopMonitoring(void) int
MonitorHandler(const char *msg)
StartMonitoring initializes the monitoring system, MonitorHandler redirects the program's output to the monitoring window and StopMonitoring releases the resources required to capture the program's output. The implementation of these functions varies by platform.
On UNIX, for example, the StartMonitor function starts an xterm which runs tail -f to monitor the log file:
The UNIX message handler, MonitorHandler, is identical to the message handler from the previous post's example, except that it writes messages to a temporary log file instead of a persistent one.
On Windows, the simplest version of the monitoring system sends messages to a dialog box. StartMonitoring creates the dialog box and starts it running in a separate thread. MonitorHandler calls LogMessage to add lines of text to the ListBox contained in the dialog box. StopMonitoring waits for the dialog box's thread to terminate. LogMessage sends the lines of text to the ListBox using the Windows messaging system. Since each item in a list box only displays a single line of text, LogMessage sends multi-line text as a sequence of separate messages, one message per line.
Now run it. The program takes three input arguments: the informational message, the warning and the error message.
(UNIX): ./monitor information "look out!" oops
(Windows): monitorWinDialog.exe information "look out!" oops
On UNIX machines and when running the dialog box version (monitorWinDialog) on a Windows machine, you'll see a window pop up and display the messages. To view the output of the event log version on
a Windows PC, you'll need to open the Event Logger and click on the Application category.
Next time: handling errors differently than information and warning messages. Until then, let me know how your application makes sure error messages get delivered to users. Do you need a permanent record, like the Windows event
logger provides?
If you've been following the comments from my recent post on chaotic maps, you know that there were several entries to the challenge of coming up with an interesting visualization.
The entries are each interesting, displaying schemes, including an animation, and extra information placed on the plots in
various ways. I have conferred with a few colleagues here and we have chosen the contribution from Rafael Oliveira as the
winning entry. Why? Because the code is interesting, he made interesting use interesting of some functions, the and the
plot is cute (at least for rng(42).
I decided to make a pixelated gingerbread man with the output of the chaotic map, using a lemniscate of Gerono as his green
bow tie :)
The Code
I took the liberty of adding a small number of comments to Rafael's code to highlight some parts of the plot.
type gingerR
function gingerR
%function pixelatedGBM from Rafael
rng(42) % could comment this out to get other "men"
[x,y] = gingerbreadman;
% scale and rotate our gingerbread man
r = [x y] * [cosd(135) -sind(135); sind(135) cosd(135)];
minR = min(r); maxR = max(r);
r = (r - repmat(minR,size(r,1),1))./repmat(maxR-minR,size(r,1),1);
r(:,2) = r(:,2).^1.5;
% create a pixel representation of it
N = 25;
b = linspace(0,1,N);
dif = b(2)-b(1);
[xb,yb] = meshgrid(b,b);
C = zeros(N);
x = r(:,1); y = r(:,2);
for i = 1:numel(xb)
C(i) = length(find(x >= xb(i) & x < xb(i) +dif & y >= yb(i) ...
& y < yb(i)+dif));
end
C = reshape(C,size(xb));
% smooth a little for better results
smooth = @(A,L) ((eye(size(A,1)) + ...
L ^ 2 * diff(eye(size(A,1)),2)' * diff(eye(size(A,1)),2) + ...
2 * L * diff(eye(size(A,1)))' * diff(eye(size(A,1)))) \ A);
D = smooth(smooth(C,0.1)',0.1)';
% let's draw it :)
set(figure,'Position',[0 0 300 400],'Color',[1 1 1])
movegui(gcf,'center')
set(surf(xb,yb,zeros(size(D)),D),'ZData',xb.*0-0.01);
view(2); shading flat; grid off; axis off equal;
colormap([1 1 1; pink(19)])
hold on
t = linspace(0,2*pi,50);
% bowtie
set(fill((sin(t))/9+0.5,(sin(2*t))/18+0.65,[0 .8 0]),...
'EdgeAlpha',0)
set(fill((sin(t))/30+0.5,(cos(t))/30+0.65,[0 .9 0]),...
'EdgeAlpha',0)
t = linspace(0,2*pi,5);
% buttons
set(fill((sin(t))/20+0.5,(cos(t))/20+0.5,[.8 0 0]),'EdgeAlpha',0)
set(fill((sin(t))/20+0.5,(cos(t))/20+0.3,[.8 0 0]),'EdgeAlpha',0)
% sugar
plot3(r(1:10:end,1),r(1:10:end,2),r(1:10:end,1)*0-0.005,'.',...
'MarkerSize',3,'MarkerEdgeColor',[1 1 1]);
end
Thanks for Participating
I really appreciate all the effort the contributors made. The decision was tough. I encourage you all to grab some of the
contributions from the comments in the previous post and play with them yourselves.
Let me also take this moment to wish all of you a happy, healthy holiday season and new year!
Chaos theory has found uses across a broad set of scientific fields to explain some complicated observed behavior. In geophysics, my
background, it can help explain the reversals of Earth's magnetic field, for example. Today I thought I'd share a chaotic
system, called Gingerbreadman maps, whose equations make the system seem simple. That is, until you do some simulations.
The equations for the gingerbreadman map look simple enough. For any given point in space: , define the next point in the sequence by . I now show the same equations in this MATLAB code, accounting for some initial values x0,y0 to seed the calculation.
dbtype 17:25gingerbreadman
17
18 % main calculation
19 for i = 1:n
20 if i == 1
21 x(i) = 1 - y0 + abs(x0);
22 y(i) = x0;
23 else
24 x(i) = 1 - y(i-1) + abs(x(i-1));
25 y(i) = x(i-1);
Plot Results
I want to show several results from different starting values for x0,y0. First I'll set the random number generator seed for repeatable results.
rng(42)
gingerbreadman
gingerbreadman
gingerbreadman
gingerbreadman
gingerbreadman
Code Listing
Here's the full code listing for gingerbreadman. You can see it allows you to specify the initial conditions, and return points and initial conditions should you choose.
With no output arguments, gingerbreadman creates plots like you've been seeing.
type gingerbreadman
function [xout,yout, x0, y0] = gingerbreadman(x0,y0)
% Gingerbreadman map producing a chaotic 2-D map.
% Copyright 2011 The MathWorks, Inc.
% if not enough inputs, assign random numbers
if nargin < 2
x0 = randn();
y0 = randn();
end
% iteration counter
n = 10000;
x = zeros(n,1);
y = zeros(n,1);
% main calculation
for i = 1:n
if i == 1
x(i) = 1 - y0 + abs(x0);
y(i) = x0;
else
x(i) = 1 - y(i-1) + abs(x(i-1));
y(i) = x(i-1);
end
end
% if output is requested, return gingerbread x,y values and
% x0, y0 initial conditions
%
% otherwise plot results
if nargout > 0
xout = x;
yout = y;
else
scatter(x, y, '.');
end
Instead of using the plotting function scatter, post your thoughts (in code) here for a chance to win some MATLAB bling. I will look at entries up through (was: Sunday, November 27, 2011) Wednesday December 14, 2011 and announce the winner
(the one whose plot I find most interesting) shortly after that.
This week Richard Willey from technical marketing will finish his two part presentation on subset selection and regularization.
In a recent posting, we examined how to use sequential feature selection to improve predictive accuracy when modeling wide data sets with highly
correlated variables. This week, we're going to solve the same problems using regularization algorithms such as lasso, the
elastic net, and ridge regression. Mathematically, these algorithms work by penalizing the size of the regression coefficients
in the model.
Standard linear regression works by estimating a set of coefficients that minimize the sum of the squared error between the
observed values and the fitted values from the model. Regularization techniques like ridge regression, lasso, and the elastic
net introduce an additional term to this minimization problem.
Ridge regression identifies a set of regression coefficients that minimize the sum of the squared errors plus the sum of the
squared regression coefficients multiplied by a weight parameter . can take any value between zero and one. A value of zero is equivalent to a standard linear regression. As increases in size, regression coefficients shrink towards zero.
Lasso minimizes the sum of the squared errors plus the sum of the absolute value of the regression coefficients.
The elastic net is a weighted average of the lasso and the ridge solutions.
The introduction of this additional term forces the regression coefficients towards zero generating a simpler model with greater
predictive accuracy.
Let's see regularization in action by using lasso to solve the same problem we looked at last week.
clear all
clc
rng(1998);
mu = [0 0 0 0 0 0 0 0];
i = 1:8;
matrix = abs(bsxfun(@minus,i',i));
covariance = repmat(.5,8,8).^matrix;
X = mvnrnd(mu, covariance, 20);
Beta = [3; 1.5; 0; 0; 2; 0; 0; 0];
ds = dataset(Beta);
Y = X * Beta + 3 * randn(20,1);
b = regress(Y,X);
ds.Linear = b;
Use Lasso to Fit the Model
The syntax for the lasso command is very similar to that used by linear regression. In this line of code, I am going estimate a set of coefficients
B that models Y as a function of X. To avoid over fitting, I'm going to apply five-fold cross validation.
[B Stats] = lasso(X,Y, 'CV', 5);
When we perform a linear regression, we generate a single set of regression coefficients. By default lasso will create 100 different models. Each model was estimated using a slightly larger . All of the model coefficients are stored in array B. The rest of the information about the model is stored in a structure named Stats.
Let's look at the first five sets of coefficients inside of B. As you traverse the rows you can see that as increases, the value model coefficients are usually shrinking towards zero.
Create a Plot Showing Mean Square Error Versus Lambda
The natural question to ask at this point in time is "OK, which of these 100 different models should I use?". We can answer
that question using lassoPlot. lassoPlot generates a plot that displays the relationship between and the cross validated mean square error (MSE) of the resulting model. Each of the red dots shows the MSE for the resulting
model. The vertical line segments stretching out from each dot are error bars for each estimate.
You can also see a pair of vertical dashed lines. The line on the right identifies the value that minimizes the cross validated MSE. The line on the left indicates the highest value of whose MSE is within one standard error of the minimum MSE. In general, people will chose the that minimizes the MSE. On occasion, if a more parsimonious model is considered particularly advantageous, a user might
choose some other value that falls between the two line segments.
lassoPlot(B, Stats, 'PlotType', 'CV')
Use the Stats Structure to Extract a Set of Model Coefficients.
The value that minimizes the MSE is stored in the Stats structure. You can use this information to index into Beta and extract the set of coefficients that minimize the MSE.
Much as in the feature selection example, we can see that the lasso algorithm has eliminated four of the five distractors
from the resulting model. This new, more parsimonious model will be significantly more accurate for prediction than a standard
linear regression.
Here, once again, it's very dangerous to base any kind of analysis on a single observation. Let's use a simulation to compare
the accuracy of a linear regression with the lasso. We'll start by preallocating some variables.
Next, we'll generate 100 different models and estimate the number of coefficients contained in the lasso model as well as
the difference in the cross validated MSE between a standard linear regression and the lasso model.
As you can see, on average, the lasso model only contains 4.5 terms (the standard linear regression model includes 8). More
importantly, the cross validated MSE for the linear regression model is about 30% larger than that generated from the lasso. This is an incredibly powerful results. The lasso algorithm is every bit as easy to apply as standard linear regression, however, it offers significant improvements in predictive
accuracy compared to regression.
rng(1998);
for i = 1 : 100
X = mvnrnd(mu, covariance, 20);
Y = X * Beta + randn(20,1);
[B Stats] = lasso(X,Y, 'CV', 5);
Betas(:,i) = B(:,Stats.IndexMinMSE) > 0;
Coeff_Num(i) = sum(B(:,Stats.IndexMinMSE) > 0);
MSE(i) = Stats.MSE(:, Stats.IndexMinMSE);
regf = @(XTRAIN, ytrain, XTEST)(XTEST*regress(ytrain,XTRAIN));
cv_Reg_MSE(i) = crossval('mse',X,Y,'predfun',regf, 'kfold', 5);
end
Number_Lasso_Coefficients = mean(Coeff_Num);
disp(Number_Lasso_Coefficients)
MSE_Ratio = median(cv_Reg_MSE)/median(MSE);
disp(MSE_Ratio)
4.57
1.2831
Choosing the Best Technique
Regularization methods and feature selection techniques both have unique strengths and weaknesses. Let's close this blog
posting with some practical guidance regarding pros and cons for the various techniques.
Regularization techniques have two major advantages compared to feature selection.
Regularization techniques are able to operate on much larger datasets than feature selection methods. Lasso and ridge regression
can be applied to datasets that contains thousands - even tens of thousands of variables. Even sequential feature selection
is usually too slow to cope with this many possible predictors.
Regularization algorithms often generate more accurate predictive models than feature selection. Regularization operates
over a continuous space while feature selection operates over a discrete space. As a result, regularization is often able
to fine tune the model and produce more accurate estimates.
However, feature selection methods also have their advantages
Regularization tehcniques are only available for a small number of model types. Notably, regularization can be applied to
linear regression and logistic regression. However, if you're working some other modeling technique - say a boosted decision
tree - you'll typically need to apply feature selection techiques.
Feature selection is easier to understand and explain to third parties. Never underestimate the importance of being able to
describe your methods when sharing your results.
With this said and done, each of the three regularization techniques also offers its own unique advantages and disadvantages.
Because lasso uses an L1 norm it tends to force individual coefficient values completely towards zero. As a result, lasso
works very well as a feature selection algorithm. It quickly identifies a small number of key variables.
In contrast, ridge regression uses an L2 norm for the coefficients (you're minimizing the sum of the squared errors). Ridge
regression tends to spread coefficient shrinkage across a larger number of coefficients. If you think that your model should
contain a large number of coefficients, ridge regression is probably a better choice than lasso.
Last, but not least, we have the elastic net which is able to compensate for a very specific limitation of lasso. Lasso is
unable to identify more predictors than you have coefficients.
Let's assume that you are running a cancer research study.
You have genes sequences for 500 different cancer patients
You're trying to determine which of 15,000 different genes have a signficant impact on the progression of the disease.
Sequential feature selection is completely impractical with this many different variables. You can't use ridge regression
because it won't force coefficients completely to zero quickly enough. At the same time, you can't use lasso since you might
need to identify more than 500 different genes. The elastic net is one possible solution.
This week Richard Willey from technical marketing will be guest blogging about subset selection and regularization. This week's blog posting is motivated by a pair of common challenges that occur in applied curve fitting.
The first challenge is how best to create accurate predictive models when your independent variables exhibit strong correlation.
In many cases you can improve upon the results of an ordinary least square regression if you reduce the number of predictors
or, alternatively, shrink the coefficient values towards zero.
The second challenge involves generating an accurate predictive model when you are constrained with respect to the number
of predictors you are working with. For example, assume that you need to embed your model onto a controller. You have 20
possible predictors to chose from, but you only have enough memory to allow for 8 independent variables.
This blog post will show two different sets of techniques to solve these related challenges. The first set of techniques
are based on a combination of feature selection and cross validation. The second set of techniques are use regularization
algorithms like ridge regression, lasso and the elastic net. All of these algorithms can be found in Statistics Toolbox.
To start with, I am going to generate a dataset that is deliberately engineered to be very difficult to fit using traditional
linear regression. The dataset has three characteristics that present challenges of linear regression
The dataset is very "wide". There are relatively large numbers of (possible) predictors compared to the number of observations.
The predictors are strongly correlated with one another
There is a large amount of noise in the dataset
clear all
clc
rng(1998);
Create eight X variables. The mean of each variable will be equal to zero.
mu = [0 0 0 0 0 0 0 0];
The variables are correlated with one another. The covariance matrix is specified as
i = 1:8;
matrix = abs(bsxfun(@minus,i',i));
covariance = repmat(.5,8,8).^matrix;
Use these parameters to generate a set of multivariate normal random numbers.
X = mvnrnd(mu, covariance, 20);
Create a hyperplane that describes Y = f(X) and add in a noise vector. Note that only three of the eight predictors have coefficient values > 0. This means that five
of the eight predictors have zero impact on the actual model.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Royal. Statist. Soc B., Vol. 58, No. 1, pages
267-288).
Let's model Y = f(X) using ordinary least squares regression. We'll start by using traditional linear regression to model Y = f(X) and compare the coefficients that were estimated by the regression model to the known vector Beta.
As expected, the regression model is performing very poorly.
The regression is estimating coefficient values for all eight predictors. Objectively, we "know" that five of these coefficients
should be zero.
Furthermore, the coefficients that are being estimated are very far removed from the true value.
Let's see if we can improve on these results using a technique called "All possible subset regression". First, I'm going to
generate 255 different regession models which encompass every possible combinate of the eight predictors. Next, I am going
to measure the accuracy of the resulting regression using the cross validated mean squared error. Finally, I'm going to select
the regression model that produced the most accurate results and compare this to our original regression model.
Create an index for the regression subsets.
index = dec2bin(1:255);
index = index == '1';
results = double(index);
results(:,9) = zeros(length(results),1);
Generate the regression models.
for i = 1:length(index)
foo = index(i,:);
rng(1981);
regf = @(XTRAIN, YTRAIN, XTEST)(XTEST*regress(YTRAIN,XTRAIN));
results(i,9) = crossval('mse', X(:,foo), Y,'kfold', 5, 'predfun',regf);
end
Sort the outputs by MSE. I'll show you the first few.
Our new model only includes 4 predictors. Variables 3,4, 7 and 8 have been eliminated from the regression model. Furthermore,
the variables that were exluded are ones that we "know" should be zeroed out.
Equally significant, the coefficients that are being estimated from the resulting regression model are much closer to the
"true" values.
All possible subset regression appears to have generated a significantly better model. This R^2 value for this regression model isn't as good as the original linear regression; however, if we're trying to generate predictions
from a new data set, we'd expect this model to perform significantly better.
Perform a Simulation
It's always dangerous to rely on the results of a single observation. If we were to change either of the noise vectors that
we used to generate our original dataset we might end up with radically different results. This next block of code will repeat
our original experiment a hundred times, each time adding in different noise vectors. Our output will be a bar chart that
shows how frequently each of the different variables was included in the resulting model. I'm using parallel computing to
speed things up though you could perform this (more slowly) in serial.
matlabpool open
sumin = zeros(1,8);
parfor j=1:100
mu = [0 0 0 0 0 0 0 0];
Beta = [3; 1.5; 0; 0; 2; 0; 0; 0];
i = 1:8;
matrix = abs(bsxfun(@minus,i',i));
covariance = repmat(.5,8,8).^matrix;
X = mvnrnd(mu, covariance, 20);
Y = X * Beta + 3 * randn(size(X,1),1);
index = dec2bin(1:255);
index = index == '1';
results = double(index);
results(:,9) = zeros(length(results),1);
for k = 1:length(index)
foo = index(k,:);
regf = @(XTRAIN, YTRAIN, XTEST)(XTEST*regress(YTRAIN,XTRAIN));
results(k,9) = crossval('mse', X(:,foo), Y,'kfold', 5, ...'predfun',regf);
end% Sort the outputs by MSE
index = sortrows(results, 9);
sumin = sumin + index(1,1:8);
end% parfor
matlabpool close
Starting matlabpool using the 'local4' configuration ... connected to 3 labs.
Sending a stop signal to all the labs ... stopped.
%Plot results.
bar(sumin)
The results clearly show that "All possible subset selection" is doing a really good job identifying the key variables that
drive our model. Variables 1, 2, and 5 show up in most of the models. Individual distractors work their way into the models
as well, however, their frequency is no where near as high as the real predictors.
Introducing Sequential Feature Selection
All possible subset selection suffers from one enormous limitation. It's computationally infeasible to apply this technique
to datasets that contain large numbers of (potential) predictors. Our first example had eight possible predictors. Testing
all possible subsets required generate (2^8)-1 = 255 different models. Suppose that we were evaluating a model with 20 possible
predictors. All possible subsets would require use to test 1,048,575 different models. With 40 predictors we'd be looking
at a billion different combinations.
Sequential feature selection is a more modern approach that tries to define a smart path through the search space. In turn,
this should allow us to identify a good set of cofficients but ensure that the problem is still computationally feasible.
Sequential feature selection works as follows:
Start by testing each possible predictor one at a time. Identify the single predictor that generates the most accurate model.
This predictor is automatically added to the model.
Next, one at a time, add each of the remaining predictors to a model that includes the single best variable. Identify the
variable that improves the accuracy of the model the most.
Test the two models for statistical significance. If the new model is not significantly more accurate that the original model,
stop the process. If, however, the new model is statistically more significant, go and search for the third best variable.
Repeat this process until you can't identify a new variable that has a statistically significant impact on the model.
% This next bit of code will show you how to use sequential feature% selection to solve a significantly larger problem.
Generate a Data Set
clear all
clc
rng(1981);
Z12 = randn(100,40);
Z1 = randn(100,1);
X = Z12 + repmat(Z1, 1,40);
B0 = zeros(10,1);
B1 = ones(10,1);
Beta = 2 * vertcat(B0,B1,B0,B1);
ds = dataset(Beta);
Y = X * Beta + 15*randn(100,1);
Use linear regression to fit the model.
b = regress(Y,X);
ds.Linear = b;
Use sequential feature selection to fit the model.
opts = statset('display','iter');
fun = @(x0,y0,x1,y1) norm(y1-x1*(x0\y0))^2; % residual sum of squares
[in,history] = sequentialfs(fun,X,Y,'cv',5, 'options',opts)
Start forward sequential feature selection:
Initial columns included: none
Columns that can not be included: none
Step 1, added column 26, criterion value 868.909
Step 2, added column 31, criterion value 566.166
Step 3, added column 9, criterion value 425.096
Step 4, added column 35, criterion value 369.597
Step 5, added column 16, criterion value 314.445
Step 6, added column 5, criterion value 283.324
Step 7, added column 18, criterion value 269.421
Step 8, added column 15, criterion value 253.78
Step 9, added column 40, criterion value 242.04
Step 10, added column 27, criterion value 234.184
Step 11, added column 12, criterion value 223.271
Step 12, added column 17, criterion value 216.464
Step 13, added column 14, criterion value 212.635
Step 14, added column 22, criterion value 207.885
Step 15, added column 7, criterion value 205.253
Step 16, added column 33, criterion value 204.179
Final columns included: 5 7 9 12 14 15 16 17 18 22 26 27 31 33 35 40
in =
Columns 1 through 12
0 0 0 0 1 0 1 0 1 0 0 1
Columns 13 through 24
0 1 1 1 1 1 0 0 0 1 0 0
Columns 25 through 36
0 1 1 0 0 0 1 0 1 0 1 0
Columns 37 through 40
0 0 0 1
history =
In: [16x40 logical]
Crit: [1x16 double]
I've configured sequentialfs to show each iteration of the feature selection process. You can see each of the variables as it gets added to the model,
along with the change in the cross validated mean squared error.
Generate a linear regression using the optimal set of features
As you can see, sequentialfs has generated a much more parsimonious model than the ordinary least squares regression. We also have a model that should
generate significantly more accurate predictions. Moreover, if we needed to (hypothetically) extract the five most important
variables we could simple grab the output from Steps 1:5.
Run a Simulation
As a last step, we'll once again run a simulation to benchmark the technique. Here, once again, we can see that the feature
selection technique is doing a very good job identifying the true predictors while screening out the distractors.
matlabpool open
sumin = 0;
parfor j=1:100
Z12 = randn(100,40);
Z1 = randn(100,1);
X = Z12 + repmat(Z1, 1,40);
B0 = zeros(10,1);
B1 = ones(10,1);
Beta = 2 * vertcat(B0,B1,B0,B1);
Y = X * Beta + 15*randn(100,1);
fun = @(x0,y0,x1,y1) norm(y1-x1*(x0\y0))^2; % residual sum of squares
[in,history] = sequentialfs(fun,X,Y,'cv',5);
sumin = in + sumin;
end
bar(sumin)
matlabpool close
Starting matlabpool using the 'local4' configuration ... connected to 3 labs.
Sending a stop signal to all the labs ... stopped.
One last comment: As you can see, neither sequentialfs nor all possible subset selection is perfect. Neither of these techniques was able to capture all of the true variables
while excluding the distractors. It's important to recognize that these data sets were deliberately designed to be difficult
to fit. [We don't want to benchmark performance using "easy" problems where any old technique might succeed.]
Next week's blog post will focus on a radically different way to solve these same types of problems. Stay tuned for a look
at regularization techniques such as ridge regression, lasso and the elastic net.
Here are a couple questions that you might want to consider: # All possible subset methods are O(2^n). What is the order
of complexity of sequential feature selection? # Other than scaling issues, can you think of any possible problems with sequential
feature selection? Are there situations where the algorithm might miss important variables?
I am pleased to introduce guest blogger Arvind Ananthan. Arvind is the Product Marketing Manager for MATLAB Coder and Fixed-Point Toolbox. His main focus in this post is to introduce basics of MATLAB Coder, talk about the workflow, its use cases, and show examples
of generated C code.
In April, 2011, MathWorks introduced MATLAB Coder as a stand-alone product to generate C code from MATLAB code. This tool
lets user generate readable, portable, and customizable C code from their MATLAB algorithms.
Many astute readers will notice that C code generation from MATLAB isn't really brand new - and that we've had this capability
to generate C code from MATLAB for quite some time now. Yes, that's true - we've been incubating this technology for quite
some time till we felt it was ready to debut as a stand alone tool. Here's a timeline of this technology over the past few
years:
2004 - Introduced the Embedded MATLAB Function block in Simulink
2007 - Introduced the emlc function in Real-Time Workshop (now called Simulink Coder) to generate stand alone C from MATLAB
2011 - Released MATLAB Coder, the first stand-alone product from MathWorks to generate portable and readable C code from MATLAB
code.
A Simple Example
Let me introduce the basics of using MATLAB Coder through a very simple example that multiplies two variables. Let's generate
C code from the following MATLAB function that multiplies two inputs:
dbtype simpleProduct.m
1 function c = simpleProduct(a,b) %#codegen
2 c = a*b;
To generate C code from this function using MATLAB Coder, I first have to specify the size and data types of my inputs - I
can do this through the MATLAB Coder UI (shown in the section below) where I specify my inputs as a [1x5] and [5x2] single
precision matrices, and subsequently generate the following C code:
MATLAB is a polymorphic language which means a single function, such as simpleProduct, can accept input arguments of different size and data types and output correct results. This function can behave as a simple
scalar product, a dot product, or a matrix multiplier depending on what inputs you pass.
Languages like C are said to be “stongly typed,” which requires you to create a different version of a function for every
variation of input data size and types. Hence, when you write C code to implement simpleProduct, you have to know ahead of time the sizes and the data types of your inputs so you can implement the right variant.
While MATLAB Coder helps you move from the highly flexible world of MATLAB to a strongly typed world of C, you still have
to specify all of the constraints C expects. You do this in MATLAB Coder by creating a MATLAB Coder project file (simpleProduct.prj in this case) where you can specify the various code generation parameters including the sizes and data types of your inputs
as shown below.
You can also edit the default configuration parameters by clicking on the 'More Settings' link which appears on the 'Build' tab of this project UI .
In the example above, I turned off a few default options such as including comments in the generated code just so you can
see how compact the code is. You not only get the option to generate comments in the resulting C code, but also inlcude the
original MATLAB code as comments in the corresponding sections of the generated code. This helps with traceability of your
C code to your original algorithms.
Detailed MATLAB Coder Worfklow
The simple example above quickly illustrates the process of generating code with MATLAB coder and shows how the resulting
C code looks.
Naturally, your real-world functions are going to be much more involved and may run into hundreds or even thousands of lines
of MATLAB Code. To help you handle that level of complexity, you would need an iterative process, like the 3-step workflow
described here, that guides you through the task of code generation incrementally:
Prepare: First, prepare your code to ensure that you can indeed generate code from your MATLAB algorithm. The convenience of MATLAB
language doesn't map directly to the constrained behavior of C. You may have to re-write portions of your MATLAB code so it
uses the MATLAB language features that support this mapping from MATLAB to C.
Test & Verify: Next, you generate a MEX function to test that your preparation step is correct. If the tool is successful in generating
a MEX function then you are ready to verify the results of this MEX function against the original MATLAB code. If not, you'd
have to iterate on the previous step till you can successfully generate a MEX function.
Generate: Finally, you generate C source code and further iterate upon your MATLAB code in order to control the look and feel or the
performance of your C code.You can also generate an optimized MEX function by turning off certain memory integrity checks
and debug options that could slow down its execution at this stage.
I'll now demonstrate this concept using a slightly more involved example. The following MATLAB code implements the Newton-Raphson
numerical technique for computing the n-th root of a real valued number.
dbtype newtonSearchAlgorithm.m
1 function [x,h] = newtonSearchAlgorithm(b,n,tol) %#codegen
2 % Given, "a", this function finds the nth root of a
3 % number by finding where: x^n-a=0.
4
5 notDone = 1;
6 aNew = 0; %Refined Guess Initialization
7 a = 1; %Initial Guess
8 count = 0;
9 h(1)=a;
10 while notDone
11 count = count+1;
12 [curVal,slope] = fcnDerivative(a,b,n); %square
13 yint = curVal-slope*a;
14 aNew = -yint/slope; %The new guess
15 h(count)=aNew;
16 if (abs(aNew-a) < tol) %Break if it's converged
17 notDone = 0;
18 elseif count>49 %after 50 iterations, stop
19 notDone = 0;
20 aNew = 0;
21 else
22 a = aNew;
23 end
24 end
25 x = aNew;
26
27 function [f,df] = fcnDerivative(a,b,n)
28 % Our function is f=a^n-b and it's derivative is n*a^(n-1).
29
30 f = a^n-b;
31 df = n*a^(n-1);
Our first step towards generating C code from this file is to prepare it for code generation. For code generation, each variable
inside your MATLAB code must be intialized, which means specifying its size and data type. In this case, I'll choose to initialize
the variable h to a static maximum size, h = zeros(1,50), as it doesn't grow beyond a length of 50 inside the for loop.
You can invoke the code generation function using the GUI or the command line (through the codegen command). Without worrying about the details of this approach, let's look at the generated C code:
The subfunction fcnDerivative is inlined in the generated C code. You can choose to not inline the code by putting this command coder.inline('never') in the MATLAB file.
Support for Dynamic Sizing
If you have variable in your MATLAB code that needs to vary its size during execution, you can choose to deal with this in
three different ways in the generated C code:
Static allocation with fixed maximum size: you can initialize variables to the maximum possible static size (like I did in the example above). In the generated code,
memory is prealloacted to this size.
Variable sizing with maximum size allocation: this option will declare the memory for the variable in the generated code to its maximum possible size, but you can dynamically
grow or shrink the variable size within this allocated maximum.
Variable sizing with dynamic memory allocation: this results in the generated code using malloc to allocate the memory for the variables that change in size during code execution.
The last two options can be enabled by turning on the variable-sizing feature in the configuration parameters. However, do remember that enabling this option also makes the resulting C code bigger
in size - so if you can avoid it, you can get much more compact and possibly more efficient code.
Use Cases of MATLAB Coder
The primary use of MATLAB Coder is to enable algorithm developers and system engineers working in MATLAB to quickly generate
readable and portable C code. The generated C code can be used in different ways supporting different workflows. Here are
a few use cases of MATLAB Coder:
Create standalone executables from your algorithms for prototyping on a PC
Speed up your MATLAB algorithm code (or portions of it) by generating and executing the MEX functions
Integrate your MATLAB algorithms as C source code or compiled library with your hand-written software
I won't be talking about creating standalone executables or creating libraries from MATLAB Coder in this blog; I'll, however,
say a few words on generating MEX functions.
Generating MEX Functions for Simulation Acceleration
The MEX interface enables you to integrate C code into MATLAB. One common use of MEX is to speed up performance bottlenecks
in MATLAB code by re-writing them in C and executing them back in MATLAB as MEX functions. MATLAB Coder can save you time
and effort by automatically generating MEX functions from your MATLAB code (and allowing you to import legacy C code easily
as well).
The speed-up you get through automatic MEX generation can vary quite a bit depending on the application. In some cases, you
may not get any speed up at all, or possibly even a slow-down, as MATLAB language has gotten quite smart and efficient in
computing many built-in functions by automatically taking advantage of processor specific routines (such as Intel Performance
Primitives) and multithreading to utilize multiple cores.
A few guidelines that you can use to determine if your algorithm is a good candidate for speed-up with MEX are:
your algorithm contains multiply nested for loops, and possibly handles state calculations (making each iteration dependent on the previous one)
your MATLAB code is hard/impossible to vectorize
most of the processing cycles in your algorithm are not spent on already optimized built-in functions such as fft or inv, etc.
you don't rely heavily on toolbox functions (especially those unsupported for code generation) in your algorithms
In these cases, you can expect to see a speed-up. Let's look at a realistic example that illustrates this concept.
The example I chose here is one that my colleague Sarah Zaranek had highlighted in her guest blog on vectorization.
I modified the original MATLAB script from that article into a function as only functions are supported by MATLAB Coder. The
other change that I made to the original script was to initialize the output variables subs and vals (as all variables in your code has to be defined/intialized once when used with MATLAB Coder). And by turning on the variable
sizing feature (using dynamic memory allocation), I didn't have to preallocate it to a sufficiently large enough size to accomodate
its growth, which would use more memory (often a lot) than needed.
The following statements create and use a MATLAB Coder configuration object with support for variable size arrays and dynamic
memory allocation to generate a MEX function.
% Create a MEX configuration object
cfg = coder.config('mex');
% Turn on dynamic memory allocation
cfg.DynamicMemoryAllocation = 'AllVariableSizeArrays';
% Generate MEX function
codegen -configcfggridSpeedUpFcn-args{10}
disp('MEX generation complete!')
Warning: There is no short path form of 'H:\Documents\LOREN\MyJob\Art of
MATLAB' available. Short path names do not include embedded spaces. The
short path name feature may be disabled in your operating system. Short
path names are required for successful code generation, therefore it may
not be possible to successfully complete the code generation process. To
avoid this problem, enable short path name support in your operating
system, or do not attempt code generation in paths containing spaces.
MEX generation complete!
This generates a MEX function in the current folder. I'll use tic, and toc to estimate the execution times of the original MATLAB function and its MEX version.
I run each function multiple times inside a for loop and use only the last few runs for our computation time calculation so we can minimize the effects of initialization
and caching.
MATLAB code execution time (with nx=10): 0.31652
MEX code executing time (with nx=10): 0.0027057
Speed up factor: 116.9806
The speed up we see in this example is more pronounced for larger values of nx (and would vary between different computers).
Vectorization and Code Generation
Sarah, in her blog article, explains quite nicely on how to vectorize this code using meshgrid and ndgrid to explode the variables in support of vectorization, and get similar levels of speed up.
Sarah paid a small price in code readability for her vectorization efforts, but a much larger price in memory consumption.
This points to a common trade-off when vectorizing MATLAB code: execution time vs. memory consumption. The temporary variables
which enable us to remove loops through matrix math are often very large, effectively limiting the number of iterations you
can take before running out of memory. For instance, if we run her algorithm for a 1,000x1,000 grid instead of a 100x100 grid,
we would create 9 double precision variables that are about 7.5 GB each!
Code generation can effectively deliver similar speed-up benefits (and with less effort in some cases), as it's trying to
explicitly achieve the same for loop operations in C code that a vectorized code achieves. However, the downside for most users with the code generation
approach is that it's available with a separate product (higher cost), and only MATLAB code that supports code generation
can automatically be converted to MEX as well.
Customizing and Optimizing the Generated Code
Now with most of the basics of code generation out of the way, a few questions might be on your mind:
"If I want the resulting code to look different, can I just edit the generated code?" You can certainly edit the generated code; as you have just seen in the examples shown here, the code is very readable. However,
once you modify it, you lose the connectivity to the original MATLAB code that generated it.
"How can I optimize the code, or customize its look and feel?" One easy way to modify the generated code is to change MATLAB Coder configuration parameters. Another way to customize the
generated code is to directly edit the MATLAB code such as explicitly breaking up certain compact vector/matrix operations
into for loops or other constructs you'd like to see in the generated code.
"What about incorporating hand-optimized or processor specific intrinsics for certain portions of the code?"coder.ceval is a command in MATLAB Coder that let's you integrate custom C code into your MATLAB code for both simulation and code generation.
Using this you can use hand-optimized or legacy C code in your algorithms. Embedded Coder is another product that adds many code optimization and customization capabilities to MATLAB Coder such as using code replacement
technology to incorporate optimized C code. It also lets you to customize the look-and-feel of the code.
Learning More About MATLAB Coder
You can find more information about MATLAB Coder on the product page, which includes demos and recorded webinars. The product documentation is also an excellent resource to find answers to questions you might have when starting with this product.
Please feel free to share your thoughts and comments on topics discussed here in this posting here.
In many cases, particularly for console applications on Windows and UNIX, this works well enough: the output appears in the DOS window or UNIX terminal. However, non-console
applications (and any type of shared library or builder component) may be unable to access either of the standard output streams; messages sent to those streams will be lost. To avoid losing
messages when the standard streams are unavailable, deployed applications register output and error message handlers. A message handler directs formatted text messages to a persistent or visible location, typically a log file or an output
window.
In this post, I'll show you how to send both error and informational messages to a log file.
A Function with Output
To demonstrate output redirection, I wrote the sendmessages function, which displays an informational message and a warning, and then issues an error:
function sendmessages(info, warn, err)
disp(info);
warning(warn);
error(err);
In order to follow the rest of this posting, you'll need the source code from MATLAB Central, particularly the C main program, log.c. See logREADME in the ZIP file for a complete description of distributed files and step-by-step directions for building the application.
Where are the Handlers?
Create a C shared library from the example function with MATLAB Compiler:
Open the generated wrapper file, libmsgfcn.c, in a text editor. The wrapper file contains two default handlers, mclDefaultErrorHandler and mclDefaultPrintHandler. The generated code contains two library initialization functions: libmsgfcnInitialize and libmsgfcnInitializeWithHandlers. You redirect program output by passing your custom message handlers to libmsgfcnInitializeWithHandlers. All message handlers conform to the same interface:
static int MessageHandler(const char *message)
You never need to call the message handler directly. The MATLAB Compiler Runtime (MCR) calls your message handler automatically.
Logging Messages to a File
A common use for a message handler is to record messages and the time they occurred in a log file. This type of message handler
requires only a few lines of C code:
static FILE *logFile = NULL;
static int LogFileHandler(const char *s)
{
if (logFile != NULL)
{
time_t now = time(NULL);
fprintf(logFile, "%s", ctime(&now));
fprintf(logFile, "%s\n", s);
}
}
Your main program needs to open and close the log file, of course. Open the log file before calling the library initialization
function and close it after calling mclTerminateApplication. See the full implementation in log.c.
The example program creates the log file MessageLog.txt to capture program output.
Fri Aug 19 11:53:32 2011
information
Fri Aug 19 11:53:32 2011
Warning: lookout!
> In sendmessages at 3
Fri Aug 19 11:53:32 2011
Error using sendmessages (line 4)
oops.
Next time, I'll show you how to display log messages in a persistent window, and how to distinguish between errors and informational
messages. In the meantime, let me know how your applications manage output. Does your code need to distinguish between program output and error messages?
Today I’d like to introduce my guest bloggers Sarah Wait Zaranek and Jiro Doke. Sarah previously has written about speeding up code from a customer to get acceptable performance and Jiro has previously written about making presentation
quality graphics and is a co-author of File Exchange Pick of the Week.
In some parts of the world, people celebrate Halloween, which occurs on October 31. In the spirit (pun intended) of the holiday, we created a Halloween-inspired function.
If you have ever used why in MATLAB, you will definitely recognize the other inspiration for our function.
Let's see what it does
I wonder what Loren is going to be for Halloween!
halloween Loren
Loren, you are scary enough without a costume!
How about not just any Loren, but our Loren?
halloween LorenShure
Loren Shure, for Halloween, you will be a screaming Gremlin.
Oh my! What about Sarah and Jiro?
halloween Sarah
halloween Jiro
Sarah, for Halloween, you will be a sleepy Bogeyman swimming in blood.
Jiro, for Halloween, you will be a Dracula baring teeth to the ends of the earth.
Note that the function works with both Command and Function syntax, which is explained here.
What happens if we don't put a name in?
halloween
Nameless, you are scary enough without a costume!
It still works. Looks like Nameless is going to have a fun holiday.
Dissecting the function
The main function does several things
loads Halloween data from a MAT-file
generates a Halloween name based on the name entered and the current time
prints the Halloween name to the screen
It also includes an "easter egg".
Let's look inside halloween.
type halloween
function halloween(varargin)
% HALLOWEEN Print out your Halloween name.
% HALLOWEEN NAME prints a randomly generated Halloween name based on
% NAME (optional). NAME must be a character array or multiple character
% arrays.
%
% Example:
% halloween('Loren')
% halloween('Sarah Zaranek')
% halloween Jiro Doke
% Copyright 2011 The MathWorks, Inc.
% Default name
if nargin == 0
name = 'Nameless';
else
% Make sure all inputs are character arrays
cellfun(@(x,y) validateattributes(x, {'char'}, {'row'}, y), ...
varargin, num2cell(1:nargin));
% Construct a single name from multiple strings
name = deblank(sprintf('%s ', varargin{:}));
end
% Special case
[~, ~, ~, ~, ~, sec] = datevec(now);
rng(sec*1000);
if randi(100) > 95 % 5 percent chance of getting this special message.
fprintf('%s, you are scary enough without a costume!', name);
return;
end
% Set seed based on name and time.
seed = sum(name) + floor(now);
rng(seed);
% Load Halloween data
data = load('HalloweenData');
% Generate Halloween name
halloween_name = getName(data);
% Print out Halloween name
printName(name, halloween_name);
We will dig into getName and printName shortly, but let's just see what halloween does.
Default Name
First, we check to see if the user supplied a name. This uses nargin which is highly useful when writing your own custom functions. nargin gives the number of arguments the user inputs when calling the function. If it is zero, we assign a default name of "Nameless".
This is an easy way to set default behavior for your function - allowing users to either run it with or without input arguments.
Take a look at this previous post by Loren on this topic.
If an input (or inputs) is provided, we do a check to see if they are valid by using validateattributes. Also, to provide a nice way of calling using Command syntax, we allow multiple input arguments to construct a longer string
(see the example above with "Loren Shure").
if nargin == 0
name = 'Nameless';
else
% Make sure all inputs are character arrays
cellfun(@(x,y) validateattributes(x, {'char'}, {'row'}, y), ...
varargin, num2cell(1:nargin));
% Construct a single name from multiple strings
name = deblank(sprintf('%s ', varargin{:}));
end
Easter Egg
Then, we generate a random number based on the current time (using now) to let us print a special message (Easter Egg) roughly 5% of the time. We use the new-ish rng function which allows us to easily set a single seed for rand, randi, and randn. The seed is generated using the millisecond output from now.
[~, ~, ~, ~, ~, sec] = datevec(now);
rng(sec*1000);
if randi(100) > 95
fprintf('%s, you are scary enough without a costume!', name);
return;
end
Set Random Seed
Then if the special case isn't reached, we generate a new seed for our random number generator. This time, it is based off
of the input name and the time (again given by now, but this time we just take the day information). We add up the letters of the name to create a single number, using sum. Note that if the array is a character string, the result is automatically converted to a double.
seed = sum(name) + floor(now);
rng(seed);
Create Halloween Name
We are using the work by Steven Savage as inspiration. The Halloween name can have 4 different parts:
a creature (like "ghost")
a descriptor (like "ghastly")
an action (like "floating")
a phrase (like "above your head")
You are always a creature, and you have one of the following three configurations:
a creature + a descriptor
a creature + an action
a creature + a descriptor and an action
We use randi to randomly choose between the three configurations. Since the output of the function is 1, 2, or 3, this is a perfect case
for switch - case construct. Within each case block, we call randi again (we love that function!) to randomly select the specific entry from the loaded list of descriptors and actions. In
the "Action" block, we call randi once again to add a prepositional phrase 50% of the time.
configID = randi(3);
switch configID
case 1 % Descriptor
<choose a descriptor>
case 2 % Action
<choose an action>
if randi(2) == 1 % Additional phrase
<choose a phrase>
end
case 3 % Descriptor + Action
<choose a descriptor and an action>
end
Let's see the whole function. Note that we sorted the Halloween name parts at the beginning, because Sarah likes things in
alphabetical order. The creature names were inspired by this web site.
type getName
function halloween_name = getName(data)
% Copyright 2011 The MathWorks, Inc.
% Sort data alphabetically
data.Creatures = sort(data.Creatures);
data.Descriptors = sort(data.Descriptors);
data.Actions = sort(data.Actions);
data.Phrases = sort(data.Phrases);
% Choose a random creature
creatureID = randi(length(data.Creatures));
% Choose one of 3: Descriptor, Action, Descriptor+Action
configID = randi(3);
switch configID
case 1 % Descriptor
descriptorID = randi(length(data.Descriptors));
halloween_name = [data.Descriptors{descriptorID}, ' ', ...
data.Creatures{creatureID}];
case 2 % Action
actionID = randi(length(data.Actions));
halloween_name = [data.Creatures{creatureID}, ' ', ...
data.Actions{actionID}];
% Additional phrase
if randi(2) == 1
rand_phrase = randi(length(data.Phrases));
halloween_name = [halloween_name, ' ', ...
data.Phrases{rand_phrase}];
end
case 3 % Descriptor + Action
descriptorID = randi(length(data.Descriptors));
actionID = randi(length(data.Actions));
halloween_name = [data.Descriptors{descriptorID}, ' ', ...
data.Creatures{creatureID}, ' ', data.Actions{actionID}];
end
Print Halloween Name
Finally, we print out your Halloween name to the screen. We do a simple grammar check to use the appropriate article based
on the first letter of the name.
type printName
function printName(name, halloween_name)
% Copyright 2011 The MathWorks, Inc.
% Check the first letter of the name
if ismember(lower(halloween_name(1)), {'a', 'e', 'i', 'o', 'u'})
fprintf('%s, for Halloween, you will be an %s.\n', name, halloween_name);
else
fprintf('%s, for Halloween, you will be a %s.\n', name, halloween_name);
end
The files
Download all the necessary files here to try it out yourself! You can use your own words by creating a MAT file named "HalloweenData.mat" containing four cell
arrays: Creatures, Descriptors, Actions, and Phrases.
Fun with RAND
What kind of interesting things have you done using a random number generator? Tell us about it here.
Recent Comments