{"id":1211,"date":"2015-08-04T14:41:47","date_gmt":"2015-08-04T19:41:47","guid":{"rendered":"https:\/\/blogs.mathworks.com\/loren\/?p=1211"},"modified":"2018-11-04T20:04:51","modified_gmt":"2018-11-05T01:04:51","slug":"artificial-neural-networks-for-beginners","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/loren\/2015\/08\/04\/artificial-neural-networks-for-beginners\/","title":{"rendered":"Artificial Neural Networks for Beginners"},"content":{"rendered":"<div class=\"content\"><!--introduction--><p><a href=\"https:\/\/en.wikipedia.org\/wiki\/Deep_learning\">Deep Learning<\/a> is a very hot topic these days especially in computer vision applications and you probably see it in the news and get curious. Now the question is, how do you get started with it? Today's guest blogger, Toshi Takeuchi, gives us a quick tutorial on <a href=\"https:\/\/en.wikipedia.org\/wiki\/Artificial_neural_network\">artificial neural networks<\/a> as a starting point for your study of deep learning.<\/p><!--\/introduction--><h3>Contents<\/h3><div><ul><li><a href=\"#1168dbb4-1365-4b63-8326-140263e2072f\">MNIST Dataset<\/a><\/li><li><a href=\"#4c67c5b3-c5b2-4f81-967c-8aab28a1a8ab\">Data Preparation<\/a><\/li><li><a href=\"#cdcd71d6-a1b2-46e3-a963-ccc7e08a699d\">Using the Deep Learning Toolbox GUI App<\/a><\/li><li><a href=\"#b1e50c47-53e5-4a09-a45f-b7286640feb6\">Visualizing the Learned Weights<\/a><\/li><li><a href=\"#c1bdce45-d5ec-4683-a80a-39887fdb374c\">Computing the Categorization Accuracy<\/a><\/li><li><a href=\"#f518ac98-768f-48c5-a6dd-e3769161e2a2\">Network Architecture<\/a><\/li><li><a href=\"#85b25a96-41ab-4044-8f06-51b712253669\">The Next Step - an Autoencoder Example<\/a><\/li><li><a href=\"#91869b46-1d87-4662-a74e-3ae83ec310b8\">Sudoku Solver: a Real-time Processing Example<\/a><\/li><li><a href=\"#b7442718-32fb-4db8-9016-93a0e71903a6\">Submitting Your Entry to Kaggle<\/a><\/li><li><a href=\"#24263b1d-b7e0-4b29-bbf1-e3770536c607\">Closing<\/a><\/li><\/ul><\/div><h4>MNIST Dataset<a name=\"1168dbb4-1365-4b63-8326-140263e2072f\"><\/a><\/h4><p>Many of us tend to learn better with a concrete example. Let me give you a quick step-by-step tutorial to get intuition using a popular <a href=\"http:\/\/yann.lecun.com\/exdb\/mnist\/index.html\">MNIST handwritten digit dataset<\/a>. Kaggle happens to use this very dataset in the <a href=\"https:\/\/www.kaggle.com\/c\/digit-recognizer\">Digit Recognizer<\/a> tutorial competition. Let's use it in this example. You can download the competition dataset from <a href=\"https:\/\/www.kaggle.com\/c\/digit-recognizer\/data\">\"Get the Data\"<\/a> page:<\/p><div><ul><li>train.csv - training data<\/li><li>test.csv  - test data for submission<\/li><\/ul><\/div><p>Load the training and test data into MATLAB, which I assume was downloaded into the current folder. The test data is used to generate your submissions.<\/p><pre class=\"codeinput\">tr = csvread(<span class=\"string\">'train.csv'<\/span>, 1, 0);                  <span class=\"comment\">% read train.csv<\/span>\r\nsub = csvread(<span class=\"string\">'test.csv'<\/span>, 1, 0);                  <span class=\"comment\">% read test.csv<\/span>\r\n<\/pre><p>The first column is the label that shows the correct digit for each sample in the dataset, and each row is a sample. In the remaining columns, a row represents a 28 x 28 image of a handwritten digit, but all pixels are placed in a single row, rather than in the original rectangular form. To visualize the digits, we need to reshape the rows into 28 x 28 matrices. You can use <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/reshape.html\">reshape<\/a> for that, except that we need to transpose the data, because <tt>reshape<\/tt> operates by column-wise rather than row-wise.<\/p><pre class=\"codeinput\">figure                                          <span class=\"comment\">% plot images<\/span>\r\ncolormap(gray)                                  <span class=\"comment\">% set to grayscale<\/span>\r\n<span class=\"keyword\">for<\/span> i = 1:25                                    <span class=\"comment\">% preview first 25 samples<\/span>\r\n    subplot(5,5,i)                              <span class=\"comment\">% plot them in 6 x 6 grid<\/span>\r\n    digit = reshape(tr(i, 2:end), [28,28])';    <span class=\"comment\">% row = 28 x 28 image<\/span>\r\n    imagesc(digit)                              <span class=\"comment\">% show the image<\/span>\r\n    title(num2str(tr(i, 1)))                    <span class=\"comment\">% show the label<\/span>\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2015\/neuralnetFinal_01.png\" alt=\"\"> <h4>Data Preparation<a name=\"4c67c5b3-c5b2-4f81-967c-8aab28a1a8ab\"><\/a><\/h4><p>You will be using the <a href=\"https:\/\/www.mathworks.com\/help\/nnet\/ref\/nprtool.html\">nprtool<\/a> pattern recognition app from <a href=\"https:\/\/www.mathworks.com\/products\/deep-learning.html\">Deep Learning Toolbox<\/a>. The app expects two sets of data:<\/p><div><ul><li>inputs - a numeric matrix, each column representing the samples and rows the features. This is the scanned images of handwritten digits.<\/li><li>targets - a numeric matrix of 0 and 1 that maps to specific labels that images represent. This is also known as a dummy variable. Deep Learning Toolbox also expects labels stored in columns, rather than in rows.<\/li><\/ul><\/div><p>The labels range from 0 to 9, but we will use '10' to represent '0' because MATLAB is indexing is 1-based.<\/p><pre class=\"language-matlab\">1 --&gt; [1; 0; 0; 0; 0; 0; 0; 0; 0; 0]\r\n2 --&gt; [0; 1; 0; 0; 0; 0; 0; 0; 0; 0]\r\n3 --&gt; [0; 0; 1; 0; 0; 0; 0; 0; 0; 0]\r\n            :\r\n0 --&gt; [0; 0; 0; 0; 0; 0; 0; 0; 0; 1]\r\n<\/pre><p>The dataset stores samples in rows rather than in columns, so you need to transpose it. Then you will partition the data so that you hold out 1\/3 of the data for model evaluation, and you will only use 2\/3 for training our artificial neural network model.<\/p><pre class=\"codeinput\">n = size(tr, 1);                    <span class=\"comment\">% number of samples in the dataset<\/span>\r\ntargets  = tr(:,1);                 <span class=\"comment\">% 1st column is |label|<\/span>\r\ntargets(targets == 0) = 10;         <span class=\"comment\">% use '10' to present '0'<\/span>\r\ntargetsd = dummyvar(targets);       <span class=\"comment\">% convert label into a dummy variable<\/span>\r\ninputs = tr(:,2:end);               <span class=\"comment\">% the rest of columns are predictors<\/span>\r\n\r\ninputs = inputs';                   <span class=\"comment\">% transpose input<\/span>\r\ntargets = targets';                 <span class=\"comment\">% transpose target<\/span>\r\ntargetsd = targetsd';               <span class=\"comment\">% transpose dummy variable<\/span>\r\n\r\nrng(1);                             <span class=\"comment\">% for reproducibility<\/span>\r\nc = cvpartition(n,<span class=\"string\">'Holdout'<\/span>,n\/3);   <span class=\"comment\">% hold out 1\/3 of the dataset<\/span>\r\n\r\nXtrain = inputs(:, training(c));    <span class=\"comment\">% 2\/3 of the input for training<\/span>\r\nYtrain = targetsd(:, training(c));  <span class=\"comment\">% 2\/3 of the target for training<\/span>\r\nXtest = inputs(:, test(c));         <span class=\"comment\">% 1\/3 of the input for testing<\/span>\r\nYtest = targets(test(c));           <span class=\"comment\">% 1\/3 of the target for testing<\/span>\r\nYtestd = targetsd(:, test(c));      <span class=\"comment\">% 1\/3 of the dummy variable for testing<\/span>\r\n<\/pre><h4>Using the Deep Learning Toolbox GUI App<a name=\"cdcd71d6-a1b2-46e3-a963-ccc7e08a699d\"><\/a><\/h4><div><ol><li>You can start the Neural Network Start GUI by typing the command <a href=\"https:\/\/www.mathworks.com\/help\/nnet\/ref\/nnstart.html\">nnstart<\/a>.<\/li><li>You then click the Pattern Recognition Tool to open the Neural Network Pattern Recognition Tool. You can also usehe command <a href=\"https:\/\/www.mathworks.com\/help\/nnet\/ref\/nprtool.html\">nprtool<\/a> to open it directly.<\/li><li>Click \"Next\" in the welcome screen and go to \"Select Data\".<\/li><li>For <tt>inputs<\/tt>, select <tt>Xtrain<\/tt> and for <tt>targets<\/tt>, select <tt>Ytrain<\/tt>.<\/li><li>Click \"Next\" and go to \"Validation and Test Data\". Accept the default settings and click \"Next\" again. This will split the data into 70-15-15 for the training, validation and testing sets.<\/li><li>In the \"Network Architecture\", change the value for the number of hidden neurons, 100, and click \"Next\" again.<\/li><li>In the \"Train Network\", click the \"Train\" button to start the training. When finished, click \"Next\". Skip \"Evaluate Network\" and click next.<\/li><li>In \"Deploy Solution\", select \"MATLAB Matrix-Only Function\" and save t the generated code. I save it as <a href=\"https:\/\/blogs.mathworks.com\/images\/loren\/2015\/myNNfun.m\">myNNfun.m<\/a>.<\/li><li>If you click \"Next\" and go to \"Save Results\", you can also save the script as well as the model you just created. I saved the simple script as <a href=\"https:\/\/blogs.mathworks.com\/images\/loren\/2015\/myNNscript.m\">myNNscript.m<\/a><\/li><\/ol><\/div><p>Here is the diagram of this artificial neural network model you created with the Pattern Recognition Tool. It has 784 input neurons, 100 hidden layer neurons, and 10 output layer neurons.<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2015\/network_diagram.png\" alt=\"\"> <\/p><p>Your model learns through training the weights to produce the correct output.<\/p><p><tt>W<\/tt> in the diagram stands for <i>weights<\/i> and <tt>b<\/tt> for <i>bias units<\/i>, which are part of individual neurons. Individual neurons in the hidden layer look like this - 784 inputs and corresponding weights, 1 bias unit, and 10 activation outputs.<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2015\/neuron.png\" alt=\"\"> <\/p><h4>Visualizing the Learned Weights<a name=\"b1e50c47-53e5-4a09-a45f-b7286640feb6\"><\/a><\/h4><p>If you look inside <tt>myNNfun.m<\/tt>, you see variables like <tt>IW1_1<\/tt> and <tt>x1_step1_keep<\/tt> that represent the weights your artificial neural network model learned through training. Because we have 784 inputs and 100 neurons, the full layer 1 weights will be a 100 x 784 matrix. Let's visualize them. This is what our neurons are learning!<\/p><pre class=\"codeinput\">load <span class=\"string\">myWeights<\/span>                          <span class=\"comment\">% load the learned weights<\/span>\r\nW1 =zeros(100, 28*28);                  <span class=\"comment\">% pre-allocation<\/span>\r\nW1(:, x1_step1_keep) = IW1_1;           <span class=\"comment\">% reconstruct the full matrix<\/span>\r\nfigure                                  <span class=\"comment\">% plot images<\/span>\r\ncolormap(gray)                          <span class=\"comment\">% set to grayscale<\/span>\r\n<span class=\"keyword\">for<\/span> i = 1:25                            <span class=\"comment\">% preview first 25 samples<\/span>\r\n    subplot(5,5,i)                      <span class=\"comment\">% plot them in 6 x 6 grid<\/span>\r\n    digit = reshape(W1(i,:), [28,28])'; <span class=\"comment\">% row = 28 x 28 image<\/span>\r\n    imagesc(digit)                      <span class=\"comment\">% show the image<\/span>\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2015\/neuralnetFinal_02.png\" alt=\"\"> <h4>Computing the Categorization Accuracy<a name=\"c1bdce45-d5ec-4683-a80a-39887fdb374c\"><\/a><\/h4><p>Now you are ready to use <tt>myNNfun.m<\/tt> to predict labels for the heldout data in <tt>Xtest<\/tt> and compare them to the actual labels in <tt>Ytest<\/tt>. That gives you a realistic predictive performance against unseen data. This is also the metric Kaggle uses to score submissions.<\/p><p>First, you see the actual output from the network, which shows the probability for each possible label. You simply choose the most probable label as your prediction and then compare it to the actual label. You should see 95% categorization accuracy.<\/p><pre class=\"codeinput\">Ypred = myNNfun(Xtest);             <span class=\"comment\">% predicts probability for each label<\/span>\r\nYpred(:, 1:5)                       <span class=\"comment\">% display the first 5 columns<\/span>\r\n[~, Ypred] = max(Ypred);            <span class=\"comment\">% find the indices of max probabilities<\/span>\r\nsum(Ytest == Ypred) \/ length(Ytest) <span class=\"comment\">% compare the predicted vs. actual<\/span>\r\n<\/pre><pre class=\"codeoutput\">ans =\r\n   1.3988e-09   6.1336e-05   1.4421e-07   1.5035e-07   2.6808e-08\r\n   1.9521e-05     0.018117   3.5323e-09   2.9139e-06    0.0017353\r\n   2.2202e-07   0.00054599     0.012391   0.00049678   0.00024934\r\n   1.5338e-09      0.46156   0.00058973   4.5171e-07   0.00025153\r\n   4.5265e-08      0.11546      0.91769   2.1261e-05   0.00031076\r\n   1.1247e-08      0.25335   1.9205e-06   1.1014e-06      0.99325\r\n   2.1627e-08    0.0045572    1.733e-08   3.7744e-07   1.7282e-07\r\n   2.2329e-09   7.6692e-05   0.00011479      0.98698   1.7328e-06\r\n   1.9634e-05    0.0011708     0.069215      0.01249   0.00084255\r\n      0.99996      0.14511   1.0106e-07   2.9687e-06    0.0033565\r\nans =\r\n      0.95293\r\n<\/pre><h4>Network Architecture<a name=\"f518ac98-768f-48c5-a6dd-e3769161e2a2\"><\/a><\/h4><p>You probably noticed that the artificial neural network model generated from the Pattern Recognition Tool has only one hidden layer. You can build a custom model with more layers if you would like, but this simple architecture is sufficient for most common problems.<\/p><p>The next question you may ask is how I picked 100 for the number of hidden neurons. The general rule of thumb is to pick a number between the number of input neurons, 784 and the number of output neurons, 10, and I just picked 100 arbitrarily. That means you might do better if you try other values. Let's do this programmatically this time. <tt>myNNscript.m<\/tt> will be handy for this - you can simply adapt the script to do a parameter sweep.<\/p><pre class=\"codeinput\">sweep = [10,50:50:300];                 <span class=\"comment\">% parameter values to test<\/span>\r\nscores = zeros(length(sweep), 1);       <span class=\"comment\">% pre-allocation<\/span>\r\nmodels = cell(length(sweep), 1);        <span class=\"comment\">% pre-allocation<\/span>\r\nx = Xtrain;                             <span class=\"comment\">% inputs<\/span>\r\nt = Ytrain;                             <span class=\"comment\">% targets<\/span>\r\ntrainFcn = <span class=\"string\">'trainscg'<\/span>;                  <span class=\"comment\">% scaled conjugate gradient<\/span>\r\n<span class=\"keyword\">for<\/span> i = 1:length(sweep)\r\n    hiddenLayerSize = sweep(i);         <span class=\"comment\">% number of hidden layer neurons<\/span>\r\n    net = patternnet(hiddenLayerSize);  <span class=\"comment\">% pattern recognition network<\/span>\r\n    net.divideParam.trainRatio = 70\/100;<span class=\"comment\">% 70% of data for training<\/span>\r\n    net.divideParam.valRatio = 15\/100;  <span class=\"comment\">% 15% of data for validation<\/span>\r\n    net.divideParam.testRatio = 15\/100; <span class=\"comment\">% 15% of data for testing<\/span>\r\n    net = train(net, x, t);             <span class=\"comment\">% train the network<\/span>\r\n    models{i} = net;                    <span class=\"comment\">% store the trained network<\/span>\r\n    p = net(Xtest);                     <span class=\"comment\">% predictions<\/span>\r\n    [~, p] = max(p);                    <span class=\"comment\">% predicted labels<\/span>\r\n    scores(i) = sum(Ytest == p) \/<span class=\"keyword\">...<\/span><span class=\"comment\">    % categorization accuracy<\/span>\r\n        length(Ytest);\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><p>Let's now plot how the categorization accuracy changes versus number of neurons in the hidden layer.<\/p><pre class=\"codeinput\">figure\r\nplot(sweep, scores, <span class=\"string\">'.-'<\/span>)\r\nxlabel(<span class=\"string\">'number of hidden neurons'<\/span>)\r\nylabel(<span class=\"string\">'categorization accuracy'<\/span>)\r\ntitle(<span class=\"string\">'Number of hidden neurons vs. accuracy'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2015\/neuralnetFinal_03.png\" alt=\"\"> <p>It looks like you get the best result around 250 neurons and the best score will be around 0.96 with this basic artificial neural network model.<\/p><p>As you can see, you gain more accuracy if you increase the number of hidden neurons, but then the accuracy decreases at some point (your result may differ a bit due to random initialization of weights). As you increase the number of neurons, your model will be able to capture more features, but if you capture too many features, then you end up overfitting your model to the training data and it won't do well with unseen data. Let's examine the learned weights with 300 hidden neurons. You see more details, but you also see more noise.<\/p><pre class=\"codeinput\">net = models{end};                      <span class=\"comment\">% restore the last model<\/span>\r\nW1 = zeros(sweep(end), 28*28);          <span class=\"comment\">% pre-allocation<\/span>\r\nW1(:, x1_step1_keep) = net.IW{1};       <span class=\"comment\">% reconstruct the full matrix<\/span>\r\nfigure                                  <span class=\"comment\">% plot images<\/span>\r\ncolormap(gray)                          <span class=\"comment\">% set to grayscale<\/span>\r\n<span class=\"keyword\">for<\/span> i = 1:25                            <span class=\"comment\">% preview first 25 samples<\/span>\r\n    subplot(5,5,i)                      <span class=\"comment\">% plot them in 6 x 6 grid<\/span>\r\n    digit = reshape(W1(i,:), [28,28])'; <span class=\"comment\">% row = 28 x 28 image<\/span>\r\n    imagesc(digit)                      <span class=\"comment\">% show the image<\/span>\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2015\/neuralnetFinal_04.png\" alt=\"\"> <h4>The Next Step - an Autoencoder Example<a name=\"85b25a96-41ab-4044-8f06-51b712253669\"><\/a><\/h4><p>You now have some intuition on artificial neural networks - a network automatically learns the relevant features from the inputs and generates a sparse representation that maps to the output labels.  What if we use the inputs as the target values? That eliminates the need for training labels and turns this into an unsupervised learning algorithm. This is known as an autoencoder and this becomes a building block of a deep learning network. There is an excellent example of autoencoders on the <a href=\"https:\/\/www.mathworks.com\/help\/nnet\/examples\/create-simple-deep-learning-network-for-classification.html\">Training a Deep Neural Network for Digit Classification<\/a> page in the Deep Learning Toolbox documentation, which also uses MNIST dataset. For more details, Stanford provides an excellent <a href=\"http:\/\/deeplearning.stanford.edu\/tutorial\/\">UFLDL Tutorial<\/a> that also uses the same dataset and MATLAB-based starter code.<\/p><h4>Sudoku Solver: a Real-time Processing Example<a name=\"91869b46-1d87-4662-a74e-3ae83ec310b8\"><\/a><\/h4><p>Beyond understanding the algorithms, there is also a practical question of how to generate the input data in the first place. Someone spent a lot of time to prepare the MNIST dataset to ensure uniform sizing, scaling, contrast, etc. To use the model you built from this dataset in practical applications, you have to be able to repeat the same set of processing on new data. How do you do such preparation yourself?<\/p><p>There is a fun video that shows you how you can solve Sudoku puzzles using a webcam that uses a different character recognition technique. Instead of static images, our colleague <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/profile\/authors\/1905880-teja-muppirala\">Teja Muppirala<\/a> uses a live video feed in real time to do it and he walks you through the pre-processing steps one by one. You should definitely check it out: <a href=\"https:\/\/www.mathworks.com\/videos\/solving-a-sudoku-puzzle-using-a-webcam-68773.html\">Solving a Sudoku Puzzle Using a Webcam<\/a>.<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2015\/solveSudokuWebcam.png\" alt=\"\"> <\/p><h4>Submitting Your Entry to Kaggle<a name=\"b7442718-32fb-4db8-9016-93a0e71903a6\"><\/a><\/h4><p>You got 96% categorization accuracy rate by simply accepting the default settings except for the number of hidden neurons. Not bad for the first try. Since you are using a Kaggle dataset, you can now submit your result to Kaggle.<\/p><pre class=\"codeinput\">n = size(sub, 1);                                   <span class=\"comment\">% num of samples<\/span>\r\nsub = sub';                                         <span class=\"comment\">% transpose<\/span>\r\n[~, highest] = max(scores);                         <span class=\"comment\">% highest scoring model<\/span>\r\nnet = models{highest};                              <span class=\"comment\">% restore the model<\/span>\r\nYpred = net(sub);                                   <span class=\"comment\">% label probabilities<\/span>\r\n[~, Label] = max(Ypred);                            <span class=\"comment\">% predicted labels<\/span>\r\nLabel = Label';                                     <span class=\"comment\">% transpose Label<\/span>\r\nLabel(Label == 10) = 0;                             <span class=\"comment\">% change '10' to '0'<\/span>\r\nImageId = 1:n; ImageId = ImageId';                  <span class=\"comment\">% image ids<\/span>\r\nwritetable(table(ImageId, Label), <span class=\"string\">'submission.csv'<\/span>);<span class=\"comment\">% write to csv<\/span>\r\n<\/pre><p>You can now submit the <tt>submission.csv<\/tt> on <a href=\"https:\/\/www.kaggle.com\/c\/digit-recognizer\/submissions\/attach\">Kaggle's entry submission page<\/a>.<\/p><h4>Closing<a name=\"24263b1d-b7e0-4b29-bbf1-e3770536c607\"><\/a><\/h4><p>In this example we focused on getting a high level intuition on artificial neural network using a concrete example of handwritten digit recognition. We didn&#8217;t go into details such as how the inputs weights and bias units are combined, how activation works, how you train such a network, etc. But you now know enough to use Deep Learning Toolbox in MATLAB to participate in a Kaggle competition.<\/p><script language=\"JavaScript\"> <!-- \r\n    function grabCode_d9fc5f63c3c446458693ea27d7d40897() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='d9fc5f63c3c446458693ea27d7d40897 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' d9fc5f63c3c446458693ea27d7d40897';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2015 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_d9fc5f63c3c446458693ea27d7d40897()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2015a<br><\/p><\/div><!--\r\nd9fc5f63c3c446458693ea27d7d40897 ##### SOURCE BEGIN #####\r\n%% Artificial Neural Networks for Beginners\r\n% <https:\/\/en.wikipedia.org\/wiki\/Deep_learning Deep Learning> is a very hot\r\n% topic these days especially in computer vision applications and you\r\n% probably see it in the news and get curious. Now the question is, how do\r\n% you get started with it? Today's guest blogger, Toshi Takeuchi, gives us\r\n% a quick tutorial on\r\n% <https:\/\/en.wikipedia.org\/wiki\/Artificial_neural_network artificial\r\n% neural networks> as a starting point for your study of deep learning. \r\n\r\n%% MNIST Dataset\r\n% Many of us tend to learn better with a concrete example. Let me give you\r\n% a quick step-by-step tutorial to get intuition using a popular\r\n% <http:\/\/yann.lecun.com\/exdb\/mnist\/index.html MNIST handwritten digit\r\n% dataset>. Kaggle happens to use this very dataset in the\r\n% <https:\/\/www.kaggle.com\/c\/digit-recognizer Digit Recognizer> tutorial\r\n% competition. Let's use it in this example. You can download the\r\n% competition dataset from <https:\/\/www.kaggle.com\/c\/digit-recognizer\/data\r\n% \"Get the Data\"> page:\r\n%\r\n% * train.csv - training data\r\n% * test.csv  - test data for submission\r\n% \r\n% Load the training and test data into MATLAB, which I assume was\r\n% downloaded into the current folder. The test data is used to generate\r\n% your submissions.\r\n\r\ntr = csvread('train.csv', 1, 0);                  % read train.csv\r\nsub = csvread('test.csv', 1, 0);                  % read test.csv\r\n\r\n%%\r\n% The first column is the label that shows the correct digit for each\r\n% sample in the dataset, and each row is a sample. In the remaining\r\n% columns, a row represents a 28 x 28 image of a handwritten digit, but all\r\n% pixels are placed in a single row, rather than in the original\r\n% rectangular form. To visualize the digits, we need to reshape the rows\r\n% into 28 x 28 matrices. You can use\r\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/reshape.html reshape> for that,\r\n% except that we need to transpose the data, because |reshape| operates by\r\n% column-wise rather than row-wise.\r\n\r\nfigure                                          % plot images\r\ncolormap(gray)                                  % set to grayscale\r\nfor i = 1:25                                    % preview first 25 samples \r\n    subplot(5,5,i)                              % plot them in 6 x 6 grid\r\n    digit = reshape(tr(i, 2:end), [28,28])';    % row = 28 x 28 image\r\n    imagesc(digit)                              % show the image\r\n    title(num2str(tr(i, 1)))                    % show the label\r\nend\r\n\r\n%% Data Preparation\r\n% You will be using the\r\n% <https:\/\/www.mathworks.com\/help\/nnet\/ref\/nprtool.html nprtool> pattern\r\n% recognition app from <https:\/\/www.mathworks.com\/products\/deep-learning.html\r\n% Deep Learning Toolbox>. The app expects two sets of data:\r\n%\r\n% * inputs - a numeric matrix, each column representing the samples and\r\n% rows the features. This is the scanned images of handwritten digits.\r\n% * targets - a numeric matrix of 0 and 1 that maps to specific labels that\r\n% images represent. This is also known as a dummy variable. Deep Learning\r\n% Toolbox also expects labels stored in columns, rather than in rows.\r\n%\r\n% The labels range from 0 to 9, but we will use '10' to represent '0'\r\n% because MATLAB is indexing is 1-based.\r\n% \r\n%   1 REPLACE_WITH_DASH_DASH> [1; 0; 0; 0; 0; 0; 0; 0; 0; 0]\r\n%   2 REPLACE_WITH_DASH_DASH> [0; 1; 0; 0; 0; 0; 0; 0; 0; 0]\r\n%   3 REPLACE_WITH_DASH_DASH> [0; 0; 1; 0; 0; 0; 0; 0; 0; 0]\r\n%               :\r\n%   0 REPLACE_WITH_DASH_DASH> [0; 0; 0; 0; 0; 0; 0; 0; 0; 1]\r\n% \r\n% The dataset stores samples in rows rather than in columns, so you need to\r\n% transpose it. Then you will partition the data so that you hold out 1\/3\r\n% of the data for model evaluation, and you will only use 2\/3 for training\r\n% our artificial neural network model.\r\n\r\nn = size(tr, 1);                    % number of samples in the dataset\r\ntargets  = tr(:,1);                 % 1st column is |label|\r\ntargets(targets == 0) = 10;         % use '10' to present '0'\r\ntargetsd = dummyvar(targets);       % convert label into a dummy variable\r\ninputs = tr(:,2:end);               % the rest of columns are predictors\r\n\r\ninputs = inputs';                   % transpose input\r\ntargets = targets';                 % transpose target\r\ntargetsd = targetsd';               % transpose dummy variable\r\n\r\nrng(1);                             % for reproducibility \r\nc = cvpartition(n,'Holdout',n\/3);   % hold out 1\/3 of the dataset\r\n\r\nXtrain = inputs(:, training(c));    % 2\/3 of the input for training\r\nYtrain = targetsd(:, training(c));  % 2\/3 of the target for training\r\nXtest = inputs(:, test(c));         % 1\/3 of the input for testing\r\nYtest = targets(test(c));           % 1\/3 of the target for testing\r\nYtestd = targetsd(:, test(c));      % 1\/3 of the dummy variable for testing\r\n\r\n%% Using the Deep Learning Toolbox GUI App\r\n% \r\n% # You can start the Neural Network Start GUI by typing the command\r\n% <https:\/\/www.mathworks.com\/help\/nnet\/ref\/nnstart.html nnstart>.\r\n% # You then click the Pattern Recognition Tool to open the Neural Network\r\n% Pattern Recognition Tool. You can also use the command\r\n% <https:\/\/www.mathworks.com\/help\/nnet\/ref\/nprtool.html nprtool> to open it\r\n% directly.\r\n% # Click \"Next\" in the welcome screen and go to \"Select Data\". \r\n% # For |inputs|, select |Xtrain| and for |targets|, select |Ytrain|.\r\n% # Click \"Next\" and go to \"Validation and Test Data\". Accept the default\r\n% settings and click \"Next\" again. This will split the data into 70-15-15\r\n% for the training, validation and testing sets. \r\n% # In the \"Network Architecture\", change the value for the number of\r\n% hidden neurons, 100, and click \"Next\" again. \r\n% # In the \"Train Network\", click the \"Train\" button to start the training.\r\n% When finished, click \"Next\". Skip \"Evaluate Network\" and click next.\r\n% # In \"Deploy Solution\", select \"MATLAB Matrix-Only Function\" and save the\r\n% generated code. I save it as\r\n% <https:\/\/blogs.mathworks.com\/images\/loren\/2015\/myNNfun.m myNNfun.m>.\r\n% # If you click \"Next\" and go to \"Save Results\", you can also save the\r\n% script as well as the model you just created. I saved the simple script\r\n% as <https:\/\/blogs.mathworks.com\/images\/loren\/2015\/myNNscript.m\r\n% myNNscript.m>\r\n%\r\n% Here is the diagram of this artificial neural network model you created\r\n% with the Pattern Recognition Tool. It has 784 input neurons, 100 hidden\r\n% layer neurons, and 10 output layer neurons.\r\n% \r\n% <<network_diagram.png>>\r\n%\r\n% Your model learns through training the weights to produce the correct\r\n% output.\r\n%\r\n% |W| in the diagram stands for _weights_ and |b| for _bias units_, which\r\n% are part of individual neurons. Individual neurons in the hidden layer\r\n% look like this - 784 inputs and corresponding weights, 1 bias unit,\r\n% and 10 activation outputs.\r\n% \r\n% <<neuron.png>>\r\n\r\n%% Visualizing the Learned Weights\r\n% If you look inside |myNNfun.m|, you see variables like |IW1_1| and\r\n% |x1_step1_keep| that represent the weights your artificial neural\r\n% network model learned through training. Because we have 784 inputs and\r\n% 100 neurons, the full layer 1 weights will be a 100 x 784 matrix. Let's\r\n% visualize them. This is what our neurons are learning!\r\n\r\nload myWeights                          % load the learned weights\r\nW1 =zeros(100, 28*28);                  % pre-allocation\r\nW1(:, x1_step1_keep) = IW1_1;           % reconstruct the full matrix\r\nfigure                                  % plot images\r\ncolormap(gray)                          % set to grayscale\r\nfor i = 1:25                            % preview first 25 samples \r\n    subplot(5,5,i)                      % plot them in 6 x 6 grid\r\n    digit = reshape(W1(i,:), [28,28])'; % row = 28 x 28 image\r\n    imagesc(digit)                      % show the image\r\nend\r\n\r\n%% Computing the Categorization Accuracy\r\n% Now you are ready to use |myNNfun.m| to predict labels for the heldout\r\n% data in |Xtest| and compare them to the actual labels in |Ytest|. That\r\n% gives you a realistic predictive performance against unseen data. This is\r\n% also the metric Kaggle uses to score submissions.\r\n%\r\n% First, you see the actual output from the network, which shows the\r\n% probability for each possible label. You simply choose the most probable\r\n% label as your prediction and then compare it to the actual label. You\r\n% should see 95% categorization accuracy. \r\n\r\nYpred = myNNfun(Xtest);             % predicts probability for each label\r\nYpred(:, 1:5)                       % display the first 5 columns\r\n[~, Ypred] = max(Ypred);            % find the indices of max probabilities\r\nsum(Ytest == Ypred) \/ length(Ytest) % compare the predicted vs. actual\r\n\r\n%% Network Architecture\r\n% You probably noticed that the artificial neural network model generated\r\n% from the Pattern Recognition Tool has only one hidden layer. You can\r\n% build a custom model with more layers if you would like, but this simple\r\n% architecture is sufficient for most common problems.\r\n% \r\n% The next question you may ask is how I picked 100 for the number of\r\n% hidden neurons. The general rule of thumb is to pick a number between the\r\n% number of input neurons, 784 and the number of output neurons, 10, and I\r\n% just picked 100 arbitrarily. That means you might do better if you try\r\n% other values. Let's do this programmatically this time. |myNNscript.m|\r\n% will be handy for this - you can simply adapt the script to do a\r\n% parameter sweep.\r\n\r\nsweep = [10,50:50:300];                 % parameter values to test\r\nscores = zeros(length(sweep), 1);       % pre-allocation\r\nmodels = cell(length(sweep), 1);        % pre-allocation\r\nx = Xtrain;                             % inputs\r\nt = Ytrain;                             % targets\r\ntrainFcn = 'trainscg';                  % scaled conjugate gradient\r\nfor i = 1:length(sweep)\r\n    hiddenLayerSize = sweep(i);         % number of hidden layer neurons\r\n    net = patternnet(hiddenLayerSize);  % pattern recognition network\r\n    net.divideParam.trainRatio = 70\/100;% 70% of data for training\r\n    net.divideParam.valRatio = 15\/100;  % 15% of data for validation\r\n    net.divideParam.testRatio = 15\/100; % 15% of data for testing\r\n    net = train(net, x, t);             % train the network\r\n    models{i} = net;                    % store the trained network\r\n    p = net(Xtest);                     % predictions\r\n    [~, p] = max(p);                    % predicted labels\r\n    scores(i) = sum(Ytest == p) \/...    % categorization accuracy\r\n        length(Ytest);                \r\nend\r\n\r\n%% \r\n% Let's now plot how the categorization accuracy changes versus number of\r\n% neurons in the hidden layer. \r\nfigure\r\nplot(sweep, scores, '.-')\r\nxlabel('number of hidden neurons')\r\nylabel('categorization accuracy')\r\ntitle('Number of hidden neurons vs. accuracy')\r\n\r\n%%\r\n% It looks like you get the best result around 250 neurons and\r\n% the best score will be around 0.96 with this basic artificial neural\r\n% network model.\r\n%\r\n% As you can see, you gain more accuracy if you increase the number of\r\n% hidden neurons, but then the accuracy decreases at some point (your\r\n% result may differ a bit due to random initialization of weights). As you\r\n% increase the number of neurons, your model will be able to capture more\r\n% features, but if you capture too many features, then you end up\r\n% overfitting your model to the training data and it won't do well with\r\n% unseen data. Let's examine the learned weights with 300 hidden neurons.\r\n% You see more details, but you also see more noise.\r\n\r\nnet = models{end};                      % restore the last model\r\nW1 = zeros(sweep(end), 28*28);          % pre-allocation\r\nW1(:, x1_step1_keep) = net.IW{1};       % reconstruct the full matrix\r\nfigure                                  % plot images\r\ncolormap(gray)                          % set to grayscale\r\nfor i = 1:25                            % preview first 25 samples \r\n    subplot(5,5,i)                      % plot them in 6 x 6 grid\r\n    digit = reshape(W1(i,:), [28,28])'; % row = 28 x 28 image\r\n    imagesc(digit)                      % show the image\r\nend\r\n\r\n%% The Next Step - an Autoencoder Example\r\n% You now have some intuition on artificial neural networks - a network\r\n% automatically learns the relevant features from the inputs and generates\r\n% a sparse representation that maps to the output labels.  What if we use\r\n% the inputs as the target values? That eliminates the need for training\r\n% labels and turns this into an unsupervised learning algorithm. This is\r\n% known as an autoencoder and this becomes a building block of a deep\r\n% learning network. There is an excellent example of autoencoders on the\r\n% <https:\/\/www.mathworks.com\/help\/nnet\/examples\/create-simple-deep-learning-network-for-classification.html\r\n% Training a Deep Neural Network for Digit Classification> page in the\r\n% Neural Network Toolbox documentation, which also uses MNIST dataset. For\r\n% more details, Stanford provides an excellent\r\n% <http:\/\/deeplearning.stanford.edu\/tutorial\/ UFLDL Tutorial> that also\r\n% uses the same dataset and MATLAB-based starter code.\r\n% \r\n%% Sudoku Solver: a Real-time Processing Example\r\n% Beyond understanding the algorithms, there is also a practical question\r\n% of how to generate the input data in the first place. Someone spent a lot\r\n% of time to prepare the MNIST dataset to ensure uniform sizing, scaling,\r\n% contrast, etc. To use the model you built from this dataset in practical\r\n% applications, you have to be able to repeat the same set of processing on\r\n% new data. How do you do such preparation yourself?\r\n%\r\n% There is a fun video that shows you how you can solve Sudoku puzzles\r\n% using a webcam that uses a different character recognition technique.\r\n% Instead of static images, our colleague\r\n% <https:\/\/www.mathworks.com\/matlabcentral\/profile\/authors\/1905880-teja-muppirala\r\n% Teja Muppirala> uses a live video feed in real time to do it and he walks\r\n% you through the pre-processing steps one by one. You should definitely\r\n% check it out:\r\n% <https:\/\/www.mathworks.com\/videos\/solving-a-sudoku-puzzle-using-a-webcam-68773.html\r\n% Solving a Sudoku Puzzle Using a Webcam>.\r\n%\r\n% <<solveSudokuWebcam.png>>\r\n\r\n%% Submitting Your Entry to Kaggle\r\n% You got 96% categorization accuracy rate by simply accepting the default\r\n% settings except for the number of hidden neurons. Not bad for the first\r\n% try. Since you are using a Kaggle dataset, you can now submit your result\r\n% to Kaggle. \r\n\r\nn = size(sub, 1);                                   % num of samples\r\nsub = sub';                                         % transpose\r\n[~, highest] = max(scores);                         % highest scoring model\r\nnet = models{highest};                              % restore the model\r\nYpred = net(sub);                                   % label probabilities\r\n[~, Label] = max(Ypred);                            % predicted labels\r\nLabel = Label';                                     % transpose Label\r\nLabel(Label == 10) = 0;                             % change '10' to '0'\r\nImageId = 1:n; ImageId = ImageId';                  % image ids\r\nwritetable(table(ImageId, Label), 'submission.csv');% write to csv\r\n\r\n%% \r\n% You can now submit the |submission.csv| on\r\n% <https:\/\/www.kaggle.com\/c\/digit-recognizer\/submissions\/attach Kaggle's\r\n% entry submission page>.\r\n\r\n%% Closing\r\n% In this example we focused on getting a high level intuition on\r\n% artificial neural network using a concrete example of handwritten digit\r\n% recognition. We didn\u00e2\u20ac&#x2122;t go into details such as how the inputs weights and\r\n% bias units are combined, how activation works, how you train such a\r\n% network, etc. But you now know enough to use Deep Learning Toolbox in\r\n% MATLAB to participate in a Kaggle competition.\r\n##### SOURCE END ##### d9fc5f63c3c446458693ea27d7d40897\r\n-->","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img decoding=\"async\"  class=\"img-responsive\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2015\/neuralnetFinal_04.png\" onError=\"this.style.display ='none';\" \/><\/div><!--introduction--><p><a href=\"https:\/\/en.wikipedia.org\/wiki\/Deep_learning\">Deep Learning<\/a> is a very hot topic these days especially in computer vision applications and you probably see it in the news and get curious. Now the question is, how do you get started with it? Today's guest blogger, Toshi Takeuchi, gives us a quick tutorial on <a href=\"https:\/\/en.wikipedia.org\/wiki\/Artificial_neural_network\">artificial neural networks<\/a> as a starting point for your study of deep learning.... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/loren\/2015\/08\/04\/artificial-neural-networks-for-beginners\/\">read more >><\/a><\/p>","protected":false},"author":39,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[43],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/1211"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/comments?post=1211"}],"version-history":[{"count":6,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/1211\/revisions"}],"predecessor-version":[{"id":3132,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/1211\/revisions\/3132"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/media?parent=1211"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/categories?post=1211"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/tags?post=1211"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}