{"id":6,"date":"2017-09-21T23:00:42","date_gmt":"2017-09-21T23:00:42","guid":{"rendered":"https:\/\/blogs.mathworks.com\/deep-learning\/?p=6"},"modified":"2021-04-06T15:53:01","modified_gmt":"2021-04-06T19:53:01","slug":"jumping-into-the-deep-end","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/deep-learning\/2017\/09\/21\/jumping-into-the-deep-end\/","title":{"rendered":"Jumping into the Deep End"},"content":{"rendered":"<div class=\"content\"><p>Hello, and welcome to the new MATLAB Central blog on deep learning! In my 24th year of MATLAB and toolbox development and design, I am excited to be tackling this new project.<\/p><p><i>Deep learning<\/i> refers to a collection of machine learning techniques that are based on neural networks that have a large number of layers (hence \"deep\"). By training these networks on labeled data sets, they can achieve state-of-the-art accuracy on classification tasks using images, text, and sound as inputs.<\/p><p>Because of my background in image processing, I have followed the rapid progress in deep learning over the past several years with great interest. There is much that I would like to learn and share with you about the area, especially with respect to exploring deep learning ideas with MATLAB. To that end, several developers have volunteered to lend a hand with topics and code and technical guidance as we explore. They are building deep learning capabilities as fast as they can in products like:<\/p><div><ul><li>Neural Network Toolbox<\/li><li>Parallel Computing Toolbox<\/li><li>Image Processing Toolbox<\/li><li>Computer Vision System Toolbox<\/li><li>Automated Driving System Toolbox<\/li><li>GPU Coder<\/li><\/ul><\/div><p>I will be introducing them to you as we get into the details of deep learning with MATLAB.<\/p><p>If you have followed my <a href=\"https:\/\/blogs.mathworks.com\/steve\">image processing blog posts<\/a>, you can expect a similar style here. Topics will be a mix of concept tutorials, examples and case studies, feature exploration, and tips. I imagine we'll discuss things like performance, GPU hardware, and online data sets. Maybe we'll do some things just for fun, like the LSTM network built last month by a MathWorks developer that spouts Shakespeare-like verse.<\/p><p>To subscribe, either using email or RSS, click on the \"Subscribe\" link at the top of the page.<\/p><p>I'll leave you with a little teaser based on AlexNet. I just plugged in a webcam and connected to it in MATLAB.<\/p><pre class=\"codeinput\">c = webcam\r\n<\/pre><pre class=\"codeoutput\">\r\nc = \r\n\r\n  webcam with properties:\r\n\r\n                     Name: 'Microsoft&reg; LifeCam Cinema(TM)'\r\n               Resolution: '640x480'\r\n     AvailableResolutions: {1&times;11 cell}\r\n             ExposureMode: 'auto'\r\n         WhiteBalanceMode: 'auto'\r\n                    Focus: 33\r\n    BacklightCompensation: 5\r\n                Sharpness: 25\r\n                     Zoom: 0\r\n                FocusMode: 'auto'\r\n                     Tilt: 0\r\n               Brightness: 143\r\n                      Pan: 0\r\n             WhiteBalance: 4500\r\n               Saturation: 83\r\n                 Exposure: -6\r\n                 Contrast: 5\r\n\r\n<\/pre><p>Next, I loaded an AlexNet network that has been pretrained with a million images. The network can classify images into 1,000 different object categories.<\/p><pre class=\"codeinput\">nnet = alexnet\r\n<\/pre><pre class=\"codeoutput\">\r\nnnet = \r\n\r\n  SeriesNetwork with properties:\r\n\r\n    Layers: [25&times;1 nnet.cnn.layer.Layer]\r\n\r\n<\/pre><p>You could also try other networks. For example, after you have upgraded to R2017b, you could experiment with GoogLeNet by using <tt>net = googlenet<\/tt>.<\/p><p>What do these 25 network layers look like?<\/p><pre class=\"codeinput\">nnet.Layers\r\n<\/pre><pre class=\"codeoutput\">\r\nans = \r\n\r\n  25x1 Layer array with layers:\r\n\r\n     1   'data'     Image Input                   227x227x3 images with 'zerocenter' normalization\r\n     2   'conv1'    Convolution                   96 11x11x3 convolutions with stride [4  4] and padding [0  0  0  0]\r\n     3   'relu1'    ReLU                          ReLU\r\n     4   'norm1'    Cross Channel Normalization   cross channel normalization with 5 channels per element\r\n     5   'pool1'    Max Pooling                   3x3 max pooling with stride [2  2] and padding [0  0  0  0]\r\n     6   'conv2'    Convolution                   256 5x5x48 convolutions with stride [1  1] and padding [2  2  2  2]\r\n     7   'relu2'    ReLU                          ReLU\r\n     8   'norm2'    Cross Channel Normalization   cross channel normalization with 5 channels per element\r\n     9   'pool2'    Max Pooling                   3x3 max pooling with stride [2  2] and padding [0  0  0  0]\r\n    10   'conv3'    Convolution                   384 3x3x256 convolutions with stride [1  1] and padding [1  1  1  1]\r\n    11   'relu3'    ReLU                          ReLU\r\n    12   'conv4'    Convolution                   384 3x3x192 convolutions with stride [1  1] and padding [1  1  1  1]\r\n    13   'relu4'    ReLU                          ReLU\r\n    14   'conv5'    Convolution                   256 3x3x192 convolutions with stride [1  1] and padding [1  1  1  1]\r\n    15   'relu5'    ReLU                          ReLU\r\n    16   'pool5'    Max Pooling                   3x3 max pooling with stride [2  2] and padding [0  0  0  0]\r\n    17   'fc6'      Fully Connected               4096 fully connected layer\r\n    18   'relu6'    ReLU                          ReLU\r\n    19   'drop6'    Dropout                       50% dropout\r\n    20   'fc7'      Fully Connected               4096 fully connected layer\r\n    21   'relu7'    ReLU                          ReLU\r\n    22   'drop7'    Dropout                       50% dropout\r\n    23   'fc8'      Fully Connected               1000 fully connected layer\r\n    24   'prob'     Softmax                       softmax\r\n    25   'output'   Classification Output         crossentropyex with 'tench', 'goldfish', and 998 other classes\r\n<\/pre><p>I happen to know that 'coffee mug' is one of the categories. How will the network do with the 23-year-old MATLAB \"Picture the Power\" mug from my bookshelf?<\/p><p>Here's the snapshot I took with my webcam using <tt>pic = snapshot(c)<\/tt>.<\/p><pre class=\"codeinput\">imshow(pic)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2017\/09\/introduction_01.png\" alt=\"\"> <p>The first layer accepts inputs. It will tell us the image size that the network accepts.<\/p><pre class=\"codeinput\">nnet.Layers(1)\r\n<\/pre><pre class=\"codeoutput\">\r\nans = \r\n\r\n  ImageInputLayer with properties:\r\n\r\n                Name: 'data'\r\n           InputSize: [227 227 3]\r\n\r\n   Hyperparameters\r\n    DataAugmentation: 'none'\r\n       Normalization: 'zerocenter'\r\n\r\n<\/pre><p>So I need to resize the snapshot to be 227x227 before I feed it to the network.<\/p><pre class=\"codeinput\">pic2 = imresize(pic,[227 227]);\r\nimshow(pic2)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2017\/09\/introduction_02.png\" alt=\"\"> <p>Now I can try to classify it.<\/p><pre class=\"codeinput\">label = classify(nnet,pic2)\r\n<\/pre><pre class=\"codeoutput\">\r\nlabel = \r\n\r\n  categorical\r\n\r\n     coffee mug \r\n\r\n<\/pre><p>OK! But I wonder what else the network thought it might be? The <tt>predict<\/tt> function can return the scores for all the categories.<\/p><pre class=\"codeinput\">p = predict(nnet,pic2);\r\nplot(p)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2017\/09\/introduction_03.png\" alt=\"\"> <p>There are several notable prediction peaks. I'll use the <tt>maxk<\/tt> function (new in R2017b) to find where they are, and then I'll look up those locations in the list of category labels in the network's last layer.<\/p><pre class=\"codeinput\">[p3,i3] = maxk(p,3);\r\n<\/pre><pre class=\"codeinput\">p3\r\n<\/pre><pre class=\"codeoutput\">\r\np3 =\r\n\r\n  1&times;3 single row vector\r\n\r\n    0.2469    0.1446    0.1377\r\n\r\n<\/pre><pre class=\"codeinput\">i3\r\n<\/pre><pre class=\"codeoutput\">\r\ni3 =\r\n\r\n   505   733   623\r\n\r\n<\/pre><pre class=\"codeinput\">nnet.Layers(end)\r\n<\/pre><pre class=\"codeoutput\">\r\nans = \r\n\r\n  ClassificationOutputLayer with properties:\r\n\r\n            Name: 'output'\r\n      ClassNames: {1000&times;1 cell}\r\n      OutputSize: 1000\r\n\r\n   Hyperparameters\r\n    LossFunction: 'crossentropyex'\r\n\r\n<\/pre><pre class=\"codeinput\">nnet.Layers(end).ClassNames(i3)\r\n<\/pre><pre class=\"codeoutput\">\r\nans =\r\n\r\n  3&times;1 cell array\r\n\r\n    {'coffee mug'     }\r\n    {'Polaroid camera'}\r\n    {'lens cap'       }\r\n\r\n<\/pre><p>Hmm. I'm glad coffee mug came out on top. I can't pour coffee into a camera or a lens cap!<\/p><p>Remember, for options to follow along with this new blog, click on the \"Subscribe\" link at the top of the page.<\/p><p>Finally, a note for my <a href=\"https:\/\/blogs.mathworks.com\/steve\">image processing blog<\/a> readers: Don't worry, I will continue to write for that blog, too.<\/p><script language=\"JavaScript\"> <!-- \r\n    function grabCode_14053852d64947a5a6fc1a32bd67336e() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='14053852d64947a5a6fc1a32bd67336e ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' 14053852d64947a5a6fc1a32bd67336e';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2017 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_14053852d64947a5a6fc1a32bd67336e()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2017b<br><\/p><\/div><!--\r\n14053852d64947a5a6fc1a32bd67336e ##### SOURCE BEGIN #####\r\n%% Jumping into the Deep End\r\n% Hello, and welcome to the new MATLAB Central blog on deep learning! In my\r\n% 24th year of MATLAB and toolbox development and design, I am excited to\r\n% be tackling this new project. \r\n%\r\n% _Deep learning_ refers to a collection of machine learning techniques\r\n% that are based on neural networks that have a large number of layers\r\n% (hence \"deep\"). By training these networks on labeled data sets, they can\r\n% achieve state-of-the-art accuracy on classification tasks using images,\r\n% text, and sound as inputs.\r\n%\r\n% Because of my background in image\r\n% processing, I have followed the rapid progress in deep learning over the\r\n% past several years with great interest. There is much that I would like\r\n% to learn and share with you about the area, especially with respect to\r\n% exploring deep learning ideas with MATLAB. To that end, several\r\n% developers have volunteered to lend a hand with topics and code and\r\n% technical guidance as we explore. They are building\r\n% deep learning capabilities as fast as they can in products like:\r\n% \r\n% * Neural Network Toolbox\r\n% * Parallel Computing Toolbox\r\n% * Image Processing Toolbox\r\n% * Computer Vision System Toolbox\r\n% * Automated Driving System Toolbox\r\n% * GPU Coder\r\n%\r\n% I will be introducing them to you as we get into the\r\n% details of deep learning with MATLAB.\r\n%\r\n% If you have followed my <https:\/\/blogs.mathworks.com\/steve image\r\n% processing blog posts>, you can expect a similar style here. Topics will\r\n% be a mix of concept tutorials, examples and case studies, feature\r\n% exploration, and tips. I imagine we'll discuss things like performance,\r\n% GPU hardware, and online data sets. Maybe we'll do some things just for fun,\r\n% like the LSTM network built last month by a MathWorks developer that\r\n% spouts Shakespeare-like verse.\r\n%\r\n% To subscribe, either using email or RSS, click on the \"Subscribe\" link at\r\n% the top of the page.\r\n%\r\n% I'll leave you with a little teaser based on AlexNet. I just plugged in a\r\n% webcam and connected to it in MATLAB.\r\n\r\nc = webcam\r\n\r\n%%\r\n% Next, I loaded an AlexNet network that has been pretrained with a million\r\n% images. The network can classify images into 1,000 different object\r\n% categories. \r\n\r\nnnet = alexnet\r\n\r\n%%\r\n% You could also try other networks. For example, after you have upgraded\r\n% to R2017b, you could experiment with GoogLeNet by using |net =\r\n% googlenet|.\r\n\r\n%%\r\n% What do these 25 network layers look like?\r\n\r\nnnet.Layers\r\n\r\n%%\r\n% I happen to know that 'coffee mug' is one of the categories.\r\n% How will the network do with the 23-year-old MATLAB \"Picture the Power\"\r\n% mug from my bookshelf? \r\n%\r\n% Here's the snapshot I took with my webcam using |pic = snapshot(c)|.\r\n\r\nimshow(pic)\r\n\r\n%%\r\n% The first layer accepts inputs. It will tell us the image size that the\r\n% network accepts.\r\n\r\nnnet.Layers(1)\r\n\r\n%%\r\n% So I need to resize the snapshot to be 227x227 before I feed it to the\r\n% network.\r\n\r\npic2 = imresize(pic,[227 227]);\r\nimshow(pic2)\r\n\r\n%%\r\n% Now I can try to classify it.\r\n\r\nlabel = classify(nnet,pic2)\r\n\r\n%%\r\n% OK! But I wonder what else the network thought it might be? The |predict|\r\n% function can return the scores for all the categories.\r\n\r\np = predict(nnet,pic2);\r\nplot(p)\r\n\r\n%%\r\n% There are several notable prediction peaks. I'll use the |maxk| function\r\n% (new in R2017b) to find where they are, and then I'll look up those\r\n% locations in the list of category labels in the network's last layer.\r\n\r\n[p3,i3] = maxk(p,3);\r\n\r\n%%\r\np3\r\n\r\n%%\r\ni3\r\n\r\n%%\r\nnnet.Layers(end)\r\n\r\n%%\r\nnnet.Layers(end).ClassNames(i3)\r\n\r\n%%\r\n% Hmm. I'm glad coffee mug came out on top. I can't pour coffee into a\r\n% camera or a lens cap!\r\n%\r\n% Remember, for options to follow along with this new blog, click\r\n% on the \"Subscribe\" link at the top of the page.\r\n%\r\n% Finally, a note for my <https:\/\/blogs.mathworks.com\/steve image processing\r\n% blog> readers: Don't worry, I will continue to write for that blog, too.\r\n\r\n\r\n\r\n##### SOURCE END ##### 14053852d64947a5a6fc1a32bd67336e\r\n-->","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2017\/09\/introduction_01.png\" class=\"img-responsive attachment-post-thumbnail size-post-thumbnail wp-post-image\" alt=\"\" decoding=\"async\" loading=\"lazy\" \/><\/div><p>Hello, and welcome to the new MATLAB Central blog on deep learning! In my 24th year of MATLAB and toolbox development and design, I am excited to be tackling this new project.Deep learning refers to... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/deep-learning\/2017\/09\/21\/jumping-into-the-deep-end\/\">read more >><\/a><\/p>","protected":false},"author":42,"featured_media":11,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[9],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/6"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/users\/42"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/comments?post=6"}],"version-history":[{"count":3,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/6\/revisions"}],"predecessor-version":[{"id":16,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/6\/revisions\/16"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/media\/11"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/media?parent=6"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/categories?post=6"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/tags?post=6"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}