{"id":519,"date":"2018-08-21T13:24:17","date_gmt":"2018-08-21T13:24:17","guid":{"rendered":"https:\/\/blogs.mathworks.com\/deep-learning\/?p=519"},"modified":"2021-04-06T15:51:39","modified_gmt":"2021-04-06T19:51:39","slug":"deep-learning-in-action-part-3","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/deep-learning\/2018\/08\/21\/deep-learning-in-action-part-3\/","title":{"rendered":"Deep Learning in Action &#8211; part 3"},"content":{"rendered":"<span style=\"font-size: 14px;\">\r\nHello Everyone! It's <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/profile\/authors\/4758135-johanna-pingel\">Johanna<\/a>, and Steve has allowed me to take over the blog from time to time to talk about deep learning.\r\n<\/span>\r\n<span style=\"font-size: 14px;\">\r\nI'm back for another episode of:\r\n<\/span>\r\n<p style=\"margin: 1% 13%; background-color: #86c5da; text-align: center;\"><span style=\"color: #ffffff; font-size: 20px;\">\"Deep Learning in Action:<\/span>\r\n<span style=\"color: #ffffff; font-size: 16px;\">Cool projects created at MathWorks\"\r\n<\/span><\/p>\r\n<span style=\"font-size: 14px;\">\r\nThis aims to give you insight into what we\u2019re working on at MathWorks.\r\n<\/span>\r\n<span style=\"font-size: 14px;\">\r\nToday\u2019s demo is called <strong>\"Wheel of Fortune\"<\/strong> or alternatively <strong>\"Do you sign MATLAB?\"<\/strong> and it\u2019s the third article in a series of posts, including:<\/span>\r\n<h6><\/h6>\r\n&nbsp;\r\n<ul>\r\n \t<li>3D Point Cloud Segmentation using CNNs<\/li>\r\n \t<li>GPU Coder<\/li>\r\n \t<li><a href=\"https:\/\/blogs.mathworks.com\/deep-learning\/2018\/07\/20\/deep-learning-in-action-part-2\/\">Sentiment Analysis<\/a><\/li>\r\n \t<li><a href=\"https:\/\/blogs.mathworks.com\/deep-learning\/2018\/06\/22\/deep-learning-in-action-part-1\/\">Pictionary<\/a><\/li>\r\n<\/ul>\r\n<span style=\"font-size: 14px;\">The developer of the demo is Joshua Wang who led a team that participated in a MathWorks Hack Day, a fun day where developers at MathWorks get 24 hours to work on a project of the choice related to MATLAB. The team decided to work on a sign language project, and I was drawn to this example because #1) this demo uses images, #2) this demo uses deep learning, and #3) this demo uses MATLAB.<\/span>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">\r\nWhen I reached out to Josh initially, I got this response...<\/span>\r\n<h6><\/h6>\r\n<blockquote>\u2026Not only is it cool that MathWorks tools made it possible to do all of this in a day (our coding all happened on a Wednesday), but it certainly ties in well with our social mission.<\/blockquote>\r\n<span style=\"font-size: 14px;\">I was intrigued how they got this up and running in under 24 hours, so I asked to see the code. <\/span>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">\r\nAfter viewing and running the code, it appeared a significant portion of the work was a nice user interface in MATLAB that looks like Wheel of Fortune: complete with a spinning wheel and the ability to play the game with an opponent. See the game in action here: <\/span>\r\n<h6><\/h6>\r\n<div style=\"width: 1280px;\" class=\"wp-video\"><!--[if lt IE 9]><script>document.createElement('video');<\/script><![endif]-->\n<video class=\"wp-video-shortcode\" id=\"video-519-1\" width=\"1280\" height=\"720\" preload=\"metadata\" controls=\"controls\"><source type=\"video\/mp4\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2018\/08\/WoFinAction.mp4?_=1\" \/><a href=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2018\/08\/WoFinAction.mp4\">https:\/\/blogs.mathworks.com\/deep-learning\/files\/2018\/08\/WoFinAction.mp4<\/a><\/video><\/div>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">\r\nUser interfaces in MATLAB are great, but not unique to deep learning. So for the remainder of this post, I want to walk through the deep learning portion of the application: how they built the CNN to recognize the letters. I'll ask Josh a few questions, and offer a chance for you to ask any questions to Josh and team in the comments section.\r\n<\/span>\r\n<h6><\/h6>\r\n\r\n<hr width=\"50%\/\" \/>\r\n\r\n<span style=\"color: #e67e22; font-size: 20px;\"><strong>Demo: Sign Language in MATLAB<\/strong><\/span>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">\r\nThe basis of this demo is to have a CNN determine which letter is being signed, A through Z. Here are a few sample images of random letters and their corresponding image: <\/span>\r\n<h6><\/h6>\r\n<pre>&gt;&gt; samples = imds.splitEachLabel(1,'randomize',true);\r\n&gt;&gt; montage(samples)\r\n<\/pre>\r\n<h6><\/h6>\r\n<img decoding=\"async\" loading=\"lazy\" width=\"838\" height=\"703\" class=\"alignnone size-full wp-image-539\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2018\/08\/2018-08-20_16-09-51.png\" alt=\"\" \/>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\"> These images are from a training dataset that can be downloaded from GitHub <a href=\"https:\/\/github.com\/loicmarie\/sign-language-alphabet-recognizer\">here.<\/a> <\/span>\r\n<h6><\/h6>\r\n\r\n<hr width=\"50%\/\" \/>\r\n\r\n<span style=\"color: #e67e22; font-size: 18px;\"><strong>Deep Learning Code<\/strong><\/span>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\"> This section walks through the code to create and train the network in 4 parts:<\/span>\r\n<h6><\/h6>\r\n<ol>\r\n \t<li>Load the dataset<\/li>\r\n \t<li>Load the network<\/li>\r\n \t<li>Modify the network<\/li>\r\n \t<li>Set training options<\/li>\r\n<\/ol>\r\n&nbsp;\r\n<h6><\/h6>\r\n<pre><span style=\"color: #5e994a;\">%% Load Data<\/span>\r\nimds = imageDatastore(<span style=\"color: #a020f0;\">'dataset'<\/span>, ...\r\n    <span style=\"color: #a020f0;\">'IncludeSubfolders'<\/span>,true, ...\r\n    <span style=\"color: #a020f0;\">'LabelSource','foldernames'<\/span>);\r\n[imdsTrain,imdsValidation] = splitEachLabel(imds,0.7,<span style=\"color: #a020f0;\">'randomized'<\/span>);\r\n<\/pre>\r\n<pre><span style=\"color: #5e994a;\">%% Load network<\/span>\r\nnet = inceptionv3();\r\nlgraph = layerGraph(net);\r\nfigure(<span style=\"color: #a020f0;\">'Units','normalized','Position'<\/span>,[0.1 0.1 0.8 0.8]);\r\nplot(lgraph)\r\n<\/pre>\r\n<div id=\"attachment_537\" style=\"width: 1034px\" class=\"wp-caption aligncenter\"><img aria-describedby=\"caption-attachment-537\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-537 size-large\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2018\/08\/2018-08-20_15-46-00-1024x679.png\" alt=\"\" width=\"1024\" height=\"679\" \/><p id=\"caption-attachment-537\" class=\"wp-caption-text\">yikes! Inception-v3 is a complicated structure.<\/p><\/div>\r\n\r\n<span style=\"font-size: 14px;\"> Note: if you don't have Inception-v3 downloaded, simply typing <\/span>\r\n<pre> &gt;&gt; inceptionv3<\/pre>\r\n<span style=\"font-size: 14px;\">on the command line will provide a link to download the model.<\/span>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">A list of all models, [including the new ONNX model converter] can be found here: <a href=\"https:\/\/www.mathworks.com\/solutions\/deep-learning\/models.html\">https:\/\/www.mathworks.com\/solutions\/deep-learning\/models.html<\/a><\/span>\r\n\r\n<hr width=\"50%\/\" \/>\r\n\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">Next, change the final layers to reflect the number of classes in the dataset. Since this is a DAG network, add layers and then verify the network is re-connected correctly. <\/span>\r\n<h6><\/h6>\r\n<pre><span style=\"color: #5e994a;\">%% Edit the architecture<\/span>\r\ninputSize = net.Layers(1).InputSize;\r\n\r\nlgraph = removeLayers(lgraph, {<span style=\"color: #a020f0;\">'predictions','predictions_softmax','ClassificationLayer_predictions'<\/span>});\r\n\r\nnumClasses = numel(categories(imdsTrain.Labels));\r\nnewLayers = [\r\n    fullyConnectedLayer(numClasses,<span style=\"color: #a020f0;\">'Name','fc','WeightLearnRateFactor'<\/span>,10,<span style=\"color: #a020f0;\">'BiasLearnRateFactor'<\/span>,10)\r\n    softmaxLayer(<span style=\"color: #a020f0;\">'Name','softmax'<\/span>)\r\n    classificationLayer(<span style=\"color: #a020f0;\">'Name','classoutput'<\/span>)];\r\nlgraph = addLayers(lgraph,newLayers);\r\n\r\nlgraph = connectLayers(lgraph,<span style=\"color: #a020f0;\">'avg_pool','fc'<\/span>);\r\n<\/pre>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">Next, set up the training options:<\/span>\r\n<h6><\/h6>\r\n<pre>layers = lgraph.Layers;\r\nconnections = lgraph.Connections;\r\n\r\nlayers(1:110) = freezeWeights(layers(1:110));\r\nlgraph = createLgraphUsingConnections(layers,connections);\r\n\r\npixelRange = [-30 30];\r\nimageAugmenter = imageDataAugmenter( ...\r\n    <span style=\"color: #a020f0;\">'RandXReflection'<\/span>,true, ...\r\n    <span style=\"color: #a020f0;\">'RandXTranslation'<\/span>,pixelRange, ...\r\n    <span style=\"color: #a020f0;\">'RandYTranslation'<\/span>,pixelRange);\r\naugimdsTrain = augmentedImageDatastore(inputSize(1:2),imdsTrain) ...\r\n    <span style=\"color: #a020f0;\">'DataAugmentation'<\/span>,imageAugmenter);\r\n\r\naugimdsValidation = augmentedImageDatastore(inputSize(1:2),imdsValidation);\r\noptions = trainingOptions(<span style=\"color: #a020f0;\">'sgdm'<\/span>, ...\r\n    <span style=\"color: #a020f0;\">'MiniBatchSize'<\/span>,10, ...\r\n    <span style=\"color: #a020f0;\">'MaxEpochs'<\/span>,6, ...\r\n    <span style=\"color: #a020f0;\">'InitialLearnRate'<\/span>,1e-4, ...\r\n    <span style=\"color: #a020f0;\">'Verbose'<\/span>,true);\r\n<\/pre>\r\n<span style=\"font-size: 12px;\">\r\nI'll be honest, I'm still not in love with augmented image datastore (since I don't love the extra lines of code in what would otherwise be a very simple and easy to read section), but it's growing on me in this example since it allows you to create extra samples of images using translation, reflection, and scaling. It also resizes all the images to the appropriate size required by the network.\r\n<\/span>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">\r\nFinally, train the network. <\/span>\r\n<h6><\/h6>\r\n<pre>net = trainNetwork(augimdsTrain,lgraph,options);\r\n<\/pre>\r\n&nbsp;\r\n<h6><\/h6>\r\n<span style=\"font-size: 12px;\">\r\n<em>Note: I trained the network using my old Tesla K40 GPU card and it took roughly 1 hour 15 minutes to run. I cut the training data size significantly since things appeared to be taking longer than I'd like, so I'd imagine this would take even longer with the full training set.<\/em>\r\n<\/span>\r\n<h6><\/h6>\r\n\r\n<hr width=\"50%\/\" \/>\r\n\r\n<span style=\"color: #e67e22; font-size: 18px;\"><strong>Q&amp;A with Josh<\/strong><\/span>\r\n\r\n&nbsp;\r\n<h6><\/h6>\r\n1. First I have to ask, what is quality engineering? What do you do?\r\n<h6><\/h6>\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td style=\"border-left: 2px solid #476b6b; padding: 10px;\"><strong>Quality Engineering at MathWorks is a group of software engineers who build the infrastructure and comprehensive test environment to support and champion MathWorks\u2019 primary goal of delivering bug-free, feature-rich software to our customers. Specifically, I work on the web and cloud services which power the MathWorks\u2019 online offerings like MATLAB Online, MATLAB Mobile, and MATLAB Grader. <\/strong><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n&nbsp;\r\n\r\n&nbsp;\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">\r\n2. What is your relationship to deep learning? Do you work on deep learning in your role, or just interested in it outside of work, or both?\r\n<\/span>\r\n<h6><\/h6>\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td style=\"border-left: 2px solid #476b6b; padding: 10px;\"><strong>I don\u2019t directly work on deep learning in my role at MathWorks. However, recent advances in machine learning have great potential to transform how customers use our products in an increasingly connected world, and our hack day project was designed to demonstrate one way we could use deep learning to make scientific computing more intuitive, contextual, and accessible. <\/strong><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">\r\n3. Whose idea was a sign language project, and why?\r\n<\/span>\r\n<h6><\/h6>\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td style=\"border-left: 2px solid #476b6b; padding: 10px;\"><strong>Making MATLAB accessible to anyone, including those with impaired hearing, is an important part of our mission to provide the ultimate computing environment for technical computation, visualization, design, simulation, and implementation. In addition, making MATLAB more accessible via gesture control would accelerate the pace of engineering and science by enabling its use in environments where a person may not have easy access to traditional input mechanisms, such as an operating room or factory floor. <\/strong><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">\r\n4. Why did you choose to transfer learn on Inception-v3?\r\n<\/span>\r\n<h6><\/h6>\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td style=\"border-left: 2px solid #476b6b; padding: 10px;\"><strong>Inception v3 has higher accuracy than models like GoogLeNet, and is easily available in MATLAB with examples.<\/strong><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">\r\n5. What was your validation accuracy approximately?\r\n<\/span>\r\n<h6><\/h6>\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td style=\"border-left: 2px solid #476b6b; padding: 10px;\"><strong>For someone unfamiliar with American Sign Language, probably around 70%. Certain letters were very similar and therefore more difficult for our model to distinguish, like M and N, and our training data set was fairly homogenous \u2013 from the right hand of a single person in front of the same background. Given that this was a single-day effort, we didn\u2019t spend a lot of time tuning the model or improving the training data.<\/strong><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">\r\n6. What was your rationale for 6 epochs? Was it training time, or accuracy? If you had more time, would you train longer?\r\n<\/span>\r\n<h6><\/h6>\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td style=\"border-left: 2px solid #476b6b; padding: 10px;\"><strong>I don\u2019t think we had any real reason for this, nor did everyone even realize we had chosen 6 epochs \u2013 we just took the defaults that were available in MATLAB or the Github project that we used for an initial prototype. <\/strong><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">\r\nThanks to the team for the demo (The entire team consists of : Anil Patro, Oral Dalay, Harshad Tambekar, Krishan Sharma, Rohit Kudva, Michael Broshi, and Sara Burke) and thanks to Josh for taking the time to walk me through it! I hope you enjoyed it as well. Anything else you'd like to ask the team? Leave a comment below!<\/span>\r\n\r\n<img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-541 size-large\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2018\/08\/20180621_135809-1024x576.jpg\" alt=\"\" width=\"1024\" height=\"576\" \/>\r\n\r\n(Hope they don't mind me putting in a picture of the team. Congrats on a job well done!)\r\n\r\n&nbsp;\r\n<h6><\/h6>\r\n<h6><\/h6>","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img decoding=\"async\"  class=\"img-responsive\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2018\/08\/2018-08-20_16-09-51.png\" onError=\"this.style.display ='none';\" \/><\/div><p>\r\nHello Everyone! It's Johanna, and Steve has allowed me to take over the blog from time to time to talk about deep learning.\r\n\r\n\r\nI'm back for another episode of:\r\n\r\n\"Deep Learning in Action:\r\nCool... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/deep-learning\/2018\/08\/21\/deep-learning-in-action-part-3\/\">read more >><\/a><\/p>","protected":false},"author":156,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[9],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/519"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/users\/156"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/comments?post=519"}],"version-history":[{"count":17,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/519\/revisions"}],"predecessor-version":[{"id":603,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/519\/revisions\/603"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/media?parent=519"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/categories?post=519"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/tags?post=519"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}