{"id":1052,"date":"2019-01-31T15:26:21","date_gmt":"2019-01-31T15:26:21","guid":{"rendered":"https:\/\/blogs.mathworks.com\/deep-learning\/?p=1052"},"modified":"2021-04-06T15:51:10","modified_gmt":"2021-04-06T19:51:10","slug":"deep-learning-visualizations-cam-visualization","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/deep-learning\/2019\/01\/31\/deep-learning-visualizations-cam-visualization\/","title":{"rendered":"Deep Learning Visualizations: CAM Visualization"},"content":{"rendered":"<span style=\"font-size: 14px;\">I\u2019m hoping by now you\u2019ve heard that MATLAB has great visualizations, which can be helpful in deep learning to help uncover what\u2019s going on inside your neural network.<\/span>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">Last post, we discussed visualizations of features learned by a neural network. Today, I\u2019d like to write about another visualization you can do in MATLAB for deep learning, that you won\u2019t find by reading the documentation*.<\/span>\r\n<h6><\/h6>\r\n<h6><\/h6>\r\n<h3>CAM Visualizations<\/h3>\r\n<span style=\"font-size: 14px;\">This is to help answer the question: \u201cHow did my network decide which category an image falls under?\u201d With class activation mapping, or CAM, you can uncover which region of an image mostly strongly influenced the network prediction. I was surprised at how easy this code was to understand: just a few lines of code that provides insight into a network. The end result will look something like this:<\/span>\r\n\r\n<img decoding=\"async\" loading=\"lazy\" width=\"1456\" height=\"617\" class=\"alignnone size-full wp-image-1106\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2019\/01\/blog_vis2.png\" alt=\"\" \/>\r\n\r\n<span style=\"font-size: 14px;\">This can show what areas of an image accounted for the network's prediction.<\/span>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">To make it more fun, let\u2019s do this \u201clive\u201d using a webcam. Of course, you can always feed an image into these lines instead of a webcam.<\/span>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">First, read in a pretrained network for image classification. SqueezeNet, GoogLeNet, ResNet-18 are good choices, since they\u2019re relatively fast.<\/span>\r\n<pre>netName = 'squeezenet';\r\nnet = eval(netName);<\/pre>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">Then we get our webcam running.<\/span>\r\n<pre>cam = webcam;\r\npreview(cam);\r\n<\/pre>\r\n<div id=\"attachment_1086\" style=\"width: 310px\" class=\"wp-caption aligncenter\"><img aria-describedby=\"caption-attachment-1086\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-1086 size-medium\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2019\/01\/blog1-300x169.png\" alt=\"\" width=\"300\" height=\"169\" \/><p id=\"caption-attachment-1086\" class=\"wp-caption-text\">Here\u2019s me running this example.<\/p><\/div>\r\n<h6><\/h6>\r\n<h6><\/h6>\r\n<em>*And to be clear, this is all documented functionality - we're not going off the grid here! <\/em>\r\n<h6><\/h6>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">Grab the input size, the class names, and the layer name from which we want to extract the activations. We want the ReLU layer that follows the last convolutional layer.<\/span>\r\n\r\n<span style=\"font-size: 14px;\">I considered adding a helper function to get the layer name, but it\u2019ll take you 2 seconds to grab the name from this table:<\/span>\r\n<h6><\/h6>\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td style=\"padding: 0px 5px; text-align: center;\"><strong><u>Network Name<\/u><\/strong><\/td>\r\n<td style=\"padding: 0px 5px; text-align: center;\"><strong><u>Activation Layer Name<\/u><\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"padding: 0px 5px;\">googlenet<\/td>\r\n<td style=\"padding: 0px 5px;\">'inception_5b-output'<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"padding: 0px 5px;\">squeezenet<\/td>\r\n<td style=\"padding: 0px 5px;\">'relu_conv10'<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"padding: 0px 5px;\">resnet18<\/td>\r\n<td style=\"padding: 0px 5px;\">'res5b'<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<h6><\/h6>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">Now let\u2019s run this on an image.<\/span>\r\n<pre>im = imread('peppers.png');\r\nimResized = imresize(im,[inputSize(1),NaN]);\r\nimageActivations = activations(net,imResized,layerName);\r\n<\/pre>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">The class activation map for a specific class is the activation map of the ReLU layer, weighted by how much each activation contributes to the final score of the class.<\/span>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">The weights are from the final fully connected layer of the network for that class. SqueezeNet doesn\u2019t have a final fully connected layer, so the output of the ReLU layer is already the class activation map.<\/span>\r\n<h6><\/h6>\r\n<pre>scores = squeeze(mean(imageActivations,[1 2]));\r\n[~,classIds] = maxk(scores,3);\r\nclassActivationMap = imageActivations(:,:,classIds(1));\r\n<\/pre>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">If you\u2019re using another network (not squeezenet) it looks like this:<\/span>\r\n<pre>scores = squeeze(mean(imageActivations,[1 2]));\r\n[~,classIds] = maxk(scores,3);\r\n\r\nif netName ~= 'squeezenet'\r\n    \r\n    fcWeights = net.Layers(end-2).Weights    ;\r\n    fcBias = net.Layers(end-2).Bias;\r\n    scores = fcWeights*scores + fcBias;\r\n    \r\n    weightVector = shiftdim(fcWeights(classIds(1),:),-1);\r\n    classActivationMap = sum(imageActivations.*weightVector,3);\r\nend\r\n<\/pre>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">Calculate the top class labels and the final normalized class scores.<\/span>\r\n<h6><\/h6>\r\n<pre>scores = exp(scores)\/sum(exp(scores));\r\nmaxScores = scores(classIds);\r\nlabels = classes(classIds);\r\n<\/pre>\r\n<span style=\"font-size: 14px;\">And visualize the results. <\/span>\r\n<pre>subplot(1,2,1);\r\nimshow(im);\r\nsubplot(1,2,2);\r\nCAMshow(im,classActivationMap);\r\ntitle(string(labels) + \", \" + string(maxScores));\r\n\r\ndrawnow;\r\n<\/pre>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">The activations for the top prediction are visualized. The top three predictions and confidence are displayed in the title of the plot.<\/span>\r\n<h6><\/h6>\r\n<img decoding=\"async\" loading=\"lazy\" width=\"1456\" height=\"617\" class=\"alignnone size-full wp-image-1106\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2019\/01\/blog_vis2.png\" alt=\"\" \/>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">Now you can switch from a still image to a webcam, with these lines of code:<\/span>\r\n<h6><\/h6>\r\n<pre>h = figure('Units','normalized','Position',[.05 .05 .9 .8]);\r\nwhile ishandle(h)\r\n% im = imread('peppers.png'); &lt;-- remove this line\r\nim = snapshot(cam);<\/pre>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">put an \u2018end\u2019 to the loop right after drawnow; you\u2019re good to run this in a loop now. <em>If I lost you with any of these steps, a link to the full file is at the bottom of this page.<\/em><\/span>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">It\u2019s also interesting to note that you can do this for any class of the network. Take a look at this image below. I have a coffee cup that is being accurately predicted as a coffee mug. You can see those class activations. But why is it also being highly classified as an iPod? <\/span>\r\n<h6><\/h6>\r\n<img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-1092 size-large\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2019\/01\/activations1-1024x586.png\" alt=\"\" width=\"1024\" height=\"586\" \/>\r\n<h6><\/h6>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">You can switch the class activations to visualize the <em>second <\/em>most likely class, and visualize what features are triggering that prediction.<\/span>\r\n<h6><\/h6>\r\n<h6><\/h6>\r\n<img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-1090 size-large\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2019\/01\/activations2-1024x613.png\" alt=\"\" width=\"1024\" height=\"613\" \/>\r\n<h6><\/h6>\r\n<h6><\/h6>\r\n<span style=\"font-size: 14px;\">Finally, you\u2019ll need two helper functions to make this example run:<\/span>\r\n<pre>function CAMshow(im,CAM)\r\n  imSize = size(im);\r\n  CAM = imresize(CAM,imSize(1:2));\r\n  CAM = normalizeImage(CAM);\r\n  CAM(CAM &lt; .2) = 0;\r\n  cmap = jet(255).*linspace(0,1,255)';\r\n  %'\r\n  CAM = ind2rgb(uint8(CAM*255),cmap)*255;\r\n\r\n  combinedImage = double(rgb2gray(im))\/2 + CAM;\r\n  combinedImage = normalizeImage(combinedImage)*255;\r\n  imshow(uint8(combinedImage));\r\nend\r\n\r\nfunction N= normalizeImage(I)\r\n  minimum = min(I(:));\r\n  maximum = max(I(:));\r\n  N = (I-minimum)\/(maximum-minimum);\r\nend\r\n<\/pre>\r\n<span style=\"font-size: 14px;\">Grab the entire code with the blue \"Get the MATLAB Code\" link on the right. Happy visualization!<\/span>\r\n\r\n<span style=\"font-size: 14px;\">Leave a comment below, or follow me on Twitter!<\/span> <p><a href=\"https:\/\/twitter.com\/jo_pings?ref_src=twsrc%5Etfw\" class=\"twitter-follow-button\" data-size=\"large\" data-show-count=\"false\">Follow @jo_pings<\/a><script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/p>\r\n<script language=\"JavaScript\"> <!-- \r\n    function grabCode_a710f144b80042c592b9fe35aab1fc59() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='a710f144b80042c592b9fe35aab1fc59 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' a710f144b80042c592b9fe35aab1fc59';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2019 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\">Copyright 2018 The MathWorks, Inc.<br><a href=\"javascript:grabCode_a710f144b80042c592b9fe35aab1fc59()\"><span class=\"get_ml_code\">Get the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      <br><\/p><!--\r\na710f144b80042c592b9fe35aab1fc59 ##### SOURCE BEGIN #####\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n% Code to run cam visualizations\r\n% runs on a webcam, but you can swap this out to be a still image instead\r\nnetName = 'squeezenet';\r\nnet = eval(netName)\r\n\r\ncam = webcam;\r\npreview(cam);\r\n\r\nisSqueezeNet = netName == \"squeezenet\";\r\n\r\ninputSize = net.Layers(1).InputSize(1:2);\r\nclasses = net.Layers(end).Classes;\r\nlayerName = 'relu_conv10';\r\n\r\nh = figure('Units','normalized','Position',[.05 .05 .9 .8]);\r\nwhile ishandle(h)\r\n    %im = imread('peppers.png');\r\n    im = snapshot(cam);\r\n    imResized = imresize(im,[inputSize(1),NaN]);\r\n    imageActivations = activations(net,imResized,layerName);\r\n    \r\n    scores = squeeze(mean(imageActivations,[1 2]));\r\n    [~,classIds] = maxk(scores,3);\r\n    classActivationMap = imageActivations(:,:,classIds(2));\r\n    \r\n    if isSqueezeNet\r\n        scores = exp(scores)\/sum(exp(scores));\r\n        maxScores = scores(classIds);\r\n        labels = classes(classIds);\r\n    else\r\n        fcWeights = net.Layers(end-2).Weights;\r\n        fcBias = net.Layers(end-2).Bias;\r\n        scores = fcWeights*scores + fcBias;\r\n        \r\n        weightVector = shiftdim(fcWeights(classIds(1),:),-1);\r\n        classActivationMap = sum(imageActivations.*weightVector,3);\r\n        \r\n    end\r\n    \r\n    subplot(1,2,1);\r\n    imshow(im);\r\n    subplot(1,2,2);\r\n    CAMshow(im,classActivationMap);\r\n    title(string(labels) + \", \" + string(maxScores));\r\n    \r\n    drawnow;\r\nend\r\n\r\nclear cam;\r\n\r\nfunction CAMshow(im,CAM)\r\nimSize = size(im);\r\nCAM = imresize(CAM,imSize(1:2));\r\nCAM = normalizeImage(CAM);\r\nCAM(CAM < .2) = 0;\r\ncmap = jet(255).*linspace(0,1,255)';\r\nCAM = ind2rgb(uint8(CAM*255),cmap)*255;\r\n\r\ncombinedImage = double(rgb2gray(im))\/2 + CAM;\r\ncombinedImage = normalizeImage(combinedImage)*255;\r\nimshow(uint8(combinedImage));\r\nend\r\nfunction N= normalizeImage(I)\r\n\r\nminimum = min(I(:));\r\nmaximum = max(I(:));\r\nN = (I-minimum)\/(maximum-minimum);\r\nend\r\n##### SOURCE END ##### a710f144b80042c592b9fe35aab1fc59\r\n--><!-- AddThis Sharing Buttons below -->","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img decoding=\"async\"  class=\"img-responsive\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2019\/01\/blog_vis2.png\" onError=\"this.style.display ='none';\" \/><\/div><p>I\u2019m hoping by now you\u2019ve heard that MATLAB has great visualizations, which can be helpful in deep learning to help uncover what\u2019s going on inside your neural network.\r\n\r\nLast post, we discussed... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/deep-learning\/2019\/01\/31\/deep-learning-visualizations-cam-visualization\/\">read more >><\/a><\/p>","protected":false},"author":156,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[9],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/1052"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/users\/156"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/comments?post=1052"}],"version-history":[{"count":63,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/1052\/revisions"}],"predecessor-version":[{"id":1270,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/1052\/revisions\/1270"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/media?parent=1052"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/categories?post=1052"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/tags?post=1052"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}