{"id":400,"date":"2018-06-08T17:46:23","date_gmt":"2018-06-08T17:46:23","guid":{"rendered":"https:\/\/blogs.mathworks.com\/deep-learning\/?p=400"},"modified":"2021-04-06T15:51:57","modified_gmt":"2021-04-06T19:51:57","slug":"semantic-segmentation-using-deep-learning","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/deep-learning\/2018\/06\/08\/semantic-segmentation-using-deep-learning\/","title":{"rendered":"Semantic Segmentation Using Deep Learning"},"content":{"rendered":"<p>Today I want to show you a <a href=\"https:\/\/www.mathworks.com\/help\/vision\/examples\/semantic-segmentation-using-deep-learning.html\">documentation example<\/a> that shows how to train a semantic segmentation network using deep learning and the Computer Vision System Toolbox. <\/p>\r\n      <p>A semantic segmentation network classifies every pixel in an image, resulting in an image that is segmented by class. Applications for semantic segmentation include road segmentation for autonomous driving and cancer cell segmentation for medical diagnosis. To learn more, see <a href=\"https:\/\/www.mathworks.com\/help\/vision\/ug\/semantic-segmentation-basics.html\">Semantic Segmentation Basics<\/a>. <\/p>\r\n      <p>To illustrate the training procedure, this example trains SegNet, one type of convolutional neural network (CNN) designed for semantic image segmentation. Other types networks for semantic segmentation include fully convolutional networks (FCN) and U-Net. The training procedure shown here can be applied to those networks too. <\/p>\r\n      <p>This example uses the <a href=\"http:\/\/mi.eng.cam.ac.uk\/research\/projects\/VideoRec\/CamVid\/\">CamVid dataset<\/a> from the University of Cambridge for training. This dataset is a collection of images containing street-level views obtained while driving. The dataset provides pixel-level labels for 32 semantic classes including car, pedestrian, and road. <\/p>\r\n      <h3>Setup<\/h3>\r\n      <p>This example creates the SegNet network with weights initialized from the VGG-16 network. To get VGG-16, install <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/61733-neural-network-toolbox-model-for-vgg-16-network\">Neural Network Toolbox\u2122 Model for VGG-16 Network<\/a>. After installation is complete, run the following code to verify that the installation is correct. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">vgg16();\r\n<\/pre><p>In addition, download a pretrained version of SegNet. The pretrained model allows you to run the entire example without having to wait for training to complete. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">pretrainedURL = <span style=\"color:rgb(160, 32, 240);\">'https:\/\/www.mathworks.com\/supportfiles\/vision\/data\/segnetVGG16CamVid.mat'<\/span>;\r\npretrainedFolder = fullfile(tempdir,<span style=\"color:rgb(160, 32, 240);\">'pretrainedSegNet'<\/span>);\r\npretrainedSegNet = fullfile(pretrainedFolder,<span style=\"color:rgb(160, 32, 240);\">'segnetVGG16CamVid.mat'<\/span>); \r\n<span style=\"color:rgb(0, 0, 255);\">if<\/span> ~exist(pretrainedFolder,<span style=\"color:rgb(160, 32, 240);\">'dir'<\/span>)\r\n    mkdir(pretrainedFolder);\r\n    disp(<span style=\"color:rgb(160, 32, 240);\">'Downloading pretrained SegNet (107 MB)...'<\/span>);\r\n    websave(pretrainedSegNet,pretrainedURL);\r\n<span style=\"color:rgb(0, 0, 255);\">end<\/span>\r\n<\/pre><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">Downloading pretrained SegNet (107 MB)...\r\n<\/pre><p>A CUDA-capable NVIDIA\u2122 GPU with compute capability 3.0 or higher is highly recommended for running this example. Use of a GPU requires Parallel Computing Toolbox\u2122. <\/p>\r\n      <h3>Download CamVid Dataset<\/h3>\r\n      <p>Download the CamVid dataset from the following URLs.<\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">imageURL = <span style=\"color:rgb(160, 32, 240);\">'http:\/\/web4.cs.ucl.ac.uk\/staff\/g.brostow\/MotionSegRecData\/files\/701_StillsRaw_full.zip'<\/span>;\r\nlabelURL = <span style=\"color:rgb(160, 32, 240);\">'http:\/\/web4.cs.ucl.ac.uk\/staff\/g.brostow\/MotionSegRecData\/data\/LabeledApproved_full.zip'<\/span>;\r\noutputFolder = fullfile(tempdir,<span style=\"color:rgb(160, 32, 240);\">'CamVid'<\/span>);\r\n<span style=\"color:rgb(0, 0, 255);\">if<\/span> ~exist(outputFolder, <span style=\"color:rgb(160, 32, 240);\">'dir'<\/span>)\r\n   \r\n    mkdir(outputFolder)\r\n    labelsZip = fullfile(outputFolder,<span style=\"color:rgb(160, 32, 240);\">'labels.zip'<\/span>);\r\n    imagesZip = fullfile(outputFolder,<span style=\"color:rgb(160, 32, 240);\">'images.zip'<\/span>);   \r\n    \r\n    disp(<span style=\"color:rgb(160, 32, 240);\">'Downloading 16 MB CamVid dataset labels...'<\/span>); \r\n    websave(labelsZip, labelURL);\r\n    unzip(labelsZip, fullfile(outputFolder,<span style=\"color:rgb(160, 32, 240);\">'labels'<\/span>));\r\n    \r\n    disp(<span style=\"color:rgb(160, 32, 240);\">'Downloading 557 MB CamVid dataset images...'<\/span>);  \r\n    websave(imagesZip, imageURL);       \r\n    unzip(imagesZip, fullfile(outputFolder,<span style=\"color:rgb(160, 32, 240);\">'images'<\/span>));    \r\n<span style=\"color:rgb(0, 0, 255);\">end<\/span>\r\n<\/pre><p>Note: Download time of the data depends on your Internet connection. The commands used above block MATLAB until the download is complete. Alternatively, you can use your web browser to first download the dataset to your local disk. To use the file you downloaded from the web, change the <inline style=\"font-family: monospace, monospace; font-size: inherit;\">outputFolder<\/inline> variable above to the location of the downloaded file. <\/p>\r\n      <h3>Load CamVid Images<\/h3>\r\n      <p>Use <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.io.datastore.imagedatastore.html\">imageDatastore<\/a> to load CamVid images. The <inline style=\"font-family: monospace, monospace; font-size: inherit;\">imageDatastore<\/inline> enables you to efficiently load a large collection of images on disk. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">imgDir = fullfile(outputFolder,<span style=\"color:rgb(160, 32, 240);\">'images'<\/span>,<span style=\"color:rgb(160, 32, 240);\">'701_StillsRaw_full'<\/span>);\r\nimds = imageDatastore(imgDir);\r\n<\/pre><p>Display one of the images.<\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">I = readimage(imds,1);\r\nI = histeq(I);\r\nimshow(I)\r\n<\/pre><img decoding=\"async\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2018\/06\/SemanticSegmentationUsingDeepLearningExample_1.png\"><h3>Load CamVid Pixel-Labeled Images<\/h3>\r\n      <p>Use <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.io.datastore.imagedatastore.html\"><tt>imageDatastore<\/tt><\/a> to load CamVid pixel label image data. A <inline style=\"font-family: monospace, monospace; font-size: inherit;\">pixelLabelDatastore<\/inline> encapsulates the pixel label data and the label ID to a class name mapping. <\/p>\r\n      <p>Following the procedure used in original SegNet paper (Badrinarayanan, Vijay, Alex Kendall, and Roberto Cipolla. \"SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation.\" arXiv preprint arXiv:1511.00561, 201), group the 32 original classes in CamVid to 11 classes. Specify these classes. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">classes = [\r\n    \"Sky\"\r\n    \"Building\"\r\n    \"Pole\"\r\n    \"Road\"\r\n    \"Pavement\"\r\n    \"Tree\"\r\n    \"SignSymbol\"\r\n    \"Fence\"\r\n    \"Car\"\r\n    \"Pedestrian\"\r\n    \"Bicyclist\"\r\n    ];\r\n<\/pre><p>To reduce 32 classes into 11, multiple classes from the original dataset are grouped together. For example, \"Car\" is a combination of \"Car\", \"SUVPickupTruck\", \"Truck_Bus\", \"Train\", and \"OtherMoving\". Return the grouped label IDs by using the supporting function <inline style=\"font-family: monospace, monospace; font-size: inherit;\">camvidPixelLabelIDs<\/inline>, which is listed at the end of this example. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">labelIDs = camvidPixelLabelIDs();\r\n<\/pre><p>Use the classes and label IDs to create the <inline style=\"font-family: monospace, monospace; font-size: inherit;\">pixelLabelDatastore.<\/inline> <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">labelDir = fullfile(outputFolder,<span style=\"color:rgb(160, 32, 240);\">'labels'<\/span>);\r\npxds = pixelLabelDatastore(labelDir,classes,labelIDs);\r\n<\/pre><p>Read and display one of the pixel-labeled images by overlaying it on top of an image.<\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">C = readimage(pxds,1);\r\ncmap = camvidColorMap;\r\nB = labeloverlay(I,C,<span style=\"color:rgb(160, 32, 240);\">'ColorMap'<\/span>,cmap);\r\nimshow(B)\r\npixelLabelColorbar(cmap,classes);\r\n<\/pre><img decoding=\"async\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2018\/06\/SemanticSegmentationUsingDeepLearningExample_2.png\"><p>Areas with no color overlay do not have pixel labels and are not used during training.<\/p>\r\n      <h3>Analyze Dataset Statistics<\/h3>\r\n      <p>To see the distribution of class labels in the CamVid dataset, use <a href=\"https:\/\/www.mathworks.com\/help\/vision\/ref\/pixellabelimagedatastore.counteachlabel.html\"><tt>countEachLabel<\/tt><\/a>. This function counts the number of pixels by class label. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">tbl = countEachLabel(pxds)\r\n<\/pre><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">tbl=<em>11\u00d73 table<\/em>\r\n        Name        PixelCount    ImagePixelCount\r\n    ____________    __________    _______________\r\n    'Sky'            76801167        483148800   \r\n    'Building'      117373718        483148800   \r\n    'Pole'            4798742        483148800   \r\n    'Road'          140535728        484531200   \r\n    'Pavement'       33614414        472089600   \r\n    'Tree'           54258673        447897600   \r\n    'SignSymbol'      5224247        468633600   \r\n    'Fence'           6921061        251596800   \r\n    'Car'            24436957        483148800   \r\n    'Pedestrian'      3402909        444441600   \r\n    'Bicyclist'       2591222        261964800   \r\n<\/pre><p>Visualize the pixel counts by class.<\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">frequency = tbl.PixelCount\/sum(tbl.PixelCount);\r\nbar(1:numel(classes),frequency)\r\nxticks(1:numel(classes)) \r\nxticklabels(tbl.Name)\r\nxtickangle(45)\r\nylabel(<span style=\"color:rgb(160, 32, 240);\">'Frequency'<\/span>)\r\n<\/pre><img decoding=\"async\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2018\/06\/SemanticSegmentationUsingDeepLearningExample_3.png\"><p>Ideally, all classes would have an equal number of observations. However, the classes in CamVid are imbalanced, which is a common issue in automotive datasets of street scenes. Such scenes have more sky, building, and road pixels than pedestrian and bicyclist pixels because sky, buildings and roads cover more area in the image. If not handled correctly, this imbalance can be detrimental to the learning process because the learning is biased in favor of the dominant classes. Later on in this example, you will use class weighting to handle this issue. <\/p>\r\n      <h3>Resize CamVid Data<\/h3>\r\n      <p>The images in the CamVid data set are 720 by 960. To reduce training time and memory usage, resize the images and pixel label images to 360 by 480. <inline style=\"font-family: monospace, monospace; font-size: inherit;\">resizeCamVidImages<\/inline> and <inline style=\"font-family: monospace, monospace; font-size: inherit;\">resizeCamVidPixelLabels<\/inline> are supporting functions listed at the end of this example. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">imageFolder = fullfile(outputFolder,<span style=\"color:rgb(160, 32, 240);\">'imagesResized'<\/span>,filesep);\r\nimds = resizeCamVidImages(imds,imageFolder);\r\nlabelFolder = fullfile(outputFolder,<span style=\"color:rgb(160, 32, 240);\">'labelsResized'<\/span>,filesep);\r\npxds = resizeCamVidPixelLabels(pxds,labelFolder);\r\n<\/pre><h3>Prepare Training and Test Sets<\/h3>\r\n      <p>SegNet is trained using 60% of the images from the dataset. The rest of the images are used for testing. The following code randomly splits the image and pixel label data into a training and test set. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">[imdsTrain,imdsTest,pxdsTrain,pxdsTest] = partitionCamVidData(imds,pxds);\r\n<\/pre><p>The 60\/40 split results in the following number of training and test images:<\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">numTrainingImages = numel(imdsTrain.Files)\r\n<\/pre><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">numTrainingImages = \r\n   421\r\n<\/pre><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">numTestingImages = numel(imdsTest.Files)\r\n<\/pre><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">numTestingImages = \r\n   280\r\n<\/pre><h3>Create the Network<\/h3>\r\n      <p>Use <a href=\"https:\/\/www.mathworks.com\/help\/vision\/ref\/segnetlayers.html\"><tt>segnetLayers<\/tt><\/a> to create a SegNet network initialized using VGG-16 weights. <inline style=\"font-family: monospace, monospace; font-size: inherit;\">segnetLayers<\/inline> automatically performs the network surgery needed to transfer the weights from VGG-16 and adds the additional layers required for semantic segmentation. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">imageSize = [360 480 3];\r\nnumClasses = numel(classes);\r\nlgraph = segnetLayers(imageSize,numClasses,<span style=\"color:rgb(160, 32, 240);\">'vgg16'<\/span>);\r\n<\/pre><p>The image size is selected based on the size of the images in the dataset. The number of classes is selected based on the classes in CamVid. <\/p>\r\n      <h3>Balance Classes Using Class Weighting<\/h3>\r\n      <p>As shown earlier, the classes in CamVid are not balanced. To improve training, you can use class weighting to balance the classes. Use the pixel label counts computed earlier with <a href=\"https:\/\/www.mathworks.com\/help\/vision\/ref\/pixellabelimagedatastore.counteachlabel.html\"><tt>countEachLayer<\/tt><\/a> and calculate the median frequency class weights. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">imageFreq = tbl.PixelCount .\/ tbl.ImagePixelCount;\r\nclassWeights = median(imageFreq) .\/ imageFreq\r\n<\/pre><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">classWeights = <em>11\u00d71<\/em>\r\n\r\n   0.318184709354742\r\n   0.208197860785155\r\n   5.092367332938507\r\n   0.174381825257403\r\n   0.710338097812948\r\n   0.417518560687874\r\n   4.537074815482926\r\n   1.838648261914560\r\n   1.000000000000000\r\n   6.605878573155874\r\n      \u22ee\r\n\r\n<\/pre><p>Specify the class weights using a <a href=https:\/\/www.mathworks.com\/help\/vision\/ref\/nnet.cnn.layer.pixelclassificationlayer.html\"><tt>pixelClassificationLayer<\/tt><\/a>. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">pxLayer = pixelClassificationLayer(<span style=\"color:rgb(160, 32, 240);\">'Name'<\/span>,<span style=\"color:rgb(160, 32, 240);\">'labels'<\/span>,<span style=\"color:rgb(160, 32, 240);\">'ClassNames'<\/span>,tbl.Name,<span style=\"color:rgb(160, 32, 240);\">'ClassWeights'<\/span>,classWeights)\r\n<\/pre><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">pxLayer = \r\n  PixelClassificationLayer with properties:\r\n\r\n            Name: 'labels'\r\n      ClassNames: {11\u00d71 cell}\r\n    ClassWeights: [11\u00d71 double]\r\n      OutputSize: 'auto'\r\n   Hyperparameters\r\n    LossFunction: 'crossentropyex'\r\n<\/pre><p>Update the SegNet network with the new <inline style=\"font-family: monospace, monospace; font-size: inherit;\">pixelClassificationLayer<\/inline> by removing the current <inline style=\"font-family: monospace, monospace; font-size: inherit;\">pixelClassificationLayer<\/inline> and adding the new layer. The current <inline style=\"font-family: monospace, monospace; font-size: inherit;\">pixelClassificationLayer<\/inline> is named 'pixelLabels'. Remove it using <a title=\"https:\/\/www.mathworks.com\/help\/nnet\/ref\/removelayers.html (link no longer works)\"><tt>removeLayers<\/tt><\/a>, add the new one using <a title=\"https:\/\/www.mathworks.com\/help\/nnet\/ref\/addlayers.html (link no longer works)\">addLayers<\/a>, and connect the new layer to the rest of the network using <a title=\"https:\/\/www.mathworks.com\/help\/nnet\/ref\/connectlayers.html (link no longer works)\">connectLayers<\/a>. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">lgraph = removeLayers(lgraph,<span style=\"color:rgb(160, 32, 240);\">'pixelLabels'<\/span>);\r\nlgraph = addLayers(lgraph, pxLayer);\r\nlgraph = connectLayers(lgraph,<span style=\"color:rgb(160, 32, 240);\">'softmax'<\/span>,<span style=\"color:rgb(160, 32, 240);\">'labels'<\/span>);\r\n<\/pre><h3>Select Training Options<\/h3>\r\n      <p>The optimization algorithm used for training is stochastic gradient descent with momentum (SGDM). Use <a href=\"https:\/\/www.mathworks.com\/help\/nnet\/ref\/trainingoptions.html\"><tt>trainingOptions<\/tt><\/a> to specify the hyperparameters used for SGDM. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">options = trainingOptions(<span style=\"color:rgb(160, 32, 240);\">'sgdm'<\/span>, <span style=\"color:rgb(0, 0, 255);\">...<\/span>\r\n    <span style=\"color:rgb(160, 32, 240);\">'Momentum'<\/span>,0.9, <span style=\"color:rgb(0, 0, 255);\">...<\/span>\r\n    <span style=\"color:rgb(160, 32, 240);\">'InitialLearnRate'<\/span>,1e-3, <span style=\"color:rgb(0, 0, 255);\">...<\/span>\r\n    <span style=\"color:rgb(160, 32, 240);\">'L2Regularization'<\/span>,0.0005, <span style=\"color:rgb(0, 0, 255);\">...<\/span>\r\n    <span style=\"color:rgb(160, 32, 240);\">'MaxEpochs'<\/span>,100, <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\">  <\/span>\r\n    <span style=\"color:rgb(160, 32, 240);\">'MiniBatchSize'<\/span>,4, <span style=\"color:rgb(0, 0, 255);\">...<\/span>\r\n    <span style=\"color:rgb(160, 32, 240);\">'Shuffle'<\/span>,<span style=\"color:rgb(160, 32, 240);\">'every-epoch'<\/span>, <span style=\"color:rgb(0, 0, 255);\">...<\/span>\r\n    <span style=\"color:rgb(160, 32, 240);\">'VerboseFrequency'<\/span>,2);\r\n<\/pre><p>A minibatch size of 4 is used to reduce memory usage while training. You can increase or decrease this value based on the amount of GPU memory you have on your system. <\/p>\r\n      <h3>Data Augmentation<\/h3>\r\n      <p>Data augmentation is used during training to provide more examples to the network because it helps improve the accuracy of the network. Here, random left\/right reflection and random X\/Y translation of +\/- 10 pixels is used for data augmentation. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">augmenter = imageDataAugmenter(<span style=\"color:rgb(160, 32, 240);\">'RandXReflection'<\/span>,true,<span style=\"color:rgb(0, 0, 255);\">...<\/span>\r\n    <span style=\"color:rgb(160, 32, 240);\">'RandXTranslation'<\/span>,[-10 10],<span style=\"color:rgb(160, 32, 240);\">'RandYTranslation'<\/span>,[-10 10]);\r\n<\/pre><p> <inline style=\"font-family: monospace, monospace; font-size: inherit;\">imageDataAugmenter<\/inline> supports several other types of data augmentation. Choosing among them requires empirical analysis and is another level of hyperparameter tuning. <\/p>\r\n      <h3>Start Training<\/h3>\r\n      <p>Combine the training data and data augmentation selections using <a href=\"https:\/\/www.mathworks.com\/help\/vision\/ref\/pixellabelimagedatastore.html\"><tt>pixelLabelImageDatastore<\/tt><\/a>. The <inline style=\"font-family: monospace, monospace; font-size: inherit;\">pixelLabelImageDatastore<\/inline> reads batches of training data, applies data augmentation, and sends the augmented data to the training algorithm. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">pximds = pixelLabelImageDatastore(imdsTrain,pxdsTrain,<span style=\"color:rgb(0, 0, 255);\">...<\/span>\r\n    <span style=\"color:rgb(160, 32, 240);\">'DataAugmentation'<\/span>,augmenter);\r\n<\/pre><p>Start training if the <inline style=\"font-family: monospace, monospace; font-size: inherit;\">doTraining<\/inline> flag is true. Otherwise, load a pretrained network. Note: Training takes about 5 hours on an NVIDIA\u2122 Titan X and can take even longer depending on your GPU hardware. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">doTraining = false;\r\n<span style=\"color:rgb(0, 0, 255);\">if<\/span> doTraining    \r\n    [net, info] = trainNetwork(pximds,lgraph,options);\r\n<span style=\"color:rgb(0, 0, 255);\">else<\/span>\r\n    data = load(pretrainedSegNet);\r\n    net = data.net;\r\n<span style=\"color:rgb(0, 0, 255);\">end<\/span>\r\n<\/pre><h3>Test Network on One Image<\/h3>\r\n      <p>As a quick sanity check, run the trained network on one test image. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">I = read(imdsTest);\r\nC = semanticseg(I, net);\r\n<\/pre><p>Display the results.<\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">B = labeloverlay(I,C,<span style=\"color:rgb(160, 32, 240);\">'Colormap'<\/span>,cmap,<span style=\"color:rgb(160, 32, 240);\">'Transparency'<\/span>,0.4);\r\nimshow(B)\r\npixelLabelColorbar(cmap, classes);\r\n<\/pre><img decoding=\"async\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2018\/06\/SemanticSegmentationUsingDeepLearningExample_4.png\"><p>Compare the results in <inline style=\"font-family: monospace, monospace; font-size: inherit;\">C<\/inline> with the expected ground truth stored in <inline style=\"font-family: monospace, monospace; font-size: inherit;\">pxdsTest<\/inline>. The green and magenta regions highlight areas where the segmentation results differ from the expected ground truth. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">expectedResult = read(pxdsTest);\r\nactual = uint8(C);\r\nexpected = uint8(expectedResult);\r\nimshowpair(actual, expected)\r\n<\/pre><img decoding=\"async\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2018\/06\/SemanticSegmentationUsingDeepLearningExample_5.png\"><p>Visually, the semantic segmentation results overlap well for classes such as road, sky, and building. However, smaller objects like pedestrians and cars are not as accurate. The amount of overlap per class can be measured using the intersection-over-union (IoU) metric, also known as the Jaccard index. Use the <a href=\"https:\/\/www.mathworks.com\/help\/images\/ref\/jaccard.html\"><tt>jaccard<\/tt><\/a> function to measure IoU. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">iou = jaccard(C, expectedResult);\r\ntable(classes,iou)\r\n<\/pre><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">ans=<em>11\u00d72 table<\/em>\r\n      classes              iou        \r\n    ____________    __________________\r\n    \"Sky\"            0.926585343977038\r\n    \"Building\"       0.798698991022729\r\n    \"Pole\"           0.169776501947919\r\n    \"Road\"           0.951766120547122\r\n    \"Pavement\"       0.418766821629557\r\n    \"Tree\"           0.434014251781473\r\n    \"SignSymbol\"     0.325092056812204\r\n    \"Fence\"           0.49200469780468\r\n    \"Car\"           0.0687557042896258\r\n    \"Pedestrian\"                     0\r\n    \"Bicyclist\"                      0\r\n<\/pre><p>The IoU metric confirms the visual results. Road, sky, and building classes have high IoU scores, while classes such as pedestrian and car have low scores. Other common segmentation metrics include the <a href=\"https:\/\/www.mathworks.com\/help\/images\/ref\/dice.html\">Dice index<\/a> and the <a href=\"https:\/\/www.mathworks.com\/help\/images\/ref\/bfscore.html\">Boundary-F1<\/a> contour matching score. <\/p>\r\n      <h3>Evaluate Trained Network<\/h3>\r\n      <p>To measure accuracy for multiple test images, run <a href=\"https:\/\/www.mathworks.com\/help\/vision\/ref\/semanticseg.html\"><tt>semanticseg<\/tt><\/a> on the entire test set. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">pxdsResults = semanticseg(imdsTest,net,<span style=\"color:rgb(160, 32, 240);\">'MiniBatchSize'<\/span>,4,<span style=\"color:rgb(160, 32, 240);\">'WriteLocation'<\/span>,tempdir,<span style=\"color:rgb(160, 32, 240);\">'Verbose'<\/span>,false);\r\n<\/pre><p> <inline style=\"font-family: monospace, monospace; font-size: inherit;\">semanticseg<\/inline> returns the results for the test set as a <inline style=\"font-family: monospace, monospace; font-size: inherit;\">pixelLabelDatastore<\/inline> object. The actual pixel label data for each test image in <inline style=\"font-family: monospace, monospace; font-size: inherit;\">imdsTest<\/inline> is written to disk in the location specified by the <inline style=\"font-family: monospace, monospace; font-size: inherit;\">'WriteLocation'<\/inline> parameter. Use <a href=\"https:\/\/www.mathworks.com\/help\/vision\/ref\/evaluatesemanticsegmentation.html\"><tt>evaluateSemanticSegmentation<\/tt><\/a> to measure semantic segmentation metrics on the test set results. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">metrics = evaluateSemanticSegmentation(pxdsResults,pxdsTest,<span style=\"color:rgb(160, 32, 240);\">'Verbose'<\/span>,false);\r\n<\/pre><p> <inline style=\"font-family: monospace, monospace; font-size: inherit;\">evaluateSemanticSegmentation<\/inline> returns various metrics for the entire dataset, for individual classes, and for each test image. To see the dataset level metrics, inspect <inline style=\"font-family: monospace, monospace; font-size: inherit;\">metrics.DataSetMetrics<\/inline> . <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">metrics.DataSetMetrics\r\n<\/pre><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">ans=<em>1\u00d75 table<\/em>\r\n     GlobalAccuracy        MeanAccuracy            MeanIoU            WeightedIoU         MeanBFScore   \r\n    _________________    _________________    _________________    _________________    ________________\r\n    0.882035049405331    0.850970241394654    0.608927281006314    0.797947090677593    0.60980715338674\r\n<\/pre><p>The dataset metrics provide a high-level overview of the network performance. To see the impact each class has on the overall performance, inspect the per-class metrics using <inline style=\"font-family: monospace, monospace; font-size: inherit;\">metrics.ClassMetrics<\/inline>. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">metrics.ClassMetrics\r\n<\/pre><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">ans=<em>11\u00d73 table<\/em>\r\n                      Accuracy                IoU              MeanBFScore   \r\n                  _________________    _________________    _________________\r\n    Sky           0.934932109589398    0.892435212043741    0.881521241030993\r\n    Building      0.797763575866624    0.752633046400693    0.597070806633627\r\n    Pole          0.726347220018996    0.186622256135469    0.522519568793497\r\n    Road          0.936763259117679    0.906720411900943    0.710433513101952\r\n    Pavement      0.906740772559168    0.728650096831083    0.703619961786386\r\n    Tree          0.866574402823008    0.737468334515386    0.664211092196979\r\n    SignSymbol    0.755895966085333    0.345193190798607    0.434011059025598\r\n    Fence         0.828068989656379    0.505920925889568     0.50829520978596\r\n    Car           0.911873566421394    0.750012303035288    0.643524410331899\r\n    Pedestrian     0.84866313766479    0.350461157529184     0.45550879471499\r\n    Bicyclist     0.847049655538425    0.542083155989493    0.468181589716695\r\n<\/pre><p>Although the overall dataset performance is quite high, the class metrics show that underrepresented classes such as <inline style=\"font-family: monospace, monospace; font-size: inherit;\">Pedestrian<\/inline>, <inline style=\"font-family: monospace, monospace; font-size: inherit;\">Bicyclist<\/inline>, and <inline style=\"font-family: monospace, monospace; font-size: inherit;\">Car<\/inline> are not segmented as well as classes such as <inline style=\"font-family: monospace, monospace; font-size: inherit;\">Road<\/inline>, <inline style=\"font-family: monospace, monospace; font-size: inherit;\">Sky<\/inline>, and <inline style=\"font-family: monospace, monospace; font-size: inherit;\">Building<\/inline>. Additional data that includes more samples of the underrepresented classes might help improve the results. <\/p>\r\n      <h3>Supporting Functions<\/h3><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\"><span style=\"color:rgb(0, 0, 255);\">function<\/span> labelIDs = camvidPixelLabelIDs()\r\n<span style=\"color:rgb(34, 139, 34);\">% Return the label IDs corresponding to each class.<\/span>\r\n<span style=\"color:rgb(34, 139, 34);\">%<\/span>\r\n<span style=\"color:rgb(34, 139, 34);\">% The CamVid dataset has 32 classes. Group them into 11 classes following<\/span>\r\n<span style=\"color:rgb(34, 139, 34);\">% the original SegNet training methodology [1].<\/span>\r\n<span style=\"color:rgb(34, 139, 34);\">%<\/span>\r\n<span style=\"color:rgb(34, 139, 34);\">% The 11 classes are:<\/span>\r\n<span style=\"color:rgb(34, 139, 34);\">%   \"Sky\" \"Building\", \"Pole\", \"Road\", \"Pavement\", \"Tree\", \"SignSymbol\",<\/span>\r\n<span style=\"color:rgb(34, 139, 34);\">%   \"Fence\", \"Car\", \"Pedestrian\",  and \"Bicyclist\".<\/span>\r\n<span style=\"color:rgb(34, 139, 34);\">%<\/span>\r\n<span style=\"color:rgb(34, 139, 34);\">% CamVid pixel label IDs are provided as RGB color values. Group them into<\/span>\r\n<span style=\"color:rgb(34, 139, 34);\">% 11 classes and return them as a cell array of M-by-3 matrices. The<\/span>\r\n<span style=\"color:rgb(34, 139, 34);\">% original CamVid class names are listed alongside each RGB value. Note<\/span>\r\n<span style=\"color:rgb(34, 139, 34);\">% that the Other\/Void class are excluded below.<\/span>\r\nlabelIDs = { <span style=\"color:rgb(0, 0, 255);\">...<\/span>\r\n    \r\n    <span style=\"color:rgb(34, 139, 34);\">% \"Sky\"<\/span>\r\n    [\r\n    128 128 128; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Sky\"<\/span>\r\n    ]\r\n    \r\n    <span style=\"color:rgb(34, 139, 34);\">% \"Building\" <\/span>\r\n    [\r\n    000 128 064; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Bridge\"<\/span>\r\n    128 000 000; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Building\"<\/span>\r\n    064 192 000; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Wall\"<\/span>\r\n    064 000 064; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Tunnel\"<\/span>\r\n    192 000 128; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Archway\"<\/span>\r\n    ]\r\n    \r\n    <span style=\"color:rgb(34, 139, 34);\">% \"Pole\"<\/span>\r\n    [\r\n    192 192 128; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Column_Pole\"<\/span>\r\n    000 000 064; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"TrafficCone\"<\/span>\r\n    ]\r\n    \r\n    <span style=\"color:rgb(34, 139, 34);\">% Road<\/span>\r\n    [\r\n    128 064 128; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Road\"<\/span>\r\n    128 000 192; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"LaneMkgsDriv\"<\/span>\r\n    192 000 064; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"LaneMkgsNonDriv\"<\/span>\r\n    ]\r\n    \r\n    <span style=\"color:rgb(34, 139, 34);\">% \"Pavement\"<\/span>\r\n    [\r\n    000 000 192; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Sidewalk\" <\/span>\r\n    064 192 128; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"ParkingBlock\"<\/span>\r\n    128 128 192; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"RoadShoulder\"<\/span>\r\n    ]\r\n        \r\n    <span style=\"color:rgb(34, 139, 34);\">% \"Tree\"<\/span>\r\n    [\r\n    128 128 000; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Tree\"<\/span>\r\n    192 192 000; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"VegetationMisc\"<\/span>\r\n    ]\r\n    \r\n    <span style=\"color:rgb(34, 139, 34);\">% \"SignSymbol\"<\/span>\r\n    [\r\n    192 128 128; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"SignSymbol\"<\/span>\r\n    128 128 064; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Misc_Text\"<\/span>\r\n    000 064 064; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"TrafficLight\"<\/span>\r\n    ]\r\n    \r\n    <span style=\"color:rgb(34, 139, 34);\">% \"Fence\"<\/span>\r\n    [\r\n    064 064 128; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Fence\"<\/span>\r\n    ]\r\n    \r\n    <span style=\"color:rgb(34, 139, 34);\">% \"Car\"<\/span>\r\n    [\r\n    064 000 128; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Car\"<\/span>\r\n    064 128 192; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"SUVPickupTruck\"<\/span>\r\n    192 128 192; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Truck_Bus\"<\/span>\r\n    192 064 128; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Train\"<\/span>\r\n    128 064 064; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"OtherMoving\"<\/span>\r\n    ]\r\n    \r\n    <span style=\"color:rgb(34, 139, 34);\">% \"Pedestrian\"<\/span>\r\n    [\r\n    064 064 000; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Pedestrian\"<\/span>\r\n    192 128 064; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Child\"<\/span>\r\n    064 000 192; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"CartLuggagePram\"<\/span>\r\n    064 128 064; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Animal\"<\/span>\r\n    ]\r\n    \r\n    <span style=\"color:rgb(34, 139, 34);\">% \"Bicyclist\"<\/span>\r\n    [\r\n    000 128 192; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"Bicyclist\"<\/span>\r\n    192 000 192; <span style=\"color:rgb(0, 0, 255);\">...<\/span><span style=\"color:rgb(34, 139, 34);\"> % \"MotorcycleScooter\"<\/span>\r\n    ]\r\n    \r\n    };\r\n<span style=\"color:rgb(0, 0, 255);\">end<\/span>\r\n<span style=\"color:rgb(0, 0, 255);\">function<\/span> pixelLabelColorbar(cmap, classNames)\r\n<span style=\"color:rgb(34, 139, 34);\">% Add a colorbar to the current axis. The colorbar is formatted<\/span>\r\n<span style=\"color:rgb(34, 139, 34);\">% to display the class names with the color.<\/span>\r\ncolormap(gca,cmap)\r\n<span style=\"color:rgb(34, 139, 34);\">% Add colorbar to current figure.<\/span>\r\nc = colorbar(<span style=\"color:rgb(160, 32, 240);\">'peer'<\/span>, gca);\r\n<span style=\"color:rgb(34, 139, 34);\">% Use class names for tick marks.<\/span>\r\nc.TickLabels = classNames;\r\nnumClasses = size(cmap,1);\r\n<span style=\"color:rgb(34, 139, 34);\">% Center tick labels.<\/span>\r\nc.Ticks = 1\/(numClasses*2):1\/numClasses:1;\r\n<span style=\"color:rgb(34, 139, 34);\">% Remove tick mark.<\/span>\r\nc.TickLength = 0;\r\n<span style=\"color:rgb(0, 0, 255);\">end<\/span>\r\n<span style=\"color:rgb(0, 0, 255);\">function<\/span> cmap = camvidColorMap()\r\n<span style=\"color:rgb(34, 139, 34);\">% Define the colormap used by CamVid dataset.<\/span>\r\ncmap = [\r\n    128 128 128   <span style=\"color:rgb(34, 139, 34);\">% Sky<\/span>\r\n    128 0 0       <span style=\"color:rgb(34, 139, 34);\">% Building<\/span>\r\n    192 192 192   <span style=\"color:rgb(34, 139, 34);\">% Pole<\/span>\r\n    128 64 128    <span style=\"color:rgb(34, 139, 34);\">% Road<\/span>\r\n    60 40 222     <span style=\"color:rgb(34, 139, 34);\">% Pavement<\/span>\r\n    128 128 0     <span style=\"color:rgb(34, 139, 34);\">% Tree<\/span>\r\n    192 128 128   <span style=\"color:rgb(34, 139, 34);\">% SignSymbol<\/span>\r\n    64 64 128     <span style=\"color:rgb(34, 139, 34);\">% Fence<\/span>\r\n    64 0 128      <span style=\"color:rgb(34, 139, 34);\">% Car<\/span>\r\n    64 64 0       <span style=\"color:rgb(34, 139, 34);\">% Pedestrian<\/span>\r\n    0 128 192     <span style=\"color:rgb(34, 139, 34);\">% Bicyclist<\/span>\r\n    ];\r\n<span style=\"color:rgb(34, 139, 34);\">% Normalize between [0 1].<\/span>\r\ncmap = cmap .\/ 255;\r\n<span style=\"color:rgb(0, 0, 255);\">end<\/span>\r\n<span style=\"color:rgb(0, 0, 255);\">function<\/span> imds = resizeCamVidImages(imds, imageFolder)\r\n<span style=\"color:rgb(34, 139, 34);\">% Resize images to [360 480].<\/span>\r\n<span style=\"color:rgb(0, 0, 255);\">if<\/span> ~exist(imageFolder,<span style=\"color:rgb(160, 32, 240);\">'dir'<\/span>) \r\n    mkdir(imageFolder)\r\n<span style=\"color:rgb(0, 0, 255);\">else<\/span>\r\n    imds = imageDatastore(imageFolder);\r\n    <span style=\"color:rgb(0, 0, 255);\">return<\/span>; <span style=\"color:rgb(34, 139, 34);\">% Skip if images already resized<\/span>\r\n<span style=\"color:rgb(0, 0, 255);\">end<\/span>\r\nreset(imds)\r\n<span style=\"color:rgb(0, 0, 255);\">while<\/span> hasdata(imds)\r\n    <span style=\"color:rgb(34, 139, 34);\">% Read an image.<\/span>\r\n    [I,info] = read(imds);     \r\n    \r\n    <span style=\"color:rgb(34, 139, 34);\">% Resize image.<\/span>\r\n    I = imresize(I,[360 480]);    \r\n    \r\n    <span style=\"color:rgb(34, 139, 34);\">% Write to disk.<\/span>\r\n    [~, filename, ext] = fileparts(info.Filename);\r\n    imwrite(I,[imageFolder filename ext])\r\n<span style=\"color:rgb(0, 0, 255);\">end<\/span>\r\nimds = imageDatastore(imageFolder);\r\n<span style=\"color:rgb(0, 0, 255);\">end<\/span>\r\n<span style=\"color:rgb(0, 0, 255);\">function<\/span> pxds = resizeCamVidPixelLabels(pxds, labelFolder)\r\n<span style=\"color:rgb(34, 139, 34);\">% Resize pixel label data to [360 480].<\/span>\r\nclasses = pxds.ClassNames;\r\nlabelIDs = 1:numel(classes);\r\n<span style=\"color:rgb(0, 0, 255);\">if<\/span> ~exist(labelFolder,<span style=\"color:rgb(160, 32, 240);\">'dir'<\/span>)\r\n    mkdir(labelFolder)\r\n<span style=\"color:rgb(0, 0, 255);\">else<\/span>\r\n    pxds = pixelLabelDatastore(labelFolder,classes,labelIDs);\r\n    <span style=\"color:rgb(0, 0, 255);\">return<\/span>; <span style=\"color:rgb(34, 139, 34);\">% Skip if images already resized<\/span>\r\n<span style=\"color:rgb(0, 0, 255);\">end<\/span>\r\nreset(pxds)\r\n<span style=\"color:rgb(0, 0, 255);\">while<\/span> hasdata(pxds)\r\n    <span style=\"color:rgb(34, 139, 34);\">% Read the pixel data.<\/span>\r\n    [C,info] = read(pxds);\r\n    \r\n    <span style=\"color:rgb(34, 139, 34);\">% Convert from categorical to uint8.<\/span>\r\n    L = uint8(C);\r\n    \r\n    <span style=\"color:rgb(34, 139, 34);\">% Resize the data. Use 'nearest' interpolation to<\/span>\r\n    <span style=\"color:rgb(34, 139, 34);\">% preserve label IDs.<\/span>\r\n    L = imresize(L,[360 480],<span style=\"color:rgb(160, 32, 240);\">'nearest'<\/span>);\r\n    \r\n    <span style=\"color:rgb(34, 139, 34);\">% Write the data to disk.<\/span>\r\n    [~, filename, ext] = fileparts(info.Filename);\r\n    imwrite(L,[labelFolder filename ext])\r\n<span style=\"color:rgb(0, 0, 255);\">end<\/span>\r\nlabelIDs = 1:numel(classes);\r\npxds = pixelLabelDatastore(labelFolder,classes,labelIDs);\r\n<span style=\"color:rgb(0, 0, 255);\">end<\/span>\r\n<span style=\"color:rgb(0, 0, 255);\">function<\/span> [imdsTrain, imdsTest, pxdsTrain, pxdsTest] = partitionCamVidData(imds,pxds)\r\n<span style=\"color:rgb(34, 139, 34);\">% Partition CamVid data by randomly selecting 60% of the data for training. The<\/span>\r\n<span style=\"color:rgb(34, 139, 34);\">% rest is used for testing.<\/span>\r\n    \r\n<span style=\"color:rgb(34, 139, 34);\">% Set initial random state for example reproducibility.<\/span>\r\nrng(0); \r\nnumFiles = numel(imds.Files);\r\nshuffledIndices = randperm(numFiles);\r\n<span style=\"color:rgb(34, 139, 34);\">% Use 60% of the images for training.<\/span>\r\nN = round(0.60 * numFiles);\r\ntrainingIdx = shuffledIndices(1:N);\r\n<span style=\"color:rgb(34, 139, 34);\">% Use the rest for testing.<\/span>\r\ntestIdx = shuffledIndices(N+1:end);\r\n<span style=\"color:rgb(34, 139, 34);\">% Create image datastores for training and test.<\/span>\r\ntrainingImages = imds.Files(trainingIdx);\r\ntestImages = imds.Files(testIdx);\r\nimdsTrain = imageDatastore(trainingImages);\r\nimdsTest = imageDatastore(testImages);\r\n<span style=\"color:rgb(34, 139, 34);\">% Extract class and label IDs info.<\/span>\r\nclasses = pxds.ClassNames;\r\nlabelIDs = 1:numel(pxds.ClassNames);\r\n<span style=\"color:rgb(34, 139, 34);\">% Create pixel label datastores for training and test.<\/span>\r\ntrainingLabels = pxds.Files(trainingIdx);\r\ntestLabels = pxds.Files(testIdx);\r\npxdsTrain = pixelLabelDatastore(trainingLabels, classes, labelIDs);\r\npxdsTest = pixelLabelDatastore(testLabels, classes, labelIDs);\r\n<span style=\"color:rgb(0, 0, 255);\">end<\/span>\r\n<\/pre>","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img decoding=\"async\"  class=\"img-responsive\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2018\/06\/SemanticSegmentationUsingDeepLearningExample_1.png\" onError=\"this.style.display ='none';\" \/><\/div><p>Today I want to show you a documentation example that shows how to train a semantic segmentation network using deep learning and the Computer Vision System Toolbox. \r\n      A semantic segmentation... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/deep-learning\/2018\/06\/08\/semantic-segmentation-using-deep-learning\/\">read more >><\/a><\/p>","protected":false},"author":42,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[9],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/400"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/users\/42"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/comments?post=400"}],"version-history":[{"count":12,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/400\/revisions"}],"predecessor-version":[{"id":692,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/400\/revisions\/692"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/media?parent=400"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/categories?post=400"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/tags?post=400"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}