{"id":2720,"date":"2019-08-22T06:15:06","date_gmt":"2019-08-22T06:15:06","guid":{"rendered":"https:\/\/blogs.mathworks.com\/deep-learning\/?p=2720"},"modified":"2021-04-06T15:49:43","modified_gmt":"2021-04-06T19:49:43","slug":"data-augmentation-for-image-classification-applications-using-deep-learning","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/deep-learning\/2019\/08\/22\/data-augmentation-for-image-classification-applications-using-deep-learning\/","title":{"rendered":"Data Augmentation for Image Classification Applications Using Deep Learning"},"content":{"rendered":"<span style=\"font-family: georgia;\">This post is from <a href=\"http:\/\/www.ogemarques.com\/\">Oge Marques, PhD<\/a> and Professor of Engineering and Computer Science at FAU. Oge is an ACM Distinguished Speaker, <a href=\"http:\/\/www.ogemarques.com\/books\/\">book author<\/a>, and <a href=\"https:\/\/www.aaas.org\/page\/2019-2020-leshner-leadership-institute-public-engagement-fellows-human-augmentation\">2019-20 AAAS Leshner Fellow<\/a>. He also happens to be a MATLAB aficionado and has been using MATLAB in his classroom for more than 20 years. You can also follow him on Twitter (<a href=\"https:\/\/twitter.com\/ProfessorOge\">@ProfessorOge<\/a>) <\/span>\r\n\r\n<!--more-->\r\n<h6><\/h6>\r\nThe popularization of deep learning for image classification and many other computer vision tasks can be attributed, in part, to the availability of very large volumes of training data. The general consensus in the machine learning and deep learning community is that, all other things being equal, the more training data you have the better your model (and, consequently, its performance on the test set) will be.\r\n<h6><\/h6>\r\nThere are many cases, however, in which you might not have a large enough data set for training a model that can learn the desired classes in a way that will generalize well to new examples, i.e. will not overfit to the training set. This can often be due to the difficulty and <em>cost<\/em> associated with the acquisition and labeling of images to build large training sets, whether cost is expressed in terms of dollars, human effort, computational resources, or the time consumed in the process.\r\n<h6><\/h6>\r\nIn this blog post we will focus on a technique called <em>data augmentation<\/em>, which is used to augment the existing dataset in a way that is more cost-effective than further data collection. In the case of image classification applications, data augmentation is usually accomplished using simple geometric transformation techniques applied to the original images, such as cropping, rotating, resizing, translating, and flipping, which we'll discuss in more detail below.\r\n<h6><\/h6>\r\nThe effectiveness and benefits of data augmentation have been extensively documented in the literature: it has been shown that data augmentation can act as a regularizer in preventing overfitting in neural networks [1, 2] and improve performance in imbalanced class problems [3]. Moreover, the winning approaches to highly visible challenges and competitions in image classification (such as ImageNet) throughout the years have used data augmentation. This has also motivated recent work on the development of methods to allow a neural network to learn augmentations that best improve the classifier [4].\r\n<h6><\/h6>\r\n<h2><strong>Data Augmentation Implementation in MATLAB<\/strong><\/h2>\r\n<h6><\/h6>\r\nImage data augmentation can be achieved in two ways [5]:\r\n<h6><\/h6>\r\n<ol>\r\n \t<li><em>offline augmentation<\/em>: which consists of performing the transformations to the images (potentially using MATLAB's batch image processing capabilities [6]) and saving the results on disk, thereby increasing the size of the dataset by a factor equal to the number of transformations performed. This can be acceptable for smaller datasets.<\/li>\r\n \t<li><em>online augmentation<\/em> or <em>augmentation on the fly<\/em>: which consists of performing transformations on the mini-batches that would be fed to the model during training. This method is preferred for larger datasets, to avoid a potentially explosive increase in storage requirements.<\/li>\r\n<\/ol>\r\n&nbsp;\r\n\r\nMATLAB provides an elegant and easy-to-use solution for online image data augmentation, which consists of two main components:\r\n<ul>\r\n \t<li><strong>augmentedImageDatastore<\/strong>: which generates batches of new images, after preprocessing the original training images using operations such as rotation, translation, shearing, resizing, or reflection (flipping).<\/li>\r\n \t<li><strong>imageDataAugmenter<\/strong>: which is used to configure the selected preprocessing operations for image data augmentation.<\/li>\r\n<\/ul>\r\n<h6><\/h6>\r\nThe following image data augmentation options are available in MATLAB using the <strong>imageDataAugmenter<\/strong> object:\r\n<h6><\/h6>\r\n<ul>\r\n \t<li>Rotation<\/li>\r\n \t<li>Reflection around the X (left-right flip) or Y (upside-down flip) axis<\/li>\r\n \t<li>Horizontal and vertical scaling<\/li>\r\n \t<li>Horizontal and vertical shearing<\/li>\r\n \t<li>Horizontal and vertical translation<\/li>\r\n<\/ul>\r\n<h6><\/h6>\r\nA few points are worth mentioning here:\r\n<h6><\/h6>\r\n<ul>\r\n \t<li>When you initialize the <strong>imageDataAugmenter<\/strong> variable, you can choose one or more options, e.g., only X and Y reflection and horizontal and vertical scaling, as in the code snippet below.<\/li>\r\n<\/ul>\r\n<pre>imageAugmenter = imageDataAugmenter( ...\r\n\u00a0\u00a0\u00a0 'RandXReflection', true, ...\r\n\u00a0\u00a0\u00a0 'RandXScale',[1,2], ...\r\n\u00a0\u00a0\u00a0 'RandYReflection', true, ...\r\n\u00a0\u00a0\u00a0 'RandYScale',[1,2]);<\/pre>\r\n<ul>\r\n \t<li>The values that you pass as parameters to some of the options (e.g., [1 2] for the X and Y scaling above) are meant to represent a <em>range of values<\/em> from which a <em>random<\/em> sample will be picked during the preprocessing step, if that transformation is applied to an image.<\/li>\r\n \t<li>There is also an option in <strong>imageDataAugmenter<\/strong> for providing a function that determines the range of values for a particular parameter, for example a random rotation between -5 and 5 degrees (see code snippet below).<\/li>\r\n<\/ul>\r\n<pre>imageAugmenter = imageDataAugmenter('RandRotation',@() -5 + 10 * rand);<\/pre>\r\n<ul>\r\n \t<li>There are two ways to access the actual preprocessed images (for inspection and display, for example):<\/li>\r\n<\/ul>\r\n<ol start=\"1\">\r\n \t<li>Starting in R2018a, there are read\/preview methods on <strong>augmentedImageDatastore<\/strong> that allow you to obtain an example batch of images (see code snippet below, which produces a tiled image such as the one in Fig.1,\u00a0 using the Flower Classification example: [8] )<\/li>\r\n<\/ol>\r\n<h6><\/h6>\r\n<pre>imageAugmenter = imageDataAugmenter('RandRotation',@() -20+40*rand);\r\n\r\naugImds = ... augmentedImageDatastore(imageSize,imds,'DataAugmentation',imageAugmenter);\r\n\r\n<span class=\"comment\">% Preview augmentation results <\/span>\r\nbatchedData = preview(augImds);\r\nimshow(imtile(batchedData.input))<\/pre>\r\n<h6><\/h6>\r\n&nbsp;\r\n<ol start=\"2\">\r\n \t<li>Starting in R2018b, a new method (augment) was added to the <strong>imageDataAugmenter<\/strong>, which serves two purposes: it functions as a standalone function-object as well as a configuration object for <strong>augmentedImageDatastore <\/strong>(see code snippet below, which might** produce the left-right flipped image such as the one in Fig. 2).<\/li>\r\n<\/ol>\r\n<pre>In = imread(which('peppers.png'));\r\nAug = imageDataAugmenter('RandXReflection',true);\r\nOut = augment(Aug,In);\r\nfigure, montage({In, Out})<\/pre>\r\n<a href=\"#_ftnref1\" name=\"_ftn1\"><\/a>\r\n<h6>**\u00a0Since the flip operation is randomly applied to the input image, you might have to run the snippet several times until an actual flip occurs.<\/h6>\r\n<img decoding=\"async\" loading=\"lazy\" class=\"wp-image-2804 size-full aligncenter\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2019\/08\/Fig1_Flowers-2.png\" alt=\"\" width=\"739\" height=\"727\" \/>\r\n<p style=\"text-align: center;\">Fig 1. Preview of augmented images processed with random rotation between -20 and 20 degrees.<\/p>\r\n\r\n<h6><\/h6>\r\n<img decoding=\"async\" loading=\"lazy\" class=\"wp-image-2798 size-full aligncenter\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2019\/08\/Fig2_Peppers-1.png\" alt=\"\" width=\"1271\" height=\"497\" \/>\r\n<p style=\"text-align: center;\">Fig 2. Example of random reflection ('RandXReflection') around the vertical axis.<\/p>\r\n\r\n<h6><\/h6>\r\n\r\n\r\nThe <strong>augmentedImageDatastore<\/strong> and the <strong>imageDataAugmenter <\/strong>integrate nicely with the neural network training workflow, which consists of [7]:\r\n<h6><\/h6>\r\n<ol>\r\n \t<li>Choose your training images, which you can store as an <strong>ImageDatastore<\/strong>, an object used to manage a collection of image files, where each individual image fits in memory, but the entire collection of images does not necessarily fit. This functionality, available since R2015b, was designed to read batches of images for faster processing in machine learning and computer vision applications.<\/li>\r\n \t<li>Select and configure the desired image preprocessing options (for example, range of rotation angles, in degrees, or range of horizontal translation distances, in pixels, from which specific values will be picked randomly) and create an <strong>imageDataAugmenter<\/strong> object initialized with the proper syntax.<\/li>\r\n \t<li>Create an\u00a0<strong>augmentedImageDatastore<\/strong>, specifying the training images, the size of output images, and the\u00a0<strong>imageDataAugmenter<\/strong> to be used. The size of output images must be compatible with the size expected by the input layer of the network.<\/li>\r\n \t<li>Train the network, specifying the <strong>augmentedImageDatastore<\/strong> as the data source for the\u00a0<strong>trainNetwork<\/strong> function. For each iteration of training, the augmented image datastore generates one mini-batch of training data by applying random transformations to the original images in the underlying data from which <strong>augmentedImageDatastore<\/strong> was constructed (see Fig. 3).<\/li>\r\n<\/ol>\r\n<h6><\/h6>\r\n<img decoding=\"async\" loading=\"lazy\" width=\"1144\" height=\"438\" class=\"alignnone size-full wp-image-2728\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2019\/08\/Fig3_workflowimage.png\" alt=\"\" \/>\r\n<h6><\/h6>\r\nFig 3. Typical workflow for\u00a0training a network using an augmented image datastore (from [7]).\r\n<h6><\/h6>\r\nFor a complete example of an image classification problem using a small dataset of flower images, with and without image data augmentation, check my MATLAB File Exchange contribution [8].\r\n<h6><\/h6>\r\n<strong>To summarize<\/strong>, data augmentation can be a useful technique when dealing with less than ideal amounts of training data. The references below provide links to materials to learn more details.\r\n<h6><\/h6>\r\n<span style=\"font-family: courier;\">Thanks again to Oge for going in-depth into data augmentation. Do you have any questions for Oge? Leave a comment below!<\/span>\r\n<h6><\/h6>\r\n&nbsp;\r\n<h6><\/h6>\r\n&nbsp;\r\n<h1>References<\/h1>\r\n<ul>\r\n \t<li style=\"list-style-type: none !important;\">[1] P. Y. Simard, D. Steinkraus, and J. C. Platt, \"Best practices for convolutional neural networks applied to visual document analysis,\" in 2013 12th International Conference on Document Analysis and Recognition, vol. 2. IEEE Computer Society, 2003, pp. 958-958.<\/li>\r\n \t<li style=\"list-style-type: none !important;\">[2] D. C. Ciresan, U. Meier, L. M. Gambardella, and J. Schmidhuber, \"Deep, big, simple neural nets for handwritten digit recognition,\" Neural computation, vol. 22, no. 12, pp. 3207-3220, 2010.<\/li>\r\n \t<li style=\"list-style-type: none !important;\">[3] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, \"Smote: synthetic minority over-sampling technique,\" Journal of artificial intelligence research, vol. 16, no. 1, pp. 321-357, 2002.<\/li>\r\n \t<li style=\"list-style-type: none !important;\">[4] J. Wang and L. Perez, \"The Effectiveness of Data Augmentation in Image Classification using Deep Learning\", 2017. <a href=\"https:\/\/arxiv.org\/pdf\/1712.04621.pdf\">https:\/\/arxiv.org\/pdf\/1712.04621.pdf<\/a><\/li>\r\n \t<li style=\"list-style-type: none !important;\">[5] B. Raj, Data Augmentation | How to use Deep Learning when you have Limited Data - Part\u00a02. <a href=\"https:\/\/medium.com\/nanonets\/how-to-use-deep-learning-when-you-have-limited-data-part-2-data-augmentation-c26971dc8ced\">https:\/\/medium.com\/nanonets\/how-to-use-deep-learning-when-you-have-limited-data-part-2-data-augmentation-c26971dc8ced<\/a><\/li>\r\n \t<li style=\"list-style-type: none !important;\">[6] Mathworks. \"Batch Processing Using the Image Batch Processor App\". <a href=\"https:\/\/www.mathworks.com\/help\/images\/batch-processing-using-the-image-batch-processor-app.html\">https:\/\/www.mathworks.com\/help\/images\/batch-processing-using-the-image-batch-processor-app.html<\/a><\/li>\r\n \t<li style=\"list-style-type: none !important;\">[7] Mathworks. \"Preprocess Images for Deep Learning\". <a href=\"https:\/\/www.mathworks.com\/help\/nnet\/ug\/preprocess-images-for-deep-learning.html\">https:\/\/www.mathworks.com\/help\/nnet\/ug\/preprocess-images-for-deep-learning.html<\/a><\/li>\r\n \t<li style=\"list-style-type: none !important;\">[8] O. Marques, \"Image classification using data augmentation version 1.1.0\", MATLAB Central File Exchange, 2019. <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/68728-image-classification-using-data-augmentation\">https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/68728-image-classification-using-data-augmentation<\/a><\/li>\r\n<\/ul>\r\n&nbsp;\r\n\r\n&nbsp;","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img decoding=\"async\"  class=\"img-responsive\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2019\/08\/Fig1_Flowers-2.png\" onError=\"this.style.display ='none';\" \/><\/div><p>This post is from Oge Marques, PhD and Professor of Engineering and Computer Science at FAU. Oge is an ACM Distinguished Speaker, book author, and 2019-20 AAAS Leshner Fellow. He also happens to be a... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/deep-learning\/2019\/08\/22\/data-augmentation-for-image-classification-applications-using-deep-learning\/\">read more >><\/a><\/p>","protected":false},"author":156,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[9],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/2720"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/users\/156"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/comments?post=2720"}],"version-history":[{"count":43,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/2720\/revisions"}],"predecessor-version":[{"id":6111,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/2720\/revisions\/6111"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/media?parent=2720"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/categories?post=2720"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/tags?post=2720"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}