{"id":13187,"date":"2023-10-03T11:37:33","date_gmt":"2023-10-03T15:37:33","guid":{"rendered":"https:\/\/blogs.mathworks.com\/deep-learning\/?p=13187"},"modified":"2025-02-27T22:28:48","modified_gmt":"2025-02-28T03:28:48","slug":"verification-and-validation-for-ai-from-requirements-to-robust-modeling","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/deep-learning\/2023\/10\/03\/verification-and-validation-for-ai-from-requirements-to-robust-modeling\/","title":{"rendered":"Verification and Validation for AI: From requirements to robust modeling"},"content":{"rendered":"<em>The following post is from <\/em><a href=\"https:\/\/www.linkedin.com\/in\/lucas-garcia-phd\/\"><em>Lucas Garc\u00eda<\/em><\/a><em>, Product Manager for Deep Learning Toolbox.\u00a0<\/em>\r\n<h6><\/h6>\r\nThis is the second post in a 4-post series on Verification and Validation for AI. In the <a href=\"https:\/\/blogs.mathworks.com\/deep-learning\/2023\/07\/11\/the-road-to-ai-certification-the-importance-of-verification-and-validation-in-ai\/\">first post<\/a>, we gave an overview of the importance of Verification and Validation in AI within the context of AI Certification. We also introduced the W-shaped development process, an adaptation of the classical V-cycle to AI applications.\r\n<h6><\/h6>\r\n<img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-12572 \" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2023\/07\/Picture1.png\" alt=\"W-shaped development process for AI\" width=\"636\" height=\"349\" \/>\r\n<h6><\/h6>\r\n<strong>Figure 1:<\/strong> W-shaped development process. Credit: EASA, Daedalean.\r\n<h6><\/h6>\r\nThis blog post focuses on the steps you need to complete and iterate on starting with collecting requirements and up to creating a robust AI model. To illustrate the verification and validation steps from requirements to modeling, we will use as an example the design of a pneumonia detector.\r\n<h6><\/h6>\r\n&nbsp;\r\n<h6><\/h6>\r\n<p style=\"font-size: 18px; color: #004b87;\"><strong>Verification and Validation for Pneumonia Detection<\/strong><\/p>\r\nOur goal is to verify a deep learning model that identifies whether a patient is suffering from pneumonia by examining chest X-ray images. The image classification model needs to be not only accurate but also highly robust to avoid the potentially severe consequences of a misdiagnosis. We\u2019ll identify the problem and take it through all the steps in the W-shaped development process (W-cycle for short).\r\n<h6><\/h6>\r\nThe dataset we will be using is the MedMNISTv2 dataset. If you are familiar with MNIST for digit classification, <a href=\"https:\/\/medmnist.com\/\">MedMNIST<\/a> is a collection of labeled 2D and 3D biomedical lightweight 28 by 28 images. We decided to use this dataset because of its simplicity and the ability to rapidly iterate over the design. More specifically, we\u2019ll use the PneumoniaMNIST dataset, which is part of the MedMNISTv2 collection.\r\n<h6><\/h6>\r\n<h6><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-13190 \" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2023\/10\/MedMNIST.png\" alt=\"MedMNIST v2: A large-scale lightweight benchmark for 2D and 3D biomedical image classification\" width=\"666\" height=\"184\" \/><\/h6>\r\n<h6><\/h6>\r\n<strong>Figure 2:<\/strong> MedMNISTv2 dataset \u2013 The dataset is licensed under\u202fCreative Commons Attribution 4.0 International\u202f(<a href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY 4.0<\/a>).\r\n<h6><\/h6>\r\nIn this post, we\u2019ll address the steps on the left-hand side of the W-cycle to create a Pneumonia Detector using the MedMNIST dataset, starting with <strong>Requirements allocated to ML component management<\/strong> all the way down to <strong>Model training<\/strong>. However, note that this is not a linear process, particularly when we evaluate the results of the training phase, so we\u2019ll have to iterate to refine our approach.\r\n<h6><\/h6>\r\n&nbsp;\r\n<h6><\/h6>\r\n<p style=\"font-size: 18px; color: #004b87;\"><strong>Requirements Allocated to ML Component Management<\/strong><\/p>\r\nWe\u2019ll start with the first step in the W-cycle related to AI and Machine Learning<u>;<\/u> collecting the requirements specific to the Machine Learning component. Note that for any non-Machine Learning component items, you can follow the V-cycle frequently used for development assurance of traditional software.\r\n<h6><\/h6>\r\nAt this stage, key questions to consider are:\r\n<ul>\r\n \t<li>Are all the requirements implemented?<\/li>\r\n \t<li>How are the requirements going to be tested?<\/li>\r\n \t<li>Can the model behavior be explained?<\/li>\r\n<\/ul>\r\n<a href=\"https:\/\/www.mathworks.com\/products\/requirements-toolbox.html\">Requirements Toolbox<\/a> lets you author, link, and validate requirements within MATLAB\u202for Simulink. You can create requirements using rich text with custom attributes or import them using requirements management tools.\r\n<h6><\/h6>\r\nAs you can see in the screenshot of the <a href=\"https:\/\/www.mathworks.com\/help\/slrequirements\/ref\/requirementseditor-app.html\">Requirements Editor<\/a> app below, we have already collected a few requirements related to input and output data, accuracy, robustness, latency, and implementation. For each requirement, you can also add a description that better explains what that specific requirement intends to accomplish.\r\n<h6><\/h6>\r\n<h6><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-13199 size-full\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2023\/10\/requirements_editor.png\" alt=\"Screenshot of the Requirements Editor app\" width=\"1095\" height=\"990\" \/><\/h6>\r\n<h6><\/h6>\r\n<strong>Figure 3:<\/strong>\u00a0Requirements Editor: Capturing requirements for the Machine Learning component.\r\n<h6><\/h6>\r\nWe\u2019ve set a slightly audacious goal with the test precision requirement, aiming to surpass 90% accuracy (the <a href=\"https:\/\/arxiv.org\/abs\/2110.14795\">original paper<\/a> achieved 88% accuracy for similar models). At the same time, we have introduced robustness requirements and other Machine Learning-related requirements we must simultaneously satisfy.\r\n<h6><\/h6>\r\n&nbsp;\r\n<h6><\/h6>\r\n<p style=\"font-size: 18px; color: #004b87;\"><strong>Data Management<\/strong><\/p>\r\nThe next step in the W-cycle is <strong>Data management<\/strong>. Since we are solving a supervised learning problem, we need labeled data for training the model. MATLAB offers various labeling apps (including <a href=\"https:\/\/www.mathworks.com\/help\/vision\/ug\/get-started-with-the-image-labeler.html\">Image Labeler<\/a> and <a href=\"https:\/\/www.mathworks.com\/help\/signal\/ug\/use-signal-labeler.html\">Signal Labeler<\/a>) that are extremely useful at this point, allowing you to label your dataset interactively (and with automation).\r\n<h6><\/h6>\r\nThankfully, data has already been labeled into \u201cpneumonia\u201d and \u201cnormal\u201d images. I would have to seek expert advice to label X-ray images or find the right algorithm to automate the process. The data set has also been partitioned into training, validation, and testing sets. So, we don\u2019t need to worry about that either. All we need to worry about at this point is to conveniently manage our images.\r\n<h6><\/h6>\r\nThe <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.io.datastore.imagedatastore.html\">imageDatastore<\/a> object allows you to manage a collection of image files where each individual image fits in memory, but the entire collection does not necessarily fit. Indeed, the MedMNIST images are small and will all fit in memory, but using a data store allows you to see how you can create a scalable process for more realistic workflows. By indicating the folder structure and that the label source can be inferred from the folder names, we can create a MATLAB object that acts as an image data repository.\r\n<h6><\/h6>\r\n<pre>trainingDataFolder = \"pneumoniamnist\\Train\";\r\nimdsTrain = imageDatastore(trainingDataFolder,IncludeSubfolders=true,LabelSource=\"foldernames\");\r\ncountEachLabel(imdsTrain) \r\n<\/pre>\r\n<h6><\/h6>\r\n<pre class=\"brush: python\" style=\"background-color: white; border: white;\">ans = \r\n \r\n  2\u00d72 table \r\n \r\n      Label      Count \r\n    _________    _____ \r\n \r\n    normal       1214  \r\n    pneumonia    3494 \r\n\r\n<\/pre>\r\n<h6><\/h6>\r\nNote that the dataset is imbalanced towards more pneumonia samples. So, this should be considered in the loss function as we train the model.\r\n<h6><\/h6>\r\n&nbsp;\r\n<h6><\/h6>\r\n<p style=\"font-size: 18px; color: #004b87;\"><strong>Learning Process Management<\/strong><\/p>\r\nAt this stage, we\u2019d like to account for all the preparatory work before the training phase. We\u2019ll focus on developing the network architecture and choosing the training options (training algorithm, loss function, hyperparameters, etc.).\r\n<h6><\/h6>\r\nYou can easily design and visualize the network interactively using the <a href=\"https:\/\/www.mathworks.com\/help\/deeplearning\/ref\/deepnetworkdesigner-app.html\">Deep Network Designer<\/a> app. Once you have designed the network (in this case, a simple CNN for image classification), MATLAB code can be generated for training.\r\n<h6><\/h6>\r\n<pre>numClasses = numel(classNames);\r\nlayers = [ \r\n       imageInputLayer(imageSize,Normalization=\"none\") \r\n       convolution2dLayer(7,64,Padding=0) \r\n       batchNormalizationLayer() \r\n       reluLayer() \r\n       dropoutLayer(0.5) \r\n       averagePooling2dLayer(2,Stride=2) \r\n       convolution2dLayer(7,128,Padding=0) \r\n       batchNormalizationLayer() \r\n       reluLayer() \r\n       dropoutLayer(0.5) \r\n       averagePooling2dLayer(2,Stride=2) \r\n       fullyConnectedLayer(numClasses) \r\n       softmaxLayer];\r\n<\/pre>\r\n<h6><\/h6>\r\nHowever, coming up with the optimal hyperparameters might not be so straightforward. The\u00a0<a href=\"https:\/\/www.mathworks.com\/help\/deeplearning\/ref\/experimentmanager-app.html\">Experiment Manager<\/a> app helps you find the optimal training options for neural networks by sweeping through a range of hyperparameter values or using Bayesian optimization. You can run different training configurations, even in parallel, if you have access to the necessary hardware.\r\n<h6><\/h6>\r\n<img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-13217 size-full\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2023\/10\/VV_EM.png\" alt=\"Hyperparameter tuning with the Experiment Manager app\" width=\"1430\" height=\"709\" \/>\r\n<h6><\/h6>\r\n<strong>Figure 4:<\/strong> Setting up the problem in Experiment Manager to find an optimal set of hyperparameters from the exported architecture in Deep Network Designer.\r\n<h6><\/h6>\r\n&nbsp;\r\n<h6><\/h6>\r\n<p style=\"font-size: 18px; color: #004b87;\"><strong>Model Training<\/strong><\/p>\r\nIt is now time to train the model - or more accurately, models. We first run the experiment we have configured in the Experiment Manager app. This gives us an excellent model to start with.\r\n<h6><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-13220 size-full\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2023\/10\/VV_EM_training.png\" alt=\"Training AI models with the Experiment Manager app\" width=\"1430\" height=\"805\" \/><\/h6>\r\n<strong>Figure 5:<\/strong>\u00a0Finding an initial model with the Experiment Manager app.\r\n<h6><\/h6>\r\nAlthough we seem to have obtained good results with our model (~96% accuracy for the validation dataset), this model will fail to comply with some of the other requirements we established earlier (e.g., robustness).\r\n<h6><\/h6>\r\nWe mentioned before that even though the W-cycle seems linear, we often must iterate on our design. To do so, we explored additional training techniques. First, we did <a href=\"https:\/\/www.mathworks.com\/help\/deeplearning\/ug\/preprocess-images-for-deep-learning.html\">data-augmented training<\/a>, that is, we performed meaningful transformations to the images (rotation, translation, scaling, etc.). This results in better generalization, less overfitting, and improving the model robustness.\r\n<h6><\/h6>\r\n<img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-13223 size-full\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2023\/10\/W_shaped_iterative.png\" alt=\"W-shaped development process for AI is iterative\" width=\"624\" height=\"348\" \/>\r\n<h6><\/h6>\r\n<strong>Figure 6:<\/strong>\u00a0An iterative approach towards building an accurate and robust model.\r\n<h6><\/h6>\r\nHowever, as we\u2019ll see in a future blog post, this data-augmented training will not be enough for our purposes. So, our last iteration will involve using a training algorithm called the Fast Gradient Sign Method (FGSM) for Adversarial Training (<a href=\"https:\/\/www.mathworks.com\/help\/deeplearning\/ug\/train-network-robust-to-adversarial-examples.html\">learn more<\/a>). The goal is to generate adversarial examples during training, which are visually similar to the original input data but can cause the model to make incorrect predictions.\r\n<h6><\/h6>\r\nStay tuned for our next blog post. We\u2019ll address the next stage in the W-cycle, the exciting topic of <strong>Learning process verification<\/strong>.","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2023\/10\/W_shaped_iterative.png\" class=\"img-responsive attachment-post-thumbnail size-post-thumbnail wp-post-image\" alt=\"\" decoding=\"async\" loading=\"lazy\" \/><\/div><p>The following post is from Lucas Garc\u00eda, Product Manager for Deep Learning Toolbox.\u00a0\r\n\r\nThis is the second post in a 4-post series on Verification and Validation for AI. In the first post, we gave an... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/deep-learning\/2023\/10\/03\/verification-and-validation-for-ai-from-requirements-to-robust-modeling\/\">read more >><\/a><\/p>","protected":false},"author":194,"featured_media":13223,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[63,54,9,12],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/13187"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/users\/194"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/comments?post=13187"}],"version-history":[{"count":28,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/13187\/revisions"}],"predecessor-version":[{"id":16997,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/13187\/revisions\/16997"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/media\/13223"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/media?parent=13187"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/categories?post=13187"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/tags?post=13187"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}