{"id":4015,"date":"2020-05-20T21:38:00","date_gmt":"2020-05-20T19:38:00","guid":{"rendered":"https:\/\/blogs.mathworks.com\/student-lounge\/?p=4015"},"modified":"2020-11-06T21:39:23","modified_gmt":"2020-11-06T20:39:23","slug":"advance-alzheimers-research-with-stall-catchers-matlab-benchmark-code","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/student-lounge\/2020\/05\/20\/advance-alzheimers-research-with-stall-catchers-matlab-benchmark-code\/","title":{"rendered":"Advance Alzheimer\u2019s Research with Stall Catchers &#8211; MATLAB Benchmark Code"},"content":{"rendered":"<p>Today\u2019s blog is written by Neha Goel, Deep Learning Technical Evangelist on the Student Competition team at MathWorks.<\/p>\n<p>Hello all! We at MathWorks, in collaboration with DrivenData, are excited to bring you this challenge. Through this challenge, you could help in finding an Alzheimer\u2019s treatment target in reach within the next year or two. You would also get the real-world experience of working with a dataset of live videos from mouse brain. We also encourage you to use MATLAB to train your model by providing complimentary MATLAB licenses.<\/p>\n<p>The objective of this challenge is to classify the outlined blood vessel segment as<\/p>\n<ul>\n<li>&#8216;flowing&#8217; &#8211; if blood is moving through the vessel or<\/li>\n<li>&#8216;stalled&#8217; if the vessel has no blood flow.<\/li>\n<\/ul>\n<p>The main asset for solving this challenge are the videos themselves! Each video is identified by its <em>filename<\/em>, which is a numeric string followed by .mp4, e.g., <em>100000.mp4<\/em>. <strong>All videos are hosted in a public s3 bucket.<\/strong><\/p>\n<p>The full training dataset contains over 580,000 videos, which is around 1.5 terabytes! To help facilitate faster model prototyping, there are two subset versions of the dataset, referred to as n<em>ano <\/em>and <em>micro.<\/em><\/p>\n<p>In addition to the videos, you are provided with &#8220;<em>train_metadata.csv<\/em>&#8221; and &#8220;<em>test_metadata.csv<\/em>&#8221; files. These files consist of the information like filename, URL of each file, number of frames of each video, <em>nano<\/em> and <em>micro<\/em> subsets indication. &#8220;<em>train_labels.csv<\/em>&#8221; is the file of labels of training data.<\/p>\n<p>For further details about the dataset check out the <a href=\"https:\/\/www.drivendata.org\/competitions\/65\/clog-loss-alzheimers-research\/page\/217\/\">Problem Description<\/a> on the competition webpage.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-4111 size-full\" src=\"https:\/\/blogs.mathworks.com\/racing-lounge\/files\/2020\/05\/StallCatchers_gif2.gif\" alt=\"gif\" width=\"500\" height=\"500\" \/><\/p>\n<h1>Getting Started with MATLAB<\/h1>\n<p>We are providing a basic Benchmark starter code in MATLAB on the <em>Nano<\/em>subset version of the dataset. In this code, we walk through a basic classification model, where we are combining a pre-trained image classification model and an LSTM network. Then, we will use this model to predict the type of the vessel on test data and save a CSV file in the format required for the challenge. You can also <a href=\"https:\/\/github.com\/drivendataorg\/clog-loss-stall-catchers-benchmark\">download this MATLAB benchmark code here.<\/a><\/p>\n<p>This can serve as basic code where you can start analyzing the data and work towards developing a more efficient, optimized, and accurate model using more of the training data available. Additionally, we have provided a few tips and tricks to work on the complete 1.5TB dataset. On the challenge&#8217;s <a href=\"https:\/\/www.drivendata.org\/competitions\/65\/clog-loss-alzheimers-research\/page\/217\/\">Problem Description<\/a> page, all required details for videos, labels performance and submission metrics are provided.<\/p>\n<p>So, let&#8217;s get started with this dataset!<\/p>\n<h1>Load Training Data<\/h1>\n<p>To access the variable values from the file <em>train_metadata.csv<\/em>, load the file in the form of a <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2020a\/matlab\/ref\/matlab.io.datastore.tabulartextdatastore.html\">tabulartext datastore<\/a> in the workspace.<\/p>\n<pre>ttds = tabularTextDatastore(\"train_metadata.csv\",\"ReadSize\",'file',\"TextType\",\"string\");\r\ntrain =read(ttds);<\/pre>\n<p>We can then preview the datastore. This visualizes the first 8 rows of the file.<\/p>\n<pre>preview(ttds)\r\n<\/pre>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-4035 size-large\" src=\"https:\/\/blogs.mathworks.com\/racing-lounge\/files\/2020\/05\/preview_ttds-1024x224.png\" alt=\"table\" width=\"1024\" height=\"224\" \/><\/p>\n<p>You can also import csv files in MATLAB using <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2020a\/matlab\/ref\/readtable.html\">readtable<\/a> function. Here, we create training labels from the <em>train_labels.csv<\/em> file and store it in a form of table. We then convert the values of the variable stalled to categorical, as most deep leaning functions used accept categorical values.<\/p>\n<pre>trainlabels = readtable(\"train_labels.csv\");\r\ntrainlabels.stalled = categorical(trainlabels.stalled);<\/pre>\n<p>In this starter code, we will be using the nano subset of the database. Here, we retrieve the files and labels for nano subset from the tables created above and save it in variables <em>nanotrain<\/em> and <em>nanotrainlabels<\/em>. (To work with the complete dataset, you will not need this step.)<\/p>\n<pre>nanotrain = train(train.nano == 'True',:);\r\nnanotrainlabels = trainlabels(train.nano == 'True',:);<\/pre>\n<h1>Access &amp; Process Video Files<\/h1>\n<p>Datastores in MATLAB are a convenient way of working with and representing collections of data that are too large to fit in memory at one time. It is an object for reading a single file or a collection of files or data. The datastore acts as a repository for data that has the same structure and formatting. To learn more about different datastores, check out the documents below:<\/p>\n<ol>\n<li><a href=\"https:\/\/www.mathworks.com\/help\/matlab\/import_export\/what-is-a-datastore.html\">Getting Started with Datastore<\/a><\/li>\n<li><a href=\"https:\/\/www.mathworks.com\/help\/matlab\/import_export\/select-datastore-for-file-format-or-application.html\">Select Datastore for File Format or Application<\/a><\/li>\n<li><a href=\"https:\/\/www.mathworks.com\/help\/deeplearning\/ug\/datastores-for-deep-learning.html\">Datastores for Deep Learning<\/a><\/li>\n<\/ol>\n<p>In this blog<u>,<\/u> we used the <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.io.datastore.filedatastore.html\">filedatastore<\/a> to read each file using its URL. Each file is then processed using the <a href=\"#MW_H_9ABC9CB1\">readVideo<\/a> helper functions, defined at the end of this blog.<\/p>\n<p>We save the datastore in a MAT-file in <em>tempdir<\/em> or current folder before proceeding to next sections. If the MAT file already exists, then load the datastore from the MAT-file without reassessing them.<\/p>\n<pre>tempfds = fullfile(tempdir,\"fds_nano.mat\");\r\n\r\nif exist(tempfds,'file')\r\n \u00a0\u00a0 load(tempfds,'fds')\r\nelse\r\n\r\n \u00a0\u00a0 fds = fileDatastore(nanotrain.url,'ReadFcn', @readVideo);\r\n\u00a0\u00a0\u00a0 files = fds.Files;\r\n\r\n \u00a0\u00a0 save(tempfds,\"fds\");\r\nend<\/pre>\n<p><strong>Tip:<\/strong> For working with complete dataset (~1.5TB)<u>,<\/u>create the datastore with the folder location of the training data (&#8216;<em>s3:\/\/drivendata-competition-clog-loss\/train<\/em>&#8216;) and not with each url <u>URL <\/u>to save time and memory. This step can take a long time to run.<\/p>\n<p>(Optional) We can preview the datastore and assure that each video frame is now cropped at the outlined segment.<\/p>\n<pre>dataOut = preview(fds);\r\ntile = imtile(dataOut);\r\nimshow(tile);<\/pre>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-4031 size-full\" src=\"https:\/\/blogs.mathworks.com\/racing-lounge\/files\/2020\/05\/preview_fds.png\" alt=\"screenshot\" width=\"560\" height=\"577\" \/><\/p>\n<h1>Classification<\/h1>\n<p>To create a deep learning network for video classification:<\/p>\n<ol>\n<li>Convert videos to sequences of feature vectors using a pretrained convolutional neural network, such as GoogLeNet, to extract features from each frame.<\/li>\n<li>Train an <a href=\"https:\/\/www.mathworks.com\/help\/deeplearning\/ug\/long-short-term-memory-networks.html\">Long Short Term Memory (LSTM)<\/a> network on the sequences to predict the video labels.<\/li>\n<li>Assemble a network that classifies videos directly by combining layers from both networks.<\/li>\n<\/ol>\n<p>The following diagram illustrates the network architecture.<\/p>\n<ul>\n<li>To input image sequences to the network, use a sequence input layer.<\/li>\n<li>To use convolutional layers to extract features, that is, to apply the convolutional operations to each frame of the videos independently, use a sequence folding layer followed by the convolutional layers.<\/li>\n<li>To restore the sequence structure and reshape the output to vector sequences, use a sequence unfolding layer and a flatten layer.<\/li>\n<li>To classify the resulting vector sequences, include the LSTM layers followed by the output layers.<\/li>\n<\/ul>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-4081 size-full\" src=\"https:\/\/blogs.mathworks.com\/racing-lounge\/files\/2020\/05\/Image_1.png\" alt=\"flowchart\" width=\"878\" height=\"94\" \/><\/p>\n<h2>Load Pretrained Convolutional Network<\/h2>\n<p>To convert frames of videos to feature vectors, we use the activations of a pretrained network. Load a pretrained GoogLeNet model using the <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2020a\/deeplearning\/ref\/googlenet.html\">googlenet<\/a> function. This function requires the Deep Learning Toolbox\u2122 Model <em>for <\/em><a href=\"https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/64456-deep-learning-toolbox-model-for-googlenet-network\">GoogLeNet Network support package<\/a>.<\/p>\n<pre>netCNN = googlenet;<\/pre>\n<h2>Convert Frames to Feature Vectors<\/h2>\n<p>Use the convolutional network as a feature extractor by getting the activations when inputting the video frames to the network.<\/p>\n<p>This diagram illustrates the data flow through the network.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-4083 size-full\" src=\"https:\/\/blogs.mathworks.com\/racing-lounge\/files\/2020\/05\/Image_2.png\" alt=\"flowchart\" width=\"491\" height=\"103\" \/><\/p>\n<p>The input size should match the input size of the pretrained network, here the GoogLeNet network.\u00a0 The datastore is then resized to the input size using the <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2020a\/ros\/ref\/transform.html\"><em>transform<\/em><\/a> function.<\/p>\n<pre>inputSize = netCNN.Layers(1).InputSize(1:2);\r\nfdsReSz = transform(fds,@(x) imresize(x,inputSize));<\/pre>\n<p>Convert the videos to sequences of feature vectors, where the feature vectors are the output of the <em><a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2020a\/deeplearning\/ref\/activations.html\">activations<\/a><\/em> function on the last pooling layer of the GoogLeNet network (&#8220;<em>pool5-7x7_s1<\/em>&#8220;). To analyze every size and location of the clogged vessels within the outlined segment, we do not modify the lengths of the sequences here.<\/p>\n<p><strong>Tip:<\/strong> After converting the videos to sequences, save the sequences in a MAT-file in the <em>tempdir<\/em> folder. If the MAT file already exists, then load the sequences from the MAT-file without reconverting them. This step can take a long time to run.<\/p>\n<pre>layerName = \"pool5-7x7_s1\";\r\n\r\ntempFile = fullfile(tempdir,\"sequences_nano.mat\");\r\n\r\nif exist(tempFile,'file')\r\n \u00a0\u00a0 load(tempFile,\"sequences\")\r\nelse\r\n \u00a0\u00a0 numFiles = numel(files);\r\n \u00a0\u00a0 sequences = cell(numFiles,1);\r\n\r\n<span data-contrast=\"none\">    for <\/span><span data-contrast=\"auto\">i = 1:numFiles<\/span> \r\n<span data-contrast=\"auto\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 fprintf(<\/span><span data-contrast=\"none\">\"Reading file %d of %d...\\n\"<\/span><span data-contrast=\"auto\">, i, numFiles);<\/span> \r\n<span data-contrast=\"auto\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 sequences{i,1} = activations(netCNN,read(fdsReSz),layerName,<\/span><span data-contrast=\"none\">'OutputAs'<\/span><span data-contrast=\"auto\">,<\/span><span data-contrast=\"none\">'columns'<\/span><span data-contrast=\"auto\">,<\/span><span data-contrast=\"none\">'ExecutionEnvironment'<\/span><span data-contrast=\"auto\">,<\/span><span data-contrast=\"none\">\"auto\"<\/span><span data-contrast=\"auto\">);<\/span> \r\n<span data-contrast=\"auto\">\u00a0\u00a0\u00a0\u00a0<\/span><span data-contrast=\"none\">end<\/span>\r\n\r\n    save(tempFile,\"sequences\");\r\nend<\/pre>\n<p>We then view the sizes of the first few sequences. Each sequence is a <em>D<\/em>-by-<em>S<\/em> array, where <em>D<\/em> is the number of features (the output size of the pooling layer) and <em>S<\/em> is the number of frames of the video.<\/p>\n<pre>sequences(1:10)<\/pre>\n<p><img decoding=\"async\" loading=\"lazy\" width=\"189\" height=\"300\" class=\"alignnone size-medium wp-image-4103\" src=\"https:\/\/blogs.mathworks.com\/racing-lounge\/files\/2020\/05\/preview_sequences-1-189x300.png\" alt=\"table\" \/><\/p>\n<h2>Prepare Training Data<\/h2>\n<p>Here, we prepare the data for training by partitioning the data into training and validation partitions. We assign 90% of the data to the training partition and 10% to the validation partition.<\/p>\n<pre>labels = nanotrainlabels.stalled;\r\n\r\nnumObservations = numel(sequences);\r\nidx = randperm(numObservations);\r\nN = floor(0.9 * numObservations);\r\n\r\nidxTrain = idx(1:N);\r\nsequencesTrain = sequences(idxTrain);\r\nlabelsTrain = labels(idxTrain);\r\n\r\nidxValidation = idx(N+1:end);\r\nsequencesValidation = sequences(idxValidation);\r\nlabelsValidation = labels(idxValidation);<\/pre>\n<p>We then get the sequence lengths of the training data and visualize them in a histogram plot.<\/p>\n<pre>numObservationsTrain = numel(sequencesTrain);\r\nsequenceLengths = zeros(1,numObservationsTrain);\r\n\r\nfor i = 1:numObservationsTrain\r\n \u00a0\u00a0 sequence = sequencesTrain{i};\r\n \u00a0\u00a0 sequenceLengths(i) = size(sequence,2);\r\nend\r\n\r\nfigure\r\nhistogram(sequenceLengths)\r\ntitle(\"Sequence Lengths\")\r\nxlabel(\"Sequence Length\")\r\nylabel(\"Frequency\")\r\n<\/pre>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-4107 size-full\" src=\"https:\/\/blogs.mathworks.com\/racing-lounge\/files\/2020\/05\/sequence-lengths.png\" alt=\"graph\" width=\"700\" height=\"569\" \/><\/p>\n<h2>Create LSTM Network<\/h2>\n<p>Next, create an <a href=\"https:\/\/www.mathworks.com\/help\/deeplearning\/ug\/long-short-term-memory-networks.html\">Long Short Term Memory (LSTM)<\/a> network that can classify the sequences of feature vectors representing the videos.<\/p>\n<p>We then define the LSTM network architecture and specify the following network layers.<\/p>\n<ul>\n<li>A <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2020a\/deeplearning\/ref\/nnet.cnn.layer.sequenceinputlayer.html\">sequence input layer<\/a> with an input size corresponding to the feature dimension of the feature vectors<\/li>\n<li>A <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2020a\/deeplearning\/ref\/nnet.cnn.layer.bilstmlayer.html\">BiLSTM layer<\/a> with 2000 hidden units with a <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2020a\/deeplearning\/ref\/nnet.cnn.layer.dropoutlayer.html\">dropout layer<\/a> To output only one label for each sequence by setting the &#8216;<em>OutputMode<\/em>&#8216; option of the BiLSTM layer to &#8216;<em>last<\/em>&#8216;<\/li>\n<li>A <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2020a\/deeplearning\/ref\/nnet.cnn.layer.fullyconnectedlayer.html\">fully connected layer<\/a> with an output size corresponding to the number of classes, a <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2020a\/deeplearning\/ref\/nnet.cnn.layer.softmaxlayer.html\">softmax layer<\/a>, and a <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2020a\/deeplearning\/ref\/classificationlayer.html\">classification layer<\/a>.<\/li>\n<\/ul>\n<pre>numFeatures = size(sequencesTrain{1},1);\r\nnumClasses = 2;\r\n\r\nlayers = [\r\n \u00a0\u00a0 sequenceInputLayer(numFeatures,'Name','sequence')\r\n \u00a0\u00a0 bilstmLayer(2000,'OutputMode','last','Name','bilstm')\r\n \u00a0\u00a0 dropoutLayer(0.5,'Name','drop')\r\n \u00a0\u00a0 fullyConnectedLayer(numClasses,'Name','fc')\r\n\u00a0\u00a0\u00a0 softmaxLayer('Name','softmax')\r\n\u00a0\u00a0\u00a0 classificationLayer('Name','classification')];<\/pre>\n<h2>Specify Training Options<\/h2>\n<p>As the next step, we specify the training options using the <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2020a\/deeplearning\/ref\/trainingoptions.html\">trainingOptions<\/a> function:<\/p>\n<ul>\n<li>Set a mini-batch size 16, an initial learning rate of 0.0001, and a gradient threshold of 2 (to prevent the gradients from exploding).<\/li>\n<li>Truncate the sequences in each mini-batch to have the same length as the shortest sequence.<\/li>\n<li>Shuffle the data every epoch.<\/li>\n<li>Validate the network once per epoch.<\/li>\n<li>Display the training progress in a plot and suppress verbose output.<\/li>\n<\/ul>\n<pre>miniBatchSize = 16;\r\nnumObservations = numel(sequencesTrain);\r\nnumIterationsPerEpoch = floor(numObservations \/ miniBatchSize);\r\n\r\noptions = trainingOptions('adam', ...\r\n\u00a0\u00a0\u00a0 'MiniBatchSize',miniBatchSize, ...\r\n\u00a0\u00a0\u00a0 'InitialLearnRate',1e-4, ...\r\n\u00a0\u00a0\u00a0 'GradientThreshold',2, ...\r\n\u00a0\u00a0\u00a0 'Shuffle','every-epoch', ...\r\n\u00a0\u00a0\u00a0 'ValidationData',{sequencesValidation,labelsValidation}, ...\r\n\u00a0\u00a0\u00a0 'ValidationFrequency',numIterationsPerEpoch, ...\r\n\u00a0\u00a0\u00a0 'Plots','training-progress', ...\r\n\u00a0\u00a0\u00a0 'Verbose',false, ...\r\n\u00a0\u00a0\u00a0 'ExecutionEnvironment','auto');<\/pre>\n<h2>Train LSTM Network<\/h2>\n<p>We then train the network using the <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2020a\/deeplearning\/ref\/trainnetwork.html\">trainNetwork<\/a> function. Note that this function can take a long time to run due to the computations involved.<\/p>\n<pre>[netLSTM,info] = trainNetwork(sequencesTrain,labelsTrain,layers,options);<\/pre>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-4105 size-large\" src=\"https:\/\/blogs.mathworks.com\/racing-lounge\/files\/2020\/05\/TP_GPU_5_19-1024x532.png\" alt=\"graph\" width=\"1024\" height=\"532\" \/><\/p>\n<p>As the next step, we calculate the classification accuracy of the network on the validation set using <em><a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2020a\/deeplearning\/ref\/classify.html\">classify<\/a> <\/em>function. You can use the same mini-batch size as you used for the training option.<\/p>\n<pre>YPred = classify(netLSTM,sequencesValidation,'MiniBatchSize',miniBatchSize);\r\nYValidation = labelsValidation;\r\naccuracy = mean(YPred == YValidation)<\/pre>\n<h2>Assemble Video Classification Network<\/h2>\n<p>To create a network that classifies videos directly, assemble a network using layers from both of the created networks. Use the layers from the convolutional network to transform the videos into vector sequences and the layers from the LSTM network to classify the vector sequences.<\/p>\n<p>The following diagram illustrates the network architecture.<\/p>\n<ul>\n<li>To input image sequences to the network, use a sequence input layer.<\/li>\n<li>To use convolutional layers to extract features, that is, to apply the convolutional operations to each frame of the videos independently, use a sequence folding layer followed by the convolutional layers.<\/li>\n<li>To restore the sequence structure and reshape the output to vector sequences, use a sequence unfolding layer and a flatten layer.<\/li>\n<li>To classify the resulting vector sequences, include the LSTM layers followed by the output layers.<\/li>\n<\/ul>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-4085 size-full\" src=\"https:\/\/blogs.mathworks.com\/racing-lounge\/files\/2020\/05\/Image_3.png\" alt=\"flowchart\" width=\"879\" height=\"98\" \/><\/p>\n<p><strong>Add Convolutional Layers<\/strong><\/p>\n<p>First, we create a layer graph of the GoogLeNet network.<\/p>\n<pre>cnnLayers = layerGraph(netCNN);<\/pre>\n<p>Then we remove the input layer (&#8220;<em>data<\/em>&#8220;) and the layers after the pooling layer used for the activations (&#8220;<em>pool5-drop_7x7_s1<\/em>&#8220;, &#8220;<em>loss3-classifier<\/em>&#8220;, &#8220;<em>prob<\/em>&#8220;, and &#8220;<em>output<\/em>&#8220;).<\/p>\n<pre>layerNames = [\"data\" \"pool5-drop_7x7_s1\" \"loss3-classifier\" \"prob\" \"output\"];\r\ncnnLayers = removeLayers(cnnLayers,layerNames);<\/pre>\n<p><strong>Add Sequence Input Layer<\/strong><\/p>\n<p>We create a <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2020a\/deeplearning\/ref\/nnet.cnn.layer.sequenceinputlayer.html\">sequence input layer<\/a> that accepts image sequences containing images of the same input size as the GoogLeNet network. To normalize the images using the same average image as the GoogLeNet network, set the &#8216;<em>Normalization<\/em>&#8216; option of the sequence input layer to &#8216;<em>zerocenter<\/em>&#8216; and the &#8216;<em>Mean<\/em>&#8216; option to the average image of the input layer of GoogLeNet.<\/p>\n<pre>inputSize = netCNN.Layers(1).InputSize(1:2);\r\naverageImage = netCNN.Layers(1).Mean;\r\n\r\ninputLayer = sequenceInputLayer([inputSize 3], ...\r\n\u00a0\u00a0\u00a0 'Normalization','zerocenter', ...\r\n\u00a0\u00a0\u00a0 'Mean',averageImage, ...\r\n\u00a0\u00a0\u00a0 'Name','input');<\/pre>\n<p>We then add the sequence input layer to the layer graph. To apply the convolutional layers to the images of the sequences independently, we remove the sequence structure of the image sequences by including a sequence folding layer between the sequence input layer and the convolutional layers. We then connect the output of the sequence folding layer to the input of the first convolutional layer (&#8220;<em>conv1-7x7_s2<\/em>&#8220;).<\/p>\n<pre>layers = [\r\n\u00a0\u00a0\u00a0 inputLayer\r\n\u00a0\u00a0\u00a0 sequenceFoldingLayer('Name','fold')];\r\n\r\nlgraph = addLayers(cnnLayers,layers);\r\nlgraph = connectLayers(lgraph,\"fold\/out\",\"conv1-7x7_s2\");<\/pre>\n<p><strong>Add LSTM Layers<\/strong><\/p>\n<p>The next step is to add the LSTM layers to the layer graph by removing the sequence input layer of the LSTM network. To restore the sequence structure removed by the sequence folding layer, we can include a sequence unfolding layer after the convolution layers. The LSTM layers expect sequences of vectors. To reshape the output of the sequence unfolding layer to vector sequences, we include a flatten layer after the sequence unfolding layer.<\/p>\n<p>We take the layers from the LSTM network and remove the sequence input layer.<\/p>\n<pre>lstmLayers = netLSTM.Layers;\r\nlstmLayers(1) = [];<\/pre>\n<p>Add the <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2020a\/deeplearning\/ref\/nnet.cnn.layer.sequenceunfoldinglayer.html\">sequence folding layer<\/a>, the flatten layer, and the LSTM layers to the layer graph. Connect the last convolutional layer (&#8220;<em>pool5-7x7_s1<\/em>&#8220;) to the input of the sequence unfolding layer (&#8220;<em>unfold\/in<\/em>&#8220;).<\/p>\n<pre>layers = [\r\n\u00a0\u00a0\u00a0 sequenceUnfoldingLayer('Name','unfold')\r\n\u00a0\u00a0\u00a0 flattenLayer('Name','flatten')\r\n\u00a0\u00a0\u00a0 lstmLayers];\r\n\r\nlgraph = addLayers(lgraph,layers);\r\nlgraph = connectLayers(lgraph,\"pool5-7x7_s1\",\"unfold\/in\");<\/pre>\n<p>To enable the unfolding layer to restore the sequence structure, connect the &#8220;<em>miniBatchSize<\/em>&#8221; output of the sequence folding layer to the corresponding input of the sequence unfolding layer.<\/p>\n<pre>lgraph = connectLayers(lgraph,\"fold\/miniBatchSize\",\"unfold\/miniBatchSize\");<\/pre>\n<p><strong>Assemble Network<\/strong><\/p>\n<p>We then check that the network\u00a0 is valid using the <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2020a\/deeplearning\/ref\/analyzenetwork.html\">analyzeNetwork<\/a> function.<\/p>\n<pre>analyzeNetwork(lgraph)\r\n<\/pre>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-4099 size-large\" src=\"https:\/\/blogs.mathworks.com\/racing-lounge\/files\/2020\/05\/net_v1_5_19-1024x647.png\" alt=\"screenshot\" width=\"1024\" height=\"647\" \/><\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-4101 size-large\" src=\"https:\/\/blogs.mathworks.com\/racing-lounge\/files\/2020\/05\/net_v2_5_19-1024x565.png\" alt=\"screenshot\" width=\"1024\" height=\"565\" \/><\/p>\n<p>And, assemble the network so that it is ready for prediction using the <em>assembleNetwork<\/em> function.<\/p>\n<pre>net = assembleNetwork(lgraph)\r\n\r\n<\/pre>\n<p><img decoding=\"async\" loading=\"lazy\" width=\"300\" height=\"113\" class=\"alignnone size-medium wp-image-4025\" src=\"https:\/\/blogs.mathworks.com\/racing-lounge\/files\/2020\/05\/DAG_network-300x113.png\" alt=\"code screenshot\" \/><\/p>\n<h2>Prepare Test Data<\/h2>\n<p>Just as we did with the training data, we will now read the test files in a filedatastore and crop the frames using the <em>readVideo<\/em> function. The cropped files datastore is then resized as per the inputSize using the transform function.<\/p>\n<p><span class=\"TextRun Highlight SCXW3851236 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"none\"><span class=\"NormalTextRun SCXW3851236 BCX0\">When we use all the files from a folder, we can directly give the\u00a0<\/span><\/span><span class=\"TextRun Highlight SCXW3851236 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"none\"><span class=\"NormalTextRun SCXW3851236 BCX0\">s3 folder\u00a0<\/span><\/span><span class=\"TextRun Highlight SCXW3851236 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"none\"><span class=\"NormalTextRun SpellingErrorV2 SCXW3851236 BCX0\">url<\/span><\/span><span class=\"TextRun Highlight SCXW3851236 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"none\"><span class=\"NormalTextRun SCXW3851236 BCX0\">\u202fwhile creating the datastore.\u00a0<\/span><\/span><span class=\"TextRun Highlight SCXW3851236 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"none\"><span class=\"NormalTextRun SCXW3851236 BCX0\">This speed<\/span><\/span><span class=\"TextRun Highlight SCXW3851236 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"none\"><span class=\"NormalTextRun SCXW3851236 BCX0\">\u00a0up the creation of the datastore.<\/span><\/span><span class=\"TextRun SCXW3851236 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"none\"><span class=\"NormalTextRun SCXW3851236 BCX0\">\u00a0<\/span><\/span><\/p>\n<pre>testfds = fileDatastore('s3:\/\/drivendata-competition-clog-loss\/test\/','ReadFcn', @readVideo);\r\ntestfdsReSz = transform(testfds,@(x) {imresize(x,inputSize)});<\/pre>\n<h2>Classify Using Test Data<\/h2>\n<p>Once we have our trained network, we can perform predictions on our test set. To do so, we classify the test set videos using the assembled network. The <em><a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2020a\/deeplearning\/ref\/classify.html\">classify<\/a> <\/em>function expects a cell array containing the input videos, so you must input a 1-by-1 cell array containing the video.<\/p>\n<pre>testFiles = testfds.Files;\r\nnumTestFiles = numel(testFiles);\r\n\r\nYPred = cell(numTestFiles,1);\r\n\r\nfor i = 1:numTestFiles\r\n \u00a0\u00a0 fprintf(\"Reading file %d of %d...\\n\", i, numTestFiles);\r\n \u00a0\u00a0 YPred{i,1} = classify(net,read(testfdsReSz),'ExecutionEnvironment','auto');\r\nend<\/pre>\n<h1>Save Submission to File<\/h1>\n<p>We create a table of the results based on the filenames and prediction scores. The desired file format for submission is a csv file with coulmn names: <em>filename<\/em> and <em>stalled<\/em>.<\/p>\n<p>We will place all the test results in a MATLAB table, which makes it easy to visualize and to write to the desired file format.<\/p>\n<pre>test = readtable(\"test_metadata.csv\");\r\ntestResults = table(test.filename,YPred(:,1),'VariableNames',{'filename','stalled'});<\/pre>\n<p>We then write the results to a CSV file. This is the file you will submit for the challenge.<\/p>\n<pre>writetable(testResults,'testResults.csv');<\/pre>\n<h1>Helper Functions<\/h1>\n<p>The <em>readVideofunction<\/em> reads the video in filename and returns a cropped 4-D frame.<\/p>\n<p>The area of interest in the entire video is the orange colored area in each frame of the video which we would want to extract to use for the rest of our training. Hence, each frame is cropped as per the bounding box of the segment, calculated using the <em>detectROI<\/em> function.<\/p>\n<pre>function video = readVideo(filename)\r\n\r\nvr = VideoReader(filename);\r\ni = 0;\r\n% video = zeros;\r\n\r\nwhile hasFrame(vr)\r\n\u00a0\u00a0\u00a0 i = i+1;\r\n\u00a0\u00a0\u00a0 frame = readFrame(vr);\r\n\u00a0\u00a0\u00a0 if i &lt; 2\r\n\u00a0\u00a0\u00a0  \u00a0\u00a0 Bbox = detectROI(frame);\r\n\u00a0\u00a0\u00a0 end\r\n \u00a0\u00a0 frame = imcrop(frame, Bbox);\r\n \u00a0\u00a0 video(:,:,:,i)=frame;\r\nend\r\n\r\nend<\/pre>\n<p>The <em>detectROIfunction<\/em> detects the outlined segment and thresholds the image based on threshold values defined. This function is generated using MATLAB&#8217;s <a href=\"https:\/\/www.mathworks.com\/help\/images\/ref\/colorthresholder-app.html\">color thresholder app<\/a>. For detecting the area specified by the orange colured markers, we used <a href=\"https:\/\/in.mathworks.com\/help\/vision\/ref\/vision.blobanalysis-system-object.html\">Blob Analysis<\/a>. We chose to keep the area of interest as the blob with the largest major axis. Check out the following videos to learn more.<\/p>\n<ul>\n<li><a href=\"https:\/\/www.youtube.com\/watch?v=HHpmRjQAiRs&amp;list=PLn8PRpmsu08oBSjfGe8WIMN-2_rwWFSgr&amp;index=8\">How to Segment Images Using Color Thresholding<\/a><\/li>\n<li><a href=\"https:\/\/www.mathworks.com\/videos\/object-detection-using-blob-analysis-108372.html\">Object Detection using Blob analysis<\/a><\/li>\n<\/ul>\n<pre>function [Bbox] = detectROI(frameIn)\r\n\r\n%% Setup the detector and initialize variables\r\npersistent detector\r\nif isempty(detector)\r\n \u00a0\u00a0 detector = vision.BlobAnalysis('BoundingBoxOutputPort',true,'MajorAxisLengthOutputPort',true);\r\nend\r\n\r\nthreshold = [104 255; 13 143; 9 98];\r\n\r\nmask = (frameIn (:,:,1) &gt;= threshold(1,1))&amp; (frameIn (:,:,1) &lt;= threshold(1,2))&amp;...\r\n(frameIn (:,:,2) &gt;= threshold(2,1))&amp; (frameIn (:,:,2) &lt;= threshold(2,2))&amp;...\r\n(frameIn (:,:,3) &gt;= threshold(3,1))&amp; (frameIn (:,:,3) &lt;= threshold(3,2));\r\n\r\n[~, ~, Bbox1, majorAxis] = detector(mask);\r\n\r\nif ~isempty(majorAxis)\r\n% Identify Largest Blob\r\n \u00a0\u00a0 [~,mIdx] = max(majorAxis);\r\n \u00a0\u00a0 Bbox = Bbox1(mIdx,:);\r\n\r\nend\r\n\r\nend<\/pre>\n<p>Thanks for following along with this blog. We are excited to find out how you will modify this starter code and make it yours. We strongly recommend looking at our <a href=\"https:\/\/www.mathworks.com\/help\/deeplearning\/ug\/deep-learning-tips-and-tricks.html\">Deep Learning Tips &amp; Tricks page<\/a> for more ideas on how you can improve the benchmark model. You can also check out this blog: <a href=\"https:\/\/towardsdatascience.com\/are-you-ready-for-a-video-classification-challenge-d044e3b202b6?source=friends_link&amp;sk=fe34a055327cd53dc1c94818a56fb1d4\">Are you ready for a Video Classification Challenge<\/a>, to learn 5 different methods for Video Classification.<\/p>\n<p>Feel free to reach out to us in the <a href=\"https:\/\/community.drivendata.org\">DrivenData forum<\/a> if you have any further questions.<\/p>\n","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img src=\"https:\/\/blogs.mathworks.com\/student-lounge\/files\/2020\/05\/StallCatchers_gif2.gif\" class=\"img-responsive attachment-post-thumbnail size-post-thumbnail wp-post-image\" alt=\"gif\" decoding=\"async\" loading=\"lazy\" \/><\/div>\n<p>Today\u2019s blog is written by Neha Goel, Deep Learning Technical Evangelist on the Student Competition team at MathWorks.<br \/>\nHello all! We at MathWorks, in collaboration with DrivenData, are excited to&#8230; <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/student-lounge\/2020\/05\/20\/advance-alzheimers-research-with-stall-catchers-matlab-benchmark-code\/\">read more >><\/a><\/p>\n","protected":false},"author":163,"featured_media":4111,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[365],"tags":[399,363,104,401,128,397,395],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/student-lounge\/wp-json\/wp\/v2\/posts\/4015"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/student-lounge\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/student-lounge\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/student-lounge\/wp-json\/wp\/v2\/users\/163"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/student-lounge\/wp-json\/wp\/v2\/comments?post=4015"}],"version-history":[{"count":37,"href":"https:\/\/blogs.mathworks.com\/student-lounge\/wp-json\/wp\/v2\/posts\/4015\/revisions"}],"predecessor-version":[{"id":4737,"href":"https:\/\/blogs.mathworks.com\/student-lounge\/wp-json\/wp\/v2\/posts\/4015\/revisions\/4737"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/student-lounge\/wp-json\/wp\/v2\/media\/4111"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/student-lounge\/wp-json\/wp\/v2\/media?parent=4015"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/student-lounge\/wp-json\/wp\/v2\/categories?post=4015"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/student-lounge\/wp-json\/wp\/v2\/tags?post=4015"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}