Activity Classification Demo

Posted by Johanna Pingel, January 30, 2020

2 views (last 30 days) | 0 Likes | 0 comment

Every January, our company hosts an annual “kickoff”: an opportunity for sales, marketing and folks in customer facing roles to meet in Boston and learn about all the new and exciting features coming in 2020 for their relevant product areas. Being on the deep learning marketing team, we decided to put words into actions by providing everyone the opportunity to interact with a neural network (some for the first time!)

Starting with an example from deep learning doc.

Here's how we wanted the demo to work: Each team competes to correctly classify as many activities as possible in 10 minutes. Each team has a laptop & webcam. They record a small video of a team member performing an activity, and then send the video to a network for classification.

Requirements of the demo

Sometimes, I’ll admit, I focus heavily on the deep learning model, and not about the entire system. This project was an opportunity for me to learn to incorporate an entire application: from front end to deployment.

Interact with neural network regardless of technical background
Many people can run at one time
Network prediction reasonably fast (which may require GPU support)

Heather (@heathergorr) and I spent roughly 1 week putting this demo together from start to finish. We started with an example in the documentation (link https://www.mathworks.com/help/deeplearning/examples/classify-videos-using-deep-learning.html) and retrained the network to recognize 30 activities.

Creating the model

The example in doc does a nice job talking about how the network can identify activities in video streams by first using GoogleNet to pull activations from the video files, and then using an LSTM network to classify those activations into their activity classes.

The activations code looks likes this:

for i = 1:numFiles

  fprintf("Reading file %d of %d...\n", i, numFiles)

  video = readVideo(files(i));
  video = centerCrop(video,inputSize);

  sequences{i,1} = activations(netCNN,video,layerName,'OutputAs','columns');
end

The LSTM network looks like this:

layers = [
  sequenceInputLayer(numFeatures,'Name','sequence')
  bilstmLayer(2000,'OutputMode','last','Name','bilstm')
  dropoutLayer(0.5,'Name','drop')
  fullyConnectedLayer(numClasses,'Name','fc')
  softmaxLayer('Name','softmax')
  classificationLayer('Name','classification')];

And the training looks like this:

miniBatchSize = 16;
numObservations = numel(sequencesTrain);
numIterationsPerEpoch = floor(numObservations / miniBatchSize);

options = trainingOptions('adam', ...
  'MiniBatchSize',miniBatchSize, ...
  'InitialLearnRate',1e-4, ...
  'GradientThreshold',2, ...
  'Shuffle','every-epoch', ...
  'ValidationData',{sequencesValidation,labelsValidation}, ...
  'ValidationFrequency',numIterationsPerEpoch, ...
  'Plots','training-progress', ...
  'Verbose',false);

[netLSTM,info] = trainNetwork(sequencesTrain,labelsTrain,layers,options);

The accuracy of the original model with 51 activities left a lot to be desired, with an accuracy of 67.8%. Looking at the training data set, you could see why: There is a lot of variation between the performers of the activities, and more data might have helped too.

Side note: we didn’t see a lot of improvement in accuracy with hyperparameter tuning. It seemed like the documentation example already has the parameters set to deliver the highest accuracy model.

We then trained on 20 activities, (71%) and finally 8 activities, finally reaching an accuracy of roughly 80%. We chose the 8 activities based on what we thought could be accomplished in the 10 minutes allotted for the game, and allowing the best opportunity for high classification accuracy.

Accuracy of the 8 class model. Overfitting?

The front end, created in app designer (video overview link: https://www.mathworks.com/videos/app-designer-overview-1510748719083.html), was intentionally simple and allowed users to quickly choose an activity, record a small video, and receive classification results.

Heather and Johanna demonstrating the app to a room of demo participants!

Here is the app and network in action:

Deploying the app

Now – with a room of 300 people: how do you get them access to the files and run the app?

Enter MATLAB Online: information on the product is here: https://www.mathworks.com/products/matlab-online.html and you can run MATLAB Online here: https://matlab.mathworks.com/

If you have access to MATLAB, you have access to MATLAB Online, which is a convenient way to run MATLAB away from your standard setup. We were pleasantly surprised that the network prediction was quite speedy: under 5 seconds per prediction without needing GPUs which suited our requirements quite well.

We considered a few other options as well:

A MATLAB Web App (link: https://www.mathworks.com/help/compiler/webapps/install-matlab-web-app-server.html) that could meet the needs of having everyone in your office run a web app from a browser – regardless of MATLAB access. This is perfect for running a web app internally. Bringing this to a convention center is not the intended use case of the server, and we didn’t want to risk any issues with needing a VPN.
There’s MATLAB Compiler (https://www.mathworks.com/products/compiler.html) –you can create an app that can run on everyone’s computer locally (without the need for internet or a MATLAB license on the final computer). For the purpose of the game, we didn’t want people to need to install an app for so short a time slot.

Since we decided on MATLAB Online, we could share a link to the app, which made sharing the app and code much easier. You can also take a look at the code and run the app too! A link to the read-only code is here.

The game was a success: we have fun images of people trying the activities. Yet the model was quite finicky (in the first session, everyone was considered “smoking” regardless of the actual activity), but later sessions were able to get 6 out of 8 activities correct in 10 minutes. We also realized when you put your hand near your face, it’s often classified as smoking or brushing hair! This was a great opportunity for people not as familiar with deep learning to have hands on experience interacting with a neural network, which also prompted some discussions among participants on why deep learning isn't always 100% accurate.

Demo Day Success!! We had a room full of people acting out all 8 activities.

In general, it was nice to think about the entire package. Not only the model, but how that model would be used: focusing on an app for people to interact with, and the way in which people would access the model. I encourage you to try out MATLAB Online this week and let me know your thoughts.

P.S. R2020a is just around the corner this spring, and from our meetings last week, it looks to be shaping up as a great release. Stay tuned!