Tennis Analysis with AI: Interactive Ground Truth Labeling
    
This blog post is from Cory Hoi, Engineer at MathWorks Engineering Development Group.
 
With the rapid advancement of artificial intelligence (AI), harnessing its power is now more accessible than ever. I imagine that the arrival of the personal computer was equally transformative. We are now seeing  AI advancements in areas like computer vision and natural language processing (NLP) being applied in chatbots, healthcare research, transportation, education, and sports. It is everywhere we look – so integrated into our daily lives that many of us hardly notice its presence. For example, just turn on your TV to your favorite sports broadcast.
For me, it’s tennis. At this past Wimbledon, line calls were fully automated, eliminating the need to argue with judges over close calls. Ball tracking and real-time data analysis have also become integral parts of the game, representing a leap forward from sports analytics just five years ago. Players now use AI to study playing patterns and refine their tactics. Yet, as an avid fan, I can’t help but wonder: how does all of this actually work?
In this two-part blog post series, I will show you how to build and leverage deep neural networks in MATLAB for object detection. This first blog post focuses on the initial steps of labeling the data. It will go over the many tools available in MATLAB to ease the typical pains of data labeling.
 
 After loading the video, the app will display the first frame in the middle of the screen. In the panel just below, you can navigate between frames with the left and right arrows.
 
After loading the video, the app will display the first frame in the middle of the screen. In the panel just below, you can navigate between frames with the left and right arrows.
 Video Labeler
Video Labeler
 
 After adding the label, it will appear in the ROI Label Definitions pane to the left. This allows you to easily switch between object labels if there are multiple objects in a single image. There are multiple manual  and automatic algorithms available.
Manual algorithms such as Polygon and Brush, allow you to exactly define the area to label. While automation algorithms leverage a range of techniques to speed up and ease typical pains of the labeling process. Some additional  automation algorithms include Superpixel, Segment Anything, and Assisted Freehand.
For example, in the following video, the ball and the person are manually labeled with the Brush and the Polygon. The ball is easy to label since it only takes a single click with the brush tool. However, labeling the person is more challenging. The polygon tool uses straight lines that don't automatically snap to the person's edges. This label could be improved by using much shorter line segments, though this would require more time.
After adding the label, it will appear in the ROI Label Definitions pane to the left. This allows you to easily switch between object labels if there are multiple objects in a single image. There are multiple manual  and automatic algorithms available.
Manual algorithms such as Polygon and Brush, allow you to exactly define the area to label. While automation algorithms leverage a range of techniques to speed up and ease typical pains of the labeling process. Some additional  automation algorithms include Superpixel, Segment Anything, and Assisted Freehand.
For example, in the following video, the ball and the person are manually labeled with the Brush and the Polygon. The ball is easy to label since it only takes a single click with the brush tool. However, labeling the person is more challenging. The polygon tool uses straight lines that don't automatically snap to the person's edges. This label could be improved by using much shorter line segments, though this would require more time.
 
 
Object Detection in Sports
As you may have guessed, my sport of choice is tennis. However, the object detection methods described here have many applications, in other sports like basketball and football, and beyond sports like in autonomous vehicles. Object detection in tennis has greatly improved my viewing experience in recent years. This is evident when looking at instant replays of tennis points with the ball clearly marked and tracked. Data preparation is an arguably crucial but often overlooked step in any AI task. It involves labelling the dataset to create ground truth data for training, and also preparing the dataset into the correct form.Interactive Video Labeling
In MATLAB, video labeling is made easy with the Video Labeler app. The app allows you to interactively label shapes or regions of interest (ROI) with rectangles, polylines, pixels, and polygon ROI labels. In this post, we are using Video Labeler to label the tennis ball, and also the tennis court in our dataset. Open the Video Labeler app from the Apps tab, under Image Processing and Computer Vision. Create a new project and import the video from the trainingData folder. After loading the video, the app will display the first frame in the middle of the screen. In the panel just below, you can navigate between frames with the left and right arrows.
 
After loading the video, the app will display the first frame in the middle of the screen. In the panel just below, you can navigate between frames with the left and right arrows.
 Video Labeler
Video Labeler
 
Manual vs Automatic Labeling
In MATLAB, there are numerous ways to label a dataset. For example, you can define bounding boxes around the people playing tennis and give them the label name “person”. You can also define lines to label the lines on the court as linear objects. However, for this project, let’s label pixels by identifying the tennis ball, give it the label name “ball”, and assign the Color to green. After adding the label, it will appear in the ROI Label Definitions pane to the left. This allows you to easily switch between object labels if there are multiple objects in a single image. There are multiple manual  and automatic algorithms available.
Manual algorithms such as Polygon and Brush, allow you to exactly define the area to label. While automation algorithms leverage a range of techniques to speed up and ease typical pains of the labeling process. Some additional  automation algorithms include Superpixel, Segment Anything, and Assisted Freehand.
For example, in the following video, the ball and the person are manually labeled with the Brush and the Polygon. The ball is easy to label since it only takes a single click with the brush tool. However, labeling the person is more challenging. The polygon tool uses straight lines that don't automatically snap to the person's edges. This label could be improved by using much shorter line segments, though this would require more time.
After adding the label, it will appear in the ROI Label Definitions pane to the left. This allows you to easily switch between object labels if there are multiple objects in a single image. There are multiple manual  and automatic algorithms available.
Manual algorithms such as Polygon and Brush, allow you to exactly define the area to label. While automation algorithms leverage a range of techniques to speed up and ease typical pains of the labeling process. Some additional  automation algorithms include Superpixel, Segment Anything, and Assisted Freehand.
For example, in the following video, the ball and the person are manually labeled with the Brush and the Polygon. The ball is easy to label since it only takes a single click with the brush tool. However, labeling the person is more challenging. The polygon tool uses straight lines that don't automatically snap to the person's edges. This label could be improved by using much shorter line segments, though this would require more time.
- The Flood Fill algorithm is efficient in labeling a group of connected pixels that have a similar color. However, when colors are similar in values, the Flood Fill tool is not recommended.
- The Smart Polygon algorithm might be a good alternative for discriminating between similarly colored objects. It estimates the shape of an object of interest within the polygon that you draw. This is useful when the object is not a simple polygon.
- The Superpixel algorithm overlays a grid of super pixels with adjustable sizes. For example, in the following GIF, the pixelated grid is initially too large. After the grid is refined, the court can be more accurately labeled.
Saving Ground Truth
Continuing to label the frames is straightforward. To label the next frame, click the next frame button. After all the frames have been labeled, the project can be saved, and the data can be exported. Saving the project allows you to reopen it from the same state it was left. To export the data, click the green checkmark in the toolbar and export the data to a file. This will create a gTruth MAT-file that can be loaded into the workspace when training the neural network later. 
 
Conclusion
This workflow illustrated how the Video Labeler app can be used to label different shapes. It offers a variety of labeling algorithms, each with advantages and disadvantages in different labeling scenarios. Stay tuned for the next blog post, where the labeled dataset is used to train and test a deep neural network that tracks the tennis ball.

 
                
               
               
               
               
               
              
コメント
コメントを残すには、ここ をクリックして MathWorks アカウントにサインインするか新しい MathWorks アカウントを作成します。