Incremental Learning: Adaptive and real-time machine learning

作者 Sivylla Paraskevopoulou, March 4, 2024

276 次查看（过去 30 天） | 0 个赞 | 0 个评论

Incremental learning is a machine learning approach that addresses the challenge of adaptively fitting models to new incoming data. The incremental learning approach is particularly useful to engineers that need to model streaming data. Often, engineers and other AI practitioners deploy machine learning to target devices, and incremental learning ensures that the models continue to work as intended if the data changes.

In this blog post, we are going to explain what incremental learning is, why it is useful, and how to implement incremental learning with MATLAB tools and Simulink blocks.

What is Incremental Learning?

Incremental learning is a machine learning approach that enables machine learning models (and deep learning models) to continuously learn by processing incoming non-stationary data from a data stream. With incremental learning, you can create artificial intelligence (AI) systems that continuously update to integrate new knowledge while maintaining previous knowledge.

Diagram of the incremental learning workflow showing how a machine learning model learns with streaming data while maintaining previous knowledge.

Figure: Incremental learning workflow.

Incremental Learning vs Traditional Machine Learning

A traditional machine learning model is trained on a batch of data and generalization to new data (that is, avoiding overfitting or underfitting) is ensured by methods like cross-validation, regularization, and hyperparameter tuning.

On the other hand, incremental learning adapts to new data in real time, and therefore it provides certain benefits compared to traditional machine learning. Incremental learning is flexible, quick, and adaptive to new data. An incremental learning model fits to data quickly and efficiently, which means it can adapt in real time to changes (or drifts) in the data distribution. It is also more efficient when little information is known about the training data. For example, class names might not be known until after the model processes observations.

Additionally, incremental learning has these benefits:

Protecting the privacy of end-user data.
Allowing devices to learn even with limited or no internet connectivity.
Allowing the design of advanced devices with personalization and smart features.

Challenges in Incremental Learning

Incremental learning is not without its inherent challenges, a couple of which are data storage and catastrophic forgetting.

Data storage – Data arrives in a stream and the sample size is unknown and possibly large, which makes data storage difficult. Therefore, the incremental learning algorithm must process the data when they are available and before they get discarded.

Catastrophic forgetting – An incremental learning model can’t access previous data while learning on new data. The model can overfit on the new data, which results in poor model performance.

Incremental Anomaly Detection

Incremental anomaly detection is a branch of machine learning that, similarly to incremental learning, involves processing incoming data from a data stream. In incremental anomaly detection, instead of fitting a machine learning model, the algorithm computes anomaly scores in real time.

Learn More About Incremental Learning

To learn more about what incremental learning is and get started with an example, see:

Showing the cumulative and windowed classification error decreasing for incremental learning.

Figure: Classification error for incremental learning model using flexible workflow by updating the performance metrics.

Why Is Incremental Learning Useful?

To solve real world problems, machine learning models must leave the desktop and go into production. When a machine learning model is operating on its target device, such as on the cloud or an edge device, the machine learning model is likely to receive non-stationary streaming data. This is when incremental learning is particularly useful.

Applications of Incremental Learning

Lithium-ion batteries are everywhere today, from wearable electronics, mobile phones, and laptops to electric vehicles and smart grids. Let’s say you are designing a virtual sensor using AI to estimate the battery’s State-Of-Charge (SOC). An SOC virtual sensor is a key component of a battery management system (BMS) that ensures the safe and efficient operation of a battery. The virtual sensor receives voltage, current, and temperature measurements from other sensors. These measurements are likely to change over time and the model that you have deployed should adapt to these changes.

Diagram of a virtual sensor with inputs voltage, current, and temperature measurements, and output the State of Charge of a battery.

Figure: Designing a virtual sensor for battery State-Of-Charge (SOC) estimation using AI.

The design of virtual sensors is just one potential application of incremental learning. Other applications include:


Signal Processing	Predictive Maintenance	Wireless Communications

An example from my personal experience is using incremental learning in the design of implantable brain-machine interfaces (BMIs). During my PhD research, I developed algorithms and designed chips for implantable BMIs. The algorithms aimed to model very noisy brain signals and cluster brain activity to identify which neuron fired and when. Because all the preprocessing and machine learning must happen on an ultra-low power and tiny chip, the algorithms must be computationally efficient, have a small footprint, and process data in real time.

As part of my work, I developed an incremental learning algorithm that clustered the incoming neural signals in real time, while retaining information (like the cluster centers and statistical dependencies) of previously clustered activity. I wish ten years ago, MATLAB had built-in algorithms for incremental learning, but more on the recent tools available in MATLAB for incremental learning in the next section.

Incremental Learning and MLOps

MLOps is as a set of practices that automate the process of taking machine learning models to production, and managing the models once they are in production. As part of MLOps, machine learning models in production are constantly monitored. By using incremental learning algorithms, the machine learning models can be updated on-the-fly, which potentially reduces errors.

MLOps loop showing steps for machine learning and operations.

Figure: The MLOps lifecycle.

Consider that in real-world applications, data is often dynamic and always changing. So, drift can be a big issue for machine learning models. A data drift can happen for many reasons, such as changes in the distribution of the input data over time or the relationship between the input and desired output.

With incremental learning, the model is updated when the input changes.

Video: What is MLOps?

How to Implement Incremental Learning

Now that you understand what incremental learning is and how useful it is for modeling streaming data, we will describe MATLAB and Simulink tools so that you can easily implement incremental learning in your application.

Incremental Learning with MATLAB

Using algorithms from Statistics and Machine Learning Toolbox, you can create flexible, efficient, and adaptive incremental learning models for classification and regression, such as linear support vector (SVM), logistic regression, and naive Bayes classifiers, and least-squares and linear SVM regression models. Alternatively, you can convert a traditionally trained model to an incremental learning model by using the incrementalLearner function. To learn more about these incremental learning models, see the documentation topic Incremental Learning Overview.

With Statistics and Machine Learning Toolbox, you can detect concept drift for incremental learning models, that is, detect when the data has changed so that the model is no longer valid. Also, you can automatically generate C/C++ code for incremental learning models. To learn more, see the example Code Generation for Incremental Learning.

Graph with concept drift detection for incremental learning showing stable, warning, and drift status for different observations.

Figure: Concept drift detection for incremental learning with MATLAB.

Incremental Learning with Simulink

Using Simulink blocks provided in Statistics and Machine Learning Toolbox, you can integrate incremental learning into the design, simulation, and test of complex AI engineered systems, such as in the design of virtual sensors. To learn more, see the following examples:


Incremental Learning in Simulink for Classification	Incremental Learning in Simulink for Regression

Takeaways

Incremental learning addresses the challenge of fitting machine learning models adaptively to incoming streaming data.
Incremental learning can reduce errors when machine learning models are operating in production.
MATLAB and Simulink provide tools, functions, and blocks to create incremental learning models, integrate them into system-level design, and deploy them to hardware.