Back in 2021, Loren Shure posted an article that introduced the first page-wise matrix functions in MATLAB: A page-wise matrix multiply pagemtimes, along with page-wise transpose pagetranspose, and... read more >>

]]>Back in 2021, Loren Shure posted an article that introduced the first page-wise matrix functions in MATLAB: A page-wise matrix multiply pagemtimes, along with page-wise transpose pagetranspose, and complex conjugate transpose pagectranspose; all of which were added to MATLAB R2020b.

The following diagram is helpful to visualize what's going on in the 3D case. Each 3D array is considered as a set of matrices (or pages)

A demo I like to show is this one. Image that A and B are stacks of 100,000 matrices, each of size 3x3. We want to multiply each pair together to form the stack C. Here's the old way of doing this using a loop.

A = rand(3,3,100000);

B = rand(3,3,100000);

C = zeros(3,3,100000); % preallocate the results

tic

for i=1:100000

C(:,:,i) = A(:,:,i) * B(:,:,i);

end

loopTime = toc

and this is how I'd do it using pagemtimes.

tic

C1 = pagemtimes(A,B);

pagedTime = toc

The result is a lot faster.

fprintf("The pagemtimes version is %f times faster than the loop\n",loopTime/pagedTime)

That's a lot of speedup! Indeed, this demo has been carefully selected to show about the most amount of speed-up you can expect on your machine for pagemtimes for reasons we'll get into later.

The idea of page-wise matrix functions turned out to be both a popular and useful idea! Here at MathWorks, we found uses for them all over our products and the user community made requests for several more variants. Page-wise matrix-multiply was a great start but you also wanted page-wise SVD, backslash, eig and more. MathWork's development team delivered!

So, let's take a closer look at the state of play for paged matrix functions as of R2024a.

Here's a list of all of the page-wise matrix functions we have in MATLAB, along with when they were introduced.

- pagetranspose (R2020b) Page-wise transpose
- pagectranspose (R2020b) Page-wise complex conjugate transpose
- pageeig (R2023a) Page-wise eigenvalues and eigenvectors
- pagelsqminnorm (R2024a) Page-wise minimum-norm least-squares solution to linear equation
- pagemtimes (R2020b) Page-wise matrix multiplication
- pagemldivide (R2022a) Page-wise left matrix divide
- pagemrdivide (R2022a) Page-wise right matrix divide
- pagenorm (R2022b) Page-wise matrix or vector norm
- pagepinv (R2024a) Page-wise Moore-Penrose pseudoinverse
- pagesvd (R2021b) Page-wise singular value decomposition

From the very beginning, a natural question to ask was "Instead of having a bunch of different page functions, why not have just one called, say, pagefun that takes the required function as an argument". Indeed, MATLAB has such a function in the parallel computing toolbox that works on distributed and GPU arrays.

We did consider that but one reason we decided against it was option handling. Some options might only apply to certain functions. The options documentation would be long with most of it being not relevant to what someone wanted to do. Another reason is performance; the current design allows us to go directly to the underlying function.

Much of the original blog post on page-wise matrix functions focused on how much faster they could be than using loops. My demo above showed that in certain cases, this speed-up could be significant. So where does this speed-up come from?

Argument checking: The first optimization is rather mundane: Argument checking. In the page-wise case, you only need to do it once whereas if you loop over 100,000 pages, you do argument checking 100,000 times. For large matrices, this overhead is insignificant but when you are dealing with thousands of very small matrices, the time to check arguments is on-par with the computation time.

More parallelism: The second reason for the speed-up is parallelism. We can choose to run the internal for-loop over all of the pages in parallel if we wish. This is where things get a little more complicated. Most of MATLAB's linear algebra functions already run in parallel. Multiply two large matrices together while monitoring your CPU workload and you'll see this in action. This leaves us with a decision to make, we need to decide to either let the underlying BLAS/LAPACK routine run multi-threaded and not thread the loop over the pages, or to run BLAS/LAPACK single-threaded and thread the outer loop.

With very small matrices, this decision is easy. There is probably no benefit at all from parallelism when multiplying two very small matrices together. When you have 100,000 pairs of them, however, the route to parallelism is clear.

Small matrix math tricks: Since the main speedup of using paged functions happens for many small matrices, this made us focus on the performance for such small matrices (e.g. 3x3) using various mathematical tricks. These speed-ups also found their way into the standard MATLAB functions such as inv and mldivide. Head over to the documentation for mldivide, for example, and you'll see this

As a result of all of this, the most impressive results come from huge numbers of small matrices and hence the structure of my initial demo in this blog post.

With all of that said, however, there is still useful speed-up to be found in with smaller numbers of larger matrix sizes as this demo of paigeeig shows.

mset = 25:25:500;

msetSize = numel(mset);

numPages = 50;

tLoop = zeros(msetSize,1);

tPage = zeros(msetSize,1);

for i=1:msetSize

A = rand(mset(i),mset(i),numPages);

CPage = pageeig(A);

f = @() pageeig(A);

tPage(i) = timeit(f);

CLoop = eigLoop(A);

f = @() eigLoop(A);

tLoop(i) = timeit(f);

end

function C = eigLoop(A)

[rows,~,pages] = size(A);

C = zeros(rows,1,pages);

for i=1:pages

C(:,1,i) = eig(A(:,:,i));

end

end

figure('Position',[100,100,1200,450])

t = tiledlayout(1,2);

nexttile

plot(mset,tPage,'*');

hold on

plot(mset,tLoop,'o');

legend({'pageeig','for-loop'},'Location','northwest');

title('Times')

xlabel('m')

ylabel('Seconds')

hold off

nexttile

plot(mset,tLoop./tPage)

title('Speed-ups of pageeig over for-loop')

xlabel('m')

ylim([0,inf])

titleStr = sprintf('Compute eigenvalues of %d pages of m x m matrices',numPages);

title(t,titleStr)

One aspect of my job is to work with MATLAB users around the world on making their code go faster and I've used paged functions several times since they were introduced. I usually can't discuss user's code, however, so I asked around internally to see where MathWorks uses them. The answer was "Almost everywhere!"

For example, it turns out that over 100 shipping functions across the various toolboxes make use of pagemtimes. Its used in 5G Toolbox, Aerospace Toolbox, Antenna Toolbox, Audio Toolbox, Communications Toolbox, System Identification Toolbox, Image Processing Toolbox, Lidar Toolbox, Medical Imaging Toolbox, Navigation Toolbox, Deep Learning Toolbox, Partial Differential Equation Toolbox, Phased Array System Toolbox, Radar Toolbox, RF Toolbox, Text Analytics Toolbox, Computer Vision Toolbox, WLAN Toolbox and the MATLAB Support Package for Quantum Computing. That's a big list and shows that the concept of page-wise matrix functions is useful in many diverse areas of technical computing.

One of our developers reached out to tell me that modern communications system use a technique called OFDM where data is sent over the air in parallel on sets of adjacent frequencies. At each frequency, there is an equalization process where the receiver tries to undo distortion from the channel and to do this we solve a least squares problem at each frequency (solve Ax = b or Ax + n = b where n represents additive Gaussian noise).

These least squares problems can be solved in parallel using routines like pagesvd across all relevant frequencies. This led to substantial speedups in toolbox functions like ofdmEqualize (Communications Toolbox) and nrEqualizeMMSE (5G Toolbox) and wlanHEEqualize (WLAN Toolbox). This speedup in turn led to significant speedups in popular shipping examples such as NR PDSCH Throughput - MATLAB & Simulink (mathworks.com).

Page-wise matrix functions have become extremely useful additions to the toolkit of many MathWorkers and I encourage you to look through your own code to see where they might be applied. Also, if you have found them useful in your own code, please do let me know in the comments

As you know, it all started with pagemtimes. From there, internal and external users of MATLAB made requests for the other functions and we've been steadily adding them in. The primary driver for which functions we've actually implemented is a discussion of applications. A page-wise function that could be used in several, well-defined application areas gets much higher priority than "foo is a matrix function so why not implement pagefoo?"

So, if this article has got your interest and you have an application that could benefit from an as-yet-unimplemented page-wise function, do let us know the details.

]]>Joining us again is Eric Ludlam, development manager of MATLAB’s charting team. Discover more about Eric on our contributors bio page. Last time Eric was here, Daffodils were on his mind. Now, he... read more >>

]]>Joining us again is Eric Ludlam, development manager of MATLAB’s charting team. Discover more about Eric on our contributors bio page. Last time Eric was here, Daffodils were on his mind. Now, he focuses on tulips.

Spring is here in Natick and the tulips are blooming! While tulips appear only briefly here in Massachusetts, they provide a lot of bright and diverse colors and shapes. To celebrate this cheerful flower, here's some code to create your own tulip!

This script uses some fun tricks, like extracting a subset of the 'hot' colormap for a red-to-yellow colormap, and specifying several material properties to give it a more sunny day look. There's also the sinpi and cospi functions that Mike likes so much!

Explore the code and create some new breeds of tulips for your MATLAB.

%% MATLAB Tulip

np = 3; % number of petals

n=np*20+1; % Theta resolution

theta=linspace(0, np*(1/np)*2,n);

r=linspace(0,1,80)';

newplot

for k=[0 1/np] % 2 layers of petals

x=1-(.8*(1-mod(np*(theta+k),2)).^3-.05).^2/2;

Z1=((x.*r).^6);

R2=x.*r*(1-k*.38);

R3=x.*(r.^8)*.4.*Z1;

X=R2.*cospi(theta)-R3.*cospi(theta);

Y=R2.*sinpi(theta)-R3.*sinpi(theta);

C=repmat(r,1,n).^25;

%% Petals

surface(X,Y,Z1*2,C,FaceColor="interp",EdgeColor="none",FaceLighting="gouraud");

line(X(end,:),Y(end,:),Z1(end,:)*2,Color="#cc0",LineWidth=1);

end

%% Stem

[CX,CY,CZ]=cylinder;

surface(CX*.1,CY*.1,(CZ-1)*2,[],FaceColor="#9b8",EdgeColor="none");

%% Decorate!

cmap=hot(200); % Colors of tulip

colormap(cmap(80:150,:)) % Extract subset of hot colormap.

axis off equal

view([0,10]);

material([.6 .9 .3 2 .5])

camlight(40,40)

Yesterday morning, I overheard my kids talking about a tic/toc ban which surprised me because they are not MATLAB users and not particularly well connected at MathWorks. How would they know about... read more >>

]]>Yesterday morning, I overheard my kids talking about a tic/toc ban which surprised me because they are not MATLAB users and not particularly well connected at MathWorks. How would they know about something as big as this before me?

The idea of tic/toc is simple. tic starts a timer while toc stops it and reports the result. I use it all the time, both in day to day work and when discussing performance related issues. Here's an example for timing the generation of a random matrix

tic % start the timer

data = rand(2000);

toc % stops the timer and reports the result

It's great and I love it! However, there are issues with using it for performance work. There's usually variation in the time of an individual run. The first time you run any command, for example, is often slower than all subsequent times for various reasons. There's also variation because of all the other programs your computer is running or maybe the CPU is switching to different clock speeds as it hits some internal thermal limit and so on.

For this reason, it is usually much more beneficial to run the code many times and look at the statistics. For this reason, the timeit command was created. In order to perform a robust measurement, timeit calls the specified function multiple times and returns the median of the measurements. Here's how its used

f = @() rand(2000);

timeit(f)

Run that several times and you'll get much less variation which is what you'd expect given that its more robust.

Some of my fellow MathWorkers love timeit and I'm frequently advised to use it instead of tic/toc in blog posts. Loren Shure blogged about it back in 2013, just after it was introduced, and she recently messaged me after I'd published something, essentially saying 'Great post kid, but you should ditch tic/toc and use timeit". You'll notice that my recent post about SUNDIALS integration in MATLAB uses timeit whenever I talk about performance. I'm trying to be good, honest I am!

But banning tic/toc? That's just too much! I reached out to MATLAB Head of Product, Michelle Hirsch, and asked what on earth was going on because here's the thing....

I understand that timeit is more robust but I just don't like it and here's why:

- I don't feel like I get more for my effort. tic/toc gives me a number. timeit gives me a number. It may be 'more robust' but I have no sense of how much more robust it is. How many times was it run and why can't I control it? What was the maximum run time and what was the minimum? In short, I want some evidence for the multiple runs that the documentation tells me have taken place and I want to be able to control the number of runs.
- tic/toc is easy to use, even in the middle of 10,000 lines of janky code, I can just insert it wherever I feel like I need it. timeit requires me to wrap things into a function. This is fine for simple cases but more hassle than I can be bothered with when deep in the trenches.

So, I'll use timeit when I have to but please, don't take tic/toc away from me!

Turns out that my kids were referring to some other thing, totally unrelated to MATLAB! Something to do with video sharing. Dancing cats and so on. Michelle assured me that tic/toc is not going anywhere and put me in touch with the area of development who look after timeit to discuss ideas for improvement in the future. I've already fed back my 2 cents worth but am curious what you'd like to see in a timeit improvement/replacement?

Michelle also reminded me that for more serious performance related work, MATLAB has a Performance Testing Framework which I'll take a closer look at in the meantime.

]]>Today's post is from Vijay Iyer, a principal academic discipline manager (Neuroscience) who is also leading the MATLAB Community Toolbox program. Over my past few years at MathWorks, I’ve been... read more >>

]]>Today's post is from Vijay Iyer, a principal academic discipline manager (Neuroscience) who is also leading the MATLAB Community Toolbox program.

Over my past few years at MathWorks, I’ve been incubating the MATLAB Community Toolbox program. This is a first blog post to begin sharing stories about what we’re seeing and learning working with (over 50 to date) open-source toolboxes built by MATLAB users for other MATLAB users.

In my two decades as a research software engineer, various “driver” paradigms for software development have appeared on the horizon. Back in the aughts, I learned about domain-driven design. More recently in the teens, I’ve been learning about test-driven development.

While working with several open-source toolboxes to plan short collaborative development cycles, a pattern appeared that strikes me as a new ‘driver’ paradigm. Let’s call it example-driven development.

Not unlike test-driven development, the name itself sounds a bit like putting the cart before the horse. But in its first few years, the program often finds that open-source MATLAB codebases of all ages & sizes can benefit from making one or more rich examples a driving goal as part of their development cycles.

The beauty of putting examples front & center is that they can serve three purposes at once:

- Documentation, via the example’s narrative text which motivates & lightly explains the code
- Smoke Testing, i.e., exercising core functionalities of the toolbox
- Dissemination, via engaging images, structure, & figure outputs to ‘hook’ potential new users

Let’s put each of these aspects in context, while exploring a few early…err…examples of example-driven development.

Credit where credit is due. Example-driven development is really an outgrowth of another recent trend: the growing use of computational notebooks, such as Jupyter notebooks and more recently MATLAB live scripts. Computational notebooks have all the elements (narrative text, code, equations, graphical figure outputs, interactive controls, and more) to author rich examples, i.e., examples that walk through a common use case or typical workflow including guidance about motivations overall and at each step.

While teachers and researchers have been the lead adopters (for teaching technical concepts and conveying code underlying published research, respectively), software tool builders have also embraced computational notebooks for software documentation. Our development community at MathWorks is no exception: examples based on live scripts have rapidly become a cornerstone for documentation across the MATLAB platform.

Left: Gallery of live script examples in the Wavelet Toolbox documentation Right: Web tutorial for EEGLAB, a MATLAB community toolbox (credit: Swartz Center for Computational Neuroscience, UCSD)

In the early days of our program, we’ve found most MATLAB community toolboxes built by researchers for researchers are so far lagging this trend. Rich examples authored as live scripts thus became an early focus for the program. But it’s worth noting that the core idea – software documentation that actively teaches its new and existing users – has long roots in the MATLAB ecosystem. For instance, the widely-used EEGLAB community toolbox has an extensive library of web tutorials recently implemented in web markdown syntax, which is well-suited for their predominantly app-based workflows. The program is proud to have helped support this upgrade of the EEGLAB web documentation.

Many MATLAB community toolboxes, such as PPML and Homer3, offer their users command-line interfaces for scripting and programming. These two tools historically relied on other approaches such as script-based examples and wiki-based documentation, respectively, to teach their users. As the program worked with each to sponsor defined discrete projects around feature-enhancement goals their lead authors identified as important for their research community, we asked them to simultaneously add some live script examples in the process.

Rather than simply being an additional task, creating live script examples proved quite complementary:

- Homer3 supports users of functional near-infrared spectroscopy (fNIRS), a neuroimaging modality. Their project focused on adding support for the new SNIRF data standard. One of the new live script examples helped to test and document this new capability.
- PPML applies electromagnetic (EM) modeling for a class of layered photonic nanostructures. Their development cycle focused on a new diagram to visualize parametrically designed nanostructures and improved reflectance plots. Their live scripts incorporated live controls enabling domain experts to assess the new visualizations across a range of input parameters.

Another community toolbox sponsored by the program was DeepInterpolation with MATLAB, a framework for denoising various modalities of raw neuroscience data using deep learning. The reference implementation for the Deep Interpolation principle is coded in Python and includes Jupyter notebook examples for different modalities and use cases. DeepInterpolation with MATLAB was developed from the ground up to make the principle readily available to MATLAB users. As part of this, live scripts analogous to the original’s Jupyter notebooks were a central requirement, enabling both scientific reproducibility (users can compare the results on the same sample data) and tailoring to the individual languages (e.g., the MATLAB version used the datastore workflow central to the Deep Learning Toolbox).

Live script examples created for Homer3 (left), PPML (top right), and DeepInterpolation with MATLAB (bottom right) as part of program development cycles

In the lingo of software testing methodology, examples like these which helped to verify key new capabilities being developed can be considered smoke tests. In contrast to unit testing, which comprehensively exercises the functions in a library with many small-scale test functions, smoke testing focuses on exercising core functionalities of a software package.

Alongside the program’s first projects, a new development team here began to interview several research software builders to better understand their MATLAB requirements. It quickly became clear to them that good community toolbox examples often are good smoke tests. From this insight, the Examples Driven Tester was born, which connects a library’s live script examples to the MATLAB unit testing framework under the hood. Tool builders can benefit from automated testing without becoming full-blown testing experts. This utility is freely available on GitHub, with early users and feedback most welcomed.

Last but not least, centering rich examples for research software tools can aid with dissemination, i.e., attracting new users. A key enabler for this was first announced in this blog: Open in MATLAB Online from GitHub. This connector allows end users to run MATLAB code from a GitHub repository (where most research tools are hosted today) in their web browser, without installing MATLAB nor navigating GitHub source control. Authors can add an Open in MATLAB Online badge to their GitHub repository inviting curious browsers to quickly give their tool a try. Similarly, File Exchange highlights this fast web-based access for all GitHub repositories linked to File Exchange with a new Open in MATLAB Online button:

Open in MATLAB Online badge in a GitHub repository (left) and Open in MATLAB Online button in its linked File Exchange entry (right). Both buttons clone the latest repository code and open it in the browser-based MATLAB Online.

MATLAB Online comes in two versions: a basic version available to anyone worldwide and the full version (available to all academic users). Thus, for many community toolboxes, MATLAB Online enables authors to serve their examples to anyone worldwide.

Some toolboxes have multiple examples. For these cases, our program has begun recommending projects tabulate their top examples prominently on their GitHub README. For instance, the README for the DeepInterpolation with MATLAB shows individual Open in MATLAB Online links ( buttons) for three examples applying denoising models to different kinds of neuroscience data:

Snippet from README file of the DeepInterpolation with MATLAB community toolbox showing individual view () and run () links for a library of lightweight live script examples corresponding to specific use cases appealing to different audiences

In this way, there are runnable examples tailored to the interests of each potential user! Note the examples are also viewable (the links) by all worldwide via a File Exchange rendering service for live scripts in GitHub repositories.

To show their tool in the best way, tool authors must take care that their (beautiful ) examples run through and generate the expected outputs at each release. In other words: testing, documentation, and dissemination are intertwined. That’s example-driven development.

Some MATLAB community toolboxes have additional more compute- or data-intensive examples, or may have several third-party software dependencies that are not (yet!) registered as MATLAB add-ons. In either of these cases, research software tools may turn to domain-focused compute environments, sometimes termed science gateways, as a place to point their users to runnable examples. More on this topic to come in future posts.

Whether it’s on MATLAB Online or beyond, we’re excited to start seeing the impact that rich and runnable examples will have to help software tool builders grow their user communities.

]]>

Late last year I introduced the new solution framework for solving Ordinary Differential Equations (ODEs) that made its debut in MATLAB R2023b. I demonstrated how it allowed users to do all kinds of... read more >>

]]>Late last year I introduced the new solution framework for solving Ordinary Differential Equations (ODEs) that made its debut in MATLAB R2023b. I demonstrated how it allowed users to do all kinds of things much more easily than before but stressed that the R2023b release was mostly about the new interface. I told you "There are no new solvers or any fundamentally new functionality....yet!". A few people picked up on the 'yet', correctly guessing that there would be new solvers and functionality soon. In R2024a, we've made a start on this by adding support for some of the SUNDIALS solvers.

SUNDIALS is a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers, an award-winning set of open source ODE solvers from Lawrence Livermore National Labs. Functionality from SUNDIALS has been available via Simbiology for a while and now we've brought it to all MATLAB users via the new ODE solver interface. There are several solvers in the SUNDIALS suite and we've added support for three of them via new values of the Solver property of the ode class: "cvodesstiff", "cvodesnonstiff" and "idas".

- "cvodesnonstiff" Variable-step, variable-order (VSVO) solver using Adams-Moulton formulas, with the order varying between 1 and 12. See CVODE and CVODES.
- "cvodesstiff" Variable-step, variable-order (VSVO) solver using Backward Differentiation Formulas (BDFs) in fixed-leading coefficient form, with order varying between 1 and 5. See CVODE and CVODES.
- "idas" Variable-order, variable-coefficient solver using Backward Differentiation Formulas (BDFs) in fixed-leading coefficient form, with order varying between 1 and 5. See IDA and IDAS.

Let's begin by solving the classic predator-prey equations

$$\begin{array}{l}\frac{\mathrm{dx}}{\mathrm{dt}}={\mathit{p}}_{1}\mathit{x}-{\mathit{p}}_{2}\mathrm{xy}\\ \frac{\mathrm{dy}}{\mathrm{dt}}=-{\mathit{p}}_{3}\mathit{y}+{\mathit{p}}_{4}\mathrm{xy}\end{array}$$

with parameters ${\mathit{p}}_{1}=0.4,{\mathit{p}}_{2}=0.4,{\mathit{p}}_{3}=0.09,{\mathit{p}}_{1}=0.1$and initial conditions $\mathit{x}\left(0\right)=1,\mathit{y}\left(0\right)=1$

Setting up and solving the problem using the ode class looks like this

d = ode(ODEFcn=@(t,y,p) [p(1)*y(1) - p(2)*y(1)*y(2); ...

p(3)*y(1)*y(2) - p(4)*y(2)]);

% Set initial conditions and parameters

d.InitialTime = 0;

d.InitialValue = [1;1];

d.Parameters = [.4 .4 .09 .1];

% Set the relative tolerance

d.RelativeTolerance = 1e-6;

At this point, I could just get on and solve the system of ODEs with the solve function and MATLAB would attempt to choose a suitable solver for me and in R2024a it happens to choose ode45 for this problem. This could change in future versions though so let's explicitly choose ode45 so we know exactly what we'll be using.

% Select the ode45 solver

d.Solver = "ode45";

% Solve over a short time span

sol = solve(d,0,100);

plot(sol.Time,sol.Solution,LineWidth=2)

To switch to using the non stiff version of SUNDIAL's CVODES solver, I just need to do

% select the cvodesnonstiff solver

d.Solver = "cvodesnonstiff";

and then solve as before

% Solve over a short time span

sol = solve(d,0,100);

plot(sol.Time,sol.Solution,LineWidth=2)

Not much appears to have changed! So what are the benefits of using the SUNDIALS solvers?

The first potential benefit of the new SUNDIALS solvers is performance. I'm going to switch back to ode45, solve over a much larger time span and use the timeit function to get solution time

d.Solver = "ode45";

timeOde45Fcn = @() solve(d,0,15000);

ode45Time = timeit(timeOde45Fcn)

Now let's solve the exact same problem with SUNDIALS' CVODES non stiff solver.

d.Solver = "cvodesnonstiff";

timeCvodesFcn = @() solve(d,0,15000);

cvodesTime = timeit(timeCvodesFcn)

Let's see what the speed-up is

fprintf("The SUNDIALS solver cvodesnonstiff is %f times faster than ode45 for this problem\n",ode45Time/cvodesTime)

I was pretty pleased with this! Internal presentations about the new solver suggested that the speed-up would be ~2x or so for most benchmarks and I've gotten way more than that here. The reasons for the speed-up of cvodesnonstiff compared to ode45 are primarily that it requires fewer steps and its slightly faster per step. The SUNDIALS suite is written in C and we've also performed some JIT magic to reduce the overhead of calling a MATLAB function from a C++ function.

We can get some insight into this by looking at the solutions. Over the time span [0,15000], ode45 takes 33409 steps for the tolerance I have set

d.Solver = "ode45";

ode45Sol = solve(d,0,15000)

while cvodesnonstiff takes 11503 steps

d.Solver = "cvodesnonstiff";

cvodesSol = solve(d,0,15000)

That's 2.9x fewer steps than ode45 but there's more than a 2.9x speed-up so the speed-difference is clearly not just because the new solver requires fewer steps.

I've been comparing against ode45 because that's the algorithm I always reach for first. However, cvodesnonstiff is most similar to ode113 as far as the algorithm under the hood is concerned and it turns out that ode113 requires about the same number of steps as cvodesnonstiff.

d.Solver = "ode113";

ode45Sol = solve(d,0,15000)

Very close in terms of number of steps but it turns out that ode113 is even slower than ode45 for this problem

d.Solver = "ode113";

timeOde113Fcn = @() solve(d,0,15000);

timeOde113 = timeit(timeOde113Fcn)

fprintf("The SUNDIALS solver cvodesnonstiff is %f times faster than ode113 for this problem\n",timeOde113/cvodesTime)

When I spoke to development about these speed differences they told me that I happened to choose a problem that really allows the new SUNDIALS integration to shine! For example, I chose an integration time that was large enough to dwarf the startup cost. For various reasons, the overhead for getting from ode object into the SUNDIALS solver is generally bigger than getting into ode45 and friends. Once we are in SUNDIALS, however, things can really fly for small problems like this.

For bigger, stiff problems, we find less of a performance difference between the MATLAB and SUNDIALS solvers. The way this was explained to me is to consider what the solver actually does. It evaluates a function handle and then does some vector-vector or matrix-vector operations/solves/etc. When the problems get big, the linear algebra starts to dominate and the time per step will be similar for both solvers. At this point, the winning solver is the one that happens to have fewer steps or which one that manages to trigger fewer matrix factorizations etc, which again, is problem dependent.

So, the SUNDIALS can be a lot faster than ode45 and friends...and sometimes its not. The exact speed difference between solvers depends heavily on the problem and I encourage you to investigate on your own problems and let me know what you find.

Having a faster ODE solver in MATLAB is great but there is another reason to love the SUNDIALS integration: Sensitivity analysis of parameters. This is functionality that Simbiology users have had for a while but now some of it is in MATLAB itself. Sensitivity analysis examines how changes in parameter values can affect the solution of the ODE system.

The MATLAB documentation demonstrates this by looking at a Sensitivity analysis of a SIR epidemic model so I went looking for a different model to play with. I soon found the CARRGO model [1] via a paper about methods for assessing sensitivity in biological models [2]. It models the interaction of cancer cells and CAR T-cells which are used to destroy them and is a variation of Lotka-Volterra predator-prey equations. The system of ODEs is

$$\begin{array}{l}\frac{\mathrm{dx}}{\mathrm{dt}}=\rho \mathit{x}\left(1-\frac{\mathit{x}}{\gamma}\right)-{\kappa}_{1}\mathrm{xy}\\ \frac{\mathrm{dy}}{\mathrm{dt}}={\kappa}_{2}\mathrm{xy}-\theta \mathit{y}\end{array}$$

x is the density of cancer cells andy is the density of CAR T-cells. In MATLAB, I'll refer to the parameters as a vector p and since there are 5 parameters here, we'll have a 5 element vector. The meaning of each parameter and the number I'll give it in MATLAB is as follows

- ${\kappa}_{1}$= p(1): The killing rate of the CAR T-cells.
- ${\kappa}_{2}$= p(2): Net rate of proliferation of CAR T-cells
- θ= p(3): Death rate of CAR T-cells
- ρ= p(4): Net growth rate of cancer cells
- γ= p(5): Cancer cell carrying capacity

Sensitivity analysis will give us the derivatives of x(density of cancer cells) and y(density of CAR T-cells) with respect to these five parameters.

Setting the equations up in MATLAB:

F = ode();

F.ODEFcn = @(t,x,p) [ p(4) * x(1) * (1-x(1) / p(5))-p(1) * x(1) * x(2); p(2)* x(1) * x(2)-p(3) * x(2)];

Sahoo et al [1] fitted this model to empirical data (Using MATLAB and fmincon so the paper tells us) and found the following parameters and initial values

F.InitialValue = [1.25*10^4 , 6.25*10^2];

F.Parameters = [6*10^-9, 3*10^-11, 1*10^-6, 6*10^-2, 1*10^9];

Set the solver to cvodesnonstiff and we can get the solution to the system of equations as usual

F.Solver = "cvodesnonstiff";

sol = solve(F,0,1000)

Next, I plot these solutions

tiledlayout(2,1)

nexttile

plot(sol.Time,sol.Solution(1,:),LineWidth=2)

title("x : density of cancer cells")

xlabel("days")

ylim([-0.5*10^8 11*10^8])

nexttile

plot(sol.Time,sol.Solution(2,:),LineWidth=2)

title("y : density of CAR T-cells")

xlabel("days")

ylim([-0.5*10^6 16*10^6])

sgtitle("Solutions of the CARRGO system of ODEs")

Computing the sensitivities of the system of equations to all 5 parameters is very easy. We simply add the following to the ODE object.

F.Sensitivity = odeSensitivity();

and run the solve command as before.

sol = solve(F,0,1000)

Note that this time, our ODEResults has a Sensitivity field alongside the Solution. There are 2 x 5 sets of Sensitivity results corresponding to the partial derivatives of either of our 2 variables with respect to the 5 parameters. If I wanted to only consider a few indices, I would change ParameterIndices accordingly.

Let's get the result for the first variable, x, and the first parameter, ${\kappa}_{1}$, i.e. $\frac{\partial \mathit{x}}{\partial {\kappa}_{1}}$

dXdKappa1 = sol.Sensitivity(1,1,:);

sensitivitySize = size(dXdKappa1)

The problem with this is that it is a 1 x 1 x 225 array so plotting it directly won't work!

plot(sol.Time,dXdKappa1)

You have to first squeeze it to get rid of the redundant singleton dimension. You'll find that there is a lot of squeezing to be done when dealing with sensitivities!

figure

plot(sol.Time,squeeze(dXdKappa1),LineWidth=2)

ylabel("$\frac{\partial x}{\partial\kappa_1}$",Interpreter="latex")

xlabel("days");

title("Sensitivity of Cancer Cells in the CARGGO Model to the $\kappa_1$ parameter",Interpreter="latex")

The result is the partial derivative of x with respect to the parameter ${\kappa}_{1}$

The epidemic example in the MATLAB documentation for Sensitivity Analysis tells us that sensitivity functions are often normalized such that the result describes the approximate percentage change in each solution component due to a small change in the parameter. This helps us compare the effect of each of the parameters on the overall model. Refer to the doc page or Hearne [3] for the mathematics. The code looks like this for the first variable,$\mathit{x},$that represents cancer.

p = F.Parameters;

normalisedSensitivityKappa1 = squeeze(sol.Sensitivity(1,1,:))'.*(p(1)./sol.Solution(1,:));

normalisedSensitivityKappa2 = squeeze(sol.Sensitivity(1,2,:))'.*(p(2)./sol.Solution(1,:));

normalisedSensitivityTheta = squeeze(sol.Sensitivity(1,3,:))'.*(p(3)./sol.Solution(1,:));

normalisedSensitivityRho = squeeze(sol.Sensitivity(1,4,:))'.*(p(4)./sol.Solution(1,:));

normalisedSensitivityGamma = squeeze(sol.Sensitivity(1,5,:))'.*(p(5)./sol.Solution(1,:));

Plot each one of these against time

figure

t = tiledlayout(3,2);

title(t,"Normalised sensitivty functions for Cancer in the CARRGO model",Interpreter="latex")

xlabel(t,"Time (days)",Interpreter="latex")

ylabel(t,"\% Change in Eqn",Interpreter="latex")

nexttile

plot(sol.Time,normalisedSensitivityKappa1,LineWidth=2)

title("$\kappa_1$",Interpreter="latex")

ylim([-1.5 0.5])

nexttile

plot(sol.Time,normalisedSensitivityKappa2,LineWidth=2)

title("$\kappa_2$",Interpreter="latex")

ylim([-30 0.5])

nexttile

plot(sol.Time,normalisedSensitivityTheta,LineWidth=2)

title("$\theta$",Interpreter="latex")

ylim([-0.005 0.01]);

nexttile

plot(sol.Time,normalisedSensitivityRho,LineWidth=2)

title("$\rho$",Interpreter="latex")

ylim([-6 10])

nexttile

plot(sol.Time,normalisedSensitivityGamma,LineWidth=2)

title("$\gamma$",Interpreter="latex")

ylim([-21 5])

I am not a biologist but my naive expectation was that ${\kappa}_{1}$, the killing rate of the CAR T-cells would have a strong effect on the model. However, the above plots suggest that this isn't the case. For almost half of the simulation, this rate has no effect at all and when it does kick in, it's much weaker than the other parameters. $ \kappa_2~ $, the net rate of proliferation, is much more important and that too only starts to have an effect roughly half way through the simulation. The parameter θ seems to have very little affect at all.

To explicitly see these conclusions, and any others you might make from the above plots, I used live controls to create a mini application based on this model that allows the user to vary the all of the parameters around the values we've been working with. Sure enough, θ doesn't make much of a difference and the observations around $ \kappa_1 $ and $ \kappa_2 $ play out too.

Click on the image below to open this model up in MATLAB Online to have a play yourself.

The type of sensitivity analysis considered here is a “local, forward sensitivity analysis”. There's a lot more to sensitivity analysis than I've shown here and you may be interested in features more advanced than those added to the ode class in R2024a. If so, I suggest you take a look at the sensitivity analysis functionality in Simbiology which includes much more.

References

[1] Sahoo P, Yang X, Abler D, Maestrini D, Adhikarla V, Frankhouser D, Cho H, Machuca V, Wang D, Barish M, Gutova M, Branciamore S, Brown CE, Rockne RC. Mathematical deconvolution of CAR T-cell proliferation and exhaustion from real-time killing assay data. J R Soc Interface. 2020 Jan;17(162):20190734. doi: 10.1098/rsif.2019.0734. Epub 2020 Jan 15. PMID: 31937234; PMCID: PMC7014796.

[2] Mester R, Landeros A, Rackauckas C, Lange K. Differential methods for assessing sensitivity in biological models. PLoS Comput Biol. 2022 Jun 13;18(6):e1009598. doi: 10.1371/journal.pcbi.1009598. PMID: 35696417; PMCID: PMC9232177.

[3] Hearne, J. W. “Sensitivity Analysis of Parameter Combinations.” Applied Mathematical Modelling 9, no. 2 (April 1985): pp. 106–8. https://doi.org/10.1016/0307-904X(85)90121-0

]]>Everyone has them! MATLAB features that you've been requesting for ages. You email MathWorks support, you post long rants on discussion forums, you even join MathWorks and spam the internal... read more >>

]]>Everyone has them! MATLAB features that you've been requesting for ages. You email MathWorks support, you post long rants on discussion forums, you even join MathWorks and spam the internal bug/feature request database before turning up to the office and sitting on the developer's desk to make your case. That last one might just be me though!

Every six months, you scan the release notes looking to see if they are there. Sure, there are 1,914 new features in R2024a but is what I want in there? My obsessions? My 'white whales'?

I've been working in technical computing for almost 25 years so you may think that the sort of functionality I'm desperate for is related to advanced mathematics or high performance computing. I have some desires there, to be sure, but since I use MATLAB Live Scripts every day, including to write the articles on this blog, one of the desires that burned in my soul is rather more mundane. I wanted spell checking.

There's actually a programmatic way of doing doing a spell check via Text Analytics Toolbox using the correctSpelling function that has been available since R2020a:

% Requires Text Analytics Toolbox

str = [

"A documnent containing some misspelled worrds."

"Another documnent cntaining typos."];

documents = tokenizedDocument(str)

updatedDocuments = correctSpelling(documents)

While this is very cool and useful in a different context, it's not quite what I had in mind. I just wanted my misspelled words to be underlined so that I can click on them and get them corrected. I'm a simple man with simple needs. I'm not the only one. MATLAB Central user, Alex Pedcenko also wanted spell checking. This is just the tip of the iceberg. Spell checking in MATLAB has been requested a lot!

Of course, you know where I'm going. As of R2024a, MATLAB has spell checking. It's not switched on by default though. In Live Editor, you need to click on the View tab and click on Spelling.

It may seem simple but this is a gamer-changer for me!

Way back at the beginning of my MathWorks career, I wrote a guest blog post for Loren Shure's The Art of MATLAB. It was great training for me and the first step in Loren handing the MathWorks blogging baton to me. I, however, was a little upset. I was upset because I couldn't define a local function in the middle of a script, like this:

function result = MyIsPrime(n)

% Function to check if n is prime

if n <= 1

result = false;

return

end

q = floor(sqrt(n));

n = uint32(n);

for i = uint32(2):q

if mod(n,i) == 0

result = false;

return

end

end

result = true;

end

%Run the function

MyIsPrime(29)

In previous versions of MATLAB, all local functions had to be at the end of a script. You couldn't define them anywhere you wanted unless you used anonymous functions. This led to what I call the 'Helper function anti-pattern' where blog posts and live scripts containing demonstrations would have a bunch of functions at the very end with a title like 'Helper functions' or 'Support functions' or something similar. Sure, I could just define them as separate .m files but then the reader of my document wouldn't be able to see them. This limitation bothered me because it inhibited how I taught or told stories.

Now, you can define them almost anywhere you like. This is a major quality of life change for me and I have been heaping praise on the internal team who did the extensive engineering behind the scenes to make it happen. I've been saying almost anywhere though. So where can't you define functions?

You still can't define functions in for-loops or conditionals. So, for example, the following will not work

if 1

function inConditional(x) % <- Error, not top level

end

end

This restriction is a deliberate design decision and is done to preserve scope and declaration clarity. Placing function definitions inside conditional blocks can obscure the function's scope, leading to uncertainty about its availability outside the conditional block and the specific conditions under which it is defined. This restriction is enforced to safeguard the code's clarity and straightforwardness.

This update changes the rules (a bit) for what makes an .m file a script. Before, if you opened an .m file and saw that the top contained a function definition then you knew it was a function file. Now, a script is any code file that contains a MATLAB command or expression that is outside the body of a function. For example, I could have any number of functions at the top of my file but as soon as I add a single line of code anywhere in the file that is not inside one of those functions, it becomes a script whereas previously you would have got an error message. Once it becomes a script, the function at the top is only usable from inside that script since it is now a local function. That is, it's no longer a function file but a script file.

R2024a has made me much happier now that two of my 'White Whale' features have been implemented. What are yours though? Join us in the new Ideas channel in the Discussions area of MATLAB Central and tell us about the features you wish MATLAB had.

]]>Few things say 'MATLAB' better than the backslash. So good, we put it on a T-shirt.

As most people will be aware, the backslash operator solves a system of linear equations. That is, it finds the x that solves A*x = b. For example

A = [2 1 1;

-1 1 -1;

1 2 3];

b = [2;3;-10];

x = A\b

So simple and yet there is a lot of numerical linear algebra behind backslash. You see, exactly how one writes an algorithm to solve a linear system of equations depends very much on the structure and type of the matrix A. Is A Hermitian? If so, is its diagonal all positive or all negative and so on? Then you have to ask is A single, double or complex? Is the matrix full or sparse? The answers to these questions determine which algorithm is best to use. Get it right and you get the correct answer very quickly, making maximal use of both your computer hardware and decades of numerical linear algebra research.

In the older MATLAB R2023b, the logic that backslash uses for full (i.e. not sparse) matrices is as follows

In the latest version of MATLAB, R2024a, we rearranged this flow-chart a little and added a test for tridiagonal matrices. This isn't a particularly expensive test since we are already some of the way there by doing the upper Hessenberg check.

As such, the backslash flow chart for full matrices is as follows (These flowcharts are from the documentation for mldivide by the way)

The practical upshot is that backslash is now significantly faster when the input matrix is dense and tridiagonal. In general, you are better off using sparse matrices when dealing with tridiagonal matrices (see later in this post) but sometimes you happen to have a dense one (e.g. from the output of ldl) and we wanted to ensure that backslash would deal with it efficiently.

Tridiagonal matrices are zero everywhere apart from the main diagonal and the two diagonals either side of it (the so-called superdiagonal and subdiagonal). Here's an example generated using MATLAB's gallery function

full(gallery("tridiag",8))

Tridiagonal matrices are particularly important because they appear in many areas of science and engineering including quantum mechanics, partial differential equations and splines to name a few.

To see how much faster we've made things, let's construct a problem involving a random tridiagonal matrix

n = 1e4;

mainDiag = rand(n, 1);

upperDiag = rand(n-1, 1);

lowerDiag = rand(n-1, 1);

triDiag = diag(mainDiag) + diag(upperDiag, 1) + diag(lowerDiag, -1);

Just to make sure that I haven't made a mistake while generating triDiag, I turn to the isbanded function. The lower and upper bandwidths of a tridiagonal matrix are both 1 so I can test as follows

isbanded(triDiag,1,1) % If this returns true, D is tridiagonal

OK, so now I'm convinced that D is tridiagonal and that I can use it to time this update to backslash

b = rand(n,1); % Construct the RHS of the equation

f = @() triDiag\b;

runtime = timeit(f)

On the same machine, R2023b did this in 0.5637 seconds so we have

- R2024a: 0.0716 seconds
- R2023b: 0.5637 seconds

Almost 7.9x speed-up! This is a little more than the 6.5x speed-up discussed in the release notes. The author of that section of the release notes used a Windows 11, AMD EPYC 74F3 24-Core Processor @ 3.19 GHz test system whereas my system is rather more modest.

cpuinfo

Clearly, how much faster the new version of backslash is depends on your system but whatever machine you use, it will definitely be faster in R2024a.

Going sparse for maximum speed

In linear algebra, the path to speed often relies on taking advantage of matrix structure and there is one aspect of the structure of tridiagonal matrices that is important -- the fact they are very sparse. We haven't sped this up in 24a but it's worth pointing out that if you can use sparse matrices for these problems then you really should. It's so much faster, even taking into account the speed-ups shown above. The memory benefits are huge too!

Let's convert triDiag to a sparse matrix to see

spTriDiag = sparse(triDiag);

f = @() spTriDiag\b;

runtime = timeit(f)

That's almost 100x faster than the full-matrix version in R2024a and almost 800x faster than the full-matrix version in R2023b.

More on backslash

- Backslash - A historical post where Cleve tells us how backslash came to represent the solution of linear systems in MATLAB
- What is A\A? - The answer: A\A is always I, except when it isn't.

This month, Steve Eddins is retiring from MathWorks after 30+ years on the job. When he joined, MathWorks was only 10 years old and had 190 staff. Recently, we just celebrated our 40th anniversary... read more >>

]]>This month, Steve Eddins is retiring from MathWorks after 30+ years on the job. When he joined, MathWorks was only 10 years old and had 190 staff. Recently, we just celebrated our 40th anniversary and have almost 7,000 staff working all over the world. Steve has seen some massive transformations.

Long before I joined MathWorks, I knew Steve through his popular blog, Steve on Image Processing with MATLAB which started way back in 2006. He is actually one of the people who inspired me to start blogging back then. As such, it has been a privilege to work alongside him as one of MathWorks’ official bloggers over the last couple of years. Sometimes, it really is a good thing to meet your heroes.

I recently had a chat with Steve, asking him about his time at MathWorks. A write up of this conversation is below.

Tell us about your career before MathWorks and how did it lead you here?

Mike, before I dive in, I want to say that I was an avid follower of your work in research software engineering. I always enjoyed your Walking Randomly blog. I am so happy that you have found a place with MathWorks, and especially that you are writing our MATLAB blog!

I was introduced to MATLAB sometime around 1988, when I was pursuing my Ph.D. in the digital signal processing research group at Georgia Tech. When Jim McLellan joined the DSP faculty, he brought with him his enthusiasm for MATLAB, and he convinced some of us grad students to join for a group purchase of MATLAB. I used MATLAB for some of my graduate studies, and there are some MATLAB plots in my dissertation.

In late 1990, I joined the faculty of the Electrical Engineering and Computer Science department at the University of Illinois at Chicago (UIC). I used MATLAB for image processing research work and for preparing course materials.

By 1993, I was ready to move on from UIC, and I was considering leaving academia. MathWorks came to mind because of my MATLAB experience. I was impressed with how company and MATLAB development leaders Jack Little, Cleve Moler, Loren Shure, and Clay Thompson actively participated in the new Usenet newsgroup, comp.soft-sys.matlab. I met Loren at a signal processing conference that spring, and I sent her my resume.

When Loren told me that no positions were available, I asked to participate in the beta program for the new Image Processing Toolbox. I was a highly motivated beta tester, providing detailed reports, algorithm suggestions, and MATLAB code. Shortly after the beta program ended, I was invited to interview. That went well, and I joined MathWorks in December 1993.

What was your first role here?

I took over development of Image Processing Toolbox. I spent my first MathWorks decade working to make the toolbox faster and more memory efficient and adding the algorithms and visualization tools needed for general research and development. Because the development organization was so small then, I also had an opportunity to do MATLAB work. I created the second-generation MATLAB profiler, optimized multidimensional array processing functions, implemented image and scientific data I/O functions, helped to design image display improvements, and integrated the FFTW library.

What was your final position at MathWorks?

The same as my first position! In the summer of 2020, the first pandemic year, I realized that I really missed doing image processing and toolbox development work. I approached Julianna, the manager of the team responsible today for the Image Processing Toolbox, about coming back to the team. I am grateful to her and to the other development managers who smoothed the way for me to rejoin the Image Processing Toolbox team in February 2021.

For about a dozen years prior to that, I was mostly doing MATLAB development. I managed several MATLAB teams. I was on the language design team, and I was one of a small group of senior designers who reviewed every feature going into MATLAB. I also helped create design standards for MATLAB.

What is it about MathWorks that made you stay so long?

It’s been the same answer throughout my career: I love making tools that make a real difference in the worlds of engineering and science, and I love doing that work with the kind of people who are here at MathWorks.

What advice would you give to a new MathWorker?

Now you’re really making me think hard, Mike. I hesitated for a long time about this, wondering what I could say that could be useful to any new MathWorker, whatever they might be doing and wherever around the world they might be doing it.

It finally dawned on me that MathWorks works hard to give helpful information to new MathWorkers, and the best thing I can do is to add my own perspective to that.

First, look at the company’s mission, which tells you the fundamental things that the company is trying accomplish in the technology, business, human, and social spheres. These have not changed during my 30 years with the company. With my engineering background, I tend to focus on the technology piece: “Our purpose is to change the world by accelerating the pace of discovery, innovation, development, and learning in engineering and science. We work to provide the ultimate computing environment for technical computation, visualization, design, simulation, and implementation. We use this environment to provide innovative solutions in a wide range of application areas.” But all four areas are important.

Your next step is to think about the company’s values. These tell you about the kind of people that MathWorks prefers to hire, and how MathWorks expects them to behave. If you can align your work with the company mission and plans, and if your personal approach to work life and interaction with others is consistent with the company values, then you have a great chance for a long, rewarding MathWorks career.

You started the blog in 2006. What led you to do this?

I have always enjoyed taking time to think carefully about something and then write an explanation. Blogging within MathWorks became a thing sometime around 2005, and I actively participated in that. Also, this was soon after the publication of Digital Image Processing Using MATLAB, which I co-authored. So, when Ned Gulley (who writes over at the MATLAB Community blog) started talking about creating technical blogs on mathworks.com, I approached him about it. I want to thank Ned for convincing MathWorks to create this technical blogging space, and I also want to express my gratitude for his encouragement of me, then and in the years since.

What have you got out of blogging over the years?

The biggest thing is a sense of connection with people around the world who are interested in MATLAB and image processing. This connection has been rewarding and motivating for me.

I also find it rewarding to be able to help other people in meaningful ways. It has been a joyful coincidence that some of the things that I personally find interesting have been also interesting and helpful to others. As I have been preparing for my retirement over the last couple of months, I have been delighted to hear from many MathWorkers that my blog helped them to learn about MATLAB and image processing. In some cases, it helped them to find their career here.

Also, the process of working out and writing an explanation of some technical topic almost always teaches me something new about it. So, writing the blog has helped to advance my own knowledge.

What was your most popular blog post?

Popularity is a bit hard for me to measure, as web traffic measurements are affected by search in unexpected and hard-to-interpret ways. If you don’t mind, let me just offer some highlights (from approximately 600 posts over 18 years).

Some fun deep dives:

- roipoly and poly2mask (part 1, part 2, part 3)
- upslope area
- continental divide
- Fourier transforms

Most controversial topics:

- New default MATLAB colormap
- Implicit expansion (Guests on Loren's Art of MATLAB blog, part 1, part 2)
- MATLAB arithmetic expands in R2016b
- More thoughts about implicit expansion

What skills do you have now that you couldn’t imagine having 30 years ago?

Well, that must be, believe it or not, naming things. I actually created a course for teaching software developers how to name API elements expressively, accurately, and effectively. I have taught that course about three dozen times.

Do you have a favourite function that you helped develop?

It’s impossible for me to pick a single favourite (or favorite) function. Here are a few that have special meaning to me.

- repmat Before repmat came along, MATLAB power users replicated a vector to form a matrix using a using a peculiar-looking indexing expression that was called “Tony’s Trick.”
- fftshift (support for arbitrary number of dimensions) When MATLAB 5 expanded the MATLAB world from matrices to multidimensional arrays, I worked on adding multidimensional array support to several functions. This experience taught me how to think about and implement generic multidimensional array manipulation, which in turn influenced Image Processing Toolbox designs.
- imresize I have revisited image resizing several times over the years. The heart of this Image Processing Toolbox function is now doing some heavy lifting in several ways throughout MATLAB, and I think it has influenced implementations elsewhere in the world, such as in deep learning. My colleague Alex and I jokingly ask ourselves whether it is possible to make an entire career about image resizing. I think the answer might be yes.
- poly2mask I have learned repeatedly, during my career, how difficult computational geometry problems can be when working on a discrete grid. I enjoyed finding creative algorithms to solve this particular problem in a way that is geometrically self-consistent and free from floating-point issues. The resulting function has been a central workhorse in Image Processing Toolbox for years.

There are two things coming up next. The first is playing French horn. I have been working intensively to improve my horn playing for about the past eight years, and I will be studying and playing even more after retirement. Today, I play in the Concord Orchestra and the Melrose Symphony, both near Boston, Massachusetts. I will continue to do that and also look for other opportunities to perform and study. I write about studying and playing horn in the Horn Journey blog.

Second, I will continue working with MATLAB and image processing. It will be as a hobbyist, though, and not as my day job. I look forward to continuing to contribute to the MATLAB community. Look for me at MATLAB Answers, the File Exchange, and Discussions. My new MATLAB Central profile is here, and my LinkedIn profile is here. I plan to continue writing about MATLAB and image processing on my new Matrix Values blog.

The latest version of MATLAB is now available for download and it's our biggest update yet. I have to tell you, I'm really excited by this one! It has got some features that I've been wanting for for... read more >>

]]>The latest version of MATLAB is now available for download and it's our biggest update yet. I have to tell you, I'm really excited by this one! It has got some features that I've been wanting for for a long long time. I'll be doing deeper dives into some of my favourite things over the next few weeks but, for now, here's an overview of some of the features that got me excited for R2024a.

These are just a few of my personal highlights out of thousands of updates. The official release highlights page is here but even that is just a subset of the full release notes.

MATLAB

- New ODE Solvers: Last release, we brought you a toally new and improved interface to the traditional MATLAB ODE solvers. Now we bring you the first of the algorithmic updates with new solvers from the SUNDIALS suite which also brings the ability to perform sensitivity analysis.
- Better Backslash: The iconic MATLAB operator has been improved for full tridiagonal matrices.
- REST function service: You can call custom MATLAB functions from any programming language or application that can make a REST call.
- Local functions (almost) anywhere: They used to have to be at the end of a script. Now they can be added anywhere in the script except within conditional contexts, such as if statements or for loops.
- MATLAB and Python: Automatic conversion between pandas dataframes and MATLAB tables. It's easier than ever to mix Python and MATLAB code thanks to the new Python Live Task. You can also now convert between Python dictionaries and MATLAB dictionaries.
- MATLAB and Fortran: Fortran has been part of my life since I was an undergraduate! Now MATLAB supports the free MinGW64 Compiler on Windows.
- Easier import of HDF5 data: MATLAB has supported HDF5 data for a long time. Now it's easier than ever thanks to new GUI import tools that can also generate code.
- Spellcheck in Live Scripts: One of the most requested live script features has finally dropped!
- Fastest MATLAB yet: Dozens of functions have been made faster. New algorithms, new libraries, more GPU support and more parallelisation.

My favourite toolbox updates

- Deep Learning Toolbox: Starting in R2024a, there is a new recommended workflow to build, train, and make predictions with neural networks that uses the trainnet function (introduced in R2023b) and dlnetwork objects. You can also run pretrained TensorFlow, PyTorch and ONNX models in Simulink.
- Optimization Toolbox: The HiGHS library is now the default Linear Programming (linprog) and Mixed Integer Programming (intlinprog) solver making these functions faster than before. See HiGHS' announcement here.
- Parallel Computing Toolbox: New GPU support in functions from MATLAB, Statistics and Machine Learning Toolbox, Communications Toolbox, 5G Toolbox, Audio Toolbox and Wavelet Toolbox. There's now well over 1,000 functions with GPU support across all of MATLAB and the various toolboxes.
- Statistics and Machine Learning Toolbox: You can now run pretrained models from Scikit-learn in Simulink. There's also a new function to perform incremental principal component analysis: incrementalPCA.
- Simscape: graphImporter which allows you you to extract data points from graph images, drag the picked points, interpolate multiple data lines on common x-axis, and export them as a table. Utilize it to digitize datasheets and seamlessly integrate data into MATLAB.

- Simulink 3D animation: This isn't a product I'd used before but just a glance at the video below made me reach out to the team and ask "Tell me everything!"

Official release video for R2024a

]]>

This is a guest blog post by Michael Hosea, a numerical analyst at MathWorks. He works on MATLAB Coder and on MATLAB’s ODE and integral solvers.

MATLAB's ODE solvers have tolerances that the user can change. Users are often reluctant to set these tolerances, perhaps because they think they are only for power users, and they are reluctant to alter “factory settings.” The sentiment is understandable. Tolerances do occur in many different contexts in MATLAB, and some of them are that sort of thing. For example, you probably won't need or want to change the default tolerance in the rank function. But as we're going to see, that's not the situation with ODE solvers. It may surprise you to hear that we don't actually know how to set your tolerances in an ODE Solver.

Most of what follows will be geared towards understanding how to do exactly that, but the concepts apply much more broadly. The application is direct with integral, ismembertol, and uniquetol, to name a few, despite differences in how tolerances are supplied and used. Even more generally, the principles apply in unit testing when one has exact or reference values to compare with results from software being tested.

Of course we did have reasons for the default tolerances used in the ODE solvers. We wanted the default tolerances to request enough accuracy for plotting, and it probably doesn't take a lot of accuracy to do that. Also, we're using the same defaults for all the solvers. Consequently, they needed to work with low-order and high-order solvers alike. Loose tolerances tend to work well enough with high-order solvers. On the other hand, I don't know if you've ever tried using tight tolerances with low-order solvers, but patience is virtue. Barring enough of that, ctrl-c is your friend. The resulting default tolerances are rather loose. Unfortunately, we don't know the context of the problem you are trying to solve. You might want more accuracy than these default tolerances are requesting, particularly if you are using one of the higher order methods. So let's talk about about the tolerances and how to go about setting them for your purposes.

Fortunately, no great understanding of the software is needed to set tolerances in a reasonable way, only an understanding of the problem you are trying to solve and what these tolerances mean. We should be able to set reasonable tolerances if we can answer the following questions about the problem we are trying to solve:

- How small of a value do we consider negligible?
- How precise is the data that defines the problem?
- What percentage error is small enough to ignore? Or, put differently, how many significant digits of accuracy do we need?

Our ODE solvers allow you to set two different tolerances. One is the absolute tolerance and the other relative tolerance. These two work together. Absolute tolerance is the simpler of the two, so let's start there.

We say an approximate result y is within an absolute tolerance A of an exact or reference value ${\mathit{y}}_{0}$ if

$${\mathit{y}}_{0}-\mathit{A}\le \mathit{y}\le {\mathit{y}}_{0}+\mathit{A}$$

Here we assume $\mathit{A}\ge 0.$ Subtracting ${\mathit{y}}_{0}$ from each term gives us

$$-\mathit{A}\le \mathit{y}-{\mathit{y}}_{0}\le \mathit{A}$$

which we can write succinctly as

$$|\mathit{y}-{\mathit{y}}_{0}|\le \mathit{A}$$

When ${\mathit{y}}_{0}$ is the exact value we are trying to approximate with y, we call the quantity on the left the absolute error.

Before we go too far, let's just point out the obvious. We don't usually know ${\mathit{y}}_{0}$. If we did, there wouldn't be much point in doing any work to approximate it unless we were just testing the code. Codes like integral and the ODE solvers compute an estimate of the absolute error and use that in place of $|\mathit{y}-{\mathit{y}}_{0}|$. Sometimes it is an approximate bound on the error rather than an estimate per se. Exactly how this is done isn't important to our discussion here except to say that if things are going well, this estimate should be about the same size as the actual error.

So, we're done, right? An absolute tolerance should be able to get the job of describing the accuracy we need done no matter what we're doing. Well, yes and no. Suppose you're solving a problem where the correct answer is, say, 10 meters, and you'd like an answer to be correct to within $\pm 0.5\mathrm{mm}$. So, for that, you'd set your absolute tolerance to $\mathit{A}=\text{\hspace{0.17em}}0.0005\mathrm{m}$. Notice that the absolute tolerance in a physical context has units, in this case meters because y and ${\mathit{y}}_{0}$ are in meters.

So we write some code that defines

AbsoluteTolerance = 0.0005;

and we keep the units in our head. We're used to doing that.

But what if the solution to our problem isn't just a scalar value, rather an array of values representing different kinds of quantities, maybe lengths, areas, volumes, mass, temperature, you name it. Assuming the software allows you to set a different absolute tolerance for each element of a solution array in the first place, you'd need a pretty good idea of the size of the solution element to decide ahead of time what the absolute tolerance on each element should be. Probably we can't just take a number like 0.0005 and think it's going to be appropriate for everything.

Moreover, how did we decide that $\pm 0.5\mathrm{mm}$ was reasonable in the first place? For something 10m long, that's rather precise for many practical purposes, often enough even for something only 1m long, but it isn't for something 0.001m long. In that case it covers a substantial difference percentage-wise. Sometimes it's not so much the absolute error that concerns us, rather the error in a "percentage" sense.

And wouldn't it be convenient to have a way of describing the accuracy we want using a dimensionless tolerance? It would be nice to have something that automatically adjusts itself for different scaling, something that we don't have to change if we merely restate the same problem using different units of measurement. After all, it shouldn't be any harder or easier to compute something to 0.5mm than 0.0005m; that's the same distance.

You're probably familiar with the concept of significant digits or significant figures when reporting data. The basic idea is that if you see the measurement of some kind, say $\mathit{x}=1.5\mathrm{mm}$, it corresponds to an unspecified exact quantity that has (hopefully) been correctly rounded to $1.5\mathrm{mm}$. This means that

$$1.45\mathrm{mm}\le {\mathit{x}}_{\mathrm{exact}}<1.55\mathrm{mm}$$

since that is the interval of numbers that rounds to $1.5\mathrm{mm}$. Expressed in the parlance of absolute error, we're saying that the absolute error

$|\mathit{x}-{\mathit{x}}_{\mathrm{exact}}|\le 0.05\mathrm{mm}$.

That formulation includes the upper bound of $1.55\mathrm{mm}$, but we needn't quibble about that insofar as tolerances are ultimately used with error estimates, anyway. If we had instead written $\mathit{x}=1.500\mathrm{mm}$, we would be implying that

$1.4995\mathrm{mm}\le {\mathit{x}}_{\mathrm{exact}}<1.5005\mathrm{mm}$,

hence that

$$|\mathit{x}-{\mathit{x}}_{\mathrm{exact}}|\le 0.0005\mathrm{mm}.$$

Here again we have, without apology, added in the upper end point. In the first case we say that we have 2 significant digits, and the latter 4.

Note that 0.05 can be written as $5\times {10}_{}^{-2}$, and 0.0005 as $5\times {10}_{}^{-4}$. The exponents seem to indicate the number of significant digits. Unfortunately, the conjecture doesn't hold up. To see this, let's just change units of measurement. That shouldn't change the precision of anything. Switching from millimeters to microns, we have in the first case

$$1450\mu \le {\mathit{x}}_{\mathrm{exact}}<1550\mu $$

and in the second,

$1499.5\mu \le {\mathit{x}}_{\mathrm{exact}}<1500.5\mu $ .

And again, in the language of absolute error, we have

$$|\mathit{x}-{\mathit{x}}_{\mathrm{exact}}|\le 50\mu $$

and

$|\mathit{x}-{\mathit{x}}_{\mathrm{exact}}|\le 0.5\mu $,

respectively. The "absolute tolerances" on the right are $\mathit{5}\times {10}_{}^{+1}\mu $ and $\mathit{5}\times {10}_{}^{-1}\mu $. Now the exponents don't seem to bear any relationship to the number of significant digits.

The problem is clearly scaling. Because absolute error has units of measurement, merely changing units of measurement rescales the data. So how can we express the idea that we want a certain number of digits to be correct? We need something that isn't sensitive to scaling. Enter the idea of relative error:

All we've done here is divide the absolute errory by $\left|{\mathit{y}}_{0}\right|$. Obviously we are assuming that ${\mathit{y}}_{0}\ne 0$. More on that later.

Note that relative error is just the percentage difference formula with absolute values (and without the final multiplication by 100% to convert to percent).

This observation may help later when we discuss how to choose the relative tolerance.

A hopeful sign is that the relative error is now dimensionless because the numerator and denominator have the same units. Obviously any scaling of y cancels out. Let's use R from here on out to denote a relative error tolerance. So we want

$$\frac{\left|\mathit{y}-{\mathit{y}}_{0}\right|}{|{\mathit{y}}_{0}|}\le \mathit{R}$$

Clearing the fraction, we get

$$\left|\mathit{y}-{\mathit{y}}_{0}\right|\le \mathit{R}\cdot |{\mathit{y}}_{0}|$$

That's usually how we use relative error in software, since it avoids the division.

Let's try this out on an example. Suppose ${\mathit{y}}_{0}=1234500$. Looks like 5 significant digits there. Let's try $\mathit{R}=5\times {\mathrm{10}}_{}^{-5}$. Plugging in, this becomes

$$\left|\mathit{y}-1234500\right|\le 61.725$$

Which means that

$$1234500-61.725\le \mathit{y}\le 1234500+61.725$$

or

$$1234438.275\le \mathit{y}\le 1234561.725$$

If we were to change the units on y, we would also scale those end points by the same factor as y itself is changed, just as if we had multiplied the inequality through by the scaling factor.

Unfortunately, those end points don't round to 1234500 when expressed with 5 significant digits. The interval is wider than we want it to be if we wanted to guarantee 5 significant digits correct. If, instead, we tighten it up to $\mathit{R}=5\times {\mathrm{10}}_{}^{-6}$and repeat the exercise, we end up with

$$1234493.8275\le \mathit{y}\le 1234506.1725$$

Now it's tighter than we needed, but those endpoints do round to 1234500 written to 5 significant digits. This generalizes:.

You can use this relative tolerance when you don't know what ${\mathit{y}}_{0}$ is or if you want the same formula to work when ${\mathit{y}}_{0}$ varies.

Just as an aside, the interval we would have wanted to see was

$$1234450\le \mathit{y}\le 1234550$$

You might be curious what relative tolerance gives this. If you're not the least bit curious, skip to the next section, because we won't be using this later.

Define a preliminary tolerance $\mathit{R}=5\times {10}_{}^{-\left(\mathit{n}+1\right)}$ as above. Then the relative tolerance that gives the loosest bounds that require n correct significant digits is

$$\mathit{R}\cdot {10}^{\left(\mathrm{ceil}\left(\mathit{c}\right)-\mathit{c}\right)}$$

where

$$\mathit{c}={\mathrm{log}}_{10}\left(\left|{\mathit{y}}_{0}\right|\right)$$

The exponent there is just the fractional part of c. In our example the tolerance works out to be about 8.1 times larger than R. This should be no surprise since we already knew from our experimentation that R was too tight and$10\mathit{R}$ too loose. There's nothing very nice or memorable there, and we can only construct this "ideal" relative tolerance if we know ${\mathit{y}}_{0}$, so we will not make further use of this fact, but if you were setting up a unit test comparing software output to a known correct result, you could make use of it.

The definition of relative error has a serious problem when the reference value ${\mathit{y}}_{0}$ is zero. The requirement

$$\frac{\left|\mathit{y}-{\mathit{y}}_{0}\right|}{|{\mathit{y}}_{0}|}\le \mathit{R}$$

becomes impossible to meet. As previously mentioned, we normally avoid the division, anyway, and evaluate

$$|\mathit{y}-{\mathit{y}}_{0}|\le \mathit{R}\cdot |{\mathit{y}}_{0}|$$

That avoids the division by zero, but it hardly helps very much when ${\mathit{y}}_{0}=0$ because the test is only satisfied when ${\mathit{y}=\mathit{y}}_{0}$, i.e. when the absolute error is zero.

It might be helpful to think about this as a limiting case. Consider a sequence of problems where the exact solution is not zero but is smaller and smaller in magnitude. Let's say the exact solution is ${\mathit{y}}_{0}=12345\times {10}^{-\mathit{K}}$ for increasing values of K.You might imagine that we're rewriting the same physical quantity in larger and larger units of measurement as K increases.

K | $$12345\times {10}^{-\mathit{K}}$$ |

10 | 0.0000012345 |

20 | 0.00000000000000012345 |

30 | 0.000000000000000000000000012345 |

40 | 0.0000000000000000000000000000000000012345 |

Leading zeros to the right of the decimal point aren't "significant" in the sense of significant digits, but obviously they determine the magnitude of the number. Relative error control is fine for all these values as long as all those leading zeros are computationally free to obtain. That might sound unlikely, but it actually could be the case if this were only a matter of scaling, say when switching units from millimeters to light years.

In the limiting case of ${\mathit{y}}_{0}=0$, however, relative error control assumes that all those zeros are "insignificant", computationally free to obtain. Unfortunately, when ${\mathit{y}}_{0}=0$ those leading zeros are all the digits that there are!

It still could be the case.that they are free. For example, if everything in the problem is zero, then the intermediate computations are probably going to yield zero at every stage. Or it might happen if there is symmetry that results in perfect cancellation, e.g. integrating an odd function over the interval [-1,1] and exact zero is obtained not because the intermediate calculations are exact, rather because the intermediate results cancel out perfectly, i.e., are equal and opposite, regardless of how inexact they might be.

These are common enough scenarios, but they are not always the case. Generally speaking, it is not easier to compute the ideal result $\mathit{y}=0$ when ${\mathit{y}}_{0}=0$ than it is to compute the ideal result $\mathit{y}=1.234567890$ when ${\mathit{y}}_{0}=1.234567890$. Just as it is usually impractical to require that the absolute error is zero, so it is impractical to impose any relative tolerance when ${\mathit{y}}_{0}=0$. As convenient as relative error control is, we have a hole at zero that we need to fill somehow.

Absolute error is easy to understand but can be difficult to use when the problem contains values of different scales, and we need know something about the magnitude of the solution values before we have computed them. Controlling the error in the relative sense instead rectifies these limitations of absolute error control, but it doesn't work in any practical way when the desired solution happens to be very close to zero.

Fortunately, ${\mathit{y}}_{0}=0$ is a value that doesn't have any scaling to worry about. Theoretically, then, absolute error should work fine at and very near zero, while relative error works everywhere else. So let's splice them together. Historically this has been accomplished in more than one way, but more commonly today it is done like this:

$$|\mathit{y}-{\mathit{y}}_{0}|\le \mathrm{max}\left(\mathit{A},\mathit{R}\cdot |{\mathit{y}}_{0}|\right)$$

or in MATLAB code

abs(y - y0) <= max(AbsoluteTolerance,RelativeTolerance*abs(y0))

As previously mentioned, we don't generally know ${\mathit{y}}_{0}$, and the left-hand side will be an estimate of the absolute error obtained somehow or other. If we also assume that the value of y that we have computed at least has the right magnitude, we can substitute it for ${\mathit{y}}_{0}$ in that expression. So in software the test will take the form

errorEstimate <= max(AbsoluteTolerance,RelativeTolerance*abs(y))

Here errorEstimate is an estimate of the absolute error, or possibly an approximate upper bound on it. Using max, not min, ensures that we choose the least restrictive of the two tolerances. This is important because the hole we are trying to fill occurs because relative error control becomes too restrictive near ${\mathit{y}}_{0}=0$.

Perhaps it is interesting to observe where the switchover from using the absolute tolerance to using the relative tolerance occurs. It occurs when $\mathrm{AbsoluteTolerance}=\mathrm{RelativeTolerance}\cdot \mathrm{abs}\left(\mathit{y}\right)$. In other words, it occurs when

$$\mathrm{abs}\left(\mathit{y}\right)=\frac{\mathrm{AbsoluteTolerance}}{\mathrm{RelativeTolerance}}$$

For example, the default tolerances for the integral function are AbsoluteTolerance = 1e-10 and RelativeTolerance = 1e-6. Consequently when we estimate that a quantity is less than 1e-10/1e-6 = 1e-4 in magnitude, we use AbsoluteTolerance and control the error in the absolute sense, otherwise we use RelativeTolerance and control it in the relative sense.

Not all error control strategies in numerical software have been formulated that way, but they can have similar effects. Often only one tolerance was accepted, and in that case one could adapt the above strategy to

errorEstimate <= tol*max(1,abs(y))

This effectively makes the two tolerances the same, so that absolute error control is used for $\mathrm{abs}\left(\mathit{y}\right)\le 1$and relative error control for $\mathrm{abs}\left(\mathit{y}\right)\ge 1$. Since error control is not an exact affair to begin with, and we have even substituted y for ${\mathit{y}}_{0}$, addition can be used instead of the max function:

errorEstimate <= tol*(abs(y) + 1)

If $\mathrm{abs}\left(\mathit{y}\right)$ is very small, the factor on the right is just a little larger than 1, hence the test is essentially comparing the estimate of the absolute error with the tolerance. That's absolute error control. If, on the other hand, $\mathrm{abs}\left(\mathit{y}\right)$ is very large, the addition of 1 matters but little, and so the effect is relative error control.

Floating point arithmetic has limitations that may affect the tolerances we choose, so let's review what they are. MATLAB provides the functions eps, realmin, and realmax that tell you about limits in the floating point type.

Double precision:

eps | 2.220446049250313e-16 |

realmin | 2.225073858507201e-308 |

realmax | 1.797693134862316e+308 |

Single precision

eps("single") | 1.1920929e-07 |

realmin("single") | 1.1754944e-38 |

realmax("single") | 3.4028235e+38 |

We used the eps function above without numeric inputs, but writing just eps with no inputs is the same as writing eps(1), and writing eps("single") is the same as writing eps(single(1)). In what follows we'll stick to the double precision case, but the same principles apply to single precision. The documentation says:

"eps(X) is the positive distance from abs(X) to the next larger in magnitude floating point number of the same precision as X."

Let's unpack this in case you've never thought about it. We use floating point numbers to model the real numbers, but there are infinitely many real values and only a finite number of floating point values. In double precision there are ${2}^{52}\text{\hspace{0.17em}}$evenly spaced numbers in the interval $1\le \mathit{x}<2$. This goes for any consecutive powers of 2, i.e. there are ${2}^{52}$ evenly spaced double precision floating point numbers in the interval ${2}^{\mathit{n}}\le \mathit{x}<{2}^{\mathit{n}+1}$, so long as $-1022\le \mathit{n}\le \mathrm{1023}$. In MATLAB you also have that many in the interval $0\le \mathit{x}<{2}^{-1022}$. which is the same as $0\le \mathit{x}<\mathrm{realmin}$.

In every case, the very next larger floating point number after $\mathit{x}\ge 0$ is $\mathit{x}+\mathrm{eps}\left(\mathit{x}\right)$, so the spacing between the floating point numbers in the interval ${2}^{\mathit{n}}\le \mathit{x}<{2}^{\mathit{n}+1}$ is $\mathrm{eps}\left({2}^{\mathit{n}}\right)$. The value of $\mathrm{eps}\left(\mathit{x}\right)$ is the same throughout the interval. For example:

n = 3;

x = linspace(2^n,2^(n+1),5)

eps(x)

There are no floating point numbers between, say, 10 and $10+\mathrm{eps}\left(10\right)$. If you try to compute the midpoint between $\mathit{a}=\mathit{x}$ and $\mathit{b}=\mathit{x}+\mathrm{eps}\left(\mathit{x}\right)$, $\mathit{c}=\frac{\left(\mathit{a}+\mathit{b}\right)}{2}$, the result will be either $\mathit{c}=\mathit{x}$ or $\mathit{c}=\mathit{x}+\mathrm{eps}\left(\mathit{x}\right)$.

OK. So what's the point again? We're talking about floating point limitations on tolerances. With extremely tight tolerances, we will, in effect, be asking the code to return one of a small set of discrete values. For example, if we chooose an absolute tolerance of eps when ${\mathit{y}}_{0}=2$, then by definition, we are asking for a result y such that

$$2-\mathrm{eps}\le \mathit{y}\le 2+\mathrm{eps}$$

The floating point number $2-\mathrm{eps}$ is less than 2, but it turns out to be the one floating point number immediately preceding 2. At the upper end point, however, $\mathit{2}+\mathrm{eps}=2$ in floating point, so the range above is just another way of saying

$$\mathit{y}\in \left\{2-\mathrm{eps},2\right\}$$

Some "range", that! We have allowed for only two possible values. Similarly, for a relative tolerance of eps, we get from applying the definition that

$$2-2\text{\hspace{0.17em}}\mathrm{eps}\le \mathit{y}\le 2+2\text{\hspace{0.17em}}\mathrm{eps}$$

which turns out to be satisfied by just 4 possible values of y:

$$\mathit{y}\in \left\{2-2\text{\hspace{0.17em}}\mathrm{eps},\text{\hspace{0.17em}}2-\mathrm{eps},\text{\hspace{0.17em}}2,\text{\hspace{0.17em}}2+2\text{\hspace{0.17em}}\mathrm{eps}\right\}$$

The upshot is that we will need to be careful when setting our tolerances so that we are asking for something reasonable.

We will always want to choose $\mathit{R}\ge \mathrm{eps}$. Even in testing scenarios where you have ${\mathit{y}}_{0}$ in hand and you're just comparing outputs, you will probably want to use at least a small multiple of eps. The ODE and integral solvers will not allow anything smaller than $\mathrm{100}\text{\hspace{0.17em}}\mathrm{eps}$.

With absolute tolerances things are more complicated because we must take account of the scale of values. An absolute tolerance can be smaller than eps when the solution values will be much smaller than 1. It can also be larger than 1. Consider a problem where the exact solution is 1 meter and anything smaller than 10 microns is negligible. If you were solving the problem using meters as your unit of measurement, then you'd set your absolute tolerance to 0.00001 meters because that's 10 microns. But now rescale the problem to use microns as your unit of measurement instead of meters. Now the exact solution is 1e6 microns, and your absolute tolerance would be 10 microns. Conversely, if you were to solve the problem in petameters, the exact answer is now 1e-15 petameters, and your absolute tolerance would be 1e-20 petameters. We're just changing units to illustrate the point. Your problem will have probably have units that are natural for the problem, and you'll need to deal with that as it comes. Whereas the value of eps places a natural limit on the relative tolerance, the realmin and realmax values present limitations on what you can use for an absolute tolerance. As with eps and the relative tolerance, you should avoid these extremes. Rescale the problem if you find yourself wanting to set an absolute tolerance anywhere near realmin or realmax.

But how does one go about actually choosing tolerances for a computation that solves some real world problem? To answer that, we just need to answer a few questions.

How small is small enough that it is nothing for all practical intents and purposes? This might be different for different pieces of the solution, but it provides a reasonable value for the absolute tolerance. If the solution to your problem contains a weight for some powder that is 1.23456g, and your scale is only accurate to 0.1g, there's not much you can do with the extra 0.03456g in the result even if those numbers are correct. What we have there is 1.2g plus a negligible quantity. It makes sense to use tolerances that only ask for accuracy to, say, 0.01g. Suppose we did that and got 1.23654 instead. Now 0.00654 of that is "noise", but it's the same 1.2g when rounded to the 0.1g that our scale can actually measure.

Recall that an absolute tolerance is not limited by eps. It has the same units as the solution, and it can be freely set to whatever value makes sense, whether 1e-40 or 1e+40 depends on how the problem is scaled. Sometimes you might even be allowed to make it exactly zero, but that's a bold move unless you know the solution isn't close to zero.

If there are only a few significant digits of precision in the data that defines the problem to be solved, it's not likely that a more accurate solution will be more useful than one computed to about the same accuracy as the problem itself is specified. Of course, it's not really that simple. It may be that solution curves are very sensitive or very insensitive to variations in different problem data. It usually is not clear how the error in the problem specification manifests in the solution. Nevertheless, we are typically computing an approximate solution to an approximate problem, and that has implications. Given that we're starting with some error, how much are we willing to pay for extra digits of accuracy from the solver? The answer is probably "not a lot", and yet these extra digits will most likely cost us additional time and maybe computing resources to obtain. If the numerical data that defines the problem is only accurate to, say, 5 significant digits, we might reasonably decide to ask the solver for no more than 5 or 6 digits of accuracy when solving it. If 5 digits of accuracy seems OK to us, we could set $\mathit{R}={5\cdot 10}^{-6}$ or $\mathit{R}={5\cdot 10}^{-7}$ for good measure.

If you said that a 0.01% error is small enough to ignore, then a reasonable relative tolerance would be 0.0001, as long as that isn't much beyond the precision of the data that defines the problem. If you prefer to think in terms of significant digits, and need, say, 5 significant digits, then a reasonable relative tolerance might be $5\times {10}^{-6}$.

Then combine this with your answer to the previous question. Since there is some hand-waving about how the solution reacts to errors in the problem specification, one can make equally good arguments for picking the larger or the smaller of the two. If accuracy is especially important to you, use the smaller.

This is just a way of setting "reasonable" tolerance values. It provides a baseline of sorts. Nothing prevents you from choosing tighter or looser tolerances, and there can be good reasons to go with tighter or looser tolerances. For one thing, tolerances are based on error estimates, and in some cases not even error estimates for the final solution values. In ODE solvers, for example, they are used "locally", step by step. This can end up limiting the error in the final solution values in different ways to different degrees, so for some "insurance", slightly tighter tolerances can be a good thing. On the other hand, run times tend to go up with tighter tolerances. Memory requirements may also increase. If the solution changes but little and time is of the essence, looser tolerances might be justified, provided you are comfortable with the potential for larger errors. If the computation is occuring in a real-time system, for example, you would want to experiment with tolerances to see how the run-time and solution values react to tighter and looser tolerances.

Let's take a look at a first order system of ordinary differential equations.

$$\frac{\mathrm{d}}{\mathrm{d}\mathit{t}}\mathit{y}=\mathit{f}\left(\mathit{t},\mathit{y}\right),\mathit{y}\left(0\right)={\mathit{y}}_{0}$$

This example has been constructed so that we know the exact solution.

function yp = f(t,y,param)

yp = zeros(size(y));

yp(1) = 2*t*(1 - y(2)) + 1;

yp(2) = 2*t*(y(1) - t - param);

end

function y = exactSolution(t,param)

y = zeros(2,1);

y(1) = cos(t.*t) + t + param;

y(2) = sin(t.*t) + 1;

end

And let's use this little comparison function I wrote so that we can easily see the accuracy of the results.

function compare(y,yExact)

fprintf(1," y(1) = %14.9f, y(2) = %14.9f\n",y(1),y(2));

fprintf(1,"exact y(1) = %14.9f, exact y(2) = %14.9f\n",yExact(1),yExact(2));

absoluteError = abs(y - yExact);

relativeError = absoluteError./abs(yExact);

fprintf(1,"absolute error y(1) = %5.0e, absolute error y(2) = %5.0e\n",absoluteError(1),absoluteError(2));

fprintf(1,"relative error y(1) = %5.0e, relative error y(2) = %5.0e\n\n",relativeError(1),relativeError(2));

end

Let's define this specific problem using as many digits as double precision allows. This will represent a problem that is specified exactly.

param = 1.234567890123456;

y0 = exactSolution(0,param); % "exact" initial value

D = ode(ODEFcn=@f,Parameters=param,InitialValue=y0);

t1 = 5; % Solve for the solution at this time point for comparisons.

yExact = exactSolution(t1,param); % The exact solution to our differential equation at t1.

% Problem specified to full precision. Using default tolerances.

sol = solve(D,t1);

compare(sol.Solution,yExact);

The default tolerances are not tight, and it shows in the accuracy we get back. We got 3 digits correct in y(1) and just 1 digit correct in y(2). Let's suppose we want 8 digits. So we'll set the relative tolerance to 5e-9.

We've just made up this problem, but what amounts to a "negligible amount" in a system like this could in theory be different for y(1) than y(2). A little known fact is that the MATLAB ODE solvers support having different absolute tolerances for different solution components. This problem doesn't really depend on it, but just to show how it's done, assume that for y(1) we think a negligible amount is 1e-10 and for y(2), 1e-11.

% Problem specified to full precision. Asking for about 8 significant

% digits of accuracy.

D.RelativeTolerance = 5e-9;

D.AbsoluteTolerance = [1e-10,1e-11];

sol = solve(D,t1);

compare(sol.Solution,yExact);

That's more like it. We got 9 digits correct in y(1) and 8 in y(2). But this is with the problem data at full precision. Your problem may have some measurements that aren't exact. Let's simulate that by tweaking some of the problem data while leaving the exact solution unchanged. Replacing the parameter and initial value with "approximate" values:

% Problem specified with 6 significant digits. Still asking for 8.

% significant digits.

param = 1.23457;

D.InitialValue = [param + 1; 1];

sol = solve(D,t1);

compare(sol.Solution,yExact);

Maybe we solved the approximate problem just as accurately as before, but comparing to the "true" solution from the full precision problem, we only got 6 digits correct in y(1) and 5 digits correct in y(2). This is why you might want to let the precision of the data in the problem moderate your tolerance choices. Reducing our request to 6 digits, we get.

% Problem specified with 6 significant digits, now asking for 6

% significant digits.

D.RelativeTolerance = 5e-7;

D.AbsoluteTolerance = 1e-9;

sol = solve(D,t1);

compare(sol.Solution,yExact);

This time we got 6 digits correct in y(1). We only got 5 digits correct in y(2), though, so maybe we should have set the relative tolerance at 5e-8. It's not a bad idea to ask for 7 digits when you need 6.

A failure related to tolerances usually occurs when the problem is too difficult to solve to the requested accuracy. If that is the case, then one must either use an algorithm that is able to solve that problem more easily, or one must relax the tolerances, or both. Failing that, the problem itself will have to be reformulated.

Sometimes, paradoxically, one must tighten the tolerances to succeed. There is an example of this here:

The tolerances for ode15s are specified in that example because the integration fails with the default tolerances, which are looser.

How is that even possible, you ask? Well, when you're driving a car on a mountain road, you need to keep the car on the road, and that's going to require a practical minimum amount of accuracy. If your tolerances are too loose and you drive off the cliff, then you won't get where you're going. How much accuracy you need in a mathematical problem is less clear than how to keep a car on the road, but the point is that if you get a failure, and if loosening the tolerances doesn't help, try tightening them. What have you got to lose? It might work. Probably the worst that's going to happen is that it fails quicker.

Now get out there and set those tolerances!

]]>