Finding if all elements of a matrix are finite, fast!

Posted by Mike Croucher, May 12, 2022

14 views (last 30 days) | 0 Likes | 3 comments

Today, I'm going to focus on three new functions that were added to the MATLAB programming language in R2022a: allfinite, anynan and anymissing. These functions are more concise, and usually faster, than alternative methods

Are all matrix elements finite? Doing it the old way.

How do you check if all the elements of a matrix are finite or not?  Here's a matrix where the answer is 'No' by design.

a=rand(3);a(2,2)=inf

a = 3×3
6945    0.8625    0.0988
0722       Inf    0.7485
8283    0.2797    0.5190

A common pattern for testing if every element in a matrix is finite is to first apply the isfinite function which returns a logical array:

checkFinite = isfinite(a)

checkFinite = 3×3 logical array
 1   1
 0   1
 1   1

and then pass this to the all function which checks if all the entries of each column are true or not

checkAllRowsFinite = all(checkFinite)

checkAllRowsFinite = 1×3 logical array
   1   0   1

we apply the all function one more time to get the result we want

checkAllFinite = all(checkAllRowsFinite)

checkAllFinite = logical
   0

No!  All the entries of our original matrix are not finite!  Usually, all of the above is put into one line:

checkAllFinite = all(all(isfinite(a)))

checkAllFinite = logical
   0

This was a common pattern for many years and things could get out of hand when dealing with multiple dimensional arrays.  For example to check if every element of 

a=rand(3,3,3,3);

is finite, we'd need to do

checkAllFinite = all(all(all(all(isfinite(a)))))

checkAllFinite = logical
   1

That's a lot of all!  Some people use the concise but arguably cryptic

checkAllFinite = all(isfinite(a(:)))

checkAllFinite = logical
   1

So, in R2018b, we introduced a new way of doing this that's nicer to read

checkAllFinite = all(isfinite(a),'all')

checkAllFinite = logical
   1

New in R2022a: allfinite - Finding if all is finite, fast

We see these patterns a lot in both our code and user's code.  So much so that in R2022a, we've developed another function to make it even easier to perform this common operation: allfinite

a = rand(3);a(2,2)=inf;

CheckAllFinite = allfinite(a)

CheckAllFinite = logical
   0

We didn't do this just to save on typing though. We did it because it's faster! You see the biggest difference with a large number of small matrices. Let's look at 10 million 3 x 3 matrices.

a3 = rand(3);

tic

for i =1:1e7

    tf = all(isfinite(a3), 'all');

end

oldMethodTime = toc

oldMethodTime = 0.7560

tic

for i =1:1e7

    tf = allfinite(a3);

end

newMethodTime=toc

newMethodTime = 0.0847

fprintf("allfinite(a3) is %.2f times faster than all(isfinite(a3), 'all') for " + ...

    "10 million small matrices\n",oldMethodTime/newMethodTime)

allfinite(a3) is 8.93 times faster than all(isfinite(a3), 'all') for 10 million small matrices

That can add up to quite a difference in functions that are called a lot.  The benefit is reduced for larger matrices but it's still useful

a2000 = rand(2000);

tic

for i =1:1e3

    tf = all(isfinite(a2000), 'all');

end

oldMethodTime = toc

oldMethodTime = 2.9144

tic

for i =1:1e3

    tf = allfinite(a2000);

end

newMethodTime=toc

newMethodTime = 0.8977

fprintf("allfinite(a2000) is %.2f times faster than all(isfinite(a2000), 'all') for " + ...

    "1000 large matrices\n",oldMethodTime/newMethodTime)

allfinite(a2000) is 3.25 times faster than all(isfinite(a2000), 'all') for 1000 large matrices

Are there any NaNs or is anything missing?

Along with allfinite, the MATLAB math team identified two other patterns that could be optimised in a similar way and came up with the functions anynan and anymissing. When working with arrays, there will almost always be a speedup although you may have to run it many times to see it.

B = 0./[-2 -1 0 1 2]

B = 1×5
     0     0   NaN     0     0

tic

repeats = 1e6;

for i =1:repeats

    tf = any(isnan(B), 'all');

end

oldMethodTime = toc

oldMethodTime = 0.0887

tic

for i =1:repeats

    tf = anynan(B);

end

newMethodTime=toc

newMethodTime = 0.0126

fprintf("anynan(B) is %.2f times faster than any(isnan(B), 'all') for " + ...

    "%d small vectors\n",oldMethodTime/newMethodTime,repeats)

anynan(B) is 7.03 times faster than any(isnan(B), 'all') for 1000000 small vectors

Is anynan faster for all datatypes?

For some data types, you might not see a speedup but it will be, at worst, as good as existing methods.  Consider this example for checking for missing values in a table

singleVar = single([1;3;5;7;9;11;13]);

cellstrVar = {'one';'three';'';'seven';'nine';'eleven';'thirteen'};

categoryVar = categorical({'red';'yellow';'blue';'violet';'';'ultraviolet';'orange'});

dateVar = [datetime(2015,1:7,15)]';

stringVar = ["a";"b";"c";"d";"e";"f";"g"];

mytable = table(singleVar,cellstrVar,categoryVar,dateVar,stringVar)

mytable = 7×5 table 
 singleVarcellstrVarcategoryVardateVarstringVar
11'one'red15-Jan-2015"a"
23'three'yellow15-Feb-2015"b"
35''blue15-Mar-2015"c"
47'seven'violet15-Apr-2015"d"
59'nine'<undefined>15-May-2015"e"
611'eleven'ultraviolet15-Jun-2015"f"
713'thirteen'orange15-Jul-2015"g"

	singleVar	cellstrVar	categoryVar	dateVar	stringVar
1	1	'one'	red	15-Jan-2015	"a"
2	3	'three'	yellow	15-Feb-2015	"b"
3	5	''	blue	15-Mar-2015	"c"
4	7	'seven'	violet	15-Apr-2015	"d"
5	9	'nine'	<undefined>	15-May-2015	"e"
6	11	'eleven'	ultraviolet	15-Jun-2015	"f"
7	13	'thirteen'	orange	15-Jul-2015	"g"

Previously, you might have checked this as follows

TF = any(ismissing(mytable),'all')

TF = logical
   1

Now, we can do

TF = anymissing(mytable)

TF = logical
   1

The times are about the same so we can safely use these new functions all the time. Let's run it 5000 times to see

tic

repeats = 5e3;

for i =1:repeats

    tf = any(ismissing(mytable),'all');

end

oldMethodTime = toc

oldMethodTime = 0.4411

tic

for i =1:repeats

    tf = anymissing(mytable);

end

newMethodTime=toc

newMethodTime = 0.3786

Updating MathWorks code to use these new functions

We've started using these new functions straight away in many MATLAB functions.  They're used in argument validation in functions like expm, logm and median among others.  Take a look for yourself by looking at the source code with, for example

edit sqrtm

Over to you

In a release as big as R2022a that has many shiny features, simple functions such as these can often get overlooked.  They are, however, useful new ways of performing very common operations. Teams at MathWorks are already starting to use them thoughout the codebase to make improvements here and there and I am sure I'll get the chance to usefully use them in user's code soon enough.

Do you think they'll be useful in your workflow?  Are there any similar functions you wish MATLAB had?

Category:: MATLAB Programming Language,; New Features,; performance

A New View of Our Logo

Blogs
A Short Game of Life

Blogs
Jenny Bosten's Art in the MATLAB Mini Hack

Blogs
lateximage
rtc: a pedestrian real-time clock figure
SDF - Set the Figure

Comments

To leave a comment, please click here to sign in to your MathWorks Account or create a new one.

The MATLAB Blog
Practical Advice for People on the Leading Edge

Practical Advice for People on the Leading Edge

Finding if all elements of a matrix are finite, fast!

Updating MathWorks code to use these new functions

Over to you

Comments

The MATLAB BlogPractical Advice for People on the Leading Edge

Practical Advice for People on the Leading Edge

Updating MathWorks code to use these new functions

Over to you

See Also

Comments

The MATLAB Blog
Practical Advice for People on the Leading Edge