Timing Extraction of Parts of an Array

Posted by Loren Shure, September 25, 2008

2 views (last 30 days) | 0 Likes | 10 comments

In Sarah's blog, Dan asked about speed of removing elements. There are a number of ways of deleting elements in MATLAB. So, what's the "best" way?

Vector Data
Methods That Come to Mind
Test Various Methods
Keep Most Elements
Keep Few Elements
Commentary and Question

Vector Data

I only work with vector, or 1-D data for this blog. And I use linear indexing (i.e., indexing with only 1 number). First let me generate some points, and perform an fft.

nPts = 1e7;
cutoff = round(nPts/2);
A = rand(nPts,1);
B = fft(A);

I want to keep a copy of B off to the side so I can do more experiments on it later. I want to be sure the copy doesn't share memory with the array being used later on, so I modify the first element (keeping it the same value).

Borig = B;
Borig(1) = Borig(1) + sin(0);

Methods That Come to Mind

Let me list the methods of paring down an array that I can think of quickly.

Find the values to keep and assign them to a new array.

Find the values to delete and set them to empty ([]) using end.

Find the values to delete and specify them explicitly, i.e., without using end.

Identify the original values and place them in a new output array.

Test Various Methods

t1 = tic;
B=B(1:cutoff);
toc(t1)
B = Borig;
Borig(1) = Borig(1) + sin(0);
t2 = tic;
B(cutoff+1:end)=[];
toc(t2)
B = Borig;
Borig(1) = Borig(1) + sin(0);
t3 = tic;
B(cutoff+1:nPts)=[];
toc(t3)
B = Borig;
Borig(1) = Borig(1) + sin(0);
t4 = tic;
C = B(1:cutoff);
toc(t4)

Elapsed time is 0.131063 seconds.
Elapsed time is 0.322803 seconds.
Elapsed time is 0.327507 seconds.
Elapsed time is 0.125420 seconds.

It doesn't seem like using end vs. the actual number for doing the indexing makes nearly as much difference as whether or not I copy the elements to keep or delete the elements to remove.

Keep Most Elements

Let's repeat the analysis but retain most of the elements this time.

B = Borig;
Borig(1) = Borig(1) + sin(0);
cutoff  = round(0.95*nPts)

cutoff =
     9500000

Test the various methods.

t1 = tic;
B=B(1:cutoff);
toc(t1)
B = Borig;
Borig(1) = Borig(1) + sin(0);
t2 = tic;
B(cutoff+1:end)=[];
toc(t2)
B = Borig;
Borig(1) = Borig(1) + sin(0);
t3 = tic;
B(cutoff+1:nPts)=[];
toc(t3)
B = Borig;
Borig(1) = Borig(1) + sin(0);
t4 = tic;
C = B(1:cutoff);
toc(t4)

Elapsed time is 0.240744 seconds.
Elapsed time is 0.510874 seconds.
Elapsed time is 0.440639 seconds.
Elapsed time is 0.243518 seconds.

Interesing. I see roughly the same time pattern as earlier, when I saved half the elements.

Keep Few Elements

Finally, sticking to the vector case, keep very few of the elements this time.

B = Borig;
Borig(1) = Borig(1) + sin(0);
cutoff  = round(0.05*nPts)

cutoff =
      500000

Test the various methods.

t1 = tic;
B=B(1:cutoff);
toc(t1)
B = Borig;
Borig(1) = Borig(1) + sin(0);
t2 = tic;
B(cutoff+1:end)=[];
toc(t2)
B = Borig;
Borig(1) = Borig(1) + sin(0);
t3 = tic;
B(cutoff+1:nPts)=[];
toc(t3)
B = Borig;
Borig(1) = Borig(1) + sin(0);
t4 = tic;
C = B(1:cutoff);
toc(t4)

Elapsed time is 0.026702 seconds.
Elapsed time is 0.194346 seconds.
Elapsed time is 0.195731 seconds.
Elapsed time is 0.026779 seconds.

Commentary and Question

I did this experiment with 10 million data elements. If I do it with orders of magnitude fewer points, the results are less dramatically different, though similar. My conclusion is that it is faster to keep the elements you want than to delete the ones you don't. However, it depends on how you reach the state of culling the data.

Suppose I'm handed the indices of the unwanted elements. Am I better off just using these to delete values or taking the hit to identify the wanted ones and using the these to keep elements? Here's my thinking. If I'm given the unwanted indices, to create the desired ones, I need to make an array of all indices, and then deleted the unwanted ones from this larger array to get the keepers. Then I have do to the actual indexing for the values. So, if presented with indices already, whether ones to be kept or deleted, use them. If not, calculate the keepers and assign these values to the required array.

Do you find yourself in situations where it is unnatural to calculate or specify the values you want to keep? If so, please post here because I am curious about those situations.

Published with MATLAB® 7.6