Timing Extraction of Parts of an Array
In Sarah's blog, Dan asked about speed of removing elements. There are a number of ways of deleting elements in MATLAB. So, what's the "best" way?
Contents
Vector Data
I only work with vector, or 1-D data for this blog. And I use linear indexing (i.e., indexing with only 1 number). First let me generate some points, and perform an fft.
nPts = 1e7; cutoff = round(nPts/2); A = rand(nPts,1); B = fft(A);
I want to keep a copy of B off to the side so I can do more experiments on it later. I want to be sure the copy doesn't share memory with the array being used later on, so I modify the first element (keeping it the same value).
Borig = B; Borig(1) = Borig(1) + sin(0);
Methods That Come to Mind
Let me list the methods of paring down an array that I can think of quickly.
Test Various Methods
t1 = tic; B=B(1:cutoff); toc(t1) B = Borig; Borig(1) = Borig(1) + sin(0); t2 = tic; B(cutoff+1:end)=[]; toc(t2) B = Borig; Borig(1) = Borig(1) + sin(0); t3 = tic; B(cutoff+1:nPts)=[]; toc(t3) B = Borig; Borig(1) = Borig(1) + sin(0); t4 = tic; C = B(1:cutoff); toc(t4)
Elapsed time is 0.131063 seconds. Elapsed time is 0.322803 seconds. Elapsed time is 0.327507 seconds. Elapsed time is 0.125420 seconds.
It doesn't seem like using end vs. the actual number for doing the indexing makes nearly as much difference as whether or not I copy the elements to keep or delete the elements to remove.
Keep Most Elements
Let's repeat the analysis but retain most of the elements this time.
B = Borig; Borig(1) = Borig(1) + sin(0); cutoff = round(0.95*nPts)
cutoff =
     9500000
Test the various methods.
t1 = tic; B=B(1:cutoff); toc(t1) B = Borig; Borig(1) = Borig(1) + sin(0); t2 = tic; B(cutoff+1:end)=[]; toc(t2) B = Borig; Borig(1) = Borig(1) + sin(0); t3 = tic; B(cutoff+1:nPts)=[]; toc(t3) B = Borig; Borig(1) = Borig(1) + sin(0); t4 = tic; C = B(1:cutoff); toc(t4)
Elapsed time is 0.240744 seconds. Elapsed time is 0.510874 seconds. Elapsed time is 0.440639 seconds. Elapsed time is 0.243518 seconds.
Interesing. I see roughly the same time pattern as earlier, when I saved half the elements.
Keep Few Elements
Finally, sticking to the vector case, keep very few of the elements this time.
B = Borig; Borig(1) = Borig(1) + sin(0); cutoff = round(0.05*nPts)
cutoff =
      500000
Test the various methods.
t1 = tic; B=B(1:cutoff); toc(t1) B = Borig; Borig(1) = Borig(1) + sin(0); t2 = tic; B(cutoff+1:end)=[]; toc(t2) B = Borig; Borig(1) = Borig(1) + sin(0); t3 = tic; B(cutoff+1:nPts)=[]; toc(t3) B = Borig; Borig(1) = Borig(1) + sin(0); t4 = tic; C = B(1:cutoff); toc(t4)
Elapsed time is 0.026702 seconds. Elapsed time is 0.194346 seconds. Elapsed time is 0.195731 seconds. Elapsed time is 0.026779 seconds.
Commentary and Question
I did this experiment with 10 million data elements. If I do it with orders of magnitude fewer points, the results are less dramatically different, though similar. My conclusion is that it is faster to keep the elements you want than to delete the ones you don't. However, it depends on how you reach the state of culling the data.
Suppose I'm handed the indices of the unwanted elements. Am I better off just using these to delete values or taking the hit to identify the wanted ones and using the these to keep elements? Here's my thinking. If I'm given the unwanted indices, to create the desired ones, I need to make an array of all indices, and then deleted the unwanted ones from this larger array to get the keepers. Then I have do to the actual indexing for the values. So, if presented with indices already, whether ones to be kept or deleted, use them. If not, calculate the keepers and assign these values to the required array.
Do you find yourself in situations where it is unnatural to calculate or specify the values you want to keep? If so, please post here because I am curious about those situations.
- 범주:
- Efficiency,
- Indexing


 
                
               
               
               
              
댓글
댓글을 남기려면 링크 를 클릭하여 MathWorks 계정에 로그인하거나 계정을 새로 만드십시오.