{"id":155,"date":"2008-09-25T07:32:20","date_gmt":"2008-09-25T12:32:20","guid":{"rendered":"https:\/\/blogs.mathworks.com\/loren\/2008\/09\/25\/timing-extraction-of-parts-of-an-array\/"},"modified":"2016-08-03T15:03:07","modified_gmt":"2016-08-03T20:03:07","slug":"timing-extraction-of-parts-of-an-array","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/loren\/2008\/09\/25\/timing-extraction-of-parts-of-an-array\/","title":{"rendered":"Timing Extraction of Parts of an Array"},"content":{"rendered":"<div xmlns:mwsh=\"https:\/\/www.mathworks.com\/namespace\/mcode\/v1\/syntaxhighlight.dtd\" class=\"content\">\r\n   <introduction>\r\n      <p>In <a href=\"https:\/\/blogs.mathworks.com\/loren\/2008\/06\/25\/speeding-up-matlab-applications\/\">Sarah's blog<\/a>, Dan asked about speed of removing elements.  There are a number of ways of deleting elements in MATLAB.  So, what's the\r\n         \"best\" way?\r\n      <\/p>\r\n   <\/introduction>\r\n   <h3>Contents<\/h3>\r\n   <div>\r\n      <ul>\r\n         <li><a href=\"#1\">Vector Data<\/a><\/li>\r\n         <li><a href=\"#3\">Methods That Come to Mind<\/a><\/li>\r\n         <li><a href=\"#4\">Test Various Methods<\/a><\/li>\r\n         <li><a href=\"#6\">Keep Most Elements<\/a><\/li>\r\n         <li><a href=\"#9\">Keep Few Elements<\/a><\/li>\r\n         <li><a href=\"#11\">Commentary and Question<\/a><\/li>\r\n      <\/ul>\r\n   <\/div>\r\n   <h3>Vector Data<a name=\"1\"><\/a><\/h3>\r\n   <p>I only work with vector, or 1-D data for this blog.  And I use linear <a href=\"https:\/\/blogs.mathworks.com\/loren\/category\/indexing\/\">indexing<\/a> (i.e., indexing with only 1 number). First let me generate some points, and perform an <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/fft.html\">fft<\/a>.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">nPts = 1e7;\r\ncutoff = round(nPts\/2);\r\nA = rand(nPts,1);\r\nB = fft(A);<\/pre><p>I want to keep a copy of <tt>B<\/tt> off to the side so I can do more experiments on it later.  I want to be sure the copy doesn't share memory with the array\r\n      being used later on, so I modify the first element (keeping it the same value).\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">Borig = B;\r\nBorig(1) = Borig(1) + sin(0);<\/pre><h3>Methods That Come to Mind<a name=\"3\"><\/a><\/h3>\r\n   <p>Let me list the methods of paring down an array that I can think of quickly.<\/p>\r\n   <li>Find the values to keep and assign them to a new array.<\/li>\r\n   <li>Find the values to delete and set them to empty (<tt>[]<\/tt>) using <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/end.html\"><tt>end<\/tt><\/a>.\r\n   <\/li>\r\n   <li>Find the values to delete and specify them explicitly, i.e., without using <tt>end<\/tt>.\r\n   <\/li>\r\n   <li>Identify the original values and place them in a new output array.<\/li>\r\n   <h3>Test Various Methods<a name=\"4\"><\/a><\/h3><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">t1 = tic;\r\nB=B(1:cutoff);\r\ntoc(t1)\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\nt2 = tic;\r\nB(cutoff+1:end)=[];\r\ntoc(t2)\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\nt3 = tic;\r\nB(cutoff+1:nPts)=[];\r\ntoc(t3)\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\nt4 = tic;\r\nC = B(1:cutoff);\r\ntoc(t4)<\/pre><pre style=\"font-style:oblique\">Elapsed time is 0.131063 seconds.\r\nElapsed time is 0.322803 seconds.\r\nElapsed time is 0.327507 seconds.\r\nElapsed time is 0.125420 seconds.\r\n<\/pre><p>It doesn't seem like using <tt>end<\/tt> vs. the actual number for doing the indexing makes nearly as much difference as whether or not I copy the elements to keep\r\n      or delete the elements to remove.\r\n   <\/p>\r\n   <h3>Keep Most Elements<a name=\"6\"><\/a><\/h3>\r\n   <p>Let's repeat the analysis but retain most of the elements this time.<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">B = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\ncutoff  = round(0.95*nPts)<\/pre><pre style=\"font-style:oblique\">cutoff =\r\n     9500000\r\n<\/pre><p>Test the various methods.<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">t1 = tic;\r\nB=B(1:cutoff);\r\ntoc(t1)\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\nt2 = tic;\r\nB(cutoff+1:end)=[];\r\ntoc(t2)\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\nt3 = tic;\r\nB(cutoff+1:nPts)=[];\r\ntoc(t3)\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\nt4 = tic;\r\nC = B(1:cutoff);\r\ntoc(t4)<\/pre><pre style=\"font-style:oblique\">Elapsed time is 0.240744 seconds.\r\nElapsed time is 0.510874 seconds.\r\nElapsed time is 0.440639 seconds.\r\nElapsed time is 0.243518 seconds.\r\n<\/pre><p>Interesing.  I see roughly the same time pattern as earlier, when I saved half the elements.<\/p>\r\n   <h3>Keep Few Elements<a name=\"9\"><\/a><\/h3>\r\n   <p>Finally, sticking to the vector case, keep very few of the elements this time.<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">B = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\ncutoff  = round(0.05*nPts)<\/pre><pre style=\"font-style:oblique\">cutoff =\r\n      500000\r\n<\/pre><p>Test the various methods.<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">t1 = tic;\r\nB=B(1:cutoff);\r\ntoc(t1)\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\nt2 = tic;\r\nB(cutoff+1:end)=[];\r\ntoc(t2)\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\nt3 = tic;\r\nB(cutoff+1:nPts)=[];\r\ntoc(t3)\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\nt4 = tic;\r\nC = B(1:cutoff);\r\ntoc(t4)<\/pre><pre style=\"font-style:oblique\">Elapsed time is 0.026702 seconds.\r\nElapsed time is 0.194346 seconds.\r\nElapsed time is 0.195731 seconds.\r\nElapsed time is 0.026779 seconds.\r\n<\/pre><h3>Commentary and Question<a name=\"11\"><\/a><\/h3>\r\n   <p>I did this experiment with 10 million data elements.  If I do it with orders of magnitude fewer points, the results are less\r\n      dramatically different, though similar.  My conclusion is that it is faster to keep the elements you want than to delete the\r\n      ones you don't.  However, it depends on how you reach the state of culling the data.\r\n   <\/p>\r\n   <p>Suppose I'm handed the indices of the unwanted elements.  Am I better off just using these to delete values or taking the\r\n      hit to identify the wanted ones and using the these to keep elements?  Here's my thinking. If I'm given the unwanted indices,\r\n      to create the desired ones, I need to make an array of all indices, and then deleted the unwanted ones from this larger array\r\n      to get the keepers.  Then I have do to the actual indexing for the values.  So, if presented with indices already, whether\r\n      ones to be kept or deleted, use them.  If not, calculate the keepers and assign these values to the required array.\r\n   <\/p>\r\n   <p>Do you find yourself in situations where it is unnatural to calculate or specify the values you want to keep?  If so, please\r\n      post <a href=\"blogs.mathworks.com\/loren\/?p=155#respond\">here<\/a> because I am curious about those situations.\r\n   <\/p><script language=\"JavaScript\">\r\n<!--\r\n\r\n    function grabCode_cd16eeffb6314cd0989203fb8fa96b45() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='cd16eeffb6314cd0989203fb8fa96b45 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' cd16eeffb6314cd0989203fb8fa96b45';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        author = 'Loren Shure';\r\n        copyright = 'Copyright 2008 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add author and copyright lines at the bottom if specified.\r\n        if ((author.length > 0) || (copyright.length > 0)) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (author.length > 0) {\r\n                d.writeln('% _' + author + '_');\r\n            }\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n      \r\n      d.title = title + ' (MATLAB code)';\r\n      d.close();\r\n      }   \r\n      \r\n-->\r\n<\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_cd16eeffb6314cd0989203fb8fa96b45()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n            the MATLAB code \r\n            <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; 7.6<br><\/p>\r\n<\/div>\r\n<!--\r\ncd16eeffb6314cd0989203fb8fa96b45 ##### SOURCE BEGIN #####\r\n%% Timing Extraction of Parts of an Array\r\n% In <https:\/\/blogs.mathworks.com\/loren\/2008\/06\/25\/speeding-up-matlab-applications\/ Sarah's blog>,\r\n% Dan asked about speed of removing elements.  There are a number of ways\r\n% of deleting elements in MATLAB.  So, what's the \"best\" way?\r\n%% Vector Data\r\n% I only work with vector, or 1-D data for this blog.  And I use\r\n% <https:\/\/www.mathworks.com\/access\/helpdesk\/help\/techdoc\/matlab_prog\/f1-85462.html#f1-85511 linear>\r\n% <https:\/\/blogs.mathworks.com\/loren\/category\/indexing\/ indexing> (i.e.,\r\n% indexing with only 1 number). First let me generate some points, and\r\n% perform an\r\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/fft.html fft>.\r\nnPts = 1e7;\r\ncutoff = round(nPts\/2);\r\nA = rand(nPts,1);\r\nB = fft(A); \r\n%% \r\n% I want to keep a copy of |B| off to the side so I can do more experiments\r\n% on it later.  I want to be sure the copy doesn't share memory with the\r\n% array being used later on, so I modify the first element (keeping it the\r\n% same value).\r\nBorig = B;\r\nBorig(1) = Borig(1) + sin(0);\r\n%% Methods That Come to Mind\r\n% Let me list the methods of paring down an array that I can think of\r\n% quickly.\r\n%\r\n% # Find the values to keep and assign them to a new array.\r\n% # Find the values to delete and set them to empty (|[]|) using \r\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/end.html |end|>.\r\n% # Find the values to delete and specify them explicitly, i.e., without using\r\n% |end|.\r\n% # Identify the original values and place them in a new output array.\r\n%% Test Various Methods\r\nt1 = tic;\r\nB=B(1:cutoff); \r\ntoc(t1)\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\nt2 = tic;\r\nB(cutoff+1:end)=[]; \r\ntoc(t2)\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\nt3 = tic;\r\nB(cutoff+1:nPts)=[]; \r\ntoc(t3)\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\nt4 = tic;\r\nC = B(1:cutoff); \r\ntoc(t4)\r\n%%\r\n% It doesn't seem like using |end| vs. the actual number for doing the\r\n% indexing makes nearly as much difference as whether or not I copy the\r\n% elements to keep or delete the elements to remove.\r\n%% Keep Most Elements\r\n% Let's repeat the analysis but retain most of the elements this time.\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\ncutoff  = round(0.95*nPts)\r\n%%\r\n% Test the various methods.\r\nt1 = tic;\r\nB=B(1:cutoff); \r\ntoc(t1)\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\nt2 = tic;\r\nB(cutoff+1:end)=[]; \r\ntoc(t2)\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\nt3 = tic;\r\nB(cutoff+1:nPts)=[]; \r\ntoc(t3)\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\nt4 = tic;\r\nC = B(1:cutoff); \r\ntoc(t4)\r\n%%\r\n% Interesing.  I see roughly the same time pattern as earlier, when I saved\r\n% half the elements.\r\n%% Keep Few Elements\r\n% Finally, sticking to the vector case, keep very few of the elements this\r\n% time.\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\ncutoff  = round(0.05*nPts)\r\n%%\r\n% Test the various methods.\r\nt1 = tic;\r\nB=B(1:cutoff); \r\ntoc(t1)\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\nt2 = tic;\r\nB(cutoff+1:end)=[]; \r\ntoc(t2)\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\nt3 = tic;\r\nB(cutoff+1:nPts)=[]; \r\ntoc(t3)\r\nB = Borig;\r\nBorig(1) = Borig(1) + sin(0);\r\nt4 = tic;\r\nC = B(1:cutoff); \r\ntoc(t4)\r\n%% Commentary and Question\r\n% I did this experiment with 10 million data elements.  If I do it with\r\n% orders of magnitude fewer points, the results are less dramatically\r\n% different, though similar.  My conclusion is that it is faster to keep\r\n% the elements you want than to delete the ones you don't.  However, it\r\n% depends on how you reach the state of culling the data.\r\n%% \r\n% Suppose I'm handed the indices of the unwanted elements.  Am I better off\r\n% just using these to delete values or taking the hit to identify the \r\n% wanted ones and using the these to keep elements?  Here's my thinking. If\r\n% I'm given the unwanted indices, to create the desired ones, I need to\r\n% make an array of all indices, and then deleted the unwanted ones from\r\n% this larger array to get the keepers.  Then I have do to the actual\r\n% indexing for the values.  So, if presented with indices already, whether\r\n% ones to be kept or deleted, use them.  If not, calculate the keepers and\r\n% assign these values to the required array. \r\n%% \r\n% Do you find yourself in situations where it is unnatural to calculate or\r\n% specify the values you want to keep?  If so, please post\r\n% <blogs.mathworks.com\/loren\/?p=155#respond here> because I am curious\r\n% about those situations.\r\n\r\n##### SOURCE END ##### cd16eeffb6314cd0989203fb8fa96b45\r\n-->","protected":false},"excerpt":{"rendered":"<p>\r\n   \r\n      In Sarah's blog, Dan asked about speed of removing elements.  There are a number of ways of deleting elements in MATLAB.  So, what's the\r\n         \"best\" way?\r\n    ... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/loren\/2008\/09\/25\/timing-extraction-of-parts-of-an-array\/\">read more >><\/a><\/p>","protected":false},"author":39,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[10,4],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/155"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/comments?post=155"}],"version-history":[{"count":1,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/155\/revisions"}],"predecessor-version":[{"id":1929,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/155\/revisions\/1929"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/media?parent=155"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/categories?post=155"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/tags?post=155"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}