{"id":3667,"date":"2020-04-30T06:13:15","date_gmt":"2020-04-30T11:13:15","guid":{"rendered":"https:\/\/blogs.mathworks.com\/loren\/?p=3667"},"modified":"2020-05-04T13:40:53","modified_gmt":"2020-05-04T17:40:53","slug":"faster-indexing-in-tables-datetime-arrays-and-other-data-types","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/loren\/2020\/04\/30\/faster-indexing-in-tables-datetime-arrays-and-other-data-types\/","title":{"rendered":"Faster Indexing in Tables, datetime Arrays, and Other Data Types"},"content":{"rendered":"<div class=\"content\"><!--introduction--><p>Today I'd like to introduce a guest blogger, Stephen Doe, who works for the MATLAB Documentation team here at MathWorks. In today's post, Stephen discusses how to take advantage of recent performance improvements when indexing into tables. The same approach applies to many different data types. While the release notes describe the performance improvements, in today's post Stephen also offers further advice based on a simple code example.<\/p><!--\/introduction--><h3>Contents<\/h3><div><ul><li><a href=\"#a76be6c7-9ca1-4be1-a0ec-ccfc4943757e\">So, What Has Improved, and How?<\/a><\/li><li><a href=\"#ba45ce3c-d9c3-4f66-85f2-3759100dd287\">Assign Table Elements in <tt>for<\/tt>-Loop<\/a><\/li><li><a href=\"#36429fea-baa9-41ed-9dbe-b3308f06d5c2\">Scripts and <tt>try-catch<\/tt> Considered Harmful<\/a><\/li><li><a href=\"#56dfd26a-6e08-49b5-a4a8-3a34b52a9f9f\">Best Practices<\/a><\/li><li><a href=\"#53499a28-cb90-4412-957b-9422d6919136\">Code Samples<\/a><\/li><\/ul><\/div><h4>So, What Has Improved, and How?<a name=\"a76be6c7-9ca1-4be1-a0ec-ccfc4943757e\"><\/a><\/h4><p>As of R2020a, the MATLAB data types team has delivered substantial performance improvements for indexing into certain types of arrays. The improved performance comes from in-place optimizations. They are most apparent when you access or assign values to many array elements within a <tt>for<\/tt>-loop. The improved data types are:<\/p><div><ul><li><tt><a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/calendarduration.html\">calendarDuration<\/a><\/tt><\/li><li><tt><a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/categorical.html\">categorical<\/a><\/tt><\/li><li><tt><a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/datetime.html\">datetime<\/a><\/tt><\/li><li><tt><a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/duration.html\">duration<\/a><\/tt><\/li><li><tt><a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/table.html\">table<\/a><\/tt><\/li><li><tt><a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/timetable.html\">timetable<\/a><\/tt><\/li><\/ul><\/div><p>The improvements were delivered over two releases (R2019b and R2020a). So, in this post I compare the performance of R2020a to R2019a, the latest release without any of these improvements. Overall, we see that for these data types, assignments to array elements are many times faster than in R2019a.<\/p><div><ul><li><b>1.5-2x<\/b> faster when assigning to elements of a <tt>table<\/tt> or <tt>timetable<\/tt> variable<\/li><li><b>Order of magnitude<\/b> faster (at least) when assigning to elements of <tt>calendarDuration<\/tt>, <tt>categorical<\/tt>, <tt>datetime<\/tt>, and <tt>duration<\/tt> arrays<\/li><\/ul><\/div><p>You can find the full details, including test code that illustrates the performance improvements between releases, in the release notes. As it happens, we have a new format for describing performance enhancements with more quantitative detail. These links take you directly to the performance release notes for R2019b and R2020a:<\/p><div><ul><li><a href=\"https:\/\/www.mathworks.com\/help\/matlab\/release-notes.html?category=performance&amp;rntext=&amp;startrelease=R2019b&amp;endrelease=R2019b&amp;groupby=release&amp;sortby=descending&amp;searchHighlight=\">R2019b Performance Notes<\/a><\/li><li><a href=\"https:\/\/www.mathworks.com\/help\/matlab\/release-notes.html?category=performance&amp;rntext=&amp;startrelease=R2020a&amp;endrelease=R2020a&amp;groupby=release&amp;sortby=descending&amp;searchHighlight=\">R2020a Performance Notes<\/a><\/li><\/ul><\/div><p>With that kind of detail, is there more that needs to be said? <b>Yes<\/b>. I'm going to show you how best to take advantage of these performance improvements. I'm also going to explain some circumstances where these improvements <b>don't<\/b> take effect.<\/p><h4>Assign Table Elements in <tt>for<\/tt>-Loop<a name=\"ba45ce3c-d9c3-4f66-85f2-3759100dd287\"><\/a><\/h4><p>Here's a simple example that takes full advantage of the indexing performance improvements. In this example, I calculate the position of a projectile at regular time steps. I use a formula where the position and velocity at each step depend on the position and velocity of the previous step. I create a table and I assign the positions and velocities to rows of the table using a <tt>for<\/tt>-loop. The figure shows the path taken by the projectile in this example.<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2020\/projectile.png\" alt=\"\"> <\/p><p>Here is a function, named <tt>projectile<\/tt>, that calculate positions and velocities and then assigns them to a table. (It depends on another function, <tt>step<\/tt>, that calculates position and velocity at one time step.) Note that in the <tt>for<\/tt>-loop, I use dot notation to access the table variables. Then I use the loop counter to index into the variables (<tt>T.x(i)<\/tt>, <tt>T.y(i)<\/tt>, and so on). (I've attached the <tt>projectile<\/tt> and <tt>step<\/tt> functions to the end of this blog post.)<\/p><pre class=\"language-matlab\">\r\n<span class=\"keyword\">function<\/span> T = projectile(v,angle)\r\n<span class=\"comment\">% PROJECTILE.M - Function to create table of projectile positions and<\/span>\r\n<span class=\"comment\">% velocities under free fall.<\/span>\r\ndt = 0.001;    <span class=\"comment\">% 0.001 seconds<\/span>\r\nT = table(<span class=\"string\">'Size'<\/span>,[30\/dt 4],<span class=\"string\">'VariableTypes'<\/span>,[<span class=\"string\">\"double\"<\/span>,<span class=\"string\">\"double\"<\/span>,<span class=\"string\">\"double\"<\/span>,<span class=\"string\">\"double\"<\/span>],<span class=\"string\">'VariableNames'<\/span>,[<span class=\"string\">\"x\"<\/span>,<span class=\"string\">\"y\"<\/span>,<span class=\"string\">\"vx\"<\/span>,<span class=\"string\">\"vy\"<\/span>]);\r\n\r\nT{1,:} = [0 0 v*cosd(angle) v*sind(angle)];\r\n<span class=\"keyword\">for<\/span> i = 2:height(T)\r\n    [T.x(i), T.y(i), T.vx(i), T.vy(i)] = step(T.x(i-1), T.y(i-1), T.vx(i-1), T.vy(i-1), dt);\r\n<span class=\"keyword\">end<\/span>\r\n\r\n<span class=\"keyword\">end<\/span>\r\n\r\n<\/pre><p>Let's call the <tt>projectile<\/tt> function with a starting velocity of 50 m\/s and an angle of 45 degrees. With this call, the output table has 30,000 rows. I'll use the <tt>head<\/tt> function to display the first five rows.<\/p><pre class=\"codeinput\">T = projectile(50,45);\r\nhead(T,5)\r\n<\/pre><pre class=\"codeoutput\">ans =\r\n  5&times;4 table\r\n       x           y          vx        vy  \r\n    ________    ________    ______    ______\r\n           0           0    35.355    35.355\r\n    0.035355    0.035341    35.355    35.346\r\n    0.070711    0.070671    35.355    35.336\r\n     0.10607     0.10599    35.355    35.326\r\n     0.14142      0.1413    35.355    35.316\r\n<\/pre><p>Now let's use <tt>tic<\/tt> and <tt>toc<\/tt> to estimate the execution time for this function in two different versions of MATLAB: R2019a (before recent improvements) and R2020a.<\/p><pre class=\"language-matlab\">tic\r\nT = projectile(50,45);\r\ntoc\r\n<\/pre><p><b>R2019a:<\/b> <b>13.06<\/b> seconds<\/p><p><b>R2020a:<\/b> <b>7.64<\/b> seconds.<\/p><p>(I made both calls on the same machine: a Windows 10, Intel&reg; Xeon&reg; W-2133 @ 3.60 GHz test system.)<\/p><p>This code is about 1.70x times faster in R2020a! And you'll see performance improvements that are at least this good with very large tables and timetables (millions of rows), and also in <tt>categorical<\/tt> and <tt>datetime<\/tt> arrays.<\/p><p>Now, note that I wrote this example in such a way as to force use of a <tt>for<\/tt>-loop. If you can, the best strategy of all is to <i>vectorize<\/i> your code for the best performance. \"Vectorizing\" code pretty much means operating on arrays instead of elements of an array. For example, it's much more efficient to call <tt>z = x + y<\/tt> instead of calling <tt>z(i) = x(i) + y(i)<\/tt> in a <tt>for<\/tt>-loop.<\/p><p>But if you have code that you can't vectorize, because it accesses many other table or array elements--like the code in my example--then the performance improvements for these data types will help your code run faster.<\/p><p>So now we're good to go, right? Not so fast.<\/p><h4>Scripts and <tt>try-catch<\/tt> Considered Harmful<a name=\"36429fea-baa9-41ed-9dbe-b3308f06d5c2\"><\/a><\/h4><p>All right, <i>harmful<\/i> is an exaggeration. It's perfectly fine to put your code in a script, or within a <tt>try-catch<\/tt> block, or to work with variables in the workspace. In all these cases, the code I wrote for the <tt>projectile<\/tt> function returns the exact same table.<\/p><p>But if that code is in a script or a try-catch block, or if you are interactively working with workspace variables, then there is no performance enhancement. For example, let me rewrite the code as a script, in the file <tt>projectile_script.m<\/tt>. Within the script, I assign the same starting velocity and angle as above. (I've attached a copy of <tt>projectile_script.m<\/tt> to the end of this blog post.)<\/p><pre class=\"language-matlab\">\r\n<span class=\"comment\">% PROJECTILE_SCRIPT.M - Script that creates table of projectile positions<\/span>\r\n<span class=\"comment\">% and velocities under free fall.<\/span>\r\ndt = 0.001;    <span class=\"comment\">% 0.001 seconds<\/span>\r\nv = 50;        <span class=\"comment\">% 50 m\/s<\/span>\r\nangle = 45;    <span class=\"comment\">% 45 degrees<\/span>\r\n\r\nT = table(<span class=\"string\">'Size'<\/span>,[30\/dt 4],<span class=\"string\">'VariableTypes'<\/span>,[<span class=\"string\">\"double\"<\/span>,<span class=\"string\">\"double\"<\/span>,<span class=\"string\">\"double\"<\/span>,<span class=\"string\">\"double\"<\/span>],<span class=\"string\">'VariableNames'<\/span>,[<span class=\"string\">\"x\"<\/span>,<span class=\"string\">\"y\"<\/span>,<span class=\"string\">\"vx\"<\/span>,<span class=\"string\">\"vy\"<\/span>]);\r\n\r\nT{1,:} = [0 0 v*cosd(angle) v*sind(angle)];\r\n<span class=\"keyword\">for<\/span> i = 2:height(T)\r\n    [T.x(i), T.y(i), T.vx(i), T.vy(i)] = step(T.x(i-1), T.y(i-1), T.vx(i-1), T.vy(i-1), dt);\r\n<span class=\"keyword\">end<\/span>\r\n\r\n<\/pre><p>Now I call the function and time it:<\/p><pre>tic\r\nprojectile_script\r\ntoc<\/pre><p><b>R2019a:<\/b> <b>12.36<\/b> seconds<\/p><p><b>R2020a:<\/b> <b>11.83<\/b> seconds<\/p><p>The difference between the two releases is now <b>much<\/b> smaller! What happened?<\/p><p>Well, it's complicated. In essence, MATLAB applies in-place optimizations to the indexing done in this line of code:<\/p><pre class=\"language-matlab\">[T.x(i), T.y(i), T.vx(i), T.vy(i)] = step(T.x(i-1), T.y(i-1), T.vx(i-1), T.vy(i-1), dt);\r\n<\/pre><p>To take advantage of the in-place optimizations for these data types, you must perform the indexing within a function. If you do so with workspace variables or within a script, you won't see the full performance improvement.<\/p><p>Also, the improvement is lost when the indexing code is within a <tt>try-catch<\/tt> block, even when that block is itself within a function. But in such cases, you can get the performance back by putting the indexing code into a separate function. (P.S. The same is true for nested functions.)<\/p><p>I won't go into the full details about <tt>try-catch<\/tt> blocks in this post. However, I have attached two files to the end of this post, <tt>projectile_try_catch.m<\/tt> and <tt>projectile_try_regained.m<\/tt>. These files show the problem and its workaround.<\/p><p>To sum up, this table shows the different pieces of code that I wrote for this post, and the performance you can expect in each case.<\/p><p>\r\n<table border=1><tr><td><b>Example File<\/b><\/td><td><b>Type of Code<\/b><\/td><td><b>Execution Time<\/b><\/td><\/tr>\r\n<tr><td><tt>projectile.m<\/tt><\/td><td>function<\/td><td>7.64 s<\/td><\/tr>\r\n<tr><td><tt>projectile_script.m<\/tt><\/td><td>script<\/td><td>11.83 s<\/td><\/tr>\r\n<tr><td><tt>projectile_try_catch.m<\/tt><\/td><td>indexing code in <tt>try-catch<\/tt> block<\/td><td>11.75 s<\/td><\/tr>\r\n<tr><td><tt>projectile_try_regained.m<\/tt><\/td><td><tt>try-catch<\/tt> block, but indexing code in separate function<\/td><td>7.43 s<\/td><\/tr><\/table>\r\n<\/p><h4>Best Practices<a name=\"56dfd26a-6e08-49b5-a4a8-3a34b52a9f9f\"><\/a><\/h4><p>To sum up, here are the best practices to keep in mind to get best performance when writing code for data types such as <tt>table<\/tt>, <tt>datetime<\/tt>, and <tt>categorical<\/tt>:<\/p><div><ul><li><b>DO<\/b> vectorize code when you can. For example, operate on table variables (<tt>T.X<\/tt>) instead of elements of table variables (<tt>T.X(i)<\/tt>)<\/li><li><b>DO<\/b> put your <tt>table<\/tt>, <tt>datetime<\/tt>, and <tt>categorical<\/tt> indexing code in functions, if you're doing a lot of indexing and can't vectorize your code.<\/li><li><b>AVOID<\/b> scripts, at least for code that does a lot of indexing. Put the indexing code in a function.<\/li><li><b>AVOID<\/b> try-catch blocks for code that does a lot of indexing. Put it in its own function.<\/li><\/ul><\/div><p>Since the MATLAB data types team continues to work on improving performance, we'd love to hear more about your experience with our data types. Please tell us more about your challenges using tables, timetables, <tt>datetime<\/tt>, or <tt>categorical<\/tt> arrays <a href=\"https:\/\/blogs.mathworks.com\/loren\/?p=3667#respond\">here<\/a>.<\/p><h4>Code Samples<a name=\"53499a28-cb90-4412-957b-9422d6919136\"><\/a><\/h4><pre class=\"language-matlab\">\r\n<span class=\"keyword\">function<\/span> [x, y, vx, vy] = step(x,y,vx,vy,dt)\r\n<span class=\"comment\">% STEP.M - Calculate position and velocity based on input position, <\/span>\r\n<span class=\"comment\">% velocity, and local gravitational acceleration. Positions in meters,<\/span>\r\n<span class=\"comment\">% velocities in m\/s, dt in seconds. <\/span>\r\ng = -9.8;  <span class=\"comment\">% -9.8 m\/s^2<\/span>\r\n\r\nvy = vy + g*dt;\r\ny = y + vy*dt + (g\/2)*dt^2;\r\n\r\nx = x + vx*dt;\r\n<span class=\"keyword\">end<\/span>\r\n\r\n\r\n\r\n<span class=\"keyword\">function<\/span> T = projectile(v,angle)\r\n<span class=\"comment\">% PROJECTILE.M - Function to create table of projectile positions and<\/span>\r\n<span class=\"comment\">% velocities under free fall.<\/span>\r\ndt = 0.001;    <span class=\"comment\">% 0.001 seconds<\/span>\r\nT = table(<span class=\"string\">'Size'<\/span>,[30\/dt 4],<span class=\"string\">'VariableTypes'<\/span>,[<span class=\"string\">\"double\"<\/span>,<span class=\"string\">\"double\"<\/span>,<span class=\"string\">\"double\"<\/span>,<span class=\"string\">\"double\"<\/span>],<span class=\"string\">'VariableNames'<\/span>,[<span class=\"string\">\"x\"<\/span>,<span class=\"string\">\"y\"<\/span>,<span class=\"string\">\"vx\"<\/span>,<span class=\"string\">\"vy\"<\/span>]);\r\n\r\nT{1,:} = [0 0 v*cosd(angle) v*sind(angle)];\r\n<span class=\"keyword\">for<\/span> i = 2:height(T)\r\n    [T.x(i), T.y(i), T.vx(i), T.vy(i)] = step(T.x(i-1), T.y(i-1), T.vx(i-1), T.vy(i-1), dt);\r\n<span class=\"keyword\">end<\/span>\r\n\r\n<span class=\"keyword\">end<\/span>\r\n\r\n\r\n\r\n<span class=\"comment\">% PROJECTILE_SCRIPT.M - Script that creates table of projectile positions<\/span>\r\n<span class=\"comment\">% and velocities under free fall.<\/span>\r\ndt = 0.001;    <span class=\"comment\">% 0.001 seconds<\/span>\r\nv = 50;        <span class=\"comment\">% 50 m\/s<\/span>\r\nangle = 45;    <span class=\"comment\">% 45 degrees<\/span>\r\n\r\nT = table(<span class=\"string\">'Size'<\/span>,[30\/dt 4],<span class=\"string\">'VariableTypes'<\/span>,[<span class=\"string\">\"double\"<\/span>,<span class=\"string\">\"double\"<\/span>,<span class=\"string\">\"double\"<\/span>,<span class=\"string\">\"double\"<\/span>],<span class=\"string\">'VariableNames'<\/span>,[<span class=\"string\">\"x\"<\/span>,<span class=\"string\">\"y\"<\/span>,<span class=\"string\">\"vx\"<\/span>,<span class=\"string\">\"vy\"<\/span>]);\r\n\r\nT{1,:} = [0 0 v*cosd(angle) v*sind(angle)];\r\n<span class=\"keyword\">for<\/span> i = 2:height(T)\r\n    [T.x(i), T.y(i), T.vx(i), T.vy(i)] = step(T.x(i-1), T.y(i-1), T.vx(i-1), T.vy(i-1), dt);\r\n<span class=\"keyword\">end<\/span>\r\n\r\n\r\n\r\n<span class=\"keyword\">function<\/span> T = projectile_try_catch(v,angle)\r\n<span class=\"comment\">% PROJECTILE_TRY_CATCH.M - A copy of PROJECTILE.M, but with a try-catch<\/span>\r\n<span class=\"comment\">% block. R2020a in-place optimizations are LOST because of the try-catch<\/span>\r\n<span class=\"comment\">% block.<\/span>\r\ndt = 0.001;    <span class=\"comment\">% 0.001 seconds<\/span>\r\nT = table(<span class=\"string\">'Size'<\/span>,[30\/dt 4],<span class=\"string\">'VariableTypes'<\/span>,[<span class=\"string\">\"double\"<\/span>,<span class=\"string\">\"double\"<\/span>,<span class=\"string\">\"double\"<\/span>,<span class=\"string\">\"double\"<\/span>],<span class=\"string\">'VariableNames'<\/span>,[<span class=\"string\">\"x\"<\/span>,<span class=\"string\">\"y\"<\/span>,<span class=\"string\">\"vx\"<\/span>,<span class=\"string\">\"vy\"<\/span>]);\r\n\r\n<span class=\"keyword\">try<\/span>\r\n    T{1,:} = [0 0 v*cosd(angle) v*sind(angle)];\r\n    <span class=\"keyword\">for<\/span> i = 2:height(T)\r\n        [T.x(i), T.y(i), T.vx(i), T.vy(i)] = step(T.x(i-1), T.y(i-1), T.vx(i-1), T.vy(i-1), dt);\r\n    <span class=\"keyword\">end<\/span>\r\n<span class=\"keyword\">catch<\/span>\r\n<span class=\"keyword\">end<\/span>\r\n\r\n<span class=\"keyword\">end<\/span>\r\n\r\n\r\n\r\n<span class=\"keyword\">function<\/span> T = projectile_try_regained(v,angle)\r\n<span class=\"comment\">% PROJECTILE_TRY_REGAINED.M - A copy of PROJECTILE.M, but with a try-catch<\/span>\r\n<span class=\"comment\">% block placed in a separate function. R2020a in-place optimizations are<\/span>\r\n<span class=\"comment\">% REGAINED because the try-catch block is in a separate, local function.<\/span>\r\n<span class=\"keyword\">try<\/span>\r\n    T = projectile_local(v,angle);\r\n<span class=\"keyword\">catch<\/span>\r\n<span class=\"keyword\">end<\/span>\r\n\r\n<span class=\"keyword\">end<\/span>\r\n\r\n<span class=\"keyword\">function<\/span> T = projectile_local(v,angle)\r\n<span class=\"comment\">% This is the code that creates the table for PROJECTILE_TRY_REGAINED.<\/span>\r\n<span class=\"comment\">% For best performance, do not use try-catch here, but rather in the<\/span>\r\n<span class=\"comment\">% calling function.<\/span>\r\ndt = 0.001;\r\nT = table(<span class=\"string\">'Size'<\/span>,[30\/dt 4],<span class=\"string\">'VariableTypes'<\/span>,[<span class=\"string\">\"double\"<\/span>,<span class=\"string\">\"double\"<\/span>,<span class=\"string\">\"double\"<\/span>,<span class=\"string\">\"double\"<\/span>],<span class=\"string\">'VariableNames'<\/span>,[<span class=\"string\">\"x\"<\/span>,<span class=\"string\">\"y\"<\/span>,<span class=\"string\">\"vx\"<\/span>,<span class=\"string\">\"vy\"<\/span>]);\r\n    \r\nT{1,:} = [0 0 v*cosd(angle) v*sind(angle)];\r\n<span class=\"keyword\">for<\/span> i = 2:height(T)\r\n    [T.x(i), T.y(i), T.vx(i), T.vy(i)] = step(T.x(i-1), T.y(i-1), T.vx(i-1), T.vy(i-1), dt);\r\n<span class=\"keyword\">end<\/span>\r\n\r\n<span class=\"keyword\">end<\/span>\r\n\r\n\r\n<\/pre><script language=\"JavaScript\"> <!-- \r\n    function grabCode_d0bc1b53a3df44ccae6629f21a9fd83d() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='d0bc1b53a3df44ccae6629f21a9fd83d ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' d0bc1b53a3df44ccae6629f21a9fd83d';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2020 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_d0bc1b53a3df44ccae6629f21a9fd83d()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2020a<br><\/p><\/div><!--\r\nd0bc1b53a3df44ccae6629f21a9fd83d ##### SOURCE BEGIN #####\r\n%% Faster Indexing in Tables, datetime Arrays, and Other Data Types\r\n% Today I'd like to introduce a guest blogger, Stephen Doe, who works for\r\n% the MATLAB Documentation team here at MathWorks. In today's post, Stephen\r\n% discusses how to take advantage of recent performance improvements when\r\n% indexing into tables. The same approach applies to many different data\r\n% types. While the release notes describe the performance improvements, in\r\n% today's post Stephen also offers further advice based on a simple code\r\n% example.\r\n%\r\n%% So, What Has Improved, and How?\r\n% As of R2020a, the MATLAB data types team has delivered substantial\r\n% performance improvements for indexing into certain types of arrays. The\r\n% improved performance comes from in-place optimizations. They are most\r\n% apparent when you access or assign values to many array elements within a\r\n% |for|-loop. The improved data types are:\r\n%\r\n% * |<https:\/\/www.mathworks.com\/help\/matlab\/ref\/calendarduration.html calendarDuration>|\r\n% * |<https:\/\/www.mathworks.com\/help\/matlab\/ref\/categorical.html categorical>|\r\n% * |<https:\/\/www.mathworks.com\/help\/matlab\/ref\/datetime.html datetime>|\r\n% * |<https:\/\/www.mathworks.com\/help\/matlab\/ref\/duration.html duration>|\r\n% * |<https:\/\/www.mathworks.com\/help\/matlab\/ref\/table.html table>|\r\n% * |<https:\/\/www.mathworks.com\/help\/matlab\/ref\/timetable.html timetable>|\r\n%\r\n% The improvements were delivered over two releases (R2019b and R2020a).\r\n% So, in this post I compare the performance of R2020a to R2019a, the\r\n% latest release without any of these improvements. Overall, we see that\r\n% for these data types, assignments to array elements are many times faster\r\n% than in R2019a.\r\n%\r\n% * *1.5-2x* faster when assigning to elements of a |table| or |timetable|\r\n% variable\r\n% * *Order of magnitude* faster (at least) when assigning to elements of |calendarDuration|, |categorical|,\r\n% |datetime|, and |duration| arrays\r\n% \r\n% You can find the full details, including test code that illustrates the\r\n% performance improvements between releases, in the release notes. As it\r\n% happens, we have a new format for describing performance enhancements\r\n% with more quantitative detail. These links take you directly to the\r\n% performance release notes for R2019b and R2020a:\r\n%\r\n% * <https:\/\/www.mathworks.com\/help\/matlab\/release-notes.html?category=performance&rntext=&startrelease=R2019b&endrelease=R2019b&groupby=release&sortby=descending&searchHighlight=\r\n% R2019b Performance Notes>\r\n% * <https:\/\/www.mathworks.com\/help\/matlab\/release-notes.html?category=performance&rntext=&startrelease=R2020a&endrelease=R2020a&groupby=release&sortby=descending&searchHighlight=\r\n% R2020a Performance Notes>\r\n%\r\n% With that kind of detail, is there more that needs to be said? *Yes*. I'm\r\n% going to show you how best to take advantage of these performance\r\n% improvements. I'm also going to explain some circumstances where these\r\n% improvements *don't* take effect.\r\n%% Assign Table Elements in |for|-Loop\r\n% Here's a simple example that takes full advantage of the indexing\r\n% performance improvements. In this example, I calculate the position of a\r\n% projectile at regular time steps. I use a formula where the position\r\n% and velocity at each step depend on the position and velocity of the\r\n% previous step. I create a table and I assign the positions and velocities\r\n% to rows of the table using a |for|-loop. The figure shows the path taken\r\n% by the projectile in this example.\r\n%\r\n% <<projectile.png>>\r\n%\r\n% Here is a function, named |projectile|, that calculate positions and\r\n% velocities and then assigns them to a table. (It depends on another\r\n% function, |step|, that calculates position and velocity at one time\r\n% step.) Note that in the |for|-loop, I use dot notation to access the\r\n% table variables. Then I use the loop counter to index into the variables\r\n% (|T.x(i)|, |T.y(i)|, and so on). (I've attached the |projectile| and\r\n% |step| functions to the end of this blog post.)\r\n%\r\n% <include>projectile.m<\/include>\r\n%\r\n% Let's call the |projectile| function with a starting velocity of 50 m\/s\r\n% and an angle of 45 degrees. With this call, the output table has 30,000\r\n% rows. I'll use the |head| function to display the first five rows.\r\n%\r\nT = projectile(50,45);\r\nhead(T,5)\r\n%%\r\n% Now let's use |tic| and |toc| to estimate the execution time for this\r\n% function in two different versions of MATLAB: R2019a (before recent\r\n% improvements) and R2020a.\r\n%\r\n%   tic\r\n%   T = projectile(50,45);\r\n%   toc\r\n%\r\n% *R2019a:* *13.06* seconds\r\n%\r\n% *R2020a:* *7.64* seconds.\r\n%\r\n% (I made both calls on the same machine: a Windows 10, Intel\u00ae Xeon\u00ae W-2133\r\n% @ 3.60 GHz test system.)\r\n%\r\n% This code is about 1.70x times faster in R2020a! And you'll see\r\n% performance improvements that are at least this good with very large\r\n% tables and timetables (millions of rows), and also in |categorical| and\r\n% |datetime| arrays.\r\n%\r\n% Now, note that I wrote this example in such a way as to force use of a\r\n% |for|-loop. If you can, the best strategy of all is to _vectorize_ your\r\n% code for the best performance. \"Vectorizing\" code pretty much means\r\n% operating on arrays instead of elements of an array. For example, it's\r\n% much more efficient to call |z = x + y| instead of calling |z(i) = x(i) +\r\n% y(i)| in a |for|-loop.\r\n%\r\n% But if you have code that you can't vectorize, because it accesses many\r\n% other table or array elementsREPLACE_WITH_DASH_DASHlike the code in my exampleREPLACE_WITH_DASH_DASHthen the\r\n% performance improvements for these data types will help your code run\r\n% faster.\r\n% \r\n% So now we're good to go, right? Not so fast.\r\n%% Scripts and |try-catch| Considered Harmful\r\n% All right, _harmful_ is an exaggeration. It's perfectly fine to put your\r\n% code in a script, or within a |try-catch| block, or to work with\r\n% variables in the workspace. In all these cases, the code I wrote for the\r\n% |projectile| function returns the exact same table.\r\n%\r\n% But if that code is in a script or a try-catch block, or if you are\r\n% interactively working with workspace variables, then there is no\r\n% performance enhancement. For example, let me rewrite the code as a\r\n% script, in the file |projectile_script.m|. Within the script, I assign\r\n% the same starting velocity and angle as above. (I've attached a copy of\r\n% |projectile_script.m| to the end of this blog post.)\r\n%\r\n% <include>projectile_script.m<\/include>\r\n%\r\n% Now I call the function and time it:\r\n%\r\n%  tic\r\n%  projectile_script\r\n%  toc\r\n%\r\n% *R2019a:* *12.36* seconds\r\n%\r\n% *R2020a:* *11.83* seconds\r\n%\r\n% The difference between the two releases is now *much* smaller! What\r\n% happened?\r\n%\r\n% Well, it's complicated. In essence, MATLAB applies in-place optimizations\r\n% to the indexing done in this line of code: \r\n%\r\n%   [T.x(i), T.y(i), T.vx(i), T.vy(i)] = step(T.x(i-1), T.y(i-1), T.vx(i-1), T.vy(i-1), dt);\r\n%\r\n% To take advantage of the in-place optimizations for these data types, you\r\n% must perform the indexing within a function. If you do so with workspace\r\n% variables or within a script, you won't see the full performance\r\n% improvement.\r\n%\r\n% Also, the improvement is lost when the indexing code is within a\r\n% |try-catch| block, even when that block is itself within a function. But\r\n% in such cases, you can get the performance back by putting the indexing\r\n% code into a separate function.\r\n%\r\n% I won't go into the full details about |try-catch| blocks in this post.\r\n% However, I have attached two files to the end of this post,\r\n% |projectile_try_catch.m| and |projectile_try_regained.m|. These files\r\n% show the problem and its workaround.\r\n%\r\n% To sum up, this table shows the different pieces of code that I wrote for\r\n% this post, and the performance you can expect in each case.\r\n%\r\n% <html>\r\n% <table border=1><tr><td><b>Example File<\/b><\/td><td><b>Type of Code<\/b><\/td><td><b>Execution Time<\/b><\/td><\/tr>\r\n% <tr><td><tt>projectile.m<\/tt><\/td><td>function<\/td><td>7.64 s<\/td><\/tr>\r\n% <tr><td><tt>projectile_script.m<\/tt><\/td><td>script<\/td><td>11.83 s<\/td><\/tr>\r\n% <tr><td><tt>projectile_try_catch.m<\/tt><\/td><td>indexing code in <tt>try-catch<\/tt> block<\/td><td>11.75 s<\/td><\/tr>\r\n% <tr><td><tt>projectile_try_regained.m<\/tt><\/td><td><tt>try-catch<\/tt> block, but indexing code in separate function<\/td><td>7.43 s<\/td><\/tr><\/table>\r\n% <\/html>\r\n%\r\n%% Best Practices\r\n% To sum up, here are the best practices to keep in mind to get best\r\n% performance when writing code for data types such as |table|, |datetime|,\r\n% and |categorical|:\r\n%\r\n% * *DO* vectorize code when you can. For example, operate on table\r\n% variables (|T.X|) instead of elements of table variables (|T.X(i)|)\r\n% * *DO* put your |table|, |datetime|, and |categorical| indexing code in\r\n% functions, if you're doing a lot of indexing and can't vectorize your\r\n% code.\r\n% * *AVOID* scripts, at least for code that does a lot of indexing. Put the\r\n% indexing code in a function.\r\n% * *AVOID* try-catch blocks for code that does a lot of indexing. Put it\r\n% in its own function.\r\n%\r\n% Since the MATLAB data types team continues to work on improving\r\n% performance, we'd love to hear more about your experience with our data\r\n% types. Please tell us more about your challenges using tables,\r\n% timetables, |datetime|, or |categorical| arrays\r\n% <https:\/\/blogs.mathworks.com\/loren\/?p=3667#respond here>.\r\n%% Code Samples\r\n%\r\n% <include>step.m<\/include>\r\n%\r\n% <include>projectile.m<\/include>\r\n%\r\n% <include>projectile_script.m<\/include>\r\n%\r\n% <include>projectile_try_catch.m<\/include>\r\n%\r\n% <include>projectile_try_regained.m<\/include>\r\n##### SOURCE END ##### d0bc1b53a3df44ccae6629f21a9fd83d\r\n-->","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img decoding=\"async\"  class=\"img-responsive\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2020\/projectile.png\" onError=\"this.style.display ='none';\" \/><\/div><!--introduction--><p>Today I'd like to introduce a guest blogger, Stephen Doe, who works for the MATLAB Documentation team here at MathWorks. In today's post, Stephen discusses how to take advantage of recent performance improvements when indexing into tables. The same approach applies to many different data types. While the release notes describe the performance improvements, in today's post Stephen also offers further advice based on a simple code example.... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/loren\/2020\/04\/30\/faster-indexing-in-tables-datetime-arrays-and-other-data-types\/\">read more >><\/a><\/p>","protected":false},"author":39,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[16,4,58],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/3667"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/comments?post=3667"}],"version-history":[{"count":6,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/3667\/revisions"}],"predecessor-version":[{"id":3689,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/3667\/revisions\/3689"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/media?parent=3667"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/categories?post=3667"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/tags?post=3667"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}