{"id":1007,"date":"2014-04-07T16:40:03","date_gmt":"2014-04-07T20:40:03","guid":{"rendered":"https:\/\/blogs.mathworks.com\/steve\/?p=1007"},"modified":"2019-11-01T11:03:04","modified_gmt":"2019-11-01T15:03:04","slug":"timing-the-fft","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/steve\/2014\/04\/07\/timing-the-fft\/","title":{"rendered":"Timing the FFT"},"content":{"rendered":"<div class=\"content\"><p>I've seen two questions recently about the speed of the <tt>fft<\/tt> function in MATLAB. First, a tech support question was forwarded to development. The user wanted to know how to predict the computation time for an FFT of a given length, N. This user was interested in values of N in the neighborhood of 4096 (2^12).<\/p><p>The second was a <a>post in the MATLAB newsgroup comp.soft-sys.matlab<\/a>. This user wondered why padding to the next power of two wasn't always the fastest way to compute the FFT.<\/p><p>Inspired by these questions, I want to show you today how to do some FFT benchmarking in MATLAB.<\/p><p>It turns out that, in general, the time required to execute an N-point FFT is proportional to N*log(N). For any particular value of N, though, the execution time can be hard to predict and depends on the number of prime factors of N (very roughly speaking). The variation in time between two close values of N can be as much as an order of magnitude.<\/p><p>Whenever I do FFT benchmarking, I find it very helpful to look at three sets of numbers:<\/p><div><ul><li>Powers of 2<\/li><li>Composite numbers that are not powers of 2<\/li><li>Prime numbers<\/li><\/ul><\/div><p>Also, I have learned to look at plots that are log scale in N and that have roughly the same number of test values within each octave (or doubling) of N.<\/p><p>Constructing sets of N values along these lines takes a little thought. Here's some code.<\/p><p>First, how many powers of 2 do we want to examine? Based on the customer questions I saw, I want to examine the range from 1024 (2^10) to 131072 (2^17).<\/p><pre class=\"codeinput\">low2 = 10;\r\nhigh2 = 17;\r\nn2 = 2.^(low2:high2);\r\n<\/pre><p>Next, I want to pick 10 composite numbers and 10 prime numbers in between successive powers of 2. I'd like to pick the numbers \"randomly,\" but I also want my experiment to be repeatable. To satisfy these seemingly contradictory constraints, I'll reset the MATLAB random number generator before beginning.<\/p><pre class=\"codeinput\">rng(<span class=\"string\">'default'<\/span>);\r\n\r\n<span class=\"comment\">% Initialize the vectors holding the prime N's and composite N's.<\/span>\r\nnp = [];\r\nnc = [];\r\n\r\n<span class=\"keyword\">for<\/span> m = low2:high2\r\n    k = (2^m):(2^(m+1));\r\n    kp = k(2:end-1);\r\n    isp = isprime(kp);\r\n    primes = kp(isp);\r\n    composites = kp(~isp);\r\n\r\n    <span class=\"comment\">% Use randperm to pick out 10 values from the vector of primes and 10<\/span>\r\n    <span class=\"comment\">% values from the vector of composites.<\/span>\r\n    new_np = primes(randperm(length(primes),10));\r\n    new_nc = composites(randperm(length(composites),10));\r\n\r\n    np = [np new_np];\r\n    nc = [nc new_nc];\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><p>Now let's use the function <tt>timeit<\/tt> to measure the execution time required to compute FFTs for all these values of N. (If you don't have a recent version of MATLAB that has <tt>timeit<\/tt>, you can get a version of it from the File Exchange.)<\/p><pre class=\"codeinput\">t2 = zeros(size(n2));\r\n<span class=\"keyword\">for<\/span> k = 1:length(n2)\r\n    x = rand(n2(k),1);\r\n    t2(k) = timeit(@() fft(x));\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><pre class=\"codeinput\">tp = zeros(size(np));\r\n<span class=\"keyword\">for<\/span> k = 1:length(np)\r\n    x = rand(np(k),1);\r\n    tp(k) = timeit(@() fft(x));\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><pre class=\"codeinput\">tc = zeros(size(np));\r\n<span class=\"keyword\">for<\/span> k = 1:length(nc)\r\n    x = rand(nc(k),1);\r\n    tc(k) = timeit(@() fft(x));\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><p>Now do a loglog plot of all these times.<\/p><pre class=\"codeinput\">loglog(n2,t2,<span class=\"string\">'o'<\/span>)\r\nset(gca,<span class=\"string\">'xtick'<\/span>,2.^(10:17))\r\nxlim([2^10 2^17])\r\n\r\nhold <span class=\"string\">on<\/span>\r\n\r\nloglog(nc,tc,<span class=\"string\">'+'<\/span>)\r\nloglog(np,tp,<span class=\"string\">'*'<\/span>)\r\n\r\nhold <span class=\"string\">off<\/span>\r\n\r\nlegend({<span class=\"string\">'Powers of 2'<\/span>,<span class=\"string\">'Composite numbers'<\/span>,<span class=\"string\">'Prime numbers'<\/span>}, <span class=\"keyword\">...<\/span>\r\n    <span class=\"string\">'Location'<\/span>,<span class=\"string\">'NorthWest'<\/span>)\r\nxlabel(<span class=\"string\">'N'<\/span>)\r\nylabel(<span class=\"string\">'Execution time (s)'<\/span>)\r\ntitle(<span class=\"string\">'FFT execution time as a function of N'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2014\/fft_speed_01.png\" alt=\"\"> <p>You can see that there's a wide spread of execution times for the values of N that are not powers of 2.<\/p><p>One thing I'm not seeing is what the MATLAB Newsgroup poster reported. That is, I don't see a non-power-of-2 time that's faster than the next highest power of 2.<\/p><p>So let's look a little harder for composite numbers that are faster than what we've seen so far. Specifically, I'm going to look for values of N with prime factors no bigger than 3.<\/p><pre class=\"codeinput\">nn = [];\r\n<span class=\"keyword\">for<\/span> m = low2:high2\r\n    k = (2^m):(2^(m+1));\r\n    kp = k(2:end-1);\r\n\r\n    kp = kp(randperm(length(kp)));\r\n    nn_m = [];\r\n    <span class=\"keyword\">for<\/span> q = 1:length(kp)\r\n        <span class=\"keyword\">if<\/span> max(factor(kp(q))) &lt;= 3\r\n            nn_m = [nn_m kp(q)];\r\n\r\n            <span class=\"keyword\">if<\/span> length(nn_m) &gt;= 4\r\n                <span class=\"comment\">% We've found enough in this part of the range.<\/span>\r\n                <span class=\"keyword\">break<\/span>\r\n            <span class=\"keyword\">end<\/span>\r\n        <span class=\"keyword\">end<\/span>\r\n    <span class=\"keyword\">end<\/span>\r\n\r\n    nn = [nn nn_m];\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><p>Measure execution times for these \"highly composite\" numbers.<\/p><pre class=\"codeinput\">tn = zeros(length(nn),1);\r\n<span class=\"keyword\">for<\/span> k = 1:length(nn)\r\n    x = rand(nn(k),1);\r\n    tn(k) = timeit(@() fft(x));\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><p>Add the times to the plot.<\/p><pre class=\"codeinput\">hold <span class=\"string\">on<\/span>\r\nloglog(nn,tn,<span class=\"string\">'d'<\/span>)\r\nhold <span class=\"string\">off<\/span>\r\n\r\nlegend({<span class=\"string\">'Powers of 2'<\/span>,<span class=\"string\">'Composite numbers'<\/span>,<span class=\"string\">'Prime numbers'<\/span>, <span class=\"keyword\">...<\/span>\r\n    <span class=\"string\">'Highly composite numbers'<\/span>},<span class=\"string\">'Location'<\/span>,<span class=\"string\">'NorthWest'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2014\/fft_speed_02.png\" alt=\"\"> <p>You can see that <i>sometimes<\/i> a non-power-of-2 can be computed very fast, faster than the next higher power of 2. In this experiment we found one such value of N: 39366. This number has 10 prime factors:<\/p><pre class=\"codeinput\">factor(39366)\r\n<\/pre><pre class=\"codeoutput\">\r\nans =\r\n\r\n     2     3     3     3     3     3     3     3     3     3\r\n\r\n<\/pre><p>I hope you enjoyed these experiments with FFT benchmarking. I can tell you from personal experience that it can turn into almost a full-time hobby!<\/p><script language=\"JavaScript\"> <!-- \r\n    function grabCode_2036b1bb196a48e0a17b58397a17a2da() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='2036b1bb196a48e0a17b58397a17a2da ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' 2036b1bb196a48e0a17b58397a17a2da';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2014 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_2036b1bb196a48e0a17b58397a17a2da()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2014a<br><\/p><\/div><!--\r\n2036b1bb196a48e0a17b58397a17a2da ##### SOURCE BEGIN #####\r\n%%\r\n% I've seen two questions recently about the speed of the |fft| function in\r\n% MATLAB. First, a tech support question was forwarded to development. The\r\n% user wanted to know how to predict the computation time for an FFT of a\r\n% given length, N. This user was interested in values of N in the\r\n% neighborhood of 4096 (2^12).\r\n%\r\n% The second was a\r\n% <http:\/\/view_thread\/335102\r\n% post in the MATLAB newsgroup comp.soft-sys.matlab>. This user wondered\r\n% why padding to the next power of two wasn't always the fastest way to\r\n% compute the FFT.\r\n%\r\n% Inspired by these questions, I want to show you today how to do some FFT\r\n% benchmarking in MATLAB.\r\n%\r\n% It turns out that, in general, the time required to execute an N-point\r\n% FFT is proportional to N*log(N). For any particular value of N, though,\r\n% the execution time can be hard to predict and depends on the number of\r\n% prime factors of N (very roughly speaking). The variation in time between\r\n% two close values of N as much as an order of magnitude.\r\n%\r\n% Whenever I do FFT benchmarking, I find it very helpful to look at three\r\n% sets of numbers:\r\n%\r\n% * Powers of 2\r\n% * Composite numbers that are not powers of 2\r\n% * Prime numbers\r\n%\r\n% Also, I have learned to look at plots that are log scale in N, which\r\n% roughly the same number of test values within each octave (or doubling)\r\n% of N.\r\n%\r\n% Constructing sets of N values along these lines takes a little thought.\r\n% Here's some code.\r\n%\r\n% First, how many powers of 2 do we want to examine? Based on the customer\r\n% questions I saw, I want to examine the range from 1024 (2^10) to 131072\r\n% (2^17).\r\n\r\nlow2 = 10;\r\nhigh2 = 17;\r\nn2 = 2.^(low2:high2);\r\n\r\n%%\r\n% Next, I want to pick 10 composite numbers and 10 prime numbers in each\r\n% decade. I'd like to pick the numbers \"randomly,\" but I also want my\r\n% experiment to be repeatable. To satisfy these seemingly contradictory\r\n% constraints, I'll reset the MATLAB random number generator before\r\n% beginning.\r\n\r\nrng('default');\r\n\r\n% Initialize the vectors holding the prime N's and composite N's.\r\nnp = [];\r\nnc = [];\r\n\r\nfor m = low2:high2\r\n    k = (2^m):(2^(m+1));\r\n    kp = k(2:end-1);\r\n    isp = isprime(kp);\r\n    primes = kp(isp);\r\n    composites = kp(~isp);\r\n    \r\n    % Use randperm to pick out 10 values from the vector of primes and 10\r\n    % values from the vector of composites.\r\n    new_np = primes(randperm(length(primes),10));\r\n    new_nc = composites(randperm(length(composites),10));\r\n    \r\n    np = [np new_np];\r\n    nc = [nc new_nc];\r\nend\r\n\r\n%%\r\n% Now let's use the function |timeit| to measure the execution time\r\n% required to compute FFTs for all these values of N. (If you don't have a\r\n% recent version of MATLAB that has |timeit|, you can get a version of it\r\n% from the\r\n% <https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/18798-timeit-benchmarking-function\r\n% File Exchange>.)\r\n\r\n%%\r\nt2 = zeros(size(n2));\r\nfor k = 1:length(n2)\r\n    x = rand(n2(k),1);\r\n    t2(k) = timeit(@() fft(x));\r\nend\r\n\r\n%%\r\ntp = zeros(size(np));\r\nfor k = 1:length(np)\r\n    x = rand(np(k),1);\r\n    tp(k) = timeit(@() fft(x));\r\nend\r\n\r\n%%\r\ntc = zeros(size(np));\r\nfor k = 1:length(nc)\r\n    x = rand(nc(k),1);\r\n    tc(k) = timeit(@() fft(x));\r\nend\r\n\r\n%%\r\n% Now do a loglog plot of all these times.\r\n\r\n%%\r\nloglog(n2,t2,'o')\r\nset(gca,'xtick',2.^(10:17))\r\nxlim([2^10 2^17])\r\n\r\nhold on\r\n\r\nloglog(nc,tc,'+')\r\nloglog(np,tp,'*')\r\n\r\nhold off\r\n\r\nlegend({'Powers of 2','Composite numbers','Prime numbers'}, ...\r\n    'Location','NorthWest')\r\nxlabel('N')\r\nylabel('Execution time (s)')\r\ntitle('FFT execution time as a function of N')\r\n\r\n%%\r\n% You can see that there's a wide spread of execution times for the values\r\n% of N that are not powers of 2.\r\n%\r\n% One thing I'm not seeing is what the MATLAB Newsgroup poster reported.\r\n% That is, I don't see a non-power-of-2 time that's faster than the next\r\n% highest power of 2.\r\n%\r\n% So let's look a little harder for composite numbers that are faster than\r\n% what we've seen so far. Specifically, I'm going to look for values of N\r\n% with prime factors no bigger than 3.\r\n\r\nnn = [];\r\nfor m = low2:high2\r\n    k = (2^m):(2^(m+1));\r\n    kp = k(2:end-1);\r\n\r\n    kp = kp(randperm(length(kp)));\r\n    nn_m = [];\r\n    for q = 1:length(kp)\r\n        if max(factor(kp(q))) <= 3\r\n            nn_m = [nn_m kp(q)];\r\n            \r\n            if length(nn_m) >= 4\r\n                % We've found enough in this part of the range.\r\n                break\r\n            end\r\n        end\r\n    end\r\n    \r\n    nn = [nn nn_m];\r\nend\r\n        \r\n%%\r\n% Measure execution times for these \"highly composite\" numbers.\r\n\r\ntn = zeros(length(nn),1);\r\nfor k = 1:length(nn)\r\n    x = rand(nn(k),1);\r\n    tn(k) = timeit(@() fft(x));\r\nend\r\n\r\n%%\r\n% Add the times to the plot.\r\n\r\nhold on\r\nloglog(nn,tn,'d')\r\nhold off\r\n\r\nlegend({'Powers of 2','Composite numbers','Prime numbers', ...\r\n    'Highly composite numbers'},'Location','NorthWest')\r\n\r\n%%\r\n% You can see that _sometimes_ a non-power-of-2 can be computed very fast,\r\n% faster than the next higher power of 2. In this experiment we found one\r\n% such value of N: 39366. This number has 10 prime factors:\r\n\r\nfactor(39366)\r\n\r\n%%\r\n% I hope you enjoyed these experiments with FFT benchmarking. I can tell\r\n% you from personal experience that it can turn into almost a full-time\r\n% hobby!\r\n\r\n##### SOURCE END ##### 2036b1bb196a48e0a17b58397a17a2da\r\n-->","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img decoding=\"async\"  class=\"img-responsive\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2014\/fft_speed_02.png\" onError=\"this.style.display ='none';\" \/><\/div><p>I've seen two questions recently about the speed of the fft function in MATLAB. First, a tech support question was forwarded to development. The user wanted to know how to predict the computation... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/steve\/2014\/04\/07\/timing-the-fft\/\">read more >><\/a><\/p>","protected":false},"author":42,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[1079,426,80,90,1075,92,705,1077,122,460,1037,807,190,474,52,94,360,96,130],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/1007"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/users\/42"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/comments?post=1007"}],"version-history":[{"count":11,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/1007\/revisions"}],"predecessor-version":[{"id":2760,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/1007\/revisions\/2760"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/media?parent=1007"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/categories?post=1007"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/tags?post=1007"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}