{"id":2750,"date":"2024-09-26T07:24:00","date_gmt":"2024-09-26T11:24:00","guid":{"rendered":"https:\/\/blogs.mathworks.com\/matlab\/?p=2750"},"modified":"2024-09-26T07:24:00","modified_gmt":"2024-09-26T11:24:00","slug":"matlab-now-has-over-1000-functions-that-just-work-on-nvidia-gpus","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/matlab\/2024\/09\/26\/matlab-now-has-over-1000-functions-that-just-work-on-nvidia-gpus\/","title":{"rendered":"MATLAB now has over 1,000 functions that Just Work on NVIDIA GPUs"},"content":{"rendered":"<div class = rtcContent><h2  style = 'margin: 20px 10px 5px 4px; padding: 0px; line-height: 20px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 20px; font-weight: 700; text-align: left; '><span>GPU support in MATLAB started in R2010b<\/span><\/h2><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>Back in R2010b, the first GPU enabled functions were made available in MATLAB via <\/span><a href = \"https:\/\/uk.mathworks.com\/products\/parallel-computing.html\"><span>Parallel Computing Toolbox<\/span><\/a><span>. The idea was then, as it is now, to overload existing MATLAB functions such that they accept the <\/span><a href = \"https:\/\/uk.mathworks.com\/help\/parallel-computing\/gpuarray.html\"><span style=' font-family: monospace;'>gpuArray<\/span><\/a><span> type. If you gave a <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span> to a function then it would automatically work on the GPU without the user having to do anything else. That is, if you had MATLAB code like this<\/span><\/div><div style=\"background-color: #F5F5F5; margin: 10px 15px 10px 0; display: inline-block\"><div class=\"inlineWrapper\"><div  style = 'border-left: 1px solid rgb(217, 217, 217); border-right: 1px solid rgb(217, 217, 217); border-top: 1px solid rgb(217, 217, 217); border-bottom: 1px solid rgb(217, 217, 217); border-radius: 4px; padding: 6px 45px 4px 13px; line-height: 18.004px; min-height: 0px; white-space: nowrap; color: rgb(33, 33, 33); font-family: Menlo, Monaco, Consolas, \"Courier New\", monospace; font-size: 14px; '><span style=\"white-space: pre\"><span >y = fft(x);                <\/span><span style=\"color: #008013;\">% Compute the fast fourier transform of x on the CPU<\/span><\/span><\/div><\/div><\/div><div  style = 'margin: 10px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>Then all you need to do to run that on the GPU is use <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span> and <\/span><a href = \"https:\/\/uk.mathworks.com\/help\/parallel-computing\/gpuarray.gather.html\"><span style=' font-family: monospace;'>gather<\/span><\/a><span> to take care of the transfer to and from the GPU. Other than that, the MATLAB code is the same<\/span><\/div><div style=\"background-color: #F5F5F5; margin: 10px 15px 10px 0; display: inline-block\"><div class=\"inlineWrapper\"><div  style = 'border-left: 1px solid rgb(217, 217, 217); border-right: 1px solid rgb(217, 217, 217); border-top: 1px solid rgb(217, 217, 217); border-bottom: 0px none rgb(33, 33, 33); border-radius: 4px 4px 0px 0px; padding: 6px 45px 0px 13px; line-height: 18.004px; min-height: 0px; white-space: nowrap; color: rgb(33, 33, 33); font-family: Menlo, Monaco, Consolas, \"Courier New\", monospace; font-size: 14px; '><span style=\"white-space: pre\"><span >gpuX = gpuArray(x);         <\/span><span style=\"color: #008013;\">% Transfer the array x to the GPU<\/span><\/span><\/div><\/div><div class=\"inlineWrapper\"><div  style = 'border-left: 1px solid rgb(217, 217, 217); border-right: 1px solid rgb(217, 217, 217); border-top: 0px none rgb(33, 33, 33); border-bottom: 0px none rgb(33, 33, 33); border-radius: 0px; padding: 0px 45px 0px 13px; line-height: 18.004px; min-height: 0px; white-space: nowrap; color: rgb(33, 33, 33); font-family: Menlo, Monaco, Consolas, \"Courier New\", monospace; font-size: 14px; '>&nbsp;<\/div><\/div><div class=\"inlineWrapper\"><div  style = 'border-left: 1px solid rgb(217, 217, 217); border-right: 1px solid rgb(217, 217, 217); border-top: 0px none rgb(33, 33, 33); border-bottom: 0px none rgb(33, 33, 33); border-radius: 0px; padding: 0px 45px 0px 13px; line-height: 18.004px; min-height: 0px; white-space: nowrap; color: rgb(33, 33, 33); font-family: Menlo, Monaco, Consolas, \"Courier New\", monospace; font-size: 14px; '><span style=\"white-space: pre\"><span >gpuY = fft(gpuX);           <\/span><span style=\"color: #008013;\">% fft is now performed on the GPU<\/span><\/span><\/div><\/div><div class=\"inlineWrapper\"><div  style = 'border-left: 1px solid rgb(217, 217, 217); border-right: 1px solid rgb(217, 217, 217); border-top: 0px none rgb(33, 33, 33); border-bottom: 0px none rgb(33, 33, 33); border-radius: 0px; padding: 0px 45px 0px 13px; line-height: 18.004px; min-height: 0px; white-space: nowrap; color: rgb(33, 33, 33); font-family: Menlo, Monaco, Consolas, \"Courier New\", monospace; font-size: 14px; '>&nbsp;<\/div><\/div><div class=\"inlineWrapper\"><div  style = 'border-left: 1px solid rgb(217, 217, 217); border-right: 1px solid rgb(217, 217, 217); border-top: 0px none rgb(33, 33, 33); border-bottom: 1px solid rgb(217, 217, 217); border-radius: 0px 0px 4px 4px; padding: 0px 45px 4px 13px; line-height: 18.004px; min-height: 0px; white-space: nowrap; color: rgb(33, 33, 33); font-family: Menlo, Monaco, Consolas, \"Courier New\", monospace; font-size: 14px; '><span style=\"white-space: pre\"><span >y = gather(gpuY);           <\/span><span style=\"color: #008013;\">% Gather the result from the GPU<\/span><\/span><\/div><\/div><\/div><div  style = 'margin: 10px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>R2010b provided support for 123 such functions, most of which were element-wise math such as <\/span><span style=' font-family: monospace;'>sin<\/span><span>, <\/span><span style=' font-family: monospace;'>exp<\/span><span> along with arithmetic operators such as <\/span><span style=' font-family: monospace;'>+<\/span><span> and <\/span><span style=' font-family: monospace;'>-.<\/span><span> There were also some more interesting things such as <\/span><a href = \"https:\/\/uk.mathworks.com\/help\/matlab\/ref\/fft.html\"><span style=' font-family: monospace;'>fft<\/span><\/a><span>,<\/span><a href = \"https:\/\/uk.mathworks.com\/help\/matlab\/ref\/mtimes.html\"><span> <\/span><span style=' font-family: monospace;'>mtimes<\/span><\/a><span> (matrix-matrix multiplication) and the all important <\/span><a href = \"https:\/\/uk.mathworks.com\/help\/matlab\/ref\/mldivide.html\"><span style=' font-family: monospace;'>mldivide<\/span><\/a><span>, the full name for backslash, <\/span><a href = \"https:\/\/blogs.mathworks.com\/matlab\/2024\/04\/02\/how-we-made-a-better-backslash-in-matlab-r2024a\/\"><span>possibly the most famous operator in MATLAB<\/span><\/a><span>.<\/span><\/div><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>It was a great start but there were also some awkward omissions, the most glaring of which were <\/span><span style=' font-family: monospace;'>subsref<\/span><span> and <\/span><span style=' font-family: monospace;'>subsasgn<\/span><span>. This meant indexing of <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span>'s was not supported. Imagine trying to do anything useful in MATLAB without ever indexing anything! My colleagues tell me that this made for an interesting trip to the SC 2010 super-computing conference where they introduced MATLAB's GPU functionality for the first time.<\/span><\/div><h2  style = 'margin: 20px 10px 5px 4px; padding: 0px; line-height: 20px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 20px; font-weight: 700; text-align: left; '><span>As of R2024b, 1195 MATLAB functions have gpuArray support<\/span><\/h2><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>14 years later and things have changed drastically. As you browse the MATLAB documentation, you'll note that many functions have an <\/span><span style=' font-weight: bold;'>Extended Capabilities <\/span><span>section. Here's that section for <\/span><a href = \"https:\/\/uk.mathworks.com\/help\/matlab\/ref\/fft.html\"><span style=' font-family: monospace;'>fft<\/span><\/a><span> in MATLAB R2024b where you can see that there are not just one but two entries related to GPUs: <\/span><span style=' font-weight: bold;'>GPU Code Generation<\/span><span> and <\/span><span style=' font-weight: bold;'>GPU Arrays<\/span><span>. We'll talk about GPU code Generation another time. <\/span><\/div><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><img class = \"imageNode\" src = \"http:\/\/blogs.mathworks.com\/matlab\/files\/2024\/09\/GPU_1195_functions_1.png\" width = \"746\" height = \"264\" alt = \"Screenshot 2024-07-05 at 13.22.22.png\" style = \"vertical-align: baseline; width: 746px; height: 264px;\"><\/img><\/div><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>Today, we are focused on GPU Arrays, the topic of today's post. In R2024b. <\/span><span style=' font-weight: bold;'>1195 functions across 14 toolboxes have now got support for <\/span><span style=' font-weight: bold; font-family: monospace;'>gpuArray<\/span><span>, which is a lot!<\/span><\/div><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>The chart below shows this growth over time.<\/span><\/div><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><img class = \"imageNode\" src = \"http:\/\/blogs.mathworks.com\/matlab\/files\/2024\/09\/GPU_1195_functions_2.png\" width = \"710\" height = \"524\" alt = \"\" style = \"vertical-align: baseline; width: 710px; height: 524px;\"><\/img><\/div><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>Working with <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span> in this way can be extremely effective. One <\/span><a href = \"https:\/\/uk.mathworks.com\/company\/user_stories\/nasa-langley-research-center-accelerates-acoustic-data-analysis-with-gpu-computing.html?s_tid=srchtitle_site_search_1_nasa_langley\"><span>public case study I can point to is from NASA's Langley Research Center<\/span><\/a><span> who tell us that it took them 30 minutes to get their MATLAB algorithm working on the GPU resulting in a 40x speedup. <\/span><\/div><h2  style = 'margin: 20px 10px 5px 4px; padding: 0px; line-height: 20px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 20px; font-weight: 700; text-align: left; '><span>What does gpuArray support really mean?<\/span><\/h2><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>When you start digging down into the details, what we mean by 'support for <\/span><span style=' font-family: monospace;'>gpuArray'<\/span><span> is both really simple and full of complications. At the simple end of the spectrum, it means that the function in question can accept <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span>'s and do something with them. The complexity starts to creep in when we ask what that 'something' is.<\/span><\/div><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>Of the 1195 functions with <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span> support, <\/span><span style=' font-weight: bold;'>729 of them are supported with no limitations<\/span><span>.  Anything you can do on the CPU with a normal MATLAB array, you can do on the GPU with a <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span>. This can make it incredibly easy to switch from using the CPU to the GPU for quite a lot of MATLAB code.<\/span><\/div><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>The other<\/span><span style=' font-weight: bold;'> 466 functions have some kind of restriction or caveat<\/span><span>. The <\/span><span style=' font-family: monospace;'>fft<\/span><span> function is one of these.  Expand the <\/span><span style=' font-weight: bold;'>GPU Arrays<\/span><span> section in <\/span><span style=' font-weight: bold;'>Extended Capabilities<\/span><span> and you'll see what they are.<\/span><\/div><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><img class = \"imageNode\" src = \"http:\/\/blogs.mathworks.com\/matlab\/files\/2024\/09\/GPU_1195_functions_3.png\" width = \"747\" height = \"145\" alt = \"\" style = \"vertical-align: baseline; width: 747px; height: 145px;\"><\/img><\/div><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>Many of these caveats are minor. I don't consider the detail for <\/span><span style=' font-family: monospace;'>fft<\/span><span> to be a big deal, although if you disagree do let me know and why. Other restrictions might be more problematic. The restriction for the <\/span><a href = \"https:\/\/uk.mathworks.com\/help\/matlab\/ref\/lu.html\"><span style=' font-family: monospace;'>lu<\/span><span> function<\/span><\/a><span>, for example, is that it doesn't accept sparse <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span>. The <\/span><a href = \"https:\/\/uk.mathworks.com\/help\/matlab\/ref\/inv.html\"><span style=' font-family: monospace;'>inv<\/span><span> function<\/span><\/a><span> also doesn't accept sparse <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span> and you also have to be more careful when dealing with badly scaled or nearly singular matrices since the <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span> version of <\/span><span style=' font-family: monospace;'>inv<\/span><span> won't warn you when you have them whereas the standard version will. Even <\/span><span style=' font-family: monospace;'>mtimes<\/span><span>, the full name of MATLAB's matrix-matrix multiplication operator has a restriction: it doesn't support <\/span><a href = \"https:\/\/uk.mathworks.com\/help\/matlab\/ref\/int64.html\"><span style=' font-family: monospace;'>int64<\/span><\/a><span> on the GPU.<\/span><\/div><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>You may reasonably ask \"Why do these restrictions and caveats exist?\". Indeed, the reviewer of this blog post asked exactly that! Often, it's for performance reasons. Take <\/span><span style=' font-family: monospace;'>fft<\/span><span>, for example. To <\/span><span style=' font-weight: bold;'>know<\/span><span> that the result has all zero imaginary part we'd have to run another GPU kernel to go and check all the values. We would then need to wait for this operation to finish (i.e. synchronize the device) to see what the answer is. If the imaginary parts are all zero, we would then need to run another kernel to de-interlace the array and drop the imaginary part. On a GPU, this could result in significant slowdown and yet most of the time the imaginary parts are non-zero. That is, <\/span><span style=' font-weight: bold;'>everything would be made much slower for a convenience that's rarely needed<\/span><span>. Of course some restrictions are there simply because it would take a lot of effort to implement them and we haven't seen the demand yet. As always, your feedback is essential here! <\/span><\/div><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>At the other end of the scale we have around <\/span><span style=' font-weight: bold;'>140-150 functions that accept a GPU array but don't actually run on the GPU<\/span><span>! These are essentially convenience functions such as <\/span><span style=' font-family: monospace;'>plot<\/span><span> that have been modified so that they Just Work when you send them a <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span>. This means that, for example, you can do <\/span><\/div><div style=\"background-color: #F5F5F5; margin: 10px 15px 10px 0; display: inline-block\"><div class=\"inlineWrapper\"><div  style = 'border-left: 1px solid rgb(217, 217, 217); border-right: 1px solid rgb(217, 217, 217); border-top: 1px solid rgb(217, 217, 217); border-bottom: 0px none rgb(33, 33, 33); border-radius: 4px 4px 0px 0px; padding: 6px 45px 0px 13px; line-height: 18.004px; min-height: 0px; white-space: nowrap; color: rgb(33, 33, 33); font-family: Menlo, Monaco, Consolas, \"Courier New\", monospace; font-size: 14px; '><span style=\"white-space: pre\"><span >x = gpuArray.linspace(-pi,pi,1000);                <\/span><span style=\"color: #008013;\">% Construct x directly on the GPU<\/span><\/span><\/div><\/div><div class=\"inlineWrapper\"><div  style = 'border-left: 1px solid rgb(217, 217, 217); border-right: 1px solid rgb(217, 217, 217); border-top: 0px none rgb(33, 33, 33); border-bottom: 0px none rgb(33, 33, 33); border-radius: 0px; padding: 0px 45px 0px 13px; line-height: 18.004px; min-height: 0px; white-space: nowrap; color: rgb(33, 33, 33); font-family: Menlo, Monaco, Consolas, \"Courier New\", monospace; font-size: 14px; '><span style=\"white-space: pre\"><span >y = sin(x);                                        <\/span><span style=\"color: #008013;\">% Compute sin(x) on the GPU<\/span><\/span><\/div><\/div><div class=\"inlineWrapper\"><div  style = 'border-left: 1px solid rgb(217, 217, 217); border-right: 1px solid rgb(217, 217, 217); border-top: 0px none rgb(33, 33, 33); border-bottom: 1px solid rgb(217, 217, 217); border-radius: 0px 0px 4px 4px; padding: 0px 45px 4px 13px; line-height: 18.004px; min-height: 0px; white-space: nowrap; color: rgb(33, 33, 33); font-family: Menlo, Monaco, Consolas, \"Courier New\", monospace; font-size: 14px; '><span style=\"white-space: pre\"><span >plot(x,y);                                         <\/span><span style=\"color: #008013;\">% produce the plot<\/span><\/span><\/div><\/div><\/div><div  style = 'margin: 10px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>Instead of <\/span><\/div><div style=\"background-color: #F5F5F5; margin: 10px 15px 10px 0; display: inline-block\"><div class=\"inlineWrapper\"><div  style = 'border-left: 1px solid rgb(217, 217, 217); border-right: 1px solid rgb(217, 217, 217); border-top: 1px solid rgb(217, 217, 217); border-bottom: 0px none rgb(33, 33, 33); border-radius: 4px 4px 0px 0px; padding: 6px 45px 0px 13px; line-height: 18.004px; min-height: 0px; white-space: nowrap; color: rgb(33, 33, 33); font-family: Menlo, Monaco, Consolas, \"Courier New\", monospace; font-size: 14px; '><span style=\"white-space: pre\"><span >x = gpuArray.linspace(-pi,pi,1000);                <\/span><span style=\"color: #008013;\">% Construct x directly on the GPU<\/span><\/span><\/div><\/div><div class=\"inlineWrapper\"><div  style = 'border-left: 1px solid rgb(217, 217, 217); border-right: 1px solid rgb(217, 217, 217); border-top: 0px none rgb(33, 33, 33); border-bottom: 0px none rgb(33, 33, 33); border-radius: 0px; padding: 0px 45px 0px 13px; line-height: 18.004px; min-height: 0px; white-space: nowrap; color: rgb(33, 33, 33); font-family: Menlo, Monaco, Consolas, \"Courier New\", monospace; font-size: 14px; '><span style=\"white-space: pre\"><span >y = sin(x);                                        <\/span><span style=\"color: #008013;\">% Compute sin(x) on the GPU<\/span><\/span><\/div><\/div><div class=\"inlineWrapper\"><div  style = 'border-left: 1px solid rgb(217, 217, 217); border-right: 1px solid rgb(217, 217, 217); border-top: 0px none rgb(33, 33, 33); border-bottom: 0px none rgb(33, 33, 33); border-radius: 0px; padding: 0px 45px 0px 13px; line-height: 18.004px; min-height: 0px; white-space: nowrap; color: rgb(33, 33, 33); font-family: Menlo, Monaco, Consolas, \"Courier New\", monospace; font-size: 14px; '><span style=\"white-space: pre\"><span >CPUx = gather(x);                                  <\/span><span style=\"color: #008013;\">% Bring x from GPU into main memory<\/span><\/span><\/div><\/div><div class=\"inlineWrapper\"><div  style = 'border-left: 1px solid rgb(217, 217, 217); border-right: 1px solid rgb(217, 217, 217); border-top: 0px none rgb(33, 33, 33); border-bottom: 0px none rgb(33, 33, 33); border-radius: 0px; padding: 0px 45px 0px 13px; line-height: 18.004px; min-height: 0px; white-space: nowrap; color: rgb(33, 33, 33); font-family: Menlo, Monaco, Consolas, \"Courier New\", monospace; font-size: 14px; '><span style=\"white-space: pre\"><span >CPUy = gather(y);                                  <\/span><span style=\"color: #008013;\">% Bring y from GPU into main memory<\/span><\/span><\/div><\/div><div class=\"inlineWrapper\"><div  style = 'border-left: 1px solid rgb(217, 217, 217); border-right: 1px solid rgb(217, 217, 217); border-top: 0px none rgb(33, 33, 33); border-bottom: 1px solid rgb(217, 217, 217); border-radius: 0px 0px 4px 4px; padding: 0px 45px 4px 13px; line-height: 18.004px; min-height: 0px; white-space: nowrap; color: rgb(33, 33, 33); font-family: Menlo, Monaco, Consolas, \"Courier New\", monospace; font-size: 14px; '><span style=\"white-space: pre\"><span >plot(CPUx,CPUy);                                   <\/span><span style=\"color: #008013;\">% Do the plot<\/span><\/span><\/div><\/div><\/div><div  style = 'margin: 10px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>When you send <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span>s to <\/span><span style=' font-family: monospace;'>plot<\/span><span>, the <\/span><span style=' font-family: monospace;'>gather<\/span><span> operations are done for you behind the scenes. So, <\/span><span style=' font-family: monospace;'>plot<\/span><span> supports <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span> even though the plot operation is not done by the GPU.<\/span><\/div><h2  style = 'margin: 20px 10px 5px 4px; padding: 0px; line-height: 20px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 20px; font-weight: 700; text-align: left; '><span>Pagewise backslash - the 1000th function to support gpuArray<\/span><\/h2><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>While gathering the numbers for this post, a few of us got fixated on what the 1000th function might be. It turns out to be slightly tricky to figure this out but we are reasonably confident that the 1000th MATLAB function to support <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span> is <\/span><a href = \"https:\/\/uk.mathworks.com\/help\/matlab\/ref\/pagemldivide.html\"><span style=' font-family: monospace;'>pagemldivide<\/span><\/a><span> -- the pagewise version of <\/span><a href = \"https:\/\/uk.mathworks.com\/help\/matlab\/ref\/double.mldivide.html\"><span style=' font-family: monospace;'>mldivide<\/span><\/a><span>, known more commonly as the backslash operator. That the 1000th <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span> function is directly related the most iconic of MATLAB functions, and one of the first functions that ever received <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span> support, seems rather fitting.<\/span><\/div><h2  style = 'margin: 20px 10px 5px 4px; padding: 0px; line-height: 20px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 20px; font-weight: 700; text-align: left; '><span>Moving beyond gpuArray<\/span><\/h2><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span style=' font-family: monospace;'>gpuArray<\/span><span> is the easiest way of getting many applications ported to use the GPU but there's more to GPUs in MATLAB than <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span>. Here are some pointers to where you might go next<\/span><\/div><ul  style = 'margin: 10px 0px 20px; padding-left: 0px; font-family: Helvetica, Arial, sans-serif; font-size: 14px; '><li  style = 'margin-left: 56px; line-height: 21px; min-height: 0px; text-align: left; white-space: pre-wrap; '><a href = \"https:\/\/uk.mathworks.com\/help\/parallel-computing\/gpuarray.arrayfun.html\"><span style=' font-family: monospace;'>arrayfun<\/span><\/a><span> - Compiles MATLAB functions to native GPU code<\/span><\/li><li  style = 'margin-left: 56px; line-height: 21px; min-height: 0px; text-align: left; white-space: pre-wrap; '><a href = \"https:\/\/uk.mathworks.com\/help\/parallel-computing\/run-cuda-or-ptx-code-on-gpu.html\"><span>Run CUDA or PTX Code on GPU<\/span><\/a><span> - Hand write your own GPU kernels<\/span><\/li><li  style = 'margin-left: 56px; line-height: 21px; min-height: 0px; text-align: left; white-space: pre-wrap; '><a href = \"https:\/\/uk.mathworks.com\/help\/gpucoder\/\"><span>GPU Coder<\/span><\/a><span> - Generates CUDA Code from MATLAB code and Simulink models<\/span><\/li><\/ul><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>A great example that compares <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span> with <\/span><span style=' font-family: monospace;'>arrayfun<\/span><span> and hand written CUDA mex functions is <\/span><a href = \"https:\/\/uk.mathworks.com\/help\/parallel-computing\/illustrating-three-approaches-to-gpu-computing-the-mandelbrot-set.html\"><span>Illustrating Three Approaches to GPU Computing: The Mandelbrot Set<\/span><\/a><\/div><h2  style = 'margin: 20px 10px 5px 4px; padding: 0px; line-height: 20px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 20px; font-weight: 700; text-align: left; '><span>What functions do you need to be supported on the GPU?<\/span><\/h2><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>There are several drivers behind this growth of GPU supported functions in MATLAB. In the early days, it was just a case of taking care of the obvious things: Matrix operations, fourier transforms, element-wise operations and so on. Over time we started ensuring that various workflows were taken care of such as <\/span><a href = \"https:\/\/uk.mathworks.com\/help\/parallel-computing\/using-gpu-arrayfun-for-monte-carlo-simulations.html\"><span>Monte-Carlo simulations<\/span><\/a><span>, <\/span><a href = \"https:\/\/uk.mathworks.com\/help\/deeplearning\/ug\/deep-learning-with-matlab-on-multiple-gpus.html\"><span>Deep Learning<\/span><\/a><span>, <\/span><a href = \"https:\/\/uk.mathworks.com\/help\/images\/image-processing-on-a-gpu.html\"><span>Image Processing<\/span><\/a><span> or more specialized things such as <\/span><a href = \"https:\/\/uk.mathworks.com\/help\/comm\/ug\/ldpc-link-simulation-using-gpu-processing.html\"><span>LDPC Link Simulation from Communications Toolbox<\/span><\/a><span>.<\/span><\/div><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>These days we still have some functions in the 'obvious' category: new MATLAB functions that obviously could do with a <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span> version. The <\/span><a href = \"https:\/\/uk.mathworks.com\/help\/matlab\/ref\/createarray.html\"><span style=' font-family: monospace;'>createArray<\/span><span> function<\/span><\/a><span>, introduced in R2024a, is an example of such a case. The greatest driver, however, concerning what additional functions get <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span> support is user requests. <\/span><\/div><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span>So, if you have a workflow that you think would benefit from additional <\/span><span style=' font-family: monospace;'>gpuArray<\/span><span> support, get in touch and let us know.<\/span><\/div><div  style = 'margin: 2px 10px 9px 4px; padding: 0px; line-height: 21px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 14px; font-weight: 400; text-align: left; '><span><\/span><\/div><h2  style = 'margin: 20px 10px 5px 4px; padding: 0px; line-height: 20px; min-height: 0px; white-space: pre-wrap; color: rgb(33, 33, 33); font-family: Helvetica, Arial, sans-serif; font-style: normal; font-size: 20px; font-weight: 700; text-align: left; '><\/h2>\n<\/div><script type=\"text\/javascript\">var css = ''; var head = document.head || document.getElementsByTagName('head')[0], style = document.createElement('style'); head.appendChild(style); style.type = 'text\/css'; if (style.styleSheet){ style.styleSheet.cssText = css; } else { style.appendChild(document.createTextNode(css)); }<\/script>","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img src=\"https:\/\/blogs.mathworks.com\/matlab\/files\/2024\/09\/GPU_1195_functions_2.png\" class=\"img-responsive attachment-post-thumbnail size-post-thumbnail wp-post-image\" alt=\"\" decoding=\"async\" loading=\"lazy\" \/><\/div><p>GPU support in MATLAB started in R2010bBack in R2010b, the first GPU enabled functions were made available in MATLAB via Parallel Computing Toolbox. The idea was then, as it is now, to overload... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/matlab\/2024\/09\/26\/matlab-now-has-over-1000-functions-that-just-work-on-nvidia-gpus\/\">read more >><\/a><\/p>","protected":false},"author":176,"featured_media":2741,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[23,42,17,20,14],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/matlab\/wp-json\/wp\/v2\/posts\/2750"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/matlab\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/matlab\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/matlab\/wp-json\/wp\/v2\/users\/176"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/matlab\/wp-json\/wp\/v2\/comments?post=2750"}],"version-history":[{"count":1,"href":"https:\/\/blogs.mathworks.com\/matlab\/wp-json\/wp\/v2\/posts\/2750\/revisions"}],"predecessor-version":[{"id":2753,"href":"https:\/\/blogs.mathworks.com\/matlab\/wp-json\/wp\/v2\/posts\/2750\/revisions\/2753"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/matlab\/wp-json\/wp\/v2\/media\/2741"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/matlab\/wp-json\/wp\/v2\/media?parent=2750"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/matlab\/wp-json\/wp\/v2\/categories?post=2750"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/matlab\/wp-json\/wp\/v2\/tags?post=2750"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}