{"id":17369,"date":"2025-06-16T11:59:20","date_gmt":"2025-06-16T15:59:20","guid":{"rendered":"https:\/\/blogs.mathworks.com\/deep-learning\/?p=17369"},"modified":"2025-07-21T09:32:50","modified_gmt":"2025-07-21T13:32:50","slug":"accelerate-edge-ai-with-the-hexagon-hardware-support-package","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/deep-learning\/2025\/06\/16\/accelerate-edge-ai-with-the-hexagon-hardware-support-package\/","title":{"rendered":"Accelerate Edge AI with the Hexagon Hardware Support Package"},"content":{"rendered":"<h6><\/h6>\r\n<em>The following blog post is from <\/em><a href=\"https:\/\/www.linkedin.com\/in\/reed-axman-43287a13b\/\"><em>Reed Axman<\/em><\/a><em>, Strategic Partner Manager at MathWorks.<\/em>\r\n<h6><\/h6>\r\n&nbsp;\r\n<h6><\/h6>\r\nDeploying AI models to edge devices enables real-time data processing and decision-making without relying on constant cloud connectivity. Edge AI and \u00a0<a href=\"https:\/\/www.mathworks.com\/solutions\/deep-learning\/embedded-ai.html\">embedded AI<\/a> reduce latency and improve responsiveness for critical applications.\u00a0 Bringing AI models closer to the point of use enhances privacy and security by keeping sensitive data local, minimizing the risk of data breaches during transmission. Additionally, this approach reduces bandwidth usage and operational costs, making intelligent features more accessible in resource-constrained or remote environments.\r\n<h6><\/h6>\r\nHowever, Edge AI comes with inherent challenges, such as limited compute, power, and latency. Neural processing units (NPUs) can help address these challenges by delivering fast, efficient neural network processing on-device. The <a href=\"https:\/\/www.mathworks.com\/hardware-support\/qualcomm-hexagon.html\">Hexagon Hardware Support Package<\/a> (HSP) from MathWorks supports the Qualcomm\u00ae Hexagon\u2122 NPU, enabling integration and deployment of AI models.\r\n<h6><\/h6>\r\nThis blog post covers what an NPU is, NPU applications and benefits, and how to install and set up the HSP. We'll also highlight the tools from Qualcomm integrated with the HSP and the provided optimizations.\r\n<h6><\/h6>\r\n&nbsp;\r\n<h6><\/h6>\r\n<p style=\"font-size: 20px; color: #c04c0b;\"><strong>Introduction to NPUs<\/strong><\/p>\r\n<p style=\"font-size: 18px;\"><strong>What is NPU?<\/strong><\/p>\r\nA Neural Processing Unit (NPU) is a hardware component designed to accelerate AI and machine learning tasks. It is part of a <a href=\"https:\/\/www.mathworks.com\/discovery\/soc-architecture.html\">system-on-chi<\/a>p (SoC) architecture and is optimized for deep learning inference at the edge. More specifically, NPUs are optimized for highly parallel computational tasks such as matrix multiplications and convolutions, which are fundamental operations in deep neural networks.\r\n<h6><\/h6>\r\n<p style=\"font-size: 18px;\"><strong>Why Use NPU?<\/strong><\/p>\r\nUsing an NPU offers several advantages. NPUs handle AI workloads efficiently, providing faster inference times and lower latency. They consume less power compared to general-purpose processors, making them ideal for battery-powered devices. Additionally, NPUs can handle various AI models, from simple neural networks to complex <a href=\"https:\/\/www.mathworks.com\/discovery\/deep-learning.html\">deep learning<\/a> architectures, offering scalability for different applications.\r\n<h6><\/h6>\r\n<p style=\"font-size: 18px;\"><strong>Hexagon NPU and Its Applications<\/strong><\/p>\r\nThe <a href=\"https:\/\/www.qualcomm.com\/products\/technology\/processors\/hexagon\">Qualcomm Hexagon NPU<\/a> mimics the neural network layers and operations of popular models, such as activation functions, convolutions, fully-connected layers, and transformers. It is used in smart speakers to enhance voice command recognition, smart cameras to improve image processing and object detection, healthcare devices for real-time patient monitoring and diagnostics, and automotive systems to support advanced driver assistance systems (ADAS) and autonomous driving.\r\n<h6><\/h6>\r\n&nbsp;\r\n<h6><\/h6>\r\n<p style=\"font-size: 20px; color: #c04c0b;\"><strong>NPU Hardware Support Package<\/strong><\/p>\r\n<p style=\"font-size: 18px;\"><strong>Easy Installation and Setup of the HSP<\/strong><\/p>\r\nThe Hexagon Hardware Support Package is user-friendly and easy to install. Here\u2019s how to get started:\r\n<ol>\r\n \t<li><strong>Download HSP<\/strong>: Visit <a href=\"http:\/\/mathworks.com\/qualcomm\">mathworks.com\/qualcomm<\/a> or use the Add-On Explorer in MATLAB and download the Hexagon Hardware Support Package.<\/li>\r\n \t<li><strong>Install HSP<\/strong>: Follow the 3-step installation wizard instructions.<\/li>\r\n \t<li><strong>Set Up Your Environment<\/strong>: Configure your hardware settings to target the Hexagon hardware or simulator for deployment.<\/li>\r\n \t<li><strong>Deploy Your Model<\/strong>: Use MATLAB and Simulink to design, simulate, and deploy your AI models to the Hexagon NPU.<\/li>\r\n<\/ol>\r\n<img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-17379\" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2025\/06\/Picture1.png\" alt=\"Deploying AI application from MATLAB to NPU\" width=\"689\" height=\"271\" \/>\r\n<h6><\/h6>\r\n<strong>Figure 1:<\/strong>High level workflow for deploying AI to the Hexagon NPU\r\n<h6><\/h6>\r\n&nbsp;\r\n<h6><\/h6>\r\n<p style=\"font-size: 18px;\"><strong>Integrated Tools from Qualcomm<\/strong><\/p>\r\nThe Hexagon Hardware Support Package integrates several tools from Qualcomm, including the Hexagon Simulator for testing and validating models in a simulated environment, transaction layer package (TLP), and sysMon.\r\n<h6><\/h6>\r\n<img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-17382 \" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2025\/06\/Picture2.png\" alt=\"\" width=\"717\" height=\"311\" \/>\r\n<h6><\/h6>\r\n<strong>Figure 2:<\/strong>\u00a0Tools integrated with the Qualcomm Hexagon Hardware Support Package\r\n<h6><\/h6>\r\n&nbsp;\r\n<h6><\/h6>\r\n<p style=\"font-size: 18px;\"><strong>Optimization Benefits<\/strong><\/p>\r\nThe HSP leverages Qualcomm's code libraries to offer optimized performance for math and DSP operations. This includes scalar and vector code optimizations. The ability to take advantage of the NPU for AI acceleration also provides significant performance benefits.\r\n<h6><\/h6>\r\n<img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-17385 \" src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2025\/06\/Picture3.png\" alt=\"\" width=\"737\" height=\"427\" \/>\r\n<h6><\/h6>\r\n<strong>Figure 3:<\/strong>\u00a0Optimizations provided significantly reduce resource utilization for algorithms deployed to Qualcomm Hexagon\r\n<h6><\/h6>\r\n&nbsp;\r\n<h6><\/h6>\r\n<p style=\"font-size: 20px; color: #c04c0b;\"><strong>Final Thoughts<\/strong><\/p>\r\nThe Hexagon Hardware Support Package from MathWorks is a tool for generating optimized code and simplifying deployment to the Qualcomm Hexagon NPU. With easy installation, integrated tools, and optimized performance, it enables developers to accelerate their AI development and bring applications to the edge. Explore further resources from MathWorks and Qualcomm to enhance your AI workflow and take your embedded AI applications to production.\r\n<h6><\/h6>","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img src=\"https:\/\/blogs.mathworks.com\/deep-learning\/files\/2025\/06\/Picture1.png\" class=\"img-responsive attachment-post-thumbnail size-post-thumbnail wp-post-image\" alt=\"\" decoding=\"async\" loading=\"lazy\" \/><\/div><p>\r\nThe following blog post is from Reed Axman, Strategic Partner Manager at MathWorks.\r\n\r\n&nbsp;\r\n\r\nDeploying AI models to edge devices enables real-time data processing and decision-making without... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/deep-learning\/2025\/06\/16\/accelerate-edge-ai-with-the-hexagon-hardware-support-package\/\">read more >><\/a><\/p>","protected":false},"author":194,"featured_media":17379,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[36,9,68],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/17369"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/users\/194"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/comments?post=17369"}],"version-history":[{"count":16,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/17369\/revisions"}],"predecessor-version":[{"id":18233,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/posts\/17369\/revisions\/18233"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/media\/17379"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/media?parent=17369"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/categories?post=17369"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/deep-learning\/wp-json\/wp\/v2\/tags?post=17369"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}