{"id":1910,"date":"2025-10-13T10:22:35","date_gmt":"2025-10-13T10:22:35","guid":{"rendered":"https:\/\/blogs.mathworks.com\/finance\/?p=1910"},"modified":"2025-10-13T10:22:35","modified_gmt":"2025-10-13T10:22:35","slug":"build-a-rag-pipeline-in-matlab-from-document-ingestion-to-llm-driven-insights","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/finance\/2025\/10\/13\/build-a-rag-pipeline-in-matlab-from-document-ingestion-to-llm-driven-insights\/","title":{"rendered":"Build a RAG Pipeline in MATLAB: From Document Ingestion to LLM-Driven Insights"},"content":{"rendered":"<p><em>The following post is from\u00a0<\/em><a href=\"https:\/\/www.linkedin.com\/in\/yuchen-dong-48061582\/\" target=\"_blank\" rel=\"noopener\"><em>Yuchen Dong<\/em><\/a><em>, Senior Finance Application Engineer at MathWorks.<\/em><\/p>\n<p>The example featured in the blog can be found on GitHub\u00a0<a href=\"https:\/\/github.com\/ydong9107\/RAGinFinance\" target=\"_blank\" rel=\"noopener\">here<\/a>.<\/p>\n<p>Retrieval-Augmented Generation (RAG) has emerged as a powerful architecture to ground large language models (LLMs) in trusted, domain-specific data. In finance, pairing LLMs with curated sources, such as Federal Open Market Committee (FOMC) minutes, can drive more reliable insight generation.<\/p>\n<p>In this blog, we\u2019ll walk through how to build a RAG pipeline using MATLAB, from preprocessing FOMC documents all the way to generating responses with LLMs of your choice. You can tailor the database with hundreds of diverse, finance-related documents to suit your specific needs.<\/p>\n<h1><strong>Why It Matters<\/strong><\/h1>\n<p>With just a few lines of code, you can:<\/p>\n<ul>\n<li>Store vectorized insights from regulatory reports<\/li>\n<li>Search them efficiently using semantic similarity<\/li>\n<li>Provide LLM responses anchored in real-world financial data<\/li>\n<\/ul>\n<p>This RAG architecture is a powerful tool for compliance analysts, economists, and financial engineers seeking to extract value from unstructured documents.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" width=\"796\" height=\"347\" class=\"alignnone size-full wp-image-1913\" src=\"http:\/\/blogs.mathworks.com\/finance\/files\/2025\/10\/text-data-database-query.png\" alt=\"\" \/><\/p>\n<h1><\/h1>\n<h1><strong>Step 1: Load Meeting Documents<\/strong><\/h1>\n<p>To begin, we load a document that reflects real-world financial discourse: minutes from a Federal Open Market Committee (FOMC) meeting, where U.S. monetary policy decisions are made. In this example, we use a single FOMC document in .pdf format.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" width=\"1024\" height=\"214\" class=\"alignnone size-large wp-image-1982\" src=\"http:\/\/blogs.mathworks.com\/finance\/files\/2025\/10\/llm_1b-1024x214.png\" alt=\"\" \/><\/p>\n<p>&nbsp;<\/p>\n<h1><strong>Step 2: Preprocess the text<\/strong><\/h1>\n<p>Raw text data is often messy. It may include stop words, punctuation, and inconsistent word forms that can reduce the accuracy of downstream tasks like embedding and search. To prepare our FOMC meeting notes for analysis, we use the <strong>Preprocess Text Data<\/strong> live task in MATLAB to clean and normalize the content.<\/p>\n<p>This step includes tokenization, lemmatization, and removal of stop words and punctuation. The result is a cleaner, more structured text suitable for vectorization.<\/p>\n<p>To visualize what terms dominate the discussion, we generate a <strong>word cloud<\/strong> of the preprocessed content:<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" width=\"741\" height=\"392\" class=\"alignnone size-full wp-image-1919\" src=\"http:\/\/blogs.mathworks.com\/finance\/files\/2025\/10\/preprocess-text-data.png\" alt=\"\" \/><\/p>\n<p>&nbsp;<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-1949\" src=\"http:\/\/blogs.mathworks.com\/finance\/files\/2025\/10\/Word-Cloud-hi-res.png\" alt=\"\" width=\"734\" height=\"528\" \/><\/p>\n<p><em>This highlights dominant themes, such as inflation, policy rates, and employment.<\/em><\/p>\n<p>&nbsp;<\/p>\n<h1><strong>Step 3: Chunking the Text<\/strong><\/h1>\n<p>Before embedding the text, we apply two key steps:<\/p>\n<ol>\n<li><strong>Filter out short fragments<\/strong> that contain fewer than three tokens\u2014these often represent incomplete or uninformative sentences.<\/li>\n<li><strong>Split the cleaned document into fixed-size chunks<\/strong> (in this case, ~128 tokens each), ensuring the resulting segments are manageable for vectorization and LLM input limits.<\/li>\n<\/ol>\n<p><img decoding=\"async\" loading=\"lazy\" width=\"1024\" height=\"387\" class=\"alignnone size-large wp-image-1985\" src=\"http:\/\/blogs.mathworks.com\/finance\/files\/2025\/10\/llm_2-1024x387.png\" alt=\"\" \/><\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-1925\" src=\"http:\/\/blogs.mathworks.com\/finance\/files\/2025\/10\/number-of-tokens.png\" alt=\"\" width=\"632\" height=\"419\" \/><\/p>\n<h1><\/h1>\n<h1><strong>Step 4: Vectorize the Document Chunks<\/strong><\/h1>\n<p>This model requires the Text Analytics Toolbox Model for either the all-MiniLM-L6-v2 Network or all-MiniLM-L12-v2 Network support package.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" width=\"1024\" height=\"88\" class=\"alignnone size-large wp-image-1991\" src=\"http:\/\/blogs.mathworks.com\/finance\/files\/2025\/10\/llm_3-1024x88.png\" alt=\"\" \/><\/p>\n<p>If the support package is not installed, it can be downloaded from the Add-Ons Menu.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-1931\" src=\"http:\/\/blogs.mathworks.com\/finance\/files\/2025\/10\/text-analytics-toolbox-model.png\" alt=\"\" width=\"605\" height=\"456\" \/><\/p>\n<p>With our text chunks ready, we now convert them into numerical vectors using a <strong>pretrained embedding model<\/strong>. These embeddings form the semantic backbone of our Retrieval-Augmented Generation (RAG) system.<\/p>\n<h1><strong><img decoding=\"async\" loading=\"lazy\" width=\"1024\" height=\"178\" class=\"alignnone size-large wp-image-1988\" src=\"http:\/\/blogs.mathworks.com\/finance\/files\/2025\/10\/llm_4-1024x178.png\" alt=\"\" \/><\/strong><\/h1>\n<h1><strong>Step 5: Store Embeddings in a Vector Database<\/strong><\/h1>\n<p>To support fast and accurate retrieval for RAG workflows, we store the document embeddings in a vector database: specifically, PostgreSQL with the pgvector extension.<\/p>\n<p>The pgvector extension enables you to store and query high-dimensional embedding vectors directly within PostgreSQL, along with any related structured data. This streamlines your architecture by combining traditional relational data and semantic search into one system.<\/p>\n<p>Once pgvector is installed, you can use the <strong>Database Explorer App<\/strong> in MATLAB to connect to your PostgreSQL instance and view or query the embedded vectors\u2014such as those we generated from the FOMC meeting notes.<\/p>\n<p>&nbsp;<\/p>\n<h1><strong>Step 6: Retrieve Documents Based on a Query<\/strong><\/h1>\n<p>Suppose we want to ask:<\/p>\n<p><strong>&#8220;Will the Federal Funds rate decrease in the next 3 months?&#8221;<\/strong><\/p>\n<p>The results are the most semantically relevant document snippets related to the query.<\/p>\n<p>&nbsp;<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-1934\" src=\"http:\/\/blogs.mathworks.com\/finance\/files\/2025\/10\/selected-docs.png\" alt=\"\" width=\"825\" height=\"362\" \/><\/p>\n<p>&nbsp;<\/p>\n<h1><strong>Step 7: Visualize and Validate Similarity<\/strong><\/h1>\n<p>To evaluate the results, we compute cosine similarity and plot a t-SNE map to show how close the query is to the returned documents in vector space.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" width=\"1024\" height=\"293\" class=\"alignnone size-large wp-image-1994\" src=\"http:\/\/blogs.mathworks.com\/finance\/files\/2025\/10\/llm_5-1024x293.png\" alt=\"\" \/><\/p>\n<h1><\/h1>\n<h1><strong>Step 8: Ask the LLM to Generate a Response<\/strong><\/h1>\n<p>With the top matching chunks, we construct a prompt and generate a grounded answer using an LLM. The output provides a detailed response supported by FOMC context\u2014grounded, explainable, and tailored to financial analysis.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-1937\" src=\"http:\/\/blogs.mathworks.com\/finance\/files\/2025\/10\/output.png\" alt=\"\" width=\"863\" height=\"560\" \/><\/p>\n<p>The example featured in the blog can be found on GitHub\u00a0<a href=\"https:\/\/github.com\/ydong9107\/RAGinFinance\" target=\"_blank\" rel=\"noopener\">here<\/a>.<\/p>\n<h1><\/h1>\n<h1><strong>Reference<\/strong><\/h1>\n<p>[1] Large Language Models with MATLAB: <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/163796-%20large-language-models-llms-with-matlab\" target=\"_blank\" rel=\"noopener\">https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/163796- large-language-models-llms-with-matlab<\/a><\/p>\n<p>[2] Information Retrieval with Document Embeddings:\u00a0<a href=\"http:\/\/mathworks.com\/help\/textanalytics\/ug\/information-retrieval-with-document-embeddings.html\" target=\"_blank\" rel=\"noopener\">http:\/\/mathworks.com\/help\/textanalytics\/ug\/information-retrieval-with-document-embeddings.html<\/a><\/p>\n<p>[3] Board of Governors of the Federal Reserve System: <a href=\"https:\/\/www.federalreserve.gov\/monetarypolicy\/fomccalendars.htm\" target=\"_blank\" rel=\"noopener\">https:\/\/www.federalreserve.gov\/monetarypolicy\/ fomccalendars.htm<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img decoding=\"async\"  class=\"img-responsive\" src=\"http:\/\/blogs.mathworks.com\/finance\/files\/2025\/10\/text-data-database-query.png\" onError=\"this.style.display ='none';\" \/><\/div>\n<p>The following post is from\u00a0Yuchen Dong, Senior Finance Application Engineer at MathWorks.<br \/>\nThe example featured in the blog can be found on GitHub\u00a0here.<br \/>\nRetrieval-Augmented Generation (RAG) has&#8230; <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/finance\/2025\/10\/13\/build-a-rag-pipeline-in-matlab-from-document-ingestion-to-llm-driven-insights\/\">read more >><\/a><\/p>\n","protected":false},"author":233,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[10],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/finance\/wp-json\/wp\/v2\/posts\/1910"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/finance\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/finance\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/finance\/wp-json\/wp\/v2\/users\/233"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/finance\/wp-json\/wp\/v2\/comments?post=1910"}],"version-history":[{"count":20,"href":"https:\/\/blogs.mathworks.com\/finance\/wp-json\/wp\/v2\/posts\/1910\/revisions"}],"predecessor-version":[{"id":2021,"href":"https:\/\/blogs.mathworks.com\/finance\/wp-json\/wp\/v2\/posts\/1910\/revisions\/2021"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/finance\/wp-json\/wp\/v2\/media?parent=1910"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/finance\/wp-json\/wp\/v2\/categories?post=1910"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/finance\/wp-json\/wp\/v2\/tags?post=1910"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}