{"id":3347,"date":"2019-05-29T08:27:26","date_gmt":"2019-05-29T13:27:26","guid":{"rendered":"https:\/\/blogs.mathworks.com\/loren\/?p=3347"},"modified":"2019-05-27T18:31:45","modified_gmt":"2019-05-27T23:31:45","slug":"big-data-in-mat-files","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/loren\/2019\/05\/29\/big-data-in-mat-files\/","title":{"rendered":"Big Data in MAT Files"},"content":{"rendered":"\r\n<div class=\"content\"><!--introduction--><p><i>Today's guest blogger is Adam Filion, a Senior Product Manager at MathWorks. Adam helps manage and prioritize our development efforts in data science and big data.<\/i><\/p><p>MAT files are an easy and common way to store MATLAB variables to disk. They support all MATLAB variable types, have good data compression, and can be accessed or created from other applications through an <a href=\"https:\/\/www.mathworks.com\/help\/pdf_doc\/matlab\/matfile_format.pdf\">external API<\/a>. MATLAB users sometimes have so much data stored in MAT files that they can't load all the data at once. In this post, we will explore different situations and solutions for analyzing large amounts of data stored in MAT files.<\/p><!--\/introduction--><h3>Contents<\/h3><div><ul><li><a href=\"#7db0337f-e343-4839-ad68-c316e656cfef\">Introduction to MAT Files<\/a><\/li><li><a href=\"#83b12fdd-9422-4800-badc-08eb0f01d0ee\">Large Collections of Small MAT Files<\/a><\/li><li><a href=\"#b24ee99b-5f1c-492a-ac4d-04dfc187502b\">Large MAT Files with Many Small Variables<\/a><\/li><li><a href=\"#c7ed4031-73f2-421f-9bbc-f652ba9f04a6\">Large MAT Files with Large Variables<\/a><\/li><li><a href=\"#c0c7037c-c772-401b-96fd-8a75f05fd20b\">MAT Files Logged from Simulink Simulations<\/a><\/li><li><a href=\"#93ea4f81-5e5f-49ab-bd1c-e31e27e5e47b\">Summary<\/a><\/li><\/ul><\/div><h4>Introduction to MAT Files<a name=\"7db0337f-e343-4839-ad68-c316e656cfef\"><\/a><\/h4><p><b>Using MAT files<\/b><\/p><p>MATLAB provides the ability to save variables to MAT files through the <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/save.html\">save<\/a> command.<\/p><pre class=\"codeinput\">a = pi;\r\nb = rand(1,10);\r\nsave <span class=\"string\">mydata.mat<\/span> <span class=\"string\">a<\/span> <span class=\"string\">b<\/span>\r\n<\/pre><p>These variables can be returned to the workspace using <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/load.html\">load<\/a>.<\/p><pre class=\"codeinput\"><span class=\"comment\">% return all variables to the workspace<\/span>\r\nload <span class=\"string\">mydata.mat<\/span>\r\n<span class=\"comment\">% find which variables are contained in a MAT file<\/span>\r\nvarNames = who(<span class=\"string\">\"-file\"<\/span>,<span class=\"string\">\"mydata.mat\"<\/span>);\r\n<span class=\"comment\">% return only the second variable to the workspace<\/span>\r\nload(<span class=\"string\">\"mydata.mat\"<\/span>,varNames{2})\r\n<\/pre><p><b>MAT file versions<\/b><\/p><p>MAT files have evolved over time and <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/import_export\/mat-file-versions.html\">several different versions<\/a> exist. You can change the version to use when saving data by passing an additional flag, such as <tt>\"-v7.3\"<\/tt>, to the <tt>save<\/tt> command. The biggest differences are summarized in the table below.<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2019\/MATfileversions.png\" alt=\"\"> <\/p><p>Version 7 is the default and should be used unless you need the additional functionality provided in Version 7.3. This is because, as mentioned in the documentation, Version 7.3 contains additional header information and may result in larger files than Version 7 when storing small amounts of data. Only create Version 6 or Version 4 MAT files if you need compatibility with older legacy applications.<\/p><p><b>When to store big data in MAT files<\/b><\/p><p>Most users analyzing large amounts of MAT file data did not choose the storage format themselves; but if you could, when would it make sense to store big data in MAT files? They are a good choice when the following three conditions apply:<\/p><div><ol><li><b>The data is originally recorded in MAT files.<\/b> This occurs when saving variables from the MATLAB workspace, logging data from Simulink simulations, or recording data from certain third-party data loggers which generate MAT files automatically. If your data does not naturally come in MAT files, it usually should be left in its original format.<\/li><li><b>The data is staying in the MATLAB ecosystem.<\/b> MAT files are simple to use and are a lossless storage format, meaning that you will never lose any information or accuracy when storing MATLAB variables. However, since they are not easily accessible from other applications, it is typically better to use another file format (e.g. csv, Parquet, etc.) when exchanging data with other applications.<\/li><li><b>MAT files easily work in your file storage system.<\/b> Some file systems impose additional requirements on files stored within them. For example, in the Hadoop Distributed File System (HDFS) it is difficult to use files that are not splitable, which is a feature MAT files do not support. In such situations, you should consider if a different file format that supports the file system requirements would be a better choice.<\/li><\/ol><\/div><p><b>Big data in MAT file situations<\/b><\/p><p>If all your MAT file data can be easily loaded into memory and analyzed at the same time, use the <tt>load<\/tt> command outlined at the beginning. For the rest of this post, we will explore the four general situations when MAT file data gets too large to work with at once.<\/p><div><ol><li>Large collections of small MAT files<\/li><li>Large MAT files with many small variables<\/li><li>Large MAT files with large variables<\/li><li>MAT files logged from Simulink simulations<\/li><\/ol><\/div><h4>Large Collections of Small MAT Files<a name=\"83b12fdd-9422-4800-badc-08eb0f01d0ee\"><\/a><\/h4><p>Often data is recorded from different entities (e.g. weather stations, vehicles, simulations, etc.) and each entity is stored in a separate file. Even if each individual MAT file can easily fit into memory, the total collection can grow large enough that we cannot work with all of it at once. When this happens, there are two solutions based on the type of analysis we need to do.<\/p><p><b>Embarrassingly Parallel Analysis<\/b><\/p><p>If the work we are doing is embarrassingly parallel, meaning that each file can be analyzed in isolation, then we can loop through the files one at a time. If <a href=\"https:\/\/www.mathworks.com\/products\/parallel-computing.html\">Parallel Computing Toolbox<\/a> is available, we can accelerate the process by using a <a href=\"https:\/\/www.mathworks.com\/help\/parallel-computing\/parfor.html\"><tt>parfor<\/tt><\/a> loop instead of a <tt>for<\/tt> loop.<\/p><pre class=\"codeinput\"><span class=\"comment\">% find .mat files in current directory<\/span>\r\nd = dir(<span class=\"string\">\"*.mat\"<\/span>);\r\n<span class=\"comment\">% loop through with a for loop, or use parfor<\/span>\r\n<span class=\"keyword\">parfor<\/span> ii = 1:length(d)\r\n    <span class=\"comment\">% load the next .mat file<\/span>\r\n    data = load(d(ii).name);\r\n    <span class=\"comment\">% perform your analysis on each individual file<\/span>\r\n    doAnalysis()\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><p><b>Inherently Sequential Analysis<\/b><\/p><p>When our files cannot be analyzed in isolation, we need to change our approach. The <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.io.datastore.filedatastore.html\"><tt>fileDatastore<\/tt><\/a> gives access to large collections of files by using a custom file reader function. For example, if your analysis only needs the variable <tt>\"b\"<\/tt> from your MAT files, you can use a reader function such as:<\/p><pre class=\"language-matlab\"><span class=\"keyword\">function<\/span> data = myReader(fileName,varName)\r\n  matData = load(fileName,varName);\r\n  data = matData.(varName);\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><p>If your data is stored in more complicated or irregular formats, you can use any arbitrary code in your reader function to return the values in the format you need. Once we define our reader function, we can create a <tt>fileDatastore<\/tt>, which will read one file at a time using our reader function.<\/p><pre class=\"codeinput\">fds = fileDatastore(<span class=\"string\">\"*.mat\"<\/span>, <span class=\"string\">\"ReadFcn\"<\/span>, @(fn) myReader(fn,<span class=\"string\">\"b\"<\/span>), <span class=\"string\">\"UniformRead\"<\/span>, true);\r\n<\/pre><p>Note that by default the <tt>fileDatastore<\/tt> will return each file's contents as an element in a cell array. The <tt>UniformRead<\/tt> option will instead keep the data's original format and vertically concatenate the data from different files.<\/p><p>After creating the datastore we can read a portion of the dataset with the <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.io.datastore.read.html\"><tt>read<\/tt><\/a> method or analyze the full out-of-memory dataset with <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/tall-arrays.html\">tall arrays<\/a>.<\/p><pre class=\"codeinput\"><span class=\"comment\">% read once from the datastore<\/span>\r\nt = read(fds);\r\n<span class=\"comment\">% create a tall array<\/span>\r\ntall_t = tall(fds);\r\n<\/pre><p>Unlike the <tt>load<\/tt> command, <tt>fileDatastore<\/tt> also supports <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/import_export\/work-with-remote-data.html\">remote storage systems<\/a> including Amazon S3, Azure Blob Storage and the Hadoop Distributed File System. For example, to use Amazon S3 make the following modifications:<\/p><pre class=\"language-matlab\">setenv(<span class=\"string\">\"AWS_ACCESS_KEY_ID\"<\/span>, <span class=\"string\">\"YOUR_AWS_ACCESS_KEY\"<\/span>)\r\nsetenv(<span class=\"string\">\"AWS_SECRET_ACCESS_KEY\"<\/span>, <span class=\"string\">\"YOUR_AWS_SECRET_ACCESS_KEY\"<\/span>)\r\nfds = fileDatastore(<span class=\"string\">\"s3:\/\/bucketname\/dataset\/*.mat\"<\/span>, <span class=\"string\">\"ReadFcn\"<\/span>, @(fn) myReader(fn,<span class=\"string\">\"b\"<\/span>), <span class=\"string\">\"UniformRead\"<\/span>, true);\r\n<\/pre><p>However, note that the <tt>fileDatastore<\/tt> automatically makes a local copy of each file it reads, which may result in downloading the entire dataset when it is stored remotely. If this is problematic, consider rewriting your data to another file format so you can use a datastore that does not require local copies. This is discussed in more detail later in this post.<\/p><h4>Large MAT Files with Many Small Variables<a name=\"b24ee99b-5f1c-492a-ac4d-04dfc187502b\"><\/a><\/h4><p>MAT files can individually be too large to load either because they have many small variables or have large variables. Files with many small variables arise when logging many signals from simulations or data loggers, or by adding more variables to a MAT file over time using the <tt>save<\/tt> command's <tt>-append<\/tt> option.<\/p><pre class=\"codeinput\">c = eye(10);\r\n<span class=\"comment\">% add another variable to the file<\/span>\r\nsave <span class=\"string\">mydata.mat<\/span> <span class=\"string\">c<\/span> <span class=\"string\">-append<\/span>\r\n<\/pre><p><b>Use a Subset of Variables<\/b><\/p><p>When working with MAT files containing too many small variables to load all at once, one approach is to only load certain variables needed for your analysis as we did in the prior section. If this reduces the data needed from each individual file such that each call to the <tt>read<\/tt> method fits into memory, then we can use the example from the previous section to avoid running out of memory.<\/p><p><b>Use a Portion of All Variables<\/b><\/p><p>However, if even after selecting only the necessary variables the data from individual files is still too large to fit into memory then we must try a different approach. In the prior section we used <tt>fileDatastore<\/tt> to read entire MAT files with a custom reader function. The <tt>fileDatastore<\/tt> also supports reading only parts of a file at a time. By adding additional logic into our reader function to manage the current state of reading through a large file, we can grab a portion of each variable.<\/p><p>Let's assume that in our collection of MAT files each file contains the same number of variables with the same names. Let's also assume all variables within a particular file are column vectors of the same length. We can then use <tt>matfile<\/tt> objects (described in more detail below) within the following reader function to partially read only a certain number of rows from each variable and concatenate them into a table.<\/p><pre class=\"language-matlab\"><span class=\"keyword\">function<\/span> [data,readCounter,done] = partialReadFcn(filename,readCounter)\r\n    <span class=\"comment\">% create MAT file object<\/span>\r\n    m = matfile(filename);\r\n    <span class=\"comment\">% initialize readCounter<\/span>\r\n    <span class=\"keyword\">if<\/span> isempty(readCounter)\r\n        readCounter = 0;\r\n    <span class=\"keyword\">end<\/span>\r\n    <span class=\"comment\">% default read size in number of rows<\/span>\r\n    readSize = 3e4;\r\n    <span class=\"comment\">% number of rows in the column vectors<\/span>\r\n    arrayLength = size(m,<span class=\"string\">\"x\"<\/span>,1);\r\n    <span class=\"keyword\">if<\/span> (arrayLength - readCounter*readSize) &gt; readSize\r\n        <span class=\"comment\">% if there's more left to read than readSize, we're not done...<\/span>\r\n        done = false;\r\n    <span class=\"keyword\">else<\/span>\r\n        <span class=\"comment\">% ...otherwise we are<\/span>\r\n        done = true;\r\n        <span class=\"comment\">% adjust readSize to finish file<\/span>\r\n        readSize = arrayLength - readCounter*readSize;\r\n    <span class=\"keyword\">end<\/span>\r\n    readRange = (1 + readSize*readCounter) : (readSize+readSize*readCounter);\r\n    readCounter = readCounter+1;\r\n    <span class=\"comment\">% read portion of all variables<\/span>\r\n    varNames = who(<span class=\"string\">\"-file\"<\/span>,filename);\r\n    data = nan(readSize,length(varNames));\r\n    <span class=\"keyword\">for<\/span> ii = 1:length(varNames)\r\n        data(:,ii) = m.(varNames{ii})(readRange,1);\r\n    <span class=\"keyword\">end<\/span>\r\n    data = array2table(data,<span class=\"string\">\"VariableNames\"<\/span>,varNames);\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><pre class=\"codeinput\">x = rand(1e6,1);\r\ny = rand(1e6,1);\r\nz = rand(1e6,1);\r\nsave <span class=\"string\">smallVars1.mat<\/span> <span class=\"string\">x<\/span> <span class=\"string\">y<\/span> <span class=\"string\">z<\/span> <span class=\"string\">-v7.3<\/span>\r\nsave <span class=\"string\">smallVars2.mat<\/span> <span class=\"string\">x<\/span> <span class=\"string\">y<\/span> <span class=\"string\">z<\/span> <span class=\"string\">-v7.3<\/span>\r\nfds_partial = fileDatastore(<span class=\"string\">\"smallVars*.mat\"<\/span>, <span class=\"string\">\"ReadFcn\"<\/span>, @partialReadFcn, <span class=\"string\">\"UniformRead\"<\/span>, true, <span class=\"string\">\"ReadMode\"<\/span>, <span class=\"string\">\"partialfile\"<\/span>);\r\n<span class=\"comment\">% reads number of rows specified in reader function<\/span>\r\nt_partial = read(fds_partial);\r\nsize(t_partial)\r\n<\/pre><pre class=\"codeoutput\">ans =\r\n       30000           3\r\n<\/pre><p>The partial reading of <tt>fileDatastore<\/tt> lets you parse arbitrarily large files with an arbitrary reader function. If you want even more control over how a datastore processes a data source, consider using <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/import_export\/develop-custom-datastore.html\">custom datastores<\/a>. By using custom datastores you get access to the low-level tools that MathWorks developers use when developing datastores for new data sources. While they can be challenging to write from scratch, custom datastores give you complete control over how the datastore behaves.<\/p><h4>Large MAT Files with Large Variables<a name=\"c7ed4031-73f2-421f-9bbc-f652ba9f04a6\"><\/a><\/h4><p><b>MATFILE Objects<\/b><\/p><p>Up through Version 7 MAT files, individual variables are limited to 2GB in size. Version 7.3 removes this restriction, allowing variables to be arbitrarily large. MATLAB's <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/matfile.html\"><tt>matfile<\/tt><\/a> objects enable users to access and change variables stored in Version 7.3 MAT files without loading the entire variable into memory.<\/p><pre class=\"codeinput\">save <span class=\"string\">mydata_7_3.mat<\/span> <span class=\"string\">a<\/span> <span class=\"string\">b<\/span> <span class=\"string\">c<\/span> <span class=\"string\">-v7.3<\/span>\r\nm = matfile(<span class=\"string\">\"mydata_7_3.mat\"<\/span>,<span class=\"string\">\"Writable\"<\/span>,true);\r\n<span class=\"comment\">% read only first three rows of variable \"c\"<\/span>\r\nm.c(1:3,:)\r\n<\/pre><pre class=\"codeoutput\">ans =\r\n     1     0     0     0     0     0     0     0     0     0\r\n     0     1     0     0     0     0     0     0     0     0\r\n     0     0     1     0     0     0     0     0     0     0\r\n<\/pre><pre class=\"codeinput\"><span class=\"comment\">% write values from \"b\" to \"c\"<\/span>\r\nm.c(1:2,:) = [m.b(1,:); m.b(1,:)];\r\nm.c(1:3,:)\r\n<\/pre><pre class=\"codeoutput\">ans =\r\n  Columns 1 through 7\r\n    0.3973    0.0812    0.5761    0.3502    0.3579    0.3944    0.0965\r\n    0.3973    0.0812    0.5761    0.3502    0.3579    0.3944    0.0965\r\n         0         0    1.0000         0         0         0         0\r\n  Columns 8 through 10\r\n    0.1076    0.8506    0.1651\r\n    0.1076    0.8506    0.1651\r\n         0         0         0\r\n<\/pre><p>These <tt>matfile<\/tt> objects can then be used in a loop or combined with a <tt>fileDatastore<\/tt> as in the above example to process individual variables that are arbitrarily large.<\/p><p>While <tt>matfile<\/tt> objects are easy to use, they have several <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/matfile.html#bt2ft8s-6\">limitations<\/a> that restrict the situations where they can be used. The biggest restrictions include:<\/p><div><ul><li>Partial reading\/writing of variables is only supported with Version 7.3 MAT files<\/li><li>Does not support partial reading\/writing of some heterogeneous datatypes such as tables, meaning those datatypes must be read or written as whole variables<\/li><\/ul><\/div><p><b>Rewriting MAT Files to Another Format<\/b><\/p><p>If <tt>matfile<\/tt> objects don't meet your needs, you could consider the custom datastores mentioned above or refactor the MAT files into another format. Rewriting your data from MAT files to another file format may make sense when you either:<\/p><div><ul><li>Need functionality not available with MAT files (e.g. splitability)<\/li><li>Need to interchange data with other applications<\/li><li>Need to work with remote datasets, and the local file requirements of <tt>fileDatastore<\/tt> are problematic<\/li><\/ul><\/div><p>One such format is Parquet. The <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/parquet-files.html\">Parquet<\/a> file format is a columnar data storage format designed for the Hadoop ecosystem, though they can be used within any environment. They support splitability, fast I\/O performance, and are a common data interchange format. Parquet files are typically kept relatively small as they are meant to fit in the Hadoop Distributed File System's 128MB block size.<\/p><p>As of R2019a MATLAB has built-in support for reading and writing Parquet files. As Parquet is designed for heterogeneous columnar data, it requires a table or timetable variable. You can interact with Parquet files from MATLAB using <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/parquetread.html\"><tt>parquetread<\/tt><\/a>, <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/parquetwrite.html\"><tt>parquetwrite<\/tt><\/a>, and <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.io.parquet.parquetinfo.html\"><tt>parquetinfo<\/tt><\/a>.<\/p><pre class=\"codeinput\">parquetwrite(<span class=\"string\">\"parquetData.parquet\"<\/span>,t_partial)\r\n<\/pre><p>If you need to do some processing using tall arrays before rewriting the data, the tall array results can be written directly to Parquet using the tall array <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/tall.write.html\"><tt>write<\/tt><\/a> command.<\/p><pre class=\"codeinput\">tall_partial = tall(fds_partial);\r\nwrite(<span class=\"string\">\"data\/p*.parquet\"<\/span>,tall_partial,<span class=\"string\">\"FileType\"<\/span>,<span class=\"string\">\"parquet\"<\/span>)\r\n<\/pre><pre class=\"codeoutput\">Writing tall data to folder C:\\Work\\ArtofMATLAB\\AdamF\\largeMATfiles\\data\r\nEvaluating tall expression using the Local MATLAB Session:\r\n- Pass 1 of 1: Completed in 3.9 sec\r\nEvaluation completed in 4 sec\r\n<\/pre><p>Once you have rewritten your data to Parquet, you can use the full Parquet dataset with the <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.io.datastore.parquetdatastore.html\"><tt>parquetDatastore<\/tt><\/a>. Unlike the <tt>fileDatastore<\/tt>, the <tt>parquetDatastore<\/tt> does not require making a local copy of each file.<\/p><pre class=\"codeinput\">pds = parquetDatastore(<span class=\"string\">\"data\\*.parquet\"<\/span>);\r\nt_parquet = read(pds);\r\ntall_parquet = tall(pds);\r\n<\/pre><p>Switching to another file format does come with its own concerns:<\/p><div><ul><li>Large amounts of data are being duplicated, which can consume large amounts of both time and disk space.<\/li><li>Some information may be lost when writing to a new format. For example, Parquet files do not preserve the timezone property of datetime values. To maintain the timezone information, you must manually save the information to another variable in your table or timetable before writing it to the Parquet file.<\/li><\/ul><\/div><h4>MAT Files Logged from Simulink Simulations<a name=\"c0c7037c-c772-401b-96fd-8a75f05fd20b\"><\/a><\/h4><p>The situations discussed above can arise from many different data sources. One special source of large quantities of MAT file data is data logged from Simulink simulations. In this post we treat this situation differently as MAT files that come from Simulink logging:<\/p><div><ol><li>Store their data in a special, nested data structure within the MAT file<\/li><li>Can use a <a href=\"https:\/\/www.mathworks.com\/help\/simulink\/slref\/matlab.io.datastore.simulationdatastore-class.html\"><tt>simulationDatastore<\/tt><\/a> specifically designed for working with large amounts of MAT files logged from Simulink<\/li><\/ol><\/div><p>A <a href=\"https:\/\/www.mathworks.com\/help\/simulink\/slref\/matlab.io.datastore.simulationdatastore-class.html\"><tt>simulationDatastore<\/tt><\/a> enables a Simulink model to interact with big data. You can load big data as simulation input and log big output data from a simulation. The documentation page for <a href=\"https:\/\/www.mathworks.com\/help\/simulink\/ug\/work-with-big-data-for-simulations.html\">Working with Big Data for Simulations<\/a> contains details of creating and using data logged from Simulink. The general idea is to start with generating the data from Simulink:<\/p><pre class=\"codeinput\"><span class=\"comment\">% load Simulink model<\/span>\r\nload_system(<span class=\"string\">\"sldemo_fuelsys\"<\/span>)\r\n<span class=\"comment\">% turn on data logging<\/span>\r\nset_param(<span class=\"string\">\"sldemo_fuelsys\"<\/span>,<span class=\"string\">\"LoggingToFile\"<\/span>,<span class=\"string\">\"on\"<\/span>)\r\n<span class=\"comment\">% run model and log data<\/span>\r\nsim(<span class=\"string\">\"sldemo_fuelsys\"<\/span>)\r\n<span class=\"comment\">% close model without saving<\/span>\r\nclose_system(<span class=\"string\">\"sldemo_fuelsys\"<\/span>,0)\r\n<\/pre><p>Once your data is logged, use <a href=\"https:\/\/www.mathworks.com\/help\/simulink\/slref\/simulink.simulationdata.datasetref-class.html\"><tt>DatasetRef<\/tt><\/a> to access the <tt>simulationDatastores.<\/tt><\/p><pre class=\"codeinput\">DSRef = Simulink.SimulationData.DatasetRef(<span class=\"string\">\"out.mat\"<\/span>,<span class=\"string\">\"sldemo_fuelsys_output\"<\/span>);\r\n<span class=\"comment\">% return a simulationDatastore for fuel signal<\/span>\r\nds = DSRef.getAsDatastore(<span class=\"string\">\"fuel\"<\/span>).Values;\r\n<span class=\"comment\">% take a single read of the fuel signal from the MAT file<\/span>\r\nt_sim = read(ds);\r\n<span class=\"comment\">% treat all the fuel data as a tall variable<\/span>\r\ntall_t_sim = tall(ds);\r\n<\/pre><h4>Summary<a name=\"93ea4f81-5e5f-49ab-bd1c-e31e27e5e47b\"><\/a><\/h4><p>In this post we explored several different situations and solutions when dealing with big data in MAT files. Many of the solutions we explored can be mixed and combined together. General recommendations for which tool to start with are summarized in the figure below. Leave a comment <a href=\"https:\/\/blogs.mathworks.com\/loren\/?p=3347#respond\">here<\/a> and let us know what enhancements you would like to see in the next version of MAT files.<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2019\/OptionsGrid.png\" alt=\"\"> <\/p><script language=\"JavaScript\"> <!-- \r\n    function grabCode_d64bdf67989c4cb1a92d0bfdb207994a() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='d64bdf67989c4cb1a92d0bfdb207994a ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' d64bdf67989c4cb1a92d0bfdb207994a';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2019 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_d64bdf67989c4cb1a92d0bfdb207994a()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2019a<br><\/p><\/div><!--\r\nd64bdf67989c4cb1a92d0bfdb207994a ##### SOURCE BEGIN #####\r\n%% Big Data in MAT Files\r\n% \r\n% _Today's guest blogger is Adam Filion, a Senior Product Manager at MathWorks. \r\n% Adam helps manage and prioritize our development efforts in data science and \r\n% big data._\r\n% \r\n%\r\n% MAT files are an easy and common way to store MATLAB variables to disk. They \r\n% support all MATLAB variable types, have good data compression, and can be accessed \r\n% or created from other applications through an <https:\/\/www.mathworks.com\/help\/pdf_doc\/matlab\/matfile_format.pdf \r\n% external API>. MATLAB users sometimes have so much data stored in MAT files \r\n% that they can't load all the data at once. In this post, we will explore different \r\n% situations and solutions for analyzing large amounts of data stored in MAT files.\r\n%% Introduction to MAT Files\r\n% *Using MAT files*\r\n%\r\n% MATLAB provides the ability to save variables to MAT files through the <https:\/\/www.mathworks.com\/help\/matlab\/ref\/save.html \r\n% save> command.\r\n\r\na = pi;\r\nb = rand(1,10);\r\nsave mydata.mat a b\r\n%%\r\n% These variables can be returned to the workspace using <https:\/\/www.mathworks.com\/help\/matlab\/ref\/load.html \r\n% load>.\r\n\r\n% return all variables to the workspace\r\nload mydata.mat\r\n% find which variables are contained in a MAT file\r\nvarNames = who(\"-file\",\"mydata.mat\");\r\n% return only the second variable to the workspace\r\nload(\"mydata.mat\",varNames{2})\r\n\r\n%% \r\n% *MAT file versions*\r\n%\r\n% MAT files have evolved over time and <https:\/\/www.mathworks.com\/help\/matlab\/import_export\/mat-file-versions.html  \r\n% several different versions> exist. You can change the version to use when saving \r\n% data by passing an additional flag, such as |\"-v7.3\"|, to the |save| command. \r\n% The biggest differences are summarized in the table below. \r\n% \r\n% <<MATfileversions.png>>\r\n% \r\n% Version 7 is the default and should be used unless you need the additional \r\n% functionality provided in Version 7.3. This is because, as mentioned in the \r\n% documentation, Version 7.3 contains additional header information and may result \r\n% in larger files than Version 7 when storing small amounts of data. Only create \r\n% Version 6 or Version 4 MAT files if you need compatibility with older legacy \r\n% applications.\r\n% \r\n% *When to store big data in MAT files*\r\n%\r\n% Most users analyzing large amounts of MAT file data did not choose the storage \r\n% format themselves; but if you could, when would it make sense to store big data \r\n% in MAT files? They are a good choice when the following three conditions apply:\r\n%\r\n% # *The data is originally recorded in MAT files.* This occurs when saving \r\n% variables from the MATLAB workspace, logging data from Simulink simulations, \r\n% or recording data from certain third-party data loggers which generate MAT files \r\n% automatically. If your data does not naturally come in MAT files, it usually \r\n% should be left in its original format.\r\n% # *The data is staying in the MATLAB ecosystem.* MAT files are simple to use \r\n% and are a lossless storage format, meaning that you will never lose any information \r\n% or accuracy when storing MATLAB variables. However, since they are not easily \r\n% accessible from other applications, it is typically better to use another file \r\n% format (e.g. csv, Parquet, etc.) when exchanging data with other applications.\r\n% # *MAT files easily work in your file storage system.* Some file systems impose \r\n% additional requirements on files stored within them. For example, in the Hadoop \r\n% Distributed File System (HDFS) it is difficult to use files that are not splitable, \r\n% which is a feature MAT files do not support. In such situations, you should \r\n% consider if a different file format that supports the file system requirements \r\n% would be a better choice.\r\n%\r\n% *Big data in MAT file situations*\r\n%\r\n% If all your MAT file data can be easily loaded into memory and analyzed at \r\n% the same time, use the |load| command outlined at the beginning. For the rest \r\n% of this post, we will explore the four general situations when MAT file data \r\n% gets too large to work with at once.\r\n%\r\n% # Large collections of small MAT files\r\n% # Large MAT files with many small variables\r\n% # Large MAT files with large variables\r\n% # MAT files logged from Simulink simulations\r\n%% Large Collections of Small MAT Files\r\n% Often data is recorded from different entities (e.g. weather stations, vehicles, \r\n% simulations, etc.) and each entity is stored in a separate file. Even if each \r\n% individual MAT file can easily fit into memory, the total collection can grow \r\n% large enough that we cannot work with all of it at once. When this happens, \r\n% there are two solutions based on the type of analysis we need to do.\r\n% \r\n% *Embarrassingly Parallel Analysis*\r\n%\r\n% If the work we are doing is embarrassingly parallel, meaning that each file \r\n% can be analyzed in isolation, then we can loop through the files one at a time. \r\n% If <https:\/\/www.mathworks.com\/products\/parallel-computing.html \r\n% Parallel Computing Toolbox> is available, we can accelerate the process by using \r\n% a <https:\/\/www.mathworks.com\/help\/parallel-computing\/parfor.html \r\n% |parfor|> loop instead of a |for| loop.\r\n\r\n% find .mat files in current directory\r\nd = dir(\"*.mat\");\r\n% loop through with a for loop, or use parfor\r\nparfor ii = 1:length(d)\r\n    % load the next .mat file\r\n    data = load(d(ii).name);\r\n    % perform your analysis on each individual file\r\n    doAnalysis()        \r\nend\r\n%% \r\n% *Inherently Sequential Analysis*\r\n%\r\n% When our files cannot be analyzed in isolation, we need to change our approach. \r\n% The <https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.io.datastore.filedatastore.html \r\n% |fileDatastore|> gives access to large collections of files by using a custom \r\n% file reader function. For example, if your analysis only needs the variable \r\n% |\"b\"| from your MAT files, you can use a reader function such as:\r\n%%\r\n% \r\n%   function data = myReader(fileName,varName)\r\n%     matData = load(fileName,varName);\r\n%     data = matData.(varName);\r\n%   end\r\n%\r\n%% \r\n% If your data is stored in more complicated or irregular formats, you can use \r\n% any arbitrary code in your reader function to return the values in the format \r\n% you need. Once we define our reader function, we can create a |fileDatastore|, \r\n% which will read one file at a time using our reader function.\r\n\r\nfds = fileDatastore(\"*.mat\", \"ReadFcn\", @(fn) myReader(fn,\"b\"), \"UniformRead\", true);\r\n%% \r\n% Note that by default the |fileDatastore| will return each file's contents \r\n% as an element in a cell array. The |UniformRead| option will instead keep the \r\n% data's original format and vertically concatenate the data from different files. \r\n% \r\n% After creating the datastore we can read a portion of the dataset with the \r\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.io.datastore.read.html \r\n% |read|> method or analyze the full out-of-memory dataset with <https:\/\/www.mathworks.com\/help\/matlab\/tall-arrays.html \r\n% tall arrays>.\r\n\r\n% read once from the datastore\r\nt = read(fds);\r\n% create a tall array\r\ntall_t = tall(fds);\r\n%% \r\n% Unlike the |load| command, |fileDatastore| also supports <https:\/\/www.mathworks.com\/help\/matlab\/import_export\/work-with-remote-data.html \r\n% remote storage systems> including Amazon S3, Azure Blob Storage and the Hadoop \r\n% Distributed File System. For example, to use Amazon S3 make the following modifications:\r\n%%\r\n% \r\n%   setenv(\"AWS_ACCESS_KEY_ID\", \"YOUR_AWS_ACCESS_KEY\")\r\n%   setenv(\"AWS_SECRET_ACCESS_KEY\", \"YOUR_AWS_SECRET_ACCESS_KEY\")\r\n%   fds = fileDatastore(\"s3:\/\/bucketname\/dataset\/*.mat\", \"ReadFcn\", @(fn) myReader(fn,\"b\"), \"UniformRead\", true);\r\n%\r\n%% \r\n% However, note that the |fileDatastore| automatically makes a local copy of \r\n% each file it reads, which may result in downloading the entire dataset when \r\n% it is stored remotely. If this is problematic, consider rewriting your data \r\n% to another file format so you can use a datastore that does not require local \r\n% copies. This is discussed in more detail later in this post.\r\n% \r\n%% Large MAT Files with Many Small Variables\r\n% MAT files can individually be too large to load either because they have many \r\n% small variables or have large variables. Files with many small variables arise \r\n% when logging many signals from simulations or data loggers, or by adding more \r\n% variables to a MAT file over time using the |save| command's |-append| option.\r\n\r\nc = eye(10);\r\n% add another variable to the file\r\nsave mydata.mat c -append\r\n%% \r\n% *Use a Subset of Variables*\r\n%\r\n% When working with MAT files containing too many small variables to load all \r\n% at once, one approach is to only load certain variables needed for your analysis \r\n% as we did in the prior section. If this reduces the data needed from each individual \r\n% file such that each call to the |read| method fits into memory, then we can \r\n% use the example from the previous section to avoid running out of memory.\r\n% \r\n% *Use a Portion of All Variables*\r\n%\r\n% However, if even after selecting only the necessary variables the data from \r\n% individual files is still too large to fit into memory then we must try a different \r\n% approach. In the prior section we used |fileDatastore| to read entire MAT files \r\n% with a custom reader function. The |fileDatastore| also supports reading only \r\n% parts of a file at a time. By adding additional logic into our reader function \r\n% to manage the current state of reading through a large file, we can grab a portion \r\n% of each variable.\r\n% \r\n% Let's assume that in our collection of MAT files each file contains the same \r\n% number of variables with the same names. Let's also assume all variables within \r\n% a particular file are column vectors of the same length. We can then use |matfile| \r\n% objects (described in more detail below) within the following reader function \r\n% to partially read only a certain number of rows from each variable and concatenate \r\n% them into a table.\r\n%%\r\n% \r\n%   function [data,readCounter,done] = partialReadFcn(filename,readCounter)\r\n%       % create MAT file object\r\n%       m = matfile(filename);\r\n%       % initialize readCounter\r\n%       if isempty(readCounter) \r\n%           readCounter = 0;\r\n%       end\r\n%       % default read size in number of rows\r\n%       readSize = 3e4; \r\n%       % number of rows in the column vectors\r\n%       arrayLength = size(m,\"x\",1); \r\n%       if (arrayLength - readCounter*readSize) > readSize\r\n%           % if there's more left to read than readSize, we're not done...\r\n%           done = false; \r\n%       else\r\n%           % ...otherwise we are\r\n%           done = true; \r\n%           % adjust readSize to finish file\r\n%           readSize = arrayLength - readCounter*readSize;   \r\n%       end\r\n%       readRange = (1 + readSize*readCounter) : (readSize+readSize*readCounter);\r\n%       readCounter = readCounter+1;\r\n%       % read portion of all variables\r\n%       varNames = who(\"-file\",filename);\r\n%       data = nan(readSize,length(varNames));\r\n%       for ii = 1:length(varNames)\r\n%           data(:,ii) = m.(varNames{ii})(readRange,1);\r\n%       end\r\n%       data = array2table(data,\"VariableNames\",varNames);\r\n%   end\r\n%\r\n\r\nx = rand(1e6,1);\r\ny = rand(1e6,1);\r\nz = rand(1e6,1);\r\nsave smallVars1.mat x y z -v7.3 \r\nsave smallVars2.mat x y z -v7.3\r\nfds_partial = fileDatastore(\"smallVars*.mat\", \"ReadFcn\", @partialReadFcn, \"UniformRead\", true, \"ReadMode\", \"partialfile\");\r\n% reads number of rows specified in reader function\r\nt_partial = read(fds_partial); \r\nsize(t_partial)\r\n%% \r\n% The partial reading of |fileDatastore| lets you parse arbitrarily large files \r\n% with an arbitrary reader function. If you want even more control over how a \r\n% datastore processes a data source, consider using <https:\/\/www.mathworks.com\/help\/matlab\/import_export\/develop-custom-datastore.html  \r\n% custom datastores>. By using custom datastores you get access to the low-level \r\n% tools that MathWorks developers use when developing datastores for new data \r\n% sources. While they can be challenging to write from scratch, custom datastores \r\n% give you complete control over how the datastore behaves.\r\n% \r\n%% Large MAT Files with Large Variables\r\n% *MATFILE Objects*\r\n%\r\n% Up through Version 7 MAT files, individual variables are limited to 2GB in \r\n% size. Version 7.3 removes this restriction, allowing variables to be arbitrarily \r\n% large. MATLAB's <https:\/\/www.mathworks.com\/help\/matlab\/ref\/matfile.html |matfile|> \r\n% objects enable users to access and change variables stored in Version 7.3 MAT \r\n% files without loading the entire variable into memory.\r\n\r\nsave mydata_7_3.mat a b c -v7.3\r\nm = matfile(\"mydata_7_3.mat\",\"Writable\",true);\r\n% read only first three rows of variable \"c\"\r\nm.c(1:3,:) \r\n%%\r\n\r\n% write values from \"b\" to \"c\"\r\nm.c(1:2,:) = [m.b(1,:); m.b(1,:)]; \r\nm.c(1:3,:)\r\n%% \r\n% These |matfile| objects can then be used in a loop or combined with a |fileDatastore| \r\n% as in the above example to process individual variables that are arbitrarily \r\n% large.\r\n% \r\n% While |matfile| objects are easy to use, they have several <https:\/\/www.mathworks.com\/help\/matlab\/ref\/matfile.html#bt2ft8s-6 \r\n% limitations> that restrict the situations where they can be used. The biggest \r\n% restrictions include:\r\n%% \r\n% * Partial reading\/writing of variables is only supported with Version 7.3 \r\n% MAT files\r\n% * Does not support partial reading\/writing of some heterogeneous datatypes \r\n% such as tables, meaning those datatypes must be read or written as whole variables\r\n%% \r\n% *Rewriting MAT Files to Another Format*\r\n%\r\n% If |matfile| objects don't meet your needs, you could consider the custom \r\n% datastores mentioned above or refactor the MAT files into another format. Rewriting \r\n% your data from MAT files to another file format may make sense when you either:\r\n%% \r\n% * Need functionality not available with MAT files (e.g. splitability)\r\n% * Need to interchange data with other applications\r\n% * Need to work with remote datasets, and the local file requirements of |fileDatastore| \r\n% are problematic\r\n%% \r\n% One such format is Parquet. The <https:\/\/www.mathworks.com\/help\/matlab\/parquet-files.html \r\n% Parquet> file format is a columnar data storage format designed for the Hadoop \r\n% ecosystem, though they can be used within any environment. They support splitability, \r\n% fast I\/O performance, and are a common data interchange format. Parquet files \r\n% are typically kept relatively small as they are meant to fit in the Hadoop Distributed \r\n% File System's 128MB block size. \r\n% \r\n% As of R2019a MATLAB has built-in support for reading and writing Parquet files. \r\n% As Parquet is designed for heterogeneous columnar data, it requires a table \r\n% or timetable variable. You can interact with Parquet files from MATLAB using \r\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/parquetread.html \r\n% |parquetread|>, <https:\/\/www.mathworks.com\/help\/matlab\/ref\/parquetwrite.html \r\n% |parquetwrite|>, and <https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.io.parquet.parquetinfo.html \r\n% |parquetinfo|>.\r\n\r\nparquetwrite(\"parquetData.parquet\",t_partial)\r\n%% \r\n% If you need to do some processing using tall arrays before rewriting the data, \r\n% the tall array results can be written directly to Parquet using the tall array \r\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/tall.write.html |write|> \r\n% command.\r\n\r\ntall_partial = tall(fds_partial);\r\nwrite(\"data\/p*.parquet\",tall_partial,\"FileType\",\"parquet\")\r\n%% \r\n% Once you have rewritten your data to Parquet, you can use the full Parquet \r\n% dataset with the <https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.io.datastore.parquetdatastore.html \r\n% |parquetDatastore|>. Unlike the |fileDatastore|, the |parquetDatastore| does \r\n% not require making a local copy of each file.\r\n\r\npds = parquetDatastore(\"data\\*.parquet\");\r\nt_parquet = read(pds);\r\ntall_parquet = tall(pds);\r\n%% \r\n% Switching to another file format does come with its own concerns:\r\n%% \r\n% * Large amounts of data are being duplicated, which can consume large amounts \r\n% of both time and disk space.\r\n% * Some information may be lost when writing to a new format. For example, \r\n% Parquet files do not preserve the timezone property of datetime values. To maintain \r\n% the timezone information, you must manually save the information to another \r\n% variable in your table or timetable before writing it to the Parquet file.\r\n%% MAT Files Logged from Simulink Simulations\r\n% The situations discussed above can arise from many different data sources. \r\n% One special source of large quantities of MAT file data is data logged from \r\n% Simulink simulations. In this post we treat this situation differently as MAT \r\n% files that come from Simulink logging:\r\n%% \r\n% # Store their data in a special, nested data structure within the MAT file\r\n% # Can use a <https:\/\/www.mathworks.com\/help\/simulink\/slref\/matlab.io.datastore.simulationdatastore-class.html \r\n% |simulationDatastore|> specifically designed for working with large amounts \r\n% of MAT files logged from Simulink\r\n%% \r\n% A <https:\/\/www.mathworks.com\/help\/simulink\/slref\/matlab.io.datastore.simulationdatastore-class.html \r\n% |simulationDatastore|> enables a Simulink model to interact with big data. \r\n% You can load big data as simulation input and log big output data from a simulation. \r\n% The documentation page for <https:\/\/www.mathworks.com\/help\/simulink\/ug\/work-with-big-data-for-simulations.html \r\n% Working with Big Data for Simulations> contains details of creating and using \r\n% data logged from Simulink. The general idea is to start with generating the \r\n% data from Simulink:\r\n\r\n% load Simulink model\r\nload_system(\"sldemo_fuelsys\")\r\n% turn on data logging\r\nset_param(\"sldemo_fuelsys\",\"LoggingToFile\",\"on\")\r\n% run model and log data\r\nsim(\"sldemo_fuelsys\")\r\n% close model without saving\r\nclose_system(\"sldemo_fuelsys\",0)\r\n%% \r\n% Once your data is logged, use <https:\/\/www.mathworks.com\/help\/simulink\/slref\/simulink.simulationdata.datasetref-class.html \r\n% |DatasetRef|> to access the |simulationDatastores.|\r\n\r\nDSRef = Simulink.SimulationData.DatasetRef(\"out.mat\",\"sldemo_fuelsys_output\");\r\n% return a simulationDatastore for fuel signal\r\nds = DSRef.getAsDatastore(\"fuel\").Values; \r\n% take a single read of the fuel signal from the MAT file\r\nt_sim = read(ds); \r\n% treat all the fuel data as a tall variable\r\ntall_t_sim = tall(ds); \r\n%% Summary\r\n% In this post we explored several different situations and solutions when\r\n% dealing with big data in MAT files. Many of the solutions we explored can\r\n% be mixed and combined together. General recommendations for which tool to\r\n% start with are summarized in the figure below. Leave a comment\r\n% <https:\/\/blogs.mathworks.com\/loren\/?p=3347#respond here> and let us know\r\n% what enhancements you would like to see in the next version of MAT files.\r\n% \r\n% <<OptionsGrid.png>>\r\n##### SOURCE END ##### d64bdf67989c4cb1a92d0bfdb207994a\r\n-->","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img decoding=\"async\"  class=\"img-responsive\" src=\"https:\/\/blogs.mathworks.com\/images\/loren\/2019\/OptionsGrid.png\" onError=\"this.style.display ='none';\" \/><\/div><!--introduction--><p><i>Today's guest blogger is Adam Filion, a Senior Product Manager at MathWorks. Adam helps manage and prioritize our development efforts in data science and big data.<\/i>... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/loren\/2019\/05\/29\/big-data-in-mat-files\/\">read more >><\/a><\/p>","protected":false},"author":39,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[63],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/3347"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/comments?post=3347"}],"version-history":[{"count":2,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/3347\/revisions"}],"predecessor-version":[{"id":3351,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/3347\/revisions\/3351"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/media?parent=3347"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/categories?post=3347"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/tags?post=3347"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}