Much of the data on the Internet is stored in some flavor of XML. Fortunately for us, MATLAB has some built in functions for handling XML file. This will be the first in a series of non-consecutive posts about working with XML in MATLAB. Today I'm going to describe the functions for reading, writing, and transforming XML files.
There are three functions in MATLAB to specifically deal with XML files. The first is xmlread. This function takes either a URL or a filename and creates a Java XML object in the workspace:
xmlfile = fullfile(matlabroot, 'toolbox/matlab/general/info.xml'); xDoc = xmlread(xmlfile)
xDoc = [#document: null]
Don't worry that the return value says: "[#document: null]". The xmlread function returns a Java object that represents the file's Document Object Model, or DOM. The "null" is simply what the org.apache.xerces.dom.DeferredDocumentImpl's implementation of toString() dumps to the MATLAB Command Window. To learn more about interacting with Java objects in MATLAB, see my previous article.
Just to make sure this object has all our XML text, let's use the next MATLAB XML function: xmlwrite. Without any additional arguments, xmlwrite will display the contents of the DOM in the Command Window:
ans = <?xml version="1.0" encoding="utf-8"?> <productinfo xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="https://www.mathworks.com/namespace/info/v1/info.xsd"> <?xml-stylesheet type="text/xsl" href="https://www.mathworks.com/namespace/info/v1/info.xsl"?> <matlabrelease>14</matlabrelease> <name>MATLAB</name> <type>matlab</type> <icon>ApplicationIcon.MATLAB</icon> <help_location>$docroot/techdoc</help_location> <dialogpref_registrar> <source>com.mathworks.mde.editor.EditorOptions</source> </dialogpref_registrar> <!-- cut for brevity..... --> </list> </productinfo>
Of course, you can also use xmlwrite to save the XML document to disk, by calling xmlwrite with the following signature:
One gotcha when working with xmlread is that because the created DOM object is a Java object, it is stored in Java memory. The amount of memory set aside for Java is system dependent, but is generally between 64 and 256 MB. This means that if you read in a 200MB XML file, you're going to run out of memory, no matter how much free main memory MATLAB says is available. In that case, you just need to adjust the amount of memory available to Java, using the new preference panel.
The final XML function provided by MATLAB deals with transforming XML documents with XSL Stylesheets. The xslt function takes a DOM object or a filename or url specifying an XML document, a filename or url specifying an XSL stylesheet, and the output destination and performs the transform. Working with stylesheets is pretty complicated so I'm not going to delve into them here.
Next time, I'll talk about the DOM object itself and how to add, remove, and query nodes.
Comments are closed.
22 CommentsOldest to Newest
import org.xml.sax.InputSource import javax.xml.parsers.* import java.io.* % str is a character string of XML you want to read in. iS = InputSource(); iS.setCharacterStream( StringReader(str) ); p = xmlread(iS);
edit('xmlwrite')Then modify the line:
javaMethod('serializeXML',... 'com.mathworks.xml.XMLUtils',... source,result);to:
javaMethod('serializeXML',... 'com.mathworks.xml.XMLUtils',... source,result,'ISO-8859-1');if you wanted to go further you can modify the xmlwrite.m file to take the encoding a parameter. I've created an enhancement request to have that added to MATLAB.