Mike on the MATLAB Desktop

June 28th, 2010

Using XML in MATLAB

Much of the data on the Internet is stored in some flavor of XML. Fortunately for us, MATLAB has some built in functions for handling XML file. This will be the first in a series of non-consecutive posts about working with XML in MATLAB. Today I'm going to describe the functions for reading, writing, and transforming XML files.

There are three functions in MATLAB to specifically deal with XML files. The first is xmlread. This function takes either a URL or a filename and creates a Java XML object in the workspace:

xmlfile = fullfile(matlabroot, 'toolbox/matlab/general/info.xml');
xDoc = xmlread(xmlfile)
xDoc =

[#document: null]

 

Don't worry that the return value says: "[#document: null]". The xmlread function returns a Java object that represents the file's Document Object Model, or DOM. The "null" is simply what the org.apache.xerces.dom.DeferredDocumentImpl's implementation of toString() dumps to the MATLAB Command Window. To learn more about interacting with Java objects in MATLAB, see my previous article.

Just to make sure this object has all our XML text, let's use the next MATLAB XML function: xmlwrite. Without any additional arguments, xmlwrite will display the contents of the DOM in the Command Window:

xmlwrite(xDoc)
ans =

<?xml version="1.0" encoding="utf-8"?>
<productinfo xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.mathworks.com/namespace/info/v1/info.xsd">
<?xml-stylesheet type="text/xsl" href="http://www.mathworks.com/namespace/info/v1/info.xsl"?>

   <matlabrelease>14</matlabrelease>
   <name>MATLAB</name>
   <type>matlab</type>
   <icon>ApplicationIcon.MATLAB</icon>
   <help_location>$docroot/techdoc</help_location>
   <dialogpref_registrar>
      <source>com.mathworks.mde.editor.EditorOptions</source>
   </dialogpref_registrar>
      <!-- cut for brevity..... -->
   </list>
</productinfo>

 

Of course, you can also use xmlwrite to save the XML document to disk, by calling xmlwrite with the following signature:

xmlwrite(outfile,xDoc)

One gotcha when working with xmlread is that because the created DOM object is a Java object, it is stored in Java memory. The amount of memory set aside for Java is system dependent, but is generally between 64 and 256 MB. This means that if you read in a 200MB XML file, you're going to run out of memory, no matter how much free main memory MATLAB says is available. In that case, you just need to adjust the amount of memory available to Java, using the new preference panel.

The final XML function provided by MATLAB deals with transforming XML documents with XSL Stylesheets. The xslt function takes a DOM object or a filename or url specifying an XML document, a filename or url specifying an XSL stylesheet, and the output destination and performs the transform. Working with stylesheets is pretty complicated so I'm not going to delve into them here.

Next time, I'll talk about the DOM object itself and how to add, remove, and query nodes.

22 Responses to “Using XML in MATLAB”

  1. StephenLL replied on :

    One of the first tasks I wanted to do with the xmlread function was to read in the XML from a string. It wasn’t obvious to me. After some research I put together the following code. I’m not sure if it is the best or the ideal but it works.

    import org.xml.sax.InputSource
    import javax.xml.parsers.*
    import java.io.*
    
    % str is a character string of XML you want to read in.
    
    iS = InputSource();
    iS.setCharacterStream( StringReader(str) );
    
    p = xmlread(iS);
    
  2. Yair Altman replied on :

    A useful reference is the official javadoc for the XML functionality: https://jaxp-sources.dev.java.net/nonav/docs/api/
    There are also numerous online resources explaining how to use the XML functionality and its many alternative libraries.

    Java-savvy programmers may be interested in the following related article: http://UndocumentedMatlab.com/blog/undocumented-xml-functionality/
    (note that it relies on unsupported functionality which may change without warning across Matlab releases).

  3. Mike replied on :

    @StephenLL,

    That is one good way to do it, thanks for sharing.

    @Yair,
    Thanks for the reference and for exploring the XML function in your blog. I’m going to have a follow up post describing some the basics of working with XML node objects, but of course, I won’t be able to detail everything.

  4. Simon replied on :

    Hello Mike,

    When is the next blog about the DOM object itself and how to add, remove, and query nodes?

    Thank you in advance,

    Simon

  5. Mike replied on :

    @Simon,

    I don’t have a specific time set yet, but probably in the next two – three weeks.

  6. Arindam replied on :

    Hi Mike,

    Does the Java object also take care of validation of the xml file as per the xsd specs?

    Thanks

    -Arindam

  7. Richie replied on :

    Hey Mike,

    when will be the next update with adding and querying nodes?

    I put myself something together by reading many different threads online, but I haven’t found anything yet describing the function of the commands. So it’s really hard to change things, when you don’t know what it -really- does.

    Greetz,
    Richie

  8. Mike replied on :

    @Richie,

    I posted the next part this morning on creating nodes.

  9. vishnusekhar replied on :

    Hello Mike,

    This is very userfull post.

    How we can pass the XML object as parameter to the stored procedure?

    Thanks,
    Vishnu

  10. Mike replied on :

    @Vishnu,

    I don’t understand the question. You can create a reference to the Java XML object with xmlread, as described above. You can then use resulting variable like any other MATLAB variable.

  11. vishnusekhar replied on :

    Mike,
    I was tryiing to calla procedure from matlab and pass the xml dom object as a parameter.
    I had declared the input argument as xml type in stored procedure.
    Here I am copying the code for your reference:
    matlab Code
    xmlContent=xmlread(‘tmpfile_11_04_2010.xml’)
    x=xmlwrite(xmlContent)
    sqlOutput = fetch(conn,strcat(‘{call sp_test (‘, {x},’)')

    also I tried
    K>> sqlOutput = runstoredprocedure(conn,’stp_RM_INS_RiskMatlabOutput’, {x})
    ??? Error using ==> database.runstoredprocedure at 86
    Procedure may return resultset. Use EXEC and FETCH

    sql code

    Alter proc sp_test(@p_xmlObject varchar(3000))As
    begin
    select ‘Sucess’
    end

    I am getting error from matlab. Could you please let me know the syntax for passing xml object to sql ?

    Thanks in advance.
    Vishnu

  12. vishnusekhar replied on :

    sorry, please find the code for the stored procedure

    Alter proc sp_test(@p_xmlObject xml)As
    begin
    select ‘Sucess’
    end

  13. Mike replied on :

    @Vishnu,

    I understand, now. There may be an escape character issue, either with the single-quotes (‘) that may be in the XML file that aren’t concatenating well, or perhaps with the > and < or & symbols in your xml document. You should contact technical support and they can help you track down the issue and show you how to fix it.

  14. vishnusekhar replied on :

    Hello Mike,

    Exactly. I had a discussion with the support team.

    xmlContent=xmlread(xmlFileName);

    xmlObject=xmlwrite(xmlContent) ;

    conn = database(strDatabaseName,strUserName,strPassword,strDriverName,strcat(strURL,strDatabaseName));

    K>> sqlOutput = runstoredprocedure(conn,’sp_test’,{”’, xmlObject, ”’ },{java.sql.Types.sqlXml})
    ??? No appropriate method, property, or field sqlXml for class java.sql.Types.

    let me know how I can use java.sql.Types.sqlXml ?

    Thanks,
    Vishnu

  15. vishnusekhar replied on :

    we are resolved this issue using the below code …

    xmlContent=xmlread(xmlFileName);
    xmlObject=xmlwrite(xmlContent) ;
    conn = database(strDatabaseName,strUserName,strPassword,strDriverName,strcat(strURL,strDatabaseName));
    sqlOutput=fetch(conn, strcat(‘{call sp_test(”’, xmlObject, ”’)}’));

    Removed the single quote from the xml. This is working fine.

    Thanks
    Vishnu

  16. Mike replied on :

    @Vishnu,

    Ah, the single quote once again is the culprit! Thanks for sharing your solution.

  17. Beverlyn replied on :

    hey, can we write in encoding ISO-8859-1 instead of encoding UTF-8?

  18. Mike replied on :

    @Beverlyn,

    By design, XMLWRITE uses our default encoding which is UTF-8 when writing the file, regardless of what you might set in the DOM. To work around this issue, you have to modify your XMLWRITE function in the MATLAB toolbox. I recommend saving a backup of it first.

    edit('xmlwrite')
    

    Then modify the line:

    javaMethod('serializeXML',...
        'com.mathworks.xml.XMLUtils',...
        source,result);
    

    to:

    javaMethod('serializeXML',...
        'com.mathworks.xml.XMLUtils',...
        source,result,'ISO-8859-1');
    

    if you wanted to go further you can modify the xmlwrite.m file to take the encoding a parameter. I’ve created an enhancement request to have that added to MATLAB.

  19. Charlie Hogg replied on :

    For reference, links to the rest of this series are here
    http://blogs.mathworks.com/desktop/2010/11/01/xml-and-matlab-navigating-a-tree/

    Mike, This is very helpful. Are there others in the series?

    thanks,
    Charlie

  20. Michael Katz replied on :

    @Charlie,

    There is also this one: http://blogs.mathworks.com/desktop/2010/09/13/simple-xml-node-creation/.

    Are there more topics you’d like me to cover?

  21. Abby replied on :

    Can you answer the question posted at http://www.mathworks.com/matlabcentral/answers/4981-setxmlstandalone-bug ?

    Basically, I want standalone=”yes” to show up in the first line of my XML file, but docNode.setXmlStandalone(1); doesn’t seem to have any effect on what is displayed via xmlwrite.

    This seems similar to the encoding problem above… advice?

  22. Michael Katz replied on :

    @Abby,

    I have answered that question. Hope that helps. I’ll try to cover that in a future post.

Leave a Reply

Wrap code fragments inside <pre> tags, like this:

<pre class="code">
a = magic(3);
sum(a)
</pre>

If you have a "<" character in your code, either follow it with a space or replace it with "&lt;" (including the semicolon).


MathWorks
Mike works on the MATLAB Desktop team.

These postings are the author's and don't necessarily represent the opinions of The MathWorks.