Mike on the MATLAB Desktop

September 13th, 2010

Simple XML Node Creation

Last time in my XML series, I showed you how to use xmlread to create an XML Document object in MATLAB. I also promised to follow up on that with more information on how to use the Document object. Today, I want to show you how to create a DOM from scratch and then build up a simple XML document. To do that, I'm going use the canonical address book example as our sample XML document.

The first step is to create a new node. To do this, we need to use the Java method com.mathworks.xml.XMLUtils.createDocument() since MATLAB does not have native XML objects. In this example we create the top level ("root" or "document") node to be named AddressBook. Using the one argument version of xmlwrite will display the xml document in the Command Window.

docNode = com.mathworks.xml.XMLUtils.createDocument('AddressBook');
xmlwrite(docNode)
ans =

<?xml version="1.0" encoding="utf-8"?>
<AddressBook/>

Now that I have the document element, I want to populate it with child nodes. The document object is also the factory for new nodes. Therefore each time I create a new node, I do so from my docNode object. Also note, that I can't append a child element node to the document directly, I have to first call geDocumentElement to get the root element node of the document. It's all a little confusing, but I'm sure it made sense to the original designers of Apache's Xerces DOM (which is the implementation we use).

In the next step, I create an Entry element to represent a single person in my address book, and append it as a child to the root node.

entry_node = docNode.createElement('Entry');
docNode.getDocumentElement.appendChild(entry_node);
xmlwrite(docNode)
ans =

<?xml version="1.0" encoding="utf-8"?>
<AddressBook>
   <Entry/>
</AddressBook>

Now that there is an Entry in the address book, I will create data elements for it. To do this I create new element nodes and append those to my Entry node. I also create text Nodes to represent the text data of each of these elements. For example, a I create a "Name" node for the person's name, and put "Friendly Mathworker" (a colleague) as text node child of the Name element. Doing this allows me to build up the XML tree structure. In addition to text nodes, we can create attribute nodes (see next section), as well as CDATA, Comments, and other types of XML nodes.

add name, phone number

name_node = docNode.createElement('Name');
name_text = docNode.createTextNode('Friendly J. Mathworker');
name_node.appendChild(name_text);
entry_node.appendChild(name_node);

phone_number_node = docNode.createElement('PhoneNumber');
phone_number_text = docNode.createTextNode('(508) 647-7000');
phone_number_node.appendChild(phone_number_text);
entry_node.appendChild(phone_number_node);

xmlwrite(docNode)
ans =

<?xml version="1.0" encoding="utf-8"?>
<AddressBook>
   <Entry>
      <Name>Friendly J. Mathworker</Name>
      <PhoneNumber>(508) 647-7000</PhoneNumber>
   </Entry>
</AddressBook>

For the final step, I'm going to add an address to this person's entry. In addition to a simple a text node, I've added some attributes using two different methods. First, I use the convenience setAttribute(name, value) method to indicate that this Address is of type "work". In the second case I use the more formal node structure to create a "hasZip" attribute to indicate that I left off my company's zip code from the address. Note that this implementation of xmlwrite alphabetically sorts the attributes when displaying the document, whereas element nodes stay in the order in which they are appended.

address_node = docNode.createElement('Address');
address_node.setTextContent('3 Apple Hill Dr, Natick MA')
% set an attribute directly
address_node.setAttribute('type','work');

entry_node.appendChild(address_node);

% or create the attribute as a node
has_zip_attribute = docNode.createAttribute('hasZip');
has_zip_attribute.setNodeValue('no');
address_node.setAttributeNode(has_zip_attribute);

xmlwrite(docNode)
ans =

<?xml version="1.0" encoding="utf-8"?>
<AddressBook>
   <Entry>
      <Name>Friendly J. Mathworker</Name>
      <PhoneNumber>(508) 647-7000</PhoneNumber>
      <Address hasZip="no" type="work">3 Apple Hill Dr, Natick MA</Address>
   </Entry>
</AddressBook>

Next time I post about XML, I'll describe how to navigate a DOM to find data of interest.

12 Responses to “Simple XML Node Creation”

  1. Laurie replied on :

    Hi Mike,

    How do you go about navigating a large xml file in MATLAB? I’m really interested in reading your next post. Any idea when it will be?

    In particular, I’m getting lost in the file as I attempt to navigate to an attribute to set it to a new value.

    Yikes!

    Laurie

  2. Mike replied on :

    @Laurie,

    You can do that using tree navigation methods such as docNode.getElementsByTagName and node.getChildNodes. There is also support for XPath expressions. It will be the topic of my next post in the series, but I was going to wait until we were done with all the R2010b feautres, but I’ll try to move it up.

  3. Laurie replied on :

    @Mike

    I used that to navigate to the closest distinct parent node–process in the code excerpt below. The element that actually contains the attribute is surrounded by 5 other elements with the same name.

    Here’s an excerpt from the code.

    etc...

    I'm trying to get to the second attribute of the parameter element to change it. Since there are several parameter elements earlier in the file--I don't know how to get to the nth instance of this element (nor do I know which instance it is). I used the tag that you mentioned to get to the process element but can't navigate to the parameter element from it.

    Does that make any sense? :)

    I tried using the tree navigation methods of getFirstChild,getNextSibling, but I got lost in the nodes(which appear to include whitespace). Is there an easy way to count nodes?

    Thank you for your assistance.

    Looking forward to your blog.

    Laurie

  4. Laurie replied on :

    @Mike

    Rats! The code didn’t get posted.

    process attribute attribute
    pre attribute attribute
    pre attribute attribute
    pre attribute attribute
    Editor attribute attribute
    parameter attribute attribute

    Here it is.. I realize the sytax is wrong.. This is just to show the general structure.

    Laurie

  5. Laurie replied on :

    @Mike

    pre elements are children of process and Editor and parameter elements are children of the last pre element

    Laurie

  6. Laurie replied on :

    When you mention that there is support for Xpath expresssions, what do you mean? Is there a toolbox and can you post a link.

    Thank you.

    Laurie

  7. Mike replied on :

    Hi Laurie,

    Sorry for not responding sooner. To get the n’th child node, you can use node.getChildNodes.item(N). You’ll have to use that item() method to iterate over the group of child nodes if you don’t know the order of the one you want, and use things like getAttribute(attrName) to find which node you want.

    To use Xpath, there’s no need for a separate toolbox. Java has it built in and we can take advantage of that. For example:

    import javax.xml.xmpath.*
    factory = XPathFactory.newInstance;
    xpath = factory.newXPath;
    expression = xpath.compile('//pre/process') % where the string is your xpath expression
    nodeList = expression.evaluate(docNode,XPathConstants.NODESET);
    

    It’s not a straightforward example, and I hope to expand on it more in my next post on the topic, but that’s still like a month away.

    Information on Java’s xPath: http://download.oracle.com/javase/6/docs/api/javax/xml/xpath/package-summary.html.
    Xpath info: http://www.w3schools.com/xpath/default.asp.

  8. Thurston Herricks replied on :

    Hello Mike,

    This is really handy. Although how do you change the encoding to UTF-16 instead of UTF-8. I see there is the setEncoding command but when I use it the header of the file still displays UTF-8 although the variables in the MATLAB workspace are listed as encoded as UTF-16.

  9. Mike replied on :

    @Thurston,

    I just answered this question on another post, check it out there. http://blogs.mathworks.com/desktop/2010/06/28/using-xml-in-matlab/#comment-7743

  10. fabien replied on :

    Mike say :
    “Note that this implementation of xmlwrite alphabetically sorts the attributes when displaying the document, whereas element nodes stay in the order in which they are appended.”

    I use xmlwrite(filename, DOMnode), so in my xml document attibutes are sorted alphabetically .
    I would like they stay as I decided.
    how could I do that ?

    Thanks

  11. Michael Katz replied on :

    @fabien,
    The implementation we use does not support this, so there is no way that I know of.

  12. Oleg replied on :

    How can I assign a stylesheet to the xml document like:

Leave a Reply

Wrap code fragments inside <pre> tags, like this:

<pre class="code">
a = magic(3);
sum(a)
</pre>

If you have a "<" character in your code, either follow it with a space or replace it with "&lt;" (including the semicolon).


MathWorks
Mike works on the MATLAB Desktop team.

These postings are the author's and don't necessarily represent the opinions of The MathWorks.