Last time in my XML series, I showed you how to use xmlread to create an XML Document object in MATLAB. I also promised to follow up on that with more information on how to use the Document object. Today, I want to show you how to create a DOM from scratch and then build up a simple XML document. To do that, I'm going use the canonical address book example as our sample XML document.
The first step is to create a new node. To do this, we need to use the Java method com.mathworks.xml.XMLUtils.createDocument() since MATLAB does not have native XML objects. In this example we create the top level ("root" or "document") node to be named AddressBook. Using the one argument version of xmlwrite will display the xml document in the Command Window.
docNode = com.mathworks.xml.XMLUtils.createDocument('AddressBook'); xmlwrite(docNode)
ans = <?xml version="1.0" encoding="utf-8"?> <AddressBook/>
Now that I have the document element, I want to populate it with child nodes. The document object is also the factory for new nodes. Therefore each time I create a new node, I do so from my docNode object. Also note, that I can't append a child element node to the document directly, I have to first call geDocumentElement to get the root element node of the document. It's all a little confusing, but I'm sure it made sense to the original designers of Apache's Xerces DOM (which is the implementation we use).
In the next step, I create an Entry element to represent a single person in my address book, and append it as a child to the root node.
entry_node = docNode.createElement('Entry'); docNode.getDocumentElement.appendChild(entry_node); xmlwrite(docNode)
ans = <?xml version="1.0" encoding="utf-8"?> <AddressBook> <Entry/> </AddressBook>
Now that there is an Entry in the address book, I will create data elements for it. To do this I create new element nodes and append those to my Entry node. I also create text Nodes to represent the text data of each of these elements. For example, a I create a "Name" node for the person's name, and put "Friendly Mathworker" (a colleague) as text node child of the Name element. Doing this allows me to build up the XML tree structure. In addition to text nodes, we can create attribute nodes (see next section), as well as CDATA, Comments, and other types of XML nodes.
add name, phone number
name_node = docNode.createElement('Name'); name_text = docNode.createTextNode('Friendly J. Mathworker'); name_node.appendChild(name_text); entry_node.appendChild(name_node); phone_number_node = docNode.createElement('PhoneNumber'); phone_number_text = docNode.createTextNode('(508) 647-7000'); phone_number_node.appendChild(phone_number_text); entry_node.appendChild(phone_number_node); xmlwrite(docNode)
ans = <?xml version="1.0" encoding="utf-8"?> <AddressBook> <Entry> <Name>Friendly J. Mathworker</Name> <PhoneNumber>(508) 647-7000</PhoneNumber> </Entry> </AddressBook>
For the final step, I'm going to add an address to this person's entry. In addition to a simple a text node, I've added some attributes using two different methods. First, I use the convenience setAttribute(name, value) method to indicate that this Address is of type "work". In the second case I use the more formal node structure to create a "hasZip" attribute to indicate that I left off my company's zip code from the address. Note that this implementation of xmlwrite alphabetically sorts the attributes when displaying the document, whereas element nodes stay in the order in which they are appended.
address_node = docNode.createElement('Address'); address_node.setTextContent('3 Apple Hill Dr, Natick MA') % set an attribute directly address_node.setAttribute('type','work'); entry_node.appendChild(address_node); % or create the attribute as a node has_zip_attribute = docNode.createAttribute('hasZip'); has_zip_attribute.setNodeValue('no'); address_node.setAttributeNode(has_zip_attribute); xmlwrite(docNode)
ans = <?xml version="1.0" encoding="utf-8"?> <AddressBook> <Entry> <Name>Friendly J. Mathworker</Name> <PhoneNumber>(508) 647-7000</PhoneNumber> <Address hasZip="no" type="work">3 Apple Hill Dr, Natick MA</Address> </Entry> </AddressBook>
Next time I post about XML, I'll describe how to navigate a DOM to find data of interest.
17 CommentsOldest to Newest
How do you go about navigating a large xml file in MATLAB? I’m really interested in reading your next post. Any idea when it will be?
In particular, I’m getting lost in the file as I attempt to navigate to an attribute to set it to a new value.
You can do that using tree navigation methods such as docNode.getElementsByTagName and node.getChildNodes. There is also support for XPath expressions. It will be the topic of my next post in the series, but I was going to wait until we were done with all the R2010b feautres, but I’ll try to move it up.
I used that to navigate to the closest distinct parent node–process in the code excerpt below. The element that actually contains the attribute is surrounded by 5 other elements with the same name.
Here’s an excerpt from the code.
I'm trying to get to the second attribute of the parameter element to change it. Since there are several parameter elements earlier in the file--I don't know how to get to the nth instance of this element (nor do I know which instance it is). I used the tag that you mentioned to get to the process element but can't navigate to the parameter element from it.
Does that make any sense? :)
I tried using the tree navigation methods of getFirstChild,getNextSibling, but I got lost in the nodes(which appear to include whitespace). Is there an easy way to count nodes?
Thank you for your assistance.
Looking forward to your blog.
Rats! The code didn’t get posted.
process attribute attribute
pre attribute attribute
pre attribute attribute
pre attribute attribute
Editor attribute attribute
parameter attribute attribute
Here it is.. I realize the sytax is wrong.. This is just to show the general structure.
pre elements are children of process and Editor and parameter elements are children of the last pre element
When you mention that there is support for Xpath expresssions, what do you mean? Is there a toolbox and can you post a link.
Sorry for not responding sooner. To get the n’th child node, you can use node.getChildNodes.item(N). You’ll have to use that item() method to iterate over the group of child nodes if you don’t know the order of the one you want, and use things like getAttribute(attrName) to find which node you want.
To use Xpath, there’s no need for a separate toolbox. Java has it built in and we can take advantage of that. For example:
import javax.xml.xmpath.* factory = XPathFactory.newInstance; xpath = factory.newXPath; expression = xpath.compile('//pre/process') % where the string is your xpath expression nodeList = expression.evaluate(docNode,XPathConstants.NODESET);
It’s not a straightforward example, and I hope to expand on it more in my next post on the topic, but that’s still like a month away.
Information on Java’s xPath: http://download.oracle.com/javase/6/docs/api/javax/xml/xpath/package-summary.html.
This is really handy. Although how do you change the encoding to UTF-16 instead of UTF-8. I see there is the setEncoding command but when I use it the header of the file still displays UTF-8 although the variables in the MATLAB workspace are listed as encoded as UTF-16.
I just answered this question on another post, check it out there. http://blogs.mathworks.com/desktop/2010/06/28/using-xml-in-matlab/#comment-7743
Mike say :
“Note that this implementation of xmlwrite alphabetically sorts the attributes when displaying the document, whereas element nodes stay in the order in which they are appended.”
I use xmlwrite(filename, DOMnode), so in my xml document attibutes are sorted alphabetically .
I would like they stay as I decided.
how could I do that ?
The implementation we use does not support this, so there is no way that I know of.
How can I assign a stylesheet to the xml document like:
I have used the above code to create an XML file having a set of points as entry with a given name and coordinate. When I look at the XML file in for example XML notepad or internet explorer, I see what I expect: the given number of points with their names and coordinates as childnodes.
However, if I want to import this XML file in matlab again, using the parseXML function suggested in the documentation of xmlread, two things appear wrong…
First of all, I get additional ’empty’ nodes: if I give one point as input, I expect a structure with one child, but I get one with 3 child structures: the second one has my input, the first and third one have the name ‘text’ and are empty. Second, my ‘correct’ child, having the name ‘point’ has 5 children structures while I expect two (‘Name’ and ‘Coordinate’)…
Do you have any idea what went wrong? Did I do anything wrong with using the exact code above and just adapting the name-tags?
Is there a way to add attribute to the first node. It is in this case.
Cihan Barış TUNCER
Is there a way to add attribute to the first node. It’s name is “AddressBook” in this case.
Cihan Baris TUNCER
My xml file looks like this:
I need to save the NumberOfBaseStations:3 and the BS attributs: ID=”0″ X=”0″ Y=”0″…. of all BS nodes in a matlab cell array.
configxml = xmlread(‘ConfigFile.xml’);
%Get the “Configuration” node
configuration = configxml.getDocumentElement;
% Get the first “Configuration”‘s child: NumberoFBaseStations
numBSs = configuration.item(1).getChildNodes;
num = str2num(numBSs.getTextContent);
% Get the second “Configuration”‘s child: BaseStaions
basestations = configuration.item(3).getChildNodes;
bs = basestations.item(1).getChildNodes;
i = 0;
if strcmpi(bs.getNodeName, ‘BS’)
???? How to get the attributes???
bs = bs.getNextSibling;
if i == num
When I use
numBSs = configuration.item(0).getChildNodes;
I cannot understand why.
The second problem is, how to get the attributs of BS nodes. My ideas didn’t work.
Pleas, can you give an advise how to fix this problems.
I have a problem to post the xml text. Perhaps there is a tag for xml, but I don’t know it. Sorry.
So, BS are children from BaseStations.
?xml version=”1.0″ encoding=”utf-8″?>
<BS ID="0" X="0" Y="0"/
<BS ID="1" X="0" Y="0"/
<BS ID="2" X="0" Y="0"/