Python XML Basic

This note is about how to read and write XML files by Python DOM.

First we need the module:

import xml.dom.minidom as dom

To have a writer:

writer = dom.Document()

An valid XML file should have at least one element, which can be created by:

root = writer.createElement("root")

then add the element to writer:

writer.appendChild(root)

Now save the file:

file_out = open('./foo.xml', "w")
 writer.writexml(file_out)
 file_out.close()

This produced a file contains:

<root/>

Before saving the file, we can add more elements within the root node. Simply create a new element and append to root as a child:

branch = writer.createElement('branch')
 root.appendChild(branch)

Better put it away as a function returns the new element:

def create_branch(doc, direction):
 branch = doc.createElement("branch")
 return branch

An element can have an attribute by:

branch.setAttribute(attribName, attribValue)

It can have its text content by:

text = doc.createTextNode(someText)
 branch.appendChild(text)

You see, the text content is actually a child of the element.

In this way, we can add more branches to the root, and add more branches to a branch.

Now to read the XML:

file_in = open('./foo.xml', "r")
 reader = dom.parse(file_in)
 file_in.close()

The reader is the doc holding the biggest root node. To access all children within a node:

for e in doc.childNodes:
 # do something

Type of the node can be get by:

e.localName

So <root/>’s localName is ‘root’

Attribute can be retrieved by:

e.getAttribute(attribName)

Text content is:

e.childNodes[0].toxml()

Better use a recursive function to parse all nodes:

def print_a_document(doc):
 for e in doc.childNodes:
 # do something to e
 print_a_document(e)

Send the reader to this function and implement the reading. Grab the source!

Advertisements

Leave a comment

Filed under Python, XML

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s