XMLFilter
Parse and modify XML anywhere
XMLFilter is a module for Python
programs. It augments or stands in for xml.sax in four ways:
- It provides an xml.sax-compatible XML parser even when the installed
version of Python lacks a working copy of xml.sax. This allows your
scripts to work when expat is missing, such as in the version of Python
bundled with Mac OS X 10.2 (Jaguar), which would otherwise give the error message:
xml.sax._exceptions.SAXReaderNotAvailable: No parsers found
It gets around the problem by falling back to the
pure-Python xmllib and adapting it to match xml.sax’s callbacks. A
test suite verifies call-for-call compatibility, so after substituting
XMLFilter for xml.sax.handler.ContentHandler, your existing scripts
should run unmodified.
- It allows subclasses to filter, modify, add and delete content from
an XML file with minimal disruption to the rest of the file. Multiple
filters can be chained in series.
- It allows output to be written to an XML file. This works even
without xml.sax, and avoids an xml.sax.saxutils.XMLGenerator bug in
Python 2.2.
- It allows programs to hint that they want to write particular chunks
of content to an XML file as CDATA, using a method that’s fully
compatible with code (SAX handlers, filters, or output handlers) that
doesn’t have any special CDATA support. This is useful for RSS
files embedding HTML or other data that would
be unwieldy after XML entity encoding.
If xml.sax is working, the code uses it in preference to the older
xmllib for a factor-of-3 performance boost. Namespaces are fully
supported, and can be switched on or off. Character encodings are
supported on Unicode-aware versions of Python.
XMLFilter has been successfully tested with versions of Python
ranging from 1.5.2 to 2.3.
It is distributed under a Python license.
XMLFilter 1.1 Download
[Download .zip Archive] Windows CRLF format, 14K
[Download .tgz Archive] Unix/Linux/Mac OS X LF format, 12K
(The XML test suite has been stripped down to a couple of
files for distribution. If you want, you can put your own XML files into the test
folder, and the test suite will verify that the generated SAX event sequences
match with and without the use of xml.sax.)
Example
For sample code using XMLFilter as a safe drop-in replacement for xml.sax, look
at my PList reader.
For a more complex example, see RSSFilter, which
uses XMLFilter’s filter chaining to perform operations on an RSS file in place,
such as getting all posts that match given criteria, or adding, modifying, or
deleting a post.