WordPress 1.2 now has an its own RSS import feature. However, it’s based on a different technique (regular expressions) than the code I contributed in January (which uses a true XML SAX parser). So I’m posting the code here as open source under the GPL license. This code has some additional features:

  • It can import single files from either your local drive or from a URL you specify, or it can import entire folder hierarchies of RSS files (blogBrowser-style: one folder per year, one file per month), making it a general-purpose weblog batch import tool using RSS as the exchange format.
  • It aggregates RSS feeds, if you point one or more copies of it at feeds on the web and set it to run regularly. (Even when run frequently, it won’t import the same item twice.) You can also use this to maintain more than one WordPress site that shares the same content, such as a test site and a production site.
  • It handles time zones in a sophisticated way, preserving the timezone offset so that each item can appear on your weblog under the author’s original local time, while using GMT for all date comparisons.
  • It respects and stores modification dates if given in the RSS file.
  • If modification dates are given in the RSS file, it can optionally import only new or changed posts, leaving posts alone that haven’t been changed or that have been changed more recently on the local machine.
  • Using the above feature and two copies of WordPress, it can synchronize two or more weblogs, bidirectionally or multi-directionally. New and changed posts on any one weblog will automatically show up on the others.
  • It complies with the XML specification, for correct behavior with XML namespaces with arbitrary prefixes and CDATA sections in arbitrary locations, both of which can trip up a regular-expression-based parser.

As long as your RSS feed passes the XML well-formedness test (which it probably does, even if it doesn’t validate according to the RSS Validator), you can use this RSS Import filter. If it’s not well-formed XML, you’re better off with the RSS import filter built into WordPress.

Versions are available for WordPress 0.9 through 1.2.

More Info and Download

blog comments powered by Disqus