Andrew Shearer: Home

Also see the list of articles, none to be taken seriously.

RI Nexus Beta

Check out the beta of the new RI Nexus site, full of news and resources about information techonology and digital media in Rhode Island.

I wrote some new Drupal modules to support it. Feedback and suggestions are welcome!

Posted September 11, 2007 at 10:11 PM

Categories: Open Source, Drupal, General, Providence, PHP, Technology

Read and Post Comments

Announcing Migraine, a Drupal Site Migration Tool

Thanks to Noosphere Networks, I’m releasing a script that helps developers of web sites built with Drupal to maintain separate development/test and production sites, pushing changes from test to production as needed. This is challenging with a stock Drupal installation. Changes to PHP code are no problem, because it lives in the filesystem and can be copied or committed to a revision-control system like Subversion. But a lot of Drupal’s configuration work take place within its web administration interface and is saved to the database, where production content such as user accounts and comments is also stored.

The desire to do this frequently comes up on Drupal’s forums, and the typical workarounds have some large drawbacks (involving some combination of extended downtime on the production site, duplication of work, and the loss of content, comments, and user account changes made in the interim).

This small script attempts to solve that by categorizing Drupal’s tables and moving only the right ones at the right time, while handling details such as merging sequence numbers. It also dumps Drupal’s databases to disk in a format that works well for checkin to a revision control system.

This is free software, licensed under the GPL.

Theres a more ambitious project called AutoPilot that aims to do this and more in the future, but its ability to merge test sites into production without losing production content isn’t available yet, and I needed something now.

Be warned, though, that this is an alpha release, intended for those with familiarity with MySQL and Drupal’s table layout. If you have CCK fields, there may be some manual work required when you modify your field layout because CCK tends to change your database schema, and Migraine does not currently attempt to automate all of those changes. It will detect them and warn of the problem, however.

See more information at the Migraine project page.

Posted September 04, 2007 at 07:21 AM

Categories: Python, Software, Open Source, PHP, General

Read and Post Comments

Providence PHP September Meetup

I’m holding the next PHP meetup for the RI area tomorrow evening (Tuesday) at Trinity Brewhouse in downtown Providence.

If you’d like to join us for free-form discussion of web development, PHP, and various types of beer, please RSVP.

Posted September 03, 2007 at 10:58 PM

Categories: Providence, PHP, Open Source, General

Read and Post Comments

Providence PHP Meetup: Tuesday, April 3

We have a new location and a guest speaker for our April meetup. Nate Abele of the CakePHP project will be here to show off the rapid web development framework and answer questions. We'll make time for discussion too.

The new location is a really nice conference room at the Johnson & Wales Academic Center, with everything that implies (i.e. a projector).

All programming skill levels welcome. If you're going to be in the Providence, RI area and can make it, please see here for more details and to RSVP:

Providence PHP April Meetup [meetup.com]

Posted March 27, 2007 at 06:28 PM

Categories: Open Source, Software, Technology, Providence, General

Read and Post Comments

Announcing fs2svn: make a Subversion repository from archive folders

fs2svn is a new, free, open-source tool that converts a bunch of archive folders into a Subversion repository.

If you’ve kept a series of historical snapshots of your work in folders, fs2svn can help you upgrade to a full-fledged version control system.

fs2svn goes through all the folders under a given parent folder (in filesystem order) and creates a Subversion revision for each one, backdated to the most recent file’s last modified date. The log message is set to the folder name.

Additions, changes, and deletions between one folder and the next are all recorded in the repository.

The input format is very simple. It only covers the mainline trunk, not any tags or branches (though tags for major versions could be manually created later, if your folder names carry enough information).

The format is so simple it could be used as a common intermediary. If you wanted to migrate a mainline trunk from some exotic version control system to Subversion, you could write a script to export it to regular folders, then use this script to import the result into Subversion.

See the main fs2svn page for information, examples, and to download.

Posted June 24, 2005 at 12:14 AM

Categories: Python, Software, Open Source, General

Read and Post Comments

WordPress RSS Import

WordPress 1.2 now has an its own RSS import feature. However, it’s based on a different technique (regular expressions) than the code I contributed in January (which uses a true XML SAX parser). So I’m posting the code here as open source under the GPL license. This code has some additional features:

It can import single files from either your local drive or from a URL you specify, or it can import entire folder hierarchies of RSS files (blogBrowser-style: one folder per year, one file per month), making it a general-purpose weblog batch import tool using RSS as the exchange format.
It aggregates RSS feeds, if you point one or more copies of it at feeds on the web and set it to run regularly. (Even when run frequently, it won’t import the same item twice.) You can also use this to maintain more than one WordPress site that shares the same content, such as a test site and a production site.
It handles time zones in a sophisticated way, preserving the timezone offset so that each item can appear on your weblog under the author’s original local time, while using GMT for all date comparisons.
It respects and stores modification dates if given in the RSS file.
If modification dates are given in the RSS file, it can optionally import only new or changed posts, leaving posts alone that haven’t been changed or that have been changed more recently on the local machine.
Using the above feature and two copies of WordPress, it can synchronize two or more weblogs, bidirectionally or multi-directionally. New and changed posts on any one weblog will automatically show up on the others.
It complies with the XML specification, for correct behavior with XML namespaces with arbitrary prefixes and CDATA sections in arbitrary locations, both of which can trip up a regular-expression-based parser.

As long as your RSS feed passes the XML well-formedness test (which it probably does, even if it doesn’t validate according to the RSS Validator), you can use this RSS Import filter. If it’s not well-formed XML, you’re better off with the RSS import filter built into WordPress.

Versions are available for WordPress 0.9 through 1.2.

More Info and Download

Posted June 10, 2004 at 08:15 AM

Categories: Software, Open Source, General

Read and Post Comments

RSSFilter

Now available: RSSFilter, an open source Python module for modifying RSS files and blogBrowser-format RSS archives in place. It builds on XMLFilter. (Speaking of which, thanks to Mark Pilgrim for its recent mention in his b-links.)

The module can also be used an RSS parser for valid XML feeds, though it trades in ultra-liberal parsing for its ability to safely modify files.

Operations such as inserting, modifying, or deleting a post are designed to cause minimal disruption to the rest of the file.

iPhoto comments, flattened with Text File Technology

Here’s a way to back up iPhoto’s image comments into an easy-to-read flat directory structure. (Translation: one big folder.) You’d want to do this when archiving your photos to CD or DVD, or when trying to merge photo libraries, or when leaving iPhoto for another program, or at any other time you want your comments saved in a non-proprietary, easily readable format.

As you may have read last week, when I upgraded to iPhoto 4, all the image descriptions temporarily disappeared from my online photo albums. (I caught the problem on my own staging server before it appeared on this site.) The culprit was a change in the way iPhoto stores photo comments. Comments are now entirely gone from the easy-to-parse AlbumData.xml file; iPhoto now stores them in a binary format that appears to be proprietary.

AppleScript to the rescue. Last week’s script saved the comments to text files and generated a directory structure that exactly paralleled iPhoto’s library, with one text file for each comment. These files were in folders for each day, which were in turn inside folders for each month, etc., guaranteeing there would be no name conflicts. I had rejected using the internal ID of each picture (which would have allowed a flat conflict-free directory structure) because the ID wasn’t user-visible anywhere in the iPhoto interface, making comment files named for the ID difficult to map back to the original pictures.

One of the comments on that post asked for a version that generated the comment files in one folder, based on the image’s filename. That was a good idea. Though the filename is not guaranteed to be unique, it often is in practice. Most digital cameras save unique serial numbers for each picture as part of the filename. So this is enough for most people. (The exceptions would be if you have more than one digital camera using a similar naming convention, or if your camera is configured to reset its numbering between rolls.)

If you like guaranteed accuracy, use my original script; if you like simplicity, use the following alternate script. It will only save one of the conflicting comments if photo filenames are duplicated. Dropping the parallel folder structure simplified the script, since this version doesn’t need to employ any POSIX path manipulation.

Copy the following into Script Editor and run. Tested with iPhoto 4.0 on Mac OS X 10.3. (It may also work with earlier versions; drop me a comment below if you’ve tried it.)

-- Export iPhoto Comments - Flat
-- Creates a text file corresponding to each picture with a comment, containing just the comment. The filenames of the text files correspond to the filenames of the images. So avoid having more than one image with the same filename (taken by two different cameras with similar naming conventions, perhaps). This isn’t a problem for most people, but if it is for you, use the slightly more complex version of the script that duplicates the iPhoto folder hierarchy: <http://www.shearersoftware.com/personal/weblog/2004/01/18/iphoto-4-has-comments-no-more>.
-- Note: this does not remove files in the comments folder when a comment disappears (due to deletion of either the comment or the image). To guard against this, you may want to delete the whole comment folder before rerunning this script. (Using a separate folder rather than storing comment files alongside the image makes this easier; you can flush the whole cache at once.)
-- Written to work around the fact that iPhoto 4 no longer stores photo comments in the AlbumData.xml file.
-- by Andrew Shearer, 2004-01-25 <mailto:ashearerw at shearersoftware dot com>

-- config
set commentsFolderName to "iPhoto Library - My Comments Cache - Flat"
set stripJPG to true --whether to strip .JPG extension
set openFolderInFinder to true
set commentFileSuffix to ".comment.txt"
set requiredAlbumPrefix to "Web-"
-- end config

tell application "Finder"

--return some folder of (path to pictures folder)
if not (exists folder named commentsFolderName of (path to pictures folder)) then make new folder at (path to pictures folder) with properties {name:commentsFolderName}
set commentsFolderPath to folder named commentsFolderName of (path to pictures folder) as text

end tell
--set commentsFolderPath to POSIX path of (path to pictures folder) & commentsFolderName

tell application "iPhoto"

repeat with theAlbum in (every album whose name starts with requiredAlbumPrefix)

repeat with thePhoto in (every photo of theAlbum whose comment is not "")

set commentText to comment of thePhoto as Unicode text
set commentFilename to image filename of thePhoto
if stripJPG then

-- strip .JPG suffix (optionally)
if commentFilename does not end with ".JPG" then

error "Error: file does not end with .JPG: \"" & commentFilename & "\""

end if
set commentFilename to text 1 through -5 of commentFilename

end if
-- add suffix to comment filename (.txt extension, etc.)
set commentFilename to commentFilename & commentFileSuffix

set f to open for access file (commentsFolderPath & commentFilename) with write permission
set eof f to 0 -- truncate file, or old data can remain
write commentText to f as Unicode text
close access f

end repeat -- photos in album

end repeat -- albums

end tell

if openFolderInFinder then tell application "Finder" to open folder commentsFolderPath

Posted January 26, 2004 at 11:38 AM

Categories: Pictures, Open Source, Mac OS X, Software

Read and Post Comments

What’s This Site Running?

I’m now using the release WordPress 1.0 to generate the content area of this weblog. (The headers, footers, site navigation, and subscription list are generated by ShearerSite.)

In many ways, it’s going from one extreme to the other. My own system is based on static rendering without a database, to the point that the original data itself is kept in RSS-compliant XML files on the site, and HTML files are generated from those. So there’s no programmatic server overhead for retrieval, but there is for authoring, since all the dependent pages have to be re-rendered on the spot. I’m still a fan of this type of system, but I wanted to try something different. WordPress is about as different as you can get: by default, it runs a battery of regular expressions--dozens upon dozens of them--over each post to format it at retrieval time. (Some kind of static caching may be on its way, though, judging from hints in the database schema.) The administration interface is mostly very good, making it much easier to perform administration tasks such as adding new categories than my homegrown config-file-based system did.

Pros of WordPress: very hackable (the good way, by the site owner); terrific setup routines; good navigation controls, easy to set up; well-rounded feature set.

Cons: frequently passes HTML through finicky regular expressions; too much use of addslashes() for my taste, including some double applications; a few bugs in 1.0 (though, to be fair, 1.0.1 final is imminent).

Some changes I made to my own copy include:

Improvements for source code posting, as well as XHTML validity. Made some changes in the regular expressions in the wptexturize and wpautop filters. Unmodified, they kept turning some my posts into invalid XHTML by adding an extra </p> tag. I also had some problems with snippets of source code that I posted. WordPress’s filters would get too smart, and try to produce curly quotes around strings, as well as em dashes before AppleScript comments. They would also tend to double-space the code, because newlines were turned into <br /> and a newline by wpautop, and the pre element honors both. I modified the code so that any filter could (optionally) avoid <pre> sections in the content, letting them go through unmodified. I did this using a loop and, much as I hated to add them, two more regular expressions.
Site-relative blog home page links, to handle my unorthodox split-directory setup.
Minor permalink change, to send out two-digit days and months.
RSS import and synchronization. (I already contributed my RSS 0.9/1.0/2.0 import and sync. code to the WordPress project, but it was far too near the 1.0 series’ release date to make it in.)

Posted January 24, 2004 at 03:14 AM

Categories: Open Source, Software

Read and Post Comments

iPhoto 4 has comments no more

I bought the upgrade to the Apple’s iLife suite, released on Friday. Here’s a gotcha for developers who parse iPhoto’s AlbumData.xml file, though it doesn’t directly affect most users. It affects me, because my own code parses AlbumData.xml to generate my web-based photo albums (such as the England trip pictures I just posted).

Though the overall format of iPhoto’s XML file stays the same (and my script had no trouble reading it), the Comments and Date fields are gone! The Date field is renamed and in a different format, which is no problem to work around because the image file’s embedded EXIF data contains the date as well. The missing Comments field is a different story.

From my quick inspection, the comment data seems to be only stored in a newly introduced iPhoto.db file, which is in some binary format. The rationale for this is presumably performance, but that doesn’t completely make sense, since the photo title is still stored in the XML file and it may be changed just as often.

In any case, here’s a workaround that uses AppleScript to write a parallel folder structure holding just the comments, one per text file. Paste the following into a Script Editor window and run. Use this anytime you’d like to protect your comments from the vagaries of software or platform transitions or upgrades. (The parallel folder structure helps this; the script could have used iPhoto’s internal IDs and generated all the files in a single folder, but that wouldn’t have been as forward-compatible.) GPL-licensed.

-- Export iPhoto Comments
-- Creates a parallel folder structure to the iPhoto Library, with a file corresponding to each picture with a comment, containing just the comment.
-- Note: this does not remove files in the parallel folder when a comment disappears (due to deletion of either the comment or the image). To guard against this, you may want to delete the whole comment folder before rerunning this script. (Using a parallel folder structure rather than storing comment files alongside the image makes this easier; you can flush the whole cache at once.)
-- Written to work around the fact that iPhoto 4 no longer stores photo comments in the AlbumData.xml file.
-- For automatic folder creation, requires the BSD subsystem (which is installed by default).
-- by Andrew Shearer, 2004-01-18 <mailto:ashearerw at shearersoftware dot com>

-- config
set commentsFolderName to "iPhoto Library - My Comments Cache"
set stripJPG to false --whether to strip .JPG extension
set openFoldersInFinder to false
set commentFileSuffix to ".comment.txt"
set requiredAlbumPrefix to "Web-"
-- end config

set commonRootPath to POSIX path of (path to pictures folder)
set origFolderName to "iPhoto Library"
set origFolderPath to commonRootPath & origFolderName
set commentsFolderPath to commonRootPath & commentsFolderName

tell application "iPhoto"

repeat with a in (every album whose name starts with requiredAlbumPrefix)

repeat with p in (every photo of a whose comment is not "")

set imagePath to image path of p
set commentText to comment of p as Unicode text

if imagePath does not start with origFolderPath then

-- make sure image is inside iPhoto Library; otherwise we won't know where to put the comment file. This AppleScript comparison is case-insensitive.
error "Image does not appear to be inside iPhoto Library. Image path: \"" & imagePath & "\". Expected library path: \"" & origFolderPath & "\""

else

-- construct new path in parallel folder structure
set commentFilePath to commentsFolderPath & text from (1 + (length of origFolderPath)) to -1 of imagePath
if stripJPG then

-- strip .JPG suffix (optionally)
if commentFilePath does not end with ".JPG" then

error "Error: file does not end with .JPG: \"" & imagePath & "\""

end if

set commentFilePath to "" & text 1 through -5 of commentFilePath

end if
-- add suffix to comment filename (.txt extension, etc.)
set commentFilePath to commentFilePath & commentFileSuffix
-- create intermediate folders as necessary with mkdir shell command. Finder-based alternatives for this are awkward. +++ This code has not been checked for proper shell escaping, though it does at least enclose its arguments in double quotes.
do shell script "mkdir -p \"`dirname \"" & commentFilePath & "\"`\""
if openFoldersInFinder then do shell script "open \"`dirname \"" & commentFilePath & "\"`\""

-- rest of the code writes out the comment file
set commentFile to commentFilePath as POSIX file -- make into Mac filespec

set f to open for access commentFile with write permission
set eof f to 0 -- truncate file, or old data can remain
write commentText to f as Unicode text
close access f

end if

end repeat -- photos in album

end repeat -- albums

end tell

That’s the AppleScript code. The comments are now in a human-readable text format, and you could use a script such as this Python snippet to read a given picture’s comment:

commentCommonBaseDir = os.path.expanduser("~/Pictures/")
commentOrigDir = os.path.join(commentCommonBaseDir,
    "iPhoto Library")
commentParallelDir = os.path.join(commentCommonBaseDir, 
    "iPhoto Library - My Comments Cache")
commentFileSuffix = ".comment.txt"

def getCommentForFile(imagePath):
    if not imagePath.lower().startswith(commentOrigDir.lower()):
        raise ('Error: image does not appear to be in iPhoto Library; ' + 
           'cannot compute comment path. Image: "%s". Library: "%s".' ) \
           % (imagePath, commentOrigDir)
    commentPath = os.path.join(commentParallelDir, 
           imagePath[len(commentOrigDir)+1:]) + commentFileSuffix
    if os.path.isfile(commentPath):
        print "Read comment for " + imagePath
        return open(commentPath, 'r').read()
    return ''

Posted January 18, 2004 at 04:57 PM

Categories: Python, Pictures, Open Source, Mac OS X, Software

Read and Post Comments