Andrew Shearer: Home

Also see the list of articles, none to be taken seriously.

Image Size Reduction for the Web

Tim Bray is looking for a better way to post photos to his web site. To judge from the sample photo, his current method doesn’t antialias the image, so sharp edges in the original look jagged when reduced in size.

I went through the same thing with iPhoto, which has an HTML Export feature that is similarly broken—it doesn’t antialias at all. It’s a strange limitation, considering that the Mac OS X graphics system has fast, high-quality antialiasing everywhere else, including fonts and Dock icons. It’s as if Apple turned off a global switch in iPhoto for better performance when displaying large number of images onscreen, but forgot to turn it back on for HTML exporting, where quality should count for much more.

In any case, the quality of iPhoto’s exports was poor, so I wrote a Python script to handle the export using the Python Imaging Library. (Contact me if you’d like the code. So far, I’ve publicly released only the general-purpose plist parser that I wrote to handle the AlbumData.xml file.)

The script reads the titles and comments assigned in iPhoto, and parses them for category and other tagging information I’ve appended to the comments. Then it generates date-based and category-based HTML page hierarchies for all the albums whose names start with "Web-", and generates any thumbnails or medium-sized images that are missing.

The Python Imaging Library, or PIL, is very easy to install with MacPython 2.3’s Package Manager.

There are some drawbacks, though:

I had to push the JPEG quality setting very high to avoid obvious macro-blocking (squares showing up around detailed areas), and pushing the quality any higher caused PIL to fail by throwing an exception.
The BICUBIC setting for image reduction didn’t appear to work at all. The image ended up non-antialiased, the same as Photoshop’s "Nearest Neighbor" setting. Only ANTIALIASED had any effect. This may result in bilinear instead of bicubic interpolation, but the documentation isn’t clear.
The Thumbnail setting produces an image quickly, but they are very low-quality.
The Progressive setting for JPEGs seemed to cause even more exceptions when trying to save at high quality levels, so I was forced not to use it.
It’s not nearly as fast as Mac OS X’s Core Graphics image reduction. But then again, I wouldn’t expect it to be.

On the positive side, the antialiasing looks good, and PIL can also read embedded EXIF data. Images that I’ve tagged as deserving more info automatically get the aperture and shutter speed printed on the page.

The code for actually reducing and saving the image, ignoring the EXIF and album manipulations for now, is as simple as this:

if not os.path.exists(newPath):
    shrunkImage = im.resize(size, resample = PIL.Image.ANTIALIAS)
    shrunkImage.save(newPath, 'JPEG', quality = 90)

You can see samples in my Pictures section. Check out the first batch of Providence photos for some night examples with shutter speeds and apertures shown, and the Providence and Boston kayaking photos for examples of pictures with lots of edges that would have looked much worse without antialiasing.

Posted October 09, 2003 at 12:12 AM

Categories: Python, Mac OS X, Software, Open Source, General

Read and Post Comments

HTMLFilter 1.1 released

HTMLFilter, in its first public standalone release, is a module for Python programs. It parses an HTML 4 document, allowing subclasses to pass through or modify the the text and tags as they go by. The resulting copy will be an otherwise exact replica of the original, including whitespace and comments. ASP, PHP, JSP, or other server-side code will generally survive the round trip. (The only exception is if the code is embedded inside an HTML tag you’re actually modifying, not just passing through, and in most cases any tag attributes not explicitly modified are safe.)

The use can be as simple as adding a <meta> tag to an existing web page without disturbing the rest, or as complex as merging two HTML pages (as it’s used in ShearerSite, which intelligently merges content pages into template pages).

You can also use it to generate HTML from scratch, with HTMLFilter taking care of the attribute encoding for tags.

HTMLFilter. Python-licensed. Unicode and encoding-savvy. Tested with Python 1.5.2 through 2.3.

Posted September 28, 2003 at 03:03 PM

Categories: Python, Software, General

Read and Post Comments

XMLFilter

I just released version 1.1 of XMLFilter, which marks the first public standalone release. XMLFilter is an open-source Python module you can include with your programs to provide XML parsing even if the target system lacks a working xml.sax package. You can use it to quickly adapt existing xml.sax-compatible scripts to work out of the box, for example, on Jaguar (Mac OS X 10.2), which lacks expat.

It works by using the older xmllib module as a fallback for xml.sax. A test suite verifies call-by-call compatibility no matter which module ends up being used.

Other features include XML event-stream filtering, writing, and creation, with support for writing CDATA sections. (Using these classes also avoids bugs in some versions of xml.sax.)

Generally, the newer your version of Python, the faster it goes. For example, if xml.sax and expat are working, they give a factor-of-3 speedup over the pure-Python xmllib, and on Python 2.3, Unicode encoding conversions will use xmlcharrefreplace for faster writing of XML numeric entities.

Python-licensed. Tested all the way down to Python 1.5.2 and up to Python 2.3. xml.sax-compatible, Unicode-savvy (wherever Python is), and optionally namespace-aware.

Posted September 15, 2003 at 08:08 PM

Categories: Python, Mac OS X, Software, General

Read and Post Comments

Netscape is dead, long live Mozilla

Matthew Thomas: Netscape is dead, long live Mozilla.

Netscape’s control over Mozilla was the single biggest factor in making Mozilla’s usability suck, from the project’s inception until two days ago. That’s a pretty tall order — to make an interface design crappier even faster than hundreds of volunteer geeks are — but somehow, Netscape managed it. That was the main reason I used to get so angry, so often.

Posted August 05, 2003 at 06:06 PM

Categories: Software, General

Read and Post Comments

Comment APIs, going once, going twice

Joe Gregorio has a RESTy comment API based on RSS 2.0. His article compares it with the soup of other protocols available: TrackBack, PingBack, and Post-It. One problem: it links to the author’s home page rather than a specific post, so it’s not good as a link-notification mechanism, as TrackBack is. And John Gruber points out that TrackBack isn’t really that good for comments in practice, because the dominant implementation just resends the article summary.

His Referrers list, however, continues to show a lot of junk along with the real links, including one user’s local Radio Userland aggregator on port 5335.

Posted June 15, 2003 at 09:09 AM

Categories: Software, General

Read and Post Comments

DirectRSS

Announcing DirectRSS. For when you want as little as possible between you and your RSS.

It’s an open-source MetaWeblog API implementation that modifies RSS 2.0 files in place. It also supports the Blogger and b2 APIs. No database required.

With it, you can use a weblog editing client such as NetNewsWire or w.bloggar to update an RSS feed, then use XSLT or the companion HTML renderer to generate a web site.

To handle larger collections of posts, it supports Dave Winer’s blogBrowser format, which, instead of a single file, uses one RSS file per month and one folder per year. To the weblog client, it looks like one big file, with all posts editable. A file containing the few most recent postings is generated automatically, for the benefit of news aggregators.

It was originally written as an XML experiment, but it’s proven reliable. It’s packaged as a Python CGI script, and comes with its own pre-configured Python web server for running locally. If you already have Python installed, there’s no setup required to run the working tutorial. (If you don’t, it only takes a few minutes to install Python.) It’s compatible with the bundled Python in Mac OS X 10.2 (Jaguar) and other Pythons lacking an expat parser. (It falls back on xmllib.)

New features in this version include full support for namespaces in both the RSS file and the MetaWeblog API, post modification dates, and a tutorial showing how to render the posts into HTML.

Currently, it’s packaged with ShearerSite, the (awkwardly-named) web interface that also performs the rendering into HTML. I may split out just the RSS editing portion if there’s interest. The HTML renderer can display RSS 2.0 files or blogBrowser archives filtered by category, date range, or numeric range.

See the download page (hosted by SourceForge), tutorial, and revision history.

Posted June 14, 2003 at 08:08 PM

Categories: ShearerSite, Software, DirectRSS, General

Read and Post Comments

Trackbacks, Referrers, Comments?

I pointed to Daring Fireball’s Trackback critique, but I didn’t comment. On the Internet, if I don’t do it, someone else will, and this article did. It’s a very well thought-out response.

To summarize: John Gruber’s criticisms of TrackBack are valid, but his referrer system has its own problems. It trades increased ease on the sending side for lower quality on the receiving side.

To improve TrackBack, it should be made easier. I don’t see why all comment forms on sites with TrackBack couldn’t be changed into combined comment/TrackBack forms. TrackBack would almost disappear to the web surfer; it would just be Remote Comments.

I hope to find a way on my own site to integrate comments with static main pages. (I have a few ideas.)

Posted June 14, 2003 at 07:07 PM

Categories: Software, General

Read and Post Comments

IE on its way out

No one has been covering the Internet Explorer from a web author's perspective as well as Zeldman.

2005? Are they kidding?: “Scoble says Longhorn will be available in 2005. Which is another way of saying IE/Win won't change for at least two years. It is not good enough to stay as it is. ...Can anyone tell us how two more years of flawed standards support is supposed to be a good thing?”

RIP:

...Our friends there [at Microsoft], we knew, were working on improvements, particularly in the areas of CSS and DOM support. Yet no significantly new browser version ever came of their activity. IE6/Win still had trouble with parts of CSS1, still did not support true native PNG transparency, and still did not incorporate Text Zoom...

Over the past weeks, the stories we and others have been covering (including the unavailability of an improved version of IE5/Mac outside the subscription-based MSN pay service, and the news that IE/Win was dead as a standalone product) painted a picture of a product on its way out. And now we know that that is the case.

We know that, after spending billions of dollars to defeat all competitors and to absolutely, positively own the desktop browsing space, Microsoft as a corporation is no longer interested in web browsers...

From here, as it has for several weeks now, it looks like a period of technological stasis and dormancy yawns ahead. Undoubtedly the less popular browsers will continue to improve. But few of us will be able to take advantage of their sophisticated standards support if 85% of the market continues to use an unchanged year 2000 browser.

OK, enough quoting. Go read the articles. It‘s getting late, but I’ll comment on one thing. I’ll do it even though it requires another quote.

IE5/Mac, with its Tasman rendering engine, was the first browser to deliver meaningful standards compliance to the market, arriving in March, 2000, a few months ahead of Mozilla 1.0 and Netscape 6... IE5/Mac introduced innovations like DOCTYPE switching and Text Zoom that soon found their way into comparably compliant browsers like Navigator, Konqueror, and Safari. And all but Text Zoom eventually made it into IE6/Win...

Add to that feature list the printer equivalent of Text Zoom: interactive fit-to-page controls in the print preview window. A very useful solution for a problem I saw users on other browsers and platforms (including IE/Win) struggle with frequently.

The reason IE 5/Mac was good was because it had to be. It was fighting against a large installed base of Netscape 4 on its merits, and Microsoft couldn’t fall back on their Windows franchise to push it. It was designed to be better than Netscape 4, and it succeeded at that. (Also helping its market share was Microsoft’s public threat to pull Office for Mac, which resulted in Apple shipping IE as their default browser.) Still, the competition made Microsoft produce some of its best work.

Soon after, with the game won (or at least, with everyone but Microsoft having lost sufficiently) Microsoft has gone home. They may have even done that years ago, quietly.

IE 6/Win wasn’t much of an upgrade. (A CNET review: “Just about the only reason we can figure that IE 6 even deserves the full 6 version number is its release in conjunction with Windows XP. For those of you not upgrading to Windows XP, whether you run IE 5.x or Netscape 6.x, there's no need to rush for this download.”)

Which brings up a question: when was this decision made? It was made public only recently, but could have been in the air in the Microsoft executive suite for much longer. They have the money to keep the development teams going regardless of the outcome. (According to a Think Secret article, IE 6/Mac was largely finished last year, but according to a former developer “We were told by upper management to hold it back until they gave it the green light.”) Aside from a 2001 update just to keep up with the release of Mac OS X, there haven't been any real feature upgrades to Internet Explorer for either Mac or Windows for the past three years. Both of them might as well have been cancelled then.

We’ve been using a dead product all this time and didn’t even know it!

Posted June 14, 2003 at 01:01 AM

Categories: Software, General

Read and Post Comments

TrackBacks, or Referrers?

Daring Fireball: The problems with Movable Type's TrackBack protocol.

Posted June 13, 2003 at 12:12 AM

Categories: Software, General

Read and Post Comments

Cross-platform, cross-browser XML apps

Jon Udell writes: “Let's review what's happening in this screen shot. I'm running Mozilla Firebird on my Mac. The application is a structured search of my OSCOM slides. There's no search engine beyond the browser itself, which provides the JavaScript UI, the XPath-based search, and the XSLT-driven results display.”

This is great. It’s a working XPath lab in your browser. At least, if your browser is IE or a recent Mozilla derivative. (Come on, Apple, implement XSLT and the XMLDocument request APIs next in Safari.)

Posted June 08, 2003 at 11:11 PM

Categories: Software, General

Read and Post Comments

« Previous Page -- Next Page »