Andrew Shearer: Home

The Hole in Postel’s Law

"Be conservative in what you do, be liberal in what you accept from others."

This law is making the rounds again, with arguments both pro and con. Here are my thoughts.

Postel’s Law is a great, useful principle for writing programs that communicate. However, the law is so elegant and successful that it’s easy to regard it as an absolute. And then, because be liberal in what you accept is such an open-ended goal, people go too far. Here’s an analysis of the problem, followed by a suggestion.

The first half of Postel’s Law, be conservative in what you transmit, is a well-specified rule with a clearly defined goal. The tools to achieve it are specs and validators. But the vague goal of the other half, to be liberal in what you accept, can turn into a bottomless hole. There’s hardly any limit to how loose an interpretation of the spec can get, how cleverly the code can guess at the sender’s intent, and how much code for special cases you can write to fix invalid data. Because such code can provide an immediate user benefit and a market advantage, it turns into an arms race. Often, the code ends up violating the spec itself, intentionally or unintentionally, which we’ll see below.

The Growing Hole

Plenty has been written praising be liberal in what you accept. So I won’t repeat it. Here are some of the problems:

It enlarges the spec. Every additional error condition fixed by a market leader becomes an (undocumented) part of the spec. Senders come to rely on it. The senders probably don’t even realize that their output is wrong because of the way software is written.

In the edit-run-debug cycle of the typical software development process, testing is often done just by trying the program out, not through any mathematical process or formal validation suite. HTML authoring tends to be done the same way. Though modern XP [Extreme Programming] test-first practices call for a thorough suite of test cases to be written before the actual code, most software still doesn’t have this advantage. HTML is an easy case for validation, with scores of easily accessible validators already written, much easier to test than most program code, yet the bulk of new pages in the world have probably never been through an HTML validator.

The problem is that, even after removing all obvious bugs, the product of this run-test-debug cycle can only run at the "seems-to-work" level. There’s no guarantee that the it’s really working, and specifically no proof that the program or web page is being conservative in what it sends. If it’s a program that communicates with other types of programs, the developer will test it with real examples of those programs. So, when a developer writing program Z needs to interoperate with programs such as A and B, and A and B are silently fixing errors in the output of program Z, Z’s developer will declare the code "working" (because to all appearances, it is), and say "ship it!". And everything will be fine until an edge case comes along that program A or program B either can’t fix or interpret differently. Or until someone tries program Z with program C, which didn’t get the memo about all the particular types of errors that programs A and B fix. All this because Z had a latent bug, due to the second half of Postel’s Law, because:

It hides violations of the other half of Postel’s Law. In other words, by being more liberal on the receiver, it becomes more difficult to find bugs in the sender.

As an example, Microsoft Internet Explorer sports what some have called a "ridiculous tolerance for errors in HTML markup". Microsoft FrontPage has a well-known tendency to silently create invalid HTML markup. (One of the bugs: FrontPage 98 and 2000 will occasionally go through a valid page with spacer images and replace all of their alt="" attributes with the lone word alt, which is invalid HTML. A developer familiar with the SGML foundations of HTML might think the fix is to parse this as a boolean attribute, alt="alt", but IE and other browsers choose to interpret it as alt="".) Though I doubt that any such bugs are intentional, the tendencies of the two products feed on each other. If the developers of FrontPage were testing with a browser that flagged such errors, it’s likely that the bugs wouldn’t have made it to release.

The bind here is that Postel’s Law tries to make things work as often as possible for users, but people trying to test other programs are users too, and errors are also covered up for them. One way out of this would be some kind of Postel Kill Switch, a strict mode intended for interoperability testing. (Turning off the other half of the law at the same time, causing the program to send out data malformed in various ways, would be harder to switch on programmatically.) Though the strict mode might do some good, it has some drawbacks: it would require a different code path, making it prudent to test both modes; and even without the extra work that would entail, testers might not bother turning the feature on every time in the first place.

Market Forces

Even though it’s usually more work to be more liberal, developers with time or money on their hands will still do it. They are often motivated just to provide convenience for their users, but with competitors in the same market, it has a predictable effect:

It increases the cost of entry. Accepting everything is a greedy strategy. It rewards the incumbents, and makes more work for newcomers. Not only do the newcomers have to catch up with all the error-fixing logic that the market leaders have been writing since the beginning, they have to somehow figure out what all those error conditions are. They’re not in the spec, and it’s almost certain that they’re not publicly documented anywhere. Even if the types of errors to be fixed were known, the new programs would have to fix them exactly the same way as the old ones, even in the face of multiple overlapping errors or ambiguous edge cases. And in some cases, this may require disregarding the spec, deliberately misinterpreting a valid document to match an overzealous fix.

Safety

This leads to one of the most damning consequences:

It makes software unreliable. Even the safest-looking fix can have unexpected consequences once others depend on it. (Which they will, and, unless the fix was added purely on speculation, already do.)

For instance, if you’re writing an HTML parser, and you see a lone ampersand (technically illegal--it should be encoded as &) the liberally accepting thing to do is to display an ampersand, just as if it had been encoded properly. Which is fine, at that moment. If the users knew what had happened, they would probably thank you for soldiering on through the rest of the document and not giving up right there. But in reality they don’t even know it happened, and as the years go by, they will keep turning out pages with unencoded ampersands. (It’s the testers-are-also-users problem again.) New high-end content management systems will be deployed without anyone working with the system even knowing that they’re entering raw HTML into some of the text fields, and that they have to be careful with ampersands (yes, this already happens). A validator may catch the problem if it happens to crop up on the page at the time it’s checked, but most likely, no one will notice until the unlucky day that someone writes a classified for an electric guitar setup saying "For Sale: guitar& $200." Then the amp will just mysteriously disappear on the post, putting a guitar and $200 on sale. (If you think the example is contrived, note that in another attempt to apply Postel’s Law, real-world browsers end up expanding the error domain even further: "guitars&amplifiers" will have three letters dropped out of it, because the first browsers judged that to be most likely what the author intended. However, if you added spaces around the punctuation, whole words would show up. This is the kind of bizarre behaviour that makes people distrust computers.)

At its root, the ampersand problem is really just confusion over a weakly specified input format. (You can find similar examples on display in comment forums across the web, which often treat visitors to the spectacle of a web developer repeatedly trying to describe an HTML tag, only to have the tag itself disappear.) However, in this case being liberally accepting didn’t fix the problem; it just made its symptoms more rare, and therefore the real problem harder to find, more capricious, and more puzzling.

In an effort to do the right thing, some programs intentionally go against the spec. Internet Explorer (and therefore Outlook, when opening HTML mail) will disbelieve the content type specified by the web server, and choose a different type itself based on heuristics, a behaviour which is even documented. An XHTML document might not be rendered if it starts with a comment that’s too long, or a plain text file might be parsed as HTML because it contained a tag-like sequence of characters. The HTTP spec specifically forbids browsers to second-guess the content type provided by the server, but IE does it anyway. This makes IE compatible with many badly-configured web servers. It also frustrates the owners of well-configured web servers for whom IE always guesses wrongly.

In certain cases, outright bugs in complex code designed to tolerate many errors has the ironic effect of limiting the spec. For example, RSS is based on XML, but because of the existence of RSS feeds with invalid XML, liberal RSS parsers can’t be based on real XML parsers. Real XML parsers are thoroughly tested and widely deployed. But instead, the developers have to roll their own quasi-XML parsers (increasing the barrier to entry). The chance of getting some part of the XML spec wrong is high (making the software unreliable). This in turn has made feed developers reluctant at various times to begin using any XML features that don’t already appear in the most common feeds, such as CDATA blocks in the description element, namespaces, and XML comments, because they might break regexp-based parsers. (Mark Pilgrim’s Ultra Liberal Feed Parser is a solution for Python programmers, and while it gets everything right as far as I know, it still doesn’t much help developers in other languages.)

In this example, XML is special, because the XML spec itself violates Postel’s Law. It calls for clients to terminate parsing entirely when they encounter malformed content. While it may have been better if this decision hadn’t been made, that’s the current reality of XML parsers. Replacing them all with less flighty ones would be nice. (Any takers?)

Security

Finally, security. A whole class of security vulnerabilities results from automatically fixing errors in input data. Because the set of errors to be fixed is ill-defined, software downstream can take a radically different action than what the software upstream thought possible. Malicious users can exploit this.

Think of the difficulty just of reliably filtering out dangerous HTML tags and attributes from a comment left on a web site. The browser is working as hard as possible to be liberal in its definition of an HTML tag, working by unknown rules to fix almost-tags. Can the author of such a filter ever be truly certain that nothing gets through? (Thinking about this, the only sure way around it without writing an entire HTML validator would be to fully parse the HTML input into an intermediate HTML-free representation, then write it back out as guaranteed-valid HTML code. The only thing left to worry about: an overzealous fix that would cause the valid code to be misinterpreted.)

The rule: Arbitrary fixes to bad input data will thwart any previous filtering or security checking of the data.

What to do?

A Suggestion

Future specs could require implementations to report whenever they encounter and correct errors, with an interface that could be as simple and non-intrusive as an exclamation point icon. (A newsreader, for instance, would place it next to a suspect newsfeed and link it to the Feed Validator.) There’s nothing particularly new about this kind of interface; several products, such as Opera, already do something similar. The trick would be that that it would be required by the spec. The market leaders would be compelled to adopt it, not just the smaller products.

This behavior wouldn’t hamper a program’s ability to accept liberally; it would just let testers and other interested users know that the data had not been sent conservatively. It would thus remove the conflict of interest between the two parts of the law. The feature would be on by default, so testers wouldn’t need to activate it, but it wouldn’t be so annoying that users wanted it off (as a modal alert box would be).

This doesn’t mean that each implementation has to have a full-fleged validator aboard. Only errors detectable by reasonably straightforward means and cases where the implementation goes to extra lengths to make sense of the input would have to be flagged. That does give implementations some wiggle room.

It’s important that this minor error display mechanism be required in order to comply with the spec. It can’t be voluntary on the part of the implementors. There’s nothing in it for them, at least not directly. To record the error as it’s fixed and display the fact takes extra code, albeit not much. Considering that the benefit goes mainly to future implementors as well as users of less liberal implementations that don’t know how to handle the same error, implementors will tend not the write that code unless nudged.

And the developers can be nudged, even for specs without trademarks or an official logo program. Having the requirement enshrined in the spec at least provides some social pressure for implementors to comply.

And some other things that seem to make sense right now:

Developers should also take great care to hold back and not misinterpret technically valid input in an attempt to do the right thing. Internet Explorer’s habit of second-guessing the Content-Type header is the kind of thing to avoid.
By the same token, to provide tolerant XML parsing, use a real standards-compliant XML parser first, and fall back to a handcoded quasi-XML parser only when that fails. (Or, if you can absolutely guarantee that the result will be identical, use the quasi-XML parser alone, but that guarantee is hard to make.)
To avoid unintentionally thwarting security filters, all heroic fixes to input should be made as far upstream in the call chain as possible. If there’s still a danger the downstream code will try to outsmart the upstream code, the upstream code could rewrite the input to be canonical and unmisinterpretable.

Posted January 14, 2004 at 07:14 AM

Categories: Technology