Andrew Shearer: Home

Also see the list of articles, none to be taken seriously.

OSCOM Day 3: Semantic Web

Managing the Semantic Web, Sandro Zic

How to ensure usability of distributed content & knowledge management?

Intelligent systems, peer-to-peer, remote programming rather than RPC.

Software agents. I’ve never quite understood the need for these. Why do we need to send code around? What couldn’t be accomplished ahead of time by Googlebot or having a direct interface exposed? Some use for disconnected operation, maybe, but increasingly we’re always connected and want immediate results anyway. Due to lack of time, we never got to this question.

Posted May 31, 2003 at 09:09 AM

Categories: OSCOM, General

Read and Post Comments

OSCOM Day 3: John Udell Keynote

Excellent keynote. He started with a simple, obvious thing which we tend to get wrong because we’re blind to it: weblog item doctitles that show up properly in search engines. Then a bunch of specific things we can implement, and a look toward the future. Good, practical stuff.

Talked about the content side of content management. Importance of titles and topic sentences. Communication skills. Don’t hit.

Content is the expression of ideas, request for attention, or attempt to influence. Technologists don’t think hard enough about the effort & the reward of making content.

Showed an entry on Don Box's site that displayed its title perfectly in his aggregator NetNewsWire, but Google didn't see it, because it wasn't in the doctitle. Easy to make this mistake. (Reiterated point: Publishing is essentially engineering. We forget these issues because engineers think from the inside out.) What is the right unit of content? Radio Userland has the day’s posts on one page, with the date as doctitle; Moveable Type one per page, so it can use the item's RSS title. Dave Winer's weblog comes in like an IV drip all day, but the audience for most weblogs isn't like that, and they need titles.

This affects how John Udell uses Radio Userland. Dave Winer interjected to ask if it would help to have a field to choose the day's title.

Brent’s Law of URLs: the more expensive the CMS, the crappier the URL. Showed a bunch of typical CMS & welogging system URLs. Tim Bray’s homegrown site was best: example ended with 2002/02/13/NamingFinishing. Vignette’s > $200K product was worst with an awful, long numeric URL.

Structure in doctitles. Search results pages can parse & group the titles. Example: with doctitle like Magazine Name | Date | Dept | title, group search results by magazine issue. Showed good example of this on O'Reilly's site.

Great example of broken titles in just about every mailing list archive. All the titles are wrong—they are the same as the last message in the thread. Not scannable. Showed a mockup with meaningful titles.

A few of the examples had the common thread of repetition of data in the user interface. Search results kept repeating the site name in document titles. Discussion board forums kept repeating the same subject lines. The mailing list example he showed was pretty much wall-to-wall repetition of the same thing. Only difference between successive lines was indentation and author name. A better interface would strip it all out, summarize, whatever. I've run into all the things he mentioned and just gotten used to them. I have to look at them with new eyes.

Call to implement ThreadsML.

Discussion of SlideML. Showed his method of generating it, but it isn't usable by “civilians”. No help in writing the actual content apart from typing raw XHTML in Emacs.

CMS systems came from publishing & were ported to web. Weblogs are web-first.

Hypertextual writing is still stuck in 1995. Netscape did as much or more than wer're doing today in 1996. We need lightweight web-aware writing tool. Need to advance beyond emacs, TEXTAREAs or the shoddy Windows DHTML edit control. InfoPath still relies on crummy XHTML editor.

Compound documents: tend to explode to meaningless names because the system has to add them (e.g. slide027.html). Discussion of old Netscape cid: protocol.

CMSs solve refactoring problems “in the large”: making consistent changes to many files, access, etc. Refactoring “in the small” suck up a huge amount of time: reformatting email messages, etc.

Categorization is a heavyweight operation; there should be other lightweight ad-hoc ways. Example: All Consuming book aggregator finds book references in blogs.

Showed example of searching his SlideML markup with XPath for code examples.

Update: Here are the slides and notes from Bitflux: part one, part two.

Posted May 31, 2003 at 09:09 AM

Categories: Interface, Open Source, Software, OSCOM, General

Read and Post Comments

OSCOM Day 3: OpenLaw

OpenLaw.org. Wendy Seltzer.

Parallels between law and open source software. It's generally public, has a revision history, forks and joins (Supreme Court over differing circuit courts). But process of forming arguments hasn't been public. So they opened up the process to the public in Eldred vs. Ashcroft. Now opening the DeCSS DVD DMCA case.

Developed an annotation system to comment on or rebut other web pages. Looked like a scrollable iframe with the original site on right, with comments in parallel on left. The courts have accepted their amicus briefs, and they have submitted comments to Copyright Office. Archives of case material, opinions, articles, etc. Important take-away from the session: now I know how to pronounce “amicus.” Or I thought I had just learned, but Larry Rosen behind me pronounced it a different way.

ChillingEffects.org.

Often just the threat of monetary losses in cease-and-desist letters is enough to shut the site down, independent of legal merit. “Shadow of the law.” Example: “you are sharing approximately 0 song files”. Little cost to send C & Ds.

So Chilling Effects archives and publicises them, increasing the cost of sending them by shaming the companies. This also spreads knowledge of the issues.

Update: Donna Wentworth at Harvard Law picked up this entry and provided the link for the C & D example. See her entry for more notes. Thanks, Donna!

Posted May 30, 2003 at 04:04 PM

Categories: Society, Law, OSCOM, General

Read and Post Comments

OSCOM Day 2: Other sessions

10 Best Features from Commercial CMS

Browser-based image editing, pre-localized interfaces

Extra credit: In-context editing (Edit This Page), dependency reporting, semblance of autoclassification, relational viewing tools

Reporting: such as Never Logged In

Configurable, forms-based workflow (ingest Visio WFML?)

508/WA compliant output — accessibility. Table headings + row headings, alts, etc.

Browser-based content object development (schema, essentially)

OpenCourse educational site. opencourse.org. “It rhymes with open source!” (The presenter avoided saying this, but I'm sure he wanted to.) Slow-moving.

Dublin Core Metadata in CMS

On oscom.org presentation slide show, different DC formats for XHTML, HTML, RDF XML are linked.

Good reference impl.: DC-dot. Another: Reggie

Elements (such as DC.Subject.Keyword) appearing multiple times, yes. Comma-separated value lists, no.

Discussion on thesauri, search engines, etc. Overall, I didn't get a huge amount out of this session, at least not directly. I'll have to find the references impls online.

Posted May 29, 2003 at 05:05 PM

Categories: Open Source, OSCOM, Software, Technology, General

Read and Post Comments

OSCOM Day 2: WebDAV

Provides a standard way to place content on a web server, with metadata, file locking, versioning. Also can decouple filesystem layout from author's view. Uses HTTP for all logins, so no need to create full user accounts.

Very few clients support metadata so far. Cadaver does, but cmd-line based. Kcera? KExplorer? support properties.

To check out: Joe Orton's sitecopy. Twingle.

WebDAV for filesharing tested lighter than SMB on network traffic.

Question on ranged PUTs. WebDAV and mod_dav support it, but some servers don't. The Mac OS X WebDAV client can't use ranged PUTs for this reason, or it would risk replacing the entire file with the tiny part that was changed. They're working toward some kind of solution.

Servers include Apache mod_dav (which the speaker wrote) and Zope, Tomcat. Jakarta Slide requires a lot of work to connect its memory-based store to something. Can even handle WebDAV with CGI except for OPTIONS method.

Subversion supports DeltaV WebDAV. You can mount & copy files from vanilla Windows & Mac OS X. But you can't modify them, because the client don't support DeltaV. (There is an experimental "autoversion" plugin to server to allow this.)

Extensions: ACL. Remote management of ACLs; close to RFC status. DASL (DAV Searching & Locating). Yet another query language. Further off.

MS WebDAV does a little check for FrontPage first, but is pretty much straight WebDAV otherwise.

My question: best/simplest route to implement a change trigger for a WebDAV server, so I could run a script? Can I plug in easily to any of the existing servers?

A. Zope supports WebDAV and is programmable. It uses its own data store, though, not the filesystem. So the whole system would have to use Zope.

Best answer. Could look at logs / an Apache filter to implement change response. Great idea.

Alternative: Author of FS watch & notify utils suggested those. They only run on Unixes, though. (I need Windows support, so I could look into NT's APIs for filesystem notification too.)

Posted May 29, 2003 at 05:05 PM

Categories: Open Source, OSCOM, Software, Technology, General

Read and Post Comments

OSCOM Day 2

We have WiFi access now! I'm posting from Dave Winer's keynote (still the introduction).

Posted May 29, 2003 at 09:09 AM

Categories: OSCOM, General

Read and Post Comments

OSCOM Day 2: Dave Winer Keynote

Dave Winer (introduced as "King of the Blogging World") said that was a great introduction, and he didn't agree with anything in it. Call to open source & commercial software worlds to work with each other. Speaking as a commercial developers who has also released open source.

Q: "Proprietary" label used to be sold as a good word. Open source just used it to differentiate themselves.

"40-person company" is what he recommends would be best for customers. 2-3 people doesn't cut it. But those 40-person companies don't exist anymore. Users look at Unix-style OS and think it must be very difficult to write. But it's actually much harder to write software that's easy to use, while users won't recognize its complexity.

Halley Suitt: Is she missing the marketing for open source? What does Linux look like? There's something with a penguin. Someone helpfully brought up his laptop and opened it for her. "My Linux virginity is gone," she announced.

Internet Explorer: users are stranded. Has a development team, but they don't fix the bugs.

XML-RPC: Dave did design in 2 weeks, met with Don Box et al once. Secret of success: not overloaded with complexity. Extra features were aggressively not included. Has not changed since 1999.

Audience member disputed the assertion that there were no 40-person software firms. Many CMS packages (shrinkwrapped) come from such companies.

What audience member wants: to be able to fix software. Even if developer goes bankrupt. Dave: What you want is not to be locked in. You want open file formats. Another audience member: retraining is high part of switching cost, not data conversion. Q: Source code escrow?

Q: With IE, doesn't want to be stranded. His weblog won't display properly in IE, and he can't fix it. Dave: Source code for IE should have been put in escrow and released already, because they're not working on it. He had strongly suggested that as a remedy in the MS antitrust trial.

Movivations for Open-Source Developers essay. To do: find link; it scrolled off my NetNewsWire aggregator before I read it.

Q: Audience member complained that Radio Userland has support issues, documentation issues.

Dave: They all do! There's no money in software! It's $39.95; that doesn't pay for a lot of support.

Sound bite about personally not liking Bill Gates or Richard Stallman. Neither of them take baths. This is quoted more accurately elsewhere.

Discussion of unifying variants of RSS.

And here we come to the climactic faceoff of the keynote. Apparently Dave Winer & Bill Kearney have never met in person before. I'll let the record speak for itself (search the web for both their names), but if you've ever seen their online mailing list discussions, you'd expect a matter vs. antimatter reaction if ever they were to meet.

Bill Kearney: I'm Bill Kearney, from Syndic8.

Dave: (no particular reaction) What's Syndic8?

Bill: (explains, happening to mention again that he's Bill Kearney)

Dave: Oh, you're Bill Kearney. My God.

[Bill starts talking about "democracy, rather than benevolent dictatorship"; discussion degenerates into shouting & swearing. Elapsed time: about 15 seconds. The play-by-play doesn't really matter, but if you want one, see Aaron's weblog. After the OSCOM organizer Charlie steps in after a few minutes, Dave is too rattled to move on and ends the session.]

I didn't get to ask my question.

Posted May 29, 2003 at 09:09 AM

Categories: Open Source, Software, OSCOM, General

Read and Post Comments

OSCOM day 1: Other notes

To come.

Posted May 29, 2003 at 02:02 AM

Categories: Open Source, OSCOM, Technology, General

Read and Post Comments

Legal Issues with Open Source Content Management: Notes

Interesting panel discussion.

#1 - Sleepycat CEO
#2 - Lisa ?; lawyer
#3 - Aaron Swartz
#4 - Larry Rosen

Open Source (free because it's useful, strategic) vs. Free Software (everything should be free) vantage points.

Q. Creative commons vs. source license? Larry Rosen: Courts have confused the issue of software IP by applying both patents and copyright to it. [I'd wondered about this problem; software is kind of in the middle of both and neither is quite right.]

Q. W3C DTD & Schema copyrightable? W3C says yes. But would content using that schema be copyrighted by the W3C? Lisa: Functionality/methods can't be covered by copyright. --maybe that applies to this case.

OpenOffice person in audience. Teddy Ruxpin case—successful contributory copyright lawsuit. Bootleg cassettes made Ruxpin tell different stories, make different movements.

Q on "Infected" code (could open source contain stealth IP)? Topical; SCO lawsuit.

Aggregation. Aaron: It's obviously illegal to put scraped feed contents on your page without attribution, obviously legal to write a tool that scrapes to generate feeds. Dave Winer: case of someone who didn't know RSS was generated auto by Radio. Got mad when it appeared on someone else's site. After that was explained, problem kind of disappeared.

The RSS topic was starting to get too long and the moderator wanted to switch subjects, before I could get my question in, which was exactly along those lines. He said to defer those questions to Dave Winer’s keynote tomorrow.

Posted May 29, 2003 at 02:02 AM

Categories: Open Source, Law, OSCOM, General

Read and Post Comments