Thinking about IDL style descriptions of document formats

I’ve been background processing about IDL style definitions of document formats for the last few days. Specifically, I’m interested in ways of expressing the structure of a document outside of code, and then having code generated to process the specified document. Sort of like lex and yacc, but more flexible and not language specific. This would mean that when you wanted to process a document in your chosen language, you wouldn’t have to deal with things like SWIG — you’d just generate the native code and go for it.

Obviously these ideas aren’t new. DCE RPC’s IDL language is like this, as is Google’s protobuffers. However, I want something more generic. Has anyone seen something like this?

Open Source document management from Alfresco

An Alfresco employee (Alfrescoer?) posts about some of the interesting things they’ve learnt about being an open source company along the way. The comments about PR being more effective the cold sales calls is especially interesting. I argued for years at TOWER that we should be paying more attention to people searching for our product, instead of paying pretty boys to drive sports cars to sales presentations that everyone secretly hates. If your product has a good reputation and people can find it online, surely the customers will come to you?