Audio from 2005 continued

Further to the previous post on the 2005 audio, a gentleman named Matt kindly wrote in with these thoughts:

Hi Michael,

Prompted by some of the other Fluendo guys, I took a look at the sample
ogg file you uploaded of the LCA streams. We think we've figured out
roughly what the problem was, so I thought I'd let you know...

We think the machine doing the DV reading/encoding was falling behind
(possibly due to too high CPU usage, but possibly because of an issue in
the dv grabber gstreamer component - we're discussing what we might do
to solve some of these issues in the future.

Unfortunately, the result of this is that the raw DV data is simply lost
- the remaining data was encoded fine (there are no errors in the vorbis
data according to my validation tools), but there's no possible way to
recover the bits that went missing. There are a couple of minor muxing
problems in the actual ogg files, which can be fixed easily enough (I've
written some tools to help with this, and Conrad Parker has some too,
let me know if you want to know how to fix these up - they confuse some
players a bit), but the talk in that sample file is still pretty much
incomprehensible due to the missing data after repairing the ogg-level


So there you go.

Audio from 2005

I’ve been asked quite a few times as to the status of the audio from the 2005 conference, so I thought I would grapple with the issues involved here, so that everyone knows. The short answer is that there doesn’t appear to be any audio… Read on for a summary of why.


The 2005 committee did have machines record speex in every lecture theatre at the conference. These were the same machines which were displaying those slide shows when the project wasn’t in use. It would appear that there was a hardware issue on those machines, as they all have recorded fairly large amounts of garbage data. I can’t comment much further than that, as I wasn’t involved with the setup of the machines, or the diagnosis of the problem.

Video recordings

But wait. There were all those video machines in the back of the theatres (or at least the big three). They’d be recording video wouldn’t they? Well, they were. It turns out that the audio streams generated from those cameras and their audio system hookup are corrupt. Apparently, and again I haven’t looked into this myself, the time stamping in the files is bogus, so the audio data can’t reliably be extracted. It would seem that about 25% of each talk can be extracted.

So where to from here?

Well, all of that is a bit of a bummer really. Our current plan is to put the dodgy ogg video files online for people to download and try to help us out with the extraction of the audio. The problem with this is that we’re talking about a fair bit of data here — 25 gig to be exact. Linux Australia has recently rolled out a mirror project which I am associated with which will be able to host these files, but it’s a case of actually getting the hardware (it’s on order), configuring it, testing it, and then deploying it. I would expect this to take around another month from now.

I’ve put a random sample of the ogg video on my site if people want to have a poke before then and see if they have suggestions. This video file, assuming I have worked out the file naming convention properly, should be the start of Eben Moglen’s keynote presentation. The file isn’t too big (around 30 megabytes) so feel free to download it and give it a try.

I do apologise for the inconvenience the loss of data has causes, despite there really being nothing I could have done about it. I do find it a little embarrassing that this has happened. If you could please refer further comments to the conference organisers list that would be nice.

Update: One of the guys at work thinks “I can’t comment much further than that, as I wasn’t involved with the setup of the machines, or the diagnosis of the problem.” sounds self righteous, so I thought I should clarify and point out that I didn’t mean it that way. What I am trying to convey here is that I would have liked to supply more technical detail as to what happened, but I don’t know any.