Learning from the mistakes that even big projects make

The following is a blog post version of a talk presented at pyconau 2018. Slides for the presentation can be found here (as Microsoft powerpoint, or as PDF), and a video of the talk (thanks NextDayVideo!) is below:

 

OpenStack is an orchestration system for setting up virtual machines and associated other virtual resources such as networks and storage on clusters of computers. At a high level, OpenStack is just configuring existing facilities of the host operating system — there isn’t really a lot of difference between OpenStack and a room full of system admins frantically resolving tickets requesting virtual machines be setup. The only real difference is scale and predictability.

To do its job, OpenStack needs to be able to manipulate parts of the operating system which are normally reserved for administrative users. This talk is the story of how OpenStack has done that thing over time, what we learnt along the way, and what I’d do differently if I had my time again. Lots of systems need to do these things, so even if you never use OpenStack hopefully there are things to be learnt here.

(more…)

Continue ReadingLearning from the mistakes that even big projects make

Collisions in MD5 sums

  • Post author:
  • Post category:Crypto

This is kind of a big deal. Cryptographic has functions are used in a lot of computer science circles to take a large document and turn it into a relatively small description of the document. The transformation has a couple of interesting properties: It's one way -- which means that I can know that I have your document, without checking the contents. There are secure file systems out there that when you give it a file give you back the ID for the file, and that's how you access it in the future. Don't know the ID? You can't possibly have seen the file. They're meant to be unique -- you can't possibly have no overlap between bazillions of documents and the comparatively few IDs available, but it's meant to be very hard to get two documents with the same ID. This is commonly used for CD downloads for instance where people want to be sure that you got the file intended completely, or to make sure that you're not storing information twice. EMC for instance has an email storage system which only saves an email if the MD5 ID is new, otherwise it must be a duplicate. It turns…

Continue ReadingCollisions in MD5 sums

End of content

No more pages to load