I really wanted to like etcd, but Andy Pavlo was right

Andy Pavlo of the CMU Database Group is well known for saying that while NoSQL databases acquire cyclical popularity, all databases eventually iterate back to a SQL interface — it happened with MongoDB and Google’s BigTable for example.

I think I have hit that point with etcd. Initially I ported from MySQL to etcd because I really wanted the inexpensive distributed locking and being able to watch values. However, I never actually watch values in my code any more, and I now spend a huge amount of my time maintaining what my code calls “caches”, but which I can now see are just poorly implemented secondary indexes. The straw that broke the camel’s back was https://github.com/etcd-io/etcd/issues/9043, which changed etcd’s defaults to only being able to return 1.5mb in a RPC request.

I therefore think it might be time for me to port back to a real SQL database, perhaps keeping etcd to manage distributed locks. Perhaps.

(more…)

Continue ReadingI really wanted to like etcd, but Andy Pavlo was right

Databases built on object stores are officially interesting

  • Post author:
  • Post category:Databases

…even if none of my friends seem to think so.

I’ve been off on a bit of a tangent recently. Its a slow burn tangent, that I am pretty sure was kicked off by this Geek Narrator podcast episode about the design of Turbo Puffer with Simon Eskildsen:

The basic idea is that you can build very large scale database systems using only the primitives provided by an object store such as Amazon S3. Now, the performance might also suck, but you can alleviate some of that with a good caching layer and in return you get massive scale. This first video caused me to discover the work of Andy Pavlo, who was interviewed by the same podcast:

(more…)

Continue ReadingDatabases built on object stores are officially interesting

Juno nova mid-cycle meetup summary: DB2 support

  • Post author:
  • Post category:OpenStack

This post is one part of a series discussing the OpenStack Nova Juno mid-cycle meetup. It's a bit shorter than most of the others, because the next thing on my list to talk about is DB2, and that's relatively contained. IBM is interested in adding DB2 support as a SQL database for Nova. Theoretically, this is a relatively simple thing to do because we use SQLAlchemy to abstract away the specifics of the SQL engine. However, in reality, the abstraction is leaky. The obvious example in this case is that DB2 has different rules for foreign keys than other SQL engines we've used. So, in order to be able to make this change, we need to tighten up our schema for the database. The change that was discussed is the requirement that the UUID column on the instances table be not null. This seems like a relatively obvious thing to allow, given that UUID is the official way to identify an instance, and has been for a really long time. However, there are a few things which make this complicated: we need to understand the state of databases that might have been through a long chain of upgrades from previous…

Continue ReadingJuno nova mid-cycle meetup summary: DB2 support

Exploring a single database migration

  • Post author:
  • Post category:OpenStack

Yesterday I was having some troubles with a database migration download step, and a Joshua Hesketh suggested I step through the migrations one at a time and see what they were doing to my sqlite test database. That's a great idea, but it wasn't immediately obvious to me how to do it. Now that I've figured out the steps required, I thought I'd document them here. First off we need a test environment. I'm hacking on nova at the moment, and tend to build throw away test environments in the cloud because its cheap and easy. So, I created a new Ubuntu 12.04 server instance in Rackspace's Sydney data center, and then configured it like this: $ sudo apt-get update $ sudo apt-get install -y git python-pip git-review libxml2-dev libxml2-utils libxslt-dev libmysqlclient-dev pep8 postgresql-server-dev-9.1 python2.7-dev python-coverage python-netaddr python-mysqldb python-git virtualenvwrapper python-numpy virtualenvwrapper sqlite3 $ source /etc/bash_completion.d/virtualenvwrapper $ mkvirtualenv migrate_204 $ toggleglobalsitepackages Simple! I should note here that we probably don't need the virtualenv because this machine is disposable, but its still a good habit to be in. Now I need to fetch the code I am testing. In this case its from my personal fork of nova, and the git…

Continue ReadingExploring a single database migration

Nova database continuous integration

  • Post author:
  • Post category:OpenStack

I've had some opportunity recently to spend a little quality time off line, and I spent some of that time working on a side project I've wanted to do for a while -- continuous integration testing of nova database migrations. Now, the code isn't perfect at the moment, but I think its an interesting direction to take and I will keep pursuing it. One of the problems nova developers have is that we don't have a good way of determining whether a database migration will be painful for deployers. We can eyeball code reviews, but whether code looks reasonable or not, its still hard to predict how it will perform on real data. Continuous integration is the obvious solution -- if we could test patch sets on real databases as part of the code review process, then reviewers would have more data about whether to approve a patch set or not. So I did that. At the moment the CI implementation I've built isn't posting to code reviews, but that's because I want to be confident that the information it gathers is accurate before wasting other reviewers' time. You can see results at openstack.stillhq.com/ci. For now, I am keeping an…

Continue ReadingNova database continuous integration

MythBuntu 8.10 just made me sad

  • Post author:
  • Post category:Mythtv

I figured it was time to give MythBuntu a try, so I set up a MythBuntu 8.10 instance in VirtualBox today. That was a mistake. I'm not 100% sure I understand how it happened, but MythBuntu somehow managed to delete my entire mythconverg MySQL database instance. Not pleased. I've restored it from last night's backup, but now I'll need to recover recordings which happened today, assuming I can be bothered. I'm writing this just as a warning to others -- if you're playing with MythBuntu, backup your MySQL instance if its not a test one.

Continue ReadingMythBuntu 8.10 just made me sad

Recovering lost MythTV recordings

  • Post author:
  • Post category:Mythtv

I had one of those moments tonight, and accidentally dropped the mythconverg database on my production MythTV instance, not the development one. This made me sad. Luckily I had a backup which was only a week old (although I am now running night backups of that database). Recovery wasn't too bad once I wrote some code. The steps: Restore from backup Don't run mythfilldatabase (it will clear out old guide data, and we need it) Apply my funky patch to myth.rebuilddatabase.pl Run myth.rebuilddatabase.pl Run mythfilldatabase And all is well again. The patch uses the guide data from the database to make an educated guess about the title, subtitle and description of the recordings which are missing from the database. Here's the patch: Index: myth.rebuilddatabase.pl =================================================================== --- myth.rebuilddatabase.pl (revision 11681) +++ myth.rebuilddatabase.pl (working copy) @@ -185,6 +185,7 @@ 'norename'=>\$norename ); +print "db = dbi:mysql:database=$database:host=$dbhost user = $user pass = $pass\n"; my $dbh = DBI->connect("dbi:mysql:database=$database:host=$dbhost", "$user","$pass") or die "Cannot connect to database ($!)\n"; @@ -314,6 +315,7 @@ # have enough to look for an past recording? if ($ssecond) { + print "Checking for a recording...\n"; $starttime = "$syear$smonth$sday$shour$sminute$ssecond"; my $guess = "select title, subtitle, description from oldrecorded where chanid=(?) and starttime=(?)"; @@…

Continue ReadingRecovering lost MythTV recordings

End of content

No more pages to load