Andy Pavlo of the CMU Database Group is well known for saying that while NoSQL databases acquire cyclical popularity, all databases eventually iterate back to a SQL interface — it happened with MongoDB and Google’s BigTable for example.
I think I have hit that point with etcd. Initially I ported from MySQL to etcd because I really wanted the inexpensive distributed locking and being able to watch values. However, I never actually watch values in my code any more, and I now spend a huge amount of my time maintaining what my code calls “caches”, but which I can now see are just poorly implemented secondary indexes. The straw that broke the camel’s back was https://github.com/etcd-io/etcd/issues/9043, which changed etcd’s defaults to only being able to return 1.5mb in a RPC request.
I therefore think it might be time for me to port back to a real SQL database, perhaps keeping etcd to manage distributed locks. Perhaps.
I need to think about this more to be honest, but I think I’ve hit the limit of what you can express in key / value pairs directly stored in etcd. I often want to look up items based off of a portion of their value (the values are JSON), but that’s not possible in etcd without maintaining those extra indices that I now maintain. As I’ve grown as a programmer, I now really really want the Chubby-style check-and-replace transactional multi-table update syntax that etcd offers and S3 recently introduced as well. So moving back to a pure SQL database would leave me missing that.
One alternative to ditching etcd entirely would be to write a RPC service which sat in front of it and abstracted away the underlying data store. If I treated etcd as a storage engine, and then maintained the various indices in that abstracting layer, then I might get to a happier place. This would map to how modern databases are build somewhat if we thought of the keys in etcd as page locations in a storage engine. etcd would be a quite expensive storage engine however given it’s in-memory only attributes.
Oh, and you should all go and watch Andy Pavlo’s excellent lectures on how to build a database storage engine: