Estimating the progress of queries on MySQL

I’ve been doing a lot of batch updates on one of my databases at home recently. show processlist says something like this:

mysql> show processlist;
| Id    | User | Host          | db           | Command | Time  | State    | Info                                     |
| 18354 | root | maui:37403    | smtp_servers | Query   | 57234 | Updating | update ips_218 set reverse_lookup = null |
| 22286 | root | maui:37348    | smtp_servers | Query   | 38103 | Updating | update ips_80 set reverse_lookup = null, |
| 22851 | root | maui:54982    | smtp_servers | Query   | 34091 | Updating | update ips_19 set reverse_lookup = null, |
| 23351 | root | molokai:58232 | smtp_servers | Sleep   |    57 |          | NULL                                     |
| 23496 | root | maui:40923    | smtp_servers | Query   | 29973 | Updating | update ips_62 set reverse_lookup = null, |
| 23906 | root | maui:38068    | smtp_servers | Query   | 26794 | Updating | update ips_83 set reverse_lookup = null, |
| 25675 | root | maui:56438    | smtp_servers | Query   | 12505 | Updating | update ips_82 set reverse_lookup = null, |
| 25846 | root | maui:41334    | smtp_servers | Query   | 10948 | Updating | update ips_90 set reverse_lookup = null, |
| 26437 | root | maui:41139    | smtp_servers | Query   |  6211 | Updating | update ips_66 set reverse_lookup = null, |
| 26773 | root | maui:32885    | smtp_servers | Query   |  3526 | Updating | update ips_76 set reverse_lookup = null, |
| 27073 | root | maui:42607    | smtp_servers | Query   |  1148 | Updating | update ips_11 set reverse_lookup = null, |
| 27202 | root | molokai:50688 | smtp_servers | Query   |     0 | NULL     | show processlist                         |
| 27203 | root | molokai:50689 | smtp_servers | Sleep   |     2 |          | NULL                                     |
14 rows in set (0.20 sec)

Now, wouldn’t it be nice if MySQL provided some extra information about the progress of those queries? Like for example the number of rows which have been updated so far, or an estimate of how long the query has left to run? I’m ok with such queries not being very accurate, but I assume the storage engine has to have some idea of how many rows are in the table and how many it has touched already.

Perhaps something like this already exists and I haven’t noticed? I’m using innodb if that matters.

Update: it seems like innodb can answer this question for me:

mysql> show engine innodb status \G;
---TRANSACTION 0 40056, ACTIVE 39794 sec, process no 22984, OS thread id 3020733328 waiting in InnoDB queue
mysql tables in use 1, locked 1
6672 lock struct(s), heap size 748864, undo log entries 909825
MySQL thread id 22851, query id 351217 maui root Updating
update ips_19 set reverse_lookup = null, reverse = null, reverse_extracted

That doesn’t give you an estimate of percentage complete though. I assume there is a 1:1 correlation between undo row entries and rows altered by the query?

Update: my imperical observation is that the undo rows are not 100% correlated to the number of rows your query changed. Its correlated to the number of rows that were changed kinda near your query. For example, if you’re doing an update, then the number is good enough to trust. However, if you’re doing a select, then the number seems to be the number of rows someone else changed while your select was running (i.e. how many old versions needed to be kept around because of your select transaction).

Also, Jeremy Cole to the rescue!

Done and done

Stewart finished off the last trailing chapter of the MythTV book a week or so ago, and I finished the last copy edit (the process where we learn US grammar) just now. That means we only need to have page layout done for a chapter or so and we’re home. Hurrah!

Time for a status update?

It’s been a while since I’ve commented on the MythTV book project, so it’s worth a quick status update. We’re basically into the home straight now — we’ve decided what to cover in the book (which of course involves missing out on a few things which could be in there if we had infinite space and time), written the chapters (which are all now done), and are now working through technical review and copy edits.

I’ll keep you posted as we get closer to final production.

Book status

I was thinking about the delivery status of the MythTV book as I was going off to sleep last night, and it occurred to me that the book is pretty much done. It’s a good feeling.

Stewart needs to finish poking a chapter on cut lists and so forth, with should be minimal effort, then he needs to work on chapters on MythWeb and VoIP (the VoIP one is half done). I need to write an updated introduction (the contents of the book have deviated from plan, although that always happens) as well as a short conclusion chapter, and help Stewart write a chapter about running the latest and greatest version of the code (and how to interact with the development community).

Then we’re done for the writing. That description can be summarised as “Stewart and I appear to both be about one and a half chapters away from finishing authoring”.

We of course still need to get through technical review, editorial review, replying to things those reviewers need changed, copy editing, page layout, and standing on one leg. We’re getting towards the end of the project though.

When I think about it that way, I guess I read the ImageMagick book around four or five times.

You can pre order a copy of the MythTV book from Amazon for $20 US.