pyconau 2018 call for proposals now open

The pyconau call for proposals is now open, and runs until 28 May. I took my teenagers to pyconau last year and they greatly enjoyed it. I hadn’t been to a pyconau in ages, and ended up really enjoying thinking about things from topic areas I don’t normally need to think about. I think expanding one’s horizons is generally a good idea.

Should I propose something for this year? I am unsure. Some random ideas that immediately spring to mind:

  • something about privsep: I think a generalised way to make privileged calls in unprivileged code is quite interesting, especially in a language which is often used for systems management and integration tasks. That said, perhaps its too OpenStacky given how disinterested in OpenStack talks most python people seem to be.
  • nova-warts: for a long time my hobby has been cleaning up historical mistakes made in OpenStack Nova that wont ever rate as a major feature change. What lessons can other projects learn from a well funded and heavily staffed project that still thought that exec() was a great way to do important work? There’s definitely an overlap with the privsep talk above, but this would be more general.
  • a talk about how I had to manage some code which only worked in python2, and some other code that only worked in python3 and in the end gave up on venvs and decided that Docker containers are like the ultimate venvs. That said, I suspect this is old hat and was obvious to everyone except me.
  • something else I haven’t though of.

Anyways, I’m undecided. Comments welcome.

Also, here’s an image for this post. Its the stone henge we found at Guerilla Bay last weekend. I assume its in frequent use for tiny tiny druids.

A pythonic example of recording metrics about ephemeral scripts with prometheus

In my previous post we talked about how to record information from short lived scripts (I call them ephemeral scripts by the way) with prometheus. The example there was a script which checked the SMART status of each of the disks in a machine and reported that via pushgateway. I now want to work through a slightly more complicated example.

I think you hit the limits of reporting simple values in shell scripts via curl requests fairly quickly. For example with the SMART monitoring script, SMART is capable of returning a whole heap of metrics about the performance of a disk, but we boiled that down to a single “health” value. This is largely because writing a parser for all the other values that smartctl returns would be inefficient and fragile in shell. So for this post, we’re going to work through an example of how to report a variety of values from a python script. Those values could be the parsed output of smartctl, but to mix things up a bit, I’m going to use a different script I wrote recently.

This new script uses the Weather Underground API to lookup weather stations near my house, and then generate graphics of the weather forecast. These graphics are displayed on the various Cisco SIP phones I already had around the house. The forecasts look like this:

The script to generate these weather forecasts is relatively simple python, and you can see the source code on github.

My cunning plan here is to use prometheus’ time series database and alert capabilities to drive home automation around my house. The first step for that is to start gathering some simple facts about the home environment so that we can do trending and decision making on them. The code to do this isn’t all that complicated. First off, we need to add the python prometheus client to our python environment, which is hopefully a venv:

pip install prometheus_client
pip install six

That second dependency isn’t a strict requirement for prometheus, but the script I’m working on needs it (because it needs to work out what’s a text value, and python 3 is bonkers).

Next we import the prometheus client in our code and setup the counter registry. At the same time I record when the script was run:

from prometheus_client import CollectorRegistry, Gauge, push_to_gateway

registry = CollectorRegistry()
Gauge('job_last_success_unixtime', 'Last time the weather job ran',

And then we just add gauges for any values we want to add to the pushgateway

Gauge('_'.join(field), '', registry=registry).set(value)

Finally, the values don’t exist in the pushgateway until we actually push them there, which we do like this:

push_to_gateway('localhost:9091', job='weather', registry=registry)

You can see the entire patch I wrote to add prometheus support on github if you’re interested in an example with more context.

Now we can have pretty graphs of temperature and stuff!

On syncing with Google Contacts

So, I started with a new company a few weeks ago, and one of the things I missed from my previous company was having the entire corporate directory synced onto my phone. Its really handy as an on caller to be able to give people a call when something goes wrong, without having to dig around and find their details.

Back in the good old days at Google the way you got this sort of data onto your phone was to run a script written by one of the guys on the gmail team. The script grabbed the LDAP directory, and pushed it into Google contacts, which you could then sync with your phone. Now I wanted something very similar — especially as the contacts sync stuff with Android is pretty reasonable.

However, I’d never coded with the Google public APIs before, and that turned out to be the hardest part of the problem.

First off I wrote a little script which dumped the corporate directory into a text file. I mostly did this because I wanted other people to be able to run the script in as light weight a manner as possible — for example, if we wanted to roll this out for hundreds of people, then you wouldn’t want to run the LDAP query hundreds of times. The format for my text file is kinda lame to be honest:

Michael Still: {'telephoneNumber': ['+61 123 123 123'], 'ID': ['mikalstill'], 'mail': ['']}

So, you get the user’s name, then a python dictionary with three keys in it. There isn’t any particular reason for having just three keys, it was just the three fields I thought were most interesting at the time. Note that each field is an array. A simple human readable format like this means that I can also grep through the file if I ever quickly want a user’s details, which is a nice side effect.

The most important thing I learnt here is that the ID field is really important. If you don’t have something you feel you can use there, then you might need to synthesize something — perhaps an ascii representation of the user’s name or something. This is important because I discovered that Google rewrites Unicode characters you ask it to store, so if you do a simple text comparison against the user’s name, then you might get a false negative and end up creating more than one entry for that user. That was particularly a problem for me because there are a fair few people in the company with European accented characters in their names.

The docs for the Google contacts API are ok, although I did have to spend some time randomly searching for examples of some of the things I wanted to do. For example, the docs didn’t have an example of how to store a phone number that I could find. Also, I am a little shocked to discover there is no query interface in contacts for contact name. This seems like a pretty massive oversight to me, but here’s what the docs have to say on the issue:

For more information about query parameters, see the Contacts Data API Reference Guide and the Google Data APIs Reference Guide. In particular, there is no support for full-text queries or locating a contact by email address.

Whatever intern wrote the API should have his ball pit rights revoked until he fixes that. After that it was all gravy. Here’s the code:

I note that there is an enterprise shared contacts API (see here), but you have to be a premiere customer for it to work.

Building a symlink tree for MythTV recordings

I wanted to build a directory of symlinks that pointed to my MythTV recordings, so I wrote a little python script to do it for me. I figure someone else might find this useful too…

    # Copyright (C) Michael Still ( 2007
    # Released under the terms of the GNU GPL
    import MySQLdb
    import os
    import re
    from socket import gethostname
    # Connect to the MythTV database based on the MythTV config
    config_values = {}
    home = os.environ.get('HOME')
    config = open(home + '/.mythtv/mysql.txt')
    for line in config.readlines():
      if not line.startswith('#') and len(line) > 5:
        (key, value) = line.rstrip('\n').split('=')
        config_values[key] = value
    db_connection = MySQLdb.connect(host = config_values['DBHostName'],
                                    user = config_values['DBUserName'],
                                    passwd = config_values['DBPassword'],
                                    db = config_values['DBName'])
    cursor = db_connection.cursor(MySQLdb.cursors.DictCursor)
    # Regexp for what is allowed in the symlink name
    unsafe = re.compile('[^a-zA-Z0-9\-\:_]+')
    # Find the recordings directory -- this assumes you haven't used an
    # identifier string for this machine...
    cursor.execute('select * from settings where value="RecordFilePrefix" and '
                   'hostname="%s";' % gethostname())
    row = cursor.fetchone()
    basedir = row['data']
    # Now find all the recordings we have at the moment
    cursor.execute('select title, subtitle, starttime, basename from recorded;')
    for i in range(cursor.rowcount):
      row = cursor.fetchone()
      title = row['title']
      subtitle = row['subtitle']
      if subtitle == '':
        subtitle = str(row['starttime'])
      title = title.replace(' ', '_')
      title = unsafe.sub('', title)
      subtitle = subtitle.replace(' ', '_')
      subtitle = unsafe.sub('', subtitle)
      extn = row['basename'].split('.')[1]
        os.symlink('%s/%s' %(basedir, row['basename']),
                   '%s/%s.%s' %(title, subtitle, extn))

This creates a tree of symlinks in the current directory that looks like this:

    $ find . -type l

Getting Google Talk working with PyXMPP

Jacek Konieczny has written the wholly fantabulous PyXMPP, which implements Jabber clients and servers in Python. Now, Google Talk is a Jabber server, but it needs TLS support before it works. The code is all there, but the echobot example in the download (look in the examples directory) doesn’t show you how. It’s not that hard though — here’s the patch I needed to make it work:

    ---  2005-12-26 07:25:55.000000000 -0800
    +++ 2006-10-25 04:25:02.000000000 -0700
    @@ -13,6 +13,7 @@
     from pyxmpp.all import JID,Iq,Presence,Message,StreamError
     from pyxmpp.jabber.client import JabberClient
    +from pyxmpp import streamtls
     class Client(JabberClient):
         """Simple bot (client) example. Uses `pyxmpp.jabber.client.JabberClient`
    @@ -28,8 +29,12 @@
             # setup client with provided connection information
             # and identity data
    +        tls = streamtls.TLSSettings(require=True, verify_peer=False)
    +        auth = ['sasl:PLAIN']
             JabberClient.__init__(self, jid, password,
    -                disco_name="PyXMPP example: echo bot", disco_type="bot")
    +                disco_name="PyXMPP example: echo bot", disco_type="bot",
    +                tls_settings=tls, auth_methods=auth)
             # register features to be announced via Service Discovery

That makes the __init__ method for the client:

    def __init__(self, jid, password):
        # if bare JID is provided add a resource -- it is required
        if not jid.resource:
            jid=JID(jid.node, jid.domain, "Echobot")
        # setup client with provided connection information
        # and identity data
        tls = streamtls.TLSSettings(require=True, verify_peer=False)
        auth = ['sasl:PLAIN']
        JabberClient.__init__(self, jid, password,
                disco_name="PyXMPP example: echo bot", disco_type="bot",
                tls_settings=tls, auth_methods=auth)
        # register features to be announced via Service Discovery

Now the client works with a gtalk login:

    $ ./ supersecretthingie
    creating client...
    *** State changed: resolving srv (u'', 'xmpp-client') ***
    *** State changed: resolving '' ***
    *** State changed: connecting ('', 5222) ***
    *** State changed: connected ('', 5222) ***
    *** State changed: tls connecting  ***
    *** State changed: tls connected  ***
    *** State changed: fully connected  ***
    *** State changed: authenticated  ***
    *** State changed: binding u'Echobot' ***
    *** State changed: authorized  *** has become available has become available(away): I'm not at my
    desk at work at the moment. This is probably because I'm at a meeting or
    racing electric scooters. If you IM me I will see the message when I get back.
    My roster: "" subscription=both groups=
    Message from received. Body: "Hello there". Type: "chat".

Too easy.

Update: mbot is a Google Talk bot engine built on top of this.