Playing with the python prometheus query API

The last few days have been a bit icky around here, with my house apparently proudly residing in the major city with the dirtiest air in the world. So, I needed a distraction... It has also been quite hot, so I wondered how my energy usage was going. I have prometheus monitoring of my power draw, so now seemed as good a time as any to learn how to do some historical querying over the API. I ended up with a python script which can output things like this: "Yesterday had a maximum temperature of 38 and we used 28.36 kwh. The average for similar days is 25.56 kwh." The code is on github if it is of interest to others. I am sure I could push more of this processing down into the prometheus engine, but I couldn't see how to do it today. Hints welcome!

Continue ReadingPlaying with the python prometheus query API

Prometheus 2.12, query logging, and startup failures on macos

  • Post author:
  • Post category:Prometheus

Prometheus v2.12 added active query logging. The basic idea is that there is a mmaped JSON file that contains all of the queries currently running. If prometheus was to crash, that file would therefore be a list of the queries running at the time of the crash. Overall, not a bad idea. Some friends had recently added prometheus to their development environments. This is wired up to grafana dashboards for their microservices, and prometheus is configured to store 14 days worth of time series data via a persistent volume from the developer desktops. We did this because it is valuable for the developers to be able to see the history of metrics before and after their changes. Now we have a developer using macos as their primary development platform, and since prometheus 2.12 it hasn't worked. Specifically this developer is using parallels to provide the docker virtual machine on his mac. You can summarise the startup for prometheus in the dev environment like this: $ docker run ...stuff... ...snip... level=error ts=2019-09-15T02:20:23.520Z caller=query_logger.go:94 component=activeQueryTracker msg="Failed to mmap" file=/prometheus-data/data/queries.active Attemptedsize=20001 err="invalid argument" panic: Unable to create mmap-ed active query log goroutine 1 [running]: github.com/prometheus/prometheus/promql.NewActiveQueryTracker(0x7fff9917af38, 0x15, 0x14, 0x2a6b7c0, 0xc00003c7e0, 0x2a6b7c0) /app/promql/query_logger.go:112 +0x4d2 main.main()…

Continue ReadingPrometheus 2.12, query logging, and startup failures on macos

A pythonic example of recording metrics about ephemeral scripts with prometheus

  • Post author:
  • Post category:Prometheus

In my previous post we talked about how to record information from short lived scripts (I call them ephemeral scripts by the way) with prometheus. The example there was a script which checked the SMART status of each of the disks in a machine and reported that via pushgateway. I now want to work through a slightly more complicated example. I think you hit the limits of reporting simple values in shell scripts via curl requests fairly quickly. For example with the SMART monitoring script, SMART is capable of returning a whole heap of metrics about the performance of a disk, but we boiled that down to a single "health" value. This is largely because writing a parser for all the other values that smartctl returns would be inefficient and fragile in shell. So for this post, we're going to work through an example of how to report a variety of values from a python script. Those values could be the parsed output of smartctl, but to mix things up a bit, I'm going to use a different script I wrote recently. This new script uses the Weather Underground API to lookup weather stations near my house, and then generate…

Continue ReadingA pythonic example of recording metrics about ephemeral scripts with prometheus

Recording performance information from short lived processes with prometheus

  • Post author:
  • Post category:Prometheus

Now that I'm recording basic statistics about the behavior of my machines, I now want to start tracking some statistics from various scripts I have lying around in cron jobs. In order to make myself sound smarter, I'm going to call these short lived scripts "ephemeral scripts" throughout this document. You're welcome. The promethean way of doing this is to have a relay process. Prometheus really wants to know where to find web servers to learn things from, and my ephemeral scripts are both not permanently around and also not running web servers. Luckily, prometheus has a thing called the pushgateway which is designed to handle this situation. I can run just one of these, and then have all my little scripts just tell it things to add to its metrics. Then prometheus regularly scrapes this one process and learns things about those scripts. Its like a game of Telephone, but for processes really. First off, let's get the pushgateway running. This is basically the same as the node_exporter from last time: $ wget https://github.com/prometheus/pushgateway/releases/download/v0.3.1/pushgateway-0.3.1.linux-386.tar.gz $ tar xvzf pushgateway-0.3.1.linux-386.tar.gz $ cd pushgateway-0.3.1.linux-386 $ ./pushgateway Let's assume once again that we're all adults and did something nicer than that involving configuration…

Continue ReadingRecording performance information from short lived processes with prometheus

Basic prometheus setup

  • Post author:
  • Post category:Prometheus

I've been playing with prometheus for monitoring. It feels quite familiar to me because its based on an internal google technology called borgmon, but I suspect that means it feels really weird to everyone else. The first thing to realize is that everything at google is a web server. Your short lived tool that copies some files around probably runs a web server. All of these web servers have built in URLs which report the progress and status of the task at hand. Prometheus is built to: scrape those web servers; aggregate the data; store the data into a time series database; and then perform dashboarding, trending and alerting on that data. The most basic example is to just export metrics for each machine on my home network. This is the easiest first step, because we don't need to build any software to do this. First off, let's install node_exporter on each machine. node_exporter is the tool which runs a web server to export metrics for each node. Everything in prometheus land is written in go, which is new to me. However, it does make running node exporter easy -- just grab the relevant binary from https://prometheus.io/download/, untar, and run.…

Continue ReadingBasic prometheus setup

End of content

No more pages to load