My candidacy for Kilo Compute PTL

This is mostly historical at this point, but I forgot to post it here when I emailed it a week or so ago. So, for future reference:

I'd like another term as Compute PTL, if you'll have me.

We live in interesting times. openstack has clearly gained a large
amount of mind share in the open cloud marketplace, with Nova being a
very commonly deployed component. Yet, we don't have a fantastic
container solution, which is our biggest feature gap at this point.
Worse -- we have a code base with a huge number of bugs filed against
it, an unreliable gate because of subtle bugs in our code and
interactions with other openstack code, and have a continued need to
add features to stay relevant. These are hard problems to solve.

Interestingly, I think the solution to these problems calls for a
social approach, much like I argued for in my Juno PTL candidacy
email. The problems we face aren't purely technical -- we need to work
out how to pay down our technical debt without blocking all new
features. We also need to ask for understanding and patience from
those feature authors as we try and improve the foundation they are
building on.

The specifications process we used in Juno helped with these problems,
but one of the things we've learned from the experiment is that we
don't require specifications for all changes. Let's take an approach
where trivial changes (no API changes, only one review to implement)
don't require a specification. There will of course sometimes be
variations on that rule if we discover something, but it means that
many micro-features will be unblocked.

In terms of technical debt, I don't personally believe that pulling
all hypervisor drivers out of Nova fixes the problems we face, it just
moves the technical debt to a different repository. However, we
clearly need to discuss the way forward at the summit, and come up
with some sort of plan. If we do something like this, then I am not
sure that the hypervisor driver interface is the right place to do
that work -- I'd rather see something closer to the hypervisor itself
so that the Nova business logic stays with Nova.

Kilo is also the release where we need to get the v2.1 API work done
now that we finally have a shared vision for how to progress. It took
us a long time to get to a good shared vision there, so we need to
ensure that we see that work through to the end.

We live in interesting times, but they're also exciting as well.

I have since been elected unopposed, so thanks for that!

Review priorities as we approach juno-3

I just send this email out to openstack-dev, but I am posting it here in case it makes it more discoverable to people drowning in email:

To: openstack-dev
Subject: [nova] Review priorities as we approach juno-3

Hi.

We're rapidly approaching j-3, so I want to remind people of the
current reviews that are high priority. The definition of high
priority I am using here is blueprints that are marked high priority
in launchpad that have outstanding code for review -- I am sure there
are other reviews that are important as well, but I want us to try to
land more blueprints than we have so far. These are listed in the
order they appear in launchpad.

== Compute Manager uses Objects (Juno Work) ==

https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/compute-manager-objects-juno,n,z

This is ongoing work, but if you're after some quick code review
points they're very easy to review and help push the project forward
in an important manner.

== Move Virt Drivers to use Objects (Juno Work) ==

I couldn't actually find any code out for review for this one apart
from https://review.openstack.org/#/c/94477/, is there more out there?

== Add a virt driver for Ironic ==

This one is in progress, but we need to keep going at it or we wont
get it merged in time.

* https://review.openstack.org/#/c/111223/ was approved, but a rebased
ate it. Should be quick to re-approve.
* https://review.openstack.org/#/c/111423/
* https://review.openstack.org/#/c/111425/
* ...there are more reviews in this series, but I'd be super happy to
see even a few reviewed

== Create Scheduler Python Library ==

* https://review.openstack.org/#/c/82778/
* https://review.openstack.org/#/c/104556/

(There are a few abandoned patches in this series, I think those two
are the active ones but please correct me if I am wrong).

== VMware: spawn refactor ==

* https://review.openstack.org/#/c/104145/
* https://review.openstack.org/#/c/104147/ (Dan Smith's -2 on this one
seems procedural to me)
* https://review.openstack.org/#/c/105738/
* ...another chain with many more patches to review

Thanks,
Michael

The actual email thread is at http://lists.openstack.org/pipermail/openstack-dev/2014-August/043098.html.

Expectations of core reviewers

One of the action items from the nova midcycle was that I was asked to
make nova’s expectations of core reviews more clear. This blog post is an
attempt at that.

Nova expects a minimum level of sustained code reviews from cores. In
the past this has been generally held to be in the order of two code
reviews a day, which is a pretty low bar compared to the review
workload of many cores. I feel that existing cores understand this
requirement well, and I am mostly stating it here for completeness.

Additionally, there is increasing levels of concern that cores need to
be on the same page about the criteria we hold code to, as well as the
overall direction of nova. While the weekly meetings help here, it was
agreed that summit attendance is really important to cores. Its the
way we decide where we’re going for the next cycle, as well as a
chance to make sure that people are all pulling in the same direction
and trust each other.

There is also a strong preference for midcycle meetup attendance,
although I understand that can sometimes be hard to arrange. My stance
is that I’d like core’s to try to attend, but understand that
sometimes people will miss one. In response to the increasing
importance of midcycles over time, I commit to trying to get the dates
for these events announced further in advance.

Given that we consider these physical events so important, I’d like
people to let me know if they have travel funding issues. I can then
approach the Foundation about funding travel if that is required.

Thoughts from the PTL

I sent this through to the openstack-dev mailing list (you can see the thread here), but I want to put it here as well for people who don’t actively follow the mailing list.

First off, thanks for electing me as the Nova PTL for Juno. I find the
outcome of the election both flattering and daunting. I'd like to
thank Dan and John for running as PTL candidates as well -- I strongly
believe that a solid democratic process is part of what makes
OpenStack so successful, and that isn't possible without people being
will to stand up during the election cycle.

I'm hoping to send out regular emails to this list with my thoughts
about our current position in the release process. Its early in the
cycle, so the ideas here aren't fully formed yet -- however I'd rather
get feedback early and often, in case I'm off on the wrong path. What
am I thinking about at the moment? The following things:

* a mid cycle meetup. I think the Icehouse meetup was a great success,
and I'd like to see us do this again in Juno. I'd also like to get the
location and venue nailed down as early as possible, so that people
who have complex travel approval processes have a chance to get travel
sorted out. I think its pretty much a foregone conclusion this meetup
will be somewhere in the continental US. If you're interested in
hosting a meetup in approximately August, please mail me privately so
we can chat.

* specs review. The new blueprint process is a work of genius, and I
think its already working better than what we've had in previous
releases. However, there are a lot of blueprints there in review, and
we need to focus on making sure these get looked at sooner rather than
later. I'd especially like to encourage operators to take a look at
blueprints relevant to their interests. Phil Day from HP has been
doing a really good job at this, and I'd like to see more of it.

* I promised to look at mentoring newcomers. The first step there is
working out how to identify what newcomers to mentor, and who mentors
them. There's not a lot of point in mentoring someone who writes a
single drive by patch, so working out who to invest in isn't as
obvious as it might seem at first. Discussing this process for
identifying mentoring targets is a good candidate for a summit
session, so have a ponder. However, if you have ideas let's get
talking about them now instead of waiting for the summit.

* summit session proposals. The deadline for proposing summit sessions
for Nova is April 20, which means we only have a little under a week
to get that done. So, if you're sitting on a summit session proposal,
now is the time to get it in.

* business as usual. We also need to find the time for bug fix code
review, blueprint implementation code review, bug triage and so forth.
Personally, I'm going to focus on bug fix code review more than I have
in the past. I'd like to see cores spend 50% of their code review time
reviewing bug fixes, to make the Juno release as solid as possible.
However, I don't intend to enforce that, its just me asking real nice.

Thanks for taking the time to read this email, and please do let me
know if you think this sort of communication is useful.

Juno Nova PTL Candidacy

This is a repost of an email to the openstack-dev list, which is mostly here for historical reasons.

Hi.

I would like to run for the OpenStack Compute PTL position as well.

I have been an active nova developer since late 2011, and have been a
core reviewer for quite a while. I am currently serving on the
Technical Committee, where I have recently been spending my time
liaising with the board about how to define what software should be
able to use the OpenStack trade mark. I've also served on the
vulnerability management team, and as nova bug czar in the past.

I have extensive experience running Open Source community groups,
having served on the TC, been the Director for linux.conf.au 2013, as
well as serving on the boards of various community groups over the
years.

In Icehouse I hired a team of nine software engineers who are all
working 100% on OpenStack at Rackspace Australia, developed and
deployed the turbo hipster third party CI system along with Joshua
Hesketh, as well as writing nova code. I recognize that if I am
successful I will need to rearrange my work responsibilities, and my
management is supportive of that.

The future
--------------

To be honest, I've thought for a while that the PTL role in OpenStack
is poorly named. Specifically, its the T that bothers me. Sure, we
need strong technical direction for our programs, but putting it in
the title raises technical direction above the other aspects of the
job. Compute at the moment is in an interesting position -- we're
actually pretty good on technical direction and we're doing
interesting things. What we're not doing well on is the social aspects
of the PTL role.

When I first started hacking on nova I came from an operations
background where I hadn't written open source code in quite a while. I
feel like I'm reasonably smart, but nova was certainly the largest
python project I'd ever seen. I submitted my first patch, and it was
rejected -- as it should have been. However, Vishy then took the time
to sit down with me and chat about what needed to change, and how to
improve the patch. That's really why I'm still involved with
OpenStack, Vishy took an interest and was always happy to chat. I'm
told by others that they have had similar experiences.

I think that's what compute is lacking at the moment. For the last few
cycles we're focused on the technical, and now the social aspects are
our biggest problem. I think this is a pendulum, and perhaps in a
release or two we'll swing back to needing to re-emphasise on
technical aspects, but for now we're doing poorly on social things.
Some examples:

- we're not keeping up with code reviews because we're reviewing the
wrong things. We have a high volume of patches which are unlikely to
ever land, but we just reject them. So far in the Icehouse cycle we've
seen 2,334 patchsets proposed, of which we approved 1,233. Along the
way, we needed to review 11,747 revisions. We don't spend enough time
working with the proposers to improve the quality of their code so
that it will land. Specifically, whilst review comments in gerrit are
helpful, we need to identify up and coming contributors and help them
build a relationship with a mentor outside gerrit. We can reduce the
number of reviews we need to do by improving the quality of initial
proposals.

- we're not keeping up with bug triage, or worse actually closing
bugs. I think part of this is that people want to land their features,
but part of it is also that closing bugs is super frustrating at the
moment. It can take hours (or days) to replicate and then diagnose a
bug. You propose a fix, and then it takes weeks to get reviewed. I'd
like to see us tweak the code review process to prioritise bug fixes
over new features for the Juno cycle. We should still land features,
but we should obsessively track review latency for bug fixes. Compute
fails if we're not producing reliable production grade code.

- I'd like to see us focus more on consensus building. We're a team
after all, and when we argue about solely the technical aspects of a
problem we ignore the fact that we're teaching the people involved a
behaviour that will continue on. Ultimately if we're not a welcoming
project that people want to code on, we'll run out of developers. I
personally want to be working on compute in five years, and I want the
compute of the future to be a vibrant, friendly, supportive place. We
get there by modelling the behaviour we want to see in the future.

So, some specific actions I think we should take:

- when we reject a review from a relatively new contributor, we should
try and pair them up with a more experienced developer to get some
coaching. That experienced dev should take point on code reviews for
the new person so that they receive low-latency feedback as they
learn. Once the experienced dev is ok with a review, nova-core can
pile on to actually get the code approved. This will reduce the
workload for nova-core (we're only reviewing things which are of a
known good standard), while improving the experience for new
contributors.

- we should obsessively track review performance for bug fixes, and
prioritise them where possible. Let's not ignore features, but let's
agree that each core should spend at least 50% of their review time
reviewing bug fixes.

- we should work on consensus building, and tracking the progress of
large blueprints. We should not wait until the end of the cycle to
re-assess the v3 API and discover we have concerns. We should be
talking about progress in the weekly meetings and making sure we're
all on the same page. Let's reduce the level of surprise. This also
flows into being clearer about the types of patches we don't want to
see proposed -- for example, if we think that patches that only change
whitespace are a bad idea, then let's document that somewhere so
people know before they put a lot of effort in.

Thanks for taking the time to read this email!