With the BBC going all Stargazing LiveI thought I would start this blog by sharing some of my space based Agile and Kanban thoughts. Being an amateur astronomer, and someone who naturally seeks out metaphor to help understanding, I do like to draw on the universe to help me explain things.
So how did we get to the moon? Well first of all there was the space race, and NASA started behind the USSR space programme who already had satellites and put Uri Gagarin into space first, but they caught up and overtook the USSR programme. How did they approach their mission? One mantra captures what NASA is about for me:
"Do one thing at a time, with supreme excellence."
For me, this embodies the spirit of Kanban in a way that nothing else ever has. Let me break it down.
Do one thing at a time
Focus on the finishing. Put your effort into releasing your value. All of these things really boil down to Limit your Work In Progress. We have Little's Law to show us mathematically why Limiting your WIP is the most effective thing you can do to affect your Lead Time and your Throughput, but I've never seen such a big and government funded organisation be so single minded in what it did as NASA was (and is).
They started off with a very simple high level backlog of things to do.
- get a satellite into orbit
- get a living thing into orbit
- get a human into orbit and back
- conduct a space walk
- dock 2 spacecraft in orbit
- put man on the moon
Then they worked out that the back of the list was so far away with so many unknowns that they focused on the value they could deliver, and called it the Mercury Programme, which stopped at getting man into orbit.
They then finished the Mercury programme. And that is all they worked on, the whole of NASA finished Mercury and nothing else. Do one thing at a time.
Even now, NASA no longer works directly on the ISS mission, they use subcontracted private space companies, and the Russian space programme to keep that running for them. Now they are focused on getting to mars. How about the NASA JPL (Jet Propulsion Laboratory) which is in effect a different organisation? Well, it did Hubble ST, then the Mars Rovers, then Mars Curiosity, and now it is focusing on the James Webb Space Telescope. Yes some missions run for years (look at Voyager 2 - over 14 light hours away from earth and launched in 1977 - you can even follow it on twitter here ) but their FOCUS is on one thing at a time. Getting things Launched has a slightly more immediate meaning at NASA.
With Supreme Excellence
This sounds simple, and it indeed it is. If you are going to put humans on top of, or in close proximity to rocket fuel and liquid oxygen, and measure the amounts those substances in thousands of tonnes, then you really want to minimise your bugs. In the world of Rocket Science, things seem to either go well, or go catastrophically badly, so you must ensure quality.
In Software Engineering, we say the most expensive time to fix bugs, is when they are live, and earlier we catch them, the cheaper they are. The cheapest time to fix a bug is actually before you write it (pairing sceptics take note).
In space missions, it is often impossible to fix a bug once it has gone live and been found the hard way as evidenced by the price paid by the crews of Columbia, Challenger and Apollo 1 and very nearly Apollo 13. Do you remember the cost of fixing the initial Hubble Space Telescope focus problem, and the media furore about that bug? If they have similar issues on James Webb Space Telescope, there is no way to fix it. Humans have never been as far away from earth as JWST will be, so once it's there, it cannot be touched by human hand again.
So focus on the quality. I remember my first reading through of David J Anderson's Kanban book, and being somewhat sceptical as that being the place he suggested the readers start.
Then I walked into my first true Kanban implementation at YouView TV, and seeing the sheer amount of bugs my team was generating. Looking back , it would have been easy to have said that the team's primary output was bugs and secondary output was software. Harsh, and probably not really true, but it makes the point figuratively.
So I duly did focus on the quality with the team, and it made a huge difference. I'll probably do a blog on that separately in the future so I'll avoid too many spoilers, but I did track back one small bug. It took only 3 hours to write the whole feature, and if it had taken 3.5 hours, the bug wouldn't have existed at all. 6 months later, it took 3 days to fix the bug. The original developer had left, the rest of the system had moved on, and the cost of (re)working it all out in order to fix the bug rose to 3 days. And of course the down stream integration and system test team also had to spend more effort retesting the bug fix, so there is further hidden cost downstream. What could have cost under 30 minutes of Dev time, actually cost 3 days of dev time plus some other unquantified QA time.
In my career to date, I've never heard an operations / support team complain that the quality of the code delivered was too high, or the business complain that we were spending too little of our development capability on live support and bug fix, or had too little downtime on live systems due to bugs.
People often say that high quality is aspirational, or too expensive. They might even be bright people saying it. It doesn't stop them being wrong. I'm sure Albert Einstein must have came up with some right nonsense from time to time. Let me explain:
Truth be told Quality IS expensive, and the cost is front and centre where everyone can see it and point at it.
However, developing with low quality is MUCH MORE expensive overall, but the cost tends to be hidden, and late in the day, so it can appear to be cheaper up front.
It is always more expensive to fix things later. We all know that intrinsically. Do you top up the engine oil on your car when it needs it, or wait for the engine to blow up? Do you replace your tyres when they are getting low tread, or wait for a blow out at 70mph on the M3? Do you fix a dodgy bit of brickwork on your house, or wait until it falls down? Should you fix the bug now 'in sprint', or wait until the website goes down right in the middle of the peak sale that accounts for 40% of the company revenue for the year? Do you refactor the basket component as you work, or just leave it until the point that no-one working in the company understands it any more so the risk of making a small change means you have to rewrite the entire basket component, and the interfaces it has to all of the other systems?
Do you build your rocket with a high quality approach, or just hope it doesn't blow up on the launch pad?
NASA choose to focus on quality, but still have accidents. Imagine if they didn't focus on quality…
It wouldn't be as catchy to say Limit your WIP and focus on quality, but that is the essence of the message. Do 1 thing and do it as well as you can also works. If it's important enough to spend money on doing, then don't do anything else, and do it as well as you can. I like to thing everything in Agile & Kanban is plain simple common sense or counter intuitive but still common sense.
I've only talked about costs and throughput in this blog, and I've deliberately stopped myself from 'going off on one' about intrinsic motivation amongst the team, and craftsmanship etc. I probably will at some time in the future but in short, the hard facts of when YouView focused on throughput, my team had 18 'funded heads' but we could only maintain 13 real developers at any given time as every time we found a good one to hire in, another left. When we had a quality craftsmanship focus we had the same 18 funded heads, and we had 18 people working there. Throughput also went from '8' to '13' so we had happier people, and got more done, as well as the higher quality. If you love throughput, forget the throughput and focus on the quality instead. Counter intuitive, but common sense none the less.
If you want to discuss anything about this, please catch me on twitter @KanbanDan