# Wednesday, 27 February 2013

If you know me, you'll know I'm a fan of both Scrum and Kanban, and get irritated by those who see them as opposing forces. So this post is definitely not me having a pop at scrum. However I do have a problem with the term "Sprint". You'll also know I love a good metaphor. 

As dev teams we often focus on running at full speed, trying to do more faster. Sounds like a good way to eliminate waste and get things done as quickly as possible, but lets have a proper think about it...

A lot of what we are asked to build are big things, and this is where the problem starts for me. 

Lets get to the metaphor. Sprinting comes from athletics, so lets focus on the big one - the 100m sprint. Look at the photo below. The winners of the 100m sprint and the 10,000m at the Olympics this year in London 

Mo Farah and Usain Bolt 2012 Olympics (cropped)

In the 100m, the athletes don't take a breath, it is anaerobic activity. Ask an athlete to try to run the 10,000m that way, and you have a doomed project. It is possible for a human being to 100m in under 10 seconds, but it doesn't follow that you can run 10,000m in under 1000 seconds. The world record for 10,000m is 1577 seconds, which is about 50% greater. I suspect that Usain Bolt has never run a 10,000m track race, and if he did, I think it is fair to say he would be significantly slower than 1577s but if he did, I think its a good bet that Usain would be unlikely to run the first 100m in 10s. That pace is the definition of unsustainable pace

If you have a software development team approaching their work as a series of individual 'out and out' sprints where they go as fast as they can to deliver software without thinking about the long game, and the quality and the technical debt they are accruing, then they are trying to run a 10,000m like a series Usain Bolt 10s sprints. Mo Farrah might take longer over the first few 100m, but he will run the whole 1000m distance much faster than Usain could. 

All too often we focus on the 'how fast' and forget about the 'how well', but I'm prepared to say that the same people who push us to sprint as fast as we can, would never back Usain over Mo in a 10,000m race. 

The risk of using metaphors is that you stumble into someone else's metaphor. In this case the Scrum Sprint metaphor. I've already stated I'm a fan of Scrum, but Scrum is often implemented badly. As a coach I've had to walk into organisations and help pick up the pieces, so I know this to be true.  If you treat your Scrum Sprints like a series of 100m races, and go on like that for 100 sprints, you will have accrued so much technical debt and a bug list so long that you will no longer be able to deliver any value work in a reasonable timeframe again.

Most of your delivery capability will be tied up dealing with failure demand. A good scrum implementation will acknowledge this, and therefore ensure quality is front and centre where it should be, and spend time every sprint on keeping the bug list down and use refactoring and 'boy scouting' to keep the technical debt down. This means not going as fast as you can out of the blocks doing only new feature work, but it also means that more value work is delivered over the time you're delivering your MMF, and that the pace is sustainable.

A good Scrum team or Agile team must be able to maintain its pace indefinitely. In my metaphor, athletic runners are either 'Sprinters', 'Middle Distance runners' or 'Long Distance runners'. So the same people often  compete in the 3,000m, 5,000m, 10,000m, Half Marathon and Marathon races. They are showing the ability to use their sustainable running pace on varying lengths of race. 

The long distance runner seems to be a better metaphor for most of us to follow than the sprinter. There are situations where the 'out and out' sprint is the right approach - I've heard tell of banking apps that are built very quickly for a single niche purpose, used for a day or so, then discarded. That would suit the 'out and out' sprint as the tech debt and bugs are thrown away with the code, then the team starts afresh. 

So people - lets make sure we are honest to ourselves and our stakeholders about the kind work we are undertaking, and then approach it with a view to the success of finishing the kind of race we are running. 



.
Tags: Agile | Kanban | Lean | Quality

Wednesday, 27 February 2013 11:29:38 (GMT Standard Time, UTC+00:00)  #   


# Wednesday, 16 January 2013

How did we get to the moon - Part 2

Here's a daft question, which mission first landed humans on the moon? Apollo 11 of course, but lets think about that. It wasn't Apollo 1, it was Apollo 11. What on earth were the other 10 missions costing millions of dollars for then? 

Apollo 11 footprint

Well, I've had a good think about it, and this is another aspect of I find admirable about NASA's approach. As I mentioned in my last post, they had a backlog, and took achievable chunks from it, and finished them before moving on to the next. 

Mercury was all about getting manned spaceflight, Gemini was what NASA called "a bridge to the moon" - to work out how Apollo could be achieved, and Apollo was about getting to the moon. 

So lets start at Gemini. The mission goals were 

  • To subject man and equipment to space flight up to two weeks in duration.
  • To rendezvous and dock with orbiting vehicles and to maneuver the docked combination by using the target vehicle's propulsion system;
  • To perfect methods of entering the atmosphere and landing at a preselected point on land.

(source NASA)

All of these were the perceived steps needed to prove that a mission to the moon was possible. In a very agile way in 1964, NASA worked out that the original goal to land on land was not needed after all, so the mission was brought to a close early. 

So what did they get from Gemini? They got:

Validated Learning

They got the Validated Learning by delivering real missions. How often have you come across IT projects that are in a 'proof of concept' phase that deliver no real working value. The team gets to have a whale of a time playing with code that "we don't need to QA because it will never go into production" with the aim of getting learning of how to do real work. Only to find out a month later when the real work starts that most of what they learned was invalid, and that reference architecture we built was a bit Mickey Mouse. 

Proof Of Concept still means going live

Freeflyer nasa

Not NASA, they built real rockets, with real spacecraft on top, put real people into them and conducted real work that went into production. I can't imagine the rocket scientists saying, "we didn't bother testing the space suit systems properly, because this isn't a production mission to the moon, it's just a proof of concept." The astronauts were going out into the vacuum of space, so everything had to work with quality. 

Gemini had 10 manned space missions in about 18 months. They turned around their iterations quickly in manageable chunks, with the first 4 missions going from 4 hours to 4 days, to 8 days, to 14 days. Goal 1 achieved, two week manned spaceflight, move on to goal 2, docking. Interestingly part of the 14 day mission was a subgoal attempting to dock craft which failed. A lesson in limiting WIP for NASA right there ;-)

Keep the team together

The Gemini mission was piloted by names you will find very familiar - Armstrong, Aldrin, Collins, Lovell et al. Names you know from famous Apollo missions (11 and 13 in this case). In other words they had another Agile principe at heart. They built a great team, finished the mission, then kept the team together and brought the next mission to them.

How often do we put together a team for a project then disband it, and put together a new team for a new project? Good agile has teams staying together and work is brought to the high performance team as it completes other work. 

On to Apollo

Similarly with Apollo, each mission was designed to gain validated learning. 

As13 nasa

Apollo 1 was a disaster with a fire in the cabin killing all of the crew, which led to changes in the cabin design, and better escape systems. A harsh lesson that led to important learnings for the rest of space flight. As with most projects that go wrong, the lessons we learn tend to be the most important ones. 

Apollo 8 was the first mission around the moon - the furthest away man has ever travelled. 

Apollo 9 was the first time the Lunar Module was tested separately from the Command Module, launching, manoeuvring, and docking in space. 

Apollo 10 was everything up to but not including lunar landing. They launched the Lunar Module and went down to within 9 miles of the surface of the moon, then came back up, docked and came home. 

Apollo 11 was the biggie. Neil may have fluffed his line ( "one small step for a man," is how it was supposed to be) but mankind was on the moon for the first time. 

Apollo 12 onwards was about finding out how man could spend longer time on the moon, and potentially set up a base there in the future - hence the lunar rover etc. 

Each mission built on the learnings of the previous mission, adding another valuable increment of validated learning, until the final objective was achieved. 

EugeneCernanOnMoon

This equates to a project releasing valuable increments and evolving through the validated learning gained until you have a Minimum Viable Product to launch to market. Gemini and Apollo 1 through 10 gave the increments and validated learning so that Apollo 11 could happen, and that progressed throughout the programme to the final mission to the moon (40 years ago). 

One Final Thought For Mankind

Can you imagine the Big Bang waterfall approach as applied to the Apollo programme? Imagine if Gemini 1 had to go and land man on the moon with no missions in-between. Do you think it would ever have even taken off? I suspect it would have been cancelled after 15 years of theoretical work, and been an abject failure. They just didn't know all of the things they needed to learn up front, it was only by launching the missions that they found out all of that Validated Learning that made the objective possible. You don't learn everything you need in a theoretical POC phase, you learn by doing things for real. NASA was doing this on the highest profile project in the world in the 1960s. Finishing one small step at a time. 

 



.
Tags: Agile | Kanban | Lean | LKU | NASA | WIP

Wednesday, 16 January 2013 11:22:06 (GMT Standard Time, UTC+00:00)  #   


# Wednesday, 09 January 2013

With the BBC going all Stargazing LiveI thought I would start this blog by sharing some of my space based Agile and Kanban thoughts. Being an amateur astronomer, and someone who naturally seeks out metaphor to help understanding, I do like to draw on the universe to help me explain things.

SaturnVSeparation

So how did we get to the moon? Well first of all there was the space race, and NASA started behind the USSR space programme who already had satellites and put Uri Gagarin into space first, but they caught up and overtook the USSR programme. How did they approach their mission? One mantra captures what NASA is about for me:

"Do one thing at a time, with supreme excellence."

For me, this embodies the spirit of Kanban in a way that nothing else ever has. Let me break it down.

Do one thing at a time

Focus on the finishing. Put your effort into releasing your value. All of these things really boil down to Limit your Work In Progress. We have Little's Law to show us mathematically why Limiting your WIP is the most effective thing you can do to affect your Lead Time and your Throughput, but I've never seen such a big and government funded organisation be so single minded in what it did as NASA was (and is).

They started off with a very simple high level backlog of things to do.

  • get a satellite into orbit
  • get a living thing into orbit
  • get a human into orbit and back
  • conduct a space walk
  • dock 2 spacecraft in orbit
  • put man on the moon

Then they worked out that the back of the list was so far away with so many unknowns that they focused on the value they could deliver, and called it the Mercury Programme, which stopped at getting man into orbit.

They then finished the Mercury programme. And that is all they worked on, the whole of NASA finished Mercury and nothing else. Do one thing at a time.

Even now, NASA no longer works directly on the ISS mission, they use subcontracted private space companies, and the Russian space programme to keep that running for them. Now they are focused on getting to mars. How about the NASA JPL (Jet Propulsion Laboratory) which is in effect a different organisation? Well, it did Hubble ST, then the Mars Rovers, then Mars Curiosity, and now it is focusing on the James Webb Space Telescope. Yes some missions run for years (look at Voyager 2 - over 14 light hours away from earth and launched in 1977 - you can even follow it on twitter here ) but their FOCUS is on one thing at a time. Getting things Launched has a slightly more immediate meaning at NASA.

With Supreme Excellence

This sounds simple, and it indeed it is. If you are going to put humans on top of, or in close proximity to rocket fuel and liquid oxygen, and measure the amounts those substances in thousands of tonnes, then you really want to minimise your bugs. In the world of Rocket Science, things seem to either go well, or go catastrophically badly, so you must ensure quality.

In Software Engineering, we say the most expensive time to fix bugs, is when they are live, and earlier we catch them, the cheaper they are. The cheapest time to fix a bug is actually before you write it (pairing sceptics take note).

In space missions, it is often impossible to fix a bug once it has gone live and been found the hard way as evidenced by the price paid by the crews of Columbia, Challenger and Apollo 1 and very nearly Apollo 13.  Do you remember the cost of fixing the initial Hubble Space Telescope focus problem, and the media furore about that bug? If they have similar issues on James Webb Space Telescope, there is no way to fix it. Humans have never been as far away from earth as JWST will be, so once it's there, it cannot be touched by human hand again.

So focus on the quality. I remember my first reading through of David J Anderson's Kanban book, and being somewhat sceptical as that being the place he suggested the readers start.

Then I walked into my first true Kanban implementation at YouView TV, and seeing the sheer amount of bugs my team was generating. Looking back , it would have been easy to have said that the team's primary output was bugs and secondary output was software. Harsh, and probably not really true, but it makes the point figuratively.

So I duly did focus on the quality with the team, and it made a huge difference. I'll probably do a blog on that separately in the future so I'll avoid too many spoilers, but I did track back one small bug. It took only 3 hours to write the whole feature, and if it had taken 3.5 hours, the bug wouldn't have existed at all.  6 months later, it took 3 days to fix the bug. The original developer had left, the rest of the system had moved on, and the cost of (re)working it all out in order to fix the bug rose to 3 days. And of course the down stream integration and system test team also had to spend more effort retesting the bug fix, so there is further hidden cost downstream. What could have cost under 30 minutes of Dev time, actually cost 3 days of dev time plus some other unquantified QA time.

In my career to date, I've never heard an operations / support team complain that the quality of the code delivered was too high, or the business complain that we were spending too little of our development capability on live support and bug fix, or had too little downtime on live systems due to bugs.

People often say that high quality is aspirational, or too expensive. They might even be bright people saying it. It doesn't stop them being wrong. I'm sure Albert Einstein must have came up with some right nonsense from time to time. Let me explain:

Truth be told Quality IS expensive, and the cost is front and centre where everyone can see it and point at it.

However, developing with low quality is MUCH MORE expensive overall, but the cost tends to be hidden, and late in the day, so it can appear to be cheaper up front.

It is always more expensive to fix things later. We all know that intrinsically. Do you top up the engine oil on your car when it needs it, or wait for the engine to blow up? Do you replace your tyres when they are getting low tread, or wait for a blow out at 70mph on the M3? Do you fix a dodgy bit of brickwork on your house, or wait until it falls down? Should you fix the bug now 'in sprint', or wait until the website goes down right in the middle of the peak sale that accounts for 40% of the company revenue for the year? Do you refactor the basket component as you work, or just leave it until the point that no-one working in the company understands it any more so the risk of making a small change means you have to rewrite the entire basket component, and the interfaces it has to all of the other systems?

Do you build your rocket with a high quality approach, or just hope it doesn't blow up on the launch pad?

NASA choose to focus on quality, but still have accidents. Imagine if they didn't focus on quality…

It wouldn't be as catchy to say Limit your WIP and focus on quality, but that is the essence of the message. Do 1 thing and do it as well as you can also works.  If it's important enough to spend money on doing, then don't do anything else, and do it as well as you can. I like to thing everything in Agile & Kanban is plain simple common sense or counter intuitive but still common sense.

I've only talked about costs and throughput in this blog, and I've deliberately stopped myself from 'going off on one' about intrinsic motivation amongst the team, and craftsmanship etc. I probably will at some time in the future but in short, the hard facts of when YouView focused on throughput, my team had 18 'funded heads' but we could only maintain 13 real developers at any given time as every time we found a good one to hire in, another left. When we had a quality craftsmanship focus we had the same 18 funded heads, and we had 18 people working there. Throughput also went from '8' to '13' so we had happier people, and got more done, as well as the higher quality. If you love throughput, forget the throughput and focus on the quality instead. Counter intuitive, but common sense none the less.

If you want to discuss anything about this, please catch me on twitter @KanbanDan



.
Tags: Agile | Kanban | Lean | LKU | NASA | Quality | WIP

Wednesday, 09 January 2013 11:47:39 (GMT Standard Time, UTC+00:00)  #