For clarity, this blog will focus on how to manage technical debt i.e. make it visible, communicate it, quantify it, and prioritise its removal. This is because in my experience it is not only a challenge to fix the debt, it can also be very challenging to get time and resources assigned to fix it unless a high profile issue has occurred which highlights the debt.
There are many definitions of technical debt available, my favourite is by Martin Fowler. Please have a read if you’d like a bit more clarity on the term: http://www.martinfowler.com/bliki/TechnicalDebt.html
Most companies are creating technical debt all the time – lets take an example:
“The team I’m part of has just started a project. We don’t have a test environment but we’ve got our dev boxes and a build server so we can get cracking. Our definition of done is okay, we’ve said we’ll use coding standards, will build features to the acceptance criteria in the story and will have a review of features with the Product Owner. We’ll write unit tests on new bits of code, and prepare test scripts but because we haven’t got a test environment we won’t be able to do integration testing or regression test the features to ensure nothing has been broken in the process of creating new features. We’ve done a couple of sprints and there’s more bugs in the solution but once the test environment is available we’ll sort them all out. “
For some clients we work with, this in not a rare scenario, but inherent within this approach is the creation of large amounts of technical debt and its growing exponentially. I think the following slide from Colin Bird’s CSM course highlights the problem very succinctly.
As these items of increasing bugs, lack of regression testing, unclear coding standards, unclear understanding of the quality of code being produced due to a lack of pair programming, TDD or even peer review and a complete lack of visibility of how the solution will perform in an integrated, production environment increase sprint on sprint (phew!)… so the mounting technical debt will start to impede velocity and add more work on to the end of the project in order to move the solution from an immature definition of done to production ready.
The typical result of this approach left unaddressed is a set of hardening and stabilisation sprints that take months to get into production. This causes knock-on effects of losing stakeholder trust and support, blocking other projects from starting, keeping resources longer than expected messing up resource planning, and generally losing kudos and trust across your internal teams – and that’s before we talk about the impact on consumers of your product!
One of the first major conversations I expect to have with a Scrum team is about the definition of done. Scrum specifies that the team should create ‘potentially shippable product’ every sprint. This is, in essence, the heart of the problem of adopting Scrum. It is the reason why so many technical, engineering and test capabilities must change and adapt in order to be able to achieve this ‘rule’.
An immature Definition of Done speaks volumes about the team’s overall ability to reach hyper-productivity levels of software development. If the gap between a story done in sprint and a story deployed into production is big, then after the final ‘development’ sprint, you should all expect a considerable period of hardening and stabilisation.
We’ve recently been using Value Stream Maps to help articulate the flow of work through development teams. The Value Stream Maps show customer expectations, information flows, physical flows, productivity metrics and value stream metrics including waste.
The physical flow often shows software being created and then moving through the environments of development, test, user acceptance, pre-production and into production. So, with your team, where is code deployed within the sprint i.e. when its tested and set to done in sprint, how far away is it from production?
Again value stream maps come to the fore when considering these questions as does Application Lifecycle Management (ALM).
In order to avoid the long and painful hardening/stabilisation period, the team must be focussed on continuously evolving the Definition of Done to bring feedback loops and risks into the sprint and reduce the gap from sprint to production so that they are truly creating ‘potentially shippable product’.
Again, I’m afraid I’ll have to refer you to Martin Fowler’s blog which introduces the Technical Debt Quadrant:
Ahh, the 2 x 2 grid! Takes me back to hours upon hours of MBA study! Thank god that’s done with!
I have often used this grid to work with teams on prevention. i.e. preventative measures in order to reduce future technical debt. As with all preventative activities, they usually take longer to implement but when adopted have a significant impact.
Use these categories to decide which sources of debt are acceptable and which are not acceptable within your organisation, and then establish tactics for preventing unacceptable behaviour…
Generally, I don’t have a problem with behaviours that sit in the top right quadrant… assuming that is really where they belong. Often this quadrant is misunderstood. It is not about having to ship because stakeholders are expecting a shipping date, or someone’s bonus depends on it shipping on a set date – those scenarios actually belong in the top left quadrant.
The top right quadrant is reserved for business drivers that have a compelling ROI for why a product has to ship immediately – i.e. responding to a threat to the business bottom line, or exploiting an opportunity that positively affects the bottom line. If in doubt, a good challenge would be to inform a board member of the situation and ask if they are aware that the product is shipping due to a compelling business agenda.
Examples of where the top-right might be justified:
Top left is indicative of poor management, usually corners are being cut in order to hit a deadline that is related to perceived operational needs rather than an underlying clear business case. Rushing teams because someone somewhere has communicated a deadline and then driving to hit the deadline because of the deadline’s sake rather than a compelling business need.
This is a very common cause of Technical Debt. If the board or shareholders were informed that a project was cutting corners and creating technical debt which will slow the company down in the future, they’d want to know why. In fact, I think that if the subject of technical debt was discussed more openly and quantified, they’d be far more examination of portfolio management capabilities.
A lot of the time, this is occurring not because there is a business case for hitting this date, rather due to how the company delivers projects. Examples of Reckless & Deliberate:
None of these are necessarily easy to change, but, stopping these behaviours will have a significant, positive impact on the long-term velocity and well-being of the company.
Incompetence at one level or another is the key contributor to debt created within this quadrant. You don’t know what you don’t know and could therefore be blissfully unaware that you are creating a huge amount of technical debt. As a manager/leader, I’d want to prevent the reckless & inadvertent creation of technical debt and there are many tools to help me do this.
Essentially, this is about investing in your people, processes and tools – again something not done enough!
Pair programming, code reviews, static code analysis, continuous integration, automated test suites help to provide feedback on code and design quality. Communities of practice, clear role description, personal development plans, training budgets with technical strategy alignment are tactics for helping people get better at what they do and lightweight iterative methods with visual management help to ensure processes are continuously reviewed and improved.
This is a natural occurrence. Regardless of what walk of life we are in, or what job we do, over time, we’ll return to a previous piece of work and see a better way of doing it. It is the natural sequence to gain more domain knowledge about a particular piece of work.
Just because we know that with hindsight and increased experience in a year or two’s time we’ll look back and see a better way, doesn’t mean that we procrastinate or spend large amounts of time trying to second guess the future.
Rather, we keep to the agile principles of:
So, lets assume we’ve implemented some tactics for reducing our Reckless & Inadvertent debt and our Reckless & Deliberate debt and now we’re going to spend some time ‘speeding-up’ the company.
This is how I have approached the problem in the past.
The first thing to do is to try to establish how painful a particular piece of technical debt is. i.e. if you have to touch this part of the solution in the future, how ‘painful’ will it be? How much additional time will be spent understanding the area, re-writing parts to get is to work or integrate with other new parts? This ‘pain’ factor is represented by the yellow part below and we’ll refer to this as the interest.
Next for each item of debt, we need to understand the effort involved in fixing it, and we’ll refer to this estimate as the Principal.
Having established how much interest is payable upon the use of any part, and knowing what it takes to fix it - the principal - we can now work out which pieces to fix, and why.
Looking at the slide above, which piece of debt do you think should be fixed first?
The answer with this amount of information is A, as it has the highest amount of ‘pain’ or Interest, with the lowest effort to resolve ‘Principal’. However, what we haven’t considered yet is time and the frequency of payments. i.e. what if in the next 6 months item A on the left will only be changed/touched once, but the item on the far right ‘D’ will be touched 10 times?
Now which item of debt do you think should be fixed first?
So we’ve established the frequency of Interest payments and found out that the most painful item of debt is item D on the far right. Have we finished?
No. Because until now we’ve focussed purely on the IT decision-making side of Technical Debt i.e. how painful it is, what it takes to fix it, and how often it hurts our development efforts. What we have yet to consider is whether our prioritisation takes into account business value.
A fair few years ago as Head of Software Development, my team created a product portfolio. i.e. we took all our disparate applications (and there were many, and they were disparate) and grouped them into families of applications that served specific business needs.
This was a very powerful exercise and it helped us to clearly discern where we were adding value and supporting the business. It also allowed us to quantify the value of software applications based on which business units were using them and how much revenue was being generated by those units.
In this way, we were able to prioritise fixing technical debt based on a clear alignment of business value.
So now we can quantify the size and frequency of Interest payments and we know what it will take to pay-off the principal to remove the debt. In addition to this we can prioritise the debt in terms of the value of the application to the business at the moment, but what about the foreseeable future.
The final consideration when prioritising technical debt activities across your application portfolio is the future needs of the business upon the application landscape. Which systems are critical to the future of the business? Which will be decommissioned as the portfolio moves through the foreseeable future?
In answering these questions, we will understand the future needs of the business upon the application landscape and be able to focus the entire development effort on improving the highest value systems of the business both for today and tomorrow.
By taking the factors we’ve already discussed: Interest Payment; Principal; Frequency of Payment; Business Value; and Strategic Intent - we can now create a high level, business-derived prioritisation matrix for articulating, quantifying, prioritising and resolving Technical Debt.
The first step is to map current business value/usage against the future strategic needs of the company.
This mapping will provide an initial view of the value of applications within the landscape.
Once these applications have been reviewed, categorised and the output discussed with business stakeholders, we can initially provide a generic approach to dealing with the debt:
We now have a means of categorising all the applications within the landscape and assigning relative value to them based on current and future usage requirements. Within these categories we are able to specify the ‘pain’ of technical debt, as well as the effort required to fix it.
We also have some strategies for dealing with the debt:
Hide the Work - I’ve seen teams hide the work to fix the Technical Debt. This tactic can be successful but I feel it goes against the grain of transparency and honesty and ultimately the fixing of debt should be a business driven activity. But, I have seen this tactic work so it should be considered within your context.
Leave the Code in a Better State - The on-going development strategy should always be to leave the code in a better state. It’s a simple statement that is rarely adhered to. The rule is that every time you touch a piece of code, you leave it in a better state. It could be as simple as an additional unit test or some clearly articulated comments or as complex as writing a suite of unit tests and refactoring a component. If every single developer adopted this strategy in your organisation the state of the code base would improve significantly over time.
Ask to Leave Code in a Better State – This tactic is closely related to the one above, but is cognizant of the fact that some companies micro-manage their teams to the extent that any time not spent directly working on the addition of features can be rapidly identified and can cause friction. This tactic addresses the scenario where a developer identifies a piece of technical debt during a sprint, that they think should be fixed. Rather than just sorting out the problem (which is desirable outcome), the developer would escalate to the Scrum Master and Product Owner. If you have to resort to this tactic then there is a lot of work to do in educating the business and IT management about the problems of technical debt, as well as a recognition that you have not provided an environment that allows self-organising teams to flourish.
Create Story & Justify – Typically, leave the code in a better state is not a fully adopted way of working across development teams and the more typical way of fixing technical debt (other than complete re-platforming) is to create individual stories and treat them as ordinary backlog items. When you get this up and working fully, involving the Product Owner in the reasoning behind the work, both parties learn a great deal about each other’s perspectives and it can be a valuable lesson in understanding the business and technical domains.
Allocate Release Ratio – Another tactic I have introduced in order to achieve some progress in fixing technical debt is to ‘Allocate a Release Ratio’. When articulating the value of addressing technical debt and prioritising it within a Product Backlog, I have often seen that despite best intentions, the Technical Debt items sink toward the bottom of the backlog and do not get resolved.
There are a number of reasons why this might occur so to mitigate against this occurrence, I seek agreement from key stakeholders to allocate a certain amount of a release backlog to technical debt items. This approach has certainly been successful in ensuring TD items are addressed. Ultimately though, the best tactic remains to leave the code in a better state as it is more efficient.
Ultimately this is about understanding the value of your application landscape, understanding where its weaknesses lie and finding a way within your organisation to ensure it is fit for purpose now and in the future. I hope that the next time you hear the term Technical Debt you’ll have gained a clear understanding of what it is from Martin Fowler’s blogs and you’ll also have some tips for how to address the problem from this blog.
Email Steve Garnett
© Copyright 2020 RippleRock