It’s Time to Rethink Technical Debt Management

Alon Eizenman
Alon EizenmanCTO & Co-founder | September 12, 2018

Technical debt is a necessary evil. Too much of it can kill your velocity, but none at all is an indication that you’re over-engineering your system, especially in the early stages. The trick is to manage technical debt smartly to move faster and get user feedback, then you will have the information that you need to decide whether to close the debt or dump the functionality.

Over the years I’ve seen more organizations run away from and avoid technical debt, rather than tackle it head-on.

What is Technical Debt?

Technical debt is a metaphor coined by Ward Cunningham that draws a comparison between technical complexity and financial debt. It is used to express the compounded cost of quick and dirty engineering choices. For every shortcut you take today to deliver something faster, you will pay interest in the form of additional software development hours every time you develop in that code area in the future.

Technical debt management

Source: Dilbert

Pros of Technical Debt

  • Speed: The entire point of shortcuts is that they get you there faster. Taking on technical debt lets you develop faster, release faster, and get to market faster.
  • Simplifying architecture: Sometimes doing the “right thing” is just too complex and/or you don’t have the technical know how to make the “right thing” work. Monoliths are much easier that microservices, but microservices are better when you scale

Cons of Technical Debt

  • Delivery velocity: The biggest consequence of technical debt is that it slows down the ability to deliver future features.
  • Unpredictable impact: How much technical debt is too much? The biggest risk of technical debt is that there’s no rhyme or reason that lets you predict what or how much technical debt you can take on before your codebase descends into chaos and become a development black hole.
  • Defect density: Recklessly adding patches and workarounds will let defects creep into your product. Your team will need to spend more and more time fixing bugs rather than developing new value-added features for your customers.
  • Costs: The cost consequences are reflected in 2 aspects:
    • Man hours: You’ll need to invest more man-hours everytime you develop to keep your app working. Additionally, when you decide to pay down the debt you’ll need to throw out the code you’ve already written and refactor.
    • Infrastructure: Taking architectural shortcuts my result in brute force infrastructure workarounds. If you ignore multithreading to increase velocity you will pay the price in CPU when you need to scale. There is a big difference between 10 and 1000 servers/clusters/nodes.

Importance of Testing

At the end of the day, engineering managers own releases and are responsible for increasing velocity. If your definition of done means ready for production, then testing is part of your software development sprint. Testing isn’t additional overhead. Reducing testing will result in decreasing velocity because you will have issues in production that your team will need to fix, increasing debt, longer time passed till issues are detected, increasing the time to fix, and declining quality. Without testing your defect and design debt will spiral out of control.

A Breakdown of Technical Debt Metrics in Agile

Velocity is the clearest KPI of technical debt as hours can be expressed as cost. On the engineering side, there is a set of common KPIs that engineering teams and managers track and analyze for trends that gives an indication of their technical debt position.

Code Duplication

  • Duplicated code: On the face of it, duplication seems like a good thing as it’s faster and a way of reusing software that has already been implemented and tested by adapting it the new functionality. However, that doesn’t mean that duplicated code is the best development solution. Furthermore, it increases the software size needlessly resulting in higher maintenance time and cost and when bugs are detected and fixed all the duplicated code needs to be identified and fixed as well.
  • Copy pasted code: Unlike duplicated code, copy-pasted code from StackOverflow, etc has not been implemented and tested in your system. It’s like taking candy from a stranger.

Code Complexity

Cyclomatic complexity indicates the complexity of a program. It is a quantitative measure of the total paths in a code unit.

Dead Code

Executed code that is not being traversed by end users or used in any computation. Dead code eats up computation and memory on each execution as well as maintenance time and resources.

Code Review Ratio

The ratio of reviewed pull requests. The simplest implementation here would be to assign equal weights to all code reviews, however, code review intensity should be dictated by the potential of failure and the potential cost of failure. For example, core computations pull requests should be reviewed by multiple team members and carry a higher ratio weight.

Code Coverage

The percent of code covered by unit tests.

  • Line coverage: The total number of lines that have been covered at least once by a test case.
  • Branch coverage: the number of conditional branches that have been covered at least once by test cases.

Test Coverage

The percent of use cases covered by tests.

Regression Code Coverage

The percent of your code covered by regression tests.

Untested Code Changes

The total number of untested code changes per build.

Duration

  • Testing: The duration of your test suite execution.
  • Environment provisioning: The idle spin up time between environment initiation and provisioning.

Dependency Cycles and Coupling

  • Dependency cycles: Unwanted cycles reported between files which break the acyclic dependencies principle.
  • Coupling: Coupling is the number of external files that depend on a particular file. The higher the dependencies, the higher the risk of breakage.

Documentation Ratio

The ratio of public API documented.

Defect Aging

Known bugs that have been deferred to future releases.

Technical Debt and User Story Backlog Aging and Ratio

How long technical debt and user story items are pending in the backlog and the ratio of technical debt and user story items/hours of development to product items/hours of development.

Technical Debt Management: Maintaining a Low-Risk Technical Debt

At SeaLights, we believe that low-risk technical debt is actually a good thing since it allows for increased velocity. Developing and maintaining integration tests is a time-consuming and expensive process…one that may lead to test gaps, increased technical debt, and reduced team efficiency. To prevent this, you must focus your test development on high-risk areas.

Maintaining low-risk technical debt requires to you gather information and metrics on these high-risk areas and concentrate on them during sprint planning. Focus on developing integration tests for high-risk areas: code that was recently changed wasn’t tested and has a high production use.

Technical debt is a two-edged sword. Applied carefully, it helps you increase velocity, but too much of it will make your software fall to pieces. Now you should know what to look for and how to start measuring and managing your technical debt.