Technical Debt + Red October

03/22/2014

There’s been a little bit written about technical debt in an early stage software product. Technical debt is something we think a lot about at Contactually.

Technical debt is a balance of problems that are generated through rapid product development, normally in an early stage product.

“Move fast and break things

Given how key it is for a startup to test product in front of customers as fast as possible, technical debt is not just a byproduct but a necessity. Given how many unknowns we were encountering in the early days of Contactually (and even now), we adopted one of Facebook’s core mantras and made it our own. While we were still getting to product market fit, we gave ourselves permission to launch things that were not fully tested, did not have all edge cases satisfied, and frankly, were ugly in both form and function.

Now that we’ve achieved an acceptable level of product/market fit in our core offering, we’ve course corrected. Actually, to be completely honest, it’s not that we woke up one day and saw that we’re checking all the boxes and could clean out the skeletons in our closet. Our users told us. Overwhelming support issues, higher churn, shaky metrics, and a incessant stream of bug reports. They loved the promise of the product, but the actual day to day usage was bumpy.

Technical debt is usually attributed to the code-level shortcuts taken, marked by a quick TODO and quickly forgotten. Who cares about those? We’re building a user facing product, so our debt was anything that the user would see:

  • As we kept adding + modifying, overall performance started to decline.
  • Bugs appeared.
  • Usability issues were numerous.
  • The design was extended and stretched too far, yielding an overall unattractive mess.

Making the decision

Late 2013, we knew we were in trouble. It was clear, even just in talking to our team, that we were spending more time hearing about issues with the product than positive results. We had to act fast. We made the decision that, for now, we had reached a level of feature completeness, and just needed to make what we have work.

It was time to pay down. But how?

The squeaky wheel gets the grease

At first, the engineering team and I came up with a list of all the problems we saw in the application, and our wishlist. To no-one’s surprise, it yielded a list of internal architectural challenges, refactors, and rewrites.
But what matters to the user?

We changed our tune, quickly. Here’s what we did:

  • Started tracking overall satisfaction - Net Promoter Score is the most straightforward. To this day, the NPS is the clearest indicator of user satisfaction with our product.
  • Tracked application performance, and identified hotspots – The best way to improve performance is pretty obvious – look at what’s slow, and fix it. A few minutes clicking around on New Relic can give an engineer a clear idea of what the slowest pages are, the least performant database queries, and clear areas for code optimization. We started at the top and just worked our way down. We now report on our Apdex score (New Relic’s measurement of how fast pages return) weekly as a top-level company metric.
  • Actually fixed bugs If you haven’t set up a simple exception-reporting tool like Airbrake, Exceptional, or Honeybadger in your application, do so now. To pay down debt, we just… started fixing what we saw. (Note: clearing your backlog of exceptions also really helps overall performance, too). We now wince every time we see a bug report come in that’s anything other than some strange exception.
  • Asked our users – This was the thing I was most excited about, and the hardest pill to swallow at the same time. We had already amassed a collection of issues that we received inbound from users. That was nice, but knowing that we were only hearing what someone had gone out of their way to tell us, we didn’t have a clear signal. So we did something insane – we asked our top 300 users to give us their list of every annoyance, frustration, bug, blocker they had. We ended up with something on the order of 1200+ items. We looked through every single one, prioritized, grouped, and ended up with a list of what we knew we needed to fix. Granted, we had a lot of feature requests and off-topic improvements (blah blah faster horse…), but we could see what people were having issues with.
  • Internally, we sat down the entire company for a two hour session where everyone went page by page, workflow by workflow, and logged everything they could find.

Red-October-poster

We made the conscientious decision to not build any new features until we fixed the majority of these. In fact, no planned feature enhancements or internal tooling was done either. We shut down everything to focus on these issues. We called it Red October.

We emerged from it triumphant. Our team felt better not just with the end result, but the process.

Managing Technical Debt ongoing

We’re now a little bit better with how we manage debt – we have to be. Our core values & culture guides us to help our users as much as possible – and buggy software doesn’t do much for them. So here’s what we do:

  • Track performance as a top-level company metric.
  • Regular meetings with the sales + support teams to understand the “burning issues” and concerns that we’re hearing from customers – and prioritize those.
  • Track our Net Promoter Score, and identify issues resulting from that.
  • Periodically ping both our most active users as well as newly activated customers, to understand what their main concerns are.

While we have no regrets for the path that got us here, we know we had to break a few eggs and disappoint our customers. Moving forward, the burden is on us to deliver value, in both new and existing components of the product.