Rewrite or Renovate: the Programmer’s Dilemma
This article was first published in e27.
After Elon Musk's recent tweet that Twitter "will ultimately need a complete rewrite," coding commentators have taken to comedy, with some suggesting that he has now achieved a "mid-level engineer mindset." In sharing Musk’s tweet, Amit Gupta, CTO of Food Market Hub, suggested that due to software’s continuous evolution, after three to four years, it becomes necessary to discuss a rewrite of a particular component or system.
This generated a lively debate between builders in the F’in Tech community on the perennial question: Is it better to rewrite or renovate existing code? In this piece, I capture the heart of this dilemma as well as some of the top takeaways from our conversation, reflecting the thinking of those that have been in the trenches and have experience to share.
Better The Devil You Know
Every complete rewrite I've been directly involved with, or have knowledge of, has taken much longer than anticipated - assuming it was even completed. Often, when programmers are asked to “fix” or “complete” a rewrite, it's easier to revert to the original (working) code base and instead focus on whatever triggered the rewrite in the first place.
When you undertake a rewrite, you're trading known issues for unknown ones. Modern libraries and frameworks are not immune to bugs and gotchas; in fact, they may have more than their predecessors. These issues may slow down delivery, while others may require workarounds and unsightly edges that are just as bothersome as the original.
The complexity lies beneath the surface. Typically, code bases evolve over time, with fixes and nuanced requirements baked into the code without being consistently documented. Attempting a rewrite without a comprehensive understanding of what's already in place may continuously reveal new requirements, often during user acceptance testing. This can drag on indefinitely, frustrating your users and blowing your timelines. As F'in Tech advisor Sam Simopoulos succinctly put it, "These things never truly finish."
If full rewrites are almost always the wrong approach, then who is pushing for them? Is it a non-technical manager, convinced by a persuasive salesperson? A mid-level engineer who hasn't attempted it before? An ambitious new starter, striving to make their mark? If any of these apply, be cautious. But if it's the seasoned senior who loathes unnecessary changes, then it may be worth considering.
Not just the tech
When undertaking a full rewrite (or a significant renovation), it is essential to consider the needs and expectations of the users, and keep the product’s purpose in mind. This maximises the chance that what is being rewritten is fit for purpose. Gerry Eng, founder and CTO at CoinHako, builds on this: “it’s often not just a technical rewrite but a product requirements rewrite as well - making the call to remove legacy features or constraints can help unblock a lot of stuff.” However, this must be a balanced approach, as it further increases the potential risk of the rewrite - especially if the product and technical team’s visions are not 100% aligned.
“What do you do if the system is so entangled that you literally cannot reasonably extract modules from it, and thus forces a complete rewrite?” asked Nino Ulsamer at Stashaway. He believes it is possible that with sufficiently senior backing, enough attention and resources can be focused to make a rewrite work. “The rewrite can still work, but it must be anchored at the highest level. Maybe not possible without a founder-CTO pushing it forward.” These are not common, but not impossible as Uber showed, with then CEO Travis Kalanick getting personally involved. Stories like this also reveal the level of comradery that can emerge in the trenches when pulling off the impossible.
Don’t Underestimate the Data
As Christian Fischer, Head of Engineering at ADDX reflects: “The real pain after a rewrite is the data migration!” Without an iterative plan, the complexity of a data migration leaves two possible choices:
- A “big bang” cutover, with a sudden switch from the old system to the new
- Run a sync for two data sets, with the “old” data set supporting old processes
Big bang cutovers are high risk with little room for experimentation. Even when there is no other option, if there is a way to split the dataset into smaller pieces (for example by customer type or geography), then that should be explored as a way of minimising impact if there is an issue.
Often, engineers will try to sync data sets between old and new systems to retain flexibility, for example running one process from one dataset, and one from the other. While sensible, this adds in a new layer of complexity in keeping the systems reconciled with one another for consistency, especially if the data models between the two systems differ. So we need to think about approaches that allow us to iterate on the same data set.
Sunny Singh, CTO at Headquarters, is currently undergoing a data migration, “We did it by using a combination of feature flags & new tables, along with a process to whitelist clients that were open to using a slightly buggier application. We gave those clients some additional incentives for bearing with us. This is helping us to keep data migration to a minimum as we are experimenting, and eventually we will be migrating everyone over to the stabilised, rewritten modules.” The use of feature flags is a great way to gradually deploy code across data migrations or more typical application releases, reducing the risk of impacting all customers at the same time.
Make It So
As Diego Rojas, CTO at Tribe FinTech, notes “of course systems/products need to keep evolving, migrating and changing but it's an iterative/gradual process until major risks are mitigated.” Change is inevitable, but how we implement that change is within our control, and minimising risk is the objective.
As Manoj Awasthi, CTO at Julo, explained, most experienced engineers would attempt a rewrite using the Strangler Fig Pattern. In this fascinating ecological phenomenon, a species of fig plant, known as a strangler fig, germinates on a host tree and grows around its trunk. As the fig grows, its roots gradually envelop and constrict the host tree, causing it to weaken and ultimately die. The fig's roots thicken and fuse together, forming a dense lattice that completely surrounds the host tree's trunk, enabling the fig to continue growing by feeding off the host tree.
As a software refactoring technique, the host tree represents a legacy software system that has become outdated or difficult to maintain, while the strangler fig represents a new software system designed to replace it.
To implement this pattern, key functionality of the legacy system is identified and implemented in the new system in a modular way. The original system is kept operational in parallel, while more functionality is migrated to the new system wrapping around it. The reliance on the legacy system decreases until it can eventually be decommissioned. According to Headquarters’ Sunny Singh, the best approach is to rewrite specific modules as you touch specific parts of the codebase. However, considering ROI is critical to avoid exerting excess effort for little return.
“Ugly” rewrites can be considered a waste of time: if you’re rewriting an application, then you might as well build it correctly, right? Unfortunately, as Thomas Chia, CTO at Chocolate Finance, pointed out, this violates Gall’s Law. We are trying to jump to complexity, and thus are bound to fail. A continuous set of small changes must be made to gradually replace legacy systems with new applications. This approach almost always requires a set of "ugly" intermediate steps, yet each iteration is being utilised by real users - proving it actually works.
So, rewrite or renovate? Collectively, CTOs say: try not to do it! But if your only option is to rewrite then try to do it in small iterative steps, have a plan around your data migration that minimises disruption, and ensure that there is a strong alignment with senior management. But if you’re a mid-level engineer who read this and thought, “these guys aren’t hardcore enough”, then I hear Elon is hiring…
With thanks to Manoj Awasthi, Thomas Chia, Gerry Eng, Christian Fischer, Amit Gupta, Elon Musk, Diego Rojas, Sam Simopoulos, Sunny Singh, Nino Ulsamer and everyone in the F’in Tech Community.