When Rewriting Your App Pays Off and When It Kills You

A founder's framework for choosing refactor, strangler-fig, or full rewrite without betting the roadmap on a guess.

Derrick S. K. SiaworJanuary 8, 20258 min read

Two dirt paths diverging through a lush green forest — Photo · Jens Lelie / Unsplash

A founder asks for a rewrite the way a homeowner asks to knock down a wall. The wall is annoying, the room feels small, and a clean slate sounds like relief. Then the contractor finds the wall is load-bearing, and the relief turns into eighteen months of scaffolding while the family lives in a hotel. Software rewrites fail the same way, and the people who pay are rarely the engineers who pushed for it. The roadmap stalls, the competition ships, and the company spends a year producing something that, on a good day, matches what it already had.

The decision between refactoring what you have and rewriting from zero is one of the highest-stakes calls an operator makes, and it usually gets made on vibes. This is the framework I use with clients to make that call on evidence instead of frustration, so the choice protects the roadmap rather than betting it on a guess.

The default is no, and here is why

In 2000, Joel Spolsky wrote what is still the sharpest warning in the field. He called rewriting working software from scratch "the single worst strategic mistake that any software company can make," using Netscape as the corpse. Netscape threw away the code behind its browser and rebuilt it. The gap between their last major release and the next beta ran nearly three years, during which they shipped nothing, couldn't answer Internet Explorer, and watched their market share evaporate. (Joel Spolsky, Things You Should Never Do)

His core point holds up two decades later. The ugly parts of an old codebase are not ugly because the previous team was incompetent. Each strange conditional and odd workaround usually encodes a real bug someone hit in production, a browser quirk, a tax edge case, a payment processor that returns success and then reverses the charge. Throw the code away and you throw away that hard-won knowledge, then spend the next year rediscovering it one outage at a time. The new system looks clean right up until it meets the same reality the old one survived.

There is a second trap waiting on the other side. Fred Brooks named it the second-system effect in 1975: the rewrite is where a confident team piles in every feature it deferred, every abstraction it always wanted, every generalization it swore it would build "properly this time." The first system was lean because nobody knew what they were doing yet. The second one collapses under its own ambition. The rewrite that was supposed to simplify becomes the most over-engineered thing the team ever ships.

So the honest default is no. A working application earns the benefit of the doubt.

Count the real cost of staying put

The default is not "do nothing." Living with a decaying codebase has a price, and the only way to make an honest decision is to put a number on it, the same evidence-over-feeling discipline that tells you whether technical debt is quietly killing your roadmap.

Research from Sonar across more than 200 projects puts technical debt at roughly $306,000 per year for a million lines of code, climbing toward $1.5 million over five years. (Sonar, Cost of Technical Debt) JetBrains' 2025 developer survey found engineers spending two to five working days a month wrestling debt, and 71% of developers losing at least a quarter of their time to it. (Pragmatic Coders, cost of tech debt) McKinsey's number is blunter still: companies burn around 20% of IT budgets on the consequences of debt, with the true cost running higher.

For a founder, this converts an abstract feeling into a line item. Pull three real figures from your own team, the kind of grounded signals that measure an engineering team without vanity metrics. How many days a month do engineers spend firefighting instead of building. How long does a typical feature take now versus a year ago. How often does a deploy cause an incident. If those numbers are flat and survivable, the codebase is annoying, not dying, and a rewrite is the wrong tool. If they are climbing steeply and every quarter is worse than the last, you have a real case to evaluate, not just irritation.

The questions that actually decide it

A rewrite is justified far less often than it is proposed. These are the conditions that genuinely point toward one.

The platform is a dead end, not just dated. An unsupported language runtime, a framework that no longer gets security patches, a hosting model your provider is sunsetting. This is a true forcing function. Wanting to be on a newer stack is not.
The product itself needs to change, not just the code. Basecamp rebuilt their product several times, but as their team has written, the obstacle was never a crufty codebase. It was that hundreds of thousands of people had built workflows around the old one. (Herb Caudill, Lessons from six software rewrite stories) If what you actually want is a different product, say so, because that is a product decision wearing an engineering costume, the same disguise that trips up the build-versus-buy call when a workflow is really a product question.
You can keep shipping during the work. If your plan requires a feature freeze, you have already lost. The freeze is exactly what handed Netscape's market to Microsoft. This is also where the build, buy, or outsource decision on engineering velocity bites: a rewrite that consumes your whole team for a year is a velocity decision in disguise.
The team that built it understands why it works. A rewrite led by people who never saw the original failures will rediscover every one of them. If the institutional memory is gone, that is an argument against a rewrite, not for it.

If you cannot answer yes to the platform question or the product question, the honest move is to refactor. The good news is that modern technique has made "refactor" far more powerful than the slow, hope-it-improves grind it used to be.

How to modernize without the big-bang bet

The reason the rewrite-versus-refactor argument felt binary for so long is that the alternative to a rewrite used to be weak. It is not anymore. The discipline now is incremental replacement, and the canonical version is the strangler fig pattern, named by Martin Fowler after the Queensland fig that grows around a host tree until it can stand on its own and the original quietly dies away.

The mechanics are simple and the risk profile is the whole point. You put a façade, usually a proxy or router, in front of the old system. New requests for a given slice of functionality get routed to a freshly built service, while everything else keeps hitting the legacy code. Slice by slice, you migrate features behind that façade until the old system has nothing left to do and you switch it off. (Microsoft Azure Architecture Center, Strangler Fig Pattern)

        ┌──────────┐
client ─▶│  façade  │─▶ legacy system (shrinking)
        │  router  │─▶ new service (growing)
        └──────────┘

The difference in outcome is enormous. A full rewrite ships nothing until it is finished, and software estimates being what they are, "finished" arrives late. Incremental modernization delivers visible improvement in weeks because each migrated piece goes live on its own. If a slice goes wrong, you route it back to the old path and the blast radius is one feature, not the company, a safety property you can sharpen further by putting each migrated slice behind a feature flag you can kill in one click. The business never stops. That is the entire game, and it is the spine of how we handle full-stack web apps end to end when a client's product is already carrying real users and revenue.

Two practices make the strangler safe rather than reckless. Before you touch a tangled module, wrap it in characterization tests, tests that pin down what the code does today, bugs and all, so you can change the implementation underneath without changing the behavior on top. And route through an abstraction layer so the old and new implementations can coexist behind one interface while you migrate traffic gradually. With both in place, every step is reversible, which is precisely what a rewrite never is.

Making the call

Rewrite versus refactor decision tree leading to a strangler fig migration behind a facade

Run the decision in three passes. First, measure: turn the cost of staying put into real numbers from your own team, not a feeling. Second, test the forcing functions: is the platform genuinely dead, does the product itself need to change, can you ship throughout. If none of those land, you are refactoring, and the strangler fig is your route. Third, if a rewrite truly is warranted, scope it as a sequence of small replacements behind a façade, never a single heroic switch, and write down what "done" means before anyone starts so the second-system effect has nowhere to hide.

A modernization that outlives the people who run it also needs a plan for handing off software so it survives the person who built it, so the institutional memory that justified refactoring does not walk out the door mid-migration.

If you are staring at this decision with real money and a real roadmap on the line, an outside read from someone who has shipped both outcomes is worth more than another internal debate. That is the kind of call we sit in on during a senior technical consultation, and the answer is more often "modernize in place, keep shipping" than founders expect. The clean slate is seductive. A company that is still standing a year from now is better.

product legacy-code architecture strategy

All of the Journal