Skip to content
DERKONLINE

When To Refactor Your Legacy Product And When To Rewrite It

A clear-eyed framework for the most expensive software decision, and why the strangler fig usually beats the doomed rewrite.

Derrick S. K. Siawor7 min read

Every software company eventually arrives at the same crossroads. The product that got you here has become hard to change. Adding a feature takes three times as long as it should. New engineers take months to be productive. The codebase feels, to use the technical term, crufty, and you start to wonder whether technical debt is quietly killing your roadmap. And someone, usually the newest senior engineer, says the words that start the most expensive argument in software: "We should just rewrite it."

It is the most seductive idea in the room and one of the most dangerous. The decision between refactoring a legacy product and rewriting it from scratch is among the most consequential a software business makes, because getting it wrong does not cost a sprint, it can cost the company. The history is littered with rewrites that killed the products that started them, and the most famous example is a cautionary tale worth carrying with you.

The ghost of Netscape

In 2000, Joel Spolsky wrote an essay called "Things You Should Never Do," and he called a from-scratch rewrite "the single worst strategic mistake that any software company can make." His evidence was Netscape, which threw out its working browser code and rebuilt from zero. Three years of development produced a buggy, feature-incomplete version while Internet Explorer went from afterthought to total market dominance. By the time the rewrite shipped, the war was lost. The company had spent its most precious years building back to where it already was while a competitor sprinted past.

Spolsky's argument rested on two observations that have aged remarkably well. The first: the crufty-looking parts of an old codebase often embed hard-earned knowledge. Every weird conditional, every strange special case, every ugly workaround usually exists because a real bug happened in production and someone fixed it. That code looks like mess and is actually accumulated wisdom about corner cases you have forgotten and will rediscover the hard way. Throw out the code and you throw out the bug fixes, then spend a year re-encountering every one of them.

The second: a rewrite is a long undertaking during which you are not improving the product your customers actually use, while your competition keeps shipping. The rewrite buys you nothing the customer can see for a very long time, and that time is a gift to everyone competing with you.

Why rewrites fail more than they should

Beyond Netscape, rewrites fail for a structural reason Fred Brooks named back in 1975: the second-system effect. The team that builds a second version, freed from the constraints of the first, tends to over-engineer it, cramming in every feature and abstraction they wished they had the first time. The replacement balloons in scope, slips its schedule, and becomes its own unmaintainable mess before it even ships.

There is also a simpler trap. The thing teams underestimate is not how hard the new system is to build, it is how much the old system actually does. A legacy product that has run for years has absorbed thousands of small requirements, edge cases, and integrations that nobody has written down. The rewrite has to match all of it before it can replace the original, and "all of it" is far larger than anyone's mental model. The rewrite reaches feature parity months or years later than estimated, if it reaches it at all.

When refactoring is the right call

For most legacy products, the honest answer is to refactor, not rewrite, and to do it incrementally. Refactoring keeps the product working and earning the entire time. You improve the code in place, module by module, shipping continuously, never going dark. The customer keeps getting a working product, the team keeps its knowledge of why the weird code exists, and you never bet the company on a big-bang cutover.

The pattern that makes this work at scale, the one nearly every successful migration uses, is the strangler fig. Named after the vine that grows around a tree and gradually replaces it, the approach is to build the new functionality alongside the old system, route traffic and behavior to the new pieces incrementally as they prove out, and decommission the old code only once nothing depends on it. The system is always running. There is no flag day where everything switches at once and prays. You replace the engine while the car is still driving, one part at a time, and you can stop, reassess, or reverse at any point, the same control a feature flag you can kill in one click gives you over a risky change. This is how you get the benefits of new code without the existential risk of a full rewrite.

When a rewrite is actually justified

This is not an absolute rule, and pretending it is would be its own kind of dishonesty. There are real cases where a rewrite is the right call. The genuine ones share a quality: continuing with the current system is itself the larger risk.

A rewrite can be justified when the underlying technology is truly dead, a platform with no security updates, a language nobody can hire for, a dependency that no longer exists. A platform that stops receiving security patches is not just a maintenance headache; skipping security carries a real, quantifiable cost that eventually forces the decision. It can be justified when the original architecture cannot support a fundamental new direction the business has committed to, and no amount of refactoring bridges the gap. And it can be justified when the system is small enough that the rewrite is genuinely bounded, a contained service, not the whole product, where the cost and risk are proportionate.

Notice the pattern. The justified rewrites are the constrained ones, the ones with a hard external forcing function or a small, well-understood scope. The doomed rewrites are the ambitious, open-ended "let's rebuild the whole thing the right way this time" projects driven by an aesthetic distaste for the existing code rather than a concrete, bounded need. The same trap shows up in the broader rewrite-versus-rebuild decision, where an unbounded rewrite can quietly kill the company. "This code is ugly" is not a business reason. "This platform stops receiving security patches next year" is.

How to make the decision honestly

Before you commit either way, force three questions onto the table. What is the actual business problem the rewrite is meant to solve, stated without reference to code aesthetics? If the answer is "the code is hard to work with," that is usually a refactoring problem. If it is "we cannot do X the business needs and the architecture fundamentally prevents it," that may be a rewrite problem. There is also a related question worth forcing onto the table: whether the real risk is concentrated in one person who built the system and not in the code at all. Second, how much does the current system actually do, including the undocumented edge cases, and is the team being honest about the true scope of matching it? Third, can you get most of the benefit incrementally with a strangler approach, keeping the product alive the whole time? If yes, that is almost always the better bet.

Decision tree: refactor by default, rewrite only when scope is bounded or platform is dead

The most expensive decision in software is also the one most often made on emotion, the frustration of working in an old codebase, the optimism that this time the team will get it right. Discipline here is worth more than enthusiasm. Refactor by default, rewrite only when staying put is the bigger risk and the scope is genuinely bounded, and prefer the strangler fig to the big bang every single time.

This is the kind of decision we help teams think through clearly, because we have seen what the wrong call costs. It is the same disciplined trade-off as the build-versus-buy call on a new piece of software: a choice that should turn on bounded need, not on appetite. Whether the right answer is to harden and refactor what you have or to carefully stand up something new, getting the diagnosis right comes first, and it is exactly the conversation our consultation and engineering work is built for. The rewrite that feels brave in the meeting is often the one that hands your market to a competitor while you build back to even. Choose with your eyes open.