The hidden costs of legacy code - and how to stop them

“No one touches the legacy code.”

Many developers have heard these words on their first day at a new job. It’s a well known fact that legacy code—code without tests, left over from developers who have moved on—can be a real pain to deal with on a practical level, and this pain has been captured in dozens of memes. 

Screenshot 2019-07-03 at 13.56.26.png

But increasingly, legacy code is also a huge cost for businesses. So how exactly do you quantify the problems caused by legacy code, and what can be done to overcome this expense?

Maintenance and Refactoring

The real costs of dealing with legacy code come from maintaining and refactoring it—which, because of the nature of legacy code and software decay, is an ongoing and essentially infinite cost. Refactoring (cleaning up code without changing its behavior) takes hours and is often put to the bottom of the to-do list. Manually maintaining legacy code by writing unit tests can even involve learning a whole new coding language. 

Legacy code is almost always accompanied by technical debt—the cost incurred by meeting time constraints with functional but imperfect code in the short-term, which will need to be refactored in the long-term. One analysis of technical debt in 1400 applications at 160 organizations found that an average-sized application of 300,000 lines of code has $1,083,000 worth of technical debt—$3.61 per line of code. 

In some ways, dealing with legacy code is like doing the laundry: the longer the task is delayed, the larger it becomes, until it builds up like a pile of dirty clothes, and becomes incredibly daunting. The difference is that laundry has a clear-cut solution, but dealing with legacy code becomes more complex the longer it sits—as if the clothes sewed themselves to each other (and to the laundry basket) a little bit more each day.

Research shows that in companies with 100+ developers, and/or an active codebase of 500K+ lines of code, software maintenance will account for more than half of the overall development budget.  

According to research carried out by the Consortium for IT Software Quality, legacy systems cost U.S companies $596 billion in 2018, compared to $70 billion in 2003. 

In 2015, Hitachi Consulting commissioned a study that found legacy systems were holding back 90% of businesses. Maintaining historic legacy code is like pressing the snooze button on your alarm clock: it postpones dealing with the problem, but eventually you’ll have to get up. 

Outages 

Airlines seem to be prone to legacy code disasters because they have huge code bases for their online booking systems. One of the most notable airline outages occurred in 2016, when Delta Airlines came to a standstill and had to ground their entire fleet after their reservation management system crashed; their IT system (including flight bookings) was taken offline until the issue was resolved.

These outages can occur even when legacy code is being actively maintained because it is so tricky to maintain. 

Security Breaches

Legacy code can cause serious issues, including security breaches. This is especially true with banks and financial services that are looking to mitigate any potential risks, particularly when it involves sensitive information like personal finances. A study carried out by Digital Reality found that 31% of IT leaders saw an issue with legacy infrastructure.

One of the world’s largest credit scoring agencies, Equifax, experienced a high-profile security breach in 2017 when hackers compromised the data of 148 million people from around the world. This is one of the most notable cases of an avoidable security breach. A report by the U.S Government Accountability Office concluded that the breach was entirely avoidable and was partly caused by legacy code from the website which dated back to the 1970s. Security breaches caused by poorly maintained legacy code can have major financial repercussions: costs of cleaning up the mess of Equifax were estimated to be in the region of $1.3 billion. Besides financial costs, security breaches also have an influence on trust with the consumers and the wider public, and PR disasters can be costly to sort out.     

Time 

Maintaining legacy code is not only a financial burden. Studies have shown that approximately 50% of the time developers spend on maintaining code is actually spent trying to understand the code that they are working to maintain. This severely impacts their productivity and prevents them from working on new projects. 

Automating testing is one straightforward way to save time. AI software can write unit tests in a matter of seconds, dramatically reducing the time spent on them.

The Solution 

In a survey carried out by Digital Reality, 62% of IT teams reported that they are investing in addressing legacy issues and modernizing infrastructure. One of the ways this can be done is through automated unit testing. Diffblue has developed a tool that automatically generates unit tests for Java code, which you can use on legacy code. Diffblue Cover makes messy, expensive, and potentially risky legacy code accessible again by automatically creating a test suite for it, making businesses safer from potential hacks and breaches and saving untold value.