Refactoring mountains of legacy JavaScript code

Brendan McKeown

Senior Technical Lead

March 23, 2022

illustration of multiple screens stack up like a mountain with code spilling off them — Illustration by Sarah Kula Marketing Designer

Several years ago, our team inherited a project that included over 6,000 lines of JavaScript source code. Not libraries, not packages—6,000 individual lines of source code committed to the project.

The code was a combination of “vanilla” JS and jQuery, built to render data visualizations by manipulating a canvas object on a web page. The datasets powering these visualizations were vast, containing thousands of data points for each page.

The data visualization code included excessive duplication of variables, functions, and statements. There were nested loops and conditional statements up to five levels deep. “Dead code” (unused functions and variables) was located next to the main rendering loop; meanwhile, data processing logic was mixed with data visualization rendering code in the same functions. The icing on the proverbial cake was that there was zero unit test coverage for the source code.

To state the obvious, someone needed to refactor this code.

The opportunity: Improve the experience for developers and customers

A major framework upgrade project created an opening for us to address our JavaScript problem. We knew we wanted to lessen the fear and avoidance that developers felt towards the codebase and reduce the level of effort required for adding new features and performing maintenance in the future.

Because of the age of the JavaScript, we set other goals, too:

Modernize the codebase by refactoring pre-ES6 code to ES6+ code
Replace jQuery modules with React components
Replace the existing data visualization code with an open-source library that uses SVG instead of Canvas to improve responsive rendering

The process: How we refactored vast amounts of JavaScript code

We started with user interface testing and searching through previous documentation of requirements. From there, our team compiled a complete list of the data visualization features on the site.

In the source code, we refactored the JavaScript code in small pieces, consolidating functions and eliminating code duplication, extracting code into separate modules and functions with a single responsibility, and removing unused functions and variables. We began to see which parts of the JavaScript were responsible for specific data visualization features, which allowed us to start separating the data processing logic from the view and rendering logic.

With the canvas implementation isolated, we could replace it with the new data visualization library of our choice—Victory Charts. Data visualization features were slowly transitioned from vanilla JS, jQuery, and canvas into React and SVG. At the same time, a subset of our team began to focus on the data processing logic, looking for ways to improve the readability and performance of the code.

We replaced the tangle of deeply nested loops with functional pipes, making it easy to read the code’s intent and simple to extend. Visual regression tests were used to ensure visual parity of the data visualizations between the production and development environments. We also performed manual regression testing to confirm that the refactor made no functional regressions.

The results: Improved user experience

In the end, the refactored data visualizations looked and functioned almost the same as before. With the new data visualization library and the move from canvas to SVG, there were a few minor visual differences, but they all improved the user experience.

Sharper icons
The data visualization container resizes with the browser viewport
Tooltips are more accurately positioned within the scrolling visualization container

The code is now much (much!) more readable, easier to interpret, and therefore easier to debug, maintain, and build upon. And now that we have isolated functions with discrete responsibilities, we can begin to write unit tests that cover the data processing logic.

Even though we had to spend a significant amount of time on the refactor, this is an investment that will pay for itself, saving minutes and sometimes hours every time a developer has to open up the code and make a change to the site. Maybe just as significantly, developers no longer fear working on the site!

What we learned: Set goals, test, and continuously refactor

Refactoring legacy code can be a valid alternative to an entire rewrite (i.e., trashing everything and starting from scratch).
When you’re getting into the process, set small, incremental, and measurable goals that a team can accomplish during a refactor. Don’t try to tackle the entire problem all at once.
With adequate automated and manual testing during the refactor, you can reduce the risk of regressions.
Account for refactoring in your long-term project planning and your day-to-day tasks. When project code or architecture is neglected, issues will compound over time, and the happiness of your developers will suffer. Refactoring code in little bits over time will always be more economical than doing a rewrite every couple of years.

Want to help us tackle big problems and build better projects? Check out our Careers page.

Filed Under: Technology