The Problem of Code Smell and Secrets to Effective Refactoring.

The SeaLights Software Metrics Guide
for Better and Faster CI/CD

This article is part of our series on Technical Debt. See other articles in the series:

In this article you will learn:

What is code smell and why it’s problematic
Common code smells, how to identify them and why they’re bad
The importance of refactoring
Refactoring flow and best practices for removing code smells
Quality intelligence—the silver bullet to focusing your refactoring efforts

What Is Code Smell?

A code smell is a problem in source code that is not a bug or strictly technically incorrect. Code will still compile and work as expected. But it indicates a violation of design principles that might lead to problems further down the road.

For example, duplicate code that was copy-pasted in several places in the source code, instead of creating one method and referencing it from each of those places, is a blatant violation of good coding practices. Some code smells are more subtle, but still indicate that programmers are taking shortcuts and not investing in quality code.

According to Martin Fowler, code smells are not problematic on their own. They are warning signals that there might be a real problem in the code. For example, long functions are considered a code smell, but not all long functions are necessarily bad or poorly designed. Fowler suggests that junior members of a development team identify code smells and review them together with senior members, who can evaluate if there is really a deeper problem in the code.

Common Code Smells

Duplicated Code and Logic

Duplicated code is considered one of the worse code smells. Beyond blatant copy paste, there are subtle duplications like parallel inheritance hierarchies and repetitive code structures.

Why it’s bad: Makes code more difficult to maintain

Long Methods and Classes

Developers spend much of their time reading code, either written by themselves or their colleagues. Long code components take time to read and fully understand.

Why it’s bad: Hurts code readability and reusability.

Duplicated Methods in the Same or a Different Class

Having multiple methods that do the same thing is sloppy programming, and cause many long term problems because different components will reference different methods.

Why it’s bad: Makes code more difficult to maintain, unpredictable code sprawl

Long Parameter List

A very long list of parameters indicates the method or class has taken on too many responsibilities. It is advisable to break it down into several components, each with a clear cut role.

Why it’s bad: Hurts code readability and reusability

Divergent Change

Divergent change happens when a class takes on more and more functionalities that are unrelated to its original core function.

Why it’s bad: Hurts code readability and reusability

Shotgun Surgery

The opposite of divergent change—functionality that is spread out across multiple components, so making one change requires changing multiple locations in the code.

Why it’s bad: Makes code more difficult to maintain

Feature Envy

A class with feature envy makes extensive use of another class. Repetitive calls to the same class indicates that the other class’s functionality might be rolled into the first class.

Why it’s bad: Makes code more difficult to maintain

Data Clumps

A data clump is a group of parameters that are frequently used together—for example a name, username and password. Packaging them together in a class cleans up many redundant references.

Why it’s bad: Makes code more difficult to maintain, hurts reusability

Switch Statement

When a switch statement such as if-then-else gets big, with a lot of logic executed within branches of the statement, code becomes very difficult to extract into separate classes or methods.

Why it’s bad: Very difficult to maintain and impossible to reuse without major refactoring

Lazy Class

A lazy class is one that doesn’t really do much, or performs only a few trivial operations. Lazy classes should be removed.

Why it’s bad: Makes code difficult to maintain

Message Chains

A message chain is a class that uses another class, which uses another class, and so on. It is more efficient to call the final class directly.

Why it’s bad: Makes code difficult to maintain, hurts readability (not immediately clear what is being called)

Middle Man

A middle man is a class that exists solely to reference other classes. If the class does not have any independent functionality, it should be removed.

Why it’s bad: Makes code difficult to maintain

Code Comments

Many experts regard code comments as a code smell, because quality code should be self-explanatory. In many cases, code components could be restructured and clearly named, so that comments are no longer necessary.

Why it’s bad: Hurts code readability

Speculative Generality

Speculative generality usually involves over-optimized code. For example, code that is excessively optimized for performance when the application doesn’t have a large user base, or algorithms that are optimized to the max when real time performance isn’t really needed.

Why it’s bad: Needlessly increases complexity, hurting maintainability

How to Solve Code Smell—Code Refactoring to the Rescue

Kent Beck defines refactoring as:

“A change to the system that leaves its behavior unchanged, but enhances some nonfunctional quality—simplicity, flexibility, understandability, performance”

Martin Fowler defines it as:

“A change made to the internal structure of software to make it easier to understand and cheaper to modify without changing its observable behavior”

Refactoring does not have tangible value because it does not alter the functionality or features or the code. However, continuous refactoring of code prevents what is known as “design decay”, cleans up and improves the quality of the code, and over time, makes code much more readable, maintainable and less prone to bugs.

Refactoring removes code smells, but is much more than that—it’s about ongoing maintenance of source code as a living system—just like any complex system requires ongoing maintenance to remain healthy.

The Refactoring Flow

The following process can be used to refactor code suffering from quality issues:

Ensure all tests pass—if there are elements of the code that break the build, fix them first.
Find code smells—code that works, but suffers from problems like the ones we listed earlier in this post (of course there can be many other code smells, depending on context, environment and language).
Determine simplification—understand what to change in the code to simplify it and prevent redundancy or over-complexity.
Implement simplification—actually modify the code to remove the code smell.
Ensure all tests still pass—in many cases, refactoring code is complex and many dependencies must be taken into account. Refactoring will commonly break existing code, so it’s essential to test that after the change, everything still works as expected.

Best Practices for Removing Code Smells

Removing a code smell is straightforward to explain—for example, “break up that method into a few smaller ones with distinct responsibilities”—but extremely difficult to carry out. Modifying old code, especially smelly code, can be like untangling a clump of strings.

Here are a few best practices that will help you get started:

- Leverage automation—modern IDEs, such as Eclipse and IntelliJ IDEA, can perform many types of refactoring automatically. For example, they can help you rename methods or classes while automatically changing all the references in the code to those elements.
- Test first—if a test does not exist for the source code you are refactoring, first create a test, and ensure the test passes even after your code changes.

Small discrete changes—make small changes, one at a time, and continuously check that tests do not break.

Don’t be afraid to rewrite—in many cases, refactoring will require rewriting parts of the code. Take the time to do this when necessary.

Approach the job rested and with time available—it’s not possible to refactor under pressure. Team leadership should be aware of the importance of refactoring and assign quiet blocks of time to allow the team to improve code quality.

Where to Start? Quality Intelligence Marks the Spot

Development teams that become aware of code smells, code quality and the importance of refactoring, or inherit legacy code that does not adhere to their own standards, are often at a loss. It’s difficult to prioritize code smells and understand their real impact on the system’s performance, maintainability, and the risk of bugs in production.

Luckily, a new category of tools is emerging called Quality Intelligence Platforms. Quality intelligence can identify code at risk of quality issues, by analyzing how frequently it has changed, and how comprehensively the code is tested.

Risky code is the first place to target when embarking on a refactoring project. That’s where refactoring is likely to have the biggest impact on quality perceived by the end user. Quality intelligence can help you build code quality insights into your agile processes:

In sprint planning, you can focus maintenance work and bug fixes on code that is at high risk of quality issues
In sprint retrospectives, you can look at code produced during the sprint, understand where quality issues might lie, and evaluate your technical debt

SeaLights is a leading Quality Intelligence Platform that can identify test gaps across areas of your code, providing clear visualization of risky code. It saves you time by focusing refactoring efforts on the areas of the product that need it most. Get a live demo and see how to dramatically improve your refactoring efficiency.