Skip to main content

Programmers use automated refactorings even when they may break the code

19 Jan 2012 - planet-eclipse

Refactoring is defined as the process of changing the internal design of the code without affecting its external behavior. Because refactorings do not alter the visible behavior of the program, they are known as behavior-preserving transformations. Bill Opdyke wrote the first PhD thesis on refactoring and cataloged a few refactorings. Later, Martin Fowler extended the catalog of refactorings in his book.

I find the behavior-preservation property in the definition of refactoring somewhat vague because different observers might be interested in different aspects of the behavior of the program. For example, one observer might be interested in what the program writes to the standard console, while another observer might be interested in changes to the security, memory footprint, or performance of the program due to a code transformation. Nevertheless, there has been traditionally an emphasis on the behavior preservation property of refactorings.

Modern IDEs, such as Eclipse, IntelliJ, NetBeans, and ReSharper, provide automated support for many refactorings. Refactoring tools are designed to satisfy the behavior-preservation property of refactorings. That is, refactoring tools run a few checks known as preconditions before applying the transformation to make sure that it won't introduce compilation problems or change the behavior of the program. For example, the Rename refactoring checks for name conflicts and ensures that the new name won't clash with an existing one. However, refactoring tools cannot always guarantee behavior-preservation, e.g. in case of reflective or native code. If the refactoring tool detects a violation of one of its preconditions, it reports the problem to the programmer. For instance, if the Eclipse refactoring tool detects a name conflict while checking the preconditions of a refactoring, it will report the problem to the user with a severity level of ERROR (See the figure below). When the Eclipse refactoring tool reports a problem with a severity level of ERROR, it is almost certain that it is going to break the code (See the documentation of Eclipse).

A screenshot of the Eclipse error message when the Rename Local Variable
  refactoring is about to introduce a name conflict.

The Eclipse documentation does not recommend to continue an automated refactoring that has reported an ERROR. In other words, designers of the Eclipse refactoring tool expected the programmers to cancel the refactoring, change the code to satisfy the preconditions of the refactoring tool, and reinvoke the tool. However, when we analyzed the data that CodingSpectator had captured from our participants, we realized that our participants had continued 66% of the automated refactorings that had reported some ERROR. This result casts doubt on the priority of behavior-preservation in the design of refactoring tools. So, we asked our participants to explain why they usually continued automated refactorings when the tool didn't give any behavior-preservation guarantees. Our interviewees told us that they found it easier to continue the refactoring and fix the potential problems manually than to cancel the refactoring and reconfigure and reinvoke the tool. They told us that they usually relied on visual inspection, the compiler, and to a lesser extent on automated tests to verify the correctness of such refactorings. Most of the automated refactorings that our participants performed affected a narrow piece of the code, and our interviewees said that their quick checks were usually enough to validate and correct such small changes. Some of our interviewees even told us that they sometimes relied on the compiler to perform a refactoring step-by-step. For example, to rename a variable, they changed one occurrence of the variable and examined the resulting compilation problems one by one to update all other references. It might be slower and more error-prone to perform a refactoring manually. However, our interviewees told us that they sometimes performed a refactoring manually with the assistance of the incremental compiler to fully control and review it.

In summary, there has been always an emphasis on the behavior-preservation property of refactorings and refactoring tools. Intuitively, one would expect programmers to adopt behavior-preserving transformations more easily because they are less likely to break the program. Nevertheless, we found that programmers usually continue to use automated refactorings even when they are not strictly behavior-preserving. Our interviewees told us that sometimes a sequence of non-behavior-preserving changes led to less configuration and more predictability and control of a refactoring operation. Our results suggest that more flexible tools that are not too strict about behavior preservation might lead to better refactoring experience. Hopefully, refactoring tools that are easier to predict and give more control to the programmer will encourage programmers to use automated tools more frequently, especially for complex changes. Moreover, since programmers seem to tolerate non-behavior-preservation to some extent, they might be even willing to use tools that automate non-behavior-preserving transformations. In other words, behavior-preservation is not the only factor that determines programmers' use of program transformation tools. Our study identified many other factors, e.g., awareness, naming, invocation method, predictability, and configuration. By carefully considering all these factors in the design of our tools, we might be even able to go beyond behavior-preserving transformations and build usable tools that automate more complex and non-behavior-preserving transformations.