Blog about readable, versatile, unambiguous, and maintainable code in Java.

5 Refactoring Principles by Example (7 min read)

post-its with the text "change" (source: pixabay)

This post features five (mostly well-known) refactoring principles applied when refactoring real open-source code (Gradle Modules Plugin).

Context

When I worked on separate compilation of module-info.java for Gradle Modules Plugin (PR #73), I noticed potential for some refactoring. As a result, I filed issue #79 and later resolved it with PR #88, where I refactored the code.

As it turned out, the refactoring was much wider than I initially thought. Here, I present parts of this PR as examples of the refactoring principles that I applied there.

Refactoring Principles

Note: the list presented here is by no means comprehensive, and the principles aren’t original (I present them in my own voice and according to my own understanding, though). As I see it, the greatest value of this post is in the real-life examples that accompany the principles.

The five principles presented here are:

  1. Hide “how” with “what”
  2. Aim for consistency
  3. Avoid deep nesting
  4. Separate concerns (= Single Responsibility Principle)
  5. Avoid duplication wisely (= Don’t Repeat Yourself)

1. Hide “How” With “What”

This principle is just a part/rephrasing of the clean code principle, as formulated by Robert Martin.

To me, hiding “how” with “what” means extracting classes and methods whenever:

  • I can identify a distinct, non-trivial function performed by some piece of code, and
  • I can hide this non-triviality behind a method with a meaningful name.

Example 1: updateRelativePath

Here’s a snippet from RunTaskMutator before the refactoring:

mainDistribution.contents(copySpec -> copySpec.filesMatching(patchModuleExtension.getJars(), action -> {
  RelativePath relativePath = action.getRelativePath().getParent().getParent()
      .append(true, "patchlibs", action.getName());
  action.setRelativePath(relativePath);
}));

and here’s the snippet after the refactoring:

mainDistribution.contents(
    copySpec -> copySpec.filesMatching(patchModuleExtension.getJars(), this::updateRelativePath)
);

To sum up, we:

  • hid how to update the relative path
  • with what we do (= the fact that we update it).

Thanks to such refactoring, it’s much easier to grasp what happens to mainDistribution.

For reference, the content of updateRelativePath is available here.

Example 2: buildAddReadsStream & buildAddOpensStream

This is how a part of the TestTask class looked before the refactoring:

TestEngine.select(project).ifPresent(testEngine -> {
  args.addAll(List.of("--add-reads", moduleName + "=" + testEngine.moduleName));

  Set<File> testDirs = testSourceSet.getOutput().getClassesDirs().getFiles();
  getPackages(testDirs).forEach(p -> {
    args.add("--add-opens");
    args.add(String.format("%s/%s=%s", moduleName, p, testEngine.addOpens));
  });
});

and here’s how it looks afterwards:

TestEngine.select(project).ifPresent(testEngine -> Stream.concat(
    buildAddReadsStream(testEngine),
    buildAddOpensStream(testEngine)
).forEach(jvmArgs::add));

Again, we:

  • hid how the values of --add-reads and --add-opens options are specified
  • with what we do (= the fact that we specify them).

For reference, the contents of buildAddReadsStream and buildAddOpensStream are available here.

2. Aim for Consistency

This is very general, but I mean any kind of reasonable consistency that we can get.

For example, Donald Raab‘s blog post about symmetry is a great example of striving for consistency. Needless to say, I agree with his conclusion wholeheartedly:

A large system with good symmetry becomes easier to understand, because you can detect and expect recurring patterns.

Donald Raab, Symmetric Sympathy

In the case of Gradle Modules Plugin, this boiled down primarily to extracting AbstractModulePluginTask base class and unifying the task finding & configuration dispatching procedure.

For example, JavadocTask and TestTask before the refactoring were:

public class JavadocTask {
  public void configureJavaDoc(Project project) {
    Javadoc javadoc = (Javadoc) project.getTasks().findByName(JavaPlugin.JAVADOC_TASK_NAME);
    if (javadoc != null) {
      // ...
    }
  }
}

public class TestTask {
  public void configureTestJava(Project project, String moduleName) {
    Test testJava = (Test) project.getTasks().findByName(JavaPlugin.TEST_TASK_NAME);
    // ... (no null check)
  }
}

and afterwards, they are:

public class JavadocTask extends AbstractModulePluginTask {
  public void configureJavaDoc() {
    helper().findTask(JavaPlugin.JAVADOC_TASK_NAME, Javadoc.class)
        .ifPresent(this::configureJavaDoc);
  }

  private void configureJavaDoc(Javadoc javadoc) { /* ... */ }
}

public class TestTask extends AbstractModulePluginTask {
  public void configureTestJava() {
    helper().findTask(JavaPlugin.TEST_TASK_NAME, Test.class)
        .ifPresent(this::configureTestJava);
  }

  private void configureTestJava(Test testJava) { /* ... */ }
}

For reference: JavaDocTask diff and TestTask diff.

3. Avoid Deep Nesting

This is rather obvious, I guess. For me, deep nesting of control structures is extremely hard to read and grasp.

As a consequence, I refactored the following getPackages method:

private static Set<String> getPackages(Collection<File> dirs) {
  Set<String> packages = new TreeSet<>();
  for (File dir : dirs) {
    if (dir.isDirectory()) {
      Path dirPath = dir.toPath();
      try (Stream<Path> entries = Files.walk(dirPath)) {
        entries.forEach(entry -> {
          if (entry.toFile().isFile()) {
            String path = entry.toString();
            if (isValidClassFileReference(path)) {
              Path relPath = dirPath.relativize(entry.getParent());
              packages.add(relPath.toString().replace(File.separatorChar, '.'));
            }
          }
        });
      } catch (IOException e) {
        throw new GradleException("Failed to scan " + dir, e);
      }
    }
  }
  return packages;
}

like below:

private static Set<String> getPackages(Collection<File> dirs) {
  return dirs.stream()
      .map(File::toPath)
      .filter(Files::isDirectory)
      .flatMap(TestTask::buildRelativePathStream)
      .map(relPath -> relPath.toString().replace(File.separatorChar, '.'))
      .collect(Collectors.toCollection(TreeSet::new));
}

private static Stream<Path> buildRelativePathStream(Path dir) {
  try {
    return Files.walk(dir)
        .filter(Files::isRegularFile)
        .filter(path -> isValidClassFileReference(path.toString()))
        .map(path -> dir.relativize(path.getParent()));
  } catch (IOException e) {
    throw new GradleException("Failed to scan " + dir, e);
  }
}

Full diff available here.

4. Separate Concerns

SRP (Single Responsibility Principle) is a well-known software design principle. Here, we can see its application in extracting StartScriptsMutator from RunTaskMutator.

Before:

public class RunTaskMutator {
  // common fields

  public void configureRun() { /* ... */ }

  public void updateStartScriptsTask(String taskStartScriptsName) { /* ... */ }

  // 12 other methods (incl. 2 common methods)
}

After:

public class RunTaskMutator extends AbstractExecutionMutator {

  public void configureRun() { /* ... */ }
  
  // 2 other methods
}

public class StartScriptsMutator extends AbstractExecutionMutator {

  public void updateStartScriptsTask(String taskStartScriptsName) { /* ... */ }

  // 8 other methods
}

Thanks to extracting StartScriptsMutator, it’s much easier to comprehend the scopes of:

For reference: the commit with the above extraction.

5. Avoid Duplication Wisely

DRY (Don’t Repeat Yourself) is another well-known software development principle. However, in my experience, this principle is sometimes taken too far, which results in code that isn’t duplicated but is also far too complex.

In other words, we should deduplicate only when the cost-gain ratio is positive:

  • cost: refactoring time, resulting complexity, etc.
  • gain: no duplication (or more strictly: single source of truth).

One such example from Gradle Modules Plugin (where the cost-gain ratio is close to zero but still positive, in my opinion) is the introduction of PatchModuleResolver.

Below, there’s a code snippet before refactoring that consists of:

  1. A PatchModuleExtension.configure method.
  2. A place where it’s used (TestTask).
  3. A place where it can’t be used (RunTaskMutator).
  4. Another place where it can’t be used (JavadocTask).
// 1. PatchModuleExtension
public List<String> configure(FileCollection classpath) {
  List<String> args = new ArrayList<>();

  config.forEach(patch -> {
        String[] split = patch.split("=");

        String asPath = classpath.filter(jar -> jar.getName().endsWith(split[1])).getAsPath();

        if (asPath.length() > 0) {
          args.add("--patch-module");
          args.add(split[0] + "=" + asPath);
        }
      }
  );

  return args;
}

// 2. TestTask
args.addAll(patchModuleExtension.configure(testJava.getClasspath()));

// 3. RunTaskMutator
patchModuleExtension.getConfig().forEach(patch -> {
      String[] split = patch.split("=");
      jvmArgs.add("--patch-module");
      jvmArgs.add(split[0] + "=" + PATCH_LIBS_PLACEHOLDER + "/" + split[1]);
    }
);

// 4. JavadocTask
patchModuleExtension.getConfig().forEach(patch -> {
      String[] split = patch.split("=");

      String asPath = javadoc.getClasspath().filter(jar -> jar.getName().endsWith(split[1])).getAsPath();

      if (asPath != null && asPath.length() > 0) {
        options.addStringOption("-patch-module", split[0] + "=" + asPath);
      }
    }
);

After introducing PatchModuleResolver, the code looks as follows:

// 1. PatchModuleExtension
public PatchModuleResolver resolve(FileCollection classpath) {
  return resolve(jarName -> classpath.filter(jar -> jar.getName().endsWith(jarName)).getAsPath());
}

public PatchModuleResolver resolve(UnaryOperator<String> jarNameResolver) {
  return new PatchModuleResolver(this, jarNameResolver);
}

// 2. TestTask
patchModuleExtension.resolve(testJava.getClasspath()).toArgumentStream().forEach(jvmArgs::add);

// 3. RunTaskMutator
patchModuleExtension.resolve(jarName -> PATCH_LIBS_PLACEHOLDER + "/" + jarName).toArgumentStream().forEach(jvmArgs::add);

// 4. JavadocTask
patchModuleExtension.resolve(javadoc.getClasspath()).toValueStream()
    .forEach(value -> options.addStringOption("-patch-module", value));

Thanks to refactoring, now there’s only one place (PatchModuleResolver) where we split the config entries of the PatchModuleExtension class.

For reference: diffs 1, 2, 3, 4.

Summary

In this post, I’ve presented the following five refactoring principles:

  1. Hide “how” with “what”
  2. Aim for consistency
  3. Avoid deep nesting
  4. Separate concerns
  5. Avoid duplication wisely

Each principle was accompanied by a real-life example, which — hopefully — showed how adhering to the principle resulted in neat code.

Leave a comment

Your email address will not be published. Required fields are marked *

One thought on “5 Refactoring Principles by Example”