Blog about readable, versatile, unambiguous, and maintainable code in Java.

Transformer Pattern (8 min read)

transformer pattern illustration: a robot from the Transformers series (source: Pixabay)

The Transformer pattern is a design pattern for Java (and potentially other OO languages with use-site variance only and invariant parameter types) that helps objects within a subtype hierarchy fluently transform themselves into objects of any type.

Context

I was following the OpenJDK threads (Sep 18-21, Nov 12-13, Nov 13-30, Dec 3-4) related to issue JDK-8203703 by Jim Laskey, and an idea came to my mind. Let me recap the relevant parts of the discussion.

Proposal of String.transform

The proposal as per JDK-8203703 boils down to the following addition:

public final class String implements /*...*/ CharSequence {
  // ...
  public <R> R transform(Function<? super String, ? extends R> f) {
    return f.apply(this);
  }
  // ...
}

As you can see, this method simply calls given Function on itself, and that’s it. Yet, it’s very useful for chaining utility methods, like ones in StringUtils from Apache Commons:

String result = string
        .toLowerCase()
        .transform(StringUtils::stripAccents)
        .transform(StringUtils::capitalize);

Normally, we’d have to write:

String result = StringUtils.capitalize(StringUtils.stripAccents(string.toLowerCase()));

Considering CharSequence.transform

At some point, Alan Bateman raised the issue of potentially defining transform in CharSequence as:

<R> R transform(Function<? super CharSequence, ? extends R> f)

This would have the benefit of being able to apply CharSequence-based utility methods (e.g. StringUtils.isNumeric) on any CharSequence, e.g.:

boolean isNumeric = charSequence
        .transform(s -> StringUtils.defaultIfBlank('0'))
        .transform(StringUtils::isNumeric);

However, as Rémi Forax pointed out, the problem with this signature is that:

  • if it were inherited by String: most utility methods take String as a parameter – such methods wouldn’t work (e.g. StringUtils::capitalize),
  • if it were overridden by String: no useful override could be made because:
    • Function<? super String, R> is a supertype of Function<? super CharSequence, R> (which is actually good),
    • Java doesn’t support contravariant parameter types (which is the true obstacle here).

As a result, the subject of CharSequence.transform has been dropped.

Problem

To sum up, the problem consists in being able to transform:

  • a CharSequence, using a Function that takes CharSequence or Object (? super CharSequence),
  • a String, using a Function that takes String or any of its supertypes (? super String).

When I looked at those lower bounds here, I realized that I’ve already seen this kind of problem (cf. Filterer Pattern).

So this problem boils down to: how to covariantly specify the contravariant bound for the Function.

Solution

Java doesn’t support contravariant parameter types, and its syntax doesn’t provide a way to covariantly (? extends) specify a contravariant (? super) bound in a single declaration. However, it is possible to do this in two separate declarations, by means of an intermediate helper type.

Assuming we want to solve this for a generic Function<? super T, ? extends R>, we need to:

  • move the above Function parameter to a helper interface parametrized with T,
  • use this helper interface with an upper bound (? extends T) as a return type.

Transformer Interface

I defined such a helper interface (which I dubbed Transformer) as follows:

@FunctionalInterface
interface Transformer<T> {
  <R> R by(Function<? super T, ? extends R> f);
}

Transformable Interface

Having defined Transformer, we can define the following base interface dubbed Transformable:

interface Transformable {
  Transformer<?> transformed();
}

This interface doesn’t do much on its own, but I treat it as a specification for:

  • subtype implementors: it reminds them to override the transformed method with a proper upper bound, and to implement it,
  • subtype users: it reminds them they can call transformed().by(f).

To sum up, this pair (Transformer & Transformable) lets us replace:

  • obj.transform(function)
  • with: obj.transformed().by(function)

Sample Implementation

Before we go back to String, let’s see how easy it is to implement both those interfaces:

class Sample implements Transformable {

  @Override
  public Transformer<Sample> transformed() {
    return this::transform; // method reference
  }

  private <R> R transform(Function<? super Sample, ? extends R> f) {
    return f.apply(this);
  }
}

As you can see, all it takes is a method reference to transform.

The transform method was made private so that there’s no conflict in subtypes when they define their own (approprietly lower boundedtransform.

Solution in Context

Implementation in Context

How could it apply to CharSequence and String? First, we’d make CharSequence extend Transformable:

public interface CharSequence extends Transformable {
  // ...
  @Override
  Transformer<? extends CharSequence> transformed();
  // ...
}

Then, we’d implement transformed in String, returning a method reference to the public transform method (added in JDK 12):

public final class String implements /*...*/ CharSequence {
  // ...
  @Override
  public Transformer<String> transformed() {
    return this::transform;
  }
  // ...
}

Note that we made a covariant change to the return type of transformed: Transformer<? extends CharSequence>Transformer<String>.

Compatibility Risk

I judge the compatibility risk of adding CharSequence.transformed to be minimal. It could break backwards compatiblity only for those CharSequence subclasses that already have a no-argument transformed method (which seems unlikely).

Usage in Context

The usage for String would not change because there’s no point in calling transformed().by() over transform().

The usage for generic CharSequence, though, would need to resort to transformed().by() because it may have many implementations so transform methods must be private:

boolean isNumeric = charSequence
        .transformed().by(s -> StringUtils.defaultIfBlank('0'))
        .transformed().by(StringUtils::isNumeric);

Performance

If you’re unfamiliar with how the JVM (which most often means HotSpot) and its JIT compiler work, you might wonder whether this apparent creation of an extra object (Transformer in transformed) will not affect performance.

Fortunately, thanks to escape analysis* and scalar replacement, this object never gets allocated on the heap. So the answer is: no, it won’t.

* This Wikipedia entry contains a false statement: “So the compiler can safely allocate both objects on the stack.” As Aleksey Shipilёv explains, Java doesn’t allocate entire objects on the stack.

Benchmark

If you need proof, here’s a little benchmark (using Aleksey Shipilёv’s excellent JMH benchmark harness). Since I couldn’t (easily) add the necessary methods to String, I created a simple wrapper over String, and implemented the benchmark on top of it.

The benchmark tests the toLowerCase() operation:

  • on two strings:
    1. "no change" (a no-op)
    2. "Some Change"
  • using three call types:
    1. direct (baseline)
    2. transform()
    3. transformed().by()

You can find the full source code for this benchmark in this GitHub gist.

Here are the results (run on Oracle JDK 8, took 50 minutes):

Benchmark                            (string)  Mode  Cnt   Score   Error  Units

TransformerBenchmark.baseline no change avgt 25 22,215 ± 0,054 ns/op
TransformerBenchmark.transform no change avgt 25 22,540 ± 0,039 ns/op
TransformerBenchmark.transformed no change avgt 25 22,565 ± 0,059 ns/op

TransformerBenchmark.baseline Some Change avgt 25 63,122 ± 0,541 ns/op
TransformerBenchmark.transform Some Change avgt 25 63,405 ± 0,196 ns/op
TransformerBenchmark.transformed Some Change avgt 25 62,930 ± 0,209 ns/op

As you can see, for both strings, there’s no performance difference between the three call types.

Summary

I realize that Transformable is probably too “extravagant” to actually make it into JDK. Actually, even Transformer alone being returned by CharSequence and String isn’t probably worth it. It’s because unary operations over CharSequences don’t seem so common (e.g. StringUtils contains just a few).

However, I found the general idea of Transformer and Transformable quite enticing. So I hope you enjoyed the read, and that you’ll find it useful in certain contexts 🙂

Appendix

Optional reading – feel free to skip it.

Similiarity to Filterer

For comparison, let’s recall the Transformer and the generic version of the Filterer:

@FunctionalInterface
interface Transformer<T> {
  <R> R by(Function<? super T, ? extends R> function);
}
@FunctionalInterface
interface Filterer<R, T> {
  R by(Predicate<? super T> predicate);
}

We can see that both interfaces are somewhat similar. Their differences can be summarized in three points:

  1. Placement of type parameter <R>:
    • Transformer: on method,
    • Filterer: on interface;
  2. Type of functional interface:
    • Transformer: Function<T, R>,
    • Filterer: Predicate<T> (structurally: Function<T, boolean>);
  3. Return type of Function related to that of by() (this is the key difference):
    • Transformer: RR,
    • Filterer: booleanR.

Transformer as Functional Interface

You might have noticed that I marked an interface with a single generic method (Transfomer) as @FunctionalInterface. As explained on StackOverflow, JLS doesn’t allow a lambda expression for a target that has a generic type parameter.

Fortunately, as Brian Goetz pointed out, this restriction does not apply to method references. So Transformer is a valid functional interface, but it can be used with method references only.

It’s good because if we were to use anonymous classes to implement Transformer:

Leave a comment

Your email address will not be published. Required fields are marked *

3 thoughts on “Transformer Pattern”