The Transformer pattern is a design pattern for Java (and potentially other OO languages with use-site variance only and invariant parameter types) that helps objects within a subtype hierarchy fluently transform themselves into objects of any type.
Context
I was following the OpenJDK threads (Sep 18-21, Nov 12-13, Nov 13-30, Dec 3-4) related to issue JDK-8203703 by Jim Laskey, and an idea came to my mind. Let me recap the relevant parts of the discussion.
Proposal of String.transform
The proposal as per JDK-8203703 boils down to the following addition:
public final class String implements /*...*/ CharSequence {
// ...
public <R> R transform(Function<? super String, ? extends R> f) {
return f.apply(this);
}
// ...
}
As you can see, this method simply calls given Function on itself, and that’s it. Yet, it’s very useful for chaining utility methods, like ones in StringUtils from Apache Commons:
String result = string
.toLowerCase()
.transform(StringUtils::stripAccents)
.transform(StringUtils::capitalize);
Normally, we’d have to write:
String result = StringUtils.capitalize(StringUtils.stripAccents(string.toLowerCase()));
Considering CharSequence.transform
At some point, Alan Bateman raised the issue of potentially defining transform in CharSequence as:
<R> R transform(Function<? super CharSequence, ? extends R> f)
This would have the benefit of being able to apply CharSequence-based utility methods (e.g. StringUtils.isNumeric) on any CharSequence, e.g.:
boolean isNumeric = charSequence
.transform(s -> StringUtils.defaultIfBlank('0'))
.transform(StringUtils::isNumeric);
However, as Rémi Forax pointed out, the problem with this signature is that:
- if it were inherited by
String: most utility methods takeStringas a parameter – such methods wouldn’t work (e.g. StringUtils::capitalize), - if it were overridden by
String: no useful override could be made because:Function<? super String, R>is a supertype ofFunction<? super CharSequence, R>(which is actually good),- Java doesn’t support contravariant parameter types (which is the true obstacle here).
As a result, the subject of CharSequence.transform has been dropped.
Problem
To sum up, the problem consists in being able to transform:
- a
CharSequence, using aFunctionthat takesCharSequenceorObject(? super CharSequence), - a
String, using aFunctionthat takesStringor any of its supertypes (? super String).
When I looked at those lower bounds here, I realized that I’ve already seen this kind of problem (cf. Filterer Pattern).
So this problem boils down to: how to covariantly specify the contravariant bound for the Function.
Solution
Java doesn’t support contravariant parameter types, and its syntax doesn’t provide a way to covariantly (? extends) specify a contravariant (? super) bound in a single declaration. However, it is possible to do this in two separate declarations, by means of an intermediate helper type.
Assuming we want to solve this for a generic Function<? super T, ? extends R>, we need to:
- move the above
Functionparameter to a helper interface parametrized withT, - use this helper interface with an upper bound (
? extends T) as a return type.
Transformer Interface
I defined such a helper interface (which I dubbed Transformer) as follows:
@FunctionalInterface
interface Transformer<T> {
<R> R by(Function<? super T, ? extends R> f);
}
Transformable Interface
Having defined Transformer, we can define the following base interface dubbed Transformable:
interface Transformable {
Transformer<?> transformed();
}
This interface doesn’t do much on its own, but I treat it as a specification for:
- subtype implementors: it reminds them to override the
transformedmethod with a proper upper bound, and to implement it, - subtype users: it reminds them they can call
transformed().by(f).
To sum up, this pair (Transformer & Transformable) lets us replace:
obj.transform(function)- with:
obj.transformed().by(function)
Sample Implementation
Before we go back to String, let’s see how easy it is to implement both those interfaces:
class Sample implements Transformable {
@Override
public Transformer<Sample> transformed() {
return this::transform; // method reference
}
private <R> R transform(Function<? super Sample, ? extends R> f) {
return f.apply(this);
}
}
As you can see, all it takes is a method reference to transform.
The transform method was made private so that there’s no conflict in subtypes when they define their own (approprietly lower bounded) transform.
Solution in Context
Implementation in Context
How could it apply to CharSequence and String? First, we’d make CharSequence extend Transformable:
public interface CharSequence extends Transformable {
// ...
@Override
Transformer<? extends CharSequence> transformed();
// ...
}
Then, we’d implement transformed in String, returning a method reference to the public transform method (added in JDK 12):
public final class String implements /*...*/ CharSequence {
// ...
@Override
public Transformer<String> transformed() {
return this::transform;
}
// ...
}
Note that we made a covariant change to the return type of transformed: Transformer<? extends CharSequence> → Transformer<String>.
Compatibility Risk
I judge the compatibility risk of adding CharSequence.transformed to be minimal. It could break backwards compatiblity only for those CharSequence subclasses that already have a no-argument transformed method (which seems unlikely).
Usage in Context
The usage for String would not change because there’s no point in calling transformed().by() over transform().
The usage for generic CharSequence, though, would need to resort to transformed().by() because it may have many implementations so transform methods must be private:
boolean isNumeric = charSequence
.transformed().by(s -> StringUtils.defaultIfBlank('0'))
.transformed().by(StringUtils::isNumeric);
Performance
If you’re unfamiliar with how the JVM (which most often means HotSpot) and its JIT compiler work, you might wonder whether this apparent creation of an extra object (Transformer in transformed) will not affect performance.
Fortunately, thanks to escape analysis* and scalar replacement, this object never gets allocated on the heap. So the answer is: no, it won’t.
* This Wikipedia entry contains a false statement: “So the compiler can safely allocate both objects on the stack.” As Aleksey Shipilёv explains, Java doesn’t allocate entire objects on the stack.
Benchmark
If you need proof, here’s a little benchmark (using Aleksey Shipilёv’s excellent JMH benchmark harness). Since I couldn’t (easily) add the necessary methods to String, I created a simple wrapper over String, and implemented the benchmark on top of it.
The benchmark tests the toLowerCase() operation:
- on two strings:
"no change"(a no-op)"Some Change"
- using three call types:
- direct (baseline)
transform()transformed().by()
You can find the full source code for this benchmark in this GitHub gist.
Here are the results (run on Oracle JDK 8, took 50 minutes):
Benchmark (string) Mode Cnt Score Error Units
TransformerBenchmark.baseline no change avgt 25 22,215 ± 0,054 ns/op
TransformerBenchmark.transform no change avgt 25 22,540 ± 0,039 ns/op
TransformerBenchmark.transformed no change avgt 25 22,565 ± 0,059 ns/op
TransformerBenchmark.baseline Some Change avgt 25 63,122 ± 0,541 ns/op
TransformerBenchmark.transform Some Change avgt 25 63,405 ± 0,196 ns/op
TransformerBenchmark.transformed Some Change avgt 25 62,930 ± 0,209 ns/op
As you can see, for both strings, there’s no performance difference between the three call types.
Summary
I realize that Transformable is probably too “extravagant” to actually make it into JDK. Actually, even Transformer alone being returned by CharSequence and String isn’t probably worth it. It’s because unary operations over CharSequences don’t seem so common (e.g. StringUtils contains just a few).
However, I found the general idea of Transformer and Transformable quite enticing. So I hope you enjoyed the read, and that you’ll find it useful in certain contexts 🙂
Appendix
Optional reading – feel free to skip it.
Similiarity to Filterer
For comparison, let’s recall the Transformer and the generic version of the Filterer:
@FunctionalInterface
interface Transformer<T> {
<R> R by(Function<? super T, ? extends R> function);
}
@FunctionalInterface
interface Filterer<R, T> {
R by(Predicate<? super T> predicate);
}
We can see that both interfaces are somewhat similar. Their differences can be summarized in three points:
- Placement of type parameter
<R>:Transformer: on method,Filterer: on interface;
- Type of functional interface:
Transformer:Function<T, R>,Filterer:Predicate<T>(structurally:Function<T, boolean>);
- Return type of
Functionrelated to that ofby()(this is the key difference):Transformer:R→R,Filterer:boolean→R.
Transformer as Functional Interface
You might have noticed that I marked an interface with a single generic method (Transfomer) as @FunctionalInterface. As explained on StackOverflow, JLS doesn’t allow a lambda expression for a target that has a generic type parameter.
Fortunately, as Brian Goetz pointed out, this restriction does not apply to method references. So Transformer is a valid functional interface, but it can be used with method references only.
It’s good because if we were to use anonymous classes to implement Transformer:
- the implementation would look ugly 🙂
- we’d have a
classfile generated for every implementation at compile time, whereas classes for method references are created at runtime using the invokedynamic opcode.
3 thoughts on “Transformer Pattern”
Hi Tomasz, this is great stuff! I also doubt it will make it into JDK (but who knows)… I’ve read your message there, I hope it is well received and that you receive good feedback. I was wondering all the time if you had forgotten that lambdas targeting types with a generic type parameter are not allowed. But there it was, at the end of all. Glad you have taken that into consideration. And I’m also very excited to know that this restriction doesn’t apply to method references. There you have a subtle difference that can end up in a great behavioral change, as you’ve shown in the article.
Thanks for your kind words, Federico! I really appreciate them 🙂
PS. The first thing I did before starting to write the post was building a prototype and seeing whether it compiles 🙂
For reference: discussion on Reddit.