The Transformer pattern is a design pattern for Java (and potentially other OO languages with use-site variance only and invariant parameter types) that helps objects within a subtype hierarchy fluently transform themselves into objects of any type.
Context
I was following the OpenJDK threads (Sep 18-21, Nov 12-13, Nov 13-30, Dec 3-4) related to issue JDK-8203703 by Jim Laskey, and an idea came to my mind. Let me recap the relevant parts of the discussion.
Proposal of String.transform
The proposal as per JDK-8203703 boils down to the following addition:
public final class String implements /*...*/ CharSequence {
// ...
public <R> R transform(Function<? super String, ? extends R> f) {
return f.apply(this);
}
// ...
}
As you can see, this method simply calls given Function
on itself, and that’s it. Yet, it’s very useful for chaining utility methods, like ones in StringUtils from Apache Commons:
String result = string
.toLowerCase()
.transform(StringUtils::stripAccents)
.transform(StringUtils::capitalize);
Normally, we’d have to write:
String result = StringUtils.capitalize(StringUtils.stripAccents(string.toLowerCase()));
Considering CharSequence.transform
At some point, Alan Bateman raised the issue of potentially defining transform
in CharSequence
as:
<R> R transform(Function<? super CharSequence, ? extends R> f)
This would have the benefit of being able to apply CharSequence
-based utility methods (e.g. StringUtils.isNumeric) on any CharSequence
, e.g.:
boolean isNumeric = charSequence
.transform(s -> StringUtils.defaultIfBlank('0'))
.transform(StringUtils::isNumeric);
However, as Rémi Forax pointed out, the problem with this signature is that:
- if it were inherited by
String
: most utility methods takeString
as a parameter – such methods wouldn’t work (e.g. StringUtils::capitalize), - if it were overridden by
String
: no useful override could be made because:Function<? super String, R>
is a supertype ofFunction<? super CharSequence, R>
(which is actually good),- Java doesn’t support contravariant parameter types (which is the true obstacle here).
As a result, the subject of CharSequence.transform
has been dropped.
Problem
To sum up, the problem consists in being able to transform:
- a
CharSequence
, using aFunction
that takesCharSequence
orObject
(? super CharSequence
), - a
String
, using aFunction
that takesString
or any of its supertypes (? super String
).
When I looked at those lower bounds here, I realized that I’ve already seen this kind of problem (cf. Filterer Pattern).
So this problem boils down to: how to covariantly specify the contravariant bound for the Function
.
Solution
Java doesn’t support contravariant parameter types, and its syntax doesn’t provide a way to covariantly (? extends
) specify a contravariant (? super
) bound in a single declaration. However, it is possible to do this in two separate declarations, by means of an intermediate helper type.
Assuming we want to solve this for a generic Function<? super T, ? extends R>
, we need to:
- move the above
Function
parameter to a helper interface parametrized withT
, - use this helper interface with an upper bound (
? extends T
) as a return type.
Transformer Interface
I defined such a helper interface (which I dubbed Transformer
) as follows:
@FunctionalInterface
interface Transformer<T> {
<R> R by(Function<? super T, ? extends R> f);
}
Transformable Interface
Having defined Transformer
, we can define the following base interface dubbed Transformable
:
interface Transformable {
Transformer<?> transformed();
}
This interface doesn’t do much on its own, but I treat it as a specification for:
- subtype implementors: it reminds them to override the
transformed
method with a proper upper bound, and to implement it, - subtype users: it reminds them they can call
transformed().by(f)
.
To sum up, this pair (Transformer
& Transformable
) lets us replace:
obj.transform(function)
- with:
obj.transformed().by(function)
Sample Implementation
Before we go back to String
, let’s see how easy it is to implement both those interfaces:
class Sample implements Transformable {
@Override
public Transformer<Sample> transformed() {
return this::transform; // method reference
}
private <R> R transform(Function<? super Sample, ? extends R> f) {
return f.apply(this);
}
}
As you can see, all it takes is a method reference to transform
.
The transform
method was made private so that there’s no conflict in subtypes when they define their own (approprietly lower bounded) transform
.
Solution in Context
Implementation in Context
How could it apply to CharSequence
and String
? First, we’d make CharSequence
extend Transformable
:
public interface CharSequence extends Transformable {
// ...
@Override
Transformer<? extends CharSequence> transformed();
// ...
}
Then, we’d implement transformed
in String
, returning a method reference to the public transform
method (added in JDK 12):
public final class String implements /*...*/ CharSequence {
// ...
@Override
public Transformer<String> transformed() {
return this::transform;
}
// ...
}
Note that we made a covariant change to the return type of transformed
: Transformer<? extends CharSequence>
→ Transformer<String>
.
Compatibility Risk
I judge the compatibility risk of adding CharSequence.transformed
to be minimal. It could break backwards compatiblity only for those CharSequence
subclasses that already have a no-argument transformed
method (which seems unlikely).
Usage in Context
The usage for String
would not change because there’s no point in calling transformed().by()
over transform()
.
The usage for generic CharSequence
, though, would need to resort to transformed().by()
because it may have many implementations so transform
methods must be private
:
boolean isNumeric = charSequence
.transformed().by(s -> StringUtils.defaultIfBlank('0'))
.transformed().by(StringUtils::isNumeric);
Performance
If you’re unfamiliar with how the JVM (which most often means HotSpot) and its JIT compiler work, you might wonder whether this apparent creation of an extra object (Transformer
in transformed
) will not affect performance.
Fortunately, thanks to escape analysis* and scalar replacement, this object never gets allocated on the heap. So the answer is: no, it won’t.
* This Wikipedia entry contains a false statement: “So the compiler can safely allocate both objects on the stack.” As Aleksey Shipilёv explains, Java doesn’t allocate entire objects on the stack.
Benchmark
If you need proof, here’s a little benchmark (using Aleksey Shipilёv’s excellent JMH benchmark harness). Since I couldn’t (easily) add the necessary methods to String
, I created a simple wrapper over String
, and implemented the benchmark on top of it.
The benchmark tests the toLowerCase()
operation:
- on two strings:
"no change"
(a no-op)"Some Change"
- using three call types:
- direct (baseline)
transform()
transformed().by()
You can find the full source code for this benchmark in this GitHub gist.
Here are the results (run on Oracle JDK 8, took 50 minutes):
Benchmark (string) Mode Cnt Score Error Units
TransformerBenchmark.baseline no change avgt 25 22,215 ± 0,054 ns/op
TransformerBenchmark.transform no change avgt 25 22,540 ± 0,039 ns/op
TransformerBenchmark.transformed no change avgt 25 22,565 ± 0,059 ns/op
TransformerBenchmark.baseline Some Change avgt 25 63,122 ± 0,541 ns/op
TransformerBenchmark.transform Some Change avgt 25 63,405 ± 0,196 ns/op
TransformerBenchmark.transformed Some Change avgt 25 62,930 ± 0,209 ns/op
As you can see, for both strings, there’s no performance difference between the three call types.
Summary
I realize that Transformable
is probably too “extravagant” to actually make it into JDK. Actually, even Transformer
alone being returned by CharSequence
and String
isn’t probably worth it. It’s because unary operations over CharSequence
s don’t seem so common (e.g. StringUtils contains just a few).
However, I found the general idea of Transformer
and Transformable
quite enticing. So I hope you enjoyed the read, and that you’ll find it useful in certain contexts 🙂
Appendix
Optional reading – feel free to skip it.
Similiarity to Filterer
For comparison, let’s recall the Transformer
and the generic version of the Filterer
:
@FunctionalInterface
interface Transformer<T> {
<R> R by(Function<? super T, ? extends R> function);
}
@FunctionalInterface
interface Filterer<R, T> {
R by(Predicate<? super T> predicate);
}
We can see that both interfaces are somewhat similar. Their differences can be summarized in three points:
- Placement of type parameter
<R>
:Transformer
: on method,Filterer
: on interface;
- Type of functional interface:
Transformer
:Function<T, R>
,Filterer
:Predicate<T>
(structurally:Function<T, boolean>
);
- Return type of
Function
related to that ofby()
(this is the key difference):Transformer
:R
→R
,Filterer
:boolean
→R
.
Transformer as Functional Interface
You might have noticed that I marked an interface with a single generic method (Transfomer
) as @FunctionalInterface
. As explained on StackOverflow, JLS doesn’t allow a lambda expression for a target that has a generic type parameter.
Fortunately, as Brian Goetz pointed out, this restriction does not apply to method references. So Transformer
is a valid functional interface, but it can be used with method references only.
It’s good because if we were to use anonymous classes to implement Transformer
:
- the implementation would look ugly 🙂
- we’d have a
class
file generated for every implementation at compile time, whereas classes for method references are created at runtime using the invokedynamic opcode.
3 thoughts on “Transformer Pattern”
Hi Tomasz, this is great stuff! I also doubt it will make it into JDK (but who knows)… I’ve read your message there, I hope it is well received and that you receive good feedback. I was wondering all the time if you had forgotten that lambdas targeting types with a generic type parameter are not allowed. But there it was, at the end of all. Glad you have taken that into consideration. And I’m also very excited to know that this restriction doesn’t apply to method references. There you have a subtle difference that can end up in a great behavioral change, as you’ve shown in the article.
Thanks for your kind words, Federico! I really appreciate them 🙂
PS. The first thing I did before starting to write the post was building a prototype and seeing whether it compiles 🙂
For reference: discussion on Reddit.