map ∷ (a→b) → [a] → [b] -- can be implicit
map f [] = []
map f (x:xs) = f x : map f xs
-- instantiation is always invisible
sqrs = map (^2) [1..10]
-- sqrs == [1, 4, 9, 16, ... , 100]
Guest post by Ingo Wechsung.
Advanced Parametric Polymorphism
Every programmer tries to avoid duplicate work. Whenever we identify a common pattern, we try to abstract over it such that we can reuse a generalized solution in many forms (i.e. polymorphically). The two main approaches to polymorphism are object-oriented subtyping and parameterized types.
Parametric polymorphism is established in mainstream programming languages for quite some time, though the term is unfamiliar to many. For example, in Java, it goes under the name of generic types.
The basic idea is simple: in the type declaration of functions and data (or methods and classes), types can be given as type variables as long as their concrete types can be determined later. Such type variables may be introduced at certain places, depending on the language. Java for example requires explicit introduction of type variables. Later, when the function or data is used, those formal type variables are instantiated with concrete types (which could well be other type variables that are valid in the context).
Here is an example in Frege that shows the abstract type declaration for the classical map function, its implementation, and its usage for the concrete type of integer lists and a usage thereof.
Here is an approximation of how this would look like in Java:
// generic types are explicit
public static<A,B> List<B> map(Function<A,B> f, List<A> xxs) {
// ... details omitted ...
}
// instantiation is visible
static List<Integer> sqrs = <Integer, Integer>map(
x -> x * x,
Arrays.asList(1,2,3,4,5,6,7,8,9,10));
Despite Frege type inference
being able to infer the type of map
entirely on its own,
it is customary to provide the type declaration explicitly.
(Tools like the eclipse plugin make this a one-click action.)
The point is that in Frege, the introduction of type variables can be implicit,
and the instantiation of type variables is always invisible.
Nevertheless, it can be deduced that map
is instantiated at type (Int→Int) → [Int] → [Int]
on the right hand side of the definition of sqrs
.
But because instantiation is invisible,
many Haskell and Frege programmers don’t even know that it exists at all.
In contrast, the Java code shows clearly both the mandatory introduction of type variables in the method declaration as well as their instantiation. The Java compiler can sometimes - but not always - infer those instantiations as well.
When generic is not polymorphic enough
Suppose your code contains a repeated use of a function (reverse
or drop n
) in a pattern like
"first apply that function to the first list, then to the second list, and then combine the
resulting lists with zip".
zip (drop n somelist) (drop n otherlist)
zip (reverse somelist) (reverse otherlist)
In the repeated use of this pattern, the types of reverse
and drop n
look identical.
∀a.[a] → [a] --(1)
(1) For the sake of clarity, we provide types using explicit quantification (∀a.
) in this article.
Giving this quantification is optional and usually omitted in real code.
There are even more examples for functions of this type that we might want to use.
reverse
tail
init
drop 10
take 20
id
The intent is clear: we want to combine two lists, but only after having applied the same transformation to both lists. In short, what we have is
zip (f somelist) (f otherlist)
for some appropriate function f
, and now we want to get rid of the parentheses and the
repetition of f
(remember that f
could be a rather lengthy expression itself).
It looks like we need a function fzip
, such that we could write
fzip f somelist otherlist
A possible implementation that comes to mind would be:
fzip
fzip f [] [] = []
fzip f xs ys = zip (f xs) (f ys)
Indeed, what could be easier?
But wait! When we ask the REPL for the type of fzip
, it tells us this:
∀a b.([a]->[b]) -> [a] -> [a] -> [(b,b)]
For some reason, it forces our two lists to have the same element type!
But the expressions that we wanted to refactor didn’t have that restriction. After all, the transformation functions we intent to use are generic in the list element type. They do work on any list.
Anyway, as it stands, applying the planned refactoring would result in type errors in all places where the first and the second list are not of the same type. For example we cannot fzip a list of characters with a list of boolean values.
So, what is wrong with our fzip
?
To understand the case, one needs to remember what has been said about instantiation above. In an expression like
fzip reverse ['a', 'b', 'c'] [false, true]
-- type error in expression [false,true]
-- type is : [Bool]
-- expected: [Char]
at what type should reverse
get instantiated? If we choose
[Char] → [Char]
it wont be able to reverse boolean lists. And if we choose
[Bool] → [Bool]
it wont be able to reverse the character list.
In the above example, the compiler chooses to instantiate reverse
at type [Char] → [Char]
according to the
character list argument, and therefore it expects the remaining argument to have the same type.
After all, this is what the type of fzip
demands! This explains the error message.
But why is this instantiation needed at all? It is needed because of a restriction of type inference in the Hindley-Milner type system, which forms the base of the type systems of languages like ML, Haskell, F# and Frege. This restriction says that lambda bound values (you can read this as "function arguments") are assumed to be monomorphic. And this needs to be so because otherwise type inference would become undecidable.
Ranking Types
Another way to put this is that type inference à la Hindley-Milner (in the following HM for short)
can only deal with polymorphism of rank 1.
Yet another way to put this is that rank 1 types are exactly those polymorphic types that a HM algorithm can infer.
Practically, this means that in languages that obey strictly to HM,
higher order functions can only take monomorphic functions or functions that are instantiated at a monomorphic type.
To be blunt, our fzip
can’t be written in ML or F#!
Using Higher Ranked Types
Fortunately, while type inference is undecidable for higher ranked types, type checking is not. That is, a computer cannot find a higher ranked type for some expression without further information. But given a type and an expression, it can decide whether the expression can possibly be of this type.
The type checkers of GHC (with language extension RankNTypes
) and Frege
employ this fact and allow polymorphic functions as arguments.
To make this work, the type of a function that takes polymorphic arguments must be annotated, or at least the polymorphic argument itself must be annotated accordingly, and type inference will do the rest.
When such an annotation is present, instead of looking for the type a functional argument needs to be instantiated at, the type checker just checks if the argument’s type is at least as general as the annotated type.
Hence, the solution to our problem is simply to point out that we want our function argument f
to be polymorphic.
We can do this by providing the following annotation for fzip
:
fzip ∷ (∀ a.[a] → [a]) → [x] → [y] → [(x,y)]
-- --------------- universally quantified
-- polymorphic type of f
fzip f xs ys = zip (f xs) (f ys)
The code for fzip
stays the same! But the type now says that f
is a function that takes a list
and returns a list of the same type, for all possible list element types.
In addition, the types of the lists f
works on is now completely decoupled from the actual types of the list arguments.
But since f
works on all list types, it is safe to apply it to both arguments.
The crucial point is the universally quantified polymorphic type for the function argument. When in doubt what to write here, one can simply ask the REPL for the type of such a function.
:type reverse
[α] -> [α]
To get the quantified type, add the forall
keyword (that can also be written ∀
, if you like)
and list all type variables that occur in the type. If you don’t like the type variable names,
you can simply rename them. For example, the following types are indistinguishable for the compiler:
forall a.[a] → [a]
forall b.[b] → [b]
∀ quetzalcoatl.[quetzalcoatl] → [quetzalcoatl]
Alternatively, we can write fzip
with an in-line type declaration for f only
but without the full type declaration for fzip
:
fzip (f ∷ ∀a.[a] → [a]) xs ys = zip (f xs) (f ys)
though I find this much less readable.
We can now use fzip
with a variety of functions. But the types of all of them need to be at least as polymorphic
as the annotated type for f
. For example, we cannot use fzip
with a function of the restricted type [Int] → [Int]
,
even if both lists are integer lists.
Here are some examples. I give the type of f
as comment:
fzip id [1..10] ['a'..'z'] -- ∀a. a → a
fzip (drop 3) [1..10] ['a'..'z'] -- ∀a.[a] → [a]
fzip reverse [1..10] ['a'..'z'] -- ∀a.[a] → [a]
fzip (map id) [1..10] ['a'..'z'] -- ∀a.[a] → [a]
fzip tail [1..10] ['a'..'z'] -- ∀a.[a] → [a]
fzip (const []) [1..10] ['a'..'z'] -- ∀a b.a → [b]
This is it. We will come back to this and discuss the shortcomings as well as possible improvements of the higher rank solution.
Meanwhile, here is some homework for the very interested ones:
-
Why not be even more general and let the type of
f
be∀a b.[a] → [b]
? (The last example hints at one reason.) -
(for true Java experts) Write
fzip
in Java without using casts or@SuppressWarnings
so that it compiles without warnings! (Yes, it is possible.)
Java solution |