The merits of type inference

One of the lesser known benefits of Frege (and Haskell) is the way that it does type inference. One reason is that so many languages list type inference in their fact sheet and so having this feature doesn’t get people excited. It must be said, though, that there are many different meanings of the word from meaning almost nothing to meaning everything.

  • In some languages, type inference only means that type declarations do not need to be repeated when definition and assignment happens on the same line.

  • In other languages, the type system can infer the type of a variable based on a literal declaration.

  • Some type systems can locally infer the return type of an expression if the parameter types are given.

  • Frege belongs to the ML-family of languages that enjoy a Hindley-Milner type system. They are solely build from expressions and can globally infer each and every expression in the program - up to the entire program. Only for very special constructions where higher-ranked types are involved do they need some hints from the programmer.

Let’s see some examples why that makes a big difference.

Beginning simply

The simplest example to start with would be f x, just applying a function f to x once. When we define a method (say, once) that is implemented as f x then we can inspect the type that Frege infers for that function.

The simple definition
once f x = f x

We can ask the repl with :type once or the eclipse plugin with Ctrl-Space for the inferred type. The inferred type is also shown in the API documentation that we can create with the fregedoc tool.

The inferred type
once :: (b->a) -> b -> a

The rationale is as follows: in f x the only option is that f is a function. It takes one parameter x. We have no constraints for the type of x nor for the return type of f x. So we just call those a and b.

When f is of type b → a then the first parameter to once is consequently of type (b→a), the second parameter x must be of type b (since it is passed to f), and the return type of once must be the same type as the return type of f, thus a.

We say that the type that Frege infers is as polymorphic as possible. But what should it even mean to be more or less polymorphic?

Being more or less polymorphic

In the code above, we did not declare any types and let the type inferencer do its job. But if we want, we can provide a type declaration ourselves and there are good reasons for doing so.

Including the type declaration
once :: (b->a) -> b -> a
once f x = f x

The first obvious benefit is that any reader of the code immediately sees the type without any need to ask the IDE, the REPL, or to consult the API documentation. The type declaration becomes part of the internal documentation that reveals on which types once can operate and what it may return when given certain argument types (it will return the same type that f returns).

But we can also be more specific, i.e. less polymorphic. Let’s assume that we only care about functions that take an Int and return a String. The implementation can stay exactly the same but the declaration makes our function monomorphic (mono = one, morph = form → "of one form").

Being less polymorphic (monomorphic even)
once :: (Int->String) -> Int -> String
once f x = f x

Now, the interesting part is that the type inferencer will anyway do its job and check whether our implementation complies to the given declaration. If not, we get an error. Explicit declarations are allowed to be more specific (= less polymorphic) but they are not allowed to be more general (= more polymorphic) as long as they are still consistent.

It turns out that even this seemingly simple example is not quite as easy as initially thought and a slight extension makes it even more challenging.

A slight extension

Let’s assume that we want to create a function twice that works just like once except that it applies f to x twice.

Applying f twice
twice f x = f (f x)

Since we now apply f not only to x but also to the result of f x, it should be clear that the result type of f x needs to be the same as the type of x, right?

Going cross-eyed?

This line of argumentation can easily lead to headaches. Take it easy. Read it again until it makes sense.

In other words, the signature of twice must be:

twice :: (a->a) -> a -> a

and this is exactly what the type system infers for us. Yippeah!

What if things go wrong?

It is nice that the type inferencer takes on the headaches for us but what if we somehow got it wrong and made the following erroneous declaration?

Erroneous declaration
falseTwice :: (b->a) -> b -> a
falseTwice f x = f (f x)

Well, even in this case Frege doesn’t let us down. It reports a very detailed description of where we went wrong. Let’s look at the last of the reported errors.

Last error message
E <console>.fr:7: type error in expression f x
    type is : a
    expected: b

Frege took issue with the expression f x since in the type signature we declared f to be of type (b→a) and therefore f x is of type a.

But the type inferencer expects f x to be of type b since it is used in the expression f (f x) as the first argument to the outer f.

Understanding error messages

Languages with type inference like Haskell and Frege are notoriously known for giving error messages that are more difficult to understand than those from let’s say Java. And this is true.
This is partly so because type inference inspects the code not only "from declaration to implementation" but also vice versa. The code is scrutinized much more deeply. But when inconsistencies are found, it is often difficult to say whether the declaration or the implementation is wrong.
However, the topic is under continuous improvement and the Frege project team is happy to receive any examples where the error message could be better.

The ultimate example for type inference

Nothing beats QuickCheck when it comes to examples of type inference.

Let’s assume we have twice implemented without an explicit type signature. Now we need a function that we can apply twice and here is one that prefixes any type that can be prefixed. And, yes, I mean that so!

The prefix function
prefix front x = front ++ x

We do not care at all what to prefix but let Frege figure out the most general type that could be used here.

For the geeks

Frege will infer a rather surprising type for the prefix function: ListSemigroup b ⇒ b a → b a → b a. We ignore it for now. It should suffice to say that it is an algebraic type for things that can be concatenated with ++ in an associative fashion like Strings or arbitrary lists.

Now onto defining the invariant that we would like to have checked on random input: we assume that twice applies any function twice (not surprising). To make things even more difficult for Frege, we give the to-be-tested expression not as a function reference but as a lambda expression. Frege must now infer the type of a sub-expression.

The invariant for twice using prefix
import Test.QuickCheck
applied_twice = property $ \x -> twice (prefix "<") x == "<<" ++ x
quickCheck applied_twice

And QuickCheck dutifully responds with OK, passed 100 tests.

Imagine the sheer amount of intelligence that Frege applies to work this out. Frege needs to find out the type for x such that it can ask it for random values. x is an argument to twice but that is unconstrained, so we get no information out of that directly. But the type of x is also the return type of the first argument to twice (i.e. prefix "<"). But that is the very abstract type ListSemigroup. Only after unifying the String "<" (the first argument to prefix) with it’s return type (ListSemigroup String), QuickCheck knows that (prefix "<") returns a String and thus it must have been fed a String to begin with and thus x is a String and thus it must ask the String type for random String values. Puh.

And Frege arrived at that conclusion without the slightest help from the programmer.

Anyway, the code is guaranteed to be structurally sound - at compile time!


You may want to convince yourself by sending some functions to twice that would be problematic to be called twice.

As a parting thought: imagine doing this with any other JVM language.

results matching ""

    No results matching ""