What does the `forall` keyword in Haskell/GHC do?
HaskellSyntaxTypesGhcForallHaskell Problem Overview
I'm beginning to understand how the forall
keyword is used in socalled "existential types" like this:
data ShowBox = forall s. Show s => SB s
This is only a subset, however, of how forall
is used and I simply cannot wrap my mind around its use in things like this:
runST :: forall a. (forall s. ST s a) > a
Or explaining why these are different:
foo :: (forall a. a > a) > (Char, Bool)
bar :: forall a. ((a > a) > (Char, Bool))
Or the whole RankNTypes
stuff...
I tend to prefer clear, jargonfree English rather than the kinds of language which are normal in academic environments. Most of the explanations I attempt to read on this (the ones I can find through search engines) have these problems:
 They're incomplete. They explain one part of the use of this keyword (like "existential types") which makes me feel happy until I read code that uses it in a completely different way (like
runST
,foo
andbar
above).  They're densely packed with assumptions that I've read the latest in whatever branch of discrete math, category theory or abstract algebra is popular this week. (If I never read the words "consult the paper whatever for details of implementation" again, it will be too soon.)
 They're written in ways that frequently turn even simple concepts into tortuously twisted and fractured grammar and semantics.
So...
On to the actual question. Can anybody completely explain the forall
keyword in clear, plain English (or, if it exists somewhere, point to such a clear explanation which I've missed) that doesn't assume I'm a mathematician steeped in the jargon?
Edited to add:
There were two standout answers from the higherquality ones below, but unfortunately I can only choose one as best. Norman's answer was detailed and useful, explaining things in a way that showed some of the theoretical underpinnings of forall
and at the same time showing me some of the practical implications of it. yairchu's answer covered an area nobody else mentioned (scoped type variables) and illustrated all of the concepts with code and a GHCi session. Were it possible to select both as best, I would. Unfortunately I can't and, after looking over both answers closely, I've decided that yairchu's slightly edges out Norman's because of the illustrative code and attached explanation. This is a bit unfair, however, because really I needed both answers to understand this to the point that forall
doesn't leave me with a faint sense of dread when I see it in a type signature.
Haskell Solutions
Solution 1  Haskell
Let's start with a code example:
foob :: forall a b. (b > b) > b > (a > b) > Maybe a > b
foob postProcess onNothin onJust mval =
postProcess val
where
val :: b
val = maybe onNothin onJust mval
This code doesn't compile (syntax error) in plain Haskell 98. It requires an extension to support the forall
keyword.
Basically, there are 3 different common uses for the forall
keyword (or at least so it seems), and each has its own Haskell extension: ScopedTypeVariables
, RankNTypes
/Rank2Types
, ExistentialQuantification
.
The code above doesn't get a syntax error with either of those enabled, but only typechecks with ScopedTypeVariables
enabled.
Scoped Type Variables:
Scoped type variables helps one specify types for code inside where
clauses. It makes the b
in val :: b
the same one as the b
in foob :: forall a b. (b > b) > b > (a > b) > Maybe a > b
.
A confusing point: you may hear that when you omit the forall
from a type it is actually still implicitly there. (from Norman's answer: "normally these languages omit the forall from polymorphic types"). This claim is correct, but it refers to the other uses of forall
, and not to the ScopedTypeVariables
use.
RankNTypes:
Let's start with that mayb :: b > (a > b) > Maybe a > b
is equivalent to mayb :: forall a b. b > (a > b) > Maybe a > b
, except for when ScopedTypeVariables
is enabled.
This means that it works for every a
and b
.
Let's say you want to do something like this.
ghci> let putInList x = [x]
ghci> liftTup putInList (5, "Blah")
([5], ["Blah"])
What must be the type of this liftTup
? It's liftTup :: (forall x. x > f x) > (a, b) > (f a, f b)
. To see why, let's try to code it:
ghci> let liftTup liftFunc (a, b) = (liftFunc a, liftFunc b)
ghci> liftTup (\x > [x]) (5, "Hello")
No instance for (Num [Char])
...
ghci>  huh?
ghci> :t liftTup
liftTup :: (t > t1) > (t, t) > (t1, t1)
"Hmm.. why does GHC infer that the tuple must contain two of the same type? Let's tell it they don't have to be"
 test.hs
liftTup :: (x > f x) > (a, b) > (f a, f b)
liftTup liftFunc (t, v) = (liftFunc t, liftFunc v)
ghci> :l test.hs
Couldnt match expected type 'x' against inferred type 'b'
...
Hmm. so here GHC doesn't let us apply liftFunc
on v
because v :: b
and liftFunc
wants an x
. We really want our function to get a function that accepts any possible x
!
{# LANGUAGE RankNTypes #}
liftTup :: (forall x. x > f x) > (a, b) > (f a, f b)
liftTup liftFunc (t, v) = (liftFunc t, liftFunc v)
So it's not liftTup
that works for all x
, it's the function that it gets that does.
Existential Quantification:
Let's use an example:
 test.hs
{# LANGUAGE ExistentialQuantification #}
data EQList = forall a. EQList [a]
eqListLen :: EQList > Int
eqListLen (EQList x) = length x
ghci> :l test.hs
ghci> eqListLen $ EQList ["Hello", "World"]
2
How is that different from RankNTypes?
ghci> :set XRankNTypes
ghci> length (["Hello", "World"] :: forall a. [a])
Couldnt match expected type 'a' against inferred type '[Char]'
...
With RankNTypes, forall a
meant that your expression must fit all possible a
s. For example:
ghci> length ([] :: forall a. [a])
0
An empty list does work as a list of any type.
So with ExistentialQuantification, forall
s in data
definitions mean that, the value contained can be of any suitable type, not that it must be of all suitable types.
Solution 2  Haskell
> Can anybody completely explain the forall keyword in clear, plain English?
No. (Well, maybe Don Stewart can.)
Here are the barriers to a simple, clear explanation or forall
:

It's a quantifier. You have a to have at least a little logic (predicate calculus) to have seen a universal or existential quantifier. If you've never seen predicate calculus or are not comfortable with quantifiers (and I have seen students during PhD qualifying exams who are not comfortable), then for you, there's no easy explanation of
forall
. 
It's a type quantifier. If you haven't seen System F and gotten some practice writing polymorphic types, you're going to find
forall
confusing. Experience with Haskell or ML is not enough, because normally these languages omit theforall
from polymorphic types. (In my mind, this is a languagedesign mistake.) 
In Haskell in particular,
forall
is used in ways that I find confusing. (I'm not a type theorist, but my work brings me in contact with a lot of type theory, and I'm quite comfortable with it.) For me, the main source of confusion is thatforall
is used to encode a type that I myself would prefer to write withexists
. It's justified by a tricky bit of type isomorphism involving quantifiers and arrows, and every time I want to understand it, I have to look things up and work out the isomorphism myself.If you are not comfortable with the idea of type isomorphism, or if you don't have any practice thinking about type isomorphisms, this use of
forall
is going to stymie you. 
While the general concept of
forall
is always the same (binding to introduce a type variable), the details of different uses can vary significantly. Informal English is not a very good tool for explaining the variations. To really understand what's going on, you need some mathematics. In this case the relevant mathematics can be found in Benjamin Pierce's introductory text Types and Programming Languages, which is a very good book.
As for your particular examples,

runST
should make your head hurt. Higherrank types (forall to the left of an arrow) are rarely found in the wild. I encourage you to read the paper that introducedrunST
: "Lazy Functional State Threads". This is a really good paper, and it will give you a much better intuition for the type ofrunST
in particular and for higherrank types in general. The explanation take several pages, it's very well done, and I'm not going to try to condense it here. 
Consider
foo :: (forall a. a > a) > (Char,Bool) bar :: forall a. ((a > a) > (Char, Bool))
If I call
bar
, I can simply pick any typea
that I like, and I can pass it a function from typea
to typea
. For example, I can pass the function(+1)
or the functionreverse
. You can think of theforall
as saying "I get to pick the type now". (The technical word for picking the type is instantiating.)The restrictions on calling
foo
are much more stringent: the argument tofoo
must be a polymorphic function. With that type, the only functions I can pass tofoo
areid
or a function that always diverges or errors, likeundefined
. The reason is that withfoo
, theforall
is to the left of the arrow, so as the caller offoo
I don't get to pick whata
is—rather it's the implementation offoo
that gets to pick whata
is. Becauseforall
is to the left of the arrow, rather than above the arrow as inbar
, the instantiation takes place in the body of the function rather than at the call site.
Summary: A complete explanation of the forall
keyword requires math and can be understood only by someone who has studied the math. Even partial explanations are hard to understand without math. But maybe my partial, nonmath explanations help a little. Go read Launchbury and Peyton Jones on runST
!
Addendum: Jargon "above", "below", "to the left of". These have nothing to do with the textual ways types are written and everything to do with abstractsyntax trees. In the abstract syntax, a forall
takes the name of a type variable, and then there is a full type "below" the forall. An arrow takes two types (argument and result type) and forms a new type (the function type). The argument type is "to the left of" the arrow; it is the arrow's left child in the abstractsyntax tree.
Examples:

In
forall a . [a] > [a]
, the forall is above the arrow; what's to the left of the arrow is[a]
. 
In
forall n f e x . (forall e x . n e x > f > Fact x f) > Block n e x > f > Fact x f
the type in parentheses would be called "a forall to the left of an arrow". (I'm using types like this in an optimizer I'm working on.)
Solution 3  Haskell
My original answer:
> Can anybody completely explain the forall keyword in clear, plain English
As Norman indicates, it is very hard to give a clear, plain English explanation of a technical term from type theory. We're all trying though.
There is only really one thing to remember about 'forall': it binds types to some scope. Once you understand that, everything is fairly easy. It is the equivalent of 'lambda' (or a form of 'let') on the type level  Norman Ramsey uses the notion of "left"/"above" to convey this same concept of scope in his excellent answer.
Most uses of 'forall' are very simple, and you can find them introduced in the GHC Users Manual, S7.8., particularly the excellent S7.8.5 on nested forms of 'forall'.
In Haskell, we usually leave off the binder for types, when the type is universally quanitified, like so:
length :: forall a. [a] > Int
is equivalent to:
length :: [a] > Int
That's it.
Since you can bind type variables now to some scope, you can have scopes other than the top level ("universally quantified"), like your first example, where the type variable is only visible within the data structure. This allows for hidden types ("existential types"). Or we can have arbitrary nesting of bindings ("rank N types").
To deeply understand type systems, you will need to learn some jargon. That's the nature of computer science. However, simple uses, like above, should be able to be grasped intuitively, via analogy with 'let' on the value level. A great introduction is Launchbury and Peyton Jones.
Solution 4  Haskell
Here is a quick and dirty explanation in plain terms that you're likely to be already familiar with.
The forall
keyword is really only used in one way in Haskell. It always means the same thing when you see it.
Universal quantification
A universally quantified type is a type of the form forall a. f a
. A value of that type can be thought of as a function that takes a type a
as its argument and returns a value of type f a
. Except that in Haskell these type arguments are passed implicitly by the type system. This "function" has to give you the same value no matter which type it receives, so the value is polymorphic.
For example, consider the type forall a. [a]
. A value of that type takes another type a
and gives you back a list of elements of that same type a
. There is only one possible implementation, of course. It would have to give you the empty list because a
could be absolutely any type. The empty list is the only list value that is polymorphic in its element type (since it has no elements).
Or the type forall a. a > a
. The caller of such a function provides both a type a
and a value of type a
. The implementation then has to return a value of that same type a
. There's only one possible implementation again. It would have to return the same value that it was given.
Existential quantification
An existentially quantified type would have the form exists a. f a
, if Haskell supported that notation. A value of that type can be thought of as a pair (or a "product") consisting of a type a
and a value of type f a
.
For example, if you have a value of type exists a. [a]
, you have a list of elements of some type. It could be any type, but even if you don't know what it is there's a lot you could do to such a list. You could reverse it, or you could count the number of elements, or perform any other list operation that doesn't depend on the type of the elements.
OK, so wait a minute. Why does Haskell use forall
to denote an "existential" type like the following?
data ShowBox = forall s. Show s => SB s
It can be confusing, but it's really describing the type of the data constructor SB
:
SB :: forall s. Show s => s > ShowBox
Once constructed, you can think of a value of type ShowBox
as consisting of two things. It's a type s
together with a value of type s
. In other words, it's a value of an existentially quantified type. ShowBox
could really be written as exists s. Show s => s
, if Haskell supported that notation.
runST
and friends
Given that, how are these different?
foo :: (forall a. a > a) > (Char,Bool)
bar :: forall a. ((a > a) > (Char, Bool))
Let's first take bar
. It takes a type a
and a function of type a > a
, and produces a value of type (Char, Bool)
. We could choose Int
as the a
and give it a function of type Int > Int
for example. But foo
is different. It requires that the implementation of foo
be able to pass any type it wants to the function we give it. So the only function we could reasonably give it is id
.
We should now be able to tackle the meaning of the type of runST
:
runST :: forall a. (forall s. ST s a) > a
So runST
has to be able to produce a value of type a
, no matter what type we give as a
. To do so, it uses an argument of type forall s. ST s a
which certainly must somehow produce the a
. What's more, it must be able to produce a value of type a
no matter what type the implementation of runST
decides to give as s
.
So in effect the runST
function is saying to you: "you can choose the type a
as long as I can choose the type s
."
OK, so what? The benefit is that this puts a constraint on the caller of runST
in that the type a
cannot involve the type s
at all, since you don't know what type s
is going to be. You can't pass it a value of type ST s [s]
, for example. What that means in practice is that the implementation of runST
can know that any value involving the type s
is local to its own implementation, so it's free to do things it otherwise wouldn't be allowed to, like mutate them in place. The type guarantees that the mutation is local to the implementation of runST
.
The type of runST
is an example of a rank2 polymorphic type because the type of its argument contains a forall
quantifier. The type of foo
above is also of rank 2. An ordinary polymorphic type, like that of bar
, is rank1, but it becomes rank2 if the types of arguments are required to be polymorphic, with their own forall
quantifier. And if a function takes rank2 arguments then its type is rank3, and so on. In general, a type that takes polymorphic arguments of rank n
has rank n + 1
.
Solution 5  Haskell
> They're densely packed with assumptions that I've read the latest in whatever branch of discrete math, category theory or abstract algebra is popular this week. (If I never read the words "consult the paper whatever for details of implementation" again, it will be too soon.)
Er, and what about simple firstorder logic? forall
is pretty clearly in reference to universal quantification, and in that context the term existential makes more sense as well, though it would be less awkward if there were an exists
keyword. Whether quantification is effectively universal or existential depends on the placement of the quantifier relative to where the variables are used on which side of a function arrow and it's all a bit confusing.
So, if that doesn't help, or if you just don't like symbolic logic, from a more functional programmingish perspective you can think of type variables as just being (implicit) type parameters to the function. Functions taking type parameters in this sense are traditionally written using a capital lambda for whatever reason, which I'll write here as /\
.
So, consider the id
function:
id :: forall a. a > a
id x = x
We can rewrite it as lambdas, moving the "type parameter" out of the type signature and adding inline type annotations:
id = /\a > (\x > x) :: a > a
Here's the same thing done to const
:
const = /\a b > (\x y > x) :: a > b > a
So your bar
function might be something like this:
bar = /\a > (\f > ('t', True)) :: (a > a) > (Char, Bool)
Note that the type of the function given to bar
as an argument depends on bar
's type parameter. Consider if you had something like this instead:
bar2 = /\a > (\f > (f 't', True)) :: (a > a) > (Char, Bool)
Here bar2
is applying the function to something of type Char
, so giving bar2
any type parameter other than Char
will cause a type error.
On the other hand, here's what foo
might look like:
foo = (\f > (f Char 't', f Bool True))
Unlike bar
, foo
doesn't actually take any type parameters at all! It takes a function that itself takes a type parameter, then applies that function to two different types.
So when you see a forall
in a type signature, just think of it as a lambda expression for type signatures. Just like regular lambdas, the scope of forall
extends as far to the right as possible, up to enclosing parenthesis, and just like variables bound in a regular lambda, the type variables bound by a forall
are only in scope within the quantified expression.
Post scriptum: Perhaps you might wondernow that we're thinking about functions taking type parameters, why can't we do something more interesting with those parameters than put them into a type signature? The answer is that we can!
A function that puts type variables together with a label and returns a new type is a type constructor, which you could write something like this:
Either = /\a b > ...
But we'd need completely new notation, because the way such a type is written, like Either a b
, is already suggestive of "apply the function Either
to these parameters".
On the other hand, a function that sort of "pattern matches" on its type parameters, returning different values for different types, is a method of a type class. A slight expansion to my /\
syntax above suggests something like this:
fmap = /\ f a b > case f of
Maybe > (\g x > case x of
Just y > Just b g y
Nothing > Nothing b) :: (a > b) > Maybe a > Maybe b
[] > (\g x > case x of
(y:ys) > g y : fmap [] a b g ys
[] > [] b) :: (a > b) > [a] > [b]
Personally, I think I prefer Haskell's actual syntax...
A function that "pattern matches" its type parameters and returns an arbitrary, existing type is a type family or functional dependencyin the former case, it even already looks a great deal like a function definition.
Solution 6  Haskell
> Can anybody completely explain the forall keyword in clear, plain English (or, if it exists somewhere, point to such a clear explanation which I've missed) that doesn't assume I'm a mathematician steeped in the jargon?
I am going to try and explain just the meaning and maybe the application of forall
in the context of Haskell and its type systems.
But before you understand that I would like to direct you to a very accessible and nice talk by Runar Bjarnason titled "Constraints Liberate, Liberties Constrain". The talk is full of examples from real world use cases as well as examples in Scala to support this statement, although it doesn't mention forall
. I will try to explain the forall
perspective below.
CONSTRAINTS LIBERATE, LIBERTIES CONSTRAIN
Its very important to digest and believe this statement to proceed with the following explanation, so I urge you to watch the talk(at least parts of it).
Now a very common example, showing the expressiveness of the Haskell type system is this type signature:
foo :: a > a
It is said that given this type signature, there is only one function that can satisfy this type and that is the identity
function or what is more popularly known id
.
In the beginning stages of me learning Haskell, I always wondered the below functions:
foo 5 = 6
foo True = False
they both satisfy the above type signature, then why do Haskell folks claim that it is id
alone which satisfies the type signature?
That is because there is an implicit forall
hidden in the type signature. The actual type is:
id :: forall a. a > a
So, now let us come back to the statement: Constraints liberate, liberties constrain
Translating that to the type system, this statement becomes:
A constraint at the type level, becomes a liberty at the term level
and A liberty at the type level, becomes a constraint at the term level
Let us try to prove the first statement:
A constraint at the type level..
So putting a constraint on our type signature
foo :: (Num a) => a > a
becomes a liberty at the term level gives us the liberty or flexibility to write all of these
foo 5 = 6
foo 4 = 2
foo 7 = 9
...
Same can be observed by constraining a
with any other typeclass etc
So now what this type signature: foo :: (Num a) => a > a
translates to is:
∃a , st a > a, ∀a ∈ Num
This is known as existential quantification, which translates to there exists some instances of a
for which a function when fed something of type a
returns something of the same type, and those instances all belong to the set of Numbers.
Hence we can see adding a constraint(that a
should belong to the set of Numbers), liberates the term level to have multiple possible implementations.
Now coming to the second statement and the one which actually carries the explanation of forall
:
A liberty at the type level, becomes a constraint at the term level
So now let us liberate the the function at the type level:
foo :: forall a. a > a
Now this translates to:
∀a , a > a
which means that the implementation of this type signature should be such that it is a > a
for all circumstances.
So now this starts constraining us at the term level. We can no longer write
foo 5 = 7
because this implementation would not satisfy if we put a
as a Bool
. a
can be a Char
or a [Char]
or a custom datatype. Under all circumstances it should return something of the similar type. This liberty at the type level is what is known as Universal Quantification and the only function which can satisfy this is
foo a = a
which is commonly known as the identity
function
Hence forall
is a liberty
at the type level, whose actual purpose is to constrain
the term level to a particular implementation.
Solution 7  Haskell
The reason why there are different uses of this keyword is that it's actually used in at least two different type system extensions: higherrank types, and existentials.
It's probably best just to read about and understand those two things separately, rather than trying to get an explanation of why 'forall' is an appropriate bit of syntax in both at the same time.
Solution 8  Haskell
How is existential existential?#
> With ExistentialQuantification, forall
s in data
definitions mean
> that, the value contained can be of any suitable type, not
> that it must be of all suitable types.
 yachiru's answer
An explanation of why forall
in data
definitions are isomorphic to (exists a. a)
(pseudoHaskell) can be found in wikibooks's "Haskell/Existentially quantified types".
The following is a brief verbatim summary:
data T = forall a. MkT a  an existential datatype
MkT :: forall a. a > T  the type of the existential constructor
When patternmatching/deconstructing MkT x
, what is the type of x
?
foo (MkT x) = ...   what is the type of x?
x
can be any type (as stated in the forall
), and so it's type is:
x :: exists a. a  (pseudoHaskell)
Therefore, the following are isomorphic:
data T = forall a. MkT a  an existential datatype
data T = MkT (exists a. a)  (pseudoHaskell)
forall means forall
My simple interpretation of all this, is that "forall
really means 'for all'".
An important distinction to make is the impact of forall
on the definition versus function application.
A forall
means the definition of the value or function must be polymorphic.
If the thing being defined is a polymorphic value, then it means that the value must be valid for all suitable a
, which is quite restrictive.
If the thing being defined is a polymorphic function, then it means that the function must be valid for all suitable a
, which isn't that restrictive because just because the function is polymorphic doesn't mean the parameter being applied have to be polymorphic. That is, if the function is valid for all a
, then conversely any suitable a
can be applied to the function. However, the type of the parameter can only be chosen once in the function definition.
If a forall
is inside the function parameter's type (i.e., a Rank2Type
) then it means the applied parameter must be truly polymorphic, to be consistent with the idea of forall
means definition is polymorphic. In this case, the type of the parameter can be chosen more than once in the function definition ("and is chosen by the implementation of the function", as pointed out by Norman)
Therefore, the reason why existential data
definitions allows any a
is because the data constructor is a polymorphic function:
MkT :: forall a. a > T
kind of MkT :: a > *
Which means any a
may be applied to the function. As opposed to, say, a polymorphic value:
valueT :: forall a. [a]
kind of valueT :: a
Which means that the definition of valueT must be polymorphic. In this case, valueT
can be defined as empty list []
of all types.
[] :: [t]
Differences
Even though the meaning for forall
is consistent in ExistentialQuantification
and RankNType
, existentials has a difference since the data
constructor can be used in pattern matching. As documented in the ghc user guide:
> When pattern matching, each pattern match introduces a new, distinct, type for each existential type variable. These types cannot be unified with any other type, nor can they escape from the scope of the pattern match.