- OCaml 99.1%
- Dune 0.9%
| lib | ||
| .gitignore | ||
| .ocamlformat | ||
| dune | ||
| dune-project | ||
| main.ml | ||
| README.org | ||
- Approaches
- The Type language
- Type schemes
- The Type Context
- Free type variables
- Generalization and instantiation
Implementation of the paper "Algorithm W Step by Step" by Martin Grabmüller, with some modifications to support N-ary functions and tuples.
Approaches
The standard "simple" ML approach is to unify as you go. This is prone to error if the language has more advanced features like typeclasses or forward references; in that case, the ordering of the program can change how it types.
An alternative is to separate the typechecking into a separate AST traversal and constraint generation stage, and a constraint solving stage. This way, all constraints are known at solve time regardless of ordernig.
An additional benefit, is that you could formulate the constraints in a language like Z3 and get the solving for free.
Furthermore, this approach scales better, as new language features just map onto the constraint solver; the solver and constraint language remains the same. (SPJ)
The Type language
Type-checking and -inference algorithms are expressed using a type language, somewhat analogous to the expression language of functional programming languages. Terms in the type language, also known as types or monotypes, are recursively defined:
type mono =
| TyVar of string
| TyFun of mono * mono
| Int
| Bool
This language can express things that are familiar to ML programmers. Examples:
intint -> boola -> a
and so on.
Substitutions
A substitution, simply put, is an assignment of type variables to types.
It can be implemented using a Map from string to t, or an association list,
etc.
Transformations on types is expressed in terms of substitutions, and thus the definition of substitutions and the definition of the type language forms the core
let rec apply s = function
| TyVar a as t' ->
( match Map.get a s with
| Some t -> t
| None -> t' )
| TyFun (t1, t2) -> TyFun (apply s t1, apply s t2)
| t -> t
Type schemes
Also known as polytypes.
The idea of type schemes is usually poorly motivated in the literature, and since
the forall can be omitted when writing type signatures in ML, the need might not
be obvious. However, being able to "quantify" a type variable is the ticket to
polymorphism.
Existential and universal quantification
The difference between the monotype a -> a, and the polytype forall a. a -> a,
is that a in a -> a represents an unknown, yet to be determined, concrete type,
which will be resolved later by unification.
The former is somtimes referred to as existential quantification;
the type a "exists but is unknown".
The latter is referred to as universal quantification;
the type a can be any type.
Definition of polytypes
The normal way to define a polytype, is something along the lines
type poly =
| Forall of string * poly
| Type of mono
allowing you to express something like forall a. forall b. a -> b, where
a -> b is a monotype.
Polytypes are considered equal up to ordering and renaming, so another possible
representation is something like type poly = set * mono with a set of quantified
type variables.
Renaming a quantified type is called alpha-conversion.
Unquantified type variables are considered constant under substitution.
Using different notation, where quantified variables are represented by a different set of symbols, specializing a type by substitution is a simple remapping.
The Type Context
Often named Γ, the type context is simply a mapping from symbol names in
the expression language, to (poly)types. It can be implemented as a Map
from string to poly.
The context contains elements of the sort x : σ, called type assignments,
asserting that the variable named x has polytype σ.
Motivation for contexts: Lambdas.
To type fun(x) -> e, x can occur in the expression e, but x does not
have a known type. We need the context to keep track of the types of symbols that
can appear in the body of the expression we are typing.
Free type variables
Type variables occurring in a type without a corresponding quantification are said to be free.