It explains HM algorithm in detail, with step-by-step examples.
You don’t need to read other chapters of the book to understand everything (if you are familiar with OCaml).
I may not explain things that are already described in the book.
I’ll also assume that you have took the CS242 course and have done the assignment.
Type Inference Rules
The CS3110 book introduces some symbols to represent the constraints. I’ll translate them in a style that is more consistent with CS242’s type checking rules. For example, the “if” part of the rule is written below the “then” part in CS3110 while in CS242 we prefer to write “if” above the “then”.
The basic idea of HM algorithm is: instead of checking every type in an expression eagerly, we generate a set of type constraint which is similar to a set of equations, and solve them for the unknown types (in a process called unification).
A constraint is just a claim that two types are equal:
Constraintc::=Typeτ::=∣∣τ=τnumbool…
The following are some example of rules with the format type checking rules => type inference rules. I’ll leave other rules as exercises for readers.
Basic Types
Constants, variables and lambdas add nothing to the constraints:
∅⊢n:num⇒∅⊢n:num⊣∅,(T-Num)Γ⊢x:τx:τ∈Γ⇒Γ⊢x:τ⊣∅x:τ∈Γ(T-Var)Γ⊢(λ(x:τarg).e):τarg→τretΓ,x:τarg⊢e:τret⇒Γ⊢(λx.e):τ→τret⊣CfreshτΓ,x:τ⊢e:τret⊣C,(T-Lam)
where freshτ means τ is a fresh new type variable that haven’t been used. 1
Binary operations add a constraint that the both sides of the operation and the result should be num.
⇒Γ⊢eL⊕eR:numΓ⊢eL:numΓ⊢eR:numΓ⊢eL⊕eR:num⊣CL,CR,τL=num,τR=numΓ⊢eL:τL⊣CLΓ⊢eR:τR⊣CR(T-Binom)
Fixpoints requires that the type of x,e and the entire expression are the same:
Γ⊢fix(x:τ).e:τΓ,x:τ⊢e:τ⇒Γ⊢fixx.e:τ′⊣C,τ′=τfreshτ′Γ,x:τ′⊢e:τ⊣C,(T-Fix)
If-statements add two constraints: the condition must be bool and both arm must have the same type.
⇒Γ⊢ifecondthenethenelseeelse:τΓ⊢econd:boolΓ⊢ethen:τΓ⊢eelse:τΓ⊢ifecondthenethenelseeelse:τ2⊣C1,C2,C3,τ1=bool,τ2=τ3Γ⊢econd:τ1⊣C1Γ⊢ethen:τ2⊣C2Γ⊢eelse:τ3⊣C3(T-If)
For function applications, we define a new type τ for the entire expression and add a constraint that the function must have type τ2→τ:
⇒Γ⊢(elamearg):τΓ⊢elam:τ2→τΓ⊢earg:τ2Γ⊢(elamearg):τ⊣C1,C2,τ1=τ2→τfreshτΓ⊢elam:τ1⊣C1Γ⊢earg:τ2⊣C2(T-App)
De Bruijn Indices
These rules work well until you decided to use the same variable in different expressions:
letf=funx->funy->((funx->x==y)2)||xinftrue1
If we use a set to represent the context Γ, then the type of the inner x (which is num) will overwrite the outer x (which is bool). This isn’t a problem when we can check the user-annotated type without moving out of the scope, but now we have to infer the type of the inner and outer x after visiting the whole expression.
If you have done the interpreter assignment, you should know that the solution is to convert the expression to nameless form using de Bruijn indices:
letf=fun_->fun_->((fun_-><0>==<1>)2)||<0>inftrue1
We also need to change the rules to represent the context as a stack. The type of the variable with index n is the n-th last element of the stack:
Γ,τ⊢⟨0⟩:τ⊣∅(T-Var-Term),Γ⊢⟨n+1⟩:τ′⊣∅Γ=τ∗,ττ∗⊢⟨n⟩:τ′⊣C(T-Var-Next),Γ⊢(λ_.e):τ→τret⊣CfreshτΓ,τ⊢e:τret⊣C(T-Lam),Γ⊢fix_.e:τ′⊣C,τ′=τfreshτ′Γ,τ′⊢e:τ⊣C,(T-Fix)
ADT
Product type constructors are similar to binary operations:
⇒Γ⊢(eL,eR):τL×τRΓ⊢eL:τLΓ⊢eR:τRΓ⊢(eL,eR):τL×τR⊣CL,CRΓ⊢eL:τL⊣CLΓ⊢eR:τR⊣CR,
We can use the same trick as function applications in product type destructors:
Γ⊢e.L:τLΓ⊢e:τL×τR⇒Γ⊢e.L:τL⊣C,τ=τL×τRfreshτL,τRΓ⊢e:τ⊣C
The type annotation of sum types can also be omitted. The type will be known the time user uses it:
Γ⊢inje=LasτL+τR:τL+τRΓ⊢e:τL⇒Γ⊢inje=L:τL+τR⊣CfreshτRΓ⊢e:τL⊣C,(T-Inject-L)⇒Γ⊢casee{L(xL)→eL∣R(xR)→eR}:τΓ⊢e:τL+τRΓ,xL:τL⊢eL:τΓ,xR:τR⊢eR:τΓ⊢casee{L(_)→eL∣R(_)→eR}:τL′⊣Csum,CL,CR,τsum=τL+τR,τL′=τR′freshτL,τR,Γ⊢e:τsum⊣CsumΓ,τL⊢eL:τL′⊣CLΓ,τR⊢eR:τR′⊣CR.(T-Case)
How to Use the Rules
We should know how these rules are used before learning polymorphism.
Collecting Constraints and Type
Remember how we used the type checking rule: for an expression e, first we find a rule whose “then” part matches e, check whether the conditions in its “if” part are satisfied (which are also type checks), then we know that Γ⊢e:τ.
fntype_check_expr(ast:&Expr,ctx:HashMap<Variable,Type>)->Result<Type,String>{matchast{Expr::Num(_)=>Ok(Type::Num),Expr::Addop{binop,left,right}=>{lettau_left=type_check_expr(left,ctx.clone())?;lettau_right=type_check_expr(right,ctx.clone())?;match(tau_left.clone(),tau_right.clone()){(Type::Num,Type::Num)=>Ok(Type::Num),_=>type_mismatch!(tau_left,tau_right,binop),}}// and other rules}}
The first part of type inference works essentially the same, except that we don’t really check the type, only maintain the set of constraints and a type:
pubfnget_constraints(&self,ctx:&mutVec<Type>)->Result<(Type,Vec<Constraint>),String>{let(tau_result,c_result)=matchself{Expr::Num(_)=>Ok((Type::Num,vec![])),Expr::Addop{left,right,..}|Expr::Mulop{left,right,..}=>{let(tau_left,c_left)=left.get_constraints(ctx)?;let(tau_right,c_right)=right.get_constraints(ctx)?;letconstraints=flat!(vec![c_left,c_right,vec![Constraint{type_l:tau_left,type_r:Type::Num,},Constraint{type_l:tau_right,type_r:Type::Num,},]]);Ok((Type::Num,constraints))}// and other rules}}
Unification
Now we got a set of constraints that looks like C={τ1=num,τ2×τ1=(num→bool)×τ3,…} and a type that probabily has some unknown variables, like ∀α.τ1→(α×bool). The next step is to unify these constraints to know what the types τ1,τ2,… are.
The unification algorithm is straightforward since the constraint and types has a simple, tree-like structure. Unsurprisingly, a union-find set is involved in the algorithm:
For every “forall” types ∀α.τ in C, introduce a fresh variable τ′ and remove the quantifier so it becomes [α→τ′]τ.
Initialize the union-find set S with every variable in C.
Initialize a map T={}. T stores the type of variables we have already known. Its keys are the roots in S and its values are types.
While C=∅:
Pop the first constraint τ1=τ2 from C.
If both τ1 and τ2 are the same atom types like bool and num, then continue.
If τ1 or τ2 are variables:
If the variable is not the root in S, find and use its root in S.
If both τ1 and τ2 are variables, we union these two variables in S. If (τ1,τ1′)∈T or (τ2,τ2′)∈T, we change the key of the items and add τ1′=τ2′ to C (if both items exists in T).
If τ1 is variable and τ2 does not contain τ1, we add (τ1,τ2) to the map T.
If τ2 is variable and τ1 does not contain τ2, we add (τ2,τ1) to the map T.
If τ1 is τ11→τ12 and τ2 is τ21→τ22, then add τ11=τ21 and τ12=τ22 to C.
Similar to 7, if τ1 and τ2 falls in the same category of the type’s BNF, then we decompose them into smaller types and add constraints to C.
Otherwise, the expression doesn’t type check.
Get the Type
You don’t need to know the type of the expression in general: if the unification succeeded, then the program is type checked. In some cases (like the polymorphism below), we still need to get the type.
We can’t directly use the type τ generated in the first step because it may contain free type variables (the variable without a quantifier) and whether these variables can be reduced is not known. For example, every fun x -> x in the expressions below generates type a -> a, but their types are different:
(funx->x)1(* int -> int *)funx->x(* forall a . a -> a *)
The following algorithm removes every free variables in τ:
While there is free type variable:
Get the first free variable x.
If S doesn’t contain x, then τ←∀x.τ and continue. Otherwise, let r be the root of x.
If r=x, then τ←[x→r]τ and continue.
If T doesn’t contain r, then τ←∀x.τ and continue. Otherwise, Let τr be the value of r.
τ←[x→τr]τ.
Return τ.
Polymorphism
In previous versions of Lam, the user had to explicitly say that there is polymorphism using the Λα.e syntax, and annotate the type of α. It’s impossible to keep this syntax: we won’t know what the type indicated by the α in Λα.e means if the user doesn’t use α in their program.
In OCaml, polymorphism is obtained by let expression: when binding a name to an expression with forall type, the variable should be able to instantiate to different types in different contexts. For example, the id in the following expression will have type bool -> bool in (id false) and num -> num in (id 1):
letid=funx->xinif(idfalse)then(id1)else1
As a result, let is no longer a syntax sugar of function application (when inferencing types. You can still treat it as function application when evaluating it). If you “desugar” the expression above, you will get num = bool and the unification will fail.
The inferencing rule of let is a bit involved:
Γ⊢let_=evarinein:τin⊣Cin,CvarΓ⊢evar:τvar⊣CvarΓ,generalize(τvar,Cvar)⊢ein:τin⊣Cin,(T-Let)
where generalize(τ,C) unifies C and remove free variables in τ using the algorithm in the previous section.
The instantiation type annotations and the T-Poly-App rule are replaced by instantiating arguments in forall types as fresh variables:
Γ,∀α1…∀αn.τ⊢⟨0⟩:[α1→τ1′,…αn→τn′]τ⊣∅freshτ1′,…,τn′.(T-Poly-Inst)
For example, let’s get the type of <0> (fun _ -> <0>) in the program
The context is ∀α.∀β.(α→β)→α→β (which is the type of fun _ -> fun _ -> <1> <0>) and the expression is <0> (fun _ -> <0>). We can use the rules to get the type τ and constraints C:
∀α.∀β.(α→β)→α→β⊢(⟨0⟩(λ_→⟨0⟩)):τ⊣(α′→β′)→α′→β′=(γ→γ)→τfreshτ∀α.∀β.(α→β)→α→β⊢⟨0⟩:(α′→β′)→α′→β′⊣∅freshα′,freshβ′(T-Poly-Inst)∅⊢(λ_.⟨0⟩):γ→γ⊣∅freshγγ⊢⟨0⟩:γ⊣∅(T-Var-Term)(T-Lam)(T-App)
We can get {α=β=γ,τ=α→α} from the constraint (α′→β′)→α′→β′=(γ→γ)→τ (try to simulate the unification algorithm yourself by hand!), so generalize(τ,{(α′→β′)→α′→β′=(γ→γ)→τ})=∀τ′.τ′→τ′.
Let’s go through the whole process of another example (in de Bruijn form):
let_=fun_-><0>inif(<0>false)then(<0>1)else1
We know that fun _ -> <0> has type τ→τ where τ is a free variable:
∅⊢let_=λ_.⟨0⟩inein∅⊢(λ_.⟨0⟩):τ→τ⊣∅freshττ⊢⟨0⟩:τ⊣∅(T-Var-Term)(T-Lam)(T-Let)
so we generalize it to ∀τ.τ→τ and add it to the context when working on the if expression. Here’s the inference process for the predicate (<0> false):
∀τ.τ→τ⊢(⟨0⟩false):τ1⊣τ1′→τ1′=bool→τ1freshτ1∀τ.τ→τ⊢⟨0⟩:τ1′→τ1′⊣∅freshτ1′(T-Poly-Inst)∀τ.τ→τ⊢false:bool⊣∅(T-Bool)(T-App)
The constraint and the type of (<0> 1) can be got using a similar process. Let’s complete the inference:
∅⊢let_=λ_.⟨0⟩inif(⟨0⟩false)then(⟨0⟩1)else1:τ2⊣τ1′→τ1′=bool→τ1,τ2′→τ2′=num→τ2,τ1=bool,τ2=num∀τ.τ→τ⊢if(⟨0⟩false)then(⟨0⟩1)else1:τ2⊣τ1′→τ1′=bool→τ1,τ2′→τ2′=num→τ2,τ1=bool,τ2=num∀τ.τ→τ⊢(⟨0⟩false):τ1⊣τ1′→τ1′=bool→τ1…∀τ.τ→τ⊢(⟨0⟩1):τ2⊣τ2′→τ2′=num→τ2…∀τ.τ→τ⊢1:num⊣∅(T-If)(T-Let)
Therefore the entire expression has type τ2. The unification algorithm (again, try to simulate it by hand!) will generate a map T where the root of τ2 corresponds to num.
Progress: if ∅⊢e:τ then either eval or there exists e′ such that e↦e′.
Preservation: if ∅⊢e:τ and e↦e′, then ∅⊢e′:τ.
By saying Γ⊢e:τ, we mean that Γ⊢e:τ′⊣C, and τ=generalize(τ′,C).
I’ll try to proof the theorems in later posts. This means that I didn’t have the proof yet, and everything presented in this blog post may be wrong. But I’m already exhausted after spending every weekend of the past month trying to make an interpreter that works 2.
If you are looking at the markdown of this post, you may have notice the difference in the LaTeX code’s style. That’s because I found a LaTeX OCR website. The code is partly generated by OCRing a screenshot of the assignment’s webpage. ↩
I haven’t implemented every feature in Lam for the same reason. Also, recursive types and modules aren’t interesting from type inference perspective. ↩