CS 541 Lecture -*-Outline-*- * The core language type system for a functional language (like SML) see also section 7.4 of Watt's Concepts and Paradigms Overview: type inference is a lot like type checking, do same work, but just accept the answer instead of checking it Important notations coming here... ** Type checking ------------------------------------------ THE TYPE CHECKING PROBLEM & NOTATION E : int [fun] ---------------------------------- fun f (x:int):int = E : int -> int ------------------------------------------ This rule means to show that the thing under the line checks, have to prove that the thing on the type checks. *** environments (Watt's PL Syntax and Semantics, section 3.3.1) Q: what if E involves x? What if E involves f? What if E involves y? How to keep track of this information to use? introduce the notion of a type environment map from names to types that embodies the assmumptions you can make. ------------------------------------------ ENVIRONMENTS H \in TypeEnv = Identifier -> Type empty-type-env : TypeEnv bind : Identifier * Type -> TypeEnv dom : TypeEnv -> PowerSet(Identifier) overlay : TypeEnv * TypeEnv -> TypeEnv find : TypeEnv * Identifier -> Type [ |-> ] : TypeEnv * Identifier * Type -> TypeEnv empty-type-env = { } bind(I,T) = { I |-> T } find(H, I) = H(I) dom(empty-type-env) = { } dom(bind(I,T)) = {I} dom(overlay(H1,H2)) = dom(H1) union dom(H2) find(overlay(H1, H2), I) = if I in dom(H1) then H1(I) else H2(I) [I |-> T]H = overlay(bind(I,T), H) ------------------------------------------ Here T stands for a Type, I for and Identifier Think of a function as a set of bindings. Q: so what is find(overlay(bind(i,int), bind(i:real)), i)? Q: so how do we formalize the notation for the rule above? Don't what to define rules for fun, val rec, and fn. So use the syntactic sugars we discussed to simplify the problem to the following. Think of infix operators as sugar for function calls too. We'll ignore overloading. *** type checking rules Note, we already don't declare the type of the function's body... ------------------------------------------ TYPE CHECKING [var] H |- I: T where [app] ---------------------- H |- E1(E2): T [if] ------------------------------- H |- (if E1 then E2 else E3): T [fn] --------------------------- where H' H |- (fn I:T1 => E): T1->T2 H |- D ==> H', H'' |- E2 : where H'' = [let] --------------------- H |- (let D in E2): T [val] ----------------------- where H' = H |- val I:T = E ==> H' bind(I,T) where H' = [rec] ----------------------- bind(I,T) H |- rec I:T = E1 ==> H' ------------------------------------------ (note: In ML, rec is really "val rec", but I'm saving space here...) Key idea: expressions are checked from the bottom up (on desugared syntax tree) (data driven recursion = structural induction) Note how the rule [app] is like modus-ponens (Curry-Howard isomorphism) Also note the idea of "mini-environments" in the val and rec rules. Q: What about simultaneous binding? mutual recursion (and)? Q: what should be the rule for pairs? For formals that are pairs? [pair] --------------------- H |- (E1,E2): ______ [fnp] --------------------- H |- (fn (x,y):_____ => E) : ____ -> T2 Q: do we need a rule for applications where the arguments are pairs? Q: do we really need the [pair] rule? no, can think of it as sugar for a function appliction But the pattern matching syntax is important for getting bindings to all the parts of the formal. Generalization of the [fnp] rule for datatypes (omit) H |- E1 : datatype c2 of T2 | c3 of T3, H2 |- E2:T, where H2 = overlay(bind(x,T2),H) H3 |- E3:T and H3 = overlay(bind(y,T3),H) [case] --------------------------------------- H |- (case E1 of c2(x) => E2 | c3(y) => E3): T *** proofs ------------------------------------------ PROOFS OF TYPE CHECKING [var] Hfxy |- x: 'a, Hfxy |- y :'b [pair] ------------------------------ [var] Hfxy |- f :'a*'b->'c, Hfxy |- (x,y) : 'a*'b [app] ---------------------------- Hfxy |- f(x,y) : 'c [fn] ---------------------------- Hxy |- (fn f:'a*'b->'c => f(x,y)) : ('a*'b->'c) -> 'c [fn] ---------------------------- ETE |- (fn (x:'a,y:'b) => (fn f:'a*'b->'c => f(x,y))) : 'a*'b -> (('a*'b->'c) -> 'c) where ETE = empty-type-env and Hxy = {x |-> 'a, y |-> 'b} and Hfxy = {f |-> 'a*'b->'c, x |-> 'a, y |-> 'b} ------------------------------------------ Q: can you prove H |- plus(x,3):int where H = {3 |-> int, plus |-> int*int -> int, x |-> int} ? ** Reconstructing (Inferring) type declarations for monomorphic expressions Basic ideas due to Hindley and Milner. Made possible by unification See also the Cardelli paper in the references (Basic Polymorphic Typechecking) *** example ------------------------------------------ TYPE RECONSTRUCTION EXAMPLE [var] Hfxy |- x: , Hfxy |- y : [pair] ------------------------------ [var] Hfxy |- f : Hfxy |- (x,y) : [app] ---------------------------- Hfxy |- f(x,y) : [fn] ---------------------------- Hxy |- (fn f => f(x,y)) : [fnp] ---------------------------- ETE |- (fn (x,y) => (fn f => f(x,y))) : where ETE = empty-type-env and Hxy = {x |-> 'a, y |-> 'b} and Hfxy = {f |-> x |-> 'a, y |-> 'b} CONSTRAINTS: ------------------------------------------ Write down constraints on types as equations *** inference rules without considering polymorphism ------------------------------------------ TYPE RECONSTRUCTION (INFERENCE) PART 1 [var] H |- I: T where I \in dom(H), find(H,I) = T H |- E1: S -> T, H |- E2: S [app] --------------------------- H |- E1(E2): T H |- E1:bool, H |- E2:T, H |- E3:T [if] ---------------------------------- H |- (if E1 then E2 else E3): T [fn] ------------------------ where H' = H |- (fn I => E): T1->T2 H |- D ==> H', H'' |- E : T where H'' = [let] -------------------- overlay(H',H) H |- (let D in E): T ------------------------------------------ fill in the hypothesis for [fn] ... H' |- E:T2, where H' = [I |-> T]H *** a problem, when to decide something is polymorphic. ------------------------------------------ WHEN IS A FUNCTION POLYMORPHIC? Is this a type error? (fn f => g(f(2), f(true))) What if f is: val id = (fn x => x) val succ = (fn x => x + 1) ------------------------------------------ In ML, the solution is that after you bind with val (or val rec), mark type variables as generic (as in id) thus within a function declaration, the argument type is non-generic but after the declaration it becomes generic ------------------------------------------ DON'T ALLOW CAPTURING let val g = (fn x => let val f = (fn y => x) in if f(3) then f(true) else x + 5 end) in ... end ------------------------------------------ the problem with the above would be if we concluded that f had type all 'a . all 'b . ('a -> 'b) instead of all 'a . ('a -> int) so can't mark a type variable as generic if it's being used in the current type environment, logically, don't want the all quantifier to capture 'b. formally: a type variable occuring in the type of an expression E is generic for a given scope if it does not occur in the type of a lambda-variable declaration that encloses the given scope notation: in the type all 'a . 'a -> 'b 'a is a generic type The notation "all T . ..." is read "for all types T, ..." non-generic type variables: type variables appearing in type of a lambda-bound identifier -shared among all occurrences of lambda-bound id (e.g., f) -prevent heterogenous use of lambda-bound identifiers generic type variables: when used, replace by a normal type variable in practice, each instance of all 'a . 'a -> 'b is of the form 'c -> 'b where 'c is a fresh (not used elsewhere) non-generic type var ------------------------------------------ GENERIC TYPE VARIABLES in the type: all 'a . 'a -> 'b 'a is generic, and 'b is not [var] H |- id: all 'a . 'a -> 'a [gElim] -------------------------------- H |- id: 'b -> 'b, H |- id: 'c -> 'c, H |- 3: int, H |- true : bool [app] ------------------------------------ H |- id(3):int, H |- id(true): bool [pair] ----------------------------------- H |- (id(3), id(true)) : int * bool where H = {id |-> (all 'a . 'a -> 'a), 3 |-> int, true |-> bool} CONSTRAINTS: 'b = int 'c = bool ------------------------------------------ ------------------------------------------ TYPE RECONSTRUCTION (INFERENCE) GENERICS and BINDINGS H |- E: T where H' = [val] ------------------- bind(I,gen(T,H)) H |- (val I = E) ==> H' H'' |- E1: T where H' = [rec] ------------------- bind(I,gen(T,H)) H |- rec f = E1 ==> H' where gen(T, H) = all V1 . ... . all Vk . T if V1,...,Vk are all the free type variables in T that are not in range(H). H |- I : all V . T [gElim] ------------------ where T' = H |- I : T' [T''/V]T ------------------------------------------ The notation [T''/V]T means substitution of T'' for free occurrences of V in T, Here T'' is arbitrary. *** rules ------------------------------------------ EXAMPLE G |- rec map = (fn f => (fn ls => if (null ls) then [] else (f (hd ls))::(map f (tl ls)))) ==> where G is { (op ::) |-> all 'a . 'a * 'a list -> 'a list, [] |-> all 'a . 'a list null |-> all 'a . 'a list -> bool hd |-> all 'a . 'a list -> 'a tl |-> all 'a . 'a list -> 'a list} ------------------------------------------ [lemma1] G' |- (null ls): bool, [lemma2] G' |- []: 'h list , [lemma3] G' |- (f (hd ls))::(map f (tl ls)) : 'h list [if] _______________________________________________________ G' |- if (null ls) then [] else (f (hd ls))::(map f (tl ls)))) : 'h list [fn] _______________________________________________________ G[map |-> 'b][f |-> 'c] |- (fn ls => if (null ls) then [] else (f (hd ls))::(map f (tl ls)))) : 'g list -> 'h list [fn] _______________________________________________________ G[map |-> 'b] |- (fn f => (fn ls => if (null ls) then [] else (f (hd ls))::(map f (tl ls)))) : ('g -> 'h) -> 'g list -> 'h list [rec] _______________________________________________________ G |- rec map = (fn f => (fn ls => if (null ls) then [] else (f (hd ls))::(map f (tl ls)))) ==> G[map |-> ('g -> 'h) -> 'g list -> 'h list] where G' = G[map |-> 'b][f |-> 'c][ls |-> 'd] constraints: 'b = ('g -> 'h) -> ('g list -> 'h list), 'c = 'g -> 'h, 'd = 'g list Lemma 1: if 'd = 'g list, then G' |- (null ls): bool Proof: [var] G' |- null : all 'a. 'a list -> bool [gElim] __________________________ G' |- null : 'g list -> bool, [var] G' |- ls : 'd [app] ___________________________________________________________________ G' |- (null ls): bool constraints: 'd = 'g list QED Have them do lemma 2 (lemma 3 takes too long). Key ideas: think of the checks to be made as either succeeding, or constraining the result (by unification) accumulate a set of constraints types that are unconstrained at the end turn into universally quantified (all 'a . TE('a)) types. names bound with let (val or rec) are polymorphic, but monomorphic instances are used in the program. (see section 7.3 of Watt's Concepts and Paradigms book) Formalization: system of typings and equations, solve equations for unknowns ** Limitations *** No recursive types: type 'a stream = unit -> ('a * 'a stream) no way to use this in the algorithm solution: use abstraction to break the recursion (think of above equation as an isomorhpism) up, down are provided implicitly by the datatype constructor (up when applied to values down when used in pattern matching) *** Polymorphic functions are not first-class objects! type variables in a function parameter are not generic ------------------------------------------ NO POLYMORPHIC FUNCTIONS AS ARGUMENTS - fun F f = fn (a,b) => (f(a), f(b)); val F = fn : ('a -> 'b) -> 'a * 'a -> 'b * 'b - val W = (fn x => (x x)); std_in:1.18-1.20 Error: operator is not a function operator: 'Z in expression: (x x) ------------------------------------------ All arguments to functions in ML have one type (which is the correct instance if possible). so no self-application, no Y combinator another example of this... - fun S f g x = ((f x) (g x)); val S = fn : ('a -> 'b -> 'c) -> ('a -> 'b) -> 'a -> 'c - fun I x = x; val I = fn : 'a -> 'a - S I I; std_in:8.1-8.5 Error: operator and operand don't agree (circularity) operator domain: ('Z -> 'Y) -> 'Z operand: ('Z -> 'Y) -> 'Z -> 'Y in expression: S I I