I. A View of Type Checking as Abstract Interpretation ------------------------------------------ SOURCE Patrick Cousot. Types as abstract interpretations. In POPL'97, pp. 316-331, 1997. https://doi.org/10.1145/263699.263744 ------------------------------------------ A. Introduction ------------------------------------------ INTRODUCTION Idea of abstract interpretation: Program semantics, Program proof methods, Static analysis techniques, all have a common structure that Slogan: ------------------------------------------ B. Example: Call-by-Value Lambda Calculus 1. Denotational Semantics What does a denotational semantics look like? Why not use operational semantics? a. abstract syntax ------------------------------------------ ABSTRACT SYNTAX OF LAMBDA CALCULUS x, f, ... \in X "program variables" e \in E "expressions" e ::= x | fn x ==> e | e1(e2) | rec f . fn x ==> e | 1 | e1 - e2 | (e1 ? e2 : e3) ------------------------------------------ b. semantic domains ------------------------------------------ NOTATION FOR SEMANTIC DOMAINS \bot denotes non-termination (looping) wrong denotes a type error D_{\bot} is the domain D \cup {\bot} with up: D -> D_{\bot} defined by up(d) = d down: D_{bot} -> D defined by down(up(d)) = d down(\bot) is undefined \Omega = up(wrong) + forms a "Coalesced sum of domains" (. :: D1) : D1 -> D1 + D2 (. :: D2) : D2 -> D1 + D2 [. -> .] forms "continuous" \bot-strict and wrong-strict functions So if f \in [D1 -> D2], then f(\bot) = \bot and f(wrong) = wrong ------------------------------------------ What does strictness mean? Is it possible to tell if a computation produces an infinite loop? ------------------------------------------ SEMANTIC DOMAINS Following Guenter and Scott: wrong \in W "type errors" z \in Z "integers" u, f \in U =iso= "values" W_{\bot} + Z_{\bot} + [U -> U]_{\bot} r \in |R = X -> U "environments" \phi \in S = |R -> U "semantic domain" ------------------------------------------ c. denotational semantics with call-by-value, runtime type checking ------------------------------------------ NOTATION FOR DENOTATIONAL SEMANTICS r[x <- u](y) = if y == x then u else r(y) lfp : (L -> L) -> (L -> L) is least fixed point operator on domain L ------------------------------------------ ------------------------------------------ (DENOTATIONAL) SEMANTIC FUNCTIONS S: E -> S S[[x]](r) = r(x) S[[fn x ==> e]](r) = S[[e1(e2)]](r) = let f = S[[e1]](r) in if f = \bot or f = \Omega then f else (down(f))(S[[e2]](r)) S[[rec f . fn x ==> e]](r) = S[[1]](r) = up(1)::Z_{\bot} S[[e1 - e2]](r) = let z1 = S[[e1]](r) in let z2 = S[[e2]](r) in if z1 = \bot or z2 = \bot then \bot else if z1 = \Omega or z2 = \Omega then \Omega else up(down(z1) - down(z2)) :: Z_{\bot} S[[e1 ? e2 : e3]](r) = let u1 = S[[e1]](r) in if u1 = \bot then \bot else if u1 = \Omega or down(u1) \not\in Z then \Omega else if down(u1) = 0 then S[[e2]](r) else S[[e3]](r) ------------------------------------------ If you were implementing this, how would you implement tests such as e1 = \bot, found in the semantics of e1 - e2 ? 2. Collecting Semantics ------------------------------------------ ELEMENTS OF ABSTRACT INTERPRETATION An abstract interpretation has: - a collecting semantics, used for - an approximation, used for - relations between these, used for The analysis is an approximation of the collecting semantics ------------------------------------------ ------------------------------------------ SEMANTIC PROPERTIES def: A *semantic property* is a set of semantic values P \in Prop = PowerSet(S) It is a complete lattice (partially ordered by \subseteq) ------------------------------------------ ------------------------------------------ COLLECTING SEMANTICS FOR THE EXAMPLE Standard collecting semantics: C : E -> Prop C[[e]] = {S[[e]]} ------------------------------------------ C. Church/Curry Monotype Semantics 1. Type Expressions and Semantic Domains ------------------------------------------ TYPE EXPRESSIONS AND SEMANTIC DOMAINS m \in M "monotype" H \in H "type environment" = X ->_{fin} M \theta \in I = H x M "typing" T \in PT = PowerSet(I) "program type" m ::= int | m1 -> m2 ------------------------------------------ What's the opposite of a monotype? Does this syntax allow types like (int->int) -> (int->int)? What's not allowed? 2. Meanings of Types, Concretization ------------------------------------------ MEANINGS OF TYPES concretization function for monotypes: g1: M ->_{fig(n} PowerSet(U) g1(int) = {up(z)::Z_{\bot} | z \in Z} \cup {\bot} g1(m1 -> m2) = {up(\phi)::[U->U]_{\bot} | \phi \in [U -> U], (\forall u \in g1(m1) :: \phi(u) \in g1(m2))} \cup {\bot} for type environments: g2: H -> PowerSet(R) g2(H) = {r \in |R | (\forall x \in X :: r(x) \in g1(H(x)))} for typings: g3 : I -> Prop g3((H,m)) = {\phi \in S | (\forall r \in g2(H) :: \phi(r) \in g1(m))} for program types: g : PT -> Prop g(T) = \bigcap_{\theta \in T} g3(\theta) g(\emptyset) = S ------------------------------------------ 3. Galois connection ------------------------------------------ DEFINITION OF GALOIS CONNECTION Def: Suppose that L is partially ordered by <=, and M is partially ordered by \sqsubseteq, a : L -> M, and g : M -> L, then (a,g) is a *Galois connection* iff for all l \in L, m \in M :: a(l) \sqsubseteq m iff l <= g(m) Corollary (p. 232 in the textbook): Suppose that (a,g) is a Galois connection between (L,<=) and (M,\sqsubseteq), then for all l \in L, m \in M:: l <= g(a(l)) (4.8) and a(g(m)) \sqsubseteq m (4.9) Corollary: If (a,g) is a Galois connection, then a preserves existing lubs and g preserves existing glbs ------------------------------------------ What's an "existing lub"? ------------------------------------------ GALOIS CONNECTION PROPERTIES Lemma: g(\bigsqcup_{i \in \Delta} T_i) = \bigsqcap_{i \in \Delta} g(T_i)). Corollary. If g is such that g(\bigsqcup_{i \in \Delta} T_i) = \bigsqcap_{i \in \Delta} g(T_i)), then there is a unique (upper) adjoint a : Prop -> PT such that (a, g) is a Galois connection. ------------------------------------------ ------------------------------------------ CONSEQUENCES If P is a property (set of values), then a(P) is the most precise type of programs with property P Implications of program properties is abstracted by superset inclusion: E.g., let oddInt = {1, 3, 5, ...} int = {1,2,3,4, ...} then oddInt \subseteq int and g(int) \supseteq g(oddInt) ------------------------------------------ 4. Monotype semantics ------------------------------------------ MONOTYPE SEMANTICS T : E -> PT T[[x]] = {(H,H(x)) | H \in |H} T[[fn x ==> e]] = T[[e1(e2)]] = {(H, m2) | (H, m1->m2) \in T[[e1]], (H, m1) \in T[[e2]]} T[[rec f . fn x ==> e]] = T[[1]] = {(H,int) | H \in |H } T[[e1 - e2]] = {(H,int) | (H,int) \in T[[e1]] \cap T[[e2]]} T[[(e1 ? e2 : e3)]] = {(H,m) | (H,int) \in T[[e1]], (H,m) \in T[[e2]] \cap T[[e3]]} ------------------------------------------ What do the typings for e1-e2 and (e1?e2:e3) mean? ------------------------------------------ SOUNDNESS def: The type semantics T[[.]] is *sound* iff for all programs e \in E :: S[[e]] \in g(T[[e]]) Corollary: (With the conventions above) S[[e]] \in g(T[[e]]) iff a(C[[e]]) \supseteq T[[e]] iff C[[e]] \subseteq g(T[[e]]). Corollary (soundness): Let e \in E be a program, and let H be a type environment. Then typeable programs cannot go wrong in the sense that: (H,m) \in T[[e]] and r \in g2(H) and S[[e]](r) \neq \bot implies S[[e]](r) \neq \Omega ------------------------------------------ Where are the runtime type errors in the statement of soundness? How could this theorem be proved using the Galois connection? D. Example 2: Polymorphism 1. Polytype Syntax ------------------------------------------ POLYTYPE SYNTAX m \in M "monotype" m ::= int | m1 -> m2 p \in |P = PowerSet(M) "polytype" H \in H "type environment" = X ->_{fin} P t \in I = H x M "typing" T \in PT = PowerSet(I) "program type" ------------------------------------------ 2. Meanings of Types, Concretization ------------------------------------------ MEANINGS OF TYPES concretization function for monotypes: g1: M -> PowerSet(U) g1(int) = {up(z)::Z_{\bot} | z \in Z} \cup {\bot} g1(m1 -> m2) = {up(\phi)::[U->U]_{\bot} | \phi \in [U -> U], (\forall u \in g1(m1) :: \phi(u) \in g1(m2))} \cup {\bot} for polytypes: g2: P -> PowerSet(U) g2(p) = \bigcap_{m \in p} g1(m) g2(\emptyset) = U for type environments: g3: H -> PowerSet(R) g3(H) = {r \in R | (\forall x \in X :: r(x) \in g2(H(x)))} for typings: g4 : I -> Prop g4((H,m)) = {\phi \in S | (\forall r \in g3(H) :: \phi(r) \in g1(m))} for program types: g : PT -> Prop g(T) = \bigcap {g4(\theta) | theta \in T} g(\emptyset) = S ------------------------------------------ Do we get a Galois connection? 3. Polymorphic Typing ------------------------------------------ POLYMORPHIC TYPING T : E -> PT T[[x]] = {(H,m) | m \in H(x)} T[[fn x ==> e]] = {(H, m1->m2) | (H[x<-m1],m2) \in T[[e]]} T[[e1(e2)]] = {(H,m2) | (H,m1->m2) \in T[[e1]], (H,m1) \in T[[e2]]} T[[let x = e1 in e2]] = {(H,m2) | (e\xists p1 \neq {} : m1 \in p1 : (H,m1) \in T[[e1]] and (H[x<-p1],m2)\in T[[e2]])} T[[rec f . fn x ==> e]] = T[[1]] = {(H,int) | H \in |H } T[[e1 - e2]] = {(H,int) | (H,int) \in T[[e1]] \cap T[[e2]]} T[[(e1 ? e2 : e3)]] = {(H,m) | (H,int) \in T[[e1]], (H,m) \in T[[e2]] \cap T[[e3]]} ------------------------------------------ Why require that p \neq {} in the typing of let? Is this type system sound? E. Abstract Semantics and Interpreters 1. compositional abstract semantics ------------------------------------------ ABSTRACT SEMANTICS Def: an *abstract domain* is specified by a poset, (T#,<=#), and an abstract semantic function, T# : E -> T#. Def: An abstract semantics (T#,<=#) and T# is *compositional* iff for all e \in E : e = [x1,...,xn](e1,...,en) where the xi \in X are locally bound, and the ei \in E are subexpression : T#[[e]] = \Psi#_e(T#[[e1]],...T#[[en]]) so that T# is defined compositionally based on the primitives \Psi#_e. Def: A compositional abstract semantics is monotone when all primitives are monotone, i.e., (T1,...,Tn) <=#(T1',...,Tn') ==> (\Psi#_e(T1,...,Tn) <=# \Psi#_e(T1',...,Tn')) Def: A monotone compositional abstract semantics is an *abstract interpreter* when the abstract domain (T#,<=#) is representable (in a computer). ------------------------------------------ 2. abstraction and soundness ------------------------------------------ ABSTRACTION Def: An abstract semantics, with abstract domain (T#,<=#) and abstract semantic function T# is an *abstraction* of a (concrete) semantics (Tb,<=b) (with semantic function Tb) iff there is a concretization map g : T# -> Tb that is monotone and such that for all e \in E :: Tb[[e]] <=b g(T#[[e]]). Def: An abstract semantics is *sound* if it is an abstraction of the collecting semantics. ------------------------------------------ What else can we say about g? What needs to be checked about an abstraction if the semantics is compositional and monotone? Is an abstraction of a sound abstraction also sound? Do Galois connections compose? 3. design of abstract semantics ------------------------------------------ DESIGN OF AN ABSTRACT INTERPRETER A. Find a Galois connection (a,g) from the collecting semantics (Tb,<=b) to a suitable property space (T#,<=#) B. Find a set of primitives, Pb and P# that are all monotone. C. Check these make a monotone compositional abstraction: forall e \in E :: a(\Psib_e(Tb[[e1]],...,Tb[[en]])) <=# \Psi#_e(a(Tb[[e1]]), ...,a(Tb[[en]])) D. For each e \in E, calculate \Psi#_e, starting from \Psi#_e(a(Tb[[e1]]), ...,a(Tb[[en]])) (approximating when necessary) ------------------------------------------ Why do we want to calculate the \Psi#_e?