COP 5021 Lecture -*- Outline -*- * Correctness of the Live Variables Analysis (2.2.2) An example of how to prove the correctness of an analysis with respect to a semantics ** failed attempt *** characterizing the dataflow equations as generators ------------------------------------------ EQUATION SYSTEM LV^=(S*) defined by: LVexit(l) = if l \in final(S*) then {} else \bigcup { LVentry(l') | (l', l) \in flow^R(S*) } LVentry(l) = (LVexit(l) \ killLV(B^l)) \cup genLV(B^l) where B^l \in blocks(S*) Functional form, for a given S*: F^S*_LV: ({entry,exit} -> Lab* -> P(Var*)) -> ({entry,exit} -> Lab* -> P(Var*)) F^S*_LV(F)(exit)(l) = if l \in final(S*) then {} else \bigcup { F(entry)(l') | (l', l) \in flow^R(S*) } F^S*_LV(F)(entry)(l) = (F(exit)(l) - killLV(B^l)) \cup genLV(B^l) where B^l \in blocks(S*) Solutions: live: {entry,exit} -> Lab* -> P(Var*) def: live solves LV^=(S*), written live |= LV^=(S*), iff live is a fixpoint of F^S*_LV. ------------------------------------------ Book uses F^S_LV instead of F^S*_LV doesn't require live be a function of entry and exit Q: Why use functions instead of tuples for solutions? less messy notation, don't have to assume labels ordered ------------------------------------------ EXAMPLE Q: What is F^S*_LV for: [x := 3]^1; [y := x+2]^2; [y := y+1]^3 ------------------------------------------ ... calculate the following F^S*_LV(F)(exit)(3) = {} F^S*_LV(F)(entry)(3) = (F(exit)(3) - {y}) \cup {y} F^S*_LV(F)(exit)(2) = F(entry)(3) F^S*_LV(F)(entry)(2) = (F(exit)(2) - {y}) \cup {x} F^S*_LV(F)(exit)(1) = F(entry)(2) F^S*_LV(F)(entry)(1) = (F(exit)(l) - {x}) \cup {} Q: What do we want to prove for correctness? Have to compare the static analysis vs. "the truth" of the operational semantics. *** Specification of Live Variables Analysis ------------------------------------------ DEFINING LIVE VARIABLES SEMANTICALLY What should a solution to LV mean? At a given point, only the live variables can affect the computation def: States s1 and s2 are *similar with respect to a set of variables V *, written s1 ~_V s2, iff (\forall x \in V :: s1(x) = s2(x)). def: functional liv: {entry,exit} -> Lab -> State *solves a set LV^=(S)* iff for all d in {entry,exit}, l in Lab: if V = liv(d)(l), then liv(d)(l) satisfies LV^=(S)(d)(l) Conjecture: Let S be label consistent. Suppose live |= LV^=(S) and s1 ~_{live(entry)(init(S))} s2. Then: (1) if (S, s1) --> (S', s1'), then there is some s2' such that (S, s2) --> (S', s2') and s1' ~_{live(entry)(init(S'))} s2'. (2) if (S, s1) --> s1' then there is some s2' such that (S, s2) --> s2' and s1' ~_{live(exit)(init(S))} s2'. ------------------------------------------ The key requirement is to have another (clear) definition of "live" Q: What happens to ~_V as V shrinks? it relates more and more states. Antimonotonicity Lemma: Suppose V1 \supseteq V2. Then ~_V1 \subseteq ~_V2. That is, s1 ~_V1 s2 ==> s1 ~_V2 s2. The antimonotonicity lemma is essentially similar to Lemma 2.20. **** failed proof Q: How would we prove the conjecture? Try by induction on the inference used to establish the semantics, see why this fails. Proof attempt: by induction on the inferences used to establish (S, s1) --> (S', s1') or (S, s1) --> s1'. Let live |= LV^=(S) and let s1 and s2 be such that s1 ~_{live(entry)(init(S))} s2 (1) Base cases (can skip): Suppose the rule applied was [asgn]. Then S = [x:=a]^l, and by [asgn] ([x:=a]^l, s1) --> s1[x|->A[[a]]s1] ([x:=a]^l, s2) --> s2[x|->A[[a]]s2] So a choice for s2' is s2[x|->A[[a]]s2]. Let's see if that is related as desired by calculation. s1 ~_{live(entry)(init(S))} s2 = s1 ~_{live(entry)(l)} s2 = s1 ~_{(live(exit)(l) \ {x}) \cup FV(a)} s2 ==> s1[x|->A[[a]]s1] ~_{(live(exit)(l) \ {x}) \cup FV(a)} s2[x|->A[[a]]s2] ==> s1[x|->A[[a]]s1] ~_{live(exit)(l) \ {x}} s2[x|->A[[a]]s2] = s1[x|->A[[a]]s1] ~_{live(exit)(l)} s2[x|->A[[a]]s2] = s1[x|->A[[a]]s1] ~_{live(exit)(init(S))} s2[x|->A[[a]]s2] (end of [asgn] case) Suppose the rule applied was [skip]. Then S = [skip]^l, and by [skip] ([skip]^l, s1) --> s1 ([skip]^l, s2) --> s2 So a choice for s2' is s2, which works because: s1 ~_{live(entry)(init(S))} s2 = s1 ~_{live(entry)(l)} s2 = s1 ~_{live(exit)(l)} s2 = s1 ~_{live(exit)(init(S))} s2 (end of skip case) The inductive hypothesis is that for all substatements Si, if live |= LV^=(Si) and s1 ~_{live(entry)(init(Si))} s2, then (1) and (2) follow with Si for S. Inductive cases: Suppose the rule applied was [seq1]. Then S = S1; S2 and by [seq1] (S1, s1) --> (S1', s1') ____________________________ (S1;S2, s1) --> (S1';S2, s1') We need to exercise the inductive hypothesis, so we must show that live |= LV^=(S1;S2) implies live |= LV^=(S1) (so we can find an s2'...). But this isn't true. Counterexample: Let S1;S2 be [x := 3]^1; [y := x]^2 Then if live |= LV^=(S1;S2), it must have live(exit)(1) = {x}. But live does not model LV^=(S1), since live(exit)(1) = {x} and by the equations for the LV analysis, LVexit(1) = {} when l \in final(S1). So the problem is considering S1 by itself leads to the boundary condition, which isn't satisfied by the solution, live. (end of counterexample) (end of proof attempt) ** fix using constraint system idea: requiring equality of the solution for individual statements is too restrictive, as seen in the counterexample above. Better would be to require that the solution for individual statments be contained in the solution for the whole program. *** Constraint characterization of live variables ------------------------------------------ CONSTRAINT SYSTEM LV^{\subseteq}(S*) defined by: LVexit(l) \supseteq if l \in final(S*) then {} else \bigcup { LVentry(l') | (l', l) \in flow^R(S*) } LVentry(l) \supseteq (LVexit(l) \ killLV(B^l)) \cup genLV(B^l) where B^l \in blocks(S*) Functional form, for a given S*, as above F^S*_LV: ({entry,exit} -> Lab* -> P(Var*)) -> ({entry,exit} -> Lab* -> P(Var*)) F^S*_LV(F)(exit)(l) = if l \in final(S*) then {} else \bigcup { F(entry)(l') | (l', l) \in flow^R(S*) } F^S*_LV(F)(entry)(l) = (F(exit)(l) \ killLV(B^l)) \cup genLV(B^l) where B^l \in blocks(S*) ------------------------------------------ *** strengthening the induction ------------------------------------------ Solutions: live: {entry,exit} -> Lab* -> P(Var*) def: live solves LV^{\subseteq}(S*), written live |= LV^{\subseteq}(S*), iff live \sqsupseteq F^S*_LV(live). def: lv1 \sqsupseteq lv2 iff (\forall e \in {entry,exit} :: (\forall l in Lab* :: lv1(e)(l) \supseteq lv2(e)(l))) Lemma 2.15: If S* is label consistent, then live |= LV^=(S*) implies live |= LV^{\subseteq}(S*). Furthermore, the least solutions coincide. ------------------------------------------ Proof of 2.15: F^S*_LV is such that live |= LV^{\subseteq}(S*) iff live \sqsupseteq F^S*_LV(live) and live |= LV^=(S*) iff live = F^S*_LV(live). Now F^S*_LV is monotonic. So by Tarski's fixed point theorem (A.10), F^S*_LV has a least fixed point, lfp(F^S*_LV), such that lfp(F^S*_LV) = \bigsqcap {live | live \sqsupseteq F^S*_LV(live)} = \bigsqcap {live | live = F^S*_LV(live)} and lfp(F^S*_LV) is a solution. (end of proof) Q: What lemma would give us the property where we got stuck? ------------------------------------------ SOLUTIONS WORK FOR SUBSTATEMENTS Lemma 2.16: Suppose S1 is label consistent, and live |= LV^{\subseteq}(S1). If flow(S1) \supseteq flow(S2) and blocks(S1) \supseteq blocks(S2), then S2 is label consistent and live |= LV^{\subseteq}(S2). ------------------------------------------ Q: Why is this true? By definition of live (the functional form), since the flow graph of S2 is a subset of that of S1 ------------------------------------------ SOLUTIONS PRESERVED Corollary 2.17: Suppose S is label consistent, and live |= LV^{\subseteq}(S). If (S,s) --> (S',s'), then live |= LV^{\subseteq}(S'). ------------------------------------------ Q: Why does that follow? using facts about --> (lemma 2.14) ------------------------------------------ SOLUTION CAN ONLY SHRINK FORWARD Lemma 2.18: Suppose S is label consistent, and live |= LV^{\subseteq}(S). Then for all (l,l') \in flow(S), live(exit)(l) \supseteq live(entry)(l'). Proof: construction of LV^{\subseteq}(S). ------------------------------------------ *** A proper proof ------------------------------------------ GETTING CORRECTNESS RIGHT Theorem 2.21: Let S be label consistent. Suppose live |= LV^{\subseteq}(S) and s1 ~_{live(entry)(init(S))} s2. Then: (1) if (S, s1) --> (S', s1'), then there is some s2' such that (S, s2) --> (S', s2') and s1' ~_{live(entry)(init(S'))} s2'. (2) if (S, s1) --> s1' then there is some s2' such that (S, s2) --> s2' and s1' ~_{live(exit)(init(S))} s2'. ------------------------------------------ Proof: by induction on the inferences uses to establish (S, s1) --> (S', s1') or (S, s1) --> s1'. Let live |= LV^{\subseteq}(S) and let s1 and s2 be such that s1 ~_{live(entry)(init(S))} s2 (1) Base cases (can skip): Suppose the rule applied was [asgn]. Then S = [x:=a]^l, and by [asgn] ([x:=a]^l, s1) --> s1[x|->A[[a]]s1] ([x:=a]^l, s2) --> s2[x|->A[[a]]s2] So a choice for s2' is s2[x|->A[[a]]s2]. Let's see if that is related as desired by calculation. s1 ~_{live(entry)(init(S))} s2 = s1 ~_{live(entry)(l)} s2 ==> s1 ~_{(live(exit)(l) \ {x}) \cup FV(a)} s2 ==> s1[x|->A[[a]]s1] ~_{(live(exit)(l) \ {x}) \cup FV(a)} s2[x|->A[[a]]s2] ==> s1[x|->A[[a]]s1] ~_{live(exit)(l) \ {x}} s2[x|->A[[a]]s2] = s1[x|->A[[a]]s1] ~_{live(exit)(l)} s2[x|->A[[a]]s2] = s1[x|->A[[a]]s1] ~_{live(exit)(init(S))} s2[x|->A[[a]]s2] (end of [asgn] case) Suppose the rule applied was [skip]. Then S = [skip]^l, and by [skip] ([skip]^l, s1) --> s1 ([skip]^l, s2) --> s2 So a choice for s2' is s2, which works because: s1 ~_{live(entry)(init(S))} s2 = s1 ~_{live(entry)(l)} s2 ==> s1 ~_{live(exit)(l)} s2 = s1 ~_{live(exit)(init(S))} s2 (end of skip case) The inductive hypothesis is that for all sub-statements Si, if live |= LV^{\subseteq}(Si) and s1 ~_{live(entry)(init(Si))} s2, then (1) and (2) follow with Si for S. Inductive cases: Suppose the rule applied was [seq1]. Then S = S1; S2 and by [seq1] (S1, s1) --> (S1', s1') ____________________________ (S1;S2, s1) --> (S1';S2, s1') By lemma 2.16, live |= LV^{\subseteq}(S1), so (in the key difference from the LV^{=} case) by the inductive hypothesis, there is some s2' such that (S1, s2) --> (S1', s2') and s1' ~_{live(entry)(init(S1'))} s2'. So, by the [seq1] rule: (S1, s2) --> (S1', s2') ____________________________ (S1;S2, s2) --> (S1';S2, s2') So it remains to prove that s1' ~_{live(entry)(init(S1';S2))} s2', but this follows immediately from the definition of init. (end of [seq1] case) See the book for the other cases, of which [seq2] also relies on lemma 2.16. (end of proof) Q: What does this theorem tell us about execution sequences? By induction on length, they also preserve solutions. See Corollary 2.22