Com S 641 Lecture -*- Outline -*- * Theoretical Properties (2.2) this section uses structural operational semantics to show that the live variables analysis is correct. Q: Why are formal correctness proofs useful for program analysis? a way to find subtle errors and correct them ** kinds of semantics ------------------------------------------ KINDS OF SEMANTICS Axiomatic Semantics as declarative specifications specifications + proof techniques Denotational Semantics as functional programming domains + meaning functions Operational Semantics as logic programming configurations + rewrite rules ------------------------------------------ *** denotational example ------------------------------------------ DENOTATIONAL SEMANTICS OF EXPRESSIONS FOR WHILE LANGUAGE Syntax (1.2): x,y \in Var n \in NumericLiteral a \in AExp b \in BExp op_a \in Op_a op_b \in Op_b op_r \in Op_r a ::= x | n | a1 op_a a2 b ::= true | false | not b | b1 opb b2 | a1 op_r a2 Domains: Z = { ..., -1, 0, 1, ... } T = { true, false } s \in State = Var -> Z Meaning Functions: A : Aexp -> State -> Z A[[x]]s = s(x) A[[n]]s = N(n) A[[a1 op_a a2]]s = (O_a[[op_a]]) (A[[a1]]s) (A[[a2]]s) where N : NumericLiteral -> Z N[[n]] = the value of literal n O_a : Op_a -> (Z x Z -> Z) O_a[[+]] = \n1 n2 . n1 + n2 ... ------------------------------------------ E.g., in C or C++, N[[015]] = 13, since 015 is in octal. Q: Why is the state a parameter to the meaning of arithmetic expressions? it's needed to evaluate variable references Q: How would you define B : Bexp -> (State -> T)? ** structural operational semantics (2.2.1) *** varieties ------------------------------------------ VARIATIONS ON STRUCTURAL OPERATIONAL SEMANTICS Big step = map from initial to final state Small step = map from a state to next ------------------------------------------ *** terminal transition system, little step, computation semantics terminal transition system (Plotkin), little step (C. Gunter), computation (M. Hennessy) essentially using rewriting as a universal machine **** idea To give the semantics of a programming language, use two auxiliary functions: input and output ------------------------------------------ COMPUTATION (LITTLE STEP) SEMANTICS Meaning Programs <-------> Answers | ^ input | | output | | v reducesto* | Gamma <-----------> T def: a in Meaning[[P]] iff ------------------------------------------ the Meaning relation is often partial, and defined by going around the diagram ... there is a g in T such that input[[P]] reducesto* g and output(g) = a. Meaning[[P]] is a function when reducesto* is Church-Rosser (confluent) **** terminal transition systems this is the guts of the system, defining the abstract machine by defining --> (reducesto) ------------------------------------------ TERMINAL TRANSITION SYSTEM (TTS) (Gamma, -->, Terminal) Gamma : --> : Terminal : reducesto* reflexive, transitive closure ------------------------------------------ ... a set of configurations (e.g., g) i.e., configurations of the abstract machine ... a binary relation on Gamma --> is sometimes written as ==> or reducesto ... subset of terminal configurations (Plotkin called it T), must be such that if g in Terminal, then there is no g' such that (g --> g') *** semantics of the while language (2.2.1) ------------------------------------------ CONFIGURATIONS AND TRANSITIONS Configurations (Gamma): Gamma = (Stmt x State) + State s in State = Var -> Z Terminal Configurations: Terminal = State ------------------------------------------ Q: What does a configuration of the form (S, s) mean? statement S is executing and state s is the current state Q: What does a configuration of the form s mean? state s is a terminal (final) state Q: Must every execution reach a final state? not in the while language... Note that we're ignoring errors like division by zero... Q: What part of the state does an expression depend on? only the values of its free variables Lemma: Let a \in AExp be given. If (\forall x \in FV(a) :: s1(x) = s2(x)) then A[[a]]s1 = A[[a]]s2. The proof is by structural induction ------------------------------------------ TRANSITIONS (Table 2.6) [ass] ([x := a]^l, s) --> s[x |-> A[[a]]s] [skip] ([skip]^l, s) --> s (S1, s) --> (S1', s') [seq1] -------------------------- (S1;S2, s) --> (S1';S2, s') (S1, s) --> s' [seq2] -------------------------- (S1;S2, s) --> (S2, s') [if1] (if [b]^l then S1 else S2, s) --> (S1, s) if B[[b]]s = true [if2] (if [b]^l then S1 else S2, s) --> (S2, s) if B[[b]]s = false [wh1] (while [b]^l do S, s) --> (S; while [b]^l do S, s) if B[[b]]s = true [wh2] (while [b]^l do S, s) --> s if B[[b]]s = false ------------------------------------------ Q: What do the sequence rules do? Q: Do the rules allow evaluation of the true or false part of an if-statement before evaluating the condition? no ------------------------------------------ EXAMPLE Let sqrxy be the state in which q has value q, ..., y has value y E.g, s7062(q) = 7, s0062(x) = 6. <[q := 0]^1; [r := x]^2; while [r >= y]^3 do ([r := r-y]^4; [q := q+1]^5), s7062> --> {by [seq2] and [ass]} <[r := x]^2; while [r >= y]^3 do ([r := r-y]^4; [q := q+1]^5), s0062> --> {by [seq2] and [ass]} = y]^3 do ([r := r-y]^4; [q := q+1]^5), s0662> --> ------------------------------------------ Q: How are the labels used in the semantics? Q: What would be a rule for a one-armed if-statement? Q: How would you add a rule for for loops? *** properties of the semantics To try to prove something about an analysis based on the semantics we have to be able to relate the flow graph and the configurations. Q: As the configurations evolve, does the flow graph for the statement in the configuration change? If so, how? ------------------------------------------ PROPERTIES OF THE SEMANTICS How does the the flow graph change as the configurations change? Case 1: --> Compare vs. final(S) final(S') flow(S) flow(S') blocks(S) blocks(S') Case 2: --> s' what can we say about the graph of S? ------------------------------------------ ... \supseteq ... \supseteq ... \supseteq (a homework problem) ... final(S) = {init(S)} (S is an elementary block) This is all in lemma 2.1.4 ** correctness of the live variables analysis (2.2.2) *** failed attempt ------------------------------------------ EQUATION SYSTEM LV^=(S*) defined by: LVexit(l) = if l \in final(S*) then {} else \bigcup { LVentry(l') | (l', l) \in flow^R(S*) } LVentry(l) = (LVexit(l) \ killLV(B^l)) \cup genLV(B^l) where B^l \in blocks(S*) Functional form, for a given S*: F^S*_LV: ({entry,exit} -> Lab* -> P(Var*)) -> ({entry,exit} -> Lab* -> P(Var*)) F^S*_LV(F)(exit)(l) = if l \in final(S*) then {} else \bigcup { F(entry)(l') | (l', l) \in flow^R(S*) } F^S*_LV(F)(entry)(l) = (F(exit)(l) \ killLV(B^l)) \cup genLV(B^l) where B^l \in blocks(S*) Solutions: live: {entry,exit} -> Lab* -> P(Var*) def: live solves LV^=(S*), written live |= LV^=(S*), iff live is a fixpoint of F^S*_LV. ------------------------------------------ Book uses F^S_LV instead of F^S*_LV doesn't require live be a function of entry and exit Q: Why use functions instead of tuples for solutions? less messy notation, don't have to assume labels ordered ------------------------------------------ EXAMPLE Q: What is F^S*_LV for: [x := 3]^1; [y := x+2]^2; [y := y+1]^3 ------------------------------------------ ... calculate the following F^S*_LV(F)(exit)(3) = {} F^S*_LV(F)(entry)(3) = (F(exit)(3) \ {y}) \cup {y} F^S*_LV(F)(exit)(2) = F(entry)(3) F^S*_LV(F)(entry)(2) = (F(exit)(2) \ {y}) \cup {x} F^S*_LV(F)(exit)(1) = F(entry)(2) F^S*_LV(F)(entry)(1) = (F(exit)(l) \ {x}) \cup {} Q: What do we want to prove for correctness? Have to compare the static analysis vs. "the truth" of the operational semantics. ------------------------------------------ DEFINING LIVE VARIABLES SEMANTICALLY What should a solution to LV mean? At a given point, only the live variables matter. def: States s1 and s2 are similar with respect to a set of variables V, written s1 ~_V s2, iff (\forall x \in V :: s1(x) = s2(x)). Conjecture: Let S be label consistent. Suppose live |= LV^=(S) and s1 ~_{live(entry)(init(S))} s2. Then: (i) if (S, s1) --> (S', s1'), then there is some s2' such that (S, s2) --> (S', s2') and s1' ~_{live(entry)(init(S'))} s2'. (i) if (S, s1) --> s1' then there is some s2' such that (S, s2) --> s2' and s1' ~_{live(exit)(init(S))} s2'. ------------------------------------------ Q: What happens to ~_V as V shrinks? it relates more and more states. Antimonotonicity Lemma: Suppose V1 \supseteq V2. Then ~_V1 \subseteq ~_V2. That is, s1 ~_V1 s2 ==> s1 ~_V2 s2. The antimonotonicity lemma is essentially similar to Lemma 2.20. Q: How would we prove the theorem? Try by induction on the inference used to establish the semantics, see why this fails. Proof attempt: by induction on the inferences uses to establish (S, s1) --> (S', s1') or (S, s1) --> s1'. Let live |= LV^=(S) and let s1 and s2 be such that s1 ~_{live(entry)(init(S))} s2 (1) Base cases (can skip): Suppose the rule applied was [ass]. Then S = [x:=a]^l, and by [ass] ([x:=a]^l, s1) --> s1[x|->A[[a]]s1] ([x:=a]^l, s2) --> s2[x|->A[[a]]s2] So a choice for s2' is s2[x|->A[[a]]s2]. Let's see if that is related as desired by calculation. s1 ~_{live(entry)(init(S))} s2 = s1 ~_{live(entry)(l)} s2 = s1 ~_{(live(exit)(l) \ {x}) \cup FV(a)} s2 ==> s1[x|->A[[a]]s1] ~_{(live(exit)(l) \cup {x} FV(a)} s2[x|->A[[a]]s2] ==> s1[x|->A[[a]]s1] ~_{live(exit)(l)} s2[x|->A[[a]]s2] = s1[x|->A[[a]]s1] ~_{live(exit)(init(S))} s2[x|->A[[a]]s2] (end of [ass] case) Suppose the rule applied was [skip]. Then S = [skip]^l, and by [skip] ([skip]^l, s1) --> s1 ([skip]^l, s2) --> s2 So a choice for s2' is s2, which works because: s1 ~_{live(entry)(init(S))} s2 = s1 ~_{live(entry)(l)} s2 = s1 ~_{live(exit)(l)} s2 = s1 ~_{live(exit)(init(S))} s2 (end of skip case) The inductive hypothesis is that for all substatements Si, if live |= LV^=(Si) and s1 ~_{live(entry)(init(Si))} s2, then (i) and (ii) follow with Si for S. Inductive cases: Suppose the rule applied was [seq1]. Then S = S1; S2 and by [seq1] (S1, s1) --> (S1', s1') ____________________________ (S1;S2, s1) --> (S1';S2, s1') We need to exercise the inductive hypothesis, so we must show that live |= LV^=(S1;S2) implies live |= LV^=(S1) (so we can find an s2'...). But this isn't true. Counterexample: Let S1;S2 be [x := 3]^1; [y := x]^2 Then if live |= LV^=(S1;S2), it must have live(exit)(1) = {x}. But live does not model LV^=(S1), since live(exit)(1) = {x} and by the LV analysis, LVexit(1) = {} when l \in final(S1). (end of counterexample) (end of proof attempt) *** fix using constraint system ------------------------------------------ CONSTRAINT SYSTEM LV^{\subseteq}(S*) defined by: LVexit(l) \supseteq if l \in final(S*) then {} else \bigcup { LVentry(l') | (l', l) \in flow^R(S*) } LVentry(l) \supseteq (LVexit(l) \ killLV(B^l)) \cup genLV(B^l) where B^l \in blocks(S*) Functional form, for a given S*, as above F^S*_LV: ({entry,exit} -> Lab* -> P(Var*)) -> ({entry,exit} -> Lab* -> P(Var*)) F^S*_LV(F)(exit)(l) = if l \in final(S*) then {} else \bigcup { F(entry)(l') | (l', l) \in flow^R(S*) } F^S*_LV(F)(entry)(l) = (F(exit)(l) \ killLV(B^l)) \cup genLV(B^l) where B^l \in blocks(S*) Solutions: live: {entry,exit} -> Lab* -> P(Var*) def: live solves LV^{\subseteq}(S*), written live |= LV^{\subseteq}(S*), iff live \sqsupseteq F^S*_LV(live). Lemma 2.15: If S* is label consistent, then live |= LV^=(S*) implies live |= LV^{\subseteq}(S*). Furthermore, the least solutions coincide. ------------------------------------------ Q: What lemma would give us the property where we got stuck? ------------------------------------------ SOLUTIONS WORK FOR SUBSTATEMENTS Lemma 2.16: Suppose S1 is label consistent, and live |= LV^{\subseteq}(S1). If flow(S1) \supseteq flow(S2), blocks(S1) \supseteq blocks(S2), then S2 is label consistent and live |= LV^{\subseteq}(S2). ------------------------------------------ Q: Why is this true? ------------------------------------------ SOLUTIONS PRESERVED Corollary 2.17: Suppose S is label consistent, and live |= LV^{\subseteq}(S). If (S,s) --> (S',s'), then live |= LV^{\subseteq}(S'). ------------------------------------------ Q: Why does that follow? using facts about --> (lemma 2.14) ------------------------------------------ SOLUTION CAN ONLY SHRINK FORWARD Lemma 2.18: Suppose S is label consistent, and live |= LV^{\subseteq}(S). Then for all (l,l') \in flow(S), live(exit)(l) \supseteq live(entry)(l'). Proof: construction of LV^{\subseteq}(S). ------------------------------------------ ------------------------------------------ GETTING CORRECTNESS RIGHT Theorem 2.21: Let S be label consistent. Suppose live |= LV^{\subseteq}(S) and s1 ~_{live(entry)(init(S))} s2. Then: (i) if (S, s1) --> (S', s1'), then there is some s2' such that (S, s2) --> (S', s2') and s1' ~_{live(entry)(init(S'))} s2'. (i) if (S, s1) --> s1' then there is some s2' such that (S, s2) --> s2' and s1' ~_{live(exit)(init(S))} s2'. ------------------------------------------ Proof: by induction on the inferences uses to establish (S, s1) --> (S', s1') or (S, s1) --> s1'. Let live |= LV^{\subseteq}(S) and let s1 and s2 be such that s1 ~_{live(entry)(init(S))} s2 (1) Base cases (can skip): Suppose the rule applied was [ass]. Then S = [x:=a]^l, and by [ass] ([x:=a]^l, s1) --> s1[x|->A[[a]]s1] ([x:=a]^l, s2) --> s2[x|->A[[a]]s2] So a choice for s2' is s2[x|->A[[a]]s2]. Let's see if that is related as desired by calculation. s1 ~_{live(entry)(init(S))} s2 = s1 ~_{live(entry)(l)} s2 ==> s1 ~_{(live(exit)(l) \ {x}) \cup FV(a)} s2 ==> s1[x|->A[[a]]s1] ~_{(live(exit)(l) \cup {x} FV(a)} s2[x|->A[[a]]s2] ==> s1[x|->A[[a]]s1] ~_{live(exit)(l)} s2[x|->A[[a]]s2] = s1[x|->A[[a]]s1] ~_{live(exit)(init(S))} s2[x|->A[[a]]s2] (end of [ass] case) Suppose the rule applied was [skip]. Then S = [skip]^l, and by [skip] ([skip]^l, s1) --> s1 ([skip]^l, s2) --> s2 So a choice for s2' is s2, which works because: s1 ~_{live(entry)(init(S))} s2 = s1 ~_{live(entry)(l)} s2 ==> s1 ~_{live(exit)(l)} s2 = s1 ~_{live(exit)(init(S))} s2 (end of skip case) The inductive hypothesis is that for all substatements Si, if live |= LV^{\subseteq}(Si) and s1 ~_{live(entry)(init(Si))} s2, then (i) and (ii) follow with Si for S. Inductive cases: Suppose the rule applied was [seq1]. Then S = S1; S2 and by [seq1] (S1, s1) --> (S1', s1') ____________________________ (S1;S2, s1) --> (S1';S2, s1') By lemma 2.16, live |= LV^{\subseteq}(S1), so (different) by the inductive hypothesis, there is some s2' such that (S1, s2) --> (S1', s2') and s1' ~_{live(entry)(init(S1'))} s2'. So, by the [seq1] rule: (S1, s2) --> (S1', s2') ____________________________ (S1;S2, s2) --> (S1';S2, s2') So it remains to prove that s1' ~_{live(entry)(init(S1';S2))} s2', but this follows immediately from the definition of init. (end of [seq1] case) See the book for the ohter cases, of which [seq2] also relies on lemma 2.16. (end of proof) Q: What does this theorem tell us about execution sequences? By induction on length, they also preserve solutions. See Corollary 2.22