COP 5021 Lecture -*- Outline -*- * Interprocedural Analysis (2.5) This section looks at analysis for languages with procedures, including the structural operational semantics for such a language. the classic way of doing this is a whole-program analysis (non-modular) But we will attempt a modular analysis, by making procedure summaries (specifications) complications: matching calls and returns, parameter passing mechanisms, aliasing (from call by reference), higher-order procedures ** syntax ------------------------------------------ SYNTAX Procedures with 1 call-by-value parameter, and 1 call-by-result parameter. P \in Program D \in Declaration P ::= begin D S end D ::= proc p(val x, res y) is^ln S end^lx | D D S ::= ... | [call p(a,z)]^lc_lr Example: begin proc fact(val n, res v) is^1 if [n == 0]^2 then [v := 1]^3 else ([call fact(n-1, v)]^4_5; [v:=v*n]^11) end^6; [call fact(3,v)]^7_8; [call fact(v,w)]^9_10 end ------------------------------------------ Q: In call p(a,z), what syntactic category is a? z? arithmetic expression, identifier Q: What assumptions would simplify the analysis? assume unique labels (label consistency) assume no redeclarations, only declared procedures are called assume names of all formals in a proc decl are distinct (x =/= y) Q: What else do we need to do analysis and correctness proofs? operational semantics flows, etc. ** operational semantics (2.5.1) Q: How is a procedure different than a macro? procedures have local variables (formals), different in each call so names aren't sufficient to distinguish each instantiation of a formal parameter, need locations... ------------------------------------------ OPERATIONAL SEMANTICS xi in Loc locations rho in Env = Var* -> Loc environments s in Store = Loc ->_fin Z stores Assume s o rho is total: ran(rho) \subseteq dom(s) ------------------------------------------ (the book uses \varsigma for this kind of store, instead of \sigma) Loc ->_fin Z is set of partial functions with a finite domain Q: How do these states relate to the states we had previously? old ones are the composition of a store and an environment Q: How should we deal with global variables? top-level environment, rho* assume injective ------------------------------------------ OPERATIONAL SEMANTICS [skip] rho |-* ([skip]^l, s) --> s [asgn] rho |-* ([x:=a]^l, s) --> s[rho(x) |-> A[[A]](s o rho)] if s o rho is total ------------------------------------------ Q: What would the sequence rules look like? while? if? Q: What would the calls look like in the operational semantics? two parts, evaluation of actual parameters, and parameter passing ------------------------------------------ CALL AND BIND RULES [call] rho |-* ([call p(a,z)]^lc_lr, s) --> (bind rho*[x |-> xi1, y |-> xi2] in S then z := y, s[xi1 |-> A[[a]](s o r), xi2 |-> v]) if xi1, xi2 not in dom(s), v in Z, proc p(val x, res y) is^ln S end^lx is in D* rho' |-* (S, s) --> (S', s') [bind1]________________________________ rho |-* (bind rho' in S then z := y, s) --> (bind rho' in S' then z := y, s') rho' |-* (S, s) --> s' [bind2]________________________________ rho |-* (bind rho' in S then z := y, s) --> s'[rho(z) |-> s'(rho'(y))] ------------------------------------------ bind-in-then is a 3 part AST, used for scoping Q: How do you parse the first rule? Q: Where is bind-in-then in the surface syntax? it isn't, just used for the operational semantics Q: What is [bind2] doing? termination of call, pass by result for z Q: In [bind2], why is rho(z) used instead of rho'(z)? Because the location of the result argument is given by the surrounding scope, so this is correct. Q: Do we have to deal with bind-in-then for proofs? yes! ------------------------------------------ EXAMPLE Let P be { proc fact(val n, res v) is^1 if [n == 0]^2 then [v := 1]^3 else ([call fact(n-1, v)]^4_5; [v:=v*n]^11) end^6; [call fact(3,v)]^7_8; [call fact(v,w)]^9_10 } Name each statement S_i if it has label i (for calls, use the top label). Let S_2 be if [n == 0]^2 then ... Let S_79 be [call fact(3,v)]^7_8; [call fact(v,w)]^9_10 Let rho* = {v |-> 0, w |-> 1} s00 = {0 |-> 0, 1 |-> 0} (and for later use) Let rho1 = rho*[n |-> 2, v |-> 3] = {v |-> 3, w |-> 1, v |-> 2} s0039 = {0 |-> 0, 1 |-> 0, 2 |-> 3, 3 |-> 9} Calcuate in the context of rho*: rho* |-* (S_79, s00) --> * ([call fact(3,v)]^7_8, s00) --> (bind rho*[n |-> 2, v |-> 3] in S_2 then v := v, s00[2 |-> 3, 3 |-> 9]) = (bind rho1 in S_2 then v := v, s0039) . (bind rho1 in S_2 then v := v; S_9, s0039) --> * (bind rho1 in (if [n == 0]^2 then ...) then v := v, s0039) --> rho1 |-* (if [n == 0]^2 then [v := 1]^3 else ([call fact(n-1, v)]^4_5; [v:=v*n]^11), s0039) --> ([call fact(n-1, v)]^4_5; [v:=v*n]^11, s0039) . (bind rho1 in ([call fact(n-1, v)]^4_5; [v:=v*n]^11) then v := v, s0039) . (bind rho1 in ([call fact(n-1, v)]^4_5; [v:=v*n]^11) then v := v; S_9, s0039) ------------------------------------------ can continue this... ** flow graphs (non-modular) Q: How should we make flow graphs for calls? ------------------------------------------ FLOW GRAPHS FOR CALLS init([call p(a, z)]^lc_lr) = final([call p(a, z)]^lc_lr) = blocks([call p(a, z)]^lc_lr) = labels([call p(a, z)]^lc_lr) = flow([call p(a, z)]^lc_lr) = {(lc;ln), (lx;lr)} if proc p(val x, res y) is^ln S end^lx is in D* ------------------------------------------ Q: What should these be? ... lc ... {lr} ... {[call p(a,z)]^lc_lr} ... {lc, lr} Q: Why use semicolons for the flows? to see which ones are procedure flows... Q: What would happen if p was a program variable? dynamic dispatch ==> harder to determine the exact code called ------------------------------------------ FLOW GRAPHS FOR PROCEDURES For each procedure declaration proc p(val x, res y) is^ln S end^lx init(p) = final(p) = blocks(p) = {is^ln, end^lx} \cup blocks(S) labels(p) = flow(p) = ------------------------------------------ Q: What should these be? ... ln ... {lx} ... {ln,lx} \cup labels(S) ... {(ln,init(S)} \cup flow(S) \cup {(l,lx)|l \in final(S)} ------------------------------------------ FLOW GRAPHS FOR PROGRAMS For program P* = begin D* S* end init* = init(S*) final* = final(S*) blocks* = blocks(S*) \cup \bigcup {blocks(p) | proc p... in D*} labels* = labels(S*) \cup \bigcup {labels(p) | proc p... in D*} flow* = flow(S*) \cup \bigcup {flow(p) | proc p... in D*} ------------------------------------------ Q: Is the flow graph for a program still finite? Yes. Q: What is Lab* for such a program? labels* ------------------------------------------ INTERPROCEDURAL FLOWS inter-flow* = { (lc, ln, lx, lr) | P* contains [call p(a,z)]^lc_lr and proc p(val x, res y) is^ln S end^lx } Notation: IF is an abstraction of inter-flow* for forward analysis: IF = inter-flow* for backward analysis: IF = inter-flow^R* ------------------------------------------ Q: How could we use inter-flow*? relates calls and returns Suppose we have (lpc, lpn, lpx, lpr) and (lqc, lpn, lpx, lqr) in inter-flow* Then (lpc;lpn), (lqc;lpn), (lpx;lpr), and (lpx;lqr) will all be flows (see the def. of flow([call p(...)]), but there can't be a trace where (lpc;lpn) is followed by (lpx;lqr). Thus although it might appear that (lpc, lpn, lpx, lqr) could be a 4-tuple, that can't happen. ------------------------------------------ EXAMPLE begin proc fact(val n, res v) is^1 if [n == 0]^2 then [v := 1]^3 else ([call fact(n-1, v)]^4_5; [v:=v*n]^11) end^6; [call fact(3,v)]^7_8; [call fact(v,w)]^9_10 end What is flow*? What is inter-flow*? ------------------------------------------ Draw the flow graph Q: What else do we need to do analysis? (nothing?) Q: What is the size of the flow graph in terms of the number of calls? it's linear Q: What is the size of the flow graph for a recursive procedure? it's not infinite, as the recrusive calls share the graph, see Fig 2.7 Q: Is this the right way to look at dataflow problems? it's not clear... ** a modular approach, procedure summaries Q: What do we do for type checking of procedures? we have a type environment that tells what each does Idea: let's do the same thing for other kinds of analysis. ------------------------------------------ PLAN FOR PROCEDURE SUMMARIES 0. Consider call to be a kind of elementary block. 1. Compute the analysis information for each procedure p, based on its body 2. Summarize each procedure p with a transfer function summary(p): 3. Call statements have kill and gen (i.e., transfer) functions that uses summary(p) and handle argument passing 4. Compute a fixed-point to solve for summaries Iteration: use a "bottom" summary to start: summary_0 = \p.\(i,v). \bot Then iterate construction of summary until reaches a fixed-point ------------------------------------------ ... summary(p): Int x Var* -> L gives the effect of a call to p in terms of the property space (for the given actual arguments, the Int and the Var* elems) so summary: ProcName -> (Int x Var* -> L) Q: What kind of dependency can summary(p) have on the Int argument? probably not much, if anything. Should ignore it. *** example: reaching definitions ------------------------------------------ EXAMPLE: REACHING DEFINITIONS L = P(Var* x Lab*^?) For analysis within a procedure, proc p(val n, res v) is^ln S end^lx formal n is considered initialized at label ln summary(p) = \(i,v). \bigcup {RD_exit(l)|l \in finals(S)} kill_RD([call p(a,z)]^lc_lr) = {(z,l)|l \in \Lab*^?} gen_RD([call p(a,z)]^lc_lr) = summary(p)(a,z) \cup {(z,lr)} ------------------------------------------