COP 5021 Lecture -*- Outline -*-

* Interprocedural Analysis (2.5)

  This section looks at analysis for languages with procedures,
  including the structural operational semantics for such a language.

  the classic way of doing this is a whole-program analysis (non-modular)

  But we will attempt a modular analysis, 
  by making procedure summaries (specifications)

  complications: matching calls and returns,
                 parameter passing mechanisms,
                 aliasing (from call by reference),
                 higher-order procedures

** syntax

------------------------------------------
               SYNTAX

Procedures with 1 call-by-value parameter,
and 1 call-by-result parameter.

P \in Program
D \in Declaration

P ::= begin D S end
D ::= proc p(val x, res y) is^ln S end^lx
    | D D
S ::= ... | [call p(a,z)]^lc_lr


Example:

  begin
    proc fact(val n, res v) is^1
      if [n == 0]^2
      then [v := 1]^3
      else ([call fact(n-1, v)]^4_5; [v:=v*n]^11)
    end^6;
    [call fact(3,v)]^7_8;
    [call fact(v,w)]^9_10
  end
------------------------------------------
     Q: In call p(a,z), what syntactic category is a? z?
        arithmetic expression, identifier
        
     Q: What assumptions would simplify the analysis?
        assume unique labels (label consistency)
        assume no redeclarations, only declared procedures are called
        assume names of all formals in a proc decl are distinct (x =/= y)

     Q: What else do we need to do analysis and correctness proofs?
        operational semantics
        flows, etc.

** operational semantics (2.5.1)

    Q:  How is a procedure different than a macro?
        procedures have local variables (formals),
           different in each call

        so names aren't sufficient to distinguish each instantiation
        of a formal parameter, need locations...

------------------------------------------
        OPERATIONAL SEMANTICS

xi  in Loc                    locations
rho in Env = Var* -> Loc      environments
s   in Store = Loc ->_fin Z   stores

 Assume s o rho is total:
  ran(rho) \subseteq dom(s)

------------------------------------------

     (the book uses \varsigma for this kind of store, instead of \sigma)

     Loc ->_fin Z is set of partial functions with a finite domain

     Q:  How do these states relate to the states we had previously?
         old ones are the composition of a store and an environment

     Q:  How should we deal with global variables?

         top-level environment, rho*
         assume injective

------------------------------------------
        OPERATIONAL SEMANTICS

[skip] rho |-* ([skip]^l, s) --> s

[asgn] rho |-* ([x:=a]^l, s)
              --> s[rho(x) |-> A[[A]](s o rho)]
                if s o rho is total

------------------------------------------

    Q:  What would the sequence rules look like? while? if?

    Q:  What would the calls look like in the operational semantics?

        two parts, evaluation of actual parameters, and parameter
        passing

------------------------------------------
        CALL AND BIND RULES

[call]
rho |-* ([call p(a,z)]^lc_lr, s)
  --> (bind rho*[x |-> xi1, y |-> xi2]
         in S then z := y,
       s[xi1 |-> A[[a]](s o r),
         xi2 |-> v])
   if xi1, xi2 not in dom(s), v in Z,
      proc p(val x, res y) is^ln S end^lx
        is in D*
   

           rho' |-* (S, s) --> (S', s')
[bind1]________________________________
   rho |-* (bind rho' in S then z := y, s)
     --> (bind rho' in S' then z := y, s')

           rho' |-* (S, s) --> s'
[bind2]________________________________
   rho |-* (bind rho' in S then z := y, s)
            --> s'[rho(z) |-> s'(rho'(y))]
------------------------------------------

        bind-in-then is a 3 part AST, used for scoping

        Q: How do you parse the first rule?

        Q: Where is bind-in-then in the surface syntax?
           it isn't, just used for the operational semantics

        Q: What is [bind2] doing?
           termination of call, pass by result for z

        Q: In [bind2], why is rho(z) used instead of rho'(z)?
        Because the location of the result argument is given by the
        surrounding scope, so this is correct.

        Q: Do we have to deal with bind-in-then for proofs?
           yes!

------------------------------------------
            EXAMPLE

Let P be

  {
    proc fact(val n, res v) is^1
      if [n == 0]^2
      then [v := 1]^3
      else ([call fact(n-1, v)]^4_5; [v:=v*n]^11)
    end^6;
    [call fact(3,v)]^7_8;
    [call fact(v,w)]^9_10
  }

Name each statement S_i if it has label i
(for calls, use the top label).
Let S_2 be if [n == 0]^2 then ...
Let S_79 be [call fact(3,v)]^7_8; [call fact(v,w)]^9_10

Let rho* = {v |-> 0, w |-> 1}
    s00 = {0 |-> 0, 1 |-> 0}
(and for later use)
Let rho1 = rho*[n |-> 2, v |-> 3] = {v |-> 3, w |-> 1, v |-> 2}
     s0039 = {0 |-> 0, 1 |-> 0, 2 |-> 3, 3 |-> 9}

Calcuate in the context of rho*:
   rho*
|-*
   (S_79, s00)
--> <by [seq1]>
    * ([call fact(3,v)]^7_8, s00)
    --> <by [call], using declaration of fact, picking 9 for value in loc 3>
      (bind rho*[n |-> 2, v |-> 3] in S_2 then v := v, s00[2 |-> 3, 3 |-> 9])
    = <by def of rho1, s0039>
      (bind rho1 in S_2 then v := v, s0039)
. (bind rho1 in S_2 then v := v; S_9, s0039)
--> <by [seq1], def of S_2>
    * (bind rho1 in (if [n == 0]^2 then ...) then v := v, s0039)
    --> <by [bind1]>
        rho1
        |-*
         (if [n == 0]^2 then [v := 1]^3
          else ([call fact(n-1, v)]^4_5; [v:=v*n]^11),
          s0039)
         --> <by [if2]>
          ([call fact(n-1, v)]^4_5; [v:=v*n]^11, s0039)
     . (bind rho1 in ([call fact(n-1, v)]^4_5; [v:=v*n]^11) then v := v, s0039)
. (bind rho1 in ([call fact(n-1, v)]^4_5; [v:=v*n]^11) then v := v; S_9, s0039)


------------------------------------------

        can continue this...


** flow graphs (non-modular)

   Q: How should we make flow graphs for calls?

------------------------------------------
          FLOW GRAPHS FOR CALLS

init([call p(a, z)]^lc_lr) =

final([call p(a, z)]^lc_lr) =

blocks([call p(a, z)]^lc_lr) =

labels([call p(a, z)]^lc_lr) =

flow([call p(a, z)]^lc_lr) = {(lc;ln),
                              (lx;lr)}
    if proc p(val x, res y) is^ln S end^lx
       is in D*

------------------------------------------
        Q: What should these be?
           ... lc
           ... {lr}
           ... {[call p(a,z)]^lc_lr}
           ... {lc, lr}

        Q: Why use semicolons for the flows?
           to see which ones are procedure flows...

        Q: What would happen if p was a program variable?
           dynamic dispatch ==> harder to determine the exact code called

------------------------------------------
    FLOW GRAPHS FOR PROCEDURES

For each procedure declaration
  proc p(val x, res y) is^ln S end^lx

init(p) =

final(p) =

blocks(p) = {is^ln, end^lx} \cup blocks(S)

labels(p) =

flow(p) = 

------------------------------------------

   Q:  What should these be?
       ... ln
       ... {lx}
       ... {ln,lx} \cup labels(S)
       ... {(ln,init(S)} \cup flow(S) \cup {(l,lx)|l \in final(S)}
     
------------------------------------------
        FLOW GRAPHS FOR PROGRAMS

For program P* = begin D* S* end

init* = init(S*)

final* = final(S*)

blocks* = blocks(S*) \cup
     \bigcup {blocks(p) | proc p... in D*}

labels* = labels(S*) \cup
     \bigcup {labels(p) | proc p... in D*}

flow* = flow(S*) \cup
     \bigcup {flow(p) | proc p... in D*}
------------------------------------------
    Q: Is the flow graph for a program still finite?
       Yes.
 
    Q: What is Lab* for such a program?
       labels*

------------------------------------------
        INTERPROCEDURAL FLOWS

inter-flow* =
  { (lc, ln, lx, lr) | P* contains
     [call p(a,z)]^lc_lr and
    proc p(val x, res y) is^ln S end^lx }

Notation:
  IF is an abstraction of inter-flow*

  for forward analysis:
    IF = inter-flow*

  for backward analysis:
    IF = inter-flow^R* 
------------------------------------------

     Q: How could we use inter-flow*?
        relates calls and returns

        Suppose we have (lpc, lpn, lpx, lpr) and (lqc, lpn, lpx, lqr)
        in inter-flow*  Then (lpc;lpn), (lqc;lpn), (lpx;lpr), and
        (lpx;lqr) will all be flows (see the def. of flow([call p(...)]),
        but there can't be a trace where (lpc;lpn) is followed by
        (lpx;lqr). Thus although it might appear that (lpc, lpn, lpx, lqr)
        could be a 4-tuple, that can't happen.

------------------------------------------
            EXAMPLE

  begin
    proc fact(val n, res v) is^1
      if [n == 0]^2
      then [v := 1]^3
      else ([call fact(n-1, v)]^4_5; [v:=v*n]^11)
    end^6;
    [call fact(3,v)]^7_8;
    [call fact(v,w)]^9_10
  end

What is flow*?


What is inter-flow*?


------------------------------------------

     Draw the flow graph

     Q: What else do we need to do analysis?
        (nothing?)
     Q: What is the size of the flow graph in terms of the number of calls?
         it's linear
     Q: What is the size of the flow graph for a recursive procedure?
         it's not infinite, as the recrusive calls share the graph,
         see Fig 2.7
     Q: Is this the right way to look at dataflow problems?
         it's not clear...

** a modular approach, procedure summaries
   Q: What do we do for type checking of procedures?
      we have a type environment that tells what each does
   Idea: let's do the same thing for other kinds of analysis.
------------------------------------------
      PLAN FOR PROCEDURE SUMMARIES

0. Consider call to be a kind of elementary block.

1. Compute the analysis information 
   for each procedure p, based on its body

2. Summarize each procedure p with a
   transfer function 
      summary(p): 


3. Call statements have kill and gen 
   (i.e., transfer) functions
   that uses summary(p)
    and handle argument passing


4. Compute a fixed-point to solve for summaries
    Iteration: use a "bottom" summary to start:
        summary_0 = \p.\(i,v). \bot
    Then iterate construction of summary
    until reaches a fixed-point
------------------------------------------
      ... summary(p): Int x Var* -> L
         gives the effect of a call to p in terms of the property space
                (for the given actual arguments, the Int and the Var* elems)

          so summary: ProcName -> (Int x Var* -> L)

      Q: What kind of dependency can summary(p) have on the Int argument?
         probably not much, if anything. Should ignore it.

*** example: reaching definitions
------------------------------------------
       EXAMPLE: REACHING DEFINITIONS

 L = P(Var* x Lab*^?)

 For analysis within a procedure,
   proc p(val n, res v) is^ln S end^lx
 formal n is considered initialized at label ln

 summary(p) = \(i,v). 
  \bigcup {RD_exit(l)|l \in finals(S)}

 kill_RD([call p(a,z)]^lc_lr) = 
    {(z,l)|l \in \Lab*^?}
 gen_RD([call p(a,z)]^lc_lr) = 
    summary(p)(a,z) \cup {(z,lr)}
------------------------------------------