COP 5021 meeting -*- Outline -*-

* control-flow analysis (1.4)

      Based on
       section 1.4 of Nielson, Nielson and Hankin's book
       Program Analysis (Springer, 1999)

** setting

------------------------------------------
             SETTING

Goal:
   

Approach:

1. Convert all control structures to
   functions and function calls.

2. Analysis finds what functions
   can be called from each point
   
------------------------------------------
   ... find a (safe) approximation to the program's possible control flows

   Q:  Why switch to a functional language for this section?

   Because a language with first-class functions can simulate all
   other control structures

------------------------------------------
        CONTINUATION PASSING STYLE

An intermediate language
   with one control structure

Idea: every expression takes a
      "continuation"
      to which it sends its result

Examples:

  x < 0
==>
  [fn k => [[[%< x] 0] k]]


  if [x < 0]
  then [y := 22]
  else [z := 33]
==>
  [fn k =>
    [[[%< x] 0]
     [%if [[y := 22] k]
          [[z := 33] k]]]]
------------------------------------------

   Explain how the primitives manipulate the argument continuation

   You can think of the continuation as printing or returning the
   final result or passing control to it

------------------------------------------
            LANGUAGE (p. 140)

Work in a functional language:

  e \in Exp
  t \in Term
f,x \in Var
  c \in Const
 op \in Op
  l \in Lab

 e ::= t^l
 t ::= c
    | x 
    | fn x => e_0    "non-recursive fun"
    | fun f x => e_0 "recursive fun def"
    | e_1 e_2
    | if e_0 then e_1 else e_2
    | let x = e_1 in e_2
    | e_1 op e_2
------------------------------------------

    Q:  What are the atomic blocks being labeled here?
        all subexpressions

** idea

   Q: What are the main ideas in this approach?

------------------------------------------
   IDEAS OF CONSTRAINT BASED ANALYSIS

- assume no side effects
  ==> associate information with labels

- use a pair of functions, (C,p):
    C: Lab* -> Powerset(Value)
    C(l) contains possible values for
         subexpression at label l

    p: Var* -> Powerset(Value)
    p(x) constains possible values for
         variable x
------------------------------------------

        "C" is an "abstract cache"
        p is an "abstract environment"

        Q:  What's the alternative to associating information directly
        with labels?
             it would be associating information with entries and exits

        Q:  How could this information be useful in an object-oriented
            program?
             to know what code is called in a dynamic dispatch
             (e.g., through an interface in a Java-like language)

        Q:  How is this different than a type system?

            not much, but we're allowing ourselves more flexibility by
            using sets of values instead of "types"

------------------------------------------
           APPROACH

- collect constraints

  for function abstractions:

    e.g., given
    
      [fn x => [x]^1]^2
    
    get
    
      {[fn x => [x]^1]} \subseteq C(2)

------------------------------------------

   Q:  What's the value of a function definition?

       A term (representing a closure)

   Q: Why do they just use the term instead of a closure?
        Because the abstract environment
        contains all the necessary values at
        every program point.

        Because it would be impossible to precisely determine the
        actual environment at every program point

   Q:  What's the general pattern here?

------------------------------------------
  for variables:

   e.g., given

       [x]^1

   get

       p(x) \subseteq C(1)

  for applications:

    e.g., given

       [[f]^1 [e]^2]^3

    get

       {v | g \in C(1), a \in C(2), and
            v = (g a)}
       \subseteq C(3)

------------------------------------------

   Q:  What's the general rule?
       Use the cache symbolically... That is:
       The fn expression at the given label is a subset of the context
       for that label.
       When using variables, we use all the possible values it may have.

   Q:  What happens in  [[fn x => [x]^1]^2 [fn y => [y]^3]^4]^5 ?

       the book uses conditional constraints for this
         {fn x => [x]^1} \subseteq C(2) ==> C(4) \subseteq p(x)

       what's the least solution?

   Q:  What would be the constraints for an expression of the form
         [[e1]^1 op [e2]^2]^3 ?

   Q:  What would be the constraints for an if-then-else expression?

   Q:  What would be the constraints for a let expression,
       like  let x = e1 in e2 ?