COP 5021 meeting -*- Outline -*- * control-flow analysis (1.4) Based on section 1.4 of Nielson, Nielson and Hankin's book Program Analysis (Springer, 1999) ** setting ------------------------------------------ SETTING Goal: Approach: 1. Convert all control structures to functions and function calls. 2. Analysis finds what functions can be called from each point ------------------------------------------ ... find a (safe) approximation to the program's possible control flows Q: Why switch to a functional language for this section? Because a language with first-class functions can simulate all other control structures ------------------------------------------ CONTINUATION PASSING STYLE An intermediate language with one control structure Idea: every expression takes a "continuation" to which it sends its result Examples: x < 0 ==> [fn k => [[[%< x] 0] k]] if [x < 0] then [y := 22] else [z := 33] ==> [fn k => [[[%< x] 0] [%if [[y := 22] k] [[z := 33] k]]]] ------------------------------------------ Explain how the primitives manipulate the argument continuation You can think of the continuation as printing or returning the final result or passing control to it ------------------------------------------ LANGUAGE (p. 140) Work in a functional language: e \in Exp t \in Term f,x \in Var c \in Const op \in Op l \in Lab e ::= t^l t ::= c | x | fn x => e_0 "non-recursive fun" | fun f x => e_0 "recursive fun def" | e_1 e_2 | if e_0 then e_1 else e_2 | let x = e_1 in e_2 | e_1 op e_2 ------------------------------------------ Q: What are the atomic blocks being labeled here? all subexpressions ** idea Q: What are the main ideas in this approach? ------------------------------------------ IDEAS OF CONSTRAINT BASED ANALYSIS - assume no side effects ==> associate information with labels - use a pair of functions, (C,p): C: Lab* -> Powerset(Value) C(l) contains possible values for subexpression at label l p: Var* -> Powerset(Value) p(x) constains possible values for variable x ------------------------------------------ "C" is an "abstract cache" p is an "abstract environment" Q: What's the alternative to associating information directly with labels? it would be associating information with entries and exits Q: How could this information be useful in an object-oriented program? to know what code is called in a dynamic dispatch (e.g., through an interface in a Java-like language) Q: How is this different than a type system? not much, but we're allowing ourselves more flexibility by using sets of values instead of "types" ------------------------------------------ APPROACH - collect constraints for function abstractions: e.g., given [fn x => [x]^1]^2 get {[fn x => [x]^1]} \subseteq C(2) ------------------------------------------ Q: What's the value of a function definition? A term (representing a closure) Q: Why do they just use the term instead of a closure? Because the abstract environment contains all the necessary values at every program point. Because it would be impossible to precisely determine the actual environment at every program point Q: What's the general pattern here? ------------------------------------------ for variables: e.g., given [x]^1 get p(x) \subseteq C(1) for applications: e.g., given [[f]^1 [e]^2]^3 get {v | g \in C(1), a \in C(2), and v = (g a)} \subseteq C(3) ------------------------------------------ Q: What's the general rule? Use the cache symbolically... That is: The fn expression at the given label is a subset of the context for that label. When using variables, we use all the possible values it may have. Q: What happens in [[fn x => [x]^1]^2 [fn y => [y]^3]^4]^5 ? the book uses conditional constraints for this {fn x => [x]^1} \subseteq C(2) ==> C(4) \subseteq p(x) what's the least solution? Q: What would be the constraints for an expression of the form [[e1]^1 op [e2]^2]^3 ? Q: What would be the constraints for an if-then-else expression? Q: What would be the constraints for a let expression, like let x = e1 in e2 ?