COP 5021 meeting -*- Outline -*- * constraint based analysis (1.4) ** goals The main goal is control flow analysis... Q: What's the difference between a data flow analysis and a control flow analysis? ------------------------------------------ DATA FLOW VS. CONTROL FLOW ANALYSIS Main difference? ------------------------------------------ ... in a data flow analysis we're interested in properties of variables and other data ... In a control flow analysis we are interested in how control passes from one elementary block to another. Q: Isn't control flow obvious in all languages? no, not with: - lack of structure, such as go to - or with advanced control features, such as lambda, or object oriented dispatch In such languages, the number of successors and predecessors of the node is no longer small. Control information is needed for interprocedural flow analysis. ** setting ------------------------------------------ SETTING 1. Convert all control structures to functions and function calls. 2. Analysis finds what functions can be called from each point ------------------------------------------ Q: Why switch to a functional language for this section? Because a language with first-class functions can simulate all other control structures ------------------------------------------ CONTINUATION PASSING STYLE An intermediate language with one control structure Idea: every expression takes a "continuation" to which it sends its result Examples: x < 0 ==> [fn k => [[[%< x] 0] k]] if [x < 0] then [y := 22] else [z := 33] ==> [fn k => [[[%< x] 0] [%if [[y := 22] k] [[z := 33] k]]]] ------------------------------------------ Explain how the primitives manipulate the argument continuation You can think of the continuation as printing or returning the final result or passing control to it ------------------------------------------ LANGUAGE (p. 140) Work in a functional language: e \in Exp t \in Term f,x \in Var c \in Const op \in Op l \in Lab e ::= t^l t ::= c | x | fn x => e_0 "non-recursive fun" | fun f x => e_0 "recursive fun def" | e_1 e_2 | if e_0 then e_1 else e_2 | let x = e_1 in e_2 | e_1 op e_2 ------------------------------------------ Q: What are the atomic blocks being labeled here? all subexpressions ** idea Q: What are the main ideas in this approach? ------------------------------------------ IDEAS OF CONSTRAINT BASED ANALYSIS - assume no side effects ==> associate information with labels - use a pair of functions, (C,p): C: Lab* -> Powerset(Value) C(l) contains possible values for subexpression at label l p: Var* -> Powerset(Value) p(x) constains possible values for variable x ------------------------------------------ "C" is an "abstract cache" p is an "abstract environment" Q: What's the alternative to associating information directly with labels? it would be associating information with entries and exits Q: How could this information be useful in an object-oriented program? to know what code is called in a dynamic dispatch (e.g., through an interface in a Java-like language) Q: How is this different than a type system? not much, but we're allowing ourselves more flexibility by using sets of values instead of "types" ------------------------------------------ APPROACH - collect constraints for function abstractions: e.g., given [fn x => [x]^1]^2 get {[fn x => [x]^1]} \subseteq C(2) ------------------------------------------ Q: What's the value of a function definition? A term (representing a closure) Q: Why do they just use the term instead of a closure? Because the abstract environment contains all the necessary values at every program point. Because it would be impossible to precisely determine the actual environment at every program point Q: What's the general pattern here? ------------------------------------------ for variables: e.g., given [x]^1 get p(x) \subseteq C(1) for applications: e.g., given [[f]^1 [e]^2]^3 get {v | g \in C(1), a \in C(2), and v = (g a)} \subseteq C(3) ------------------------------------------ Q: What's the general rule? Use the cache symbolically... That is: The fn expression at the given label is a subset of the context for that label. When using variables, we use all the possible values it may have. Q: What happens in [[fn x => [x]^1]^2 [fn y => [y]^3]^4]^5 ? the book uses conditional constraints for this {fn x => [x]^1} \subseteq C(2) ==> C(4) \subseteq p(x) what's the least solution? Q: What would be the constraints for an expression of the form [[e1]^1 op [e2]^2]^3 ? Q: What would be the constraints for an if-then-else expression? Q: What would be the constraints for a let expression, like let x = e1 in e2 ?