SEMANTICS AND ABSTRACT INTERPRETATION

Language semantics (big step)

   <P,s> -->* s'

   P |- s ~~> s'

Abstract interpretation

   P |- l_1 |> l_2

   f_p(l_1) = l_2


         MUNDANE

Def: a *mundane* analysis or approach is
     one that is first-order. That is the
     properties describe sets of values

     e.g., shape analysis,
           constant propagation


        EXAMPLE 4.1

For constant propagation:

Semantics is:

  S* |- s1 ~~> s2

   means <S*,s1> -->* s2

Analysis is:

 S* |- \hat{s1} |> \hat{s2}

  means i = \hat{s1}
     /\ s2 = \bigsqcup {CP.(l) |
                         l in final(S*)}


       CORRECTNESS RELATIONS (4.1.1)

def: a *correctness relation* has type

      V x L -> Boolean

     It says what properties safely
     describe a given value, and must be
     preserved by computation/analysis:

  (v1 R l1 /\ p |- v1 ~~> v2
           /\ p |- l1 |>  l2)
     ==> v2 R l2                    (4.3)

Picture:

    p |- l1  |>   l2

         R   ==>  R

    p |- v1  ~~>  v2

 CORRECTNESS FOR ORDERED PROPERTY SPACES

Suppose L = (L, <=) is a complete lattice,
Then we require:

    v R l1 /\ l1 <= l2  ==>  v R l2  (4.4)

 (\forall l \in L' <= L :: v R l)
          ==> v R (\bigmeet L')      (4.5)

     CONSTANT PROPAGATION (EXAMPLE 4.3)

s R_CP \hat{s} iff 
     (\forall x \in Var* ::
          (\hat{s}(x) = \top
           \/ s(x) = \hat{s}(x)))


    REPRESENTATION FUNCTIONS (4.1.2)

def: a *representation function* maps
     a value to the best property
     describing it.

     It must be preserved by computation
     in the following sense:

     (b(v1) <= l1 /\ p |- v1 ~~> v2
                 /\ p |- l1 |> l2)
       ==> b(v2) <= l2               (4.6)

Picture:

    p |- l1  |>   l2
          ^        ^
         b|   ==>  |b
          |        |
    p |- v1  ~~>  v2

      CORRECTNESS VIA REPRESENTATION
            AND VICE VERSA

def: R_b is the correctness relation
     generated by b:

     v R_b l <==> b(v) <= l

def: b_R is the representation function
     generated by R:

     b_R(v) = \bigmeet { l | v R l }

Lemma 4.5
 (i) R_b satisfies (4.4) and (4.5),
    and b_{R_b} = b
(ii) if R satisfies (4.4) and (4.5),
    then b_R is well-defined
    and R_{b_R} = R

    CONSTANT PROPAGATION (EXAMPLE 4.6)

b_CP: State -> \hat{State_CP}
b_CP(s) = s

So R_CP is defined by:


         GENERALIZATION (4.1.3)


In
    p |- v1 ~~> v2

allow v1 in V1, v2 in V2, and V1 <> V2

In
    f_p(l1) = l2

allow l1 in L1, l2 in L2, and L1 <> L2


So get 2 correctness relations:

 R1: V1 x L1 -> Boolean
     generated by b1: V1 -> L1

 R2: V2 x L2 -> Boolean 
     generated by b2: V2 -> L2

Logical relationship:

             f_p
         l1  -->  l2

         R1  ==>  R2

    p |- v1  ~~>  v2

def: (R1 ->> R2) is a relation defined by

   (p |- . ~~> .) (R1 ->> R2) f_p
<==>
    (\forall v1, v2, l1 ::
      (p |- v1 ~~> v2) /\ v1 R1 l1
          ==> v2 R2 f_p(l1))

      INTERVAL LATTICE (EXAMPLE 4.10)

Interval = { _|_ }
    \cup {[z1,z2] | z1 <= z2, z1 in Z-,
                    z2 in Z+}
Z- = Z \cup {-\infty}
Z+ = Z \cup {\infty}

  _|_ denotes the empty interval

<= ordering on Interval is:


where (for integers z1, z2):

   inf(_|_) = \infty
   inf([z1,z2]) = z1
   sup(_|_) = -\infty
   sup([z1,z2]) = z2

      WHY FIXED POINTS?

Analysis transforms properties:

         f: L -> L

where f is monotone.

E.g., for reaching definitions:

 F(RD_1,...,RD_n) =
   (F_1(RD_1,...,RD_n), ...,
    F_n(RD_1...,RD_n))

Want least fixed point, lfp(f) for:

     - recursive programs
     - programs with loops

But iterating doesn't necessarily:
    - reach a fixed point (stabilize)
    - stabilize at the least fixed point

           IDEA

How to approximate lfp(f)?

  use sequence (f^n_V)n

    - which must stabalize
    - which will safely approximate lfp(f)

The V (\nabla) is a widening operator

      UPPER BOUND OPERATORS

def: Suppose (L,<=) is a complete lattice.
     Then an operation
       ub: L x L -> L
     is an upper bound operator iff
     for all l1, l2 in L,

       l1 <= ub(l1,l2)

     and

       l2 <= ub(l1,l2).

Example (4.12):

     Let int be a fixed interval
         e.g., int02 = [0,2]

   define:

     ub^int(int1, int2) =
        if int1 <= int or int2 <= int1
        then int1 |_| int2
        else [-\infty, \infty]

     e.g., with int02 = [0,2]
       ub^int02(int1, int2) =
         if int1 <= [0,2] or int2 <= int1
         then int1 |_| int2
         else [-\infty, \infty]

      so ub^int02([1,2],[2,3]) =

      but ub^int02([2,3],[1,2]) =


      MAKING ASCENDING CHAINS

def: Let (l_n)n = (l_0, l_1, ...)
 be a sequence of elements in L.
 Let phi: (L x L) -> L be a total function
 Then 

  bapply(phi, (l_n)n) = (m_n)n
  where m_0 = l_0
        m_n = phi(m_{n-1}, l_n), for n > 0

Notation:

  (bapply(phi, (l_n)n) is written
   (l^{phi}_n)n 


Fact 4.11 If (l_n)n is a sequence and
  ub is an upper bound operator,
 then (bapply(ub, (l_n)n) is
 an ascending chain.

        WIDENING OPERATORS

def: Let L be a complete lattice.
Then V: L x L -> L 
                 is a *widening operator*
iff:
  - V is an upper bound operator, and
  - for all ascending chains (l_n)n,
    the chain bapply(V, (l_n)n)
    eventually stabilizes

USING WIDENING TO SAFELY APPROXIMATE LFP

Given: monotone f: L -> L
       widening operator V: L x L -> L

Goal: find lfp_V(f), such that:

  (a) f(lfp_V(f)) <= lfp_V(f),  and

  (b) lfp_V(f) >= lfp(f)

Define lfp_V(f) = f_V^m, where

   m >= 0 is the least number such that:

    f(f_V^m) <= f_V^m

   where for all n >= 0

    f_V^0 = _|_

    f_V^{n+1} = f_V^{n},
          if f(f_V^{n}) <= f_V^{n}

    f_V^{n+1} = f_V^{n} V f(f_V^{n}),
          otherwise

           EXAMPLE 4.15

Consider lattice Interval.

For K a finite set of integers,

 widening operator V_K defined by:

 _|_ V_K _|_ = _|_
int1 V_K int2 =
   [LB_K(inf(int1), inf(int2)),
    UB_K(sup(int1), sup(int2))]

where

 LB_K(z1,z3) =
     z1,  if z1 <= z3
     k,   if z3 < z1
             /\ k = max{k \in K | k <= z3}
-\infty, if z3 < z1 /\ (k \in K ==> z3 < k)

 UB_K(z2,z4) =
     z2,  if z4 <= z2
     k,   if z2 < z4
             /\ k = min{k \in K | z4 <= k}
\infty, if z2 < z4 /\ (k \in K ==> k < z4)

E.g., suppose K = {5, 0, 2, 1},
      and consider (int_n)n defined by
       [0,1],[0,2],[0,3],...
      then (int^{V_K}_n)n is:


        NARROWING OPERATORS (4.2.2)

Widening operator V gives an m such that

   f(f_V^m) <= f_V^m

Note that

 - f_V^m may not be a fixed point of f

 - f_V^m >= lfp(f)

Goal: get better approx to lfp(f)

Idea: f_V^m in Red(f)

So search by computing 

   f(f_V^m)
   f(f(f_V^m))
   ...
   f^n(f_V^m)


            NARROWING OPERATOR

def: D: L x L -> L is a narrowing operator
    iff:

 - for all l1, l2 in L,
   l2 <= l1 ==> l2 <= (l1 D l2)
                and (l1 D l2) <= l1

 - for all descending chains (l_n)n,
   the sequence bapply(D, (l_n)n)
   eventually stabalizes.


         GALOIS CONNECTIONS (4.3)

Motivation:

  Collecting semantics:

   - obviously correct,
   - view as a lattice, L

  but it's

    - costly, and/or
    - nonterminating/uncomputable

  So do analysis in another lattice, M

Relationship:

  abstraction function
     a: L -> M

  concretization function
     g: M -> L

          DEFINITION

Def: Let L and M be complete lattices.
Then (L, a, g, M) is a Galois connection
iff
   a: L -> M and g: M -> L are monotone
and
   g o a >= id_L               (4.8)

   a o g <= id_M               (4.9)

          ADJUNCTIONS

Def: Let (L,<=_L) and (M,<=_M)
     be complete lattices.
Then (L, a, g, M) is an adjunction iff
   a: L -> M and g: M -> L are total
and
   for all l in L and m in M:
    a(l) <=_M m   iff    l <=_L g(m).


Prop 4.20. (L, a, g, M) is a
Galois connection iff it is an adjunction.

    GALOIS CONNECTIONS DEFINED BY
        EXTRACTION FUNCTIONS

Fact: Suppose b: V -> L is a
representation function.  Then
 (Powerset(V), a, g, L)
is a Galois connection between Powerset(V)
and L, where for all V' <= V and l \in L:
    a(V') = \join {b(v) | v \in V'}
    g(l) = {v \in V | b(v) <= l}

def: Suppose L = (Powerset(D), <=)
     and   eta: V -> D.
We define
   b_{eta}: V -> Powerset(D)
by
   b_{eta}(v) = {eta(v)}

Fact:

 (Powerset(V),a_{eta},g_{eta},Powerset(D))

is a Galois connection, where

   a_{eta}(V') = {eta(v) | v \in V'}
   g_{eta}(D') = {v | eta(v) \in D'}


              PROPERTIES

Lemma 4.22: If (L, a, g, M) is a Galois
connection, then:

 (i) a uniquely determines g by
        g(m) = |_| { l | a(l) <= m }
     and g uniquely determines a by
                _
        a(l) = | | { m | l <= g(m) }

 (ii) a is completely additive and
      g is completely multiplicative

Lemma 4.23: If a: L -> M is
completely additive, then
there exists g: M -> L such that 
   (L, a, g, M) is a Galois connection

Fact 4.24: If
    (L, a, g, M) is a Galois connection
then
          a o g o a = a
and       g o a o g = g

        GALOIS INSERTIONS (4.3.2)

In a Galois connection

   (L, a, g, M)

the set M may contain "junk" elements.


def: Let L and M be complete lattices.
Then (L, a, g, M) is a Galois insertion
iff it is a Galois connection and

     a o g = id_M