SEMANTICS AND ABSTRACT INTERPRETATION Language semantics (big step)
-->* s'
P |- s ~~> s'
Abstract interpretation
P |- l_1 |> l_2
f_p(l_1) = l_2
MUNDANE
Def: a *mundane* analysis or approach is
one that is first-order. That is the
properties describe sets of values
e.g., shape analysis,
constant propagation
EXAMPLE 4.1
For constant propagation:
Semantics is:
S* |- s1 ~~> s2
means -->* s2
Analysis is:
S* |- \hat{s1} |> \hat{s2}
means i = \hat{s1}
/\ s2 = \bigsqcup {CP.(l) |
l in final(S*)}
CORRECTNESS RELATIONS (4.1.1)
def: a *correctness relation* has type
V x L -> Boolean
It says what properties safely
describe a given value, and must be
preserved by computation/analysis:
(v1 R l1 /\ p |- v1 ~~> v2
/\ p |- l1 |> l2)
==> v2 R l2 (4.3)
Picture:
p |- l1 |> l2
R ==> R
p |- v1 ~~> v2
CORRECTNESS FOR ORDERED PROPERTY SPACES
Suppose L = (L, <=) is a complete lattice,
Then we require:
v R l1 /\ l1 <= l2 ==> v R l2 (4.4)
(\forall l \in L' <= L :: v R l)
==> v R (\bigmeet L') (4.5)
CONSTANT PROPAGATION (EXAMPLE 4.3)
s R_CP \hat{s} iff
(\forall x \in Var* ::
(\hat{s}(x) = \top
\/ s(x) = \hat{s}(x)))
REPRESENTATION FUNCTIONS (4.1.2)
def: a *representation function* maps
a value to the best property
describing it.
It must be preserved by computation
in the following sense:
(b(v1) <= l1 /\ p |- v1 ~~> v2
/\ p |- l1 |> l2)
==> b(v2) <= l2 (4.6)
Picture:
p |- l1 |> l2
^ ^
b| ==> |b
| |
p |- v1 ~~> v2
CORRECTNESS VIA REPRESENTATION
AND VICE VERSA
def: R_b is the correctness relation
generated by b:
v R_b l <==> b(v) <= l
def: b_R is the representation function
generated by R:
b_R(v) = \bigmeet { l | v R l }
Lemma 4.5
(i) R_b satisfies (4.4) and (4.5),
and b_{R_b} = b
(ii) if R satisfies (4.4) and (4.5),
then b_R is well-defined
and R_{b_R} = R
CONSTANT PROPAGATION (EXAMPLE 4.6)
b_CP: State -> \hat{State_CP}
b_CP(s) = s
So R_CP is defined by:
GENERALIZATION (4.1.3)
In
p |- v1 ~~> v2
allow v1 in V1, v2 in V2, and V1 <> V2
In
f_p(l1) = l2
allow l1 in L1, l2 in L2, and L1 <> L2
So get 2 correctness relations:
R1: V1 x L1 -> Boolean
generated by b1: V1 -> L1
R2: V2 x L2 -> Boolean
generated by b2: V2 -> L2
Logical relationship:
f_p
l1 --> l2
R1 ==> R2
p |- v1 ~~> v2
def: (R1 ->> R2) is a relation defined by
(p |- . ~~> .) (R1 ->> R2) f_p
<==>
(\forall v1, v2, l1 ::
(p |- v1 ~~> v2) /\ v1 R1 l1
==> v2 R2 f_p(l1))
INTERVAL LATTICE (EXAMPLE 4.10)
Interval = { _|_ }
\cup {[z1,z2] | z1 <= z2, z1 in Z-,
z2 in Z+}
Z- = Z \cup {-\infty}
Z+ = Z \cup {\infty}
_|_ denotes the empty interval
<= ordering on Interval is:
where (for integers z1, z2):
inf(_|_) = \infty
inf([z1,z2]) = z1
sup(_|_) = -\infty
sup([z1,z2]) = z2
WHY FIXED POINTS?
Analysis transforms properties:
f: L -> L
where f is monotone.
E.g., for reaching definitions:
F(RD_1,...,RD_n) =
(F_1(RD_1,...,RD_n), ...,
F_n(RD_1...,RD_n))
Want least fixed point, lfp(f) for:
- recursive programs
- programs with loops
But iterating doesn't necessarily:
- reach a fixed point (stabilize)
- stabilize at the least fixed point
IDEA
How to approximate lfp(f)?
use sequence (f^n_V)n
- which must stabalize
- which will safely approximate lfp(f)
The V (\nabla) is a widening operator
UPPER BOUND OPERATORS
def: Suppose (L,<=) is a complete lattice.
Then an operation
ub: L x L -> L
is an upper bound operator iff
for all l1, l2 in L,
l1 <= ub(l1,l2)
and
l2 <= ub(l1,l2).
Example (4.12):
Let int be a fixed interval
e.g., int02 = [0,2]
define:
ub^int(int1, int2) =
if int1 <= int or int2 <= int1
then int1 |_| int2
else [-\infty, \infty]
e.g., with int02 = [0,2]
ub^int02(int1, int2) =
if int1 <= [0,2] or int2 <= int1
then int1 |_| int2
else [-\infty, \infty]
so ub^int02([1,2],[2,3]) =
but ub^int02([2,3],[1,2]) =
MAKING ASCENDING CHAINS
def: Let (l_n)n = (l_0, l_1, ...)
be a sequence of elements in L.
Let phi: (L x L) -> L be a total function
Then
bapply(phi, (l_n)n) = (m_n)n
where m_0 = l_0
m_n = phi(m_{n-1}, l_n), for n > 0
Notation:
(bapply(phi, (l_n)n) is written
(l^{phi}_n)n
Fact 4.11 If (l_n)n is a sequence and
ub is an upper bound operator,
then (bapply(ub, (l_n)n) is
an ascending chain.
WIDENING OPERATORS
def: Let L be a complete lattice.
Then V: L x L -> L
is a *widening operator*
iff:
- V is an upper bound operator, and
- for all ascending chains (l_n)n,
the chain bapply(V, (l_n)n)
eventually stabilizes
USING WIDENING TO SAFELY APPROXIMATE LFP
Given: monotone f: L -> L
widening operator V: L x L -> L
Goal: find lfp_V(f), such that:
(a) f(lfp_V(f)) <= lfp_V(f), and
(b) lfp_V(f) >= lfp(f)
Define lfp_V(f) = f_V^m, where
m >= 0 is the least number such that:
f(f_V^m) <= f_V^m
where for all n >= 0
f_V^0 = _|_
f_V^{n+1} = f_V^{n},
if f(f_V^{n}) <= f_V^{n}
f_V^{n+1} = f_V^{n} V f(f_V^{n}),
otherwise
EXAMPLE 4.15
Consider lattice Interval.
For K a finite set of integers,
widening operator V_K defined by:
_|_ V_K _|_ = _|_
int1 V_K int2 =
[LB_K(inf(int1), inf(int2)),
UB_K(sup(int1), sup(int2))]
where
LB_K(z1,z3) =
z1, if z1 <= z3
k, if z3 < z1
/\ k = max{k \in K | k <= z3}
-\infty, if z3 < z1 /\ (k \in K ==> z3 < k)
UB_K(z2,z4) =
z2, if z4 <= z2
k, if z2 < z4
/\ k = min{k \in K | z4 <= k}
\infty, if z2 < z4 /\ (k \in K ==> k < z4)
E.g., suppose K = {5, 0, 2, 1},
and consider (int_n)n defined by
[0,1],[0,2],[0,3],...
then (int^{V_K}_n)n is:
NARROWING OPERATORS (4.2.2)
Widening operator V gives an m such that
f(f_V^m) <= f_V^m
Note that
- f_V^m may not be a fixed point of f
- f_V^m >= lfp(f)
Goal: get better approx to lfp(f)
Idea: f_V^m in Red(f)
So search by computing
f(f_V^m)
f(f(f_V^m))
...
f^n(f_V^m)
NARROWING OPERATOR
def: D: L x L -> L is a narrowing operator
iff:
- for all l1, l2 in L,
l2 <= l1 ==> l2 <= (l1 D l2)
and (l1 D l2) <= l1
- for all descending chains (l_n)n,
the sequence bapply(D, (l_n)n)
eventually stabalizes.
GALOIS CONNECTIONS (4.3)
Motivation:
Collecting semantics:
- obviously correct,
- view as a lattice, L
but it's
- costly, and/or
- nonterminating/uncomputable
So do analysis in another lattice, M
Relationship:
abstraction function
a: L -> M
concretization function
g: M -> L
DEFINITION
Def: Let L and M be complete lattices.
Then (L, a, g, M) is a Galois connection
iff
a: L -> M and g: M -> L are monotone
and
g o a >= id_L (4.8)
a o g <= id_M (4.9)
ADJUNCTIONS
Def: Let (L,<=_L) and (M,<=_M)
be complete lattices.
Then (L, a, g, M) is an adjunction iff
a: L -> M and g: M -> L are total
and
for all l in L and m in M:
a(l) <=_M m iff l <=_L g(m).
Prop 4.20. (L, a, g, M) is a
Galois connection iff it is an adjunction.
GALOIS CONNECTIONS DEFINED BY
EXTRACTION FUNCTIONS
Fact: Suppose b: V -> L is a
representation function. Then
(Powerset(V), a, g, L)
is a Galois connection between Powerset(V)
and L, where for all V' <= V and l \in L:
a(V') = \join {b(v) | v \in V'}
g(l) = {v \in V | b(v) <= l}
def: Suppose L = (Powerset(D), <=)
and eta: V -> D.
We define
b_{eta}: V -> Powerset(D)
by
b_{eta}(v) = {eta(v)}
Fact:
(Powerset(V),a_{eta},g_{eta},Powerset(D))
is a Galois connection, where
a_{eta}(V') = {eta(v) | v \in V'}
g_{eta}(D') = {v | eta(v) \in D'}
PROPERTIES
Lemma 4.22: If (L, a, g, M) is a Galois
connection, then:
(i) a uniquely determines g by
g(m) = |_| { l | a(l) <= m }
and g uniquely determines a by
_
a(l) = | | { m | l <= g(m) }
(ii) a is completely additive and
g is completely multiplicative
Lemma 4.23: If a: L -> M is
completely additive, then
there exists g: M -> L such that
(L, a, g, M) is a Galois connection
Fact 4.24: If
(L, a, g, M) is a Galois connection
then
a o g o a = a
and g o a o g = g
GALOIS INSERTIONS (4.3.2)
In a Galois connection
(L, a, g, M)
the set M may contain "junk" elements.
def: Let L and M be complete lattices.
Then (L, a, g, M) is a Galois insertion
iff it is a Galois connection and
a o g = id_M