Notes
Outline
Finite Model Theory
Lecture 18
Extended 0/1 Laws
Or “Getting Real”
Outline
A better probabilistic model
Probabilities of conjunctive queries
Probabilities for FO
Based on work done with N. Dalvi and G.Miklau, and on papers by Lynch, Shelah and Spencer
Annomalies 0/1 Laws
Database schema:
Employee(name, city, occupation)
We are not given the instance.
Any person belongs to Employee with m = 1/2 !
The expected size E[Employee] = n3/2 ! 1 !!
In practice need conditional probabilities,
m(f | y), but they often don’t exists [ why ?]
A Better Model
Postulate that for each R 2 s
E[R] = cR   (a constant)
This leads to: for each tuple t:
Pr[t 2 R] = cR / na   where a = arity(R)
A Better Model
No more anomalies:
For a given person, the probability of it belonging to Employee is ! 0
The expected size is E[R] = cR
Asymptotic conditional probabilities always exists for conjunctive queries
Conjunctive Queries
Have the form:

9 x1…9 xk.(C1 Æ … Æ Cm)
Where each Ci is R(…) or xi=xj or xi¹ xj
Conjunctive Queries
Theorem
For every Q there are numbers E, C s.t:
     Pr[Q] =C / nE + O(1/NE+1)
Corollary Pr[Q1 | Q2] always has a limit
Will show next how to compute C, E
Subgraph Properties
Consider R(x,y);
For every edge, Pr(R(u,v)) = c/n2
Given Q, let H = Q¹ obtained by adding all predicates of the form xi ¹ xj
H checks for the presence of a subgraph
Subgraph Properties
Example 1:
Q = R(x,y),R(y,z),R(z,x)
H=Q¹ = R(x,y),R(y,z),R(z,x),x¹ y,y¹ z,z¹ x
Subgraph Properties
Pr(H) = Pr(Çu,v,w H(u,v,w))

        · åu,v,w Pr(H(u,v,w))

        = n(n-1)(n-2) *  1/3  * c3 / n6

        = 1/3 c3 / n3 + O(1/n4)
Subgraph Properties
Example 2:
Q = R(x,y),R(y,a),R(b,x)
H=Q¹=R(x,y),R(y,z),R(z,x),x¹ y,y¹a,a¹x,x¹b, b¹x
Subgraph Properties
Pr(H) = Pr(Çu,v H(u,v))

      · åu,v Pr(H(u,v))

      = n(n-1) *  1/1  * c3 / n6

      =  c3 / n4 + O(1/n5)
Subgraph Properties
Let Q = G1, G2, …, Gm
Lemma Pr(Q) · C/H * 1/nE
Subgraph Properties
Lower bound, for the triangle:
Pr(H) = Pr(Çu,v,w H(u,v,w))
   ¸ åPr(H(u,v,w)) – åPr(H(u,v,w)Æ H(u’,v’,w’)
= 1/3 c3/n3 + O(1/n4) - å Pr(HH)
Subgraph Properties
What is Pr(H) ?  Each term belongs to one of the following cases:
Subgraph Properties
Hence, for the triangle:

 Pr(H) ¼ 1/3 c3/n3
This generalizes easily to any subgraph property
Subgraphs with E = 0
H = R(x,y)     E = 2-2 = 0;  what is Pr(H) ?
H = R(x,y)R(u,v)     E = 4–4 = 0what is Pr(H) ?
H = R(x,y)R(y,z)R(z,x), R(u,v)  E(H) = E(triangle);
Exponent in the theorem is always correct, but need to adjust the coefficient
Conjunctive Queries
Consider the query:
R(x,y),R(y,z),R(z,x)
Any of the variables x,y,z may be equal: results in the following subgraphs:
H1 = R(x,y)R(y,z)R(z,x)    E=6-3=3
H2 = R(x,x)R(x,z)R(z,x)    E=6-2=4
H3 = R(x,x)R(x,x)R(x,x) = R(x,x)   E=2
Hence Pr(Q) = Pr(H3) = cR/n2
Conjunctive Queries
Now consider
Q =  R(a,x),R(y,b)
Two graphs:
H1 = R(a,x)R(y,b)    E = 4-2=2
H2 = R(a,b)              E = 2
One can prove:
Pr(Q) = Pr(H1) + Pr(H2) = (c + c2)/n2
More General Distributions
[Shelah&Spencer, Lynch]
Pr(tuple) = b / na
Example: H = triangle
Pr(H) ¼ n3 * 1/3 * b3 / n3a = C / nE
Simply redefine E(H) to use a
More General Distributions
But, problem here; let \alpha = 3/2:
Threshold Functions for Subgraphs
[Erdos and Reny]
Edge probability Pr(t) = p(n) = some function
Main theorem of random graphs:
For any monotone property C there exists a threshold function t(n) s.t.
If p(n) ¿ t(n) then limn Pr(C) = 0
If p(n) À t(n) then limn Pr(C) = 1
Threshold Functions
[Erdos and Reny]
The threshold function for subgraph property H is the following:
Let a = maxH0 µ H |nodes(H0)| / |edges(H0)|
Then t(n) = 1/na
Can derive it from the exponent [ show in class ]
Extended 0/1 Laws
Shelah and Spencer, and Lynch consider the following general case:
Pr(t) = b / na,  for a > 0
Lynch: a logic admits an extended 0/1 law if for each f one of the following holds:
Pr(f) ¼ C/nE,   or
Pr(f) < 1/nE  for every E >0