Final Exam Solutions
Winter 2001
1.a
select distinct Doc.url
from Doc, Occurs, Word
where Doc.url = Occurs.url and Occurs.word = Word.word and
Word.language = "Spanish"
common mistakes: forget "distinct"
notice: the join with Doc is neccessary, since we haven't said that
url in Occurs is a foreign key in Doc. Still, I haven't taken points
off if you dropped the join with Doc
1.b
select distinct Doc.url
from Doc
where Doc.url not in
(select Occurs.url
from Occurs, Word
where Occurs.word=Word.word
and Word.language != "Spanish")
common mistakes:
i. require documents to be nonempty (an
empty document contains
only Spanish words, hence should be
included)
ii. have an extra join in the inner select
(no need to join
with Doc again: A - A*B = A - B)
note: "distinct" is technically needed, since we didn't say that url
is a key in Doc. No points were taken off
if you dropped
"distinct".
note: other correct solutions are select... EXCEPT select... or
select...where Doc.url NOT IN (select...)
1.c
select distinct Doc.url
from Doc
where url not in (select occurs.url
from occurs, word
where occurs.word = word.word and
word.language = "Spanish" and
occurs.bytepos <= 300)
common mistakes: extra join with Doc in the subquery
note: other correct solutions are select... EXCEPT select... or
select...where Doc.url NOT IN (select...)
1.d
select d1.url, d2.url
from Doc d1, Doc d2, Occurs o1, Occurs o2
where d1.url = o1.url and o1.word = o2.word and o2.url = d2.url
groupby d1.url, d2.url
having count(distinct o1.word) >= 200
common mistakes:
i. extra join with Word
ii. count(*) >= 200 (this includes
pairs of documents that have
few words in common, byt they occur in
many positions)
2.a
/db//b
<b> 1 </b>
<b> 3 </b>
<b> 4 </b>
<b> 5 </b>
/db//c/b
<b> 4 </b>
<b> 5 </b>
/db//c[b]
<c> <b> 4 </b> </c>
<c> <b> 5 </b> </c>
2.b
/db/a/c <= 55
/db/a/[//b] >=1 and <=10
/db/a//*/*/b >=20 and <= 100
(the intent was to ask the question /db/a//*/*/*/b, but one
star
got lost during typing; the answer here is more interesting:
=80)
3.a: left tree: root and the three Red leaves;
right tree: the two Red leaves (not the root !)
3.b: yes, yes, yes, no
4
Partial Datalog(neg):
Win(x) :- Red(x), Leaf(x)
Lose(x) :- Blue(x), Leaf(x)
Win(x) :- Blue(x), R(x,y), Win(y)
Lose(x) :- Red(x), R(x,y), Lose(y)
Win(x) :- Red(x), Not(Lose(x))
Lose(x) :- Blue(x), Not(Win(x))
Inflationary datalog(neg), assuming perfectly balanced tree:
Win(x) :- Red(x), Leaf(x)
Lose(x) :- Blue(x), Leaf(x)
Win(x) :- Blue(x), R(x,y), Win(y)
Lose(x) :- Red(x), R(x,y), Lose(y)
Known(x):- Leaf(x)
Known(x):- R(x,y), Known(y)
Win(x) :- Red(x), Not(Lose(x)), Known(x)
Lose(x) :- Blue(x), Not(Win(x)), Known(x)
5.a:
------------
| a | c |
------------
| e | d |
------------
5.b
i. We have to count the largest number of nodes that are used by n
keys. Notation: ceil(x) means "round x up to the nearest
integer".
- there are ceil(n/2) leaves
- there are x internal nodes; each has 3 outgoing edges except
the root has only two
outgoing edges:
= 3x-1 outgoing edges
the edges' destinations are x-1 internal nodes and ceil(n/2)
leaves:
= x-1 + ceil(n/2) incoming edges
hence:
x =
ceil(ceil(n/2)/2)
- hence total number of splits:
= ceil(n/2) - 1 + ceil(ceil(n/2)/2) - 1
= (approx) n/2 + n/4 - 2 = 3n/4 - 2
ii. We have to count the smallest number of nodes used by n keys. As
above: ceil(n/4) leaves, ceil((ceil(n/4)-1)/4) internal nodes. Hence,
total number of splits is:
= ceil(n/4)-1 + ceil((ceil(n/4)-1)/4) - 1
= (approx) n/4 + n/16 - 1/4 - 2
= 5n/16 - 9/2