Final Exam Solutions

CSE 544

Winter 2001

 



1.a

select distinct Doc.url
from Doc, Occurs, Word
where Doc.url = Occurs.url and Occurs.word = Word.word and
      Word.language = "Spanish"

common mistakes: forget "distinct"
notice: the join with Doc is neccessary, since we haven't said that
url in Occurs is a foreign key in Doc. Still, I haven't taken points
off if you dropped the join with Doc

1.b

select distinct Doc.url
from Doc
where Doc.url not in
        (select Occurs.url
         from Occurs, Word
         where Occurs.word=Word.word and Word.language != "Spanish")

common mistakes:
       i. require documents to be nonempty (an empty document contains
       only Spanish words, hence should be included)
       ii. have an extra join in the inner select (no need to join
       with Doc again:  A - A*B = A - B)

note: "distinct" is technically needed, since we didn't say that url
      is a key in Doc.  No points were taken off if you dropped
      "distinct".

note: other correct solutions are select... EXCEPT select... or
select...where Doc.url NOT IN (select...)

1.c

select distinct Doc.url
from Doc
where url not in (select occurs.url
                  from occurs, word
                  where occurs.word = word.word and
                        word.language = "Spanish" and
                        occurs.bytepos <= 300)

common mistakes: extra join with Doc in the subquery

note: other correct solutions are select... EXCEPT select... or
select...where Doc.url NOT IN (select...)

1.d

select d1.url, d2.url
from Doc d1, Doc d2, Occurs o1, Occurs o2
where d1.url = o1.url and o1.word = o2.word and o2.url = d2.url
groupby d1.url, d2.url
having count(distinct o1.word) >= 200

common mistakes:
       i. extra join with Word
       ii. count(*) >= 200 (this includes pairs of documents that have
       few words in common, byt they occur in many positions)


2.a

/db//b
             <b> 1 </b>
             <b> 3 </b>
             <b> 4 </b>
             <b> 5 </b>


/db//c/b
             <b> 4 </b>
             <b> 5 </b>

/db//c[b]
             <c> <b> 4 </b> </c>
             <c> <b> 5 </b> </c>

2.b

/db/a/c        <= 55
/db/a/[//b]    >=1 and <=10
/db/a//*/*/b   >=20 and <= 100
    (the intent was to ask the question /db/a//*/*/*/b, but one star
    got lost during typing; the answer here is more interesting: =80)


3.a:  left tree: root and the three Red leaves;
      right tree: the two Red leaves (not the root !)

3.b: yes, yes, yes, no


4

Partial Datalog(neg):

Win(x)  :- Red(x), Leaf(x)
Lose(x) :- Blue(x), Leaf(x)
Win(x)  :- Blue(x), R(x,y), Win(y)
Lose(x) :- Red(x), R(x,y), Lose(y)
Win(x)  :- Red(x), Not(Lose(x))
Lose(x) :- Blue(x), Not(Win(x))



Inflationary datalog(neg), assuming perfectly balanced tree:

Win(x)  :- Red(x), Leaf(x)
Lose(x) :- Blue(x), Leaf(x)
Win(x)  :- Blue(x), R(x,y), Win(y)
Lose(x) :- Red(x), R(x,y), Lose(y)
Known(x):- Leaf(x)
Known(x):- R(x,y), Known(y)
Win(x)  :- Red(x), Not(Lose(x)), Known(x)
Lose(x) :- Blue(x), Not(Win(x)), Known(x)


5.a:

        ------------
       |  a  |  c  |
        ------------
       |  e  |  d  |
        ------------
5.b

i. We have to count the largest number of nodes that are used by n
keys.  Notation: ceil(x) means "round x up to the nearest integer".

- there are ceil(n/2) leaves

- there are x internal nodes; each has 3 outgoing edges except
          the root has only two outgoing edges:
      = 3x-1 outgoing edges
  the edges' destinations are x-1 internal nodes and ceil(n/2)
  leaves:
               = x-1 + ceil(n/2) incoming edges
          hence:
             x = ceil(ceil(n/2)/2)

- hence total number of splits:
                = ceil(n/2) - 1 + ceil(ceil(n/2)/2) - 1
                = (approx) n/2 + n/4 - 2 = 3n/4 - 2

ii. We have to count the smallest number of nodes used by n keys.  As
above: ceil(n/4) leaves, ceil((ceil(n/4)-1)/4) internal nodes.  Hence,
total number of splits is:

              = ceil(n/4)-1 + ceil((ceil(n/4)-1)/4) - 1
              = (approx) n/4 + n/16 - 1/4 - 2
              = 5n/16 - 9/2