Solution to Homework 3, CSE444, Fall 2003

 

1.      

a.     Heap

Block 0

Block 1

Block 2

Block 3

2

NULL

NULL

2

1

0

NULL

NULL

40

 

 

10

 

 

30

 

 

Free

35

 

 

20

 

 

39

 

 

Free

25

 

 

28

 

 

Free

Free

 

b.     Clustered table

Block 0

Block 1

Block 2

Block 3

2

NULL

NULL

3

3

0

1

2

40

 

 

10

 

 

30

 

 

28

 

 

Free

20

 

 

35

 

 

Free

Free

25

 

 

39

 

 

Free

 

2.     Both “best fit” and “first fit” need a new block only when all blocks are full.  So none saves disk space over the other. Deletions and Insertions are equally efficient in both policies since they require access to 1 block in both policies to remove a record or insert a new one.

 

Updates of variable-length fields are more efficient with “best fit”. This policy ensures each allocated block contains the maximum possible free space on average, thus allowing the records grow and shrink flexibly. As a counterpart, “first-fit” fills up the blocks as fast as possible making changes to the record length much more difficult. Details depend on the specific implementation, but it would inevitably involve much more allocations due insufficient space in a block to handle update and as a consequence more swapping records between blocks.

 

3.      

a.     No Tombstones

·          2 insertions/ 1 deletion per day 

(2*(100+2) - (100+2)) bytes/days = 102 bytes/days 

·          4096 bytes / 102 bytes/days = ~40 days

b.     With Tombstones

·          2 insertions/ 1 deletion per day 

(2*(100+2) - (100)) bytes/days = 104 bytes/days 

·          4096 bytes / 104 bytes/days = ~39 days

4.      

a.     b.


 

5.      

a.     Histogram of bucket sizes using Mr. Frumble’s hash function

 

 

 

b.     Expected running time is 1’000’000 = 100*100^2.

Running time using Mr. Frumble’s hash function is 4,816,030 = 02 + … + 02 + 12 + 42 + 102 + … + 6602 + 6702 + 6602 + … + 102 + 42 + 12 + 02 + …

It is 4.81603 times slower than expected.

 

c.      

                                                           i.      New hash function

unsigned int NewHash(const char *s, unsigned int n)

{     

         unsigned int res = 0;

        s++;

int x = atoi(s);

  res = x % n;

        return res;

}

                                                        ii.      New running time is 1’000’000 = 1002+1002+…+1002

 

                                                    iii.      Histogram of bucket sizes using new hash function