Solution to Homework 3, CSE444, Fall 2003
1.
a. Heap
Block 0 |
Block 1 |
Block 2 |
Block 3 |
||||||||||
2 |
NULL |
NULL |
2 |
1 |
0 |
NULL |
NULL |
||||||
40 |
|
|
10 |
|
|
30 |
|
|
Free |
||||
35 |
|
|
20 |
|
|
39 |
|
|
Free |
||||
25 |
|
|
28 |
|
|
Free |
Free |
||||||
b. Clustered table
Block 0 |
Block 1 |
Block 2 |
Block 3 |
||||||||||||
2 |
NULL |
NULL |
3 |
3 |
0 |
1 |
2 |
||||||||
40 |
|
|
10 |
|
|
30 |
|
|
28 |
|
|
||||
Free |
20 |
|
|
35 |
|
|
Free |
||||||||
Free |
25 |
|
|
39 |
|
|
Free |
||||||||
2. Both “best fit” and “first fit” need a new block only when all blocks are full. So none saves disk space over the other. Deletions and Insertions are equally efficient in both policies since they require access to 1 block in both policies to remove a record or insert a new one.
Updates of variable-length fields are more efficient with “best fit”. This policy ensures each allocated block contains the maximum possible free space on average, thus allowing the records grow and shrink flexibly. As a counterpart, “first-fit” fills up the blocks as fast as possible making changes to the record length much more difficult. Details depend on the specific implementation, but it would inevitably involve much more allocations due insufficient space in a block to handle update and as a consequence more swapping records between blocks.
3.
a. No Tombstones
· 2 insertions/ 1 deletion per day
(2*(100+2) - (100+2)) bytes/days = 102 bytes/days
· 4096 bytes / 102 bytes/days = ~40 days
b. With Tombstones
· 2 insertions/ 1 deletion per day
(2*(100+2) - (100)) bytes/days = 104 bytes/days
· 4096 bytes / 104 bytes/days = ~39 days
4.
a. b.
5.
a. Histogram of bucket sizes using Mr. Frumble’s hash function
b. Expected running time is 1’000’000 = 100*100^2.
Running time using Mr. Frumble’s hash function is 4,816,030 = 02 + … + 02 + 12 + 42 + 102 + … + 6602 + 6702 + 6602 + … + 102 + 42 + 12 + 02 + …
It is 4.81603 times slower than expected.
c.
i. New hash function
unsigned int NewHash(const char *s, unsigned int n)
{
unsigned int res = 0;
s++;
int x = atoi(s);
res = x % n;
return res;
}
ii. New running time is 1’000’000 = 1002+1002+…+1002
iii. Histogram of bucket sizes using new hash function