Solution, Homework 3, CSE444, fall 2003

Solution to Homework 3, CSE444, Fall 2003

a. Heap

Block 0		Block 1		Block 2		Block 3
2	NULL	NULL	2	1	0	NULL	NULL
40		10		30		Free
35		20		39		Free
25		28		Free		Free

b. Clustered table

Block 0		Block 1		Block 2		Block 3
2	NULL	NULL	3	3	0	1	2
40		10		30		28
Free		20		35		Free
Free		25		39		Free

2. Both “best fit” and “first fit” need a new block only when all blocks are full. So none saves disk space over the other. Deletions and Insertions are equally efficient in both policies since they require access to 1 block in both policies to remove a record or insert a new one.

Updates of variable-length fields are more efficient with “best fit”. This policy ensures each allocated block contains the maximum possible free space on average, thus allowing the records grow and shrink flexibly. As a counterpart, “first-fit” fills up the blocks as fast as possible making changes to the record length much more difficult. Details depend on the specific implementation, but it would inevitably involve much more allocations due insufficient space in a block to handle update and as a consequence more swapping records between blocks.

a. No Tombstones

· 2 insertions/ 1 deletion per day

(2*(100+2) - (100+2)) bytes/days = 102 bytes/days

· 4096 bytes / 102 bytes/days = ~40 days

b. With Tombstones

· 2 insertions/ 1 deletion per day

(2*(100+2) - (100)) bytes/days = 104 bytes/days

· 4096 bytes / 104 bytes/days = ~39 days

a. b.

a. Histogram of bucket sizes using Mr. Frumble’s hash function

b. Expected running time is 1’000’000 = 100*100^2.

Running time using Mr. Frumble’s hash function is 4,816,030 = 0² + … + 0² + 1² + 4² + 10² + … + 660² + 670² + 660² + … + 10² + 4² + 1² + 0² + …

It is 4.81603 times slower than expected.

i. New hash function

unsigned int NewHash(const char *s, unsigned int n)

{

unsigned int res = 0;

s++;

int x = atoi(s);

res = x % n;

return res;

}

ii. New running time is 1’000’000 = 100²+100²+…+100²

iii. Histogram of bucket sizes using new hash function