1a. Normalizing by multiplying by 20 we have the following
steps in the Huffman optimization algorithm.
a b c d e f g h
1 2 1 3 6 2 3 2
(a,c) b d e f g h
2 2 3 6 2 3 2
((a,c),b) d e f g h
4 3 6 2 3 2
((a,c),b) d e (f,h) g
4 3 6 4
3
((a,c),b) (d,g) e (f,h)
4 6
6 4
(((a,c),b),(f,h)) (d,g) e
8
6 6
(((a,c),b),(f,h)) ((d,g),e)
8
12
((((a,c),b),(f,h)),((d,g),e))
20
In this notation (x,y) denotes a tree with left subtree x and right
subtree y ,
ie,
/\
/ \
x y
Left branch of a node is labelled '0' and right branch is labelled
'1'.
1b. Depth of a leaf node is the number of bits in the encoding of the corresponding symbol.
Symbol Depth
a
4
b
3
c
4
d
3
e
2
f
3
g
3
h
3
Average
bits per symbol
= (1/20)*(4*1 + 3*2 + 4*1 + 3*3 + 2*6 + 3*2 + 3*3 + 3*2) = 56/20 = 2.8
bits per symbol
1c. Codes are: a: 0000, b: 001: c: 0001,
d: 100, e: 11, f: 010, g: 101, h: 011,
Therefore
the code for efdgh is 11010100101011
2. The final dictionary is :
0 a
1 b
2 c
3 ab
4 bc
5 cc
6 cca
7 aba
The decoded string is : abcccababa (Check the answer by running LZW by hand)
3. Edges are considered for inclusion in MST in the following order. A "yes" mark indicates that the edge did not form a cycle, and was hence included. A "no" means that the edge was rejected.
Edge
Weight Included
(a,d) :
1 yes
(d,g) :
1 yes
(b,d) :
2 yes
(c,f) :
2 yes
(a,b) :
2 no
(c,d) :
3 yes
(d,f) :
3 no
(e,g) :
4 yes
(a,c) :
4 no
(e,d) :
5 no
(f,g) :
5 no
(b,e) :
5 no
An MST is formed by the following edges : {(a,d), (d,g), (b,d), (c,f), (c,d), (e,g)}
4. Clearly the max clique size is not more than n. And the minimum clique size is not less than 1. We will use the fact that if there is a clique of size k then there is also a clique of every smaller size. Now, suppose we know that the max clique size is between two bounds, "min" and "max". Suppose "mid" is defined as (max + min)/2. We call algorithm A on G with k = mid. If A returns "yes", then there is a clique of size mid, hence the two new (lower and upper) bounds of the max clique size are "mid" and "max". On the other hand, if A returns false, then there is no clique of size mid, hence the new bounds for the max clique size are "min" and "mid" - 1. We can repeat the same procedure recursively, till the lower and upper bounds are the same. Note, that this is effectively a binary search for the correct max-clique-size in the feasible range. The pseudo-code is as follows :
Algorithm B (Input : G, Output
: Max clique size of G)
Begin
min := 1 // the max clique size cannot be less than 1.
max := n // the max clique size cannot be more than n
FindMaxClique(min,max) // do
a binary search in this range
End
Procedure FindMaxClique(min
: integer, max : integer)
Begin
If (min = max) // base
case of recursion, the two bounds are same
Output min
Abort
mid
:= (max+min)/2
If not(A(G,mid)) Then
FindMaxClique(min, mid-1)
Else
FindMaxClique(mid, max)
End
Analysis : The algorithm does a binary search in the range [1,n] and has therefore O(log n) recursive calls to FindMaxClique, and is otherwise polynomial in time complexity.
5.a. Solution space is the set of all valid linear rearrangements.
Thus each node in the solution space is a distinct valid linear arrangement.
b. Move : Take any pair of vertices u and v. Suppose
f(u) = i and f(v) = j. To form a neighbor h of f make h(u) = j and h(v)
= i, leaving everything else the same. That is, we simply exchange u and
v in the linear arrangement. The linear arrangement thus obtained is a
neighbor.
c. Let f and g be any two linear arrangements.
The distance between f and g is the number of vertices v, such that f(v)
is different from g(v). If this distance is the zero then f and g
are the same. We show that if the distance between f and g is d >
0 then we can find a neighbor h of f such that the distance between h and
g is d-1. Let f and g differ on vertex v so that f(v) = i and g(v)
= j. Let w be such that f(w) = j. We exchange v and w
to form the neighbor h, that is h(x) = f(x) if x is not one of v or w and,
h(v) = j and h(w) = i. The arrangement h has distance d-1 from g
because h(v) = g(v).
d. An upper bound on the time complexity of
a greedy local search :
A move must decrease the cost by at least 1. The minimun cost of a node
(over all possible graphs) is 0 (when there are no edges). The maximum
cost of a node (over all possible graphs) is, considering the clique graph
of n nodes, O(n^3) (a little algebra needed here). Hence the maximum
number of moves is O(n^3). Each move considers all of the nodes O(n^2)
neighbors, computes the cost of each of them in O(m) time, and is hence
O(m*n^2). Hence the overall algorithm has time complexity O(m*n^5).
6.a. Assuming a gray scale of 8 bits per pixel,
the compression ratio would be 8:0.1 = 80:1
b. there are a number of ways to achive
.1 bpp in VQ. We need to solve the equation (log2n)/(ab)
= .1 bpp, where n is the codebook size, a x b is the block size.
Since images are typically a power of two on a side it is wise to choose
a and b to be powers of two. It is also a good idea to make the a
x b block as square as possible. Finally, you want a decent size
codebook so that there are enough representatives around. A decent
size codebook might be 1,024 so that a x b should be about 100. How
about 128 which is 8 x 16. So n = 1,024 and a = 8 and b = 16 gives
us .078 bpp. Since we are shooting for .1 bpp then we should
go to a larger codebook, say 4,096. With a = 8 and b = 16 we
get .094 bpp.
7.
A : 0 0 1 1 1
1 0 0
1st level : 0 1 1
0 0 0 0 0
2nd level :1/2 1/2 1/2 -1/2
3rd level :1/2 0.
The three level wavelet transform yields 1/2
0 1/2 -1/2 0 0 0 0
with 1/2 in the lowest resolution subband.
B : 0 0 0 1 1
1 1 0
1st level : 0 1/2 1 1/2
0 1/2 0 -1/2
2nd level :1/4 3/4 1/4 -1/4
3rd level :1/2 1/4.
The three level wavelet transform yields 1/2
1/4 1/4 -1/4 0 1/2 0 -1/2
with 1/2 in the lowest resolution subband.
8. The dynamic programming table looks like this :
A G A T
C
0 -1 -2 -3 -4
-5
A
-1 2 1 0
-1 -2
T
-2 1 1 0
2 1
A
-3 0 0 3
2 1
T
-4 -1 -1 2
5 4
T
-5 -2 -2 1
4 4
C
-6 -3 -3 0
3 6
a. A maximum score matching is :
A G A
T - C
A T A
T T C
b. There are more than one max-score matchings, since on backtracking we get a graph, instead of a single path.
9.
Lets name the points
a b c
d e f
g h
(1,2) (3,3) (5,5) (8,3) (7,2) (6,5) (1,7) (4,6)
Sort on x
a g b
h c f
e d
(1,2) (1,7) (3,3) (4,6) (5,5) (6,5) (7,2) (8,3)
Sort on y
a e
b d c
f h g
(1,2) (7,2) (3,3) (8,3) (5,5) (6,5) (4,6) (1,7)
Widest spread is in x so we choose the root to have axis = x and value = 4. The left subtree will have a, g, b, h and the right subtree will have c, f, e, d.
Left subtree of root:
Sort on x
a g b
h
(1,2) (1,7) (3,3) (4,6)
Sort on y
a b
h g
(1,2) (3,3) (4,6) (1,7)
Widest spread is in y so we choose the axis = y and value = 5. The left-left subtree will have a,b and the right-left will have h,g.
Left-left subtree:
Sort on x
a b
(1,2) (3,3)
Sort on y
a b
(1,2) (3,3)
Widest spread is in x so we choose the axis = x and value = 2. The left-left-left subtree will have a and the right-left will have b.
Left-right subtree:
Sort on x
g h
(1,7) (4,6)
Sort on y
h g
(4,6) (1,7)
Widest spread is in y so we choose the axis = x and value = 3. The left-right-left subtree will have g and the left right-left will have h.
Right subtree of root:
Sort on x
c f
e d
(5,5) (6,5) (7,2) (8,3)
Sort on y
e d
c f
(7,2) (8,3) (5,5) (6,5)
The widest spread is the same so we arbitrarily choose axis = x and value = 6. The right-left subtree will have c,f and the right-right will have e,d.
Right-left subtree:
Sort on x
c f
(5,5) (6,5)
Sort on y
c f
(5,5) (6,5)
The widest spread is x so we choose axis = x and value = 5. The right-left-left subtree will have c and the right-left-right will have f.
Right-right subtree:
Sort on x
e d
(7,2) (8,3)
Sort on y
e d
(7,2) (8,3)
The widest spread is the same so we arbitrarily choose axis = x and value = 7. The right-right-left subtree will have e and the right-right-right will have d.
In the end we have the following k-d tree:
x,4
/ \
/ \
y,5
x,6
/ \
/ \
/ \
/ \
x,2 x,3 x,5
x,7
/ \ / \ /
\ / \
a b g h c
f e d
10.
a. In the best case there
will be 2n/B cache misses. Every B reads in A incurs a cache miss
and every B writes in B incurs a cache miss.
b. In the worst case the
arrays are allocated in memory so that A[0] maps to the same location in
the cache as B[0]. In this case there will be 2n cache misses
because after each read or write operation the cache block accessed is
owned by the other array.
11. a. False
b. False
c. True
d. True
e. False
f. False
g. False
h. False
i. True
j. False