CSE 417 Assignment #2
Autumn 2002
Due: Wednesday, October 30, 2002.
- Recall the list partition problem considered in lecture.
- Using the same sample data considered in class, determine the
values of the D matrix computed by the algorithm.
Use these values to determine the partitions of the list that correspond to
each of the entries in the last column of the M matrix.
- We calculated in class that the running time for solving the list
partition problem for partitioning a sequence of length n into k
pieces was O(n2 k).
Show that if all the numbers in the sequence are positive you can improve the
inner loop of the algorithm so that it takes
O(log i) steps in the worst case instead of O(i) and
derive an algorithm that works in O(kn log n) time
overall.
(Hint: notice that in this case
M[pos,j-1] increases as pos increases since all elements are
positive and there are more
elements to fit in the same number of blocks and
p[i]-p[pos] decreases as pos increases since there are fewer
left-over elements.)
- An alignment of two strings
A=a1a2...an and
B=b1b2...bm is
a pair of strings E and F of equal length that contain all the characters in
A and B respectively but also may have interspersed extra copies of
a special blank character that doesn't appear in either A or B.
If you removed all of the blanks from E you would get A and if you removed
all the blanks from F you would get B.
The cost of the alignment is the number of positions where strings E and F
differ. Design an algorithm to find a minimal cost alignment given two
strings A and B -- not just its cost.
Analyze your run-time and try to use as little space as you can.
(Variants of this problem are very important in computational biology where
the strings correspond to DNA sequences of genes or amino acid sequences of
proteins that are from related species. In that case, however, one often also
wants to consider more than just two strings, a problem called multiple string
alignment.)
- The ALGORITHM Design Manual, page 78, Problem 3-6
- (Bonus) The ALGORITHM Design Manual, pages 78-79, Problem 3-7.