CSE 421 Assignment #6
Autumn 2007
Due: Friday, November 16, 2007.
Reading Assignment: Kleinberg and Tardos Chapter 7
Problems: Re-read the Grading Guidelines before answering (For all problems on this homework prove that your algorithm is correct and analyze its worst-case running time.)
- Kleinberg and Tardos, Chapter 6, Problem 13, page 324.
- Kleinberg and Tardos, Chapter 6, Problem 24, pages 331-332.
- When people type, they are much more likely to
transpose two adjacent characters rather than insert, delete, or change them.
Suppose that there is a cost d< 1 per transposition, cost 1
for either an
insertion or deletion and cost eab with 1< eab
for typing an a instead of a b. (The likelihood of a mistyped
character depends on how far apart the characters are on the keyboard.)
Design an algorithm to compute the edit distance
between two strings under this measure and determine the sequence of
errors needed to achieve this distance. Make sure
that the transposition errors you consider do not involve the same letter
twice.
- Extra Credit In this problem you are given as input a binary
tree T=(V,E) with each leaf l labelled by a symbol sl from some fixed alphabet A, as
well as a two-dimensional non-negative array of costs indexed by pairs of
elements of A such that c[s,t] is the cost of changing symbol
s to symbol t and this satisfies c[s,s]=0 and c[s,t]=c[t,s] for all
s,t in A. Design an algorithm to label each internal node v of the binary tree by a symbol sv from A so as to
minimize the sum over all (u,v) in E of c[su,sv].
(This kind of problem arises in computational biology when one wishes to
assess a potential evolutionary tree and reconstruct properties of potential
ancestral species given this tree. In this case each "symbol" might be
a very short sequence - or even just a letter - of DNA or protein that might
at a given position within a longer sequence, and c[s,t] is the cost of the evolutionary change in going from s to t.)