CSE 421 Assignment #5
Spring 2016
Due: Friday, May 13, 2016.
Reading Assignment: Kleinberg and Tardos Chapter 7 (sections 7.1, 7.2, 7.3, 7.5, 7.6, 7.11).
Problems: (see the Grading Guidelines
before answering)
- Kleinberg and Tardos, Chapter 6, Problem 12, pages 323-324.
- Kleinberg and Tardos, Chapter 6, Problem 13, page 324.
- When people type, they are much more likely to
transpose two adjacent characters rather than insert, delete, or change them.
Suppose that there is a cost 1 per transposition, cost 2
for either an
insertion or deletion and cost eab with 1< eab
for typing an a instead of a b. (The likelihood of a mistyped
character depends on how far apart the characters are on the keyboard.)
Each character can be involved in only one transposition. (For example,
changin $abc$ to $bac$ costs 1 transposition, but changing it to $bca$ could
not be done using two transpositions and would instead require substitutions
or insertions and deletions.
Clarification: if symbols participate in a transposition then they must
remain consecutive and unchanged in the modified string.
Design an algorithm to compute the edit distance
between two strings under this measure and determine the sequence of
errors needed to achieve this distance.
- Kleinberg and Tardos, Chapter 7, Problem 5, page 416.
- Kleinberg and Tardos, Chapter 7, Problem 7, pages 417-418.
- Extra Credit Reminder Extra credit programming problem from
Homework #4 (Due: Monday May 9).
- Extra Credit In this problem you are given as input a binary
tree T=(V,E) with each leaf w labelled by a symbol sw from some fixed alphabet A, as
well as a two-dimensional non-negative array of costs indexed by pairs of
elements of A such that c[s,t] is the cost of changing symbol
s to symbol t and this satisfies c[s,s]=0 and c[s,t]=c[t,s] for all
s,t in A. Design an algorithm to label each internal node v of the binary tree by a symbol sv from A so as to
minimize the sum over all (u,v) in E of c[su,sv].
(This kind of problem arises in computational biology when one wishes to
assess a potential evolutionary tree and reconstruct properties of potential
ancestral species given this tree. In this case each "symbol" might be
a very short sequence - or even just a letter - of DNA or protein that might
at a given position within a longer sequence, and c[s,t] is the cost of the evolutionary change in going from s to t.)