CSE 421 Assignment #5
Spring 2016

Due: Friday, May 13, 2016.

Reading Assignment: Kleinberg and Tardos Chapter 7 (sections 7.1, 7.2, 7.3, 7.5, 7.6, 7.11).

Problems: (see the Grading Guidelines before answering)


  1. Kleinberg and Tardos, Chapter 6, Problem 12, pages 323-324.

  2. Kleinberg and Tardos, Chapter 6, Problem 13, page 324.

  3. When people type, they are much more likely to transpose two adjacent characters rather than insert, delete, or change them. Suppose that there is a cost 1 per transposition, cost 2 for either an insertion or deletion and cost eab with 1< eab for typing an a instead of a b. (The likelihood of a mistyped character depends on how far apart the characters are on the keyboard.) Each character can be involved in only one transposition. (For example, changin $abc$ to $bac$ costs 1 transposition, but changing it to $bca$ could not be done using two transpositions and would instead require substitutions or insertions and deletions. Clarification: if symbols participate in a transposition then they must remain consecutive and unchanged in the modified string. Design an algorithm to compute the edit distance between two strings under this measure and determine the sequence of errors needed to achieve this distance.

  4. Kleinberg and Tardos, Chapter 7, Problem 5, page 416.

  5. Kleinberg and Tardos, Chapter 7, Problem 7, pages 417-418.

  6. Extra Credit Reminder Extra credit programming problem from Homework #4 (Due: Monday May 9).

  7. Extra Credit In this problem you are given as input a binary tree T=(V,E) with each leaf w labelled by a symbol sw from some fixed alphabet A, as well as a two-dimensional non-negative array of costs indexed by pairs of elements of A such that c[s,t] is the cost of changing symbol s to symbol t and this satisfies c[s,s]=0 and c[s,t]=c[t,s] for all s,t in A. Design an algorithm to label each internal node v of the binary tree by a symbol sv from A so as to minimize the sum over all (u,v) in E of c[su,sv].

    (This kind of problem arises in computational biology when one wishes to assess a potential evolutionary tree and reconstruct properties of potential ancestral species given this tree. In this case each "symbol" might be a very short sequence - or even just a letter - of DNA or protein that might at a given position within a longer sequence, and c[s,t] is the cost of the evolutionary change in going from s to t.)