CSE 421 Assignment #5
Autumn 2012

Due: Friday, November 9, 2012.

Reading Assignment: Kleinberg and Tardos Chapter 7 (sections 7.1, 7.2, 7.3, 7.5, 7.6, 7.11).

Problems: (see the Grading Guidelines before answering)


  1. Kleinberg and Tardos, Chapter 6, Problem 6, pages 317-318.

  2. When people type, they are much more likely to transpose two adjacent characters rather than insert, delete, or change them. Suppose that there is a cost 1 per transposition, cost 2 for either an insertion or deletion and cost eab with 1< eab for typing an a instead of a b. (The likelihood of a mistyped character depends on how far apart the characters are on the keyboard.) Design an algorithm to compute the edit distance between two strings under this measure and determine the sequence of errors needed to achieve this distance. Make sure that the transposition errors you consider do not involve the same letter twice.

  3. Kleinberg and Tardos, Chapter 6, Problem 13, page 324.

  4. Kleinberg and Tardos, Chapter 7, Problem 23, pages 428-429.

  5. Extra Credit In this problem you are given as input a binary tree T=(V,E) with each leaf w labelled by a symbol sw from some fixed alphabet A, as well as a two-dimensional non-negative array of costs indexed by pairs of elements of A such that c[s,t] is the cost of changing symbol s to symbol t and this satisfies c[s,s]=0 and c[s,t]=c[t,s] for all s,t in A. Design an algorithm to label each internal node v of the binary tree by a symbol sv from A so as to minimize the sum over all (u,v) in E of c[su,sv].

    (This kind of problem arises in computational biology when one wishes to assess a potential evolutionary tree and reconstruct properties of potential ancestral species given this tree. In this case each "symbol" might be a very short sequence - or even just a letter - of DNA or protein that might at a given position within a longer sequence, and c[s,t] is the cost of the evolutionary change in going from s to t.)