Lecture 19 Summary

See the activity summary for an overview of the lecture.


The instructor says that he is continuing the discussion of dynamic programming with the discussion of the longest common subsequence (LCS). LCS is important in bioinformatics (bioinformatics is a field of biology where they study DNA).

The instructor tells students to keep in mind that for real applications the strings are tens or hundreds of thousands of characters.


Here the instructor explains LCS.


This is an activity slide. You can stop the video at 3:18.

The instructor begins displaying and discussing at 5:06.

Student Submission



Here the instructor presents a more general version of LCS that is useful in biology.

The main point here is that we can make a basic algorithm more powerful so that it can apply to a bigger problem.

The instructor also makes the point that gamma and delta could have positive or negative values.



The key here is that there are two cases: matching aj and bk or not matching aj and bk.

At 14:28, the instructor asks, "Which case is this corresponding to?" You should stop here to let students respond. At UW, a student answers "bk matching something." The instructor repeats the answer.


This is an activity slide. You can stop the video at 17:57.

After seeing the optimization recurrence for LCS, the students find the optimization recurrence for the more general problem of string alignment.

At 21:13, the instructor begins displaying and discussing solutions.

At 24:20, the instructor says that he put up the first solution (2.8) to show that you could enhance the original solution to get the answer to this deeper problem. It just happens that this can be simplified into the second solution (2.7)

Student Submission



The instructor shows the task graph (or data flow graph) for dynamic programming computations.

At 27:30, the instructor asks, "Is column order the only way to do this?" You should stop here for student response. At 28:05, the instructor asks, "Are there any others?" If you didn't get the responses of both rows and diagonals when you stopped the first time, you can stop again here.


At 29:50, the instructor asks, "So what do I get out of opt?" You should stop here for student response. It's hard to hear the student response on the recording, but the instructor repeats it.


Here the instructor explains how you can make a new matrix with entries left, down, diagonal that will let you recover the LCS, since the basic algorithm just computes the length of the LCS.


This is an activity slide. You can stop the video at 35:50.

At 39:11 the instructor uses the phrase "back of the envelope estimation" - this means a quick, approximate calculation. At 39:55 he uses the phrase "ballpark estimate" - this means the same thing.

The points here are: the value of quick estimates, and identifying the resources of interest.

At 38:45, begins discussing, and asks a series of questions before showing student submissions. You should show this, and stop after each question for student responses.

At 39:58, the instructor asks, "What are the critical resources here?" A student answers, "time and space." At 40:19, the instructor asks, "Having our theory hats on, what are space and time?" "having theory hats on" means, thinking in terms of theory. A student responds, "CPU and memory," but this is not what the instructor is looking for. Another student answers "O(nm)" which is the right answer.

Starts showing solutions at 41:08.

At 44:20, the instructor makes the point that the algorithm is feasible with respect to time, but not feasible with respect to space.

At 45:41, the instructor asks, "what's the space complexity of computing the length of the LCS?" You should stop here if there's time. At UW, a student responds, "O(n+m)" The instructor goes to a whiteboard to explain.

Student Submission


Student Submission



The instructor explains how to find the length of the LCS in linear space.

At 47:38, the instructor asks, "Why doesn't this work for getting back the subsequence?" One student responds, "You've overwritten the beginning of the subsequence," another responds, "You lose track of how to get the subsequence." You can stop here for student response if you have time.

At 48:21, the instructor refers to "leaving breadcrumbs." This means marking the path to something.


The instructor says that next time, he will present an algorithm for LCS that runs in O(nm) time and O(n+m) space. He also makes the point that when he began discussing dynamic programming, he was using more space to reduce time, and now he is using more time (by recomputing some values) to save space.