|
CSE Home | About Us | Search | Contact Info |
Projects can be done individually, or in small groups. Large groups are also okay, provided you have a thoughtful plan to organize and divide the work. Groups combining people from different fields are particularly encouraged. Feel free to use the class email list cse527a_au07@u.washington.edu to brainstorm project ideas, try to round up partners, etc. Choices for the project include, but are not limited to:
By middle of finals week, hand in a paper (approximately 5-10 pages) describing the project, and give a 20-30 minute presentation.
Students consistently impress me with creative, cogent project ideas, so by all means fell free to come up with your own ideas. Here are a few of mine to get you started:
De Novo Discovery of Non-Coding RNA Genes: If you did HW#4, here's an extension that might be of interest. Given the success of the simple approach we used in HW#4 for finding non-coding RNA genes, it is natural to wonder whether any of the GC rich patches we found which were not previously annotated as tRNAs or rRNAs are in fact real non-coding RNA genes. Short of wet-lab experiments (as in [1]), how might we tell? One approach would be to look for similar sequences in related organisms. Here's a sketch of one possible approach. Take a look at the taxonomy information at the NCBI web site, and select some organisms related to M. jannaschii, probably AT-rich ones, perhaps also hyperthermophiles. Run your HW#4 algorithm on them as well. Filter out the tRNA and rRNA hits and any shorter than, say, 50 nucleotides. Perhaps extend each hit by 25-50 nucleotides in each direction, in case the Viterbi boundaries were somewhat off. Try to match each hit in one organism to its (putative) orthologs in the others, based perhaps on length, BLAST matches (feel free to download & install it locally, either the NCBI or (faster) WU versions) or Smith-Waterman alignments (modify your HW#2 code, or perhaps use the ssearch component of W. Pearson's fasta package). Even matching them by eye is OK, although that obviously won't scale very well... Do any of the putative ortholog groups appear to have conserved secondary structures? Perform secondary structure predictions using the Vienna RNA package, Pfold, CMfinder or other tools. What do you find?
This problem is obviouslyly open-ended, and not certain of success. I'm open to just about anything you want to try this side of ouija boards; just describe what you try and how it works (or doesn't), and perhaps your thoughts on better alternatives.
References:
[1] RJ Klein, Z Misulovin, SR Eddy, "Noncoding RNA genes identified in AT-rich hyperthermophiles." Proc. Natl. Acad. Sci. U.S.A., 99, #11 (2002) 7542-7. [offcampus]
Computer Science & Engineering University of Washington Box 352350 Seattle, WA 98195-2350 (206) 543-1695 voice, (206) 543-2969 FAX |