image University of Washington Computer Science & Engineering
 CSE 590 C, Sp '06: Reading & Research in Comp. Bio.
  CSE Home   About Us    Search    Contact Info 

 Course Info    CSE 590C is a weekly seminar on Readings and Research in Computational Biology, open to all graduate students in computational, biological, and mathematical sciences.
When/Where:  Mondays, 3:30 - 4:50, EEB 026
Organizers:  Joe Felsenstein, Bill Noble, Larry Ruzzo, Martin Tompa
Credit: 1-3 Variable
Grading: Credit/No Credit. Talk to the organizers if you are unsure of our expectations.
 Email
cse590cb@cs.washington.edu Course-related announcements and discussions
  Manage Your Subscription List Archives
compbio-seminars@cs.washington.edu Biology seminar announcements from all around campus
  Manage Your Subscription List Archives
compbio-group@cs.washington.edu Discussions about computational biology
  Manage Your Subscription List Archives
 Schedule
 Date  Presenters/Participants Topic Papers
03/27---- Organizational Meeting ----
04/03Amol Prakash, CSEEstimating significance of whole genome multiple alignmentsAbstract
04/10Adrienne Wang, CSEUsing ncRNA as a Test of Whole-Genome Multiple Alignments 
04/17Amir Ben-Dor, AgilentArray CGH data analysis - theory and practiceAbstract
04/24Jian Qiu, GStba 
05/01Bob Thurman, GSDiscovery of higher-order functional features in the human genomeAbstract
05/08Jason Mcdermott, MicrobiologyComputational exploration of biological organization with the BioverseSlides
05/15Charles Mader, MicrobiologyProtein Structure Prediction: an alternative modelAbstract
05/22Zizhen Yao, CSECMfinder: A Covariance Model Based RNA Motif Finding AlgorithmAbstract
05/29Holiday

 Papers, etc.

  Note on Electronic Access to Journals

Links to full papers below are often to journals that require a paid subscription. The UW Library is generally a paid subscriber, and you can freely access these articles if you do so from an on-campus computer. For off-campus access, follow the "[offcampus]" links below or look at the library "proxy server" instructions. You will be prompted for your UW net ID and password once per session.  


03/27: ---- Organizational Meeting ----

04/03: Estimating significance of whole genome multiple alignments -- Amol Prakash, CSE

   Abstract:   Whole genome alignments are being widely used by biologists for multiple purposes related to comparative genomics. Before doing any such analysis on a particular portion of the alignment, it is critical to have confidence that that portion is correctly aligned. Unfortunately, no method to estimate the significance of these whole genome alignments has been suggested to date. In this work we provide a methodology to do so and assess significance of the 8-vertebrate MultiZ alignment present on the UCSC Genome Browser. We report approximately 0.75% (1.57Mbp) of the human chromosome 1 alignment as having high p-value. This number increases to 14% if we consider only the alignments containing zebrafish. The results for chromosome 1 are available as a UCSC browser track at http://bio.cs.washington.edu/SigMA-w/. This is also the first tool that can compute the significance of every portion of an alignment, and not just the entire alignment.

04/17: Array CGH data analysis - theory and practice -- Amir Ben-Dor, Agilent

   Abstract:   Cancer typically arises as a result of an acquired genomic instability and subsequent clonal evolution of neoplastic cells. Consequently, cancer cells contain multiple regions of copy number gains and losses throughout their genomes. The patterns of copy number aberrations present in a cancer genome consist of selected and non-selected lesions and vary within and across different tissues of origin. For example loss of CDKN2A (9p21) is frequent in melanomas and lung carcinomas, SMAD4 (18q21) deletions are present in colon cancers, and HER2/NEU (17q12) amplification is often seen in breast carcinomas. Recent technology developments introduced an oligonucleotide array platform for array based comparative genomic hybridization (aCGH) analyses. This platform provides increased resolution in determining the boundaries of measured genome alterations.

In the talk I will review the biological background of cancer genome instabilities, and the measurement technologies, in particular array CGH, and discuss in details data analysis goals, tasks and solutions.. I will briefly describe the data analysis workflow of a multi-sample array cgh cancer study, and provide details on the more complicated steps of the analysis - aberration calling, data centering, detecting common aberrations and, if time permits, joint analysis of cgh and gene expression data. To demonstrate the workflow, I'll share some examples from on-going collaboration with John Weinstein group (NCI), analyzing array CGH data for the NCI-60 drug discovery panel of cell lines. The NCI-60 panel includes multiple highly annotated, well characterized samples from nine different tissue types and thus represents a valuable resource for studying the patterns of genomic lesions that may be present in human cancers.

05/01: Discovery of higher-order functional features in the human genome -- Bob Thurman, GS

   Abstract:   It has long been hypothesized that the human and other large genomes are organized into higher-order (i.e., greater than gene-sized) functional domains. Recent technological advances have enabled the rapid emergence of large-scale biological data sets comprising specific functional variables (e.g., transcription, histone modifications, etc.) sampled in a nearly continuous fashion across the genome. A major outstanding question is to what degree such data reveal coherent higher-order features that may in turn illuminate the underlying functional architecture of the genome. To address this, we developed novel approaches based on wavelet analysis for discovery of \`\`domain-level'' behavior in fine scale functional genomic data, and for correlating apparently disparate functional data types collected at different resolutions and scales. Wavelets represent a powerful mathematical framework for decomposing a given genomic data type into increasingly coarse scales, allowing broader and broader trends in the data to reveal themselves. We apply this approach to a variety of continuously sampled data types from the NHGRI ENCODE project to visualize distinct higher order features of the human genomic landscape. We then applied Hidden Markov Models (HMMs) to the wavelet decomposition to provide segmentations of the ENCODE regions into discrete functional states or domains. We also correlate multiple continuous data types at multiple scales to uncover important similarities and differences, a major feature of which is that such relationships (e.g., the correspondence between transcription, histone modification patterns, and fine-scale evolutionary conservation) are often highly localized in nature, disappearing and reappearing again from region to region and locus to locus. The results highlight an analytical framework which may be applied broadly to other complex genomes.

05/08: Computational exploration of biological organization with the Bioverse -- Jason Mcdermott, Microbiology

   Slides:
   http://compbio.washington.edu/local/people/mcdermottj/presentations/May082006/Presentation.ppt
   http://compbio.washington.edu/local/people/mcdermottj/presentations/May082006/Presentation.htm

05/15: Protein Structure Prediction: an alternative model -- Charles Mader, Microbiology

   Abstract:   I will present an introduction to the protein structure prediction problem in computational biology. The Poisson-Boltzmann equation is commonly used to predict electrostaict interactions in protein structure prediction. I present acritique of the Poisson-Boltzmann approach to protein electrostatics. Based on this critique I derive the RB equation. The RB equation provides a way to parameterize an energy function such that the native conformation is the minimum engery. I show how to use the p-space elipsoid to determine the resolution of this model, and describe how the volume of the p-space eplipsoid can be used to evaluate second order corrections to the model.

Charles provides Additional Information:

05/22: CMfinder: A Covariance Model Based RNA Motif Finding Algorithm -- Zizhen Yao, CSE

   Abstract:   The recent discoveries of large numbers of non-coding RNAs creates a need for tools for automatic, high quality identification and characterization of conserved RNA motifs that can be readily used for database search. Previous tools fall short of this goal. CMfinder is a new tool for RNA motif prediction. It is an expectation maximization algorithm using covariance models for motif description, carefully crafted heuristics for effective motif search, and a novel Bayesian framework for structure prediction combining folding energy and sequence covariation. When testing on known ncRNA families, including some difficult cases with poor sequence conservation and large indels, our method demonstrates excellent average per-base-pair accuracy --- 79% compared with at most 60% for alternative methods. In this talk, I will discuss the algorithmic issues in CMfinder, and a systematic framework for discovering ncRNAs at genomic scale. In a continuing collaboration with biologists, we have identified several dozens of promising candidates in different bacterial clades, with one experimentally validated novel riboswitch, and a few others under close investigation.


 Other  Seminars Past quarters of CSE 590C
COMBI & Genome Sciences Seminars
Applied Math Department Mathematical Biology Journal Club
Biostatistics Seminars
Microbiology Department Seminars
Zoology 525, Mathematical Biology Seminar Series

 Resources Molecular Biology for Computer Scientists, a primer by Lawrence Hunter (46 pages)
A Quick Introduction to Elements of Biology, a primer by Alvis Brazma et al.
S-Star Bioinformatics Online Course Schedule, a collection of video primers
A very comprehensive FAQ at bioinformatics.org, including annotated references to online tutorials and lectures.
CSE 527: Computational Biology
CSE 590TV: Computational Biology (Professional Masters Program)
Genome 540/541: Introduction to Computational Molecular Biology: Genome and Protein Sequence Analysis

CSE's Computational Molecular Biology research group
Interdisciplinary Ph.D. program in Computational Molecular Biology


CSE logo Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA  98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX
[comments to cse590c-webmaster@cs.washington.edu]