|
![]() |
![]() |
![]() |
![]() |
Course Info |
CSE 590C is a weekly seminar on Readings and Research in
Computational Biology, open to all graduate students in computational,
biological, and mathematical sciences.
| |||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||
Schedule |
|
|||||||||||||||||||||||||||||||||||||||||||||||||
Papers, etc. |
Links to full papers below are often to journals that require a
paid subscription. The UW Library is generally a paid
subscriber, and you can freely access these articles if you do
so from an on-campus computer. For off-campus access,
follow the "[offcampus]" links below or
look at the
library "proxy server" instructions.
You will be prompted for your UW net ID and password once per
session.
|
Abstract: Whole genome alignments are being widely used by biologists for multiple purposes related to comparative genomics. Before doing any such analysis on a particular portion of the alignment, it is critical to have confidence that that portion is correctly aligned. Unfortunately, no method to estimate the significance of these whole genome alignments has been suggested to date. In this work we provide a methodology to do so and assess significance of the 8-vertebrate MultiZ alignment present on the UCSC Genome Browser. We report approximately 0.75% (1.57Mbp) of the human chromosome 1 alignment as having high p-value. This number increases to 14% if we consider only the alignments containing zebrafish. The results for chromosome 1 are available as a UCSC browser track at http://bio.cs.washington.edu/SigMA-w/. This is also the first tool that can compute the significance of every portion of an alignment, and not just the entire alignment. |
04/17: Array CGH data analysis - theory and practice -- Amir Ben-Dor, Agilent
Abstract:
Cancer typically arises as a result of an acquired genomic instability
and subsequent clonal evolution of neoplastic cells. Consequently,
cancer cells contain multiple regions of copy number gains and losses
throughout their genomes. The patterns of copy number aberrations
present in a cancer genome consist of selected and non-selected lesions
and vary within and across different tissues of origin. For example loss
of CDKN2A (9p21) is frequent in melanomas and lung carcinomas, SMAD4
(18q21) deletions are present in colon cancers, and HER2/NEU (17q12)
amplification is often seen in breast carcinomas. Recent technology
developments introduced an oligonucleotide array platform for array
based comparative genomic hybridization (aCGH) analyses. This platform
provides increased resolution in determining the boundaries of measured
genome alterations.
In the talk I will review the biological background of cancer genome instabilities, and the measurement technologies, in particular array CGH, and discuss in details data analysis goals, tasks and solutions.. I will briefly describe the data analysis workflow of a multi-sample array cgh cancer study, and provide details on the more complicated steps of the analysis - aberration calling, data centering, detecting common aberrations and, if time permits, joint analysis of cgh and gene expression data. To demonstrate the workflow, I'll share some examples from on-going collaboration with John Weinstein group (NCI), analyzing array CGH data for the NCI-60 drug discovery panel of cell lines. The NCI-60 panel includes multiple highly annotated, well characterized samples from nine different tissue types and thus represents a valuable resource for studying the patterns of genomic lesions that may be present in human cancers. |
05/01: Discovery of higher-order functional features in the human genome -- Bob Thurman, GS
Abstract: It has long been hypothesized that the human and other large genomes are organized into higher-order (i.e., greater than gene-sized) functional domains. Recent technological advances have enabled the rapid emergence of large-scale biological data sets comprising specific functional variables (e.g., transcription, histone modifications, etc.) sampled in a nearly continuous fashion across the genome. A major outstanding question is to what degree such data reveal coherent higher-order features that may in turn illuminate the underlying functional architecture of the genome. To address this, we developed novel approaches based on wavelet analysis for discovery of \`\`domain-level'' behavior in fine scale functional genomic data, and for correlating apparently disparate functional data types collected at different resolutions and scales. Wavelets represent a powerful mathematical framework for decomposing a given genomic data type into increasingly coarse scales, allowing broader and broader trends in the data to reveal themselves. We apply this approach to a variety of continuously sampled data types from the NHGRI ENCODE project to visualize distinct higher order features of the human genomic landscape. We then applied Hidden Markov Models (HMMs) to the wavelet decomposition to provide segmentations of the ENCODE regions into discrete functional states or domains. We also correlate multiple continuous data types at multiple scales to uncover important similarities and differences, a major feature of which is that such relationships (e.g., the correspondence between transcription, histone modification patterns, and fine-scale evolutionary conservation) are often highly localized in nature, disappearing and reappearing again from region to region and locus to locus. The results highlight an analytical framework which may be applied broadly to other complex genomes. |
05/08: Computational exploration of biological organization with the Bioverse -- Jason Mcdermott, Microbiology
Slides:
http://compbio.washington.edu/local/people/mcdermottj/presentations/May082006/Presentation.ppt http://compbio.washington.edu/local/people/mcdermottj/presentations/May082006/Presentation.htm |
05/15: Protein Structure Prediction: an alternative model -- Charles Mader, Microbiology
Abstract:
I will present an introduction to the protein structure
prediction problem in computational biology.
The Poisson-Boltzmann equation is commonly used to
predict electrostaict interactions in protein structure
prediction. I present acritique of the Poisson-Boltzmann approach to
protein electrostatics. Based on this critique I
derive the RB equation. The RB equation
provides a way to parameterize an energy function such that
the native conformation is the minimum engery. I show how
to use the p-space elipsoid to determine the resolution of
this model, and describe how the volume of the p-space
eplipsoid can be used to evaluate second order corrections
to the model.
Charles provides Additional Information:
|
05/22: CMfinder: A Covariance Model Based RNA Motif Finding Algorithm -- Zizhen Yao, CSE
Abstract: The recent discoveries of large numbers of non-coding RNAs creates a need for tools for automatic, high quality identification and characterization of conserved RNA motifs that can be readily used for database search. Previous tools fall short of this goal. CMfinder is a new tool for RNA motif prediction. It is an expectation maximization algorithm using covariance models for motif description, carefully crafted heuristics for effective motif search, and a novel Bayesian framework for structure prediction combining folding energy and sequence covariation. When testing on known ncRNA families, including some difficult cases with poor sequence conservation and large indels, our method demonstrates excellent average per-base-pair accuracy --- 79% compared with at most 60% for alternative methods. In this talk, I will discuss the algorithmic issues in CMfinder, and a systematic framework for discovering ncRNAs at genomic scale. In a continuing collaboration with biologists, we have identified several dozens of promising candidates in different bacterial clades, with one experimentally validated novel riboswitch, and a few others under close investigation. |
CSE's Computational Molecular Biology research group
Interdisciplinary Ph.D. program in Computational Molecular Biology
![]() |
Computer Science & Engineering University of Washington Box 352350 Seattle, WA 98195-2350 (206) 543-1695 voice, (206) 543-2969 FAX [comments to cse590c-webmaster@cs.washington.edu] |