CSE 590C, Sp '14: Reading & Research in Comp. Bio.

University of Washington Computer Science & Engineering

CSE Home

About Us

Contact Info

Course Info CSE 590C is a weekly seminar on Readings and Research in Computational Biology, open to all graduate students in computational, biological, and mathematical sciences.

When/Where: Mondays, 3:30 - 4:50, EE1 026 (schematic)

Organizers: Joe Felsenstein, Su-In Lee, Bill Noble, Larry Ruzzo

Credit: 1-3 Variable

Grading: Credit/No Credit. Talk to the organizers if you are unsure of our expectations.

Email

cse590cb@cs.washington.edu Course-related announcements and discussions
Manage Your Subscription List Archives

compbio-seminars@cs.washington.edu Biology seminar announcements from all around campus
Manage Your Subscription List Archives

compbio-group@cs.washington.edu Computational biology discussions, conference/job postings, etc.,
Manage Your Subscription List Archives

Schedule

Date Presenters/Participants Topic Details

03/31 ---- Organizational Meeting ----

04/07 Erick Matsen, FHCRC Substitution and per-residue selection in B cell affinity maturation Details

04/14 Jeff Howbert Computing exact p-values to improve calibration of a cross-correlation shotgun proteomics scoring function. Details

04/21 Scott Lundberg Learning Statistical Dependency Structure Among CHIP-seq Tracks

04/28 Max Libbrecht Genome annotation of multiple cell types and chromatin architecture using graph-based regularization Details

05/05 Daniel Jones Analysis of splicing and transcription in RNA-seq experiments

05/12 Alex Hu Models to identify peptides from data-independent acquisition mass spectra

05/19 Sharon Greenblum Copy Number Variation in Human Gut Microbial Species

05/26 Holiday

06/02 John Earls AUREA Nebula: Cloud based network analysis

Papers, etc.
Note on Electronic Access to Journals
Links to full papers below are often to journals that require a paid subscription. The UW Library is generally a paid subscriber, and you can freely access these articles if you do so from an on-campus computer. For off-campus access, follow the "[offcampus]" links below or look at the library "proxy server" instructions. You will be prompted for your UW net ID and password once per session.

03/31: ---- Organizational Meeting ----

04/07: Substitution and per-residue selection in B cell affinity maturation -- Erick Matsen, FHCRC

Slides: http://matsen.github.io/talks/bcell-single.html#/

04/14: Computing exact p-values to improve calibration of a cross-correlation shotgun proteomics scoring function. -- Jeff Howbert

    Abstract:   The core of every shotgun proteomics analysis pipeline is a function that scores the quality of a match between an observed fragmentation spectrum and a candidate peptide. The utility of these scores is critically dependent on their statistical calibration. For a well-calibrated score function, a score of X assigned to one spectrum is directly comparable to a score of X assigned to a different spectrum. Improving calibration of a score function across spectra can lead to large improvements in the number of identified spectra at a given statistical confidence threshold. Score calibration has been carried out previously using empirical curve fitting procedures to estimate p-values, or with post-processors such as PeptideProphet and Percolator.
This work describes a new method for computing exact p-values for the oldest and one of the most widely used score functions, SEQUEST XCorr. Dynamic programming is used to efficiently compute the full distribution of scores for all possible peptides whose masses are close to that of the spectrum precursor mass. We find that the resulting p-values are valid relative to a widely accepted null model, and that ranking identified spectra by p-value rather than XCorr reduces variance due to spectrum-specific effects on the score. Across a variety of data sets, our XCorr p-value yields significantly more spectrum and peptide identifications at a fixed false discovery rate than other, state-of-the-art methods, including SEQUEST, Mascot, X!Tandem, and Comet, and is competitive with other dynamic programming-based calibration methods like MS-GF+. Strikingly, the improved calibration afforded by our scoring scheme is complementary to that provided by Percolator, so that combination of the two methods yields even better results. Our method is able to take advantage of both high-resolution MS1 and MS2 data.

04/21: Learning Statistical Dependency Structure Among CHIP-seq Tracks -- Scott Lundberg

04/28: Genome annotation of multiple cell types and chromatin architecture using graph-based regularization -- Max Libbrecht

    Authors:   Maxwell W. Libbrecht (1), Michael M. Hoffman (2), Ferhat Ay (3), David M. Gilbert (4), Jeffrey A. Bilmes (5), William S. Noble (1,3). (1) Computer Science & Eng., U Washington; (2) Princess Margaret Cancer Center; (3) Genome Sciences, U Washington; (4) Biological Science, Florida State U; (5) Electrical Eng., U Washington

Abstract:   Semi-automated genome annotation algorithms facilitate human interpretation of large, heterogeneous collections of functional genomics data by simultaneously partitioning the human genome and assigning labels to the resulting genomic segments. However, existing methods fail to address two problems related to genome annotation: (1) performing genome annotation in multiple cell types and (2) integrating 3D structure information into the annotation. We propose a single solution to these seemingly different problems using the idea of a pairwise prior, which encourages certain pairs of genomic positions to receive the same label. We developed a novel computational method, called graph-based regularization (GBR), that performs inference in the presence of a pairwise prior. We first use GBR to annotate multiple cell types, transferring via the pairwise prior the information that pairs of genomic loci that received the same label in a reference cell type should be more likely to receive the same label in the cell type in question. We then use GBR to integrate 3D structure information from chromatin conformation assays such as Hi-C. In this case, the the pairwise prior encourages positions that are close in 3D to occupy the same type of domain. This approach allows us to annotate the human cell line IMR90 and thereby characterize the ontology of domains, revealing the relationships between Polycomb and constitutively repressed domains, topological domains, and replication domains. Finally, we use annotations over six human cell lines to find sequence elements that mark developmentally-conserved boundaries between domains.

05/05: Analysis of splicing and transcription in RNA-seq experiments -- Daniel Jones

05/12: Models to identify peptides from data-independent acquisition mass spectra -- Alex Hu

05/19: Copy Number Variation in Human Gut Microbial Species -- Sharon Greenblum

05/26:   -- Holiday

06/02: AUREA Nebula: Cloud based network analysis -- John Earls

Other Seminars Past quarters of CSE 590C
COMBI & Genome Sciences Seminars
Biostatistics Seminars
Microbiology Department Seminars

Resources Molecular Biology for Computer Scientists, a primer by Lawrence Hunter (46 pages)
A Quick Introduction to Elements of Biology, a primer by Alvis Brazma et al.
A very comprehensive FAQ at bioinformatics.org, including annotated references to online tutorials and lectures.
CSE 527: Computational Biology
CSEP 590A: Computational Biology (Professional Masters Program)
Genome 540/541: Introduction to Computational Molecular Biology: Genome and Protein Sequence Analysis
CSE's Computational Molecular Biology research group
Interdisciplinary Ph.D. program in Computational Molecular Biology

Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA 98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX