|
CSE Home | About Us | Search | Contact Info |
Course Info | CSE 590C is a weekly seminar on Readings and Research in Computational Biology, open to all
graduate students in computational, biological, and mathematical sciences.
| |||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||
Theme | Traditionally, we reserve Spring quarter for "homegrown" research --- highlights of work by researchers in the Seattle area. Our Spring schedule is: | |||||||||||||||||||||||||||||||||||||||||||||||||
Schedule |
| |||||||||||||||||||||||||||||||||||||||||||||||||
Papers, etc. | Note on Electronic Access to JournalsThe UW Library is generally a paid subscriber to non-open-access journals we cite. You can freely access these articles from on-campus computers. For off-campus access, follow the "[offcampus]" links below or look at the library "proxy server" instructions. You will be prompted for your UW net ID and password.03/27: -- ---- Organizational Meeting ---- 04/03: Monocle 2 -- Xiaojie Qiu (GS)
Xiaojie Qiu, Qi Mao, Ying Tang, Li Wang, Raghav Chawla, Hannah Pliner, Cole Trapnell,
"Reversed graph embedding resolves complex single-cell developmental trajectories,"
http://biorxiv.org/content/early/2017/02/21/110668
Abstract: Organizing single cells along a developmental trajectory has emerged as a powerful tool for understanding how gene regulation governs cell fate decisions. However, learning the structure of complex single-cell trajectories with two or more branches remains a challenging computational problem. We present Monocle 2, which uses reversed graph embedding to reconstruct single-cell trajectories in a fully unsupervised manner. Monocle 2 learns an explicit principal graph to describe the data, greatly improving the robustness and accuracy of its trajectories compared to other algorithms. Monocle 2 uncovered a new, alternative cell fate in what we previously reported to be a linear trajectory for differentiating myoblasts. We also reconstruct branched trajectories for two studies of blood development, and show that loss of function mutations in key lineage transcription factors diverts cells to alternative branches on the a trajectory. Monocle 2 is thus a powerful tool for analyzing cell fate decisions with single-cell genomics. 04/10: Game Theory Meets ML -- Scott Lundberg (CSE)
Title: How game theory can help us understand our machine learning models
Abstract: When applying machine learning to computational biology it is important to have accurate models, and it is also important to understand why they make specific predictions. Unfortunately, these two requirements are often at odds. Here we demonstrate how to have your cake and eat it too...to have the highest accuracy while retaining interpretability. 04/17: Genome-Guided Transcriptome Assembly -- Shunfu Mao (EE)
Title: RefShannon: a genome-guided transcriptome assembler using sparse flow decomposition
Abstract: High throughput sequencing of RNA (RNA-Seq) has become a staple in modern molecular biology, with applications not only in quantifying gene expression but also in isoform-level analysis of the RNA transcripts. To enable such an isoform-level analysis, a transcriptome assembly algorithm is utilized to stitch together the observed short reads into the corresponding transcripts. This task is complicated due to the complexity of alternative splicing - a mechanism by which the same gene may generate multiple distinct RNA transcripts. We develop a novel genome-guided transcriptome assembler, RefShannon, that exploits the varying abundances of the different transcripts, in enabling an accurate reconstruction of the transcripts. Our evaluation shows RefShannon improves sensitivity by 13% to 25% at a given specificity in comparison with state-of-the-art assemblers including Stringtie and Cufflinks. 04/24: Resolving Multi-copy Duplication in Genomes -- Sudipto Mukherjee (EE)
Abstract:
Structural rearrangement in the DNA has come to the forefront of research in biology due to its association
with diseases. Advancement of sequencing technologies has made it possible to obtain partial information about these
regions. The goal of this project is to design an algorithm for reconstructing a form of structural variation, known
as segmental duplication, by modeling the problem as a discrete matrix completion. By leveraging techniques from
non-convex optimization and structure recovery with missing data, the designed workflow is able to achieve better
result than state-of-the-art algorithms.
05/01: Chromatin Architecture -- Jacob Schreiber (CSE)
Jacob Schreiber, Maxwell Libbrecht, Jeffrey Bilmes, and William Stafford Noble,
"Nucleotide sequence and DNaseI sensitivity are predictive of 3D chromatin architecture,"
http://biorxiv.org/content/biorxiv/early/2017/01/28/103614.full.pdf
Abstract: Motivation: Recently, Hi-C has been used to probe the 3D chromatin architecture of multiple organisms and cell types. The resulting collections of pairwise contacts across the genome have connected chromatin architecture to many cellular phenomena, including replication timing and gene regulation. However, high resolution (10 kb or finer) contact maps remain scarce due to the expense and time required for collection. A computational method for predicting pairwise contacts without the need to run a Hi-C experiment would be invaluable in understanding the role that 3D chromatin architecture plays in genome biology. Results: We describe Rambutan, a deep convolutional neural network that predicts Hi-C contacts at 1 kb resolution using nucleotide sequence and DNaseI assay signal as inputs. Specifically, Rambutan identifies locus pairs that engage in high confidence contacts according to Fit-Hi-C, a previously described method for assigning statistical confidence estimates to Hi-C contacts. We first demonstrate Rambutanâs performance across chromosomes at 1 kb resolution in the GM12878 cell line. Subsequently, we measure Rambutanâs performance across six cell types. In this setting, the model achieves an area under the receiver operating characteristic curve between 0.7662 and 0.8246 and an area under the precision-recall curve between 0.3737 and 0.9008. We further demonstrate that the predicted contacts exhibit expected trends relative to histone modification ChIP-seq data, replication timing measurements, and annotations of functional elements such as promoters and enhancers. Finally, we predict Hi-C contacts for 53 human cell types and show that the predictions cluster by cellular function. Availability: Tutorials and source code for Rambutan are publicly available at https://github.com/jmschrei/rambutan. 05/08: Large-scale RNA-Seq Analysis -- Daniel Jones (CSE) 05/15: Minor Histocompatibility and Graft-Versus-Host Disease -- David Levine (Biostat)
Abstract:
The risk of acute graft-versus-host disease (GVHD) is higher after allogeneic hematopoietic cell
transplantation (HCT) from unrelated donors as compared with related donors. This difference has been explained by
increased recipient mismatching for major histocompatibility antigens or minor histocompatibility antigens. In the
current study, we used genome-wide arrays to enumerate single nucleotide polymorphisms (SNPs) that produce
graft-versus-host (GVH) amino acid coding differences between recipients and donors. We then tested the hypothesis
that higher degrees of genome-wide recipient GVH mismatching correlate with higher risks of GVHD after allogeneic
HCT. In HLA-genotypically matched sibling recipients, the average recipient mismatching of coding SNPs was
9.35%. Each 1% increase in genome-wide recipient mismatching was associated with an estimated 20% increase in the
hazard of grades III-IV GVHD (hazard ratio [HR], 1.20; 95% confidence interval [CI], 1.05-1.37; P = .007) and an
estimated 22% increase in the hazard of stage 2-4 acute gut GVHD (HR, 1.22; 95% CI, 1.02-1.45; P = .03). In HLA-A,
B, C, DRB1, DQA1, DQB1, DPA1, DPB1-phenotypically matched unrelated recipients, the average recipient mismatching of
coding SNPs was 17.3%. The estimated risks of GVHD-related outcomes in HLA-phenotypically matched unrelated
recipients were low, relative to the large difference in genome-wide mismatching between the 2 groups. In contrast,
the risks of GVHD-related outcomes were higher in HLA-DP GVH-mismatched unrelated recipients than in HLA-matched
sibling recipients. Taken together, these results suggest that the increased GVHD risk after unrelated HCT is
predominantly an effect of HLA-mismatching.
05/22: Tentative: a framework for single-cell RNA-seq learning -- Sumit Mukherjee (EE) 05/29: -- Holiday | |||||||||||||||||||||||||||||||||||||||||||||||||
Other Seminars | Past quarters of CSE 590C COMBI & Genome Sciences Seminars Biostatistics Seminars Microbiology Department Seminars | |||||||||||||||||||||||||||||||||||||||||||||||||
Resources | Molecular Biology for Computer Scientists, a primer by Lawrence Hunter (46 pages) A Quick Introduction to Elements of Biology, a primer by Alvis Brazma et al. A comprehensive FAQ at bioinformatics.org, including annotated links to online tutorials and lectures. CSE 527: Computational Biology CSEP 590A: Computational Biology (Professional Masters Program) Genome 540/541: Introduction to Computational Molecular Biology: Genome and Protein Sequence Analysis CSE's Computational Molecular Biology research group |
Computer Science & Engineering University of Washington Box 352350 Seattle, WA 98195-2350 (206) 543-1695 voice, (206) 543-2969 FAX |