image University of Washington Computer Science & Engineering
  CSE 590CSp '17:  Reading & Research in Comp. Bio.
  CSE Home   About Us    Search    Contact Info 

 Course Info    CSE 590C is a weekly seminar on Readings and Research in Computational Biology, open to all graduate students in computational, biological, and mathematical sciences.
When/Where: Mondays, 3:30 - 4:50, EE1 026
Organizers: Sreeram Kannan, Larry Ruzzo, Yuliang Wang
Credit: 1-3 Variable
Grading: Credit/No Credit. Talk to the organizers if you are unsure of our expectations.
 Email Course-related announcements and discussions
  Manage Your Subscription List Archives Biology seminar announcements from all around campus
  Manage Your Subscription List Archives Computational biology discussions, conference/job postings, etc.,
  Manage Your Subscription List Archives
 Theme Traditionally, we reserve Spring quarter for "homegrown" research --- highlights of work by researchers in the Seattle area. Our Spring schedule is:
 Date  Presenters/Participants Topic Details
03/27---- Organizational Meeting ----
04/03Xiaojie Qiu (GS)Monocle 2Details
04/10Scott Lundberg (CSE)Game Theory Meets MLDetails
04/17Shunfu Mao (EE)Genome-Guided Transcriptome AssemblyDetails
04/24Sudipto Mukherjee (EE)Resolving Multi-copy Duplication in GenomesDetails
05/01Jacob Schreiber (CSE)Chromatin ArchitectureDetails
05/08Daniel Jones (CSE)Large-scale RNA-Seq Analysis 
05/15David Levine (Biostat)Minor Histocompatibility and Graft-Versus-Host DiseaseDetails
05/22Sumit Mukherjee (EE)Tentative: a framework for single-cell RNA-seq learning 
 Papers, etc.

  Note on Electronic Access to Journals

The UW Library is generally a paid subscriber to non-open-access journals we cite. You can freely access these articles from on-campus computers. For off-campus access, follow the "[offcampus]" links below or look at the library "proxy server" instructions. You will be prompted for your UW net ID and password.  

03/27:   -- ---- Organizational Meeting ----

04/03: Monocle 2 -- Xiaojie Qiu (GS)

Xiaojie Qiu, Qi Mao, Ying Tang, Li Wang, Raghav Chawla, Hannah Pliner, Cole Trapnell, "Reversed graph embedding resolves complex single-cell developmental trajectories,"

Abstract: Organizing single cells along a developmental trajectory has emerged as a powerful tool for understanding how gene regulation governs cell fate decisions. However, learning the structure of complex single-cell trajectories with two or more branches remains a challenging computational problem. We present Monocle 2, which uses reversed graph embedding to reconstruct single-cell trajectories in a fully unsupervised manner. Monocle 2 learns an explicit principal graph to describe the data, greatly improving the robustness and accuracy of its trajectories compared to other algorithms. Monocle 2 uncovered a new, alternative cell fate in what we previously reported to be a linear trajectory for differentiating myoblasts. We also reconstruct branched trajectories for two studies of blood development, and show that loss of function mutations in key lineage transcription factors diverts cells to alternative branches on the a trajectory. Monocle 2 is thus a powerful tool for analyzing cell fate decisions with single-cell genomics.

04/10: Game Theory Meets ML -- Scott Lundberg (CSE)

Title: How game theory can help us understand our machine learning models

Abstract: When applying machine learning to computational biology it is important to have accurate models, and it is also important to understand why they make specific predictions. Unfortunately, these two requirements are often at odds. Here we demonstrate how to have your cake and eat it have the highest accuracy while retaining interpretability.

04/17: Genome-Guided Transcriptome Assembly -- Shunfu Mao (EE)

Title: RefShannon: a genome-guided transcriptome assembler using sparse flow decomposition

Abstract: High throughput sequencing of RNA (RNA-Seq) has become a staple in modern molecular biology, with applications not only in quantifying gene expression but also in isoform-level analysis of the RNA transcripts. To enable such an isoform-level analysis, a transcriptome assembly algorithm is utilized to stitch together the observed short reads into the corresponding transcripts. This task is complicated due to the complexity of alternative splicing - a mechanism by which the same gene may generate multiple distinct RNA transcripts. We develop a novel genome-guided transcriptome assembler, RefShannon, that exploits the varying abundances of the different transcripts, in enabling an accurate reconstruction of the transcripts. Our evaluation shows RefShannon improves sensitivity by 13% to 25% at a given specificity in comparison with state-of-the-art assemblers including Stringtie and Cufflinks.

04/24: Resolving Multi-copy Duplication in Genomes -- Sudipto Mukherjee (EE)

Abstract:   Structural rearrangement in the DNA has come to the forefront of research in biology due to its association with diseases. Advancement of sequencing technologies has made it possible to obtain partial information about these regions. The goal of this project is to design an algorithm for reconstructing a form of structural variation, known as segmental duplication, by modeling the problem as a discrete matrix completion. By leveraging techniques from non-convex optimization and structure recovery with missing data, the designed workflow is able to achieve better result than state-of-the-art algorithms.

05/01: Chromatin Architecture -- Jacob Schreiber (CSE)

Jacob Schreiber, Maxwell Libbrecht, Jeffrey Bilmes, and William Stafford Noble, "Nucleotide sequence and DNaseI sensitivity are predictive of 3D chromatin architecture,"

Abstract: Motivation: Recently, Hi-C has been used to probe the 3D chromatin architecture of multiple organisms and cell types. The resulting collections of pairwise contacts across the genome have connected chromatin architecture to many cellular phenomena, including replication timing and gene regulation. However, high resolution (10 kb or finer) contact maps remain scarce due to the expense and time required for collection. A computational method for predicting pairwise contacts without the need to run a Hi-C experiment would be invaluable in understanding the role that 3D chromatin architecture plays in genome biology. Results: We describe Rambutan, a deep convolutional neural network that predicts Hi-C contacts at 1 kb resolution using nucleotide sequence and DNaseI assay signal as inputs. Specifically, Rambutan identifies locus pairs that engage in high confidence contacts according to Fit-Hi-C, a previously described method for assigning statistical confidence estimates to Hi-C contacts. We first demonstrate Rambutan’s performance across chromosomes at 1 kb resolution in the GM12878 cell line. Subsequently, we measure Rambutan’s performance across six cell types. In this setting, the model achieves an area under the receiver operating characteristic curve between 0.7662 and 0.8246 and an area under the precision-recall curve between 0.3737 and 0.9008. We further demonstrate that the predicted contacts exhibit expected trends relative to histone modification ChIP-seq data, replication timing measurements, and annotations of functional elements such as promoters and enhancers. Finally, we predict Hi-C contacts for 53 human cell types and show that the predictions cluster by cellular function. Availability: Tutorials and source code for Rambutan are publicly available at

05/08: Large-scale RNA-Seq Analysis -- Daniel Jones (CSE)

05/15: Minor Histocompatibility and Graft-Versus-Host Disease -- David Levine (Biostat)

  • PJ Martin, DM Levine, BE Storer, EH Warren, X Zheng, SC Nelson, AG Smith, BK Mortensen, JA Hansen, "Genome-wide minor histocompatibility matching as related to the risk of graft-versus-host disease." Blood, 129, #6 (2017) 791-798. [offcampus]
Abstract:   The risk of acute graft-versus-host disease (GVHD) is higher after allogeneic hematopoietic cell transplantation (HCT) from unrelated donors as compared with related donors. This difference has been explained by increased recipient mismatching for major histocompatibility antigens or minor histocompatibility antigens. In the current study, we used genome-wide arrays to enumerate single nucleotide polymorphisms (SNPs) that produce graft-versus-host (GVH) amino acid coding differences between recipients and donors. We then tested the hypothesis that higher degrees of genome-wide recipient GVH mismatching correlate with higher risks of GVHD after allogeneic HCT. In HLA-genotypically matched sibling recipients, the average recipient mismatching of coding SNPs was 9.35%. Each 1% increase in genome-wide recipient mismatching was associated with an estimated 20% increase in the hazard of grades III-IV GVHD (hazard ratio [HR], 1.20; 95% confidence interval [CI], 1.05-1.37; P = .007) and an estimated 22% increase in the hazard of stage 2-4 acute gut GVHD (HR, 1.22; 95% CI, 1.02-1.45; P = .03). In HLA-A, B, C, DRB1, DQA1, DQB1, DPA1, DPB1-phenotypically matched unrelated recipients, the average recipient mismatching of coding SNPs was 17.3%. The estimated risks of GVHD-related outcomes in HLA-phenotypically matched unrelated recipients were low, relative to the large difference in genome-wide mismatching between the 2 groups. In contrast, the risks of GVHD-related outcomes were higher in HLA-DP GVH-mismatched unrelated recipients than in HLA-matched sibling recipients. Taken together, these results suggest that the increased GVHD risk after unrelated HCT is predominantly an effect of HLA-mismatching.

05/22: Tentative: a framework for single-cell RNA-seq learning -- Sumit Mukherjee (EE)

05/29:   -- Holiday

 Other Seminars Past quarters of CSE 590C
COMBI & Genome Sciences Seminars
Biostatistics Seminars
Microbiology Department Seminars
 Resources Molecular Biology for Computer Scientists, a primer by Lawrence Hunter (46 pages)
A Quick Introduction to Elements of Biology, a primer by Alvis Brazma et al.
A comprehensive FAQ at, including annotated links to online tutorials and lectures.
CSE 527: Computational Biology
CSEP 590A: Computational Biology (Professional Masters Program)
Genome 540/541: Introduction to Computational Molecular Biology: Genome and Protein Sequence Analysis

CSE's Computational Molecular Biology research group
Interdisciplinary Ph.D. program in Computational Molecular Biology

CSE logo Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA  98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX