image University of Washington Computer Science & Engineering
  CSE 590CSp '16:  Reading & Research in Comp. Bio.
  CSE Home   About Us    Search    Contact Info 

 Course Info    CSE 590C is a weekly seminar on Readings and Research in Computational Biology, open to all graduate students in computational, biological, and mathematical sciences.
When/Where: Mondays, 3:30 - 4:50, EE1 003
Organizers: Joe Felsenstein, Su-In Lee, Bill Noble, Larry Ruzzo, Cole Trapnell
Credit: 1-3 Variable
Grading: Credit/No Credit. Talk to the organizers if you are unsure of our expectations.
 Email
cse590cb@cs.washington.edu Course-related announcements and discussions
  Manage Your Subscription List Archives
compbio-seminars@cs.washington.edu Biology seminar announcements from all around campus
  Manage Your Subscription List Archives
compbio-group@cs.washington.edu Computational biology discussions, conference/job postings, etc.,
  Manage Your Subscription List Archives
 Theme Traditionally, we reserve Spring quarter for "homegrown" research --- highlights of work by researchers in the Seattle area. Our Spring schedule is:
 Schedule
 Date  Presenters/Participants Topic Details
03/28---- No Meeting ----
04/04Max Libbrecht, CSE/GSA unified encyclopedia of human functional elements through fully automated annotation of 166 human cell typesDetails
04/11Safiye Celik, CSEExtracting a low-dimensional description of multiple gene expression datasets reveals a potential driver for tumor-associated stroma in ovarian cancerDetails
04/18Tim Durham, GSImputing Missing Data in the Roadmap Epigenomics Project Using Tensor Decomposition Approaches 
04/25Scott Lundberg, CSEPredicting desaturation events in the operating room using models that retain both trust and accuracyDetails
05/02Xiaojie Qiu, GSA computational framework to learn principal developmental graph and to detect novel drivers for cell fate transition from single-cell measurementsDetails
05/09Hannah Pliner, GSIdentifying patterns of chromatin remodeling during skeletal muscle myogenesis 
05/16Sreeram Kannan, EEInferring information flow in genetic pathways from single cell data 
05/23Mary Emond, BiostatTrying to be smarter about aggregating rare variants for phenotype association tests: using functional information and "Can we snoop the data prior to determining aggregates without killing our power?" 
05/30Holiday
 Papers, etc.

  Note on Electronic Access to Journals

The UW Library is generally a paid subscriber to non-open-access journals we cite. You can freely access these articles from on-campus computers. For off-campus access, follow the "[offcampus]" links below or look at the library "proxy server" instructions. You will be prompted for your UW net ID and password.  


03/28:   -- ---- No Meeting ----

04/04: A unified encyclopedia of human functional elements through fully automated annotation of 166 human cell types -- Max Libbrecht, CSE/GS

    Abstract:   Semi-automated genome annotation algorithms such as Segway and ChromHMM are widely used to model diverse genomics data sets. These algorithms take as input a collection of genomics data sets and simultaneously partition the genome and label each segment with an integer such that positions with the same label have similar patterns of activity. These algorithms are ``semi-automated'' because a human performs a functional interpretation of the labels after the annotation process. Previous attempts to annotate multiple cell types using these methods primarily trained a single model to apply to all cell types, but this approach requires that all cell types have exactly the same data sets available and is sensitive to artifactual differences between genomics experiments. Training an independent model for each cell type avoids these limitations, but was previously impractical because doing so would require performing manual interpretation separately for each cell type. We propose a method for automating the annotation interpretation step by using a machine learning classifier trained on previous human interpretations. The use of this classifier allows the annotation process to proceed from raw data to final output in a fully automated way. We applied Segway with automated interpretation to all available data sets for all 166 human cell types with sufficient data, the most comprehensive genome annotation to date. We compiled these annotations together to produce a unified encyclopedia of all function-associated elements in the human genome, using evolutionary conservation to identify function-associated types of activity. The resulting encyclopedia annotates each functional element that is active in at least one cell types with its type and its pattern of activity across these cell types. We found that the activity marked by this encyclopedia explains most noncoding evolutionary conservation and identifies functional variants marked by GWAS tag SNPs. This unified encyclopedia therefore enables easy and intuitive interpretation of the effect of sequence variants on phenotype, such for investigation of disease, evolutionary conservation or positive selection.

04/11: Extracting a low-dimensional description of multiple gene expression datasets reveals a potential driver for tumor-associated stroma in ovarian cancer -- Safiye Celik, CSE

    Abstract:   We present the INSPIRE (INferring Shared modules from multiPle gene expREssion datasets) method to infer highly coherent and robust modules of co-expressed genes and the dependencies among the modules from multiple expression datasets. INSPIRE increases the power to detect robust and relevant patterns (modules and dependencies among modules by enabling the use of multiple datasets that contain different sets of genes due to, e.g., the difference in microarray platforms.

Our evaluations on synthetically generated datasets and gene expression datasets from multiple ovarian cancer studie​s ​show that the model learned by INSPIRE can explain unseen data better and can reveal prior knowledge on gene functions more accurately than alternative methods. Applying INSPIRE to nine ovarian cancer datasets leads to the identification of a new marker and potential molecular driver of tumor-associated stroma - HOPX. HOPX module strongly overlaps with the genes defining the mesenchymal patient subtype identified in The Cancer Genome Atlas (TCGA) ovarian cancer data. We provide evidence for a previously unknown molecular basis of tumor resectability efficacy involving tumor-associated mesenchymal stem cells represented by HOPX.

04/18: Imputing Missing Data in the Roadmap Epigenomics Project Using Tensor Decomposition Approaches -- Tim Durham, GS

04/25: Predicting desaturation events in the operating room using models that retain both trust and accuracy -- Scott Lundberg, CSE

    Abstract:   During a typical operating room procedure there are many different sensors and data points recorded about an individual. Using this data to predict adverse events is a promising application of machine learning in the operation room. Here we predict oxygen desaturation events during anesthesia, and show how extremely complex models can be succinctly explained to a doctor visually. Model accuracy can match or exceed human doctors, and the ability to explain "why" a prediction was made aids in its practical use.

05/02: A computational framework to learn principal developmental graph and to detect novel drivers for cell fate transition from single-cell measurements -- Xiaojie Qiu, GS

    Abstract:   Development were long regarded as a hierarchical branching process. Conventional studies utilize population measurements on bulk samples which hamper us to investigate the intricate developmental dynamics. The recent emergence of single-cell RNA-seq makes it possible to track the hierarchical branching process by taking advantage of the collective behavior of each individual cells during cell fate transition. However, how to accurately reconstruct the developmental trajectory from the high-dimension, snap-shot, nosy sc RNA-seq data poises a huge computational challenges. In this talk, I will introduce the manifold learning algorithm, DDRTree, originally developed for inferring cancer progression and a novel feature selection method, fstree, for reconstruct the accurate developmental trajectories. Comparing to other existing algorithms, this algorithm is dramatically more accurate and robust. We also build a statistical framework, BEAM (branch expression analysis modeling), for detecting genes dynamically change along different developmental lineages. The unprecedented high resolution of the reconstructed developmental trajectories not only enables us to determine the driver genes play an important role at the critical time point of cell fate transition but also to directly infer causal gene regulatory networks.

05/09: Identifying patterns of chromatin remodeling during skeletal muscle myogenesis -- Hannah Pliner, GS

05/16: Inferring information flow in genetic pathways from single cell data -- Sreeram Kannan, EE

05/23: Trying to be smarter about aggregating rare variants for phenotype association tests: using functional information and "Can we snoop the data prior to determining aggregates without killing our power?" -- Mary Emond, Biostat

05/30:   -- Holiday

 Other Seminars Past quarters of CSE 590C
COMBI & Genome Sciences Seminars
Biostatistics Seminars
Microbiology Department Seminars
 Resources Molecular Biology for Computer Scientists, a primer by Lawrence Hunter (46 pages)
A Quick Introduction to Elements of Biology, a primer by Alvis Brazma et al.
A comprehensive FAQ at bioinformatics.org, including annotated links to online tutorials and lectures.
CSE 527: Computational Biology
CSEP 590A: Computational Biology (Professional Masters Program)
Genome 540/541: Introduction to Computational Molecular Biology: Genome and Protein Sequence Analysis

CSE's Computational Molecular Biology research group
Interdisciplinary Ph.D. program in Computational Molecular Biology


CSE logo Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA  98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX