CSE 590C, Sp '16: Reading & Research in Comp. Bio.

University of Washington Computer Science & Engineering

CSE Home

About Us

Contact Info

Course Info    CSE 590C is a weekly seminar on Readings and Research in Computational Biology, open to all graduate students in computational, biological, and mathematical sciences.

When/Where: Mondays, 3:30 - 4:50, EE1 003

Organizers: Joe Felsenstein, Su-In Lee, Bill Noble, Larry Ruzzo, Cole Trapnell

Credit: 1-3 Variable

Grading: Credit/No Credit. Talk to the organizers if you are unsure of our expectations.

Email

cse590cb@cs.washington.edu Course-related announcements and discussions
Manage Your Subscription List Archives

compbio-seminars@cs.washington.edu Biology seminar announcements from all around campus
Manage Your Subscription List Archives

compbio-group@cs.washington.edu Computational biology discussions, conference/job postings, etc.,
Manage Your Subscription List Archives

Theme Traditionally, we reserve Spring quarter for "homegrown" research --- highlights of work by researchers in the Seattle area. Our Spring schedule is:

Schedule

Date Presenters/Participants Topic Details
03/28 ---- No Meeting ----

04/04 Max Libbrecht, CSE/GS A unified encyclopedia of human functional elements through fully automated annotation of 166 human cell types Details

04/11 Safiye Celik, CSE Extracting a low-dimensional description of multiple gene expression datasets reveals a potential driver for tumor-associated stroma in ovarian cancer Details

04/18 Tim Durham, GS Imputing Missing Data in the Roadmap Epigenomics Project Using Tensor Decomposition Approaches

04/25 Scott Lundberg, CSE Predicting desaturation events in the operating room using models that retain both trust and accuracy Details

05/02 Xiaojie Qiu, GS A computational framework to learn principal developmental graph and to detect novel drivers for cell fate transition from single-cell measurements Details

05/09 Hannah Pliner, GS Identifying patterns of chromatin remodeling during skeletal muscle myogenesis

05/16 Sreeram Kannan, EE Inferring information flow in genetic pathways from single cell data

05/23 Mary Emond, Biostat Trying to be smarter about aggregating rare variants for phenotype association tests: using functional information and "Can we snoop the data prior to determining aggregates without killing our power?"

05/30 Holiday

Papers, etc.
Note on Electronic Access to Journals
The UW Library is generally a paid subscriber to non-open-access journals we cite. You can freely access these articles from on-campus computers. For off-campus access, follow the "[offcampus]" links below or look at the library "proxy server" instructions. You will be prompted for your UW net ID and password.

03/28:   -- ---- No Meeting ----

04/04: A unified encyclopedia of human functional elements through fully automated annotation of 166 human cell types -- Max Libbrecht, CSE/GS

    Abstract:   Semi-automated genome annotation algorithms such as Segway and ChromHMM are widely used to model diverse genomics data sets. These algorithms take as input a collection of genomics data sets and simultaneously partition the genome and label each segment with an integer such that positions with the same label have similar patterns of activity. These algorithms are ``semi-automated'' because a human performs a functional interpretation of the labels after the annotation process. Previous attempts to annotate multiple cell types using these methods primarily trained a single model to apply to all cell types, but this approach requires that all cell types have exactly the same data sets available and is sensitive to artifactual differences between genomics experiments. Training an independent model for each cell type avoids these limitations, but was previously impractical because doing so would require performing manual interpretation separately for each cell type. We propose a method for automating the annotation interpretation step by using a machine learning classifier trained on previous human interpretations. The use of this classifier allows the annotation process to proceed from raw data to final output in a fully automated way. We applied Segway with automated interpretation to all available data sets for all 166 human cell types with sufficient data, the most comprehensive genome annotation to date. We compiled these annotations together to produce a unified encyclopedia of all function-associated elements in the human genome, using evolutionary conservation to identify function-associated types of activity. The resulting encyclopedia annotates each functional element that is active in at least one cell types with its type and its pattern of activity across these cell types. We found that the activity marked by this encyclopedia explains most noncoding evolutionary conservation and identifies functional variants marked by GWAS tag SNPs. This unified encyclopedia therefore enables easy and intuitive interpretation of the effect of sequence variants on phenotype, such for investigation of disease, evolutionary conservation or positive selection.

04/11: Extracting a low-dimensional description of multiple gene expression datasets reveals a potential driver for tumor-associated stroma in ovarian cancer -- Safiye Celik, CSE

    Abstract:   We present the INSPIRE (INferring Shared modules from multiPle gene expREssion datasets) method to infer highly coherent and robust modules of co-expressed genes and the dependencies among the modules from multiple expression datasets. INSPIRE increases the power to detect robust and relevant patterns (modules and dependencies among modules by enabling the use of multiple datasets that contain different sets of genes due to, e.g., the difference in microarray platforms.
Our evaluations on synthetically generated datasets and gene expression datasets from multiple ovarian cancer studieâs âshow that the model learned by INSPIRE can explain unseen data better and can reveal prior knowledge on gene functions more accurately than alternative methods. Applying INSPIRE to nine ovarian cancer datasets leads to the identification of a new marker and potential molecular driver of tumor-associated stroma - HOPX. HOPX module strongly overlaps with the genes defining the mesenchymal patient subtype identified in The Cancer Genome Atlas (TCGA) ovarian cancer data. We provide evidence for a previously unknown molecular basis of tumor resectability efficacy involving tumor-associated mesenchymal stem cells represented by HOPX.

04/18: Imputing Missing Data in the Roadmap Epigenomics Project Using Tensor Decomposition Approaches -- Tim Durham, GS

04/25: Predicting desaturation events in the operating room using models that retain both trust and accuracy -- Scott Lundberg, CSE

    Abstract:   During a typical operating room procedure there are many different sensors and data points recorded about an individual. Using this data to predict adverse events is a promising application of machine learning in the operation room. Here we predict oxygen desaturation events during anesthesia, and show how extremely complex models can be succinctly explained to a doctor visually. Model accuracy can match or exceed human doctors, and the ability to explain "why" a prediction was made aids in its practical use.

05/02: A computational framework to learn principal developmental graph and to detect novel drivers for cell fate transition from single-cell measurements -- Xiaojie Qiu, GS

    Abstract:   Development were long regarded as a hierarchical branching process. Conventional studies utilize population measurements on bulk samples which hamper us to investigate the intricate developmental dynamics. The recent emergence of single-cell RNA-seq makes it possible to track the hierarchical branching process by taking advantage of the collective behavior of each individual cells during cell fate transition. However, how to accurately reconstruct the developmental trajectory from the high-dimension, snap-shot, nosy sc RNA-seq data poises a huge computational challenges. In this talk, I will introduce the manifold learning algorithm, DDRTree, originally developed for inferring cancer progression and a novel feature selection method, fstree, for reconstruct the accurate developmental trajectories. Comparing to other existing algorithms, this algorithm is dramatically more accurate and robust. We also build a statistical framework, BEAM (branch expression analysis modeling), for detecting genes dynamically change along different developmental lineages. The unprecedented high resolution of the reconstructed developmental trajectories not only enables us to determine the driver genes play an important role at the critical time point of cell fate transition but also to directly infer causal gene regulatory networks.

05/09: Identifying patterns of chromatin remodeling during skeletal muscle myogenesis -- Hannah Pliner, GS

05/16: Inferring information flow in genetic pathways from single cell data -- Sreeram Kannan, EE

05/23: Trying to be smarter about aggregating rare variants for phenotype association tests: using functional information and "Can we snoop the data prior to determining aggregates without killing our power?" -- Mary Emond, Biostat

05/30:   -- Holiday

Other Seminars Past quarters of CSE 590C
COMBI & Genome Sciences Seminars
Biostatistics Seminars
Microbiology Department Seminars

Resources Molecular Biology for Computer Scientists, a primer by Lawrence Hunter (46 pages)
A Quick Introduction to Elements of Biology, a primer by Alvis Brazma et al.
A comprehensive FAQ at bioinformatics.org, including annotated links to online tutorials and lectures.
CSE 527: Computational Biology
CSEP 590A: Computational Biology (Professional Masters Program)
Genome 540/541: Introduction to Computational Molecular Biology: Genome and Protein Sequence Analysis

CSE's Computational Molecular Biology research group
Interdisciplinary Ph.D. program in Computational Molecular Biology

Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA 98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX