Course Info |
|
CSE 590C is a weekly seminar on Readings and Research in Computational Biology, open to all
graduate students in computational, biological, and mathematical sciences.
|
Schedule |
Date |
Presenters/Participants |
Topic |
Details |
| 03/28 | ---- No Meeting ---- |
04/04 | Max Libbrecht, CSE/GS | A unified encyclopedia of human functional elements through fully automated annotation of 166 human cell types | Details |
04/11 | Safiye Celik, CSE | Extracting a low-dimensional description of multiple gene expression datasets reveals a potential driver for tumor-associated stroma in ovarian cancer | Details |
04/18 | Tim Durham, GS | Imputing Missing Data in the Roadmap Epigenomics Project Using Tensor Decomposition Approaches | |
04/25 | Scott Lundberg, CSE | Predicting desaturation events in the operating room using models that retain both trust and accuracy | Details |
05/02 | Xiaojie Qiu, GS | A computational framework to learn principal developmental graph and to detect novel drivers for cell fate transition from single-cell measurements | Details |
05/09 | Hannah Pliner, GS | Identifying patterns of chromatin remodeling during skeletal muscle myogenesis | |
05/16 | Sreeram Kannan, EE | Inferring information flow in genetic pathways from single cell data | |
05/23 | Mary Emond, Biostat | Trying to be smarter about aggregating rare variants for phenotype association tests: using functional information and "Can we snoop the data prior to determining aggregates without killing our power?" | |
05/30 | Holiday |
|
Papers, etc. |
Note on Electronic Access to Journals
The UW Library is generally a paid subscriber to non-open-access journals we cite. You can freely access these
articles from on-campus computers. For off-campus access, follow the "[offcampus]" links below or look at the
library "proxy server" instructions. You will be
prompted for your UW net ID and password.
03/28: -- ---- No Meeting ----
04/04: A unified encyclopedia of human functional elements through fully automated annotation of 166 human cell types -- Max Libbrecht, CSE/GS
|
Abstract:
Semi-automated genome annotation algorithms such as Segway and ChromHMM are widely used to model diverse
genomics data sets. These algorithms take as input a collection of genomics data sets and simultaneously partition
the genome and label each segment with an integer such that positions with the same label have similar patterns of
activity. These algorithms are ``semi-automated'' because a human performs a functional interpretation of the labels
after the annotation process. Previous attempts to annotate multiple cell types using these methods primarily
trained a single model to apply to all cell types, but this approach requires that all cell types have exactly the
same data sets available and is sensitive to artifactual differences between genomics experiments. Training an
independent model for each cell type avoids these limitations, but was previously impractical because doing so would
require performing manual interpretation separately for each cell type. We propose a method for automating the
annotation interpretation step by using a machine learning classifier trained on previous human interpretations. The
use of this classifier allows the annotation process to proceed from raw data to final output in a fully automated
way. We applied Segway with automated interpretation to all available data sets for all 166 human cell types with
sufficient data, the most comprehensive genome annotation to date. We compiled these annotations together to produce
a unified encyclopedia of all function-associated elements in the human genome, using evolutionary conservation to
identify function-associated types of activity. The resulting encyclopedia annotates each functional element that is
active in at least one cell types with its type and its pattern of activity across these cell types. We found that
the activity marked by this encyclopedia explains most noncoding evolutionary conservation and identifies functional
variants marked by GWAS tag SNPs. This unified encyclopedia therefore enables easy and intuitive interpretation of
the effect of sequence variants on phenotype, such for investigation of disease, evolutionary conservation or
positive selection.
|
04/11: Extracting a low-dimensional description of multiple gene expression datasets reveals a potential driver for tumor-associated stroma in ovarian cancer -- Safiye Celik, CSE
|
Abstract:
We present the INSPIRE (INferring Shared modules from multiPle gene expREssion datasets) method to infer
highly coherent and robust modules of co-expressed genes and the dependencies among the modules from multiple
expression datasets. INSPIRE increases the power to detect robust and relevant patterns (modules and dependencies
among modules by enabling the use of multiple datasets that contain different sets of genes due to, e.g., the
difference in microarray platforms.
Our evaluations on synthetically generated datasets and gene expression datasets from multiple ovarian cancer
studieâs âshow that the model learned by INSPIRE can explain unseen data better and can reveal prior knowledge on gene
functions more accurately than alternative methods. Applying INSPIRE to nine ovarian cancer datasets leads to the
identification of a new marker and potential molecular driver of tumor-associated stroma - HOPX. HOPX module
strongly overlaps with the genes defining the mesenchymal patient subtype identified in The Cancer Genome Atlas
(TCGA) ovarian cancer data. We provide evidence for a previously unknown molecular basis of tumor resectability
efficacy involving tumor-associated mesenchymal stem cells represented by HOPX.
|
04/18: Imputing Missing Data in the Roadmap Epigenomics Project Using Tensor Decomposition Approaches -- Tim Durham, GS
04/25: Predicting desaturation events in the operating room using models that retain both trust and accuracy -- Scott Lundberg, CSE
|
Abstract:
During a typical operating room procedure there are many different sensors and data points recorded about an individual. Using this data to predict adverse events is a promising application of machine learning in the operation room. Here we predict oxygen desaturation events during anesthesia, and show how extremely complex models can be succinctly explained to a doctor visually. Model accuracy can match or exceed human doctors, and the ability to explain "why" a prediction was made aids in its practical use.
|
05/02: A computational framework to learn principal developmental graph and to detect novel drivers for cell fate transition from single-cell measurements -- Xiaojie Qiu, GS
|
Abstract:
Development were long regarded as a hierarchical branching process. Conventional studies utilize
population measurements on bulk samples which hamper us to investigate the intricate developmental dynamics. The
recent emergence of single-cell RNA-seq makes it possible to track the hierarchical branching process by taking
advantage of the collective behavior of each individual cells during cell fate transition. However, how to
accurately reconstruct the developmental trajectory from the high-dimension, snap-shot, nosy sc RNA-seq data poises
a huge computational challenges. In this talk, I will introduce the manifold learning algorithm, DDRTree,
originally developed for inferring cancer progression and a novel feature selection method, fstree, for reconstruct
the accurate developmental trajectories. Comparing to other existing algorithms, this algorithm is dramatically more
accurate and robust. We also build a statistical framework, BEAM (branch expression analysis modeling), for
detecting genes dynamically change along different developmental lineages. The unprecedented high resolution of the
reconstructed developmental trajectories not only enables us to determine the driver genes play an important role at
the critical time point of cell fate transition but also to directly infer causal gene regulatory networks.
|
05/09: Identifying patterns of chromatin remodeling during skeletal muscle myogenesis -- Hannah Pliner, GS
05/16: Inferring information flow in genetic pathways from single cell data -- Sreeram Kannan, EE
05/23: Trying to be smarter about aggregating rare variants for phenotype association tests: using functional information and "Can we snoop the data prior to determining aggregates without killing our power?" -- Mary Emond, Biostat
05/30: -- Holiday
|