CSE 427 Computational Biology


Announcements

Our first meeting will be on Tuesday, January 7, 2020

Time / Place

TTh 10-11:20 / CSE2 (Gates Center) G01

Course Description:

Biomedical data is vastly increasing in quantity, scope, and generality, expanding opportunities to discover novel biological processes and clinically translatable outcomes. The ENCODE (Encyclopedia of DNA Elements) project is generating myriads of sequencing datasets that measure varied activities across the human genome in many different cell types. Growing numbers of disease studies are producing multiple types of high-throughput molecular and imaging data. Medical records, now routinely digitized, provide new possibilities to follow and predict patients’ progress in real time.

Machine learning (ML), a key technology in modern biology that addresses these changing dynamics, aims to infer meaningful interactions among variables by learning their statistical relationships from data consisting of measurements on variables across samples. Accurate inference of such interactions from big biological data would provide a tremendous opportunity to generate novel biological discoveries, therapeutic targets, and predictive models for patient outcomes. However, a greatly increased hypothesis space and complex dependencies among variables pose complex, open challenges. To meet these challenges, recent advancements in ML focused on algorithms (i) to infer reliable, accurate statistical relationships from data in various kinds of network inference problems, and (ii) to improve the interpretability of ML models which are often considered as a black box, pushing the boundaries of both ML and biology.

In this course, we will discuss ML approaches, with an emphasis on those that address the black-box nature of ML, to solve a wide variety of problems in biology and medicine. Class projects will provide opportunities to solve real-world research problems in biology and medicine.

No background in biology is required. Students are expected to have taken undergraduate-level machine learning or statistics courses, and have programming skills in MatLab, R, C++, JAVA, Perl, or Python. Students who are interested in joining Prof. Lee's group are required to take this course.

Grading