General information

Course Logistics

Prerequisites and Grading

Prerequisites: Students entering the class should be comfortable with programming and should have a pre-existing working knowledge of linear algebra (MATH 308), vector calculus (MATH 126), probability and statistics (CSE 312/STAT390), and algorithms. For a brief refresher, we recommend that you consult the linear algebra and statistics/probability reference materials on the Textbooks page.

Grading: Your grade will be based on two parts: 3 homework assignments, literature review, and discussion in three research Showcase lectures. There are no credit given for attending other lectures. There are no exams and no quizzes.

Homework assigments: Only written assignments, no programming assignments. Submit to Gradescope.

Research Showcase: 45-minute invited presentation about ongoing computational biology research by Allen School PhD students and the instructor. The instructor will then lead the discussion about the limitation, potential improvement and future directions.

Late Policy: We will allow 3 total late days. If an assignment is submitted late and this exceeds your 3 late days, that assignment will receive 0 credit. Late days may be spread over any number of assignments, but the total number may not exceed 3. Late days are rounded up so that an assignment that is 28 hours late accumulated 2 late days.

Schedule

Date Content Reading Slides Assignments
Basics
1/4 Welcome/overview. Introduction to computational biology. Molecular Biology for Computer Scientists
Central dogma (10 mins)
Transcription/translation
Yu, Michael Ku, et al. "Translation of genotype to phenotype by a hierarchy of cell subsystems." Cell systems 2.2 (2016): 77-88.
slides
Sequence
1/9 Global sequence analysis (Part 1)
A survey of best practices for RNA-seq data analysis.
TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions.
1/11 Global sequence analysis (Part 2) HiCARN: resolution enhancement of Hi-C data using cascading residual networks . BLAST PSI-BLAST HW 1 here
1/16 Global sequence analysis (Part 3) Deep-learning language models help to improve protein sequence alignment . slides
1/18 Research Showcase (Xiao Wang, Protein structure) slides
1/23 Protein function prediction (part 1)
Graph to Sequence Alignment CAFA4 CAFA3 CAFA2
slides
1/25 Protein function prediction (part 2) Machine learning by Andrew Ng
Protein family classification
HW1 due
1/30 Introduction to biomedical graph analysis (part 1) Graphs in molecular biology slides HW 2 here
Graph (systems biology)
2/1 Research showcase (Hanwen Xu, large langauge model for protein function analysis)
2/6 Introduction to biomedical graph analysis (part 2) slides
2/8 Research showcase (Yue Guo, bioNLP) Network-based tumor stratification
Supervised network-based tumor stratification
2/13 Biomedical graph diffusion (part 1) Network-based tumor stratification
Supervised network-based tumor stratification
slides
2/15 Biomedical graph diffusion (part 2)
2/20 Research showcase (Addie, Woicik, generative model for network integration)
Genomics
2/22 Genomics for precision medicine (drug repurposing) Towards precision medicine slides HW2 due! HW 3 here, due by March 15th. Literature review is also due by March 15th .
2/27 Genomics for precision medicine (drug combination) Towards precision medicine
2/29 Genomics for precision medicine (new drug discovery) slides
3/5 Research showcase (Zucks Liu, Machine learning in Ophthalmology: From segmentation to generation)
3/7 Review of CSE427 slides