Quick links

Kathy Chen FPO

Date and Time
Tuesday, April 9, 2024 - 10:00am to 12:00pm
Location
Carl Icahn Lab 280
Type
FPO

Kathy Chen will present her FPO "Decoding the sequence basis of gene regulation" on Tuesday, April 9, 2024 at 10:00 AM in Icahn 280 and Zoom.

Location: Zoom link: https://princeton.zoom.us/j/95565860844?pwd=LzNkcHZ6bnVackxZRnZJSitXdW9NUT09

The members of Kathy’s committee are as follows:
Examiners: Olga Troyanskaya (Adviser), Mona Singh, Kai Li
Readers: Ryan Adams, Jian Zhou (UT Southwestern)

Everyone is invited to attend her talk.  

Abstract follows below:

Deciphering the regulatory code of gene expression is a critical challenge in human genetics, instrumental to unlocking the potential of personalized medicine. Modern experimental technologies have resulted in an abundance of high-dimensional genome-wide data, revealing the complex system of epigenetic interactions encoded in the genome. The development of computational approaches which can leverage this vast data to model chromatin interactions globally offer a new understanding of how genomic sequences specify regulatory functions. Specifically, sequence-based deep learning models have become the de facto standard for learning the functional properties encoded in DNA sequences based on large sequencing datasets. These models are powerful tools for interpreting molecular and phenotypic effects, capable of predicting the impact of any noncoding variant in the human genome, even rare or never-before-observed variants, and systematically characterizing their consequences beyond what is tractable from experiments and quantitative genetics alone.

In this thesis, we present two deep learning-based sequence models, which predict different epigenetic properties of the genome that contribute to transcriptional regulation. First, Sei is a framework for integrating human genetics data with sequence information to discover the regulatory basis of traits and diseases. Sei learns a vocabulary of regulatory activities, called sequence classes, using a model that predicts 21,907 chromatin profiles across >1,300 cell lines and tissues. Sequence classes provide a global classification and quantification of sequences and variants based on diverse regulatory activities, such as cell type-specific enhancers.

Next, we developed a model Hedgehog, which enables the quantification of variation on methylation sites. Hedgehog predicts 296 continuous-valued methylation profiles across a range of cell types and tissues. Hedgehog is complementary to Sei and reveals new insights into the relationship between DNA methylation and other epigenetic modifications.

Finally, we show how deep learning-based methods can be applied to elucidate the regulatory basis of human health and disease. Specifically, we use Sei to study the contribution of noncoding mutations in cancer. Collectively, we demonstrate novel frameworks for modeling the sequence dependencies of the epigenome and the capability of such approaches to delineate the regulatory mechanisms underlying complex diseases.

Follow us: Facebook Twitter Linkedin