Research

The Lee laboratory studies transposable elements and other types of genomic variations in human disease using computational genomic and bioinformatic approaches. Specifically, we develop and apply computational methods for genomic studies using next-generation sequencing and perform integrative analyses of DNA- and RNA-sequencing data.


Our research currently focuses on:

1 Pathogenic retrotransposon insertions in human disease. Retrotransposon mobilization is a significant source of human genomic variability and is causally implicated in various Mendelian disorders and complex diseases. We aim to define the relevance and importance of retrotransposon insertions as a mechanism underlying human diseases, including cancer, Mendelian disorders, and complex neurological disorders.
2

The role of somatic mutation in developmental and degenerative disorders. Our previous single-cell studies have revealed extensive somatic mosaicism in the human brain. We aim to analyze single-cell genomic data and develop novel methods to detect various types of low-clonal or single-cell unique somatic mutations and to understand how these mutations relate to neurodegeneration.

3

The effects of DNA variants on RNA splicing. One major pathogenic mechanism underlying human disease is disruption of RNA splicing caused by DNA mutations including retrotransposon insertions. Our recent analysis of cancer DNA- and RNA-seq profiles found a large set of somatic mutations that disrupted RNA splicing, highlighting intron retention as a common yet underappreciated mechanism of tumor suppressor inactivation. Our goal is to systematically characterize the effects of pathogenic DNA variants, especially non-coding variants, on RNA splicing.

 

We aim to address the following major biomedical questions:

 

What are the causal genomic variants for genetic disorders of unknown etiology?

Advances in human genetics and genomics have uncovered causal variants for many hereditary human diseases. However, no link to causal variants has yet been identified for a significant fraction of Mendelian diseases. We explore causal genetic variants for genetics disorders with unknown etiology by investigating their genomic and transcriptomic data with a special focus on non-canonical types of variants including 1) non-coding variants, especially those associated with repetitive sequences such as transposable elements and tandem repeats, 2) genomic variants causing splicing aberrations, and 3) mosaic variants with low variant allele frequency. These types of variants cannot be detected through typical variant analyses.  We have developed effective computational methods and obtained in-depth understanding of existing tools so as to be able to study these types of variants. For example, the Tea (Transposable Element Analyzer) methods we developed to study somatic retrotransposition in cancer and single-neuronal genomes (Lee et al., Science, 2012; Evrony and Lee et al., Neuron, 2015) have been evolving to accommodate advances in genomic and computing technologies to study various genetic disorders. We have also developed a computational method to detect and predict splicing-disrupting somatic mutations, including synonymous ones (Jung et al., Nature Genetics, 2015).

 

How often do somatic mutations occur and what are their roles in developmental and degenerative disorders?

Somatic mutations have been studied most extensively in cancer, but they also cause neurodevelopmental disorders. Our previous studies of single-neuronal genomes in postmortem human brains have revealed abundant somatic mutations, such as mobilization of transposable elments and variation in short tandem repeats, suggesting their potential role in neurological disorders. Our studies have not only demonstrated great promise of single-cell genomics for studying somatic mutation but also highlighted the importance of rigorous data analysis and computational expertise to address technical artifacts in the data. We continue to investigate somatic mutations in several neurological disorders and other conditions in close collaboration with clinical and experimental scientists.