Publications

2009

Lee E, Jung H, Radivojac P, Kim JW, Lee D. Analysis of AML genes in dysregulated molecular networks. BMC Bioinformatics. 2009;10 Suppl 9:S2.

BACKGROUND: Identifying disease causing genes and understanding their molecular mechanisms are essential to developing effective therapeutics. Thus, several computational methods have been proposed to prioritize candidate disease genes by integrating different data types, including sequence information, biomedical literature, and pathway information. Recently, molecular interaction networks have been incorporated to predict disease genes, but most of those methods do not utilize invaluable disease-specific information available in mRNA expression profiles of patient samples. RESULTS: Through the integration of protein-protein interaction networks and gene expression profiles of acute myeloid leukemia (AML) patients, we identified subnetworks of interacting proteins dysregulated in AML and characterized known mutation genes causally implicated to AML embedded in the subnetworks. The analysis shows that the set of extracted subnetworks is a reservoir rich in AML genes reflecting key leukemogenic processes such as myeloid differentiation. CONCLUSION: We showed that the integrative approach both utilizing gene expression profiles and molecular networks could identify AML causing genes most of which were not detectable with gene expression analysis alone due to the minor changes in mRNA level.

Jung H, Lee E, Kim, Lee D. Pathway level analysis by augmenting activities of transcription factor target genes. IET Syst Biol. 2009;3(6):534–42.

Many approaches to discovering significant pathways in gene expression profiles have been developed to facilitate biological interpretation and hypothesis generation. In this work, the authors propose a pathway identification scheme integrating the activity of pathway member genes with that of target genes of transcription factors (TFs) in the same pathway by the weighted Z-method. The authors evaluated the integrative scoring scheme in gene expression profiles of essential thrombocythemia patients with JAK2V617F mutation status, primary breast tumour samples with the status of metastasis occurrence, two independent lung cancer expression profiles with their prognosis, and found that our approach identified cancer-type-specific pathways better than gene set enrichment analysis (GSEA) and Tian's method using the original pathways [pathways that have TFs from database] and the extended pathways (including target genes of TFs of the original pathways). The success of our scheme implicates that adding information of transcriptional regulation is better way of utilising mRNA measurements for estimating differential activities of pathways from gene expression profiles more exactly.

BACKGROUND: The Janus kinase-signal transducer and activator of transcription (JAK/STAT) pathway is one of the most important targets for myeloproliferative disorder (MPD). Although several efforts toward modeling the pathway using systems biology have been successful, the pathway was not fully investigated in regard to understanding pathological context and to model receptor kinetics and mutation effects. RESULTS: We have performed modeling and simulation studies of the JAK/STAT pathway, including the kinetics of two associated receptors (the erythropoietin receptor and thrombopoietin receptor) with the wild type and a recently reported mutation (JAK2V617F) of the JAK2 protein. CONCLUSION: We found that the different kinetics of those two receptors might be important factors that affect the sensitivity of JAK/STAT signaling to the mutation effect. In addition, our simulation results support clinically observed pathological differences between the two subtypes of MPD with respect to the JAK2V617F mutation.

2008

Lee* EA, Chuang* HY, Kim JW, Ideker** T, Lee** D. Inferring pathway activity toward precise disease classification. PLoS Comput Biol. 2008;4(11):e1000217.

The advent of microarray technology has made it possible to classify disease states based on gene expression profiles of patients. Typically, marker genes are selected by measuring the power of their expression profiles to discriminate among patients of different disease states. However, expression-based classification can be challenging in complex diseases due to factors such as cellular heterogeneity within a tissue sample and genetic heterogeneity across patients. A promising technique for coping with these challenges is to incorporate pathway information into the disease classification procedure in order to classify disease based on the activity of entire signaling pathways or protein complexes rather than on the expression levels of individual genes or proteins. We propose a new classification method based on pathway activities inferred for each patient. For each pathway, an activity level is summarized from the gene expression levels of its condition-responsive genes (CORGs), defined as the subset of genes in the pathway whose combined expression delivers optimal discriminative power for the disease phenotype. We show that classifiers using pathway activity achieve better performance than classifiers based on individual gene expression, for both simple and complex case-control studies including differentiation of perturbed from non-perturbed cells and subtyping of several different kinds of cancer. Moreover, the new method outperforms several previous approaches that use a static (i.e., non-conditional) definition of pathways. Within a pathway, the identified CORGs may facilitate the development of better diagnostic markers and the discovery of core alterations in human disease.

2007

Chuang* HY, Lee* EA, Liu YT, Lee D, Ideker T. Network-based classification of breast cancer metastasis. Mol Syst Biol. 2007;3:140.

Mapping the pathways that give rise to metastasis is one of the key challenges of breast cancer research. Recently, several large-scale studies have shed light on this problem through analysis of gene expression profiles to identify markers correlated with metastasis. Here, we apply a proteinnetwork-based approach that identifies markers not as individual genes but as subnetworks extracted from protein interaction databases. The resulting subnetworks provide novel hypotheses for pathways involved in tumor progression. Although genes with known breast cancer mutations are typically not detected through analysis of differential expression, they play a central role in the protein network by interconnecting many differentially expressed genes. We find that the subnetwork markers are more reproducible than individual marker genes selected without network information, and that they achieve higher accuracy in the classification of metastatic versus non-metastatic tumors