publications
publications by categories in reversed chronological order.
2024
- MethodMR-GGI: accurate inference of gene–gene interactions using Mendelian randomizationWonseok Oh, Junghyun Jung, and Jong Wha J. JooBMC Bioinformatics 2024
Researchers have long studied the regulatory processes of genes to uncover their functions. Gene regulatory network analysis is one of the popular approaches for understanding these processes, requiring accurate identification of interactions among the genes to establish the gene regulatory network. Advances in genome-wide association studies and expression quantitative trait loci studies have led to a wealth of genomic data, facilitating more accurate inference of gene–gene interactions. However, unknown confounding factors may influence these interactions, making their interpretation complicated. Mendelian randomization (MR) has emerged as a valuable tool for causal inference in genetics, addressing confounding effects by estimating causal relationships using instrumental variables. In this paper, we propose a new statistical method, MR-GGI, for accurately inferring gene–gene interactions using Mendelian randomization. MR-GGI applies one gene as the exposure and another as the outcome, using causal cis-single-nucleotide polymorphisms as instrumental variables in the inverse-variance weighted MR model. Through simulations, we have demonstrated MR-GGI’s ability to control type 1 error and maintain statistical power despite confounding effects. MR-GGI performed the best when compared to other methods using the F1 score on the DREAM5 dataset. Additionally, when applied to yeast genomic data, MR-GGI successfully identified six clusters. Through gene ontology analysis, we have confirmed that each cluster in our study performs distinct functional roles by gathering genes with specific functions.These findings demonstrate that MR-GGI accurately inferences gene–gene interactions despite the confounding effects in real biological environments.
- AnalysisElucidating immunological characteristics of the adenoma-carcinoma sequence in colorectal cancer patients in South Korea using a bioinformatics approachJaeseung Song, Daeun Kim, Junghyun Jung, and 2 more authorsScientific reports 2024
Colorectal cancer (CRC) is one of the top five most common and life-threatening malignancies worldwide. Most CRC develops from advanced colorectal adenoma (ACA), a precancerous stage, through the adenoma-carcinoma sequence. However, its underlying mechanisms, including how the tumor microenvironment changes, remain elusive. Therefore, we conducted an integrative analysis comparing RNA-seq data collected from 40 ACA patients who visited Dongguk University Ilsan Hospital with normal adjacent colons and tumor samples from 18 CRC patients collected from a public database. Differential expression analysis identified 21 and 79 sequentially up- or down-regulated genes across the continuum, respectively. The functional centrality of the continuum genes was assessed through network analysis, identifying 11 up- and 13 down-regulated hub-genes. Subsequently, we validated the prognostic effects of hub-genes using the Kaplan–Meier survival analysis. To estimate the immunological transition of the adenoma-carcinoma sequence, single-cell deconvolution and immune repertoire analyses were conducted. Significant composition changes for innate immunity cells and decreased plasma B-cells with immunoglobulin diversity were observed, along with distinctive immunoglobulin recombination patterns. Taken together, we believe our findings suggest underlying transcriptional and immunological changes during the adenoma-carcinoma sequence, contributing to the further development of pre-diagnostic markers for CRC.
- Analysis (Corresponding)Large-scale integrative analysis of juvenile idiopathic arthritis for new insight into its pathogenesisDaeun Kim, Jaeseung Song, Serghei Mangul, and 3 more authorsArthritis Research & Therapy 2024
Juvenile idiopathic arthritis (JIA) is one of the most prevalent rheumatic disorders in children and is classified as an autoimmune disease (AID). While a robust genetic contribution to JIA etiology has been established, the exact pathogenesis remains unclear. To prioritize biologically interpretable susceptibility genes and proteins for JIA, we conducted transcriptome-wide and proteome-wide association studies (TWAS/PWAS). Then, to understand the genetic architecture of JIA, we systematically analyzed single-nucleotide polymorphism (SNP)-based heritability, a signature of natural selection, and polygenicity. Next, we conducted HLA typing using multi-ethnicity RNA sequencing data. Additionally, we examined the T cell receptor (TCR) repertoire at a single-cell level to explore the potential links between immunity and JIA risk. We have identified 19 TWAS genes and two PWAS proteins associated with JIA risks. Furthermore, we observe that the heritability and cell type enrichment analysis of JIA are enriched in T lymphocytes and HLA regions and that JIA shows higher polygenicity compared to other AIDs. In multi-ancestry HLA typing, B*45:01 is more prevalent in African JIA patients than in European JIA patients, whereas DQA1*01:01, DQA1*03:01, and DRB1*04:01 exhibit a higher frequency in European JIA patients. Using single-cell immune repertoire analysis, we identify clonally expanded T cell subpopulations in JIA patients, including CXCL13+BHLHE40+ TH cells which are significantly associated with JIA risks. Our findings shed new light on the pathogenesis of JIA and provide a strong foundation for future mechanistic studies aimed at uncovering the molecular drivers of JIA.
2023
- MethodA scalable approach to characterize pleiotropy across thousands of human diseases and complex traits using GWAS summary statisticsZixuan Zhang, Junghyun Jung, Artem Kim, and 3 more authorsThe American Journal of Human Genetics 2023
Genome-wide association studies (GWASs) across thousands of traits have revealed the pervasive pleiotropy of trait-associated genetic variants. While methods have been proposed to characterize pleiotropic components across groups of phenotypes, scaling these approaches to ultra-large-scale biobanks has been challenging. Here, we propose FactorGo, a scalable variational factor analysis model to identify and characterize pleiotropic components using biobank GWAS summary data. In extensive simulations, we observe that FactorGo outperforms the state-of-the-art (model-free) approach tSVD in capturing latent pleiotropic factors across phenotypes while maintaining a similar computational cost. We apply FactorGo to estimate 100 latent pleiotropic factors from GWAS summary data of 2,483 phenotypes measured in European-ancestry Pan-UK BioBank individuals (N = 420,531). Next, we find that factors from FactorGo are more enriched with relevant tissue-specific annotations than those identified by tSVD (p = 2.58E−10) and validate our approach by recapitulating brain-specific enrichment for BMI and the height-related connection between reproductive system and muscular-skeletal growth. Finally, our analyses suggest shared etiologies between rheumatoid arthritis and periodontal condition in addition to alkaline phosphatase as a candidate prognostic biomarker for prostate cancer. Overall, FactorGo improves our biological understanding of shared etiologies across thousands of GWASs.
- Analysis (Corresponding)Novel insight into the etiology of ischemic stroke gained by integrative multiome-wide association studyJunghyun Jung*, Zeyun Lu, Adam Smith, and 1 more authorHuman Molecular Genetics 2023
Stroke, characterized by sudden neurological deficits, is the second leading cause of death worldwide. Although genome-wide association studies (GWAS) have successfully identified many genomic regions associated with ischemic stroke (IS), the genes underlying risk and their regulatory mechanisms remain elusive. Here, we integrate a large-scale GWAS (N = 1 296 908) for IS together with molecular QTLs data, including mRNA, splicing, enhancer RNA (eRNA), and protein expression data from up to 50 tissues (total N = 11 588). We identify 136 genes/eRNA/proteins associated with IS risk across 60 independent genomic regions and find IS risk is most enriched for eQTLs in arterial and brain-related tissues. Focusing on IS-relevant tissues, we prioritize 9 genes/proteins using probabilistic fine-mapping TWAS analyses. In addition, we discover that blood cell traits, particularly reticulocyte cells, have shared genetic contributions with IS using TWAS-based pheWAS and genetic correlation analysis. Lastly, we integrate our findings with a large-scale pharmacological database and identify a secondary bile acid, deoxycholic acid, as a potential therapeutic component. Our work highlights IS risk genes/splicing-sites/enhancer activity/proteins with their phenotypic consequences using relevant tissues as well as identify potential therapeutic candidates for IS.
2022
- AnalysisExome-wide association study to identify rare variants influencing COVID-19 outcomes: Results from the Host Genetics InitiativeGuillaume Butler-Laporte, Gundula Povysil, Jack A Kosmicki, and 8 more authorsPLoS Genetics 2022
Host genetics is a key determinant of COVID-19 outcomes. Previously, the COVID-19 Host Genetics Initiative genome-wide association study used common variants to identify multiple loci associated with COVID-19 outcomes. However, variants with the largest impact on COVID-19 outcomes are expected to be rare in the population. Hence, studying rare variants may provide additional insights into disease susceptibility and pathogenesis, thereby informing therapeutics development. Here, we combined whole-exome and whole-genome sequencing from 21 cohorts across 12 countries and performed rare variant exome-wide burden analyses for COVID-19 outcomes. In an analysis of 5,085 severe disease cases and 571,737 controls, we observed that carrying a rare deleterious variant in the SARS-CoV-2 sensor toll-like receptor TLR7 (on chromosome X) was associated with a 5.3-fold increase in severe disease (95% CI: 2.75–10.05, p = 5.41x10-7). This association was consistent across sexes. These results further support TLR7 as a genetic determinant of severe disease and suggest that larger studies on rare variants influencing COVID-19 outcomes could provide additional insights.
- AnalysisA first update on mapping the human genetic architecture of COVID-19Nature 2022
The COVID-19 pandemic continues to pose a major public health threat, especially in countries with low vaccination rates. To better understand the biological underpinnings of SARS-CoV-2 infection and COVID-19 severity, we formed the COVID-19 Host Genetics Initiative1. Here we present a genome-wide association study meta-analysis of up to 125,584 cases and over 2.5 million control individuals across 60 studies from 25 countries, adding 11 genome-wide significant loci compared with those previously identified2. Genes at new loci, including SFTPD, MUC5B and ACE2, reveal compelling insights regarding disease susceptibility and severity.
- AnalysisAn in-silico approach to studying a very rare neurodegenerative disease using a disease with higher prevalence with shared pathways and genes: Cerebral adrenoleukodystrophy and Alzheimer’s diseaseY Shim, M Shin, Junghyun Jung, and 2 more authorsFrontiers in molecular neuroscience 2022
Cerebral adrenoleukodystrophy (cALD) is a rare neurodegenerative disease characterized by inflammatory demyelination in the central nervous system. Another neurodegenerative disease with a high prevalence, Alzheimer’s disease (AD), shares many common features with cALD such as cognitive impairment and the alleviation of symptoms by erucic acid. We investigated cALD and AD in parallel to study the shared pathological pathways between a rare disease and a more common disease. The approach may expand the biological understandings and reveal novel therapeutic targets. Gene set enrichment analysis (GSEA) and weighted gene correlation network analysis (WGCNA) were conducted to identify both the resemblance in gene expression patterns and genes that are pathologically relevant in the two diseases. Within differentially expressed genes (DEGs), GSEA identified 266 common genes with similar up- or down-regulation patterns in cALD and AD. Among the interconnected genes in AD data, two gene sets containing 1,486 genes preserved in cALD data were selected by WGCNA that may significantly affect the development and progression of cALD. WGCNA results filtered by functional correlation via protein–protein interaction analysis overlapping with GSEA revealed four genes (annexin A5, beta-2-microglobulin, CD44 molecule, and fibroblast growth factor 2) that showed robust associations with the pathogeneses of cALD and AD, where they were highly involved in inflammation, apoptosis, and the mitogen-activated protein kinase pathway. This study provided an integrated strategy to provide new insights into a rare disease with scant publicly available data (cALD) using a more prevalent disorder with some pathological association (AD), which suggests novel druggable targets and drug candidates.
- AnalysisIntegrative transcriptome-wide analysis of atopic dermatitis for drug repositioningJaeseung Song, Daeun Kim, Sora Lee, and 3 more authorsCommunications biology 2022
Atopic dermatitis (AD) is one of the most common inflammatory skin diseases, which significantly impact the quality of life. Transcriptome-wide association study (TWAS) was conducted to estimate both transcriptomic and genomic features of AD and detected significant associations between 31 expression quantitative loci and 25 genes. Our results replicated well-known genetic markers for AD, as well as 4 novel associated genes. Next, transcriptome meta-analysis was conducted with 5 studies retrieved from public databases and identified 5 additional novel susceptibility genes for AD. Applying the connectivity map to the results from TWAS and meta-analysis, robustly enriched perturbations were identified and their chemical or functional properties were analyzed. Here, we report the first research on integrative approaches for an AD, combining TWAS and transcriptome meta-analysis. Together, our findings could provide a comprehensive understanding of the pathophysiologic mechanisms of AD and suggest potential drug candidates as alternative treatment options.
- AnalysisCytomegalovirus proteins, maternal pregnancy cytokines, and their impact on neonatal immune cytokine profiles and acute lymphoblastic leukemogenesis in childrenJoseph L Wiemels, Rong Wang, Mi Zhou, and 8 more authorsHaematologica 2022
Early cytomegalovirus (CMV) infection and altered cytokine profiles at birth are associated with risk of childhood acute lymphoblastic leukemia (ALL).1-5 We examined neonatal cytokine levels and CMV proteins in 130 children who contracted ALL later in life and 460 controls. We assessed the immunodominant viral coat protein (pp65) and CMV proteins that manipulate human immune function (CMVIL-10, CMV-CXCL-1), which were detectable in most neonatal samples and correlated with specific cytokine levels (IL-10, IL12, TGF-β1, and TNFα) CMV-IL-10 was positively associated with ALL risk. Neonatal cytokines, analyzed as a principal component loaded by IL-10, IL-12, and TNFα levels, were significantly different between cases and controls. Maternal mid-pregnancy cytokine expression was weakly correlated with cytokines at birth but did not differentiate childhood ALL cases and controls. In sum, the data provide preliminary indications that CMV viral activity during pregnancy may influence the neonatal cytokine profiles linked to risk of childhood ALL.
- AnalysisWhole genome sequencing reveals host factors underlying critical Covid-19Athanasios Kousathanas, Erola Pairo-Castineira, Konrad Rawlik, and 8 more authorsNature 2022
Critical COVID-19 is caused by immune-mediated inflammatory lung injury. Host genetic variation influences the development of illness requiring critical care1 or hospitalization2,3,4 after infection with SARS-CoV-2. The GenOMICC (Genetics of Mortality in Critical Care) study enables the comparison of genomes from individuals who are critically ill with those of population controls to find underlying disease mechanisms. Here we use whole-genome sequencing in 7,491 critically ill individuals compared with 48,400 controls to discover and replicate 23 independent variants that significantly predispose to critical COVID-19. We identify 16 new independent associations, including variants within genes that are involved in interferon signalling (IL10RB and PLSCR1), leucocyte differentiation (BCL11A) and blood-type antigen secretor status (FUT2). Using transcriptome-wide association and colocalization to infer the effect of gene expression on disease severity, we find evidence that implicates multiple genes—including reduced expression of a membrane flippase (ATP11A), and increased expression of a mucin (MUC1)—in critical disease. Mendelian randomization provides evidence in support of causal roles for myeloid cell adhesion molecules (SELE, ICAM5 and CD209) and the coagulation factor F8, all of which are potentially druggable targets. Our results are broadly consistent with a multi-component model of COVID-19 pathophysiology, in which at least two distinct mechanisms can predispose to life-threatening disease: failure to control viral replication; or an enhanced tendency towards pulmonary inflammation and intravascular coagulation. We show that comparison between cases of critical illness and population controls is highly efficient for the detection of therapeutically relevant mechanisms of disease.
2021
- AnalysisCommon, low-frequency, rare, and ultra-rare coding variants contribute to COVID-19 severityChiara Fallerini, Nicola Picchiotti, Margherita Baldassarri, and 8 more authorsHuman genetics 2021
The combined impact of common and rare exonic variants in COVID-19 host genetics is currently insufficiently understood. Here, common and rare variants from whole-exome sequencing data of about 4000 SARS-CoV-2-positive individuals were used to define an interpretable machine-learning model for predicting COVID-19 severity. First, variants were converted into separate sets of Boolean features, depending on the absence or the presence of variants in each gene. An ensemble of LASSO logistic regression models was used to identify the most informative Boolean features with respect to the genetic bases of severity. The Boolean features selected by these logistic models were combined into an Integrated PolyGenic Score that offers a synthetic and interpretable index for describing the contribution of host genetics in COVID-19 severity, as demonstrated through testing in several independent cohorts. Selected features belong to ultra-rare, rare, low-frequency, and common variants, including those in linkage disequilibrium with known GWAS loci. Noteworthily, around one quarter of the selected genes are sex-specific. Pathway analysis of the selected genes associated with COVID-19 severity reflected the multi-organ nature of the disease. The proposed model might provide useful information for developing diagnostics and therapeutics, while also being able to guide bedside disease management.
- AnalysisSARS-CoV-2 susceptibility and COVID-19 disease severity are associated with genetic variants affecting gene expression in a variety of tissuesMatteo D’Antonio, Jennifer P Nguyen, Timothy D Arthur, and 9 more authorsCell reports 2021
Variability in SARS-CoV-2 susceptibility and COVID-19 disease severity between individuals is partly due to genetic factors. Here, we identify 4 genomic loci with suggestive associations for SARS-CoV-2 susceptibility and 19 for COVID-19 disease severity. Four of these 23 loci likely have an ethnicity-specific component. Genome-wide association study (GWAS) signals in 11 loci colocalize with expression quantitative trait loci (eQTLs) associated with the expression of 20 genes in 62 tissues/cell types (range: 1:43 tissues/gene), including lung, brain, heart, muscle, and skin as well as the digestive system and immune system. We perform genetic fine mapping to compute 99% credible SNP sets, which identify 10 GWAS loci that have eight or fewer SNPs in the credible set, including three loci with one single likely causal SNP. Our study suggests that the diverse symptoms and disease severity of COVID-19 observed between individuals is associated with variants across the genome, affecting gene expression levels in a wide variety of tissue types.
- MethodA Deep Learning Approach with Data Augmentation to Predict Novel Spider Neurotoxic PeptidesByungjo Lee, Min Kyoung Shin, In-Wook Hwang, and 6 more authorsInternational Journal of Molecular Sciences 2021
As major components of spider venoms, neurotoxic peptides exhibit structural diversity, target specificity, and have great pharmaceutical potential. Deep learning may be an alternative to the laborious and time-consuming methods for identifying these peptides. However, the major hurdle in developing a deep learning model is the limited data on neurotoxic peptides. Here, we present a peptide data augmentation method that improves the recognition of neurotoxic peptides via a convolutional neural network model. The neurotoxic peptides were augmented with the known neurotoxic peptides from UniProt database, and the models were trained using a training set with or without the generated sequences to verify the augmented data. The model trained with the augmented dataset outperformed the one with the unaugmented dataset, achieving accuracy of 0.9953, precision of 0.9922, recall of 0.9984, and F1 score of 0.9953 in simulation dataset. From the set of all RNA transcripts of Callobius koreanus spider, we discovered neurotoxic peptides via the model, resulting in 275 putative peptides of which 252 novel sequences and only 23 sequences showing homology with the known peptides by Basic Local Alignment Search Tool. Among these 275 peptides, four were selected and shown to have neuromodulatory effects on the human neuroblastoma cell line SH-SY5Y. The augmentation method presented here may be applied to the identification of other functional peptides from biological resources with insufficient data.
- AnalysisRare loss-of-function variants in type I IFN immunity genes are not associated with severe COVID-19Gundula Povysil, Guillaume Butler-Laporte, Ning Shang, and 8 more authorsThe Journal of Clinical Investigation 2021
A recent report found that rare predicted loss-of-function (pLOF) variants across 13 candidate genes in TLR3- and IRF7-dependent type I IFN pathways explain up to 3.5% of severe COVID-19 cases. We performed whole-exome or whole-genome sequencing of 1,864 COVID-19 cases (713 with severe and 1,151 with mild disease) and 15,033 ancestry-matched population controls across 4 independent COVID-19 biobanks. We tested whether rare pLOF variants in these 13 genes were associated with severe COVID-19. We identified only 1 rare pLOF mutation across these genes among 713 cases with severe COVID-19 and observed no enrichment of pLOFs in severe cases compared to population controls or mild COVID-19 cases. We found no evidence of association of rare LOF variants in the 13 candidate genes with severe COVID-19 outcomes.
- Method (First)MARS: leveraging allelic heterogeneity to increase power of association testingFarhad Hormozdiari†, Junghyun Jung†, Eleazar Eskin, and 1 more authorGenome biology Apr 2021
In standard genome-wide association studies (GWAS), the standard association test is underpowered to detect associations between loci with multiple causal variants with small effect sizes. We propose a statistical method, Model-based Association test Reflecting causal Status (MARS), that finds associations between variants in risk loci and a phenotype, considering the causal status of variants, only requiring the existing summary statistics to detect associated risk loci. Utilizing extensive simulated data and real data, we show that MARS increases the power of detecting true associated risk loci compared to previous approaches that consider multiple variants, while controlling the type I error.
- AnalysisAn Integrative Transcriptomic Analysis of Systemic Juvenile Idiopathic Arthritis for Identifying Potential Genetic Markers and Drug CandidatesDaeun Kim, Jaeseung Song, Sora Lee, and 2 more authorsInternational Journal of Molecular Sciences Apr 2021
Systemic juvenile idiopathic arthritis (sJIA) is a rare subtype of juvenile idiopathic arthritis, whose clinical features are systemic fever and rash accompanied by painful joints and inflammation. Even though sJIA has been reported to be an autoinflammatory disorder, its exact pathogenesis remains unclear. In this study, we integrated a meta-analysis with a weighted gene co-expression network analysis (WGCNA) using 5 microarray datasets and an RNA sequencing dataset to understand the interconnection of susceptibility genes for sJIA. Using the integrative analysis, we identified a robust sJIA signature that consisted of 2 co-expressed gene sets comprising 103 up-regulated genes and 25 down-regulated genes in sJIA patients compared with healthy controls. Among the 128 sJIA signature genes, we identified an up-regulated cluster of 11 genes and a down-regulated cluster of 4 genes, which may play key roles in the pathogenesis of sJIA. We then detected 10 bioactive molecules targeting the significant gene clusters as potential novel drug candidates for sJIA using an in silico drug repositioning analysis. These findings suggest that the gene clusters may be potential genetic markers of sJIA and 10 drug candidates can contribute to the development of new therapeutic options for sJIA.
2020
- MethodFully automated web-based tool for identifying regulatory hotspotsJu Hun Choi, Taegun Kim, Junghyun Jung, and 1 more authorBMC genomics Apr 2020
Regulatory hotspots are genetic variations that may regulate the expression levels of many genes. It has been of great interest to find those hotspots utilizing expression quantitative trait locus (eQTL) analysis. However, it has been reported that many of the findings are spurious hotspots induced by various unknown confounding factors. Recently, methods utilizing complicated statistical models have been developed that successfully identify genuine hotspots. Next-generation Intersample Correlation Emended (NICE) is one of the methods that show high sensitivity and low false-discovery rate in finding regulatory hotspots. Even though the methods successfully find genuine hotspots, they have not been widely used due to their non-user-friendly interfaces and complex running processes. Furthermore, most of the methods are impractical due to their prohibitively high computational complexity.
- MethodA Fully Automated Parallel-Processing R Package for High-Dimensional Multiple-Phenotype Analysis Considering Population StructureGi Ju Lee, Sung Min Park, Junghyun Jung, and 1 more authorInternational Journal of Fuzzy Logic and Intelligent Systems Apr 2020
A typical genome-wide association study is conducted through a single-phenotype analysis of the correlation between each phenotype and genotype one at a time. Alternatively, a multiple-phenotype analysis of the correlation between multiple phenotypes and a genotype often has many advantages over single-phenotype analysis. For example, statistical power in the association test may be increased in a multiple-phenotype analysis and thus may detect small effects that cannot be identified in a single-phenotype analysis. Of the several multiple-phenotype analytical methods that have been proposed, generalized analysis of molecular variance for mixed-model analysis (GAMMA) is used to analyze many phenotypes simultaneously while considering the population structure. This method shows higher accuracy than the other methods. However, GAMMA has not been widely used because no automated and user-friendly software is available; this is also the case with most other multiple-phenotype analysis methods. In addition, the lack of a parallel-processing option, which is essential in a genome-wide-association-studies analysis, is also prevalent in GAMMA. In this study, we propose an easy-to-use R package for GAMMA called GAMMA Renew (GAMMAR) that performs multiple-phenotype analysis using parallel processing. We evaluate GAMMAR using a recently published yeast dataset to locate trans-regulatory hotspots.
2019
- AnalysisMeta-analysis of polymyositis and dermatomyositis microarray data reveals novel genetic biomarkersJaeseung Song, Daeun Kim, Juyeon Hong, and 6 more authorsGenes Apr 2019
Polymyositis (PM) and dermatomyositis (DM) are both classified as idiopathic inflammatory myopathies. They share a few common characteristics such as inflammation and muscle weakness. Previous studies have indicated that these diseases present aspects of an auto-immune disorder; however, their exact pathogenesis is still unclear. In this study, three gene expression datasets (PM: 7, DM: 50, Control: 13) available in public databases were used to conduct meta-analysis. We then conducted expression quantitative trait loci analysis to detect the variant sites that may contribute to the pathogenesis of PM and DM. Six-hundred differentially expressed genes were identified in the meta-analysis (false discovery rate (FDR) < 0.01), among which 317 genes were up-regulated and 283 were down-regulated in the disease group compared with those in the healthy control group. The up-regulated genes were significantly enriched in interferon-signaling pathways in protein secretion, and/or in unfolded-protein response. We detected 10 single nucleotide polymorphisms (SNPs) which could potentially play key roles in driving the PM and DM. Along with previously reported genes, we identified 4 novel genes and 10 SNP-variant regions which could be used as candidates for potential drug targets or biomarkers for PM and DM.
- Analysis (First)Integrative genomic and transcriptomic analysis of genetic markers in Dupuytren’s diseaseJunghyun Jung, Go Woon Kim, Byungjo Lee, and 2 more authorsBMC medical genomics Apr 2019
In this study, we identified co-expressed gene set (DD signature) consisting of 753 genes via weighted gene co-expression network analysis. To confirm the robustness of DD signature, module enrichment analysis and meta-analysis were performed. Moreover, this signature effectively classified DD disease samples. The DD signature were significantly enriched in unfolded protein response (UPR) related to endoplasmic reticulum (ER) stress. Next, we conducted multiple-phenotype regression analysis to identify trans-regulatory hotspots regulating expression levels of DD signature using Genotype-Tissue Expression data. Finally, 10 trans-regulatory hotspots and 16 eGenes genes that are significantly associated with at least one cis-eQTL were identified.
- AnalysisMolecular origin of AuNPs-induced cytotoxicity and mechanistic studyEuiyeon Lee, Hyunjin Jeon, Minhyeong Lee, and 5 more authorsScientific reports Apr 2019
Gold nanoparticles (AuNPs) with diverse physicochemical properties are reported to affect biological systems differently, but the relationship between the physicochemical properties of AuNPs and their biological effects is not clearly understood. Here, we aimed to elucidate the molecular origins of AuNP-induced cytotoxicity and their mechanisms, focusing on the surface charge and structural properties of modified AuNPs. We prepared a library of well-tailored AuNPs exhibiting various functional groups and surface charges. Through this work, we revealed that the direction or the magnitude of surface charge is not an exclusive factor that determines the cytotoxicity of AuNPs. We, instead, suggested that toxic AuNPs share a common structural characteristics of a hydrophobic moiety neighbouring the positive charge, which can induce lytic interaction with plasma membrane. Mechanistic study showed that the toxic AuNPs interfered with the formation of cytoskeletal structure to slow cell migration, inhibited DNA replication and caused DNA damage via oxidative stress to hinder cell proliferation. Gene expression analysis showed that the toxic AuNPs down-regulated genes associated with cell cycle processes. We discovered structural characteristics that define the cytotoxic AuNPs and suggested the mechanisms of their cytotoxicity. These findings will help us to understand and to predict the biological effects of modified AuNPs based on their physicochemical properties.v
2018
- Analysis (First)Meta-and cross-species analyses of insulin resistance based on gene expression datasets in human white adipose tissuesJunghyun Jung, Go Woon Kim, Woosuk Lee, and 3 more authorsScientific reports Apr 2018
Ample evidence indicates that insulin resistance (IR) is closely related to white adipose tissue (WAT), but the underlying mechanisms of IR pathogenesis are still unclear. Using 352 microarray datasets from seven independent studies, we identified a meta-signature which comprised of 1,413 genes. Our meta-signature was also enriched in overall WAT in in vitro and in vivo IR models. Only 12 core enrichment genes were consistently enriched across all IR models. Among the meta-signature, we identified a drug signature made up of 211 genes with expression levels that were co-regulated by thiazolidinediones and metformin using cross-species analysis. To confirm the clinical relevance of our drug signature, we found that the expression levels of 195 genes in the drug signature were significantly correlated with both homeostasis model assessment 2-IR score and body mass index. Finally, 18 genes from the drug signature were identified by protein-protein interaction network cluster. Four core enrichment genes were included in 18 genes and the expression levels of selected 8 genes were validated by quantitative PCR. These findings suggest that our signatures provide a robust set of genetic markers which can be used to provide a starting point for developing potential therapeutic targets in improving IR in WAT.
2017
- AnalysisModelling APOE ɛ3/4 allele-associated sporadic Alzheimer’s disease in an induced neuronHongwon Kim, Junsang Yoo, Jaein Shin, and 8 more authorsBrain Apr 2017
The recent generation of induced neurons by direct lineage conversion holds promise for in vitro modelling of sporadic Alzheimer’s disease. Here, we report the generation of induced neuron-based model of sporadic Alzheimer’s disease in mice and humans, and used this system to explore the pathogenic mechanisms resulting from the sporadic Alzheimer’s disease risk factor apolipoprotein E (APOE) ɛ3/4 allele. We show that mouse and human induced neurons overexpressing mutant amyloid precursor protein in the background of APOE ɛ3/4 allele exhibit altered amyloid precursor protein (APP) processing, abnormally increased production of amyloid-β42 and hyperphosphorylation of tau. Importantly, we demonstrate that APOE ɛ3/4 patient induced neuron culture models can faithfully recapitulate molecular signatures seen in APOE ɛ3/4-associated sporadic Alzheimer’s disease patients. Moreover, analysis of the gene network derived from APOE ɛ3/4 patient induced neurons reveals a strong interaction between APOE ɛ3/4 and another Alzheimer’s disease risk factor, desmoglein 2 (DSG2). Knockdown of DSG2 in APOE ɛ3/4 induced neurons effectively rescued defective APP processing, demonstrating the functional importance of this interaction. These data provide a direct connection between APOE ɛ3/4 and another Alzheimer’s disease susceptibility gene and demonstrate in proof of principle the utility of induced neuron-based modelling of Alzheimer’s disease for therapeutic discovery.
- AnalysisElectromagnetized gold nanoparticles mediate direct lineage reprogramming into induced dopamine neurons in vivo for Parkinson’s disease therapyJunsang Yoo, Euiyeon Lee, Hee Young Kim, and 8 more authorsNature nanotechnology Apr 2017
Electromagnetic fields (EMF) are physical energy fields generated by electrically charged objects, and specific ranges of EMF can influence numerous biological processes, which include the control of cell fate and plasticity. In this study, we show that electromagnetized gold nanoparticles (AuNPs) in the presence of specific EMF conditions facilitate an efficient direct lineage reprogramming to induced dopamine neurons in vitro and in vivo. Remarkably, electromagnetic stimulation leads to a specific activation of the histone acetyltransferase Brd2, which results in histone H3K27 acetylation and a robust activation of neuron-specific genes. In vivo dopaminergic neuron reprogramming by EMF stimulation of AuNPs efficiently and non-invasively alleviated symptoms in mouse Parkinson’s disease models. This study provides a proof of principle for EMF-based in vivo lineage conversion as a potentially viable and safe therapeutic strategy for the treatment of neurodegenerative disorders.
- Analysis (First)Meta-analysis of microarray datasets for the risk assessment of coplanar polychlorinated biphenyl 77 (PCB77) on human healthJunghyun Jung, Kyoungyoung Hah, Woosuk Lee, and 1 more authorToxicology and Environmental Health Sciences Apr 2017
Polychlorinated biphenyls (PCBs) are persistent organic compounds that have been banned since 1970s, but continue to contaminate the environment. PCBs are categorized into two structural groups: coplanar and non-coplanar PCBs. The coplanar PCBs are dioxin-like potent toxic compounds. To evaluate their effects on humans, we chose a coplanar PCB77 for data analysis. We performed meta- analysis by integrating datasets via the Rank Product method, and identified 375 up- and 66 down- regulated differentially expressed genes (DEGs). Notably, up-regulated genes were significantly associated with liver and kidney diseases. Using gene ontology enrichment, we found that the up-regulated DEGs were significantly enriched in the apoptotic process (false discovery rate, FDR=1.62e-10) and response to unfolded protein (FDR=7.65e-10). Protein-protein interaction networks identified the hub proteins containing HSP90AB1 and HSPA5. These findings suggest that our DEGs may provide a robust set of genetic markers for PCB77.
- Analysis (First)Meta-analysis of microarray and RNA-Seq gene expression datasets for carcinogenic risk: An assessment of Bisphenol AJunghyun Jung, Changsoo Mok, Woosuk Lee, and 1 more authorMolecular & Cellular Toxicology Apr 2017
Bisphenol A (BPA) is an endocrine-disrupting chemical that is related to many diseases, including heart attacks and diabetes. Recently, several studies have reported the carcinogenic potential of BPA in rodents, yet carcinogenic effects of BPA in humans remains unclear. In this study, meta-analysis was applied to independent GEO datasets, based on 158 Affymetrix microarrays and 8 Illumina RNA-Seqs. Additionally, we performed functional enrichment analysis, disease similarity analysis based on Disease Ontology (DO) analysis, and network analysis. 1,993 (1,457 up-, 536 down-regulated) differentially expressed genes (DEGs) were identified from five GEO datasets by adjusting for batch effects. Using disease similarity analysis, we demonstrated that results of DO analysis of the top 20 diseases were highly related to breast cancer. Moreover, we showed that the DEGs were significantly enriched in gene expression datasets on human breast cancer tissue via gene set enrichment analysis. By performing network analysis, we finally identified 85 (68 up- and 17 down-regulated) DEGs, and some of their expression levels were validated by quantitative PCR. The identified DEGs were regarded as genetic markers for carcinogenic risks, indicating that BPA may be a potential carcinogenic chemical contributing to the cause of breast cancer in humans.
- MethodInhibitory effects of novel SphK2 inhibitors on migration of cancer cellsEuiyeon Lee, Junghyun Jung, Deokho Jung, and 5 more authorsAnti-Cancer Agents in Medicinal Chemistry (Formerly Current Medicinal Chemistry-Anti-Cancer Agents) Apr 2017
Background: Cell migration is an essential process for survival and differentiation of mammalian cells. Numerous diseases are induced or influenced by inappropriate regulation of cell migration, which plays a key role in cancer cell metastasis. In fact, very few anti-metastasis drugs are available on the market. SphKs are enzymes that convert sphingosine to sphingosine-1-phosphate (S1P) and are known to control various cellular functions, including migration of cells. In human, SphK2 is known to promote apoptosis, suppresses cell growth, and controls cell migration; in addition, the specific ablation of SphK2 activity was reported to inhibit cancer cell metastasis. Objective: The previously identified SG12 and SG14 are synthetic analogs of sphingoid and can specifically inhibit the functions of SphK2. We investigated the effects of the SphK2 specific inhibitors on the migratory behavior of cells. Method: We investigated how SG12 and SG14 affect cell migration by monitoring both cumulative and individual cell migration behavior using HeLa cells. Results: SG12 and SG14 mutually showed stronger inhibitory effects with less cytotoxicity compared with a general SphK inhibitor, N,N-dimethylsphingosine (DMS). The mechanistic aspects of specific SphK2 inhibition were studied by examining actin filamentation and the expression levels of motility-related genes. Conclusion: The data revealed that SG12 and SG14 resemble DMS in decreasing overall cell motility, but differ in that they differentially affect motility parameters and motility-related signal transduction pathways and therefore actin polymerization, which are not altered by DMS. Our findings show that SphK2 inhibitors are putative candidates for anti-metastatic drugs.
2016
- AnalysisGalectin-3 supports stemness in ovarian cancer stem cells by activation of the Notch1 intracellular domainHyeok Gu Kang, Da-Hyun Kim, Seok-Jun Kim, and 4 more authorsOncotarget Apr 2016
Ovarian cancer is the most lethal gynecologic disease because usually, it is lately sensed, easily acquires chemoresistance, and has a high recurrence rate. Recent studies suggest that ovarian cancer stem cells (CSCs) are involved in these malignancies. Here, we demonstrated that galectin-3 maintains ovarian CSCs by activating the Notch1 intracellular domain (NICD1). The number and size of ovarian CSCs decreased in the absence of galectin-3, and overexpression of galectin-3 increased them. Overexpression of galectin-3 increased the resistance for cisplatin and paclitaxel-induced cell death. Silencing of galectin-3 decreased the migration and invasion of ovarian cancer cells, and overexpression of galectin-3 reversed these effects. The Notch signaling pathway was strongly activated by galectin-3 overexpression in A2780 cells. Silencing of galectin-3 reduced the levels of cleaved NICD1 and expression of the Notch target genes, Hes1 and Hey1. Overexpression of galectin-3 induced NICD1 cleavage and increased expression of Hes1 and Hey1. Moreover, overexpression of galectin-3 increased the nuclear translocation of NICD1. Interestingly, the carbohydrate recognition domain of galectin-3 interacted with NICD1. Overexpression of galectin-3 increased tumor burden in A2780 ovarian cancer xenografted mice. Increased expression of galectin-3 was detected in advanced stages, compared to stage 1 or 2 in ovarian cancer patients, suggesting that galectin-3 supports stemness of these cells. Based on these results, we suggest that targeting galectin-3 may be a potent approach for improving ovarian cancer therapy.