Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Lazar C, Meganck S, Taminau J, Steenhoff D, Coletta A, Molter C, Weiss-Solís DY, Duque R, Bersini H, Nowé A. Batch effect removal methods for microarray gene expression data integration: a survey. Brief Bioinform 2012;14:469-90. [PMID: 22851511 DOI: 10.1093/bib/bbs037] [Citation(s) in RCA: 210] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

For:	Lazar C, Meganck S, Taminau J, Steenhoff D, Coletta A, Molter C, Weiss-Solís DY, Duque R, Bersini H, Nowé A. Batch effect removal methods for microarray gene expression data integration: a survey. Brief Bioinform 2012;14:469-90. [PMID: 22851511 DOI: 10.1093/bib/bbs037] [Citation(s) in RCA: 210] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

Number

Cited by Other Article(s)

Yu Y, Mai Y, Zheng Y, Shi L. Assessing and mitigating batch effects in large-scale omics studies. Genome Biol 2024;25:254. [PMID: 39363244 PMCID: PMC11447944 DOI: 10.1186/s13059-024-03401-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 09/23/2024] [Indexed: 10/05/2024] Open

Hoang N, Sardaripour N, Ramey GD, Schilling K, Liao E, Chen Y, Park JH, Bledsoe X, Landman BA, Gamazon ER, Benton ML, Capra JA, Rubinov M. Integration of estimated regional gene expression with neuroimaging and clinical phenotypes at biobank scale. PLoS Biol 2024;22:e3002782. [PMID: 39269986 PMCID: PMC11424006 DOI: 10.1371/journal.pbio.3002782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 09/25/2024] [Accepted: 08/01/2024] [Indexed: 09/15/2024] Open

Abstract

An understanding of human brain individuality requires the integration of data on brain organization across people and brain regions, molecular and systems scales, as well as healthy and clinical states. Here, we help advance this understanding by leveraging methods from computational genomics to integrate large-scale genomic, transcriptomic, neuroimaging, and electronic-health record data sets. We estimated genetically regulated gene expression (gr-expression) of 18,647 genes, across 10 cortical and subcortical regions of 45,549 people from the UK Biobank. First, we showed that patterns of estimated gr-expression reflect known genetic-ancestry relationships, regional identities, as well as inter-regional correlation structure of directly assayed gene expression. Second, we performed transcriptome-wide association studies (TWAS) to discover 1,065 associations between individual variation in gr-expression and gray-matter volumes across people and brain regions. We benchmarked these associations against results from genome-wide association studies (GWAS) of the same sample and found hundreds of novel associations relative to these GWAS. Third, we integrated our results with clinical associations of gr-expression from the Vanderbilt Biobank. This integration allowed us to link genes, via gr-expression, to neuroimaging and clinical phenotypes. Fourth, we identified associations of polygenic gr-expression with structural and functional MRI phenotypes in the Human Connectome Project (HCP), a small neuroimaging-genomic data set with high-quality functional imaging data. Finally, we showed that estimates of gr-expression and magnitudes of TWAS were generally replicable and that the p-values of TWAS were replicable in large samples. Collectively, our results provide a powerful new resource for integrating gr-expression with population genetics of brain organization and disease.

Collapse

Affiliation(s)

Nhung Hoang Department of Computer Science, Vanderbilt University, Nashville, Tennessee, United States of America
Neda Sardaripour Department of Biomedical Engineering, Vanderbilt University, Nashville, Tennessee, United States of America
Grace D. Ramey Biological and Medical Informatics Division, University of California, San Francisco, California, United States of America Department of Epidemiology and Biostatistics, University of California, San Francisco, California, United States of America
Kurt Schilling Department of Electrical and Computer Engineering, Vanderbilt University, Nashville, Tennessee, United States of America Department of Radiology and Radiological Sciences, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
Emily Liao Department of Biomedical Engineering, Vanderbilt University, Nashville, Tennessee, United States of America
Yiting Chen Department of Biomedical Engineering, Vanderbilt University, Nashville, Tennessee, United States of America
Jee Hyun Park Department of Biomedical Engineering, Vanderbilt University, Nashville, Tennessee, United States of America
Xavier Bledsoe Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
Bennett A. Landman Department of Computer Science, Vanderbilt University, Nashville, Tennessee, United States of America Department of Biomedical Engineering, Vanderbilt University, Nashville, Tennessee, United States of America Department of Electrical and Computer Engineering, Vanderbilt University, Nashville, Tennessee, United States of America Department of Radiology and Radiological Sciences, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
Eric R. Gamazon Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
Mary Lauren Benton Department of Computer Science, Baylor University, Waco, Texas, United States of America
John A. Capra Department of Computer Science, Vanderbilt University, Nashville, Tennessee, United States of America Department of Epidemiology and Biostatistics, University of California, San Francisco, California, United States of America Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America Bakar Computational Health Sciences Institute, University of California, San Francisco, California, United States of America
Mikail Rubinov Department of Computer Science, Vanderbilt University, Nashville, Tennessee, United States of America Department of Biomedical Engineering, Vanderbilt University, Nashville, Tennessee, United States of America Department of Psychology, Vanderbilt University, Nashville, Tennessee, United States of America Howard Hughes Medical Institute Janelia Research Campus, Ashburn, Virginia, United States of America

Collapse

Takahashi M, Chong HB, Zhang S, Yang TY, Lazarov MJ, Harry S, Maynard M, Hilbert B, White RD, Murrey HE, Tsou CC, Vordermark K, Assaad J, Gohar M, Dürr BR, Richter M, Patel H, Kryukov G, Brooijmans N, Alghali ASO, Rubio K, Villanueva A, Zhang J, Ge M, Makram F, Griesshaber H, Harrison D, Koglin AS, Ojeda S, Karakyriakou B, Healy A, Popoola G, Rachmin I, Khandelwal N, Neil JR, Tien PC, Chen N, Hosp T, van den Ouweland S, Hara T, Bussema L, Dong R, Shi L, Rasmussen MQ, Domingues AC, Lawless A, Fang J, Yoda S, Nguyen LP, Reeves SM, Wakefield FN, Acker A, Clark SE, Dubash T, Kastanos J, Oh E, Fisher DE, Maheswaran S, Haber DA, Boland GM, Sade-Feldman M, Jenkins RW, Hata AN, Bardeesy NM, Suvà ML, Martin BR, Liau BB, Ott CJ, Rivera MN, Lawrence MS, Bar-Peled L. DrugMap: A quantitative pan-cancer analysis of cysteine ligandability. Cell 2024;187:2536-2556.e30. [PMID: 38653237 PMCID: PMC11143475 DOI: 10.1016/j.cell.2024.03.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 01/15/2024] [Accepted: 03/19/2024] [Indexed: 04/25/2024]

Affiliation(s)

Mariko Takahashi Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA.
Harrison B Chong Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Siwen Zhang Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Tzu-Yi Yang Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Matthew J Lazarov Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Stefan Harry Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138, USA
Michelle Maynard Scorpion Therapeutics, Boston, MA 02110, USA
Brendan Hilbert Scorpion Therapeutics, Boston, MA 02110, USA
Ryan D White Scorpion Therapeutics, Boston, MA 02110, USA
Heather E Murrey Scorpion Therapeutics, Boston, MA 02110, USA
Chih-Chiang Tsou Scorpion Therapeutics, Boston, MA 02110, USA
Kira Vordermark Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Jonathan Assaad Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Magdy Gohar Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Benedikt R Dürr Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Marianne Richter Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Himani Patel Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Gregory Kryukov Scorpion Therapeutics, Boston, MA 02110, USA
Natasja Brooijmans Scorpion Therapeutics, Boston, MA 02110, USA
Aliyu Sidi Omar Alghali Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA
Karla Rubio Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA
Antonio Villanueva Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA
Junbing Zhang Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Maolin Ge Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Farah Makram Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Hanna Griesshaber Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Drew Harrison Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Ann-Sophie Koglin Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Samuel Ojeda Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Barbara Karakyriakou Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Alexander Healy Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
George Popoola Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Inbal Rachmin Cutaneous Biology Research Center, Massachusetts General Hospital, Boston, MA 02114, USA
Neha Khandelwal Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Jason R Neil Scorpion Therapeutics, Boston, MA 02110, USA
Pei-Chieh Tien Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Nicholas Chen Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Pathology, Harvard Medical School, Boston, MA 02114, USA
Tobias Hosp Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Sanne van den Ouweland Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Toshiro Hara Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA
Lillian Bussema Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA
Rui Dong Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA
Lei Shi Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Martin Q Rasmussen Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Ana Carolina Domingues Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Aleigha Lawless Department of Surgery, Massachusetts General Hospital, Boston, MA 02114, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Jacy Fang Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Satoshi Yoda Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Linh Phuong Nguyen Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Sarah Marie Reeves Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Farrah Nicole Wakefield Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Adam Acker Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Sarah Elizabeth Clark Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Taronish Dubash Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
John Kastanos Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
Eugene Oh Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
David E Fisher Cutaneous Biology Research Center, Massachusetts General Hospital, Boston, MA 02114, USA
Shyamala Maheswaran Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
Daniel A Haber Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Medicine, Harvard Medical School, Boston, MA 02114, USA; Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
Genevieve M Boland Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Surgery, Massachusetts General Hospital, Boston, MA 02114, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Surgery, Harvard Medical School, Boston, MA 02114, USA
Moshe Sade-Feldman Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
Russell W Jenkins Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
Aaron N Hata Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
Nabeel M Bardeesy Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
Mario L Suvà Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Pathology, Harvard Medical School, Boston, MA 02114, USA
Brent R Martin Scorpion Therapeutics, Boston, MA 02110, USA
Brian B Liau Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138, USA
Christopher J Ott Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
Miguel N Rivera Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Pathology, Harvard Medical School, Boston, MA 02114, USA
Michael S Lawrence Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Pathology, Harvard Medical School, Boston, MA 02114, USA.
Liron Bar-Peled Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Medicine, Harvard Medical School, Boston, MA 02114, USA.

Collapse

Ma C, Zhang Y, Ding R, Chen H, Wu X, Xu L, Yu C. In search of the ratio of miRNA expression as robust biomarkers for constructing stable diagnostic models among multi-center data. Front Genet 2024;15:1381917. [PMID: 38746057 PMCID: PMC11091382 DOI: 10.3389/fgene.2024.1381917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Accepted: 04/10/2024] [Indexed: 05/16/2024] Open

Wu P, Li D, Zhang C, Dai B, Tang X, Liu J, Wu Y, Wang X, Shen A, Zhao J, Zi X, Li R, Sun N, He J. A unique circulating microRNA pairs signature serves as a superior tool for early diagnosis of pan-cancer. Cancer Lett 2024;588:216655. [PMID: 38460724 DOI: 10.1016/j.canlet.2024.216655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Revised: 11/18/2023] [Accepted: 01/16/2024] [Indexed: 03/11/2024]

Affiliation(s)

Peng Wu Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Dongyu Li Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China; 4+4 Medical Doctor Program, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Chaoqi Zhang Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Bing Dai School of Software, Tsinghua University, Beijing, 100084, China
Xiaoya Tang Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Jingjing Liu Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Yue Wu Department of Clinical Laboratory, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Xingwu Wang Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Ao Shen Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Jiapeng Zhao 4+4 Medical Doctor Program, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Xiaohui Zi Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Ruirui Li Department of Pathology, National Cancer Center/ National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Nan Sun Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China.
Jie He Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China.

Collapse

Khoa LTP, Yang W, Shan M, Zhang L, Mao F, Zhou B, Li Q, Malcore R, Harris C, Zhao L, Rao RC, Iwase S, Kalantry S, Bielas SL, Lyssiotis CA, Dou Y. Quiescence enables unrestricted cell fate in naive embryonic stem cells. Nat Commun 2024;15:1721. [PMID: 38409226 PMCID: PMC10897426 DOI: 10.1038/s41467-024-46121-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 02/14/2024] [Indexed: 02/28/2024] Open

Affiliation(s)

Le Tran Phuc Khoa Department of Medicine, Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, CA, 90033, USA Department of Molecular and Integrative Physiology, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
Wentao Yang Department of Medicine, Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, CA, 90033, USA
Mengrou Shan Department of Molecular and Integrative Physiology, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
Li Zhang Department of Molecular and Integrative Physiology, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
Fengbiao Mao Institute of Medical Innovation and Research, Peking University Third Hospital, Beijing, China
Bo Zhou Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
Qiang Li Department of Ophthalmology & Visual Sciences, W.K. Kellogg Eye Center, University of Michigan, 1000 Wall St., Ann Arbor, MI, 48105, USA
Rebecca Malcore Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
Clair Harris Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
Lili Zhao Beaumont Hospital, Wayne, 33155 Annapolis St., Wayne, MI, 48184, USA
Rajesh C Rao Department of Ophthalmology & Visual Sciences, W.K. Kellogg Eye Center, University of Michigan, 1000 Wall St., Ann Arbor, MI, 48105, USA
Shigeki Iwase Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
Sundeep Kalantry Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
Stephanie L Bielas Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
Costas A Lyssiotis Department of Molecular and Integrative Physiology, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
Yali Dou Department of Medicine, Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, CA, 90033, USA.

Collapse

Diao Y, Zhao Y, Li X, Li B, Huo R, Han X. A simplified machine learning model utilizing platelet-related genes for predicting poor prognosis in sepsis. Front Immunol 2023;14:1286203. [PMID: 38054005 PMCID: PMC10694245 DOI: 10.3389/fimmu.2023.1286203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 11/03/2023] [Indexed: 12/07/2023] Open

Abstract

Background

Thrombocytopenia is a known prognostic factor in sepsis, yet the relationship between platelet-related genes and sepsis outcomes remains elusive. We developed a machine learning (ML) model based on platelet-related genes to predict poor prognosis in sepsis. The model underwent rigorous evaluation on six diverse platforms, ensuring reliable and versatile findings.

Methods

A retrospective analysis of platelet data from 365 sepsis patients confirmed the predictive role of platelet count in prognosis. We employed COX analysis, Least Absolute Shrinkage and Selection Operator (LASSO) and Support Vector Machine (SVM) techniques to identify platelet-related genes from the GSE65682 dataset. Subsequently, these genes were trained and validated on six distinct platforms comprising 719 patients, and compared against the Acute Physiology and Chronic Health Evaluation II (APACHE II) and Sequential Organ-Failure Assessment (SOFA) score.

Results

A PLT count <100×109/L independently increased the risk of death in sepsis patients (OR = 2.523; 95% CI: 1.084-5.872). The ML model, based on five platelet-related genes, demonstrated impressive area under the curve (AUC) values ranging from 0.5 to 0.795 across various validation platforms. On the GPL6947 platform, our ML model outperformed the APACHE II score with an AUC of 0.795 compared to 0.761. Additionally, by incorporating age, the model's performance was further improved to an AUC of 0.812. On the GPL4133 platform, the initial AUC of the machine learning model based on five platelet-related genes was 0.5. However, after including age, the AUC increased to 0.583. In comparison, the AUC of the APACHE II score was 0.604, and the AUC of the SOFA score was 0.542.

Conclusion

Our findings highlight the broad applicability of this ML model, based on platelet-related genes, in facilitating early treatment decisions for sepsis patients with poor outcomes. Our study paves the way for advancements in personalized medicine and improved patient care.

Collapse

Abdallah N, Marion JM, Tauber C, Carlier T, Hatt M, Chauvet P. Enhancing histopathological image classification of invasive ductal carcinoma using hybrid harmonization techniques. Sci Rep 2023;13:20014. [PMID: 37973797 PMCID: PMC10654662 DOI: 10.1038/s41598-023-46239-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 10/30/2023] [Indexed: 11/19/2023] Open

Maselli F, D’Antona S, Utichi M, Arnaudi M, Castiglioni I, Porro D, Papaleo E, Gandellini P, Cava C. Computational analysis of five neurodegenerative diseases reveals shared and specific genetic loci. Comput Struct Biotechnol J 2023;21:5395-5407. [PMID: 38022694 PMCID: PMC10651457 DOI: 10.1016/j.csbj.2023.10.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 10/09/2023] [Accepted: 10/16/2023] [Indexed: 12/01/2023] Open

He J, Yang H, Liu Z, Chen M, Ye Y, Tao Y, Li S, Fang J, Xu J, Wu X, Qi H. Elevated expression of glycolytic genes as a prominent feature of early-onset preeclampsia: insights from integrative transcriptomic analysis. Front Mol Biosci 2023;10:1248771. [PMID: 37818100 PMCID: PMC10561389 DOI: 10.3389/fmolb.2023.1248771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 09/08/2023] [Indexed: 10/12/2023] Open

Abstract

Introduction: Preeclampsia (PE), a notable pregnancy-related disorder, leads to 40,000+ maternal deaths yearly. Recent research shows PE divides into early-onset (EOPE) and late-onset (LOPE) subtypes, each with distinct clinical features and outcomes. However, the molecular characteristics of various subtypes are currently subject to debate and are not consistent. Methods: We integrated transcriptomic expression data from a total of 372 placental samples across 8 publicly available databases via combat algorithm. Then, a variety of strategies including Random Forest Recursive Feature Elimination (RF-RFE), differential analysis, oposSOM, and Weighted Correlation Network Analysis were employed to identify the characteristic genes of the EOPE and LOPE subtypes. Finally, we conducted in vitro experiments on the key gene HK2 in HTR8/SVneo cells to explore its function. Results: Our results revealed a complex classification of PE placental samples, wherein EOPE manifests as a highly homogeneous sample group characterized by hypoxia and HIF1A activation. Among the core features is the upregulation of glycolysis-related genes, particularly HK2, in the placenta-an observation corroborated by independent validation data and single-cell data. Building on the pronounced correlation between HK2 and EOPE, we conducted in vitro experiments to assess the potential functional impact of HK2 on trophoblast cells. Additionally, the LOPE samples exhibit strong heterogeneity and lack distinct features, suggesting a complex molecular makeup for this subtype. Unsupervised clustering analysis indicates that LOPE likely comprises at least two distinct subtypes, linked to cell-environment interaction and cytokine and protein modification functionalities. Discussion: In summary, these findings elucidate potential mechanistic differences between the two PE subtypes, lend support to the hypothesis of classifying PE based on gestational weeks, and emphasize the potential significant role of glycolysis-related genes, especially HK2 in EOPE.

Collapse

Affiliation(s)

Jie He Department of Obstetrics, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China Chongqing Key Laboratory of Maternal and Fetal Medicine, Chongqing Medical University, Chongqing, China Joint International Research Laboratory of Reproduction and Development of Chinese Ministry of Education, Chongqing Medical University, Chongqing, China
Huan Yang Department of Obstetrics, Chongqing University Three Gorges Hospital, Chongqing, China
Zheng Liu Department of Obstetrics, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China Chongqing Key Laboratory of Maternal and Fetal Medicine, Chongqing Medical University, Chongqing, China Joint International Research Laboratory of Reproduction and Development of Chinese Ministry of Education, Chongqing Medical University, Chongqing, China
Miaomiao Chen Maternal and Child Health Hospital of Hubei Province, Wuhan, China
Ying Ye Department of Cardiothoracic Surgery, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
Yuelan Tao Department of Obstetrics, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China Chongqing Key Laboratory of Maternal and Fetal Medicine, Chongqing Medical University, Chongqing, China Joint International Research Laboratory of Reproduction and Development of Chinese Ministry of Education, Chongqing Medical University, Chongqing, China
Shuhong Li Department of Oncology, Chengdu Second People’s Hospital, Chengdu, China
Jie Fang Department of Obstetrics, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China Chongqing Key Laboratory of Maternal and Fetal Medicine, Chongqing Medical University, Chongqing, China Joint International Research Laboratory of Reproduction and Development of Chinese Ministry of Education, Chongqing Medical University, Chongqing, China
Jiacheng Xu Department of Obstetrics, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China Chongqing Key Laboratory of Maternal and Fetal Medicine, Chongqing Medical University, Chongqing, China Joint International Research Laboratory of Reproduction and Development of Chinese Ministry of Education, Chongqing Medical University, Chongqing, China
Xiafei Wu Department of Obstetrics, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China Chongqing Key Laboratory of Maternal and Fetal Medicine, Chongqing Medical University, Chongqing, China Joint International Research Laboratory of Reproduction and Development of Chinese Ministry of Education, Chongqing Medical University, Chongqing, China
Hongbo Qi Department of Obstetrics, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China Chongqing Key Laboratory of Maternal and Fetal Medicine, Chongqing Medical University, Chongqing, China Joint International Research Laboratory of Reproduction and Development of Chinese Ministry of Education, Chongqing Medical University, Chongqing, China Department of Obstetrics and Gynecology, Women and Children’s Hospital of Chongqing Medical University, Chongqing, China

Collapse

Wang P, Paquet ÉR, Robert C. Comprehensive transcriptomic analysis of long non-coding RNAs in bovine ovarian follicles and early embryos. PLoS One 2023;18:e0291761. [PMID: 37725621 PMCID: PMC10508637 DOI: 10.1371/journal.pone.0291761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 09/05/2023] [Indexed: 09/21/2023] Open

Abstract

Long non-coding RNAs (lncRNAs) have been the subject of numerous studies over the past decade. First thought to come from aberrant transcriptional events, lncRNAs are now considered a crucial component of the genome with roles in multiple cellular functions. However, the functional annotation and characterization of bovine lncRNAs during early development remain limited. In this comprehensive analysis, we review lncRNAs expression in bovine ovarian follicles and early embryos, based on a unique database comprising 468 microarray hybridizations from a single platform designed to target 7,724 lncRNA transcripts, of which 5,272 are intergenic (lincRNA), 958 are intronic, and 1,524 are antisense (lncNAT). Compared to translated mRNA, lncRNAs have been shown to be more tissue-specific and expressed in low copy numbers. This analysis revealed that protein-coding genes and lncRNAs are both expressed more in oocytes. Differences between the oocyte and the 2-cell embryo are also more apparent in terms of lncRNAs than mRNAs. Co-expression network analysis using WGCNA generated 25 modules with differing proportions of lncRNAs. The modules exhibiting a higher proportion of lncRNAs were found to be associated with fewer annotated mRNAs and housekeeping functions. Functional annotation of co-expressed mRNAs allowed attribution of lncRNAs to a wide array of key cellular events such as meiosis, translation initiation, immune response, and mitochondrial related functions. We thus provide evidence that lncRNAs play diverse physiological roles that are tissue-specific and associated with key cellular functions alongside mRNAs in bovine ovarian follicles and early embryos. This contributes to add lncRNAs as active molecules in the complex regulatory networks driving folliculogenesis, oogenesis and early embryogenesis all of which are necessary for reproductive success.

Collapse

Samadishadlou M, Rahbarghazi R, Piryaei Z, Esmaeili M, Avcı ÇB, Bani F, Kavousi K. Unlocking the potential of microRNAs: machine learning identifies key biomarkers for myocardial infarction diagnosis. Cardiovasc Diabetol 2023;22:247. [PMID: 37697288 PMCID: PMC10496209 DOI: 10.1186/s12933-023-01957-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Accepted: 08/10/2023] [Indexed: 09/13/2023] Open

Abstract

BACKGROUND

MicroRNAs (miRNAs) play a crucial role in regulating adaptive and maladaptive responses in cardiovascular diseases, making them attractive targets for potential biomarkers. However, their potential as novel biomarkers for diagnosing cardiovascular diseases requires systematic evaluation.

METHODS

In this study, we aimed to identify a key set of miRNA biomarkers using integrated bioinformatics and machine learning analysis. We combined and analyzed three gene expression datasets from the Gene Expression Omnibus (GEO) database, which contains peripheral blood mononuclear cell (PBMC) samples from individuals with myocardial infarction (MI), stable coronary artery disease (CAD), and healthy individuals. Additionally, we selected a set of miRNAs based on their area under the receiver operating characteristic curve (AUC-ROC) for separating the CAD and MI samples. We designed a two-layer architecture for sample classification, in which the first layer isolates healthy samples from unhealthy samples, and the second layer classifies stable CAD and MI samples. We trained different machine learning models using both biomarker sets and evaluated their performance on a test set.

RESULTS

We identified hsa-miR-21-3p, hsa-miR-186-5p, and hsa-miR-32-3p as the differentially expressed miRNAs, and a set including hsa-miR-186-5p, hsa-miR-21-3p, hsa-miR-197-5p, hsa-miR-29a-5p, and hsa-miR-296-5p as the optimum set of miRNAs selected by their AUC-ROC. Both biomarker sets could distinguish healthy from not-healthy samples with complete accuracy. The best performance for the classification of CAD and MI was achieved with an SVM model trained using the biomarker set selected by AUC-ROC, with an AUC-ROC of 0.96 and an accuracy of 0.94 on the test data.

CONCLUSIONS

Our study demonstrated that miRNA signatures derived from PBMCs could serve as valuable novel biomarkers for cardiovascular diseases.

Collapse

Yu Y, Zhang N, Mai Y, Ren L, Chen Q, Cao Z, Chen Q, Liu Y, Hou W, Yang J, Hong H, Xu J, Tong W, Dong L, Shi L, Fang X, Zheng Y. Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method. Genome Biol 2023;24:201. [PMID: 37674217 PMCID: PMC10483871 DOI: 10.1186/s13059-023-03047-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 05/18/2023] [Indexed: 09/08/2023] Open

Affiliation(s)

Ying Yu State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
Naixin Zhang State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
Yuanbang Mai State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
Luyao Ren State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
Qiaochu Chen State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
Zehui Cao State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
Qingwang Chen State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
Yaqing Liu State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
Wanwan Hou State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
Jingcheng Yang State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China Greater Bay Area Institute of Precision Medicine, Guangzhou, Guangdong, China
Huixiao Hong Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
Joshua Xu Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
Weida Tong Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
Lianhua Dong National Institute of Metrology, Beijing, China
Leming Shi State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China. International Human Phenome Institutes, Shanghai, China.
Xiang Fang National Institute of Metrology, Beijing, China.
Yuanting Zheng State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.

Collapse

Rokavec M, Özcan E, Neumann J, Hermeking H. Development and Validation of a 15-gene Expression Signature with Superior Prognostic Ability in Stage II Colorectal Cancer. CANCER RESEARCH COMMUNICATIONS 2023;3:1689-1700. [PMID: 37654625 PMCID: PMC10467603 DOI: 10.1158/2767-9764.crc-22-0489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 01/24/2023] [Accepted: 07/31/2023] [Indexed: 09/02/2023]

Abstract

Currently, there is no consensus about the use of adjuvant chemotherapy for patients with stage II colorectal cancer. Here, we aimed to identify and validate a prognostic mRNA expression signature for the stratification of patients with stage II colorectal cancer according to their risk for relapse. First, publicly available mRNA expression profiling datasets from 792 primary, stage II colorectal cancers from six different training cohorts were analyzed to identify genes that are consistently associated with patient relapse-free survival (RFS). Second, the identified gene expression signature was experimentally validated using NanoString technology and computationally refined on primary colorectal cancer samples from 205 patients with stage II colorectal cancer. Third, the refined signature was validated in two independent publicly available cohorts of 166 patients with stage II colorectal cancer. Bioinformatics analysis of training cohorts identified a 61-gene signature that was highly significantly associated with RFS (HR = 37.08, P = 2.68*10-106, sensitivity = 89.29%, specificity = 89.61%, and AUC = 0.937). The experimental validation and refinement revealed a 15-gene signature that robustly predicted relapse in three independent cohorts: an in-house cohort (HR = 20.4, P = 8.73*10-23, sensitivity = 90.32%, specificity = 80.99%, AUC = 0.812), GSE161158 (HR = 5.81, P = 3.57*10-4, sensitivity = 64.29%, specificity = 81.67%, AUC = 0.796), and GSE26906 (HR = 7.698, P = 7.26*10-8, sensitivity = 61.54%, specificity = 78.33%, AUC = 0.752). In the pooled training cohort, the 15-gene signature (HR = 4.72, P = 7.76*10-25, sensitivity = 75%, specificity = 67.44%, AUC = 0.784) was superior to the Oncotype DX colon 7-gene signature (HR = 2.698, P = 6.3*10-8, sensitivity = 62.16%, specificity = 55.5%, AUC = 0.633). We report the identification and validation of a novel mRNA expression signature for robust prognostication and stratification of patients with stage II colorectal cancer, with superior performance in the analyzed validation cohorts when compared with clinicopathologic biomarkers and signatures currently used for stage II colorectal cancer prognostication.

Significance

We identified and validated a 15-gene expression signature for robust prognostication and stratification of patients with stage II colorectal cancer, with superior performance when compared with currently used biomarkers. Therefore, the 15-gene expression signature has the potential to improve the prognostication and treatment decisions for patients with stage II colorectal cancer.

Collapse

Mei T, Li Y, Orduña Dolado A, Li Z, Andersson R, Berliocchi L, Rasmussen LJ. Pooled analysis of frontal lobe transcriptomic data identifies key mitophagy gene changes in Alzheimer's disease brain. Front Aging Neurosci 2023;15:1101216. [PMID: 37358952 PMCID: PMC10288858 DOI: 10.3389/fnagi.2023.1101216] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 05/18/2023] [Indexed: 06/28/2023] Open

Abstract

Background

The growing prevalence of Alzheimer's disease (AD) is becoming a global health challenge without effective treatments. Defective mitochondrial function and mitophagy have recently been suggested as etiological factors in AD, in association with abnormalities in components of the autophagic machinery like lysosomes and phagosomes. Several large transcriptomic studies have been performed on different brain regions from AD and healthy patients, and their data represent a vast source of important information that can be utilized to understand this condition. However, large integration analyses of these publicly available data, such as AD RNA-Seq data, are still missing. In addition, large-scale focused analysis on mitophagy, which seems to be relevant for the aetiology of the disease, has not yet been performed.

Methods

In this study, publicly available raw RNA-Seq data generated from healthy control and sporadic AD post-mortem human samples of the brain frontal lobe were collected and integrated. Sex-specific differential expression analysis was performed on the combined data set after batch effect correction. From the resulting set of differentially expressed genes, candidate mitophagy-related genes were identified based on their known functional roles in mitophagy, the lysosome, or the phagosome, followed by Protein-Protein Interaction (PPI) and microRNA-mRNA network analysis. The expression changes of candidate genes were further validated in human skin fibroblast and induced pluripotent stem cells (iPSCs)-derived cortical neurons from AD patients and matching healthy controls.

Results

From a large dataset (AD: 589; control: 246) based on three different datasets (i.e., ROSMAP, MSBB, & GSE110731), we identified 299 candidate mitophagy-related differentially expressed genes (DEG) in sporadic AD patients (male: 195, female: 188). Among these, the AAA ATPase VCP, the GTPase ARF1, the autophagic vesicle forming protein GABARAPL1 and the cytoskeleton protein actin beta ACTB were selected based on network degrees and existing literature. Changes in their expression were further validated in AD-relevant human in vitro models, which confirmed their down-regulation in AD conditions.

Conclusion

Through the joint analysis of multiple publicly available data sets, we identify four differentially expressed key mitophagy-related genes potentially relevant for the pathogenesis of sporadic AD. Changes in expression of these four genes were validated using two AD-relevant human in vitro models, primary human fibroblasts and iPSC-derived neurons. Our results provide foundation for further investigation of these genes as potential biomarkers or disease-modifying pharmacological targets.

Collapse

Stokes T, Cen HH, Kapranov P, Gallagher IJ, Pitsillides AA, Volmar C, Kraus WE, Johnson JD, Phillips SM, Wahlestedt C, Timmons JA. Transcriptomics for Clinical and Experimental Biology Research: Hang on a Seq. ADVANCED GENETICS (HOBOKEN, N.J.) 2023;4:2200024. [PMID: 37288167 PMCID: PMC10242409 DOI: 10.1002/ggn2.202200024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Indexed: 06/09/2023]

Ni A, Liu M, Qin LX. BatMan: Mitigating Batch Effects Via Stratification for Survival Outcome Prediction. JCO Clin Cancer Inform 2023;7:e2200138. [PMID: 37335961 PMCID: PMC10530623 DOI: 10.1200/cci.22.00138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 01/31/2023] [Indexed: 06/21/2023] Open

Abstract

Reproducible translation of transcriptomics data has been hampered by the ubiquitous presence of batch effects. Statistical methods for managing batch effects were initially developed in the setting of sample group comparison and later borrowed for other settings such as survival outcome prediction. The most notable such method is ComBat, which adjusts for batches by including it as a covariate alongside sample groups in a linear regression. In survival prediction, however, ComBat is used without definable groups for survival outcome and is done sequentially with survival regression for a potentially batch-confounded outcome. To address these issues, we propose a new method called BATch MitigAtion via stratificatioN (BatMan). It adjusts batches as strata in survival regression and uses variable selection methods such as the regularized regression to handle high dimensionality. We assess the performance of BatMan in comparison with ComBat, each used either alone or in conjunction with data normalization, in a resampling-based simulation study under various levels of predictive signal strength and patterns of batch-outcome association. Our simulations show that (1) BatMan outperforms ComBat in nearly all scenarios when there are batch effects in the data and (2) their performance can be worsened by the addition of data normalization. We further evaluate them using microRNA data for ovarian cancer from the Cancer Genome Atlas and find that BatMan outforms ComBat while the addition of data normalization worsens the prediction. Our study thus shows the advantage of BatMan and raises caution about the use of data normalization in the context of developing survival prediction models. The BatMan method and the simulation tool for performance assessment are implemented in R and publicly available at LXQin/PRECISION.survival-GitHub.

Collapse

Liu Y, Wei X, Feng X, Liu Y, Feng G, Du Y. Repeatability of radiomics studies in colorectal cancer: a systematic review. BMC Gastroenterol 2023;23:125. [PMID: 37059990 PMCID: PMC10105401 DOI: 10.1186/s12876-023-02743-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 03/22/2023] [Indexed: 04/16/2023] Open

Fajarda O, Almeida JR, Duarte-Pereira S, Silva RM, Oliveira JL. Methodology to identify a gene expression signature by merging microarray datasets. Comput Biol Med 2023;159:106867. [PMID: 37060770 DOI: 10.1016/j.compbiomed.2023.106867] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 03/01/2023] [Accepted: 03/30/2023] [Indexed: 04/17/2023]

Zhang D, Wang Y, Zhao F, Yang Q. Integrated multiomics analyses unveil the implication of a costimulatory molecule score on tumor aggressiveness and immune evasion in breast cancer: A large-scale study through over 8,000 patients. Comput Biol Med 2023;159:106866. [PMID: 37068318 DOI: 10.1016/j.compbiomed.2023.106866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 02/05/2023] [Accepted: 03/30/2023] [Indexed: 04/08/2023]

Abstract

BACKGROUND

Although immunotherapy has revolutionised cancer management, reliable genomic biomarkers for identifying eligible patient subpopulations are lacking. Costimulatory molecules play a crucial role in mounting anti-tumour responses, and clinical trials targeting these novel biomarkers are underway. However, whether these molecules can determine tumour aggressiveness and the risk of tumour evasion in breast cancer (BC) remains largely unknown.

METHODS

The whole-tissue transcriptomic data of 8236 patients with BC from 15 independent cohorts were extracted. An integrated scoring system named 'costimulatory molecule score' (CMS) was constructed and sufficient validated using least absolute shrinkage and selection operator regression (1000 iterations) and the random survival forest algorithm (1000 trees). The correlation among CMSs, cancer genotypes and clinicopathological characteristics was examined. Extensive multiomics and immunogenomic analyses were performed to investigate and verify the association among CMSs, enriched pathways, potential intrinsic and extrinsic immune escape mechanisms, immunotherapy response and therapeutic options.

RESULTS

The predictive role of CMS model that relies on expression pattern of merely 5 costimulatory genes for prognosis is almost universally applicable to BC patients in a platform-independent manner. Through internal and external in silico validation, high CMS was characterized by favorable genotypes but decreased tumor immunogenicity, activation of stroma, immune-suppressive states and potential immunotherapeutic resistance. Similar results were observed in a real-world immunotherapy cohort and Pan-Cancer analysis.

CONCLUSION

This comprehensive characterization indicates CMS model may be complemented for predicting tumor aggressiveness and immune evasion in BC patients, underlining the future clinical potential for further exploration of resistance mechanisms and optimization of immunotherapeutic strategies.

Collapse

Elingaard-Larsen LO, Villumsen SO, Justesen L, Thuesen ACB, Kim M, Ali M, Danielsen ER, Legido-Quigley C, van Hall G, Hansen T, Ahluwalia TS, Vaag AA, Brøns C. Circulating Metabolomic and Lipidomic Signatures Identify a Type 2 Diabetes Risk Profile in Low-Birth-Weight Men with Non-Alcoholic Fatty Liver Disease. Nutrients 2023;15:nu15071590. [PMID: 37049431 PMCID: PMC10096690 DOI: 10.3390/nu15071590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 03/09/2023] [Accepted: 03/15/2023] [Indexed: 03/28/2023] Open

Carry PM, Vigers T, Vanderlinden LA, Keeter C, Dong F, Buckner T, Litkowski E, Yang I, Norris JM, Kechris K. Propensity scores as a novel method to guide sample allocation and minimize batch effects during the design of high throughput experiments. BMC Bioinformatics 2023;24:86. [PMID: 36882691 PMCID: PMC9990331 DOI: 10.1186/s12859-023-05202-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 02/22/2023] [Indexed: 03/09/2023] Open

Abstract

BACKGROUND

We developed a novel approach to minimize batch effects when assigning samples to batches. Our algorithm selects a batch allocation, among all possible ways of assigning samples to batches, that minimizes differences in average propensity score between batches. This strategy was compared to randomization and stratified randomization in a case-control study (30 per group) with a covariate (case vs control, represented as β1, set to be null) and two biologically relevant confounding variables (age, represented as β2, and hemoglobin A1c (HbA1c), represented as β3). Gene expression values were obtained from a publicly available dataset of expression data obtained from pancreas islet cells. Batch effects were simulated as twice the median biological variation across the gene expression dataset and were added to the publicly available dataset to simulate a batch effect condition. Bias was calculated as the absolute difference between observed betas under the batch allocation strategies and the true beta (no batch effects). Bias was also evaluated after adjustment for batch effects using ComBat as well as a linear regression model. In order to understand performance of our optimal allocation strategy under the alternative hypothesis, we also evaluated bias at a single gene associated with both age and HbA1c levels in the 'true' dataset (CAPN13 gene).

RESULTS

Pre-batch correction, under the null hypothesis (β1), maximum absolute bias and root mean square (RMS) of maximum absolute bias, were minimized using the optimal allocation strategy. Under the alternative hypothesis (β2 and β3 for the CAPN13 gene), maximum absolute bias and RMS of maximum absolute bias were also consistently lower using the optimal allocation strategy. ComBat and the regression batch adjustment methods performed well as the bias estimates moved towards the true values in all conditions under both the null and alternative hypotheses. Although the differences between methods were less pronounced following batch correction, estimates of bias (average and RMS) were consistently lower using the optimal allocation strategy under both the null and alternative hypotheses.

CONCLUSIONS

Our algorithm provides an extremely flexible and effective method for assigning samples to batches by exploiting knowledge of covariates prior to sample allocation.

Collapse

Juan H, Huang H. Quantitative analysis of high‐throughput biological data. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2023. [DOI: 10.1002/wcms.1658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]

Gregori J, Sánchez À, Villanueva J. msmsEDA & msmsTests: Label-Free Differential Expression by Spectral Counts. Methods Mol Biol 2023;2426:197-242. [PMID: 36308691 DOI: 10.1007/978-1-0716-1967-4_10] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]

Chicco D, Oneto L, Tavazzi E. Eleven quick tips for data cleaning and feature engineering. PLoS Comput Biol 2022;18:e1010718. [PMID: 36520712 PMCID: PMC9754225 DOI: 10.1371/journal.pcbi.1010718] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open

Yang Q, Li B, Wang P, Xie J, Feng Y, Liu Z, Zhu F. LargeMetabo: an out-of-the-box tool for processing and analyzing large-scale metabolomic data. Brief Bioinform 2022;23:6768054. [PMID: 36274234 DOI: 10.1093/bib/bbac455] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2022] [Revised: 09/06/2022] [Accepted: 09/24/2022] [Indexed: 12/14/2022] Open

Adamer MF, Brüningk SC, Tejada-Arranz A, Estermann F, Basler M, Borgwardt K. reComBat: batch-effect removal in large-scale multi-source gene-expression data integration. BIOINFORMATICS ADVANCES 2022;2:vbac071. [PMID: 36699372 PMCID: PMC9710604 DOI: 10.1093/bioadv/vbac071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 09/01/2022] [Accepted: 09/26/2022] [Indexed: 01/28/2023]

Borisov N, Buzdin A. Transcriptomic Harmonization as the Way for Suppressing Cross-Platform Bias and Batch Effect. Biomedicines 2022;10:2318. [PMID: 36140419 PMCID: PMC9496268 DOI: 10.3390/biomedicines10092318] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 09/14/2022] [Accepted: 09/16/2022] [Indexed: 11/16/2022] Open

Liu H, Xing K, Jiang Y, Liu Y, Wang C, Ding X. Using Machine Learning to Identify Biomarkers Affecting Fat Deposition in Pigs by Integrating Multisource Transcriptome Information. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2022;70:10359-10370. [PMID: 35953074 PMCID: PMC9413214 DOI: 10.1021/acs.jafc.2c03339] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 07/27/2022] [Accepted: 07/29/2022] [Indexed: 06/15/2023]

Huang HH, Rao H, Miao R, Liang Y. A novel meta-analysis based on data augmentation and elastic data shared lasso regularization for gene expression. BMC Bioinformatics 2022;23:353. [PMID: 35999505 PMCID: PMC9396780 DOI: 10.1186/s12859-022-04887-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 08/10/2022] [Indexed: 12/22/2022] Open

Abstract

Background

Gene expression analysis can provide useful information for analyzing complex biological mechanisms. However, many reported findings are unrepeatable due to small sample sizes relative to a large number of genes and the low signal-to-noise ratios of most gene expression datasets.

Results

Meta-analysis of multi-data sets is an efficient method for tackling the above problem. To improve the performance of meta-analysis, we propose a novel meta-analysis framework. It consists of two parts: (1) a novel data augmentation strategy. Various cross-platform normalization methods exist, which can preserve original biological information of gene expression datasets from different angles and add different “perturbations” to the dataset. Using such perturbation, we provide a feasible means for gene expression data augmentation; (2) elastic data shared lasso (DSL-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{L}}}_{\mathbf{2}}$$\end{document}L2). The DSL-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{L}}_{\mathbf{2}}$$\end{document}L2 method spans the continuum between individual models for each dataset and one model for all datasets. It also overcomes the shortcomings of the data shared lasso method when dealing with highly correlated features. Comprehensive simulation experiment results show that the proposed method has high prediction and gene selection performance. We then apply the proposed method to non-small cell lung cancer (NSCLC) blood gene expression data in order to identify key tumor-related genes. The outcomes of our experiment indicate that the method could be used for identifying a set of robust disease-related gene signatures that may be used for NSCLC early diagnosis or prognosis or even targeting.

Conclusion

We propose a novel and effective meta-analysis method for biological research, extrapolating and integrating information from multiple gene expression datasets.

Collapse

Kumar R, Khatri A, Acharya V. Deep learning uncovers distinct behavior of rice network to pathogens response. iScience 2022;25:104546. [PMID: 35754717 PMCID: PMC9218438 DOI: 10.1016/j.isci.2022.104546] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 05/06/2022] [Accepted: 06/02/2022] [Indexed: 12/15/2022] Open

Yuan Z, Murakoshi N, Xu D, Tajiri K, Okabe Y, Aonuma K, Murakata Y, Li S, Song Z, Shimoda Y, Mori H, Aonuma K, Ieda M. Identification of potential dilated cardiomyopathy-related targets by meta-analysis and co-expression analysis of human RNA-sequencing datasets. Life Sci 2022;306:120807. [PMID: 35841977 DOI: 10.1016/j.lfs.2022.120807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 06/27/2022] [Accepted: 07/11/2022] [Indexed: 11/17/2022]

Sarafidis M, Lambrou GI, Zoumpourlis V, Koutsouris D. An Integrated Bioinformatics Analysis towards the Identification of Diagnostic, Prognostic, and Predictive Key Biomarkers for Urinary Bladder Cancer. Cancers (Basel) 2022;14:cancers14143358. [PMID: 35884419 PMCID: PMC9319344 DOI: 10.3390/cancers14143358] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 07/03/2022] [Accepted: 07/06/2022] [Indexed: 02/04/2023] Open

Abstract

Simple Summary

Bladder cancer is evidently a challenge as far as its prognosis and treatment are concerned. The investigation of potential biomarkers and therapeutic targets is indispensable and still in progress. Most studies attempt to identify differential signatures between distinct molecular tumor subtypes. Therefore, keeping in mind the heterogeneity of urinary bladder tumors, we attempted to identify a consensus gene-related signature between the common expression profile of bladder cancer and control samples. In the quest for substantive features, we were able to identify key hub genes, whose signatures could hold diagnostic, prognostic, or therapeutic significance, but, primarily, could contribute to a better understanding of urinary bladder cancer biology.

Abstract

Bladder cancer (BCa) is one of the most prevalent cancers worldwide and accounts for high morbidity and mortality. This study intended to elucidate potential key biomarkers related to the occurrence, development, and prognosis of BCa through an integrated bioinformatics analysis. In this context, a systematic meta-analysis, integrating 18 microarray gene expression datasets from the GEO repository into a merged meta-dataset, identified 815 robust differentially expressed genes (DEGs). The key hub genes resulted from DEG-based protein–protein interaction and weighted gene co-expression network analyses were screened for their differential expression in urine and blood plasma samples of BCa patients. Subsequently, they were tested for their prognostic value, and a three-gene signature model, including COL3A1, FOXM1, and PLK4, was built. In addition, they were tested for their predictive value regarding muscle-invasive BCa patients’ response to neoadjuvant chemotherapy. A six-gene signature model, including ANXA5, CD44, NCAM1, SPP1, CDCA8, and KIF14, was developed. In conclusion, this study identified nine key biomarker genes, namely ANXA5, CDT1, COL3A1, SPP1, VEGFA, CDCA8, HJURP, TOP2A, and COL6A1, which were differentially expressed in urine or blood of BCa patients, held a prognostic or predictive value, and were immunohistochemically validated. These biomarkers may be of significance as prognostic and therapeutic targets for BCa.

Collapse

Niu J, Yang J, Guo Y, Qian K, Wang Q. Joint deep learning for batch effect removal and classification toward MALDI MS based metabolomics. BMC Bioinformatics 2022;23:270. [PMID: 35818047 PMCID: PMC9275160 DOI: 10.1186/s12859-022-04758-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 05/30/2022] [Indexed: 12/02/2022] Open

Kong J, Ha D, Lee J, Kim I, Park M, Im SH, Shin K, Kim S. Network-based machine learning approach to predict immunotherapy response in cancer patients. Nat Commun 2022;13:3703. [PMID: 35764641 PMCID: PMC9240063 DOI: 10.1038/s41467-022-31535-6] [Citation(s) in RCA: 66] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 06/22/2022] [Indexed: 11/08/2022] Open

Niu J, Xu W, Wei D, Qian K, Wang Q. Deep Learning Framework for Integrating Multibatch Calibration, Classification, and Pathway Activities. Anal Chem 2022;94:8937-8946. [PMID: 35709357 DOI: 10.1021/acs.analchem.2c00601] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Pathway importance by graph convolutional network and Shapley additive explanations in gene expression phenotype of diffuse large B-cell lymphoma. PLoS One 2022;17:e0269570. [PMID: 35749395 PMCID: PMC9231717 DOI: 10.1371/journal.pone.0269570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 05/09/2022] [Indexed: 11/30/2022] Open

Tamposis IA, Manios GA, Charitou T, Vennou KE, Kontou PI, Bagos PG. MAGE: An Open-Source Tool for Meta-Analysis of Gene Expression Studies. BIOLOGY 2022;11:biology11060895. [PMID: 35741417 PMCID: PMC9220151 DOI: 10.3390/biology11060895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Revised: 06/05/2022] [Accepted: 06/08/2022] [Indexed: 11/16/2022]

Reassessment of Reliability and Reproducibility for Triple-Negative Breast Cancer Subtyping. Cancers (Basel) 2022;14:cancers14112571. [PMID: 35681552 PMCID: PMC9179838 DOI: 10.3390/cancers14112571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 05/05/2022] [Accepted: 05/06/2022] [Indexed: 11/17/2022] Open

Abstract

Simple Summary

Triple-negative breast cancer (TNBC) is a heterogeneous disease. A proper classification system is needed to develop targetable biomarkers and guide personalized treatment in clinical practice. However, there has been no consensus on the molecular subtypes of TNBC, probably due to discrepancies in technical and computational methods chosen by different research groups. In this paper, we reassessed each major step for TNBC subtyping and provided suggestions, which promote rational workflow design and ensure reliable and reproducible results for future studies. We presented a recommended pipeline to the existing data, validated established TNBC subtypes with a larger sample size, and revealed two intermediate subtypes with prognostic significance. This work provides perspectives on issues and limitations regarding TNBC subtyping, indicating promising directions for developing targeted therapy based on the molecular characteristics of each TNBC subtype.

Abstract

Triple-negative breast cancer (TNBC) is a heterogeneous disease with diverse, often poor prognoses and treatment responses. In order to identify targetable biomarkers and guide personalized care, scientists have developed multiple molecular classification systems for TNBC based on transcriptomic profiling. However, there is no consensus on the molecular subtypes of TNBC, likely due to discrepancies in technical and computational methods used by different research groups. Here, we reassessed the major steps for TNBC subtyping, validated the reproducibility of established TNBC subtypes, and identified two more subtypes with a larger sample size. By comparing results from different workflows, we demonstrated the limitations of formalin-fixed, paraffin-embedded samples, as well as batch effect removal across microarray platforms. We also refined the usage of computational tools for TNBC subtyping. Furthermore, we integrated high-quality multi-institutional TNBC datasets (discovery set: n = 457; validation set: n = 165). Performing unsupervised clustering on the discovery and validation sets independently, we validated four previously discovered subtypes: luminal androgen receptor, mesenchymal, immunomodulatory, and basal-like immunosuppressed. Additionally, we identified two potential intermediate states of TNBC tumors based on their resemblance with more than one well-characterized subtype. In summary, we addressed the issues and limitations of previous TNBC subtyping through comprehensive analyses. Our results promote the rational design of future subtyping studies and provide new insights into TNBC patient stratification.

Collapse

Augustine J, Jereesh AS. Blood-based gene-expression biomarkers identification for the non-invasive diagnosis of Parkinson's disease using two-layer hybrid feature selection. Gene X 2022;823:146366. [PMID: 35202733 DOI: 10.1016/j.gene.2022.146366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2021] [Revised: 02/15/2022] [Accepted: 02/18/2022] [Indexed: 11/19/2022] Open

Zheng D, Zhu Y, Zhang J, Zhang W, Wang H, Chen H, Wu C, Ni J, Xu X, Nian B, Chen S, Wang B, Li X, Zhang Y, Zhang J, Zhong W, Xiong L, Li F, Zhang D, Xu J, Jiang G. Identification and evaluation of circulating small extracellular vesicle microRNAs as diagnostic biomarkers for patients with indeterminate pulmonary nodules. J Nanobiotechnology 2022;20:172. [PMID: 35366907 PMCID: PMC8976298 DOI: 10.1186/s12951-022-01366-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 03/10/2022] [Indexed: 12/13/2022] Open

Abstract

Background

The identification of indeterminate pulmonary nodules (IPNs) following a low-dose computed tomography (LDCT) is a major challenge for early diagnosis of lung cancer. The inadequate assessment of IPNs’ malignancy risk results in a large number of unnecessary surgeries or an increased risk of cancer metastases. However, limited studies on non-invasive diagnosis of IPNs have been reported.

Methods

In this study, we identified and evaluated the diagnostic value of circulating small extracellular vesicle (sEV) microRNAs (miRNAs) in patients with IPNs that had been newly detected using LDCT scanning and were scheduled for surgery. Out of 459 recruited patients, 109 eligible patients with IPNs were enrolled in the training cohort (n = 47) and the test cohort (n = 62). An external cohort (n = 99) was used for validation. MiRNAs were extracted from plasma sEVs, and assessed using Small RNA sequencing. 490 lung adenocarcinoma samples and follow-up data were used to investigate the role of miRNAs in overall survival.

Results

A circulating sEV miRNA (CirsEV-miR) model was constructed from five differentially expressed miRNAs (DEMs), showing 0.920 AUC in the training cohort (n = 47), and further identified in the test cohort (n = 62) and in an external validation cohort (n = 99). Among five DEMs of the CirsEV-miR model, miR-101-3p and miR-150-5p were significantly associated with better overall survival (p = 0.0001 and p = 0.0069). The CirsEV-miR scores were calculated, which significantly correlated with IPNs diameters (p < 0.05), and were able to discriminate between benign and malignant PNs (diameter ≤ 1 cm). The expression patterns of sEV miRNAs in the benign, adenocarcinoma in situ/minimally invasive adenocarcinoma, and invasive adenocarcinoma subgroups were found to gradually change with the increase in aggressiveness for the first time. Among all DEMs of the three subgroups, five miRNAs (miR-30c-5p, miR-30e-5p, miR-500a-3p, miR-125a-5p, and miR-99a-5p) were also significantly associated with overall survival of lung adenocarcinoma patients.

Conclusions

Our results indicate that the CirsEV-miR model could help distinguish between benign and malignant PNs, providing insights into the feasibility of circulating sEV miRNAs in diagnostic biomarker development.

Trial registration: Chinese Clinical Trials: ChiCTR1800019877. Registered 05 December 2018, https://www.chictr.org.cn/showproj.aspx?proj=31346.

Graphical Abstract

Supplementary Information

The online version contains supplementary material available at 10.1186/s12951-022-01366-0.

Collapse

Liu YE, Saul S, Rao AM, Robinson ML, Agudelo Rojas OL, Sanz AM, Verghese M, Solis D, Sibai M, Huang CH, Sahoo MK, Gelvez RM, Bueno N, Estupiñan Cardenas MI, Villar Centeno LA, Rojas Garrido EM, Rosso F, Donato M, Pinsky BA, Einav S, Khatri P. An 8-gene machine learning model improves clinical prediction of severe dengue progression. Genome Med 2022;14:33. [PMID: 35346346 PMCID: PMC8959795 DOI: 10.1186/s13073-022-01034-w] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2021] [Accepted: 02/24/2022] [Indexed: 02/06/2023] Open

Abstract

BACKGROUND

Each year 3-6 million people develop life-threatening severe dengue (SD). Clinical warning signs for SD manifest late in the disease course and are nonspecific, leading to missed cases and excess hospital burden. Better SD prognostics are urgently needed.

METHODS

We integrated 11 public datasets profiling the blood transcriptome of 365 dengue patients of all ages and from seven countries, encompassing biological, clinical, and technical heterogeneity. We performed an iterative multi-cohort analysis to identify differentially expressed genes (DEGs) between non-severe patients and SD progressors. Using only these DEGs, we trained an XGBoost machine learning model on public data to predict progression to SD. All model parameters were "locked" prior to validation in an independent, prospectively enrolled cohort of 377 dengue patients in Colombia. We measured expression of the DEGs in whole blood samples collected upon presentation, prior to SD progression. We then compared the accuracy of the locked XGBoost model and clinical warning signs in predicting SD.

RESULTS

We identified eight SD-associated DEGs in the public datasets and built an 8-gene XGBoost model that accurately predicted SD progression in the independent validation cohort with 86.4% (95% CI 68.2-100) sensitivity and 79.7% (95% CI 75.5-83.9) specificity. Given the 5.8% proportion of SD cases in this cohort, the 8-gene model had a positive and negative predictive value (PPV and NPV) of 20.9% (95% CI 16.7-25.6) and 99.0% (95% CI 97.7-100.0), respectively. Compared to clinical warning signs at presentation, which had 77.3% (95% CI 58.3-94.1) sensitivity and 39.7% (95% CI 34.7-44.9) specificity, the 8-gene model led to an 80% reduction in the number needed to predict (NNP) from 25.4 to 5.0. Importantly, the 8-gene model accurately predicted subsequent SD in the first three days post-fever onset and up to three days prior to SD progression.

CONCLUSIONS

The 8-gene XGBoost model, trained on heterogeneous public datasets, accurately predicted progression to SD in a large, independent, prospective cohort, including during the early febrile stage when SD prediction remains clinically difficult. The model has potential to be translated to a point-of-care prognostic assay to reduce dengue morbidity and mortality without overwhelming limited healthcare resources.

Collapse

Affiliation(s)

Yiran E. Liu grid.168010.e0000000419368956Institute for Immunity, Transplantation and Infection, School of Medicine, Stanford University, CA Stanford, USA ,2grid.168010.e0000000419368956Cancer Biology Graduate Program, School of Medicine, Stanford University, CA Stanford, USA ,3grid.168010.e0000000419368956Division of Infectious Diseases and Geographic Medicine, Department of Medicine, School of Medicine, Stanford University, CA Stanford, USA
Sirle Saul grid.168010.e0000000419368956Division of Infectious Diseases and Geographic Medicine, Department of Medicine, School of Medicine, Stanford University, CA Stanford, USA
Aditya Manohar Rao grid.168010.e0000000419368956Institute for Immunity, Transplantation and Infection, School of Medicine, Stanford University, CA Stanford, USA ,4grid.168010.e0000000419368956Immunology Graduate Program, School of Medicine, Stanford University, CA Stanford, USA
Makeda Lucretia Robinson grid.168010.e0000000419368956Division of Infectious Diseases and Geographic Medicine, Department of Medicine, School of Medicine, Stanford University, CA Stanford, USA ,5grid.168010.e0000000419368956Department of Pathology, School of Medicine, Stanford University, CA Stanford, USA
Olga Lucia Agudelo Rojas grid.477264.4Clinical Research Center, Fundación Valle del Lili, Cali, Colombia
Ana Maria Sanz grid.477264.4Clinical Research Center, Fundación Valle del Lili, Cali, Colombia
Michelle Verghese grid.168010.e0000000419368956Department of Pathology, School of Medicine, Stanford University, CA Stanford, USA
Daniel Solis grid.168010.e0000000419368956Department of Pathology, School of Medicine, Stanford University, CA Stanford, USA
Mamdouh Sibai grid.168010.e0000000419368956Department of Pathology, School of Medicine, Stanford University, CA Stanford, USA
Chun Hong Huang grid.168010.e0000000419368956Department of Pathology, School of Medicine, Stanford University, CA Stanford, USA
Malaya Kumar Sahoo grid.168010.e0000000419368956Department of Pathology, School of Medicine, Stanford University, CA Stanford, USA
Rosa Margarita Gelvez Centro de Atención y Diagnóstico de Enfermedades Infecciosas (CDI), Bucaramanga, Colombia
Nathalia Bueno Centro de Atención y Diagnóstico de Enfermedades Infecciosas (CDI), Bucaramanga, Colombia
Maria Isabel Estupiñan Cardenas Centro de Atención y Diagnóstico de Enfermedades Infecciosas (CDI), Bucaramanga, Colombia
Luis Angel Villar Centeno Centro de Atención y Diagnóstico de Enfermedades Infecciosas (CDI), Bucaramanga, Colombia
Elsa Marina Rojas Garrido Centro de Atención y Diagnóstico de Enfermedades Infecciosas (CDI), Bucaramanga, Colombia
Fernando Rosso grid.477264.4Clinical Research Center, Fundación Valle del Lili, Cali, Colombia ,8grid.477264.4Division of Infectious Diseases, Department of Internal Medicine, Fundación Valle del Lili, Cali, Colombia
Michele Donato grid.168010.e0000000419368956Institute for Immunity, Transplantation and Infection, School of Medicine, Stanford University, CA Stanford, USA ,9grid.168010.e0000000419368956Center for Biomedical Informatics Research, Department of Medicine, School of Medicine, Stanford University, CA Stanford, USA
Benjamin A. Pinsky grid.168010.e0000000419368956Division of Infectious Diseases and Geographic Medicine, Department of Medicine, School of Medicine, Stanford University, CA Stanford, USA ,5grid.168010.e0000000419368956Department of Pathology, School of Medicine, Stanford University, CA Stanford, USA
Shirit Einav grid.168010.e0000000419368956Division of Infectious Diseases and Geographic Medicine, Department of Medicine, School of Medicine, Stanford University, CA Stanford, USA ,10grid.168010.e0000000419368956Department of Microbiology and Immunology, School of Medicine, Stanford University, CA Stanford, USA
Purvesh Khatri grid.168010.e0000000419368956Institute for Immunity, Transplantation and Infection, School of Medicine, Stanford University, CA Stanford, USA ,9grid.168010.e0000000419368956Center for Biomedical Informatics Research, Department of Medicine, School of Medicine, Stanford University, CA Stanford, USA

Collapse

Bajo-Morales J, Prieto-Prieto JC, Herrera LJ, Rojas I, Castillo-Secilla D. COVID-19 Biomarkers Recognition & Classification Using Intelligent Systems. Curr Bioinform 2022. [DOI: 10.2174/1574893617666220328125029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Abstract Background: SARS-CoV-2 has paralyzed mankind due to its high transmissibility and its associated mortality, causing millions of infections and deaths worldwide. The search for gene expression biomarkers from the host transcriptional response to infection may help understand the underlying mechanisms by which the virus causes COVID-19. This research proposes a smart methodology integrating different RNA-Seq datasets from SARS-CoV-2, other respiratory diseases, and healthy patients. Methods: The proposed pipeline exploits the functionality of the ‘KnowSeq’ R/Bioc package, integrating different data sources and attaining a significantly larger gene expression dataset, thus endowing the results with higher statistical significance and robustness in comparison with previous studies in the literature. A detailed preprocessing step was carried out to homogenize the samples and build a clinical decision system for SARS-CoV-2. It uses machine learning techniques such as feature selection algorithm and supervised classification system. This clinical decision system uses the most differentially expressed genes among different diseases (including SARS-Cov-2) to develop a four-class classifier. Results: The multiclass classifier designed can discern SARS-CoV-2 samples, reaching an accuracy equal to 91.5%, a mean F1-Score equal to 88.5%, and a SARS-CoV-2 AUC equal to 94% by using only 15 genes as predictors. A biological interpretation of the gene signature extracted reveals relations with processes involved in viral responses. Conclusion: This work proposes a COVID-19 gene signature composed of 15 genes, selected after applying the feature selection ‘minimum Redundancy Maximum Relevance’ algorithm. The integration among several RNA-Seq datasets was a success, allowing for a considerable large number of samples and therefore providing greater statistical significance to the results than previous studies. Biological interpretation of the selected genes was also provided. Collapse

Su L, Xu C, Zeng S, Su L, Joshi T, Stacey G, Xu D. Large-Scale Integrative Analysis of Soybean Transcriptome Using an Unsupervised Autoencoder Model. FRONTIERS IN PLANT SCIENCE 2022;13:831204. [PMID: 35310659 PMCID: PMC8927983 DOI: 10.3389/fpls.2022.831204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Accepted: 02/09/2022] [Indexed: 06/14/2023]

McCulloch JA, Davar D, Rodrigues RR, Badger JH, Fang JR, Cole AM, Balaji AK, Vetizou M, Prescott SM, Fernandes MR, Costa RGF, Yuan W, Salcedo R, Bahadiroglu E, Roy S, DeBlasio RN, Morrison RM, Chauvin JM, Ding Q, Zidi B, Lowin A, Chakka S, Gao W, Pagliano O, Ernst SJ, Rose A, Newman NK, Morgun A, Zarour HM, Trinchieri G, Dzutsev AK. Intestinal microbiota signatures of clinical response and immune-related adverse events in melanoma patients treated with anti-PD-1. Nat Med 2022;28:545-556. [PMID: 35228752 PMCID: PMC10246505 DOI: 10.1038/s41591-022-01698-2] [Citation(s) in RCA: 192] [Impact Index Per Article: 96.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 01/13/2022] [Indexed: 12/12/2022]

Affiliation(s)

John A McCulloch Genetics and Microbiome Core, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
Diwakar Davar Department of Medicine and UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
Richard R Rodrigues Genetics and Microbiome Core, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
Jonathan H Badger Genetics and Microbiome Core, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
Jennifer R Fang Cancer Immunobiology Section, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
Alicia M Cole Cancer Immunobiology Section, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
Ascharya K Balaji Cancer Immunobiology Section, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
Marie Vetizou Cancer Immunobiology Section, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
Stephanie M Prescott Cancer Immunobiology Section, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
Miriam R Fernandes Cancer Immunobiology Section, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
Raquel G F Costa Cancer Immunobiology Section, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
Wuxing Yuan Genetics and Microbiome Core, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
Rosalba Salcedo Cancer Immunobiology Section, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
Erol Bahadiroglu Cancer Immunobiology Section, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
Soumen Roy Cancer Immunobiology Section, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
Richelle N DeBlasio Department of Medicine and UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
Robert M Morrison Department of Medicine and UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
Joe-Marc Chauvin Department of Medicine and UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
Quanquan Ding Department of Medicine and UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
Bochra Zidi Department of Medicine and UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
Ava Lowin Department of Medicine and UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
Saranya Chakka Department of Medicine and UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
Wentao Gao Department of Medicine and UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
Ornella Pagliano Department of Medicine and UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
Scarlett J Ernst Department of Medicine and UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
Amy Rose Department of Medicine and UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
Nolan K Newman College of Pharmacy, Oregon State University, Corvallis, OR, USA
Andrey Morgun College of Pharmacy, Oregon State University, Corvallis, OR, USA
Hassane M Zarour Department of Medicine and UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA. Department of Immunology, University of Pittsburgh, Pittsburgh, PA, USA.
Giorgio Trinchieri Cancer Immunobiology Section, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA.
Amiran K Dzutsev Cancer Immunobiology Section, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA.

Collapse

Bhavra K, Wilde M, Richardson M, Cordell R, Thomas CLP, Zhao B, Bryant L, Brightling CE, Ibrahim W, Salman D, Siddiqui S, Monks P, Gaillard E. The utility of a standardised breath sampler in school age children within a real-world prospective study. J Breath Res 2022;16. [PMID: 35168217 DOI: 10.1088/1752-7163/ac5526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Accepted: 02/15/2022] [Indexed: 11/12/2022]

Abstract

Clinical assessment of paediatric asthmatics is problematic, and non-invasive biomarkers are needed urgently. Monitoring exhaled volatile organic compounds (VOCs) is an attractive alternative to invasive tests (blood and sputum), and may be used as frequently as required. Standardised reproducible breath-sampling is essential for exhaled-VOC analysis, and although the ReCIVA (Owlstone Medical Limited) breath-sampler was designed to satisfy this requirement, paediatric use was not in the original design brief. The efficacy of the ReCIVA for sampling paediatric-breath has been studied, and 90 breath-samples from 64 children (5-15 years) with, and without asthma (controls), were collected with two different ReCIVA units. Seventy samples (77.8%) contained the specified 1L of sampled-breath. Median sampling times were longer in children with acute asthma (770.2 s, range: 532.2-900.1 s) compared to stable asthma (690.6 s, range: 477.5-900.1 s; p=0.01). The ReCIVA successfully detected operational faults, in 21 samples. A leak, caused by a poor fit of the face mask seal was the most common (15); the others were USB communication-faults (5); and, a single instance of a file-creation error. Paediatric breath-profiles were reliably monitored, however synchronisation of sampling to breathing-phases was sometimes lost, causing some breaths not to be sampled, and some to be sampled continuously. This occurred in 60 (66.7%) of the samples and was a source of variability. Three samples were lost from a combination of factors, however, and importantly, multi-variate modelling of untargeted VOC analysis indicated the absence of significant batch effects for 8 operational variables. The ReCIVA appears suitable for paediatric breath-sampling. Post-processing of breath-sample meta-data is recommended to assess the quality of sample-acquisition. Further, future studies should explore the effect of pump-synchronisation faults on recovered VOC profiles, and mask sizes to fit all ages will reduce the potential for leaks and importantly, provide higher levels of comfort to children with asthma.

Collapse

Affiliation(s)

Kirandeep Bhavra Department of Respiratory Sciences, Leicester Royal Infirmary, NIHR Leicester Biomedical Research Centre (Respiratory theme), PO Box 65, Robert Kilpatrick Clinical Sciences Building, Leicester, LE2 7LX, UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND
Michael Wilde University of Leicester, Department of Chemistry, Leicester, Leicestershire, LE1 7RH, UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND
Matthew Richardson Loughborough University School of Science, Department of Chemistry, Loughborough, Leicestershire, LE11 3TU, UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND
Rebecca Cordell University of Leicester Department of Chemistry, University of Leicester, Leicester, Leicester, LE1 7RH, UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND
C L Paul Thomas University of Leicester Department of Respiratory Sciences, NIHR Leicester Biomedical Research Centre (Respiratory theme), Glenfield Hospital, Groby Road, Leicester, East Midlands, LE3 9QP, UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND
Bo Zhao University of Leicester College of Life Sciences, Leicester NIHR Biomedical Research Centre (Respiratory theme), Glenfield Hospital, Groby Road, Leicester, Leicester, LE3 9QP, UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND
Luke Bryant University of Leicester Department of Chemistry, University of Leicester, University Road, Leicester, Leicester, LE1 7RH, UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND
Christopher E Brightling Loughborough University School of Science, Department of Chemistry, Loughborough, Leicestershire, LE11 3TU, UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND
Wadah Ibrahim Loughborough University School of Science, Department of Chemistry, Loughborough, Leicestershire, LE11 3TU, UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND
Dahlia Salman University of Leicester Department of Respiratory Sciences, NIHR Leicester Biomedical Research Centre (Respiratory theme),, Glenfield Hospital, Groby Road, Leicester, East Midlands, LE3 9QP, UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND
Salman Siddiqui Loughborough University School of Science, Department of Chemistry, Loughborough, Leicestershire, LE11 3TU, UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND
Paul Monks University of Leicester, Department of Chemistry, Leicester, Leicestershire, LE1 7RH, UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND
Erol Gaillard Department of Respiratory Sciences, University of Leicester, College of Life Sciences, Leicester, Leicestershire, LE1 7RH, UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND

Collapse

Noble AJ, Purcell RV, Adams AT, Lam YK, Ring PM, Anderson JR, Osborne AJ. A Final Frontier in Environment-Genome Interactions? Integrated, Multi-Omic Approaches to Predictions of Non-Communicable Disease Risk. Front Genet 2022;13:831866. [PMID: 35211161 PMCID: PMC8861380 DOI: 10.3389/fgene.2022.831866] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 01/19/2022] [Indexed: 12/26/2022] Open

Wang X, Wang J, Zhang H, Huang S, Yin Y. HDMC: a novel deep learning-based framework for removing batch effects in single-cell RNA-seq data. Bioinformatics 2022;38:1295-1303. [PMID: 34864918 DOI: 10.1093/bioinformatics/btab821] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 11/25/2021] [Accepted: 11/30/2021] [Indexed: 01/05/2023] Open

Abstract

MOTIVATION

With the development of single-cell RNA sequencing (scRNA-seq) techniques, increasingly more large-scale gene expression datasets become available. However, to analyze datasets produced by different experiments, batch effects among different datasets must be considered. Although several methods have been recently published to remove batch effects in scRNA-seq data, two problems remain to be challenging and not completely solved: (i) how to reduce the distribution differences of different batches more accurately; and (ii) how to align samples from different batches to recover the cell type clusters.

RESULTS

We proposed a novel deep-learning approach, which is a hierarchical distribution-matching framework assisted with contrastive learning to address these two problems. Firstly, we design a hierarchical framework for distribution matching based on a deep autoencoder. This framework employs an adversarial training strategy to match the global distribution of different batches. This provides an improved foundation to further match the local distributions with a maximum mean discrepancy-based loss. For local matching, we divide cells in each batch into clusters and develop a contrastive learning mechanism to simultaneously align similar cluster pairs and keep noisy pairs apart from each other. This allows to obtain clusters with all cells of the same type (true positives), and avoid clusters with cells of different type (false positives). We demonstrate the effectiveness of our method on both simulated and real datasets. Results show that our new method significantly outperforms the state-of-the-art methods and has the ability to prevent overcorrection.

AVAILABILITY AND IMPLEMENTATION

The python code to generate results and figures in this article is available at https://github.com/zhanglabNKU/HDMC, the data underlying this article is also available at this github repository.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Bajo-Morales J, Galvez JM, Prieto-Prieto JC, Herrera LJ, Rojas I, Castillo-Secilla D. Heterogeneous Gene Expression Cross-Evaluation of Robust Biomarkers Using Machine Learning Techniques Applied to Lung Cancer. Curr Bioinform 2022. [DOI: 10.2174/1574893616666211005114934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Abstract Background: Nowadays, gene expression analysis is one of the most promising pillars for understanding and uncovering the mechanisms underlying the development and spread of cancer. In this sense, Next Generation Sequencing technologies, such as RNA-Seq, are currently leading the market due to their precision and cost. Nevertheless, there is still an enormous amount of non-analyzed data obtained from older technologies, such as Microarray, which could still be useful to extract relevant knowledge. Methods: Throughout this research, a complete machine learning methodology to cross-evaluate the compatibility between both RNA-Seq and Microarray sequencing technologies is described and implemented. In order to show a real application of the designed pipeline, a lung cancer case study is addressed by considering two detected subtypes: adenocarcinoma and squamous cell carcinoma. Transcriptomic datasets considered for our study have been obtained from the public repositories NCBI/GEO, ArrayExpress and GDC-Portal. From them, several gene experiments have been carried out with the aim of finding gene signatures for these lung cancer subtypes, linked to both transcriptomic technologies. With these DEGs selected, intelligent predictive models capable of classifying new samples belonging to these cancer subtypes have been developed. Results: The predictive models built using one technology are capable of discerning samples from a different technology. The classification results are evaluated in terms of accuracy, F1-score and ROC curves along with AUC. Finally, the biological information of the gene sets obtained and their relationship with lung cancer are reviewed, encountering strong biological evidence linking them to the disease. Conclusion: Our method has the capability of finding strong gene signatures which are also independent of the transcriptomic technology used to develop the analysis. In addition, our article highlights the potential of using heterogeneous transcriptomic data to increase the amount of samples for the studies, increasing the statistical significance of the results. Collapse

Zhang X, Ye Z, Chen J, Qiao F. AMDBNorm: an approach based on distribution adjustment to eliminate batch effects of gene expression data. Brief Bioinform 2021;23:6485011. [PMID: 34958674 DOI: 10.1093/bib/bbab528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 10/16/2021] [Accepted: 11/14/2021] [Indexed: 11/14/2022] Open