1
|
Fu Y, Yuan ZF, Wu L, Peng J, Wang X, High AA. Addressing Sample Mix-Ups: Tools and Approaches for Large-Scale Multi-Omics Studies. Proteomics 2025; 25:e202400271. [PMID: 39659081 DOI: 10.1002/pmic.202400271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2024] [Revised: 11/25/2024] [Accepted: 11/26/2024] [Indexed: 12/12/2024]
Abstract
Advances in high-throughput omics technologies have enabled system-wide characterization of biological samples across multiple molecular levels, such as the genome, transcriptome, and proteome. However, as sample sizes rapidly increase in large-scale multi-omics studies, sample mix-ups have become a prevalent issue, compromising data integrity and leading to erroneous conclusions. The interconnected nature of multi-omics data presents an opportunity to identify and correct these errors. This review examines the potential sources of sample mix-ups and evaluates the methodologies and tools developed for detecting and correcting these errors, with an emphasis on approaches applicable to proteomics data. We categorize existing tools into three main groups: expression/protein quantitative trait loci-based, genotype concordance-based, and gene/protein expression correlation-based approaches. Notably, only a handful of tools currently utilize the proteogenomics approach for correcting sample mix-ups at the proteomics level. Integrating the strengths of current tools across diverse data types could enable the development of more versatile and comprehensive solutions. In conclusion, verifying sample identity is a critical first step to reduce bias and increase precision in subsequent analyses for large-scale multi-omics studies. By leveraging these tools for identifying and correcting sample mix-ups, researchers can significantly improve the reliability and reproducibility of biomedical research.
Collapse
Affiliation(s)
- Yingxue Fu
- Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Zuo-Fei Yuan
- Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Long Wu
- Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Junmin Peng
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Xusheng Wang
- Department of Neurology, University of Tennessee Health Science Center, Memphis, Tennessee, USA
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee, USA
| | - Anthony A High
- Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| |
Collapse
|
2
|
Liu Q, Zhang J, Guo C, Wang M, Wang C, Yan Y, Sun L, Wang D, Zhang L, Yu H, Hou L, Wu C, Zhu Y, Jiang G, Zhu H, Zhou Y, Fang S, Zhang T, Hu L, Li J, Liu Y, Zhang H, Zhang B, Ding L, Robles AI, Rodriguez H, Gao D, Ji H, Zhou H, Zhang P. Proteogenomic characterization of small cell lung cancer identifies biological insights and subtype-specific therapeutic strategies. Cell 2024; 187:184-203.e28. [PMID: 38181741 DOI: 10.1016/j.cell.2023.12.004] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 09/25/2023] [Accepted: 12/01/2023] [Indexed: 01/07/2024]
Abstract
We performed comprehensive proteogenomic characterization of small cell lung cancer (SCLC) using paired tumors and adjacent lung tissues from 112 treatment-naive patients who underwent surgical resection. Integrated multi-omics analysis illustrated cancer biology downstream of genetic aberrations and highlighted oncogenic roles of FAT1 mutation, RB1 deletion, and chromosome 5q loss. Two prognostic biomarkers, HMGB3 and CASP10, were identified. Overexpression of HMGB3 promoted SCLC cell migration via transcriptional regulation of cell junction-related genes. Immune landscape characterization revealed an association between ZFHX3 mutation and high immune infiltration and underscored a potential immunosuppressive role of elevated DNA damage response activity via inhibition of the cGAS-STING pathway. Multi-omics clustering identified four subtypes with subtype-specific therapeutic vulnerabilities. Cell line and patient-derived xenograft-based drug tests validated the specific therapeutic responses predicted by multi-omics subtyping. This study provides a valuable resource as well as insights to better understand SCLC biology and improve clinical practice.
Collapse
Affiliation(s)
- Qian Liu
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, School of Medicine, Tongji University, Shanghai 200433, China; Department of Analytical Chemistry, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Jing Zhang
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, School of Medicine, Tongji University, Shanghai 200433, China
| | - Chenchen Guo
- State Key Laboratory of Cell Biology, Shanghai Institute of Biochemistry and Cell Biology, CAS Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China
| | - Mengcheng Wang
- State Key Laboratory of Cell Biology, Shanghai Institute of Biochemistry and Cell Biology, CAS Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Chenfei Wang
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration of Ministry of Education, Department of Orthopedics, Tongji Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China; Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Yilv Yan
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, School of Medicine, Tongji University, Shanghai 200433, China
| | - Liangdong Sun
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, School of Medicine, Tongji University, Shanghai 200433, China
| | - Di Wang
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, School of Medicine, Tongji University, Shanghai 200433, China
| | - Lele Zhang
- Central Laboratory, Shanghai Pulmonary Hospital, School of Medicine, Tongji University, Shanghai 200433, China
| | - Huansha Yu
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, School of Medicine, Tongji University, Shanghai 200433, China
| | - Likun Hou
- Department of Pathology, Shanghai Pulmonary Hospital, School of Medicine, Tongji University, Shanghai 200433, China
| | - Chunyan Wu
- Department of Pathology, Shanghai Pulmonary Hospital, School of Medicine, Tongji University, Shanghai 200433, China
| | - Yuming Zhu
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, School of Medicine, Tongji University, Shanghai 200433, China
| | - Gening Jiang
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, School of Medicine, Tongji University, Shanghai 200433, China
| | - Hongwen Zhu
- Department of Analytical Chemistry, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Yanting Zhou
- Department of Analytical Chemistry, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Shanhua Fang
- Department of Analytical Chemistry, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Tengfei Zhang
- State Key Laboratory of Cell Biology, Shanghai Institute of Biochemistry and Cell Biology, CAS Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Liang Hu
- State Key Laboratory of Cell Biology, Shanghai Institute of Biochemistry and Cell Biology, CAS Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China
| | - Junqiang Li
- D1 Medical Technology, Shanghai 201800, China
| | - Yansheng Liu
- Cancer Biology Institute, Yale University School of Medicine, West Haven, CT 06516, USA
| | - Hui Zhang
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Bing Zhang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Li Ding
- Department of Medicine, McDonnell Genome Institute, Washington University, St. Louis, MO 63108, USA
| | - Ana I Robles
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, National Institutes of Health, Rockville, MD 20850, USA
| | - Henry Rodriguez
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, National Institutes of Health, Rockville, MD 20850, USA
| | - Daming Gao
- State Key Laboratory of Cell Biology, Shanghai Institute of Biochemistry and Cell Biology, CAS Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China; University of Chinese Academy of Sciences, Beijing 100049, China; Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China.
| | - Hongbin Ji
- State Key Laboratory of Cell Biology, Shanghai Institute of Biochemistry and Cell Biology, CAS Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China; University of Chinese Academy of Sciences, Beijing 100049, China; Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China; School of Life Science and Technology, Shanghai Tech University, Shanghai 200120, China.
| | - Hu Zhou
- Department of Analytical Chemistry, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China; School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China.
| | - Peng Zhang
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, School of Medicine, Tongji University, Shanghai 200433, China.
| |
Collapse
|
3
|
Ketteler A, Blumenthal DB. Demographic confounders distort inference of gene regulatory and gene co-expression networks in cancer. Brief Bioinform 2023; 24:bbad413. [PMID: 37985453 DOI: 10.1093/bib/bbad413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 09/19/2023] [Accepted: 10/26/2023] [Indexed: 11/22/2023] Open
Abstract
Gene regulatory networks (GRNs) and gene co-expression networks (GCNs) allow genome-wide exploration of molecular regulation patterns in health and disease. The standard approach for obtaining GRNs and GCNs is to infer them from gene expression data, using computational network inference methods. However, since network inference methods are usually applied on aggregate data, distortion of the networks by demographic confounders might remain undetected, especially because gene expression patterns are known to vary between different demographic groups. In this paper, we present a computational framework to systematically evaluate the influence of demographic confounders on network inference from gene expression data. Our framework compares similarities between networks inferred for different demographic groups with similarity distributions obtained for random splits of the expression data. Moreover, it allows to quantify to which extent demographic groups are represented by networks inferred from the aggregate data in a confounder-agnostic way. We apply our framework to test four widely used GRN and GCN inference methods as to their robustness w. r. t. confounding by age, ethnicity and sex in cancer. Our findings based on more than $ {44000}$ inferred networks indicate that age and sex confounders play an important role in network inference for certain cancer types, emphasizing the importance of incorporating an assessment of the effect of demographic confounders into network inference workflows. Our framework is available as a Python package on GitHub: https://github.com/bionetslab/grn-confounders.
Collapse
Affiliation(s)
- Anna Ketteler
- Biomedical Network Science Lab, Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - David B Blumenthal
- Biomedical Network Science Lab, Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
4
|
Coleman C, Wang M, Wang E, Micallef C, Shao Z, Vicari JM, Li Y, Yu K, Cai D, Peng J, Haroutunian V, Fullard JF, Bendl J, Zhang B, Roussos P. Multi-omic atlas of the parahippocampal gyrus in Alzheimer's disease. Sci Data 2023; 10:602. [PMID: 37684260 PMCID: PMC10491684 DOI: 10.1038/s41597-023-02507-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 08/29/2023] [Indexed: 09/10/2023] Open
Abstract
Alzheimer's disease (AD) is the most common form of dementia worldwide, with a projection of 151 million cases by 2050. Previous genetic studies have identified three main genes associated with early-onset familial Alzheimer's disease, however this subtype accounts for less than 5% of total cases. Next-generation sequencing has been well established and holds great promise to assist in the development of novel therapeutics as well as biomarkers to prevent or slow the progression of this devastating disease. Here we present a public resource of functional genomic data from the parahippocampal gyrus of 201 postmortem control, mild cognitively impaired (MCI) and AD individuals from the Mount Sinai brain bank, of which whole-genome sequencing (WGS), and bulk RNA sequencing (RNA-seq) were previously published. The genomic data include bulk proteomics and DNA methylation, as well as cell-type-specific RNA-seq and assay for transposase-accessible chromatin with high-throughput sequencing (ATAC-seq) data. We have performed extensive preprocessing and quality control, allowing the research community to access and utilize this public resource available on the Synapse platform at https://doi.org/10.7303/syn51180043.2 .
Collapse
Affiliation(s)
- Claire Coleman
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Icahn Institute of Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Minghui Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Erming Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Courtney Micallef
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Icahn Institute of Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Zhiping Shao
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Icahn Institute of Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - James M Vicari
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Icahn Institute of Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Yuxin Li
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
- Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Kaiwen Yu
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
- Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Dongming Cai
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- James J Peters VA Medical Center, Research & Development, Bronx, NY, 10468, USA
- Alzheimer Disease Research Center, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Junmin Peng
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
- Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Vahram Haroutunian
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- James J Peters VA Medical Center, Research & Development, Bronx, NY, 10468, USA
- Nash Family Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - John F Fullard
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Icahn Institute of Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Jaroslav Bendl
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Icahn Institute of Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Bin Zhang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
| | - Panos Roussos
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
- Icahn Institute of Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
- James J Peters VA Medical Center, Research & Development, Bronx, NY, 10468, USA.
| |
Collapse
|
5
|
Chowdhury S, Kennedy JJ, Ivey RG, Murillo OD, Hosseini N, Song X, Petralia F, Calinawan A, Savage SR, Berry AB, Reva B, Ozbek U, Krek A, Ma W, da Veiga Leprevost F, Ji J, Yoo S, Lin C, Voytovich UJ, Huang Y, Lee SH, Bergan L, Lorentzen TD, Mesri M, Rodriguez H, Hoofnagle AN, Herbert ZT, Nesvizhskii AI, Zhang B, Whiteaker JR, Fenyo D, McKerrow W, Wang J, Schürer SC, Stathias V, Chen XS, Barcellos-Hoff MH, Starr TK, Winterhoff BJ, Nelson AC, Mok SC, Kaufmann SH, Drescher C, Cieslik M, Wang P, Birrer MJ, Paulovich AG. Proteogenomic analysis of chemo-refractory high-grade serous ovarian cancer. Cell 2023; 186:3476-3498.e35. [PMID: 37541199 PMCID: PMC10414761 DOI: 10.1016/j.cell.2023.07.004] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 03/23/2023] [Accepted: 07/05/2023] [Indexed: 08/06/2023]
Abstract
To improve the understanding of chemo-refractory high-grade serous ovarian cancers (HGSOCs), we characterized the proteogenomic landscape of 242 (refractory and sensitive) HGSOCs, representing one discovery and two validation cohorts across two biospecimen types (formalin-fixed paraffin-embedded and frozen). We identified a 64-protein signature that predicts with high specificity a subset of HGSOCs refractory to initial platinum-based therapy and is validated in two independent patient cohorts. We detected significant association between lack of Ch17 loss of heterozygosity (LOH) and chemo-refractoriness. Based on pathway protein expression, we identified 5 clusters of HGSOC, which validated across two independent patient cohorts and patient-derived xenograft (PDX) models. These clusters may represent different mechanisms of refractoriness and implicate putative therapeutic vulnerabilities.
Collapse
Affiliation(s)
- Shrabanti Chowdhury
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Jacob J Kennedy
- Translational Science and Therapeutics Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
| | - Richard G Ivey
- Translational Science and Therapeutics Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
| | - Oscar D Murillo
- Translational Science and Therapeutics Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
| | - Noshad Hosseini
- Department of Computational Medicine and Bioinformatics, Michigan Center for Translational Pathology, University of Michigan School of Medicine, Ann Arbor, MI 48109, USA
| | - Xiaoyu Song
- Tisch Cancer Institute, Department of Population Health Science and Policy, Institute for Health Care Delivery Science, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Francesca Petralia
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Anna Calinawan
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Sara R Savage
- Lester and Sue Smith Breast Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | | | - Boris Reva
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Umut Ozbek
- Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Azra Krek
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Weiping Ma
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | | | - Jiayi Ji
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | | | - Chenwei Lin
- Translational Science and Therapeutics Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
| | - Uliana J Voytovich
- Translational Science and Therapeutics Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
| | - Yajue Huang
- Department of Laboratory Medicine & Pathology, Mayo Clinic, Rochester, MN 55905, USA
| | - Sun-Hee Lee
- Departments of Oncology and Molecular Pharmacology & Experimental Therapeutics, Mayo Clinic, Rochester, MN 55905, USA
| | - Lindsay Bergan
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
| | - Travis D Lorentzen
- Translational Science and Therapeutics Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
| | - Mehdi Mesri
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Rockville, MD 20850, USA
| | - Henry Rodriguez
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Rockville, MD 20850, USA
| | - Andrew N Hoofnagle
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, USA
| | - Zachary T Herbert
- Molecular Biology Core Facilities, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Alexey I Nesvizhskii
- Department of Pathology, Department of Computational Medicine and Bioinformatics, University of Michigan School of Medicine, Ann Arbor, MI 48109, USA
| | - Bing Zhang
- Lester and Sue Smith Breast Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Jeffrey R Whiteaker
- Translational Science and Therapeutics Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
| | - David Fenyo
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
| | - Wilson McKerrow
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
| | - Joshua Wang
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
| | - Stephan C Schürer
- Department of Molecular and Cellular Pharmacology, Sylvester Comprehensive Cancer Center, Miller School of Medicine, and Institute for Data Science & Computing, University of Miami, Miami, FL 33136, USA
| | - Vasileios Stathias
- Department of Molecular and Cellular Pharmacology, Sylvester Comprehensive Cancer Center, Miller School of Medicine, and Institute for Data Science & Computing, University of Miami, Miami, FL 33136, USA
| | - X Steven Chen
- Department of Public Health Sciences, Sylvester Comprehensive Cancer Center, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
| | - Mary Helen Barcellos-Hoff
- Helen Diller Family Comprehensive Cancer Center, Department of Radiation Oncology, University of California, San Francisco, San Francisco, CA 94115, USA
| | - Timothy K Starr
- Department of Obstetrics, Gynecology and Women's Health, University of Minnesota, Minneapolis, MN 55455, USA
| | - Boris J Winterhoff
- Department of Obstetrics, Gynecology and Women's Health, University of Minnesota, Minneapolis, MN 55455, USA
| | - Andrew C Nelson
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA
| | - Samuel C Mok
- Department of Gynecologic Oncology and Reproductive Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Scott H Kaufmann
- Departments of Oncology and Molecular Pharmacology & Experimental Therapeutics, Mayo Clinic, Rochester, MN 55905, USA
| | - Charles Drescher
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
| | - Marcin Cieslik
- Department of Pathology, Department of Computational Medicine and Bioinformatics, Michigan Center for Translational Pathology, University of Michigan School of Medicine, Ann Arbor, MI 48109, USA.
| | - Pei Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
| | - Michael J Birrer
- Winthrop P. Rockefeller Cancer Institute, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA.
| | - Amanda G Paulovich
- Translational Science and Therapeutics Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA.
| |
Collapse
|
6
|
Yoo S, Sinha A, Yang D, Altorki NK, Tandon R, Wang W, Chavez D, Lee E, Patel AS, Sato T, Kong R, Ding B, Schadt EE, Watanabe H, Massion PP, Borczuk AC, Zhu J, Powell CA. Integrative network analysis of early-stage lung adenocarcinoma identifies aurora kinase inhibition as interceptor of invasion and progression. Nat Commun 2022; 13:1592. [PMID: 35332150 PMCID: PMC8948234 DOI: 10.1038/s41467-022-29230-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 03/01/2022] [Indexed: 12/15/2022] Open
Abstract
Here we focus on the molecular characterization of clinically significant histological subtypes of early-stage lung adenocarcinoma (esLUAD), which is the most common histological subtype of lung cancer. Within lung adenocarcinoma, histology is heterogeneous and associated with tumor invasion and diverse clinical outcomes. We present a gene signature distinguishing invasive and non-invasive tumors among esLUAD. Using the gene signatures, we estimate an Invasiveness Score that is strongly associated with survival of esLUAD patients in multiple independent cohorts and with the invasiveness phenotype in lung cancer cell lines. Regulatory network analysis identifies aurora kinase as one of master regulators of the gene signature and the perturbation of aurora kinases in vitro and in a murine model of invasive lung adenocarcinoma reduces tumor invasion. Our study reveals aurora kinases as a therapeutic target for treatment of early-stage invasive lung adenocarcinoma.
Collapse
Affiliation(s)
- Seungyeul Yoo
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, New York, NY, USA
- Sema4, Stamford, CT, USA
| | - Abhilasha Sinha
- Division of Pulmonary, Critical Care and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Dawei Yang
- Division of Pulmonary, Critical Care and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Pulmonary and Critical Care Medicine, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Nasser K Altorki
- Department of Cardiothoracic Surgery, Weill Cornell Medicine-New York Presbyterian Hospital, New York, NY, USA
| | - Radhika Tandon
- School of Medicine, St. George's University, West Indies, Grenada
| | - Wenhui Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, New York, NY, USA
| | - Deebly Chavez
- Division of Pulmonary, Critical Care and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Eunjee Lee
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, New York, NY, USA
- Sema4, Stamford, CT, USA
| | - Ayushi S Patel
- Division of Pulmonary, Critical Care and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Vileck Institute of Graduate Biomedical Sciences, New York University School of Medicine, New York, NY, USA
| | - Takashi Sato
- Division of Pulmonary, Critical Care and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Division of Pulmonary Medicine, Department of Medicine, Keio University School of Medicine, Tokyo, Japan
- Department of Respiratory Medicine, Kitasato University School of Medicine, Sagamihara, Japan
| | - Ranran Kong
- Division of Pulmonary, Critical Care and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Thoracic Surgery, The Second Affiliated Hospital of Medical School, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Bisen Ding
- Division of Pulmonary, Critical Care and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Key Laboratory of Birth Defects and Related Diseases of Women And Children of MOE, State Key Laboratory of Biotherapy, West China Second University Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Eric E Schadt
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, New York, NY, USA
- Sema4, Stamford, CT, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Hideo Watanabe
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, New York, NY, USA
- Division of Pulmonary, Critical Care and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Pierre P Massion
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Alain C Borczuk
- Department of Pathology, Weill Cornell Medicine, New York, NY, USA
| | - Jun Zhu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Icahn Institute for Data Science and Genomic Technology, New York, NY, USA.
- Sema4, Stamford, CT, USA.
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Charles A Powell
- Division of Pulmonary, Critical Care and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
7
|
Li L, Niu M, Erickson A, Luo J, Rowbotham K, Guo K, Huang H, Li Y, Jiang Y, Hur J, Liu C, Peng J, Wang X. SMAP is a pipeline for sample matching in proteogenomics. Nat Commun 2022; 13:744. [PMID: 35136070 PMCID: PMC8825821 DOI: 10.1038/s41467-022-28411-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 01/17/2022] [Indexed: 11/12/2022] Open
Abstract
The integration of genomics and proteomics data (proteogenomics) holds the promise of furthering the in-depth understanding of human disease. However, sample mix-up is a pervasive problem in proteogenomics because of the complexity of sample processing. Here, we present a pipeline for Sample Matching in Proteogenomics (SMAP) to verify sample identity and ensure data integrity. SMAP infers sample-dependent protein-coding variants from quantitative mass spectrometry (MS), and aligns the MS-based proteomic samples with genomic samples by two discriminant scores. Theoretical analysis with simulated data indicates that SMAP is capable of uniquely matching proteomic and genomic samples when ≥20% genotypes of individual samples are available. When SMAP was applied to a large-scale dataset generated by the PsychENCODE BrainGVEX project, 54 samples (19%) were corrected. The correction was further confirmed by ribosome profiling and chromatin sequencing (ATAC-seq) data from the same set of samples. Our results demonstrate that SMAP is an effective tool for sample verification in a large-scale MS-based proteogenomics study. SMAP is publicly available at https://github.com/UND-Wanglab/SMAP, and a web-based version can be accessed at https://smap.shinyapps.io/smap/. Sample mix-up is a potential problem in large-scale omic studies due to the complexity of sample processing. Here, the authors present a pipeline for sample matching in proteogenomics to verify sample identity and ensure data integrity.
Collapse
Affiliation(s)
- Ling Li
- Department of Biology, University of North Dakota, Grand Forks, ND, 58202, USA
| | - Mingming Niu
- Departments of Structural Biology and Developmental Neurobiology, Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Alyssa Erickson
- Department of Biology, University of North Dakota, Grand Forks, ND, 58202, USA
| | - Jie Luo
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Zhejiang Academy of Agricultural Sciences, Hangzhou, 310021, China
| | - Kincaid Rowbotham
- Department of Biology, University of North Dakota, Grand Forks, ND, 58202, USA
| | - Kai Guo
- Department of Neurology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - He Huang
- Department of Biology, University of North Dakota, Grand Forks, ND, 58202, USA
| | - Yuxin Li
- Departments of Structural Biology and Developmental Neurobiology, Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Yi Jiang
- Department of Epidemiology and Biostatistics, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China
| | - Junguk Hur
- Department of Biomedical Sciences, School of medicine and health sciences, University of North Dakota, Grand Forks, ND, 58202, USA
| | - Chunyu Liu
- Department of Psychiatry, SUNY Upstate Medical University, Syracuse, NY, 13210, USA
| | - Junmin Peng
- Departments of Structural Biology and Developmental Neurobiology, Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA.
| | - Xusheng Wang
- Department of Biology, University of North Dakota, Grand Forks, ND, 58202, USA.
| |
Collapse
|
8
|
John Cremin C, Dash S, Huang X. Big Data: Historic Advances and Emerging Trends in Biomedical Research. CURRENT RESEARCH IN BIOTECHNOLOGY 2022. [DOI: 10.1016/j.crbiot.2022.02.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
|
9
|
Martínez-García M, Hernández-Lemus E. Data Integration Challenges for Machine Learning in Precision Medicine. Front Med (Lausanne) 2022; 8:784455. [PMID: 35145977 PMCID: PMC8821900 DOI: 10.3389/fmed.2021.784455] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 12/28/2021] [Indexed: 12/19/2022] Open
Abstract
A main goal of Precision Medicine is that of incorporating and integrating the vast corpora on different databases about the molecular and environmental origins of disease, into analytic frameworks, allowing the development of individualized, context-dependent diagnostics, and therapeutic approaches. In this regard, artificial intelligence and machine learning approaches can be used to build analytical models of complex disease aimed at prediction of personalized health conditions and outcomes. Such models must handle the wide heterogeneity of individuals in both their genetic predisposition and their social and environmental determinants. Computational approaches to medicine need to be able to efficiently manage, visualize and integrate, large datasets combining structure, and unstructured formats. This needs to be done while constrained by different levels of confidentiality, ideally doing so within a unified analytical architecture. Efficient data integration and management is key to the successful application of computational intelligence approaches to medicine. A number of challenges arise in the design of successful designs to medical data analytics under currently demanding conditions of performance in personalized medicine, while also subject to time, computational power, and bioethical constraints. Here, we will review some of these constraints and discuss possible avenues to overcome current challenges.
Collapse
Affiliation(s)
- Mireya Martínez-García
- Clinical Research Division, National Institute of Cardiology ‘Ignacio Chávez’, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine (INMEGEN), Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autnoma de Mexico, Mexico City, Mexico
| |
Collapse
|
10
|
Smithmyer ME, Wiedeman AE, Skibinski DAG, Savage AK, Acosta-Vega C, Scheiding S, Gersuk VH, O'Rourke C, Long SA, Buckner JH, Speake C. A simple strategy for sample annotation error detection in cytometry datasets. Cytometry A 2021; 101:351-360. [PMID: 34967113 DOI: 10.1002/cyto.a.24525] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 11/19/2021] [Accepted: 12/15/2021] [Indexed: 11/05/2022]
Abstract
Mislabeling samples or data with the wrong participant information can affect study integrity and lead investigators to draw inaccurate conclusions. Quality control to prevent these types of errors is commonly embedded into the analysis of genomic datasets, but a similar identification strategy is not standard for cytometric data. Here, we present a method for detecting sample identification errors in cytometric data using expression of human leukocyte antigen (HLA) class I alleles. We measured HLA-A*02 and HLA-B*07 expression in three longitudinal samples from 41 participants using a 33-marker CyTOF panel designed to identify major immune cell types. 3/123 samples (2.4%) showed HLA allele expression that did not match their longitudinal pairs. Furthermore, these same three samples' cytometric signature did not match qPCR HLA class I allele data, suggesting that they were accurately identified as mismatches. We conclude that this technique is useful for detecting sample-labeling errors in cytometric analyses of longitudinal data. This technique could also be used in conjunction with another method, like GWAS or PCR, to detect errors in cross-sectional data. We suggest widespread adoption of this or similar techniques will improve the quality of clinical studies that utilize cytometry.
Collapse
Affiliation(s)
- Megan E Smithmyer
- Center for Interventional Immunology, Benaroya Research Institute, Seattle, Washington, USA
| | - Alice E Wiedeman
- Center for Translational Immunology, Benaroya Research Institute, Seattle, Washington, USA
| | - David A G Skibinski
- Center for Interventional Immunology, Benaroya Research Institute, Seattle, Washington, USA.,Nexelis, 645 Elliot Avenue West, Suite 300, Seattle, Washington, USA
| | - Adam K Savage
- Allen Institute for Immunology, Seattle, Washington, USA
| | - Carolina Acosta-Vega
- Center for Translational Immunology, Benaroya Research Institute, Seattle, Washington, USA
| | - Sheila Scheiding
- Center for Translational Immunology, Benaroya Research Institute, Seattle, Washington, USA
| | - Vivian H Gersuk
- Center for Systems Immunology, Benaroya Research Institute, Seattle, Washington, USA
| | - Colin O'Rourke
- Center for Interventional Immunology, Benaroya Research Institute, Seattle, Washington, USA
| | - S Alice Long
- Center for Translational Immunology, Benaroya Research Institute, Seattle, Washington, USA
| | - Jane H Buckner
- Center for Translational Immunology, Benaroya Research Institute, Seattle, Washington, USA
| | - Cate Speake
- Center for Interventional Immunology, Benaroya Research Institute, Seattle, Washington, USA
| |
Collapse
|
11
|
Westphal MS, Lee E, Schadt EE, Sholler GS, Zhu J. Identification of Let-7 miRNA Activity as a Prognostic Biomarker of SHH Medulloblastoma. Cancers (Basel) 2021; 14:cancers14010139. [PMID: 35008302 PMCID: PMC8750188 DOI: 10.3390/cancers14010139] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 12/22/2021] [Accepted: 12/23/2021] [Indexed: 11/16/2022] Open
Abstract
Medulloblastoma (MB) is the most common pediatric embryonal brain tumor. The current consensus classifies MB into four molecular subgroups: sonic hedgehog-activated (SHH), wingless-activated (WNT), Group 3, and Group 4. MYCN and let-7 play a critical role in MB. Thus, we inferred the activity of miRNAs in MB by using the ActMiR procedure. SHH-MB has higher MYCN expression than the other subgroups. We showed that high MYCN expression with high let-7 activity is significantly associated with worse overall survival, and this association was validated in an independent MB dataset. Altogether, our results suggest that let-7 activity and MYCN can further categorize heterogeneous SHH tumors into more and less-favorable prognostic subtypes, which provide critical information for personalizing treatment options for SHH-MB. Comparing the expression differences between the two SHH-MB prognostic subtypes with compound perturbation profiles, we identified FGFR inhibitors as one potential treatment option for SHH-MB patients with the less-favorable prognostic subtype.
Collapse
Affiliation(s)
| | - Eunjee Lee
- Sema4, 333 Ludlow St., Stamford, CT 06902, USA; (M.S.W.); (E.L.); (E.E.S.)
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
| | - Eric E. Schadt
- Sema4, 333 Ludlow St., Stamford, CT 06902, USA; (M.S.W.); (E.L.); (E.E.S.)
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
| | - Giselle S. Sholler
- Helen DeVos Children’s Hospital, Grand Rapids, MI 49503, USA;
- College of Human Medicine, Michigan State University, Grand Rapids, MI 49503, USA
| | - Jun Zhu
- Sema4, 333 Ludlow St., Stamford, CT 06902, USA; (M.S.W.); (E.L.); (E.E.S.)
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
- The Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029, USA
- Correspondence:
| |
Collapse
|
12
|
Gürsoy G, Emani P, Brannon CM, Jolanki OA, Harmanci A, Strattan JS, Cherry JM, Miranker AD, Gerstein M. Data Sanitization to Reduce Private Information Leakage from Functional Genomics. Cell 2021; 183:905-917.e16. [PMID: 33186529 DOI: 10.1016/j.cell.2020.09.036] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2019] [Revised: 07/23/2020] [Accepted: 09/11/2020] [Indexed: 12/30/2022]
Abstract
The generation of functional genomics datasets is surging, because they provide insight into gene regulation and organismal phenotypes (e.g., genes upregulated in cancer). The intent behind functional genomics experiments is not necessarily to study genetic variants, yet they pose privacy concerns due to their use of next-generation sequencing. Moreover, there is a great incentive to broadly share raw reads for better statistical power and general research reproducibility. Thus, we need new modes of sharing beyond traditional controlled-access models. Here, we develop a data-sanitization procedure allowing raw functional genomics reads to be shared while minimizing privacy leakage, enabling principled privacy-utility trade-offs. Our protocol works with traditional Illumina-based assays and newer technologies such as 10x single-cell RNA sequencing. It involves quantifying the privacy leakage in reads by statistically linking study participants to known individuals. We carried out these linkages using data from highly accurate reference genomes and more realistic environmental samples.
Collapse
Affiliation(s)
- Gamze Gürsoy
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Prashant Emani
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Charlotte M Brannon
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Otto A Jolanki
- Stanford University School of Medicine, Department of Genetics, Stanford, CA 94305, USA
| | - Arif Harmanci
- School of Biomedical Informatics, Center for Precision Health, University of Texas Health Sciences Center, Houston, TX 77030, USA
| | - J Seth Strattan
- Stanford University School of Medicine, Department of Genetics, Stanford, CA 94305, USA
| | - J Michael Cherry
- Stanford University School of Medicine, Department of Genetics, Stanford, CA 94305, USA
| | - Andrew D Miranker
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA; Department of Chemical and Environmental Engineering, Yale University, New Haven, CT 06520, USA
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA; Department of Computer Science, Yale University, New Haven, CT 06520, USA; Department of Statistics and Data Science, Yale University, New Haven, CT 06520, USA.
| |
Collapse
|
13
|
A community effort to identify and correct mislabeled samples in proteogenomic studies. PATTERNS 2021; 2:100245. [PMID: 34036290 PMCID: PMC8134945 DOI: 10.1016/j.patter.2021.100245] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 01/27/2021] [Accepted: 03/31/2021] [Indexed: 01/06/2023]
Abstract
Sample mislabeling or misannotation has been a long-standing problem in scientific research, particularly prevalent in large-scale, multi-omic studies due to the complexity of multi-omic workflows. There exists an urgent need for implementing quality controls to automatically screen for and correct sample mislabels or misannotations in multi-omic studies. Here, we describe a crowdsourced precisionFDA NCI-CPTAC Multi-omics Enabled Sample Mislabeling Correction Challenge, which provides a framework for systematic benchmarking and evaluation of mislabel identification and correction methods for integrative proteogenomic studies. The challenge received a large number of submissions from domestic and international data scientists, with highly variable performance observed across the submitted methods. Post-challenge collaboration between the top-performing teams and the challenge organizers has created an open-source software, COSMO, with demonstrated high accuracy and robustness in mislabeling identification and correction in simulated and real multi-omic datasets.
Collapse
|
14
|
Wang LB, Karpova A, Gritsenko MA, Kyle JE, Cao S, Li Y, Rykunov D, Colaprico A, Rothstein JH, Hong R, Stathias V, Cornwell M, Petralia F, Wu Y, Reva B, Krug K, Pugliese P, Kawaler E, Olsen LK, Liang WW, Song X, Dou Y, Wendl MC, Caravan W, Liu W, Cui Zhou D, Ji J, Tsai CF, Petyuk VA, Moon J, Ma W, Chu RK, Weitz KK, Moore RJ, Monroe ME, Zhao R, Yang X, Yoo S, Krek A, Demopoulos A, Zhu H, Wyczalkowski MA, McMichael JF, Henderson BL, Lindgren CM, Boekweg H, Lu S, Baral J, Yao L, Stratton KG, Bramer LM, Zink E, Couvillion SP, Bloodsworth KJ, Satpathy S, Sieh W, Boca SM, Schürer S, Chen F, Wiznerowicz M, Ketchum KA, Boja ES, Kinsinger CR, Robles AI, Hiltke T, Thiagarajan M, Nesvizhskii AI, Zhang B, Mani DR, Ceccarelli M, Chen XS, Cottingham SL, Li QK, Kim AH, Fenyö D, Ruggles KV, Rodriguez H, Mesri M, Payne SH, Resnick AC, Wang P, Smith RD, Iavarone A, Chheda MG, Barnholtz-Sloan JS, Rodland KD, Liu T, Ding L. Proteogenomic and metabolomic characterization of human glioblastoma. Cancer Cell 2021; 39:509-528.e20. [PMID: 33577785 PMCID: PMC8044053 DOI: 10.1016/j.ccell.2021.01.006] [Citation(s) in RCA: 353] [Impact Index Per Article: 88.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/03/2020] [Revised: 06/02/2020] [Accepted: 01/11/2021] [Indexed: 02/07/2023]
Abstract
Glioblastoma (GBM) is the most aggressive nervous system cancer. Understanding its molecular pathogenesis is crucial to improving diagnosis and treatment. Integrated analysis of genomic, proteomic, post-translational modification and metabolomic data on 99 treatment-naive GBMs provides insights to GBM biology. We identify key phosphorylation events (e.g., phosphorylated PTPN11 and PLCG1) as potential switches mediating oncogenic pathway activation, as well as potential targets for EGFR-, TP53-, and RB1-altered tumors. Immune subtypes with distinct immune cell types are discovered using bulk omics methodologies, validated by snRNA-seq, and correlated with specific expression and histone acetylation patterns. Histone H2B acetylation in classical-like and immune-low GBM is driven largely by BRDs, CREBBP, and EP300. Integrated metabolomic and proteomic data identify specific lipid distributions across subtypes and distinct global metabolic changes in IDH-mutated tumors. This work highlights biological relationships that could contribute to stratification of GBM patients for more effective treatment.
Collapse
Affiliation(s)
- Liang-Bo Wang
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Alla Karpova
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Marina A Gritsenko
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Jennifer E Kyle
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Song Cao
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Yize Li
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Dmitry Rykunov
- Department of Genetics and Genomic Sciences, Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Antonio Colaprico
- Sylvester Comprehensive Cancer Center, University of Miami, FL 33136, USA; Division of Biostatistics, Department of Public Health Science, University of Miami, FL 33136, USA
| | - Joseph H Rothstein
- Department of Genetics and Genomic Sciences, Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Runyu Hong
- Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Vasileios Stathias
- Sylvester Comprehensive Cancer Center, University of Miami, FL 33136, USA; Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, Miami, FL 33136, USA; BD2K-LINCS Data Coordination and Integration Center, Miami, FL 33136, USA
| | - MacIntosh Cornwell
- Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Medicine, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Francesca Petralia
- Department of Genetics and Genomic Sciences, Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Yige Wu
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Boris Reva
- Department of Genetics and Genomic Sciences, Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Karsten Krug
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | - Pietro Pugliese
- Department of Science and Technology, University of Sannio, 82100, Benevento, Italy
| | - Emily Kawaler
- Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Lindsey K Olsen
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| | - Wen-Wei Liang
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Xiaoyu Song
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Yongchao Dou
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Michael C Wendl
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63130, USA; Department of Genetics, Washington University in St. Louis, St. Louis, MO 63130, USA; Department of Mathematics, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Wagma Caravan
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Wenke Liu
- Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Daniel Cui Zhou
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Jiayi Ji
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Chia-Feng Tsai
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Vladislav A Petyuk
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Jamie Moon
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Weiping Ma
- Department of Genetics and Genomic Sciences, Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Rosalie K Chu
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Karl K Weitz
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Ronald J Moore
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Matthew E Monroe
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Rui Zhao
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Xiaolu Yang
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA; Poznań University of Medical Sciences, 61-701 Poznań, Poland
| | - Seungyeul Yoo
- Department of Genetics and Genomic Sciences, Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Azra Krek
- Department of Genetics and Genomic Sciences, Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Alexis Demopoulos
- Department of Neurology, Northwell Health System, Lake Success, NY 11042 USA
| | - Houxiang Zhu
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Matthew A Wyczalkowski
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Joshua F McMichael
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63130, USA
| | | | - Caleb M Lindgren
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| | - Hannah Boekweg
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| | - Shuangjia Lu
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Jessika Baral
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Lijun Yao
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Kelly G Stratton
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Lisa M Bramer
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Erika Zink
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Sneha P Couvillion
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Kent J Bloodsworth
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Shankha Satpathy
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | - Weiva Sieh
- Department of Genetics and Genomic Sciences, Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Simina M Boca
- Innovation Center for Biomedical Informatics, Georgetown University Medical Center, Washington, DC 20007, USA
| | - Stephan Schürer
- Sylvester Comprehensive Cancer Center, University of Miami, FL 33136, USA; Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, Miami, FL 33136, USA; BD2K-LINCS Data Coordination and Integration Center, Miami, FL 33136, USA; Institute for Data Science & Computing, University of Miami, FL 33136, USA
| | - Feng Chen
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA; Department of Cell Biology and Physiology, Washington University in St. Louis, St. Louis, MO 63130, USA; Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Maciej Wiznerowicz
- International Institute for Molecular Oncology, 60-203 Poznań, Poland; Poznań University of Medical Sciences, 61-701 Poznań, Poland
| | | | - Emily S Boja
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD 20892, USA
| | - Christopher R Kinsinger
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD 20892, USA
| | - Ana I Robles
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD 20892, USA
| | - Tara Hiltke
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD 20892, USA
| | | | - Alexey I Nesvizhskii
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA; Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Bing Zhang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - D R Mani
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | - Michele Ceccarelli
- Department of Electrical Engineering and Information Technology, University of Naples "Federico II", 80128, Naples, Italy; BIOGEM, 83031 Ariano Irpino, Italy
| | - Xi S Chen
- Sylvester Comprehensive Cancer Center, University of Miami, FL 33136, USA; Division of Biostatistics, Department of Public Health Science, University of Miami, FL 33136, USA
| | - Sandra L Cottingham
- Department of Pathology, Spectrum Health and Helen DeVos Children's Hospital, Grand Rapids, MI 49503, USA
| | - Qing Kay Li
- Department of Pathology, Johns Hopkins University, Baltimore, MD 21231, USA
| | - Albert H Kim
- Department of Neurological Surgery, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - David Fenyö
- Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Kelly V Ruggles
- Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Medicine, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Henry Rodriguez
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD 20892, USA
| | - Mehdi Mesri
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD 20892, USA
| | - Samuel H Payne
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| | - Adam C Resnick
- Center for Data Driven Discovery in Biomedicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Pei Wang
- Department of Genetics and Genomic Sciences, Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Richard D Smith
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Antonio Iavarone
- Institute for Cancer Genetics, Columbia University Medical Center, New York, NY 10032, USA; Department of Neurology, Columbia University Medical Center, New York, NY 10032, USA; Department of Pathology and Cell Biology, Columbia University Medical Center, New York, NY 10032, USA; Herbert Irving Comprehensive Cancer Center, Columbia University Medical Center, New York, NY 10032, USA
| | - Milan G Chheda
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA; Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63130, USA; Department of Neurology, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Jill S Barnholtz-Sloan
- Case Comprehensive Cancer Center and Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA; Research and Education, University Hospitals Health System, Cleveland, OH 44106, USA
| | - Karin D Rodland
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA; Department of Cell, Developmental, and Cancer Biology, Oregon Health & Science University, Portland, OR 97221, USA.
| | - Tao Liu
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA.
| | - Li Ding
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63130, USA; Department of Genetics, Washington University in St. Louis, St. Louis, MO 63130, USA; Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63130, USA.
| |
Collapse
|
15
|
Petralia F, Tignor N, Reva B, Koptyra M, Chowdhury S, Rykunov D, Krek A, Ma W, Zhu Y, Ji J, Calinawan A, Whiteaker JR, Colaprico A, Stathias V, Omelchenko T, Song X, Raman P, Guo Y, Brown MA, Ivey RG, Szpyt J, Guha Thakurta S, Gritsenko MA, Weitz KK, Lopez G, Kalayci S, Gümüş ZH, Yoo S, da Veiga Leprevost F, Chang HY, Krug K, Katsnelson L, Wang Y, Kennedy JJ, Voytovich UJ, Zhao L, Gaonkar KS, Ennis BM, Zhang B, Baubet V, Tauhid L, Lilly JV, Mason JL, Farrow B, Young N, Leary S, Moon J, Petyuk VA, Nazarian J, Adappa ND, Palmer JN, Lober RM, Rivero-Hinojosa S, Wang LB, Wang JM, Broberg M, Chu RK, Moore RJ, Monroe ME, Zhao R, Smith RD, Zhu J, Robles AI, Mesri M, Boja E, Hiltke T, Rodriguez H, Zhang B, Schadt EE, Mani DR, Ding L, Iavarone A, Wiznerowicz M, Schürer S, Chen XS, Heath AP, Rokita JL, Nesvizhskii AI, Fenyö D, Rodland KD, Liu T, Gygi SP, Paulovich AG, Resnick AC, Storm PB, Rood BR, Wang P. Integrated Proteogenomic Characterization across Major Histological Types of Pediatric Brain Cancer. Cell 2020; 183:1962-1985.e31. [PMID: 33242424 PMCID: PMC8143193 DOI: 10.1016/j.cell.2020.10.044] [Citation(s) in RCA: 178] [Impact Index Per Article: 35.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 06/19/2020] [Accepted: 10/26/2020] [Indexed: 02/06/2023]
Abstract
We report a comprehensive proteogenomics analysis, including whole-genome sequencing, RNA sequencing, and proteomics and phosphoproteomics profiling, of 218 tumors across 7 histological types of childhood brain cancer: low-grade glioma (n = 93), ependymoma (32), high-grade glioma (25), medulloblastoma (22), ganglioglioma (18), craniopharyngioma (16), and atypical teratoid rhabdoid tumor (12). Proteomics data identify common biological themes that span histological boundaries, suggesting that treatments used for one histological type may be applied effectively to other tumors sharing similar proteomics features. Immune landscape characterization reveals diverse tumor microenvironments across and within diagnoses. Proteomics data further reveal functional effects of somatic mutations and copy number variations (CNVs) not evident in transcriptomics data. Kinase-substrate association and co-expression network analysis identify important biological mechanisms of tumorigenesis. This is the first large-scale proteogenomics analysis across traditional histological boundaries to uncover foundational pediatric brain tumor biology and inform rational treatment selection.
Collapse
Affiliation(s)
- Francesca Petralia
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Nicole Tignor
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Boris Reva
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Mateusz Koptyra
- Center for Data-Driven Discovery in Biomedicine, Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Shrabanti Chowdhury
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Dmitry Rykunov
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Azra Krek
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Weiping Ma
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Yuankun Zhu
- Center for Data-Driven Discovery in Biomedicine, Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Jiayi Ji
- Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Anna Calinawan
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | | | - Antonio Colaprico
- Department of Public Health Science, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Vasileios Stathias
- Department of Pharmacology, Institute for Data Science and Computing, Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL 33146, USA
| | - Tatiana Omelchenko
- Cell Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Xiaoyu Song
- Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Pichai Raman
- Center for Data-Driven Discovery in Biomedicine, Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Yiran Guo
- Center for Data-Driven Discovery in Biomedicine, Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Miguel A Brown
- Center for Data-Driven Discovery in Biomedicine, Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Richard G Ivey
- Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - John Szpyt
- Thermo Fisher Scientific Center for Multiplexed Proteomics, Department of Cell Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Sanjukta Guha Thakurta
- Thermo Fisher Scientific Center for Multiplexed Proteomics, Department of Cell Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Marina A Gritsenko
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Karl K Weitz
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Gonzalo Lopez
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Selim Kalayci
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Zeynep H Gümüş
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Seungyeul Yoo
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | | | - Hui-Yin Chang
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Karsten Krug
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02412, USA
| | - Lizabeth Katsnelson
- Institute for Systems Genetics; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Ying Wang
- Institute for Systems Genetics; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Jacob J Kennedy
- Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | | | - Lei Zhao
- Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Krutika S Gaonkar
- Center for Data-Driven Discovery in Biomedicine, Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Brian M Ennis
- Center for Data-Driven Discovery in Biomedicine, Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Bo Zhang
- Center for Data-Driven Discovery in Biomedicine, Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Valerie Baubet
- Center for Data-Driven Discovery in Biomedicine, Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Lamiya Tauhid
- Center for Data-Driven Discovery in Biomedicine, Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Jena V Lilly
- Center for Data-Driven Discovery in Biomedicine, Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Jennifer L Mason
- Center for Data-Driven Discovery in Biomedicine, Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Bailey Farrow
- Center for Data-Driven Discovery in Biomedicine, Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Nathan Young
- Center for Data-Driven Discovery in Biomedicine, Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Sarah Leary
- Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA; Cancer and Blood Disorders Center, Seattle Children's Hospital, Seattle, WA 98105, USA; Department of Pediatrics, University of Washington, Seattle, WA 98195, USA
| | - Jamie Moon
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Vladislav A Petyuk
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Javad Nazarian
- Children's National Research Institute, George Washington University School of Medicine, Washington, DC 20010, USA; Department of Oncology, Children's Research Center, University Children's Hospital Zürich, Zürich 8032, Switzerland
| | - Nithin D Adappa
- Department of Otorhinolaryngology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - James N Palmer
- Department of Otorhinolaryngology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Robert M Lober
- Department of Neurosurgery, Dayton Children's Hospital, Dayton, OH 45404, USA
| | - Samuel Rivero-Hinojosa
- Children's National Research Institute, George Washington University School of Medicine, Washington, DC 20010, USA
| | - Liang-Bo Wang
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 631110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
| | - Joshua M Wang
- Institute for Systems Genetics; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Matilda Broberg
- Institute for Systems Genetics; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Rosalie K Chu
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Ronald J Moore
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Matthew E Monroe
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Rui Zhao
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Richard D Smith
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Jun Zhu
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Ana I Robles
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Mehdi Mesri
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Emily Boja
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Tara Hiltke
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Henry Rodriguez
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Bing Zhang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Eric E Schadt
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - D R Mani
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02412, USA
| | - Li Ding
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 631110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA; Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - Antonio Iavarone
- Institute for Cancer Genetics, Department of Neurology, Department of Pathology and Cell Biology, Herbert Irving Comprehensive Cancer Center, Columbia University Medical Center, New York, NY 10032, USA
| | - Maciej Wiznerowicz
- Poznan University of Medical Sciences, 61-701 Poznań, Poland; International Institute for Molecular Oncology, 61-203 Poznań, Poland
| | - Stephan Schürer
- Department of Pharmacology, Institute for Data Science and Computing, Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL 33146, USA
| | - Xi S Chen
- Department of Public Health Science, University of Miami Miller School of Medicine, Miami, FL 33136, USA; Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Allison P Heath
- Center for Data-Driven Discovery in Biomedicine, Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Jo Lynne Rokita
- Center for Data-Driven Discovery in Biomedicine, Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Alexey I Nesvizhskii
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA; Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - David Fenyö
- Institute for Systems Genetics; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Karin D Rodland
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA; Department of Cell, Developmental, and Cancer Biology, Oregon Health & Science University, Portland, OR 97221, USA
| | - Tao Liu
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Steven P Gygi
- Thermo Fisher Scientific Center for Multiplexed Proteomics, Department of Cell Biology, Harvard Medical School, Boston, MA 02115, USA
| | | | - Adam C Resnick
- Center for Data-Driven Discovery in Biomedicine, Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.
| | - Phillip B Storm
- Center for Data-Driven Discovery in Biomedicine, Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.
| | - Brian R Rood
- Children's National Research Institute, George Washington University School of Medicine, Washington, DC 20010, USA.
| | - Pei Wang
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
| |
Collapse
|
16
|
Gillette MA, Satpathy S, Cao S, Dhanasekaran SM, Vasaikar SV, Krug K, Petralia F, Li Y, Liang WW, Reva B, Krek A, Ji J, Song X, Liu W, Hong R, Yao L, Blumenberg L, Savage SR, Wendl MC, Wen B, Li K, Tang LC, MacMullan MA, Avanessian SC, Kane MH, Newton CJ, Cornwell M, Kothadia RB, Ma W, Yoo S, Mannan R, Vats P, Kumar-Sinha C, Kawaler EA, Omelchenko T, Colaprico A, Geffen Y, Maruvka YE, da Veiga Leprevost F, Wiznerowicz M, Gümüş ZH, Veluswamy RR, Hostetter G, Heiman DI, Wyczalkowski MA, Hiltke T, Mesri M, Kinsinger CR, Boja ES, Omenn GS, Chinnaiyan AM, Rodriguez H, Li QK, Jewell SD, Thiagarajan M, Getz G, Zhang B, Fenyö D, Ruggles KV, Cieslik MP, Robles AI, Clauser KR, Govindan R, Wang P, Nesvizhskii AI, Ding L, Mani DR, Carr SA. Proteogenomic Characterization Reveals Therapeutic Vulnerabilities in Lung Adenocarcinoma. Cell 2020; 182:200-225.e35. [PMID: 32649874 PMCID: PMC7373300 DOI: 10.1016/j.cell.2020.06.013] [Citation(s) in RCA: 430] [Impact Index Per Article: 86.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Revised: 03/06/2020] [Accepted: 06/03/2020] [Indexed: 12/24/2022]
Abstract
To explore the biology of lung adenocarcinoma (LUAD) and identify new therapeutic opportunities, we performed comprehensive proteogenomic characterization of 110 tumors and 101 matched normal adjacent tissues (NATs) incorporating genomics, epigenomics, deep-scale proteomics, phosphoproteomics, and acetylproteomics. Multi-omics clustering revealed four subgroups defined by key driver mutations, country, and gender. Proteomic and phosphoproteomic data illuminated biology downstream of copy number aberrations, somatic mutations, and fusions and identified therapeutic vulnerabilities associated with driver events involving KRAS, EGFR, and ALK. Immune subtyping revealed a complex landscape, reinforced the association of STK11 with immune-cold behavior, and underscored a potential immunosuppressive role of neutrophil degranulation. Smoking-associated LUADs showed correlation with other environmental exposure signatures and a field effect in NATs. Matched NATs allowed identification of differentially expressed proteins with potential diagnostic and therapeutic utility. This proteogenomics dataset represents a unique public resource for researchers and clinicians seeking to better understand and treat lung adenocarcinomas.
Collapse
Affiliation(s)
- Michael A Gillette
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA; Division of Pulmonary and Critical Care Medicine, Massachusetts General Hospital, Boston, MA, 02115, USA.
| | - Shankha Satpathy
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA.
| | - Song Cao
- Department of Medicine and Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63110, USA
| | | | - Suhas V Vasaikar
- Department of Translational Molecular Pathology, MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Karsten Krug
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Francesca Petralia
- Department of Genetics and Genomic Sciences, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Yize Li
- Department of Medicine and Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - Wen-Wei Liang
- Department of Medicine and Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - Boris Reva
- Department of Genetics and Genomic Sciences, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Azra Krek
- Department of Genetics and Genomic Sciences, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Jiayi Ji
- Department of Population Health Science and Policy; Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Xiaoyu Song
- Department of Population Health Science and Policy; Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Wenke Liu
- Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Runyu Hong
- Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Lijun Yao
- Department of Medicine and Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - Lili Blumenberg
- Institute for Systems Genetics and Department of Medicine, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Sara R Savage
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Michael C Wendl
- Department of Medicine and Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - Bo Wen
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Kai Li
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Lauren C Tang
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA; Department of Biological Sciences, Columbia University, New York, NY, 10027, USA
| | - Melanie A MacMullan
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA; Mork Family Department of Chemical Engineering and Materials Science, University of Southern California, Los Angeles, CA, 90089, USA
| | - Shayan C Avanessian
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - M Harry Kane
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | | | - MacIntosh Cornwell
- Institute for Systems Genetics and Department of Medicine, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Ramani B Kothadia
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Weiping Ma
- Department of Genetics and Genomic Sciences, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Seungyeul Yoo
- Department of Genetics and Genomic Sciences, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Rahul Mannan
- Department of Pathology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Pankaj Vats
- Department of Pathology, University of Michigan, Ann Arbor, MI, 48109, USA
| | | | - Emily A Kawaler
- Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Tatiana Omelchenko
- Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Antonio Colaprico
- Department of Public Health Sciences, University of Miami, Miller School of Medicine, Miami, FL, 33136, USA
| | - Yifat Geffen
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Yosef E Maruvka
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | | | - Maciej Wiznerowicz
- Poznan University of Medical Sciences, Poznań, 61-701, Poland; International Institute for Molecular Oncology, Poznań, 60-203, Poland
| | - Zeynep H Gümüş
- Department of Genetics and Genomic Sciences, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Rajwanth R Veluswamy
- Division of Hematology and Medical Oncology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | | | - David I Heiman
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Matthew A Wyczalkowski
- Department of Medicine and Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - Tara Hiltke
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD, 20892, USA
| | - Mehdi Mesri
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD, 20892, USA
| | - Christopher R Kinsinger
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD, 20892, USA
| | - Emily S Boja
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD, 20892, USA
| | - Gilbert S Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Arul M Chinnaiyan
- Department of Pathology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Henry Rodriguez
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD, 20892, USA
| | - Qing Kay Li
- Sidney Kimmel Comprehensive Cancer Center, The Johns Hopkins Medical Institutions, Baltimore, MD, 21224, USA
| | - Scott D Jewell
- Van Andel Research Institute, Grand Rapids, MI, 49503, USA
| | - Mathangi Thiagarajan
- Leidos Biomedical Research Inc., Frederick National Laboratory for Cancer Research, Frederick, MD, 21702, USA
| | - Gad Getz
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Bing Zhang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - David Fenyö
- Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Kelly V Ruggles
- Institute for Systems Genetics and Department of Medicine, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Marcin P Cieslik
- Department of Pathology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Ana I Robles
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD, 20892, USA
| | - Karl R Clauser
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Ramaswamy Govindan
- Division of Oncology and Siteman Cancer Center, Washington University School of Medicine in St. Louis, St. Louis, MO, 63110, USA
| | - Pei Wang
- Department of Genetics and Genomic Sciences, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Alexey I Nesvizhskii
- Department of Pathology, University of Michigan, Ann Arbor, MI, 48109, USA; Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Li Ding
- Department of Medicine and Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - D R Mani
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Steven A Carr
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA.
| |
Collapse
|
17
|
Blay N, Casas E, Galván-Femenía I, Graffelman J, de Cid R, Vavouri T. Assessment of kinship detection using RNA-seq data. Nucleic Acids Res 2020; 47:e136. [PMID: 31501877 PMCID: PMC6868348 DOI: 10.1093/nar/gkz776] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2019] [Revised: 08/23/2019] [Accepted: 08/29/2019] [Indexed: 01/23/2023] Open
Abstract
Analysis of RNA sequencing (RNA-seq) data from related individuals is widely used in clinical and molecular genetics studies. Prediction of kinship from RNA-seq data would be useful for confirming the expected relationships in family based studies and for highlighting samples from related individuals in case-control or population based studies. Currently, reconstruction of pedigrees is largely based on SNPs or microsatellites, obtained from genotyping arrays, whole genome sequencing and whole exome sequencing. Potential problems with using RNA-seq data for kinship detection are the low proportion of the genome that it covers, the highly skewed coverage of exons of different genes depending on expression level and allele-specific expression. In this study we assess the use of RNA-seq data to detect kinship between individuals, through pairwise identity by descent (IBD) estimates. First, we obtained high quality SNPs after successive filters to minimize the effects due to allelic imbalance as well as errors in sequencing, mapping and genotyping. Then, we used these SNPs to calculate pairwise IBD estimates. By analysing both real and simulated RNA-seq data we show that it is possible to identify up to second degree relationships using RNA-seq data of even low to moderate sequencing depth.
Collapse
Affiliation(s)
- Natalia Blay
- Program for Predictive and Personalized Medicine of Cancer, Germans Trias i Pujol Research Institute (PMPPC-IGTP), Badalona 08916, Spain.,Josep Carreras Leukaemia Research Institute (IJC), Campus ICO-Germans Trias i Pujol, Universitat Autònoma de Barcelona, Badalona 08916, Spain.,Masters Programme in Bioinformatics and Biostatistics, Universitat Oberta de Catalunya (UOC), Barcelona 08035, Spain
| | - Eduard Casas
- Program for Predictive and Personalized Medicine of Cancer, Germans Trias i Pujol Research Institute (PMPPC-IGTP), Badalona 08916, Spain.,Josep Carreras Leukaemia Research Institute (IJC), Campus ICO-Germans Trias i Pujol, Universitat Autònoma de Barcelona, Badalona 08916, Spain.,Doctoral Programme in Biomedicine, Universitat de Barcelona, Barcelona 08007, Spain
| | - Iván Galván-Femenía
- Program for Predictive and Personalized Medicine of Cancer, Germans Trias i Pujol Research Institute (PMPPC-IGTP), Badalona 08916, Spain.,Genomes for Life - GCAT lab Group - Germans Trias i Pujol Research Institute, Can Ruti Campus, Ctra de Can Ruti, Camí de les Escoles s/n, Badalona, Barcelona 08916, Spain
| | - Jan Graffelman
- Department of Statistics and Operations Research Universitat Politècnica de Catalunya, Barcelona 08028, Spain.,Department of Biostatistics, University of Washington, Seattle, WA 98105-946, USA
| | - Rafael de Cid
- Program for Predictive and Personalized Medicine of Cancer, Germans Trias i Pujol Research Institute (PMPPC-IGTP), Badalona 08916, Spain.,Genomes for Life - GCAT lab Group - Germans Trias i Pujol Research Institute, Can Ruti Campus, Ctra de Can Ruti, Camí de les Escoles s/n, Badalona, Barcelona 08916, Spain
| | - Tanya Vavouri
- Program for Predictive and Personalized Medicine of Cancer, Germans Trias i Pujol Research Institute (PMPPC-IGTP), Badalona 08916, Spain.,Josep Carreras Leukaemia Research Institute (IJC), Campus ICO-Germans Trias i Pujol, Universitat Autònoma de Barcelona, Badalona 08916, Spain
| |
Collapse
|
18
|
Jiang Y, Giase G, Grennan K, Shieh AW, Xia Y, Han L, Wang Q, Wei Q, Chen R, Liu S, White KP, Chen C, Li B, Liu C. DRAMS: A tool to detect and re-align mixed-up samples for integrative studies of multi-omics data. PLoS Comput Biol 2020; 16:e1007522. [PMID: 32282793 PMCID: PMC7179940 DOI: 10.1371/journal.pcbi.1007522] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Revised: 04/23/2020] [Accepted: 02/28/2020] [Indexed: 11/28/2022] Open
Abstract
Studies of complex disorders benefit from integrative analyses of multiple omics data. Yet, sample mix-ups frequently occur in multi-omics studies, weakening statistical power and risking false findings. Accurately aligning sample information, genotype, and corresponding omics data is critical for integrative analyses. We developed DRAMS (https://github.com/Yi-Jiang/DRAMS) to Detect and Re-Align Mixed-up Samples to address the sample mix-up problem. It uses a logistic regression model followed by a modified topological sorting algorithm to identify the potential true IDs based on data relationships of multi-omics. According to tests using simulated data, the more types of omics data used or the smaller the proportion of mix-ups, the better that DRAMS performs. Applying DRAMS to real data from the PsychENCODE BrainGVEX project, we detected and corrected 201 (12.5% of total data generated) mix-ups. Of the 21 mix-ups involving errors of racial identity, DRAMS re-assigned all data to the correct racial group in the 1000 Genomes project. In doing so, quantitative trait loci (QTL) (FDR<0.01) increased by an average of 1.62-fold. The use of DRAMS in multi-omics studies will strengthen statistical power of the study and improve quality of the results. Even though very limited studies have multi-omics data in place, we expect such data will increase quickly with the needs of DRAMS.
Collapse
Affiliation(s)
- Yi Jiang
- Center for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China
- Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, Tennessee, United States of America
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Gina Giase
- School of Public Health, University of Illinois at Chicago, Chicago, Illinois, United States of America
| | - Kay Grennan
- Department of Psychiatry, SUNY Upstate Medical University, Syracuse, New York, United States of America
| | - Annie W. Shieh
- Department of Psychiatry, SUNY Upstate Medical University, Syracuse, New York, United States of America
| | - Yan Xia
- Center for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China
- Department of Psychiatry, SUNY Upstate Medical University, Syracuse, New York, United States of America
| | - Lide Han
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Quan Wang
- Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, Tennessee, United States of America
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Qiang Wei
- Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, Tennessee, United States of America
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Rui Chen
- Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, Tennessee, United States of America
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Sihan Liu
- Center for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China
| | - Kevin P. White
- Institute for Genomics and Systems Biology, Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Tempus Labs Inc, Chicago, Illinois, United States of America
| | - Chao Chen
- Center for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Bingshan Li
- Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, Tennessee, United States of America
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Chunyu Liu
- Center for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China
- Department of Psychiatry, SUNY Upstate Medical University, Syracuse, New York, United States of America
- School of Psychology, Shaanxi Normal University, Xi’an, Shaanxi, China
| |
Collapse
|
19
|
Westphal M, Frankhouser D, Sonzone C, Shields PG, Yan P, Bundschuh R. SMaSH: Sample matching using SNPs in humans. BMC Genomics 2019; 20:1001. [PMID: 31888490 PMCID: PMC6936078 DOI: 10.1186/s12864-019-6332-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND Inadvertent sample swaps are a real threat to data quality in any medium to large scale omics studies. While matches between samples from the same individual can in principle be identified from a few well characterized single nucleotide polymorphisms (SNPs), omics data types often only provide low to moderate coverage, thus requiring integration of evidence from a large number of SNPs to determine if two samples derive from the same individual or not. METHODS We select about six thousand SNPs in the human genome and develop a Bayesian framework that is able to robustly identify sample matches between next generation sequencing data sets. RESULTS We validate our approach on a variety of data sets. Most importantly, we show that our approach can establish identity between different omics data types such as Exome, RNA-Seq, and MethylCap-Seq. We demonstrate how identity detection degrades with sample quality and read coverage, but show that twenty million reads of a fairly low quality RNA-Seq sample are still sufficient for reliable sample identification. CONCLUSION Our tool, SMASH, is able to identify sample mismatches in next generation sequencing data sets between different sequencing modalities and for low quality sequencing data.
Collapse
Affiliation(s)
- Maximillian Westphal
- Interdisciplinary Biophysics Graduate Program, The Ohio State University, 484 W. 12th Avenue, Columbus, 43210, OH, USA
| | - David Frankhouser
- Biomedical Science Graduate Program, The Ohio State University, 333 W. 10th Avenue, Columbus, 43210, OH, USA.,Department of Diabetes Complications and Metabolism and Department of Population Sciences in the Beckman Research Institute, City of Hope, 1500 East Duarte Road, Duarte, 91010, CA, USA
| | - Carmine Sonzone
- Molecular, Cellular, and Developmental Biology Graduate Program, The Ohio State University, 484 W. 12th Avenue, Columbus, 43210, OH, USA
| | - Peter G Shields
- Molecular, Cellular, and Developmental Biology Graduate Program, The Ohio State University, 484 W. 12th Avenue, Columbus, 43210, OH, USA.,Department of Internal Medicine, The Ohio State University, 395 W. 12th Avenue, Columbus, 43210, OH, USA.,Comprehensive Cancer Center, The Ohio State University, 460 W. 10th Avenue, Columbus, 43210, OH, USA
| | - Pearlly Yan
- Department of Internal Medicine, The Ohio State University, 395 W. 12th Avenue, Columbus, 43210, OH, USA.,Comprehensive Cancer Center, The Ohio State University, 460 W. 10th Avenue, Columbus, 43210, OH, USA
| | - Ralf Bundschuh
- Interdisciplinary Biophysics Graduate Program, The Ohio State University, 484 W. 12th Avenue, Columbus, 43210, OH, USA. .,Department of Internal Medicine, The Ohio State University, 395 W. 12th Avenue, Columbus, 43210, OH, USA. .,Department of Physics, The Ohio State University, 191 W. Woodruff Avenue, Columbus, 43210, OH, USA. .,Department of Chemistry and Biochemistry, The Ohio State University, 100 W. 18th Avenue, Columbus, 43210, OH, USA. .,Center for RNA Biology, The Ohio State University, 484 W. 12th Avenue, Columbus, 43210, OH, USA.
| |
Collapse
|
20
|
Clark DJ, Dhanasekaran SM, Petralia F, Pan J, Song X, Hu Y, da Veiga Leprevost F, Reva B, Lih TSM, Chang HY, Ma W, Huang C, Ricketts CJ, Chen L, Krek A, Li Y, Rykunov D, Li QK, Chen LS, Ozbek U, Vasaikar S, Wu Y, Yoo S, Chowdhury S, Wyczalkowski MA, Ji J, Schnaubelt M, Kong A, Sethuraman S, Avtonomov DM, Ao M, Colaprico A, Cao S, Cho KC, Kalayci S, Ma S, Liu W, Ruggles K, Calinawan A, Gümüş ZH, Geiszler D, Kawaler E, Teo GC, Wen B, Zhang Y, Keegan S, Li K, Chen F, Edwards N, Pierorazio PM, Chen XS, Pavlovich CP, Hakimi AA, Brominski G, Hsieh JJ, Antczak A, Omelchenko T, Lubinski J, Wiznerowicz M, Linehan WM, Kinsinger CR, Thiagarajan M, Boja ES, Mesri M, Hiltke T, Robles AI, Rodriguez H, Qian J, Fenyö D, Zhang B, Ding L, Schadt E, Chinnaiyan AM, Zhang Z, Omenn GS, Cieslik M, Chan DW, Nesvizhskii AI, Wang P, Zhang H. Integrated Proteogenomic Characterization of Clear Cell Renal Cell Carcinoma. Cell 2019; 179:964-983.e31. [PMID: 31675502 PMCID: PMC7331093 DOI: 10.1016/j.cell.2019.10.007] [Citation(s) in RCA: 426] [Impact Index Per Article: 71.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Revised: 07/15/2019] [Accepted: 10/07/2019] [Indexed: 02/07/2023]
Abstract
To elucidate the deregulated functional modules that drive clear cell renal cell carcinoma (ccRCC), we performed comprehensive genomic, epigenomic, transcriptomic, proteomic, and phosphoproteomic characterization of treatment-naive ccRCC and paired normal adjacent tissue samples. Genomic analyses identified a distinct molecular subgroup associated with genomic instability. Integration of proteogenomic measurements uniquely identified protein dysregulation of cellular mechanisms impacted by genomic alterations, including oxidative phosphorylation-related metabolism, protein translation processes, and phospho-signaling modules. To assess the degree of immune infiltration in individual tumors, we identified microenvironment cell signatures that delineated four immune-based ccRCC subtypes characterized by distinct cellular pathways. This study reports a large-scale proteogenomic analysis of ccRCC to discern the functional impact of genomic alterations and provides evidence for rational treatment selection stemming from ccRCC pathobiology.
Collapse
Affiliation(s)
- David J Clark
- Department of Pathology, Johns Hopkins University, Baltimore, MD 21231, USA
| | | | - Francesca Petralia
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Jianbo Pan
- Department of Pathology, Johns Hopkins University, Baltimore, MD 21231, USA
| | - Xiaoyu Song
- Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Yingwei Hu
- Department of Pathology, Johns Hopkins University, Baltimore, MD 21231, USA
| | | | - Boris Reva
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Tung-Shing M Lih
- Department of Pathology, Johns Hopkins University, Baltimore, MD 21231, USA
| | - Hui-Yin Chang
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Weiping Ma
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Chen Huang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Christopher J Ricketts
- Urologic Oncology Branch, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Lijun Chen
- Department of Pathology, Johns Hopkins University, Baltimore, MD 21231, USA
| | - Azra Krek
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Yize Li
- Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Dmitry Rykunov
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Qing Kay Li
- Department of Pathology, Johns Hopkins University, Baltimore, MD 21231, USA
| | - Lin S Chen
- Department of Public Health Sciences, University of Chicago, Chicago, IL 60637, USA
| | - Umut Ozbek
- Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Suhas Vasaikar
- Department of Translational Molecular Pathology, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Yige Wu
- Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Seungyeul Yoo
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Shrabanti Chowdhury
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | | | - Jiayi Ji
- Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Michael Schnaubelt
- Department of Pathology, Johns Hopkins University, Baltimore, MD 21231, USA
| | - Andy Kong
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | | | - Dmitry M Avtonomov
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Minghui Ao
- Department of Pathology, Johns Hopkins University, Baltimore, MD 21231, USA
| | - Antonio Colaprico
- Department of Public Health Sciences, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Song Cao
- Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Kyung-Cho Cho
- Department of Pathology, Johns Hopkins University, Baltimore, MD 21231, USA
| | - Selim Kalayci
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Shiyong Ma
- Department of Pathology, Johns Hopkins University, Baltimore, MD 21231, USA
| | - Wenke Liu
- Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine, New York, NY 10016, USA
| | - Kelly Ruggles
- Department of Medicine, New York University School of Medicine, New York, NY 10016, USA
| | - Anna Calinawan
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Zeynep H Gümüş
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Daniel Geiszler
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Emily Kawaler
- Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine, New York, NY 10016, USA
| | - Guo Ci Teo
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Bo Wen
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Yuping Zhang
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Sarah Keegan
- Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine, New York, NY 10016, USA
| | - Kai Li
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Feng Chen
- Departments of Medicine and Cell Biology and Physiology, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Nathan Edwards
- Department of Biochemistry and Cellular Biology, Georgetown University, Washington, DC 20007, USA
| | - Phillip M Pierorazio
- Brady Urological Institute and Department of Urology, Johns Hopkins University, Baltimore, MD 21231, USA
| | - Xi Steven Chen
- Department of Public Health Sciences, University of Miami Miller School of Medicine, Miami, FL 33136, USA; Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Christian P Pavlovich
- Brady Urological Institute and Department of Urology, Johns Hopkins University, Baltimore, MD 21231, USA
| | - A Ari Hakimi
- Department of Surgery, Urology Service, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Gabriel Brominski
- Department of Urology, Poznań University of Medical Sciences, Szwajcarska 3, Poznań 61-285, Poland
| | - James J Hsieh
- Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Andrzej Antczak
- Department of Urology, Poznań University of Medical Sciences, Szwajcarska 3, Poznań 61-285, Poland
| | - Tatiana Omelchenko
- Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Jan Lubinski
- Department of Genetics and Pathology, Pomeranian Medical University, Szczecin 71-252, Poland
| | - Maciej Wiznerowicz
- International Institute for Molecular Oncology, Poznań 60-203, Poland; Poznań University of Medical Sciences, Poznan 60-701, Poland
| | - W Marston Linehan
- Urologic Oncology Branch, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Christopher R Kinsinger
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD 20892, USA
| | | | - Emily S Boja
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD 20892, USA
| | - Mehdi Mesri
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD 20892, USA
| | - Tara Hiltke
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD 20892, USA
| | - Ana I Robles
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD 20892, USA
| | - Henry Rodriguez
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD 20892, USA
| | - Jiang Qian
- Department of Ophthalmology, Johns Hopkins University, Baltimore, MD 21231, USA
| | - David Fenyö
- Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine, New York, NY 10016, USA
| | - Bing Zhang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Li Ding
- Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Eric Schadt
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Sema4, Stamford, CT 06902, USA
| | - Arul M Chinnaiyan
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Zhen Zhang
- Department of Pathology, Johns Hopkins University, Baltimore, MD 21231, USA
| | - Gilbert S Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Internal Medicine, Human Genetics, and School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Marcin Cieslik
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA.
| | - Daniel W Chan
- Department of Pathology, Johns Hopkins University, Baltimore, MD 21231, USA.
| | | | - Pei Wang
- Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
| | - Hui Zhang
- Department of Pathology, Johns Hopkins University, Baltimore, MD 21231, USA.
| |
Collapse
|
21
|
A Network Analysis of Multiple Myeloma Related Gene Signatures. Cancers (Basel) 2019; 11:cancers11101452. [PMID: 31569720 PMCID: PMC6827160 DOI: 10.3390/cancers11101452] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 09/20/2019] [Accepted: 09/20/2019] [Indexed: 12/21/2022] Open
Abstract
Multiple myeloma (MM) is the second most prevalent hematological cancer. MM is a complex and heterogeneous disease, and thus, it is essential to leverage omics data from large MM cohorts to understand the molecular mechanisms underlying MM tumorigenesis, progression, and drug responses, which may aid in the development of better treatments. In this study, we analyzed gene expression, copy number variation, and clinical data from the Multiple Myeloma Research Consortium (MMRC) dataset and constructed a multiple myeloma molecular causal network (M3CN). The M3CN was used to unify eight prognostic gene signatures in the literature that shared very few genes between them, resulting in a prognostic subnetwork of the M3CN, consisting of 178 genes that were enriched for genes involved in cell cycle (fold enrichment = 8.4, p value = 6.1 × 10−26). The M3CN was further used to characterize immunomodulators and proteasome inhibitors for MM, demonstrating the pleiotropic effects of these drugs, with drug-response signature genes enriched across multiple M3CN subnetworks. Network analyses indicated potential links between these drug-response subnetworks and the prognostic subnetwork. To elucidate the structure of these important MM subnetworks, we identified putative key regulators predicted to modulate the state of these subnetworks. Finally, to assess the predictive power of our network-based models, we stratified MM patients in an independent cohort, the MMRF-CoMMpass study, based on the prognostic subnetwork, and compared the performance of this subnetwork against other signatures in the literature. We show that the M3CN-derived prognostic subnetwork achieved the best separation between different risk groups in terms of log-rank test p-values and hazard ratios. In summary, this work demonstrates the power of a probabilistic causal network approach to understanding molecular mechanisms underlying the different MM signatures.
Collapse
|
22
|
Lee E, Yoo S, Wang W, Tu Z, Zhu J. A probabilistic multi-omics data matching method for detecting sample errors in integrative analysis. Gigascience 2019; 8:giz080. [PMID: 31289834 PMCID: PMC6615984 DOI: 10.1093/gigascience/giz080] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 05/22/2019] [Accepted: 06/14/2019] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Data errors, including sample swapping and mis-labeling, are inevitable in the process of large-scale omics data generation. Data errors need to be identified and corrected before integrative data analyses where different types of data are merged on the basis of the annotated labels. Data with labeling errors dampen true biological signals. More importantly, data analysis with sample errors could lead to wrong scientific conclusions. We developed a robust probabilistic multi-omics data matching procedure, proMODMatcher, to curate data and identify and correct data annotation and errors in large databases. RESULTS Application to simulated datasets suggests that proMODMatcher achieved robust statistical power even when the number of cis-associations was small and/or the number of samples was large. Application of our proMODMatcher to multi-omics datasets in The Cancer Genome Atlas and International Cancer Genome Consortium identified sample errors in multiple cancer datasets. Our procedure was not only able to identify sample-labeling errors but also to unambiguously identify the source of the errors. Our results demonstrate that these errors should be identified and corrected before integrative analysis. CONCLUSIONS Our results indicate that sample-labeling errors were common in large multi-omics datasets. These errors should be corrected before integrative analysis.
Collapse
Affiliation(s)
- Eunjee Lee
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Sema4, a Mount Sinai venture, 333 Ludlow street, Stamford, CT 06902, USA
| | - Seungyeul Yoo
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Wenhui Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Zhidong Tu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Jun Zhu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Sema4, a Mount Sinai venture, 333 Ludlow street, Stamford, CT 06902, USA
- The Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| |
Collapse
|
23
|
Wang W, Wang L, Gulko PS, Zhu J. Computational deconvolution of synovial tissue cellular composition: presence of adipocytes in synovial tissue decreased during arthritis pathogenesis and progression. Physiol Genomics 2019; 51:241-253. [PMID: 31100034 PMCID: PMC6620645 DOI: 10.1152/physiolgenomics.00009.2019] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Revised: 04/18/2019] [Accepted: 05/13/2019] [Indexed: 01/15/2023] Open
Abstract
Osteoarthritis (OA) and rheumatoid arthritis (RA) are the most common forms of arthritis. The synovial tissue is the major site of inflammation of OA and RA and consists of diverse cells. Synovial tissue cell composition changes during arthritis pathogenesis and progression have not been systematically characterized and may provide critical insights into disease processes. In this study we aimed at systematically examining cellular changes in synovial tissue. Publicly available synovial tissue transcriptomic data sets were used. We computationally estimated cell compositions in synovial tissue based on transcriptomic data and compared cell compositions in different diseases or at different disease stages. Synovial fibroblasts, macrophages, adipocytes, and immune cells were the major cell types in all synovial tissue. Both OA and RA patients had a significantly lower adipocyte fraction compared with healthy controls. The decrease trend was also observed during OA and RA progression. The fraction of monocytes was also increased in both OA and RA arthritis patients, consistent with the observations that inflammation involved in both OA and RA. But the monocyte fraction in RAs was much higher than the ones in healthy controls and OAs. The M2 macrophage fraction was reduced in RA compared with OA, the reduction trend continued during RA progression from the early- to the late-stage. There were consistent cell composition differences between different types or stages of arthritis. Both in RA and OA, the new discovery of changes in the adipocyte and M2 macrophage fractions has potential leading to novel therapeutic development.
Collapse
Affiliation(s)
- Wenhui Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai , New York, New York
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai , New York, New York
| | - Li Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai , New York, New York
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai , New York, New York
- Sema4, a Mount Sinai venture, Stamford, Connecticut
| | - Percio S Gulko
- Division of Rheumatology, Department of Medicine, Icahn School of Medicine at Mount Sinai , New York
| | - Jun Zhu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai , New York, New York
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai , New York, New York
- Sema4, a Mount Sinai venture, Stamford, Connecticut
| |
Collapse
|
24
|
Wang M, Beckmann ND, Roussos P, Wang E, Zhou X, Wang Q, Ming C, Neff R, Ma W, Fullard JF, Hauberg ME, Bendl J, Peters MA, Logsdon B, Wang P, Mahajan M, Mangravite LM, Dammer EB, Duong DM, Lah JJ, Seyfried NT, Levey AI, Buxbaum JD, Ehrlich M, Gandy S, Katsel P, Haroutunian V, Schadt E, Zhang B. The Mount Sinai cohort of large-scale genomic, transcriptomic and proteomic data in Alzheimer's disease. Sci Data 2018; 5:180185. [PMID: 30204156 PMCID: PMC6132187 DOI: 10.1038/sdata.2018.185] [Citation(s) in RCA: 266] [Impact Index Per Article: 38.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2018] [Accepted: 07/20/2018] [Indexed: 12/30/2022] Open
Abstract
Alzheimer's disease (AD) affects half the US population over the age of 85 and is universally fatal following an average course of 10 years of progressive cognitive disability. Genetic and genome-wide association studies (GWAS) have identified about 33 risk factor genes for common, late-onset AD (LOAD), but these risk loci fail to account for the majority of affected cases and can neither provide clinically meaningful prediction of development of AD nor offer actionable mechanisms. This cohort study generated large-scale matched multi-Omics data in AD and control brains for exploring novel molecular underpinnings of AD. Specifically, we generated whole genome sequencing, whole exome sequencing, transcriptome sequencing and proteome profiling data from multiple regions of 364 postmortem control, mild cognitive impaired (MCI) and AD brains with rich clinical and pathophysiological data. All the data went through rigorous quality control. Both the raw and processed data are publicly available through the Synapse software platform.
Collapse
Affiliation(s)
- Minghui Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Noam D. Beckmann
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Panos Roussos
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Psychiatry, JJ Peters VA Medical Center, 130 West Kingsbridge Road, Bronx, NY 10468, USA
| | - Erming Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Xianxiao Zhou
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Qian Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Chen Ming
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Ryan Neff
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Weiping Ma
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - John F. Fullard
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Mads E. Hauberg
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- iPSYCH, The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Aarhus 8000, Denmark
- Department of Biomedicine, Aarhus University, Aarhus, Aarhus, 8000, Denmark
- Centre for Integrative Sequencing (iSEQ), Aarhus University, Aarhus, 8000, Denmark
| | - Jaroslav Bendl
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Mette A. Peters
- Sage Bionetworks, 1100 Fairview Ave N, Seattle, WA 98109, USA
| | - Ben Logsdon
- Sage Bionetworks, 1100 Fairview Ave N, Seattle, WA 98109, USA
| | - Pei Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Milind Mahajan
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | | | - Eric B. Dammer
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
- Integrated Proteomics Core Facility, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Duc M. Duong
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
- Integrated Proteomics Core Facility, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - James J. Lah
- Department of Neurology, Emory University School of Medicine, Atlanta, GA 30322, USA
- Center for Neurodegenerative Disease, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Nicholas T. Seyfried
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
- Integrated Proteomics Core Facility, Emory University School of Medicine, Atlanta, GA 30322, USA
- Department of Neurology, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Allan I. Levey
- Department of Neurology, Emory University School of Medicine, Atlanta, GA 30322, USA
- Center for Neurodegenerative Disease, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Joseph D. Buxbaum
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA
- Fishberg Department of Neuroscience, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Michelle Ehrlich
- Department of Neurology, Icahn School of Medicine at Mount Sinai, One Gustave L Levy Place, New York NY 10029, USA
- Department of Pediatrics, Icahn School of Medicine at Mount Sinai, One Gustave L Levy Place, New York, NY 10029, USA
| | - Sam Gandy
- Psychiatry, JJ Peters VA Medical Center, 130 West Kingsbridge Road, Bronx, NY 10468, USA
- Department of Neurology, Icahn School of Medicine at Mount Sinai, One Gustave L Levy Place, New York NY 10029, USA
- The Alzheimer’s Disease Research Center, Icahn School of Medicine at Mount Sinai, One Gustave L Levy Place, New York, NY 10029, USA
| | - Pavel Katsel
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Psychiatry, JJ Peters VA Medical Center, 130 West Kingsbridge Road, Bronx, NY 10468, USA
| | - Vahram Haroutunian
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Psychiatry, JJ Peters VA Medical Center, 130 West Kingsbridge Road, Bronx, NY 10468, USA
- The Alzheimer’s Disease Research Center, Icahn School of Medicine at Mount Sinai, One Gustave L Levy Place, New York, NY 10029, USA
- Fishberg Department of Neuroscience, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Eric Schadt
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Bin Zhang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| |
Collapse
|
25
|
Awany D, Allali I, Chimusa ER. Tantalizing dilemma in risk prediction from disease scoring statistics. Brief Funct Genomics 2018; 18:211-219. [PMID: 30605512 PMCID: PMC6609536 DOI: 10.1093/bfgp/ely040] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Revised: 08/17/2018] [Accepted: 11/29/2018] [Indexed: 02/01/2023] Open
Abstract
Over the past decade, human host genome-wide association studies (GWASs) have contributed greatly to our understanding of the impact of host genetics on phenotypes. Recently, the microbiome has been recognized as a complex trait in host genetic variation, leading to microbiome GWAS (mGWASs). For these, many different statistical methods and software tools have been developed for association mapping. Applications of these methods and tools have revealed several important findings; however, the establishment of causal factors and the direction of causality in the interactive role between human genetic polymorphisms, the microbiome and the host phenotypes are still a huge challenge. Here, we review disease scoring approaches in host and mGWAS and their underlying statistical methods and tools. We highlight the challenges in pinpointing the genetic-associated causal factors in host and mGWAS and discuss the role of multi-omic approach in disease scoring statistics that may provide a better understanding of human phenotypic variation by enabling further system biological experiment to establish causality.
Collapse
Affiliation(s)
- Denis Awany
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, South Africa
| | - Imane Allali
- Computational Biology Division, Department of Integrative Biomedical Sciences, Faculty of Health Sciences, University of Cape Town, South Africa
| | - Emile R Chimusa
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, South Africa
| |
Collapse
|
26
|
Misra BB, Langefeld CD, Olivier M, Cox LA. Integrated Omics: Tools, Advances, and Future Approaches. J Mol Endocrinol 2018; 62:JME-18-0055. [PMID: 30006342 DOI: 10.1530/jme-18-0055] [Citation(s) in RCA: 234] [Impact Index Per Article: 33.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/24/2018] [Revised: 07/02/2018] [Accepted: 07/12/2018] [Indexed: 12/13/2022]
Abstract
With the rapid adoption of high-throughput omic approaches to analyze biological samples such as genomics, transcriptomics, proteomics, and metabolomics, each analysis can generate tera- to peta-byte sized data files on a daily basis. These data file sizes, together with differences in nomenclature among these data types, make the integration of these multi-dimensional omics data into biologically meaningful context challenging. Variously named as integrated omics, multi-omics, poly-omics, trans-omics, pan-omics, or shortened to just 'omics', the challenges include differences in data cleaning, normalization, biomolecule identification, data dimensionality reduction, biological contextualization, statistical validation, data storage and handling, sharing, and data archiving. The ultimate goal is towards the holistic realization of a 'systems biology' understanding of the biological question in hand. Commonly used approaches in these efforts are currently limited by the 3 i's - integration, interpretation, and insights. Post integration, these very large datasets aim to yield unprecedented views of cellular systems at exquisite resolution for transformative insights into processes, events, and diseases through various computational and informatics frameworks. With the continued reduction in costs and processing time for sample analyses, and increasing types of omics datasets generated such as glycomics, lipidomics, microbiomics, and phenomics, an increasing number of scientists in this interdisciplinary domain of bioinformatics face these challenges. We discuss recent approaches, existing tools, and potential caveats in the integration of omics datasets for development of standardized analytical pipelines that could be adopted by the global omics research community.
Collapse
Affiliation(s)
- Biswapriya B Misra
- B Misra, Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, United States
| | - Carl D Langefeld
- C Langefeld, Biostatistical Sciences, Wake Forest University School of Medicine, Winston-Salem, United States
| | - Michael Olivier
- M Olivier, Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, United States
| | - Laura A Cox
- L Cox, Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, United States
| |
Collapse
|
27
|
Kim M, Tagkopoulos I. Data integration and predictive modeling methods for multi-omics datasets. Mol Omics 2018; 14:8-25. [DOI: 10.1039/c7mo00051k] [Citation(s) in RCA: 56] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
We provide an overview of opportunities and challenges in multi-omics predictive analytics with particular emphasis on data integration and machine learning methods.
Collapse
Affiliation(s)
- Minseung Kim
- Department of Computer Science
- University of California
- Davis
- USA
- Genome Center
| | - Ilias Tagkopoulos
- Department of Computer Science
- University of California
- Davis
- USA
- Genome Center
| |
Collapse
|
28
|
Lee S, Lee S, Ouellette S, Park WY, Lee EA, Park PJ. NGSCheckMate: software for validating sample identity in next-generation sequencing studies within and across data types. Nucleic Acids Res 2017; 45:e103. [PMID: 28369524 PMCID: PMC5499645 DOI: 10.1093/nar/gkx193] [Citation(s) in RCA: 83] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2016] [Revised: 03/06/2017] [Accepted: 03/22/2017] [Indexed: 12/30/2022] Open
Abstract
In many next-generation sequencing (NGS) studies, multiple samples or data types are profiled for each individual. An important quality control (QC) step in these studies is to ensure that datasets from the same subject are properly paired. Given the heterogeneity of data types, file types and sequencing depths in a multi-dimensional study, a robust program that provides a standardized metric for genotype comparisons would be useful. Here, we describe NGSCheckMate, a user-friendly software package for verifying sample identities from FASTQ, BAM or VCF files. This tool uses a model-based method to compare allele read fractions at known single-nucleotide polymorphisms, considering depth-dependent behavior of similarity metrics for identical and unrelated samples. Our evaluation shows that NGSCheckMate is effective for a variety of data types, including exome sequencing, whole-genome sequencing, RNA-seq, ChIP-seq, targeted sequencing and single-cell whole-genome sequencing, with a minimal requirement for sequencing depth (>0.5X). An alignment-free module can be run directly on FASTQ files for a quick initial check. We recommend using this software as a QC step in NGS studies. AVAILABILITY https://github.com/parklab/NGSCheckMate.
Collapse
Affiliation(s)
- Sejoon Lee
- Samsung Genome Institute, Samsung Medical Center, Seoul, 06351, South Korea
- SD Genomics Co., Ltd, Seoul, 06336, South Korea
| | - Soohyun Lee
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | - Scott Ouellette
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | - Woong-Yang Park
- Samsung Genome Institute, Samsung Medical Center, Seoul, 06351, South Korea
| | - Eunjung A. Lee
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
- Division of Genetics and Genomics, Boston Children's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Peter J. Park
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
- Ludwig Center at Harvard, Boston, MA 02115, USA
| |
Collapse
|
29
|
Alyass A, Turcotte M, Meyre D. From big data analysis to personalized medicine for all: challenges and opportunities. BMC Med Genomics 2015; 8:33. [PMID: 26112054 PMCID: PMC4482045 DOI: 10.1186/s12920-015-0108-y] [Citation(s) in RCA: 228] [Impact Index Per Article: 22.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2015] [Accepted: 06/15/2015] [Indexed: 02/07/2023] Open
Abstract
Recent advances in high-throughput technologies have led to the emergence of systems biology as a holistic science to achieve more precise modeling of complex diseases. Many predict the emergence of personalized medicine in the near future. We are, however, moving from two-tiered health systems to a two-tiered personalized medicine. Omics facilities are restricted to affluent regions, and personalized medicine is likely to widen the growing gap in health systems between high and low-income countries. This is mirrored by an increasing lag between our ability to generate and analyze big data. Several bottlenecks slow-down the transition from conventional to personalized medicine: generation of cost-effective high-throughput data; hybrid education and multidisciplinary teams; data storage and processing; data integration and interpretation; and individual and global economic relevance. This review provides an update of important developments in the analysis of big data and forward strategies to accelerate the global transition to personalized medicine.
Collapse
Affiliation(s)
- Akram Alyass
- Department of Clinical Epidemiology and Biostatistics, McMaster University, 1280 Main Street West, Hamilton, ON, Canada.
| | - Michelle Turcotte
- Department of Clinical Epidemiology and Biostatistics, McMaster University, 1280 Main Street West, Hamilton, ON, Canada.
| | - David Meyre
- Department of Clinical Epidemiology and Biostatistics, McMaster University, 1280 Main Street West, Hamilton, ON, Canada.
- Department of Pathology and Molecular Medicine, McMaster University, 1280 Main Street West, Hamilton, ON, Canada.
| |
Collapse
|
30
|
Yoo S, Takikawa S, Geraghty P, Argmann C, Campbell J, Lin L, Huang T, Tu Z, Feronjy R, Spira A, Schadt EE, Powell CA, Zhu J. Integrative analysis of DNA methylation and gene expression data identifies EPAS1 as a key regulator of COPD. PLoS Genet 2015; 11:e1004898. [PMID: 25569234 PMCID: PMC4287352 DOI: 10.1371/journal.pgen.1004898] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2014] [Accepted: 11/17/2014] [Indexed: 01/11/2023] Open
Abstract
Chronic Obstructive Pulmonary Disease (COPD) is a complex disease. Genetic, epigenetic, and environmental factors are known to contribute to COPD risk and disease progression. Therefore we developed a systematic approach to identify key regulators of COPD that integrates genome-wide DNA methylation, gene expression, and phenotype data in lung tissue from COPD and control samples. Our integrative analysis identified 126 key regulators of COPD. We identified EPAS1 as the only key regulator whose downstream genes significantly overlapped with multiple genes sets associated with COPD disease severity. EPAS1 is distinct in comparison with other key regulators in terms of methylation profile and downstream target genes. Genes predicted to be regulated by EPAS1 were enriched for biological processes including signaling, cell communications, and system development. We confirmed that EPAS1 protein levels are lower in human COPD lung tissue compared to non-disease controls and that Epas1 gene expression is reduced in mice chronically exposed to cigarette smoke. As EPAS1 downstream genes were significantly enriched for hypoxia responsive genes in endothelial cells, we tested EPAS1 function in human endothelial cells. EPAS1 knockdown by siRNA in endothelial cells impacted genes that significantly overlapped with EPAS1 downstream genes in lung tissue including hypoxia responsive genes, and genes associated with emphysema severity. Our first integrative analysis of genome-wide DNA methylation and gene expression profiles illustrates that not only does DNA methylation play a ‘causal’ role in the molecular pathophysiology of COPD, but it can be leveraged to directly identify novel key mediators of this pathophysiology. Chronic Obstructive Pulmonary Disease (COPD) is a common lung disease. It is the fourth leading cause of death in the world and is expected to be the third by 2020. COPD is a heterogeneous and complex disease consisting of obstruction in the small airways, emphysema, and chronic bronchitis. COPD is generally caused by exposure to noxious particles or gases, most commonly from cigarette smoking. However, only 20–25% of smokers develop clinically significant airflow obstruction. Smoking is known to cause epigenetic changes in lung tissues. Thus, genetics, epigenetic, and their interaction with environmental factors play an important role in COPD pathogenesis and progression. Currently, there are no therapeutics that can reverse COPD progression. In order to identify new targets that may lead to the development of therapeutics for curing COPD, we developed a systematic approach to identify key regulators of COPD that integrates genome-wide DNA methylation, gene expression, and phenotype data in lung tissue from COPD and control samples. Our integrative analysis identified 126 key regulators of COPD. We identified EPAS1 as the only key regulator whose downstream genes significantly overlapped with multiple genes sets associated with COPD disease severity.
Collapse
Affiliation(s)
- Seungyeul Yoo
- Institute of Genomics and Multiscale Biology, Mount Sinai School of Medicine, New York, New York, United States of America
- Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York, New York, United States of America
| | - Sachiko Takikawa
- Division of Pulmonary, Critical Care and Sleep Medicine, Mount Sinai School of Medicine, New York, New York, United States of America
| | - Patrick Geraghty
- Department of Medicine, St. Luke's Roosevelt Medical Center, Mount Sinai School of Medicine, New York, New York, United States of America
| | - Carmen Argmann
- Institute of Genomics and Multiscale Biology, Mount Sinai School of Medicine, New York, New York, United States of America
- Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York, New York, United States of America
| | - Joshua Campbell
- Division of Computational Biomedicine, Department of Medicine, Boston University School of Medicine, Boston, Massachusetts, United States of America
| | - Luan Lin
- Institute of Genomics and Multiscale Biology, Mount Sinai School of Medicine, New York, New York, United States of America
- Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York, New York, United States of America
| | - Tao Huang
- Institute of Genomics and Multiscale Biology, Mount Sinai School of Medicine, New York, New York, United States of America
- Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York, New York, United States of America
| | - Zhidong Tu
- Institute of Genomics and Multiscale Biology, Mount Sinai School of Medicine, New York, New York, United States of America
- Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York, New York, United States of America
| | - Robert Feronjy
- Department of Medicine, St. Luke's Roosevelt Medical Center, Mount Sinai School of Medicine, New York, New York, United States of America
| | - Avrum Spira
- Division of Computational Biomedicine, Department of Medicine, Boston University School of Medicine, Boston, Massachusetts, United States of America
| | - Eric E. Schadt
- Institute of Genomics and Multiscale Biology, Mount Sinai School of Medicine, New York, New York, United States of America
- Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York, New York, United States of America
| | - Charles A. Powell
- Division of Pulmonary, Critical Care and Sleep Medicine, Mount Sinai School of Medicine, New York, New York, United States of America
| | - Jun Zhu
- Institute of Genomics and Multiscale Biology, Mount Sinai School of Medicine, New York, New York, United States of America
- Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York, New York, United States of America
- * E-mail:
| |
Collapse
|