1
|
Sarsani V, Aldikacti B, Zhao T, He S, Chien P, Flaherty P. Discovering Genetic Modulators of the Protein Homeostasis System through Multilevel Analysis. bioRxiv 2024:2024.02.26.582154. [PMID: 38464212 PMCID: PMC10925187 DOI: 10.1101/2024.02.26.582154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Every protein progresses through a natural lifecycle from birth to maturation to death; this process is coordinated by the protein homeostasis system. Environmental or physiological conditions trigger pathways that maintain the homeostasis of the proteome. An open question is how these pathways are modulated to respond to the many stresses that an organism encounters during its lifetime. To address this question, we tested how the fitness landscape changes in response to environmental and genetic perturbations using directed and massively parallel transposon mutagenesis in Caulobacter crescentus. We developed a general computational pipeline for the analysis of gene-by-environment interactions in transposon mutagenesis experiments. This pipeline uses a combination of general linear models (GLMs), statistical knockoffs, and a nonparametric Bayesian statistical model to identify essential genetic network components that are shared across environmental perturbations. This analysis allows us to quantify the similarity of proteotoxic environmental perturbations from the perspective of the fitness landscape. We find that essential genes vary more by genetic background than by environmental conditions, with limited overlap among mutant strains targeting different facets of the protein homeostasis system. We also identified 146 unique fitness determinants across different strains, with 19 genes common to at least two strains, showing varying resilience to proteotoxic stresses. Experiments exposing cells to a combination of genetic perturbations and dual environmental stressors show that perturbations that are quantitatively dissimilar from the perspective of the fitness landscape are likely to have a synergistic effect on the growth defect.
Collapse
Affiliation(s)
- Vishal Sarsani
- Department of Mathematics and Statistics, University of Massachusetts Amherst, Amherst, 01002, Massachusetts, USA
| | - Berent Aldikacti
- Department of Biochemistry and Molecular Biology, University of Massachusetts Amherst, Amherst, 01002, Massachusetts, USA
| | - Tingting Zhao
- Department of Information Systems and Analytics, Bryant University, Smithfield, 02917, RI, USA
- School of Health and Behavioral Sciences, Bryant University, Smithfield, 02917, RI, USA
| | - Shai He
- Department of Mathematics and Statistics, University of Massachusetts Amherst, Amherst, 01002, Massachusetts, USA
| | - Peter Chien
- Department of Biochemistry and Molecular Biology, University of Massachusetts Amherst, Amherst, 01002, Massachusetts, USA
| | - Patrick Flaherty
- Department of Mathematics and Statistics, University of Massachusetts Amherst, Amherst, 01002, Massachusetts, USA
| |
Collapse
|
2
|
Sarsani V, Brotman SM, Xianyong Y, Fernandes Silva L, Laakso M, Spracklen CN. A cross-ancestry genome-wide meta-analysis, fine-mapping, and gene prioritization approach to characterize the genetic architecture of adiponectin. HGG Adv 2024; 5:100252. [PMID: 37859345 PMCID: PMC10652123 DOI: 10.1016/j.xhgg.2023.100252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 10/16/2023] [Accepted: 10/16/2023] [Indexed: 10/21/2023] Open
Abstract
Previous genome-wide association studies (GWASs) for adiponectin, a complex trait linked to type 2 diabetes and obesity, identified >20 associated loci. However, most loci were identified in populations of European ancestry, and many of the target genes underlying the associations remain unknown. We conducted a cross-ancestry adiponectin GWAS meta-analysis in ≤46,434 individuals from the Metabolic Syndrome in Men (METSIM) cohort and the ADIPOGen and AGEN consortiums. We combined study-specific association summary statistics using a fixed-effects, inverse variance-weighted approach. We identified 22 loci associated with adiponectin (p < 5×10-8), including 15 known and seven previously unreported loci. Among individuals of European ancestry, Genome-wide Complex Traits Analysis joint conditional analysis (GCTA-COJO) identified 14 additional distinct signals at the ADIPOQ, CDH13, HCAR1, and ZNF664 loci. Leveraging the cross-ancestry data, FINEMAP + SuSiE identified 45 causal variants (PP > 0.9), which also exhibited potential pleiotropy for cardiometabolic traits. To prioritize target genes at associated loci, we propose a combinatorial likelihood scoring formalism (Gene Priority Score [GPScore]) based on measures derived from 11 gene prioritization strategies and the physical distance to the transcription start site. With GPScore, we prioritize the 30 most probable target genes underlying the adiponectin-associated variants in the cross-ancestry analysis, including well-known causal genes (e.g., ADIPOQ, CDH13) and additional genes (e.g., CSF1, RGS17). Functional association networks revealed complex interactions of prioritized genes, their functionally connected genes, and their underlying pathways centered around insulin and adiponectin signaling, indicating an essential role in regulating energy balance in the body, inflammation, coagulation, fibrinolysis, insulin resistance, and diabetes. Overall, our analyses identify and characterize adiponectin association signals and inform experimental interrogation of target genes for adiponectin.
Collapse
Affiliation(s)
- Vishal Sarsani
- Department of Mathematics and Statistics, University of Massachusetts Amherst, Amherst, MA, USA
| | - Sarah M Brotman
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Yin Xianyong
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Lillian Fernandes Silva
- Institute of Clinical Medicine, Internal Medicine, University of Eastern Finland, Kuopio, Finland
| | - Markku Laakso
- Institute of Clinical Medicine, Internal Medicine, University of Eastern Finland, Kuopio, Finland
| | - Cassandra N Spracklen
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, USA.
| |
Collapse
|
3
|
Sarsani V, Aldikacti B, He S, Zeinert R, Chien P, Flaherty P. Model-based identification of conditionally-essential genes from transposon-insertion sequencing data. PLoS Comput Biol 2022; 18:e1009273. [PMID: 35255084 PMCID: PMC8929702 DOI: 10.1371/journal.pcbi.1009273] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 03/17/2022] [Accepted: 02/09/2022] [Indexed: 12/13/2022] Open
Abstract
The understanding of bacterial gene function has been greatly enhanced by recent advancements in the deep sequencing of microbial genomes. Transposon insertion sequencing methods combines next-generation sequencing techniques with transposon mutagenesis for the exploration of the essentiality of genes under different environmental conditions. We propose a model-based method that uses regularized negative binomial regression to estimate the change in transposon insertions attributable to gene-environment changes in this genetic interaction study without transformations or uniform normalization. An empirical Bayes model for estimating the local false discovery rate combines unique and total count information to test for genes that show a statistically significant change in transposon counts. When applied to RB-TnSeq (randomized barcode transposon sequencing) and Tn-seq (transposon sequencing) libraries made in strains of Caulobacter crescentus using both total and unique count data the model was able to identify a set of conditionally beneficial or conditionally detrimental genes for each target condition that shed light on their functions and roles during various stress conditions.
Collapse
Affiliation(s)
- Vishal Sarsani
- Department of Mathematics and Statistics, University of Massachusetts Amherst, Amherst, Massachusetts, United States of America
| | - Berent Aldikacti
- Department of Biochemistry and Molecular Biology, University of Massachusetts Amherst, Amherst, Massachusetts, United States of America
| | - Shai He
- Department of Mathematics and Statistics, University of Massachusetts Amherst, Amherst, Massachusetts, United States of America
| | - Rilee Zeinert
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, Maryland, United States of America
| | - Peter Chien
- Department of Biochemistry and Molecular Biology, University of Massachusetts Amherst, Amherst, Massachusetts, United States of America
| | - Patrick Flaherty
- Department of Mathematics and Statistics, University of Massachusetts Amherst, Amherst, Massachusetts, United States of America
| |
Collapse
|
4
|
He S, Schein A, Sarsani V, Flaherty P. A BAYESIAN NONPARAMETRIC MODEL FOR INFERRING SUBCLONAL POPULATIONS FROM STRUCTURED DNA SEQUENCING DATA. Ann Appl Stat 2021; 15:925-951. [PMID: 34262633 DOI: 10.1214/20-aoas1434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
There are distinguishing features or "hallmarks" of cancer that are found across tumors, individuals, and types of cancer, and these hallmarks can be driven by specific genetic mutations. Yet, within a single tumor there is often extensive genetic heterogeneity as evidenced by single-cell and bulk DNA sequencing data. The goal of this work is to jointly infer the underlying genotypes of tumor subpopulations and the distribution of those subpopulations in individual tumors by integrating single-cell and bulk sequencing data. Understanding the genetic composition of the tumor at the time of treatment is important in the personalized design of targeted therapeutic combinations and monitoring for possible recurrence after treatment. We propose a hierarchical Dirichlet process mixture model that incorporates the correlation structure induced by a structured sampling arrangement and we show that this model improves the quality of inference. We develop a representation of the hierarchical Dirichlet process prior as a Gamma-Poisson hierarchy and we use this representation to derive a fast Gibbs sampling inference algorithm using the augment-and-marginalize method. Experiments with simulation data show that our model outperforms standard numerical and statistical methods for decomposing admixed count data. Analyses of real acute lymphoblastic leukemia cancer sequencing dataset shows that our model improves upon state-of-the-art bioinformatic methods. An interpretation of the results of our model on this real dataset reveals co-mutated loci across samples.
Collapse
Affiliation(s)
- Shai He
- Department of Mathematics and Statistics, University of Massachusetts Amherst
| | | | - Vishal Sarsani
- Department of Mathematics and Statistics, University of Massachusetts Amherst
| | - Patrick Flaherty
- Department of Mathematics and Statistics, University of Massachusetts Amherst
| |
Collapse
|
5
|
Rosains J, Srivastava A, Woo W, Sarsani V, Zhao Z, Noorbakhsh J, Abaan OD, Frech C, DiGiovanna J, Jeon R, Neuhauser S, Robinson P, Evrard YA, Bult C, Moscow JA, Davis-Dusenbery B, Chuang JH. Abstract 1074: The PDX Data Commons and Coordinating Center (PDCCC) for PDXNet in support of preclinical research. Cancer Res 2019. [DOI: 10.1158/1538-7445.am2019-1074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Patient-Derived Xenografts (PDX) are proven models to study novel drugs or drug combinations and test hypothesis in preclinical studies. The overarching goal of the PDXNet is to coordinate the development of appropriate PDX models and methods for preclinical drug testing to advance CTEP clinical development of new cancer agents.
The PDXNet is an NCI-funded consortium of six PDX Development and Trial Centers (PDTCs) and one PDCCC. Four PDTCs are responsible for developing PDXs and executing specific preclinical trials focused on cancer types including breast cancer, melanoma, and lung cancer. The other two recently awarded centers are specifically focused on minority PDX models and preclinical trials. Besides the PDTCs, the NCI Patient-Derived Models Repository (PDMR) at the Frederick National Laboratory for Cancer Research (FNLCR) is also providing models and data to the PDXNet. The PDCCC is responsible for coordination and developing standards for PDX generation as well as data analysis and metadata harmonization. The PDX Data Commons is built on top of existing NCI resources, leveraging the Cancer Genomics Cloud maintained by Seven Bridges Genomics, where PDXNet data is co-located with TCGA and other large-scale datasets. The PDCCC is co-led by experts from the Jackson Laboratory, providing scientific leadership in xenograft methods and cancer biology to ensure the promulgation of standards that are well-suited for the PDX community.
A new portal has been set up at https://www.pdxnetwork.org/ to serve as the point of access to PDXNet resources. In addition, we established ongoing network-wide meetings to facilitate knowledge exchange, held PDXNet portal trainings, and set up working groups to tackle specific challenges. For instance, the Data Ontology working group has been working towards building a common data ontology model specifically for PDX datasets. We are in the process of annotating the very first dataset using this new ontology on the PDXNet portal. Also, the Workflows working group has been working on building and benchmarking various RNA-seq and whole exome sequencing analysis workflows to standardize data processing between PDXNet grantees and create a harmonized PDXNet dataset. These PDX models and the accompanying data will be opened to the community for data mining and/or preclinical research.
The PDXNet is a strong step toward building a consensus around PDX models, so that the power for discovery can be expanded by making multi-institutional PDX cohorts a reality. As the coordination center, we are also working closely with the EuroPDX project to exchange standards and knowledge to support the PDX community with a set of standards going forward. The PDCCC is a central part of this process to systematically capture and analyze the variables most influential to PDX models and share protocols and tools to make PDXs an interchangeable research currency for preclinical discovery.
Citation Format: Jacqueline Rosains, Anuj Srivastava, Wingyi Woo, Vishal Sarsani, ZiMing Zhao, Javad Noorbakhsh, Ogan D. Abaan, Christian Frech, Jack DiGiovanna, Ryan Jeon, Steve Neuhauser, Peter Robinson, Yvonne A. Evrard, Carol Bult, Jeffrey A. Moscow, Brandi Davis-Dusenbery, Jeffrey H. Chuang. The PDX Data Commons and Coordinating Center (PDCCC) for PDXNet in support of preclinical research [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2019; 2019 Mar 29-Apr 3; Atlanta, GA. Philadelphia (PA): AACR; Cancer Res 2019;79(13 Suppl):Abstract nr 1074.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | - Ryan Jeon
- 1Seven Bridges Genomics, Cambridge, MA
| | | | | | - Yvonne A. Evrard
- 4Frederick National Laboratory for Cancer Research, Frederick, MD
| | | | | | | | | |
Collapse
|
6
|
Sethi A, Srivastava A, Woo X, Sarsani V, Zhao Z, Noorbakhsh J, French C, DiGiovanna J, Abaan OD, Neuhauser S, Robinson P, Evrard YA, Bult CJ, Moscow JA, Davis-Dusenbery B, Chuang JH. Abstract 1029: The PDX Data Commons and Coordinating Center (PDCCC) for PDXNet. Cancer Res 2018. [DOI: 10.1158/1538-7445.am2018-1029] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Patient-Derived Xenografts (PDX) are powerful models to study tumors' drug-response in the context of personalized medicine. In the PDX model settings, by virtue of expanding the patient's tumor sample, testing multiple drug or drug-combinations can be executed rapidly and has no ethical limitations. However, there are major issues around standards that need to be addressed to make these models widely accessible and usable.
The overarching goal of the PDXNet is to coordinate the development of appropriate PDX models and methods for preclinical drug testing to advance CTEP clinical development of new cancer agents. In an effort to standardize protocols for PDX generation as well as data analysis and metadata harmonization, we are building a data storage, sharing, and analysis platform that harmonizes PDXNet data with other large datasets and analysis workflows. The PDX Data Commons is built on top of existing NCI resources, leveraging the Cancer Genomics Cloud maintained by Seven Bridges Genomics, where PDXNet data is co-located with TCGA and other large-scale datasets. The PDCCC is co-led by experts from The Jackson Laboratory, providing scientific leadership in xenograft methods and cancer biology to ensure the promulgation of standards that are well-suited for the PDX community. In addition, the PDCCC is responsible for establishing studies to identify best-practices for PDX data analysis and metadata schemas. The data collected as part of the PDXNet is currently stored on the PDXNet portal that has a query interface for identifying models for pre-clinical trials. Simultaneously, we administer training activities and research pilots to build synergies within the PDXNet, enhancing the ability of the PDXNet to develop clinical trials from PDX studies.
In PDXNet, besides the PDCCC, there are 4 PDX Development and Trial Centers (PDTCs) responsible for executing specific pre-clinical trials focused around cancer types including breast cancer, melanoma, and lung cancer. Data generated by the PDTCs will be hosted by the PDCCC, and metadata will be collected based on schemas developed by the network for systematic ontological analysis. These PDX models, in coordination with the NCI Patient-Derived Models Repository (PDMR) at the Frederick National Laboratory for Cancer Research (FNLCR) will be shared with the broader community. In addition, PDTC's will collaborate with non-PDXNet investigators for PDX studies through an administrative supplement program supported by the NCI.
The PDXNet is a strong step toward building a consensus around PDX models, so that the power for discovery can be expanded by making multi-institutional PDX cohorts a reality. The PDCCC is a central part of this process to systematically capture and analyze the variables most influential to PDX models and share protocols and tools to make PDXs an interchangeable research currency for pre-clinical discovery.
Citation Format: Anurag Sethi, Anuj Srivastava, Xingyi Woo, Vishal Sarsani, Ziming Zhao, Javad Noorbakhsh, Christian French, Jack DiGiovanna, Ogan D. Abaan, Steve Neuhauser, Peter Robinson, Yvonne A. Evrard, Carol J. Bult, Jeffrey A. Moscow, Brandi Davis-Dusenbery, Jeffrey H. Chuang. The PDX Data Commons and Coordinating Center (PDCCC) for PDXNet [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2018; 2018 Apr 14-18; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2018;78(13 Suppl):Abstract nr 1029.
Collapse
Affiliation(s)
| | - Anuj Srivastava
- 2The Jackson Laboratory for Genomic Medicine, Farmington, CT
| | | | | | - Ziming Zhao
- 2The Jackson Laboratory for Genomic Medicine, Farmington, CT
| | | | | | | | | | | | - Peter Robinson
- 2The Jackson Laboratory for Genomic Medicine, Farmington, CT
| | - Yvonne A. Evrard
- 4Frederick National Laboratory for Cancer Research, Frederick, MD
| | | | | | | | | |
Collapse
|