51
|
Sundararaman N, Bhat A, Venkatraman V, Binek A, Dwight Z, Ariyasinghe NR, Escopete S, Joung SY, Cheng S, Parker SJ, Fert-Bober J, Van Eyk JE. BIRCH: An Automated Workflow for Evaluation, Correction, and Visualization of Batch Effect in Bottom-Up Mass Spectrometry-Based Proteomics Data. J Proteome Res 2023; 22:471-481. [PMID: 36695565 DOI: 10.1021/acs.jproteome.2c00671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Recent surges in large-scale mass spectrometry (MS)-based proteomics studies demand a concurrent rise in methods to facilitate reliable and reproducible data analysis. Quantification of proteins in MS analysis can be affected by variations in technical factors such as sample preparation and data acquisition conditions leading to batch effects, which adds to noise in the data set. This may in turn affect the effectiveness of any biological conclusions derived from the data. Here we present Batch-effect Identification, Representation, and Correction of Heterogeneous data (BIRCH), a workflow for analysis and correction of batch effect through an automated, versatile, and easy to use web-based tool with the goal of eliminating technical variation. BIRCH also supports diagnosis of the data to check for the presence of batch effects, feasibility of batch correction, and imputation to deal with missing values in the data set. To illustrate the relevance of the tool, we explore two case studies, including an iPSC-derived cell study and a Covid vaccine study to show different context-specific use cases. Ultimately this tool can be used as an extremely powerful approach for eliminating technical bias while retaining biological bias, toward understanding disease mechanisms and potential therapeutics.
Collapse
Affiliation(s)
- Niveda Sundararaman
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Archana Bhat
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Vidya Venkatraman
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Aleksandra Binek
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Zachary Dwight
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Nethika R Ariyasinghe
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Sean Escopete
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Sandy Y Joung
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Susan Cheng
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Sarah J Parker
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Justyna Fert-Bober
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Jennifer E Van Eyk
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| |
Collapse
|
52
|
De La Peña R, Hodgson H, Liu JCT, Stephenson MJ, Martin AC, Owen C, Harkess A, Leebens-Mack J, Jimenez LE, Osbourn A, Sattely ES. Complex scaffold remodeling in plant triterpene biosynthesis. Science 2023; 379:361-368. [PMID: 36701471 PMCID: PMC9976607 DOI: 10.1126/science.adf1017] [Citation(s) in RCA: 26] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Triterpenes with complex scaffold modifications are widespread in the plant kingdom. Limonoids are an exemplary family that are responsible for the bitter taste in citrus (e.g., limonin) and the active constituents of neem oil, a widely used bioinsecticide (e.g., azadirachtin). Despite the commercial value of limonoids, a complete biosynthetic route has not been described. We report the discovery of 22 enzymes, including a pair of neofunctionalized sterol isomerases, that catalyze 12 distinct reactions in the total biosynthesis of kihadalactone A and azadirone, products that bear the signature limonoid furan. These results enable access to valuable limonoids and provide a template for discovery and reconstitution of triterpene biosynthetic pathways in plants that require multiple skeletal rearrangements and oxidations.
Collapse
Affiliation(s)
- Ricardo De La Peña
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, USA
| | - Hannah Hodgson
- Department of Biochemistry and Metabolism, John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | | | - Michael J Stephenson
- School of Chemistry, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, UK
| | - Azahara C Martin
- Department of Crop Genetics, John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - Charlotte Owen
- Department of Biochemistry and Metabolism, John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - Alex Harkess
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - Jim Leebens-Mack
- Department of Plant Biology, 4505 Miller Plant Sciences, University of Georgia, Athens, GA 30602, USA
| | - Luis E Jimenez
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, USA
| | - Anne Osbourn
- Department of Biochemistry and Metabolism, John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - Elizabeth S Sattely
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, USA.,Howard Hughes Medical Institute, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
53
|
Chen JW, Shrestha L, Green G, Leier A, Marquez-Lago TT. The hitchhikers' guide to RNA sequencing and functional analysis. Brief Bioinform 2023; 24:bbac529. [PMID: 36617463 PMCID: PMC9851315 DOI: 10.1093/bib/bbac529] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 10/18/2022] [Accepted: 11/07/2022] [Indexed: 01/10/2023] Open
Abstract
DNA and RNA sequencing technologies have revolutionized biology and biomedical sciences, sequencing full genomes and transcriptomes at very high speeds and reasonably low costs. RNA sequencing (RNA-Seq) enables transcript identification and quantification, but once sequencing has concluded researchers can be easily overwhelmed with questions such as how to go from raw data to differential expression (DE), pathway analysis and interpretation. Several pipelines and procedures have been developed to this effect. Even though there is no unique way to perform RNA-Seq analysis, it usually follows these steps: 1) raw reads quality check, 2) alignment of reads to a reference genome, 3) aligned reads' summarization according to an annotation file, 4) DE analysis and 5) gene set analysis and/or functional enrichment analysis. Each step requires researchers to make decisions, and the wide variety of options and resulting large volumes of data often lead to interpretation challenges. There also seems to be insufficient guidance on how best to obtain relevant information and derive actionable knowledge from transcription experiments. In this paper, we explain RNA-Seq steps in detail and outline differences and similarities of different popular options, as well as advantages and disadvantages. We also discuss non-coding RNA analysis, multi-omics, meta-transcriptomics and the use of artificial intelligence methods complementing the arsenal of tools available to researchers. Lastly, we perform a complete analysis from raw reads to DE and functional enrichment analysis, visually illustrating how results are not absolute truths and how algorithmic decisions can greatly impact results and interpretation.
Collapse
Affiliation(s)
- Jiung-Wen Chen
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Lisa Shrestha
- Department of Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
| | - George Green
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - André Leier
- Department of Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
| | - Tatiana T Marquez-Lago
- Department of Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
- Department of Microbiology, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
| |
Collapse
|
54
|
Escorcia-Rodríguez JM, Gaytan-Nuñez E, Hernandez-Benitez EM, Zorro-Aranda A, Tello-Palencia MA, Freyre-González JA. Improving gene regulatory network inference and assessment: The importance of using network structure. Front Genet 2023; 14:1143382. [PMID: 36926589 PMCID: PMC10012345 DOI: 10.3389/fgene.2023.1143382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 02/20/2023] [Indexed: 03/03/2023] Open
Abstract
Gene regulatory networks are graph models representing cellular transcription events. Networks are far from complete due to time and resource consumption for experimental validation and curation of the interactions. Previous assessments have shown the modest performance of the available network inference methods based on gene expression data. Here, we study several caveats on the inference of regulatory networks and methods assessment through the quality of the input data and gold standard, and the assessment approach with a focus on the global structure of the network. We used synthetic and biological data for the predictions and experimentally-validated biological networks as the gold standard (ground truth). Standard performance metrics and graph structural properties suggest that methods inferring co-expression networks should no longer be assessed equally with those inferring regulatory interactions. While methods inferring regulatory interactions perform better in global regulatory network inference than co-expression-based methods, the latter is better suited to infer function-specific regulons and co-regulation networks. When merging expression data, the size increase should outweigh the noise inclusion and graph structure should be considered when integrating the inferences. We conclude with guidelines to take advantage of inference methods and their assessment based on the applications and available expression datasets.
Collapse
Affiliation(s)
- Juan M Escorcia-Rodríguez
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| | - Estefani Gaytan-Nuñez
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico.,Undergraduate Program in Genomic Sciences, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| | - Ericka M Hernandez-Benitez
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico.,Undergraduate Program in Genomic Sciences, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| | - Andrea Zorro-Aranda
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico.,Department of Chemical Engineering, Universidad de Antioquia, Medellín, Colombia
| | - Marco A Tello-Palencia
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico.,Undergraduate Program in Genomic Sciences, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| | - Julio A Freyre-González
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, Mexico
| |
Collapse
|
55
|
Hammond EN, Kates AE, Putman-Buehler N, Watson L, Godfrey JJ, Brys N, Deblois C, Steinberger AJ, Cox MS, Skarlupka JH, Haleem A, Bentz ML, Suen G, Safdar N. A quality improvement study on the relationship between intranasal povidone-iodine and anesthesia and the nasal microbiota of surgery patients. PLoS One 2022; 17:e0278699. [PMID: 36490265 PMCID: PMC9733847 DOI: 10.1371/journal.pone.0278699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 11/22/2022] [Indexed: 12/13/2022] Open
Abstract
INTRODUCTION The composition of the nasal microbiota in surgical patients in the context of general anesthesia and nasal povidone-iodine decolonization is unknown. The purpose of this quality improvement study was to determine: (i) if general anesthesia is associated with changes in the nasal microbiota of surgery patients and (ii) if preoperative intranasal povidone-iodine decolonization is associated with changes in the nasal microbiota of surgery patients. MATERIALS AND METHODS One hundred and fifty-one ambulatory patients presenting for surgery were enrolled in a quality improvement study by convenience sampling. Pre- and post-surgery nasal samples were collected from patients in the no intranasal decolonization group (control group, n = 54). Pre-decolonization nasal samples were collected from the preoperative intranasal povidone-iodine decolonization group (povidone-iodine group, n = 97). Intranasal povidone-iodine was administered immediately prior to surgery and continued for 20 minutes before patients proceeded for surgery. Post-nasal samples were then collected. General anesthesia was administered to both groups. DNA from the samples was extracted for 16S rRNA sequencing on an Illumina MiSeq. RESULTS In the control group, there was no evidence of change in bacterial diversity between pre- and post-surgery samples. In the povidone-iodine group, nasal bacterial diversity was greater in post-surgery, relative to pre-surgery (Shannon's Diversity Index (P = 0.038), Chao's richness estimate (P = 0.02) and Inverse Simpson index (P = 0.027). Among all the genera, only the relative abundance of the genus Staphylococcus trended towards a decrease in patients after application (FDR adjusted P = 0.06). Abundant genera common to both povidone-iodine and control groups included Staphylococcus, Bradyrhizobium, Corynebacterium, Dolosigranulum, Lactobacillus, and Moraxella. CONCLUSIONS We found general anesthesia was not associated with changes in the nasal microbiota. Povidone-iodine treatment was associated with nasal microbial diversity and decreased abundance of Staphylococcus. Future studies should examine the nasal microbiota structure and function longitudinally in surgical patients receiving intranasal povidone-iodine.
Collapse
Affiliation(s)
- Eric N. Hammond
- Institute for Clinical and Translational Research, University of Wisconsin-Madison, Madison, WI, United States of America
- Division of Infectious Disease, Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, WI, United States of America
| | - Ashley E. Kates
- Division of Infectious Disease, Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, WI, United States of America
- William S. Middleton Memorial Veterans Hospital, Madison, WI, United States of America
| | - Nathan Putman-Buehler
- Department of Biochemistry, College of Agricultural and Life Sciences, University of Wisconsin-Madison, Madison, WI, United States of America
| | - Lauren Watson
- SSM Health, St. Mary’s Hospital, Madison, WI, United States of America
| | - Jared J. Godfrey
- Division of Infectious Disease, Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, WI, United States of America
- William S. Middleton Memorial Veterans Hospital, Madison, WI, United States of America
| | - Nicole Brys
- Waisman Center, University of Wisconsin-Madison, Madison, WI, United States of America
| | - Courtney Deblois
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, United States of America
- Microbiology Doctoral Training Program, University of Wisconsin-Madison, Madison, WI, United States of America
| | - Andrew J. Steinberger
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, United States of America
- Microbiology Doctoral Training Program, University of Wisconsin-Madison, Madison, WI, United States of America
| | - Madison S. Cox
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, United States of America
- Microbiology Doctoral Training Program, University of Wisconsin-Madison, Madison, WI, United States of America
| | - Joseph H. Skarlupka
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, United States of America
- Microbiology Doctoral Training Program, University of Wisconsin-Madison, Madison, WI, United States of America
| | - Ambar Haleem
- Division of Infectious Disease, Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, WI, United States of America
| | - Michael L. Bentz
- Division of Plastic and Reconstructive Surgery and Urology, Department of Surgery, University of Wisconsin School of Medicine and Public Health, Madison, WI, United States of America
| | - Garret Suen
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, United States of America
| | - Nasia Safdar
- Division of Infectious Disease, Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, WI, United States of America
- William S. Middleton Memorial Veterans Hospital, Madison, WI, United States of America
- * E-mail:
| |
Collapse
|
56
|
Transcriptomic data analysis of melanocytes and melanoma cell lines of LAT transporter genes for precise medicine. BIO-ALGORITHMS AND MED-SYSTEMS 2022. [DOI: 10.2478/bioal-2022-0086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Abstract
Background: Boron Neutron Capture Therapy (BNCT) is a two-step treatment that can be used in some types of cancers. It involves administering a compound containing boron atoms to the patient and irradiating the affected area of the body with a neutron beam. The success of the therapy depends mainly on the delivery of the boron isotope (10B) to the tumor using an appropriate boron carrier. One of the boron carriers used is boronophenylalanine (BPA). Therefore, in research on the use of boron carriers, it is also important to know the mechanisms of its uptake by cells. Aim: To study the expression of LAT family genes in two melanoma (high melanotic WM115 and low melanotic WM266-4) cell lines and melanocytes (HEMa-Lp) which are responsible for the transport the BPA into cells. Methods: To normalize data from the transcriptomic analysis, the ratio of the median method was used. This allowed the samples to be compared with each other. Comparison metrics included log-fold change (LFC) values. The heatmap of LFC values and the cluster map were created. These graphs show the similarities and differences between the samples. Results: Transcriptomic data show that in melanocytes, LFC for SLC7A5 (LAT1) and SLC3A2 (4Fhc) was higher than in melanoma cell lines, which corresponded with their melanin content. Conclusion: Our results indicate overexpression of BPA transporter genes in normal cells (melanocytes), which may suggest the highest level of these proteins in melanocytes compared to less melanotic melanoma. Therefore, for BNCT, the use of BPA as the 10B carrier will require additional qualifying tests of amino acid transporter expression for patients and specific tumors to develop a personalized BNCT.
Collapse
|
57
|
Moreno M, Vilaça R, Ferreira PG. Scalable transcriptomics analysis with Dask: applications in data science and machine learning. BMC Bioinformatics 2022; 23:514. [PMID: 36451115 PMCID: PMC9710082 DOI: 10.1186/s12859-022-05065-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Accepted: 11/16/2022] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Gene expression studies are an important tool in biological and biomedical research. The signal carried in expression profiles helps derive signatures for the prediction, diagnosis and prognosis of different diseases. Data science and specifically machine learning have many applications in gene expression analysis. However, as the dimensionality of genomics datasets grows, scalable solutions become necessary. METHODS In this paper we review the main steps and bottlenecks in machine learning pipelines, as well as the main concepts behind scalable data science including those of concurrent and parallel programming. We discuss the benefits of the Dask framework and how it can be integrated with the Python scientific environment to perform data analysis in computational biology and bioinformatics. RESULTS This review illustrates the role of Dask for boosting data science applications in different case studies. Detailed documentation and code on these procedures is made available at https://github.com/martaccmoreno/gexp-ml-dask . CONCLUSION By showing when and how Dask can be used in transcriptomics analysis, this review will serve as an entry point to help genomic data scientists develop more scalable data analysis procedures.
Collapse
Affiliation(s)
- Marta Moreno
- grid.5808.50000 0001 1503 7226Department of Computer Science, Faculty of Sciences, University of Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal ,grid.20384.3d0000 0004 0500 6380Laboratory of Artificial Intelligence and Decision Support, INESC TEC, Rua Dr. Roberto Frias, 4200-465 Porto, Portugal
| | - Ricardo Vilaça
- grid.20384.3d0000 0004 0500 6380High-Assurance Software Laboratory, INESC TEC, Rua Dr. Roberto Frias, 4200-465 Porto, Portugal ,grid.10328.380000 0001 2159 175XDepartment of Informatics, Minho Advanced Computing Center, University of Minho, Gualtar, 4710-070 Braga, Portugal
| | - Pedro G. Ferreira
- grid.5808.50000 0001 1503 7226Department of Computer Science, Faculty of Sciences, University of Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal ,grid.20384.3d0000 0004 0500 6380Laboratory of Artificial Intelligence and Decision Support, INESC TEC, Rua Dr. Roberto Frias, 4200-465 Porto, Portugal ,grid.5808.50000 0001 1503 7226Institute of Molecular Pathology and Immunology of the University of Porto, Institute for Research and Innovation in Health (i3s), R. Alfredo Allen 208, 4200-135 Porto, Portugal
| |
Collapse
|
58
|
Li Q, Newaz K, Milenković T. Towards future directions in data-integrative supervised prediction of human aging-related genes. BIOINFORMATICS ADVANCES 2022; 2:vbac081. [PMID: 36699345 PMCID: PMC9710570 DOI: 10.1093/bioadv/vbac081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 09/23/2022] [Accepted: 10/31/2022] [Indexed: 11/13/2022]
Abstract
Motivation Identification of human genes involved in the aging process is critical due to the incidence of many diseases with age. A state-of-the-art approach for this purpose infers a weighted dynamic aging-specific subnetwork by mapping gene expression (GE) levels at different ages onto the protein-protein interaction network (PPIN). Then, it analyzes this subnetwork in a supervised manner by training a predictive model to learn how network topologies of known aging- versus non-aging-related genes change across ages. Finally, it uses the trained model to predict novel aging-related gene candidates. However, the best current subnetwork resulting from this approach still yields suboptimal prediction accuracy. This could be because it was inferred using outdated GE and PPIN data. Here, we evaluate whether analyzing a weighted dynamic aging-specific subnetwork inferred from newer GE and PPIN data improves prediction accuracy upon analyzing the best current subnetwork inferred from outdated data. Results Unexpectedly, we find that not to be the case. To understand this, we perform aging-related pathway and Gene Ontology term enrichment analyses. We find that the suboptimal prediction accuracy, regardless of which GE or PPIN data is used, may be caused by the current knowledge about which genes are aging-related being incomplete, or by the current methods for inferring or analyzing an aging-specific subnetwork being unable to capture all of the aging-related knowledge. These findings can potentially guide future directions towards improving supervised prediction of aging-related genes via -omics data integration. Availability and implementation All data and code are available at zenodo, DOI: 10.5281/zenodo.6995045. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Qi Li
- Department of Computer Science and Engineering, Lucy Family Institute for Data & Society, and Eck Institute for Global Health (EIGH), University of Notre Dame, Notre Dame, IN 46556, USA
| | - Khalique Newaz
- Department of Computer Science and Engineering, Lucy Family Institute for Data & Society, and Eck Institute for Global Health (EIGH), University of Notre Dame, Notre Dame, IN 46556, USA,Center for Data and Computing in Natural Sciences (CDCS), Institute for Computational Systems Biology, Universität Hamburg, Hamburg 20146, Germany
| | | |
Collapse
|
59
|
Blackwell AD, Garcia AR. Ecoimmunology in the field: Measuring multiple dimensions of immune function with minimally invasive, field-adapted techniques. Am J Hum Biol 2022; 34:e23784. [PMID: 35861267 PMCID: PMC9786696 DOI: 10.1002/ajhb.23784] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2022] [Revised: 06/29/2022] [Accepted: 07/08/2022] [Indexed: 01/25/2023] Open
Abstract
OBJECTIVE Immune function is multifaceted and characterizations based on single biomarkers may be uninformative or misleading, particularly when considered across ecological contexts. However, measuring the many facets of immunity in the field can be challenging, since many measures cannot be obtained on-site, necessitating sample preservation and transport. Here we assess state-of-the-art methods for measuring immunity, focusing on measures that require a minimal blood sample obtained from a finger prick, which can be: (1) dried on filter paper, (2) frozen in liquid nitrogen, or (3) stabilized with chemical reagents. RESULTS We review immune measures that can be obtained from point-of-care devices or from immunoassays of dried blood spots (DBSs), field methods for flow cytometry, the use of RNA or DNA sequencing and quantification, and the application of immune activation assays under field conditions. CONCLUSIONS Stable protein products, such as immunoglobulins and C-reactive protein are reliably measured in DBSs. Because less stable proteins, such as cytokines, may be problematic to measure even in fresh blood, mRNA from stabilized blood may provide a cleaner measure of cytokine and broader immune-related gene expression. Gene methylation assays or mRNA sequencing also allow for the quantification of many other parameters, including the inference of leukocyte subsets, though with less accuracy than with flow cytometry. Combining these techniques provides an improvement over single-marker studies, allowing for a more nuanced understanding of how social and ecological variables are linked to immune measures and disease risk in diverse populations and settings.
Collapse
Affiliation(s)
- Aaron D. Blackwell
- Department of AnthropologyWashington State UniversityPullmanWashingtonUSA
| | - Angela R. Garcia
- Research DepartmentPhoenix Children's HospitalPhoenixArizonaUSA,Department of Child HealthUniversity of Arizona College of MedicinePhoenixArizonaUSA
| |
Collapse
|
60
|
Wu CT, Shen M, Du D, Cheng Z, Parker SJ, Lu Y, Van Eyk JE, Yu G, Clarke R, Herrington DM, Wang Y. Cosbin: cosine score-based iterative normalization of biologically diverse samples. BIOINFORMATICS ADVANCES 2022; 2:vbac076. [PMID: 36330358 PMCID: PMC9614059 DOI: 10.1093/bioadv/vbac076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 10/02/2022] [Accepted: 10/18/2022] [Indexed: 11/06/2022]
Abstract
Motivation Data normalization is essential to ensure accurate inference and comparability of gene expression measures across samples or conditions. Ideally, gene expression data should be rescaled based on consistently expressed reference genes. However, to normalize biologically diverse samples, the most commonly used reference genes exhibit striking expression variability and size-factor or distribution-based normalization methods can be problematic when the amount of asymmetry in differential expression is significant. Results We report an efficient and accurate data-driven method—Cosine score-based iterative normalization (Cosbin)—to normalize biologically diverse samples. Based on the Cosine scores of cross-condition expression patterns, the Cosbin pipeline iteratively eliminates asymmetric differentially expressed genes, identifies consistently expressed genes, and calculates sample-wise normalization factors. We demonstrate the superior performance and enhanced utility of Cosbin compared with six representative peer methods using both simulation and real multi-omics expression datasets. Implemented in open-source R scripts and specifically designed to address normalization bias due to significant asymmetry in differential expression across multiple conditions, the Cosbin tool complements rather than replaces the existing methods and will allow biologists to more accurately detect true molecular signals among diverse phenotypic groups. Availability and implementation The R scripts of Cosbin pipeline are freely available at https://github.com/MinjieSh/Cosbin. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
| | | | - Dongping Du
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA
| | - Zuolin Cheng
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA
| | - Sarah J Parker
- Advanced Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Yingzhou Lu
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA
| | - Jennifer E Van Eyk
- Advanced Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Guoqiang Yu
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA
| | - Robert Clarke
- The Hormel Institute, University of Minnesota, Austin, MN 55912, USA
| | - David M Herrington
- Department of Internal Medicine, Wake Forest University, Winston-Salem, NC 27157, USA
| | - Yue Wang
- To whom correspondence should be addressed.
| |
Collapse
|
61
|
Webster AK, Chitrakar R, Taylor SM, Baugh LR. Alternative somatic and germline gene-regulatory strategies during starvation-induced developmental arrest. Cell Rep 2022; 41:111473. [PMID: 36223742 PMCID: PMC9608353 DOI: 10.1016/j.celrep.2022.111473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2021] [Revised: 07/18/2022] [Accepted: 09/20/2022] [Indexed: 11/16/2022] Open
Abstract
Nutrient availability governs growth and quiescence, and many animals arrest development when starved. Using C. elegans L1 arrest as a model, we show that gene expression changes deep into starvation. Surprisingly, relative expression of germline-enriched genes increases for days. We conditionally degrade the large subunit of RNA polymerase II using the auxin-inducible degron system and analyze absolute expression levels. We find that somatic transcription is required for survival, but the germline maintains transcriptional quiescence. Thousands of genes are continuously transcribed in the soma, though their absolute abundance declines, such that relative expression of germline transcripts increases given extreme transcript stability. Aberrantly activating transcription in starved germ cells compromises reproduction, demonstrating important physiological function of transcriptional quiescence. This work reveals alternative somatic and germline gene-regulatory strategies during starvation, with the soma maintaining a robust transcriptional response to support survival and the germline maintaining transcriptional quiescence to support future reproductive success. Webster et al. show that the transcriptional response to starvation is mounted early in larval somatic cells supporting survival but that it wanes over time. In contrast, they show that the germline remains transcriptionally quiescent deep into starvation, supporting reproductive potential, while maintaining its transcriptome via transcript stability.
Collapse
Affiliation(s)
- Amy K. Webster
- Department of Biology, Duke University, Durham, NC 27708, USA,Present address: Institute of Ecology and Evolution, University of Oregon, Eugene, OR 97403, USA
| | - Rojin Chitrakar
- Department of Biology, Duke University, Durham, NC 27708, USA
| | - Seth M. Taylor
- Department of Biology, Duke University, Durham, NC 27708, USA
| | - L. Ryan Baugh
- Department of Biology, Duke University, Durham, NC 27708, USA,Center for Genomic and Computational Biology, Duke University, Durham, NC 27708, USA,Lead contact,Correspondence:
| |
Collapse
|
62
|
Lobo D, Linheiro R, Godinho R, Archer JP. On taming the effect of transcript level intra-condition count variation during differential expression analysis: A story of dogs, foxes and wolves. PLoS One 2022; 17:e0274591. [PMID: 36136981 PMCID: PMC9498955 DOI: 10.1371/journal.pone.0274591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 08/31/2022] [Indexed: 11/22/2022] Open
Abstract
The evolution of RNA-seq technologies has yielded datasets of scientific value that are often generated as condition associated biological replicates within expression studies. With expanding data archives opportunity arises to augment replicate numbers when conditions of interest overlap. Despite correction procedures for estimating transcript abundance, a source of ambiguity is transcript level intra-condition count variation; as indicated by disjointed results between analysis tools. We present TVscript, a tool that removes reference-based transcripts associated with intra-condition count variation above specified thresholds and we explore the effects of such variation on differential expression analysis. Initially iterative differential expression analysis involving simulated counts, where levels of intra-condition variation and sets of over represented transcripts are explicitly specified, was performed. Then counts derived from inter- and intra-study data representing brain samples of dogs, wolves and foxes (wolves vs. dogs and aggressive vs. tame foxes) were used. For simulations, the sensitivity in detecting differentially expressed transcripts increased after removing hyper-variable transcripts, although at levels of intra-condition variation above 5% detection became unreliable. For real data, prior to applying TVscript, ≈20% of the transcripts identified as being differentially expressed were associated with high levels of intra-condition variation, an over representation relative to the reference set. As transcripts harbouring such variation were removed pre-analysis, a discordance from 26 to 40% in the lists of differentially expressed transcripts is observed when compared to those obtained using the non-filtered reference. The removal of transcripts possessing intra-condition variation values within (and above) the 97th and 95th percentiles, for wolves vs. dogs and aggressive vs. tame foxes, maximized the sensitivity in detecting differentially expressed transcripts as a result of alterations within gene-wise dispersion estimates. Through analysis of our real data the support for seven genes with potential for being involved with selection for tameness is provided. TVscript is available at: https://sourceforge.net/projects/tvscript/.
Collapse
Affiliation(s)
- Diana Lobo
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- BIOPOLIS, Program in Genomics, Biodiversity and Land Planning, CIBIO, Vairão, Portugal
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
- * E-mail: (DL); (JPA)
| | - Raquel Linheiro
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
| | - Raquel Godinho
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- BIOPOLIS, Program in Genomics, Biodiversity and Land Planning, CIBIO, Vairão, Portugal
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
| | - John Patrick Archer
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- BIOPOLIS, Program in Genomics, Biodiversity and Land Planning, CIBIO, Vairão, Portugal
- * E-mail: (DL); (JPA)
| |
Collapse
|
63
|
Current challenges and best practices for cell-free long RNA biomarker discovery. Biomark Res 2022; 10:62. [PMID: 35978416 PMCID: PMC9385245 DOI: 10.1186/s40364-022-00409-w] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 08/04/2022] [Indexed: 11/24/2022] Open
Abstract
The analysis of biomarkers in biological fluids, also known as liquid biopsies, is seen with great potential to diagnose complex diseases such as cancer with a high sensitivity and minimal invasiveness. Although it can target any biomolecule, most liquid biopsy studies have focused on circulating nucleic acids. Historically, studies have aimed at the detection of specific mutations on cell-free DNA (cfDNA), but recently, the study of cell-free RNA (cfRNA) has gained traction. Since 2020, a handful of cfDNA tests have been approved for therapy selection by the FDA, however, no cfRNA tests are approved to date. One of the main drawbacks in the field of RNA-based liquid biopsies is the low reproducibility of the results, often caused by technical and biological variability, a lack of standardized protocols and insufficient cohorts. In this review, we will identify the main challenges and biases introduced during the different stages of biomarker discovery in liquid biopsies with cfRNA and propose solutions to minimize them.
Collapse
|
64
|
Dysregulation of Cell Envelope Homeostasis in Staphylococcus aureus Exposed to Solvated Lignin. Appl Environ Microbiol 2022; 88:e0054822. [PMID: 35852361 PMCID: PMC9361832 DOI: 10.1128/aem.00548-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Lignin is an aromatic plant cell wall polymer that facilitates water transport through the vasculature of plants and is generated in large quantities as an inexpensive by-product of pulp and paper manufacturing and biorefineries. Although lignin's ability to reduce bacterial growth has been reported previously, its hydrophobicity complicates the ability to examine its biological effects on living cells in aqueous growth media. We recently described the ability to solvate lignin in Good's buffers with neutral pH, a breakthrough that allowed examination of lignin's antimicrobial effects against the human pathogen Staphylococcus aureus. These analyses showed that lignin damages the S. aureus cell membrane, causes increased cell clustering, and inhibits growth synergistically with tunicamycin, a teichoic acid synthesis inhibitor. In the present study, we examined the physiological and transcriptomic responses of S. aureus to lignin. Intriguingly, lignin restored the susceptibility of genetically resistant S. aureus isolates to penicillin and oxacillin, decreased intracellular pH, impaired normal cell division, and rendered cells more resistant to detergent-induced lysis. Additionally, transcriptome sequencing (RNA-Seq) differential expression (DE) analysis of lignin-treated cultures revealed significant gene expression changes (P < 0.05 with 5% false discovery rate [FDR]) related to the cell envelope, cell wall physiology, fatty acid metabolism, and stress resistance. Moreover, a pattern of concurrent up- and downregulation of genes within biochemical pathways involved in transmembrane transport and cell wall physiology was observed, which likely reflects an attempt to tolerate or compensate for lignin-induced damage. Together, these results represent the first comprehensive analysis of lignin's antibacterial activity against S. aureus. IMPORTANCE S. aureus is a leading cause of skin and soft tissue infections. The ability of S. aureus to acquire genetic resistance to antibiotics further compounds its ability to cause life-threatening infections. While the historical response to antibiotic resistance has been to develop new antibiotics, bacterial pathogens are notorious for rapidly acquiring genetic resistance mechanisms. As such, the development of adjuvants represents a viable way of extending the life span of current antibiotics to which pathogens may already be resistant. Here, we describe the phenotypic and transcriptomic response of S. aureus to treatment with lignin. Our results demonstrate that lignin extracted from sugarcane and sorghum bagasse restores S. aureus susceptibility to β-lactams, providing a premise for repurposing these antibiotics in treatment of resistant S. aureus strains, possibly in the form of topical lignin/β-lactam formulations.
Collapse
|
65
|
Casas AI, Hassan AA, Manz Q, Wiwie C, Kleikers P, Egea J, López MG, List M, Baumbach J, Schmidt HHHW. Un-biased housekeeping gene panel selection for high-validity gene expression analysis. Sci Rep 2022; 12:12324. [PMID: 35853974 PMCID: PMC9296577 DOI: 10.1038/s41598-022-15989-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 07/04/2022] [Indexed: 12/02/2022] Open
Abstract
Differential gene expression normalised to a single housekeeping (HK) is used to identify disease mechanisms and therapeutic targets. HK gene selection is often arbitrary, potentially introducing systematic error and discordant results. Here we examine these risks in a disease model of brain hypoxia. We first identified the eight most frequently used HK genes through a systematic review. However, we observe that in both ex-vivo and in vivo, their expression levels varied considerably between conditions. When applying these genes to normalise expression levels of the validated stroke target gene, inducible Nox4, we obtained opposing results. As an alternative tool for unbiased HK gene selection, software tools exist but are limited to individual datasets lacking genome-wide search capability and user-friendly interfaces. We, therefore, developed the HouseKeepR algorithm to rapidly analyse multiple gene expression datasets in a disease-specific manner and rank HK gene candidates according to stability in an unbiased manner. Using a panel of de novo top-ranked HK genes for brain hypoxia, but not single genes, Nox4 induction was consistently reproduced. Thus, differential gene expression analysis is best normalised against a HK gene panel selected in an unbiased manner. HouseKeepR is the first user-friendly, bias-free, and broadly applicable tool to automatically propose suitable HK genes in a tissue- and disease-dependent manner.
Collapse
Affiliation(s)
- Ana I Casas
- Department of Neurology and Center for Translational Neuro- and Behavioural Sciences (C-TNBS), University Clinics Essen, Essen, Germany. .,Department of Pharmacology & Personalised Medicine, MeHNS, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, The Netherlands.
| | - Ahmed A Hassan
- Department of Pharmacology & Personalised Medicine, MeHNS, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, The Netherlands
| | - Quirin Manz
- Faculty of Mathematics, Informatics and Natural Sciences, University of Hamburg, Hamburg, Germany
| | - Christian Wiwie
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Pamela Kleikers
- Department of Pharmacology & Personalised Medicine, MeHNS, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, The Netherlands
| | - Javier Egea
- Molecular Neuroinflammation and Neuronal Plasticity Research Laboratory, Hospital Universitario Santa Cristina, Instituto de Investigación Sanitaria-Hospital Universitario de la Princesa, Madrid, Spain.,Departamento de Farmacología, Instituto de I+D del Medicamento Teófilo Hernando (ITH), Facultad de Medicina, Universidad Autónoma de Madrid, Madrid, Spain
| | - Manuela G López
- Departamento de Farmacología, Instituto de I+D del Medicamento Teófilo Hernando (ITH), Facultad de Medicina, Universidad Autónoma de Madrid, Madrid, Spain
| | - Markus List
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
| | - Jan Baumbach
- Faculty of Mathematics, Informatics and Natural Sciences, University of Hamburg, Hamburg, Germany
| | - Harald H H W Schmidt
- Department of Pharmacology & Personalised Medicine, MeHNS, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, The Netherlands.
| |
Collapse
|
66
|
Dressler FF, Brägelmann J, Reischl M, Perner S. Normics: Proteomic Normalization by Variance and Data-Inherent Correlation Structure. Mol Cell Proteomics 2022; 21:100269. [PMID: 35853575 PMCID: PMC9450154 DOI: 10.1016/j.mcpro.2022.100269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 06/16/2022] [Accepted: 07/13/2022] [Indexed: 11/17/2022] Open
Abstract
Several algorithms for the normalization of proteomic data are currently available, each based on a priori assumptions. Among these is the extent to which differential expression (DE) can be present in the dataset. This factor is usually unknown in explorative biomarker screens. Simultaneously, the increasing depth of proteomic analyses often requires the selection of subsets with a high probability of being DE to obtain meaningful results in downstream bioinformatical analyses. Based on the relationship of technical variation and (true) biological DE of an unknown share of proteins, we propose the “Normics” algorithm: Proteins are ranked based on their expression level–corrected variance and the mean correlation with all other proteins. The latter serves as a novel indicator of the non-DE likelihood of a protein in a given dataset. Subsequent normalization is based on a subset of non-DE proteins only. No a priori information such as batch, clinical, or replicate group is necessary. Simulation data demonstrated robust and superior performance across a wide range of stochastically chosen parameters. Five publicly available spike-in and biologically variant datasets were reliably and quantitively accurately normalized by Normics with improved performance compared to standard variance stabilization as well as median, quantile, and LOESS normalizations. In complex biological datasets Normics correctly determined proteins as being DE that had been cross-validated by an independent transcriptome analysis of the same samples. In both complex datasets Normics identified the most DE proteins. We demonstrate that combining variance analysis and data-inherent correlation structure to identify non-DE proteins improves data normalization. Standard normalization algorithms can be consolidated against high shares of (one-sided) biological regulation. The statistical power of downstream analyses can be increased by focusing on Normics-selected subsets of high DE likelihood. Normics is a tool for the normalization of proteomic data based on existing algorithms. Specifically addresses data with high shares of differential expression. Combines variance and data-inherent correlation structure. Provides a ranking of differential expression likelihood. Enables normalization based on the most stable proteins.
Collapse
Affiliation(s)
- Franz F Dressler
- Institute of Pathology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany; Institute of Pathology, University Medical Center Schleswig-Holstein, Luebeck Site, Luebeck, Germany.
| | - Johannes Brägelmann
- Mildred Scheel School of Oncology, University of Cologne, Faculty of Medicine and University Hospital Cologne, Cologne, Germany; Department of Translational Genomics, University of Cologne, Faculty of Medicine and University Hospital Cologne, Cologne, Germany; Center for Molecular Medicine Cologne, University of Cologne, Faculty of Medicine and University Hospital Cologne, Cologne, Germany
| | - Markus Reischl
- Institute for Automation and Applied Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Sven Perner
- Institute of Pathology, University Medical Center Schleswig-Holstein, Luebeck Site, Luebeck, Germany; Institute of Pathology, Research Center Borstel, Leibniz Lung Center, Borstel, Germany
| |
Collapse
|
67
|
Roche KE, Mukherjee S. The accuracy of absolute differential abundance analysis from relative count data. PLoS Comput Biol 2022; 18:e1010284. [PMID: 35816553 PMCID: PMC9302745 DOI: 10.1371/journal.pcbi.1010284] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 07/21/2022] [Accepted: 06/07/2022] [Indexed: 11/29/2022] Open
Abstract
Concerns have been raised about the use of relative abundance data derived from next generation sequencing as a proxy for absolute abundances. For example, in the differential abundance setting, compositional effects in relative abundance data may give rise to spurious differences (false positives) when considered from the absolute perspective. In practice however, relative abundances are often transformed by renormalization strategies intended to compensate for these effects and the scope of the practical problem remains unclear. We used simulated data to explore the consistency of differential abundance calling on renormalized relative abundances versus absolute abundances and find that, while overall consistency is high, with a median sensitivity (true positive rates) of 0.91 and specificity (1—false positive rates) of 0.89, consistency can be much lower where there is widespread change in the abundance of features across conditions. We confirm these findings on a large number of real data sets drawn from 16S metabarcoding, expression array, bulk RNA-seq, and single-cell RNA-seq experiments, where data sets with the greatest change between experimental conditions are also those with the highest false positive rates. Finally, we evaluate the predictive utility of summary features of relative abundance data themselves. Estimates of sparsity and the prevalence of feature-level change in relative abundance data give reasonable predictions of discrepancy in differential abundance calling in simulated data and can provide useful bounds for worst-case outcomes in real data. Molecular sequence counting is a near-ubituiqous method for taking “snapshots” of the state of biological systems at the molecular level and is applied to problems as diverse as profiling gene expression and characterizing bacterial community composition. However, concerns exist about the interpretation of these data, given they are relative counts. In particular some feature-level differences between samples may be technical, not biological, stemming from compositional effects. Here, we quantify the accuracy of estimates of sample-sample differences made from relative versus “absolute” molecular count data, using a comprehensive simulation strategy and published experimental data. We find the accuracy of difference estimation is high in at least 50% of simulated and real data sets but that low accuracy outcomes are far from rare. Further, we observe similar numbers of these low accuracy cases when using any of several popular methods for estimating differences in biological count data. Our results support the use of complementary reference measures of absolute abundance (like RNA spike-ins) for normalizing next-generation sequencing data. We briefly validate the use of these reference quantities and of stringent effect size thresholds as strategies for mitigating interpretational problems with relative count data.
Collapse
Affiliation(s)
- Kimberly E. Roche
- Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina, United States of America
- * E-mail:
| | - Sayan Mukherjee
- Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina, United States of America
- Departments of Statistical Science, Mathematics, Computer Science, Biostatistics & Bioinformatics, Duke University, Durham, North Carolina, United States of America
- Center for Scalable Data Analytics and Artificial Intelligence, Universität Leipzig and the Max Planck Institute for Mathematics in the Natural Sciences, Leipzig, Germany
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina, United States of America
| |
Collapse
|
68
|
L.B. Almeida B, M. Bahrudeen MN, Chauhan V, Dash S, Kandavalli V, Häkkinen A, Lloyd-Price J, S.D. Cristina P, Baptista ISC, Gupta A, Kesseli J, Dufour E, Smolander OP, Nykter M, Auvinen P, Jacobs HT, M.D. Oliveira S, S. Ribeiro A. The transcription factor network of E. coli steers global responses to shifts in RNAP concentration. Nucleic Acids Res 2022; 50:6801-6819. [PMID: 35748858 PMCID: PMC9262627 DOI: 10.1093/nar/gkac540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Revised: 06/02/2022] [Accepted: 06/14/2022] [Indexed: 12/24/2022] Open
Abstract
The robustness and sensitivity of gene networks to environmental changes is critical for cell survival. How gene networks produce specific, chronologically ordered responses to genome-wide perturbations, while robustly maintaining homeostasis, remains an open question. We analysed if short- and mid-term genome-wide responses to shifts in RNA polymerase (RNAP) concentration are influenced by the known topology and logic of the transcription factor network (TFN) of Escherichia coli. We found that, at the gene cohort level, the magnitude of the single-gene, mid-term transcriptional responses to changes in RNAP concentration can be explained by the absolute difference between the gene's numbers of activating and repressing input transcription factors (TFs). Interestingly, this difference is strongly positively correlated with the number of input TFs of the gene. Meanwhile, short-term responses showed only weak influence from the TFN. Our results suggest that the global topological traits of the TFN of E. coli shape which gene cohorts respond to genome-wide stresses.
Collapse
Affiliation(s)
- Bilena L.B. Almeida
- Correspondence may also be addressed to Bilena L.B. Almeida. Tel: +358 2945211;
| | | | | | | | - Vinodh Kandavalli
- Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - Antti Häkkinen
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, FI-00014 Helsinki, Finland
| | | | - Palma S.D. Cristina
- Laboratory of Biosystem Dynamics, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Ines S C Baptista
- Laboratory of Biosystem Dynamics, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Abhishekh Gupta
- Center for Quantitative Medicine and Department of Cell Biology, University of Connecticut School of Medicine, 263 Farmington Av., Farmington, CT 06030-6033, USA
| | - Juha Kesseli
- Prostate Cancer Research Center, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland; Tays Cancer Center, Tampere University Hospital, Tampere, Finland
| | - Eric Dufour
- Mitochondrial bioenergetics and metabolism, BioMediTech, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Olli-Pekka Smolander
- Department of Chemistry and Biotechnology, Tallinn University of Technology, Tallinn, Estonia
- Institute of Biotechnology, University of Helsinki, Viikinkaari 5D, 00790 Helsinki, Finland
| | - Matti Nykter
- Prostate Cancer Research Center, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland; Tays Cancer Center, Tampere University Hospital, Tampere, Finland
| | - Petri Auvinen
- Institute of Biotechnology, University of Helsinki, Viikinkaari 5D, 00790 Helsinki, Finland
| | - Howard T Jacobs
- Faculty of Medicine and Health Technology, FI-33014 Tampere University, Finland; Department of Environment and Genetics, La Trobe University, Melbourne, Victoria 3086, Australia
| | - Samuel M.D. Oliveira
- Department of Electrical and Computer Engineering, Boston University, Boston, MA, USA
| | | |
Collapse
|
69
|
Yang LH, Hagan DH, Rivera-Rios JC, Kelp MM, Cross ES, Peng Y, Kaiser J, Williams LR, Croteau PL, Jayne JT, Ng NL. Investigating the Sources of Urban Air Pollution Using Low-Cost Air Quality Sensors at an Urban Atlanta Site. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2022; 56:7063-7073. [PMID: 35357805 DOI: 10.1021/acs.est.1c07005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Advances in low-cost sensors (LCS) for monitoring air quality have opened new opportunities to characterize air quality in finer spatial and temporal resolutions. In this study, we deployed LCS that measure both gas (CO, NO, NO2, and O3) and particle concentrations and co-located research-grade instruments in Atlanta, GA, to investigate the capability of LCS in resolving air pollutant sources using non-negative matrix factorization (NMF) in a moderately polluted urban area. We provide a comparison of applying the NMF technique to both normalized and non-normalized data sets. We identify four factors with different temporal trends and properties for both normalized and non-normalized data sets. Both normalized and non-normalized LCS data sets can resolve primary organic aerosol (POA) factors identified from research-grade instruments. However, applying normalization provides factors with more diverse compositions and can resolve secondary organic aerosol (SOA). Results from this study demonstrate that LCS not only can be used to provide basic mass concentration information but also can be used for in-depth source apportionment studies even in an urban setting with complex pollution mixtures and relatively low aerosol loadings.
Collapse
Affiliation(s)
- Laura Hyesung Yang
- School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - David H Hagan
- QuantAQ, Inc., Somerville, Massachusetts 02143, United States
| | - Jean C Rivera-Rios
- School of Chemical and Biomolecular Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Makoto M Kelp
- Department of Earth and Planetary Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Eben S Cross
- QuantAQ, Inc., Somerville, Massachusetts 02143, United States
| | - Yuyang Peng
- School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Jennifer Kaiser
- School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
- School of Earth and Atmospheric Sciences, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Leah R Williams
- Aerodyne Research, Inc., Billerica, Massachusetts 01821, United States
| | - Philip L Croteau
- Aerodyne Research, Inc., Billerica, Massachusetts 01821, United States
| | - John T Jayne
- Aerodyne Research, Inc., Billerica, Massachusetts 01821, United States
| | - Nga Lee Ng
- School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
- School of Chemical and Biomolecular Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
- School of Earth and Atmospheric Sciences, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| |
Collapse
|
70
|
Genomic Occupancy of the Bromodomain Protein Bdf3 Is Dynamic during Differentiation of African Trypanosomes from Bloodstream to Procyclic Forms. mSphere 2022; 7:e0002322. [PMID: 35642518 PMCID: PMC9241505 DOI: 10.1128/msphere.00023-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open
Abstract
Trypanosoma brucei, the causative agent of human and animal African trypanosomiasis, cycles between a mammalian host and a tsetse fly vector. The parasite undergoes huge changes in morphology and metabolism during adaptation to each host environment. These changes are reflected in the different transcriptomes of parasites living in each host. However, it remains unclear whether chromatin-interacting proteins help mediate these changes. Bromodomain proteins localize to transcription start sites in bloodstream parasites, but whether the localization of bromodomain proteins changes as parasites differentiate from bloodstream to insect stages remains unknown. To address this question, we performed cleavage under target and release using nuclease (CUT&RUN) against bromodomain protein 3 (Bdf3) in parasites differentiating from bloodstream to insect forms. We found that Bdf3 occupancy at most loci increased at 3 h following onset of differentiation and decreased thereafter. A number of sites with increased bromodomain protein occupancy lie proximal to genes with altered transcript levels during differentiation, such as procyclins, procyclin-associated genes, and invariant surface glycoproteins. Most Bdf3-occupied sites are observed throughout differentiation. However, one site appears de novo during differentiation and lies proximal to the procyclin gene locus housing genes essential for remodeling surface proteins following transition to the insect stage. These studies indicate that occupancy of chromatin-interacting proteins is dynamic during life cycle stage transitions and provide the groundwork for future studies on the effects of changes in bromodomain protein occupancy. Additionally, the adaptation of CUT&RUN for Trypanosoma brucei provides other researchers with an alternative to chromatin immunoprecipitation (ChIP). IMPORTANCE The parasite Trypanosoma brucei is the causative agent of human and animal African trypanosomiasis (sleeping sickness). Trypanosomiasis, which affects humans and cattle, is fatal if untreated. Existing drugs have significant side effects. Thus, these parasites impose a significant human and economic burden in sub-Saharan Africa, where trypanosomiasis is endemic. T. brucei cycles between the mammalian host and a tsetse fly vector, and parasites undergo huge changes in morphology and metabolism to adapt to different hosts. Here, we show that DNA-interacting bromodomain protein 3 (Bdf3) shows changes in occupancy at its binding sites as parasites transition from the bloodstream to the insect stage. Additionally, a new binding site appears near the locus responsible for remodeling of parasite surface proteins during transition to the insect stage. Understanding the mechanisms behind host adaptation is important for understanding the life cycle of the parasite.
Collapse
|
71
|
Kim HJ, Booth G, Saunders L, Srivatsan S, McFaline-Figueroa JL, Trapnell C. Nuclear oligo hashing improves differential analysis of single-cell RNA-seq. Nat Commun 2022; 13:2666. [PMID: 35562344 PMCID: PMC9106741 DOI: 10.1038/s41467-022-30309-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 04/26/2022] [Indexed: 11/09/2022] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) offers a high-resolution molecular view into complex tissues, but suffers from high levels of technical noise which frustrates efforts to compare the gene expression programs of different cell types. "Spike-in" RNA standards help control for technical variation in scRNA-seq, but using them with recently developed, ultra-scalable scRNA-seq methods based on combinatorial indexing is not feasible. Here, we describe a simple and cost-effective method for normalizing transcript counts and subtracting technical variability that improves differential expression analysis in scRNA-seq. The method affixes a ladder of synthetic single-stranded DNA oligos to each cell that appears in its RNA-seq library. With improved normalization we explore chemical perturbations with broad or highly specific effects on gene regulation, including RNA pol II elongation, histone deacetylation, and activation of the glucocorticoid receptor. Our methods reveal that inhibiting histone deacetylation prevents cells from executing their canonical program of changes following glucocorticoid stimulation.
Collapse
Affiliation(s)
- Hyeon-Jin Kim
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA
| | - Greg Booth
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA
| | - Lauren Saunders
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA
| | - Sanjay Srivatsan
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA
| | | | - Cole Trapnell
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA. .,Brotman Baty Institute of Precision Medicine, Seattle, WA, 98195, USA. .,Allen Discovery Center for Cell Lineage Tracing, Seattle, WA, 98195, USA.
| |
Collapse
|
72
|
Waskito LA, Rezkitha YAA, Vilaichone RK, Wibawa IDN, Mustika S, Sugihartono T, Miftahussurur M. Antimicrobial Resistance Profile by Metagenomic and Metatranscriptomic Approach in Clinical Practice: Opportunity and Challenge. Antibiotics (Basel) 2022; 11:antibiotics11050654. [PMID: 35625299 PMCID: PMC9137939 DOI: 10.3390/antibiotics11050654] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 04/29/2022] [Accepted: 05/09/2022] [Indexed: 01/15/2023] Open
Abstract
The burden of bacterial resistance to antibiotics affects several key sectors in the world, including healthcare, the government, and the economic sector. Resistant bacterial infection is associated with prolonged hospital stays, direct costs, and costs due to loss of productivity, which will cause policy makers to adjust their policies. Current widely performed procedures for the identification of antibiotic-resistant bacteria rely on culture-based methodology. However, some resistance determinants, such as free-floating DNA of resistance genes, are outside the bacterial genome, which could be potentially transferred under antibiotic exposure. Metagenomic and metatranscriptomic approaches to profiling antibiotic resistance offer several advantages to overcome the limitations of the culture-based approach. These methodologies enhance the probability of detecting resistance determinant genes inside and outside the bacterial genome and novel resistance genes yet pose inherent challenges in availability, validity, expert usability, and cost. Despite these challenges, such molecular-based and bioinformatics technologies offer an exquisite advantage in improving clinicians’ diagnoses and the management of resistant infectious diseases in humans. This review provides a comprehensive overview of next-generation sequencing technologies, metagenomics, and metatranscriptomics in assessing antimicrobial resistance profiles.
Collapse
Affiliation(s)
- Langgeng Agung Waskito
- Department of Internal Medicine, Faculty of Medicine, Universitas Airlangga, Surabaya 60132, Indonesia;
- Helicobacter pylori and Microbiota Study Group, Institute of Tropical Diseases, Universitas Airlangga, Surabaya 60115, Indonesia;
- Department of Physiology and Medical Biochemistry, Faculty of Medicine, Universitas Airlangga, Surabaya 60132, Indonesia
| | - Yudith Annisa Ayu Rezkitha
- Helicobacter pylori and Microbiota Study Group, Institute of Tropical Diseases, Universitas Airlangga, Surabaya 60115, Indonesia;
- Department of Internal Medicine, Faculty of Medicine, Universitas Muhammadiyah Surabaya, Surabaya 60115, Indonesia
| | - Ratha-korn Vilaichone
- Gastroenterology Unit, Department of Medicine, Faculty of Medicine, Thammasat University Hospital, Khlong Nueng 12120, Pathumthani, Thailand;
- Digestive Diseases Research Center (DRC), Thammasat University, Khlong Nueng 12121, Pathumthani, Thailand
- Department of Medicine, Chulabhorn International College of Medicine (CICM), Thammasat University, Khlong Nueng 12121, Pathumthani, Thailand
- Division of Gastroentero-Hepatology, Department of Internal Medicine, Faculty of Medicine, Dr. Soetomo Teaching Hospital, Universitas Airlangga, Surabaya 60286, Indonesia;
| | - I Dewa Nyoman Wibawa
- Division of Gastroentero-Hepatology, Department of Internal Medicine, Sanglah General Hospital, Faculty of Medicine, Universitas Udayana, Denpasar 80232, Indonesia;
| | - Syifa Mustika
- Division of Gastroentero-Hepatology, Department of Internal Medicine, Dr. Saiful Anwar Hospital, Malang 65112, Indonesia;
| | - Titong Sugihartono
- Division of Gastroentero-Hepatology, Department of Internal Medicine, Faculty of Medicine, Dr. Soetomo Teaching Hospital, Universitas Airlangga, Surabaya 60286, Indonesia;
| | - Muhammad Miftahussurur
- Helicobacter pylori and Microbiota Study Group, Institute of Tropical Diseases, Universitas Airlangga, Surabaya 60115, Indonesia;
- Division of Gastroentero-Hepatology, Department of Internal Medicine, Faculty of Medicine, Dr. Soetomo Teaching Hospital, Universitas Airlangga, Surabaya 60286, Indonesia;
- Correspondence: ; Tel.: +62-31-502-3865; Fax: +62-31-502-3865
| |
Collapse
|
73
|
Fröhlich K, Brombacher E, Fahrner M, Vogele D, Kook L, Pinter N, Bronsert P, Timme-Bronsert S, Schmidt A, Bärenfaller K, Kreutz C, Schilling O. Benchmarking of analysis strategies for data-independent acquisition proteomics using a large-scale dataset comprising inter-patient heterogeneity. Nat Commun 2022; 13:2622. [PMID: 35551187 PMCID: PMC9098472 DOI: 10.1038/s41467-022-30094-0] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Accepted: 04/14/2022] [Indexed: 12/25/2022] Open
Abstract
Numerous software tools exist for data-independent acquisition (DIA) analysis of clinical samples, necessitating their comprehensive benchmarking. We present a benchmark dataset comprising real-world inter-patient heterogeneity, which we use for in-depth benchmarking of DIA data analysis workflows for clinical settings. Combining spectral libraries, DIA software, sparsity reduction, normalization, and statistical tests results in 1428 distinct data analysis workflows, which we evaluate based on their ability to correctly identify differentially abundant proteins. From our dataset, we derive bootstrap datasets of varying sample sizes and use the whole range of bootstrap datasets to robustly evaluate each workflow. We find that all DIA software suites benefit from using a gas-phase fractionated spectral library, irrespective of the library refinement used. Gas-phase fractionation-based libraries perform best against two out of three reference protein lists. Among all investigated statistical tests non-parametric permutation-based statistical tests consistently perform best. Data independent acquisition (DIA) has been gaining momentum in clinical proteomics. Here, the authors create a benchmark dataset comprising inter-patient heterogeneity to compare popular DIA data analysis workflows for identifying differentially abundant proteins.
Collapse
Affiliation(s)
- Klemens Fröhlich
- Institute for Surgical Pathology, Medical Center-University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg im Breisgau, Germany.,Faculty of Biology, University of Freiburg, Freiburg im Breisgau, Germany.,Spemann Graduate School of Biology and Medicine (SGBM), University of Freiburg, Freiburg im Breisgau, Germany
| | - Eva Brombacher
- Faculty of Biology, University of Freiburg, Freiburg im Breisgau, Germany.,Spemann Graduate School of Biology and Medicine (SGBM), University of Freiburg, Freiburg im Breisgau, Germany.,Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg, Freiburg im Breisgau, Germany.,Centre for Integrative Biological Signaling Studies (CIBSS), University of Freiburg, Freiburg im Breisgau, Germany
| | - Matthias Fahrner
- Institute for Surgical Pathology, Medical Center-University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg im Breisgau, Germany.,Faculty of Biology, University of Freiburg, Freiburg im Breisgau, Germany.,Spemann Graduate School of Biology and Medicine (SGBM), University of Freiburg, Freiburg im Breisgau, Germany
| | - Daniel Vogele
- Institute for Surgical Pathology, Medical Center-University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg im Breisgau, Germany.,Faculty of Biology, University of Freiburg, Freiburg im Breisgau, Germany
| | - Lucas Kook
- Epidemiology, Biostatistics & Prevention Institute, University of Zurich, Zurich, Switzerland.,Institute for Data Analysis and Process Design, Zurich University of Applied Sciences, Winterthur, Switzerland
| | - Niko Pinter
- Institute for Surgical Pathology, Medical Center-University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg im Breisgau, Germany
| | - Peter Bronsert
- Institute for Surgical Pathology, Medical Center-University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg im Breisgau, Germany.,German Cancer Consortium (DKTK) and German Cancer Research Center (DKFZ), Heidelberg, Germany.,Tumorbank Comprehensive Cancer Center Freiburg, Medical Center University of Freiburg, Freiburg im Breisgau, Germany
| | - Sylvia Timme-Bronsert
- Institute for Surgical Pathology, Medical Center-University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg im Breisgau, Germany.,Tumorbank Comprehensive Cancer Center Freiburg, Medical Center University of Freiburg, Freiburg im Breisgau, Germany
| | - Alexander Schmidt
- Proteomics Core Facility, Biozentrum, University of Basel, Basel, Switzerland
| | - Katja Bärenfaller
- Swiss Institute of Allergy and Asthma Research (SIAF), University of Zurich, and Swiss Institute of Bioinformatics (SIB), Wolfgang, Switzerland
| | - Clemens Kreutz
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg, Freiburg im Breisgau, Germany.,Centre for Integrative Biological Signaling Studies (CIBSS), University of Freiburg, Freiburg im Breisgau, Germany
| | - Oliver Schilling
- Institute for Surgical Pathology, Medical Center-University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg im Breisgau, Germany. .,German Cancer Consortium (DKTK) and German Cancer Research Center (DKFZ), Heidelberg, Germany. .,BIOSS Centre for Biological Signaling Studies, University of Freiburg, Freiburg im Breisgau, Germany.
| |
Collapse
|
74
|
Sargazi S, Mukhtar M, Rahdar A, Bilal M, Barani M, Díez-Pascual AM, Behzadmehr R, Pandey S. Opportunities and challenges of using high-sensitivity nanobiosensors to detect long noncoding RNAs: A preliminary review. Int J Biol Macromol 2022; 205:304-315. [PMID: 35182562 DOI: 10.1016/j.ijbiomac.2022.02.082] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 02/11/2022] [Accepted: 02/14/2022] [Indexed: 12/17/2022]
Abstract
The two types ofncRNAs, including microRNAs (miRNAs) and long noncoding RNAs (lncRNAs), are responsible for several biological processes within cells, such as the immune responses, cell growth and invasion, and regulation of the cell cycle. Rapidly expanding class of ncRNAs, lncRNAsinteract with other molecules to form chromatin-remodeling complexes. These potential hallmarks of diseases contribute to transcriptional and post-transcriptional regulation of several genes, possibly via cross-talk with other RNAs. Aberrant expression of lncRNAshas drawn increasing attention to the pathophysiology of different diseases, includingcancer and cardiovasculardiseases. Unfortunately, circulating lncRNAs are presented in the bloodstream at very low levels, making sensitive detection difficult. Currently, there are few methods for detecting these ncRNAs from which quantitative real-time-polymerase chain reaction (qRT-PCR) is the most routinely used technique. These techniqueslack sensitivity for intracellular detection of lncRNAs. Moreover, they are tedious and require a large sample size. Currently, nanotechnology has taken over the diagnostic field because of the tunable properties and modification opportunities. Furthermore, these conventional techniques can be merged with nanotechnology to improve detection sensitivity.This review highlights some of the most recent findings on nanotechnology-based methods and possible obstacles intheir application for moreaccurate sensing of lncRNAs.
Collapse
Affiliation(s)
- Saman Sargazi
- Cellular and Molecular Research Center, Research Institute of Cellular and Molecular Sciences in Infectious Diseases, Zahedan University of Medical Sciences, Zahedan 9816743463, Iran
| | - Mahwash Mukhtar
- Faculty of Pharmacy, Institute of Pharmaceutical Technology and Regulatory Affairs, University of Szeged, Eötvösutca 6, Szeged 6720, Hungary
| | - Abbas Rahdar
- Department of Physics, Faculty of Science, University of Zabol, 538-98615 Zabol, Iran.
| | - Muhammad Bilal
- School of Life Science and Food Engineering, Huaiyin Institute of Technology, Huaian 223003, China
| | - Mahmood Barani
- Medical Mycology and Bacteriology Research Center, Kerman University of Medical Sciences, Kerman 7616913555, Iran
| | - Ana M Díez-Pascual
- Universidad de Alcalá, Facultad de Ciencias, Departamento de Química Analítica, Química Física e Ingeniería Química, Ctra. Madrid-Barcelona, Km. 33.6, 28805 Alcalá de Henares, Madrid, Spain
| | - Razieh Behzadmehr
- Department of Radiology, Zabol university of medical sciences, Zabol, Iran
| | - Sadanand Pandey
- Department of Chemistry, College of Natural Science, Yeungnam University, 280 Daehak-Ro, Gyeongsan, Gyeongbuk 38541, South Korea.
| |
Collapse
|
75
|
GCEN: An Easy-to-Use Toolkit for Gene Co-Expression Network Analysis and lncRNAs Annotation. Curr Issues Mol Biol 2022; 44:1479-1487. [PMID: 35723358 PMCID: PMC9164028 DOI: 10.3390/cimb44040100] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2022] [Revised: 03/13/2022] [Accepted: 03/23/2022] [Indexed: 02/07/2023] Open
Abstract
Gene co-expression network analysis has been widely used in gene function annotation, especially for long noncoding RNAs (lncRNAs). However, there is a lack of effective cross-platform analysis tools. For biologists to easily build a gene co-expression network and to predict gene function, we developed GCEN, a cross-platform command-line toolkit developed with C++. It is an efficient and easy-to-use solution that will allow everyone to perform gene co-expression network analysis without the requirement of sophisticated programming skills, especially in cases of RNA-Seq research and lncRNAs function annotation. Because of its modular design, GCEN can be easily integrated into other pipelines.
Collapse
|
76
|
LncRNA Biomarkers of Inflammation and Cancer. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1363:121-145. [PMID: 35220568 DOI: 10.1007/978-3-030-92034-0_7] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 12/09/2022]
Abstract
Long noncoding RNAs (lncRNAs) are promising candidates as biomarkers of inflammation and cancer. LncRNAs have several properties that make them well-suited as molecular markers of disease: (1) many lncRNAs are expressed in a tissue-specific manner, (2) distinct lncRNAs are upregulated based on different inflammatory or oncogenic stimuli, (3) lncRNAs released from cells are packaged and protected in extracellular vesicles, and (4) circulating lncRNAs in the blood are detectable using various RNA sequencing approaches. Here we focus on the potential for lncRNA biomarkers to detect inflammation and cancer, highlighting key biological, technological, and analytical considerations that will help advance the development of lncRNA-based liquid biopsies.
Collapse
|
77
|
Pinel GD, Horder JL, King JR, McIntyre A, Mongan NP, López GG, Benest AV. Endothelial Cell RNA-Seq Data: Differential Expression and Functional Enrichment Analyses to Study Phenotypic Switching. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2441:369-426. [PMID: 35099752 DOI: 10.1007/978-1-0716-2059-5_29] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
RNA-seq is a common approach used to explore gene expression data between experimental conditions or cell types and ultimately leads to information that can shed light on the biological processes involved and inform further hypotheses. While the protocols required to generate samples for sequencing can be performed in most research facilities, the resulting computational analysis is often an area in which researchers have little experience. Here we present a user-friendly bioinformatics workflow which describes the methods required to take raw data produced by RNA sequencing to interpretable results. Widely used and well documented tools are applied. Data quality assessment and read trimming were performed by FastQC and Cutadapt, respectively. Following this, STAR was utilized to map the trimmed reads to a reference genome and the alignment was analyzed by Qualimap. The subsequent mapped reads were quantified by featureCounts. DESeq2 was used to normalize and perform differential expression analysis on the quantified reads, identifying differentially expressed genes and preparing the data for functional enrichment analysis. Gene set enrichment analysis identified enriched gene sets from the normalized count data and clusterProfiler was used to perform functional enrichment against the GO, KEGG, and Reactome databases. Example figures of the functional enrichment analysis results were also generated. The example data used in the workflow are derived from HUVECs, an in vitro model used in the study of endothelial cells, published and publicly available for download from the European Nucleotide Archive.
Collapse
Affiliation(s)
- Guillermo Díez Pinel
- Neuronal and Vascular Biology Group, UCL Institute of Ophthalmology, University College London, London, UK
| | - Joseph L Horder
- Endothelial Quiescence Group, Centre for Cancer Sciences, Biodiscovery Institute, School of Medicine, University of Nottingham, Nottingham, UK
| | - John R King
- School of Mathematics, Faculty of Science, University of Nottingham, Nottingham, UK
| | - Alan McIntyre
- Hypoxia and Acidosis Group, Center for Cancer Sciences, Biodiscovery Institute, University of Nottingham, Nottingham, UK
| | - Nigel P Mongan
- School of Veterinary Medicine and Science, Biodiscovery Institute, University of Nottingham, Nottingham, UK
- Department of Pharmacology, Weill Cornell Medicine, New York, NY, USA
| | - Gonzalo Gómez López
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Andrew V Benest
- Endothelial Quiescence Group, Centre for Cancer Sciences, Biodiscovery Institute, School of Medicine, University of Nottingham, Nottingham, UK.
| |
Collapse
|
78
|
miRNA-Profiling in Ejaculated and Epididymal Pig Spermatozoa and Their Relation to Fertility after Artificial Insemination. BIOLOGY 2022; 11:biology11020236. [PMID: 35205102 PMCID: PMC8869492 DOI: 10.3390/biology11020236] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Revised: 01/18/2022] [Accepted: 01/25/2022] [Indexed: 02/01/2023]
Abstract
Simple Summary The present study searched for the presence and abundance of porcine spermatozoa small RNA sequences (microRNAs) that have the potential to alter gene expression patterns. Four different sperm sources were compared: spermatozoa from three different sections of the ejaculate and from the caudal epididymis, also classed as spermatozoa from higher (HF) or lower (LF) fertility boars. Sperm miRNAs were compared using high-output small RNA sequencing. We identified five sperm miRNAs not previously reported in pigs. Differences in abundance of four miRNAs known to affect the expression of genes with key roles in fertility were related to boar fertility. These miRNAs could be used as fertility markers in artificial insemination programs. Abstract MicroRNAs (miRNAs) are short non-coding RNAs (20–25 nucleotides in length) capable of regulating gene expression by binding -fully or partially- to the 3’-UTR of target messenger RNA (mRNA). To date, several studies have investigated the role of sperm miRNAs in spermatogenesis and their remaining presence toward fertilization and early embryo development. However, little is known about the miRNA cargo in the different sperm sources and their possible implications in boar fertility. Here, we characterized the differential abundance of miRNAs in spermatozoa from the terminal segment of the epididymis and three different fractions of the pig ejaculate (sperm-peak, sperm-rich, and post-sperm rich) comparing breeding boars with higher (HF) and lower (LF) fertility after artificial insemination (AI) using high-output small RNA sequencing. We identified five sperm miRNAs that, to our knowledge, have not been previously reported in pigs (mir-10386, mir-10390, mir-6516, mir-9788-1, and mir-9788-2). Additionally, four miRNAs (mir-1285, mir-92a, mir-34c, mir-30), were differentially expressed among spermatozoa sourced from ejaculate fractions and the cauda epididymis, and also different abundance was found between HF and LF groups in mir-182, mir-1285, mir-191, and mir-96. These miRNAs target genes with key roles in fertility, sperm survival, immune tolerance, or cell cycle regulation, among others. Linking the current findings with the expression of specific sperm proteins would help predict fertility in future AI-sires.
Collapse
|
79
|
FitzGerald ES, Jamieson AM. Comment on 'SARS-CoV-2 suppresses anticoagulant and fibrinolytic gene expression in the lung'. eLife 2022; 11:74268. [PMID: 35014954 PMCID: PMC8752089 DOI: 10.7554/elife.74268] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 11/26/2021] [Indexed: 12/28/2022] Open
Abstract
Mast et al. analyzed transcriptome data derived from RNA-sequencing (RNA-seq) of COVID-19 patient bronchoalveolar lavage fluid (BALF) samples, as compared to BALF RNA-seq samples from a study investigating microbiome and inflammatory interactions in obese and asthmatic adults (Mast et al., 2021). Based on their analysis of these data, Mast et al. concluded that mRNA expression of key regulators of the extrinsic coagulation cascade and fibrinolysis were significantly reduced in COVID-19 patients. Notably, they reported that the expression of the extrinsic coagulation cascade master regulator Tissue Factor (F3) remained unchanged, while there was an 8-fold upregulation of its cognate inhibitor Tissue Factor Pathway Inhibitor (TFPI). From this they conclude that “pulmonary fibrin deposition does not stem from enhanced local [tissue factor] production and that counterintuitively, COVID-19 may dampen [tissue factor]-dependent mechanisms in the lungs”. They also reported decreased Activated Protein C (aPC) mediated anticoagulant activity and major increases in fibrinogen expression and other key regulators of clot formation. Many of these results are contradictory to findings in most of the field, particularly the findings regarding extrinsic coagulation cascade mediated coagulopathies. Here, we present a complete re-analysis of the data sets analyzed by Mast et al. This re-analysis demonstrates that the two data sets utilized were not comparable between one another, and that the COVID-19 sample set was not suitable for the transcriptomic analysis Mast et al. performed. We also identified other significant flaws in the design of their retrospective analysis, such as poor-quality control and filtering standards. Given the issues with the datasets and analysis, their conclusions are not supported.
Collapse
Affiliation(s)
- Ethan S FitzGerald
- Division of Biology and Medicine, Department of Molecular Microbiology and Immunology, Brown University, Providence, United States
| | - Amanda M Jamieson
- Division of Biology and Medicine, Department of Molecular Microbiology and Immunology, Brown University, Providence, United States
| |
Collapse
|
80
|
Graf J, Cho S, McDonough E, Corwin A, Sood A, Lindner A, Salvucci M, Stachtea X, Van Schaeybroeck S, Dunne PD, Laurent-Puig P, Longley D, Prehn JHM, Ginty F. FLINO: a new method for immunofluorescence bioimage normalization. Bioinformatics 2022; 38:520-526. [PMID: 34601553 PMCID: PMC8723144 DOI: 10.1093/bioinformatics/btab686] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 09/09/2021] [Accepted: 09/25/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Multiplexed immunofluorescence bioimaging of single-cells and their spatial organization in tissue holds great promise to the development of future precision diagnostics and therapeutics. Current multiplexing pipelines typically involve multiple rounds of immunofluorescence staining across multiple tissue slides. This introduces experimental batch effects that can hide underlying biological signal. It is important to have robust algorithms that can correct for the batch effects while not introducing biases into the data. Performance of data normalization methods can vary among different assay pipelines. To evaluate differences, it is critical to have a ground truth dataset that is representative of the assay. RESULTS A new immunoFLuorescence Image NOrmalization method is presented and evaluated against alternative methods and workflows. Multiround immunofluorescence staining of the same tissue with the nuclear dye DAPI was used to represent virtual slides and a ground truth. DAPI was restained on a given tissue slide producing multiple images of the same underlying structure but undergoing multiple representative tissue handling steps. This ground truth dataset was used to evaluate and compare multiple normalization methods including median, quantile, smooth quantile, median ratio normalization and trimmed mean of the M-values. These methods were applied in both an unbiased grid object and segmented cell object workflow to 24 multiplexed biomarkers. An upper quartile normalization of grid objects in log space was found to obtain almost equivalent performance to directly normalizing segmented cell objects by the middle quantile. The developed grid-based technique was then applied with on-slide controls for evaluation. Using five or fewer controls per slide can introduce biases into the data. Ten or more on-slide controls were able to robustly correct for batch effects. AVAILABILITY AND IMPLEMENTATION The data underlying this article along with the FLINO R-scripts used to perform the evaluation of image normalizations methods and workflows can be downloaded from https://github.com/GE-Bio/FLINO. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- John Graf
- To whom correspondence should be addressed. or
| | - Sanghee Cho
- Department of Biology & Applied Physics, GE Research, Niskayuna, NY 12309, USA
| | - Elizabeth McDonough
- Department of Biology & Applied Physics, GE Research, Niskayuna, NY 12309, USA
| | - Alex Corwin
- Department of Biology & Applied Physics, GE Research, Niskayuna, NY 12309, USA
| | - Anup Sood
- Department of Biology & Applied Physics, GE Research, Niskayuna, NY 12309, USA
| | - Andreas Lindner
- Department of Physiology and Medical Physics, Centre of Systems Medicine, Royal College of Surgeons in Ireland University of Medicine and Health Sciences, 123 St. Stephen’s Green, Dublin 2, Ireland
| | - Manuela Salvucci
- Department of Physiology and Medical Physics, Centre of Systems Medicine, Royal College of Surgeons in Ireland University of Medicine and Health Sciences, 123 St. Stephen’s Green, Dublin 2, Ireland
| | - Xanthi Stachtea
- Department of Oncology, Centre for Cancer Research & Cell Biology, Queen’s University Belfast, 97 Lisburn Road, Belfast, BT9 7AE, Northern Ireland, UK
| | - Sandra Van Schaeybroeck
- Department of Oncology, Centre for Cancer Research & Cell Biology, Queen’s University Belfast, 97 Lisburn Road, Belfast, BT9 7AE, Northern Ireland, UK
| | - Philip D Dunne
- Department of Oncology, Centre for Cancer Research & Cell Biology, Queen’s University Belfast, 97 Lisburn Road, Belfast, BT9 7AE, Northern Ireland, UK
| | - Pierre Laurent-Puig
- Department of Biology, Hôpital Européen Georges-Pompidou, Assistance Publique - Hôpitaux de Paris, 3 Av. Victoria, 75004 Paris, France
| | - Daniel Longley
- Department of Oncology, Centre for Cancer Research & Cell Biology, Queen’s University Belfast, 97 Lisburn Road, Belfast, BT9 7AE, Northern Ireland, UK
| | - Jochen H M Prehn
- Department of Physiology and Medical Physics, Centre of Systems Medicine, Royal College of Surgeons in Ireland University of Medicine and Health Sciences, 123 St. Stephen’s Green, Dublin 2, Ireland
| | - Fiona Ginty
- To whom correspondence should be addressed. or
| |
Collapse
|
81
|
Johnson KA, Krishnan A. Robust normalization and transformation techniques for constructing gene coexpression networks from RNA-seq data. Genome Biol 2022; 23:1. [PMID: 34980209 PMCID: PMC8721966 DOI: 10.1186/s13059-021-02568-9] [Citation(s) in RCA: 44] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 12/06/2021] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Constructing gene coexpression networks is a powerful approach for analyzing high-throughput gene expression data towards module identification, gene function prediction, and disease-gene prioritization. While optimal workflows for constructing coexpression networks, including good choices for data pre-processing, normalization, and network transformation, have been developed for microarray-based expression data, such well-tested choices do not exist for RNA-seq data. Almost all studies that compare data processing and normalization methods for RNA-seq focus on the end goal of determining differential gene expression. RESULTS Here, we present a comprehensive benchmarking and analysis of 36 different workflows, each with a unique set of normalization and network transformation methods, for constructing coexpression networks from RNA-seq datasets. We test these workflows on both large, homogenous datasets and small, heterogeneous datasets from various labs. We analyze the workflows in terms of aggregate performance, individual method choices, and the impact of multiple dataset experimental factors. Our results demonstrate that between-sample normalization has the biggest impact, with counts adjusted by size factors producing networks that most accurately recapitulate known tissue-naive and tissue-aware gene functional relationships. CONCLUSIONS Based on this work, we provide concrete recommendations on robust procedures for building an accurate coexpression network from an RNA-seq dataset. In addition, researchers can examine all the results in great detail at https://krishnanlab.github.io/RNAseq_coexpression to make appropriate choices for coexpression analysis based on the experimental factors of their RNA-seq dataset.
Collapse
Affiliation(s)
- Kayla A Johnson
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Arjun Krishnan
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA.
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA.
| |
Collapse
|
82
|
Düren Y, Lederer J, Qin LX. OUP accepted manuscript. Nucleic Acids Res 2022; 50:e56. [PMID: 35188574 PMCID: PMC9177987 DOI: 10.1093/nar/gkac064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 01/03/2022] [Accepted: 02/08/2022] [Indexed: 11/22/2022] Open
Abstract
Deep sequencing has become one of the most popular tools for transcriptome profiling in biomedical studies. While an abundance of computational methods exists for ‘normalizing’ sequencing data to remove unwanted between-sample variations due to experimental handling, there is no consensus on which normalization is the most suitable for a given data set. To address this problem, we developed ‘DANA’—an approach for assessing the performance of normalization methods for microRNA sequencing data based on biology-motivated and data-driven metrics. Our approach takes advantage of well-known biological features of microRNAs for their expression pattern and chromosomal clustering to simultaneously assess (i) how effectively normalization removes handling artifacts and (ii) how aptly normalization preserves biological signals. With DANA, we confirm that the performance of eight commonly used normalization methods vary widely across different data sets and provide guidance for selecting a suitable method for the data at hand. Hence, it should be adopted as a routine preprocessing step (preceding normalization) for microRNA sequencing data analysis. DANA is implemented in R and publicly available at https://github.com/LXQin/DANA.
Collapse
Affiliation(s)
- Yannick Düren
- Department of Mathematical Statistics, Ruhr-University Bochum, Universitätsstraße 150, 44801 Bochum, Germany
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Johannes Lederer
- Department of Mathematical Statistics, Ruhr-University Bochum, Universitätsstraße 150, 44801 Bochum, Germany
| | - Li-Xuan Qin
- To whom correspondence should be addressed. Tel: +1 646 888 8251; Fax: +1 646 888 0010;
| |
Collapse
|
83
|
Cai M, Yin X, Tang X, Zhang C, Zheng Q, Li M. Metatranscriptomics reveals different features of methanogenic archaea among global vegetated coastal ecosystems. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022; 802:149848. [PMID: 34464803 DOI: 10.1016/j.scitotenv.2021.149848] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 08/17/2021] [Accepted: 08/19/2021] [Indexed: 06/13/2023]
Abstract
Vegetated coastal ecosystems (VCEs; i.e., mangroves, saltmarshes, and seagrasses) represent important sources of natural methane emission. Despite recent advances in the understanding of novel taxa and pathways associated with methanogenesis in these ecosystems, the key methanogenic players and the contribution of different substrates to methane formation remain elusive. Here, we systematically investigate the community and activity of methanogens using publicly available metatranscriptomes at a global scale together with our in-house metatranscriptomic dataset. Taxonomic profiling reveals that 13 groups of methanogenic archaea were transcribed in the investigated VCEs, and they were predominated by Methanosarcinales. Among these VCEs, methanogens exhibited all the three known methanogenic pathways in some mangrove sediments, where methylotrophic methanogens Methanosarcinales/Methanomassiliicoccales grew on diverse methyl compounds and coexisted with hydrogenotrophic (mainly Methanomicrobiales) and acetoclastic (mainly Methanothrix) methanogens. Contrastingly, the predominant methanogenic pathway in saltmarshes and seagrasses was constrained to methylotrophic methanogenesis. These findings reveal different archaeal methanogens in VCEs and suggest the potentially distinct methanogenesis contributions in these VCEs to the global warming.
Collapse
Affiliation(s)
- Mingwei Cai
- Institute of Chemical Biology, Shenzhen Bay Laboratory, Shenzhen, China; Shenzhen Key Laboratory of Marine Microbiome Engineering, Institute for Advanced Study, Shenzhen University, Shenzhen, China.
| | - Xiuran Yin
- Microbial Ecophysiology Group, Faculty of Biology/Chemistry, University of Bremen, Bremen, Germany; MARUM, Center for Marine Environmental Sciences, University of Bremen, Bremen, Germany
| | - Xiaoyu Tang
- Institute of Chemical Biology, Shenzhen Bay Laboratory, Shenzhen, China; School of Pharmaceutical Sciences, Nanjing Tech University, Nanjing, China
| | - Cuijing Zhang
- Shenzhen Key Laboratory of Marine Microbiome Engineering, Institute for Advanced Study, Shenzhen University, Shenzhen, China
| | - Qingfei Zheng
- Institute of Chemical Biology, Shenzhen Bay Laboratory, Shenzhen, China; School of Chemical Biology and Biotechnology, Peking University Shenzhen Graduate School, Shenzhen, China.
| | - Meng Li
- Shenzhen Key Laboratory of Marine Microbiome Engineering, Institute for Advanced Study, Shenzhen University, Shenzhen, China.
| |
Collapse
|
84
|
Tesovnik T, Jenko Bizjan B, Šket R, Debeljak M, Battelino T, Kovač J. Technological Approaches in the Analysis of Extracellular Vesicle Nucleotide Sequences. Front Bioeng Biotechnol 2021; 9:787551. [PMID: 35004647 PMCID: PMC8733665 DOI: 10.3389/fbioe.2021.787551] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 11/11/2021] [Indexed: 12/12/2022] Open
Abstract
Together with metabolites, proteins, and lipid components, the EV cargo consists of DNA and RNA nucleotide sequence species, which are part of the intracellular communication network regulating specific cellular processes and provoking distinct target cell responses. The extracellular vesicle (EV) nucleotide sequence cargo molecules are often investigated in association with a particular pathology and may provide an insight into the physiological and pathological processes in hard-to-access organs and tissues. The diversity and biological function of EV nucleotide sequences are distinct regarding EV subgroups and differ in tissue- and cell-released EVs. EV DNA is present mainly in apoptotic bodies, while there are different species of EV RNAs in all subgroups of EVs. A limited sample volume of unique human liquid biopsy provides a small amount of EVs with limited isolated DNA and RNA, which can be a challenging factor for EV nucleotide sequence analysis, while the additional difficulty is technical variability of molecular nucleotide detection. Every EV study is challenged with its first step of the EV isolation procedure, which determines the EV's purity, yield, and diameter range and has an impact on the EV's downstream analysis with a significant impact on the final result. The gold standard EV isolation procedure with ultracentrifugation provides a low output and not highly pure isolated EVs, while modern techniques increase EV's yield and purity. Different EV DNA and RNA detection techniques include the PCR procedure for nucleotide sequence replication of the molecules of interest, which can undergo a small-input EV DNA or RNA material. The nucleotide sequence detection approaches with their advantages and disadvantages should be considered to appropriately address the study problem and to extract specific EV nucleotide sequence information with the detection using qPCR or next-generation sequencing. Advanced next-generation sequencing techniques allow the detection of total EV genomic or transcriptomic data even at the single-molecule resolution and thus, offering a sensitive and accurate EV DNA or RNA biomarker detection. Additionally, with the processes where the EV genomic or transcriptomic data profiles are compared to identify characteristic EV differences in specific conditions, novel biomarkers could be discovered. Therefore, a suitable differential expression analysis is crucial to define the EV DNA or RNA differences between conditions under investigation. Further bioinformatics analysis can predict molecular cell targets and identify targeted and affected cellular pathways. The prediction target tools with functional studies are essential to help specify the role of the investigated EV-targeted nucleotide sequences in health and disease and support further development of EV-related therapeutics. This review will discuss the biological diversity of human liquid biopsy-obtained EV nucleotide sequences DNA and RNA species reported as potential biomarkers in health and disease and methodological principles of their detection, from human liquid biopsy EV isolation, EV nucleotide sequence extraction, techniques for their detection, and their cell target prediction.
Collapse
Affiliation(s)
- Tine Tesovnik
- Institute for Special Laboratory Diagnostics, University Medical Centre Ljubljana, University Children’s Hospital, Ljubljana, Slovenia
| | - Barbara Jenko Bizjan
- Institute for Special Laboratory Diagnostics, University Medical Centre Ljubljana, University Children’s Hospital, Ljubljana, Slovenia
| | - Robert Šket
- Institute for Special Laboratory Diagnostics, University Medical Centre Ljubljana, University Children’s Hospital, Ljubljana, Slovenia
| | - Maruša Debeljak
- Institute for Special Laboratory Diagnostics, University Medical Centre Ljubljana, University Children’s Hospital, Ljubljana, Slovenia
| | - Tadej Battelino
- Department of Pediatric Endocrinology, Diabetes and Metabolic Diseases, University Medical Centre Ljubljana, University Children’s Hospital, Ljubljana, Slovenia
- Faculty of Medicine, Chair of Paediatrics, University of Ljubljana, Ljubljana, Slovenia
| | - Jernej Kovač
- Institute for Special Laboratory Diagnostics, University Medical Centre Ljubljana, University Children’s Hospital, Ljubljana, Slovenia
| |
Collapse
|
85
|
Cho-Clark MJ, Sukumar G, Vidal NM, Raiciulescu S, Oyola MG, Olsen C, Mariño-Ramírez L, Dalgard CL, Wu TJ. Comparative transcriptome analysis between patient and endometrial cancer cell lines to determine common signaling pathways and markers linked to cancer progression. Oncotarget 2021; 12:2500-2513. [PMID: 34966482 PMCID: PMC8711572 DOI: 10.18632/oncotarget.28161] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 12/10/2021] [Indexed: 01/08/2023] Open
Abstract
The rising incidence and mortality of endometrial cancer (EC) in the United States calls for an improved understanding of the disease's progression. Current methodologies for diagnosis and treatment rely on the use of cell lines as models for tumor biology. However, due to inherent heterogeneity and differential growing environments between cell lines and tumors, these comparative studies have found little parallels in molecular signatures. As a consequence, the development and discovery of preclinical models and reliable drug targets are delayed. In this study, we established transcriptome parallels between cell lines and tumors from The Cancer Genome Atlas (TCGA) with the use of optimized normalization methods. We identified genes and signaling pathways associated with regulating the transformation and progression of EC. Specifically, the LXR/RXR activation, neuroprotective role for THOP1 in Alzheimer's disease, and glutamate receptor signaling pathways were observed to be mostly downregulated in advanced cancer stage. While some of these highlighted markers and signaling pathways are commonly found in the central nervous system (CNS), our results suggest a novel function of these genes in the periphery. Finally, our study underscores the value of implementing appropriate normalization methods in comparative studies to improve the identification of accurate and reliable markers.
Collapse
Affiliation(s)
- Madelaine J. Cho-Clark
- Department of Gynecologic Surgery & Obstetrics, Uniformed Services University of the Health Sciences, Bethesda, MD 20814, USA
| | - Gauthaman Sukumar
- Collaborative Health Initiative Research Program, Uniformed Services University of the Health Sciences, Bethesda, MD 20814, USA
| | - Newton Medeiros Vidal
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Sorana Raiciulescu
- Preventive Medicine and Biostatistics, Uniformed Services University of the Health Sciences, Bethesda, MD 20814, USA
| | - Mario G. Oyola
- Department of Gynecologic Surgery & Obstetrics, Uniformed Services University of the Health Sciences, Bethesda, MD 20814, USA
| | - Cara Olsen
- Preventive Medicine and Biostatistics, Uniformed Services University of the Health Sciences, Bethesda, MD 20814, USA
| | - Leonardo Mariño-Ramírez
- National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, MD 20814, USA
| | - Clifton L. Dalgard
- Collaborative Health Initiative Research Program, Uniformed Services University of the Health Sciences, Bethesda, MD 20814, USA
- Department of Anatomy, Physiology and Genetics, Uniformed Services University of the Health Sciences, Bethesda, MD 20814, USA
| | - T. John Wu
- Department of Gynecologic Surgery & Obstetrics, Uniformed Services University of the Health Sciences, Bethesda, MD 20814, USA
| |
Collapse
|
86
|
Asami M, Lam BYH, Ma MK, Rainbow K, Braun S, VerMilyea MD, Yeo GSH, Perry ACF. Human embryonic genome activation initiates at the one-cell stage. Cell Stem Cell 2021; 29:209-216.e4. [PMID: 34936886 PMCID: PMC8826644 DOI: 10.1016/j.stem.2021.11.012] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 08/24/2021] [Accepted: 11/29/2021] [Indexed: 12/13/2022]
Abstract
In human embryos, the initiation of transcription (embryonic genome activation [EGA]) occurs by the eight-cell stage, but its exact timing and profile are unclear. To address this, we profiled gene expression at depth in human metaphase II oocytes and bipronuclear (2PN) one-cell embryos. High-resolution single-cell RNA sequencing revealed previously inaccessible oocyte-to-embryo gene expression changes. This confirmed transcript depletion following fertilization (maternal RNA degradation) but also uncovered low-magnitude upregulation of hundreds of spliced transcripts. Gene expression analysis predicted embryonic processes including cell-cycle progression and chromosome maintenance as well as transcriptional activators that included cancer-associated gene regulators. Transcription was disrupted in abnormal monopronuclear (1PN) and tripronuclear (3PN) one-cell embryos. These findings indicate that human embryonic transcription initiates at the one-cell stage, sooner than previously thought. The pattern of gene upregulation promises to illuminate processes involved at the onset of human development, with implications for epigenetic inheritance, stem-cell-derived embryos, and cancer. Gene expression initiates at the one-cell stage in human embryos Expression is of low magnitude but remains elevated until the eight-cell stage Upregulated transcripts are spliced and correspond to embryonic processes Upregulation is disrupted in morphologically abnormal one-cell embryos
Collapse
Affiliation(s)
- Maki Asami
- Laboratory of Mammalian Molecular Embryology, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, England
| | - Brian Y H Lam
- MRC Metabolic Diseases Unit, Wellcome-MRC Institute of Metabolic Science, Addenbrooke's Hospital, University of Cambridge, Cambridge CB2 0QQ, England
| | - Marcella K Ma
- MRC Metabolic Diseases Unit, Wellcome-MRC Institute of Metabolic Science, Addenbrooke's Hospital, University of Cambridge, Cambridge CB2 0QQ, England
| | - Kara Rainbow
- MRC Metabolic Diseases Unit, Wellcome-MRC Institute of Metabolic Science, Addenbrooke's Hospital, University of Cambridge, Cambridge CB2 0QQ, England
| | - Stefanie Braun
- Ovation Fertility Austin, Embryology and Andrology Laboratories, Austin, TX 78731, USA
| | - Matthew D VerMilyea
- Ovation Fertility Austin, Embryology and Andrology Laboratories, Austin, TX 78731, USA.
| | - Giles S H Yeo
- MRC Metabolic Diseases Unit, Wellcome-MRC Institute of Metabolic Science, Addenbrooke's Hospital, University of Cambridge, Cambridge CB2 0QQ, England.
| | - Anthony C F Perry
- Laboratory of Mammalian Molecular Embryology, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, England.
| |
Collapse
|
87
|
Zolotareva O, Nasirigerdeh R, Matschinske J, Torkzadehmahani R, Bakhtiari M, Frisch T, Späth J, Blumenthal DB, Abbasinejad A, Tieri P, Kaissis G, Rückert D, Wenke NK, List M, Baumbach J. Flimma: a federated and privacy-aware tool for differential gene expression analysis. Genome Biol 2021; 22:338. [PMID: 34906207 PMCID: PMC8670124 DOI: 10.1186/s13059-021-02553-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Accepted: 11/22/2021] [Indexed: 12/13/2022] Open
Abstract
Aggregating transcriptomics data across hospitals can increase sensitivity and robustness of differential expression analyses, yielding deeper clinical insights. As data exchange is often restricted by privacy legislation, meta-analyses are frequently employed to pool local results. However, the accuracy might drop if class labels are inhomogeneously distributed among cohorts. Flimma ( https://exbio.wzw.tum.de/flimma/ ) addresses this issue by implementing the state-of-the-art workflow limma voom in a federated manner, i.e., patient data never leaves its source site. Flimma results are identical to those generated by limma voom on aggregated datasets even in imbalanced scenarios where meta-analysis approaches fail.
Collapse
Affiliation(s)
- Olga Zolotareva
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany. .,Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany.
| | - Reza Nasirigerdeh
- AI in Medicine and Healthcare, Technical University of Munich, Munich, Germany.,Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
| | - Julian Matschinske
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | | | - Mohammad Bakhtiari
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Tobias Frisch
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Julian Späth
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - David B Blumenthal
- Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
| | - Amir Abbasinejad
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany.,Sapienza University of Rome, Rome, Italy
| | - Paolo Tieri
- CNR National Research Council, IAC Institute for Applied Computing, Rome, Italy.,Sapienza University of Rome, Rome, Italy
| | - Georgios Kaissis
- AI in Medicine and Healthcare, Technical University of Munich, Munich, Germany.,Klinikum rechts der Isar, Technical University of Munich, Munich, Germany.,Biomedical Image Analysis Group, Imperial College London, London, UK.,OpenMined, Oxford, UK
| | - Daniel Rückert
- AI in Medicine and Healthcare, Technical University of Munich, Munich, Germany.,Klinikum rechts der Isar, Technical University of Munich, Munich, Germany.,Biomedical Image Analysis Group, Imperial College London, London, UK
| | - Nina K Wenke
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Markus List
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany.,Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
88
|
Kudelova E, Holubekova V, Grendar M, Kolkova Z, Samec M, Vanova B, Mikolajcik P, Smolar M, Kudela E, Laca L, Lasabova Z. Circulating miRNA expression over the course of colorectal cancer treatment. Oncol Lett 2021; 23:18. [PMID: 34868358 PMCID: PMC8630815 DOI: 10.3892/ol.2021.13136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 07/20/2021] [Indexed: 11/06/2022] Open
Abstract
Colorectal cancer (CRC) is the third-most common cancer type in males and the second-most common cancer type in females, and has the second-highest overall mortality rate worldwide. Approximately 50% of patients in stage I–III develop metastases, mostly localized to the liver. All physiological conditions occurring in the organism are also reflected in the levels of circulating microRNAs (miRNAs/miRs) in patients. miRNAs are a class of small, non-coding, single-stranded RNAs consisting of 18–25 nucleotides, which have important roles in various cellular processes. The aim of the present study was to evaluate a panel of seven circulating miRNAs (miR-106a-5p, miR-210-5p, miR-155-5p, miR-21-5p, miR-103a-3p, miR-191-5p and miR-16-5p) as biomarkers for monitoring patients undergoing adjuvant treatment of CRC. Total RNA was extracted from the plasma of patients with CRC prior to surgery, in the early post-operative period (n=60) and 3 months after surgery (n=14). The levels of the selected circulating miRNAs were measured with the miRCURY LNA miRNA PCR system and fold changes were calculated using the standard ∆∆Cq method. DIANA-miRPath analysis was used to evaluate the role of significantly deregulated miRNAs. The results indicated significant upregulation of miR-155-5p, miR-21-5p and miR-191-5p, and downregulation of miR-16-5p directly after the surgery. In paired follow-up samples, the most significant upregulation was detected for miR-106a-5p and miR-16-5p, and the most significant downregulation was for miR-21-5p. Pathway analysis outlined the role of the differentially expressed miRNAs in cancer development, but the same pathways are also involved in wound healing and regeneration of intestinal epithelium. It may be suggested that these processes should also be considered in studies investigating sensitive and easily detectable circulating biomarkers for recurrence in patients.
Collapse
Affiliation(s)
- Eva Kudelova
- Clinic of Surgery and Transplant Center, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin SK-03601, Slovak Republic
| | - Veronika Holubekova
- Biomedical Center in Martin, Jessenius Faculty of Medicine, Comenius University in Bratislava, Martin SK-03601, Slovak Republic
| | - Marian Grendar
- Biomedical Center in Martin, Jessenius Faculty of Medicine, Comenius University in Bratislava, Martin SK-03601, Slovak Republic
| | - Zuzana Kolkova
- Biomedical Center in Martin, Jessenius Faculty of Medicine, Comenius University in Bratislava, Martin SK-03601, Slovak Republic
| | - Marek Samec
- Clinic of Gynecology and Obstetrics, Jessenius Faculty of Medicine, Comenius University in Bratislava, Martin SK-03601, Slovak Republic
| | - Barbora Vanova
- Biomedical Center in Martin, Jessenius Faculty of Medicine, Comenius University in Bratislava, Martin SK-03601, Slovak Republic
| | - Peter Mikolajcik
- Clinic of Surgery and Transplant Center, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin SK-03601, Slovak Republic
| | - Marek Smolar
- Clinic of Surgery and Transplant Center, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin SK-03601, Slovak Republic
| | - Erik Kudela
- Clinic of Gynecology and Obstetrics, Jessenius Faculty of Medicine, Comenius University in Bratislava, Martin SK-03601, Slovak Republic
| | - Ludovit Laca
- Clinic of Surgery and Transplant Center, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin SK-03601, Slovak Republic
| | - Zora Lasabova
- Department of Molecular Biology and Genomics, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin SK-03601, Slovak Republic
| |
Collapse
|
89
|
Pruisscher P, Lehmann P, Nylin S, Gotthard K, Wheat CW. Extensive transcriptomic profiling of pupal diapause in a butterfly reveals a dynamic phenotype. Mol Ecol 2021; 31:1269-1280. [PMID: 34862690 DOI: 10.1111/mec.16304] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 10/13/2021] [Accepted: 10/15/2021] [Indexed: 12/13/2022]
Abstract
Diapause is a common adaptation for overwintering in insects that is characterized by arrested development and increased tolerance to stress and cold. While the expression of specific candidate genes during diapause have been investigated, there is no general understanding of the dynamics of the transcriptional landscape as a whole during the extended diapause phenotype. Such a detailed temporal insight is important as diapause is a vital aspect of life cycle timing. Here, we performed a time-course experiment using RNA-Seq on the head and abdomen in the butterfly Pieris napi. In both body parts, comparing diapausing and nondiapausing siblings, differentially expressed genes are detected from the first day of pupal development and onwards, varying dramatically across these formative stages. During diapause there are strong gene expression dynamics present, revealing a preprogrammed transcriptional landscape that is active during the winter. Different biological processes appear to be active in the two body parts. Finally, adults emerging from either the direct or diapause pathways do not show large transcriptomic differences, suggesting the adult phenotype is strongly canalized.
Collapse
Affiliation(s)
| | - Philipp Lehmann
- Department of Zoology, Stockholm University, Stockholm, Sweden
| | - Sören Nylin
- Department of Zoology, Stockholm University, Stockholm, Sweden
| | - Karl Gotthard
- Department of Zoology, Stockholm University, Stockholm, Sweden
| | | |
Collapse
|
90
|
Karakulak T, Moch H, von Mering C, Kahraman A. Probing Isoform Switching Events in Various Cancer Types: Lessons From Pan-Cancer Studies. Front Mol Biosci 2021; 8:726902. [PMID: 34888349 PMCID: PMC8650491 DOI: 10.3389/fmolb.2021.726902] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 11/01/2021] [Indexed: 12/03/2022] Open
Abstract
Alternative splicing is an essential regulatory mechanism for gene expression in mammalian cells contributing to protein, cellular, and species diversity. In cancer, alternative splicing is frequently disturbed, leading to changes in the expression of alternatively spliced protein isoforms. Advances in sequencing technologies and analysis methods led to new insights into the extent and functional impact of disturbed alternative splicing events. In this review, we give a brief overview of the molecular mechanisms driving alternative splicing, highlight the function of alternative splicing in healthy tissues and describe how alternative splicing is disrupted in cancer. We summarize current available computational tools for analyzing differential transcript usage, isoform switching events, and the pathogenic impact of cancer-specific splicing events. Finally, the strategies of three recent pan-cancer studies on isoform switching events are compared. Their methodological similarities and discrepancies are highlighted and lessons learned from the comparison are listed. We hope that our assessment will lead to new and more robust methods for cancer-specific transcript detection and help to produce more accurate functional impact predictions of isoform switching events.
Collapse
Affiliation(s)
- Tülay Karakulak
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
- Department of Pathology and Molecular Pathology, University Hospital Zurich, Zurich, Switzerland
- Swiss Informatics Institute, Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Holger Moch
- Department of Pathology and Molecular Pathology, University Hospital Zurich, Zurich, Switzerland
- Faculty of Medicine, University of Zurich, Zurich, Switzerland
| | - Christian von Mering
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
- Swiss Informatics Institute, Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Abdullah Kahraman
- Department of Pathology and Molecular Pathology, University Hospital Zurich, Zurich, Switzerland
- Swiss Informatics Institute, Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
91
|
Deep Learning for Human Disease Detection, Subtype Classification, and Treatment Response Prediction Using Epigenomic Data. Biomedicines 2021; 9:biomedicines9111733. [PMID: 34829962 PMCID: PMC8615388 DOI: 10.3390/biomedicines9111733] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 10/26/2021] [Accepted: 11/17/2021] [Indexed: 12/25/2022] Open
Abstract
Deep learning (DL) is a distinct class of machine learning that has achieved first-class performance in many fields of study. For epigenomics, the application of DL to assist physicians and scientists in human disease-relevant prediction tasks has been relatively unexplored until very recently. In this article, we critically review published studies that employed DL models to predict disease detection, subtype classification, and treatment responses, using epigenomic data. A comprehensive search on PubMed, Scopus, Web of Science, Google Scholar, and arXiv.org was performed following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. Among 1140 initially identified publications, we included 22 articles in our review. DNA methylation and RNA-sequencing data are most frequently used to train the predictive models. The reviewed models achieved a high accuracy ranged from 88.3% to 100.0% for disease detection tasks, from 69.5% to 97.8% for subtype classification tasks, and from 80.0% to 93.0% for treatment response prediction tasks. We generated a workflow to develop a predictive model that encompasses all steps from first defining human disease-related tasks to finally evaluating model performance. DL holds promise for transforming epigenomic big data into valuable knowledge that will enhance the development of translational epigenomics.
Collapse
|
92
|
Ropers D, Couté Y, Faure L, Ferré S, Labourdette D, Shabani A, Trouilh L, Vasseur P, Corre G, Ferro M, Teste MA, Geiselmann J, de Jong H. Multiomics Study of Bacterial Growth Arrest in a Synthetic Biology Application. ACS Synth Biol 2021; 10:2910-2926. [PMID: 34739215 DOI: 10.1021/acssynbio.1c00115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We investigated the scalability of a previously developed growth switch based on external control of RNA polymerase expression. Our results indicate that, in liter-scale bioreactors operating in fed-batch mode, growth-arrested Escherichia coli cells are able to convert glucose to glycerol at an increased yield. A multiomics quantification of the physiology of the cells shows that, apart from acetate production, few metabolic side effects occur. However, a number of specific responses to growth slow-down and growth arrest are launched at the transcriptional level. These notably include the downregulation of genes involved in growth-associated processes, such as amino acid and nucleotide metabolism and translation. Interestingly, the transcriptional responses are buffered at the proteome level, probably due to the strong decrease of the total mRNA concentration after the diminution of transcriptional activity and the absence of growth dilution of proteins. Growth arrest thus reduces the opportunities for dynamically adjusting the proteome composition, which poses constraints on the design of biotechnological production processes but may also avoid the initiation of deleterious stress responses.
Collapse
Affiliation(s)
| | - Yohann Couté
- Université Grenoble Alpes, INSERM, CEA, UMR BioSanté U1292, CNRS, CEA, FR2048, 38000 Grenoble, France
| | | | - Sabrina Ferré
- Université Grenoble Alpes, INSERM, CEA, UMR BioSanté U1292, CNRS, CEA, FR2048, 38000 Grenoble, France
| | - Delphine Labourdette
- GeT-Biopuces, TBI, Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France
| | - Arieta Shabani
- Université Grenoble Alpes, Inria, 38000 Grenoble, France
| | - Lidwine Trouilh
- GeT-Biopuces, TBI, Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France
| | | | | | - Myriam Ferro
- Université Grenoble Alpes, INSERM, CEA, UMR BioSanté U1292, CNRS, CEA, FR2048, 38000 Grenoble, France
| | - Marie-Ange Teste
- GeT-Biopuces, TBI, Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France
| | - Johannes Geiselmann
- Université Grenoble Alpes, Inria, 38000 Grenoble, France
- Université Grenoble Alpes, CNRS, LIPhy, 38000 Grenoble, France
| | - Hidde de Jong
- Université Grenoble Alpes, Inria, 38000 Grenoble, France
| |
Collapse
|
93
|
Sweany RR, Mack BM, Moore GG, Gilbert MK, Cary JW, Lebar MD, Rajasekaran K, Damann Jr. KE. Genetic Responses and Aflatoxin Inhibition during Co-Culture of Aflatoxigenic and Non-Aflatoxigenic Aspergillus flavus. Toxins (Basel) 2021; 13:794. [PMID: 34822579 PMCID: PMC8618995 DOI: 10.3390/toxins13110794] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 10/30/2021] [Accepted: 11/05/2021] [Indexed: 11/16/2022] Open
Abstract
Aflatoxin is a carcinogenic mycotoxin produced by Aspergillus flavus. Non-aflatoxigenic (Non-tox) A. flavus isolates are deployed in corn fields as biocontrol because they substantially reduce aflatoxin contamination via direct replacement and additionally via direct contact or touch with toxigenic (Tox) isolates and secretion of inhibitory/degradative chemicals. To understand touch inhibition, HPLC analysis and RNA sequencing examined aflatoxin production and gene expression of Non-tox isolate 17 and Tox isolate 53 mono-cultures and during their interaction in co-culture. Aflatoxin production was reduced by 99.7% in 72 h co-cultures. Fewer than expected unique reads were assigned to Tox 53 during co-culture, indicating its growth and/or gene expression was inhibited in response to Non-tox 17. Predicted secreted proteins and genes involved in oxidation/reduction were enriched in Non-tox 17 and co-cultures compared to Tox 53. Five secondary metabolite (SM) gene clusters and kojic acid synthesis genes were upregulated in Non-tox 17 compared to Tox 53 and a few were further upregulated in co-cultures in response to touch. These results suggest Non-tox strains can inhibit growth and aflatoxin gene cluster expression in Tox strains through touch. Additionally, upregulation of other SM genes and redox genes during the biocontrol interaction demonstrates a potential role of inhibitory SMs and antioxidants as additional biocontrol mechanisms and deserves further exploration to improve biocontrol formulations.
Collapse
Affiliation(s)
- Rebecca R. Sweany
- Food and Feed Safety Research Unit, Southern Regional Research Center, US Department of Agriculture, New Orleans, LA 70124, USA; (B.M.M.); (M.K.G.); (J.W.C.); (M.D.L.)
- Department of Plant Pathology and Crop Physiology, Louisiana State University, Baton Rouge, LA 70808, USA;
| | - Brian M. Mack
- Food and Feed Safety Research Unit, Southern Regional Research Center, US Department of Agriculture, New Orleans, LA 70124, USA; (B.M.M.); (M.K.G.); (J.W.C.); (M.D.L.)
- Department of Plant Pathology and Crop Physiology, Louisiana State University, Baton Rouge, LA 70808, USA;
| | - Geromy G. Moore
- Food and Feed Safety Research Unit, Southern Regional Research Center, US Department of Agriculture, New Orleans, LA 70124, USA; (B.M.M.); (M.K.G.); (J.W.C.); (M.D.L.)
- Department of Plant Pathology and Crop Physiology, Louisiana State University, Baton Rouge, LA 70808, USA;
| | - Matthew K. Gilbert
- Food and Feed Safety Research Unit, Southern Regional Research Center, US Department of Agriculture, New Orleans, LA 70124, USA; (B.M.M.); (M.K.G.); (J.W.C.); (M.D.L.)
- Department of Plant Pathology and Crop Physiology, Louisiana State University, Baton Rouge, LA 70808, USA;
| | - Jeffrey W. Cary
- Food and Feed Safety Research Unit, Southern Regional Research Center, US Department of Agriculture, New Orleans, LA 70124, USA; (B.M.M.); (M.K.G.); (J.W.C.); (M.D.L.)
- Department of Plant Pathology and Crop Physiology, Louisiana State University, Baton Rouge, LA 70808, USA;
| | - Matthew D. Lebar
- Food and Feed Safety Research Unit, Southern Regional Research Center, US Department of Agriculture, New Orleans, LA 70124, USA; (B.M.M.); (M.K.G.); (J.W.C.); (M.D.L.)
- Department of Plant Pathology and Crop Physiology, Louisiana State University, Baton Rouge, LA 70808, USA;
| | - Kanniah Rajasekaran
- Food and Feed Safety Research Unit, Southern Regional Research Center, US Department of Agriculture, New Orleans, LA 70124, USA; (B.M.M.); (M.K.G.); (J.W.C.); (M.D.L.)
- Department of Plant Pathology and Crop Physiology, Louisiana State University, Baton Rouge, LA 70808, USA;
| | - Kenneth E. Damann Jr.
- Department of Plant Pathology and Crop Physiology, Louisiana State University, Baton Rouge, LA 70808, USA;
| |
Collapse
|
94
|
Tran DT, Might M. cdev: a ground-truth based measure to evaluate RNA-seq normalization performance. PeerJ 2021; 9:e12233. [PMID: 34707933 PMCID: PMC8496462 DOI: 10.7717/peerj.12233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Accepted: 09/09/2021] [Indexed: 11/28/2022] Open
Abstract
Normalization of RNA-seq data has been an active area of research since the problem was first recognized a decade ago. Despite the active development of new normalizers, their performance measures have been given little attention. To evaluate normalizers, researchers have been relying on ad hoc measures, most of which are either qualitative, potentially biased, or easily confounded by parametric choices of downstream analysis. We propose a metric called condition-number based deviation, or cdev, to quantify normalization success. cdev measures how much an expression matrix differs from another. If a ground truth normalization is given, cdev can then be used to evaluate the performance of normalizers. To establish experimental ground truth, we compiled an extensive set of public RNA-seq assays with external spike-ins. This data collection, together with cdev, provides a valuable toolset for benchmarking new and existing normalization methods.
Collapse
Affiliation(s)
- Diem-Trang Tran
- School of Computing, University of Utah, Salt Lake City, UT, United States of America
| | - Matthew Might
- Hugh Kaul Precision Medicine Institute, University of Alabama at Birmingham, Birmingham, AL, United States of America
| |
Collapse
|
95
|
Osabe T, Shimizu K, Kadota K. Differential expression analysis using a model-based gene clustering algorithm for RNA-seq data. BMC Bioinformatics 2021; 22:511. [PMID: 34670485 PMCID: PMC8527798 DOI: 10.1186/s12859-021-04438-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Accepted: 10/11/2021] [Indexed: 11/10/2022] Open
Abstract
Background RNA-seq is a tool for measuring gene expression and is commonly used to identify differentially expressed genes (DEGs). Gene clustering is used to classify DEGs with similar expression patterns for the subsequent analyses of data from experiments such as time-courses or multi-group comparisons. However, gene clustering has rarely been used for analyzing simple two-group data or differential expression (DE). In this study, we report that a model-based clustering algorithm implemented in an R package, MBCluster.Seq, can also be used for DE analysis. Results The input data originally used by MBCluster.Seq is DEGs, and the proposed method (called MBCdeg) uses all genes for the analysis. The method uses posterior probabilities of genes assigned to a cluster displaying non-DEG pattern for overall gene ranking. We compared the performance of MBCdeg with conventional R packages such as edgeR, DESeq2, and TCC that are specialized for DE analysis using simulated and real data. Our results showed that MBCdeg outperformed other methods when the proportion of DEG (PDEG) was less than 50%. However, the DEG identification using MBCdeg was less consistent than with conventional methods. We compared the effects of different normalization algorithms using MBCdeg, and performed an analysis using MBCdeg in combination with a robust normalization algorithm (called DEGES) that was not implemented in MBCluster.Seq. The new analysis method showed greater stability than using the original MBCdeg with the default normalization algorithm. Conclusions MBCdeg with DEGES normalization can be used in the identification of DEGs when the PDEG is relatively low. As the method is based on gene clustering, the DE result includes information on which expression pattern the gene belongs to. The new method may be useful for the analysis of time-course and multi-group data, where the classification of expression patterns is often required. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04438-4.
Collapse
Affiliation(s)
- Takayuki Osabe
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Yayoi 1-1-1, Bunkyo-ku, Tokyo, 113-8657, Japan
| | - Kentaro Shimizu
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Yayoi 1-1-1, Bunkyo-ku, Tokyo, 113-8657, Japan.,Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Yayoi 1-1-1, Bunkyo-ku, Tokyo, 113-8657, Japan.,Interfaculty Initiative in Information Studies, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo, 113-0033, Japan
| | - Koji Kadota
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Yayoi 1-1-1, Bunkyo-ku, Tokyo, 113-8657, Japan. .,Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Yayoi 1-1-1, Bunkyo-ku, Tokyo, 113-8657, Japan. .,Interfaculty Initiative in Information Studies, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo, 113-0033, Japan.
| |
Collapse
|
96
|
Annotation depth confounds direct comparison of gene expression across species. BMC Bioinformatics 2021; 22:499. [PMID: 34654362 PMCID: PMC8518172 DOI: 10.1186/s12859-021-04414-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2021] [Accepted: 09/30/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Comparisons of the molecular framework among organisms can be done on both structural and functional levels. One of the most common top-down approaches for functional comparisons is RNA sequencing. This estimation of organismal transcriptional responses is of interest for understanding evolution of molecular activity, which is used for answering a diversity of questions ranging from basic biology to pre-clinical species selection and translation. However, direct comparison between species is often hindered by evolutionary divergence in structure of molecular framework, as well as large difference in the depth of our understanding of the genetic background between humans and other species. Here, we focus on the latter. We attempt to understand how differences in transcriptome annotation affect direct gene abundance comparisons between species. RESULTS We examine and suggest some straightforward approaches for direct comparison given the current available tools and using a sample dataset from human, cynomolgus monkey, dog, rat and mouse with a common quantitation and normalization approach. In addition, we examine how variation in genome annotation depth and quality across species may affect these direct comparisons. CONCLUSIONS Our findings suggest that further efforts for better genome annotation or computational normalization tools may be of strong interest.
Collapse
|
97
|
Abstract
Cancer is a genetic disease in which multiple genes are perturbed. Thus, information about the regulatory relationships between genes is necessary for the identification of biomarkers and therapeutic targets. In this review, methods for inference of gene regulatory networks (GRNs) from transcriptomics data that are used in cancer research are introduced. The methods are classified into three categories according to the analysis model. The first category includes methods that use pair-wise measures between genes, including correlation coefficient and mutual information. The second category includes methods that determine the genetic regulatory relationship using multivariate measures, which consider the expression profiles of all genes concurrently. The third category includes methods using supervised and integrative approaches. The supervised approach estimates the regulatory relationship using a supervised learning method that constructs a regression or classification model for predicting whether there is a regulatory relationship between genes with input data of gene expression profiles and class labels of prior biological knowledge. The integrative method is an expansion of the supervised method and uses more data and biological knowledge for predicting the regulatory relationship. Furthermore, simulation and experimental validation of the estimated GRNs are also discussed in this review. This review identified that most GRN inference methods are not specific for cancer transcriptome data, and such methods are required for better understanding of cancer pathophysiology. In addition, more systematic methods for validation of the estimated GRNs need to be developed in the context of cancer biology.
Collapse
|
98
|
Wei Q, Ramsey SA. Predicting chemotherapy response using a variational autoencoder approach. BMC Bioinformatics 2021; 22:453. [PMID: 34551729 PMCID: PMC8456615 DOI: 10.1186/s12859-021-04339-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 08/17/2021] [Indexed: 01/14/2023] Open
Abstract
Background Multiple studies have shown the utility of transcriptome-wide RNA-seq profiles as features for machine learning-based prediction of response to chemotherapy in cancer. While tumor transcriptome profiles are publicly available for thousands of tumors for many cancer types, a relatively modest number of tumor profiles are clinically annotated for response to chemotherapy. The paucity of labeled examples and the high dimension of the feature data limit performance for predicting therapeutic response using fully-supervised classification methods. Recently, multiple studies have established the utility of a deep neural network approach, the variational autoencoder (VAE), for generating meaningful latent features from original data. Here, we report the first study of a semi-supervised approach using VAE-encoded tumor transcriptome features and regularized gradient boosted decision trees (XGBoost) to predict chemotherapy drug response for five cancer types: colon, pancreatic, bladder, breast, and sarcoma. Results We found: (1) VAE-encoding of the tumor transcriptome preserves the cancer type identity of the tumor, suggesting preservation of biologically relevant information; and (2) as a feature-set for supervised classification to predict response-to-chemotherapy, the unsupervised VAE encoding of the tumor’s gene expression profile leads to better area under the receiver operating characteristic curve and area under the precision-recall curve classification performance than the original gene expression profile or the PCA principal components or the ICA components of the gene expression profile, in four out of five cancer types that we tested. Conclusions Given high-dimensional “omics” data, the VAE is a powerful tool for obtaining a nonlinear low-dimensional embedding; it yields features that retain biological patterns that distinguish between different types of cancer and that enable more accurate tumor transcriptome-based prediction of response to chemotherapy than would be possible using the original data or their principal components. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04339-6.
Collapse
Affiliation(s)
- Qi Wei
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, USA.
| | - Stephen A Ramsey
- Department of Biomedical Sciences, Oregon State University, Corvallis, OR, USA
| |
Collapse
|
99
|
Borella M, Martello G, Risso D, Romualdi C. PsiNorm: a scalable normalization for single-cell RNA-seq data. Bioinformatics 2021; 38:164-172. [PMID: 34499096 PMCID: PMC8696108 DOI: 10.1093/bioinformatics/btab641] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 08/30/2021] [Accepted: 09/06/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Single-cell RNA sequencing (scRNA-seq) enables transcriptome-wide gene expression measurements at single-cell resolution providing a comprehensive view of the compositions and dynamics of tissue and organism development. The evolution of scRNA-seq protocols has led to a dramatic increase of cells throughput, exacerbating many of the computational and statistical issues that previously arose for bulk sequencing. In particular, with scRNA-seq data all the analyses steps, including normalization, have become computationally intensive, both in terms of memory usage and computational time. In this perspective, new accurate methods able to scale efficiently are desirable. RESULTS Here, we propose PsiNorm, a between-sample normalization method based on the power-law Pareto distribution parameter estimate. Here, we show that the Pareto distribution well resembles scRNA-seq data, especially those coming from platforms that use unique molecular identifiers. Motivated by this result, we implement PsiNorm, a simple and highly scalable normalization method. We benchmark PsiNorm against seven other methods in terms of cluster identification, concordance and computational resources required. We demonstrate that PsiNorm is among the top performing methods showing a good trade-off between accuracy and scalability. Moreover, PsiNorm does not need a reference, a characteristic that makes it useful in supervised classification settings, in which new out-of-sample data need to be normalized. AVAILABILITY AND IMPLEMENTATION PsiNorm is implemented in the scone Bioconductor package and available at https://bioconductor.org/packages/scone/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Matteo Borella
- Department of Biology, University of Padova, Padua 35121, Italy
| | | | | | | |
Collapse
|
100
|
Smail MA, Wu X, Henkel ND, Eby HM, Herman JP, McCullumsmith RE, Shukla R. Similarities and dissimilarities between psychiatric cluster disorders. Mol Psychiatry 2021; 26:4853-4863. [PMID: 33504954 PMCID: PMC8313609 DOI: 10.1038/s41380-021-01030-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Revised: 12/30/2020] [Accepted: 01/12/2021] [Indexed: 01/16/2023]
Abstract
The common molecular mechanisms underlying psychiatric disorders are not well understood. Prior attempts to assess the pathological mechanisms responsible for psychiatric disorders have been limited by biased selection of comparable disorders, datasets/cohort availability, and challenges with data normalization. Here, using DisGeNET, a gene-disease associations database, we sought to expand such investigations in terms of number and types of diseases. In a top-down manner, we analyzed an unbiased cluster of 36 psychiatric disorders and comorbid conditions at biological pathway, cell-type, drug-target, and chromosome levels and deployed density index, a novel metric to quantify similarities (close to 1) and dissimilarities (close to 0) between these disorders at each level. At pathway level, we show that cognition and neurotransmission drive the similarity and are involved across all disorders, whereas immune-system and signal-response coupling (cell surface receptors, signal transduction, gene expression, and metabolic process) drives the dissimilarity and are involved with specific disorders. The analysis at the drug-target level supports the involvement of neurotransmission-related changes across these disorders. At cell-type level, dendrite-targeting interneurons, across all layers, are most involved. Finally, by matching the clustering pattern at each level of analysis, we showed that the similarity between the disorders is influenced most at the chromosomal level and to some extent at the cellular level. Together, these findings provide first insights into distinct cellular and molecular pathologies, druggable mechanisms associated with several psychiatric disorders and comorbid conditions and demonstrate that similarities between these disorders originate at the chromosome level and disperse in a bottom-up manner at cellular and pathway levels.
Collapse
Affiliation(s)
- Marissa A Smail
- Department of Pharmacology and Systems Physiology, University of Cincinnati, Cincinnati, OH, USA
- Neuroscience Graduate Program, University of Cincinnati, Cincinnati, OH, USA
| | - Xiaojun Wu
- Department of Neurosciences, University of Toledo College of Medicine and Life Sciences, Toledo, OH, USA
| | - Nicholas D Henkel
- Department of Neurosciences, University of Toledo College of Medicine and Life Sciences, Toledo, OH, USA
| | - Hunter M Eby
- Department of Neurosciences, University of Toledo College of Medicine and Life Sciences, Toledo, OH, USA
| | - James P Herman
- Department of Pharmacology and Systems Physiology, University of Cincinnati, Cincinnati, OH, USA
- Veterans Affairs Medical Center, Cincinnati, OH, USA
- Department of Neurology, University of Cincinnati, Cincinnati, OH, USA
| | - Robert E McCullumsmith
- Department of Neurosciences, University of Toledo College of Medicine and Life Sciences, Toledo, OH, USA
- Neurosciences Institute, ProMedica, Toledo, OH, USA
| | - Rammohan Shukla
- Department of Neurosciences, University of Toledo College of Medicine and Life Sciences, Toledo, OH, USA.
| |
Collapse
|