1
|
Wang B, Chougule K, Jiao Y, Olson A, Kumar V, Gladman N, Huang J, Llaca V, Fengler K, Wei X, Wang L, Wang X, Regulski M, Drenkow J, Gingeras T, Hayes C, Armstrong J, Huang Y, Xin Z, Ware D. High-quality chromosome scale genome assemblies of two important Sorghum inbred lines, Tx2783 and RTx436. NAR Genom Bioinform 2024; 6:lqae097. [PMID: 39131819 PMCID: PMC11310780 DOI: 10.1093/nargab/lqae097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 07/01/2024] [Accepted: 07/23/2024] [Indexed: 08/13/2024] Open
Abstract
Sorghum bicolor (L.) Moench is a significant grass crop globally, known for its genetic diversity. High quality genome sequences are needed to capture the diversity. We constructed high-quality, chromosome-level genome assemblies for two vital sorghum inbred lines, Tx2783 and RTx436. Through advanced single-molecule techniques, long-read sequencing and optical maps, we improved average sequence continuity 19-fold and 11-fold higher compared to existing Btx623 v3.0 reference genome and obtained 19 and 18 scaffolds (N50 of 25.6 and 14.4) for Tx2783 and RTx436, respectively. Our gene annotation efforts resulted in 29 612 protein-coding genes for the Tx2783 genome and 29 265 protein-coding genes for the RTx436 genome. Comparative analyses with 26 plant genomes which included 18 sorghum genomes and 8 outgroup species identified around 31 210 protein-coding gene families, with about 13 956 specific to sorghum. Using representative models from gene trees across the 18 sorghum genomes, a total of 72 579 pan-genes were identified, with 14% core, 60% softcore and 26% shell genes. We identified 99 genes in Tx2783 and 107 genes in RTx436 that showed functional enrichment specifically in binding and metabolic processes, as revealed by the GO enrichment Pearson Chi-Square test. We detected 36 potential large inversions in the comparison between the BTx623 Bionano map and the BTx623 v3.1 reference sequence. Strikingly, these inversions were notably absent when comparing Tx2783 or RTx436 with the BTx623 Bionano map. These inversion were mostly in the pericentromeric region which is known to have low complexity regions and harder to assemble and suggests the presence of potential artifacts in the public BTx623 reference assembly. Furthermore, in comparison to Tx2783, RTx436 exhibited 324 883 additional Single Nucleotide Polymorphisms (SNPs) and 16 506 more Insertions/Deletions (INDELs) when using BTx623 as the reference genome. We also characterized approximately 348 nucleotide-binding leucine-rich repeat (NLR) disease resistance genes in the two genomes. These high-quality genomes serve as valuable resources for discovering agronomic traits and structural variation studies.
Collapse
Affiliation(s)
- Bo Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | | | - Yinping Jiao
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
- Texas Tech University, 1006 Canton Ave, Lubbock, TX 79409-2122, USA
| | - Andrew Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Vivek Kumar
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Nicholas Gladman
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
- USDA ARS Robert W. Holley Center for Agriculture and Health Cornell University, Ithaca, NY, USA
| | - Jian Huang
- Department of Plant and Soil Sciences, Oklahoma State University, Stillwater, OK 74078-6028, USA
| | - Victor Llaca
- Corteva Agriscience™, 8325 NW 62nd Avenue, Johnston, IA 50131, USA
| | - Kevin Fengler
- Corteva Agriscience™, 8325 NW 62nd Avenue, Johnston, IA 50131, USA
| | - Xuehong Wei
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Liya Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Xiaofei Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | | | - Jorg Drenkow
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | | | - Chad Hayes
- U.S. Department of Agriculture-Agricultural Research Service, Plant Stress and Germplasm Development Unit, Cropping Systems Research Laboratory, Lubbock, TX 79415, USA
| | - J Scott Armstrong
- Peanut and Small Grains Research Unit, 1301 N. Western Rd. Stillwater, OK 74075, USA
| | - Yinghua Huang
- USDA-ARS Plant Science Research Laboratory, 1301 N. Western Road, Stillwater, OK 74075-2714, USA
- Dept. of Plant Biology, Ecology, and Evolution, 301 Physical Sciences, Stillwater, OK 74078-3013, USA
| | - Zhanguo Xin
- U.S. Department of Agriculture-Agricultural Research Service, Plant Stress and Germplasm Development Unit, Cropping Systems Research Laboratory, Lubbock, TX 79415, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
- USDA ARS Robert W. Holley Center for Agriculture and Health Cornell University, Ithaca, NY, USA
| |
Collapse
|
2
|
McEvoy SL, Meyer RS, Hasenstab-Lehman KE, Guilliams CM. The reference genome of an endangered Asteraceae, Deinandra increscens subsp. villosa, endemic to the Central Coast of California. G3 (BETHESDA, MD.) 2024; 14:jkae117. [PMID: 38845594 PMCID: PMC11304951 DOI: 10.1093/g3journal/jkae117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 05/26/2024] [Indexed: 08/09/2024]
Abstract
We present a reference genome for the federally endangered Gaviota tarplant, Deinandra increscens subsp. villosa (Madiinae, Asteraceae), an annual herb endemic to the Central California coast. Generating PacBio HiFi, Oxford Nanopore Technologies, and Dovetail Omni-C data, we assembled a haploid consensus genome of 1.67 Gb as 28.7 K scaffolds with a scaffold N50 of 74.9 Mb. We annotated repeat content in 74.8% of the genome. Long terminal repeats (LTRs) covered 44.0% of the genome with Copia families predominant at 22.9% followed by Gypsy at 14.2%. Both Gypsy and Copia elements were common in ancestral peaks of LTRs, and the most abundant element was a Gypsy element containing nested Copia/Angela sequence similarity, reflecting a complex evolutionary history of repeat activity. Gene annotation produced 33,257 genes and 68,942 transcripts, of which 99% were functionally annotated. BUSCO scores for the annotated proteins were 96.0% complete of which 77.6% was single copy and 18.4% duplicates. Whole genome duplication synonymous mutation rates of Gaviota tarplant and sunflower (Helianthus annuus) shared peaks that correspond to the last Asteraceae polyploidization event and subsequent divergence from a common ancestor at ∼27 MYA. Regions of high-density tandem genes were identified, pointing to potentially important loci of environmental adaptation in this species.
Collapse
Affiliation(s)
- Susan L McEvoy
- Department of Conservation and Research, Santa Barbara Botanic Garden, Santa Barbara, CA 93105, USA
| | - Rachel S Meyer
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - C Matt Guilliams
- Department of Conservation and Research, Santa Barbara Botanic Garden, Santa Barbara, CA 93105, USA
| |
Collapse
|
3
|
Shi Q, Zhang Q, Shao M. Accurate assembly of multiple RNA-seq samples with Aletsch. Bioinformatics 2024; 40:i307-i317. [PMID: 38940157 PMCID: PMC11211816 DOI: 10.1093/bioinformatics/btae215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION High-throughput RNA sequencing has become indispensable for decoding gene activities, yet the challenge of reconstructing full-length transcripts persists. Traditional single-sample assemblers frequently produce fragmented transcripts, especially in single-cell RNA-seq data. While algorithms designed for assembling multiple samples exist, they encounter various limitations. RESULTS We present Aletsch, a new assembler for multiple bulk or single-cell RNA-seq samples. Aletsch incorporates several algorithmic innovations, including a "bridging" system that can effectively integrate multiple samples to restore missed junctions in individual samples, and a new graph-decomposition algorithm that leverages "supporting" information across multiple samples to guide the decomposition of complex vertices. A standout feature of Aletsch is its application of a random forest model with 50 well-designed features for scoring transcripts. We demonstrate its robust adaptability across different chromosomes, datasets, and species. Our experiments, conducted on RNA-seq data from several protocols, firmly demonstrate Aletsch's significant outperformance over existing meta-assemblers. As an example, when measured with the partial area under the precision-recall curve (pAUC, constrained by precision), Aletsch surpasses the leading assemblers TransMeta by 22.9%-62.1% and PsiCLASS by 23.0%-175.5% on human datasets. AVAILABILITY AND IMPLEMENTATION Aletsch is freely available at https://github.com/Shao-Group/aletsch. Scripts that reproduce the experimental results of this manuscript is available at https://github.com/Shao-Group/aletsch-test.
Collapse
Affiliation(s)
- Qian Shi
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA 16802, United States
| | - Qimin Zhang
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA 16802, United States
| | - Mingfu Shao
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA 16802, United States
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, United States
| |
Collapse
|
4
|
Brůna T, Lomsadze A, Borodovsky M. GeneMark-ETP significantly improves the accuracy of automatic annotation of large eukaryotic genomes. Genome Res 2024; 34:757-768. [PMID: 38866548 PMCID: PMC11216313 DOI: 10.1101/gr.278373.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 05/02/2024] [Indexed: 06/14/2024]
Abstract
Large-scale genomic initiatives, such as the Earth BioGenome Project, require efficient methods for eukaryotic genome annotation. Here we present an automatic gene finder, GeneMark-ETP, integrating genomic-, transcriptomic-, and protein-derived evidence that has been developed with a focus on large plant and animal genomes. GeneMark-ETP first identifies genomic loci where extrinsic data are sufficient for making gene predictions with "high confidence." The genes situated in the genomic space between the high-confidence genes are predicted in the next stage. The set of high-confidence genes serves as an initial training set for the statistical model. Further on, the model parameters are iteratively updated in the rounds of gene prediction and parameter re-estimation. Upon reaching convergence, GeneMark-ETP makes the final predictions and delivers the whole complement of predicted genes. GeneMark-ETP outperforms gene finders using a single type of extrinsic evidence. Comparisons with gene finders MAKER2 and TSEBRA, those that use both transcript- and protein-derived extrinsic evidence, show that GeneMark-ETP delivers state-of-the-art gene-prediction accuracy, with the margin of outperforming existing approaches increasing in its application to larger and more complex eukaryotic genomes.
Collapse
Affiliation(s)
- Tomáš Brůna
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Alexandre Lomsadze
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Mark Borodovsky
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia 30332, USA;
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| |
Collapse
|
5
|
Bruna T, Lomsadze A, Borodovsky M. A new gene finding tool GeneMark-ETP significantly improves the accuracy of automatic annotation of large eukaryotic genomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.01.13.524024. [PMID: 36711453 PMCID: PMC9882169 DOI: 10.1101/2023.01.13.524024] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Large-scale genomic initiatives, such as the Earth BioGenome Project, require efficient methods for eukaryotic genome annotation. Here we present an automatic gene finder, GeneMark-ETP, integrating genomic-, transcriptomic- and protein-derived evidence that has been developed with a focus on large plant and animal genomes. GeneMark-ETP first identifies genomic loci where extrinsic data is sufficient for making gene predictions with 'high confidence'. The genes situated in the genomic space between the high confidence genes are predicted in the next stage. The set of high confidence genes serves as an initial training set for the statistical model. Further on, the model parameters are iteratively updated in the rounds of gene prediction and parameter re-estimation. Upon reaching convergence, GeneMark-ETP makes the final predictions and delivers the whole complement of predicted genes. GeneMark-ETP outperformed gene finders using a single type of extrinsic evidence. Comparisons with gene finders utilizing both transcript- and protein-derived extrinsic evidence, MAKER2, and TSEBRA, demonstrated that GeneMark-ETP delivered state-of-the-art gene prediction accuracy with the margin of outperforming existing approaches increasing in its applications to larger and more complex eukaryotic genomes.
Collapse
Affiliation(s)
- Tomas Bruna
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Alexandre Lomsadze
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Mark Borodovsky
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
| |
Collapse
|
6
|
Wu Z, Miedzinska K, Krause JS, Pérez JH, Wingfield JC, Meddle SL, Smith J. A chromosome-level genome assembly of a free-living white-crowned sparrow (Zonotrichia leucophrys gambelii). Sci Data 2024; 11:86. [PMID: 38238322 PMCID: PMC10796373 DOI: 10.1038/s41597-024-02929-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 01/03/2024] [Indexed: 01/22/2024] Open
Abstract
The white-crowned sparrow, Zonotrichia leucophrys, is a passerine bird with a wide distribution and it is extensively adapted to environmental changes. It has historically acted as a model species in studies on avian ecology, physiology and behaviour. Here, we present a high-quality chromosome-level genome of Zonotrichia leucophrys using PacBio and OmniC sequencing data. Gene models were constructed by combining RNA-seq and Iso-seq data from liver, hypothalamus, and ovary. In total a 1,123,996,003 bp genome was generated, including 31 chromosomes assembled in complete scaffolds along with other, unplaced scaffolds. This high-quality genome assembly offers an important genomic resource for the research community using the white-crowned sparrow as a model for understanding avian genome biology and development, and provides a genomic basis for future studies, both fundamental and applied.
Collapse
Affiliation(s)
- Zhou Wu
- The Roslin Institute and Royal (Dick) School of Veterinary Studies R(D)SVS, The University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK.
| | - Katarzyna Miedzinska
- The Roslin Institute and Royal (Dick) School of Veterinary Studies R(D)SVS, The University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK
| | - Jesse S Krause
- Department of Neurobiology, Physiology, and Behavior, University of California, Davis, CA, 95616, USA
- Department of Biology, University of Nevada Reno, Reno, NV, 89557, USA
| | - Jonathan H Pérez
- Department of Biology, University of South Alabama, Mobile, AL, 36688, USA
| | - John C Wingfield
- Department of Neurobiology, Physiology, and Behavior, University of California, Davis, CA, 95616, USA
| | - Simone L Meddle
- The Roslin Institute and Royal (Dick) School of Veterinary Studies R(D)SVS, The University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK
| | - Jacqueline Smith
- The Roslin Institute and Royal (Dick) School of Veterinary Studies R(D)SVS, The University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK.
| |
Collapse
|
7
|
Bush J, Webster C, Wegrzyn J, Simon C, Wilcox E, Khan R, Weisz D, Dudchenko O, Aiden EL, Frandsen P. Chromosome-Level Genome Assembly and Annotation of a Periodical Cicada Species: Magicicada septendecula. Genome Biol Evol 2024; 16:evae001. [PMID: 38190231 PMCID: PMC10799293 DOI: 10.1093/gbe/evae001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 12/16/2023] [Accepted: 12/28/2023] [Indexed: 01/09/2024] Open
Abstract
We present a high-quality assembly and annotation of the periodical cicada species, Magicicada septendecula (Hemiptera: Auchenorrhyncha: Cicadidae). Periodical cicadas have a significant ecological impact, serving as a food source for many mammals, reptiles, and birds. Magicicada are well known for their massive emergences of 1 to 3 species that appear in different locations in the eastern United States nearly every year. These year classes ("broods") emerge dependably every 13 or 17 yr in a given location. Recently, it has become clear that 4-yr early or late emergences of a sizeable portion of a population are an important part of the history of brood formation; however, the biological mechanisms by which they track the passage of time remain a mystery. Using PacBio HiFi reads in conjunction with Hi-C proximity ligation data, we have assembled and annotated the first whole genome for a periodical cicada, an important resource for future phylogenetic and comparative genomic analysis. This also represents the first quality genome assembly and annotation for the Hemipteran superfamily Cicadoidea. With a scaffold N50 of 518.9 Mb and a complete BUSCO score of 96.7%, we are confident that this assembly will serve as a vital resource toward uncovering the genomic basis of periodical cicadas' long, synchronized life cycles and will provide a robust framework for further investigations into these insects.
Collapse
Affiliation(s)
- Jonas Bush
- Huck Life Sciences Institute, The Pennsylvania State University, State College, PA, USA
- Department of Plant and Wildlife Sciences, Brigham Young University, Provo, UT, USA
| | - Cynthia Webster
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Jill Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Chris Simon
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Edward Wilcox
- Department of Plant and Wildlife Sciences, Brigham Young University, Provo, UT, USA
| | - Ruqayya Khan
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - David Weisz
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Olga Dudchenko
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- The Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
| | - Erez Lieberman Aiden
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- The Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Paul Frandsen
- Department of Plant and Wildlife Sciences, Brigham Young University, Provo, UT, USA
- Data Science Lab, Office of the Chief Information Officer, Smithsonian Institution, Washington, DC, USA
| |
Collapse
|
8
|
Chivu-Economescu M, Herlea V, Dima S, Sorop A, Pechianu C, Procop A, Kitahara S, Necula L, Matei L, Dragu D, Neagu AI, Bleotu C, Diaconu CC, Popescu I, Duda DG. Soluble PD-L1 as a diagnostic and prognostic biomarker in resectable gastric cancer patients. Gastric Cancer 2023; 26:934-946. [PMID: 37668884 DOI: 10.1007/s10120-023-01429-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 08/28/2023] [Indexed: 09/06/2023]
Abstract
BACKGROUND In this study, we compared programmed death-ligand 1 (PD-L1) expression in primary tissue samples and its soluble form (sPD-L1) concentration in matched preoperative plasma samples from gastric cancer patients to understand the relationship between tissue and plasma PD-L1 expression and to determine its diagnostic and prognostic value. METHODS PD-L1 expression in tissue was assessed by immunohistochemistry and enzyme-linked immunosorbent assay (ELISA), and sPD-L1 concentration in plasma was quantified by ELISA. The levels of the CD274 gene, which encodes for PD-L1 protein, were examined as part of bulk tissue RNA-sequencing analyses. Additionally, we evaluated the association between sPD-L1 levels and various laboratory parameters, disease characteristics, and patient outcomes. RESULTS GC patients had significantly higher levels of sPD-L1 in their plasma (71.69 pg/mL) compared to healthy controls (35.34 pg/mL) (p < 0.0001). Moreover, sPD-L1 levels were significantly correlated with tissue PD-L1 protein, CD274 mRNA expression, larger tumor size, advanced tumor stage, and lymph node metastasis. Elevated sPD-L1 levels (> 103.5 ng/mL) were associated with poor overall survival (HR = 2.16, 95%CI 1.15-4.08, p = 0.017). Furthermore, intratumoral neutrophil and dendritic cell levels were directly correlated with plasma sPD-L1 concentration in the GC patients. CONCLUSIONS sPD-L1 was readily measurable in GC patients, and its level was associated with GC tissue PD-L1 expression, greater inflammatory cell infiltration, disease progression, and survival. Thus, sPD-L1 may be a useful minimally invasive diagnostic and prognostic biomarker in GC patients.
Collapse
Affiliation(s)
- Mihaela Chivu-Economescu
- Department of Cellular and Molecular Pathology, Stefan S. Nicolau Institute of Virology, 030304, Bucharest, Romania
| | - Vlad Herlea
- Department of Pathology, Fundeni Clinical Institute, 022328, Bucharest, Romania
| | - Simona Dima
- Center of Digestive Diseases and Liver Transplantation, Fundeni Clinical Institute, 022328, Bucharest, Romania
- Center of Excellence for Translational Medicine, Fundeni Clinical Institute, 022328, Bucharest, Romania
- Carol Davila University of Medicine and Pharmacy, 050474, Bucharest, Romania
| | - Andrei Sorop
- Center of Excellence for Translational Medicine, Fundeni Clinical Institute, 022328, Bucharest, Romania
| | - Catalin Pechianu
- Department of Pathology, Fundeni Clinical Institute, 022328, Bucharest, Romania
| | - Alexandru Procop
- Department of Pathology, Fundeni Clinical Institute, 022328, Bucharest, Romania
| | - Shuji Kitahara
- Edwin L. Steele Laboratories for Tumor Biology, Department of Radiation Oncology, Harvard Medical School and Massachusetts General Hospital, Cox-724, 100 Blossom St., Boston, MA, 02114, USA
| | - Laura Necula
- Department of Cellular and Molecular Pathology, Stefan S. Nicolau Institute of Virology, 030304, Bucharest, Romania
| | - Lilia Matei
- Department of Cellular and Molecular Pathology, Stefan S. Nicolau Institute of Virology, 030304, Bucharest, Romania
| | - Denisa Dragu
- Department of Cellular and Molecular Pathology, Stefan S. Nicolau Institute of Virology, 030304, Bucharest, Romania
| | - Ana-Iulia Neagu
- Department of Cellular and Molecular Pathology, Stefan S. Nicolau Institute of Virology, 030304, Bucharest, Romania
| | - Coralia Bleotu
- Department of Cellular and Molecular Pathology, Stefan S. Nicolau Institute of Virology, 030304, Bucharest, Romania
| | - Carmen C Diaconu
- Department of Cellular and Molecular Pathology, Stefan S. Nicolau Institute of Virology, 030304, Bucharest, Romania
| | - Irinel Popescu
- Center of Digestive Diseases and Liver Transplantation, Fundeni Clinical Institute, 022328, Bucharest, Romania
- Center of Excellence for Translational Medicine, Fundeni Clinical Institute, 022328, Bucharest, Romania
| | - Dan G Duda
- Edwin L. Steele Laboratories for Tumor Biology, Department of Radiation Oncology, Harvard Medical School and Massachusetts General Hospital, Cox-724, 100 Blossom St., Boston, MA, 02114, USA.
| |
Collapse
|
9
|
Tao S, Hou Y, Diao L, Hu Y, Xu W, Xie S, Xiao Z. Long noncoding RNA study: Genome-wide approaches. Genes Dis 2023; 10:2491-2510. [PMID: 37554208 PMCID: PMC10404890 DOI: 10.1016/j.gendis.2022.10.024] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2022] [Revised: 10/09/2022] [Accepted: 10/23/2022] [Indexed: 11/30/2022] Open
Abstract
Long noncoding RNAs (lncRNAs) have been confirmed to play a crucial role in various biological processes across several species. Though many efforts have been devoted to the expansion of the lncRNAs landscape, much about lncRNAs is still unknown due to their great complexity. The development of high-throughput technologies and the constantly improved bioinformatic methods have resulted in a rapid expansion of lncRNA research and relevant databases. In this review, we introduced genome-wide research of lncRNAs in three parts: (i) novel lncRNA identification by high-throughput sequencing and computational pipelines; (ii) functional characterization of lncRNAs by expression atlas profiling, genome-scale screening, and the research of cancer-related lncRNAs; (iii) mechanism research by large-scale experimental technologies and computational analysis. Besides, primary experimental methods and bioinformatic pipelines related to these three parts are summarized. This review aimed to provide a comprehensive and systemic overview of lncRNA genome-wide research strategies and indicate a genome-wide lncRNA research system.
Collapse
Affiliation(s)
- Shuang Tao
- The Biotherapy Center, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, China
| | - Yarui Hou
- The Biotherapy Center, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, China
| | - Liting Diao
- The Biotherapy Center, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, China
| | - Yanxia Hu
- The Biotherapy Center, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, China
| | - Wanyi Xu
- The Biotherapy Center, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, China
| | - Shujuan Xie
- The Biotherapy Center, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, China
- Institute of Vaccine, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, China
| | - Zhendong Xiao
- The Biotherapy Center, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, China
| |
Collapse
|
10
|
Gutierrez-Hoffmann M, Fan J, O’Meally RN, Cole RN, Florea L, Antonescu C, Talbot CC, Tiniakou E, Darrah E, Soloski MJ. The Interaction of Borrelia burgdorferi with Human Dendritic Cells: Functional Implications. JOURNAL OF IMMUNOLOGY (BALTIMORE, MD. : 1950) 2023; 211:612-625. [PMID: 37405694 PMCID: PMC10527078 DOI: 10.4049/jimmunol.2300235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 06/01/2023] [Indexed: 07/06/2023]
Abstract
Dendritic cells bridge the innate and adaptive immune responses by serving as sensors of infection and as the primary APCs responsible for the initiation of the T cell response against invading pathogens. The naive T cell activation requires the following three key signals to be delivered from dendritic cells: engagement of the TCR by peptide Ags bound to MHC molecules (signal 1), engagement of costimulatory molecules on both cell types (signal 2), and expression of polarizing cytokines (signal 3). Initial interactions between Borrelia burgdorferi, the causative agent of Lyme disease, and dendritic cells remain largely unexplored. To address this gap in knowledge, we cultured live B. burgdorferi with monocyte-derived dendritic cells (mo-DCs) from healthy donors to examine the bacterial immunopeptidome associated with HLA-DR. In parallel, we examined changes in the expression of key costimulatory and regulatory molecules as well as profiled the cytokines released by dendritic cells when exposed to live spirochetes. RNA-sequencing studies on B. burgdorferi-pulsed dendritic cells show a unique gene expression signature associated with B. burgdorferi stimulation that differs from stimulation with lipoteichoic acid, a TLR2 agonist. These studies revealed that exposure of mo-DCs to live B. burgdorferi drives the expression of both pro- and anti-inflammatory cytokines as well as immunoregulatory molecules (e.g., PD-L1, IDO1, Tim3). Collectively, these studies indicate that the interaction of live B. burgdorferi with mo-DCs promotes a unique mature DC phenotype that likely impacts the nature of the adaptive T cell response generated in human Lyme disease.
Collapse
Affiliation(s)
- Maria Gutierrez-Hoffmann
- Lyme Disease Research Center, Johns Hopkins University,
School of Medicine, Baltimore, MD 21224, USA
- Division of Rheumatology, Johns Hopkins University,
School of Medicine, Baltimore, MD 21224, USA
| | - Jinshui Fan
- Division of Rheumatology, Johns Hopkins University,
School of Medicine, Baltimore, MD 21224, USA
| | - Robert N. O’Meally
- Mass Spectrometry and Proteomics Facility,
Department of Biological Chemistry, Johns Hopkins University School of Medicine,
Baltimore, MD 21205, USA
| | - Robert N. Cole
- Mass Spectrometry and Proteomics Facility,
Department of Biological Chemistry, Johns Hopkins University School of Medicine,
Baltimore, MD 21205, USA
| | - Liliana Florea
- Department of Genetic Medicine, Johns Hopkins
University, School of Medicine, Baltimore, MD 21205, USA
| | - Corina Antonescu
- Department of Genetic Medicine, Johns Hopkins
University, School of Medicine, Baltimore, MD 21205, USA
| | - C. Conover Talbot
- Institute for Basic Biomedical Sciences, Johns
Hopkins University, School of Medicine, Baltimore, MD 21205, USA
| | - Eleni Tiniakou
- Division of Rheumatology, Johns Hopkins University,
School of Medicine, Baltimore, MD 21224, USA
| | - Erika Darrah
- Lyme Disease Research Center, Johns Hopkins University,
School of Medicine, Baltimore, MD 21224, USA
- Division of Rheumatology, Johns Hopkins University,
School of Medicine, Baltimore, MD 21224, USA
| | - Mark J. Soloski
- Lyme Disease Research Center, Johns Hopkins University,
School of Medicine, Baltimore, MD 21224, USA
- Division of Rheumatology, Johns Hopkins University,
School of Medicine, Baltimore, MD 21224, USA
| |
Collapse
|
11
|
Oreper D, Klaeger S, Jhunjhunwala S, Delamarre L. The peptide woods are lovely, dark and deep: Hunting for novel cancer antigens. Semin Immunol 2023; 67:101758. [PMID: 37027981 DOI: 10.1016/j.smim.2023.101758] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 03/22/2023] [Accepted: 03/22/2023] [Indexed: 04/08/2023]
Abstract
Harnessing the patient's immune system to control a tumor is a proven avenue for cancer therapy. T cell therapies as well as therapeutic vaccines, which target specific antigens of interest, are being explored as treatments in conjunction with immune checkpoint blockade. For these therapies, selecting the best suited antigens is crucial. Most of the focus has thus far been on neoantigens that arise from tumor-specific somatic mutations. Although there is clear evidence that T-cell responses against mutated neoantigens are protective, the large majority of these mutations are not immunogenic. In addition, most somatic mutations are unique to each individual patient and their targeting requires the development of individualized approaches. Therefore, novel antigen types are needed to broaden the scope of such treatments. We review high throughput approaches for discovering novel tumor antigens and some of the key challenges associated with their detection, and discuss considerations when selecting tumor antigens to target in the clinic.
Collapse
Affiliation(s)
- Daniel Oreper
- Genentech, 1 DNA way, South San Francisco, 94080 CA, USA.
| | - Susan Klaeger
- Genentech, 1 DNA way, South San Francisco, 94080 CA, USA.
| | | | | |
Collapse
|
12
|
Macas J, Ávila Robledillo L, Kreplak J, Novák P, Koblížková A, Vrbová I, Burstin J, Neumann P. Assembly of the 81.6 Mb centromere of pea chromosome 6 elucidates the structure and evolution of metapolycentric chromosomes. PLoS Genet 2023; 19:e1010633. [PMID: 36735726 PMCID: PMC10027222 DOI: 10.1371/journal.pgen.1010633] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 03/20/2023] [Accepted: 01/23/2023] [Indexed: 02/04/2023] Open
Abstract
Centromeres in the legume genera Pisum and Lathyrus exhibit unique morphological characteristics, including extended primary constrictions and multiple separate domains of centromeric chromatin. These so-called metapolycentromeres resemble an intermediate form between monocentric and holocentric types, and therefore provide a great opportunity for studying the transitions between different types of centromere organizations. However, because of the exceedingly large and highly repetitive nature of metapolycentromeres, highly contiguous assemblies needed for these studies are lacking. Here, we report on the assembly and analysis of a 177.6 Mb region of pea (Pisum sativum) chromosome 6, including the 81.6 Mb centromere region (CEN6) and adjacent chromosome arms. Genes, DNA methylation profiles, and most of the repeats were uniformly distributed within the centromere, and their densities in CEN6 and chromosome arms were similar. The exception was an accumulation of satellite DNA in CEN6, where it formed multiple arrays up to 2 Mb in length. Centromeric chromatin, characterized by the presence of the CENH3 protein, was predominantly associated with arrays of three different satellite repeats; however, five other satellites present in CEN6 lacked CENH3. The presence of CENH3 chromatin was found to determine the spatial distribution of the respective satellites during the cell cycle. Finally, oligo-FISH painting experiments, performed using probes specifically designed to label the genomic regions corresponding to CEN6 in Pisum, Lathyrus, and Vicia species, revealed that metapolycentromeres evolved via the expansion of centromeric chromatin into neighboring chromosomal regions and the accumulation of novel satellite repeats. However, in some of these species, centromere evolution also involved chromosomal translocations and centromere repositioning.
Collapse
Affiliation(s)
- Jiří Macas
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice, Czech Republic
| | - Laura Ávila Robledillo
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice, Czech Republic
| | - Jonathan Kreplak
- Agroécologie, AgroSup Dijon, INRA, Univ. Bourgogne, Univ. Bourgogne Franche-Comté, Dijon, France
| | - Petr Novák
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice, Czech Republic
| | - Andrea Koblížková
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice, Czech Republic
| | - Iva Vrbová
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice, Czech Republic
| | - Judith Burstin
- Agroécologie, AgroSup Dijon, INRA, Univ. Bourgogne, Univ. Bourgogne Franche-Comté, Dijon, France
| | - Pavel Neumann
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice, Czech Republic
| |
Collapse
|
13
|
Enespa, Chandra P. Tool and techniques study to plant microbiome current understanding and future needs: an overview. Commun Integr Biol 2022; 15:209-225. [PMID: 35967908 PMCID: PMC9367660 DOI: 10.1080/19420889.2022.2082736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Microorganisms are present in the universe and they play role in beneficial and harmful to human life, society, and environments. Plant microbiome is a broad term in which microbes are present in the rhizo, phyllo, or endophytic region and play several beneficial and harmful roles with the plant. To know of these microorganisms, it is essential to be able to isolate purification and identify them quickly under laboratory conditions. So, to improve the microbial study, several tools and techniques such as microscopy, rRNA, or rDNA sequencing, fingerprinting, probing, clone libraries, chips, and metagenomics have been developed. The major benefits of these techniques are the identification of microbial community through direct analysis as well as it can apply in situ. Without tools and techniques, we cannot understand the roles of microbiomes. This review explains the tools and their roles in the understanding of microbiomes and their ecological diversity in environments.
Collapse
Affiliation(s)
- Enespa
- Department of Plant Pathology, School of Agriculture, SMPDC, University of Lucknow, Lucknow, India
| | - Prem Chandra
- Department of Environmental Microbiology, Babasaheb Bhimrao Ambedkar (A Central) University, Lucknow, India
| |
Collapse
|
14
|
Wang G, Zhang J, Wu S, Qin S, Zheng Y, Xia C, Geng H, Yao J, Deng L. The mechanistic target of rapamycin complex 1 pathway involved in hepatic gluconeogenesis through peroxisome-proliferator-activated receptor γ coactivator-1α. ANIMAL NUTRITION 2022; 11:121-131. [PMID: 36204284 PMCID: PMC9516411 DOI: 10.1016/j.aninu.2022.07.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 07/18/2022] [Accepted: 07/27/2022] [Indexed: 11/29/2022]
Abstract
Cattle can efficiently perform de novo generation of glucose through hepatic gluconeogenesis to meet post-weaning glucose demand. Substantial evidence points to cattle and non-ruminant animals being characterized by phylogenetic features in terms of their differing capacity for hepatic gluconeogenesis, a process that is highly efficient in cattle yet the underlying mechanism remains unclear. Here we used a variety of transcriptome data, as well as tissue and cell-based methods to uncover the mechanisms of high-efficiency hepatic gluconeogenesis in cattle. We showed that cattle can efficiently convert propionate into pyruvate, at least partly, via high expression of acyl-CoA synthetase short-chain family member 1 (ACSS1), propionyl-CoA carboxylase alpha chain (PCCA), methylmalonyl-CoA epimerase (MCEE), methylmalonyl-CoA mutase (MMUT), and succinate-CoA ligase (SUCLG2) genes in the liver (P < 0.01). Moreover, higher expression of the rate-limiting enzymes of gluconeogenesis, such as phosphoenolpyruvate carboxykinase (PCK) and fructose 1,6-bisphosphatase (FBP), ensures the efficient operation of hepatic gluconeogenesis in cattle (P < 0.01). Mechanistically, we found that cattle liver exhibits highly active mechanistic target of rapamycin complex 1 (mTORC1), and the expressions of PCCA, MMUT, SUCLG2, PCK, and FBP genes are regulated by the activation of mTORC1 (P < 0.001). Finally, our results showed that mTORC1 promotes hepatic gluconeogenesis in a peroxisome proliferator-activated receptor γ coactivator 1α (PGC-1α) dependent manner. Collectively, our results not only revealed an important mechanism responsible for the quantitative differences in the efficiency of hepatic gluconeogenesis in cattle versus non-ruminant animals, but also established that mTORC1 is indeed involved in the regulation of hepatic gluconeogenesis through PGC-1α. These results provide a novel potential insight into promoting hepatic gluconeogenesis through activated mTORC1 in both ruminants and mammals.
Collapse
|
15
|
Yu T, Zhao X, Li G. TransMeta simultaneously assembles multisample RNA-seq reads. Genome Res 2022; 32:1398-1407. [PMID: 35858749 PMCID: PMC9341511 DOI: 10.1101/gr.276434.121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 06/03/2022] [Indexed: 11/25/2022]
Abstract
Assembling RNA-seq reads into full-length transcripts is crucial in transcriptomic studies and poses computational challenges. Here we present TransMeta, a simple and robust algorithm that simultaneously assembles RNA-seq reads from multiple samples. TransMeta is designed based on the newly introduced vector-weighted splicing graph model, which enables accurate reconstruction of the consensus transcriptome via incorporating a cosine similarity-based combing strategy and a newly designed label-setting path-searching strategy. Tests on both simulated and real data sets show that TransMeta consistently outperforms PsiCLASS, StringTie2 plus its merge mode, and Scallop plus TACO, the most popular tools, in terms of precision and recall under a wide range of coverage thresholds at the meta-assembly level. Additionally, TransMeta consistently shows superior performance at the individual sample level.
Collapse
Affiliation(s)
- Ting Yu
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Xiaoyu Zhao
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
- School of Mathematics, Shandong University, Jinan, Shandong 250100, China
| | - Guojun Li
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
- School of Mathematical Science, Liaocheng University, Liaocheng 252000, China
| |
Collapse
|
16
|
Schon MA, Lutzmayer S, Hofmann F, Nodine MD. Bookend: precise transcript reconstruction with end-guided assembly. Genome Biol 2022; 23:143. [PMID: 35768836 PMCID: PMC9245221 DOI: 10.1186/s13059-022-02700-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Accepted: 06/05/2022] [Indexed: 12/29/2022] Open
Abstract
We developed Bookend, a package for transcript assembly that incorporates data from different RNA-seq techniques, with a focus on identifying and utilizing RNA 5' and 3' ends. We demonstrate that correct identification of transcript start and end sites is essential for precise full-length transcript assembly. Utilization of end-labeled reads present in full-length single-cell RNA-seq datasets dramatically improves the precision of transcript assembly in single cells. Finally, we show that hybrid assembly across short-read, long-read, and end-capture RNA-seq datasets from Arabidopsis thaliana, as well as meta-assembly of RNA-seq from single mouse embryonic stem cells, can produce reference-quality end-to-end transcript annotations.
Collapse
Affiliation(s)
- Michael A Schon
- Cluster of Plant Developmental Biology, Laboratory of Molecular Biology, Wageningen University & Research, Wageningen, 6708, PB, The Netherlands.
- Gregor Mendel Institute (GMI), Austrian Academy of Sciences, Vienna Biocenter (VBC), Dr. Bohr-Gasse 3, 1030, Vienna, Austria.
| | - Stefan Lutzmayer
- Gregor Mendel Institute (GMI), Austrian Academy of Sciences, Vienna Biocenter (VBC), Dr. Bohr-Gasse 3, 1030, Vienna, Austria
| | - Falko Hofmann
- Gregor Mendel Institute (GMI), Austrian Academy of Sciences, Vienna Biocenter (VBC), Dr. Bohr-Gasse 3, 1030, Vienna, Austria
| | - Michael D Nodine
- Cluster of Plant Developmental Biology, Laboratory of Molecular Biology, Wageningen University & Research, Wageningen, 6708, PB, The Netherlands.
- Gregor Mendel Institute (GMI), Austrian Academy of Sciences, Vienna Biocenter (VBC), Dr. Bohr-Gasse 3, 1030, Vienna, Austria.
| |
Collapse
|
17
|
Holland DO, Gotea V, Fedkenheuer K, Jaiswal SK, Baugher C, Tan H, Fedkenheuer M, Elnitski L. Characterization and clustering of kinase isoform expression in metastatic melanoma. PLoS Comput Biol 2022; 18:e1010065. [PMID: 35560144 PMCID: PMC9132324 DOI: 10.1371/journal.pcbi.1010065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Revised: 05/25/2022] [Accepted: 03/29/2022] [Indexed: 11/18/2022] Open
Abstract
Mutations to the human kinome are known to play causal roles in cancer. The kinome regulates numerous cell processes including growth, proliferation, differentiation, and apoptosis. In addition to aberrant expression, aberrant alternative splicing of cancer-driver genes is receiving increased attention as it could lead to loss or gain of functional domains, altering a kinase's downstream impact. The present study quantifies changes in gene expression and isoform ratios in the kinome of metastatic melanoma cells relative to primary tumors. We contrast 538 total kinases and 3,040 known kinase isoforms between 103 primary tumor and 367 metastatic samples from The Cancer Genome Atlas (TCGA). We find strong evidence of differential expression (DE) at the gene level in 123 kinases (23%). Additionally, of the 468 kinases with alternative isoforms, 60 (13%) had significant difference in isoform ratios (DIR). Notably, DE and DIR have little correlation; for instance, although DE highlights enrichment in receptor tyrosine kinases (RTKs), DIR identifies altered splicing in non-receptor tyrosine kinases (nRTKs). Using exon junction mapping, we identify five examples of splicing events favored in metastatic samples. We demonstrate differential apoptosis and protein localization between SLK isoforms in metastatic melanoma. We cluster isoform expression data and identify subgroups that correlate with genomic subtypes and anatomic tumor locations. Notably, distinct DE and DIR patterns separate samples with BRAF hotspot mutations and (N/K/H)RAS hotspot mutations, the latter of which lacks effective kinase inhibitor treatments. DE in RAS mutants concentrates in CMGC kinases (a group including cell cycle and splicing regulators) rather than RTKs as in BRAF mutants. Furthermore, isoforms in the RAS kinase subgroup show enrichment for cancer-related processes such as angiogenesis and cell migration. Our results reveal a new approach to therapeutic target identification and demonstrate how different mutational subtypes may respond differently to treatments highlighting possible new driver events in cancer.
Collapse
Affiliation(s)
- David O. Holland
- Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Valer Gotea
- Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Kevin Fedkenheuer
- Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Sushil K. Jaiswal
- Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Catherine Baugher
- Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Hua Tan
- Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Michael Fedkenheuer
- Lymphocyte Nuclear Biology, National Institute of Arthritis and Musculoskeletal and Skin Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Laura Elnitski
- Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| |
Collapse
|
18
|
Ringeling FR, Chakraborty S, Vissers C, Reiman D, Patel AM, Lee KH, Hong A, Park CW, Reska T, Gagneur J, Chang H, Spletter ML, Yoon KJ, Ming GL, Song H, Canzar S. Partitioning RNAs by length improves transcriptome reconstruction from short-read RNA-seq data. Nat Biotechnol 2022; 40:741-750. [PMID: 35013600 PMCID: PMC11332977 DOI: 10.1038/s41587-021-01136-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 10/26/2021] [Indexed: 02/06/2023]
Abstract
The accuracy of methods for assembling transcripts from short-read RNA sequencing data is limited by the lack of long-range information. Here we introduce Ladder-seq, an approach that separates transcripts according to their lengths before sequencing and uses the additional information to improve the quantification and assembly of transcripts. Using simulated data, we show that a kallisto algorithm extended to process Ladder-seq data quantifies transcripts of complex genes with substantially higher accuracy than conventional kallisto. For reference-based assembly, a tailored scheme based on the StringTie2 algorithm reconstructs a single transcript with 30.8% higher precision than its conventional counterpart and is more than 30% more sensitive for complex genes. For de novo assembly, a similar scheme based on the Trinity algorithm correctly assembles 78% more transcripts than conventional Trinity while improving precision by 78%. In experimental data, Ladder-seq reveals 40% more genes harboring isoform switches compared to conventional RNA sequencing and unveils widespread changes in isoform usage upon m6A depletion by Mettl14 knockout.
Collapse
Affiliation(s)
| | | | - Caroline Vissers
- Department of Biochemistry & Biophysics, University of California, San Francisco, San Francisco, CA, USA
| | - Derek Reiman
- Department of Biomedical Engineering, University of Illinois at Chicago, Chicago, IL, USA
| | - Akshay M Patel
- Gene Center, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Ki-Heon Lee
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea
| | - Ari Hong
- Center for RNA Research, Institute for Basic Science (IBS), Seoul, Republic of Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Chan-Woo Park
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea
| | - Tim Reska
- Gene Center, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Julien Gagneur
- Department of Informatics, Technical University of Munich, Garching, Germany
- Institute of Human Genetics, Technical University of Munich, Munich, Germany
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
| | - Hyeshik Chang
- Center for RNA Research, Institute for Basic Science (IBS), Seoul, Republic of Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
| | - Maria L Spletter
- Biomedical Center, Department of Physiological Chemistry, Ludwig-Maximilians-Universität München, Martinsried-Planegg, Germany
| | - Ki-Jun Yoon
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea
| | - Guo-Li Ming
- Department of Neuroscience and Mahoney Institute for Neurosciences, University of Pennsylvania, Philadelphia, PA, USA
| | - Hongjun Song
- Department of Neuroscience and Mahoney Institute for Neurosciences, University of Pennsylvania, Philadelphia, PA, USA
| | - Stefan Canzar
- Gene Center, Ludwig-Maximilians-Universität München, Munich, Germany.
| |
Collapse
|
19
|
Yolken RH, Kinnunen PM, Vapalahti O, Dickerson F, Suvisaari J, Chen O, Sabunciyan S. Studying the virome in psychiatric disease. Schizophr Res 2021; 234:78-86. [PMID: 34016507 DOI: 10.1016/j.schres.2021.04.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 04/12/2021] [Accepted: 04/14/2021] [Indexed: 12/12/2022]
Abstract
An overlooked aspect of current microbiome studies is the role of viruses in human health. Compared to bacterial studies, laboratory and analytical methods to study the entirety of viral communities in clinical samples are rudimentary and need further refinement. In order to address this need, we developed Virobiome-Seq, a sequence capture method and an accompanying bioinformatics analysis pipeline, that identifies viral reads in human samples. Virobiome-Seq is able to enrich for and detect multiple types of viruses in human samples, including novel subtypes that diverge at the sequence level. In addition, Virobiome-Seq is able to detect RNA transcripts from DNA viruses and may provide a sensitive method for detecting viral activity in vivo. Since Virobiome-Seq also yields the viral sequence, it makes it possible to investigate associations between viral genotype and psychiatric illness. In this proof of concept study, we detected HIV1, Torque Teno, Pegi, Herpes and Papilloma virus sequences in Peripheral Blood Mononuclear Cells, plasma and stool samples collected from individuals with psychiatric disorders. We also detected the presence of numerous novel circular RNA viruses but were unable to determine whether these viruses originate from the sample or represent contaminants. Despite this challenge, we demonstrate that our knowledge of viral diversity is incomplete and opportunities for novel virus discovery exist. Virobiome-Seq will enable a more sophisticated analysis of the virome and has the potential of uncovering complex interactions between viral activity and psychiatric disease.
Collapse
Affiliation(s)
- Robert H Yolken
- Department of Pediatrics, Johns Hopkins University, Baltimore, MD, USA
| | - Paula M Kinnunen
- Faculty of Veterinary Medicine, University of Helsinki, Helsinki, Finland
| | - Olli Vapalahti
- Faculty of Veterinary Medicine, University of Helsinki, Helsinki, Finland; Department of Virology, Faculty of Medicine, University of Helsinki, Helsinki, Finland; HUS Diagnostic Center, HUSLAB, Clinical Microbiology, Helsinki University Hospital, Helsinki, Finland
| | - Faith Dickerson
- Stanley Research Program, Sheppard Pratt, Baltimore, MD, USA
| | - Jaana Suvisaari
- Finnish Institute for Health and Welfare (THL), Helsinki, Finland
| | - Ou Chen
- Department of Pediatrics, Johns Hopkins University, Baltimore, MD, USA
| | - Sarven Sabunciyan
- Department of Pediatrics, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
20
|
Gatter T, Stadler PF. Ryūtō: Improved multi-sample transcript assembly for differential transcript expression analysis and more. Bioinformatics 2021; 37:4307-4313. [PMID: 34255826 DOI: 10.1093/bioinformatics/btab494] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 06/21/2021] [Accepted: 07/01/2021] [Indexed: 01/12/2023] Open
Abstract
MOTIVATION Accurate assembly of RNA-seq is a crucial step in many analytic tasks such as gene annotation or expression studies. Despite ongoing research, progress on traditional single sample assembly has brought no major breakthrough. Multi-sample RNA-Seq experiments provide more information than single sample datasets and thus constitute a promising area of research. Yet, this advantage is challenging to utilize due to the large amount of accumulating errors. RESULTS We present an extension to Ryūtō enabling the reconstruction of consensus transcriptomes from multiple RNA-seq data sets, incorporating consensus calling at low level features. We report stable improvements already at 3 replicates. Ryūtō outperforms competing approaches, providing a better and user-adjustable sensitivity-precision trade-off. Ryūtō's unique ability to utilize a (incomplete) reference for multi sample assemblies greatly increases precision. We demonstrate benefits for differential expression analysis. CONCLUSION Ryūtō consistently improves assembly on replicates of the same tissue independent of filter settings, even when mixing conditions or time series. Consensus voting in Ryūtō is especially effective at high precision assembly, while Ryūtō's conventional mode can reach higher recall. AVAILABILITY Ryūtō is available at https://github.com/studla/RYUTO. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Thomas Gatter
- Bioinformatics Group, Department of Computer Science & Interdisciplinary Center for Bioinformatics, Universität Leipzig, D-04107 Leipzig, Germany
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science & Interdisciplinary Center for Bioinformatics, Universität Leipzig, D-04107 Leipzig, Germany
- Discrete Biomath Group, Max Planck Institute for Mathematics in the Sciences, D-04103 Leipzig, Germany
- Institute for Theoretical Chemistry, University of Vienna, A-1090 Wien, Austria
- Santa Fe Institute, Santa Fe, NM 87501, USA
| |
Collapse
|
21
|
Banerjee S, Bhandary P, Woodhouse M, Sen TZ, Wise RP, Andorf CM. FINDER: an automated software package to annotate eukaryotic genes from RNA-Seq data and associated protein sequences. BMC Bioinformatics 2021; 22:205. [PMID: 33879057 PMCID: PMC8056616 DOI: 10.1186/s12859-021-04120-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 04/07/2021] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Gene annotation in eukaryotes is a non-trivial task that requires meticulous analysis of accumulated transcript data. Challenges include transcriptionally active regions of the genome that contain overlapping genes, genes that produce numerous transcripts, transposable elements and numerous diverse sequence repeats. Currently available gene annotation software applications depend on pre-constructed full-length gene sequence assemblies which are not guaranteed to be error-free. The origins of these sequences are often uncertain, making it difficult to identify and rectify errors in them. This hinders the creation of an accurate and holistic representation of the transcriptomic landscape across multiple tissue types and experimental conditions. Therefore, to gauge the extent of diversity in gene structures, a comprehensive analysis of genome-wide expression data is imperative. RESULTS We present FINDER, a fully automated computational tool that optimizes the entire process of annotating genes and transcript structures. Unlike current state-of-the-art pipelines, FINDER automates the RNA-Seq pre-processing step by working directly with raw sequence reads and optimizes gene prediction from BRAKER2 by supplementing these reads with associated proteins. The FINDER pipeline (1) reports transcripts and recognizes genes that are expressed under specific conditions, (2) generates all possible alternatively spliced transcripts from expressed RNA-Seq data, (3) analyzes read coverage patterns to modify existing transcript models and create new ones, and (4) scores genes as high- or low-confidence based on the available evidence across multiple datasets. We demonstrate the ability of FINDER to automatically annotate a diverse pool of genomes from eight species. CONCLUSIONS FINDER takes a completely automated approach to annotate genes directly from raw expression data. It is capable of processing eukaryotic genomes of all sizes and requires no manual supervision-ideal for bench researchers with limited experience in handling computational tools.
Collapse
Affiliation(s)
- Sagnik Banerjee
- Program in Bioinformatics and Computational Biology, Iowa State University, Ames, IA, 50011, USA
- Department of Statistics, Iowa State University, Ames, IA, 50011, USA
| | - Priyanka Bhandary
- Program in Bioinformatics and Computational Biology, Iowa State University, Ames, IA, 50011, USA
- Department of Genetics, Developmental and Cell Biology, Iowa State University, Ames, IA, 50011, USA
| | - Margaret Woodhouse
- Corn Insects and Crop Genetics Research Unit, USDA-Agricultural Research Service, Ames, IA, 50011, USA
| | - Taner Z Sen
- Crop Improvement and Genetics Research Unit, USDA-Agricultural Research Service, Albany, CA, 94710, USA
| | - Roger P Wise
- Corn Insects and Crop Genetics Research Unit, USDA-Agricultural Research Service, Ames, IA, 50011, USA
- Department of Plant Pathology and Microbiology, Iowa State University, Ames, IA, 50011, USA
| | - Carson M Andorf
- Corn Insects and Crop Genetics Research Unit, USDA-Agricultural Research Service, Ames, IA, 50011, USA.
- Department of Computer Science, Iowa State University, Ames, IA, 50011, USA.
| |
Collapse
|
22
|
Zhou A, Xie S, Feng Y, Sun D, Liu S, Sun Z, Li M, Zhang C, Zou J. Insights Into the Albinism Mechanism for Two Distinct Color Morphs of Northern Snakehead, Channa argus Through Histological and Transcriptome Analyses. Front Genet 2020; 11:830. [PMID: 33193565 PMCID: PMC7530302 DOI: 10.3389/fgene.2020.00830] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 07/09/2020] [Indexed: 12/20/2022] Open
Abstract
The great northern snakehead (Channa argus) is one of the most important economic and conservational fish in China. In this study, the melanocytes in the skin of two distinct color morphs C. argus were investigated and compared through employment of the microscopic analysis, hematoxylin and eosin (H&E) and Masson Fontana staining. Our results demonstrated the uneven distribution of melanocytes with extremely low density and most of them were in the state of aging or death. Meanwhile, there was no obvious pigment layer and melanocytes distribution pattern found in the albino-type (AT), while the melanocytes were evenly distributed with abundance in the bicolor-type (BT). The transcriptome analysis through Illumina HiSeq sequencing showed that a total of 34.93 Gb Clean Data was obtained, and Q30 base percentage reached 92.66%. The BT and AT northern snakeheads transcriptome data included a total of 56,039,701 and 60,410,063 clean reads (n = 3), respectively. In gene expression analyses, the sample correlation coefficients (r) were ranged between 0.92 and 1.00; the contribution of PC1 and PC2 were 50.25 and 13.73% by using PCA cluster analysis, the total number of DEGs were 1024 (559 up-regulated and 465 down-regulated), and the number of annotated DEGs was 767 (COG 172, KEGG 262, GO 288, SwissProt 548, Pfam 579 and NR 765). Additionally, 46,363 ± 873 and 44,947 ± 392 single nucleotide polymorphisms (SNPs) were compiled via genetic structure analysis, respectively. Ten key pigment-related genes were screened using qRT-PCR. And all of them revealed extremely higher expression levels in the skin of BT than those of AT. This is the first study to analyze the mechanism of albino characteristics of Channa via histology and transcriptomics, and also provide the oretical and practical support for the protection and development of germplasm resources for C. argus.
Collapse
Affiliation(s)
- Aiguo Zhou
- Joint Laboratory of Guangdong Province and Hong Kong Region on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou, China.,Guangdong Laboratory for Lingnan Modern Agriculture, South China Agricultural University, Guangzhou, China
| | - Shaolin Xie
- Joint Laboratory of Guangdong Province and Hong Kong Region on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou, China.,Guangdong Laboratory for Lingnan Modern Agriculture, South China Agricultural University, Guangzhou, China
| | - Yongyong Feng
- Joint Laboratory of Guangdong Province and Hong Kong Region on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou, China
| | - Di Sun
- Joint Laboratory of Guangdong Province and Hong Kong Region on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou, China
| | - Shulin Liu
- Joint Laboratory of Guangdong Province and Hong Kong Region on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou, China
| | - Zhuolin Sun
- Joint Laboratory of Guangdong Province and Hong Kong Region on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou, China
| | - Mingzhi Li
- Independent Researcher, Guangzhou, China
| | - Chaonan Zhang
- Joint Laboratory of Guangdong Province and Hong Kong Region on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou, China
| | - Jixing Zou
- Joint Laboratory of Guangdong Province and Hong Kong Region on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou, China.,Guangdong Laboratory for Lingnan Modern Agriculture, South China Agricultural University, Guangzhou, China
| |
Collapse
|