1
|
Wang Y, Li X, Lu W, Li F, Yao L, Liu Z, Shi H, Zhang W, Bai Y. Full-length circRNA sequencing method using low-input RNAs and profiling of circRNAs in MPTP-PD mice on a nanopore platform. Analyst 2024. [PMID: 39240088 DOI: 10.1039/d4an00715h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2024]
Abstract
Considering the importance of accurate information of full-length (FL) transcripts in functional analysis, researchers prefer to develop new sequencing methods based on third-generation sequencing (TGS) rather than short-read sequencing. Several FL circRNA sequencing strategies have been developed. However, the current methods are inapplicable to low-biomass samples, since a large amount of total RNAs are acquired for circRNA enrichment before library preparation. In this work, we developed an effective method to detect FL circRNAs from a nanogram level (1-100 ng) of total RNAs based on a nanopore platform. Additionally, prior to the library preparation process, we added a series of 24 nt barcodes for each sample to reduce the cost and operating time. Using this method, we profiled circRNA expression in the striatum, hippocampus and cerebral cortex of a Parkinson's disease (PD) mouse model. Over 6% of reads were effective for FL circRNA identification in most datasets. Notably, a reduction in the RNA initial input resulted in a lower correlation between replicates and the detection efficiency for longer circRNA, but the lowest input (1 ng) was able to detect numerous FL circRNAs. Next, we systematically identified over 263 934 circRNAs in PD and healthy mice using the lower-input FL sequencing method, some of which came from 50.52% of PD-associated genes. Moreover, significant changes were observed in the circRNA expression pattern at an isoform level, and high-confidence protein translation evidence was predicted. Overall, we developed an effective method to characterize FL circRNAs from low-input samples and provide a comprehensive insight into the biological function of circRNAs in PD at an isoform level.
Collapse
Affiliation(s)
- Ying Wang
- State Key Lab of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu, 210096, China.
| | - Xiaohan Li
- State Key Lab of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu, 210096, China.
- Hebei Key Laboratory of Animal Physiology, Biochemistry and Molecular Biology, College of Life Sciences, Hebei Normal University, Shijiazhuang, 050024, China
| | - Wenxiang Lu
- State Key Lab of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu, 210096, China.
| | - Fuyu Li
- State Key Lab of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu, 210096, China.
| | - Lingsong Yao
- State Key Lab of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu, 210096, China.
| | - Zhiyu Liu
- State Key Lab of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu, 210096, China.
| | - Huajuan Shi
- State Key Lab of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu, 210096, China.
| | - Weizhong Zhang
- Department of Ophthalmology, First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu, 210029, China.
| | - Yunfei Bai
- State Key Lab of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu, 210096, China.
| |
Collapse
|
2
|
Gao Z, Lu Y, Li M, Chong Y, Hong J, Wu J, Wu D, Xi D, Deng W. Application of Pan-Omics Technologies in Research on Important Economic Traits for Ruminants. Int J Mol Sci 2024; 25:9271. [PMID: 39273219 PMCID: PMC11394796 DOI: 10.3390/ijms25179271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Revised: 08/23/2024] [Accepted: 08/26/2024] [Indexed: 09/15/2024] Open
Abstract
The economic significance of ruminants in agriculture underscores the need for advanced research methodologies to enhance their traits. This review aims to elucidate the transformative role of pan-omics technologies in ruminant research, focusing on their application in uncovering the genetic mechanisms underlying complex traits such as growth, reproduction, production performance, and rumen function. Pan-omics analysis not only helps in identifying key genes and their regulatory networks associated with important economic traits but also reveals the impact of environmental factors on trait expression. By integrating genomics, epigenomics, transcriptomics, metabolomics, and microbiomics, pan-omics enables a comprehensive analysis of the interplay between genetics and environmental factors, offering a holistic understanding of trait expression. We explore specific examples of economic traits where these technologies have been pivotal, highlighting key genes and regulatory networks identified through pan-omics approaches. Additionally, we trace the historical evolution of each omics field, detailing their progression from foundational discoveries to high-throughput platforms. This review provides a critical synthesis of recent advancements, offering new insights and practical recommendations for the application of pan-omics in the ruminant industry. The broader implications for modern animal husbandry are discussed, emphasizing the potential for these technologies to drive sustainable improvements in ruminant production systems.
Collapse
Affiliation(s)
- Zhendong Gao
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Kunming 650201, China
| | - Ying Lu
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Mengfei Li
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Yuqing Chong
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Jieyun Hong
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Jiao Wu
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Dongwang Wu
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Dongmei Xi
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Weidong Deng
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Kunming 650201, China
| |
Collapse
|
3
|
Santucci K, Cheng Y, Xu SM, Janitz M. Enhancing novel isoform discovery: leveraging nanopore long-read sequencing and machine learning approaches. Brief Funct Genomics 2024:elae031. [PMID: 39158328 DOI: 10.1093/bfgp/elae031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 07/29/2024] [Accepted: 07/31/2024] [Indexed: 08/20/2024] Open
Abstract
Long-read sequencing technologies can capture entire RNA transcripts in a single sequencing read, reducing the ambiguity in constructing and quantifying transcript models in comparison to more common and earlier methods, such as short-read sequencing. Recent improvements in the accuracy of long-read sequencing technologies have expanded the scope for novel splice isoform detection and have also enabled a far more accurate reconstruction of complex splicing patterns and transcriptomes. Additionally, the incorporation and advancements of machine learning and deep learning algorithms in bioinformatic software have significantly improved the reliability of long-read sequencing transcriptomic studies. However, there is a lack of consensus on what bioinformatic tools and pipelines produce the most precise and consistent results. Thus, this review aims to discuss and compare the performance of available methods for novel isoform discovery with long-read sequencing technologies, with 25 tools being presented. Furthermore, this review intends to demonstrate the need for developing standard analytical pipelines, tools, and transcript model conventions for novel isoform discovery and transcriptomic studies.
Collapse
Affiliation(s)
- Kristina Santucci
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW 2052, Australia
| | - Yuning Cheng
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW 2052, Australia
| | - Si-Mei Xu
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW 2052, Australia
| | - Michael Janitz
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW 2052, Australia
| |
Collapse
|
4
|
Odriozola I, Rasmussen JA, Gilbert MTP, Limborg MT, Alberdi A. A practical introduction to holo-omics. CELL REPORTS METHODS 2024; 4:100820. [PMID: 38986611 PMCID: PMC11294832 DOI: 10.1016/j.crmeth.2024.100820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 04/17/2024] [Accepted: 06/20/2024] [Indexed: 07/12/2024]
Abstract
Holo-omics refers to the joint study of non-targeted molecular data layers from host-microbiota systems or holobionts, which is increasingly employed to disentangle the complex interactions between the elements that compose them. We navigate through the generation, analysis, and integration of omics data, focusing on the commonalities and main differences to generate and analyze the various types of omics, with a special focus on optimizing data generation and integration. We advocate for careful generation and distillation of data, followed by independent exploration and analyses of the single omic layers to obtain a better understanding of the study system, before the integration of multiple omic layers in a final model is attempted. We highlight critical decision points to achieve this aim and flag the main challenges to address complex biological questions regarding the integrative study of host-microbiota relationships.
Collapse
Affiliation(s)
- Iñaki Odriozola
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Jacob A Rasmussen
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - M Thomas P Gilbert
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark; University Museum, NTNU, Trondheim, Norway
| | - Morten T Limborg
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Antton Alberdi
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
5
|
Sun Y, Ko DH, Gao J, Fu K, Mao Y, He Y, Tian H. Engineering psychrophilic polymerase for nanopore long-read sequencing. Front Bioeng Biotechnol 2024; 12:1406722. [PMID: 39011153 PMCID: PMC11246872 DOI: 10.3389/fbioe.2024.1406722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Accepted: 05/30/2024] [Indexed: 07/17/2024] Open
Abstract
Unveiling the potential application of psychrophilic polymerases as candidates for polymerase-nanopore long-read sequencing presents a departure from conventional choices such as thermophilic Bacillus stearothermophilus (Bst) renowned for its limitation in temperature and mesophilic Bacillus subtilis phage (phi29) polymerases for limitations in strong exonuclease activity and weak salt tolerance. Exploiting the PB-Bst fusion DNA polymerases from Psychrobacillus (PB) and Bacillus stearothermophilus (Bst), our structural and biochemical analysis reveal a remarkable enhancement in salt tolerance and a concurrent reduction in exonuclease activity, achieved through targeted substitution of a pivotal functional domain. The sulfolobus 7-kDa protein (Sso7d) emerges as a standout fusion domain, imparting significant improvements in PB-Bst processivity. Notably, this study elucidates additional functional sites regulating exonuclease activity (Asp43 and Glu45) and processivity using artificial nucleotides (Glu266, Gln283, Leu334, Glu335, Ser426, and Asp430). By disclosing the intricate dynamics in exonuclease activity, strand displacement, and artificial nucleotide-based processivity at specific functional sites, our findings not only advance the fundamental understanding of psychrophilic polymerases but also provide novel insights into polymerase engineering.
Collapse
Affiliation(s)
| | | | | | | | | | - Yun He
- Research Center of Molecular Diagnostics and Sequencing, Research Institute of Tsinghua University in Shenzhen, Shenzhen, China
| | - Hui Tian
- Research Center of Molecular Diagnostics and Sequencing, Research Institute of Tsinghua University in Shenzhen, Shenzhen, China
| |
Collapse
|
6
|
Zang H, Guo S, Dong S, Song Y, Li K, Fan X, Qiu J, Zheng Y, Jiang H, Wu Y, Lü Y, Chen D, Guo R. Construction of a Full-Length Transcriptome of Western Honeybee Midgut Tissue and Improved Genome Annotation. Genes (Basel) 2024; 15:728. [PMID: 38927663 PMCID: PMC11202838 DOI: 10.3390/genes15060728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Revised: 05/22/2024] [Accepted: 05/26/2024] [Indexed: 06/28/2024] Open
Abstract
Honeybees are an indispensable pollinator in nature with pivotal ecological, economic, and scientific value. However, a full-length transcriptome for Apis mellifera, assembled with the advanced third-generation nanopore sequencing technology, has yet to be reported. Here, nanopore sequencing of the midgut tissues of uninoculated and Nosema ceranae-inoculated A. mellifera workers was conducted, and the full-length transcriptome was then constructed and annotated based on high-quality long reads. Next followed improvement of sequences and annotations of the current reference genome of A. mellifera. A total of 5,942,745 and 6,664,923 raw reads were produced from midguts of workers at 7 days post-inoculation (dpi) with N. ceranae and 10 dpi, while 7,100,161 and 6,506,665 raw reads were generated from the midguts of corresponding uninoculated workers. After strict quality control, 6,928,170, 6,353,066, 5,745,048, and 6,416,987 clean reads were obtained, with a length distribution ranging from 1 kb to 10 kb. Additionally, 16,824, 17,708, 15,744, and 18,246 full-length transcripts were respectively detected, including 28,019 nonredundant ones. Among these, 43,666, 30,945, 41,771, 26,442, and 24,532 full-length transcripts could be annotated to the Nr, KOG, eggNOG, GO, and KEGG databases, respectively. Additionally, 501 novel genes (20,326 novel transcripts) were identified for the first time, among which 401 (20,255), 193 (13,365), 414 (19,186), 228 (12,093), and 202 (11,703) were respectively annotated to each of the aforementioned five databases. The expression and sequences of three randomly selected novel transcripts were confirmed by RT-PCR and Sanger sequencing. The 5' UTR of 2082 genes, the 3' UTR of 2029 genes, and both the 5' and 3' UTRs of 730 genes were extended. Moreover, 17,345 SSRs, 14,789 complete ORFs, 1224 long non-coding RNAs (lncRNAs), and 650 transcription factors (TFs) from 37 families were detected. Findings from this work not only refine the annotation of the A. mellifera reference genome, but also provide a valuable resource and basis for relevant molecular and -omics studies.
Collapse
Affiliation(s)
- He Zang
- College of Bee Science and Biomedicine, Fujian Agriculture and Forestry University, Fuzhou 350002, China; (H.Z.); (S.G.); (S.D.); (Y.S.); (K.L.); (X.F.); (J.Q.); (Y.Z.); (D.C.)
- National & Local United Engineering Laboratory of Natural Biotoxin, Fuzhou 350002, China
- Apitherapy Research Institute of Fujian Province, Fuzhou 350002, China
| | - Sijia Guo
- College of Bee Science and Biomedicine, Fujian Agriculture and Forestry University, Fuzhou 350002, China; (H.Z.); (S.G.); (S.D.); (Y.S.); (K.L.); (X.F.); (J.Q.); (Y.Z.); (D.C.)
| | - Shunan Dong
- College of Bee Science and Biomedicine, Fujian Agriculture and Forestry University, Fuzhou 350002, China; (H.Z.); (S.G.); (S.D.); (Y.S.); (K.L.); (X.F.); (J.Q.); (Y.Z.); (D.C.)
| | - Yuxuan Song
- College of Bee Science and Biomedicine, Fujian Agriculture and Forestry University, Fuzhou 350002, China; (H.Z.); (S.G.); (S.D.); (Y.S.); (K.L.); (X.F.); (J.Q.); (Y.Z.); (D.C.)
| | - Kunze Li
- College of Bee Science and Biomedicine, Fujian Agriculture and Forestry University, Fuzhou 350002, China; (H.Z.); (S.G.); (S.D.); (Y.S.); (K.L.); (X.F.); (J.Q.); (Y.Z.); (D.C.)
| | - Xiaoxue Fan
- College of Bee Science and Biomedicine, Fujian Agriculture and Forestry University, Fuzhou 350002, China; (H.Z.); (S.G.); (S.D.); (Y.S.); (K.L.); (X.F.); (J.Q.); (Y.Z.); (D.C.)
- National & Local United Engineering Laboratory of Natural Biotoxin, Fuzhou 350002, China
- Apitherapy Research Institute of Fujian Province, Fuzhou 350002, China
| | - Jianfeng Qiu
- College of Bee Science and Biomedicine, Fujian Agriculture and Forestry University, Fuzhou 350002, China; (H.Z.); (S.G.); (S.D.); (Y.S.); (K.L.); (X.F.); (J.Q.); (Y.Z.); (D.C.)
- National & Local United Engineering Laboratory of Natural Biotoxin, Fuzhou 350002, China
- Apitherapy Research Institute of Fujian Province, Fuzhou 350002, China
| | - Yidi Zheng
- College of Bee Science and Biomedicine, Fujian Agriculture and Forestry University, Fuzhou 350002, China; (H.Z.); (S.G.); (S.D.); (Y.S.); (K.L.); (X.F.); (J.Q.); (Y.Z.); (D.C.)
| | - Haibin Jiang
- Apiculture Science Institute of Jilin Province, Jilin 132000, China; (H.J.); (Y.W.)
| | - Ying Wu
- Apiculture Science Institute of Jilin Province, Jilin 132000, China; (H.J.); (Y.W.)
| | - Yang Lü
- Mudanjiang Branch of Heilongjiang Academy of Agricultural Sciences, Mudanjiang 157000, China;
| | - Dafu Chen
- College of Bee Science and Biomedicine, Fujian Agriculture and Forestry University, Fuzhou 350002, China; (H.Z.); (S.G.); (S.D.); (Y.S.); (K.L.); (X.F.); (J.Q.); (Y.Z.); (D.C.)
- National & Local United Engineering Laboratory of Natural Biotoxin, Fuzhou 350002, China
- Apitherapy Research Institute of Fujian Province, Fuzhou 350002, China
| | - Rui Guo
- College of Bee Science and Biomedicine, Fujian Agriculture and Forestry University, Fuzhou 350002, China; (H.Z.); (S.G.); (S.D.); (Y.S.); (K.L.); (X.F.); (J.Q.); (Y.Z.); (D.C.)
- National & Local United Engineering Laboratory of Natural Biotoxin, Fuzhou 350002, China
- Apitherapy Research Institute of Fujian Province, Fuzhou 350002, China
| |
Collapse
|
7
|
Xing K, Li H, Wang X, Sun Y, Zhang J. A Full-Length Transcriptome and Analysis of the NHL-1 Gene Family in Neocaridina denticulata sinensis. BIOLOGY 2024; 13:366. [PMID: 38927246 PMCID: PMC11200715 DOI: 10.3390/biology13060366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 05/18/2024] [Accepted: 05/21/2024] [Indexed: 06/28/2024]
Abstract
Neocaridina denticulata sinensis has emerged as a promising model organism for basic studies in Decapod. However, the current transcriptome information on this species is based on next-generation sequencing technologies, which are limited by a short read length. Therefore, the present study aimed to generate a full-length transcriptome assembly of N. denticulata sinensis utilizing the PacBio Sequel Ⅱ platform. The resulting transcriptome assembly comprised 5831 transcripts with an N50 value of 3697 bp. Remarkably, 90.5% of these transcripts represented novel isoforms of known genes. The transcripts were further searched against the NR, SwissProt, KEGG, KOG, GO, NT, and Pfam databases. A total of 24.8% of the transcripts can be annotated across all seven databases. Additionally, 1236 alternative splicing events, 344 transcription factors, and 124 long non-coding RNAs (LncRNAs) were predicted. Based on the alternative splicing annotation results, a RING finger protein NHL-1 gene from N. denticulata sinensis (NdNHL-1) was identified. There are 15 transcripts in NdNHL-1. The longest transcript is 4995 bp in length and encodes a putative protein of 1665 amino acids. A phylogenetic analysis showed its close relationship with NHL-1 from other crustacean species. This report represents the full-length transcriptome of N. denticulata sinensis and will facilitate research on functional genomics and environmental adaptation in this species.
Collapse
Affiliation(s)
- Kefan Xing
- School of Life Sciences/Hebei Basic Science Center for Biotic Interaction, Hebei University, Baoding 071002, China; (K.X.); (H.L.); (X.W.)
| | - Huimin Li
- School of Life Sciences/Hebei Basic Science Center for Biotic Interaction, Hebei University, Baoding 071002, China; (K.X.); (H.L.); (X.W.)
| | - Xiongfei Wang
- School of Life Sciences/Hebei Basic Science Center for Biotic Interaction, Hebei University, Baoding 071002, China; (K.X.); (H.L.); (X.W.)
| | - Yuying Sun
- School of Life Sciences/Hebei Basic Science Center for Biotic Interaction, Hebei University, Baoding 071002, China; (K.X.); (H.L.); (X.W.)
- Institute of Life Science and Green Development, Hebei University, Baoding 071002, China
| | - Jiquan Zhang
- School of Life Sciences/Hebei Basic Science Center for Biotic Interaction, Hebei University, Baoding 071002, China; (K.X.); (H.L.); (X.W.)
- Institute of Life Science and Green Development, Hebei University, Baoding 071002, China
| |
Collapse
|
8
|
Zhong Y, Yang Y, Wang X, Ren B, Wang X, Shan G, Chen L. Systematic identification and characterization of exon-intron circRNAs. Genome Res 2024; 34:376-393. [PMID: 38609186 PMCID: PMC11067877 DOI: 10.1101/gr.278590.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 03/07/2024] [Indexed: 04/14/2024]
Abstract
Exon-intron circRNAs (EIciRNAs) are a circRNA subclass with retained introns. Global features of EIciRNAs remain largely unexplored, mainly owing to the lack of bioinformatic tools. The regulation of intron retention (IR) in EIciRNAs and the associated functionality also require further investigation. We developed a framework, FEICP, which efficiently detected EIciRNAs from high-throughput sequencing (HTS) data. EIciRNAs are distinct from exonic circRNAs (EcircRNAs) in aspects such as with larger length, localization in the nucleus, high tissue specificity, and enrichment mostly in the brain. Deep learning analyses revealed that compared with regular introns, the retained introns of circRNAs (CIRs) are shorter in length, have weaker splice site strength, and have higher GC content. Compared with retained introns in linear RNAs (LIRs), CIRs are more likely to form secondary structures and show greater sequence conservation. CIRs are closer to the 5'-end, whereas LIRs are closer to the 3'-end of transcripts. EIciRNA-generating genes are more actively transcribed and associated with epigenetic marks of gene activation. Computational analyses and genome-wide CRISPR screening revealed that SRSF1 binds to CIRs and inhibits the biogenesis of most EIciRNAs. SRSF1 regulates the biogenesis of EIciLIMK1, which enhances the expression of LIMK1 in cis to boost neuronal differentiation, exemplifying EIciRNA physiological function. Overall, our study has developed the FEICP pipeline to identify EIciRNAs from HTS data, and reveals multiple features of CIRs and EIciRNAs. SRSF1 has been identified to regulate EIciRNA biogenesis. EIciRNAs and the regulation of EIciRNA biogenesis play critical roles in neuronal differentiation.
Collapse
Affiliation(s)
- Yinchun Zhong
- Department of Cardiology, The First Affiliated Hospital of USTC, Division of Life Science and Medicine, University of Science and Technology of China, Hefei 230027, China
| | - Yan Yang
- Hefei National Laboratory for Physical Sciences at Microscale, Department of Clinical Laboratory, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Science and Medicine, University of Science and Technology of China, Hefei 230027, China
| | - Xiaolin Wang
- Hefei National Laboratory for Physical Sciences at Microscale, Department of Clinical Laboratory, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Science and Medicine, University of Science and Technology of China, Hefei 230027, China
| | - Bingbing Ren
- Department of Pulmonary and Critical Care Medicine, Regional Medical Center for National Institute of Respiratory Diseases, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, Hangzhou 310016, China
| | - Xueren Wang
- Department of Anesthesiology, Shanxi Bethune Hospital, Taiyuan 030032, China;
- Department of Anesthesiology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Ge Shan
- Hefei National Laboratory for Physical Sciences at Microscale, Department of Clinical Laboratory, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Science and Medicine, University of Science and Technology of China, Hefei 230027, China;
| | - Liang Chen
- Department of Cardiology, The First Affiliated Hospital of USTC, Division of Life Science and Medicine, University of Science and Technology of China, Hefei 230027, China
| |
Collapse
|
9
|
Tian Y, Liu X, Chen X, Wang B, Dong M, Chen L, Yang Z, Li Y, Sun H. Integrated Untargeted Metabolome, Full-Length Sequencing and Transcriptome Analyses Reveal the Mechanism of Flavonoid Biosynthesis in Blueberry ( Vaccinium spp.) Fruit. Int J Mol Sci 2024; 25:4137. [PMID: 38673724 PMCID: PMC11050320 DOI: 10.3390/ijms25084137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 04/01/2024] [Accepted: 04/04/2024] [Indexed: 04/28/2024] Open
Abstract
As a highly economic berry fruit crop, blueberry is enjoyed by most people and has various potential health benefits, many of which are attributed to the relatively high concentrations of flavonoids. To obtain more accurate and comprehensive transcripts, the full-length transcriptome of half-highbush blueberry (Vaccinium corymbosum/angustifolium cultivar Northland) obtained using single molecule real-time and next-generation sequencing technologies was reported for the first time. Overall, 147,569 consensus transcripts (average length, 2738 bp; N50, 3176 bp) were obtained. After quality control steps, 63,425 high-quality isoforms were obtained and 5030 novel genes, 3002 long non-coding RNAs, 3946 transcription factor genes (TFs), 30,540 alternative splicing events, and 2285 fusion gene pairs were identified. To better explore the molecular mechanism of flavonoid biosynthesis in mature blueberry fruit, an integrative analysis of the metabolome and transcriptome was performed on the exocarp, sarcocarp, and seed. A relatively complete biosynthesis pathway map of phenylpropanoids, flavonoids, and proanthocyanins in blueberry was constructed. The results of the joint analysis showed that the 228 functional genes and 42 TFs regulated 78 differentially expressed metabolites within the biosynthesis pathway of phenylpropanoids/flavonoids. O2PLS analysis results showed that the key metabolites differentially accumulated in blueberry fruit tissues were albireodelphin, delphinidin 3,5-diglucoside, delphinidin 3-O-rutinoside, and delphinidin 3-O-sophoroside, and 10 structural genes (4 Vc4CLs, 3 VcBZ1s, 1 VcUGT75C1, 1 VcAT, and 1 VcUGAT), 4 transporter genes (1 VcGSTF and 3 VcMATEs), and 10 TFs (1 VcMYB, 2 VcbHLHs, 4 VcWD40s, and 3 VcNACs) exhibited strong correlations with 4 delphinidin glycosides. These findings provide insights into the molecular mechanisms of flavonoid biosynthesis and accumulation in blueberry fruit.
Collapse
Affiliation(s)
- Youwen Tian
- College of Horticulture, Jilin Agricultural University, Changchun 130118, China; (Y.T.); (X.L.); (X.C.); (B.W.); (L.C.)
- College of Life Sciences, Jilin Agricultural University, Changchun 130118, China;
| | - Xinlei Liu
- College of Horticulture, Jilin Agricultural University, Changchun 130118, China; (Y.T.); (X.L.); (X.C.); (B.W.); (L.C.)
| | - Xuyang Chen
- College of Horticulture, Jilin Agricultural University, Changchun 130118, China; (Y.T.); (X.L.); (X.C.); (B.W.); (L.C.)
| | - Bowei Wang
- College of Horticulture, Jilin Agricultural University, Changchun 130118, China; (Y.T.); (X.L.); (X.C.); (B.W.); (L.C.)
| | - Mei Dong
- College of Life Sciences, Jilin Agricultural University, Changchun 130118, China;
| | - Li Chen
- College of Horticulture, Jilin Agricultural University, Changchun 130118, China; (Y.T.); (X.L.); (X.C.); (B.W.); (L.C.)
| | - Zhengsong Yang
- High Mountain Economic Plant Research Institute, Yunnan Academy of Agricultural Sciences, Lijiang 674110, China;
| | - Yadong Li
- College of Horticulture, Jilin Agricultural University, Changchun 130118, China; (Y.T.); (X.L.); (X.C.); (B.W.); (L.C.)
| | - Haiyue Sun
- College of Horticulture, Jilin Agricultural University, Changchun 130118, China; (Y.T.); (X.L.); (X.C.); (B.W.); (L.C.)
| |
Collapse
|
10
|
Huang J, Xiong X, Zhang W, Chen X, Wei Y, Li H, Xie J, Wei Q, Zhou Q. Integrating miRNA and full-length transcriptome profiling to elucidate the mechanism of muscle growth in Muscovy ducks reveals key roles for miR-301a-3p/ANKRD1. BMC Genomics 2024; 25:340. [PMID: 38575872 PMCID: PMC10993543 DOI: 10.1186/s12864-024-10138-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 02/19/2024] [Indexed: 04/06/2024] Open
Abstract
BACKGROUND The popularity of Muscovy ducks is attributed not only to their conformation traits but also to their slightly higher content of breast and leg meat, as well as their stronger-tasting meat compared to that of typical domestic ducks. However, there is a lack of comprehensive systematic research on the development of breast muscle in Muscovy ducks. In addition, since the number of skeletal muscle myofibers is established during the embryonic period, this study conducted a full-length transcriptome sequencing and microRNA sequencing of the breast muscle. Muscovy ducks at four developmental stages, namely Embryonic Day 21 (E21), Embryonic Day 27 (E27), Hatching Day (D0), and Post-hatching Day 7 (D7), were used to isolate total RNA for analysis. RESULTS A total of 68,161 genes and 472 mature microRNAs were identified. In order to uncover deeper insights into the regulation of mRNA by miRNAs, we conducted an integration of the differentially expressed miRNAs (known as DEMs) with the differentially expressed genes (referred to as DEGs) across various developmental stages. This integration allowed us to make predictions regarding the interactions between miRNAs and mRNA. Through this analysis, we identified a total of 274 DEGs that may serve as potential targets for the 68 DEMs. In the predicted miRNA‒mRNA interaction networks, let-7b, miR-133a-3p, miR-301a-3p, and miR-338-3p were the hub miRNAs. In addition, multiple DEMs also showed predicted target relationships with the DEGs associated with skeletal system development. These identified DEGs and DEMs as well as their predicted interaction networks involved in the regulation of energy homeostasis and muscle development were most likely to play critical roles in facilitating the embryo-to-hatchling transition. A candidate miRNA, miR-301a-3p, exhibited increased expression during the differentiation of satellite cells and was downregulated in the breast muscle tissues of Muscovy ducks at E21 compared to E27. A dual-luciferase reporter assay suggested that the ANKRD1 gene, which encodes a transcription factor, is a direct target of miR-301a-3p. CONCLUSIONS miR-301a-3p suppressed the posttranscriptional activity of ANKRD1, which is an activator of satellite cell proliferation, as determined with gain- and loss-of-function experiments. miR-301a-3p functions as an inducer of myogenesis by targeting the ANKRD1 gene in Muscovy ducks. These results provide novel insights into the early developmental process of black Muscovy breast muscles and will improve understanding of the underlying molecular mechanisms.
Collapse
Affiliation(s)
- Jiangnan Huang
- Institute of Animal Husbandry and Veterinary Medicine, Jiangxi Academy of Agricultural Sciences, Nanchang, 330200, China
| | - Xiaolan Xiong
- Institute of Animal Husbandry and Veterinary Medicine, Jiangxi Academy of Agricultural Sciences, Nanchang, 330200, China
| | - Weihong Zhang
- Institute of Animal Husbandry and Veterinary Medicine, Jiangxi Academy of Agricultural Sciences, Nanchang, 330200, China
| | - Xiaolian Chen
- Institute of Animal Husbandry and Veterinary Medicine, Jiangxi Academy of Agricultural Sciences, Nanchang, 330200, China
| | - Yue Wei
- Institute of Animal Husbandry and Veterinary Medicine, Jiangxi Academy of Agricultural Sciences, Nanchang, 330200, China
| | - Haiqin Li
- Institute of Animal Husbandry and Veterinary Medicine, Jiangxi Academy of Agricultural Sciences, Nanchang, 330200, China
| | - Jinfang Xie
- Institute of Animal Husbandry and Veterinary Medicine, Jiangxi Academy of Agricultural Sciences, Nanchang, 330200, China
| | - Qipeng Wei
- Institute of Animal Husbandry and Veterinary Medicine, Jiangxi Academy of Agricultural Sciences, Nanchang, 330200, China.
| | - Quanyong Zhou
- Institute of Animal Husbandry and Veterinary Medicine, Jiangxi Academy of Agricultural Sciences, Nanchang, 330200, China.
| |
Collapse
|
11
|
Wang Y, Xie Z, Kutschera E, Adams JI, Kadash-Edmondson KE, Xing Y. rMATS-turbo: an efficient and flexible computational tool for alternative splicing analysis of large-scale RNA-seq data. Nat Protoc 2024; 19:1083-1104. [PMID: 38396040 DOI: 10.1038/s41596-023-00944-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Accepted: 11/02/2023] [Indexed: 02/25/2024]
Abstract
Pre-mRNA alternative splicing is a prevalent mechanism for diversifying eukaryotic transcriptomes and proteomes. Regulated alternative splicing plays a role in many biological processes, and dysregulated alternative splicing is a feature of many human diseases. Short-read RNA sequencing (RNA-seq) is now the standard approach for transcriptome-wide analysis of alternative splicing. Since 2011, our laboratory has developed and maintained Replicate Multivariate Analysis of Transcript Splicing (rMATS), a computational tool for discovering and quantifying alternative splicing events from RNA-seq data. Here we provide a protocol for the contemporary version of rMATS, rMATS-turbo, a fast and scalable re-implementation that maintains the statistical framework and user interface of the original rMATS software, while incorporating a revamped computational workflow with a substantial improvement in speed and data storage efficiency. The rMATS-turbo software scales up to massive RNA-seq datasets with tens of thousands of samples. To illustrate the utility of rMATS-turbo, we describe two representative application scenarios. First, we describe a broadly applicable two-group comparison to identify differential alternative splicing events between two sample groups, including both annotated and novel alternative splicing events. Second, we describe a quantitative analysis of alternative splicing in a large-scale RNA-seq dataset (~1,000 samples), including the discovery of alternative splicing events associated with distinct cell states. We detail the workflow and features of rMATS-turbo that enable efficient parallel processing and analysis of large-scale RNA-seq datasets on a compute cluster. We anticipate that this protocol will help the broad user base of rMATS-turbo make the best use of this software for studying alternative splicing in diverse biological systems.
Collapse
Affiliation(s)
- Yuanyuan Wang
- Bioinformatics Interdepartmental Graduate Program, University of California, Los Angeles, Los Angeles, CA, USA
- Center for Computational and Genomic Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Zhijie Xie
- Center for Computational and Genomic Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Eric Kutschera
- Center for Computational and Genomic Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Jenea I Adams
- Center for Computational and Genomic Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Genomics and Computational Biology Graduate Program, University of Pennsylvania, Philadelphia, PA, USA
| | - Kathryn E Kadash-Edmondson
- Center for Computational and Genomic Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Yi Xing
- Center for Computational and Genomic Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA, USA.
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA, USA.
- Department of Biomedical and Health Informatics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA.
| |
Collapse
|
12
|
Hao J, Liang Y, Ping J, Wang T, Su Y. Full-length transcriptome analysis of Ophioglossum vulgatum: effects of experimentally identified chloroplast gene clusters on expression and evolutionary patterns. PLANT MOLECULAR BIOLOGY 2024; 114:31. [PMID: 38509284 DOI: 10.1007/s11103-024-01423-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 01/24/2024] [Indexed: 03/22/2024]
Abstract
Genes with similar or related functions in chloroplasts are often arranged in close proximity, forming clusters on chromosomes. These clusters are transcribed coordinated to facilitate the expression of genes with specific function. Our previous study revealed a significant negative correlation between the chloroplast gene expression level of the rare medicinal fern Ophioglossum vulgatum and its evolutionary rates as well as selection pressure. Therefore, in this study, we employed a combination of SMRT and Illumina sequencing technology to analyze the full-length transcriptome sequencing of O. vulgatum for the first time. In particular, we experimentally identified gene clusters based on transcriptome data and investigated the effects of chloroplast gene clustering on expression and evolutionary patterns. The results revealed that the total sequenced data volume of the full-length transcriptome of O. vulgatum amounted to 71,950,652,163 bp, and 110 chloroplast genes received transcript coverage. Nine different types of gene clusters were experimentally identified in their transcripts. The chloroplast cluster genes may cause a decrease in non-synonymous substitution rate and selection pressure, as well as a reduction in transversion rate, transition rate, and their ratio. While expression levels of chloroplast cluster genes in leaf, sporangium, and stem would be relatively elevated. The Mann-Whitney U test indicated statistically significant in the selection pressure, sporangia and leaves groups (P < 0.05). We have contributed novel full-length transcriptome data resources for ferns, presenting new evidence on the effects of chloroplast gene clustering on expression land evolutionary patterns, and offering new theoretical support for transgenic research through gene clustering.
Collapse
Affiliation(s)
- Jing Hao
- School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, China
| | - Yingyi Liang
- College of Life Sciences, South China Agricultural University, Guangzhou, 510642, China
| | - Jingyao Ping
- School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, China
| | - Ting Wang
- College of Life Sciences, South China Agricultural University, Guangzhou, 510642, China.
| | - Yingjuan Su
- School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, China.
- Research Institute of Sun Yat-sen University in Shenzhen, Shenzhen, 518057, China.
| |
Collapse
|
13
|
Liu X, Zheng J, Ding J, Wu J, Zuo F, Zhang G. When Livestock Genomes Meet Third-Generation Sequencing Technology: From Opportunities to Applications. Genes (Basel) 2024; 15:245. [PMID: 38397234 PMCID: PMC10888458 DOI: 10.3390/genes15020245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Revised: 01/30/2024] [Accepted: 02/10/2024] [Indexed: 02/25/2024] Open
Abstract
Third-generation sequencing technology has found widespread application in the genomic, transcriptomic, and epigenetic research of both human and livestock genetics. This technology offers significant advantages in the sequencing of complex genomic regions, the identification of intricate structural variations, and the production of high-quality genomes. Its attributes, including long sequencing reads, obviation of PCR amplification, and direct determination of DNA/RNA, contribute to its efficacy. This review presents a comprehensive overview of third-generation sequencing technologies, exemplified by single-molecule real-time sequencing (SMRT) and Oxford Nanopore Technology (ONT). Emphasizing the research advancements in livestock genomics, the review delves into genome assembly, structural variation detection, transcriptome sequencing, and epigenetic investigations enabled by third-generation sequencing. A comprehensive analysis is conducted on the application and potential challenges of third-generation sequencing technology for genome detection in livestock. Beyond providing valuable insights into genome structure analysis and the identification of rare genes in livestock, the review ventures into an exploration of the genetic mechanisms underpinning exemplary traits. This review not only contributes to our understanding of the genomic landscape in livestock but also provides fresh perspectives for the advancement of research in this domain.
Collapse
Affiliation(s)
- Xinyue Liu
- College of Animal Science and Technology, Southwest University, Rongchang, Chongqing 402460, China; (X.L.); (J.Z.); (J.D.); (J.W.); (F.Z.)
| | - Junyuan Zheng
- College of Animal Science and Technology, Southwest University, Rongchang, Chongqing 402460, China; (X.L.); (J.Z.); (J.D.); (J.W.); (F.Z.)
| | - Jialan Ding
- College of Animal Science and Technology, Southwest University, Rongchang, Chongqing 402460, China; (X.L.); (J.Z.); (J.D.); (J.W.); (F.Z.)
| | - Jiaxin Wu
- College of Animal Science and Technology, Southwest University, Rongchang, Chongqing 402460, China; (X.L.); (J.Z.); (J.D.); (J.W.); (F.Z.)
| | - Fuyuan Zuo
- College of Animal Science and Technology, Southwest University, Rongchang, Chongqing 402460, China; (X.L.); (J.Z.); (J.D.); (J.W.); (F.Z.)
- Beef Cattle Engineering and Technology Research Center of Chongqing, Southwest University, Rongchang, Chongqing 402460, China
| | - Gongwei Zhang
- College of Animal Science and Technology, Southwest University, Rongchang, Chongqing 402460, China; (X.L.); (J.Z.); (J.D.); (J.W.); (F.Z.)
- Beef Cattle Engineering and Technology Research Center of Chongqing, Southwest University, Rongchang, Chongqing 402460, China
| |
Collapse
|
14
|
Adams M, Vollmers C. Generation and analysis of a mouse multi-tissue genome annotation atlas. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.31.578267. [PMID: 38352519 PMCID: PMC10862843 DOI: 10.1101/2024.01.31.578267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/24/2024]
Abstract
Generating an accurate and complete genome annotation for an organism is complex because the cells within each tissue can express a unique set of transcript isoforms from a unique set of genes. A comprehensive genome annotation should contain information on what tissues express what transcript isoforms at what level. This tissue-level isoform information can then inform a wide range of research questions as well as experiment designs. Long-read sequencing technology combined with advanced full-length cDNA library preparation methods has now achieved throughput and accuracy where generating these types of annotations is achievable. Here, we show this by generating a genome annotation of the mouse (Mus musculus). We used the nanopore-based R2C2 long-read sequencing method to generate 64 million highly accurate full length cDNA consensus reads - averaging 5.4 million reads per tissue for a dozen tissues. Using the Mandalorion tool we processed these reads to generate the Tissue-level Atlas of Mouse Isoforms (TAMI - available at https://genome.ucsc.edu/s/vollmers/TAMI) which we believe will be a valuable complement to conventional, manually curated reference genome annotations.
Collapse
Affiliation(s)
- Matthew Adams
- Department of Molecular, Cellular, and Developmental Biology, University of California Santa Cruz
| | | |
Collapse
|
15
|
Liao T, Zhang L, Wang Y, Guo L, Cao J, Liu G. Full-length transcriptome characterization of Platycladus orientalis based on the PacBio platform. Front Genet 2024; 15:1345039. [PMID: 38304337 PMCID: PMC10830785 DOI: 10.3389/fgene.2024.1345039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 01/10/2024] [Indexed: 02/03/2024] Open
Abstract
As a unique and native conifer in China, Platycladus orientalis is widely used in soil erosion control, garden landscapes, timber, and traditional Chinese medicine. However, due to the lack of reference genome and transcriptome, it is limited to the further molecular mechanism research and gene function mining. To develop a full-length reference transcriptome, tissues from five different parts of P. orientalis and four cone developmental stages were sequenced and analyzed by single-molecule real-time (SMRT) sequencing through the PacBio platform in this study. Overall, 37,111 isoforms were detected by PacBio with an N50 length of 2,317 nt, an average length of 1,999 bp, and the GC content of 41.81%. Meanwhile, 36,120 coding sequences, 5,645 simple sequence repeats (SSRs), 1,201 non-coding RNAs (lncRNAs), and 182 alternative splicing (AS) events with five types were identified using the results obtained from the PacBio transcript isoforms. Furthermore, 1,659 transcription factors (TFs) were detected and belonged to 51 TF families. A total of 35,689 transcripts (96.17%) were annotated through the NCBI nr, KOG, Swiss-Prot and KEGG databases, and 385 transcript isoforms related to 8 types of hormones were identified incorporated into plant hormone signal transduction pathways. The assembly and revelation of the full-length transcriptome of P. orientalis offer a pioneering insight for future investigations into gene function and genetic breeding within Platycladus species.
Collapse
Affiliation(s)
| | | | | | | | | | - Guobin Liu
- Institute of Forestry and Pomology, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| |
Collapse
|
16
|
Xu R, Prakoso D, Salvador LCM, Rajeev S. Leptospira transcriptome sequencing using long-read technology reveals unannotated transcripts and potential polyadenylation of RNA molecules. Microbiol Spectr 2023; 11:e0223423. [PMID: 37861327 PMCID: PMC10715090 DOI: 10.1128/spectrum.02234-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 09/11/2023] [Indexed: 10/21/2023] Open
Abstract
IMPORTANCE Leptospirosis, caused by the spirochete bacteria Leptospira, is a zoonotic disease of humans and animals, accounting for over 1 million annual human cases and over 60,000 deaths. We have characterized operon transcriptional units, identified novel RNA coding regions, and reported evidence of potential posttranscriptional polyadenylation in the Leptospira transcriptomes for the first time using Oxford Nanopore Technology RNA sequencing protocols. The newly identified RNA coding regions and operon transcriptional units were detected only in the pathogenic Leptospira transcriptomes, suggesting their significance in virulence-related functions. This article integrates bioinformatics, infectious diseases, microbiology, molecular biology, veterinary sciences, and public health. Given the current knowledge gap in the regulation of leptospiral pathogenicity, our findings offer valuable insights to researchers studying leptospiral pathogenicity and provide both a basis and a tool for researchers focusing on prokaryotic molecular studies for the understanding of RNA compositions and prokaryotic polyadenylation for their organisms of interest.
Collapse
Affiliation(s)
- Ruijie Xu
- Institute of Bioinformatics, University of Georgia, Athens, Georgia, USA
- Center for the Ecology of Infectious Diseases, University of Georgia, Athens, Georgia, USA
| | - Dhani Prakoso
- Department of Biomedical and Diagnostic Sciences, College of Veterinary Medicine, University of Tennessee, Knoxville, Tennessee, USA
| | - Liliana C. M. Salvador
- Institute of Bioinformatics, University of Georgia, Athens, Georgia, USA
- Center for the Ecology of Infectious Diseases, University of Georgia, Athens, Georgia, USA
- Department of Infectious Diseases, College of Veterinary Medicine, University of Georgia, Athens, Georgia, USA
| | - Sreekumari Rajeev
- Department of Biomedical and Diagnostic Sciences, College of Veterinary Medicine, University of Tennessee, Knoxville, Tennessee, USA
| |
Collapse
|
17
|
Zhang C, Fang Y, Chen W, Chen Z, Zhang Y, Xie Y, Chen W, Xie Z, Guo M, Wang J, Tan C, Wang H, Tang C. Improving the RNA velocity approach with single-cell RNA lifecycle (nascent, mature and degrading RNAs) sequencing technologies. Nucleic Acids Res 2023; 51:e112. [PMID: 37941145 PMCID: PMC10711548 DOI: 10.1093/nar/gkad969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 09/27/2023] [Accepted: 10/14/2023] [Indexed: 11/10/2023] Open
Abstract
We presented an experimental method called FLOUR-seq, which combines BD Rhapsody and nanopore sequencing to detect the RNA lifecycle (including nascent, mature, and degrading RNAs) in cells. Additionally, we updated our HIT-scISOseq V2 to discover a more accurate RNA lifecycle using 10x Chromium and Pacbio sequencing. Most importantly, to explore how single-cell full-length RNA sequencing technologies could help improve the RNA velocity approach, we introduced a new algorithm called 'Region Velocity' to more accurately configure cellular RNA velocity. We applied this algorithm to study spermiogenesis and compared the performance of FLOUR-seq with Pacbio-based HIT-scISOseq V2. Our findings demonstrated that 'Region Velocity' is more suitable for analyzing single-cell full-length RNA data than traditional RNA velocity approaches. These novel methods could be useful for researchers looking to discover full-length RNAs in single cells and comprehensively monitor RNA lifecycle in cells.
Collapse
Affiliation(s)
| | | | - Weitian Chen
- BGI, Shenzhen 518000, China
- BGI Education Center, University of Chinese Academy of Sciences, Shenzhen 518083, China
| | | | - Ying Zhang
- Guangdong Provincial Reproductive Science Institute (Guangdong Provincial Fertility Hospital), Guangzhou, China; NHC Key Laboratory of Male Reproduction and Genetics, Guangzhou, China
| | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Guizzo MG, Mans B, Pienaar R, Ribeiro JMC. A comparison of Illumina and PacBio methods to build tick salivary gland transcriptomes confirms large expression of lipocalins and other salivary protein families that are not represented in available tick genomes. Ticks Tick Borne Dis 2023; 14:102209. [PMID: 37327738 PMCID: PMC10527494 DOI: 10.1016/j.ttbdis.2023.102209] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Revised: 05/29/2023] [Accepted: 05/30/2023] [Indexed: 06/18/2023]
Abstract
Tick saliva helps blood feeding by its antihemostatic and immunomodulatory activities. Tick salivary gland transcriptomes (sialotranscriptomes) revealed thousands of transcripts coding for putative secreted polypeptides. Hundreds of these transcripts code for groups of similar proteins, constituting protein families, such as the lipocalins and metalloproteases. However, while many of these transcriptome-derived protein sequences matches sequences predicted by tick genome assemblies, the majority are not represented in these proteomes. The diversity of these transcriptome-derived transcripts could derive from artifacts generated during assembly of short Illumina reads or derive from polymorphisms of the genes coding for these proteins. To investigate this discrepancy, we collected salivary glands from blood-feeding ticks and, from the same homogenate, made and sequenced libraries following Illumina and PacBio protocols, with the assumption that the longer PacBio reads would reveal the sequences generated by the assembly of Illumina reads. Using both Rhipicephalus zambeziensis and Ixodes scapularis ticks, we have obtained more lipocalin transcripts from the Illumina library than the PacBio library. To verify whether these unique Illumina transcripts were real, we selected 9 uniquely Illumina-derived lipocalin transcripts from I. scapularis and attempted to obtain PCR products. These were obtained and their sequences confirmed the presence of these transcripts in the I. scapularis salivary homogenate. We further compared the predicted salivary lipocalins and metalloproteases from I. scapularis sialotranscriptomes with those found in the predicted proteomes of 3 publicly available genomes of I. scapularis. Results indicate that the discrepancy between the genome and transcriptome sequences for these salivary protein families is due to a high degree of polymorphism within these genes.
Collapse
Affiliation(s)
- Melina Garcia Guizzo
- Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, Rockville, MD, 20852, USA
| | - Ben Mans
- Epidemiology, Parasites and Vectors, Agricultural Research Council-Onderstepoort Veterinary Research, Onderstepoort, South Africa; The Department of Veterinary Tropical Diseases, University of Pretoria, Pretoria, South Africa; Department of Life and Consumer Sciences, University of South Africa, Pretoria, South Africa
| | - Ronel Pienaar
- Epidemiology, Parasites and Vectors, Agricultural Research Council-Onderstepoort Veterinary Research, Onderstepoort, South Africa
| | - Jose M C Ribeiro
- Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, Rockville, MD, 20852, USA.
| |
Collapse
|
19
|
Rele CP, Sandlin KM, Leung W, Reed LK. Manual annotation of Drosophila genes: a Genomics Education Partnership protocol. F1000Res 2023; 11:1579. [PMID: 37854289 PMCID: PMC10579860 DOI: 10.12688/f1000research.126839.3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/26/2023] [Indexed: 10/20/2023] Open
Abstract
Annotating the genomes of multiple species allows us to analyze the evolution of their genes. While many eukaryotic genome assemblies already include computational gene predictions, these predictions can benefit from review and refinement through manual gene annotation. The Genomics Education Partnership (GEP; https://thegep.org/) developed a structural annotation protocol for protein-coding genes that enables undergraduate student and faculty researchers to create high-quality gene annotations that can be utilized in subsequent scientific investigations. For example, this protocol has been utilized by the GEP faculty to engage undergraduate students in the comparative annotation of genes involved in the insulin signaling pathway in 27 Drosophila species, using D. melanogaster as the reference genome. Students construct gene models using multiple lines of computational and empirical evidence including expression data (e.g., RNA-Seq), sequence similarity (e.g., BLAST and multiple sequence alignment), and computational gene predictions. Quality control measures require each gene be annotated by at least two students working independently, followed by reconciliation of the submitted gene models by a more experienced student. This article provides an overview of the annotation protocol and describes how discrepancies in student submitted gene models are resolved to produce a final, high-quality gene set suitable for subsequent analyses. The protocol can be adapted to other scientific questions (e.g., expansion of the Drosophila Muller F element) and species (e.g., parasitoid wasps) to provide additional opportunities for undergraduate students to participate in genomics research. These student annotation efforts can substantially improve the quality of gene annotations in publicly available genomic databases.
Collapse
Affiliation(s)
- Chinmay P. Rele
- Department of Biological Sciences, The University of Alabama, Tuscaloosa, Alabama, 35487, USA
| | - Katie M. Sandlin
- Department of Biological Sciences, The University of Alabama, Tuscaloosa, Alabama, 35487, USA
| | - Wilson Leung
- Department of Biology, Washington University in St. Louis, St. Louis, Missouri, 63130, USA
| | - Laura K. Reed
- Department of Biological Sciences, The University of Alabama, Tuscaloosa, Alabama, 35487, USA
| |
Collapse
|
20
|
Wang R, Helbig I, Edmondson AC, Lin L, Xing Y. Splicing defects in rare diseases: transcriptomics and machine learning strategies towards genetic diagnosis. Brief Bioinform 2023; 24:bbad284. [PMID: 37580177 PMCID: PMC10516351 DOI: 10.1093/bib/bbad284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 07/10/2023] [Accepted: 07/20/2023] [Indexed: 08/16/2023] Open
Abstract
Genomic variants affecting pre-messenger RNA splicing and its regulation are known to underlie many rare genetic diseases. However, common workflows for genetic diagnosis and clinical variant interpretation frequently overlook splice-altering variants. To better serve patient populations and advance biomedical knowledge, it has become increasingly important to develop and refine approaches for detecting and interpreting pathogenic splicing variants. In this review, we will summarize a few recent developments and challenges in using RNA sequencing technologies for rare disease investigation. Moreover, we will discuss how recent computational splicing prediction tools have emerged as complementary approaches for revealing disease-causing variants underlying splicing defects. We speculate that continuous improvements to sequencing technologies and predictive modeling will not only expand our understanding of splicing regulation but also bring us closer to filling the diagnostic gap for rare disease patients.
Collapse
Affiliation(s)
- Robert Wang
- Center for Computational and Genomic Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Genomics and Computational Biology Graduate Program, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Ingo Helbig
- The Epilepsy NeuroGenetics Initiative, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Division of Neurology, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Department of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Department of Neurology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Andrew C Edmondson
- Center for Computational and Genomic Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Department of Pediatrics, Division of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Lan Lin
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Yi Xing
- Center for Computational and Genomic Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Department of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
21
|
Engelhard CA, Khani S, Derdak S, Bilban M, Kornfeld JW. Nanopore sequencing unveils the complexity of the cold-activated murine brown adipose tissue transcriptome. iScience 2023; 26:107190. [PMID: 37564700 PMCID: PMC10410515 DOI: 10.1016/j.isci.2023.107190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 04/28/2023] [Accepted: 06/16/2023] [Indexed: 08/12/2023] Open
Abstract
Alternative transcription increases transcriptome complexity by expression of multiple transcripts per gene. Annotation and quantification of transcripts using short-read sequencing is non-trivial. Long-read sequencing aims at overcoming these problems by sequencing full-length transcripts. Activation of brown adipose tissue (BAT) thermogenesis involves major transcriptomic remodeling and positively affects metabolism via increased energy expenditure. We benchmark Oxford Nanopore Technology (ONT) long-read sequencing protocols to Illumina short-read sequencing assessing alignment characteristics, gene and transcript detection and quantification, differential gene and transcript expression, transcriptome reannotation, and differential transcript usage (DTU). We find ONT sequencing is superior to Illumina for transcriptome reassembly, reducing the risk of false-positive events by unambiguously mapping reads to transcripts. We identified novel isoforms of genes undergoing DTU in cold-activated BAT including Cars2, Adtrp, Acsl5, Scp2, Aldoa, and Pde4d, validated by real-time PCR. The reannotated murine BAT transcriptome established here provides a framework for future investigations into the regulation of BAT.
Collapse
Affiliation(s)
- Christoph Andreas Engelhard
- Department for Biochemistry and Molecular Biology (BMB), University of Southern Denmark, Campusvej 55, 5230 Odense M, Denmark
| | - Sajjad Khani
- Max Planck Institute for Metabolism Research, Gleueler Strasse 50, 50931 Cologne, Germany
- Cologne Excellence Cluster on Cellular Stress Responses in Ageing-Associated Diseases (CECAD), University of Cologne, Cologne, Germany
| | - Sophia Derdak
- Core Facilities, Medical University of Vienna, Lazarettgasse 14, 1090 Vienna, Austria
| | - Martin Bilban
- Department of Laboratory Medicine & Core Facilities, Medical University of Vienna, Waehringer Guertel 18-20, 1090 Vienna, Austria
| | - Jan-Wilhelm Kornfeld
- Department for Biochemistry and Molecular Biology (BMB), University of Southern Denmark, Campusvej 55, 5230 Odense M, Denmark
| |
Collapse
|
22
|
Wang F, Xu Y, Wang R, Zhang B, Smith N, Notaro A, Gaerlan S, Kutschera E, Kadash-Edmondson KE, Xing Y, Lin L. TEQUILA-seq: a versatile and low-cost method for targeted long-read RNA sequencing. Nat Commun 2023; 14:4760. [PMID: 37553321 PMCID: PMC10409798 DOI: 10.1038/s41467-023-40083-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Accepted: 07/11/2023] [Indexed: 08/10/2023] Open
Abstract
Long-read RNA sequencing (RNA-seq) is a powerful technology for transcriptome analysis, but the relatively low throughput of current long-read sequencing platforms limits transcript coverage. One strategy for overcoming this bottleneck is targeted long-read RNA-seq for preselected gene panels. We present TEQUILA-seq, a versatile, easy-to-implement, and low-cost method for targeted long-read RNA-seq utilizing isothermally linear-amplified capture probes. When performed on the Oxford nanopore platform with multiple gene panels of varying sizes, TEQUILA-seq consistently and substantially enriches transcript coverage while preserving transcript quantification. We profile full-length transcript isoforms of 468 actionable cancer genes across 40 representative breast cancer cell lines. We identify transcript isoforms enriched in specific subtypes and discover novel transcript isoforms in extensively studied cancer genes such as TP53. Among cancer genes, tumor suppressor genes (TSGs) are significantly enriched for aberrant transcript isoforms targeted for degradation via mRNA nonsense-mediated decay, revealing a common RNA-associated mechanism for TSG inactivation. TEQUILA-seq reduces the per-reaction cost of targeted capture by 2-3 orders of magnitude, as compared to a standard commercial solution. TEQUILA-seq can be broadly used for targeted sequencing of full-length transcripts in diverse biomedical research settings.
Collapse
Affiliation(s)
- Feng Wang
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Yang Xu
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Graduate Group in Genomics and Computational Biology, University of Pennsylvania, Philadelphia, PA, USA
| | - Robert Wang
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Graduate Group in Genomics and Computational Biology, University of Pennsylvania, Philadelphia, PA, USA
| | - Beatrice Zhang
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Noah Smith
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Amber Notaro
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Samantha Gaerlan
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Eric Kutschera
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Kathryn E Kadash-Edmondson
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Yi Xing
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
- Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA.
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
| | - Lan Lin
- Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA.
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
| |
Collapse
|
23
|
Pardo-Palacios FJ, Wang D, Reese F, Diekhans M, Carbonell-Sala S, Williams B, Loveland JE, De María M, Adams MS, Balderrama-Gutierrez G, Behera AK, Gonzalez JM, Hunt T, Lagarde J, Liang CE, Li H, Jerryd Meade M, Moraga Amador DA, Prjibelski AD, Birol I, Bostan H, Brooks AM, Hasan Çelik M, Chen Y, Du MR, Felton C, Göke J, Hafezqorani S, Herwig R, Kawaji H, Lee J, Liang Li J, Lienhard M, Mikheenko A, Mulligan D, Ming Nip K, Pertea M, Ritchie ME, Sim AD, Tang AD, Kei Wan Y, Wang C, Wong BY, Yang C, Barnes I, Berry A, Capella S, Dhillon N, Fernandez-Gonzalez JM, Ferrández-Peral L, Garcia-Reyero N, Goetz S, Hernández-Ferrer C, Kondratova L, Liu T, Martinez-Martin A, Menor C, Mestre-Tomás J, Mudge JM, Panayotova NG, Paniagua A, Repchevsky D, Rouchka E, Saint-John B, Sapena E, Sheynkman L, Laird Smith M, Suner MM, Takahashi H, Youngworth IA, Carninci P, Denslow ND, Guigó R, Hunter ME, Tilgner HU, Wold BJ, Vollmers C, Frankish A, Fai Au K, Sheynkman GM, Mortazavi A, Conesa A, Brooks AN. Systematic assessment of long-read RNA-seq methods for transcript identification and quantification. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.25.550582. [PMID: 37546854 PMCID: PMC10402094 DOI: 10.1101/2023.07.25.550582] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
The Long-read RNA-Seq Genome Annotation Assessment Project (LRGASP) Consortium was formed to evaluate the effectiveness of long-read approaches for transcriptome analysis. The consortium generated over 427 million long-read sequences from cDNA and direct RNA datasets, encompassing human, mouse, and manatee species, using different protocols and sequencing platforms. These data were utilized by developers to address challenges in transcript isoform detection and quantification, as well as de novo transcript isoform identification. The study revealed that libraries with longer, more accurate sequences produce more accurate transcripts than those with increased read depth, whereas greater read depth improved quantification accuracy. In well-annotated genomes, tools based on reference sequences demonstrated the best performance. When aiming to detect rare and novel transcripts or when using reference-free approaches, incorporating additional orthogonal data and replicate samples are advised. This collaborative study offers a benchmark for current practices and provides direction for future method development in transcriptome analysis.
Collapse
Affiliation(s)
- Francisco J. Pardo-Palacios
- Institute for Integrative Systems Biology, Spanish National Research Council (CSIC), Paterna, Spain
- These authors contributed equally to this work
| | - Dingjie Wang
- Department of Biomedical Informatics, The Ohio State University, Columbus, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, USA
- These authors contributed equally to this work
| | - Fairlie Reese
- Developmental and Cell Biology, University of California, Irvine, Irvine, USA
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
- These authors contributed equally to this work
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, USA
- These authors contributed equally to this work
| | - Sílvia Carbonell-Sala
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Catalonia, Spain
- These authors contributed equally to this work
| | - Brian Williams
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, USA
- These authors contributed equally to this work
| | - Jane E. Loveland
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- These authors contributed equally to this work
| | - Maite De María
- Department of Physiological Sciences, College of Veterinary Medicine, University of Florida, Gainesville, USA
- Center for Environmental and Human Toxicology, University of Florida, Gainesville, USA
- These authors contributed equally to this work
| | - Matthew S. Adams
- Molecular Cell and Developmental Biology, University of California, Santa Cruz, Santa Cruz, USA
- These authors contributed equally to this work
| | - Gabriela Balderrama-Gutierrez
- Developmental and Cell Biology, University of California, Irvine, Irvine, USA
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
- These authors contributed equally to this work
| | - Amit K. Behera
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, USA
- These authors contributed equally to this work
| | - Jose M. Gonzalez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- These authors contributed equally to this work
| | - Toby Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- These authors contributed equally to this work
| | - Julien Lagarde
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Catalonia, Spain
- Flomics Biotech, Dr Aiguader 88, Barcelona 08003, Spain
- These authors contributed equally to this work
| | - Cindy E. Liang
- Molecular Cell and Developmental Biology, University of California, Santa Cruz, Santa Cruz, USA
- These authors contributed equally to this work
| | - Haoran Li
- Department of Biomedical Informatics, The Ohio State University, Columbus, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, USA
- These authors contributed equally to this work
| | - Marcus Jerryd Meade
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, USA
- These authors contributed equally to this work
| | - David A. Moraga Amador
- Interdisciplinary Center for Biotechnology Research, University of Florida, Gainesville, USA
- These authors contributed equally to this work
| | - Andrey D. Prjibelski
- Department of Computer Science, University of Helsinki, Helsinki, Finland
- Center for Bioinformatics and Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia
- These authors contributed equally to this work
| | - Inanc Birol
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, Canada
| | - Hamed Bostan
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Durham, USA
| | - Ashley M. Brooks
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Durham, USA
| | - Muhammed Hasan Çelik
- Developmental and Cell Biology, University of California, Irvine, Irvine, USA
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
| | - Ying Chen
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Mei R,M. Du
- Walter and Eliza Hall Institute of Medical Research, Parkville, Australia
| | - Colette Felton
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, USA
| | - Jonathan Göke
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
- Department of Statistics and Data Science, National University of Singapore, Singapore, Singapore
| | - Saber Hafezqorani
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, Canada
| | - Ralf Herwig
- Department Computational Molecular Biology, Max-Planck-Institute for Molecular Genetics, Berlin, Germany
| | - Hideya Kawaji
- Research Center for Genome & Medical Sciences, Tokyo Metropolitan Institute of Medical Science, Tokyo, Japan
| | - Joseph Lee
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Jian Liang Li
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Durham, USA
| | - Matthias Lienhard
- Department Computational Molecular Biology, Max-Planck-Institute for Molecular Genetics, Berlin, Germany
| | - Alla Mikheenko
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, London, UK
| | - Dennis Mulligan
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, USA
| | - Ka Ming Nip
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, Canada
| | - Mihaela Pertea
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, USA
| | - Matthew E. Ritchie
- Walter and Eliza Hall Institute of Medical Research, Parkville, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Australia
| | - Andre D. Sim
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Alison D. Tang
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, USA
| | - Yuk Kei Wan
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Changqing Wang
- Walter and Eliza Hall Institute of Medical Research, Parkville, Australia
| | - Brandon Y. Wong
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, USA
| | - Chen Yang
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, Canada
| | - If Barnes
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Andrew Berry
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Namrita Dhillon
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, USA
| | | | - Luis Ferrández-Peral
- Institute for Integrative Systems Biology, Spanish National Research Council (CSIC), Paterna, Spain
| | - Natàlia Garcia-Reyero
- Environmental Laboratory, US Army Engineer Research & Development Center, Vicksburg, USA
| | | | | | | | | | | | | | - Jorge Mestre-Tomás
- Institute for Integrative Systems Biology, Spanish National Research Council (CSIC), Paterna, Spain
| | - Jonathan M. Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nedka G. Panayotova
- Interdisciplinary Center for Biotechnology Research, University of Florida, Gainesville, USA
| | - Alejandro Paniagua
- Institute for Integrative Systems Biology, Spanish National Research Council (CSIC), Paterna, Spain
| | | | - Eric Rouchka
- Department of Biochemistry & Molecular Genetics, University of Louisville, Louisville, USA
| | - Brandon Saint-John
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, USA
| | - Enrique Sapena
- European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK, UK
| | - Leon Sheynkman
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, USA
| | - Melissa Laird Smith
- Department of Biochemistry & Molecular Genetics, University of Louisville, Louisville, USA
| | - Marie-Marthe Suner
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Hazuki Takahashi
- Center for Integrative Medical Sciences, Laboratory for Transcriptome Technology, RIKEN, Yokohama, Japan
| | | | - Piero Carninci
- Center for Integrative Medical Sciences, Laboratory for Transcriptome Technology, RIKEN, Yokohama, Japan
- Human Technopole, Milano, Italy
| | - Nancy D. Denslow
- Department of Physiological Sciences, College of Veterinary Medicine, University of Florida, Gainesville, USA
- Center for Environmental and Human Toxicology, Department of Physiological Sciences,, University of Florida, Gainesville, USA
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Catalonia, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Catalonia, Spain
| | - Margaret E. Hunter
- U.S. Geological Survey, Wetland and Aquatic Research Center, Gainesville, USA
| | - Hagen U. Tilgner
- Brain and Mind Research Institute and Center for Neurogenetics, Weill Cornell Medicine, New York City, USA
| | - Barbara J. Wold
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, USA
| | - Christopher Vollmers
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, USA
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Kin Fai Au
- Department of Biomedical Informatics, The Ohio State University, Columbus, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, USA
| | - Gloria M. Sheynkman
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, USA
- Center for Public Health Genomics
- UVA Cancer Center, University of Virginia, Charlottesville, USA
| | - Ali Mortazavi
- Developmental and Cell Biology, University of California, Irvine, Irvine, USA
- Center for Complex Biological Systems, University of California, Irvine, Irvine, USA
| | - Ana Conesa
- Institute for Integrative Systems Biology, Spanish National Research Council (CSIC), Paterna, Spain
- Microbiology and Cell Science Department, Institute for Food and Agricultural Sciences, University of Florida, Gainesville, USA
| | - Angela N. Brooks
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, USA
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, USA
| |
Collapse
|
24
|
Zhang RB, Dong LC, Huang Q, Shen Y, Li HY, Yu SG, Wu QF. Matrix metalloproteinases are key targets of acupuncture in the treatment of ulcerative colitis. Exp Biol Med (Maywood) 2023; 248:1229-1241. [PMID: 37438919 PMCID: PMC10621479 DOI: 10.1177/15353702231182205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Accepted: 04/10/2023] [Indexed: 07/14/2023] Open
Abstract
The aim of this study was to elucidate the key targets of acupuncture in the colon of ulcerative colitis (UC) mice model using full-length transcriptome sequencing. 2.5% dextran sodium sulfate (DSS)-induced colitis mice were treated with or without acupuncture. Intestinal pathology was observed, and full transcriptome sequencing and bioinformatic analysis were performed. The results demonstrated that acupuncture treatment reduced the UC symptoms, disease activity index score, and histological colitis score and increased body weight, colon length, and the number of intestinal goblet cells. In addition, acupuncture can also decrease the expression of necrotic biomarker phosphorylates mixed lineage kinase domain-like pseudo kinase (p-MLKL). Full-length transcriptome analysis indicated that acupuncture reversed the expression of 987 of the 1918 upregulated differentially expressed genes (DEGs), and 632 of the 1351 downregulated DEGs induced by DSS. DEGs regulated by acupuncture were mainly involved in inflammatory responses and intestinal barrier pathways. The protein-protein interaction network analysis revealed that matrix metalloproteinases (MMPs) are important genes regulated by acupuncture. Gene set enrichment analysis revealed that extracellular matrix (ECM)-receptor interaction was an important target of acupuncture. In addition, alternative splicing analysis suggested that acupuncture improved signaling pathways related to intestinal permeability, the biological processes of xenobiotics, sulfur compounds, and that monocarboxylic acids are closely associated with MMPs. Overall, our transcriptome analysis results indicate that acupuncture improves intestinal barrier function in UC through negative regulation of MMPs expression.
Collapse
Affiliation(s)
| | | | - Qin Huang
- Acupuncture and Tuina College, Chengdu University of Traditional Chinese Medicine, Chengdu 610075, China
| | - Yuan Shen
- Acupuncture and Tuina College, Chengdu University of Traditional Chinese Medicine, Chengdu 610075, China
| | - Hong-Ying Li
- Acupuncture and Tuina College, Chengdu University of Traditional Chinese Medicine, Chengdu 610075, China
| | - Shu-Guang Yu
- Acupuncture and Tuina College, Chengdu University of Traditional Chinese Medicine, Chengdu 610075, China
| | - Qiao-Feng Wu
- Acupuncture and Tuina College, Chengdu University of Traditional Chinese Medicine, Chengdu 610075, China
| |
Collapse
|
25
|
Petri AJ, Sahlin K. isONform: reference-free transcriptome reconstruction from Oxford Nanopore data. Bioinformatics 2023; 39:i222-i231. [PMID: 37387174 PMCID: PMC10311309 DOI: 10.1093/bioinformatics/btad264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION With advances in long-read transcriptome sequencing, we can now fully sequence transcripts, which greatly improves our ability to study transcription processes. A popular long-read transcriptome sequencing technique is Oxford Nanopore Technologies (ONT), which through its cost-effective sequencing and high throughput, has the potential to characterize the transcriptome in a cell. However, due to transcript variability and sequencing errors, long cDNA reads need substantial bioinformatic processing to produce a set of isoform predictions from the reads. Several genome and annotation-based methods exist to produce transcript predictions. However, such methods require high-quality genomes and annotations and are limited by the accuracy of long-read splice aligners. In addition, gene families with high heterogeneity may not be well represented by a reference genome and would benefit from reference-free analysis. Reference-free methods to predict transcripts from ONT, such as RATTLE, exist, but their sensitivity is not comparable to reference-based approaches. RESULTS We present isONform, a high-sensitivity algorithm to construct isoforms from ONT cDNA sequencing data. The algorithm is based on iterative bubble popping on gene graphs built from fuzzy seeds from the reads. Using simulated, synthetic, and biological ONT cDNA data, we show that isONform has substantially higher sensitivity than RATTLE albeit with some loss in precision. On biological data, we show that isONform's predictions have substantially higher consistency with the annotation-based method StringTie2 compared with RATTLE. We believe isONform can be used both for isoform construction for organisms without well-annotated genomes and as an orthogonal method to verify predictions of reference-based methods. AVAILABILITY AND IMPLEMENTATION https://github.com/aljpetri/isONform.
Collapse
Affiliation(s)
- Alexander J Petri
- Department of Mathematics, Science for Life Laboratory, Stockholm University, Stockholm 106 91, Sweden
| | - Kristoffer Sahlin
- Department of Mathematics, Science for Life Laboratory, Stockholm University, Stockholm 106 91, Sweden
| |
Collapse
|
26
|
Lienhard M, van den Beucken T, Timmermann B, Hochradel M, Börno S, Caiment F, Vingron M, Herwig R. IsoTools: a flexible workflow for long-read transcriptome sequencing analysis. Bioinformatics 2023; 39:btad364. [PMID: 37267159 PMCID: PMC10287928 DOI: 10.1093/bioinformatics/btad364] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Revised: 04/28/2023] [Accepted: 06/01/2023] [Indexed: 06/04/2023] Open
Abstract
MOTIVATION Long-read transcriptome sequencing (LRTS) has the potential to enhance our understanding of alternative splicing and the complexity of this process requires the use of versatile computational tools, with the ability to accommodate various stages of the workflow with maximum flexibility. RESULTS We introduce IsoTools, a Python-based LRTS analysis framework that offers a wide range of functionality for transcriptome reconstruction and quantification of transcripts. Furthermore, we integrate a graph-based method for identifying alternative splicing events and a statistical approach based on the beta-binomial distribution for detecting differential events. To demonstrate the effectiveness of our methods, we applied IsoTools to PacBio LRTS data of human hepatocytes treated with the histone deacetylase inhibitor valproic acid. Our results indicate that LRTS can provide valuable insights into alternative splicing, particularly in terms of complex and differential splicing patterns, in comparison to short-read RNA-seq. AVAILABILITY AND IMPLEMENTATION IsoTools is available on GitHub and PyPI, and its documentation, including tutorials, CLI, and API references, can be found at https://isotools.readthedocs.io/.
Collapse
Affiliation(s)
- Matthias Lienhard
- Department of Computational Biology, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Twan van den Beucken
- Department of Toxicogenomics, Maastricht University, Maastricht 6229ER, The Netherlands
| | - Bernd Timmermann
- Sequencing Core Unit, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Myriam Hochradel
- Sequencing Core Unit, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Stefan Börno
- Sequencing Core Unit, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Florian Caiment
- Department of Toxicogenomics, Maastricht University, Maastricht 6229ER, The Netherlands
| | - Martin Vingron
- Department of Computational Biology, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Ralf Herwig
- Department of Computational Biology, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| |
Collapse
|
27
|
Liu M, Xiao F, Zhu J, Fu D, Wang Z, Xiao R. Combined PacBio Iso-Seq and Illumina RNA-Seq Analysis of the Tuta absoluta (Meyrick) Transcriptome and Cytochrome P450 Genes. INSECTS 2023; 14:363. [PMID: 37103178 PMCID: PMC10146655 DOI: 10.3390/insects14040363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 03/29/2023] [Accepted: 04/02/2023] [Indexed: 06/19/2023]
Abstract
Tuta absoluta (Meyrick) is a devastating invasive pest worldwide. The abamectin and chlorantraniliprole complex have become an alternative option for chemical control because they can enhance insecticidal activity and delay increased drug resistance. Notably, pests are inevitably resistant to various types of insecticides, and compound insecticides are no exception. To identify potential genes involved in the detoxification of abamectin and chlorantraniliprole complex in T. absoluta, PacBio SMRT-seq transcriptome sequencing and Illumina RNA-seq analysis of abamectin and chlorantraniliprole complex-treated T. absoluta were performed. We obtained 80,492 non-redundant transcripts, 62,762 (77.97%) transcripts that were successfully annotated, and 15,524 differentially expressed transcripts (DETs). GO annotation results showed that most of these DETs were involved in the biological processes of life-sustaining activities, such as cellular, metabolic, and single-organism processes. The KEGG pathway enrichment results showed that the pathways related to glutathione metabolism, fatty acid and amino acid synthesis, and metabolism were related to the response to abamectin and chlorantraniliprole complex in T. absoluta. Among these, 21 P450s were differentially expressed (11 upregulated and 10 downregulated). The qRT-PCR results for the eight upregulated P450 genes after abamectin and chlorantraniliprole complex treatment were consistent with the RNA-Seq data. Our findings provide new full-length transcriptional data and information for further studies on detoxification-related genes in T. absoluta.
Collapse
|
28
|
You Y, Prawer YDJ, De Paoli-Iseppi R, Hunt CPJ, Parish CL, Shim H, Clark MB. Identification of cell barcodes from long-read single-cell RNA-seq with BLAZE. Genome Biol 2023; 24:66. [PMID: 37024980 PMCID: PMC10077662 DOI: 10.1186/s13059-023-02907-y] [Citation(s) in RCA: 31] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 03/23/2023] [Indexed: 04/08/2023] Open
Abstract
Long-read single-cell RNA sequencing (scRNA-seq) enables the quantification of RNA isoforms in individual cells. However, long-read scRNA-seq using the Oxford Nanopore platform has largely relied upon matched short-read data to identify cell barcodes. We introduce BLAZE, which accurately and efficiently identifies 10x cell barcodes using only nanopore long-read scRNA-seq data. BLAZE outperforms the existing tools and provides an accurate representation of the cells present in long-read scRNA-seq when compared to matched short reads. BLAZE simplifies long-read scRNA-seq while improving the results, is compatible with downstream tools accepting a cell barcode file, and is available at https://github.com/shimlab/BLAZE .
Collapse
Affiliation(s)
- Yupei You
- School of Mathematics and Statistics/Melbourne Integrative Genomics, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Yair D J Prawer
- Centre for Stem Cell Systems, Department of Anatomy and Physiology, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Ricardo De Paoli-Iseppi
- Centre for Stem Cell Systems, Department of Anatomy and Physiology, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Cameron P J Hunt
- The Florey Institute of Neuroscience and Mental Health, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Clare L Parish
- The Florey Institute of Neuroscience and Mental Health, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Heejung Shim
- School of Mathematics and Statistics/Melbourne Integrative Genomics, The University of Melbourne, Parkville, VIC, 3010, Australia.
| | - Michael B Clark
- Centre for Stem Cell Systems, Department of Anatomy and Physiology, The University of Melbourne, Parkville, VIC, 3010, Australia.
| |
Collapse
|
29
|
Li T, Li Y, Shangguan H, Bian J, Luo R, Tian Y, Li Z, Nie X, Cui L. BarleyExpDB: an integrative gene expression database for barley. BMC PLANT BIOLOGY 2023; 23:170. [PMID: 37003963 PMCID: PMC10064564 DOI: 10.1186/s12870-023-04193-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 03/27/2023] [Indexed: 06/19/2023]
Abstract
BACKGROUND RNA-sequencing (RNA-seq) has been widely used to study the dynamic expression patterns of transcribed genes, which can lead to new biological insights. However, processing and analyzing these huge amounts of histological data remains a great challenge for wet labs and field researchers who lack bioinformatics experience and computational resources. RESULTS We present BarleyExpDB, an easy-to-operate, free, and web-accessible database that integrates transcriptional profiles of barley at different growth and developmental stages, tissues, and stress conditions, as well as differential expression of mutants and populations to build a platform for barley expression and visualization. The expression of a gene of interest can be easily queried by searching by known gene ID or sequence similarity. Expression data can be displayed as a heat map, along with functional descriptions as well as Gene Ontology, Kyoto Encyclopedia of Genes and Genomes, Proteins Families Database, and Simple Modular Architecture Research Tool annotations. CONCLUSIONS BarleyExpDB will serve as a valuable resource for the barley research community to leverage the vast publicly available RNA-seq datasets for functional genomics research and crop molecular breeding.
Collapse
Affiliation(s)
- Tingting Li
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, 330045 Jiangxi China
- State Key Laboratory of Crop Stress Biology in Arid Areas and College of Agronomy, Northwest A&F University, Yangling, 712100 Shaanxi China
| | - Yihan Li
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, 330045 Jiangxi China
| | - Hongbin Shangguan
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, 330045 Jiangxi China
| | - Jianxin Bian
- Peking University Institute of Advanced Agricultural Sciences, Weifang, 261325 Shandong China
| | - Ruihan Luo
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, 330045 Jiangxi China
| | - Yuan Tian
- Xintai Urban and Rural Development Group Co., Ltd, Taian, 271200 Shandong China
| | - Zhimin Li
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, 330045 Jiangxi China
| | - Xiaojun Nie
- State Key Laboratory of Crop Stress Biology in Arid Areas and College of Agronomy, Northwest A&F University, Yangling, 712100 Shaanxi China
| | - Licao Cui
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, 330045 Jiangxi China
| |
Collapse
|
30
|
Sun X, Li H. Full-length transcriptome combined with RNA sequence analysis of Fraxinus chinensis. Genes Genomics 2023; 45:553-567. [PMID: 36905551 DOI: 10.1007/s13258-023-01374-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 02/25/2023] [Indexed: 03/12/2023]
Abstract
BACKGROUND The dry root or stem bark of Fraxinus chinensis is a famous herb Qin Pi which is known for its anti-inflammatory, analgesic, anti-tumor, liver protective and diuretic pharmacological effects, the fundamental chemical components are coumarin, phenylethanol glycosides and flavonoids. However, it is difficult to clarify the secondary metabolite synthesis pathway and key genes involved in the pathway because of lack genome information of Fraxinus chinensis. OBJECTIVE To generate a complete transcriptome of Fraxinus chinensis and to clarify the differentially expressed genes (DEGs) in leaves and stem barks. METHODS In this study, full-length transcriptome analysis and RNA-Seq were combined to characterize Fraxinus chinensis transcriptome. RESULTS A total of 69,145 transcripts were acquired and regarded as reference transcriptome, 67,441 transcripts (97.47%) were annotated to NCBI non-redundant protein (Nr), SwissProt, the Kyoto Encyclopedia of Genes and Genomes (KEGG) and eukaryotic orthologous groups (KOG) databases. A total of 18,917 isoforms were annotated to KEGG database and classified to 138 biological pathways. In total, 10,822 simple sequence repeat (SSRs) and 11,319 resistance (R) gene were classified to 18 types, and 3947 transcription factors (TFs) were identified in full-length transcriptome analysis. Additionally, 15,095 DEGs were detected by RNA-seq in leaves and barks, including 4696 significantly up-regulated and 10,399 significantly down-regulated genes. And 254 transcripts were annotated into phenylpropane metabolism pathway containing 86 DEGs and ten of these enzyme genes were verified by qRT-PCR. CONCLUSION It laid the foundation for further exploration of the biosynthetic pathway of phenylpropanoids and related key enzyme genes.
Collapse
Affiliation(s)
- Xiaochun Sun
- Co-construction Collaborative Innovation Center for Chineses Medicine Resources Industrialization by Shaanxi and Education Ministry, Shaanxi University of Chinese Medicine, Xianyang, China
| | | |
Collapse
|
31
|
Bao M, Wang X, Sun R, Wang Z, Li J, Jiang T, Lin A, Wang H, Feng J. Full-Length Transcriptome of the Great Himalayan Leaf-Nosed Bats ( Hipposideros armiger) Optimized Genome Annotation and Revealed the Expression of Novel Genes. Int J Mol Sci 2023; 24:ijms24054937. [PMID: 36902366 PMCID: PMC10003721 DOI: 10.3390/ijms24054937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 02/26/2023] [Accepted: 02/28/2023] [Indexed: 03/08/2023] Open
Abstract
The Great Himalayan Leaf-nosed bat (Hipposideros armiger) is one of the most representative species of all echolocating bats and is an ideal model for studying the echolocation system of bats. An incomplete reference genome and limited availability of full-length cDNAs have hindered the identification of alternatively spliced transcripts, which slowed down related basic studies on bats' echolocation and evolution. In this study, we analyzed five organs from H. armiger for the first time using PacBio single-molecule real-time sequencing (SMRT). There were 120 GB of subreads generated, including 1,472,058 full-length non-chimeric (FLNC) sequences. A total of 34,611 alternative splicing (AS) events and 66,010 Alternative Polyadenylation (APA) sites were detected by transcriptome structural analysis. Moreover, a total of 110,611 isoforms were identified, consisting of 52% new isoforms of known genes and 5% of novel gene loci, as well as 2112 novel genes that have not been annotated before in the current reference genome of H. armiger. Furthermore, several key novel genes, including Pol, RAS, NFKB1, and CAMK4, were identified as being associated with nervous, signal transduction, and immune system processes, which may be involved in regulating the auditory nervous perception and immune system that helps bats to regulate in echolocation. In conclusion, the full-length transcriptome results optimized and replenished existing H. armiger genome annotation in multiple ways and offer advantages for newly discovered or previously unrecognized protein-coding genes and isoforms, which can be used as a reference resource.
Collapse
Affiliation(s)
- Mingyue Bao
- College of Life Science, Jilin Agricultural University, Changchun 130118, China
| | - Xue Wang
- College of Life Science, Jilin Agricultural University, Changchun 130118, China
| | - Ruyi Sun
- College of Life Science, Jilin Agricultural University, Changchun 130118, China
| | - Zhiqiang Wang
- Jilin Provincial Key Laboratory of Animal Resource Conservation and Utilization, Northeast Normal University, Changchun 130117, China
| | - Jiqian Li
- Jilin Provincial Key Laboratory of Animal Resource Conservation and Utilization, Northeast Normal University, Changchun 130117, China
| | - Tinglei Jiang
- Jilin Provincial Key Laboratory of Animal Resource Conservation and Utilization, Northeast Normal University, Changchun 130117, China
| | - Aiqing Lin
- Jilin Provincial Key Laboratory of Animal Resource Conservation and Utilization, Northeast Normal University, Changchun 130117, China
| | - Hui Wang
- College of Life Science, Jilin Agricultural University, Changchun 130118, China
- Correspondence: (H.W.); (J.F.)
| | - Jiang Feng
- College of Life Science, Jilin Agricultural University, Changchun 130118, China
- Jilin Provincial Key Laboratory of Animal Resource Conservation and Utilization, Northeast Normal University, Changchun 130117, China
- Correspondence: (H.W.); (J.F.)
| |
Collapse
|
32
|
Velasco VME, Ferreira A, Zaman S, Noordermeer D, Ensminger I, Wegrzyn JL. A long-read and short-read transcriptomics approach provides the first high-quality reference transcriptome and genome annotation for Pseudotsuga menziesii (Douglas-fir). G3 (BETHESDA, MD.) 2023; 13:jkac304. [PMID: 36454025 PMCID: PMC10468028 DOI: 10.1093/g3journal/jkac304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 12/13/2021] [Accepted: 10/19/2022] [Indexed: 12/02/2022]
Abstract
Douglas-fir (Pseudotsuga menziesii) is native to western North America. It grows in a wide range of environmental conditions and is an important timber tree. Although there are several studies on the gene expression responses of Douglas-fir to abiotic cues, the absence of high-quality transcriptome and genome data is a barrier to further investigation. Like for most conifers, the available transcriptome and genome reference dataset for Douglas-fir remains fragmented and requires refinement. We aimed to generate a highly accurate, and complete reference transcriptome and genome annotation. We deep-sequenced the transcriptome of Douglas-fir needles from seedlings that were grown under nonstress control conditions or a combination of heat and drought stress conditions using long-read (LR) and short-read (SR) sequencing platforms. We used 2 computational approaches, namely de novo and genome-guided LR transcriptome assembly. Using the LR de novo assembly, we identified 1.3X more high-quality transcripts, 1.85X more "complete" genes, and 2.7X more functionally annotated genes compared to the genome-guided assembly approach. We predicted 666 long noncoding RNAs and 12,778 unique protein-coding transcripts including 2,016 putative transcription factors. We leveraged the LR de novo assembled transcriptome with paired-end SR and a published single-end SR transcriptome to generate an improved genome annotation. This was conducted with BRAKER2 and refined based on functional annotation, repetitive content, and transcriptome alignment. This high-quality genome annotation has 51,419 unique gene models derived from 322,631 initial predictions. Overall, our informatics approach provides a new reference Douglas-fir transcriptome assembly and genome annotation with considerably improved completeness and functional annotation.
Collapse
Affiliation(s)
| | - Alyssa Ferreira
- Department of Evolution and Ecology, University of
Connecticut, Storrs, CT 06269, USA
| | - Sumaira Zaman
- Department of Evolution and Ecology, University of
Connecticut, Storrs, CT 06269, USA
| | - Devin Noordermeer
- Department of Biology, University of Toronto,
Mississauga, ON L5L 1C8, Canada
- Graduate Department of Cell and Systems Biology, University of
Toronto, Toronto, ON M5S, Canada
| | - Ingo Ensminger
- Department of Biology, University of Toronto,
Mississauga, ON L5L 1C8, Canada
- Graduate Department of Cell and Systems Biology, University of
Toronto, Toronto, ON M5S, Canada
- Graduate Department of Ecology and Evolutionary Biology, University of
Toronto, Toronto, ON M5S, Canada
| | - Jill L Wegrzyn
- Department of Evolution and Ecology, University of
Connecticut, Storrs, CT 06269, USA
| |
Collapse
|
33
|
Gao Y, Wang F, Wang R, Kutschera E, Xu Y, Xie S, Wang Y, Kadash-Edmondson KE, Lin L, Xing Y. ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data. SCIENCE ADVANCES 2023; 9:eabq5072. [PMID: 36662851 PMCID: PMC9858503 DOI: 10.1126/sciadv.abq5072] [Citation(s) in RCA: 26] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 12/16/2022] [Indexed: 05/20/2023]
Abstract
Long-read RNA sequencing (RNA-seq) holds great potential for characterizing transcriptome variation and full-length transcript isoforms, but the relatively high error rate of current long-read sequencing platforms poses a major challenge. We present ESPRESSO, a computational tool for robust discovery and quantification of transcript isoforms from error-prone long reads. ESPRESSO jointly considers alignments of all long reads aligned to a gene and uses error profiles of individual reads to improve the identification of splice junctions and the discovery of their corresponding transcript isoforms. On both a synthetic spike-in RNA sample and human RNA samples, ESPRESSO outperforms multiple contemporary tools in not only transcript isoform discovery but also transcript isoform quantification. In total, we generated and analyzed ~1.1 billion nanopore RNA-seq reads covering 30 human tissue samples and three human cell lines. ESPRESSO and its companion dataset provide a useful resource for studying the RNA repertoire of eukaryotic transcriptomes.
Collapse
Affiliation(s)
- Yuan Gao
- Center for Computational and Genomic Medicine, The Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Feng Wang
- Center for Computational and Genomic Medicine, The Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Robert Wang
- Center for Computational and Genomic Medicine, The Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Genomics and Computational Biology Graduate Program, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Eric Kutschera
- Center for Computational and Genomic Medicine, The Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Yang Xu
- Center for Computational and Genomic Medicine, The Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Genomics and Computational Biology Graduate Program, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Stephan Xie
- Center for Computational and Genomic Medicine, The Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Yuanyuan Wang
- Center for Computational and Genomic Medicine, The Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Kathryn E. Kadash-Edmondson
- Center for Computational and Genomic Medicine, The Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Lan Lin
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, The Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Yi Xing
- Center for Computational and Genomic Medicine, The Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Biomedical and Health Informatics, The Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| |
Collapse
|
34
|
Jiang H, Li Y, Luan M, Huang S, Zhao L, Yang G, Pan G. Single-Molecule Real-Time Sequencing of Full-Length Transcriptome and Identification of Genes Related to Male Development in Cannabis sativa. PLANTS (BASEL, SWITZERLAND) 2022; 11:3559. [PMID: 36559671 PMCID: PMC9782162 DOI: 10.3390/plants11243559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 11/25/2022] [Accepted: 12/12/2022] [Indexed: 06/17/2023]
Abstract
Female Cannabis sativa plants have important therapeutic properties. The sex ratio of the dioecious cannabis is approximately 1:1. Cultivating homozygous female plants by inducing female plants to produce male flowers is of great practical significance. However, the mechanism underlying cannabis male development remains unclear. In this study, single-molecule real-time (SMRT) sequencing was performed using a mixed sample of female and induced male flowers from the ZYZM1 cannabis variety. A total of 15,241 consensus reads were identified, and 13,657 transcripts were annotated across seven public databases. A total of 48 lncRNAs with an average length of 986.54 bp were identified. In total, 8202 transcripts were annotated as transcription factors, the most common of which were bHLH transcription factors. Moreover, tissue-specific expression pattern analysis showed that 13 MADS transcription factors were highly expressed in male flowers. Furthermore, 232 reads of novel genes were predicted and enriched in lipid metabolism, and qRT-PCR results showed that CER1 may be involved in the development of cannabis male flowers. In addition, 1170 AS events were detected, and two AS events were further validated. Taken together, these results may improve our understanding of the complexity of full-length cannabis transcripts and provide a basis for understanding the molecular mechanism of cannabis male development.
Collapse
Affiliation(s)
- Hui Jiang
- Institute of Bast Fiber Crops, Chinese Academy of Agricultural Science, Changsha 410205, China
| | - Ying Li
- State Key Laboratory Breeding Base of Dao-di Herbs, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Mingbao Luan
- Institute of Bast Fiber Crops, Chinese Academy of Agricultural Science, Changsha 410205, China
| | - Siqi Huang
- Institute of Bast Fiber Crops, Chinese Academy of Agricultural Science, Changsha 410205, China
| | - Lining Zhao
- Institute of Bast Fiber Crops, Chinese Academy of Agricultural Science, Changsha 410205, China
| | - Guang Yang
- State Key Laboratory Breeding Base of Dao-di Herbs, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Gen Pan
- Institute of Bast Fiber Crops, Chinese Academy of Agricultural Science, Changsha 410205, China
- State Key Laboratory Breeding Base of Dao-di Herbs, China Academy of Chinese Medical Sciences, Beijing 100700, China
| |
Collapse
|
35
|
Papa Y, Wellenreuther M, Morrison MA, Ritchie PA. Genome assembly and isoform analysis of a highly heterozygous New Zealand fisheries species, the tarakihi (Nemadactylus macropterus). G3 (BETHESDA, MD.) 2022; 13:6883520. [PMID: 36477875 PMCID: PMC9911067 DOI: 10.1093/g3journal/jkac315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 11/01/2022] [Accepted: 11/08/2022] [Indexed: 12/14/2022]
Abstract
Although being some of the most valuable and heavily exploited wild organisms, few fisheries species have been studied at the whole-genome level. This is especially the case in New Zealand, where genomics resources are urgently needed to assist fisheries management. Here, we generated 55 Gb of short Illumina reads (92× coverage) and 73 Gb of long Nanopore reads (122×) to produce the first genome assembly of the marine teleost tarakihi [Nemadactylus macropterus (Forster, 1801)], a highly valuable fisheries species in New Zealand. An additional 300 Mb of Iso-Seq reads were obtained to assist in gene annotation. The final genome assembly was 568 Mb long with an N50 of 3.37 Mb. The genome completeness was high, with 97.8% of complete Actinopterygii Benchmarking Universal Single-Copy Orthologs. Heterozygosity values estimated through k-mer counting (1.00%) and bi-allelic SNPs (0.64%) were high compared with the same values reported for other fishes. Iso-Seq analysis recovered 91,313 unique transcripts from 15,515 genes (mean ratio of 5.89 transcripts per gene), and the most common alternative splicing event was intron retention. This highly contiguous genome assembly and the isoform-resolved transcriptome will provide a useful resource to assist the study of population genomics and comparative eco-evolutionary studies in teleosts and related organisms.
Collapse
Affiliation(s)
- Yvan Papa
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Maren Wellenreuther
- Seafood Production Group, The New Zealand Institute for Plant and Food Research Limited, Nelson 7010, New Zealand,School of Biological Sciences, The University of Auckland, Auckland 1010, New Zealand
| | - Mark A Morrison
- National Institute of Water and Atmospheric Research, Auckland 1010, New Zealand
| | - Peter A Ritchie
- Corresponding author: Te Toki A Rata, Gate 7, Kelburn Parade, Wellington 6012, New Zealand.
| |
Collapse
|
36
|
Ono Y, Hamada M, Asai K. PBSIM3: a simulator for all types of PacBio and ONT long reads. NAR Genom Bioinform 2022; 4:lqac092. [PMID: 36465498 PMCID: PMC9713900 DOI: 10.1093/nargab/lqac092] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Revised: 11/02/2022] [Accepted: 11/12/2022] [Indexed: 12/03/2022] Open
Abstract
Long-read sequencers, such as Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) sequencers, have improved their read length and accuracy, thereby opening up unprecedented research. Many tools and algorithms have been developed to analyze long reads, and rapid progress in PacBio and ONT has further accelerated their development. Together with the development of high-throughput sequencing technologies and their analysis tools, many read simulators have been developed and effectively utilized. PBSIM is one of the popular long-read simulators. In this study, we developed PBSIM3 with three new functions: error models for long reads, multi-pass sequencing for high-fidelity read simulation and transcriptome sequencing simulation. Therefore, PBSIM3 is now able to meet a wide range of long-read simulation requirements.
Collapse
Affiliation(s)
- Yukiteru Ono
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa 277-8561, Japan
| | - Michiaki Hamada
- Department of Electrical Engineering and Bioscience, Faculty of Science and Engineering, Waseda University, 55N-06-10, 3-4-1, Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), 63-520, 3-4-1, Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- Institute for Medical-Oriented Structural Biology, Waseda University, 2-2, Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan
- Graduate School of Medicine, Nippon Medical School, 1-1-5, Sendagi, Bunkyo-ku, Tokyo, 113-8602, Japan
| | - Kiyoshi Asai
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa 277-8561, Japan
- Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-3-26, Aomi, Koto-ku, 135-0064 Tokyo, Japan
| |
Collapse
|
37
|
Su Q, Chen Q, Li Z, Zhao J, Li L, Xu L, Yang B, Liu C. Multi-omics analysis reveals GABAergic dysfunction after traumatic brainstem injury in rats. Front Neurosci 2022; 16:1003300. [PMID: 36507346 PMCID: PMC9726735 DOI: 10.3389/fnins.2022.1003300] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Accepted: 10/31/2022] [Indexed: 11/24/2022] Open
Abstract
Background Traumatic brainstem injury (TBSI) is one of the forms of brain injury and has a very high mortality rate. Understanding the molecular mechanism of injury can provide additional information for clinical treatment. Materials and methods In this study, we detected transcriptome, proteomics, and metabolome expression changes in the brainstem of TBSI rats, and comprehensively analyzed the underlying mechanisms of TBSI. Results After TBSI, there was significant diffuse axonal injury (DAI) in the brainstem of rats. A total of 579 genes, 70 proteins, and 183 metabolites showed significant changes in brainstem tissue. Through molecular function and pathway analysis, the differentially expressed genes, proteins, and metabolites of TBSI were mainly attributed to neural signal regulation, inflammation, neuroprotection, and immune system. In addition, a comprehensive analysis of transcripts, proteins, and metabolites showed that the genes, proteins, and metabolic pathways regulated in the brainstem after TBSI were involved in neuroactive ligand-receptor interaction. A variety of GCPR-regulated pathways were affected, especially GAGA's corresponding receptors GABAA, GABAB, GABAC, and transporter GAT that were inhibited to varying degrees. Conclusion This study provides insights into the development of a rapid diagnostic kit and making treatment strategies for TBSI.
Collapse
Affiliation(s)
- Qin Su
- Guangzhou Forensic Science Institute, Guangzhou, China,Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
| | - Qianling Chen
- School of Forensic Medicine, Southern Medical University, Guangzhou, China
| | - Zhigang Li
- Guangzhou Forensic Science Institute, Guangzhou, China
| | - Jian Zhao
- Guangzhou Forensic Science Institute, Guangzhou, China
| | - Lingyue Li
- School of Forensic Medicine, Southern Medical University, Guangzhou, China
| | - Luyao Xu
- School of Forensic Medicine, Southern Medical University, Guangzhou, China
| | - Bin Yang
- School of Forensic Medicine, Southern Medical University, Guangzhou, China
| | - Chao Liu
- Guangzhou Forensic Science Institute, Guangzhou, China,*Correspondence: Chao Liu,
| |
Collapse
|
38
|
Caceres M, Mumey B, Husic E, Rizzi R, Cairo M, Sahlin K, Tomescu AI. Safety in Multi-Assembly via Paths Appearing in All Path Covers of a DAG. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3673-3684. [PMID: 34847041 DOI: 10.1109/tcbb.2021.3131203] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
A multi-assembly problem asks to reconstruct multiple genomic sequences from mixed reads sequenced from all of them. Standard formulations of such problems model a solution as a path cover in a directed acyclic graph, namely a set of paths that together cover all vertices of the graph. Since multi-assembly problems admit multiple solutions in practice, we consider an approach commonly used in standard genome assembly: output only partial solutions (contigs, or safe paths), that appear in all path cover solutions. We study constrained path covers, a restriction on the path cover solution that incorporate practical constraints arising in multi-assembly problems. We give efficient algorithms finding all maximal safe paths for constrained path covers. We compute the safe paths of splicing graphs constructed from transcript annotations of different species. Our algorithms run in less than 15 seconds per species and report RNA contigs that are over 99% precise and are up to 8 times longer than unitigs. Moreover, RNA contigs cover over 70% of the transcripts and their coding sequences in most cases. With their increased length to unitigs, high precision, and fast construction time, maximal safe paths can provide a better base set of sequences for transcript assembly programs.
Collapse
|
39
|
Castaldi PJ, Abood A, Farber CR, Sheynkman GM. Bridging the splicing gap in human genetics with long-read RNA sequencing: finding the protein isoform drivers of disease. Hum Mol Genet 2022; 31:R123-R136. [PMID: 35960994 PMCID: PMC9585682 DOI: 10.1093/hmg/ddac196] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 08/08/2022] [Accepted: 08/09/2022] [Indexed: 02/04/2023] Open
Abstract
Aberrant splicing underlies many human diseases, including cancer, cardiovascular diseases and neurological disorders. Genome-wide mapping of splicing quantitative trait loci (sQTLs) has shown that genetic regulation of alternative splicing is widespread. However, identification of the corresponding isoform or protein products associated with disease-associated sQTLs is challenging with short-read RNA-seq, which cannot precisely characterize full-length transcript isoforms. Furthermore, contemporary sQTL interpretation often relies on reference transcript annotations, which are incomplete. Solutions to these issues may be found through integration of newly emerging long-read sequencing technologies. Long-read sequencing offers the capability to sequence full-length mRNA transcripts and, in some cases, to link sQTLs to transcript isoforms containing disease-relevant protein alterations. Here, we provide an overview of sQTL mapping approaches, the use of long-read sequencing to characterize sQTL effects on isoforms, the linkage of RNA isoforms to protein-level functions and comment on future directions in the field. Based on recent progress, long-read RNA sequencing promises to be part of the human disease genetics toolkit to discover and treat protein isoforms causing rare and complex diseases.
Collapse
Affiliation(s)
- Peter J Castaldi
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, MA 02115, USA
- Division of General Medicine and Primary Care, Department of Medicine, Brigham and Women’s Hospital, Boston, MA 02115, USA
| | - Abdullah Abood
- Center for Public Health Genomics, School of Medicine, University of Virginia, Charlottesville, VA 22903, USA
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22903, USA
| | - Charles R Farber
- Center for Public Health Genomics, School of Medicine, University of Virginia, Charlottesville, VA 22903, USA
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22903, USA
- Department of Public Health Sciences, School of Medicine, University of Virginia, Charlottesville, VA 22903, USA
| | - Gloria M Sheynkman
- Center for Public Health Genomics, School of Medicine, University of Virginia, Charlottesville, VA 22903, USA
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22903, USA
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22903, USA
- UVA Comprehensive Cancer Center, University of Virginia, Charlottesville, VA 22903, USA
| |
Collapse
|
40
|
Lee CC, Hsu HW, Lin CY, Gustafson N, Matsuura K, Lee CY, Yang CCS. First Polycipivirus and Unmapped RNA Virus Diversity in the Yellow Crazy Ant, Anoplolepis gracilipes. Viruses 2022; 14:v14102161. [PMID: 36298716 PMCID: PMC9612232 DOI: 10.3390/v14102161] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 09/26/2022] [Accepted: 09/27/2022] [Indexed: 12/01/2022] Open
Abstract
The yellow crazy ant, Anoplolepis gracilipes is a widespread invasive ant that poses significant threats to local biodiversity. Yet, compared to other global invasive ant species such as the red imported fire ant (Solenopsis invicta) or the Argentine ant (Linepithema humile), little is known about the diversity of RNA viruses in the yellow crazy ant. In the current study, we generated a transcriptomic database for A. gracilipes using a high throughput sequencing approach to identify new RNA viruses and characterize their genomes. Four virus species assigned to Dicistroviridae, two to Iflaviridae, one to Polycipiviridae, and two unclassified Riboviria viruses were identified. Detailed genomic characterization was carried out on the polycipivirus and revealed that this virus comprises 11,644 nucleotides with six open reading frames. Phylogenetic analysis and pairwise amino acid identity comparison classified this virus into the genus Sopolycivirus under Polycipiviridae, which is tentatively named "Anoplolepis gracilipes virus 3 (AgrV-3)". Evolutionary analysis showed that AgrV-3 possesses a high level of genetic diversity and elevated mutation rate, combined with the common presence of multiple viral strains within single worker individuals, suggesting AgrV-3 likely evolves following the quasispecies model. A subsequent field survey placed the viral pathogen "hotspot" of A. gracilipes in the Southeast Asian region, a pattern consistent with the region being recognized as part of the ant's native range. Lastly, infection of multiple virus species seems prevalent across field colonies and may have been linked to the ant's social organization.
Collapse
Affiliation(s)
- Chih-Chi Lee
- Laboratory of Insect Ecology, Graduate School of Agriculture, Kyoto University, Kyoto 6068502, Japan
- Research Institute for Sustainable Humanosphere, Kyoto University, Kyoto 6110011, Japan
- Department of Evolutionary and Environmental Biology, Institute of Evolution, University of Haifa, Haifa 3498838, Israel
| | - Hung-Wei Hsu
- Laboratory of Insect Ecology, Graduate School of Agriculture, Kyoto University, Kyoto 6068502, Japan
- Research Institute for Sustainable Humanosphere, Kyoto University, Kyoto 6110011, Japan
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
| | - Chun-Yi Lin
- Research Institute for Sustainable Humanosphere, Kyoto University, Kyoto 6110011, Japan
- Citrus Research and Education Center, University of Florida, Lake Alfred, FL 33850, USA
| | - Nicolas Gustafson
- Department of Biochemistry, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | - Kenji Matsuura
- Laboratory of Insect Ecology, Graduate School of Agriculture, Kyoto University, Kyoto 6068502, Japan
| | - Chow-Yang Lee
- Department of Entomology, University of California, 900 University Avenue, Riverside, CA 92521, USA
| | - Chin-Cheng Scotty Yang
- Department of Entomology, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
- Correspondence: ; Tel.: +1-540-231-3052
| |
Collapse
|
41
|
Mori M, Ode H, Kubota M, Nakata Y, Kasahara T, Shigemi U, Okazaki R, Matsuda M, Matsuoka K, Sugimoto A, Hachiya A, Imahashi M, Yokomaku Y, Iwatani Y. Nanopore Sequencing for Characterization of HIV-1 Recombinant Forms. Microbiol Spectr 2022; 10:e0150722. [PMID: 35894615 PMCID: PMC9431566 DOI: 10.1128/spectrum.01507-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 07/06/2022] [Indexed: 11/23/2022] Open
Abstract
High genetic diversity, including the emergence of recombinant forms (RFs), is one of the most prominent features of human immunodeficiency virus type 1 (HIV-1). Conventional detection of HIV-1 RFs requires pretreatments, i.e., cloning or single-genome amplification, to distinguish them from dual- or multiple-infection variants. However, these processes are time-consuming and labor-intensive. Here, we constructed a new nanopore sequencing-based platform that enables us to obtain distinctive genetic information for intersubtype RFs and dual-infection HIV-1 variants by using amplicons of HIV-1 near-full-length genomes or two overlapping half-length genome fragments. Repeated benchmark tests of HIV-1 proviral DNA revealed consensus sequence inference with a reduced error rate, allowing us to obtain sufficiently accurate sequence data. In addition, we applied the platform for sequence analyses of 9 clinical samples with suspected HIV-1 RF infection or dual infection according to Sanger sequencing-based genotyping tests for HIV-1 drug resistance. For each RF infection case, replicated analyses involving our nanopore sequencing-based platform consistently produced long consecutive analogous consensus sequences with mosaic genomic structures consisting of two different subtypes. In contrast, we detected multiple heterologous sequences in each dual-infection case. These results demonstrate that our new nanopore sequencing platform is applicable to identify the full-length HIV-1 genome structure of intersubtype RFs as well as dual-infection heterologous HIV-1. Since the genetic diversity of HIV-1 continues to gradually increase, this system will help accelerate full-length genome analysis and molecular epidemiological surveillance for HIV-1. IMPORTANCE HIV-1 is characterized by large genetic differences, including HIV-1 recombinant forms (RFs). Conventional genetic analyses require time-consuming pretreatments, i.e., cloning or single-genome amplification, to distinguish RFs from dual- or multiple-infection cases. In this study, we developed a new analytical system for HIV-1 sequence data obtained by nanopore sequencing. The error rate of this method was reduced to ~0.06%. We applied this system for sequence analyses of 9 clinical samples with suspected HIV-1 RF infection or dual infection, which were extracted from 373 cases of HIV patients based on our retrospective analysis of HIV-1 drug resistance genotyping test results. We found that our new nanopore sequencing platform is applicable to identify the full-length HIV-1 genome structure of intersubtype RFs as well as dual-infection heterologous HIV-1. Our protocol will be useful for epidemiological surveillance to examine HIV-1 transmission as well as for genotypic tests of HIV-1 drug resistance in clinical settings.
Collapse
Affiliation(s)
- Mikiko Mori
- Clinical Research Center, National Hospital Organization Nagoya Medical Center, Nagoya, Japan
- Division of Basic Medicine, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Hirotaka Ode
- Clinical Research Center, National Hospital Organization Nagoya Medical Center, Nagoya, Japan
| | - Mai Kubota
- Clinical Research Center, National Hospital Organization Nagoya Medical Center, Nagoya, Japan
| | - Yoshihiro Nakata
- Clinical Research Center, National Hospital Organization Nagoya Medical Center, Nagoya, Japan
- Division of Basic Medicine, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Takaaki Kasahara
- Clinical Research Center, National Hospital Organization Nagoya Medical Center, Nagoya, Japan
- Division of Basic Medicine, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Urara Shigemi
- Clinical Research Center, National Hospital Organization Nagoya Medical Center, Nagoya, Japan
| | - Reiko Okazaki
- Clinical Research Center, National Hospital Organization Nagoya Medical Center, Nagoya, Japan
| | - Masakazu Matsuda
- Clinical Research Center, National Hospital Organization Nagoya Medical Center, Nagoya, Japan
| | - Kazuhiro Matsuoka
- Clinical Research Center, National Hospital Organization Nagoya Medical Center, Nagoya, Japan
| | - Atsuko Sugimoto
- Clinical Research Center, National Hospital Organization Nagoya Medical Center, Nagoya, Japan
| | - Atsuko Hachiya
- Clinical Research Center, National Hospital Organization Nagoya Medical Center, Nagoya, Japan
| | - Mayumi Imahashi
- Clinical Research Center, National Hospital Organization Nagoya Medical Center, Nagoya, Japan
| | - Yoshiyuki Yokomaku
- Clinical Research Center, National Hospital Organization Nagoya Medical Center, Nagoya, Japan
| | - Yasumasa Iwatani
- Clinical Research Center, National Hospital Organization Nagoya Medical Center, Nagoya, Japan
- Division of Basic Medicine, Nagoya University Graduate School of Medicine, Nagoya, Japan
| |
Collapse
|
42
|
Ferrández-Peral L, Zhan X, Alvarez-Estape M, Chiva C, Esteller-Cucala P, García-Pérez R, Julià E, Lizano E, Fornas Ò, Sabidó E, Li Q, Marquès-Bonet T, Juan D, Zhang G. Transcriptome innovations in primates revealed by single-molecule long-read sequencing. Genome Res 2022; 32:1448-1462. [PMID: 35840341 PMCID: PMC9435740 DOI: 10.1101/gr.276395.121] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 07/12/2022] [Indexed: 11/24/2022]
Abstract
Transcriptomic diversity greatly contributes to the fundamentals of disease, lineage-specific biology, and environmental adaptation. However, much of the actual isoform repertoire contributing to shaping primate evolution remains unknown. Here, we combined deep long- and short-read sequencing complemented with mass spectrometry proteomics in a panel of lymphoblastoid cell lines (LCLs) from human, three other great apes, and rhesus macaque, producing the largest full-length isoform catalog in primates to date. Around half of the captured isoforms are not annotated in their reference genomes, significantly expanding the gene models in primates. Furthermore, our comparative analyses unveil hundreds of transcriptomic innovations and isoform usage changes related to immune function and immunological disorders. The confluence of these evolutionary innovations with signals of positive selection and their limited impact in the proteome points to changes in alternative splicing in genes involved in immune response as an important target of recent regulatory divergence in primates.
Collapse
Affiliation(s)
| | | | | | - Cristina Chiva
- Center for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), 08003 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
| | | | | | - Eva Julià
- Center for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), 08003 Barcelona, Spain
| | - Esther Lizano
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, 08003 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, 08193 Barcelona, Spain
| | - Òscar Fornas
- Center for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), 08003 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
| | - Eduard Sabidó
- Center for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), 08003 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
| | - Qiye Li
- BGI-Shenzhen, Shenzhen 518083, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tomàs Marquès-Bonet
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, 08003 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, 08193 Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), 08010 Barcelona, Spain
- CNAG-CRG, Center for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), 08028 Barcelona, Spain
| | - David Juan
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, 08003 Barcelona, Spain
| | - Guojie Zhang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China
- Section for Ecology and Evolution, Department of Biology, University of Copenhagen, DK-2100 Copenhagen 2200, Denmark
- Evolutionary and Organismal Biology Research Center, School of Medicine, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
43
|
Putzeys L, Boon M, Lammens EM, Kuznedelov K, Severinov K, Lavigne R. Development of ONT-cappable-seq to unravel the transcriptional landscape of Pseudomonas phages. Comput Struct Biotechnol J 2022; 20:2624-2638. [PMID: 35685363 PMCID: PMC9163698 DOI: 10.1016/j.csbj.2022.05.034] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 05/16/2022] [Accepted: 05/16/2022] [Indexed: 11/28/2022] Open
Abstract
RNA sequencing has become the method of choice to study the transcriptional landscape of phage-infected bacteria. However, short-read RNA sequencing approaches generally fail to capture the primary 5' and 3' boundaries of transcripts, confounding the discovery of key transcription initiation and termination events as well as operon architectures. Yet, the elucidation of these elements is crucial for the understanding of the strategy of transcription regulation during the infection process, which is currently lacking beyond a handful of model phages. We developed ONT-cappable-seq, a specialized long-read RNA sequencing technique that allows end-to-end sequencing of primary prokaryotic transcripts using the Nanopore sequencing platform. We applied ONT-cappable-seq to study transcription of Pseudomonas aeruginosa phage LUZ7, obtaining a comprehensive genome-wide map of viral transcription start sites, terminators, and complex operon structures that fine-regulate gene expression. Our work provides new insights in the RNA biology of a non-model phage, unveiling distinct promoter architectures, putative small non-coding viral RNAs, and the prominent regulatory role of terminators during infection. The robust workflow presented here offers a framework to obtain a global, yet fine-grained view of phage transcription and paves the way for standardized, in-depth transcription studies for microbial viruses or bacteria in general.
Collapse
Affiliation(s)
- Leena Putzeys
- Department of Biosystems, Laboratory of Gene Technology, KU Leuven, Leuven 3001, Belgium
| | - Maarten Boon
- Department of Biosystems, Laboratory of Gene Technology, KU Leuven, Leuven 3001, Belgium
| | - Eveline-Marie Lammens
- Department of Biosystems, Laboratory of Gene Technology, KU Leuven, Leuven 3001, Belgium
| | | | | | - Rob Lavigne
- Department of Biosystems, Laboratory of Gene Technology, KU Leuven, Leuven 3001, Belgium
| |
Collapse
|
44
|
Wang L, Zhang C, Yin W, Wei W, Wang Y, Sa W, Liang J. Single-molecule real-time sequencing of the full-length transcriptome of purple garlic (Allium sativum L. cv. Leduzipi) and identification of serine O-acetyltransferase family proteins involved in cysteine biosynthesis. JOURNAL OF THE SCIENCE OF FOOD AND AGRICULTURE 2022; 102:2864-2873. [PMID: 34741310 DOI: 10.1002/jsfa.11627] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 10/25/2021] [Accepted: 11/05/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND Garlic (Allium sativum L.), whose bioactive components are mainly organosulfur compounds (OSCs), is a herbaceous perennial widely consumed as a green vegetable and a condiment. Yet, the metabolic enzymes involved in the biosynthesis of OSCs are not identified in garlic. RESULTS Here, a full-length transcriptome of purple garlic was generated via PacBio and Illumina sequencing, to characterize the garlic transcriptome and identify key proteins mediating the biosynthesis of OSCs. Overall, 22.56 Gb of clean data were generated, resulting in 454 698 circular consensus sequence (CCS) reads, of which 83.4% (379 206) were identified as being full-length non-chimeric reads - their further transcript clustering facilitated identification of 36 571 high-quality consensus reads. Once corrected, their genome-wide mapping revealed that 6140 reads were novel isoforms of known genes, and 2186 reads were novel isoforms from novel genes. We detected 1677 alternative splicing events, finding 2902 genes possessing either two or more poly(A) sites. Given the importance of serine O-acetyltransferase (SERAT) in cysteine biosynthesis, we investigated the five SERAT homologs in garlic. Phylogenetic analysis revealed a three-tier classification of SERAT proteins, each featuring a serine acetyltransferase domain (N-terminal) and one or two hexapeptide transferase motifs. Template-based modeling showed that garlic SERATs shared a common homo-trimeric structure with homologs from bacteria and other plants. The residues responsible for substrate recognition and catalysis were highly conserved, implying a similar reaction mechanism. In profiling the five SERAT genes' transcript levels, their expression pattern varied significantly among different tissues. CONCLUSION This study's findings deepen our knowledge of SERAT proteins, and provide timely genetic resources that could advance future exploration into garlic's genetic improvement and breeding. © 2021 Society of Chemical Industry.
Collapse
Affiliation(s)
- Le Wang
- State Key Laboratory of Plateau Ecology and Agriculture, Qinghai University, Xining, China
- Qinghai Academy of Agricultural Forestry Sciences, Qinghai University, Xining, China
- Qinghai Key Laboratory of Hulless Barley Genetics and Breeding, College of Agriculture and Forestry Sciences, Qinghai University, Xining, China
| | - Chao Zhang
- State Key Laboratory of Plateau Ecology and Agriculture, Qinghai University, Xining, China
- Qinghai Academy of Agricultural Forestry Sciences, Qinghai University, Xining, China
- Qinghai Key Laboratory of Hulless Barley Genetics and Breeding, College of Agriculture and Forestry Sciences, Qinghai University, Xining, China
| | - Wei Yin
- Qinghai Academy of Agricultural Forestry Sciences, Qinghai University, Xining, China
| | - Wei Wei
- State Key Laboratory of Plateau Ecology and Agriculture, Qinghai University, Xining, China
- Qinghai Academy of Agricultural Forestry Sciences, Qinghai University, Xining, China
| | - Yonghong Wang
- State Key Laboratory of Plateau Ecology and Agriculture, Qinghai University, Xining, China
| | - Wei Sa
- State Key Laboratory of Plateau Ecology and Agriculture, Qinghai University, Xining, China
| | - Jian Liang
- State Key Laboratory of Plateau Ecology and Agriculture, Qinghai University, Xining, China
- Qinghai Academy of Agricultural Forestry Sciences, Qinghai University, Xining, China
- Qinghai Key Laboratory of Hulless Barley Genetics and Breeding, College of Agriculture and Forestry Sciences, Qinghai University, Xining, China
| |
Collapse
|
45
|
Wongsurawat T, Jenjaroenpun P, Wanchai V, Nookaew I. Native RNA or cDNA Sequencing for Transcriptomic Analysis: A Case Study on Saccharomyces cerevisiae. Front Bioeng Biotechnol 2022; 10:842299. [PMID: 35497361 PMCID: PMC9039254 DOI: 10.3389/fbioe.2022.842299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Accepted: 03/01/2022] [Indexed: 11/13/2022] Open
Abstract
Direct sequencing of single molecules through nanopores allows for accurate quantification and full-length characterization of native RNA or complementary DNA (cDNA) without amplification. Both nanopore-based native RNA and cDNA approaches involve complex transcriptome procedures at a lower cost. However, there are several differences between the two approaches. In this study, we perform matched native RNA sequencing and cDNA sequencing to enable relevant comparisons and evaluation. Using Saccharomyces cerevisiae, a eukaryotic model organism widely used in industrial biotechnology, two different growing conditions are considered for comparison, including the poly-A messenger RNA isolated from yeast cells grown in minimum media under respirofermentative conditions supplemented with glucose (glucose growth conditions) and from cells that had shifted to ethanol as a carbon source (ethanol growth conditions). Library preparation for direct RNA sequencing is shorter than that for direct cDNA sequencing. The sequence characteristics of the two methods were different, such as sequence yields, quality score of reads, read length distribution, and mapped on reference ability of reads. However, differential gene expression analyses derived from the two approaches are comparable. The unique feature of direct RNA sequencing is RNA modification; we found that the RNA modification at the 5' end of a transcript was underestimated due to the 3' bias behavior of the direct RNA sequencing. Our comprehensive evaluation from this work could help researchers make informed choices when selecting an appropriate long-read sequencing method for understanding gene functions, pathways, and detailed functional characterization.
Collapse
Affiliation(s)
- Thidathip Wongsurawat
- Division of Bioinformatics and Data Management for Research, Research Group and Research Network Division, Research Department, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
| | - Piroon Jenjaroenpun
- Division of Bioinformatics and Data Management for Research, Research Group and Research Network Division, Research Department, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
| | - Visanu Wanchai
- Department of Biomedical Informatics, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Intawat Nookaew
- Department of Biomedical Informatics, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| |
Collapse
|
46
|
Full-Length Transcriptome Characterization and Comparative Analysis of Chosenia arbutifolia. FORESTS 2022. [DOI: 10.3390/f13040543] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
As a unique tree species in the Salicaceae family, Chosenia arbutifolia is used primarily for construction materials and landscape planting in China. Compared with other Salicaceae species members, the genomic resources of C. arbutifolia are extremely scarce. Thus, in the present study, the full-length transcriptome of C. arbutifolia was sequenced by single-molecular real-time sequencing (SMRT) technology based on the PacBio platform. Then, it was compared against those of other Salicaceae species. We generated 17,397,064 subreads and 95,940 polished reads with an average length of 1812 bp, which were acquired through calibration, clustering, and polishing. In total, 50,073 genes were reconstructed, of which 48,174 open reading frames, 4281 long non-coding RNAs, and 3121 transcription factors were discovered. Functional annotation revealed that 47,717 genes had a hit in at least one of five reference databases. Moreover, a set of 12,332 putative SSR markers were screened among the reconstructed genes. Single-copy and special orthogroups, and divergent and conserved genes, were identified and analyzed to find divergence among C. arbutifolia and the five Salicaceae species. To reveal genes involved in a specific function and pathway, enrichment analyses for GO and KEGG were also performed. In conclusion, the present study empirically confirmed that SMRT sequencing realistically depicted the C. arbutifolia transcriptome and provided a comprehensive reference for functional genomic research on Salicaceae species.
Collapse
|
47
|
Yan C, Zhang N, Wang Q, Fu Y, Zhao H, Wang J, Wu G, Wang F, Li X, Liao H. Full-length transcriptome sequencing reveals the molecular mechanism of potato seedlings responding to low-temperature. BMC PLANT BIOLOGY 2022; 22:125. [PMID: 35300606 PMCID: PMC8932150 DOI: 10.1186/s12870-022-03461-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Accepted: 02/09/2022] [Indexed: 06/14/2023]
Abstract
BACKGROUND Potato (Solanum tuberosum L.) is one of the world's most important crops, the cultivated potato is frost-sensitive, and low-temperature severely influences potato production. However, the mechanism by which potato responds to low-temperature stress is unclear. In this research, we apply a combination of second-generation sequencing and third-generation sequencing technologies to sequence full-length transcriptomes in low-temperature-sensitive cultivars to identify the important genes and main pathways related to low-temperature resistance. RESULTS In this study, we obtained 41,016 high-quality transcripts, which included 15,189 putative new transcripts. Amongst them, we identified 11,665 open reading frames, 6085 simple sequence repeats out of the potato dataset. We used public available genomic contigs to analyze the gene features, simple sequence repeat, and alternative splicing event of 24,658 non-redundant transcript sequences, predicted the coding sequence and identified the alternative polyadenylation. We performed cluster analysis, GO, and KEGG functional analysis of 4518 genes that were differentially expressed between the different low-temperature treatments. We examined 36 transcription factor families and identified 542 transcription factors in the differentially expressed genes, and 64 transcription factors were found in the AP2 transcription factor family which was the most. We measured the malondialdehyde, soluble sugar, and proline contents and the expression genes changed associated with low temperature resistance in the low-temperature treated leaves. We also tentatively speculate that StLPIN10369.5 and StCDPK16 may play a central coordinating role in the response of potatoes to low temperature stress. CONCLUSIONS Overall, this study provided the first large-scale full-length transcriptome sequencing of potato and will facilitate structure-function genetic and comparative genomics studies of this important crop.
Collapse
Affiliation(s)
- Chongchong Yan
- Anhui Academy of Agricultural Sciences, Hefei, 230031, Anhui, China.
| | - Nan Zhang
- Anhui Vocational College of City Management, Hefei, 231635, Anhui, China
| | - Qianqian Wang
- Anhui Academy of Agricultural Sciences, Hefei, 230031, Anhui, China
| | - Yuying Fu
- Anhui Academy of Agricultural Sciences, Hefei, 230031, Anhui, China
| | - Hongyuan Zhao
- Anhui Academy of Agricultural Sciences, Hefei, 230031, Anhui, China
| | - Jiajia Wang
- Anhui Academy of Agricultural Sciences, Hefei, 230031, Anhui, China
| | - Gang Wu
- Anhui Academy of Agricultural Sciences, Hefei, 230031, Anhui, China
| | - Feng Wang
- Jieshou County Agricultural Technology Promotion Center, Jieshou, 236500, Anhui, China
| | - Xueyan Li
- Funan County Agricultural Technology Promotion Center, Funan, 236300, Anhui, China
| | - Huajun Liao
- Anhui Academy of Agricultural Sciences, Hefei, 230031, Anhui, China.
| |
Collapse
|
48
|
Grünberger F, Ferreira-Cerca S, Grohmann D. Nanopore sequencing of RNA and cDNA molecules in Escherichia coli. RNA (NEW YORK, N.Y.) 2022; 28:400-417. [PMID: 34906997 PMCID: PMC8848933 DOI: 10.1261/rna.078937.121] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 11/29/2021] [Indexed: 05/09/2023]
Abstract
High-throughput sequencing dramatically changed our view of transcriptome architectures and allowed for ground-breaking discoveries in RNA biology. Recently, sequencing of full-length transcripts based on the single-molecule sequencing platform from Oxford Nanopore Technologies (ONT) was introduced and is widely used to sequence eukaryotic and viral RNAs. However, experimental approaches implementing this technique for prokaryotic transcriptomes remain scarce. Here, we present an experimental and bioinformatic workflow for ONT RNA-seq in the bacterial model organism Escherichia coli, which can be applied to any microorganism. Our study highlights critical steps of library preparation and computational analysis and compares the results to gold standards in the field. Furthermore, we comprehensively evaluate the applicability and advantages of different ONT-based RNA sequencing protocols, including direct RNA, direct cDNA, and PCR-cDNA. We find that (PCR)-cDNA-seq offers improved yield and accuracy compared to direct RNA sequencing. Notably, (PCR)-cDNA-seq is suitable for quantitative measurements and can be readily used for simultaneous and accurate detection of transcript 5' and 3' boundaries, analysis of transcriptional units, and transcriptional heterogeneity. In summary, based on our comprehensive study, we show nanopore RNA-seq to be a ready-to-use tool allowing rapid, cost-effective, and accurate annotation of multiple transcriptomic features. Thereby nanopore RNA-seq holds the potential to become a valuable alternative method for RNA analysis in prokaryotes.
Collapse
Affiliation(s)
- Felix Grünberger
- Institute of Biochemistry, Genetics and Microbiology, Institute of Microbiology and Archaea Centre, Single-Molecule Biochemistry Lab and Biochemistry Centre Regensburg, University of Regensburg, 93053 Regensburg, Germany
| | - Sébastien Ferreira-Cerca
- Regensburg Center of Biochemistry (RCB), University of Regensburg, 93053 Regensburg, Germany
- Institute for Biochemistry, Genetics and Microbiology, Regensburg Center for Biochemistry, Biochemistry III, University of Regensburg, 93053 Regensburg, Germany
| | - Dina Grohmann
- Institute of Biochemistry, Genetics and Microbiology, Institute of Microbiology and Archaea Centre, Single-Molecule Biochemistry Lab and Biochemistry Centre Regensburg, University of Regensburg, 93053 Regensburg, Germany
- Regensburg Center of Biochemistry (RCB), University of Regensburg, 93053 Regensburg, Germany
| |
Collapse
|
49
|
Sun M, Zhao Y, Shao X, Ge J, Tang X, Zhu P, Wang J, Zhao T. EST-SSR Marker Development and Full-Length Transcriptome Sequence Analysis of Tiger Lily ( Lilium lancifolium Thunb). Appl Bionics Biomech 2022; 2022:7641048. [PMID: 35126662 PMCID: PMC8816598 DOI: 10.1155/2022/7641048] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 01/03/2022] [Accepted: 01/12/2022] [Indexed: 12/11/2022] Open
Abstract
The fast advancement and deployment of sequencing technologies after the Human Genome Project have greatly increased our knowledge of the eukaryotic genome sequences. However, due to technological concerns, high-quality genomic data has been confined to a few key organisms. Moreover, our understanding of which portions of genomes make up genes and which transcript isoforms synthesize these genes is scarce. Therefore, the current study has been designed to explore the reliability of the tiger lily (Lilium lancifolium Thunb) transcriptome. The PacBio-SMRT was used for attaining the complete transcriptomic profile. We obtained a total of 815,624 CCS (Circular Consensus Sequence) reads with an average length of 1295 bp. The tiger lily transcriptome has been sequenced for the first time using third-generation long-read technology. Furthermore, unigenes (38,707), lncRNAs (6852), and TF members (768) were determined based on the transcriptome data, followed by evaluating SSRs (3319). It has also been revealed that 105 out of 128 primer pairs effectively amplified PCR products. Around 15,608 transcripts were allocated to 25 distinct KOG Clusters, and 10,706 unigenes were grouped into 52 functional categories in the annotated transcripts. Until now, no tiger lily lncRNAs have been discovered. Results of this study may serve as an extensive set of reference transcripts and help us learn more about the transcriptomes of tiger lilies and pave the path for further research.
Collapse
Affiliation(s)
- Mingwei Sun
- Lianyungang Academy of Agricultural Sciences, Lianyungang, China
| | | | - Xiaobin Shao
- Lianyungang Academy of Agricultural Sciences, Lianyungang, China
| | - Jintao Ge
- Lianyungang Academy of Agricultural Sciences, Lianyungang, China
| | - Xueyan Tang
- Lianyungang Academy of Agricultural Sciences, Lianyungang, China
| | - Pengbo Zhu
- Lianyungang Academy of Agricultural Sciences, Lianyungang, China
| | - Jiangying Wang
- Lianyungang Academy of Agricultural Sciences, Lianyungang, China
| | - Tongli Zhao
- Lianyungang Academy of Agricultural Sciences, Lianyungang, China
| |
Collapse
|
50
|
Raghavan V, Kraft L, Mesny F, Rigerte L. A simple guide to de novo transcriptome assembly and annotation. Brief Bioinform 2022; 23:6514404. [PMID: 35076693 PMCID: PMC8921630 DOI: 10.1093/bib/bbab563] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 12/03/2021] [Accepted: 12/09/2021] [Indexed: 12/13/2022] Open
Abstract
A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary, expensive or difficult. In the absence of a sequenced genome to guide the reconstruction process, the transcriptome must be assembled de novo using only the information available in the RNA-seq reads. Subsequently, the sequences must be annotated in order to identify sequence-intrinsic and evolutionary features in them (for example, protein-coding regions). Although straightforward at first glance, de novo transcriptome assembly and annotation can quickly prove to be challenging undertakings. In addition to familiarizing themselves with the conceptual and technical intricacies of the tasks at hand and the numerous pre- and post-processing steps involved, those interested must also grapple with an overwhelmingly large choice of tools. The lack of standardized workflows, fast pace of development of new tools and techniques and paucity of authoritative literature have served to exacerbate the difficulty of the task even further. Here, we present a comprehensive overview of de novo transcriptome assembly and annotation. We discuss the procedures involved, including pre- and post-processing steps, and present a compendium of corresponding tools.
Collapse
Affiliation(s)
- Venket Raghavan
- Corresponding authors: Venket Raghavan, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail: ; Louis Kraft, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail:
| | - Louis Kraft
- Corresponding authors: Venket Raghavan, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail: ; Louis Kraft, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail:
| | | | | |
Collapse
|