1
|
Feng YZ, Zhu QF, Xue J, Chen P, Yu Y. Shining in the dark: the big world of small peptides in plants. ABIOTECH 2023; 4:238-256. [PMID: 37970469 PMCID: PMC10638237 DOI: 10.1007/s42994-023-00100-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Accepted: 02/24/2023] [Indexed: 11/17/2023]
Abstract
Small peptides represent a subset of dark matter in plant proteomes. Through differential expression patterns and modes of action, small peptides act as important regulators of plant growth and development. Over the past 20 years, many small peptides have been identified due to technical advances in genome sequencing, bioinformatics, and chemical biology. In this article, we summarize the classification of plant small peptides and experimental strategies used to identify them as well as their potential use in agronomic breeding. We review the biological functions and molecular mechanisms of small peptides in plants, discuss current problems in small peptide research and highlight future research directions in this field. Our review provides crucial insight into small peptides in plants and will contribute to a better understanding of their potential roles in biotechnology and agriculture.
Collapse
Affiliation(s)
- Yan-Zhao Feng
- Guangdong Key Laboratory of Crop Germplasm Resources Preservation and Utilization, Key Laboratory of South China Modern Biological Seed Industry, Ministry of Agriculture and Rural Affairs, Agro-Biological Gene Research Center, Guangdong Academy of Agricultural Sciences, Guangzhou, 510640 China
| | - Qing-Feng Zhu
- Guangdong Key Laboratory of Crop Germplasm Resources Preservation and Utilization, Key Laboratory of South China Modern Biological Seed Industry, Ministry of Agriculture and Rural Affairs, Agro-Biological Gene Research Center, Guangdong Academy of Agricultural Sciences, Guangzhou, 510640 China
| | - Jiao Xue
- Guangdong Key Laboratory of Crop Germplasm Resources Preservation and Utilization, Key Laboratory of South China Modern Biological Seed Industry, Ministry of Agriculture and Rural Affairs, Agro-Biological Gene Research Center, Guangdong Academy of Agricultural Sciences, Guangzhou, 510640 China
| | - Pei Chen
- Guangdong Key Laboratory of Crop Germplasm Resources Preservation and Utilization, Key Laboratory of South China Modern Biological Seed Industry, Ministry of Agriculture and Rural Affairs, Agro-Biological Gene Research Center, Guangdong Academy of Agricultural Sciences, Guangzhou, 510640 China
| | - Yang Yu
- Guangdong Key Laboratory of Crop Germplasm Resources Preservation and Utilization, Key Laboratory of South China Modern Biological Seed Industry, Ministry of Agriculture and Rural Affairs, Agro-Biological Gene Research Center, Guangdong Academy of Agricultural Sciences, Guangzhou, 510640 China
| |
Collapse
|
2
|
Feng Y, Jiang M, Yu W, Zhou J. Identification of short open reading frames in plant genomes. FRONTIERS IN PLANT SCIENCE 2023; 14:1094715. [PMID: 36875581 PMCID: PMC9975389 DOI: 10.3389/fpls.2023.1094715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 01/26/2023] [Indexed: 06/18/2023]
Abstract
The roles of short/small open reading frames (sORFs) have been increasingly recognized in recent years due to the rapidly growing number of sORFs identified in various organisms due to the development and application of the Ribo-Seq technique, which sequences the ribosome-protected footprints (RPFs) of the translating mRNAs. However, special attention should be paid to RPFs used to identify sORFs in plants due to their small size (~30 nt) and the high complexity and repetitiveness of the plant genome, particularly for polyploidy species. In this work, we compare different approaches to the identification of plant sORFs, discuss the advantages and disadvantages of each method, and provide a guide for choosing different methods in plant sORF studies.
Collapse
Affiliation(s)
- Yong Feng
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Mengyun Jiang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- State Key Laboratory of Crop Stress Adaptation and Improvement, School of Life Sciences, Henan University, Kaifeng, China
| | - Weichang Yu
- Guangdong Key Laboratory of Plant Epigenetics, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China
- Liaoning Peanut Research Institute, Liaoning Academy of Agricultural Sciences, Fuxing, China
| | - Jiannan Zhou
- Key Laboratory of Tropical Fruit Biology (Ministry of Agriculture), South Subtropical Crops Research Institute, Chinese Academy of Tropical Agricultural Sciences, Zhanjiang, China
| |
Collapse
|
3
|
Song B, Li H, Jiang M, Gao Z, Wang S, Gao L, Chen Y, Li W. slORFfinder: a tool to detect open reading frames resulting from trans-splicing of spliced leader sequences. Brief Bioinform 2023; 24:6972299. [PMID: 36611257 PMCID: PMC9851317 DOI: 10.1093/bib/bbac610] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 11/16/2022] [Accepted: 12/11/2022] [Indexed: 01/09/2023] Open
Abstract
Trans-splicing of a spliced leader (SL) to the 5' ends of mRNAs is used to produce mature mRNAs in several phyla of great importance to human health and the marine ecosystem. One of the consequences of the addition of SL sequences is the change or disruption of the open reading frames (ORFs) in the recipient transcripts. Given that most SL sequences have one or more of the trinucleotide NUG, including AUG in flatworms, trans-splicing of SL sequences can potentially supply a start codon to create new ORFs, which we refer to as slORFs, in the recipient mRNAs. Due to the lack of a tool to precisely detect them, slORFs were usually neglected in previous studies. In this work, we present the tool slORFfinder, which automatically links the SL sequences to the recipient mRNAs at the trans-splicing sites identified from SL-containing reads of RNA-Seq and predicts slORFs according to the distribution of ribosome-protected footprints (RPFs) on the trans-spliced transcripts. By applying this tool to the analyses of nematodes, ascidians and euglena, whose RPFs are publicly available, we find wide existence of slORFs in these taxa. Furthermore, we find that slORFs are generally translated at higher levels than the annotated ORFs in the genomes, suggesting they might have important functions. Overall, this study provides a tool, slORFfinder (https://github.com/songbo446/slORFfinder), to identify slORFs, which can enhance our understanding of ORFs in taxa with SL machinery.
Collapse
Affiliation(s)
| | | | - Mengyun Jiang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
| | - Zhongtian Gao
- Guangdong Provincial Key Laboratory for Plant Epigenetics, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen 518060, China
| | - Suikang Wang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
| | - Lei Gao
- Guangdong Provincial Key Laboratory for Plant Epigenetics, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen 518060, China
| | - Yunsheng Chen
- Corresponding authors: Yunsheng Chen, Department of Laboratory Medicine, Shenzhen Children's Hospital, Shenzhen 518038, China, E-mail: ; Wujiao Li, Department of Laboratory Medicine, Shenzhen Childrens' Hospital, Shenzhen 518038, China, E-mail:
| | - Wujiao Li
- Corresponding authors: Yunsheng Chen, Department of Laboratory Medicine, Shenzhen Children's Hospital, Shenzhen 518038, China, E-mail: ; Wujiao Li, Department of Laboratory Medicine, Shenzhen Childrens' Hospital, Shenzhen 518038, China, E-mail:
| |
Collapse
|
4
|
Álvarez-Urdiola R, Borràs E, Valverde F, Matus JT, Sabidó E, Riechmann JL. Peptidomics Methods Applied to the Study of Flower Development. Methods Mol Biol 2023; 2686:509-536. [PMID: 37540375 DOI: 10.1007/978-1-0716-3299-4_24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/05/2023]
Abstract
Understanding the global and dynamic nature of plant developmental processes requires not only the study of the transcriptome, but also of the proteome, including its largely uncharacterized peptidome fraction. Recent advances in proteomics and high-throughput analyses of translating RNAs (ribosome profiling) have begun to address this issue, evidencing the existence of novel, uncharacterized, and possibly functional peptides. To validate the accumulation in tissues of sORF-encoded polypeptides (SEPs), the basic setup of proteomic analyses (i.e., LC-MS/MS) can be followed. However, the detection of peptides that are small (up to ~100 aa, 6-7 kDa) and novel (i.e., not annotated in reference databases) presents specific challenges that need to be addressed both experimentally and with computational biology resources. Several methods have been developed in recent years to isolate and identify peptides from plant tissues. In this chapter, we outline two different peptide extraction protocols and the subsequent peptide identification by mass spectrometry using the database search or the de novo identification methods.
Collapse
Affiliation(s)
- Raquel Álvarez-Urdiola
- Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Edifici CRAG, Campus UAB, Cerdanyola del Vallès, Barcelona, Spain
| | - Eva Borràs
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Federico Valverde
- Institute for Plant Biochemistry and Photosynthesis CSIC - University of Seville, Seville, Spain
| | - José Tomás Matus
- Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Edifici CRAG, Campus UAB, Cerdanyola del Vallès, Barcelona, Spain
- Institute for Integrative Systems Biology (I2SysBio), Universitat de València-CSIC, Paterna, Valencia, Spain
| | - Eduard Sabidó
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - José Luis Riechmann
- Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Edifici CRAG, Campus UAB, Cerdanyola del Vallès, Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
| |
Collapse
|
5
|
Chothani S, Ho L, Schafer S, Rackham O. Discovering microproteins: making the most of ribosome profiling data. RNA Biol 2023; 20:943-954. [PMID: 38013207 PMCID: PMC10730196 DOI: 10.1080/15476286.2023.2279845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/30/2023] [Indexed: 11/29/2023] Open
Abstract
Building a reference set of protein-coding open reading frames (ORFs) has revolutionized biological process discovery and understanding. Traditionally, gene models have been confirmed using cDNA sequencing and encoded translated regions inferred using sequence-based detection of start and stop combinations longer than 100 amino-acids to prevent false positives. This has led to small ORFs (smORFs) and their encoded proteins left un-annotated. Ribo-seq allows deciphering translated regions from untranslated irrespective of the length. In this review, we describe the power of Ribo-seq data in detection of smORFs while discussing the major challenge posed by data-quality, -depth and -sparseness in identifying the start and end of smORF translation. In particular, we outline smORF cataloguing efforts in humans and the large differences that have arisen due to variation in data, methods and assumptions. Although current versions of smORF reference sets can already be used as a powerful tool for hypothesis generation, we recommend that future editions should consider these data limitations and adopt unified processing for the community to establish a canonical catalogue of translated smORFs.
Collapse
Affiliation(s)
- Sonia Chothani
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore
| | - Lena Ho
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore
| | - Sebastian Schafer
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore
| | - Owen Rackham
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore
- School of Biological Sciences, University of Southampton, Southampton, UK
- The Alan Turing Institute, The British Library, London, UK
| |
Collapse
|
6
|
Small open reading frames in plant research: from prediction to functional characterization. 3 Biotech 2022; 12:76. [PMID: 35251879 PMCID: PMC8873315 DOI: 10.1007/s13205-022-03147-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 02/11/2022] [Indexed: 11/01/2022] Open
Abstract
Gene prediction is a laborious and time-consuming task. The advancement of sequencing technologies and bioinformatics tools, coupled with accelerated rate of ribosome profiling and mass spectrometry development, have made identification of small open reading frames (sORFs) (< 100 codons) in various plant genomes possible. The past 50 years have seen sORFs being isolated from many organisms. However, to date, a comprehensive sORF annotation pipeline is as yet unavailable, hence, addressed in our review. Here, we also provide current information on classification and functions of plant sORFs and their potential applications in crop improvement programs.
Collapse
|
7
|
Kute PM, Soukarieh O, Tjeldnes H, Trégouët DA, Valen E. Small Open Reading Frames, How to Find Them and Determine Their Function. Front Genet 2022; 12:796060. [PMID: 35154250 PMCID: PMC8831751 DOI: 10.3389/fgene.2021.796060] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 12/30/2021] [Indexed: 12/12/2022] Open
Abstract
Advances in genomics and molecular biology have revealed an abundance of small open reading frames (sORFs) across all types of transcripts. While these sORFs are often assumed to be non-functional, many have been implicated in physiological functions and a significant number of sORFs have been described in human diseases. Thus, sORFs may represent a hidden repository of functional elements that could serve as therapeutic targets. Unlike protein-coding genes, it is not necessarily the encoded peptide of an sORF that enacts its function, sometimes simply the act of translating an sORF might have a regulatory role. Indeed, the most studied sORFs are located in the 5′UTRs of coding transcripts and can have a regulatory impact on the translation of the downstream protein-coding sequence. However, sORFs have also been abundantly identified in non-coding RNAs including lncRNAs, circular RNAs and ribosomal RNAs suggesting that sORFs may be diverse in function. Of the many different experimental methods used to discover sORFs, the most commonly used are ribosome profiling and mass spectrometry. These can confirm interactions between transcripts and ribosomes and the production of a peptide, respectively. Extensions to ribosome profiling, which also capture scanning ribosomes, have further made it possible to see how sORFs impact the translation initiation of mRNAs. While high-throughput techniques have made the identification of sORFs less difficult, defining their function, if any, is typically more challenging. Together, the abundance and potential function of many of these sORFs argues for the necessity of including sORFs in gene annotations and systematically characterizing these to understand their potential functional roles. In this review, we will focus on the high-throughput methods used in the detection and characterization of sORFs and discuss techniques for validation and functional characterization.
Collapse
Affiliation(s)
- Preeti Madhav Kute
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
- Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen, Norway
| | - Omar Soukarieh
- Department of Molecular Epidemiology Of Vascular and Brain Disorders, INSERM, BPH, U1219, University of Bordeaux, Bordeaux, France
| | - Håkon Tjeldnes
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - David-Alexandre Trégouët
- Department of Molecular Epidemiology Of Vascular and Brain Disorders, INSERM, BPH, U1219, University of Bordeaux, Bordeaux, France
| | - Eivind Valen
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
- Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen, Norway
- *Correspondence: Eivind Valen,
| |
Collapse
|
8
|
Orlov YL, Anashkina AA. Life: Computational Genomics Applications in Life Sciences. Life (Basel) 2021; 11:life11111211. [PMID: 34833087 PMCID: PMC8622464 DOI: 10.3390/life11111211] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 11/06/2021] [Indexed: 01/19/2023] Open
Affiliation(s)
- Yuriy L. Orlov
- The Digital Health Institute, I.M. Sechenov First Moscow State Medical University of the Ministry of Health of the Russian Federation (Sechenov University), 119991 Moscow, Russia;
- Life Sciences Department, Novosibirsk State University, 630090 Novosibirsk, Russia
- Agrarian and Technological Institute, Peoples’ Friendship University of Russia, 117198 Moscow, Russia
- Correspondence:
| | - Anastasia A. Anashkina
- The Digital Health Institute, I.M. Sechenov First Moscow State Medical University of the Ministry of Health of the Russian Federation (Sechenov University), 119991 Moscow, Russia;
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
| |
Collapse
|