1
|
Mouland AJ, Chau BA, Uversky VN. Methodological approaches to studying phase separation and HIV-1 replication: Current and future perspectives. Methods 2024; 229:147-155. [PMID: 39002735 DOI: 10.1016/j.ymeth.2024.07.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2023] [Revised: 06/26/2024] [Accepted: 07/11/2024] [Indexed: 07/15/2024] Open
Abstract
This article reviews tried-and-tested methodologies that have been employed in the first studies on phase separating properties of structural, RNA-binding and catalytic proteins of HIV-1. These are described here to stimulate interest for any who may want to initiate similar studies on virus-mediated liquid-liquid phase separation. Such studies serve to better understand the life cycle and pathogenesis of viruses and open the door to new therapeutics.
Collapse
Affiliation(s)
- Andrew J Mouland
- Department of Medicine, McGill University, Montreal, Quebec, Canada.
| | - Bao-An Chau
- Department of Medicine, McGill University, Montreal, Quebec, Canada
| | - Vladimir N Uversky
- Department of Molecular Medicine and Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA.
| |
Collapse
|
2
|
Argudo PG. Lipids and proteins: Insights into the dynamics of assembly, recognition, condensate formation. What is still missing? Biointerphases 2024; 19:038501. [PMID: 38922634 DOI: 10.1116/6.0003662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Accepted: 06/03/2024] [Indexed: 06/27/2024] Open
Abstract
Lipid membranes and proteins, which are part of us throughout our lives, have been studied for decades. However, every year, new discoveries show how little we know about them. In a reader-friendly manner for people not involved in the field, this paper tries to serve as a bridge between physicists and biologists and new young researchers diving into the field to show its relevance, pointing out just some of the plethora of lines of research yet to be unraveled. It illustrates how new ways, from experimental to theoretical approaches, are needed in order to understand the structures and interactions that take place in a single lipid, protein, or multicomponent system, as we are still only scratching the surface.
Collapse
Affiliation(s)
- Pablo G Argudo
- Max Planck Institute for Polymer Research (MPI-P), Mainz 55128, Germany
| |
Collapse
|
3
|
Lee WK, Chan BKK, Kim JY, Ju SJ, Kim SJ. Comparative genomics reveals the dynamic evolutionary history of cement protein genes of barnacles from intertidal to deep-sea hydrothermal vents. Mol Ecol Resour 2024; 24:e13895. [PMID: 37955198 DOI: 10.1111/1755-0998.13895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Revised: 10/16/2023] [Accepted: 10/30/2023] [Indexed: 11/14/2023]
Abstract
Thoracican barnacles are a diverse group of marine organisms for which the availability of genome assemblies is currently limited. In this study, we sequenced the genomes of two neolepadoid species (Ashinkailepas kermadecensis, Imbricaverruca yamaguchii) from hydrothermal vents, in addition to two intertidal species. Genome sizes ranged from 481 to 1054 Mb, with repetitive sequence contents of 21.2% to 50.7%. Concordance rates of orthologs and heterozygosity rates were between 82.4% and 91.7% and between 1.0% and 2.1%, respectively, indicating high genetic diversity and heterozygosity. Based on phylogenomic analyses, we revised the nomenclature of cement genes encoding cement proteins that are not homologous to any known proteins. The major cement gene, CP100A, was found in all thoracican species, including vent-associated neolepadoids, and was hypothesised to be essential for thoracican settlement. Duplicated genes, CP100B and CP100C, were found only in balanids, suggesting potential functional redundancy or acquisition of new functions associated with the calcareous base. An ancestor of CP52 genes was duplicated dynamically among lepadids, pollicipedids with multiple copies on a single scaffold, and balanids with multiple sequential repeats of the conserved regions, but no CP52 genes were found in neolepadoids, providing insights into cement gene evolution among thoracican lineages. This study enhances our understanding of the adhesion mechanisms of thoracicans in underwater environments. The newly sequenced genomes provide opportunities for studying their evolution and ecology, shedding light on their adaptation to diverse marine environments, and contributing to our knowledge of barnacle biology with valuable genomic resources for further studies in this field.
Collapse
Affiliation(s)
- Won-Kyung Lee
- Division of Biomedical Research, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea
- Division of EcoScience, Ewha Womans University, Seoul, Korea
| | - Benny K K Chan
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Jae-Yoon Kim
- Division of Biomedical Research, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea
| | - Se-Jong Ju
- Marine Resources & Environment Research Division, Korea Institute of Ocean Science and Technology, Busan, Korea
| | - Se-Joo Kim
- Division of Biomedical Research, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea
| |
Collapse
|
4
|
Djulbegovic M, Taylor Gonzalez DJ, Antonietti M, Uversky VN, Shields CL, Karp CL. Intrinsic disorder may drive the interaction of PROS1 and MERTK in uveal melanoma. Int J Biol Macromol 2023; 250:126027. [PMID: 37506796 PMCID: PMC11182630 DOI: 10.1016/j.ijbiomac.2023.126027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 07/23/2023] [Accepted: 07/25/2023] [Indexed: 07/30/2023]
Abstract
BACKGROUND Class 2 uveal melanomas are associated with the inactivation of the BRCA1 ((breast cancer type 1 susceptibility protein)-associated protein 1 (BAP1)) gene. Inactivation of BAP1 promotes the upregulation of vitamin K-dependent protein S (PROS1), which interacts with the tyrosine-protein kinase Mer (MERTK) receptor on M2 macrophages to induce an immunosuppressive environment. METHODS We simulated the interaction of PROS1 with MERTK with ColabFold. We evaluated PROS1 and MERTK for the presence of intrinsically disordered protein regions (IDPRs) and disorder-to-order (DOT) regions to understand their protein-protein interaction (PPI). We first evaluated the structure of each protein with AlphaFold. We then analyzed specific sequence-based features of the PROS1 and MERTK with a suite of bioinformatics tools. RESULTS With high-resolution, moderate confidence, we successfully modeled the interaction between PROS1 and MERTK (predicted local distance difference test score (pDLLT) = 70.68). Our structural analysis qualitatively demonstrated IDPRs (i.e., spaghetti-like entities) in PROS1 and MERK. PROS1 was 23.37 % disordered, and MERTK was 23.09 % disordered, classifying them as moderately disordered and flexible proteins. PROS1 was significantly enriched in cysteine, the most order-promoting residue (p-value <0.05). Our IUPred analysis demonstrated that there are two disorder-to-order transition (DOT) regions in PROS1. MERTK was significantly enriched in proline, the most disorder-promoting residue (p-value <0.05), but did not contain DOT regions. Our STRING analysis demonstrated that the PPI network between PROS1 and MERTK is more complex than their assumed one-to-one binding (p-value <2.0 × 10-6). CONCLUSION Our findings present a novel prediction for the interaction between PROS1 and MERTK. Our findings show that PROS1 and MERTK contain elements of intrinsic disorder. PROS1 has two DOT regions that are attractive immunotherapy targets. We recommend that IDPRs and DOT regions found in PROS1 and MERTK should be considered when developing immunotherapies targeting this PPI.
Collapse
Affiliation(s)
- Mak Djulbegovic
- Bascom Palmer Eye Institute, University of Miami, Miami, FL, USA
| | | | | | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
| | - Carol L Shields
- Ocular Oncology Service, Wills Eye Hospital, Thomas Jefferson University, Philadelphia, PA, USA
| | - Carol L Karp
- Bascom Palmer Eye Institute, University of Miami, Miami, FL, USA.
| |
Collapse
|
5
|
Denesyuk AI, Permyakov SE, Permyakov EA, Johnson MS, Denessiouk K, Uversky VN. Canonical structural-binding modes in the calmodulin-target protein complexes. J Biomol Struct Dyn 2023; 41:7582-7594. [PMID: 36106955 DOI: 10.1080/07391102.2022.2123391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Accepted: 09/04/2022] [Indexed: 10/14/2022]
Abstract
Intracellular calcium sensor protein calmodulin (CaM) belongs to the large EF-hand protein superfamily. CaM shows a unique and not fully understood ability to bind to multiple targets, allows them to participate in a variety of regulatory processes. The protein has two approximately symmetrical globular domains (the N- and C-lobes). Analysis of the CaM-binding sites of target proteins showed that they have two hydrophobic 'anchor' amino acids separated by 10 to 17 residues. Consequently, several CaM-binding motifs: {1-10}, {1-11}, {1-13}, {1-14}, {1-16}, {1-17}, differing by the distance between the two anchor residues along the amino acid sequence, have been identified. Despite extensive structural information on the role of target-protein amino acid residues in the formation of complexes with CaM, much less is known about the role of amino acids from CaM contributing to these interactions. In this work, a quantitative analysis of the contact surfaces of CaM and target proteins has been carried out for 35 representative three-dimensional structures. It has been shown that, in addition to the two hydrophobic terminal residues of the target fragment, the interaction also involves residues that are 4 residues earlier in the sequence (binding mode {1-5}). It has also been found that the N- and C-lobes of CaM bind the {1-5} motif located at the ends of the target in a structurally identical manner. Methionine residues at positions 51 (corresponding to 124 in the C-lobe), 71 (144), and 72 (145) of the CaM amino acid sequence are key hydrophobic residues for this interaction. They are located at the N- and C-boundaries of the even EF-hand motifs. The hydrophobic core of CaM ('Ф-quatrefoil') consists of 10 amino acids in the N-lobe (and in the C-lobe): Phe16 (Phe89), Phe19 (Phe92), Ile27 (Ile100), Thr29 (Ala102), Leu32 (Leu105), Ile52 (Ile125), Val55 (Ala128), Ile63 (Val136), Phe65 (Tyr138), and Phe68 (Phe141) and do not intersect with the target-binding methionine residues. CaM belongs to the 'dynamic' group of EF-hand proteins, in which calcium and protein ligand binding causes only global conformational changes but does not alter the conservative 'black' and 'grey' clusters described in our earlier works (PLoS One. 2014; 9(10):e109287). The membership of CaM in the 'dynamic' group is determined by the triggering and protective methionine layer: Met51 (Met124), Met71 (Met144) and Met72 (Met145). HIGHLIGHTSInterchain interactions in the unique 35 CaM complex structures were analyzed.Methionine amino acids of the N- and C-lobes of CaM form triggering and protective layers.Interactions of the target terminal residues with these methionine layers are structurally identical.CaM belonging to the 'dynamic' group is determined by the triggering and protective methionine layer.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Alexander I Denesyuk
- Institute for Biological Instrumentation of the, Russian Academy of Sciences, Federal Research Center, "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino Moscow Region, Russia
- Structural Bioinformatics Laboratory, Biochemistry, InFLAMES Research Flagship Center, Faculty of Science and Engineering, Åbo Akademi University, Turku, Finland
| | - Sergei E Permyakov
- Institute for Biological Instrumentation of the, Russian Academy of Sciences, Federal Research Center, "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino Moscow Region, Russia
| | - Eugene A Permyakov
- Institute for Biological Instrumentation of the, Russian Academy of Sciences, Federal Research Center, "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino Moscow Region, Russia
| | - Mark S Johnson
- Structural Bioinformatics Laboratory, Biochemistry, InFLAMES Research Flagship Center, Faculty of Science and Engineering, Åbo Akademi University, Turku, Finland
| | - Konstantin Denessiouk
- Structural Bioinformatics Laboratory, Biochemistry, InFLAMES Research Flagship Center, Faculty of Science and Engineering, Åbo Akademi University, Turku, Finland
| | - Vladimir N Uversky
- Institute for Biological Instrumentation of the, Russian Academy of Sciences, Federal Research Center, "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino Moscow Region, Russia
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
| |
Collapse
|
6
|
Antonietti M, Gonzalez DJT, Djulbegovic M, Dayhoff GW, Uversky VN, Shields CL, Karp CL. Intrinsic disorder in PRAME and its role in uveal melanoma. Cell Commun Signal 2023; 21:222. [PMID: 37626310 PMCID: PMC10463658 DOI: 10.1186/s12964-023-01197-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Accepted: 06/13/2023] [Indexed: 08/27/2023] Open
Abstract
INTRODUCTION The PReferentially expressed Antigen in MElanoma (PRAME) protein has been shown to be an independent biomarker for increased risk of metastasis in Class 1 uveal melanomas (UM). Intrinsically disordered proteins and regions of proteins (IDPs/IDPRs) are proteins that do not have a well-defined three-dimensional structure and have been linked to neoplastic development. Our study aimed to evaluate the presence of intrinsic disorder in PRAME and the role these structureless regions have in PRAME( +) Class 1 UM. METHODS A bioinformatics study to characterize PRAME's propensity for the intrinsic disorder. We first used the AlphaFold tool to qualitatively assess the protein structure of PRAME. Then we used the Compositional Profiler and a set of per-residue intrinsic disorder predictors to quantify the intrinsic disorder. The Database of Disordered Protein Prediction (D2P2) platform, IUPred, FuzDrop, fIDPnn, AUCpred, SPOT-Disorder2, and metapredict V2 allowed us to evaluate the potential functional disorder of PRAME. Additionally, we used the Search Tool for the Retrieval of Interacting Genes (STRING) to analyze PRAME's potential interactions with other proteins. RESULTS Our structural analysis showed that PRAME contains intrinsically disordered protein regions (IDPRs), which are structureless and flexible. We found that PRAME is significantly enriched with serine (p-value < 0.05), a disorder-promoting amino acid. PRAME was found to have an average disorder score of 16.49% (i.e., moderately disordered) across six per-residue intrinsic disorder predictors. Our IUPred analysis revealed the presence of disorder-to-order transition (DOT) regions in PRAME near the C-terminus of the protein (residues 475-509). The D2P2 platform predicted a region from approximately 140 and 175 to be highly concentrated with post-translational modifications (PTMs). FuzDrop predicted the PTM hot spot of PRAME to be a droplet-promoting region and an aggregation hotspot. Finally, our analysis using the STRING tool revealed that PRAME has significantly more interactions with other proteins than expected for randomly selected proteins of the same size, with the ability to interact with 84 different partners (STRING analysis result: p-value < 1.0 × 10-16; model confidence: 0.400). CONCLUSION Our study revealed that PRAME has IDPRs that are possibly linked to its functionality in the context of Class 1 UM. The regions of functionality (i.e., DOT regions, PTM sites, droplet-promoting regions, and aggregation hotspots) are localized to regions of high levels of disorder. PRAME has a complex protein-protein interaction (PPI) network that may be secondary to the structureless features of the polypeptide. Our findings contribute to our understanding of UM and suggest that IDPRs and DOT regions in PRAME may be targeted in developing new therapies for this aggressive cancer. Video Abstract.
Collapse
Affiliation(s)
- Michael Antonietti
- Bascom Palmer Eye Institute, University of Miami, 900 NW 17th Street, Miami, FL, 33136, USA
| | | | - Mak Djulbegovic
- Bascom Palmer Eye Institute, University of Miami, 900 NW 17th Street, Miami, FL, 33136, USA
| | - Guy W Dayhoff
- Department of Chemistry, College of Art and Sciences, University of South Florida, FL, 33612, Tampa, USA
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, FL, 33612, Tampa, USA
| | - Carol L Shields
- Ocular Oncology Service, Wills Eye Hospital, Thomas Jefferson University, PA, Philadelphia, USA
| | - Carol L Karp
- Bascom Palmer Eye Institute, University of Miami, 900 NW 17th Street, Miami, FL, 33136, USA.
| |
Collapse
|
7
|
Bao C, Lu C, Lin J, Gough J, Fang H. The dcGO Domain-Centric Ontology Database in 2023: New Website and Extended Annotations for Protein Structural Domains. J Mol Biol 2023; 435:168093. [PMID: 37061086 PMCID: PMC7614987 DOI: 10.1016/j.jmb.2023.168093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2022] [Revised: 03/24/2023] [Accepted: 04/06/2023] [Indexed: 04/17/2023]
Abstract
Protein structural domains have been less studied than full-length proteins in terms of ontology annotations. The dcGO database has filled this gap by providing mappings from protein domains to ontologies. The dcGO update in 2023 extends annotations for protein domains of multiple definitions (SCOP, Pfam, and InterPro) with commonly used ontologies that are categorised into functions, phenotypes, diseases, drugs, pathways, regulators, and hallmarks. This update adds new dimensions to the utility of both ontology and protein domain resources. A newly designed website at http://www.protdomainonto.pro/dcGO offers a more centralised and user-friendly way to access the dcGO database, with enhanced faceted search returning term- and domain-specific information pages. Users can navigate both ontology terms and annotated domains through improved ontology hierarchy browsing. A newly added facility enables domain-based ontology enrichment analysis.
Collapse
Affiliation(s)
- Chaohui Bao
- Shanghai Institute of Hematology, State Key Laboratory of Medical Genomics, National Research Center for Translational Medicine at Shanghai, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China
| | - Chang Lu
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, UK; MRC London Institute of Medical Sciences, Imperial College London, London W12 0HS, UK
| | - James Lin
- High Performance Computing Center, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Julian Gough
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, UK
| | - Hai Fang
- Shanghai Institute of Hematology, State Key Laboratory of Medical Genomics, National Research Center for Translational Medicine at Shanghai, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China.
| |
Collapse
|
8
|
Mokin YI, Gavrilova AA, Fefilova AS, Kuznetsova IM, Turoverov KK, Uversky VN, Fonin AV. Nucleolar- and Nuclear-Stress-Induced Membrane-Less Organelles: A Proteome Analysis through the Prism of Liquid-Liquid Phase Separation. Int J Mol Sci 2023; 24:11007. [PMID: 37446185 DOI: 10.3390/ijms241311007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 06/26/2023] [Accepted: 06/27/2023] [Indexed: 07/15/2023] Open
Abstract
Radical changes in the idea of the organization of intracellular space that occurred in the early 2010s made it possible to consider the formation and functioning of so-called membrane-less organelles (MLOs) based on a single physical principle: the liquid-liquid phase separation (LLPS) of biopolymers. Weak non-specific inter- and intramolecular interactions of disordered polymers, primarily intrinsically disordered proteins, and RNA, play a central role in the initiation and regulation of these processes. On the other hand, in some cases, the "maturation" of MLOs can be accompanied by a "liquid-gel" phase transition, where other types of interactions can play a significant role in the reorganization of their structure. In this work, we conducted a bioinformatics analysis of the propensity of the proteomes of two membrane-less organelles, formed in response to stress in the same compartment, for spontaneous phase separation and examined their intrinsic disorder predispositions. These MLOs, amyloid bodies (A-bodies) formed in the response to acidosis and heat shock and nuclear stress bodies (nSBs), are characterized by a partially overlapping composition, but show different functional activities and morphologies. We show that the proteomes of these biocondensates are differently enriched in proteins, and many have high potential for spontaneous LLPS that correlates with the different morphology and function of these organelles. The results of these analyses allowed us to evaluate the role of weak interactions in the formation and functioning of these important organelles.
Collapse
Affiliation(s)
- Yakov I Mokin
- Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg 194064, Russia
| | - Anastasia A Gavrilova
- Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg 194064, Russia
| | - Anna S Fefilova
- Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg 194064, Russia
| | - Irina M Kuznetsova
- Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg 194064, Russia
| | - Konstantin K Turoverov
- Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg 194064, Russia
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
- USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
| | - Alexander V Fonin
- Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg 194064, Russia
| |
Collapse
|
9
|
Di Nunzio F, Uversky VN, Mouland AJ. Biomolecular condensates: insights into early and late steps of the HIV-1 replication cycle. Retrovirology 2023; 20:4. [PMID: 37029379 PMCID: PMC10081342 DOI: 10.1186/s12977-023-00619-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 03/16/2023] [Indexed: 04/09/2023] Open
Abstract
A rapidly evolving understanding of phase separation in the biological and physical sciences has led to the redefining of virus-engineered replication compartments in many viruses with RNA genomes. Condensation of viral, host and genomic and subgenomic RNAs can take place to evade the innate immunity response and to help viral replication. Divergent viruses prompt liquid-liquid phase separation (LLPS) to invade the host cell. During HIV replication there are several steps involving LLPS. In this review, we characterize the ability of individual viral and host partners that assemble into biomolecular condensates (BMCs). Of note, bioinformatic analyses predict models of phase separation in line with several published observations. Importantly, viral BMCs contribute to function in key steps retroviral replication. For example, reverse transcription takes place within nuclear BMCs, called HIV-MLOs while during late replication steps, retroviral nucleocapsid acts as a driver or scaffold to recruit client viral components to aid the assembly of progeny virions. Overall, LLPS during viral infections represents a newly described biological event now appreciated in the virology field, that can also be considered as an alternative pharmacological target to current drug therapies especially when viruses become resistant to antiviral treatment.
Collapse
Affiliation(s)
- Francesca Di Nunzio
- Advanced Molecular Virology Unit, Department of Virology, Institut Pasteur, Université Paris Cité, 75015, Paris, France
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, 33612, USA
| | - Andrew J Mouland
- Lady Davis Institute at the Jewish General Hospital, Montréal, QC, H3T 1E2, Canada.
- Department of Microbiology and Immunology, McGill University, Montréal, QC, H3A 2B4, Canada.
- Department of Medicine, McGill University, Montréal, QC, H4A 3J1, Canada.
| |
Collapse
|
10
|
Lu C, Zaucha J, Gam R, Fang H, Ben Smithers, Oates ME, Bernabe-Rubio M, Williams J, Zelenka N, Pandurangan AP, Tandon H, Shihab H, Kalaivani R, Sung M, Sardar AJ, Tzovoras BG, Danovi D, Gough J. Hypothesis-free phenotype prediction within a genetics-first framework. Nat Commun 2023; 14:919. [PMID: 36808136 PMCID: PMC9938118 DOI: 10.1038/s41467-023-36634-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Accepted: 02/10/2023] [Indexed: 02/19/2023] Open
Abstract
Cohort-wide sequencing studies have revealed that the largest category of variants is those deemed 'rare', even for the subset located in coding regions (99% of known coding variants are seen in less than 1% of the population. Associative methods give some understanding how rare genetic variants influence disease and organism-level phenotypes. But here we show that additional discoveries can be made through a knowledge-based approach using protein domains and ontologies (function and phenotype) that considers all coding variants regardless of allele frequency. We describe an ab initio, genetics-first method making molecular knowledge-based interpretations for exome-wide non-synonymous variants for phenotypes at the organism and cellular level. By using this reverse approach, we identify plausible genetic causes for developmental disorders that have eluded other established methods and present molecular hypotheses for the causal genetics of 40 phenotypes generated from a direct-to-consumer genotype cohort. This system offers a chance to extract further discovery from genetic data after standard tools have been applied.
Collapse
Affiliation(s)
- Chang Lu
- MRC Laboratory of Molecular Biology, Cambridge Biomedical Campus, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Jan Zaucha
- Department of Computer Science, University of Bristol, Bristol, BS8 1UB, UK
| | - Rihab Gam
- MRC Laboratory of Molecular Biology, Cambridge Biomedical Campus, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Hai Fang
- Department of Computer Science, University of Bristol, Bristol, BS8 1UB, UK
- Shanghai Institute of Hematology, State Key Laboratory of Medical Genomics, National Research Centre for Translational Medicine at Shanghai, Ruijin Hospital affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Ben Smithers
- Department of Computer Science, University of Bristol, Bristol, BS8 1UB, UK
| | - Matt E Oates
- Department of Computer Science, University of Bristol, Bristol, BS8 1UB, UK
| | - Miguel Bernabe-Rubio
- Centre for Gene Therapy and Regenerative Medicine, King's College London, Guy's Hospital, Floor 28, Tower Wing, Great Maze Pond, London, SE1 9RT, UK
| | - James Williams
- Centre for Gene Therapy and Regenerative Medicine, King's College London, Guy's Hospital, Floor 28, Tower Wing, Great Maze Pond, London, SE1 9RT, UK
| | - Natalie Zelenka
- Department of Computer Science, University of Bristol, Bristol, BS8 1UB, UK
| | - Arun Prasad Pandurangan
- MRC Laboratory of Molecular Biology, Cambridge Biomedical Campus, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Himani Tandon
- MRC Laboratory of Molecular Biology, Cambridge Biomedical Campus, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Hashem Shihab
- Department of Computer Science, University of Bristol, Bristol, BS8 1UB, UK
| | - Raju Kalaivani
- MRC Laboratory of Molecular Biology, Cambridge Biomedical Campus, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Minkyung Sung
- MRC Laboratory of Molecular Biology, Cambridge Biomedical Campus, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Adam J Sardar
- Department of Computer Science, University of Bristol, Bristol, BS8 1UB, UK
| | | | - Davide Danovi
- Centre for Gene Therapy and Regenerative Medicine, King's College London, Guy's Hospital, Floor 28, Tower Wing, Great Maze Pond, London, SE1 9RT, UK
| | - Julian Gough
- MRC Laboratory of Molecular Biology, Cambridge Biomedical Campus, Francis Crick Avenue, Cambridge, CB2 0QH, UK.
- Department of Computer Science, University of Bristol, Bristol, BS8 1UB, UK.
| |
Collapse
|
11
|
Liaisons dangereuses: Intrinsic Disorder in Cellular Proteins Recruited to Viral Infection-Related Biocondensates. Int J Mol Sci 2023; 24:ijms24032151. [PMID: 36768473 PMCID: PMC9917183 DOI: 10.3390/ijms24032151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 01/11/2023] [Accepted: 01/19/2023] [Indexed: 01/25/2023] Open
Abstract
Liquid-liquid phase separation (LLPS) is responsible for the formation of so-called membrane-less organelles (MLOs) that are essential for the spatio-temporal organization of the cell. Intrinsically disordered proteins (IDPs) or regions (IDRs), either alone or in conjunction with nucleic acids, are involved in the formation of these intracellular condensates. Notably, viruses exploit LLPS at their own benefit to form viral replication compartments. Beyond giving rise to biomolecular condensates, viral proteins are also known to partition into cellular MLOs, thus raising the question as to whether these cellular phase-separating proteins are drivers of LLPS or behave as clients/regulators. Here, we focus on a set of eukaryotic proteins that are either sequestered in viral factories or colocalize with viral proteins within cellular MLOs, with the primary goal of gathering organized, predicted, and experimental information on these proteins, which constitute promising targets for innovative antiviral strategies. Using various computational approaches, we thoroughly investigated their disorder content and inherent propensity to undergo LLPS, along with their biological functions and interactivity networks. Results show that these proteins are on average, though to varying degrees, enriched in disorder, with their propensity for phase separation being correlated, as expected, with their disorder content. A trend, which awaits further validation, tends to emerge whereby the most disordered proteins serve as drivers, while more ordered cellular proteins tend instead to be clients of viral factories. In light of their high disorder content and their annotated LLPS behavior, most proteins in our data set are drivers or co-drivers of molecular condensation, foreshadowing a key role of these cellular proteins in the scaffolding of viral infection-related MLOs.
Collapse
|
12
|
Mohammed AS, Uversky VN. Intrinsic Disorder as a Natural Preservative: High Levels of Intrinsic Disorder in Proteins Found in the 2600-Year-Old Human Brain. BIOLOGY 2022; 11:1704. [PMID: 36552214 PMCID: PMC9775155 DOI: 10.3390/biology11121704] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 11/22/2022] [Accepted: 11/23/2022] [Indexed: 11/29/2022]
Abstract
Proteomic analysis revealed the preservation of many proteins in the Heslington brain (which is at least 2600-year-old brain tissue uncovered within the skull excavated in 2008 from a pit in Heslington, Yorkshire, England). Five of these proteins-"main proteins": heavy, medium, and light neurofilament proteins (NFH, NFM, and NFL), glial fibrillary acidic protein (GFAP), and myelin basic (MBP) protein-are engaged in the formation of non-amyloid protein aggregates, such as intermediate filaments and myelin sheath. We used a wide spectrum of bioinformatics tools to evaluate the prevalence of functional disorder in several related sets of proteins, such as the main proteins and their 44 interactors, all other proteins identified in the Heslington brain, as well as the entire human proteome (20,317 manually curated proteins), and 10,611 brain proteins. These analyses revealed that all five main proteins, half of their interactors and almost one third of the Heslington brain proteins are expected to be mostly disordered. Furthermore, most of the remaining Heslington brain proteins are expected to contain sizable levels of disorder. This is contrary to the expected substantial (if not complete) elimination of the disordered proteins from the Heslington brain. Therefore, it seems that the intrinsic disorder of NFH, NFM, NFL, GFAP, and MBP, their interactors, and many other proteins might play a crucial role in preserving the Heslington brain by forming tightly folded brain protein aggregates, in which different parts are glued together via the disorder-to-order transitions.
Collapse
Affiliation(s)
- Aaron S. Mohammed
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Blvd. MDC07, Tampa, FL 33612, USA
| | - Vladimir N. Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Blvd. MDC07, Tampa, FL 33612, USA
- USF Health Byrd Alzheimer’s Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
| |
Collapse
|
13
|
Dhulipala S, Uversky VN. Looking at the Pathogenesis of the Rabies Lyssavirus Strain Pasteur Vaccins through a Prism of the Disorder-Based Bioinformatics. Biomolecules 2022; 12:1436. [PMID: 36291645 PMCID: PMC9599798 DOI: 10.3390/biom12101436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 09/30/2022] [Accepted: 10/04/2022] [Indexed: 11/28/2022] Open
Abstract
Rabies is a neurological disease that causes between 40,000 and 70,000 deaths every year. Once a rabies patient has become symptomatic, there is no effective treatment for the illness, and in unvaccinated individuals, the case-fatality rate of rabies is close to 100%. French scientists Louis Pasteur and Émile Roux developed the first vaccine for rabies in 1885. If administered before the virus reaches the brain, the modern rabies vaccine imparts long-lasting immunity to the virus and saves more than 250,000 people every year. However, the rabies virus can suppress the host's immune response once it has entered the cells of the brain, making death likely. This study aimed to make use of disorder-based proteomics and bioinformatics to determine the potential impact that intrinsically disordered protein regions (IDPRs) in the proteome of the rabies virus might have on the infectivity and lethality of the disease. This study used the proteome of the Rabies lyssavirus (RABV) strain Pasteur Vaccins (PV), one of the best-understood strains due to its use in the first rabies vaccine, as a model. The data reported in this study are in line with the hypothesis that high levels of intrinsic disorder in the phosphoprotein (P-protein) and nucleoprotein (N-protein) allow them to participate in the creation of Negri bodies and might help this virus to suppress the antiviral immune response in the host cells. Additionally, the study suggests that there could be a link between disorder in the matrix (M) protein and the modulation of viral transcription. The disordered regions in the M-protein might have a possible role in initiating viral budding within the cell. Furthermore, we checked the prevalence of functional disorder in a set of 37 host proteins directly involved in the interaction with the RABV proteins. The hope is that these new insights will aid in the development of treatments for rabies that are effective after infection.
Collapse
Affiliation(s)
- Surya Dhulipala
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
| | - Vladimir N. Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
- USF Health Byrd Alzheimer’s Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
- Protein Research Group, Institute for Biological Instrumentation of the Russian Academy of Sciences, Federal Research Center “Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences”, 142290 Pushchino, Moscow Region, Russia
| |
Collapse
|
14
|
Djulbegovic MB, Taylor DJ, Uversky VN, Galor A, Shields CL, Karp CL. Intrinsic Disorder in BAP1 and Its Association with Uveal Melanoma. Genes (Basel) 2022; 13:1703. [PMID: 36292588 PMCID: PMC9601668 DOI: 10.3390/genes13101703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Revised: 09/20/2022] [Accepted: 09/21/2022] [Indexed: 11/16/2022] Open
Abstract
Background: Specific subvariants of uveal melanoma (UM) are associated with increased rates of metastasis compared to other subvariants. BRCA1 (BReast CAncer gene 1)-associated protein-1 (BAP1) is encoded by a gene that has been linked to aggressive behavior in UM. Methods: We evaluated BAP1 for the presence of intrinsically disordered protein regions (IDPRs) and its protein−protein interactions (PPI). We evaluated specific sequence-based features of the BAP1 protein using a set of bioinformatic databases, predictors, and algorithms. Results: We show that BAP1’s structure contains extensive IDPRs as it is highly enriched in proline residues (the most disordered amino acid; p-value < 0.05), the average percent of predicted disordered residues (PPDR) was 57.34%, and contains 9 disorder-based binding sites (ie. molecular recognition features (MoRFs)). BAP1’s intrinsic disorder allows it to engage in a complex PPI network with at least 49 partners (p-value < 1.0 × 10−16). Conclusion: These findings show that BAP1 contains IDPRs and an intricate PPI network. Mutations in UM that are associated with the BAP1 gene may alter the function of the IDPRs embedded into its structure. These findings develop the understanding of UM and may provide a target for potential novel therapies to treat this aggressive neoplasm.
Collapse
Affiliation(s)
| | - David J. Taylor
- Bascom Palmer Eye Institute, University of Miami, Miami, FL 33136, USA
| | - Vladimir N. Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer’s Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33613, USA
| | - Anat Galor
- Bascom Palmer Eye Institute, University of Miami, Miami, FL 33136, USA
- Ophthalmology, Miami Veterans Affairs Medical Center, Miami, FL 33136, USA
- Research Services, Miami Veterans Affairs Medical Center, Miami, FL 33136, USA
| | - Carol L. Shields
- Ocular Oncology Service, Wills Eye Hospital, Thomas Jefferson University, Philadelphia, PA 19107, USA
| | - Carol L. Karp
- Bascom Palmer Eye Institute, University of Miami, Miami, FL 33136, USA
| |
Collapse
|
15
|
Intrinsically disordered BMP4 morphogen and the beak of the finch: Co-option of an ancient axial patterning system. Int J Biol Macromol 2022; 219:366-373. [PMID: 35931296 DOI: 10.1016/j.ijbiomac.2022.07.203] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 07/25/2022] [Indexed: 12/24/2022]
Abstract
Darwin's finches, with the primary diversity in the shape and size of their beaks, represent an excellent model system to study speciation and adaptive evolution. It is generally held that evolution depends on the natural selection of heritable phenotypic variations originating from the genetic mutations. However, it is now increasingly evident that epigenetic transgenerational inheritance of phenotypic variation can also guide evolutionary change. Several studies have shown that the bone morphogenetic protein BMP4 is a major driver of beak morphology. A recent study explored variability of the morphological, genetic, and epigenetic differences in the adjacent "urban" and "rural" populations of two species of ground Darwin's finches on the Galápagos Islands and revealed significant changes in methylation patterns in several genes including those involved in the BMP/TGFß pathway in the sperm DNA compared to erythrocyte DNA. These observations indicated that epigenetic changes caused by environmental fluctuations can be passed on to the offspring. Nonetheless, the mechanism by which dysregulated expression of BMP4 impacts beak morphology remains poorly understood. Here, we show that BMP4 is an intrinsically disordered protein and present a causal a link between epigenetic changes, BMP4 dysregulation and the evolution of the beak of the finch by natural selection.
Collapse
|
16
|
Redwan EM, Aljadawi AA, Uversky VN. Hepatitis C Virus Infection and Intrinsic Disorder in the Signaling Pathways Induced by Toll-Like Receptors. BIOLOGY 2022; 11:1091. [PMID: 36101469 PMCID: PMC9312352 DOI: 10.3390/biology11071091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 07/07/2022] [Accepted: 07/19/2022] [Indexed: 11/23/2022]
Abstract
In this study, we examined the interplay between protein intrinsic disorder, hepatitis C virus (HCV) infection, and signaling pathways induced by Toll-like receptors (TLRs). To this end, 10 HCV proteins, 10 human TLRs, and 41 proteins from the TLR-induced downstream pathways were considered from the prevalence of intrinsic disorder. Mapping of the intrinsic disorder to the HCV-TLR interactome and to the TLR-based pathways of human innate immune response to the HCV infection demonstrates that substantial levels of intrinsic disorder are characteristic for proteins involved in the regulation and execution of these innate immunity pathways and in HCV-TLR interaction. Disordered regions, being commonly enriched in sites of various posttranslational modifications, may play important functional roles by promoting protein-protein interactions and support the binding of the analyzed proteins to other partners such as nucleic acids. It seems that this system represents an important illustration of the role of intrinsic disorder in virus-host warfare.
Collapse
Affiliation(s)
- Elrashdy M. Redwan
- Biological Science Department, Faculty of Science, King Abdulaziz University, P.O. Box 80203, Jeddah 21589, Saudi Arabia; (E.M.R.); (A.A.A.)
- Therapeutic and Protective Proteins Laboratory, Protein Research Department, Genetic Engineering and Biotechnology Research Institute, City for Scientific Research and Technology Applications, New Borg EL-Arab, Alexandria 21934, Egypt
| | - Abdullah A. Aljadawi
- Biological Science Department, Faculty of Science, King Abdulaziz University, P.O. Box 80203, Jeddah 21589, Saudi Arabia; (E.M.R.); (A.A.A.)
| | - Vladimir N. Uversky
- Biological Science Department, Faculty of Science, King Abdulaziz University, P.O. Box 80203, Jeddah 21589, Saudi Arabia; (E.M.R.); (A.A.A.)
- Department of Molecular Medicine and USF Health Byrd Alzheimer’s Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
| |
Collapse
|
17
|
What Is Parvalbumin for? Biomolecules 2022; 12:biom12050656. [PMID: 35625584 PMCID: PMC9138604 DOI: 10.3390/biom12050656] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Revised: 04/25/2022] [Accepted: 04/28/2022] [Indexed: 12/28/2022] Open
Abstract
Parvalbumin (PA) is a small, acidic, mostly cytosolic Ca2+-binding protein of the EF-hand superfamily. Structural and physical properties of PA are well studied but recently two highly conserved structural motifs consisting of three amino acids each (clusters I and II), which contribute to the hydrophobic core of the EF-hand domains, have been revealed. Despite several decades of studies, physiological functions of PA are still poorly known. Since no target proteins have been revealed for PA so far, it is believed that PA acts as a slow calcium buffer. Numerous experiments on various muscle systems have shown that PA accelerates the relaxation of fast skeletal muscles. It has been found that oxidation of PA by reactive oxygen species (ROS) is conformation-dependent and one more physiological function of PA in fast muscles could be a protection of these cells from ROS. PA is thought to regulate calcium-dependent metabolic and electric processes within the population of gamma-aminobutyric acid (GABA) neurons. Genetic elimination of PA results in changes in GABAergic synaptic transmission. Mammalian oncomodulin (OM), the β isoform of PA, is expressed mostly in cochlear outer hair cells and in vestibular hair cells. OM knockout mice lose their hearing after 3–4 months. It was suggested that, in sensory cells, OM maintains auditory function, most likely affecting outer hair cells’ motility mechanisms.
Collapse
|
18
|
Duncan A, Barry K, Daum C, Eloe-Fadrosh E, Roux S, Schmidt K, Tringe SG, Valentin KU, Varghese N, Salamov A, Grigoriev IV, Leggett RM, Moulton V, Mock T. Metagenome-assembled genomes of phytoplankton microbiomes from the Arctic and Atlantic Oceans. MICROBIOME 2022; 10:67. [PMID: 35484634 PMCID: PMC9047304 DOI: 10.1186/s40168-022-01254-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Accepted: 02/28/2022] [Indexed: 06/14/2023]
Abstract
BACKGROUND Phytoplankton communities significantly contribute to global biogeochemical cycles of elements and underpin marine food webs. Although their uncultured genomic diversity has been estimated by planetary-scale metagenome sequencing and subsequent reconstruction of metagenome-assembled genomes (MAGs), this approach has yet to be applied for complex phytoplankton microbiomes from polar and non-polar oceans consisting of microbial eukaryotes and their associated prokaryotes. RESULTS Here, we have assembled MAGs from chlorophyll a maximum layers in the surface of the Arctic and Atlantic Oceans enriched for species associations (microbiomes) with a focus on pico- and nanophytoplankton and their associated heterotrophic prokaryotes. From 679 Gbp and estimated 50 million genes in total, we recovered 143 MAGs of medium to high quality. Although there was a strict demarcation between Arctic and Atlantic MAGs, adjacent sampling stations in each ocean had 51-88% MAGs in common with most species associations between Prasinophytes and Proteobacteria. Phylogenetic placement revealed eukaryotic MAGs to be more diverse in the Arctic whereas prokaryotic MAGs were more diverse in the Atlantic Ocean. Approximately 70% of protein families were shared between Arctic and Atlantic MAGs for both prokaryotes and eukaryotes. However, eukaryotic MAGs had more protein families unique to the Arctic whereas prokaryotic MAGs had more families unique to the Atlantic. CONCLUSION Our study provides a genomic context to complex phytoplankton microbiomes to reveal that their community structure was likely driven by significant differences in environmental conditions between the polar Arctic and warm surface waters of the tropical and subtropical Atlantic Ocean. Video Abstract.
Collapse
Affiliation(s)
- Anthony Duncan
- School of Computing Sciences, University of East Anglia, Norwich Research Park, Norwich, NR47TJ, UK
| | - Kerrie Barry
- US Department of Energy Joint Genome Institute, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Chris Daum
- US Department of Energy Joint Genome Institute, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Emiley Eloe-Fadrosh
- US Department of Energy Joint Genome Institute, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Simon Roux
- US Department of Energy Joint Genome Institute, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Katrin Schmidt
- School of Environmental Sciences, University of East Anglia, Norwich Research Park, Norwich, NR47TJ, UK
| | - Susannah G Tringe
- US Department of Energy Joint Genome Institute, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Klaus U Valentin
- Alfred-Wegener Institute for Polar and Marine Research, Am Handelshafen 12, 27570, Bremerhaven, Germany
| | - Neha Varghese
- US Department of Energy Joint Genome Institute, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Asaf Salamov
- US Department of Energy Joint Genome Institute, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Igor V Grigoriev
- US Department of Energy Joint Genome Institute, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | | | - Vincent Moulton
- School of Computing Sciences, University of East Anglia, Norwich Research Park, Norwich, NR47TJ, UK
| | - Thomas Mock
- School of Environmental Sciences, University of East Anglia, Norwich Research Park, Norwich, NR47TJ, UK.
| |
Collapse
|
19
|
Genome-Wide Survey and Development of the First Microsatellite Markers Database ( AnCorDB) in Anemone coronaria L. Int J Mol Sci 2022; 23:ijms23063126. [PMID: 35328546 PMCID: PMC8949970 DOI: 10.3390/ijms23063126] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2022] [Revised: 03/08/2022] [Accepted: 03/10/2022] [Indexed: 12/31/2022] Open
Abstract
Anemone coronaria L. (2n = 2x = 16) is a perennial, allogamous, highly heterozygous plant marketed as a cut flower or in gardens. Due to its large genome size, limited efforts have been made in order to develop species-specific molecular markers. We obtained the first draft genome of the species by Illumina sequencing an androgenetic haploid plant of the commercial line “MISTRAL® Magenta”. The genome assembly was obtained by applying the MEGAHIT pipeline and consisted of 2 × 106 scaffolds. The SciRoKo SSR (Simple Sequence Repeats)-search module identified 401.822 perfect and 188.987 imperfect microsatellites motifs. Following, we developed a user-friendly “Anemone coronaria Microsatellite DataBase” (AnCorDB), which incorporates the Primer3 script, making it possible to design couples of primers for downstream application of the identified SSR markers. Eight genotypes belonging to eight cultivars were used to validate 62 SSRs and a subset of markers was applied for fingerprinting each cultivar, as well as to assess their intra-cultivar variability. The newly developed microsatellite markers will find application in Breeding Rights disputes, developing genetic maps, marker assisted breeding (MAS) strategies, as well as phylogenetic studies.
Collapse
|
20
|
Djulbegovic M, Uversky VN. The aqueous humor proteome is intrinsically disordered. Biochem Biophys Rep 2022; 29:101202. [PMID: 35128080 PMCID: PMC8808082 DOI: 10.1016/j.bbrep.2022.101202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 01/03/2022] [Accepted: 01/04/2022] [Indexed: 11/14/2022] Open
Abstract
Our study demonstrated that intrinsic disorder is abundant in the aqueous humor. The 749 aqueous proteins analyzed were enriched with disorder-promoting residues. 208 aqueous humor proteins were predicted to be highly intrinsically disordered. Misregulation of IDPs may promote pathology in the aqueous humor. IDPs in aqueous humor may serve as future targets for novel therapeutics.
Collapse
|
21
|
Severn-Ellis AA, Schoeman MH, Bayer PE, Hane JK, Rees DJG, Edwards D, Batley J. Genome Analysis of the Broad Host Range Necrotroph Nalanthamala psidii Highlights Genes Associated With Virulence. FRONTIERS IN PLANT SCIENCE 2022; 13:811152. [PMID: 35283890 PMCID: PMC8914235 DOI: 10.3389/fpls.2022.811152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 01/18/2022] [Indexed: 06/14/2023]
Abstract
Guava wilt disease is caused by the fungus Nalanthamala psidii. The wilt disease results in large-scale destruction of orchards in South Africa, Taiwan, and several Southeast Asian countries. De novo assembly, annotation, and in-depth analysis of the N. psidii genome were carried out to facilitate the identification of characteristics associated with pathogenicity and pathogen evolution. The predicted secretome revealed a range of CAZymes, proteases, lipases and peroxidases associated with plant cell wall degradation, nutrient acquisition, and disease development. Further analysis of the N. psidii carbohydrate-active enzyme profile exposed the broad-spectrum necrotrophic lifestyle of the pathogen, which was corroborated by the identification of putative effectors and secondary metabolites with the potential to induce tissue necrosis and cell surface-dependent immune responses. Putative regulatory proteins including transcription factors and kinases were identified in addition to transporters potentially involved in the secretion of secondary metabolites. Transporters identified included important ABC and MFS transporters involved in the efflux of fungicides. Analysis of the repetitive landscape and the detection of mechanisms linked to reproduction such as het and mating genes rendered insights into the biological complexity and evolutionary potential of N. psidii as guava pathogen. Hence, the assembly and annotation of the N. psidii genome provided a valuable platform to explore the pathogenic potential and necrotrophic lifestyle of the guava wilt pathogen.
Collapse
Affiliation(s)
- Anita A. Severn-Ellis
- School of Biological Sciences, Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia
- Aquaculture Research and Development, Department of Primary Industries and Regional Development, Indian Ocean Marine Research Centre, Watermans Bay, WA, Australia
| | - Maritha H. Schoeman
- Institute for Tropical and Subtropical Crops, Agricultural Research Council, Nelspruit, South Africa
| | - Philipp E. Bayer
- School of Biological Sciences, Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia
| | - James K. Hane
- Centre for Crop and Disease Management, School of Molecular and Life Sciences, Curtin University, Bentley, WA, Australia
| | - D. Jasper G. Rees
- Agricultural Research Council, Biotechnology Platform, Pretoria, South Africa
- Botswana University of Agriculture and Natural Resources, Gaborone, Botswana
| | - David Edwards
- School of Biological Sciences, Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia
| | - Jacqueline Batley
- School of Biological Sciences, Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia
| |
Collapse
|
22
|
Erythropoietin Interacts with Specific S100 Proteins. Biomolecules 2022; 12:biom12010120. [PMID: 35053268 PMCID: PMC8773746 DOI: 10.3390/biom12010120] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 01/10/2022] [Accepted: 01/10/2022] [Indexed: 02/06/2023] Open
Abstract
Erythropoietin (EPO) is a clinically significant four-helical cytokine, exhibiting erythropoietic, cytoprotective, immunomodulatory, and cancer-promoting activities. Despite vast knowledge on its signaling pathways and physiological effects, extracellular factors regulating EPO activity remain underexplored. Here we show by surface plasmon resonance spectroscopy, that among eighteen members of Ca2+-binding proteins of the S100 protein family studied, only S100A2, S100A6 and S100P proteins specifically recognize EPO with equilibrium dissociation constants ranging from 81 nM to 0.5 µM. The interactions occur exclusively under calcium excess. Bioinformatics analysis showed that the EPO-S100 interactions could be relevant to progression of neoplastic diseases, including cancer, and other diseases. The detailed knowledge of distinct physiological effects of the EPO-S100 interactions could favor development of more efficient clinical implications of EPO. Summing up our data with previous findings, we conclude that S100 proteins are potentially able to directly affect functional activities of specific members of all families of four-helical cytokines, and cytokines of other structural superfamilies.
Collapse
|
23
|
Zhang F, Song H, Zeng M, Wu FX, Li Y, Pan Y, Li M. A Deep Learning Framework for Gene Ontology Annotations With Sequence- and Network-Based Information. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2208-2217. [PMID: 31985440 DOI: 10.1109/tcbb.2020.2968882] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Knowledge of protein functions plays an important role in biology and medicine. With the rapid development of high-throughput technologies, a huge number of proteins have been discovered. However, there are a great number of proteins without functional annotations. A protein usually has multiple functions and some functions or biological processes require interactions of a plurality of proteins. Additionally, Gene Ontology provides a useful classification for protein functions and contains more than 40,000 terms. We propose a deep learning framework called DeepGOA to predict protein functions with protein sequences and protein-protein interaction (PPI) networks. For protein sequences, we extract two types of information: sequence semantic information and subsequence-based features. We use the word2vec technique to numerically represent protein sequences, and utilize a Bi-directional Long and Short Time Memory (Bi-LSTM) and multi-scale convolutional neural network (multi-scale CNN) to obtain the global and local semantic features of protein sequences, respectively. Additionally, we use the InterPro tool to scan protein sequences for extracting subsequence-based information, such as domains and motifs. Then, the information is plugged into a neural network to generate high-quality features. For the PPI network, the Deepwalk algorithm is applied to generate its embedding information of PPI. Then the two types of features are concatenated together to predict protein functions. To evaluate the performance of DeepGOA, several different evaluation methods and metrics are utilized. The experimental results show that DeepGOA outperforms DeepGO and BLAST.
Collapse
|
24
|
Ma W, Liu Z, Beier S, Houben A, Carpentier S. Identification of rye B chromosome-associated peptides by mass spectrometry. THE NEW PHYTOLOGIST 2021; 230:2179-2185. [PMID: 33503271 DOI: 10.1111/nph.17238] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Accepted: 01/20/2021] [Indexed: 06/12/2023]
Abstract
B chromosomes (Bs) are supernumerary dispensable components of the standard genome (A chromosomes, As) that have been found in many eukaryotes. So far, it is unkown whether the B-derived transcripts translate to proteins or if the host proteome is changed due to the presence of Bs. Comparative mass spectrometry was performed using the protein samples isolated from shoots of rye plants with and without Bs. We aimed to identify B-associated peptides and analyzed the effects of Bs on the total proteome. Our comparative proteome analysis demonstrates that the presence of rye Bs affects the total proteome including different biological function processes. We found 319 of 16 776 quantified features in at least three out of five +B plants but not in 0B plants; 31 of 319 features were identified as B-associated peptide features. According to our data mining, one B-specific protein fragment showed similarity to a glycine-rich RNA binding protein which differed from its A-paralogue by two amino acid insertions. Our result represents a milestone in B chromosome research, because this is the first report to demonstrate the existence of Bs changing the proteome of the host.
Collapse
Affiliation(s)
- Wei Ma
- Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu, 214122, China
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstrasse 3, Stadt Seeland, 06466, Germany
| | - ZhaoJun Liu
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstrasse 3, Stadt Seeland, 06466, Germany
- Microelement Research Center/Key Laboratory of Arable Land Conservation (Middle and Lower Reaches of Yangtze River), Ministry of Agriculture, Huazhong Agricultural University, Wuhan, 430070, China
- School of Life Sciences Life, Science Center Weihenstephan, Crop Physiology, Technical University Munich, Alte Akademie 12, Freising, 85354, Germany
| | - Sebastian Beier
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstrasse 3, Stadt Seeland, 06466, Germany
| | - Andreas Houben
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstrasse 3, Stadt Seeland, 06466, Germany
| | - Sebastien Carpentier
- Department of Biosystems, KU Leuven, Willem Decroylaan 42, 2455-3001 Leuven, Belgium
- SYBIOMA, KULeuven, Herestraat 49, Leuven, 3000, Belgium
- Genetic Resources, Bioversity International, Willem Decroylaan 42, 2455-3001 Leuven, Belgium
| |
Collapse
|
25
|
Horváthová L, Žárský V, Pánek T, Derelle R, Pyrih J, Motyčková A, Klápšťová V, Vinopalová M, Marková L, Voleman L, Klimeš V, Petrů M, Vaitová Z, Čepička I, Hryzáková K, Harant K, Gray MW, Chami M, Guilvout I, Francetic O, Franz Lang B, Vlček Č, Tsaousis AD, Eliáš M, Doležal P. Analysis of diverse eukaryotes suggests the existence of an ancestral mitochondrial apparatus derived from the bacterial type II secretion system. Nat Commun 2021; 12:2947. [PMID: 34011950 PMCID: PMC8134430 DOI: 10.1038/s41467-021-23046-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2018] [Accepted: 03/22/2021] [Indexed: 12/14/2022] Open
Abstract
The type 2 secretion system (T2SS) is present in some Gram-negative eubacteria and used to secrete proteins across the outer membrane. Here we report that certain representative heteroloboseans, jakobids, malawimonads and hemimastigotes unexpectedly possess homologues of core T2SS components. We show that at least some of them are present in mitochondria, and their behaviour in biochemical assays is consistent with the presence of a mitochondrial T2SS-derived system (miT2SS). We additionally identified 23 protein families co-occurring with miT2SS in eukaryotes. Seven of these proteins could be directly linked to the core miT2SS by functional data and/or sequence features, whereas others may represent different parts of a broader functional pathway, possibly also involving the peroxisome. Its distribution in eukaryotes and phylogenetic evidence together indicate that the miT2SS-centred pathway is an ancestral eukaryotic trait. Our findings thus have direct implications for the functional properties of the early mitochondrion. Bacteria use the type 2 secretion system to secrete enzymes and toxins across the outer membrane to the environment. Here the authors analyse the T2SS pathway in three protist lineages and suggest that the early mitochondrion may have been capable of secreting proteins into the cytosol.
Collapse
Affiliation(s)
- Lenka Horváthová
- Faculty of Science, Department of Parasitology, Charles University, BIOCEV, Vestec, Czech Republic
| | - Vojtěch Žárský
- Faculty of Science, Department of Parasitology, Charles University, BIOCEV, Vestec, Czech Republic
| | - Tomáš Pánek
- Faculty of Science, Department of Biology and Ecology, University of Ostrava, Ostrava, Czech Republic.,Faculty of Science, Department of Zoology, Charles University, Prague 2, Czech Republic
| | - Romain Derelle
- School of Biosciences, University of Birmingham, Edgbaston, UK
| | - Jan Pyrih
- Laboratory of Molecular & Evolutionary Parasitology, RAPID group, School of Biosciences, University of Kent, Canterbury, UK.,Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic
| | - Alžběta Motyčková
- Faculty of Science, Department of Parasitology, Charles University, BIOCEV, Vestec, Czech Republic
| | - Veronika Klápšťová
- Faculty of Science, Department of Parasitology, Charles University, BIOCEV, Vestec, Czech Republic
| | - Martina Vinopalová
- Faculty of Science, Department of Parasitology, Charles University, BIOCEV, Vestec, Czech Republic
| | - Lenka Marková
- Faculty of Science, Department of Parasitology, Charles University, BIOCEV, Vestec, Czech Republic
| | - Luboš Voleman
- Faculty of Science, Department of Parasitology, Charles University, BIOCEV, Vestec, Czech Republic
| | - Vladimír Klimeš
- Faculty of Science, Department of Biology and Ecology, University of Ostrava, Ostrava, Czech Republic
| | - Markéta Petrů
- Faculty of Science, Department of Parasitology, Charles University, BIOCEV, Vestec, Czech Republic
| | - Zuzana Vaitová
- Faculty of Science, Department of Parasitology, Charles University, BIOCEV, Vestec, Czech Republic
| | - Ivan Čepička
- Faculty of Science, Department of Zoology, Charles University, Prague 2, Czech Republic
| | - Klára Hryzáková
- Faculty of Science, Department of Genetics and Microbiology, Charles University, Prague 2, Czech Republic
| | - Karel Harant
- Faculty of Science, Proteomic core facility, Charles University, BIOCEV, Vestec, Czech Republic
| | - Michael W Gray
- Department of Biochemistry and Molecular Biology and Centre for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, NS, Canada
| | - Mohamed Chami
- Center for Cellular Imaging and NanoAnalytics, University of Basel, Basel, Switzerland
| | - Ingrid Guilvout
- Biochemistry of Macromolecular Interactions Unit, Department of Structural Biology and Chemistry, Institut Pasteur, CNRS UMR3528, Paris, France
| | - Olivera Francetic
- Biochemistry of Macromolecular Interactions Unit, Department of Structural Biology and Chemistry, Institut Pasteur, CNRS UMR3528, Paris, France
| | - B Franz Lang
- Robert Cedergren Centre for Bioinformatics and Genomics, Département de Biochimie, Université de Montréal, Montreal, QC, Canada
| | - Čestmír Vlček
- Institute of Molecular Genetics, Czech Academy of Sciences, Prague 4, Czech Republic
| | - Anastasios D Tsaousis
- Laboratory of Molecular & Evolutionary Parasitology, RAPID group, School of Biosciences, University of Kent, Canterbury, UK
| | - Marek Eliáš
- Faculty of Science, Department of Biology and Ecology, University of Ostrava, Ostrava, Czech Republic.
| | - Pavel Doležal
- Faculty of Science, Department of Parasitology, Charles University, BIOCEV, Vestec, Czech Republic.
| |
Collapse
|
26
|
Zohra Smaili F, Tian S, Roy A, Alazmi M, Arold ST, Mukherjee S, Scott Hefty P, Chen W, Gao X. QAUST: Protein Function Prediction Using Structure Similarity, Protein Interaction, and Functional Motifs. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 19:998-1011. [PMID: 33631427 PMCID: PMC9403031 DOI: 10.1016/j.gpb.2021.02.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2018] [Revised: 04/03/2019] [Accepted: 05/17/2019] [Indexed: 11/25/2022]
Abstract
The number of available protein sequences in public databases is increasing exponentially. However, a significant percentage of these sequences lack functional annotation, which is essential for the understanding of how biological systems operate. Here, we propose a novel method, Quantitative Annotation of Unknown STructure (QAUST), to infer protein functions, specifically Gene Ontology (GO) terms and Enzyme Commission (EC) numbers. QAUST uses three sources of information: structure information encoded by global and local structure similarity search, biological network information inferred by protein–protein interaction data, and sequence information extracted from functionally discriminative sequence motifs. These three pieces of information are combined by consensus averaging to make the final prediction. Our approach has been tested on 500 protein targets from the Critical Assessment of Functional Annotation (CAFA) benchmark set. The results show that our method provides accurate functional annotation and outperforms other prediction methods based on sequence similarity search or threading. We further demonstrate that a previously unknown function of human tripartite motif-containing 22 (TRIM22) protein predicted by QAUST can be experimentally validated.
Collapse
Affiliation(s)
- Fatima Zohra Smaili
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
| | - Shuye Tian
- Department of Biology, Southern University of Science and Technology of China (SUSTC), Shenzhen 518055, China
| | - Ambrish Roy
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Meshari Alazmi
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia; College of Computer Science and Engineering, University of Hail, Hail 55476, Saudi Arabia
| | - Stefan T Arold
- Biological and Environmental Sciences and Engineering (BESE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
| | - Srayanta Mukherjee
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - P Scott Hefty
- Department of Molecular Bioscience, University of Kansas, Lawrence, KS 66047, USA
| | - Wei Chen
- Department of Biology, Southern University of Science and Technology of China (SUSTC), Shenzhen 518055, China.
| | - Xin Gao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia.
| |
Collapse
|
27
|
Grinev VV, Barneh F, Ilyushonak IM, Nakjang S, Smink J, van Oort A, Clough R, Seyani M, McNeill H, Reza M, Martinez-Soria N, Assi SA, Ramanouskaya TV, Bonifer C, Heidenreich O. RUNX1/RUNX1T1 mediates alternative splicing and reorganises the transcriptional landscape in leukemia. Nat Commun 2021; 12:520. [PMID: 33483506 PMCID: PMC7822815 DOI: 10.1038/s41467-020-20848-z] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Accepted: 12/14/2020] [Indexed: 01/30/2023] Open
Abstract
The fusion oncogene RUNX1/RUNX1T1 encodes an aberrant transcription factor, which plays a key role in the initiation and maintenance of acute myeloid leukemia. Here we show that the RUNX1/RUNX1T1 oncogene is a regulator of alternative RNA splicing in leukemic cells. The comprehensive analysis of RUNX1/RUNX1T1-associated splicing events identifies two principal mechanisms that underlie the differential production of RNA isoforms: (i) RUNX1/RUNX1T1-mediated regulation of alternative transcription start site selection, and (ii) direct or indirect control of the expression of genes encoding splicing factors. The first mechanism leads to the expression of RNA isoforms with alternative structure of the 5'-UTR regions. The second mechanism generates alternative transcripts with new junctions between internal cassettes and constitutive exons. We also show that RUNX1/RUNX1T1-mediated differential splicing affects several functional groups of genes and produces proteins with unique conserved domain structures. In summary, this study reveals alternative splicing as an important component of transcriptome re-organization in leukemia by an aberrant transcriptional regulator.
Collapse
Affiliation(s)
- Vasily V. Grinev
- grid.17678.3f0000 0001 1092 255XDepartment of Genetics, Faculty of Biology, Belarusian State University, 220030 Minsk, Republic of Belarus
| | - Farnaz Barneh
- grid.487647.ePrincess Maxima Center for Pediatric Oncology, 3584 CS Utrecht, The Netherlands
| | - Ilya M. Ilyushonak
- grid.17678.3f0000 0001 1092 255XDepartment of Genetics, Faculty of Biology, Belarusian State University, 220030 Minsk, Republic of Belarus
| | - Sirintra Nakjang
- grid.1006.70000 0001 0462 7212Wolfson Childhood Cancer Research Centre, Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, NE1 7RU UK
| | - Job Smink
- grid.487647.ePrincess Maxima Center for Pediatric Oncology, 3584 CS Utrecht, The Netherlands
| | - Anita van Oort
- grid.487647.ePrincess Maxima Center for Pediatric Oncology, 3584 CS Utrecht, The Netherlands
| | - Richard Clough
- grid.1006.70000 0001 0462 7212Wolfson Childhood Cancer Research Centre, Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, NE1 7RU UK
| | - Michael Seyani
- grid.1006.70000 0001 0462 7212Wolfson Childhood Cancer Research Centre, Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, NE1 7RU UK
| | - Hesta McNeill
- grid.1006.70000 0001 0462 7212Wolfson Childhood Cancer Research Centre, Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, NE1 7RU UK
| | - Mojgan Reza
- grid.1006.70000 0001 0462 7212Wolfson Childhood Cancer Research Centre, Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, NE1 7RU UK
| | - Natalia Martinez-Soria
- grid.1006.70000 0001 0462 7212Wolfson Childhood Cancer Research Centre, Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, NE1 7RU UK
| | - Salam A. Assi
- grid.6572.60000 0004 1936 7486Institute of Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham, Birmingham, B15 2TT UK
| | - Tatsiana V. Ramanouskaya
- grid.17678.3f0000 0001 1092 255XDepartment of Genetics, Faculty of Biology, Belarusian State University, 220030 Minsk, Republic of Belarus
| | - Constanze Bonifer
- grid.6572.60000 0004 1936 7486Institute of Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham, Birmingham, B15 2TT UK
| | - Olaf Heidenreich
- grid.487647.ePrincess Maxima Center for Pediatric Oncology, 3584 CS Utrecht, The Netherlands ,grid.1006.70000 0001 0462 7212Wolfson Childhood Cancer Research Centre, Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, NE1 7RU UK ,grid.1006.70000 0001 0462 7212Newcastle University Centre for Cancer, Newcastle University, Newcastle upon Tyne, NE1 7RU UK
| |
Collapse
|
28
|
Yazhini A, Srinivasan N, Sandhya S. Signatures of conserved and unique molecular features in Afrotheria. Sci Rep 2021; 11:1011. [PMID: 33441654 PMCID: PMC7806701 DOI: 10.1038/s41598-020-79559-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 12/07/2020] [Indexed: 11/09/2022] Open
Abstract
Afrotheria is a clade of African-origin species with striking dissimilarities in appearance and habitat. In this study, we compared whole proteome sequences of six Afrotherian species to obtain a broad viewpoint of their underlying molecular make-up, to recognize potentially unique proteomic signatures. We find that 62% of the proteomes studied here, predominantly involved in metabolism, are orthologous, while the number of homologous proteins between individual species is as high as 99.5%. Further, we find that among Afrotheria, L. africana has several orphan proteins with 112 proteins showing < 30% sequence identity with their homologues. Rigorous sequence searches and complementary approaches were employed to annotate 156 uncharacterized protein sequences and 28 species-specific proteins. For 122 proteins we predicted potential functional roles, 43 of which we associated with protein- and nucleic-acid binding roles. Further, we analysed domain content and variations in their combinations within Afrotheria and identified 141 unique functional domain architectures, highlighting proteins with potential for specialized functions. Finally, we discuss the potential relevance of highly represented protein families such as MAGE-B2, olfactory receptor and ribosomal proteins in L. africana and E. edwardii, respectively. Taken together, our study reports the first comparative study of the Afrotherian proteomes and highlights salient molecular features.
Collapse
Affiliation(s)
- Arangasamy Yazhini
- Lab 103, Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka, 560012, India
| | - Narayanaswamy Srinivasan
- Lab 103, Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka, 560012, India.
| | - Sankaran Sandhya
- Lab 103, Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka, 560012, India.
| |
Collapse
|
29
|
De-la-Cruz IM, Hallab A, Olivares-Pinto U, Tapia-López R, Velázquez-Márquez S, Piñero D, Oyama K, Usadel B, Núñez-Farfán J. Genomic signatures of the evolution of defence against its natural enemies in the poisonous and medicinal plant Datura stramonium (Solanaceae). Sci Rep 2021; 11:882. [PMID: 33441607 PMCID: PMC7806989 DOI: 10.1038/s41598-020-79194-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Accepted: 12/03/2020] [Indexed: 01/22/2023] Open
Abstract
Tropane alkaloids and terpenoids are widely used in the medicine and pharmaceutic industry and evolved as chemical defenses against herbivores and pathogens in the annual herb Datura stramonium (Solanaceae). Here, we present the first draft genomes of two plants from contrasting environments of D. stramonium. Using these de novo assemblies, along with other previously published genomes from 11 Solanaceae species, we carried out comparative genomic analyses to provide insights on the genome evolution of D. stramonium within the Solanaceae family, and to elucidate adaptive genomic signatures to biotic and abiotic stresses in this plant. We also studied, in detail, the evolution of four genes of D. stramonium-Putrescine N-methyltransferase, Tropinone reductase I, Tropinone reductase II and Hyoscyamine-6S-dioxygenase-involved in the tropane alkaloid biosynthesis. Our analyses revealed that the genomes of D. stramonium show signatures of expansion, physicochemical divergence and/or positive selection on proteins related to the production of tropane alkaloids, terpenoids, and glycoalkaloids as well as on R defensive genes and other important proteins related with biotic and abiotic pressures such as defense against natural enemies and drought.
Collapse
Affiliation(s)
- I M De-la-Cruz
- Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
| | - A Hallab
- IBG-4 Bioinformatics, CEPLAS, Forschungszentrum Jülich, Julich, Germany
| | - U Olivares-Pinto
- Escuela Nacional de Estudios Superiores, Universidad Nacional Autónoma de México (UNAM), Campus Juriquilla, Querétaro, Mexico
| | - R Tapia-López
- Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
| | - S Velázquez-Márquez
- Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
| | - D Piñero
- Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
| | - K Oyama
- Escuela Nacional de Estudios Superiores and Laboratorio Nacional de Análisis y Síntesis Ecológica (LANASE), Universidad Nacional Autónoma de México (UNAM), Campus Morelia, Morelia, Michoacán, Mexico
| | - B Usadel
- IBG-4 Bioinformatics, CEPLAS, Forschungszentrum Jülich, Julich, Germany
- Institute for Biology I, RWTH Aachen University, Aachen, Germany
| | - J Núñez-Farfán
- Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico.
| |
Collapse
|
30
|
Draft Genome Assembly of the Freshwater Apex Predator Wels Catfish ( Silurus glanis) Using Linked-Read Sequencing. G3-GENES GENOMES GENETICS 2020; 10:3897-3906. [PMID: 32917720 PMCID: PMC7642921 DOI: 10.1534/g3.120.401711] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
The wels catfish (Silurus glanis) is one of the largest freshwater fish species in the world. This top predator plays a key role in ecosystem stability, and represents an iconic trophy-fish for recreational fishermen. S. glanis is also a highly valued species for its high-quality boneless flesh, and has been cultivated for over 100 years in Eastern and Central Europe. The interest in rearing S. glanis continues to grow; the aquaculture production of this species has almost doubled during the last decade. However, despite its high ecological, cultural and economic importance, the available genomic resources for S. glanis are very limited. To fulfill this gap we report a de novo assembly and annotation of the whole genome sequence of a female S. glanis. The linked-read based technology with 10X Genomics Chromium chemistry and Supernova assembler produced a highly continuous draft genome of S. glanis: ∼0.8Gb assembly (scaffold N50 = 3.2 Mb; longest individual scaffold = 13.9 Mb; BUSCO completeness = 84.2%), which included 313.3 Mb of putative repeated sequences. In total, 21,316 protein-coding genes were predicted, of which 96% were annotated functionally from either sequence homology or protein signature searches. The highly continuous genome assembly will be an invaluable resource for aquaculture genomics, genetics, conservation, and breeding research of S. glanis.
Collapse
|
31
|
Vandenbrouck Y, Pineau C, Lane L. The Functionally Unannotated Proteome of Human Male Tissues: A Shared Resource to Uncover New Protein Functions Associated with Reproductive Biology. J Proteome Res 2020; 19:4782-4794. [PMID: 33064489 DOI: 10.1021/acs.jproteome.0c00516] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
In the context of the Human Proteome Project, we built an inventory of 412 functionally unannotated human proteins for which experimental evidence at the protein level exists (uPE1) and which are highly expressed in tissues involved in human male reproduction. We implemented a strategy combining literature mining, bioinformatics tools to collate annotation and experimental information from specific molecular public resources, and efficient visualization tools to put these unknown proteins into their biological context (protein complexes, tissue and subcellular location, expression pattern). The gathered knowledge allowed pinpointing five uPE1 for which a function has recently been proposed and which should be updated in protein knowledge bases. Furthermore, this bioinformatics strategy allowed to build new functional hypotheses for five other uPE1s in link with phenotypic traits that are specific to male reproductive function such as ciliogenesis/flagellum formation in germ cells (CCDC112 and TEX9), chromatin remodeling (C3orf62) and spermatozoon maturation (CCDC183). We also discussed the enigmatic case of MAGEB proteins, a poorly documented cancer/testis antigen subtype. Tools used and computational outputs produced during this study are freely accessible via ProteoRE (http://www.proteore.org), a Galaxy-based instance, for reuse purposes. We propose these five uPE1s should be investigated in priority by expert laboratories and hope that this inventory and shared resources will stimulate the interest of the community of reproductive biology.
Collapse
Affiliation(s)
- Yves Vandenbrouck
- Univ. Grenoble Alpes, INSERM, CEA, IRIG-BGE, U1038, F-38000 Grenoble, France
| | - Charles Pineau
- Univ. Rennes, Inserm, EHESP, Irset (Institut de Recherche en Santé, Environnement et Travail) - UMR_S 1085, F-35042 Rennes cedex, France
| | - Lydie Lane
- SIB Swiss Institute of Bioinformatics and Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, CMU, Michel Servet 1, 1211 Geneva 4, Switzerland
| |
Collapse
|
32
|
"Mind the Gap": Hi-C Technology Boosts Contiguity of the Globe Artichoke Genome in Low-Recombination Regions. G3-GENES GENOMES GENETICS 2020; 10:3557-3564. [PMID: 32817122 PMCID: PMC7534446 DOI: 10.1534/g3.120.401446] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Globe artichoke (Cynara cardunculus var. scolymus; 2n2x=34) is cropped largely in the Mediterranean region, being Italy the leading world producer; however, over time, its cultivation has spread to the Americas and China. In 2016, we released the first (v1.0) globe artichoke genome sequence (http://www.artichokegenome.unito.it/). Its assembly was generated using ∼133-fold Illumina sequencing data, covering 725 of the 1,084 Mb genome, of which 526 Mb (73%) were anchored to 17 chromosomal pseudomolecules. Based on v1.0 sequencing data, we generated a new genome assembly (v2.0), obtained from a Hi-C (Dovetail) genomic library, and which improves the scaffold N50 from 126 kb to 44.8 Mb (∼356-fold increase) and N90 from 29 kb to 17.8 Mb (∼685-fold increase). While the L90 of the v1.0 sequence included 6,123 scaffolds, the new v2.0 just 15 super-scaffolds, a number close to the haploid chromosome number of the species. The newly generated super-scaffolds were assigned to pseudomolecules using reciprocal blast procedures. The cumulative size of unplaced scaffolds in v2.0 was reduced of 165 Mb, increasing to 94% the anchored genome sequence. The marked improvement is mainly attributable to the ability of the proximity ligation-based approach to deal with both heterochromatic (e.g.: peri-centromeric) and euchromatic regions during the assembly procedure, which allowed to physically locate low recombination regions. The new high-quality reference genome enhances the taxonomic breadth of the data available for comparative plant genomics and led to a new accurate gene prediction (28,632 genes), thus promoting the map-based cloning of economically important genes.
Collapse
|
33
|
Van Bibber NW, Haerle C, Khalife R, Dayhoff GW, Uversky VN. Intrinsic Disorder in Human Proteins Encoded by Core Duplicon Gene Families. J Phys Chem B 2020; 124:8050-8070. [PMID: 32880174 DOI: 10.1021/acs.jpcb.0c07676] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Segmental duplications (i.e., highly homologous DNA fragments greater than 1 kb in length that are present within a genome at more than one site) are typically found in genome regions that are prone to rearrangements. A noticeable fraction of the human genome (∼5%) includes segmental duplications (or duplicons) that are assumed to play a number of vital roles in human evolution, human-specific adaptation, and genomic instability. Despite their importance for crucial events such as synaptogenesis, neuronal migration, and neocortical expansion, these segmental duplications continue to be rather poorly characterized. Of particular interest are the core duplicon gene (CDG) families, which are replicates sharing common "core" DNA among the randomly attached pieces and which expand along single chromosomes and might harbor newly acquired protein domains. Another important feature of proteins encoded by CDG families is their multifunctionality. Although it seems that these proteins might possess many characteristic features of intrinsically disordered proteins, to the best of our knowledge, a systematic investigation of the intrinsic disorder predisposition of the proteins encoded by core duplicon gene families has not been conducted yet. To fill this gap and to determine the degree to which these proteins might be affected by intrinsic disorder, we analyzed a set of human proteins encoded by the members of 10 core duplicon gene families, such as NBPF, RGPD, GUSBP, PMS2P, SPATA31, TRIM51, GOLGA8, NPIP, TBC1D3, and LRRC37. Our analysis revealed that the vast majority of these proteins are highly disordered, with their disordered regions often being utilized as means for the protein-protein interactions and/or targeted for numerous posttranslational modifications of different nature.
Collapse
Affiliation(s)
- Nathan W Van Bibber
- Department of Molecular Medicine Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Boulevard, Tampa, Florida 33612, United States
| | - Cornelia Haerle
- Department of Molecular Medicine Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Boulevard, Tampa, Florida 33612, United States
| | - Roy Khalife
- Department of Molecular Medicine Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Boulevard, Tampa, Florida 33612, United States
| | - Guy W Dayhoff
- Department of Chemistry, College of Art and Sciences, University of South Florida, Tampa, Florida 33620, United States
| | - Vladimir N Uversky
- Department of Molecular Medicine Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Boulevard, Tampa, Florida 33612, United States.,USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Boulevard, Tampa, Florida 33612, United States.,Institute for Biological Instrumentation, Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", 4 Institutskaya St., Pushchino, 142290, Moscow Region, Russia
| |
Collapse
|
34
|
Pérez‐Portela R, Riesgo A, Wangensteen OS, Palacín C, Turon X. Enjoying the warming Mediterranean: Transcriptomic responses to temperature changes of a thermophilous keystone species in benthic communities. Mol Ecol 2020; 29:3299-3315. [DOI: 10.1111/mec.15564] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2019] [Revised: 07/08/2020] [Accepted: 07/20/2020] [Indexed: 12/18/2022]
Affiliation(s)
- Rocío Pérez‐Portela
- Department of Evolutionary Biology, Ecology and Environmental Sciences University of Barcelona, and Research Institute of Biodiversity (IRBIO) Barcelona Spain
- Center for Advanced Studies of Blanes (CEAB, CSIC) Girona Spain
| | - Ana Riesgo
- Department of Life Sciences The Natural History Museum London UK
| | - Owen S. Wangensteen
- Norwegian College of Fishery Science UiT The Arctic University of Norway Tromsø Norway
| | - Cruz Palacín
- Department of Evolutionary Biology, Ecology and Environmental Sciences University of Barcelona, and Research Institute of Biodiversity (IRBIO) Barcelona Spain
| | - Xavier Turon
- Center for Advanced Studies of Blanes (CEAB, CSIC) Girona Spain
| |
Collapse
|
35
|
Whole genome resequencing of four Italian sweet pepper landraces provides insights on sequence variation in genes of agronomic value. Sci Rep 2020; 10:9189. [PMID: 32514106 PMCID: PMC7280500 DOI: 10.1038/s41598-020-66053-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 05/07/2020] [Indexed: 11/08/2022] Open
Abstract
Sweet pepper (Capsicum annuum L.) is a high value crop and one of the most widely grown vegetables belonging to the Solanaceae family. In addition to commercial varieties and F1 hybrids, a multitude of landraces are grown, whose genetic combination is the result of hundreds of years of random, environmental, and farmer selection. High genetic diversity exists in the landrace gene pool which however has scarcely been studied, thus bounding their cultivation. We re-sequenced four pepper inbred lines, within as many Italian landraces, which representative of as many fruit types: big sized blocky with sunken apex ('Quadrato') and protruding apex or heart shaped ('Cuneo'), elongated ('Corno') and smaller sized sub-spherical ('Tumaticot'). Each genomic sequence was obtained through Illumina platform at coverage ranging from 39 to 44×, and reconstructed at a chromosome scale. About 35.5k genes were predicted in each inbred line, of which 22,017 were shared among them and the reference genome (accession 'CM334'). Distinctive variations in miRNAs, resistance gene analogues (RGAs) and susceptibility genes (S-genes) were detected. A detailed survey of the SNP/Indels occurring in genes affecting fruit size, shape and quality identified the highest frequencies of variation in regulatory regions. Many structural variations were identified as presence/absence variations (PAVs), notably in resistance gene analogues (RGAs) and in the capsanthin/capsorubin synthase (CCS) gene. The large allelic diversity observed in the four inbred lines suggests their potential use as a pre-breeding resource and represents a one-stop resource for C. annuum genomics and a key tool for dissecting the path from sequence variation to phenotype.
Collapse
|
36
|
Intrinsic Disorder in Tetratricopeptide Repeat Proteins. Int J Mol Sci 2020; 21:ijms21103709. [PMID: 32466138 PMCID: PMC7279152 DOI: 10.3390/ijms21103709] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 05/12/2020] [Accepted: 05/22/2020] [Indexed: 12/27/2022] Open
Abstract
Among the realm of repeat containing proteins that commonly serve as “scaffolds” promoting protein-protein interactions, there is a family of proteins containing between 2 and 20 tetratricopeptide repeats (TPRs), which are functional motifs consisting of 34 amino acids. The most distinguishing feature of TPR domains is their ability to stack continuously one upon the other, with these stacked repeats being able to affect interaction with binding partners either sequentially or in combination. It is known that many repeat-containing proteins are characterized by high levels of intrinsic disorder, and that many protein tandem repeats can be intrinsically disordered. Furthermore, it seems that TPR-containing proteins share many characteristics with hybrid proteins containing ordered domains and intrinsically disordered protein regions. However, there has not been a systematic analysis of the intrinsic disorder status of TPR proteins. To fill this gap, we analyzed 166 human TPR proteins to determine the degree to which proteins containing TPR motifs are affected by intrinsic disorder. Our analysis revealed that these proteins are characterized by different levels of intrinsic disorder and contain functional disordered regions that are utilized for protein-protein interactions and often serve as targets of various posttranslational modifications.
Collapse
|
37
|
Pandurangan AP, Stahlhacke J, Oates ME, Smithers B, Gough J. The SUPERFAMILY 2.0 database: a significant proteome update and a new webserver. Nucleic Acids Res 2020; 47:D490-D494. [PMID: 30445555 PMCID: PMC6324026 DOI: 10.1093/nar/gky1130] [Citation(s) in RCA: 98] [Impact Index Per Article: 24.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2018] [Accepted: 10/25/2018] [Indexed: 01/09/2023] Open
Abstract
Here, we present a major update to the SUPERFAMILY database and the webserver. We describe the addition of new SUPERFAMILY 2.0 profile HMM library containing a total of 27 623 HMMs. The database now includes Superfamily domain annotations for millions of protein sequences taken from the Universal Protein Recourse Knowledgebase (UniProtKB) and the National Center for Biotechnology Information (NCBI). This addition constitutes about 51 and 45 million distinct protein sequences obtained from UniProtKB and NCBI respectively. Currently, the database contains annotations for 63 244 and 102 151 complete genomes taken from UniProtKB and NCBI respectively. The current sequence collection and genome update is the biggest so far in the history of SUPERFAMILY updates. In order to the deal with the massive wealth of information, here we introduce a new SUPERFAMILY 2.0 webserver (http://supfam.org). Currently, the webserver mainly focuses on the search, retrieval and display of Superfamily annotation for the entire sequence and genome collection in the database.
Collapse
Affiliation(s)
| | | | - Matt E Oates
- Computer Science, University of Bristol, Bristol BS8 1UB, UK
| | - Ben Smithers
- Computer Science, University of Bristol, Bristol BS8 1UB, UK
| | - Julian Gough
- MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, UK
| |
Collapse
|
38
|
An improved high-quality genome assembly and annotation of Tibetan hulless barley. Sci Data 2020; 7:139. [PMID: 32385314 PMCID: PMC7210891 DOI: 10.1038/s41597-020-0480-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Accepted: 04/03/2020] [Indexed: 12/28/2022] Open
Abstract
Hulless barley (Hordeum vulgare L. var. nudum) is a barley variety that has loose husk cover of the caryopses. Because of the ease in processing and edibility, hulless barley has been locally cultivated and used as human food. For example, in Tibetan Plateau, hulless barley is the staple food for human and essential livestock feed. Although the draft genome of hulless barley has been sequenced, the assembly remains fragmented. Here, we reported an improved high-quality assembly and annotation of the Tibetan hulless barley genome using more than 67X PacBio long-reads. The N50 contig length of the new assembly is at least more than 19 times larger than other available barley assemblies. The new genome assembly also showed high gene completeness and high collinearity of genome synteny with the previously reported barley genome. The new genome assembly and annotation will not only remove major hurdles in genetic analysis and breeding of hulless barley, but will also serve as a key resource for studying barley genomics and genetics.
Collapse
|
39
|
Zhang F, Song H, Zeng M, Li Y, Kurgan L, Li M. DeepFunc: A Deep Learning Framework for Accurate Prediction of Protein Functions from Protein Sequences and Interactions. Proteomics 2019; 19:e1900019. [PMID: 30941889 DOI: 10.1002/pmic.201900019] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2019] [Revised: 03/18/2019] [Indexed: 01/06/2023]
Abstract
Annotation of protein functions plays an important role in understanding life at the molecular level. High-throughput sequencing produces massive numbers of raw proteins sequences and only about 1% of them have been manually annotated with functions. Experimental annotations of functions are expensive, time-consuming and do not keep up with the rapid growth of the sequence numbers. This motivates the development of computational approaches that predict protein functions. A novel deep learning framework, DeepFunc, is proposed which accurately predicts protein functions from protein sequence- and network-derived information. More precisely, DeepFunc uses a long and sparse binary vector to encode information concerning domains, families, and motifs collected from the InterPro tool that is associated with the input protein sequence. This vector is processed with two neural layers to obtain a low-dimensional vector which is combined with topological information extracted from protein-protein interactions (PPIs) and functional linkages. The combined information is processed by a deep neural network that predicts protein functions. DeepFunc is empirically and comparatively tested on a benchmark testing dataset and the Critical Assessment of protein Function Annotation algorithms (CAFA) 3 dataset. The experimental results demonstrate that DeepFunc outperforms current methods on the testing dataset and that it secures the highest Fmax = 0.54 and AUC = 0.94 on the CAFA3 dataset.
Collapse
Affiliation(s)
- Fuhao Zhang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, P. R. China
| | - Hong Song
- School of Computer Science and Engineering, Central South University, Changsha, 410083, P. R. China
| | - Min Zeng
- School of Computer Science and Engineering, Central South University, Changsha, 410083, P. R. China
| | - Yaohang Li
- School of Computer Science and Engineering, Central South University, Changsha, 410083, P. R. China.,Department of Computer Science, Old Dominion University, Norfolk, VA, 23529, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, 23284, USA
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, 410083, P. R. China
| |
Collapse
|
40
|
Mann KS, Chisholm J, Sanfaçon H. Strawberry Mottle Virus (Family Secoviridae, Order Picornavirales) Encodes a Novel Glutamic Protease To Process the RNA2 Polyprotein at Two Cleavage Sites. J Virol 2019; 93:e01679-18. [PMID: 30541838 PMCID: PMC6384087 DOI: 10.1128/jvi.01679-18] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Accepted: 11/19/2018] [Indexed: 01/29/2023] Open
Abstract
Strawberry mottle virus (SMoV) belongs to the family Secoviridae (order Picornavirales) and has a bipartite genome with each RNA encoding one polyprotein. All characterized secovirids encode a single protease related to the picornavirus 3C protease. The SMoV 3C-like protease was previously shown to cut the RNA2 polyprotein (P2) at a single site between the predicted movement protein and coat protein (CP) domains. However, the SMoV P2 polyprotein includes an extended C-terminal region with a coding capacity of up to 70 kDa downstream of the presumed CP domain, an unusual characteristic for this family. In this study, we identified a novel cleavage event at a P↓AFP sequence immediately downstream of the CP domain. Following deletion of the PAFP sequence, the polyprotein was processed at or near a related PKFP sequence 40 kDa further downstream, defining two protein domains in the C-terminal region of the P2 polyprotein. Both processing events were dependent on a novel protease domain located between the two cleavage sites. Mutagenesis of amino acids that are conserved among isolates of SMoV and of the related Black raspberry necrosis virus did not identify essential cysteine, serine, or histidine residues, suggesting that the RNA2-encoded SMoV protease is not related to serine or cysteine proteases of other picorna-like viruses. Rather, two highly conserved glutamic acid residues spaced by 82 residues were found to be strictly required for protease activity. We conclude that the processing of SMoV polyproteins requires two viral proteases, the RNA1-encoded 3C-like protease and a novel glutamic protease encoded by RNA2.IMPORTANCE Many viruses encode proteases to release mature proteins and intermediate polyproteins from viral polyproteins. Polyprotein processing allows regulation of the accumulation and activity of viral proteins. Many viral proteases also cleave host factors to facilitate virus infection. Thus, viral proteases are key virulence factors. To date, viruses with a positive-strand RNA genome are only known to encode cysteine or serine proteases, most of which are related to the cellular papain, trypsin, or chymotrypsin proteases. Here, we characterize the first glutamic protease encoded by a plant virus or by a positive-strand RNA virus. The novel glutamic protease is unique to a few members of the family Secoviridae, suggesting that it is a recent acquisition in the evolution of this family. The protease does not resemble known cellular proteases. Rather, it is predicted to share structural similarities with a family of fungal and bacterial glutamic proteases that adopt a lectin fold.
Collapse
Affiliation(s)
- Krin S Mann
- Agriculture and Agri-Food Canada, Summerland Research and Development Centre, Summerland, British Columbia, Canada
| | - Joan Chisholm
- Agriculture and Agri-Food Canada, Summerland Research and Development Centre, Summerland, British Columbia, Canada
| | - Hélène Sanfaçon
- Agriculture and Agri-Food Canada, Summerland Research and Development Centre, Summerland, British Columbia, Canada
| |
Collapse
|
41
|
Iyer MS, Bhargava K, Pavalam M, Sowdhamini R. GenDiS database update with improved approach and features to recognize homologous sequences of protein domain superfamilies. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2019; 2019:5426807. [PMID: 30943284 PMCID: PMC6446967 DOI: 10.1093/database/baz042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Revised: 02/20/2019] [Accepted: 03/08/2019] [Indexed: 11/24/2022]
Abstract
Since proteins evolve by divergent evolution, proteins with distant homology to each other may or may not bear similar functions. Improved computational approaches are required to recognize distant homologues that are functionally similar. One of the methods of assigning function to sequences is to use profiles derived from sequences of known structure. We describe an update of the Genomic Distribution of protein structural domain Superfamilies (GenDiS) database, namely GenDiS+, which provides a projection of SCOP superfamily members on the sequence space (NR database, NCBI). The sequences are validated using structure-based sequence alignment profiles and domain and full-length sequence alignments. GenDiS+ is a `tour de force’ for detecting homologues within around 160 000 taxonomic identifiers, starting from nearly 11 000 domains of known structure. Features, like full-sequence alignment and phylogeny, domain sequence alignment and phylogeny, list of associated structural and sequence domains with strength of interactions, links to databases like Pfam, UniProt and ModBase and list of sequences with a PDB structure, are provided.
Collapse
Affiliation(s)
- Meenakshi S Iyer
- National Centre for Biological Sciences, Tata Institute of Fundamental Research (TIFR), Gandhi Krishi, Vignana Kendra Campus, Bellary Road, Bangalore, Karnataka, India
| | - Kartik Bhargava
- National Centre for Biological Sciences, Tata Institute of Fundamental Research (TIFR), Gandhi Krishi, Vignana Kendra Campus, Bellary Road, Bangalore, Karnataka, India.,Birla Institute of Technology and Science, Pilani, VidyaVihar Campus, Pilani, Rajasthan, India
| | - Murugavel Pavalam
- National Centre for Biological Sciences, Tata Institute of Fundamental Research (TIFR), Gandhi Krishi, Vignana Kendra Campus, Bellary Road, Bangalore, Karnataka, India
| | - Ramanathan Sowdhamini
- National Centre for Biological Sciences, Tata Institute of Fundamental Research (TIFR), Gandhi Krishi, Vignana Kendra Campus, Bellary Road, Bangalore, Karnataka, India
| |
Collapse
|
42
|
Blaz J, Barrera-Redondo J, Vázquez-Rosas-Landa M, Canedo-Téxon A, Aguirre von Wobeser E, Carrillo D, Stouthamer R, Eskalen A, Villafán E, Alonso-Sánchez A, Lamelas A, Ibarra-Juarez LA, Pérez-Torres CA, Ibarra-Laclette E. Genomic Signals of Adaptation towards Mutualism and Sociality in Two Ambrosia Beetle Complexes. Life (Basel) 2018; 9:E2. [PMID: 30583535 PMCID: PMC6463014 DOI: 10.3390/life9010002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2018] [Revised: 12/08/2018] [Accepted: 12/20/2018] [Indexed: 01/03/2023] Open
Abstract
Mutualistic symbiosis and eusociality have developed through gradual evolutionary processes at different times in specific lineages. Like some species of termites and ants, ambrosia beetles have independently evolved a mutualistic nutritional symbiosis with fungi, which has been associated with the evolution of complex social behaviors in some members of this group. We sequenced the transcriptomes of two ambrosia complexes (Euwallacea sp. near fornicatus⁻Fusarium euwallaceae and Xyleborus glabratus⁻Raffaelea lauricola) to find evolutionary signatures associated with mutualism and behavior evolution. We identified signatures of positive selection in genes related to nutrient homeostasis; regulation of gene expression; development and function of the nervous system, which may be involved in diet specialization; behavioral changes; and social evolution in this lineage. Finally, we found convergent changes in evolutionary rates of proteins across lineages with phylogenetically independent origins of sociality and mutualism, suggesting a constrained evolution of conserved genes in social species, and an evolutionary rate acceleration related to changes in selective pressures in mutualistic lineages.
Collapse
Affiliation(s)
- Jazmín Blaz
- Red de Estudios Moleculares Avanzados, Instituto de Ecología A.C, Xalapa, Veracruz 91070, Mexico.
| | - Josué Barrera-Redondo
- Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México, Ciudad de México 04500, Mexico.
| | | | - Anahí Canedo-Téxon
- Red de Estudios Moleculares Avanzados, Instituto de Ecología A.C, Xalapa, Veracruz 91070, Mexico.
| | | | - Daniel Carrillo
- Tropical Research and Education Center, University of Florida, Homestead, FL 33031, USA.
| | - Richard Stouthamer
- Department of Plant Pathology, University of California⁻Riverside, Riverside, CA 92521, USA.
| | - Akif Eskalen
- Department of Plant Pathology, University of California, Davis, CA 95616-8751, USA.
| | - Emanuel Villafán
- Red de Estudios Moleculares Avanzados, Instituto de Ecología A.C, Xalapa, Veracruz 91070, Mexico.
| | - Alexandro Alonso-Sánchez
- Red de Estudios Moleculares Avanzados, Instituto de Ecología A.C, Xalapa, Veracruz 91070, Mexico.
| | - Araceli Lamelas
- Red de Estudios Moleculares Avanzados, Instituto de Ecología A.C, Xalapa, Veracruz 91070, Mexico.
| | - Luis Arturo Ibarra-Juarez
- Red de Estudios Moleculares Avanzados, Instituto de Ecología A.C, Xalapa, Veracruz 91070, Mexico.
- Cátedras CONACyT/Instituto de Ecología A.C., Xalapa, Veracruz 91070, Mexico.
| | - Claudia Anahí Pérez-Torres
- Red de Estudios Moleculares Avanzados, Instituto de Ecología A.C, Xalapa, Veracruz 91070, Mexico.
- Cátedras CONACyT/Instituto de Ecología A.C., Xalapa, Veracruz 91070, Mexico.
| | - Enrique Ibarra-Laclette
- Red de Estudios Moleculares Avanzados, Instituto de Ecología A.C, Xalapa, Veracruz 91070, Mexico.
| |
Collapse
|
43
|
Highly Continuous Genome Assembly of Eurasian Perch ( Perca fluviatilis) Using Linked-Read Sequencing. G3-GENES GENOMES GENETICS 2018; 8:3737-3743. [PMID: 30355765 PMCID: PMC6288837 DOI: 10.1534/g3.118.200768] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The Eurasian perch (Perca fluviatilis) is the most common fish of the Percidae family and is widely distributed across Eurasia. Perch is a popular target for professional and recreational fisheries, and a promising freshwater aquaculture species in Europe. However, despite its high ecological, economical and societal importance, the available genomic resources for P. fluviatilis are rather limited. In this work, we report de novo assembly and annotation of the whole genome sequence of perch. The linked-read based technology with 10X Genomics Chromium chemistry and Supernova assembler produced a draft perch genome ∼1.0 Gbp assembly (scaffold N50 = 6.3 Mb; the longest individual scaffold of 29.3 Mb; BUSCO completeness of 88.0%), which included 281.6 Mb of putative repeated sequences. The perch genome assembly presented here, generated from small amount of starting material (0.75 ng) and a single linked-read library, is highly continuous and considerably more complete than the currently available draft of P. fluviatilis genome. A total of 23,397 protein-coding genes were predicted, 23,171 (99%) of which were annotated functionally from either sequence homology or protein signature searches. Linked-read technology enables fast, accurate and cost-effective de novo assembly of large non-model eukaryote genomes. The highly continuous assembly of the Eurasian perch genome presented in this study will be an invaluable resource for a range of genetic, ecological, physiological, ecotoxicological, functional and comparative genomic studies in perch and other fish species of the Percidae family.
Collapse
|
44
|
Abstract
We attempt to quantify animal “bodyplans” and their variation within Metazoa. Our results challenge the view that maximum variation was achieved early in animal evolutionary history by nonuniformitarian mechanisms. Rather, they are compatible with the view that the capacity for fundamental innovation is not limited to the early evolutionary history of clades. We perform quantitative tests of the principal hypotheses of the molecular mechanisms underpinning the establishment of animal bodyplans and corroborate the hypothesis that animal evolution has been permitted or driven by gene regulatory evolution. The animal kingdom exhibits a great diversity of organismal form (i.e., disparity). Whether the extremes of disparity were achieved early in animal evolutionary history or clades continually explore the limits of possible morphospace is subject to continuing debate. Here we show, through analysis of the disparity of the animal kingdom, that, even though many clades exhibit maximal initial disparity, arthropods, chordates, annelids, echinoderms, and mollusks have continued to explore and expand the limits of morphospace throughout the Phanerozoic, expanding dramatically the envelope of disparity occupied in the Cambrian. The “clumpiness” of morphospace occupation by living clades is a consequence of the extinction of phylogenetic intermediates, indicating that the original distribution of morphologies was more homogeneous. The morphological distances between phyla mirror differences in complexity, body size, and species-level diversity across the animal kingdom. Causal hypotheses of morphologic expansion include time since origination, increases in genome size, protein repertoire, gene family expansion, and gene regulation. We find a strong correlation between increasing morphological disparity, genome size, and microRNA repertoire, but no correlation to protein domain diversity. Our results are compatible with the view that the evolution of gene regulation has been influential in shaping metazoan disparity whereas the invasion of terrestrial ecospace appears to represent an additional gestalt, underpinning the post-Cambrian expansion of metazoan disparity.
Collapse
|
45
|
You R, Huang X, Zhu S. DeepText2GO: Improving large-scale protein function prediction with deep semantic text representation. Methods 2018; 145:82-90. [PMID: 29883746 DOI: 10.1016/j.ymeth.2018.05.026] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2018] [Revised: 04/30/2018] [Accepted: 05/31/2018] [Indexed: 11/16/2022] Open
Abstract
As of April 2018, UniProtKB has collected more than 115 million protein sequences. Less than 0.15% of these proteins, however, have been associated with experimental GO annotations. As such, the use of automatic protein function prediction (AFP) to reduce this huge gap becomes increasingly important. The previous studies conclude that sequence homology based methods are highly effective in AFP. In addition, mining motif, domain, and functional information from protein sequences has been found very helpful for AFP. Other than sequences, alternative information sources such as text, however, may be useful for AFP as well. Instead of using BOW (bag of words) representation in traditional text-based AFP, we propose a new method called DeepText2GO that relies on deep semantic text representation, together with different kinds of available protein information such as sequence homology, families, domains, and motifs, to improve large-scale AFP. Furthermore, DeepText2GO integrates text-based methods with sequence-based ones by means of a consensus approach. Extensive experiments on the benchmark dataset extracted from UniProt/SwissProt have demonstrated that DeepText2GO significantly outperformed both text-based and sequence-based methods, validating its superiority.
Collapse
Affiliation(s)
- Ronghui You
- School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, China; Center for Computational System Biology, ISTBI, Fudan University, Shanghai 200433, China
| | - Xiaodi Huang
- School of Computing and Mathematics, Charles Sturt University, Albury, NSW 2640, Australia
| | - Shanfeng Zhu
- School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, China; Center for Computational System Biology, ISTBI, Fudan University, Shanghai 200433, China.
| |
Collapse
|
46
|
Fang H, Wang K. Regulatory Genomic Data Cubism. iScience 2018; 3:217-225. [PMID: 30428322 PMCID: PMC6137703 DOI: 10.1016/j.isci.2018.04.017] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2018] [Revised: 03/28/2018] [Accepted: 04/20/2018] [Indexed: 12/11/2022] Open
Abstract
A regularly shaped grid is useful for analyzing data particularly at multilayer levels, where patterns can be visually represented and analytically compared—conceptually similar to Picasso's cubism. Here we introduce ATLAS, featuring a suite of spatially ordered maps designed for representation and comparison of patterns seen in regulatory genomic data. It produces a landscape learned from input data and enables landscape-guided correlation with additional data. We illustrate its use for multilayer data comparison on the same cell type, and for comparisons involving different cell types, revealing information in a scientifically insightful and also visually intuitive way. The data-driven and visual-aided ability of ATLAS presents a general strategy for regulatory genomic data analysis. Realization of Picasso's cubism in regulatory genomics Enables representation and comparison of patterns in regulatory genomic data Able to analyze data at multilayer levels and involving different cell types A general strategy for regulatory genomic data analysis
Collapse
Affiliation(s)
- Hai Fang
- State Key Laboratory of Medical Genomics and Shanghai Institute of Hematology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China; Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK.
| | - Kankan Wang
- State Key Laboratory of Medical Genomics and Shanghai Institute of Hematology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China.
| |
Collapse
|
47
|
Austin CM, Tan MH, Harrisson KA, Lee YP, Croft LJ, Sunnucks P, Pavlova A, Gan HM. De novo genome assembly and annotation of Australia's largest freshwater fish, the Murray cod (Maccullochella peelii), from Illumina and Nanopore sequencing read. Gigascience 2018; 6:1-6. [PMID: 28873963 PMCID: PMC5597895 DOI: 10.1093/gigascience/gix063] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2017] [Accepted: 07/11/2017] [Indexed: 12/02/2022] Open
Abstract
One of the most iconic Australian fish is the Murray cod, Maccullochella peelii (Mitchell 1838), a freshwater species that can grow to ∼1.8 metres in length and live to age ≥48 years. The Murray cod is of a conservation concern as a result of strong population contractions, but it is also popular for recreational fishing and is of growing aquaculture interest. In this study, we report the whole genome sequence of the Murray cod to support ongoing population genetics, conservation, and management research, as well as to better understand the evolutionary ecology and history of the species. A draft Murray cod genome of 633 Mbp (N50 = 109 974bp; BUSCO and CEGMA completeness of 94.2% and 91.9%, respectively) with an estimated 148 Mbp of putative repetitive sequences was assembled from the combined sequencing data of 2 fish individuals with an identical maternal lineage; 47.2 Gb of Illumina HiSeq data and 804 Mb of Nanopore data were generated from the first individual while 23.2 Gb of Illumina MiSeq data were generated from the second individual. The inclusion of Nanopore reads for scaffolding followed by subsequent gap-closing using Illumina data led to a 29% reduction in the number of scaffolds and a 55% and 54% increase in the scaffold and contig N50, respectively. We also report the first transcriptome of Murray cod that was subsequently used to annotate the Murray cod genome, leading to the identification of 26 539 protein-coding genes. We present the whole genome of the Murray cod and anticipate this will be a catalyst for a range of genetic, genomic, and phylogenetic studies of the Murray cod and more generally other fish species of the Percichthydae family.
Collapse
Affiliation(s)
- Christopher M Austin
- Centre for Integrative Ecology, School of Life and Environmental Sciences, Deakin University, Geelong, Victoria 3220, Australia.,Genomics Facility, Tropical Medicine and Biology Platform, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway 47500, Petaling Jaya, Selangor, Malaysia.,School of Science, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway 47500, Petaling Jaya, Selangor, Malaysia
| | - Mun Hua Tan
- Centre for Integrative Ecology, School of Life and Environmental Sciences, Deakin University, Geelong, Victoria 3220, Australia.,Genomics Facility, Tropical Medicine and Biology Platform, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway 47500, Petaling Jaya, Selangor, Malaysia.,School of Science, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway 47500, Petaling Jaya, Selangor, Malaysia
| | - Katherine A Harrisson
- School of Biological Sciences, Monash University, Clayton Campus, Clayton, Victoria, Australia
| | - Yin Peng Lee
- Genomics Facility, Tropical Medicine and Biology Platform, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway 47500, Petaling Jaya, Selangor, Malaysia.,School of Science, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway 47500, Petaling Jaya, Selangor, Malaysia
| | - Laurence J Croft
- School of Science, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway 47500, Petaling Jaya, Selangor, Malaysia.,Malaysian Genomics Resource Centre Berhad, Boulevard Signature Office, Kuala Lumpur, Malaysia
| | - Paul Sunnucks
- School of Biological Sciences, Monash University, Clayton Campus, Clayton, Victoria, Australia
| | - Alexandra Pavlova
- School of Biological Sciences, Monash University, Clayton Campus, Clayton, Victoria, Australia
| | - Han Ming Gan
- Centre for Integrative Ecology, School of Life and Environmental Sciences, Deakin University, Geelong, Victoria 3220, Australia.,Genomics Facility, Tropical Medicine and Biology Platform, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway 47500, Petaling Jaya, Selangor, Malaysia.,School of Science, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway 47500, Petaling Jaya, Selangor, Malaysia
| |
Collapse
|
48
|
Botta C, Acquadro A, Greppi A, Barchi L, Bertolino M, Cocolin L, Rantsiou K. Genomic assessment in Lactobacillus plantarum links the butyrogenic pathway with glutamine metabolism. Sci Rep 2017; 7:15975. [PMID: 29162929 PMCID: PMC5698307 DOI: 10.1038/s41598-017-16186-8] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Accepted: 11/08/2017] [Indexed: 11/09/2022] Open
Abstract
The butyrogenic capability of Lactobacillus (L.) plantarum is highly dependent on the substrate type and so far not assigned to any specific metabolic pathway. Accordingly, we compared three genomes of L. plantarum that showed a strain-specific capability to produce butyric acid in human cells growth media. Based on the genomic analysis, butyric acid production was attributed to the complementary activities of a medium-chain thioesterase and the fatty acid synthase of type two (FASII). However, the genomic islands of discrepancy observed between butyrogenic L. plantarum strains (S2T10D, S11T3E) and the non-butyrogenic strain O2T60C do not encompass genes of FASII, but several cassettes of genes related to sugar metabolism, bacteriocins, prophages and surface proteins. Interestingly, single amino acid substitutions predicted from SNPs analysis have highlighted deleterious mutations in key genes of glutamine metabolism in L. plantarum O2T60C, which corroborated well with the metabolic deficiency suffered by O2T60C in high-glutamine growth media and its consequent incapability to produce butyrate. In parallel, the increase of glutamine content induced the production of butyric acid by L. plantarum S2T10D. The present study reveals a previously undescribed metabolic route for butyric acid production in L. plantarum, and a potential involvement of the glutamine uptake in its regulation.
Collapse
Affiliation(s)
- Cristian Botta
- Department of Forestry, Agriculture and Food Sciences, University of Torino, Turin, Italy
| | - Alberto Acquadro
- Department of Forestry, Agriculture and Food Sciences, University of Torino, Turin, Italy
| | - Anna Greppi
- Department of Forestry, Agriculture and Food Sciences, University of Torino, Turin, Italy
- Department of Health Sciences and Technology, Laboratory of Food Biotechnology, ETH Zürich, Switzerland
| | - Lorenzo Barchi
- Department of Forestry, Agriculture and Food Sciences, University of Torino, Turin, Italy
| | - Marta Bertolino
- Department of Forestry, Agriculture and Food Sciences, University of Torino, Turin, Italy
| | - Luca Cocolin
- Department of Forestry, Agriculture and Food Sciences, University of Torino, Turin, Italy
| | - Kalliopi Rantsiou
- Department of Forestry, Agriculture and Food Sciences, University of Torino, Turin, Italy.
| |
Collapse
|
49
|
Yurchenko T, Ševčíková T, Strnad H, Butenko A, Eliáš M. The plastid genome of some eustigmatophyte algae harbours a bacteria-derived six-gene cluster for biosynthesis of a novel secondary metabolite. Open Biol 2017; 6:rsob.160249. [PMID: 27906133 PMCID: PMC5133447 DOI: 10.1098/rsob.160249] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2016] [Accepted: 10/31/2016] [Indexed: 01/26/2023] Open
Abstract
Acquisition of genes by plastid genomes (plastomes) via horizontal gene transfer (HGT) seems to be a rare phenomenon. Here, we report an interesting case of HGT revealed by sequencing the plastomes of the eustigmatophyte algae Monodopsis sp. MarTras21 and Vischeria sp. CAUP Q 202. These plastomes proved to harbour a unique cluster of six genes, most probably acquired from a bacterium of the phylum Bacteroidetes, with homologues in various bacteria, typically organized in a conserved uncharacterized putative operon. Sequence analyses of the six proteins encoded by the operon yielded the following annotation for them: (i) a novel family without discernible homologues; (ii) a new family within the superfamily of metallo-dependent hydrolases; (iii) a novel subgroup of the UbiA superfamily of prenyl transferases; (iv) a new clade within the sugar phosphate cyclase superfamily; (v) a new family within the xylose isomerase-like superfamily; and (vi) a hydrolase for a phosphate moiety-containing substrate. We suggest that the operon encodes enzymes of a pathway synthesizing an isoprenoid–cyclitol-derived compound, possibly an antimicrobial or other protective substance. To the best of our knowledge, this is the first report of an expansion of the metabolic capacity of a plastid mediated by HGT into the plastid genome.
Collapse
Affiliation(s)
- Tatiana Yurchenko
- Faculty of Science, Department of Biology and Ecology, Life Science Research Centre, University of Ostrava, Chittussiho 10, 710 00 Ostrava, Czech Republic.,Faculty of Science, Institute of Environmental Technologies, University of Ostrava, Chittussiho 10, 710 00 Ostrava, Czech Republic
| | - Tereza Ševčíková
- Faculty of Science, Department of Biology and Ecology, Life Science Research Centre, University of Ostrava, Chittussiho 10, 710 00 Ostrava, Czech Republic
| | - Hynek Strnad
- Institute of Molecular Genetics of the ASCR, v. v. i., Prague, Czech Republic
| | - Anzhelika Butenko
- Faculty of Science, Department of Biology and Ecology, Life Science Research Centre, University of Ostrava, Chittussiho 10, 710 00 Ostrava, Czech Republic
| | - Marek Eliáš
- Faculty of Science, Department of Biology and Ecology, Life Science Research Centre, University of Ostrava, Chittussiho 10, 710 00 Ostrava, Czech Republic .,Faculty of Science, Institute of Environmental Technologies, University of Ostrava, Chittussiho 10, 710 00 Ostrava, Czech Republic
| |
Collapse
|
50
|
Malik SS, Azem-E-Zahra S, Kim KM, Caetano-Anollés G, Nasir A. Do Viruses Exchange Genes across Superkingdoms of Life? Front Microbiol 2017; 8:2110. [PMID: 29163404 PMCID: PMC5671483 DOI: 10.3389/fmicb.2017.02110] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Accepted: 10/16/2017] [Indexed: 12/13/2022] Open
Abstract
Viruses can be classified into archaeoviruses, bacterioviruses, and eukaryoviruses according to the taxonomy of the infected host. The host-constrained perception of viruses implies preference of genetic exchange between viruses and cellular organisms of their host superkingdoms and viral origins from host cells either via escape or reduction. However, viruses frequently establish non-lytic interactions with organisms and endogenize into the genomes of bacterial endosymbionts that reside in eukaryotic cells. Such interactions create opportunities for genetic exchange between viruses and organisms of non-host superkingdoms. Here, we take an atypical approach to revisit virus-cell interactions by first identifying protein fold structures in the proteomes of archaeoviruses, bacterioviruses, and eukaryoviruses and second by tracing their spread in the proteomes of superkingdoms Archaea, Bacteria, and Eukarya. The exercise quantified protein structural homologies between viruses and organisms of their host and non-host superkingdoms and revealed likely candidates for virus-to-cell and cell-to-virus gene transfers. Unexpected lifestyle-driven genetic affiliations between bacterioviruses and Eukarya and eukaryoviruses and Bacteria were also predicted in addition to a large cohort of protein folds that were universally shared by viral and cellular proteomes and virus-specific protein folds not detected in cellular proteomes. These protein folds provide unique insights into viral origins and evolution that are generally difficult to recover with traditional sequence alignment-dependent evolutionary analyses owing to the fast mutation rates of viral gene sequences.
Collapse
Affiliation(s)
- Shahana S Malik
- Department of Biosciences, COMSATS Institute of Information Technology, Islamabad, Pakistan
| | - Syeda Azem-E-Zahra
- Department of Biosciences, COMSATS Institute of Information Technology, Islamabad, Pakistan
| | - Kyung Mo Kim
- Division of Polar Life Sciences, Korea Polar Research Institute, Incheon, South Korea
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Arshan Nasir
- Department of Biosciences, COMSATS Institute of Information Technology, Islamabad, Pakistan.,Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| |
Collapse
|