Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Burge S, Kelly E, Lonsdale D, Mutowo-Muellenet P, McAnulla C, Mitchell A, Sangrador-Vegas A, Yong SY, Mulder N, Hunter S. Manual GO annotation of predictive protein signatures: the InterPro approach to GO curation. Database (Oxford) 2012;2012:bar068. [PMID: 22301074 PMCID: PMC3270475 DOI: 10.1093/database/bar068] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

For:	Burge S, Kelly E, Lonsdale D, Mutowo-Muellenet P, McAnulla C, Mitchell A, Sangrador-Vegas A, Yong SY, Mulder N, Hunter S. Manual GO annotation of predictive protein signatures: the InterPro approach to GO curation. Database (Oxford) 2012;2012:bar068. [PMID: 22301074 PMCID: PMC3270475 DOI: 10.1093/database/bar068] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Number

Cited by Other Article(s)

Sun Y, Wang HY, Liu B, Yue B, Liu Q, Liu Y, Rosa IF, Doretto LB, Han S, Lin L, Gong X, Shao C. CRISPR/dCas9-Mediated DNA Methylation Editing on emx2 in Chinese Tongue Sole (Cynoglossus semilaevis) Testis Cells. Int J Mol Sci 2024;25:7637. [PMID: 39062879 PMCID: PMC11277268 DOI: 10.3390/ijms25147637] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 07/03/2024] [Accepted: 07/09/2024] [Indexed: 07/28/2024] Open

Affiliation(s)

Yanxu Sun College of Fisheries and Life Science, Shanghai Ocean University, Shanghai 201306, China; (Y.S.); (B.Y.); (X.G.) State Key Laboratory of Mariculture Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao 266071, China; (H.-Y.W.); (B.L.); (Q.L.); (Y.L.); (L.B.D.); (S.H.); (L.L.)
Hong-Yan Wang State Key Laboratory of Mariculture Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao 266071, China; (H.-Y.W.); (B.L.); (Q.L.); (Y.L.); (L.B.D.); (S.H.); (L.L.)
Binghua Liu State Key Laboratory of Mariculture Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao 266071, China; (H.-Y.W.); (B.L.); (Q.L.); (Y.L.); (L.B.D.); (S.H.); (L.L.)
Bowen Yue College of Fisheries and Life Science, Shanghai Ocean University, Shanghai 201306, China; (Y.S.); (B.Y.); (X.G.) State Key Laboratory of Mariculture Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao 266071, China; (H.-Y.W.); (B.L.); (Q.L.); (Y.L.); (L.B.D.); (S.H.); (L.L.)
Qian Liu State Key Laboratory of Mariculture Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao 266071, China; (H.-Y.W.); (B.L.); (Q.L.); (Y.L.); (L.B.D.); (S.H.); (L.L.)
Yuyan Liu State Key Laboratory of Mariculture Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao 266071, China; (H.-Y.W.); (B.L.); (Q.L.); (Y.L.); (L.B.D.); (S.H.); (L.L.)
Ivana F. Rosa Department of Structural and Functional Biology, Institute of Biosciences, São Paulo State University (UNESP), Botucatu 01049-010, Brazil;
Lucas B. Doretto State Key Laboratory of Mariculture Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao 266071, China; (H.-Y.W.); (B.L.); (Q.L.); (Y.L.); (L.B.D.); (S.H.); (L.L.)
Shenglei Han State Key Laboratory of Mariculture Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao 266071, China; (H.-Y.W.); (B.L.); (Q.L.); (Y.L.); (L.B.D.); (S.H.); (L.L.)
Lei Lin State Key Laboratory of Mariculture Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao 266071, China; (H.-Y.W.); (B.L.); (Q.L.); (Y.L.); (L.B.D.); (S.H.); (L.L.)
Xiaoling Gong College of Fisheries and Life Science, Shanghai Ocean University, Shanghai 201306, China; (Y.S.); (B.Y.); (X.G.) Key Laboratory of Exploration and Utilization of Aquatic Genetic Resources (Shanghai Ocean University), Ministry of Education, Shanghai 201306, China National Demonstration Center for Experimental Fisheries Science Education, Shanghai Ocean University, Shanghai 201306, China
Changwei Shao State Key Laboratory of Mariculture Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao 266071, China; (H.-Y.W.); (B.L.); (Q.L.); (Y.L.); (L.B.D.); (S.H.); (L.L.) Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao Marine Science and Technology Center, Qingdao 266237, China

Collapse

Ulusoy E, Doğan T. Mutual annotation-based prediction of protein domain functions with Domain2GO. Protein Sci 2024;33:e4988. [PMID: 38757367 PMCID: PMC11099699 DOI: 10.1002/pro.4988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 02/25/2024] [Accepted: 03/30/2024] [Indexed: 05/18/2024]

Abstract

Identifying unknown functional properties of proteins is essential for understanding their roles in both health and disease states. The domain composition of a protein can reveal critical information in this context, as domains are structural and functional units that dictate how the protein should act at the molecular level. The expensive and time-consuming nature of wet-lab experimental approaches prompted researchers to develop computational strategies for predicting the functions of proteins. In this study, we proposed a new method called Domain2GO that infers associations between protein domains and function-defining gene ontology (GO) terms, thus redefining the problem as domain function prediction. Domain2GO uses documented protein-level GO annotations together with proteins' domain annotations. Co-annotation patterns of domains and GO terms in the same proteins are examined using statistical resampling to obtain reliable associations. As a use-case study, we evaluated the biological relevance of examples selected from the Domain2GO-generated domain-GO term mappings via literature review. Then, we applied Domain2GO to predict unknown protein functions by propagating domain-associated GO terms to proteins annotated with these domains. For function prediction performance evaluation and comparison against other methods, we employed Critical Assessment of Function Annotation 3 (CAFA3) challenge datasets. The results demonstrated the high potential of Domain2GO, particularly for predicting molecular function and biological process terms, along with advantages such as producing interpretable results and having an exceptionally low computational cost. The approach presented here can be extended to other ontologies and biological entities to investigate unknown relationships in complex and large-scale biological data. The source code, datasets, results, and user instructions for Domain2GO are available at https://github.com/HUBioDataLab/Domain2GO. Additionally, we offer a user-friendly online tool at https://huggingface.co/spaces/HUBioDataLab/Domain2GO, which simplifies the prediction of functions of previously unannotated proteins solely using amino acid sequences.

Collapse

Rutherford KM, Lera-Ramírez M, Wood V. PomBase: a Global Core Biodata Resource-growth, collaboration, and sustainability. Genetics 2024;227:iyae007. [PMID: 38376816 PMCID: PMC11075564 DOI: 10.1093/genetics/iyae007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 01/13/2024] [Indexed: 02/21/2024] Open

Zhang T, Huang W, Zhang L, Li DZ, Qi J, Ma H. Phylogenomic profiles of whole-genome duplications in Poaceae and landscape of differential duplicate retention and losses among major Poaceae lineages. Nat Commun 2024;15:3305. [PMID: 38632270 PMCID: PMC11024178 DOI: 10.1038/s41467-024-47428-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 04/02/2024] [Indexed: 04/19/2024] Open

Kyu KL, Taylor CM, Douglas CA, Malik AI, Colmer TD, Siddique KHM, Erskine W. Genetic diversity and candidate genes for transient waterlogging tolerance in mungbean at the germination and seedling stages. FRONTIERS IN PLANT SCIENCE 2024;15:1297096. [PMID: 38584945 PMCID: PMC10996369 DOI: 10.3389/fpls.2024.1297096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 02/26/2024] [Indexed: 04/09/2024]

Abstract

Mungbean [Vigna radiata var. radiata (L.) Wilczek] production in Asia is detrimentally affected by transient soil waterlogging caused by unseasonal and increasingly frequent extreme precipitation events. While mungbean exhibits sensitivity to waterlogging, there has been insufficient exploration of germplasm for waterlogging tolerance, as well as limited investigation into the genetic basis for tolerance to identify valuable loci. This research investigated the diversity of transient waterlogging tolerance in a mini-core germplasm collection of mungbean and identified candidate genes for adaptive traits of interest using genome-wide association studies (GWAS) at two critical stages of growth: germination and seedling stage (i.e., once the first trifoliate leaf had fully-expanded). In a temperature-controlled glasshouse, 292 genotypes were screened for tolerance after (i) 4 days of waterlogging followed by 7 days of recovery at the germination stage and (ii) 8 days of waterlogging followed by 7 days of recovery at the seedling stage. Tolerance was measured against drained controls. GWAS was conducted using 3,522 high-quality DArTseq-derived SNPs, revealing five significant associations with five phenotypic traits indicating improved tolerance. Waterlogging tolerance was positively correlated with the formation of adventitious roots and higher dry masses. FGGY carbohydrate kinase domain-containing protein was identified as a candidate gene for adventitious rooting and mRNA-uncharacterized LOC111241851, Caffeoyl-CoA O-methyltransferase At4g26220 and MORC family CW-type zinc finger protein 3 and zinc finger protein 2B genes for shoot, root, and total dry matter production. Moderate to high broad-sense heritability was exhibited for all phenotypic traits, including seed emergence (81%), adventitious rooting (56%), shoot dry mass (81%), root dry mass (79%) and SPAD chlorophyll content (70%). The heritability estimates, marker-trait associations, and identification of sources of waterlogging tolerant germplasm from this study demonstrate high potential for marker-assisted selection of tolerance traits to accelerate breeding of climate-resilient mungbean varieties.

Collapse

McCartney N, Kondakath G, Tai A, Trimmer BA. Functional annotation of insecta transcriptomes: A cautionary tale from Lepidoptera. INSECT BIOCHEMISTRY AND MOLECULAR BIOLOGY 2024;165:104038. [PMID: 37952902 DOI: 10.1016/j.ibmb.2023.104038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 10/30/2023] [Accepted: 11/07/2023] [Indexed: 11/14/2023]

Ibtehaz N, Kagaya Y, Kihara D. Domain-PFP allows protein function prediction using function-aware domain embedding representations. Commun Biol 2023;6:1103. [PMID: 37907681 PMCID: PMC10618451 DOI: 10.1038/s42003-023-05476-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 10/17/2023] [Indexed: 11/02/2023] Open

Tripathi S, Shirnekhi HK, Gorman SD, Chandra B, Baggett DW, Park CG, Somjee R, Lang B, Hosseini SMH, Pioso BJ, Li Y, Iacobucci I, Gao Q, Edmonson MN, Rice SV, Zhou X, Bollinger J, Mitrea DM, White MR, McGrail DJ, Jarosz DF, Yi SS, Babu MM, Mullighan CG, Zhang J, Sahni N, Kriwacki RW. Defining the condensate landscape of fusion oncoproteins. Nat Commun 2023;14:6008. [PMID: 37770423 PMCID: PMC10539325 DOI: 10.1038/s41467-023-41655-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 09/13/2023] [Indexed: 09/30/2023] Open

Affiliation(s)

Swarnendu Tripathi Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
Hazheen K Shirnekhi Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
Scott D Gorman Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA Arrakis Therapeutics, 830 Winter St, Waltham, MA, 02451, USA
Bappaditya Chandra Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
David W Baggett Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
Cheon-Gil Park Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
Ramiz Somjee Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA Rhodes College, Memphis, TN, USA Washington University School of Medicine, 660 South Euclid Avenue, St. Louis, MO, 63110, USA
Benjamin Lang Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA Center of Excellence for Data-Driven Discovery, Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
Seyed Mohammad Hadi Hosseini Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA Center of Excellence for Data-Driven Discovery, Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
Brittany J Pioso Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
Yongsheng Li Livestrong Cancer Institutes, Department of Oncology, Dell Medical School, The University of Texas at Austin, Austin, TX, 78712, USA
Ilaria Iacobucci Department of Pathology, St. Jude Children's Research Hospital, Memphis, TN, USA
Qingsong Gao Department of Pathology, St. Jude Children's Research Hospital, Memphis, TN, USA
Michael N Edmonson Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
Stephen V Rice Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
Xin Zhou Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
John Bollinger Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
Diana M Mitrea Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA Dewpoint Therapeutics, 451 D Street, Suite 104, Boston, MA, 02210, USA
Michael R White Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA IDEXX Laboratories, Inc., One IDEXX Drive, Westbrook, ME, 04092, USA
Daniel J McGrail Center for Immunotherapy and Precision Immuno-Oncology, Cleveland Clinic, Cleveland, OH, USA Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
Daniel F Jarosz Department of Chemical and Systems Biology, Stanford University School of Medicine, Stanford, CA, USA Department of Developmental Biology, Stanford University School of Medicine, Stanford, CA, USA
S Stephen Yi Livestrong Cancer Institutes, Department of Oncology, Dell Medical School, The University of Texas at Austin, Austin, TX, 78712, USA Department of Biomedical Engineering, and Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX, USA
M Madan Babu Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA Center of Excellence for Data-Driven Discovery, Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
Charles G Mullighan Department of Pathology, St. Jude Children's Research Hospital, Memphis, TN, USA
Jinghui Zhang Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
Nidhi Sahni Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA Program in Quantitative and Computational Biosciences, Baylor College of Medicine, Houston, TX, USA
Richard W Kriwacki Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA. Department of Microbiology, Immunology and Biochemistry, University of Tennessee Health Sciences Center, Memphis, TN, USA.

Collapse

Ibtehaz N, Kagaya Y, Kihara D. Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.23.554486. [PMID: 37662252 PMCID: PMC10473699 DOI: 10.1101/2023.08.23.554486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]

Liu H, Zhang Y, Chen J. Whole-genome sequencing and functional annotation of pathogenic Paraconiothyrium brasiliense causing human cellulitis. Hum Genomics 2023;17:65. [PMID: 37461066 DOI: 10.1186/s40246-023-00512-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 07/11/2023] [Indexed: 07/20/2023] Open

Dosch J, Bergmann H, Tran V, Ebersberger I. FAS: assessing the similarity between proteins using multi-layered feature architectures. Bioinformatics 2023;39:btad226. [PMID: 37084276 PMCID: PMC10185405 DOI: 10.1093/bioinformatics/btad226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 02/23/2023] [Accepted: 04/13/2023] [Indexed: 04/23/2023] Open

Reijnders MJMF, Waterhouse RM. CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation. PLoS Comput Biol 2022;18:e1010075. [PMID: 35560159 PMCID: PMC9132264 DOI: 10.1371/journal.pcbi.1010075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2021] [Revised: 05/25/2022] [Accepted: 04/04/2022] [Indexed: 11/29/2022] Open

Abstract

Characterising gene function for the ever-increasing number and diversity of species with annotated genomes relies almost entirely on computational prediction methods. These software are also numerous and diverse, each with different strengths and weaknesses as revealed through community benchmarking efforts. Meta-predictors that assess consensus and conflict from individual algorithms should deliver enhanced functional annotations. To exploit the benefits of meta-approaches, we developed CrowdGO, an open-source consensus-based Gene Ontology (GO) term meta-predictor that employs machine learning models with GO term semantic similarities and information contents. By re-evaluating each gene-term annotation, a consensus dataset is produced with high-scoring confident annotations and low-scoring rejected annotations. Applying CrowdGO to results from a deep learning-based, a sequence similarity-based, and two protein domain-based methods, delivers consensus annotations with improved precision and recall. Furthermore, using standard evaluation measures CrowdGO performance matches that of the community’s best performing individual methods. CrowdGO therefore offers a model-informed approach to leverage strengths of individual predictors and produce comprehensive and accurate gene functional annotations.

New technologies mean that we are able to read the genetic blueprints in the form of complete genome sequences from many different species. We are also able to use computational methods combined with evidence from experiments to map out the locations in the genomes of many thousands of genes and other important regions. However, discovering and characterising the biological functions of all these genes and their protein products requires considerably more experimental work. In order to gain insights into the possible functions of the many genes currently lacking functional information from experiments we must therefore rely on methods that computationally predict protein functions. Many different software tools have been developed to tackle this challenge, each with their own strengths and weaknesses as shown by several community-based competitions that assess the performance of the predictors. Taking advantage of powerful modern machine learning techniques, we developed CrowdGO, a new software that aims to combine predictions from several tools and produce comprehensive and accurate gene functional annotations. CrowdGO is able to computationally assess agreements and conflicts amongst annotations from different predictors to then re-evaluate the results and deliver enhanced predictions of protein functions.

Collapse

Thuy-Boun PS, Wang AY, Crissien-Martinez A, Xu JH, Chatterjee S, Stupp GS, Su AI, Coyle WJ, Wolan DW. Quantitative metaproteomics and activity-based protein profiling of patient fecal microbiome identifies host and microbial serine-type endopeptidase activity associated with ulcerative colitis. Mol Cell Proteomics 2022;21:100197. [PMID: 35033677 PMCID: PMC8941213 DOI: 10.1016/j.mcpro.2022.100197] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Revised: 01/10/2022] [Accepted: 01/11/2022] [Indexed: 12/12/2022] Open

Abstract

The gut microbiota plays an important yet incompletely understood role in the induction and propagation of ulcerative colitis (UC). Organism-level efforts to identify UC-associated microbes have revealed the importance of community structure, but less is known about the molecular effectors of disease. We performed 16S rRNA gene sequencing in parallel with label-free data-dependent LC-MS/MS proteomics to characterize the stool microbiomes of healthy (n = 8) and UC (n = 10) patients. Comparisons of taxonomic composition between techniques revealed major differences in community structure partially attributable to the additional detection of host, fungal, viral, and food peptides by metaproteomics. Differential expression analysis of metaproteomic data identified 176 significantly enriched protein groups between healthy and UC patients. Gene ontology analysis revealed several enriched functions with serine-type endopeptidase activity overrepresented in UC patients. Using a biotinylated fluorophosphonate probe and streptavidin-based enrichment, we show that serine endopeptidases are active in patient fecal samples and that additional putative serine hydrolases are detectable by this approach compared with unenriched profiling. Finally, as metaproteomic databases expand, they are expected to asymptotically approach completeness. Using ComPIL and de novo peptide sequencing, we estimate the size of the probable peptide space unidentified (“dark peptidome”) by our large database approach to establish a rough benchmark for database sufficiency. Despite high variability inherent in patient samples, our analysis yielded a catalog of differentially enriched proteins between healthy and UC fecal proteomes. This catalog provides a clinically relevant jumping-off point for further molecular-level studies aimed at identifying the microbial underpinnings of UC.

•

Identified 176 significantly altered protein groups between healthy and UC patients.

•

Serine-type endopeptidase activity is overrepresented in UC patients.

•

Fluorophosphonate ABPP shows that endopeptidases are active in fecal samples.

•

ABPP enrichment helps identify additional putative serine hydrolases in samples.

•

De novo sequencing used to estimate number of MS2 spectra unidentified by ComPIL.

Collapse

Xu P, Zhao C, You X, Yang F, Chen J, Ruan Z, Gu R, Xu J, Bian C, Shi Q. Draft Genome of the Mirrorwing Flyingfish (Hirundichthys speculiger). Front Genet 2021;12:695700. [PMID: 34306036 PMCID: PMC8294118 DOI: 10.3389/fgene.2021.695700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Accepted: 06/03/2021] [Indexed: 12/04/2022] Open

Affiliation(s)

Pengwei Xu College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
Chenxi Zhao College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
Xinxin You College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China.,Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, BGI, Shenzhen, China
Fan Yang Marine Geological Department, Marine Geological Survey Institute of Hainan Province, Haikou, China
Jieming Chen College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China.,Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, BGI, Shenzhen, China
Zhiqiang Ruan College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China.,Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, BGI, Shenzhen, China
Ruobo Gu Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, BGI, Shenzhen, China
Junmin Xu Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, BGI, Shenzhen, China
Chao Bian College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China.,Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, BGI, Shenzhen, China
Qiong Shi College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China.,Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, BGI, Shenzhen, China

Collapse

Yang Y, Huang L, Xu C, Qi L, Wu Z, Li J, Chen H, Wu Y, Fu T, Zhu H, Saand MA, Li J, Liu L, Fan H, Zhou H, Qin W. Chromosome-scale genome assembly of areca palm (Areca catechu). Mol Ecol Resour 2021;21:2504-2519. [PMID: 34133844 DOI: 10.1111/1755-0998.13446] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 06/08/2021] [Accepted: 06/11/2021] [Indexed: 11/28/2022]

Affiliation(s)

Yaodong Yang Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
Liyun Huang Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
Chunyan Xu BGI Genomics, BGI-Shenzhen, Shenzhen, China
Lan Qi Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
Zhangyan Wu BGI Genomics, BGI-Shenzhen, Shenzhen, China
Jia Li Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
Haixin Chen BGI Genomics, BGI-Shenzhen, Shenzhen, China
Yi Wu Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
Tao Fu BGI Genomics, BGI-Shenzhen, Shenzhen, China
Hui Zhu Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
Mumtaz Ali Saand Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
Jing Li Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
Liyun Liu Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
Haikou Fan Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
Huanqi Zhou Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
Weiquan Qin Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China

Collapse

Blum M, Chang HY, Chuguransky S, Grego T, Kandasaamy S, Mitchell A, Nuka G, Paysan-Lafosse T, Qureshi M, Raj S, Richardson L, Salazar GA, Williams L, Bork P, Bridge A, Gough J, Haft DH, Letunic I, Marchler-Bauer A, Mi H, Natale DA, Necci M, Orengo CA, Pandurangan AP, Rivoire C, Sigrist CJA, Sillitoe I, Thanki N, Thomas PD, Tosatto SCE, Wu CH, Bateman A, Finn RD. The InterPro protein families and domains database: 20 years on. Nucleic Acids Res 2021;49:D344-D354. [PMID: 33156333 PMCID: PMC7778928 DOI: 10.1093/nar/gkaa977] [Citation(s) in RCA: 1168] [Impact Index Per Article: 389.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 10/08/2020] [Accepted: 10/23/2020] [Indexed: 01/22/2023] Open

Affiliation(s)

Matthias Blum European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
Hsin-Yu Chang European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
Sara Chuguransky European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
Tiago Grego European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
Swaathi Kandasaamy European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
Alex Mitchell European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
Gift Nuka European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
Typhaine Paysan-Lafosse European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
Matloob Qureshi European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
Shriya Raj European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
Lorna Richardson European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
Gustavo A Salazar European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
Lowri Williams European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
Peer Bork European Molecular Biology Laboratory, Structural and Computational Biology Unit, Meyerhofstraße 1, 69117 Heidelberg, Germany
Alan Bridge Swiss-Prot Group, Swiss Institute of Bioinformatics, CMU, 1 rue Michel Servet, CH-1211, Geneva 4, Switzerland
Julian Gough Medical Research Council Laboratory of Molecular Biology, Cambridge Biomedical Campus, Francis Crick Ave, Trumpington, Cambridge CB2 0QH, UK
Daniel H Haft National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda MD 20894 USA
Ivica Letunic Biobyte Solutions GmbH, Bothestr 142, 69126 Heidelberg, Germany
Aron Marchler-Bauer National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda MD 20894 USA
Huaiyu Mi Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, CA 90033, USA
Darren A Natale Protein Information Resource, Georgetown University Medical Center, Washington, DC 20007, USA
Marco Necci Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy
Christine A Orengo Department of Structural and Molecular Biology, University College London, Gower St, Bloomsbury, London WC1E 6BT, UK
Arun P Pandurangan Medical Research Council Laboratory of Molecular Biology, Cambridge Biomedical Campus, Francis Crick Ave, Trumpington, Cambridge CB2 0QH, UK
Catherine Rivoire Swiss-Prot Group, Swiss Institute of Bioinformatics, CMU, 1 rue Michel Servet, CH-1211, Geneva 4, Switzerland
Christian J A Sigrist Swiss-Prot Group, Swiss Institute of Bioinformatics, CMU, 1 rue Michel Servet, CH-1211, Geneva 4, Switzerland
Ian Sillitoe Department of Structural and Molecular Biology, University College London, Gower St, Bloomsbury, London WC1E 6BT, UK
Narmada Thanki National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda MD 20894 USA
Paul D Thomas Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, CA 90033, USA
Silvio C E Tosatto Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy
Cathy H Wu Protein Information Resource, Georgetown University Medical Center, Washington, DC 20007, USA
Alex Bateman European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
Robert D Finn European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK

Collapse

Mi H, Ebert D, Muruganujan A, Mills C, Albou LP, Mushayamaha T, Thomas PD. PANTHER version 16: a revised family classification, tree-based classification tool, enhancer regions and extensive API. Nucleic Acids Res 2020;49:D394-D403. [PMID: 33290554 PMCID: PMC7778891 DOI: 10.1093/nar/gkaa1106] [Citation(s) in RCA: 787] [Impact Index Per Article: 196.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/19/2020] [Accepted: 10/28/2020] [Indexed: 01/29/2023] Open

Rath PP, Gourinath S. The actin cytoskeleton orchestra in Entamoeba histolytica. Proteins 2020;88:1361-1375. [PMID: 32506560 DOI: 10.1002/prot.25955] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Revised: 04/17/2020] [Accepted: 05/27/2020] [Indexed: 12/14/2022]

Wood V, Carbon S, Harris MA, Lock A, Engel SR, Hill DP, Van Auken K, Attrill H, Feuermann M, Gaudet P, Lovering RC, Poux S, Rutherford KM, Mungall CJ. Term Matrix: a novel Gene Ontology annotation quality control system based on ontology term co-annotation patterns. Open Biol 2020;10:200149. [PMID: 32875947 PMCID: PMC7536087 DOI: 10.1098/rsob.200149] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Accepted: 08/06/2020] [Indexed: 12/11/2022] Open

Affiliation(s)

Valerie Wood Cambridge Systems Biology Centre, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK
Seth Carbon Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
Midori A. Harris Cambridge Systems Biology Centre, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK
Antonia Lock Department of Genetics, Evolution and Environment, University College London, London WC1E 6B, UK
Stacia R. Engel Department of Genetics, Stanford University, Palo Alto, CA 94304-5477, USA
David P. Hill Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA
Kimberly Van Auken Division of Biology and Biological Engineering, California Institute of Technology, 1200 East California Boulevard, Pasadena, CA 91125, USA
Helen Attrill Department of Physiology, Development and Neuroscience, University of Cambridge, Downing Street, Cambridge CB2 3DY, UK
Marc Feuermann Swiss Institute of Bioinformatics, 1 Michel-Servet, 1204 Geneva, Switzerland
Pascale Gaudet Swiss Institute of Bioinformatics, 1 Michel-Servet, 1204 Geneva, Switzerland
Ruth C. Lovering Functional Gene Annotation, Preclinical and Fundamental Science, Institute of Cardiovascular Science, University College London, London WC1E 6JF, UK
Sylvain Poux Swiss Institute of Bioinformatics, 1 Michel-Servet, 1204 Geneva, Switzerland
Kim M. Rutherford Cambridge Systems Biology Centre, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK
Christopher J. Mungall Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA

Collapse

Koo DCE, Bonneau R. Towards region-specific propagation of protein functions. Bioinformatics 2020;35:1737-1744. [PMID: 30304483 PMCID: PMC6513163 DOI: 10.1093/bioinformatics/bty834] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2018] [Revised: 08/23/2018] [Accepted: 10/08/2018] [Indexed: 01/06/2023] Open

Kishore R, Arnaboldi V, Van Slyke CE, Chan J, Nash RS, Urbano JM, Dolan ME, Engel SR, Shimoyama M, Sternberg PW, Genome Resources TAO. Automated generation of gene summaries at the Alliance of Genome Resources. Database (Oxford) 2020;2020:baaa037. [PMID: 32559296 PMCID: PMC7304461 DOI: 10.1093/database/baaa037] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Revised: 04/06/2020] [Accepted: 04/29/2020] [Indexed: 12/28/2022]

Abstract

Short paragraphs that describe gene function, referred to as gene summaries, are valued by users of biological knowledgebases for the ease with which they convey key aspects of gene function. Manual curation of gene summaries, while desirable, is difficult for knowledgebases to sustain. We developed an algorithm that uses curated, structured gene data at the Alliance of Genome Resources (Alliance; www.alliancegenome.org) to automatically generate gene summaries that simulate natural language. The gene data used for this purpose include curated associations (annotations) to ontology terms from the Gene Ontology, Disease Ontology, model organism knowledgebase (MOK)-specific anatomy ontologies and Alliance orthology data. The method uses sentence templates for each data category included in the gene summary in order to build a natural language sentence from the list of terms associated with each gene. To improve readability of the summaries when numerous gene annotations are present, we developed a new algorithm that traverses ontology graphs in order to group terms by their common ancestors. The algorithm optimizes the coverage of the initial set of terms and limits the length of the final summary, using measures of information content of each ontology term as a criterion for inclusion in the summary. The automated gene summaries are generated with each Alliance release, ensuring that they reflect current data at the Alliance. Our method effectively leverages category-specific curation efforts of the Alliance member databases to create modular, structured and standardized gene summaries for seven member species of the Alliance. These automatically generated gene summaries make cross-species gene function comparisons tenable and increase discoverability of potential models of human disease. In addition to being displayed on Alliance gene pages, these summaries are also included on several MOK gene pages.

Collapse

Tang H, Finn RD, Thomas PD. TreeGrafter: phylogenetic tree-based annotation of proteins with Gene Ontology terms and other annotations. Bioinformatics 2019;35:518-520. [PMID: 30032202 PMCID: PMC6361231 DOI: 10.1093/bioinformatics/bty625] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2018] [Accepted: 07/18/2018] [Indexed: 11/13/2022] Open

Shim JE, Kim JH, Shin J, Lee JE, Lee I. Pathway-specific protein domains are predictive for human diseases. PLoS Comput Biol 2019;15:e1007052. [PMID: 31075101 PMCID: PMC6530867 DOI: 10.1371/journal.pcbi.1007052] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 05/22/2019] [Accepted: 04/19/2019] [Indexed: 01/04/2023] Open

Abstract

Protein domains are basic functional units of proteins. Many protein domains are pervasive among diverse biological processes, yet some are associated with specific pathways. Human complex diseases are generally viewed as pathway-level disorders. Therefore, we hypothesized that pathway-specific domains could be highly informative for human diseases. To test the hypothesis, we developed a network-based scoring scheme to quantify specificity of domain-pathway associations. We first generated domain profiles for human proteins, then constructed a co-pathway protein network based on the associations between domain profiles. Based on the score, we classified human protein domains into pathway-specific domains (PSDs) and non-specific domains (NSDs). We found that PSDs contained more pathogenic variants than NSDs. PSDs were also enriched for disease-associated mutations that disrupt protein-protein interactions (PPIs) and tend to have a moderate number of domain interactions. These results suggest that mutations in PSDs are likely to disrupt within-pathway PPIs, resulting in functional failure of pathways. Finally, we demonstrated the prediction capacity of PSDs for disease-associated genes with experimental validations in zebrafish. Taken together, the network-based quantitative method of modeling domain-pathway associations presented herein suggested underlying mechanisms of how protein domains associated with specific pathways influence mutational impacts on diseases via perturbations in within-pathway PPIs, and provided a novel genomic feature for interpreting genetic variants to facilitate the discovery of human disease genes.

Protein domains are basic functional units of proteins, yet domain-based pathway annotations for proteins are challenging tasks because many domains are pervasive among diverse pathways. Therefore, we developed a network-based scoring scheme to measure pathway specificity of domains, and then used it to identify pathway-specific domains. Surprisingly, we observed substantially more disease mutations in pathway-specific domains than non-specific domains. We found evidences that mutations of pathway-specific domains tend to perturb pathway integrity via disrupting within-pathway protein-protein interactions. We also demonstrated prediction capacity of pathway-specific domains for complex diseases with experimental validations. Our study demonstrated the usefulness of pathway information for protein domains in interpreting non-random distribution of disease mutations among domains and identification of disease genes and variants.

Collapse

Liu W, Cai Y, He P, Chen L, Bian Y. Comparative transcriptomics reveals potential genes involved in the vegetative growth of Morchella importuna. 3 Biotech 2019;9:81. [PMID: 30800592 PMCID: PMC6374242 DOI: 10.1007/s13205-019-1614-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Accepted: 02/02/2019] [Indexed: 12/16/2022] Open

Dhar D, Dey D, Basu S. Insights into the evolution of extracellular leucine-rich repeats in metazoans with special reference to Toll-like receptor 4. J Biosci 2019;44:18. [PMID: 30837369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Insights into the evolution of extracellular leucine-rich repeats in metazoans with special reference to Toll-like receptor 4. J Biosci 2019. [DOI: 10.1007/s12038-018-9821-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Garapati PV, Zhang J, Rey AJ, Marygold SJ. Towards comprehensive annotation of Drosophila melanogaster enzymes in FlyBase. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2019;2019:5298334. [PMID: 30689844 PMCID: PMC6343044 DOI: 10.1093/database/bay144] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Accepted: 12/18/2018] [Indexed: 11/13/2022]

Fey P, Dodson RJ, Basu S, Hartline EC, Chisholm RL. dictyBase and the Dicty Stock Center (version 2.0) - a progress report. THE INTERNATIONAL JOURNAL OF DEVELOPMENTAL BIOLOGY 2019;63:563-572. [PMID: 31840793 PMCID: PMC7409682 DOI: 10.1387/ijdb.190226pf] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]

Gurdeep Singh R, Tanca A, Palomba A, Van der Jeugt F, Verschaffelt P, Uzzau S, Martens L, Dawyndt P, Mesuere B. Unipept 4.0: Functional Analysis of Metaproteome Data. J Proteome Res 2018;18:606-615. [PMID: 30465426 DOI: 10.1021/acs.jproteome.8b00716] [Citation(s) in RCA: 83] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]

Zeng C, Zhan W, Deng L. SDADB: a functional annotation database of protein structural domains. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018;2018:5046758. [PMID: 29961821 PMCID: PMC6025185 DOI: 10.1093/database/bay064] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Accepted: 06/04/2018] [Indexed: 12/27/2022]

Swapna LS, Molinaro AM, Lindsay-Mosher N, Pearson BJ, Parkinson J. Comparative transcriptomic analyses and single-cell RNA sequencing of the freshwater planarian Schmidtea mediterranea identify major cell types and pathway conservation. Genome Biol 2018;19:124. [PMID: 30143032 PMCID: PMC6109357 DOI: 10.1186/s13059-018-1498-x] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Accepted: 08/01/2018] [Indexed: 12/15/2022] Open

Xiao Y, Xu P, Fan H, Baudouin L, Xia W, Bocs S, Xu J, Li Q, Guo A, Zhou L, Li J, Wu Y, Ma Z, Armero A, Issali AE, Liu N, Peng M, Yang Y. The genome draft of coconut (Cocos nucifera). Gigascience 2018;6:1-11. [PMID: 29048487 PMCID: PMC5714197 DOI: 10.1093/gigascience/gix095] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Accepted: 09/28/2017] [Indexed: 12/02/2022] Open

Abstract

Coconut palm (Cocos nucifera,2n = 32), a member of genus Cocos and family Arecaceae (Palmaceae), is an important tropical fruit and oil crop. Currently, coconut palm is cultivated in 93 countries, including Central and South America, East and West Africa, Southeast Asia and the Pacific Islands, with a total growth area of more than 12 million hectares [1]. Coconut palm is generally classified into 2 main categories: “Tall” (flowering 8–10 years after planting) and “Dwarf” (flowering 4–6 years after planting), based on morphological characteristics and breeding habits. This Palmae species has a long growth period before reproductive years, which hinders conventional breeding progress. In spite of initial successes, improvements made by conventional breeding have been very slow. In the present study, we obtained de novo sequences of the Cocos nucifera genome: a major genomic resource that could be used to facilitate molecular breeding in Cocos nucifera and accelerate the breeding process in this important crop. A total of 419.67 gigabases (Gb) of raw reads were generated by the Illumina HiSeq 2000 platform using a series of paired-end and mate-pair libraries, covering the predicted Cocos nucifera genome length (2.42 Gb, variety “Hainan Tall”) to an estimated ×173.32 read depth. A total scaffold length of 2.20 Gb was generated (N50 = 418 Kb), representing 90.91% of the genome. The coconut genome was predicted to harbor 28 039 protein-coding genes, which is less than in Phoenix dactylifera (PDK30: 28 889), Phoenix dactylifera (DPV01: 41 660), and Elaeis guineensis (EG5: 34 802). BUSCO evaluation demonstrated that the obtained scaffold sequences covered 90.8% of the coconut genome and that the genome annotation was 74.1% complete. Genome annotation results revealed that 72.75% of the coconut genome consisted of transposable elements, of which long-terminal repeat retrotransposons elements (LTRs) accounted for the largest proportion (92.23%). Comparative analysis of the antiporter gene family and ion channel gene families between C. nucifera and Arabidopsis thaliana indicated that significant gene expansion may have occurred in the coconut involving Na⁺/H⁺ antiporter, carnitine/acylcarnitine translocase, potassium-dependent sodium-calcium exchanger, and potassium channel genes. Despite its agronomic importance, C. nucifera is still under-studied. In this report, we present a draft genome of C. nucifera and provide genomic information that will facilitate future functional genomics and molecular-assisted breeding in this crop species.

Collapse

Affiliation(s)

Yong Xiao Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Av. Wenqing No. 496, Wenchang, Hainan 571339, P. R. China
Pengwei Xu BGI Genomics, BGI-Shenzhen, Building NO.7, BGI Park, No. 21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
Haikuo Fan Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Av. Wenqing No. 496, Wenchang, Hainan 571339, P. R. China
Luc Baudouin AGAP, Université de Montpellier, CIRAD, INRA, Montpellier Supagro, F-34398, Montpellier, France.,CIRAD, UMR AGAP, F-34398, Montpellier France
Wei Xia Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Av. Wenqing No. 496, Wenchang, Hainan 571339, P. R. China
Stéphanie Bocs AGAP, Université de Montpellier, CIRAD, INRA, Montpellier Supagro, F-34398, Montpellier, France.,CIRAD, UMR AGAP, F-34398, Montpellier France
Junyang Xu BGI Genomics, BGI-Shenzhen, Building NO.7, BGI Park, No. 21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
Qiong Li Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agricultural Science, Rd. Xueyuan No. 4, Haikou, Hainan 571101, P. R. China
Anping Guo Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agricultural Science, Rd. Xueyuan No. 4, Haikou, Hainan 571101, P. R. China
Lixia Zhou Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Av. Wenqing No. 496, Wenchang, Hainan 571339, P. R. China
Jing Li Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Av. Wenqing No. 496, Wenchang, Hainan 571339, P. R. China
Yi Wu Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Av. Wenqing No. 496, Wenchang, Hainan 571339, P. R. China
Zilong Ma Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agricultural Science, Rd. Xueyuan No. 4, Haikou, Hainan 571101, P. R. China
Alix Armero AGAP, Université de Montpellier, CIRAD, INRA, Montpellier Supagro, F-34398, Montpellier, France.,Montpellier Supagro, UMR AGAP, F-34398, Montpellier, France
Auguste Emmanuel Issali Station Cocotier Marc Delorme, Centre National De Recherche Agronomique (CNRA) 07 B.P. 13, Port Bouet, Côte d'Ivoire
Na Liu BGI Genomics, BGI-Shenzhen, Building NO.7, BGI Park, No. 21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
Ming Peng Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agricultural Science, Rd. Xueyuan No. 4, Haikou, Hainan 571101, P. R. China
Yaodong Yang Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Av. Wenqing No. 496, Wenchang, Hainan 571339, P. R. China

Collapse

Song H, Lin K, Hu J, Pang E. An Updated Functional Annotation of Protein-Coding Genes in the Cucumber Genome. FRONTIERS IN PLANT SCIENCE 2018;9:325. [PMID: 29599790 PMCID: PMC5863696 DOI: 10.3389/fpls.2018.00325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2017] [Accepted: 02/27/2018] [Indexed: 06/08/2023]

Abstract

Background: Although the cucumber reference genome and its annotation were published several years ago, the functional annotation of predicted genes, particularly protein-coding genes, still requires further improvement. In general, accurately determining orthologous relationships between genes allows for better and more robust functional assignments of predicted genes. As one of the most reliable strategies, the determination of collinearity information may facilitate reliable orthology inferences among genes from multiple related genomes. Currently, the identification of collinear segments has mainly been based on conservation of gene order and orientation. Over the course of plant genome evolution, various evolutionary events have disrupted or distorted the order of genes along chromosomes, making it difficult to use those genes as genome-wide markers for plant genome comparisons. Results: Using the localized LASTZ/MULTIZ analysis pipeline, we aligned 15 genomes, including cucumber and other related angiosperm plants, and identified a set of genomic segments that are short in length, stable in structure, uniform in distribution and highly conserved across all 15 plants. Compared with protein-coding genes, these conserved segments were more suitable for use as genomic markers for detecting collinear segments among distantly divergent plants. Guided by this set of identified collinear genomic segments, we inferred 94,486 orthologous protein-coding gene pairs (OPPs) between cucumber and 14 other angiosperm species, which were used as proxies for transferring functional terms to cucumber genes from the annotations of the other 14 genomes. In total, 10,885 protein-coding genes were assigned Gene Ontology (GO) terms which was nearly 1,300 more than results collected in Uniprot-proteomic database. Our results showed that annotation accuracy would been improved compared with other existing approaches. Conclusions: In this study, we provided an alternative resource for the functional annotation of predicted cucumber protein-coding genes, which we expect will be beneficial for the cucumber's biological study, accessible from http://cmb.bnu.edu.cn/functional_annotation. Meanwhile, using the cucumber reference genome as a case study, we presented an efficient strategy for transferring gene functional information from previously well-characterized protein-coding genes in model species to newly sequenced or "non-model" plant species.

Collapse

Huerta-Cepas J, Forslund K, Coelho LP, Szklarczyk D, Jensen LJ, von Mering C, Bork P. Fast Genome-Wide Functional Annotation through Orthology Assignment by eggNOG-Mapper. Mol Biol Evol 2018;34:2115-2122. [PMID: 28460117 PMCID: PMC5850834 DOI: 10.1093/molbev/msx148] [Citation(s) in RCA: 1619] [Impact Index Per Article: 269.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open

Grove C, Cain S, Chen WJ, Davis P, Harris T, Howe KL, Kishore R, Lee R, Paulini M, Raciti D, Tuli MA, Van Auken K, Williams G. Using WormBase: A Genome Biology Resource for Caenorhabditis elegans and Related Nematodes. Methods Mol Biol 2018;1757:399-470. [PMID: 29761466 DOI: 10.1007/978-1-4939-7737-6_14] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]

Zhang L, Xu P, Cai Y, Ma L, Li S, Li S, Xie W, Song J, Peng L, Yan H, Zou L, Ma Y, Zhang C, Gao Q, Wang J. The draft genome assembly of Rhododendron delavayi Franch. var. delavayi. Gigascience 2017;6:1-11. [PMID: 29020749 PMCID: PMC5632301 DOI: 10.1093/gigascience/gix076] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Revised: 05/28/2017] [Accepted: 08/04/2017] [Indexed: 01/16/2023] Open

Abstract

Rhododendron delavayi Franch. is globally famous as an ornamental plant. Its distribution in southwest China covers several different habitats and environments. However, not much research had been conducted on Rhododendron spp. at the molecular level, which hinders understanding of its evolution, speciation, and synthesis of secondary metabolites, as well as its wide adaptability to different environments. Here, we report the genome assembly and gene annotation of R. delavayi var. delavayi (the second genome sequenced in the Ericaceae), which will facilitate the study of the family. The genome assembly will have further applications in genome-assisted cultivar breeding. The final size of the assembled R. delavayi var. delavayi genome (695.09 Mb) was close to the 697.94 Mb, estimated by k-mer analysis. A total of 336.83 gigabases (Gb) of raw Illumina HiSeq 2000 reads were generated from 9 libraries (with insert sizes ranging from 170 bp to 40 kb), achieving a raw sequencing depth of ×482.6. After quality filtering, 246.06 Gb of clean reads were obtained, giving ×352.55 coverage depth. Assembly using Platanus gave a total scaffold length of 695.09 Mb, with a contig N50 of 61.8 kb and a scaffold N50 of 637.83 kb. Gene prediction resulted in the annotation of 32 938 protein-coding genes. The genome completeness was evaluated by CEGMA and BUSCO and reached 95.97% and 92.8%, respectively. The gene annotation completeness was also evaluated by CEGMA and BUSCO and reached 97.01% and 87.4%, respectively. Genome annotation revealed that 51.77% of the R. delavayi genome is composed of transposable elements, and 37.48% of long terminal repeat elements (LTRs). The de novo assembled genome of R. delavayi var. delavayi (hereinafter referred to as R. delavayi) is the second genomic resource of the family Ericaceae and will provide a valuable resource for research on future comparative genomic studies in Rhododendron species. The availability of the R. delavayi genome sequence will hopefully provide a tool for scientists to tackle open questions regarding molecular mechanisms underlying environmental interactions in the genus Rhododendron, more accurately understand the evolutionary processes and systematics of the genus, facilitate the identification of genes encoding pharmaceutically important compounds, and accelerate molecular breeding to release elite varieties.

Collapse

Affiliation(s)

Lu Zhang Flower Research Institute of Yunnan Academy of Agricultural Sciences, National Engineering Research Center For Ornamental Horticulture, No. 2238 Beijing Road, Panlong District, Kunming 650205, China Key Lab of Yunnan Flower Breeding, No. 2238 Beijing Road, Panlong District, Kunming 650205, China
Pengwei Xu BGI-Shenzhen, BGI Park, No. 21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
Yanfei Cai Flower Research Institute of Yunnan Academy of Agricultural Sciences, National Engineering Research Center For Ornamental Horticulture, No. 2238 Beijing Road, Panlong District, Kunming 650205, China Key Lab of Yunnan Flower Breeding, No. 2238 Beijing Road, Panlong District, Kunming 650205, China
Lulin Ma Flower Research Institute of Yunnan Academy of Agricultural Sciences, National Engineering Research Center For Ornamental Horticulture, No. 2238 Beijing Road, Panlong District, Kunming 650205, China Key Lab of Yunnan Flower Breeding, No. 2238 Beijing Road, Panlong District, Kunming 650205, China
Shifeng Li Flower Research Institute of Yunnan Academy of Agricultural Sciences, National Engineering Research Center For Ornamental Horticulture, No. 2238 Beijing Road, Panlong District, Kunming 650205, China Key Lab of Yunnan Flower Breeding, No. 2238 Beijing Road, Panlong District, Kunming 650205, China
Shufa Li Flower Research Institute of Yunnan Academy of Agricultural Sciences, National Engineering Research Center For Ornamental Horticulture, No. 2238 Beijing Road, Panlong District, Kunming 650205, China Key Lab of Yunnan Flower Breeding, No. 2238 Beijing Road, Panlong District, Kunming 650205, China
Weijia Xie Flower Research Institute of Yunnan Academy of Agricultural Sciences, National Engineering Research Center For Ornamental Horticulture, No. 2238 Beijing Road, Panlong District, Kunming 650205, China Key Lab of Yunnan Flower Breeding, No. 2238 Beijing Road, Panlong District, Kunming 650205, China
Jie Song Flower Research Institute of Yunnan Academy of Agricultural Sciences, National Engineering Research Center For Ornamental Horticulture, No. 2238 Beijing Road, Panlong District, Kunming 650205, China Key Lab of Yunnan Flower Breeding, No. 2238 Beijing Road, Panlong District, Kunming 650205, China
Lvchun Peng Flower Research Institute of Yunnan Academy of Agricultural Sciences, National Engineering Research Center For Ornamental Horticulture, No. 2238 Beijing Road, Panlong District, Kunming 650205, China Key Lab of Yunnan Flower Breeding, No. 2238 Beijing Road, Panlong District, Kunming 650205, China
Huijun Yan Flower Research Institute of Yunnan Academy of Agricultural Sciences, National Engineering Research Center For Ornamental Horticulture, No. 2238 Beijing Road, Panlong District, Kunming 650205, China Key Lab of Yunnan Flower Breeding, No. 2238 Beijing Road, Panlong District, Kunming 650205, China
Ling Zou Flower Research Institute of Yunnan Academy of Agricultural Sciences, National Engineering Research Center For Ornamental Horticulture, No. 2238 Beijing Road, Panlong District, Kunming 650205, China Key Lab of Yunnan Flower Breeding, No. 2238 Beijing Road, Panlong District, Kunming 650205, China
Yongpeng Ma Kunming Botanical Garden, Kunming Institute of Botany, Chinese Academy of Sciences, No. 132 Lanhei Road, Panlong District, Kunming, Yunnan 650201, China
Chengjun Zhang Germplasm Bank of Wild species, Kunming Institute of Botany, Chinese Academy of Sciences, No. 132 Lanhei Road, Panlong District, Kunming, Yunnan 650201, China
Qiang Gao BGI-Shenzhen, BGI Park, No. 21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
Jihua Wang Flower Research Institute of Yunnan Academy of Agricultural Sciences, National Engineering Research Center For Ornamental Horticulture, No. 2238 Beijing Road, Panlong District, Kunming 650205, China Key Lab of Yunnan Flower Breeding, No. 2238 Beijing Road, Panlong District, Kunming 650205, China

Collapse

Huerta-Cepas J, Forslund K, Coelho LP, Szklarczyk D, Jensen LJ, von Mering C, Bork P. Fast Genome-Wide Functional Annotation through Orthology Assignment by eggNOG-Mapper. Mol Biol Evol 2017. [PMID: 28460117 DOI: 10.1093/molbev/msx148.] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Hulo C, Masson P, Toussaint A, Osumi-Sutherland D, de Castro E, Auchincloss AH, Poux S, Bougueleret L, Xenarios I, Le Mercier P. Bacterial Virus Ontology; Coordinating across Databases. Viruses 2017;9:E126. [PMID: 28545254 PMCID: PMC5490803 DOI: 10.3390/v9060126] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Revised: 05/16/2017] [Accepted: 05/17/2017] [Indexed: 12/29/2022] Open

Oppenheim SJ, Rosenfeld JA, DeSalle R. Genome content analysis yields new insights into the relationship between the human malaria parasite Plasmodium falciparum and its anopheline vectors. BMC Genomics 2017;18:205. [PMID: 28241792 PMCID: PMC5327517 DOI: 10.1186/s12864-017-3590-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2016] [Accepted: 02/13/2017] [Indexed: 11/24/2022] Open

Abstract

Background

The persistent and growing gap between the availability of sequenced genomes and the ability to assign functions to sequenced genes led us to explore ways to maximize the information content of automated annotation for studies of anopheline mosquitos. Specifically, we use genome content analysis of a large number of previously sequenced anopheline mosquitos to follow the loss and gain of protein families over the evolutionary history of this group.

The importance of this endeavor lies in the potential for comparative genomic studies between Anopheles and closely related non-vector species to reveal ancestral genome content dynamics involved in vector competence. In addition, comparisons within Anopheles could identify genome content changes responsible for variation in the vectorial capacity of this family of important parasite vectors.

Results

The competence and capacity of P. falciparum vectors do not appear to be phylogenetically constrained within the Anophelinae. Instead, using ancestral reconstruction methods, we suggest that a previously unexamined component of vector biology, anopheline nucleotide metabolism, may contribute to the unique status of anophelines as P. falciparum vectors. While the fitness effects of nucleotide co-option by P. falciparum parasites on their anopheline hosts are not yet known, our results suggest that anopheline genome content may be responding to selection pressure from P. falciparum. Whether this response is defensive, in an attempt to redress improper nucleotide balance resulting from P. falciparum infection, or perhaps symbiotic, resulting from an as-yet-unknown mutualism between anophelines and P. falciparum, is an open question that deserves further study.

Conclusions

Clearly, there is a wealth of functional information to be gained from detailed manual genome annotation, yet the rapid increase in the number of available sequences means that most researchers will not have the time or resources to manually annotate all the sequence data they generate. We believe that efforts to maximize the amount of information obtained from automated annotation can help address the functional annotation deficit that most evolutionary biologists now face, and here demonstrate the value of such an approach.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-017-3590-0) contains supplementary material, which is available to authorized users.

Collapse

Cozzetto D, Jones DT. Computational Methods for Annotation Transfers from Sequence. Methods Mol Biol 2017;1446:55-67. [PMID: 27812935 DOI: 10.1007/978-1-4939-3743-1_5] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Gaudet P, Škunca N, Hu JC, Dessimoz C. Primer on the Gene Ontology. Methods Mol Biol 2017;1446:25-37. [PMID: 27812933 DOI: 10.1007/978-1-4939-3743-1_3] [Citation(s) in RCA: 61] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]

Gaudet P, Dessimoz C. Gene Ontology: Pitfalls, Biases, and Remedies. Methods Mol Biol 2017;1446:189-205. [PMID: 27812944 DOI: 10.1007/978-1-4939-3743-1_14] [Citation(s) in RCA: 77] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Feuermann M, Gaudet P, Mi H, Lewis SE, Thomas PD. Large-scale inference of gene function through phylogenetic annotation of Gene Ontology terms: case study of the apoptosis and autophagy cellular processes. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016;2016:baw155. [PMID: 28025345 PMCID: PMC5199145 DOI: 10.1093/database/baw155] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/26/2016] [Revised: 10/10/2016] [Accepted: 11/01/2016] [Indexed: 01/30/2023]

Falda M, Lavezzo E, Fontana P, Bianco L, Berselli M, Formentin E, Toppo S. Eliciting the Functional Taxonomy from protein annotations and taxa. Sci Rep 2016;6:31971. [PMID: 27534507 PMCID: PMC4989186 DOI: 10.1038/srep31971] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2016] [Accepted: 08/01/2016] [Indexed: 11/30/2022] Open

Shim JE, Lee I. Weighted mutual information analysis substantially improves domain-based functional network models. Bioinformatics 2016;32:2824-30. [PMID: 27207946 PMCID: PMC5018372 DOI: 10.1093/bioinformatics/btw320] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2016] [Accepted: 05/16/2016] [Indexed: 11/30/2022] Open

Berardini TZ, Reiser L, Li D, Mezheritsky Y, Muller R, Strait E, Huala E. The Arabidopsis information resource: Making and mining the "gold standard" annotated reference plant genome. Genesis 2015;53:474-85. [PMID: 26201819 PMCID: PMC4545719 DOI: 10.1002/dvg.22877] [Citation(s) in RCA: 640] [Impact Index Per Article: 71.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2015] [Revised: 07/15/2015] [Accepted: 07/15/2015] [Indexed: 11/09/2022]

Drabkin HJ, Christie KR, Dolan ME, Hill DP, Ni L, Sitnikov D, Blake JA. Application of comparative biology in GO functional annotation: the mouse model. Mamm Genome 2015;26:574-83. [PMID: 26141960 PMCID: PMC4602061 DOI: 10.1007/s00335-015-9580-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Accepted: 06/23/2015] [Indexed: 01/22/2023]

Ito A, Ohkawa T. A method of searching for related literature on protein structure analysis by considering a user's intention. BMC Bioinformatics 2015;16 Suppl 7:S4. [PMID: 25952498 PMCID: PMC4423583 DOI: 10.1186/1471-2105-16-s7-s4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Huntley RP, Sawford T, Mutowo-Meullenet P, Shypitsyna A, Bonilla C, Martin MJ, O'Donovan C. The GOA database: gene Ontology annotation updates for 2015. Nucleic Acids Res 2014;43:D1057-63. [PMID: 25378336 PMCID: PMC4383930 DOI: 10.1093/nar/gku1113] [Citation(s) in RCA: 381] [Impact Index Per Article: 38.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

Fang H. dcGOR: an R package for analysing ontologies and protein domain annotations. PLoS Comput Biol 2014;10:e1003929. [PMID: 25356683 PMCID: PMC4214615 DOI: 10.1371/journal.pcbi.1003929] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2014] [Accepted: 09/21/2014] [Indexed: 01/08/2023] Open