1
|
Wasti QZ, Sabar MF, Farooq A, Khan MU. Stepping towards pollen DNA metabarcoding: A breakthrough in forensic sciences. Forensic Sci Med Pathol 2023:10.1007/s12024-023-00770-8. [PMID: 38147285 DOI: 10.1007/s12024-023-00770-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/20/2023] [Indexed: 12/27/2023]
Abstract
This review is engaged in determining the capability of plant pollen as a significant source of evidence for the linkage between suspects and crime location in forensic sciences. Research and review articles were collected from Google Scholar, the Web of Science, and PubMed. Articles were searched using specific keywords such as "Forensic Palynology," "Pollen metabarcoding," "Plant forensics," and "Pollen" AND "criminal investigation." Boolean logic was also utilized to narrow the articles to be included in this review article. Through the literature and exploratory research, it has been observed in the current study that with advancements in technology, forensic palynology has found its application in creating an association between the crime scene and suspected individuals to have a link to it, as pollen DNA is a long-lasting investigative tool that can effectively help forensic investigations. Moreover, the literature shows that the DNA of pollen and spores has helped forensic scientists link suspects to crime scenes, and the introduction of pollen DNA metabarcoding tools has eased the efforts of palynologists to analyze pollen DNA. The introduction of DNA metabarcoding techniques to analyze pollen from plants has helped identify the geological locations of the plants and ultimately identify the culprit.
Collapse
Affiliation(s)
- Qandeel Zaineb Wasti
- Centre for Applied Molecular Biology, University of the Punjab, Lahore, Pakistan
| | | | - Abeera Farooq
- Punjab University College of Pharmacy, University of the Punjab, Lahore, Pakistan
| | - Muhammad Umer Khan
- Institute of Molecular Biology and Biotechnology, The University of Lahore, Lahore, Pakistan.
| |
Collapse
|
2
|
Lu LY, Ou JH, Hui RCY, Chuang YH, Fan YC, Sun PL. High Diversity of Fusarium Species in Onychomycosis: Clinical Presentations, Molecular Identification, and Antifungal Susceptibility. J Fungi (Basel) 2023; 9:jof9050534. [PMID: 37233245 DOI: 10.3390/jof9050534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 04/10/2023] [Accepted: 04/27/2023] [Indexed: 05/27/2023] Open
Abstract
Fusarium are uncommon but important pathogenic organisms; they cause non-dermatophyte mould (NDM) onychomycosis. Patients typically respond poorly to treatment owing to Fusarium's native resistance to multiple antifungal drugs. However, epidemiological data for Fusarium onychomycosis are lacking in Taiwan. We retrospectively reviewed the data of 84 patients with positive Fusarium nail sample cultures at Chang Gung Memorial Hospital, Linkou Branch between 2014 and 2020. We aimed to investigate the clinical presentations, microscopic and pathological characteristics, antifungal susceptibility, and species diversity of Fusarium in patients with Fusarium onychomycosis. We enrolled 29 patients using the six-parameter criteria for NDM onychomycosis to determine the clinical significance of Fusarium in these patients. All isolates were subjected to species identification by sequences and molecular phylogeny. A total of 47 Fusarium strains belonging to 13 species in four different Fusarium species complexes (with Fusarium keratoplasticum predominating) were isolated from 29 patients. Six types of histopathology findings were specific to Fusarium onychomycosis, which may be useful for differentiating dermatophytes from NDMs. The results of drug susceptibility testing showed high variation among species complexes, and efinaconazole, lanoconazole, and luliconazole showed excellent in vitro activity for the most part. This study's primary limitation was its single-centre retrospective design. Our study showed a high diversity of Fusarium species in diseased nails. Fusarium onychomycosis has clinical and pathological features distinct from those of dermatophyte onychomycosis. Thus, careful diagnosis and proper pathogen identification are essential in the management of NDM onychomycosis caused by Fusarium sp.
Collapse
Affiliation(s)
- Lai-Ying Lu
- Department of Dermatology, Chang Gung Memorial Hospital, Linkou Branch, Taoyuan 333423, Taiwan
- College of Medicine, Chang Gung University, Taoyuan 333323, Taiwan
- Department of Dermatology and Aesthetic Medicine Center, Jen-Ai Hospital, Taichung 412224, Taiwan
| | - Jie-Hao Ou
- Department of Plant Pathology, National Chung Hsing University, Taichung 402202, Taiwan
| | - Rosaline Chung-Yee Hui
- Department of Dermatology, Chang Gung Memorial Hospital, Linkou Branch, Taoyuan 333423, Taiwan
| | - Ya-Hui Chuang
- Department of Dermatology, Chang Gung Memorial Hospital, Linkou Branch, Taoyuan 333423, Taiwan
| | - Yun-Chen Fan
- Department of Plant Pathology, National Chung Hsing University, Taichung 402202, Taiwan
| | - Pei-Lun Sun
- Department of Dermatology, Chang Gung Memorial Hospital, Linkou Branch, Taoyuan 333423, Taiwan
- College of Medicine, Chang Gung University, Taoyuan 333323, Taiwan
- Research Laboratory of Medical Mycology, Chang Gung Memorial Hospital, Linkou Branch, Taoyuan 33323, Taiwan
| |
Collapse
|
3
|
Zuluaga DL, Blanco E, Mangini G, Sonnante G, Curci PL. A Survey of the Transcriptomic Resources in Durum Wheat: Stress Responses, Data Integration and Exploitation. PLANTS (BASEL, SWITZERLAND) 2023; 12:1267. [PMID: 36986956 PMCID: PMC10056183 DOI: 10.3390/plants12061267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 02/28/2023] [Accepted: 03/04/2023] [Indexed: 06/19/2023]
Abstract
Durum wheat (Triticum turgidum subsp. durum (Desf.) Husn.) is an allotetraploid cereal crop of worldwide importance, given its use for making pasta, couscous, and bulgur. Under climate change scenarios, abiotic (e.g., high and low temperatures, salinity, drought) and biotic (mainly exemplified by fungal pathogens) stresses represent a significant limit for durum cultivation because they can severely affect yield and grain quality. The advent of next-generation sequencing technologies has brought a huge development in transcriptomic resources with many relevant datasets now available for durum wheat, at various anatomical levels, also focusing on phenological phases and environmental conditions. In this review, we cover all the transcriptomic resources generated on durum wheat to date and focus on the corresponding scientific insights gained into abiotic and biotic stress responses. We describe relevant databases, tools and approaches, including connections with other "omics" that could assist data integration for candidate gene discovery for bio-agronomical traits. The biological knowledge summarized here will ultimately help in accelerating durum wheat breeding.
Collapse
|
4
|
Baltoumas FA, Karatzas E, Paez-Espino D, Venetsianou NK, Aplakidou E, Oulas A, Finn RD, Ovchinnikov S, Pafilis E, Kyrpides NC, Pavlopoulos GA. Exploring microbial functional biodiversity at the protein family level-From metagenomic sequence reads to annotated protein clusters. FRONTIERS IN BIOINFORMATICS 2023; 3:1157956. [PMID: 36959975 PMCID: PMC10029925 DOI: 10.3389/fbinf.2023.1157956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 02/21/2023] [Indexed: 03/06/2023] Open
Abstract
Metagenomics has enabled accessing the genetic repertoire of natural microbial communities. Metagenome shotgun sequencing has become the method of choice for studying and classifying microorganisms from various environments. To this end, several methods have been developed to process and analyze the sequence data from raw reads to end-products such as predicted protein sequences or families. In this article, we provide a thorough review to simplify such processes and discuss the alternative methodologies that can be followed in order to explore biodiversity at the protein family level. We provide details for analysis tools and we comment on their scalability as well as their advantages and disadvantages. Finally, we report the available data repositories and recommend various approaches for protein family annotation related to phylogenetic distribution, structure prediction and metadata enrichment.
Collapse
Affiliation(s)
- Fotis A. Baltoumas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
- *Correspondence: Fotis A. Baltoumas, ; Nikos C. Kyrpides, ; Georgios A. Pavlopoulos,
| | - Evangelos Karatzas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - David Paez-Espino
- Lawrence Berkeley National Laboratory, DOE Joint Genome Institute, Berkeley, CA, United States
| | - Nefeli K. Venetsianou
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - Eleni Aplakidou
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - Anastasis Oulas
- The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| | - Robert D. Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, United Kingdom
| | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA, United States
| | - Evangelos Pafilis
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Heraklion, Greece
| | - Nikos C. Kyrpides
- Lawrence Berkeley National Laboratory, DOE Joint Genome Institute, Berkeley, CA, United States
- *Correspondence: Fotis A. Baltoumas, ; Nikos C. Kyrpides, ; Georgios A. Pavlopoulos,
| | - Georgios A. Pavlopoulos
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
- Center of New Biotechnologies and Precision Medicine, Department of Medicine, School of Health Sciences, National and Kapodistrian University of Athens, Athens, Greece
- Hellenic Army Academy, Vari, Greece
- *Correspondence: Fotis A. Baltoumas, ; Nikos C. Kyrpides, ; Georgios A. Pavlopoulos,
| |
Collapse
|
5
|
Nata’ala MK, Avila Santos AP, Coelho Kasmanas J, Bartholomäus A, Saraiva JP, Godinho Silva S, Keller-Costa T, Costa R, Gomes NCM, Ponce de Leon Ferreira de Carvalho AC, Stadler PF, Sipoli Sanches D, Nunes da Rocha U. MarineMetagenomeDB: a public repository for curated and standardized metadata for marine metagenomes. ENVIRONMENTAL MICROBIOME 2022; 17:57. [PMID: 36401317 PMCID: PMC9675116 DOI: 10.1186/s40793-022-00449-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 09/15/2022] [Indexed: 05/17/2023]
Abstract
BACKGROUND Metagenomics is an expanding field within microbial ecology, microbiology, and related disciplines. The number of metagenomes deposited in major public repositories such as Sequence Read Archive (SRA) and Metagenomic Rapid Annotations using Subsystems Technology (MG-RAST) is rising exponentially. However, data mining and interpretation can be challenging due to mis-annotated and misleading metadata entries. In this study, we describe the Marine Metagenome Metadata Database (MarineMetagenomeDB) to help researchers identify marine metagenomes of interest for re-analysis and meta-analysis. To this end, we have manually curated the associated metadata of several thousands of microbial metagenomes currently deposited at SRA and MG-RAST. RESULTS In total, 125 terms were curated according to 17 different classes (e.g., biome, material, oceanic zone, geographic feature and oceanographic phenomena). Other standardized features include sample attributes (e.g., salinity, depth), sample location (e.g., latitude, longitude), and sequencing features (e.g., sequencing platform, sequence count). MarineMetagenomeDB version 1.0 contains 11,449 marine metagenomes from SRA and MG-RAST distributed across all oceans and several seas. Most samples were sequenced using Illumina sequencing technology (84.33%). More than 55% of the samples were collected from the Pacific and the Atlantic Oceans. About 40% of the samples had their biomes assigned as 'ocean'. The 'Quick Search' and 'Advanced Search' tabs allow users to use different filters to select samples of interest dynamically in the web app. The interactive map allows the visualization of samples based on their location on the world map. The web app is also equipped with a novel download tool (on both Windows and Linux operating systems), that allows easy download of raw sequence data of selected samples from their respective repositories. As a use case, we demonstrated how to use the MarineMetagenomeDB web app to select estuarine metagenomes for potential large-scale microbial biogeography studies. CONCLUSION The MarineMetagenomeDB is a powerful resource for non-bioinformaticians to find marine metagenome samples with curated metadata and stimulate meta-studies involving marine microbiomes. Our user-friendly web app is publicly available at https://webapp.ufz.de/marmdb/ .
Collapse
Affiliation(s)
- Muhammad Kabiru Nata’ala
- Department of Environmental Microbiology, Helmholtz Centre for Environmental Research – UFZ GmbH, 04318 Leipzig, Saxony Germany
- Department of Computer Science and Interdisciplinary Centre of Bioinformatics, University of Leipzig, 04107 Leipzig, Saxony Germany
| | - Anderson P. Avila Santos
- Department of Environmental Microbiology, Helmholtz Centre for Environmental Research – UFZ GmbH, 04318 Leipzig, Saxony Germany
- Institute of Mathematics and Computer Sciences, University of Sao Paulo, São Carlos, Brazil
| | - Jonas Coelho Kasmanas
- Department of Environmental Microbiology, Helmholtz Centre for Environmental Research – UFZ GmbH, 04318 Leipzig, Saxony Germany
- Department of Computer Science and Interdisciplinary Centre of Bioinformatics, University of Leipzig, 04107 Leipzig, Saxony Germany
- Institute of Mathematics and Computer Sciences, University of Sao Paulo, São Carlos, Brazil
| | - Alexander Bartholomäus
- Section 3.7 Geomicrobiology, GFZ German Research Centre for Geosciences, 14473 Telegrafenberg, Potsdam Germany
| | - João Pedro Saraiva
- Department of Environmental Microbiology, Helmholtz Centre for Environmental Research – UFZ GmbH, 04318 Leipzig, Saxony Germany
| | - Sandra Godinho Silva
- Department of Bioengineering and Institute for Bioengineering and Biosciences, Instituto Superior Técnico, Universidade de Lisboa, 1049-001 Lisbon, Portugal
| | - Tina Keller-Costa
- Department of Bioengineering and Institute for Bioengineering and Biosciences, Instituto Superior Técnico, Universidade de Lisboa, 1049-001 Lisbon, Portugal
| | - Rodrigo Costa
- Department of Bioengineering and Institute for Bioengineering and Biosciences, Instituto Superior Técnico, Universidade de Lisboa, 1049-001 Lisbon, Portugal
| | - Newton C. M. Gomes
- Department of Biology and Centre for Environmental and Marine Studies (CESAM), University of Aveiro, 3810-193 Aveiro, Portugal
| | | | - Peter F. Stadler
- Department of Computer Science and Interdisciplinary Centre of Bioinformatics, University of Leipzig, 04107 Leipzig, Saxony Germany
| | | | - Ulisses Nunes da Rocha
- Department of Environmental Microbiology, Helmholtz Centre for Environmental Research – UFZ GmbH, 04318 Leipzig, Saxony Germany
| |
Collapse
|
6
|
Leyhr J, Waldmann L, Filipek-Górniok B, Zhang H, Allalou A, Haitina T. A novel cis-regulatory element drives early expression of Nkx3.2 in the gnathostome primary jaw joint. eLife 2022; 11:75749. [PMCID: PMC9665848 DOI: 10.7554/elife.75749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 09/30/2022] [Indexed: 11/16/2022] Open
Abstract
The acquisition of movable jaws was a major event during vertebrate evolution. The role of NK3 homeobox 2 (Nkx3.2) transcription factor in patterning the primary jaw joint of gnathostomes (jawed vertebrates) is well known, however knowledge about its regulatory mechanism is lacking. In this study, we report a proximal enhancer element of Nkx3.2 that is deeply conserved in most gnathostomes but undetectable in the jawless hagfish and lamprey. This enhancer is active in the developing jaw joint region of the zebrafish Danio rerio, and was thus designated as jaw joint regulatory sequence 1 (JRS1). We further show that JRS1 enhancer sequences from a range of gnathostome species, including a chondrichthyan and mammals, have the same activity in the jaw joint as the native zebrafish enhancer, indicating a high degree of functional conservation despite the divergence of cartilaginous and bony fish lineages or the transition of the primary jaw joint into the middle ear of mammals. Finally, we show that deletion of JRS1 from the zebrafish genome using CRISPR/Cas9 results in a significant reduction of early gene expression of nkx3.2 and leads to a transient jaw joint deformation and partial fusion. Emergence of this Nkx3.2 enhancer in early gnathostomes may have contributed to the origin and shaping of the articulating surfaces of vertebrate jaws.
Collapse
Affiliation(s)
- Jake Leyhr
- Subdepartment of Evolution and Development, Department of Organismal Biology, Uppsala University
| | - Laura Waldmann
- Subdepartment of Evolution and Development, Department of Organismal Biology, Uppsala University
| | - Beata Filipek-Górniok
- Science for Life Laboratory Genome Engineering Zebrafish Facility, Department of Organismal Biology, Uppsala University
| | - Hanqing Zhang
- Division of Visual Information and Interaction, Department of Information Technology, Uppsala University
- Science for Life Laboratory BioImage Informatics Facility
| | - Amin Allalou
- Division of Visual Information and Interaction, Department of Information Technology, Uppsala University
- Science for Life Laboratory BioImage Informatics Facility
| | - Tatjana Haitina
- Subdepartment of Evolution and Development, Department of Organismal Biology, Uppsala University
| |
Collapse
|
7
|
Goudey B, Geard N, Verspoor K, Zobel J. Propagation, detection and correction of errors using the sequence database network. Brief Bioinform 2022; 23:6764545. [PMID: 36266246 PMCID: PMC9677457 DOI: 10.1093/bib/bbac416] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 07/31/2022] [Accepted: 08/28/2022] [Indexed: 12/14/2022] Open
Abstract
Nucleotide and protein sequences stored in public databases are the cornerstone of many bioinformatics analyses. The records containing these sequences are prone to a wide range of errors, including incorrect functional annotation, sequence contamination and taxonomic misclassification. One source of information that can help to detect errors are the strong interdependency between records. Novel sequences in one database draw their annotations from existing records, may generate new records in multiple other locations and will have varying degrees of similarity with existing records across a range of attributes. A network perspective of these relationships between sequence records, within and across databases, offers new opportunities to detect-or even correct-erroneous entries and more broadly to make inferences about record quality. Here, we describe this novel perspective of sequence database records as a rich network, which we call the sequence database network, and illustrate the opportunities this perspective offers for quantification of database quality and detection of spurious entries. We provide an overview of the relevant databases and describe how the interdependencies between sequence records across these databases can be exploited by network analyses. We review the process of sequence annotation and provide a classification of sources of error, highlighting propagation as a major source. We illustrate the value of a network perspective through three case studies that use network analysis to detect errors, and explore the quality and quantity of critical relationships that would inform such network analyses. This systematic description of a network perspective of sequence database records provides a novel direction to combat the proliferation of errors within these critical bioinformatics resources.
Collapse
Affiliation(s)
- Benjamin Goudey
- Corresponding author. Benjamin Goudey, School of Computing and Information Systems, University of Melbourne Parkville, Victoria, 3010,
| | - Nicholas Geard
- School of Computing and Information Systems, University of Melbourne Parkville, Victoria, 3010
| | - Karin Verspoor
- School of Computing Technologies, RMIT University Melbourne, Victoria, 3000
| | - Justin Zobel
- School of Computing and Information Systems, University of Melbourne Parkville, Victoria, 3010
| |
Collapse
|
8
|
Poolman TM, Townsend‐Nicholson A, Cain A. Teaching genomics to life science undergraduates using cloud computing platforms with open datasets. BIOCHEMISTRY AND MOLECULAR BIOLOGY EDUCATION : A BIMONTHLY PUBLICATION OF THE INTERNATIONAL UNION OF BIOCHEMISTRY AND MOLECULAR BIOLOGY 2022; 50:446-449. [PMID: 35972192 PMCID: PMC9804627 DOI: 10.1002/bmb.21646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 03/08/2022] [Accepted: 06/30/2022] [Indexed: 06/15/2023]
Abstract
The final year of a biochemistry degree is usually a time to experience research. However, laboratory-based research projects were not possible during COVID-19. Instead, we used open datasets to provide computational research projects in metagenomics to biochemistry undergraduates (80 students with limited computing experience). We aimed to give the students a chance to explore any dataset, rather than use a small number of artificial datasets (~60 published datasets were used). To achieve this, we utilized Google Colaboratory (Colab), a virtual computing environment. Colab was used as a framework to retrieve raw sequencing data (analyzed with QIIME2) and generate visualizations. Setting up the environment requires no prior experience; all students have the same drive structure and notebooks can be shared (for synchronous sessions). We also used the platform to combine multiple datasets, perform a meta-analysis, and allowed the students to analyze large datasets with 1000s of subjects and factors. Projects that required increased computational resources were integrated with Google Cloud Compute. In future, all research projects can include some aspects of reanalyzing public data, providing students with data science experience. Colab is also an excellent environment in which to develop data skills in multiple languages (e.g., Perl, Python, Julia).
Collapse
Affiliation(s)
- Toryn M. Poolman
- Structural & Molecular Biology Faculty of Life SciencesUCLLondonUK
| | | | - Amanda Cain
- Structural & Molecular Biology Faculty of Life SciencesUCLLondonUK
| |
Collapse
|
9
|
Luo S, Wang LC, Shuai ZH, Yang GJ, Lu JF, Chen J. A short peptidoglycan recognition protein protects Boleophthalmus pectinirostris against bacterial infection via inhibiting bacterial activity. FISH & SHELLFISH IMMUNOLOGY 2022; 127:119-128. [PMID: 35716967 DOI: 10.1016/j.fsi.2022.06.019] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 06/11/2022] [Accepted: 06/13/2022] [Indexed: 06/15/2023]
Abstract
Peptidoglycan recognition proteins (PGRPs) belong to a member of pattern-recognition receptors (PRRs), which proposed as antibacterial protein. The present study investigated the antibacterial effect of BpPGRP5 in great blue-spotted mudskipper (Boleophthalmus pectinirostris). BpPGRP5 transcript was detected in all tested tissues with the highest expression level in spleen, and its expression was significantly upregulated in spleen, intestine, and kidney following Aeromonas veronii infection. rBpPGRP5 was found to interact with several polysaccharides and bacteria, including Gram-negative bacteria (Escherichia coli and A. veronii) and Gram-positive bacteria (Listeria monocytogenes and Staphylococcus aureus). rBpPGRP5 inhibited the proliferation of E. coli, S. aureus, L. monocytogenes, and A. veronii in a Zn2+-dependent manner. Furthermore, in vivo studies revealed that intraperitoneal injection of rBpPGRP5 improved the survival rate of A. veronii-infected B. pectinirostris, accompanied by decreased bacterial load in the blood, kidney, intestine, and spleen. Taken together, our results indicated that BpPGRP5 is an antimicrobial protein that protects B. pectinirostris against bacterial infection.
Collapse
Affiliation(s)
- Sheng Luo
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Ningbo University, Ningbo, 315211, China; Laboratory of Biochemistry and Molecular Biology, School of Marine Sciences, Ningbo University, Ningbo, 315211, China; Key Laboratory of Aquacultural Biotechnology Ministry of Education, Ningbo University, Ningbo, 315211, China
| | - Li-Cong Wang
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Ningbo University, Ningbo, 315211, China; Laboratory of Biochemistry and Molecular Biology, School of Marine Sciences, Ningbo University, Ningbo, 315211, China; Key Laboratory of Aquacultural Biotechnology Ministry of Education, Ningbo University, Ningbo, 315211, China
| | - Zhi-Han Shuai
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Ningbo University, Ningbo, 315211, China; Laboratory of Biochemistry and Molecular Biology, School of Marine Sciences, Ningbo University, Ningbo, 315211, China; Key Laboratory of Aquacultural Biotechnology Ministry of Education, Ningbo University, Ningbo, 315211, China
| | - Guan-Jun Yang
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Ningbo University, Ningbo, 315211, China; Laboratory of Biochemistry and Molecular Biology, School of Marine Sciences, Ningbo University, Ningbo, 315211, China; Key Laboratory of Aquacultural Biotechnology Ministry of Education, Ningbo University, Ningbo, 315211, China
| | - Jian-Fei Lu
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Ningbo University, Ningbo, 315211, China; Laboratory of Biochemistry and Molecular Biology, School of Marine Sciences, Ningbo University, Ningbo, 315211, China; Key Laboratory of Aquacultural Biotechnology Ministry of Education, Ningbo University, Ningbo, 315211, China.
| | - Jiong Chen
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Ningbo University, Ningbo, 315211, China; Laboratory of Biochemistry and Molecular Biology, School of Marine Sciences, Ningbo University, Ningbo, 315211, China; Key Laboratory of Aquacultural Biotechnology Ministry of Education, Ningbo University, Ningbo, 315211, China.
| |
Collapse
|
10
|
Aleurodiscus bicornis and A. formosanus spp. nov. (Basidiomycota) with smooth basidiospores, and redescription of A. parvisporus. Mycol Prog 2022. [DOI: 10.1007/s11557-021-01733-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
11
|
Ullah S, Rahman W, Ullah F, Ahmad G, Ijaz M, Gao T. DBHR: a collection of databases relevant to human research. Future Sci OA 2022; 8:FSO780. [PMID: 35251694 PMCID: PMC8890137 DOI: 10.2144/fsoa-2021-0101] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 01/05/2022] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND The achievement of the human genome project provides a basis for the systematic study of the human genome from evolutionary history to disease-specific medicine. With the explosive growth of biological data, a growing number of biological databases are being established to support human-related research. OBJECTIVE The main objective of our study is to store, organize and share data in a structured and searchable manner. In short, we have planned the future development of new features in the database research area. MATERIALS & METHODS In total, we collected and integrated 680 human databases from scientific published work. Multiple options are presented for accessing the data, while original links and short descriptions are also presented for each database. RESULTS & DISCUSSION We have provided the latest collection of human research databases on a single platform with six categories: DNA database, RNA database, protein database, expression database, pathway database and disease database. CONCLUSION Taken together, our database will be useful for further human research study and will be modified over time. The database has been implemented in PHP, HTML, CSS and MySQL and is available freely at https://habdsk.org/database.php.
Collapse
Affiliation(s)
| | | | | | | | | | - Tianshun Gao
- Research Center, The Seventh Affiliated Hospital of Sun Yat-sen University, Shenzhen, Guangzhou, China
| |
Collapse
|
12
|
Biological Nitrogen Removal Database: A Manually Curated Data Resource. Microorganisms 2022; 10:microorganisms10020431. [PMID: 35208885 PMCID: PMC8874995 DOI: 10.3390/microorganisms10020431] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 01/29/2022] [Accepted: 02/01/2022] [Indexed: 02/01/2023] Open
Abstract
Biological nitrogen removal (BNR) technologies are the most effective approaches for the remediation of environmental nitrogen pollutants from wastewater treatment plants (WWTPs). Presently, research is going on to elucidate the structure and function of BNR microbial communities and optimizing BNR treatment systems to enhance nitrogen removal efficiency. The literature on BNR microbial communities and experimental datasets is not unified across various repositories, while a uniform resource for the collection, annotation, and structuring of these BNR datasets is still unavailable. Herein, we present the Biological Nitrogen Removal Database (BNRdb), an integrated resource containing various manually curated BNR-related data. At present, BNRdb contains 23,308 microbial strains, 46 gene families, 24 enzymes, 18 reactions, 301 BNR treatment datasets, 860 BNR-associated next-generation sequencing datasets, and 6 common BNR bioreactor systems. BNRdb provides a user-friendly interface enabling interactive data browsing. To our knowledge, BNRdb is the first BNR data resource that systematically integrates BNR data from archaeal, bacterial, and fungal communities. We believe that BNRdb will contribute to a better understanding of BNR process and nitrogen bioremediation research.
Collapse
|
13
|
Jin TC, Lu JF, Luo S, Wang LC, Lu XJ, Chen J. Characterization of large yellow croaker (Larimichthys crocea) osteoprotegerin and its role in the innate immune response against to Vibrio alginolyticus. Comp Biochem Physiol B Biochem Mol Biol 2021; 258:110680. [PMID: 34688907 DOI: 10.1016/j.cbpb.2021.110680] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 10/17/2021] [Accepted: 10/18/2021] [Indexed: 12/25/2022]
Abstract
Osteoprotegerin (OPG) is a member of the tumor necrosis factor receptor superfamily, contributing to inflammation, apoptosis, and differentiation. However, the function of OPG in the host immune system of teleosts remains unclear. Here, we cloned the cDNA of the LcOPG gene from large yellow croaker. LcOPG mRNA was expressed in all analyzed tissues and was upregulated by Vibrio alginolyticus infection in immune tissues and monocytes/macrophages (MO/MФ). Subsequently, the LcOPG protein was expressed and purified using a prokaryotic expression system. Recombinant LcOPG protein (rLcOPG) treatment suppressed V. alginolyticus-induced pro-inflammatory cytokine and enhanced V. alginolyticus-induced anti-inflammatory cytokine mRNA expression. Furthermore, rLcOPG decreased V. alginolyticus-induced MO/MФ apoptosis. Therefore, the results indicate that LcOPG might play a role in the immune response of V. alginolyticus-infected large yellow croaker.
Collapse
Affiliation(s)
- Tian-Cheng Jin
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Ningbo University, Ningbo 315211, China; Laboratory of Biochemistry and Molecular Biology, School of Marine Sciences, Ningbo University, Ningbo 315211, China; Key Laboratory of Applied Marine Biotechnology of Ministry of Education, Ningbo University, Ningbo 315211, China
| | - Jian-Fei Lu
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Ningbo University, Ningbo 315211, China; Laboratory of Biochemistry and Molecular Biology, School of Marine Sciences, Ningbo University, Ningbo 315211, China; Key Laboratory of Applied Marine Biotechnology of Ministry of Education, Ningbo University, Ningbo 315211, China
| | - Sheng Luo
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Ningbo University, Ningbo 315211, China; Laboratory of Biochemistry and Molecular Biology, School of Marine Sciences, Ningbo University, Ningbo 315211, China; Key Laboratory of Applied Marine Biotechnology of Ministry of Education, Ningbo University, Ningbo 315211, China
| | - Li-Cong Wang
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Ningbo University, Ningbo 315211, China; Laboratory of Biochemistry and Molecular Biology, School of Marine Sciences, Ningbo University, Ningbo 315211, China; Key Laboratory of Applied Marine Biotechnology of Ministry of Education, Ningbo University, Ningbo 315211, China
| | - Xin-Jiang Lu
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Ningbo University, Ningbo 315211, China; Laboratory of Biochemistry and Molecular Biology, School of Marine Sciences, Ningbo University, Ningbo 315211, China; Key Laboratory of Applied Marine Biotechnology of Ministry of Education, Ningbo University, Ningbo 315211, China
| | - Jiong Chen
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Ningbo University, Ningbo 315211, China; Laboratory of Biochemistry and Molecular Biology, School of Marine Sciences, Ningbo University, Ningbo 315211, China; Key Laboratory of Applied Marine Biotechnology of Ministry of Education, Ningbo University, Ningbo 315211, China.
| |
Collapse
|
14
|
Siramshetty VB, Grishagin I, Nguyễn ÐT, Peryea T, Skovpen Y, Stroganov O, Katzel D, Sheils T, Jadhav A, Mathé EA, Southall NT. NCATS Inxight Drugs: a comprehensive and curated portal for translational research. Nucleic Acids Res 2021; 50:D1307-D1316. [PMID: 34648031 DOI: 10.1093/nar/gkab918] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 09/23/2021] [Accepted: 09/24/2021] [Indexed: 02/06/2023] Open
Abstract
The United States has a complex regulatory scheme for marketing drugs. Understanding drug regulatory status is a daunting task that requires integrating data from many sources from the United States Food and Drug Administration (FDA), US government publications, and other processes related to drug development. At NCATS, we created Inxight Drugs (https://drugs.ncats.io), a web resource that attempts to address this challenge in a systematic manner. NCATS Inxight Drugs incorporates and unifies a wealth of data, including those supplied by the FDA and from independent public sources. The database offers a substantial amount of manually curated literature data unavailable from other sources. Currently, the database contains 125 036 product ingredients, including 2566 US approved drugs, 6242 marketed drugs, and 9684 investigational drugs. All substances are rigorously defined according to the ISO 11238 standard to comply with existing regulatory standards for unique drug substance identification. A special emphasis was placed on capturing manually curated and referenced data on treatment modalities and semantic relationships between substances. A supplementary resource 'Novel FDA Drug Approvals' features regulatory details of newly approved FDA drugs. The database is regularly updated using NCATS Stitcher data integration tool that automates data aggregation and supports full data access through a RESTful API.
Collapse
Affiliation(s)
- Vishal B Siramshetty
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD, USA
| | - Ivan Grishagin
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD, USA
| | - Ðắc-Trung Nguyễn
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD, USA
| | - Tyler Peryea
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD, USA
| | | | | | - Daniel Katzel
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD, USA
| | - Timothy Sheils
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD, USA
| | - Ajit Jadhav
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD, USA
| | - Ewy A Mathé
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD, USA
| | - Noel T Southall
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD, USA
| |
Collapse
|
15
|
Kasmanas JC, Bartholomäus A, Corrêa FB, Tal T, Jehmlich N, Herberth G, von Bergen M, Stadler PF, Carvalho ACPDLFD, Nunes da Rocha U. HumanMetagenomeDB: a public repository of curated and standardized metadata for human metagenomes. Nucleic Acids Res 2021; 49:D743-D750. [PMID: 33221926 PMCID: PMC7778935 DOI: 10.1093/nar/gkaa1031] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Revised: 10/15/2020] [Accepted: 10/21/2020] [Indexed: 12/30/2022] Open
Abstract
Metagenomics became a standard strategy to comprehend the functional potential of microbial communities, including the human microbiome. Currently, the number of metagenomes in public repositories is increasing exponentially. The Sequence Read Archive (SRA) and the MG-RAST are the two main repositories for metagenomic data. These databases allow scientists to reanalyze samples and explore new hypotheses. However, mining samples from them can be a limiting factor, since the metadata available in these repositories is often misannotated, misleading, and decentralized, creating an overly complex environment for sample reanalysis. The main goal of the HumanMetagenomeDB is to simplify the identification and use of public human metagenomes of interest. HumanMetagenomeDB version 1.0 contains metadata of 69 822 metagenomes. We standardized 203 attributes, based on standardized ontologies, describing host characteristics (e.g. sex, age and body mass index), diagnosis information (e.g. cancer, Crohn's disease and Parkinson), location (e.g. country, longitude and latitude), sampling site (e.g. gut, lung and skin) and sequencing attributes (e.g. sequencing platform, average length and sequence quality). Further, HumanMetagenomeDB version 1.0 metagenomes encompass 58 countries, 9 main sample sites (i.e. body parts), 58 diagnoses and multiple ages, ranging from just born to 91 years old. The HumanMetagenomeDB is publicly available at https://webapp.ufz.de/hmgdb/.
Collapse
Affiliation(s)
- Jonas Coelho Kasmanas
- Institute of Mathematics and Computer Sciences, University of São Paulo, São Carlos, Brazil.,Department of Environmental Microbiology, Helmholtz Centre for Environmental Research - UFZ GmbH, Leipzig, Saxony 04318, Germany.,Department of Computer Science and Interdisciplinary Center of Bioinformatics, University of Leipzig, Leipzig, Saxony 04107, Germany
| | - Alexander Bartholomäus
- GFZ German Research Centre for Geosciences, Section 3.7 Geomicrobiology, Telegrafenberg, 14473 Potsdam, Germany
| | - Felipe Borim Corrêa
- Department of Environmental Microbiology, Helmholtz Centre for Environmental Research - UFZ GmbH, Leipzig, Saxony 04318, Germany.,Department of Computer Science and Interdisciplinary Center of Bioinformatics, University of Leipzig, Leipzig, Saxony 04107, Germany
| | - Tamara Tal
- Department of Bioanalytical Ecotoxicology, Helmholtz Centre for Environmental Research - UFZ GmbH, Leipzig, Saxony 04318, Germany
| | - Nico Jehmlich
- Department of Molecular Systems Biology, Helmholtz Centre for Environmental Research - UFZ GmbH, Leipzig, Saxony 04318, Germany
| | - Gunda Herberth
- Department of Environmental Immunology, Helmholtz Centre for Environmental Research - UFZ GmbH, Leipzig, Saxony 04318, Germany
| | - Martin von Bergen
- Department of Molecular Systems Biology, Helmholtz Centre for Environmental Research - UFZ GmbH, Leipzig, Saxony 04318, Germany.,Institute of Biochemistry, Faculty of Life Sciences, University of Leipzig, Leipzig, Saxony 04107, Germany
| | - Peter F Stadler
- Department of Computer Science and Interdisciplinary Center of Bioinformatics, University of Leipzig, Leipzig, Saxony 04107, Germany
| | | | - Ulisses Nunes da Rocha
- Department of Environmental Microbiology, Helmholtz Centre for Environmental Research - UFZ GmbH, Leipzig, Saxony 04318, Germany
| |
Collapse
|
16
|
Barash E, Sal-Man N, Sabato S, Ziv-Ukelson M. BacPaCS-Bacterial Pathogenicity Classification via Sparse-SVM. Bioinformatics 2020; 35:2001-2008. [PMID: 30407484 DOI: 10.1093/bioinformatics/bty928] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2018] [Revised: 08/30/2018] [Accepted: 11/07/2018] [Indexed: 01/01/2023] Open
Abstract
MOTIVATION Bacterial infections are a major cause of illness worldwide. However, most bacterial strains pose no threat to human health and may even be beneficial. Thus, developing powerful diagnostic bioinformatic tools that differentiate pathogenic from commensal bacteria are critical for effective treatment of bacterial infections. RESULTS We propose a machine-learning approach for classifying human-hosted bacteria as pathogenic or non-pathogenic based on their genome-derived proteomes. Our approach is based on sparse Support Vector Machines (SVM), which autonomously selects a small set of genes that are related to bacterial pathogenicity. We implement our approach as a tool-'Bacterial Pathogenicity Classification via sparse-SVM' (BacPaCS)-which is fully automated and handles datasets significantly larger than those previously used. BacPaCS shows high accuracy in distinguishing pathogenic from non-pathogenic bacteria, in a clinically relevant dataset, comprising only human-hosted bacteria. Among the genes that received the highest positive weight in the resulting classifier, we found genes that are known to be related to bacterial pathogenicity, in addition to novel candidates, whose involvement in bacterial virulence was never reported. AVAILABILITY AND IMPLEMENTATION The code and the resulting model are available at: https://github.com/barashe/bacpacs. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Eran Barash
- Department of Computer Science, Faculty of Natural Sciences
| | - Neta Sal-Man
- The Shraga Segal Department of Microbiology Immunology and Genetics, Faculty of Health Sciences, Ben-Gurion University of the Negev, BeerSheva, Israel
| | - Sivan Sabato
- Department of Computer Science, Faculty of Natural Sciences
| | | |
Collapse
|
17
|
Corrêa FB, Saraiva JP, Stadler PF, da Rocha UN. TerrestrialMetagenomeDB: a public repository of curated and standardized metadata for terrestrial metagenomes. Nucleic Acids Res 2020; 48:D626-D632. [PMID: 31728526 PMCID: PMC7145636 DOI: 10.1093/nar/gkz994] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Revised: 10/07/2019] [Accepted: 10/17/2019] [Indexed: 11/27/2022] Open
Abstract
Microbiome studies focused on the genetic potential of microbial communities (metagenomics) became standard within microbial ecology. MG-RAST and the Sequence Read Archive (SRA), the two main metagenome repositories, contain over 202 858 public available metagenomes and this number has increased exponentially. However, mining databases can be challenging due to misannotated, misleading and decentralized data. The main goal of TerrestrialMetagenomeDB is to make it easier for scientists to find terrestrial metagenomes of interest that could be compared with novel datasets in meta-analyses. We defined terrestrial metagenomes as those that do not belong to marine environments. Further, we curated the database using text mining to assign potential descriptive keywords that better contextualize environmental aspects of terrestrial metagenomes, such as biomes and materials. TerrestrialMetagenomeDB release 1.0 includes 15 022 terrestrial metagenomes from SRA and MG-RAST. Together, the downloadable data amounts to 68 Tbp. In total, 199 terrestrial terms were divided into 14 categories. These metagenomes span 83 countries, 30 biomes and 7 main source materials. The TerrestrialMetagenomeDB is publicly available at https://webapp.ufz.de/tmdb.
Collapse
Affiliation(s)
- Felipe Borim Corrêa
- Department of Environmental Microbiology, UFZ-Helmholtz Centre for Environmental Research, Leipzig, Saxony 04318, Germany.,Department of Computer Science and Interdisciplinary Center of Bioinformatics, University of Leipzig, Leipzig, Saxony 04107, Germany
| | - João Pedro Saraiva
- Department of Environmental Microbiology, UFZ-Helmholtz Centre for Environmental Research, Leipzig, Saxony 04318, Germany
| | - Peter F Stadler
- Department of Computer Science and Interdisciplinary Center of Bioinformatics, University of Leipzig, Leipzig, Saxony 04107, Germany
| | - Ulisses Nunes da Rocha
- Department of Environmental Microbiology, UFZ-Helmholtz Centre for Environmental Research, Leipzig, Saxony 04318, Germany
| |
Collapse
|
18
|
Ambrosino L, Colantuono C, Diretto G, Fiore A, Chiusano ML. Bioinformatics Resources for Plant Abiotic Stress Responses: State of the Art and Opportunities in the Fast Evolving -Omics Era. PLANTS 2020; 9:plants9050591. [PMID: 32384671 PMCID: PMC7285221 DOI: 10.3390/plants9050591] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Revised: 04/24/2020] [Accepted: 04/29/2020] [Indexed: 12/13/2022]
Abstract
Abiotic stresses are among the principal limiting factors for productivity in agriculture. In the current era of continuous climate changes, the understanding of the molecular aspects involved in abiotic stress response in plants is a priority. The rise of -omics approaches provides key strategies to promote effective research in the field, facilitating the investigations from reference models to an increasing number of species, tolerant and sensitive genotypes. Integrated multilevel approaches, based on molecular investigations at genomics, transcriptomics, proteomics and metabolomics levels, are now feasible, expanding the opportunities to clarify key molecular aspects involved in responses to abiotic stresses. To this aim, bioinformatics has become fundamental for data production, mining and integration, and necessary for extracting valuable information and for comparative efforts, paving the way to the modeling of the involved processes. We provide here an overview of bioinformatics resources for research on plant abiotic stresses, describing collections from -omics efforts in the field, ranging from raw data to complete databases or platforms, highlighting opportunities and still open challenges in abiotic stress research based on -omics technologies.
Collapse
Affiliation(s)
- Luca Ambrosino
- Department of Agricultural Sciences, University of Naples Federico II, 80055 Portici (Na), Italy; (L.A.); (C.C.)
- Department of Research Infrastructures for Marine Biological Resources (RIMAR), 80121 Naples, Italy
| | - Chiara Colantuono
- Department of Agricultural Sciences, University of Naples Federico II, 80055 Portici (Na), Italy; (L.A.); (C.C.)
- Department of Research Infrastructures for Marine Biological Resources (RIMAR), 80121 Naples, Italy
| | - Gianfranco Diretto
- Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA), 00123 Rome, Italy; (G.D.); (A.F.)
| | - Alessia Fiore
- Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA), 00123 Rome, Italy; (G.D.); (A.F.)
| | - Maria Luisa Chiusano
- Department of Agricultural Sciences, University of Naples Federico II, 80055 Portici (Na), Italy; (L.A.); (C.C.)
- Department of Research Infrastructures for Marine Biological Resources (RIMAR), 80121 Naples, Italy
- Correspondence: ; Tel.: +39-081-253-9492
| |
Collapse
|
19
|
Staněk D, Sedláčková L, Seeman P, Šafka Brožková D, Laššuthová P. Whole-Exome Sequencing in Czech Patients with Neurogenetic Diseases. Genet Test Mol Biomarkers 2020; 24:264-273. [DOI: 10.1089/gtmb.2019.0232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- David Staněk
- DNA Laboratory, Department of Paediatric Neurology, Charles University and University Hospital Motol, Prague, Czech Republic
| | - Lucie Sedláčková
- DNA Laboratory, Department of Paediatric Neurology, Charles University and University Hospital Motol, Prague, Czech Republic
| | - Pavel Seeman
- DNA Laboratory, Department of Paediatric Neurology, Charles University and University Hospital Motol, Prague, Czech Republic
| | - Dana Šafka Brožková
- DNA Laboratory, Department of Paediatric Neurology, Charles University and University Hospital Motol, Prague, Czech Republic
| | - Petra Laššuthová
- DNA Laboratory, Department of Paediatric Neurology, Charles University and University Hospital Motol, Prague, Czech Republic
| |
Collapse
|
20
|
Verheggen K, Raeder H, Berven FS, Martens L, Barsnes H, Vaudel M. Anatomy and evolution of database search engines-a central component of mass spectrometry based proteomic workflows. MASS SPECTROMETRY REVIEWS 2020; 39:292-306. [PMID: 28902424 DOI: 10.1002/mas.21543] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2016] [Accepted: 07/05/2017] [Indexed: 06/07/2023]
Abstract
Sequence database search engines are bioinformatics algorithms that identify peptides from tandem mass spectra using a reference protein sequence database. Two decades of development, notably driven by advances in mass spectrometry, have provided scientists with more than 30 published search engines, each with its own properties. In this review, we present the common paradigm behind the different implementations, and its limitations for modern mass spectrometry datasets. We also detail how the search engines attempt to alleviate these limitations, and provide an overview of the different software frameworks available to the researcher. Finally, we highlight alternative approaches for the identification of proteomic mass spectrometry datasets, either as a replacement for, or as a complement to, sequence database search engines.
Collapse
Affiliation(s)
- Kenneth Verheggen
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biochemistry, Ghent University, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Helge Raeder
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Department of Pediatrics, Haukeland University Hospital, Bergen, Norway
| | - Frode S Berven
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biochemistry, Ghent University, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Harald Barsnes
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
- Computational Biology Unit, Department of Informatics, University of Bergen, Norway
| | - Marc Vaudel
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
- Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, Bergen, Norway
| |
Collapse
|
21
|
Bartley BA, Beal J, Karr JR, Strychalski EA. Organizing genome engineering for the gigabase scale. Nat Commun 2020; 11:689. [PMID: 32019919 PMCID: PMC7000699 DOI: 10.1038/s41467-020-14314-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2019] [Accepted: 12/18/2019] [Indexed: 12/11/2022] Open
Abstract
Genome-scale engineering holds great potential to impact science, industry, medicine, and society, and recent improvements in DNA synthesis have enabled the manipulation of megabase genomes. However, coordinating and integrating the workflows and large teams necessary for gigabase genome engineering remains a considerable challenge. We examine this issue and recommend a path forward by: 1) adopting and extending existing representations for designs, assembly plans, samples, data, and workflows; 2) developing new technologies for data curation and quality control; 3) conducting fundamental research on genome-scale modeling and design; and 4) developing new legal and contractual infrastructure to facilitate collaboration.
Collapse
Affiliation(s)
| | - Jacob Beal
- Raytheon BBN Technologies, Cambridge, MA, 02138, USA.
| | - Jonathan R Karr
- Icahn Institute and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10128, USA
| | | |
Collapse
|
22
|
Ran X, Zhao F, Wang Y, Liu J, Zhuang Y, Ye L, Qi M, Cheng J, Zhang Y. Plant Regulomics: a data-driven interface for retrieving upstream regulators from plant multi-omics data. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 101:237-248. [PMID: 31494994 DOI: 10.1111/tpj.14526] [Citation(s) in RCA: 59] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 07/31/2019] [Accepted: 08/19/2019] [Indexed: 05/19/2023]
Abstract
High-throughput technology has become a powerful approach for routine plant research. Interpreting the biological significance of high-throughput data has largely focused on the functional characterization of a large gene list or genomic loci that involves the following two aspects: the functions of the genes or loci and how they are regulated as a whole, i.e. searching for the upstream regulators. Traditional platforms for functional annotation largely help resolving the first issue. Addressing the second issue is essential for a global understanding of the regulatory mechanism, but is more challenging, and requires additional high-throughput experimental evidence and a unified statistical framework for data-mining. The rapid accumulation of 'omics data provides a large amount of experimental data. We here present Plant Regulomics, an interface that integrates 19 925 transcriptomic and epigenomic data sets and diverse sources of functional evidence (58 112 terms and 695 414 protein-protein interactions) from six plant species along with the orthologous genes from 56 whole-genome sequenced plant species. All pair-wise transcriptomic comparisons with biological significance within the same study were performed, and all epigenomic data were processed to genomic loci targeted by various factors. These data were well organized to gene modules and loci lists, which were further implemented into the same statistical framework. For any input gene list or genomic loci, Plant Regulomics retrieves the upstream factors, treatments, and experimental/environmental conditions regulating the input from the integrated 'omics data. Additionally, multiple tools and an interactive visualization are available through a user-friendly web interface. Plant Regulomics is available at http://bioinfo.sibs.ac.cn/plant-regulomics.
Collapse
Affiliation(s)
- Xiaojuan Ran
- National Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 300 Fenglin Road, Shanghai, 200032, China
- University of the Chinese Academy of Sciences, Beijing, 100049, China
| | - Fei Zhao
- National Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 300 Fenglin Road, Shanghai, 200032, China
- University of the Chinese Academy of Sciences, Beijing, 100049, China
| | - Yuejun Wang
- National Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 300 Fenglin Road, Shanghai, 200032, China
- University of the Chinese Academy of Sciences, Beijing, 100049, China
| | - Jian Liu
- National Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 300 Fenglin Road, Shanghai, 200032, China
- University of the Chinese Academy of Sciences, Beijing, 100049, China
| | - Yili Zhuang
- National Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 300 Fenglin Road, Shanghai, 200032, China
- University of the Chinese Academy of Sciences, Beijing, 100049, China
| | - Luhuan Ye
- National Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 300 Fenglin Road, Shanghai, 200032, China
- University of the Chinese Academy of Sciences, Beijing, 100049, China
| | - Meifang Qi
- National Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 300 Fenglin Road, Shanghai, 200032, China
- University of the Chinese Academy of Sciences, Beijing, 100049, China
| | - Jingfei Cheng
- National Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 300 Fenglin Road, Shanghai, 200032, China
- University of the Chinese Academy of Sciences, Beijing, 100049, China
| | - Yijing Zhang
- National Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 300 Fenglin Road, Shanghai, 200032, China
- University of the Chinese Academy of Sciences, Beijing, 100049, China
| |
Collapse
|
23
|
Xiong W, Huang X, Chen Y, Fu R, Du X, Chen X, Zhan A. Zooplankton biodiversity monitoring in polluted freshwater ecosystems: A technical review. ENVIRONMENTAL SCIENCE AND ECOTECHNOLOGY 2019; 1:100008. [PMCID: PMC9488063 DOI: 10.1016/j.ese.2019.100008] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Revised: 12/16/2019] [Accepted: 12/20/2019] [Indexed: 05/26/2023]
Abstract
Freshwater ecosystems harbor a vast diversity of micro-eukaryotes (rotifers, crustaceans and protists), and such diverse taxonomic groups play important roles in ecosystem functioning and services. Unfortunately, freshwater ecosystems and biodiversity therein are threatened by many environmental stressors, particularly those derived from intensive human activities such as chemical pollution. In the past several decades, significant efforts have been devoted to halting biodiversity loss to recover services and functioning of freshwater ecosystems. Biodiversity monitoring is the first and a crucial step towards diagnosing pollution impacts on ecosystems and making conservation plans. Yet, bio-monitoring of ubiquitous micro-eukaryotes is extremely challenging, owing to many technical issues associated with micro-zooplankton such as microscopic size, fuzzy morphological features, and extremely high biodiversity. Here, we review current methods used for monitoring zooplankton biodiversity to advance management of impaired freshwater ecosystems. We discuss the development of traditional morphology-based identification methods such as scanning electron microscope (SEM) and ZOOSCAN and FlowCAM automatic systems, and DNA-based strategies such as metabarcoding and real-time quantitative PCR. In addition, we summarize advantages and disadvantages of these methods when applied for monitoring impacted ecosystems, and we propose practical DNA-based monitoring workflows for studying biological consequences of environmental pollution in freshwater ecosystems. Finally, we propose possible solutions for existing technical issues to improve accuracy and efficiency of DNA-based biodiversity monitoring. Freshwater ecosystems and associated biodiversity have been highly degraded. Biodiversity monitoring is crucial for diagnosing degradation degrees. Here we review available methods for monitoring zooplankton biodiversity. We propose possible solutions for existing technical issues.
Collapse
Affiliation(s)
- Wei Xiong
- Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, 18 Shuangqing Road, Haidian District, Beijing, 100085, China
| | - Xuena Huang
- Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, 18 Shuangqing Road, Haidian District, Beijing, 100085, China
| | - Yiyong Chen
- Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, 18 Shuangqing Road, Haidian District, Beijing, 100085, China
| | - Ruiying Fu
- Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, 18 Shuangqing Road, Haidian District, Beijing, 100085, China
| | - Xun Du
- Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, 18 Shuangqing Road, Haidian District, Beijing, 100085, China
| | - Xingyu Chen
- Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, 18 Shuangqing Road, Haidian District, Beijing, 100085, China
- College of Resources, Environment and Tourism, Capital Normal University, 105 West Third Ring Road, Haidian District, Beijing, 100048, China
| | - Aibin Zhan
- Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, 18 Shuangqing Road, Haidian District, Beijing, 100085, China
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, 19A Yuquan Road, Shijingshan District, Beijing, 100049, China
| |
Collapse
|
24
|
Bioinformatics for Marine Products: An Overview of Resources, Bottlenecks, and Perspectives. Mar Drugs 2019; 17:md17100576. [PMID: 31614509 PMCID: PMC6835618 DOI: 10.3390/md17100576] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Revised: 10/01/2019] [Accepted: 10/02/2019] [Indexed: 12/13/2022] Open
Abstract
The sea represents a major source of biodiversity. It exhibits many different ecosystems in a huge variety of environmental conditions where marine organisms have evolved with extensive diversification of structures and functions, making the marine environment a treasure trove of molecules with potential for biotechnological applications and innovation in many different areas. Rapid progress of the omics sciences has revealed novel opportunities to advance the knowledge of biological systems, paving the way for an unprecedented revolution in the field and expanding marine research from model organisms to an increasing number of marine species. Multi-level approaches based on molecular investigations at genomic, metagenomic, transcriptomic, metatranscriptomic, proteomic, and metabolomic levels are essential to discover marine resources and further explore key molecular processes involved in their production and action. As a consequence, omics approaches, accompanied by the associated bioinformatic resources and computational tools for molecular analyses and modeling, are boosting the rapid advancement of biotechnologies. In this review, we provide an overview of the most relevant bioinformatic resources and major approaches, highlighting perspectives and bottlenecks for an appropriate exploitation of these opportunities for biotechnology applications from marine resources.
Collapse
|
25
|
Wu SH, Wei CL, Lin YT, Chang CC, He SH. Four new East Asian species of Aleurodiscus with echinulate basidiospores. MycoKeys 2019; 52:71-87. [PMID: 31139010 PMCID: PMC6522468 DOI: 10.3897/mycokeys.52.34066] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2019] [Accepted: 04/16/2019] [Indexed: 11/17/2022] Open
Abstract
Four new species of Aleurodiscus sensu lato with echinulate basidiospores are described from East Asia: A.alpinus, A.pinicola, A.senticosus, and A.sichuanensis. Aleurodiscusalpinus is from northwest Yunnan of China where it occurs on Rhododendron in montane habitats. Aleurodiscuspinicola occurs on Pinus in montane settings in Taiwan and northwest Yunnan. Aleurodiscussenticosus is from subtropical Taiwan, where it occurs on angiosperms. Aleurodiscussichuanensis is reported from southwest China on angiosperms in montane environments. Phylogenetic relationships of these four new species were inferred from analyses of a combined dataset consisting of three genetic markers, viz. 28S, nuc rDNA ITS1-5.8S-ITS2 (ITS), and a portion of the translation elongation factor 1-alpha gene, TEF1.
Collapse
Affiliation(s)
- Sheng-Hua Wu
- Department of Biology, National Museum of Natural Science, Taichung 40419, Taiwan National Museum of Natural Science Taichung Taiwan
| | - Chia-Ling Wei
- Department of Biology, National Museum of Natural Science, Taichung 40419, Taiwan National Museum of Natural Science Taichung Taiwan
| | - Yu-Ting Lin
- Department of Biology, National Museum of Natural Science, Taichung 40419, Taiwan National Museum of Natural Science Taichung Taiwan
| | - Chiung-Chih Chang
- Department of Biology, National Museum of Natural Science, Taichung 40419, Taiwan National Museum of Natural Science Taichung Taiwan
| | - Shuang-Hui He
- Institute of Microbiology, Beijing Forestry University, Beijing 100083, China Beijing Forestry University Beijing China
| |
Collapse
|
26
|
Engel J, Veksler-Lublinsky I, Ziv-Ukelson M. Constrained Gene Block Discovery and Its Application to Prokaryotic Genomes. J Comput Biol 2019; 26:745-766. [PMID: 31140838 DOI: 10.1089/cmb.2019.0096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Recent advances in Next Generation Sequencing techniques, combined with global efforts to study infectious diseases, yield huge and rapidly-growing databases of microbial genomes. These big new data statistically empower genomic-context based approaches to functional analysis: the idea is that groups of genes that are clustered locally together across many genomes usually express protein products that interact in the same biological pathway (e.g., operons). The problem of finding such conserved "gene blocks" in a given genomic data has been studied extensively. In this work, we propose a new gene block discovery problem variant: find conserved gene blocks abiding by a user specification of biological functional constraints. We take advantage of the biological constraints to efficiently prune the search space. This is achieved by modeling the new problem as a special constrained variant of the well-studied "Closed Frequent Itemset Mining" problem, generalized here to handle item duplications. We exemplify the application of the tool we developed for this problem with two different case studies related to microbial ATP (adenosine triphosphate)-binding cassette (ABC) transporters.
Collapse
Affiliation(s)
- Jonathan Engel
- 1Department of Computer Science, Ben Gurion University of the Negev, Beer-Sheva, Israel
| | - Isana Veksler-Lublinsky
- 2Department of Software and Information Systems Engineering, Ben Gurion University of the Negev, Beer-Sheva, Israel
| | - Michal Ziv-Ukelson
- 1Department of Computer Science, Ben Gurion University of the Negev, Beer-Sheva, Israel
| |
Collapse
|
27
|
Díaz FP, Latorre C, Carrasco-Puga G, Wood JR, Wilmshurst JM, Soto DC, Cole TL, Gutiérrez RA. Multiscale climate change impacts on plant diversity in the Atacama Desert. GLOBAL CHANGE BIOLOGY 2019; 25:1733-1745. [PMID: 30706600 DOI: 10.1111/gcb.14583] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Accepted: 01/17/2019] [Indexed: 06/09/2023]
Abstract
Comprehending ecological dynamics requires not only knowledge of modern communities but also detailed reconstructions of ecosystem history. Ancient DNA (aDNA) metabarcoding allows biodiversity responses to major climatic change to be explored at different spatial and temporal scales. We extracted aDNA preserved in fossil rodent middens to reconstruct late Quaternary vegetation dynamics in the hyperarid Atacama Desert. By comparing our paleo-informed millennial record with contemporary observations of interannual variations in diversity, we show local plant communities behave differentially at different timescales. In the interannual (years to decades) time frame, only annual herbaceous expand and contract their distributional ranges (emerging from persistent seed banks) in response to precipitation, whereas perennials distribution appears to be extraordinarily resilient. In contrast, at longer timescales (thousands of years) many perennial species were displaced up to 1,000 m downslope during pluvial events. Given ongoing and future natural and anthropogenically induced climate change, our results not only provide baselines for vegetation in the Atacama Desert, but also help to inform how these and other high mountain plant communities may respond to fluctuations of climate in the future.
Collapse
Affiliation(s)
- Francisca P Díaz
- Departamento de Genética Molecular y Microbiología, Pontificia Universidad Católica de Chile, Santiago, Chile
- FONDAP Center for Genome Regulation & Millennium Institute for Integrative Biology (iBio), Santiago, Chile
| | - Claudio Latorre
- Departamento de Ecología, Pontificia Universidad Católica de Chile, Santiago, Chile
- Institute of Ecology and Biodiversity (IEB), Ñuñoa, Santiago, Chile
| | - Gabriela Carrasco-Puga
- Departamento de Genética Molecular y Microbiología, Pontificia Universidad Católica de Chile, Santiago, Chile
- FONDAP Center for Genome Regulation & Millennium Institute for Integrative Biology (iBio), Santiago, Chile
| | - Jamie R Wood
- Manaaki Whenua - Landcare Research, Lincoln, New Zealand
| | - Janet M Wilmshurst
- Manaaki Whenua - Landcare Research, Lincoln, New Zealand
- School of Environment, The University of Auckland, Auckland, New Zealand
| | - Daniela C Soto
- Departamento de Genética Molecular y Microbiología, Pontificia Universidad Católica de Chile, Santiago, Chile
- FONDAP Center for Genome Regulation & Millennium Institute for Integrative Biology (iBio), Santiago, Chile
| | - Theresa L Cole
- Manaaki Whenua - Landcare Research, Lincoln, New Zealand
- Department of Zoology, University of Otago, Dunedin, New Zealand
| | - Rodrigo A Gutiérrez
- Departamento de Genética Molecular y Microbiología, Pontificia Universidad Católica de Chile, Santiago, Chile
- FONDAP Center for Genome Regulation & Millennium Institute for Integrative Biology (iBio), Santiago, Chile
| |
Collapse
|
28
|
Catanach TA, Sweet AD, Nguyen NPD, Peery RM, Debevec AH, Thomer AK, Owings AC, Boyd BM, Katz AD, Soto-Adames FN, Allen JM. Fully automated sequence alignment methods are comparable to, and much faster than, traditional methods in large data sets: an example with hepatitis B virus. PeerJ 2019; 7:e6142. [PMID: 30627489 PMCID: PMC6321758 DOI: 10.7717/peerj.6142] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2018] [Accepted: 11/14/2018] [Indexed: 01/05/2023] Open
Abstract
Aligning sequences for phylogenetic analysis (multiple sequence alignment; MSA) is an important, but increasingly computationally expensive step with the recent surge in DNA sequence data. Much of this sequence data is publicly available, but can be extremely fragmentary (i.e., a combination of full genomes and genomic fragments), which can compound the computational issues related to MSA. Traditionally, alignments are produced with automated algorithms and then checked and/or corrected "by eye" prior to phylogenetic inference. However, this manual curation is inefficient at the data scales required of modern phylogenetics and results in alignments that are not reproducible. Recently, methods have been developed for fully automating alignments of large data sets, but it is unclear if these methods produce alignments that result in compatible phylogenies when compared to more traditional alignment approaches that combined automated and manual methods. Here we use approximately 33,000 publicly available sequences from the hepatitis B virus (HBV), a globally distributed and rapidly evolving virus, to compare different alignment approaches. Using one data set comprised exclusively of whole genomes and a second that also included sequence fragments, we compared three MSA methods: (1) a purely automated approach using traditional software, (2) an automated approach including by eye manual editing, and (3) more recent fully automated approaches. To understand how these methods affect phylogenetic results, we compared resulting tree topologies based on these different alignment methods using multiple metrics. We further determined if the monophyly of existing HBV genotypes was supported in phylogenies estimated from each alignment type and under different statistical support thresholds. Traditional and fully automated alignments produced similar HBV phylogenies. Although there was variability between branch support thresholds, allowing lower support thresholds tended to result in more differences among trees. Therefore, differences between the trees could be best explained by phylogenetic uncertainty unrelated to the MSA method used. Nevertheless, automated alignment approaches did not require human intervention and were therefore considerably less time-intensive than traditional approaches. Because of this, we conclude that fully automated algorithms for MSA are fully compatible with older methods even in extremely difficult to align data sets. Additionally, we found that most HBV diagnostic genotypes did not correspond to evolutionarily-sound groups, regardless of alignment type and support threshold. This suggests there may be errors in genotype classification in the database or that HBV genotypes may need a revision.
Collapse
Affiliation(s)
- Therese A. Catanach
- Ornithology Department, Academy of Natural Sciences of Drexel University, Philadelphia, PA, United States of America
- Illinois Natural History Survey, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
- Department of Wildlife and Fisheries Sciences, Texas A&M University, College Station, TX, United States of America
| | - Andrew D. Sweet
- Illinois Natural History Survey, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
- Department of Entomology, Purdue University, West Lafayette, IN, United States of America
| | - Nam-phuong D. Nguyen
- Computer Science and Engineering, University of San Diego, California, La Jolla, CA, United States of America
| | - Rhiannon M. Peery
- Department of Biology, University of Alberta, Edmonton, Alberta, Canada
- Department of Plant Biology, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
| | - Andrew H. Debevec
- School of Integrative Biology, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
| | - Andrea K. Thomer
- School of Information, University of Michigan—Ann Arbor, Ann Arbor, MI, United States of America
| | - Amanda C. Owings
- Program in Ecology, Evolution, and Conservation Biology, University of Illinois at Urbana-Champaign, Urbana, IL, United States of America
| | - Bret M. Boyd
- Illinois Natural History Survey, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
- Department of Entomology, University of Georga, Athens, GA, United States of America
| | - Aron D. Katz
- Illinois Natural History Survey, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
- Department of Entomology, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
| | - Felipe N. Soto-Adames
- Florida State Collection of Arthropods, Florida Department of Agriculture and Consumer Services, Gainesville, FL, United States of America
- Department of Entomology and Nematology, University of Florida, Gainesville, FL, United States of America
| | - Julie M. Allen
- Biology Department, University of Nevada, Reno, Reno, NV, United States of America
| |
Collapse
|
29
|
Markosian C, Di Costanzo L, Sekharan M, Shao C, Burley SK, Zardecki C. Analysis of impact metrics for the Protein Data Bank. Sci Data 2018; 5:180212. [PMID: 30325351 PMCID: PMC6190746 DOI: 10.1038/sdata.2018.212] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2018] [Accepted: 08/29/2018] [Indexed: 01/13/2023] Open
Abstract
Since 1971, the Protein Data Bank (PDB) archive has served as the single, global repository for open access to atomic-level data for biological macromolecules. The archive currently holds >140,000 structures (>1 billion atoms). These structures are the molecules of life found in all organisms. Knowing the 3D structure of a biological macromolecule is essential for understanding the molecule's function, providing insights in health and disease, food and energy production, and other topics of concern to prosperity and sustainability. PDB data are freely and publicly available, without restrictions on usage. Through bibliometric and usage studies, we sought to determine the impact of the PDB across disciplines and demographics. Our analysis shows that even though research areas such as molecular biology and biochemistry account for the most usage, other fields are increasingly using PDB resources. PDB usage is seen across 150 disciplines in applied sciences, humanities, and social sciences. Data are also re-used and integrated with >400 resources. Our study identifies trends in PDB usage and documents its utility across research disciplines.
Collapse
Affiliation(s)
- Christopher Markosian
- Department of Molecular Biology and Biochemistry, School of Arts and Sciences, Rutgers, The State University of New Jersey, Piscataway, NJ USA
| | - Luigi Di Costanzo
- RCSB Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ USA
| | - Monica Sekharan
- RCSB Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ USA
| | - Chenghua Shao
- RCSB Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ USA
| | - Stephen K Burley
- RCSB Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ USA.,RCSB Protein Data Bank, Skaggs School of Pharmacy and Pharmaceutical Sciences and San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA USA.,Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ USA
| | - Christine Zardecki
- RCSB Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ USA
| |
Collapse
|
30
|
|
31
|
Ramharack P, Soliman MES. Bioinformatics-based tools in drug discovery: the cartography from single gene to integrative biological networks. Drug Discov Today 2018; 23:1658-1665. [PMID: 29864527 DOI: 10.1016/j.drudis.2018.05.041] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2018] [Revised: 05/12/2018] [Accepted: 05/29/2018] [Indexed: 02/02/2023]
Abstract
Originally developed for the analysis of biological sequences, bioinformatics has advanced into one of the most widely recognized domains in the scientific community. Despite this technological evolution, there is still an urgent need for nontoxic and efficient drugs. The onus now falls on the 'omics domain to meet this need by implementing bioinformatics techniques that will allow for the introduction of pioneering approaches in the rational drug design process. Here, we categorize an updated list of informatics tools and explore the capabilities of integrative bioinformatics in disease control. We believe that our review will serve as a comprehensive guide toward bioinformatics-oriented disease and drug discovery research.
Collapse
Affiliation(s)
- Pritika Ramharack
- Molecular Bio-computation and Drug Design Laboratory, School of Health Sciences, University of KwaZulu-Natal, Westville Campus, Durban 4001, South Africa
| | - Mahmoud E S Soliman
- Molecular Bio-computation and Drug Design Laboratory, School of Health Sciences, University of KwaZulu-Natal, Westville Campus, Durban 4001, South Africa.
| |
Collapse
|
32
|
Haçarız O, Sayers GP. Genererating a core cluster of Fasciola hepatica virulence and immunomodulation-related genes using a comparative in silico approach. Res Vet Sci 2018; 117:271-276. [DOI: 10.1016/j.rvsc.2017.12.023] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2017] [Revised: 12/20/2017] [Accepted: 12/27/2017] [Indexed: 01/24/2023]
|
33
|
Ohta T, Nakazato T, Bono H. Calculating the quality of public high-throughput sequencing data to obtain a suitable subset for reanalysis from the Sequence Read Archive. Gigascience 2018; 6:1-8. [PMID: 28449062 PMCID: PMC5459929 DOI: 10.1093/gigascience/gix029] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2017] [Accepted: 04/11/2017] [Indexed: 11/15/2022] Open
Abstract
It is important for public data repositories to promote the reuse of archived data. In the growing field of omics science, however, the increasing number of submissions of high-throughput sequencing (HTSeq) data to public repositories prevents users from choosing a suitable data set from among the large number of search results. Repository users need to be able to set a threshold to reduce the number of results to obtain a suitable subset of high-quality data for reanalysis. We calculated the quality of sequencing data archived in a public data repository, the Sequence Read Archive (SRA), by using the quality control software FastQC. We obtained quality values for 1 171 313 experiments, which can be used to evaluate the suitability of data for reuse. We also visualized the data distribution in SRA by integrating the quality information and metadata of experiments and samples. We provide quality information of all of the archived sequencing data, which enable users to obtain sufficient quality sequencing data for reanalyses. The calculated quality data are available to the public in various formats. Our data also provide an example of enhancing the reuse of public data by adding metadata to published research data by a third party.
Collapse
Affiliation(s)
- Tazro Ohta
- Correspondence address. Tazro Ohta, Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Yata 1111, Mishima, Shizuoka 411-8540, Japan. E-mail: ; Hidemasa Bono, Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Yata 1111, Mishima, Shizuoka 411-8540, Japan. E-mail:
| | | | - Hidemasa Bono
- Correspondence address. Tazro Ohta, Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Yata 1111, Mishima, Shizuoka 411-8540, Japan. E-mail: ; Hidemasa Bono, Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Yata 1111, Mishima, Shizuoka 411-8540, Japan. E-mail:
| |
Collapse
|
34
|
Jedličková L, Dvořáková H, Dvořák J, Kašný M, Ulrychová L, Vorel J, Žárský V, Mikeš L. Cysteine peptidases of Eudiplozoon nipponicum: a broad repertoire of structurally assorted cathepsins L in contrast to the scarcity of cathepsins B in an invasive species of haematophagous monogenean of common carp. Parasit Vectors 2018; 11:142. [PMID: 29510760 PMCID: PMC5840727 DOI: 10.1186/s13071-018-2666-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Accepted: 01/22/2018] [Indexed: 01/30/2023] Open
Abstract
Background Cysteine peptidases of clan CA, family C1 account for a major part of proteolytic activity in the haematophagous monogenean Eudiplozoon nipponicum. The full spectrum of cysteine cathepsins is, however, unknown and their particular biochemical properties, tissue localisation, and involvement in parasite-host relationships are yet to be explored. Methods Sequences of cathepsins L and B (EnCL and EnCB) were mined from E. nipponicum transcriptome and analysed bioinformatically. Genes encoding two EnCLs and one EnCB were cloned and recombinant proteins produced in vitro. The enzymes were purified by chromatography and their activity towards selected substrates was characterised. Antibodies and specific RNA probes were employed for localisation of the enzymes/transcripts in tissues of E. nipponicum adults. Results Transcriptomic analysis revealed a set of ten distinct transcripts that encode EnCLs. The enzymes are significantly variable in their active sites, specifically the S2 subsites responsible for interaction with substrates. Some of them display unusual structural features that resemble cathepsins B and S. Two recombinant EnCLs had different pH activity profiles against both synthetic and macromolecular substrates, and were able to hydrolyse blood proteins and collagen I. They were localised in the haematin cells of the worm’s digestive tract and in gut lumen. The EnCB showed similarity with cathepsin B2 of Schistosoma mansoni. It displays molecular features typical of cathepsins B, including an occluding loop responsible for its exopeptidase activity. Although the EnCB hydrolysed haemoglobin in vitro, it was localised in the vitelline cells of the parasite and not the digestive tract. Conclusions To our knowledge, this study represents the first complex bioinformatic and biochemical characterisation of cysteine peptidases in a monogenean. Eudiplozoon nipponicum adults express a variety of CLs, which are the most abundant peptidases in the worms. The properties and localisation of the two heterologously expressed EnCLs indicate a central role in the (partially extracellular?) digestion of host blood proteins. High variability of substrate-binding sites in the set of EnCLs suggests specific adaptation to a range of biological processes that require proteolysis. Surprisingly, a single cathepsin B is expressed by the parasite and it is not involved in digestion, but probably in vitellogenesis. Electronic supplementary material The online version of this article (10.1186/s13071-018-2666-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Lucie Jedličková
- Department of Parasitology, Faculty of Science, Charles University, Viničná 7, 12844, Prague 2, Czech Republic.
| | - Hana Dvořáková
- Department of Parasitology, Faculty of Science, Charles University, Viničná 7, 12844, Prague 2, Czech Republic
| | - Jan Dvořák
- Medical Biology Centre, School of Biological Sciences, Queen's University Belfast, 97 Lisburn Road, Belfast, BT9 7BL, UK.,Department of Zoology and Fisheries, Faculty of Agrobiology, Food and Natural Resources, Czech University of Life Sciences Prague, Kamýcká 129, 16500, Prague 6, Czech Republic
| | - Martin Kašný
- Department of Parasitology, Faculty of Science, Charles University, Viničná 7, 12844, Prague 2, Czech Republic.,Department of Botany and Zoology, Faculty of Science, Masaryk University, Kotlářská 2, 611 37, Brno, Czech Republic
| | - Lenka Ulrychová
- Department of Parasitology, Faculty of Science, Charles University, Viničná 7, 12844, Prague 2, Czech Republic.,Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic, Flemingovo nám. 2, 16000, Prague 6, Czech Republic
| | - Jiří Vorel
- Department of Botany and Zoology, Faculty of Science, Masaryk University, Kotlářská 2, 611 37, Brno, Czech Republic
| | - Vojtěch Žárský
- Department of Parasitology, Faculty of Science, Charles University, Průmyslová 595, Vestec, 25250, Czech Republic
| | - Libor Mikeš
- Department of Parasitology, Faculty of Science, Charles University, Viničná 7, 12844, Prague 2, Czech Republic
| |
Collapse
|
35
|
Dwivedi SL, Scheben A, Edwards D, Spillane C, Ortiz R. Assessing and Exploiting Functional Diversity in Germplasm Pools to Enhance Abiotic Stress Adaptation and Yield in Cereals and Food Legumes. FRONTIERS IN PLANT SCIENCE 2017; 8:1461. [PMID: 28900432 PMCID: PMC5581882 DOI: 10.3389/fpls.2017.01461] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2017] [Accepted: 08/07/2017] [Indexed: 05/03/2023]
Abstract
There is a need to accelerate crop improvement by introducing alleles conferring host plant resistance, abiotic stress adaptation, and high yield potential. Elite cultivars, landraces and wild relatives harbor useful genetic variation that needs to be more easily utilized in plant breeding. We review genome-wide approaches for assessing and identifying alleles associated with desirable agronomic traits in diverse germplasm pools of cereals and legumes. Major quantitative trait loci and single nucleotide polymorphisms (SNPs) associated with desirable agronomic traits have been deployed to enhance crop productivity and resilience. These include alleles associated with variation conferring enhanced photoperiod and flowering traits. Genetic variants in the florigen pathway can provide both environmental flexibility and improved yields. SNPs associated with length of growing season and tolerance to abiotic stresses (precipitation, high temperature) are valuable resources for accelerating breeding for drought-prone environments. Both genomic selection and genome editing can also harness allelic diversity and increase productivity by improving multiple traits, including phenology, plant architecture, yield potential and adaptation to abiotic stresses. Discovering rare alleles and useful haplotypes also provides opportunities to enhance abiotic stress adaptation, while epigenetic variation has potential to enhance abiotic stress adaptation and productivity in crops. By reviewing current knowledge on specific traits and their genetic basis, we highlight recent developments in the understanding of crop functional diversity and identify potential candidate genes for future use. The storage and integration of genetic, genomic and phenotypic information will play an important role in ensuring broad and rapid application of novel genetic discoveries by the plant breeding community. Exploiting alleles for yield-related traits would allow improvement of selection efficiency and overall genetic gain of multigenic traits. An integrated approach involving multiple stakeholders specializing in management and utilization of genetic resources, crop breeding, molecular biology and genomics, agronomy, stress tolerance, and reproductive/seed biology will help to address the global challenge of ensuring food security in the face of growing resource demands and climate change induced stresses.
Collapse
Affiliation(s)
| | - Armin Scheben
- School of Biological Sciences, Institute of Agriculture, University of Western Australia, PerthWA, Australia
| | - David Edwards
- School of Biological Sciences, Institute of Agriculture, University of Western Australia, PerthWA, Australia
| | - Charles Spillane
- Plant and AgriBiosciences Research Centre, Ryan Institute, National University of Ireland GalwayGalway, Ireland
| | - Rodomiro Ortiz
- Department of Plant Breeding, Swedish University of Agricultural SciencesAlnarp, Sweden
| |
Collapse
|
36
|
Carrasco-Ramiro F, Peiró-Pastor R, Aguado B. Human genomics projects and precision medicine. Gene Ther 2017; 24:551-561. [PMID: 28805797 DOI: 10.1038/gt.2017.77] [Citation(s) in RCA: 78] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Revised: 07/31/2017] [Accepted: 08/04/2017] [Indexed: 12/31/2022]
Abstract
The completion of the Human Genome Project (HGP) in 2001 opened the floodgates to a deeper understanding of medicine. There are dozens of HGP-like projects which involve from a few tens to several million genomes currently in progress, which vary from having specialized goals or a more general approach. However, data generation, storage, management and analysis in public and private cloud computing platforms have raised concerns about privacy and security. The knowledge gained from further research has changed the field of genomics and is now slowly permeating into clinical medicine. The new precision (personalized) medicine, where genome sequencing and data analysis are essential components, allows tailored diagnosis and treatment according to the information from the patient's own genome and specific environmental factors. P4 (predictive, preventive, personalized and participatory) medicine is introducing new concepts, challenges and opportunities. This review summarizes current sequencing technologies, concentrates on ongoing human genomics projects, and provides some examples in which precision medicine has already demonstrated clinical impact in diagnosis and/or treatment.
Collapse
Affiliation(s)
- F Carrasco-Ramiro
- Centro de Biología Molecular Severo Ochoa (CBMSO) CSIC-UAM. Genomics and Next Generation Sequencing Service. Campus de Excelencia Internacional (CEI) UAM+CSIC. Nicolás Cabrera 1, Madrid, Cantoblanco, Spain
| | - R Peiró-Pastor
- Centro de Biología Molecular Severo Ochoa (CBMSO) CSIC-UAM. Genomics and Next Generation Sequencing Service. Campus de Excelencia Internacional (CEI) UAM+CSIC. Nicolás Cabrera 1, Madrid, Cantoblanco, Spain
| | - B Aguado
- Centro de Biología Molecular Severo Ochoa (CBMSO) CSIC-UAM. Genomics and Next Generation Sequencing Service. Campus de Excelencia Internacional (CEI) UAM+CSIC. Nicolás Cabrera 1, Madrid, Cantoblanco, Spain
| |
Collapse
|
37
|
Pilon AC, Valli M, Dametto AC, Pinto MEF, Freire RT, Castro-Gamboa I, Andricopulo AD, Bolzani VS. NuBBE DB: an updated database to uncover chemical and biological information from Brazilian biodiversity. Sci Rep 2017; 7:7215. [PMID: 28775335 PMCID: PMC5543130 DOI: 10.1038/s41598-017-07451-x] [Citation(s) in RCA: 98] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2017] [Accepted: 06/28/2017] [Indexed: 01/24/2023] Open
Abstract
The intrinsic value of biodiversity extends beyond species diversity, genetic heritage, ecosystem variability and ecological services, such as climate regulation, water quality, nutrient cycling and the provision of reproductive habitats it is also an inexhaustible source of molecules and products beneficial to human well-being. To uncover the chemistry of Brazilian natural products, the Nuclei of Bioassays, Ecophysiology and Biosynthesis of Natural Products Database (NuBBEDB) was created as the first natural product library from Brazilian biodiversity. Since its launch in 2013, the NuBBEDB has proven to be an important resource for new drug design and dereplication studies. Consequently, continuous efforts have been made to expand its contents and include a greater diversity of natural sources to establish it as a comprehensive compendium of available biogeochemical information about Brazilian biodiversity. The content in the NuBBEDB is freely accessible online (https://nubbe.iq.unesp.br/portal/nubbedb.html) and provides validated multidisciplinary information, chemical descriptors, species sources, geographic locations, spectroscopic data (NMR) and pharmacological properties. Herein, we report the latest advancements concerning the interface, content and functionality of the NuBBEDB. We also present a preliminary study on the current profile of the compounds present in Brazilian territory.
Collapse
Affiliation(s)
- Alan C Pilon
- Nuclei of Bioassays, Biosynthesis and Ecophysiology of Natural Products (NuBBE), Department of Organic Chemistry, Institute of Chemistry, Sao Paulo State University - UNESP, 14800-060, Araraquara, SP, Brazil
| | - Marilia Valli
- Nuclei of Bioassays, Biosynthesis and Ecophysiology of Natural Products (NuBBE), Department of Organic Chemistry, Institute of Chemistry, Sao Paulo State University - UNESP, 14800-060, Araraquara, SP, Brazil
| | - Alessandra C Dametto
- Nuclei of Bioassays, Biosynthesis and Ecophysiology of Natural Products (NuBBE), Department of Organic Chemistry, Institute of Chemistry, Sao Paulo State University - UNESP, 14800-060, Araraquara, SP, Brazil
| | - Meri Emili F Pinto
- Nuclei of Bioassays, Biosynthesis and Ecophysiology of Natural Products (NuBBE), Department of Organic Chemistry, Institute of Chemistry, Sao Paulo State University - UNESP, 14800-060, Araraquara, SP, Brazil
| | - Rafael T Freire
- Centro de Imagens e Espectroscopia in vivo por Ressonância Magnética, Institute of Physics of Sao Carlos, University of Sao Paulo - USP, 13566-590, Sao Carlos, SP, Brazil
| | - Ian Castro-Gamboa
- Nuclei of Bioassays, Biosynthesis and Ecophysiology of Natural Products (NuBBE), Department of Organic Chemistry, Institute of Chemistry, Sao Paulo State University - UNESP, 14800-060, Araraquara, SP, Brazil
| | - Adriano D Andricopulo
- Laboratório de Química Medicinal e Computacional (LQMC), Centro de Pesquisa e Inovação em Biodiversidade eFármacos, Institute of Physics of Sao Carlos, University of Sao Paulo - USP, 13563-120, Sao Carlos, SP, Brazil
| | - Vanderlan S Bolzani
- Nuclei of Bioassays, Biosynthesis and Ecophysiology of Natural Products (NuBBE), Department of Organic Chemistry, Institute of Chemistry, Sao Paulo State University - UNESP, 14800-060, Araraquara, SP, Brazil.
| |
Collapse
|
38
|
Serrano AE, Escudero LV, Tebes-Cayo C, Acosta M, Encalada O, Fernández-Moroso S, Demergasso C. First draft genome sequence of a strain from the genus Fusibacter isolated from Salar de Ascotán in Northern Chile. Stand Genomic Sci 2017; 12:43. [PMID: 28770028 PMCID: PMC5525254 DOI: 10.1186/s40793-017-0252-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2017] [Accepted: 07/12/2017] [Indexed: 11/18/2022] Open
Abstract
Fusibacter sp. 3D3 (ATCC BAA-2418) is an arsenate-reducing halotolerant strain within the Firmicutes phylum, isolated from the Salar de Ascotán, a hypersaline salt flat in Northern Chile. This high-Andean closed basin is an athalassohaline environment located at the bottom of a tectonic basin surrounded by mountain range, including some active volcanoes. This landscape can be an advantageous system to explore the effect of salinity on microorganisms that mediate biogeochemical reactions. Since 2000, microbial reduction of arsenic has been evidenced in the system, and the phylogenetic analysis of the original community plus the culture enrichments has revealed the predominance of Firmicutes phylum. Here, we describe the first whole draft genome sequence of an arsenic-reducing strain belonging to the Fusibacter genus showing the highest 16S rRNA gene sequence similarity (98%) with Fusibacter sp. strain Vns02. The draft genome consists of 57 contigs with 5,111,250 bp and an average G + C content of 37.6%. Out of 4780 total genes predicted, 4700 genes code for proteins and 80 genes for RNAs. Insights from the genome sequence and some microbiological features of the strain 3D3 are available under Bioproject accession PRJDB4973 and Biosample SAMD00055724. The release of the genome sequence of this strain could contribute to the understanding of the arsenic biogeochemistry in extreme environments.
Collapse
Affiliation(s)
- Antonio E Serrano
- Centro de Biotecnología, Universidad Católica del Norte, Antofagasta, Chile
| | - Lorena V Escudero
- Centro de Biotecnología, Universidad Católica del Norte, Antofagasta, Chile.,Centro de Investigación Científica y Tecnológica para la Minería, Antofagasta, Chile
| | - Cinthya Tebes-Cayo
- Centro de Biotecnología, Universidad Católica del Norte, Antofagasta, Chile
| | - Mauricio Acosta
- Centro de Biotecnología, Universidad Católica del Norte, Antofagasta, Chile
| | - Olga Encalada
- Centro de Biotecnología, Universidad Católica del Norte, Antofagasta, Chile
| | | | - Cecilia Demergasso
- Centro de Biotecnología, Universidad Católica del Norte, Antofagasta, Chile.,Centro de Investigación Científica y Tecnológica para la Minería, Antofagasta, Chile
| |
Collapse
|
39
|
Galperin MY, Fernández-Suárez XM, Rigden DJ. The 24th annual Nucleic Acids Research database issue: a look back and upcoming changes. Nucleic Acids Res 2017; 45:D1-D11. [PMID: 28053160 PMCID: PMC5210597 DOI: 10.1093/nar/gkw1188] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2016] [Accepted: 11/16/2016] [Indexed: 12/23/2022] Open
Abstract
This year's Database Issue of Nucleic Acids Research contains 152 papers that include descriptions of 54 new databases and update papers on 98 databases, of which 16 have not been previously featured in NAR As always, these databases cover a broad range of molecular biology subjects, including genome structure, gene expression and its regulation, proteins, protein domains, and protein-protein interactions. Following the recent trend, an increasing number of new and established databases deal with the issues of human health, from cancer-causing mutations to drugs and drug targets. In accordance with this trend, three recently compiled databases that have been selected by NAR reviewers and editors as 'breakthrough' contributions, denovo-db, the Monarch Initiative, and Open Targets, cover human de novo gene variants, disease-related phenotypes in model organisms, and a bioinformatics platform for therapeutic target identification and validation, respectively. We expect these databases to attract the attention of numerous researchers working in various areas of genetics and genomics. Looking back at the past 12 years, we present here the 'golden set' of databases that have consistently served as authoritative, comprehensive, and convenient data resources widely used by the entire community and offer some lessons on what makes a successful database. The Database Issue is freely available online at the https://academic.oup.com/nar web site. An updated version of the NAR Molecular Biology Database Collection is available at http://www.oxfordjournals.org/nar/database/a/.
Collapse
Affiliation(s)
- Michael Y Galperin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | | - Daniel J Rigden
- Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, UK
| |
Collapse
|
40
|
Lin YT, Shih HH, Hulcr J, Lin CS, Lu SS, Chen CY. Ambrosiella in Taiwan including one new species. MYCOSCIENCE 2017. [DOI: 10.1016/j.myc.2017.02.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
41
|
Surendranath V, Albrecht V, Hayhurst JD, Schöne B, Robinson J, Marsh SGE, Schmidt AH, Lange V. TypeLoader: A fast and efficient automated workflow for the annotation and submission of novel full-length HLA alleles. HLA 2017; 90:25-31. [DOI: 10.1111/tan.13055] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2016] [Revised: 03/17/2017] [Accepted: 04/19/2017] [Indexed: 11/26/2022]
Affiliation(s)
| | | | - J. D. Hayhurst
- Anthony Nolan Research Institute; Royal Free Hospital; London UK
| | - B. Schöne
- DKMS Life Science Lab; Dresden Germany
| | - J. Robinson
- Anthony Nolan Research Institute; Royal Free Hospital; London UK
- UCL Cancer Institute; University College London; London UK
| | - S. G. E. Marsh
- Anthony Nolan Research Institute; Royal Free Hospital; London UK
- UCL Cancer Institute; University College London; London UK
| | - A. H. Schmidt
- DKMS Life Science Lab; Dresden Germany
- DKMS; Tübingen Germany
| | - V. Lange
- DKMS Life Science Lab; Dresden Germany
| |
Collapse
|
42
|
Takeda JI, Masuda A, Ohno K. Six GU-rich (6GU R) FUS-binding motifs detected by normalization of CLIP-seq by Nascent-seq. Gene 2017; 618:57-64. [PMID: 28392367 DOI: 10.1016/j.gene.2017.04.008] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2016] [Revised: 04/03/2017] [Accepted: 04/05/2017] [Indexed: 12/13/2022]
Abstract
FUS, an RNA-binding protein (RBP), is mutated or abnormally regulated in neurodegenerative disorders. FUS regulates various aspects of RNA metabolisms. FUS-binding sites are rich in GU contents and are highly degenerative. FUS-binding motifs of GGU, GGUG, GUGGU and CGCGC have been previously reported. These motifs, however, are applicable to a small fraction of FUS-binding sites. As CLIP-seq tags are enriched in genes that are highly expressed, we normalized CLIP-seq tags by Nascent-seq tags or RNA-seq tags of mouse N2a cells. Nascent-seq identifies nascent transcripts before being processed for splicing and polyadenylation. We extracted frequently observed 4-nt motifs from Nascent-seq-normalized CLIP regions, RNA-seq-normalized CLIP regions, and native CLIP regions. Specific GU-rich motifs were best detected in Nascent-seq-normalized CLIP regions. Analysis of structural motifs using Nascent-seq-normalized CLIP regions also predicted GU-rich sequence forming a stem structure. Sensitivity and specificity were calculated by examining whether the extracted motifs were present at the cross-linking-induced mutation sites (CIMS), where FUS was directly bound. We found that a combination of six motifs (UGUG, CUGG, UGGU, GCUG, GUGG, and UUGG), which were extracted from Nascent-seq-normalized CLIP-regions, had a better discriminative power than (i) motifs extracted from RNA-seq-normalized CLIP regions, (ii) motifs extracted from native CLIP regions, (iii) previously reported individual motifs, or (iv) 15 motifs in SpliceAid 2. Validation of the 6 GU-rich (6GUR) motifs using CLIP-seq of the cerebrum and the whole brain showed that the 6GUR motifs were specifically enriched in CIMS. The number of the 6GUR motifs in an uninterrupted region was counted and multiplied by four to calculate the area, which was defined as the 6GUR-Score. The 6GUR-Score of 8 or more best discriminated CIMS from CIMS-flanking regions. We propose that the 6GUR motifs predict FUS-binding sites more efficiently than previously reported individual motifs or 15 motifs in SpliceAid 2.
Collapse
Affiliation(s)
- Jun-Ichi Takeda
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, 65 Tsurumai, Showa-ku, Nagoya 466-8550, Japan
| | - Akio Masuda
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, 65 Tsurumai, Showa-ku, Nagoya 466-8550, Japan
| | - Kinji Ohno
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, 65 Tsurumai, Showa-ku, Nagoya 466-8550, Japan.
| |
Collapse
|
43
|
Bell KL, Loeffler VM, Brosi BJ. An rbcL reference library to aid in the identification of plant species mixtures by DNA metabarcoding. APPLICATIONS IN PLANT SCIENCES 2017; 5:apps1600110. [PMID: 28337390 PMCID: PMC5357121 DOI: 10.3732/apps.1600110] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2016] [Accepted: 01/30/2017] [Indexed: 05/03/2023]
Abstract
PREMISE OF THE STUDY DNA metabarcoding has broad-ranging applications in ecology, aerobiology, biosecurity, and forensics. A bioinformatics pipeline has recently been published for identification using a comprehensive database of ITS2, one of the common plant DNA barcoding markers. There is, however, no corresponding database for rbcL, the other primary marker used in plants. METHODS Using publicly available data, we compiled a reference library of rbcL sequences and trained databases for use with UTAX and RDP classifier algorithms. We used this reference library, along with the existing bioinformatics pipeline and ITS2 reference library, to identify species in an artificial mixture of nine species of pollen. We have made this database publicly available in multiple formats, to allow use with multiple bioinformatics pipelines, now and in the future. RESULTS Using the rbcL database, in addition to the ITS2 database, we succeeded in making species-level identifications for eight species and a family-level identification of the ninth species. This is an improvement on ITS2 sequence alone. DISCUSSION The reference library described here will assist with identification of plant species using rbcL. By making another gene region available for standard barcoding, this will increase the resolution and accuracy of identifications.
Collapse
Affiliation(s)
- Karen L. Bell
- Department of Environmental Science, Emory University, 400 Dowman Drive, Atlanta, Georgia 30322 USA
- Author for correspondence:
| | - Virginia M. Loeffler
- Department of Environmental Science, Emory University, 400 Dowman Drive, Atlanta, Georgia 30322 USA
| | - Berry J. Brosi
- Department of Environmental Science, Emory University, 400 Dowman Drive, Atlanta, Georgia 30322 USA
| |
Collapse
|
44
|
Wang Y, Song F, Zhu J, Zhang S, Yang Y, Chen T, Tang B, Dong L, Ding N, Zhang Q, Bai Z, Dong X, Chen H, Sun M, Zhai S, Sun Y, Yu L, Lan L, Xiao J, Fang X, Lei H, Zhang Z, Zhao W. GSA: Genome Sequence Archive<sup/>. GENOMICS PROTEOMICS & BIOINFORMATICS 2017; 15:14-18. [PMID: 28387199 PMCID: PMC5339404 DOI: 10.1016/j.gpb.2017.01.001] [Citation(s) in RCA: 425] [Impact Index Per Article: 60.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Accepted: 01/07/2017] [Indexed: 11/30/2022]
Abstract
With the rapid development of sequencing technologies towards higher throughput and lower cost, sequence data are generated at an unprecedentedly explosive rate. To provide an efficient and easy-to-use platform for managing huge sequence data, here we present Genome Sequence Archive (GSA; http://bigd.big.ac.cn/gsa or http://gsa.big.ac.cn), a data repository for archiving raw sequence data. In compliance with data standards and structures of the International Nucleotide Sequence Database Collaboration (INSDC), GSA adopts four data objects (BioProject, BioSample, Experiment, and Run) for data organization, accepts raw sequence reads produced by a variety of sequencing platforms, stores both sequence reads and metadata submitted from all over the world, and makes all these data publicly available to worldwide scientific communities. In the era of big data, GSA is not only an important complement to existing INSDC members by alleviating the increasing burdens of handling sequence data deluge, but also takes the significant responsibility for global big data archive and provides free unrestricted access to all publicly available data in support of research activities throughout the world.
Collapse
Affiliation(s)
- Yanqing Wang
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Fuhai Song
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Junwei Zhu
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Sisi Zhang
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yadong Yang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tingting Chen
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Bixia Tang
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Lili Dong
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Nan Ding
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Qian Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Zhouxian Bai
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xunong Dong
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Huanxin Chen
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Mingyuan Sun
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Shuang Zhai
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yubin Sun
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Lei Yu
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Li Lan
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Jingfa Xiao
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China; Collaborative Innovation Center of Genetics and Development, Fudan University, Shanghai 200438, China
| | - Xiangdong Fang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China; Collaborative Innovation Center of Genetics and Development, Fudan University, Shanghai 200438, China.
| | - Hongxing Lei
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China; Center of Alzheimer's Disease, Beijing Institute for Brain Disorders, Beijing 100053, China.
| | - Zhang Zhang
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China; Collaborative Innovation Center of Genetics and Development, Fudan University, Shanghai 200438, China.
| | - Wenming Zhao
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China; Collaborative Innovation Center of Genetics and Development, Fudan University, Shanghai 200438, China.
| |
Collapse
|
45
|
Bagnato C, Have AT, Prados MB, Beligni MV. Computational Functional Analysis of Lipid Metabolic Enzymes. Methods Mol Biol 2017; 1609:195-216. [PMID: 28660584 DOI: 10.1007/978-1-4939-6996-8_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
The computational analysis of enzymes that participate in lipid metabolism has both common and unique challenges when compared to the whole protein universe. Some of the hurdles that interfere with the functional annotation of lipid metabolic enzymes that are common to other pathways include the definition of proper starting datasets, the construction of reliable multiple sequence alignments, the definition of appropriate evolutionary models, and the reconstruction of phylogenetic trees with high statistical support, particularly for large datasets. Most enzymes that take part in lipid metabolism belong to complex superfamilies with many members that are not involved in lipid metabolism. In addition, some enzymes that do not have sequence similarity catalyze similar or even identical reactions. Some of the challenges that, albeit not unique, are more specific to lipid metabolism refer to the high compartmentalization of the routes, the catalysis in hydrophobic environments and, related to this, the function near or in biological membranes.In this work, we provide guidelines intended to assist in the proper functional annotation of lipid metabolic enzymes, based on previous experiences related to the phospholipase D superfamily and the annotation of the triglyceride synthesis pathway in algae. We describe a pipeline that starts with the definition of an initial set of sequences to be used in similarity-based searches and ends in the reconstruction of phylogenies. We also mention the main issues that have to be taken into consideration when using tools to analyze subcellular localization, hydrophobicity patterns, or presence of transmembrane domains in lipid metabolic enzymes.
Collapse
Affiliation(s)
- Carolina Bagnato
- Instituto de Energía y Desarrollo Sustentable-Comisión Nacional de Energía Atómica, Centro Atómico Bariloche, S. C. de Bariloche, 8400, Río Negro, Argentina
| | - Arjen Ten Have
- Instituto de Investigaciones Biológicas (IIB-CONICET-UNMdP), Facultad de Ciencias Exactas y Naturales, Universidad Nacional de Mar del Plata, Mar del Plata, 7600, Argentina
| | - María B Prados
- Instituto de Energía y Desarrollo Sustentable-Comisión Nacional de Energía Atómica, Centro Atómico Bariloche, S. C. de Bariloche, 8400, Río Negro, Argentina
| | - María V Beligni
- Instituto de Investigaciones Biológicas (IIB-CONICET-UNMdP), Facultad de Ciencias Exactas y Naturales, Universidad Nacional de Mar del Plata, Mar del Plata, 7600, Argentina.
| |
Collapse
|
46
|
Tahsin T, Weissenbacher D, Jones-Shargani D, Magee D, Vaiente M, Gonzalez G, Scotch M. Named entity linking of geospatial and host metadata in GenBank for advancing biomedical research. Database (Oxford) 2017; 2017:4781736. [PMID: 30412219 PMCID: PMC6225896 DOI: 10.1093/database/bax093] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2017] [Revised: 11/20/2017] [Accepted: 11/21/2017] [Indexed: 02/06/2023]
Abstract
DATABASE URL : https://zodo.asu.edu/zoophydb/.
Collapse
Affiliation(s)
- Tasnia Tahsin
- Department of Biomedical Informatics, Arizona State University, 13212 E Shea Blvd, Scottsdale, AZ 85259, USA
| | - Davy Weissenbacher
- Department of Biomedical Informatics, Arizona State University, 13212 E Shea Blvd, Scottsdale, AZ 85259, USA
- Biodesign Center for Environmental Health Engineering, Arizona State University 781 E, Terrace Mall Tempe, AZ 85281 USA
| | - Demetrius Jones-Shargani
- Biodesign Center for Environmental Health Engineering, Arizona State University 781 E, Terrace Mall Tempe, AZ 85281 USA
| | - Daniel Magee
- Department of Biomedical Informatics, Arizona State University, 13212 E Shea Blvd, Scottsdale, AZ 85259, USA
- Biodesign Center for Environmental Health Engineering, Arizona State University 781 E, Terrace Mall Tempe, AZ 85281 USA
| | - Matteo Vaiente
- Department of Biomedical Informatics, Arizona State University, 13212 E Shea Blvd, Scottsdale, AZ 85259, USA
- Biodesign Center for Environmental Health Engineering, Arizona State University 781 E, Terrace Mall Tempe, AZ 85281 USA
| | - Graciela Gonzalez
- Department of Biomedical Informatics, Arizona State University, 13212 E Shea Blvd, Scottsdale, AZ 85259, USA
- Institute of Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, 423 Guardian Drive, Philadelphia, PA 19104, USA
| | - Matthew Scotch
- Department of Biomedical Informatics, Arizona State University, 13212 E Shea Blvd, Scottsdale, AZ 85259, USA
- Biodesign Center for Environmental Health Engineering, Arizona State University 781 E, Terrace Mall Tempe, AZ 85281 USA
| |
Collapse
|
47
|
Gojobori T, Ikeo K, Katayama Y, Kawabata T, Kinjo AR, Kinoshita K, Kwon Y, Migita O, Mizutani H, Muraoka M, Nagata K, Omori S, Sugawara H, Yamada D, Yura K. VaProS: a database-integration approach for protein/genome information retrieval. JOURNAL OF STRUCTURAL AND FUNCTIONAL GENOMICS 2016; 17:69-81. [PMID: 28012137 PMCID: PMC5274651 DOI: 10.1007/s10969-016-9211-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/03/2016] [Accepted: 12/05/2016] [Indexed: 01/01/2023]
Abstract
Life science research now heavily relies on all sorts of databases for genome sequences, transcription, protein three-dimensional (3D) structures, protein-protein interactions, phenotypes and so forth. The knowledge accumulated by all the omics research is so vast that a computer-aided search of data is now a prerequisite for starting a new study. In addition, a combinatory search throughout these databases has a chance to extract new ideas and new hypotheses that can be examined by wet-lab experiments. By virtually integrating the related databases on the Internet, we have built a new web application that facilitates life science researchers for retrieving experts' knowledge stored in the databases and for building a new hypothesis of the research target. This web application, named VaProS, puts stress on the interconnection between the functional information of genome sequences and protein 3D structures, such as structural effect of the gene mutation. In this manuscript, we present the notion of VaProS, the databases and tools that can be accessed without any knowledge of database locations and data formats, and the power of search exemplified in quest of the molecular mechanisms of lysosomal storage disease. VaProS can be freely accessed at http://p4d-info.nig.ac.jp/vapros/ .
Collapse
Affiliation(s)
- Takashi Gojobori
- Computational Bioscience Research Center, Biological and Environmental Sciences and Engineering, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Saudi Arabia
- National Institute of Genetics, Shizuoka, 411-8540, Mishima, Japan
| | - Kazuho Ikeo
- National Institute of Genetics, Shizuoka, 411-8540, Mishima, Japan
| | - Yukie Katayama
- Graduate School of Agricultural and Life Sciences, University of Tokyo, Bunkyo, Tokyo, 113-8657, Japan
| | - Takeshi Kawabata
- Institute for Protein Research, Osaka University, Suita, Osaka, 565-0871, Japan
| | - Akira R Kinjo
- Institute for Protein Research, Osaka University, Suita, Osaka, 565-0871, Japan
| | - Kengo Kinoshita
- Graduate School of Information Sciences, Tohoku University, Miyagi, Sendai, 980-8597, Japan
- Tohoku Medical Megabank Organization, Tohoku University, Miyagi, Sendai, 980-8573, Japan
| | - Yeondae Kwon
- Graduate School of Agricultural and Life Sciences, University of Tokyo, Bunkyo, Tokyo, 113-8657, Japan
| | - Ohsuke Migita
- Department of Maternal-Fetal Biology, National Research Institute for Child Health and Development, Setagaya, Tokyo, 157-8535, Japan
- Department of Pediatrics, St. Marianna University School of Medicine, Miyamae, Kawasaki, 216-8511, Japan
| | - Hisashi Mizutani
- National Institute of Genetics, Shizuoka, 411-8540, Mishima, Japan
| | - Masafumi Muraoka
- National Institute of Genetics, Shizuoka, 411-8540, Mishima, Japan
| | - Koji Nagata
- Graduate School of Agricultural and Life Sciences, University of Tokyo, Bunkyo, Tokyo, 113-8657, Japan
| | - Satoshi Omori
- Graduate School of Information Sciences, Tohoku University, Miyagi, Sendai, 980-8597, Japan
| | - Hideaki Sugawara
- National Institute of Genetics, Shizuoka, 411-8540, Mishima, Japan
| | - Daichi Yamada
- Center for Informational Biology, Ochanomizu University, 2-1-1, Otsuka, Bunkyo, Tokyo, 112-8610, Japan
| | - Kei Yura
- National Institute of Genetics, Shizuoka, 411-8540, Mishima, Japan.
- Center for Informational Biology, Ochanomizu University, 2-1-1, Otsuka, Bunkyo, Tokyo, 112-8610, Japan.
| |
Collapse
|
48
|
Toribio AL, Alako B, Amid C, Cerdeño-Tarrága A, Clarke L, Cleland I, Fairley S, Gibson R, Goodgame N, Ten Hoopen P, Jayathilaka S, Kay S, Leinonen R, Liu X, Martínez-Villacorta J, Pakseresht N, Rajan J, Reddy K, Rosello M, Silvester N, Smirnov D, Vaughan D, Zalunin V, Cochrane G. European Nucleotide Archive in 2016. Nucleic Acids Res 2016; 45:D32-D36. [PMID: 27899630 PMCID: PMC5210577 DOI: 10.1093/nar/gkw1106] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2016] [Revised: 10/25/2016] [Accepted: 10/31/2016] [Indexed: 02/07/2023] Open
Abstract
The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) offers a rich platform for data sharing, publishing and archiving and a globally comprehensive data set for onward use by the scientific community. With a broad scope spanning raw sequencing reads, genome assemblies and functional annotation, the resource provides extensive data submission, search and download facilities across web and programmatic interfaces. Here, we outline ENA content and major access modalities, highlight major developments in 2016 and outline a number of examples of data reuse from ENA.
Collapse
Affiliation(s)
- Ana Luisa Toribio
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Blaise Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ana Cerdeño-Tarrága
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Laura Clarke
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Iain Cleland
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Susan Fairley
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Richard Gibson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Neil Goodgame
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Petra Ten Hoopen
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Simon Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Xin Liu
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josué Martínez-Villacorta
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nima Pakseresht
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jeena Rajan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Kethi Reddy
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Marc Rosello
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nicole Silvester
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dmitriy Smirnov
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel Vaughan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Vadim Zalunin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
49
|
Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. GenBank. Nucleic Acids Res 2016; 45:D37-D42. [PMID: 27899564 PMCID: PMC5210553 DOI: 10.1093/nar/gkw1070] [Citation(s) in RCA: 300] [Impact Index Per Article: 37.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2016] [Revised: 10/19/2016] [Accepted: 11/07/2016] [Indexed: 01/19/2023] Open
Abstract
GenBank® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for 370 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or the NCBI Submission Portal. GenBank staff assign accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Nucleotide database, which links to related information such as taxonomy, genomes, protein sequences and structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. Recent updates include changes to policies regarding sequence identifiers, an improved 16S submission wizard, targeted loci studies, the ability to submit methylation and BioNano mapping files, and a database of anti-microbial resistance genes.
Collapse
Affiliation(s)
- Dennis A Benson
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Mark Cavanaugh
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Karen Clark
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Ilene Karsch-Mizrachi
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - David J Lipman
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - James Ostell
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Eric W Sayers
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| |
Collapse
|
50
|
Okuda S, Watanabe Y, Moriya Y, Kawano S, Yamamoto T, Matsumoto M, Takami T, Kobayashi D, Araki N, Yoshizawa AC, Tabata T, Sugiyama N, Goto S, Ishihama Y. jPOSTrepo: an international standard data repository for proteomes. Nucleic Acids Res 2016; 45:D1107-D1111. [PMID: 27899654 PMCID: PMC5210561 DOI: 10.1093/nar/gkw1080] [Citation(s) in RCA: 387] [Impact Index Per Article: 48.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2016] [Revised: 10/21/2016] [Accepted: 10/27/2016] [Indexed: 12/17/2022] Open
Abstract
Major advancements have recently been made in mass spectrometry-based proteomics, yielding an increasing number of datasets from various proteomics projects worldwide. In order to facilitate the sharing and reuse of promising datasets, it is important to construct appropriate, high-quality public data repositories. jPOSTrepo (https://repository.jpostdb.org/) has successfully implemented several unique features, including high-speed file uploading, flexible file management and easy-to-use interfaces. This repository has been launched as a public repository containing various proteomic datasets and is available for researchers worldwide. In addition, our repository has joined the ProteomeXchange consortium, which includes the most popular public repositories such as PRIDE in Europe for MS/MS datasets and PASSEL for SRM datasets in the USA. Later MassIVE was introduced in the USA and accepted into the ProteomeXchange, as was our repository in July 2016, providing important datasets from Asia/Oceania. Accordingly, this repository thus contributes to a global alliance to share and store all datasets from a wide variety of proteomics experiments. Thus, the repository is expected to become a major repository, particularly for data collected in the Asia/Oceania region.
Collapse
Affiliation(s)
- Shujiro Okuda
- Niigata University Graduate School of Medical and Dental Sciences, Niigata 951-8510, Japan
| | - Yu Watanabe
- Niigata University Graduate School of Medical and Dental Sciences, Niigata 951-8510, Japan
| | - Yuki Moriya
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Kashiwa 277-0871, Japan
| | - Shin Kawano
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Kashiwa 277-0871, Japan
| | - Tadashi Yamamoto
- Biofluid Biomarker Center, Institute for Social Innovation and Cooperation, Niigata University, Niigata 950-2181, Japan
| | - Masaki Matsumoto
- Medical Institute of Bioregulation, Kyushu University, Fukuoka 812-8582, Japan
| | - Tomoyo Takami
- Medical Institute of Bioregulation, Kyushu University, Fukuoka 812-8582, Japan
| | - Daiki Kobayashi
- Graduate School of Medical Sciences, Faculty of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Norie Araki
- Graduate School of Medical Sciences, Faculty of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Akiyasu C Yoshizawa
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan
| | - Tsuyoshi Tabata
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan
| | - Naoyuki Sugiyama
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan
| | - Susumu Goto
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan
| | - Yasushi Ishihama
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan
| |
Collapse
|