1
|
Nestor BJ, Bayer PE, Fernandez CGT, Edwards D, Finnegan PM. Approaches to increase the validity of gene family identification using manual homology search tools. Genetica 2023; 151:325-338. [PMID: 37817002 PMCID: PMC10692271 DOI: 10.1007/s10709-023-00196-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 10/01/2023] [Indexed: 10/12/2023]
Abstract
Identifying homologs is an important process in the analysis of genetic patterns underlying traits and evolutionary relationships among species. Analysis of gene families is often used to form and support hypotheses on genetic patterns such as gene presence, absence, or functional divergence which underlie traits examined in functional studies. These analyses often require precise identification of all members in a targeted gene family. Manual pipelines where homology search and orthology assignment tools are used separately are the most common approach for identifying small gene families where accurate identification of all members is important. The ability to curate sequences between steps in manual pipelines allows for simple and precise identification of all possible gene family members. However, the validity of such manual pipeline analyses is often decreased by inappropriate approaches to homology searches including too relaxed or stringent statistical thresholds, inappropriate query sequences, homology classification based on sequence similarity alone, and low-quality proteome or genome sequences. In this article, we propose several approaches to mitigate these issues and allow for precise identification of gene family members and support for hypotheses linking genetic patterns to functional traits.
Collapse
Affiliation(s)
- Benjamin J Nestor
- School of Biological Sciences, University of Western Australia, Perth, WA, 6009, Australia.
- Centre for Applied Bioinformatics, University of Western Australia, Perth, WA, 6009, Australia.
| | - Philipp E Bayer
- School of Biological Sciences, University of Western Australia, Perth, WA, 6009, Australia
- Centre for Applied Bioinformatics, University of Western Australia, Perth, WA, 6009, Australia
| | - Cassandria G Tay Fernandez
- School of Biological Sciences, University of Western Australia, Perth, WA, 6009, Australia
- Centre for Applied Bioinformatics, University of Western Australia, Perth, WA, 6009, Australia
| | - David Edwards
- School of Biological Sciences, University of Western Australia, Perth, WA, 6009, Australia
- Centre for Applied Bioinformatics, University of Western Australia, Perth, WA, 6009, Australia
| | - Patrick M Finnegan
- School of Biological Sciences, University of Western Australia, Perth, WA, 6009, Australia
- Centre for Applied Bioinformatics, University of Western Australia, Perth, WA, 6009, Australia
| |
Collapse
|
2
|
Proteome-Wide Detection and Annotation of Receptor Tyrosine Kinases (RTKs): RTK-PRED and the TyReK Database. Biomolecules 2023; 13:biom13020270. [PMID: 36830638 PMCID: PMC9953206 DOI: 10.3390/biom13020270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 01/16/2023] [Accepted: 01/28/2023] [Indexed: 02/04/2023] Open
Abstract
Receptor tyrosine kinases (RTKs) form a highly important group of protein receptors of the eukaryotic cell membrane. They control many vital cellular functions and are involved in the regulation of complex signaling networks. Mutations in RTKs have been associated with different types of cancers and other diseases. Although they are very important for proper cell function, they have been experimentally studied in a limited range of eukaryotic species. Currently, there is no available database for RTKs providing information about their function, expression, and interactions. Therefore, the identification of RTKs in multiple organisms, the documentation of their characteristics, and the collection of related information would be very useful. In this paper, we present a novel RTK detection pipeline (RTK-PRED) and the Receptor Tyrosine Kinases Database (TyReK-DB). RTK-PRED combines profile HMMs with transmembrane topology prediction to identify and classify potential RTKs. Proteins of all eukaryotic reference proteomes of the UniProt database were used as input in RTK-PRED leading to a filtered dataset of 20,478 RTKs. Based on the information collected for these RTKs from multiple databases, the relational TyReK database was created.
Collapse
|
3
|
Sinha S, Lynn AM, Desai DK. Implementation of homology based and non-homology based computational methods for the identification and annotation of orphan enzymes: using Mycobacterium tuberculosis H37Rv as a case study. BMC Bioinformatics 2020; 21:466. [PMID: 33076816 PMCID: PMC7574302 DOI: 10.1186/s12859-020-03794-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Accepted: 10/01/2020] [Indexed: 02/06/2023] Open
Abstract
Background Homology based methods are one of the most important and widely used approaches for functional annotation of high-throughput microbial genome data. A major limitation of these methods is the absence of well-characterized sequences for certain functions. The non-homology methods based on the context and the interactions of a protein are very useful for identifying missing metabolic activities and functional annotation in the absence of significant sequence similarity. In the current work, we employ both homology and context-based methods, incrementally, to identify local holes and chokepoints, whose presence in the Mycobacterium tuberculosis genome is indicated based on its interaction with known proteins in a metabolic network context, but have not been annotated. We have developed two computational procedures using network theory to identify orphan enzymes (‘Hole finding protocol’) coupled with the identification of candidate proteins for the predicted orphan enzyme (‘Hole filling protocol’). We propose an integrated interaction score based on scores from the STRING database to identify candidate protein sequences for the orphan enzymes from M. tuberculosis, as a case study, which are most likely to perform the missing function. Results The application of an automated homology-based enzyme identification protocol, ModEnzA, on M. tuberculosis genome yielded 56 novel enzyme predictions. We further predicted 74 putative local holes, 6 choke points, and 3 high confidence local holes in the genome using ‘Hole finding protocol’. The ‘Hole-filling protocol’ was validated on the E. coli genome using artificial in-silico enzyme knockouts where our method showed 25% increased accuracy, compared to other methods, in assigning the correct sequence for the knocked-out enzyme amongst the top 10 ranks. The method was further validated on 8 additional genomes. Conclusions We have developed methods that can be generalized to augment homology-based annotation to identify missing enzyme coding genes and to predict a candidate protein for them. For pathogens such as M. tuberculosis, this work holds significance in terms of increasing the protein repertoire and thereby, the potential for identifying novel drug targets.
Collapse
Affiliation(s)
- Swati Sinha
- Bioinformatics Institute, Agency for Science, Technology, and Research (A*Star), 30 Biopolis Street, #07-01 Matrix, Singapore, 138671, Republic of Singapore
| | - Andrew M Lynn
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Dhwani K Desai
- Department of Biology and Department of Pharmacology, Dalhousie University, Halifax, NS, B3H4R2, Canada. .,School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India.
| |
Collapse
|
4
|
Silveira MC, Azevedo da Silva R, Faria da Mota F, Catanho M, Jardim R, R Guimarães AC, de Miranda AB. Systematic Identification and Classification of β-Lactamases Based on Sequence Similarity Criteria: β-Lactamase Annotation. Evol Bioinform Online 2018; 14:1176934318797351. [PMID: 30210232 PMCID: PMC6131288 DOI: 10.1177/1176934318797351] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2018] [Accepted: 08/08/2018] [Indexed: 12/11/2022] Open
Abstract
β-lactamases, the enzymes responsible for resistance to β-lactam antibiotics, are
widespread among prokaryotic genera. However, current β-lactamase classification
schemes do not represent their present diversity. Here, we propose a workflow to
identify and classify β-lactamases. Initially, a set of curated sequences was
used as a model for the construction of profiles Hidden Markov Models (HMM),
specific for each β-lactamase class. An extensive, nonredundant set of
β-lactamase sequences was constructed from 7 different resistance proteins
databases to test the methodology. The profiles HMM were improved for their
specificity and sensitivity and then applied to fully assembled genomes. Five
hierarchical classification levels are described, and a new class of
β-lactamases with fused domains is proposed. Our profiles HMM provide a better
annotation of β-lactamases, with classes and subclasses defined by objective
criteria such as sequence similarity. This classification offers a solid base to
the elaboration of studies on the diversity, dispersion, prevalence, and
evolution of the different classes and subclasses of this critical enzymatic
activity.
Collapse
Affiliation(s)
- Melise Chaves Silveira
- Laboratório de Biologia Computacional e Sistemas, Instituto Oswaldo Cruz, Fiocruz, Rio de Janeiro, Brazil
| | - Rangeline Azevedo da Silva
- Laboratório de Biologia Computacional e Sistemas, Instituto Oswaldo Cruz, Fiocruz, Rio de Janeiro, Brazil
| | - Fábio Faria da Mota
- Laboratório de Biologia Computacional e Sistemas, Instituto Oswaldo Cruz, Fiocruz, Rio de Janeiro, Brazil
| | - Marcos Catanho
- Laboratório de Genômica Funcional e Bioinformática, Instituto Oswaldo Cruz, Fiocruz, Rio de Janeiro, Brazil
| | - Rodrigo Jardim
- Laboratório de Biologia Computacional e Sistemas, Instituto Oswaldo Cruz, Fiocruz, Rio de Janeiro, Brazil
| | - Ana Carolina R Guimarães
- Laboratório de Genômica Funcional e Bioinformática, Instituto Oswaldo Cruz, Fiocruz, Rio de Janeiro, Brazil
| | - Antonio B de Miranda
- Laboratório de Biologia Computacional e Sistemas, Instituto Oswaldo Cruz, Fiocruz, Rio de Janeiro, Brazil
| |
Collapse
|
5
|
Chen H, Xia Y, Zhu S, Yang J, Yao J, Di J, Liang Y, Gao R, Wu W, Yang Y, Shi C, Hu D, Qin H, Wang Z. Lactobacillus plantarum LP‑Onlly alters the gut flora and attenuates colitis by inducing microbiome alteration in interleukin‑10 knockout mice. Mol Med Rep 2017; 16:5979-5985. [PMID: 28849048 PMCID: PMC5865777 DOI: 10.3892/mmr.2017.7351] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2016] [Accepted: 06/15/2017] [Indexed: 12/16/2022] Open
Abstract
The association between inflammatory bowel disease (IBD) and gut microbes has been widely investigated. Our previous study demonstrated that Lactobacillus plantarum LP‑Onlly (LP) applied as a probiotic altered the gut flora and attenuated colitis in interleukin (IL)‑10 knockout (IL‑10‑/‑) mice. In the present study, metagenome sequencing was performed to investigate the gut microbiome in IL‑10‑/‑mice and the influence of oral administration of LP on microbial composition. Metagenomics sequencing was performed to investigate the influence of IBD on the gut microbiome with and without LP treatment. The alteration of the abundances of various taxonomic and functional groups were investigated across these gut microbiomes. The present study demonstrates that Akkermansia muciniphila was significantly enriched in IL‑10‑/‑ mice, and bacteroides were significantly increased following LP administration. In addition, the phylum Bacteroidetes and Firmicutes were significantly influenced by LP administration. Further characterization of functional capacity revealed that in the gut metagenomes of IL‑10‑/‑mice, genes encoding cell cycle control, replication, recombination, repair and cell envelope biogenesis were decreased, but intracellular trafficking, secretion, and vesicular transport were increased. The present findings indicate that the gut metagenome is associated with IBD, and oral administration of LP contributes to prevention of gut inflammation, providing insight into the treatment of IBD.
Collapse
Affiliation(s)
- Hongqi Chen
- Department of General Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai 200233, P.R. China
| | - Yang Xia
- Department of General Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai 200233, P.R. China
| | - Sibo Zhu
- Department of Molecular and Cellular Biology, Cinoasia Institute, Shanghai 200438, P.R. China
| | - Jun Yang
- Department of General Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai 200233, P.R. China
| | - Jing Yao
- Department of General Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai 200233, P.R. China
| | - Jianzhong Di
- Department of General Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai 200233, P.R. China
| | - Yong Liang
- Department of General Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai 200233, P.R. China
| | - Renyuan Gao
- Department of General Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai 200233, P.R. China
| | - Wen Wu
- Department of General Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai 200233, P.R. China
| | - Yongzhi Yang
- Department of General Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai 200233, P.R. China
| | - Chenzhang Shi
- Department of General Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai 200233, P.R. China
| | - Desheng Hu
- Department of General Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai 200233, P.R. China
| | - Huanlong Qin
- Department of General Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai 200233, P.R. China
| | - Zhigang Wang
- Department of General Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai 200233, P.R. China
| |
Collapse
|
6
|
Nelson DR, Khraiwesh B, Fu W, Alseekh S, Jaiswal A, Chaiboonchoe A, Hazzouri KM, O'Connor MJ, Butterfoss GL, Drou N, Rowe JD, Harb J, Fernie AR, Gunsalus KC, Salehi-Ashtiani K. The genome and phenome of the green alga Chloroidium sp. UTEX 3007 reveal adaptive traits for desert acclimatization. eLife 2017. [PMID: 28623667 PMCID: PMC5509433 DOI: 10.7554/elife.25783] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
To investigate the phenomic and genomic traits that allow green algae to survive in deserts, we characterized a ubiquitous species, Chloroidium sp. UTEX 3007, which we isolated from multiple locations in the United Arab Emirates (UAE). Metabolomic analyses of Chloroidium sp. UTEX 3007 indicated that the alga accumulates a broad range of carbon sources, including several desiccation tolerance-promoting sugars and unusually large stores of palmitate. Growth assays revealed capacities to grow in salinities from zero to 60 g/L and to grow heterotrophically on >40 distinct carbon sources. Assembly and annotation of genomic reads yielded a 52.5 Mbp genome with 8153 functionally annotated genes. Comparison with other sequenced green algae revealed unique protein families involved in osmotic stress tolerance and saccharide metabolism that support phenomic studies. Our results reveal the robust and flexible biology utilized by a green alga to successfully inhabit a desert coastline. DOI:http://dx.doi.org/10.7554/eLife.25783.001 Single-celled green algae, also known as green microalgae, play an important role for the world’s ecosystems, in part, because they can harness energy from sunlight to produce carbon-rich compounds. Microalgae are also important for biotechnology and people have harnessed them to make food, fuel and medicines. Green microalgae live in many types of habitats from streams to oceans, and they can also be found on the land, including in deserts. Like plants that live in the desert, these microalgae have likely evolved specific traits that allow them to live in these hot and dry regions. Yet, fewer scientists have studied microalgae compared to land plants, and until now it was not well understood how microalgae could survive in the desert. Nelson et al. analyzed green microalgae from different locations around the United Arab Emirates and found that one microalga, known as Chloroidium, is one of the most dominant algae in this area. This included samples from beaches, mangroves, desert oases, buildings and public fresh water sources. Chloroidium has a unique set of genes and proteins and grew particularly well in freshwater and saltwater. Rather than just harnessing sunlight, the microalgae were able to consume over 40 different varieties of carbon sources to produce energy. The microalgae also accumulated oily molecules with a similar composition to palm oil, which may help this species to survive in desert regions. A next step will be to develop biotechnological assets based on the information obtained. In the future, microalgae could be used to make an oil that represents an alternative to palm oil; this would reduce the demand for palm tree plantations, which pose a major threat to the natural environment. Moreover, understanding how microalgae can colonize a desert region will help us to understand the effects of climate change in the region. DOI:http://dx.doi.org/10.7554/eLife.25783.002
Collapse
Affiliation(s)
- David R Nelson
- Laboratory of Algal, Synthetic, and Systems Biology, Division of Science and Math, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates.,Center for Genomics and Systems Biology, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
| | - Basel Khraiwesh
- Laboratory of Algal, Synthetic, and Systems Biology, Division of Science and Math, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates.,Center for Genomics and Systems Biology, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
| | - Weiqi Fu
- Laboratory of Algal, Synthetic, and Systems Biology, Division of Science and Math, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
| | - Saleh Alseekh
- Max Planck Institute of Molecular Plant Physiology, Potsdam, Germany
| | - Ashish Jaiswal
- Laboratory of Algal, Synthetic, and Systems Biology, Division of Science and Math, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
| | - Amphun Chaiboonchoe
- Laboratory of Algal, Synthetic, and Systems Biology, Division of Science and Math, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
| | - Khaled M Hazzouri
- Center for Genomics and Systems Biology, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
| | - Matthew J O'Connor
- Core Technology Platform, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
| | - Glenn L Butterfoss
- Center for Genomics and Systems Biology, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
| | - Nizar Drou
- Center for Genomics and Systems Biology, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
| | - Jillian D Rowe
- Center for Genomics and Systems Biology, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
| | - Jamil Harb
- Max Planck Institute of Molecular Plant Physiology, Potsdam, Germany.,Department of Biology and Biochemistry, Birzeit University, Birzeit, Palestine
| | - Alisdair R Fernie
- Max Planck Institute of Molecular Plant Physiology, Potsdam, Germany
| | - Kristin C Gunsalus
- Center for Genomics and Systems Biology, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates.,Center for Genomics and Systems Biology and Department of Biology, New York University, New York, United States
| | - Kourosh Salehi-Ashtiani
- Laboratory of Algal, Synthetic, and Systems Biology, Division of Science and Math, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates.,Center for Genomics and Systems Biology, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
| |
Collapse
|
7
|
Kamran M, Sinha S, Dubey P, Lynn AM, Dhar SK. Identification of putative Z-ring-associated proteins, involved in cell division in human pathogenic bacteria Helicobacter pylori. FEBS Lett 2016; 590:2158-71. [PMID: 27253179 DOI: 10.1002/1873-3468.12230] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Revised: 05/19/2016] [Accepted: 05/23/2016] [Indexed: 11/07/2022]
Abstract
Cell division in bacteria is initiated by FtsZ, which forms a Z ring at the middle of the cell, between the nucleoids. The Z ring is stabilized by Z ring-associated proteins (Zaps), which crosslink the FtsZ filaments and provide strength. The deletion of Zaps leads to the elongation phenotype with an abnormal Z ring. The components of cell division in Helicobacter pylori are similar to other gram negative bacteria except for the absence of few components including Zaps. Here, we used HHsearch to identify homologs of the missing cell division proteins and got potential hits for ZapA and ZapB, as well as for few other cell division proteins. We further validated the function of the putative ZapA homolog by genetic complementation, immuno-colocalization and biochemical analysis.
Collapse
Affiliation(s)
- Mohammad Kamran
- Special Centre for Molecular Medicine, Jawaharlal Nehru University, New Delhi, India
| | - Swati Sinha
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Priyanka Dubey
- Special Centre for Molecular Medicine, Jawaharlal Nehru University, New Delhi, India
| | - Andrew M Lynn
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Suman K Dhar
- Special Centre for Molecular Medicine, Jawaharlal Nehru University, New Delhi, India
| |
Collapse
|
8
|
Oh Brother, Where Art Thou? Finding Orthologs in the Twilight and Midnight Zones of Sequence Similarity. Evol Biol 2016. [DOI: 10.1007/978-3-319-41324-2_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|