1
|
Rout RK, Umer S, Khandelwal M, Pati S, Mallik S, Balabantaray BK, Qin H. Identification of discriminant features from stationary pattern of nucleotide bases and their application to essential gene classification. Front Genet 2023; 14:1154120. [PMID: 37152988 PMCID: PMC10156977 DOI: 10.3389/fgene.2023.1154120] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 04/04/2023] [Indexed: 05/09/2023] Open
Abstract
Introduction: Essential genes are essential for the survival of various species. These genes are a family linked to critical cellular activities for species survival. These genes are coded for proteins that regulate central metabolism, gene translation, deoxyribonucleic acid replication, and fundamental cellular structure and facilitate intracellular and extracellular transport. Essential genes preserve crucial genomics information that may hold the key to a detailed knowledge of life and evolution. Essential gene studies have long been regarded as a vital topic in computational biology due to their relevance. An essential gene is composed of adenine, guanine, cytosine, and thymine and its various combinations. Methods: This paper presents a novel method of extracting information on the stationary patterns of nucleotides such as adenine, guanine, cytosine, and thymine in each gene. For this purpose, some co-occurrence matrices are derived that provide the statistical distribution of stationary patterns of nucleotides in the genes, which is helpful in establishing the relationship between the nucleotides. For extracting discriminant features from each co-occurrence matrix, energy, entropy, homogeneity, contrast, and dissimilarity features are computed, which are extracted from all co-occurrence matrices and then concatenated to form a feature vector representing each essential gene. Finally, supervised machine learning algorithms are applied for essential gene classification based on the extracted fixed-dimensional feature vectors. Results: For comparison, some existing state-of-the-art feature representation techniques such as Shannon entropy (SE), Hurst exponent (HE), fractal dimension (FD), and their combinations have been utilized. Discussion: An extensive experiment has been performed for classifying the essential genes of five species that show the robustness and effectiveness of the proposed methodology.
Collapse
Affiliation(s)
- Ranjeet Kumar Rout
- National Institute of Technology Srinagar, Hazratbal, Jammu and Kashmir, India
| | - Saiyed Umer
- Aliah University, Kolkata, West Bengal, India
| | - Monika Khandelwal
- National Institute of Technology Srinagar, Hazratbal, Jammu and Kashmir, India
| | - Smitarani Pati
- Dr. B R Ambedkar National Institute of Technology Jalandhar, Jalandhar, Punjab, India
| | - Saurav Mallik
- Harvard T H Chan School of Public Health, Boston, United States
- Department of Pharmacology and Toxicology, University of Arizona, Tucson, AZ, United States
- *Correspondence: Saurav Mallik, , ; Hong Qin,
| | | | - Hong Qin
- Department of Computer Science and Engineering, University of Tennessee at Chattanooga, Chattanooga, TN, United States
- *Correspondence: Saurav Mallik, , ; Hong Qin,
| |
Collapse
|
2
|
Oz N, Vayndorf EM, Tsuchiya M, McLean S, Turcios-Hernandez L, Pitt JN, Blue BW, Muir M, Kiflezghi MG, Tyshkovskiy A, Mendenhall A, Kaeberlein M, Kaya A. Evidence that conserved essential genes are enriched for pro-longevity factors. GeroScience 2022; 44:1995-2006. [PMID: 35695982 PMCID: PMC9616985 DOI: 10.1007/s11357-022-00604-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 06/03/2022] [Indexed: 02/02/2023] Open
Abstract
At the cellular level, many aspects of aging are conserved across species. This has been demonstrated by numerous studies in simple model organisms like Saccharomyces cerevisiae, Caenorhabdits elegans, and Drosophila melanogaster. Because most genetic screens examine loss of function mutations or decreased expression of genes through reverse genetics, essential genes have often been overlooked as potential modulators of the aging process. By taking the approach of increasing the expression level of a subset of conserved essential genes, we found that 21% of these genes resulted in increased replicative lifespan in S. cerevisiae. This is greater than the ~ 3.5% of genes found to affect lifespan upon deletion, suggesting that activation of essential genes may have a relatively disproportionate effect on increasing lifespan. The results of our experiments demonstrate that essential gene overexpression is a rich, relatively unexplored means of increasing eukaryotic lifespan.
Collapse
Affiliation(s)
- Naci Oz
- Department of Biology, Virginia Commonwealth University, Richmond, VA, 23284, USA
| | - Elena M Vayndorf
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, 98195, USA
| | - Mitsuhiro Tsuchiya
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, 98195, USA
| | - Samantha McLean
- Department of Biology, Virginia Commonwealth University, Richmond, VA, 23284, USA
| | | | - Jason N Pitt
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, 98195, USA
| | - Benjamin W Blue
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, 98195, USA
| | - Michael Muir
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, 98195, USA
| | - Michael G Kiflezghi
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, 98195, USA
| | - Alexander Tyshkovskiy
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Alexander Mendenhall
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, 98195, USA
| | - Matt Kaeberlein
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, 98195, USA.
| | - Alaattin Kaya
- Department of Biology, Virginia Commonwealth University, Richmond, VA, 23284, USA.
- Massey Cancer Center, Virginia Commonwealth University, Richmond, VA, 23298, USA.
- Department of Human and Molecular Genetics, Virginia Commonwealth University, Richmond, VA, 23284, USA.
| |
Collapse
|
3
|
Guo HB, Perminov A, Bekele S, Kedziora G, Farajollahi S, Varaljay V, Hinkle K, Molinero V, Meister K, Hung C, Dennis P, Kelley-Loughnane N, Berry R. AlphaFold2 models indicate that protein sequence determines both structure and dynamics. Sci Rep 2022; 12:10696. [PMID: 35739160 PMCID: PMC9226352 DOI: 10.1038/s41598-022-14382-9] [Citation(s) in RCA: 48] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 06/06/2022] [Indexed: 12/29/2022] Open
Abstract
AlphaFold 2 (AF2) has placed Molecular Biology in a new era where we can visualize, analyze and interpret the structures and functions of all proteins solely from their primary sequences. We performed AF2 structure predictions for various protein systems, including globular proteins, a multi-domain protein, an intrinsically disordered protein (IDP), a randomized protein, two larger proteins (> 1000 AA), a heterodimer and a homodimer protein complex. Our results show that along with the three dimensional (3D) structures, AF2 also decodes protein sequences into residue flexibilities via both the predicted local distance difference test (pLDDT) scores of the models, and the predicted aligned error (PAE) maps. We show that PAE maps from AF2 are correlated with the distance variation (DV) matrices from molecular dynamics (MD) simulations, which reveals that the PAE maps can predict the dynamical nature of protein residues. Here, we introduce the AF2-scores, which are simply derived from pLDDT scores and are in the range of [0, 1]. We found that for most protein models, including large proteins and protein complexes, the AF2-scores are highly correlated with the root mean square fluctuations (RMSF) calculated from MD simulations. However, for an IDP and a randomized protein, the AF2-scores do not correlate with the RMSF from MD, especially for the IDP. Our results indicate that the protein structures predicted by AF2 also convey information of the residue flexibility, i.e., protein dynamics.
Collapse
Affiliation(s)
- Hao-Bo Guo
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
- UES Inc., Dayton, OH, USA
| | - Alexander Perminov
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
- Computer Science Department, Miami University, Oxford, OH, USA
| | - Selemon Bekele
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
- UES Inc., Dayton, OH, USA
| | - Gary Kedziora
- General Dynamics Information Technology, Inc., Wright-Patterson Air Force Base, 45433, OH, USA
| | - Sanaz Farajollahi
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
- UES Inc., Dayton, OH, USA
| | - Vanessa Varaljay
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
| | - Kevin Hinkle
- Department of Chemical and Materials Engineering, Dayton University, Dayton, OH, USA
| | - Valeria Molinero
- Department of Chemistry, The University of Utah, Salt Lake City, UT, USA
| | - Konrad Meister
- Department of Natural Sciences, University of Alaska Southeast, Juneau, AK, USA
- Max Planck Institute for Polymer Research, Mainz, Germany
| | - Chia Hung
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
| | - Patrick Dennis
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
| | - Nancy Kelley-Loughnane
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA.
| | - Rajiv Berry
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA.
| |
Collapse
|