1
|
Faiz M, Khan SJ, Azim F, Ejaz N. Disclosing the locale of transmembrane proteins within cellular alcove by machine learning approach: systematic review and meta analysis. J Biomol Struct Dyn 2023; 42:11133-11148. [PMID: 37768108 DOI: 10.1080/07391102.2023.2260490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 09/13/2023] [Indexed: 09/29/2023]
Abstract
Protein subcellular localization is a promising research question in Proteomics and associated fields, including Biological Sciences, Biomedical Engineering, Computational Biology, Bioinformatics, Proteomics, Artificial Intelligence, and Biophysics. However, computational techniques are preferred to explore this attribute for a massive number of proteins. The byproduct of this conjunction yields diversified location identifiers of proteins. These protein subcellular localization identifiers are unique regarding the database used, organisms, Machine Learning Technique, and accuracy. Despite the availability of these identifiers, the majority of the work has been done on the subcellular localization of proteins and, less work has been done specifically on locations of transmembrane proteins. This systematic review accounts for computational techniques implemented on transmembrane protein localization. Moreover, a literature search on PubMed, Science Direct, and IEEE Databases disclosed no systematic review or meta-analysis on the cell's transmembrane protein locale. A Systematic review was formed under the guidelines of PRISMA by using Science Direct, PubMed, and IEEE Databases. Journal publications from 2000 to 2023 were taken into consideration and screened. This review has focused only on computational studies rather than experimental techniques. 1004 studies were reviewed and were categorized as relevant and non-relevant according to inclusion and exclusion criteria. All the screening was done through Endnote after importing citations. This systematic review characterizes the gap in targeting the locale of the transmembrane protein and will aid researchers in exploring its new horizons.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Mehwish Faiz
- Department of Biomedical Engineering, Ziauddin University (FESTM), Karachi, Pakistan
- Department of Electrical Engineering, Ziauddin University, (FESTM), Karachi, Pakistan
| | - Saad Jawaid Khan
- Department of Biomedical Engineering, Ziauddin University (FESTM), Karachi, Pakistan
| | - Fahad Azim
- Department of Electrical Engineering, Ziauddin University, (FESTM), Karachi, Pakistan
| | - Nazia Ejaz
- Balochistan University of Engineering and Technology, Khuzdar, Pakistan
| |
Collapse
|
2
|
Yao J, Qian X, Bao J, Wei Q, Lu Y, Zheng H, Cao X, Xing G. Probing the Effect of Two Heterozygous Mutations in Codon 723 of SLC26A4 on Deafness Phenotype Based on Molecular Dynamics Simulations. Sci Rep 2015; 5:10831. [PMID: 26035154 PMCID: PMC4451684 DOI: 10.1038/srep10831] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2015] [Accepted: 04/27/2015] [Indexed: 11/09/2022] Open
Abstract
A Chinese family was identified with clinical features of enlarged vestibular aqueduct syndrome (EVAS). The mutational analysis showed that the proband (III-2) had EVAS with bilateral sensorineural hearing loss and carried a rare compound heterozygous mutation of SLC26A4 (IVS7-2A>G, c.2167C>G), which was inherited from the same mutant alleles of IVS7-2A>G heterozygous father and c.2167C>G heterozygous mother. Compared with another confirmed pathogenic biallelic mutation in SLC26A4 (IVS7-2A>G, c.2168A>G), these two biallelic mutations shared one common mutant allele and the same codon of the other mutant allele, but led to different changes of amino acid (p.H723D, p.H723R) and both resulted in the deafness phenotype. Structure-modeling indicated that these two mutant alleles changed the shape of pendrin protein encoded by SLC26A4 with increasing randomness in conformation, and might impair pendrin’s ability as an anion transporter. The molecular dynamics simulations also revealed that the stability of mutant pendrins was reduced with increased flexibility of backbone atoms, which was consistent with the structure-modeling results. These evidences indicated that codon 723 was a hot-spot region in SLC26A4 with a significant impact on the structure and function of pendrin, and acted as one of the genetic factors responsible for the development of hearing loss.
Collapse
Affiliation(s)
- Jun Yao
- Department of Biotechnology, School of Basic Medical Science, Nanjing Medical University, Nanjing, P.R. China
| | - Xuli Qian
- Department of Biotechnology, School of Basic Medical Science, Nanjing Medical University, Nanjing, P.R. China
| | - Jingxiao Bao
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, P.R. China
| | - Qinjun Wei
- Department of Biotechnology, School of Basic Medical Science, Nanjing Medical University, Nanjing, P.R. China
| | - Yajie Lu
- Department of Biotechnology, School of Basic Medical Science, Nanjing Medical University, Nanjing, P.R. China
| | - Heng Zheng
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, P.R. China
| | - Xin Cao
- Department of Biotechnology, School of Basic Medical Science, Nanjing Medical University, Nanjing, P.R. China
| | - Guangqian Xing
- Department of Otolaryngology, the First Affiliated Hospital of Nanjing Medical University, Nanjing, P.R. China
| |
Collapse
|
3
|
|
4
|
Chuang CL, Chen CM, Wong WS, Tsai KN, Chan EC, Jiang JA. A robust correlation estimator and nonlinear recurrent model to infer genetic interactions in Saccharomyces cerevisiae and pathways of pulmonary disease in Homo sapiens. Biosystems 2009; 98:160-75. [PMID: 19527770 DOI: 10.1016/j.biosystems.2009.05.013] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2008] [Revised: 05/08/2009] [Accepted: 05/28/2009] [Indexed: 12/13/2022]
Abstract
In order to identify genes involved in complex diseases, it is crucial to study the genetic interactions at the systems biology level. By utilizing modern high throughput microarray technology, it has become feasible to obtain gene expressions data and turn it into knowledge that explains the regulatory behavior of genes. In this study, an unsupervised nonlinear model was proposed to infer gene regulatory networks on a genome-wide scale. The proposed model consists of two components, a robust correlation estimator and a nonlinear recurrent model. The robust correlation estimator was used to initialize the parameters of the nonlinear recurrent curve-fitting model. Then the initialized model was used to fit the microarray data. The model was used to simulate the underlying nonlinear regulatory mechanisms in biological organisms. The proposed algorithm was applied to infer the regulatory mechanisms of the general network in Saccharomyces cerevisiae and the pulmonary disease pathways in Homo sapiens. The proposed algorithm requires no prior biological knowledge to predict linkages between genes. The prediction results were checked against true positive links obtained from the YEASTRACT database, the TRANSFAC database, and the KEGG database. By checking the results with known interactions, we showed that the proposed algorithm could determine some meaningful pathways, many of which are supported by the existing literature.
Collapse
Affiliation(s)
- Cheng-Long Chuang
- Institute of Biomedical Engineering, National Taiwan University, Taiwan
| | | | | | | | | | | |
Collapse
|
5
|
Sheehan D, O'Donovan E. Towards structural biology in the genus Mytilus: on-line homology modeling with Geno3D. MARINE ENVIRONMENTAL RESEARCH 2006; 62 Suppl:S411-4. [PMID: 16712916 DOI: 10.1016/j.marenvres.2006.04.054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Seven structures from the genus Mytilus exist in the protein data bank (PDB) but > 1000 Mytilus protein sequences are known. Sequences (NCBI) were used as targets for on-line homology modeling with Geno3D as an alternative route to structure. Quality comparators include root mean square deviation (RMSD) of target from template, chain geometry and Ramachandran diagram analysis. We modeled 17 Mytilus structures for mainly stress-response proteins relevant to biomonitoring. These model files are available as a first step towards a structural resource for Mytilus. Analysis suggests that they are structurally plausible with low RMSDs relative to targets. Ramachandran analysis suggests a small percentage of disallowed dihedral angles (in the range 0-6.86% for 15 of 17 models).
Collapse
Affiliation(s)
- D Sheehan
- Proteomics Research Group, Department of Biochemistry, University College Cork, Ireland.
| | | |
Collapse
|
6
|
Homology modeling of milk enzymes using on-line resources: Insights to structure-function and evolutionary relationships. Int Dairy J 2006. [DOI: 10.1016/j.idairyj.2005.09.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
7
|
Gammeren AJV, Hulsbergen FB, Hollander JG, Groot HJMD. Residual backbone and side-chain 13C and 15N resonance assignments of the intrinsic transmembrane light-harvesting 2 protein complex by solid-state Magic Angle Spinning NMR spectroscopy. JOURNAL OF BIOMOLECULAR NMR 2005; 31:279-93. [PMID: 15928995 DOI: 10.1007/s10858-005-1604-8] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2004] [Accepted: 01/27/2005] [Indexed: 05/02/2023]
Abstract
This study reports the sequence specific chemical shifts assignments for 76 residues of the 94 residues containing monomeric unit of the photosynthetic light-harvesting 2 transmembrane protein complex from Rhodopseudomonas acidophila strain 10050, using Magic Angle Spinning (MAS) NMR in combination with extensive and selective biosynthetic isotope labeling methods. The sequence specific chemical shifts assignment is an essential step for structure determination by MAS NMR. Assignments have been performed on the basis of 2-dimensional proton-driven spin diffusion (13)C-(13)C correlation experiments with mixing times of 20 and 500 ms and band selective (13)C-(15)N correlation spectroscopy on a series of site-specific biosynthetically labeled samples. The decreased line width and the reduced number of correlation signals of the selectively labeled samples with respect to the uniformly labeled samples enable to resolve the narrowly distributed correlation signals of the backbone carbons and nitrogens involved in the long alpha-helical transmembrane segments. Inter-space correlations between nearby residues and between residues and the labeled BChl a cofactors, provided by the (13)C-(13)C correlation experiments using a 500 ms spin diffusion period, are used to arrive at sequence specific chemical shift assignments for many residues in the protein complex. In this way it is demonstrated that MAS NMR methods combined with site-specific biosynthetic isotope labeling can be used for sequence specific assignment of the NMR response of transmembrane proteins.
Collapse
Affiliation(s)
- A J van Gammeren
- Leiden Institute of Chemistry, Gorlaeus Laboratoria, Leiden University, The Netherlands
| | | | | | | |
Collapse
|
8
|
Bigelow HR, Petrey DS, Liu J, Przybylski D, Rost B. Predicting transmembrane beta-barrels in proteomes. Nucleic Acids Res 2004; 32:2566-77. [PMID: 15141026 PMCID: PMC419468 DOI: 10.1093/nar/gkh580] [Citation(s) in RCA: 126] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Very few methods address the problem of predicting beta-barrel membrane proteins directly from sequence. One reason is that only very few high-resolution structures for transmembrane beta-barrel (TMB) proteins have been determined thus far. Here we introduced the design, statistics and results of a novel profile-based hidden Markov model for the prediction and discrimination of TMBs. The method carefully attempts to avoid over-fitting the sparse experimental data. While our model training and scoring procedures were very similar to a recently published work, the architecture and structure-based labelling were significantly different. In particular, we introduced a new definition of beta- hairpin motifs, explicit state modelling of transmembrane strands, and a log-odds whole-protein discrimination score. The resulting method reached an overall four-state (up-, down-strand, periplasmic-, outer-loop) accuracy as high as 86%. Furthermore, accurately discriminated TMB from non-TMB proteins (45% coverage at 100% accuracy). This high precision enabled the application to 72 entirely sequenced Gram-negative bacteria. We found over 164 previously uncharacterized TMB proteins at high confidence. Database searches did not implicate any of these proteins with membranes. We challenge that the vast majority of our 164 predictions will eventually be verified experimentally. All proteome predictions and the PROFtmb prediction method are available at http://www.rostlab.org/ services/PROFtmb/.
Collapse
Affiliation(s)
- Henry R Bigelow
- CUBIC, Department of Biochemistry and Molecular Biophysics, Columbia University, 650 West 168th Street BB217, New York, NY 10032, USA.
| | | | | | | | | |
Collapse
|
9
|
Gun'ko VM, Klyueva AV, Levchuk YN, Leboda R. Photon correlation spectroscopy investigations of proteins. Adv Colloid Interface Sci 2003; 105:201-328. [PMID: 12969646 DOI: 10.1016/s0001-8686(03)00091-5] [Citation(s) in RCA: 72] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Physical principles of photon correlation spectroscopy (PCS), mathematical treatment of the PCS data (converting autocorrelation functions to distribution functions or average characteristics), and PCS applications to study proteins and other biomacromolecules in aqueous media are described and analysed. The PCS investigations of conformational changes in protein molecules, their aggregation itself or in consequence of interaction with other molecules or organic (polymers) and inorganic (e.g. fumed silica) fine particles as well as the influence of low molecular compounds (surfactants, drugs, salts, metal ions, etc.) reveal unique capability of the PCS techniques for elucidation of important native functions of proteins and other biomacromolecules (DNA, RNA, etc.) or microorganisms (Escherichia coli, Pseudomonas putida, Dunaliella viridis, etc.). Special attention is paid to the interaction of proteins with fumed oxides and the impact of polymers and fine oxide particles on the motion of living flagellar microorganisms analysed by means of PCS.
Collapse
Affiliation(s)
- Vladimir M Gun'ko
- Institute of Surface Chemistry, 17 General Naumov Street, Kiev 03164, Ukraine.
| | | | | | | |
Collapse
|
10
|
Abstract
PURPOSE OF REVIEW Structural biology is one of the most informative disciplines available to biochemical research. It unites technologies that can reveal high-resolution (often atomic) details of the inner workings of biochemical systems. These insights are crucial for an understanding of basic life processes, such as the reaction mechanism of a drug-converting enzyme, signal transduction from one protein to another, activation of a metabolic pathway by a gene effector, action modes of drugs, or the consequences of a mutation on the function of an enzyme. Structural biology is also vital for characterizing the molecular basis of many diseases and often provides a starting platform for the development of specifically tuned therapies, for example by designing drugs that bind to particular targets in order to affect their functionality. An overview of the main techniques as well as some selected case studies are presented. RECENT FINDINGS Recent advances in the three major structure determination techniques (X-ray crystallography, nuclear magnetic resonance spectroscopy, and electron microscopy) have made it possible to obtain three-dimensional structural information on larger systems, at higher resolution and with much less effort than in the past. Because these techniques often provide complementary information, combining them is particularly powerful for gaining comprehensive insights into complex systems, as illustrated by the recent structure determinations of the low-density lipoprotein receptor and the ribosome. SUMMARY Structural biology is no longer exclusive to a few dedicated specialists. Many of its techniques are now easily accessible and can be used as tools in general life science research.
Collapse
Affiliation(s)
- Mischa Machius
- Department of Biochemistry, University of Texas Southwestern Medical Center at Dallas, Dallas, Texas, USA.
| |
Collapse
|
11
|
Dosztányi Z, Magyar C, Tusnády GE, Cserzo M, Fiser A, Simon I. Servers for sequence-structure relationship analysis and prediction. Nucleic Acids Res 2003; 31:3359-63. [PMID: 12824327 PMCID: PMC168995 DOI: 10.1093/nar/gkg589] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We describe several algorithms and public servers that were developed to analyze and predict various features of protein structures. These servers provide information about the covalent state of cysteine (CYSREDOX), as well as about residues involved in non-covalent cross links that play an important role in the structural stability of proteins (SCIDE and SCPRED). We also discuss methods and servers developed to identify helical transmembrane proteins from large databases and rough genomic data, including two of the most popular transmembrane prediction methods, DAS and HMMTOP. Several biologically interesting applications of these servers are also presented. The servers are available through http://www.enzim.hu/servers.html.
Collapse
Affiliation(s)
- Zsuzsanna Dosztányi
- Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, H-1518 Budapest, PO Box 7, Hungary
| | | | | | | | | | | |
Collapse
|
12
|
Ledesma A, de Lacoba MG, Arechaga I, Rial E. Modeling the transmembrane arrangement of the uncoupling protein UCP1 and topological considerations of the nucleotide-binding site. J Bioenerg Biomembr 2002; 34:473-86. [PMID: 12678439 DOI: 10.1023/a:1022522310279] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
The uncoupling protein from brown adipose tissue (UCP1) is a mitochondrial proton transporter whose activity is inhibited by purine nucleotides. UCP1, like the other members of the mitochondrial transporter superfamily, is an homodimer and each subunit contains six transmembrane segments. In an attempt to understand the structural elements that are important for nucleotide binding, a model for the transmembrane arrangement of UCP1 has been built by computational methods. Biochemical and sequence analysis considerations are taken as constraints. The main features of the model include the following: (i) the six transmembrane alpha-helices (TMHs) associate to form an antiparallel helix bundle; (ii) TMHs have an amphiphilic nature and thus the hydrophobic and variable residues face the lipid bilayer; (iii) matrix loops do not penetrate in the core of the bundle; and (iv) the polar core constitutes the translocation pathway. Photoaffinity labeling and mutagenesis studies have identified several UCP1 regions that interact with the nucleotide. We present a model where the nucleotide binds deep inside the bundle core. The purine ring interacts with the matrix loops while the polyphosphate chain is stabilized through interactions with essential Arg residues in the TMH and whose side chains face the core of the helix bundle.
Collapse
Affiliation(s)
- Amalia Ledesma
- Centro de Investigaciones Biológicas, CSIC, Velázquez 144, 28006 Madrid, Spain
| | | | | | | |
Collapse
|
13
|
Cserzö M, Eisenhaber F, Eisenhaber B, Simon I. On filtering false positive transmembrane protein predictions. Protein Eng Des Sel 2002; 15:745-52. [PMID: 12456873 DOI: 10.1093/protein/15.9.745] [Citation(s) in RCA: 119] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
While helical transmembrane (TM) region prediction tools achieve high (>90%) success rates for real integral membrane proteins, they produce a considerable number of false positive hits in sequences of known nontransmembrane queries. We propose a modification of the dense alignment surface (DAS) method that achieves a substantial decrease in the false positive error rate. Essentially, a sequence that includes possible transmembrane regions is compared in a second step with TM segments in a sequence library of documented transmembrane proteins. If the performance of the query sequence against the library of documented TM segment-containing sequences in this test is lower than an empirical threshold, it is classified as a non-transmembrane protein. The probability of false positive prediction for trusted TM region hits is expressed in terms of E-values. The modified DAS method, the DAS-TMfilter algorithm, has an unchanged high sensitivity for TM segments ( approximately 95% detected in a learning set of 128 documented transmembrane proteins). At the same time, the selectivity measured over a non-redundant set of 526 soluble proteins with known 3D structure is approximately 99%, mainly because a large number of falsely predicted single membrane-pass proteins are eliminated by the DAS-TMfilter algorithm.
Collapse
Affiliation(s)
- Miklos Cserzö
- University of Birmingham, School of Biosciences, Edgbaston, Birmingham B15 2TT, UK.
| | | | | | | |
Collapse
|