151
|
Sukumar S, Krishnan A, Banerjee S. An Overview of Bioinformatics Resources for SNP Analysis. Adv Bioinformatics 2021. [DOI: 10.1007/978-981-33-6191-1_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
152
|
Hu Y, Ma D, Ning S, Ye Q, Zhao X, Ding Q, Liang P, Cai G, Ma X, Qin X, Wei D. High-Quality Genome of the Medicinal Plant Strobilanthes cusia Provides Insights Into the Biosynthesis of Indole Alkaloids. FRONTIERS IN PLANT SCIENCE 2021; 12:742420. [PMID: 34659312 PMCID: PMC8515051 DOI: 10.3389/fpls.2021.742420] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 08/26/2021] [Indexed: 05/21/2023]
Abstract
Strobilanthes cusia (Nees) Kuntze is an important plant used to process the traditional Chinese herbal medicines "Qingdai" and "Nanbanlangen". The key active ingredients are indole alkaloids (IAs) that exert antibacterial, antiviral, and antitumor pharmacological activities and serve as natural dyes. We assembled the S. cusia genome at the chromosome level through combined PacBio circular consensus sequencing (CCS) and Hi-C sequencing data. Hi-C data revealed a draft genome size of 913.74 Mb, with 904.18 Mb contigs anchored into 16 pseudo-chromosomes. Contig N50 and scaffold N50 were 35.59 and 68.44 Mb, respectively. Of the 32,974 predicted protein-coding genes, 96.52% were functionally annotated in public databases. We predicted 675.66 Mb repetitive sequences, 47.08% of sequences were long terminal repeat (LTR) retrotransposons. Moreover, 983 Strobilanthes-specific genes (SSGs) were identified for the first time, accounting for ~2.98% of all protein-coding genes. Further, 245 putative centromeric and 29 putative telomeric fragments were identified. The transcriptome analysis identified 2,975 differentially expressed genes (DEGs) enriched in phenylpropanoid, flavonoid, and triterpenoid biosynthesis. This systematic characterization of key enzyme-coding genes associated with the IA pathway and basic helix-loop-helix (bHLH) transcription factor family formed a network from the shikimate pathway to the indole alkaloid synthesis pathway in S. cusia. The high-quality S. cusia genome presented herein is an essential resource for the traditional Chinese medicine genomics studies and understanding the genetic underpinning of IA biosynthesis.
Collapse
Affiliation(s)
- Yongle Hu
- College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China
- College of Ecology and Resource Engineering, Wuyi University, Wuyishan, China
- Fujian Provincial Key Laboratory of Eco-Industrial Green Technology, Wuyishan, China
| | - Dongna Ma
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, China
| | - Shuju Ning
- College of Agriculture, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Qi Ye
- College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Xuanxuan Zhao
- College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Qiansu Ding
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, China
| | - Pingping Liang
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, China
| | - Guoqian Cai
- College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Xiaomao Ma
- College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Xia Qin
- College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Daozhi Wei
- College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China
- *Correspondence: Daozhi Wei
| |
Collapse
|
153
|
Ge F, Hu J, Zhu YH, Arif M, Yu DJ. TargetMM: Accurate Missense Mutation Prediction by Utilizing Local and Global Sequence Information with Classifier Ensemble. Comb Chem High Throughput Screen 2021; 25:38-52. [DOI: 10.2174/1386207323666201204140438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Revised: 10/22/2020] [Accepted: 10/26/2020] [Indexed: 11/22/2022]
Abstract
Aim and Objective:
Missense mutation (MM) may lead to various human diseases by
disabling proteins. Accurate prediction of MM is important and challenging for both protein
function annotation and drug design. Although several computational methods yielded acceptable
success rates, there is still room for further enhancing the prediction performance of MM.
Materials and Methods:
In the present study, we designed a new feature extracting method, which
considers the impact degree of residues in the microenvironment range to the mutation site.
Stringent cross-validation and independent test on benchmark datasets were performed to evaluate
the efficacy of the proposed feature extracting method. Furthermore, three heterogeneous
prediction models were trained and then ensembled for the final prediction. By combining the
feature representation method and classifier ensemble technique, we reported a novel MM
predictor called TargetMM for identifying the pathogenic mutations from the neutral ones.
Results:
Comparison outcomes based on statistical evaluation demonstrate that TargetMM
outperforms the prior advanced methods on the independent test data. The source codes and
benchmark datasets of TargetMM are freely available at https://github.com/sera616/TargetMM.git
for academic use.
Collapse
Affiliation(s)
- Fang Ge
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094,China
| | - Jun Hu
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023,China
| | - Yi-Heng Zhu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094,China
| | - Muhammad Arif
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094,China
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094,China
| |
Collapse
|
154
|
Qi N, Chen Y, Zeng Y, Bao M, Long X, Guo Y, Tan A, Gao Y, Zhang H, Yang X, Hu Y, Mo Z, Jiang Y. rs2274911 polymorphism in GPRC6A associated with serum E2 and PSA in a Southern Chinese male population. Gene 2020; 763:145067. [PMID: 32827681 DOI: 10.1016/j.gene.2020.145067] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 08/03/2020] [Accepted: 08/17/2020] [Indexed: 11/18/2022]
Abstract
BACKGROUND rs2274911 (Pro91Ser, G > A) is a missense mutation located on the second exon of the GPRC6A gene. Increasing evidence revealed a significant association between the A allele of rs2274911 and male diseases, such as oligospermia, cryptorchidism, and prostate tumor. However, the function of rs2274911 in healthy males is unclear. SUBJECTS AND METHODS A total of 1742 healthy men were selected from the Fangchenggang Area Male Health and Examination Survey (FAMHES). The association between rs2274911 and phenotype was evaluated. The cell characteristics of rs2274911 mutation (mu), wild-type GPRC6A (WT), and RFP control in human embryonic kidney (293T) and human prostate cancer (PC3) cells were analyzed. RNA sequencing was performed on PC3 cells. RESULTS E2 and PSA serum levels increased with the accumulation of the A allele (E2: G vs. A, -0.029 [-0.050, -0.008], P < 0.01, P trend = 0.027; PSA: G vs. A, -0.040 [-0.079, 0.000], P < 0.05, P trend = 0.048). rs2274911 enhanced the proliferation and invasion ability of PC3 or 293T cells and activated the ERK pathway. The genes were identified as rs2274911 mu-affected genes through RNA sequential analysis of rs2274911 mu, GPRC6A WT, and RFP control of PC3 cells. Most of these genes were related to cancer development processes, cAMP, and the ERK cell signaling pathway. CONCLUSION This project represents that rs2274911 is associated with E2 and PSA serum levels in Southern Chinese men. Rs2274991 mutation promotes 293T and PC3 cell proliferation in vitro. These results suggest that rs2274911 is a functional variant of GPRC6A.
Collapse
Affiliation(s)
- Nana Qi
- Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi 530021, China; Guangxi Key Laboratory of Genomic and Personalized Medicine, Nanning, Guangxi 530021, China; Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Nanning, Guangxi 530021, China
| | - Yang Chen
- Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi 530021, China; Department of Urology Surgery, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi 530021, China
| | - Yanyu Zeng
- Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi 530021, China; Guangxi Key Laboratory of Genomic and Personalized Medicine, Nanning, Guangxi 530021, China; Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Nanning, Guangxi 530021, China
| | - Mengying Bao
- Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi 530021, China; Guangxi Key Laboratory of Genomic and Personalized Medicine, Nanning, Guangxi 530021, China; Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Nanning, Guangxi 530021, China
| | - Xinyang Long
- Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi 530021, China; Guangxi Key Laboratory of Genomic and Personalized Medicine, Nanning, Guangxi 530021, China; Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Nanning, Guangxi 530021, China
| | - Yajie Guo
- Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi 530021, China; Guangxi Key Laboratory of Genomic and Personalized Medicine, Nanning, Guangxi 530021, China; Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Nanning, Guangxi 530021, China
| | - Aihua Tan
- Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi 530021, China; Department of Chemotherapy, The Affiliated Tumor Hospital of Guangxi Medical University, Nanning, Guangxi 530021, China
| | - Yong Gao
- Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi 530021, China; Department of Clinical Laboratory, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi 530021, China
| | - Haiying Zhang
- Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi 530021, China; Department of Occupational Health and Environmental Health, School of Public Health, Guangxi Medical University, Nanning, Guangxi 530021, China
| | - Xiaobo Yang
- Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi 530021, China; Department of Occupational Health and Environmental Health, School of Public Health, Guangxi Medical University, Nanning, Guangxi 530021, China
| | - Yanling Hu
- Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi 530021, China; Guangxi Key Laboratory of Genomic and Personalized Medicine, Nanning, Guangxi 530021, China; Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Nanning, Guangxi 530021, China
| | - Zengnan Mo
- Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi 530021, China; Guangxi Key Laboratory of Genomic and Personalized Medicine, Nanning, Guangxi 530021, China; Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Nanning, Guangxi 530021, China; Department of Urology Surgery, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi 530021, China.
| | - Yonghua Jiang
- Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi 530021, China; Guangxi Key Laboratory of Genomic and Personalized Medicine, Nanning, Guangxi 530021, China; Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Nanning, Guangxi 530021, China.
| |
Collapse
|
155
|
Depotter JRL, Zuo W, Hansen M, Zhang B, Xu M, Doehlemann G. Effectors with Different Gears: Divergence of Ustilago maydis Effector Genes Is Associated with Their Temporal Expression Pattern during Plant Infection. J Fungi (Basel) 2020; 7:16. [PMID: 33383948 PMCID: PMC7823823 DOI: 10.3390/jof7010016] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Revised: 12/22/2020] [Accepted: 12/24/2020] [Indexed: 01/04/2023] Open
Abstract
Plant pathogens secrete a variety of effector proteins that enable host colonization but are also typical pathogen detection targets for the host immune system. Consequently, effector genes encounter high selection pressures, which typically makes them fast evolving. The corn smut pathogen Ustilago maydis has an effector gene repertoire with a dynamic expression across the different disease stages. We determined the amino acid divergence of U. maydis effector candidates with Sporisorium reilianum orthologs, a close relative of U. maydis. Intriguingly, there are two distinct groups of effector candidates, ones with a respective conserved and diverged protein evolution. Conservatively evolving effector genes especially have their peak expression during the (pre-)penetration stages of the disease cycle. In contrast, expression of divergently evolving effector genes generally peaks during fungal proliferation within the host. To test if this interspecific effector diversity corresponds to intraspecific diversity, we sampled and sequenced a diverse collection of U. maydis strains from the most important maize breeding and production regions in China. Effector candidates with a diverged interspecific evolution had more intraspecific amino acid variation than candidates with a conserved evolution. In conclusion, we highlight diversity in evolution within the U. maydis effector repertoire with dynamically and conservatively evolving members.
Collapse
Affiliation(s)
- Jasper R. L. Depotter
- Institute for Plant Sciences, University of Cologne, CEPLAS, D-50674 Cologne, Germany; (J.R.L.D.); (W.Z.); (M.H.)
| | - Weiliang Zuo
- Institute for Plant Sciences, University of Cologne, CEPLAS, D-50674 Cologne, Germany; (J.R.L.D.); (W.Z.); (M.H.)
| | - Maike Hansen
- Institute for Plant Sciences, University of Cologne, CEPLAS, D-50674 Cologne, Germany; (J.R.L.D.); (W.Z.); (M.H.)
| | - Boqi Zhang
- National Maize Improvement Centre of China, China Agricultural University, Beijing 100193, China; (B.Z.); (M.X.)
| | - Mingliang Xu
- National Maize Improvement Centre of China, China Agricultural University, Beijing 100193, China; (B.Z.); (M.X.)
| | - Gunther Doehlemann
- Institute for Plant Sciences, University of Cologne, CEPLAS, D-50674 Cologne, Germany; (J.R.L.D.); (W.Z.); (M.H.)
| |
Collapse
|
156
|
Ma D, Dong S, Zhang S, Wei X, Xie Q, Ding Q, Xia R, Zhang X. Chromosome-level reference genome assembly provides insights into aroma biosynthesis in passion fruit (Passiflora edulis). Mol Ecol Resour 2020; 21:955-968. [PMID: 33325619 DOI: 10.1111/1755-0998.13310] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 12/03/2020] [Accepted: 12/10/2020] [Indexed: 12/30/2022]
Abstract
Passion fruit, native to tropical America, is an agriculturally, economically and ornamentally important fruit plant that is well known for its acid pulp, rich aroma and distinctive flavour. Here, we present a chromosome-level genome assembly of passion fruit by incorporating PacBio long HiFi reads and Hi-C technology. The assembled reference genome is 1.28 Gb size with a scaffold N50 of 126.4 Mb and 99.22% sequences anchored onto nine pseudochromosomes. This genome is highly repetitive, accounting for 86.61% of the assembled genome. A total of 39,309 protein-coding genes were predicted with 93.48% of those being functionally annotated in the public databases. Genome evolution analysis revealed a core eudicot-common γ whole-genome triplication event and a more recent whole-genome duplication event, possibly contributing to the expansion of certain gene families. The 33 rapidly expanded gene families were significantly enriched in the pathways of isoflavone biosynthesis, galactose metabolism, diterpene biosynthesis and fatty acid metabolism, which might be responsible for the formation of featured flavours in the passion fruit. Transcriptome analysis revealed that genes related to ester and ethylene biosynthesis were significantly upregulated in the mature fruit and the expression levels of those genes were consistent with the accumulation of volatile lipid compounds. The passion fruit genome analysis improves our understanding of the genome evolution of this species and sheds new lights into the molecular mechanism of aroma biosynthesis in passion fruit.
Collapse
Affiliation(s)
- Dongna Ma
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China.,Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, China
| | - Shanshan Dong
- Laboratory of Southern Subtropical Plant Diversity, Fairy Lake Botanical Garden, Shenzhen & Chinese Academy of Sciences, Shenzhen, China
| | - Shengcheng Zhang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Xiuqing Wei
- Fruit Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, China
| | - Qingjun Xie
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangzhou, China.,Guangdong Provincial Key Laboratory of Plant Molecular Breeding, Guangzhou, China
| | - Qiansu Ding
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, China
| | - Rui Xia
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangzhou, China.,Key Laboratory of Biology and Germplasm Enhancement of Horticultural Crops in South China, Ministry of Agriculture, South China Agricultural University, Guangzhou, China
| | - Xingtan Zhang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| |
Collapse
|
157
|
Karimian M, Behjati M, Barati E, Ehteram T, Karimian A. CYP1A1 and GSTs common gene variations and presbycusis risk: a genetic association analysis and a bioinformatics approach. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2020; 27:42600-42610. [PMID: 32712936 DOI: 10.1007/s11356-020-10144-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Accepted: 07/15/2020] [Indexed: 06/11/2023]
Abstract
Antioxidant enzymes such as glutathione S-transferases (GSTs) and cytochromes P450 (CYPs) are involved in the metabolism and detoxification of cytotoxic compounds, as well as the elimination of reactive oxygen species (ROS). Therefore, alterations in the structure of these enzymes could result in prolonged production of ROS with subsequent risk of development of disorders such as presbycusis. This study aimed to investigate the association between CYP1A1 (rs4646903, rs1048943) and GSTs (GSTM1-deletion, GSTT1-deletion, GSTP1-rs1695) with presbycusis risk in an Iranian population which was followed by an in silico approach. In a case-control study, 280 subjects including 140 cases with presbycusis and 140 healthy controls were enrolled. Genotypes of single-nucleotide polymorphisms (SNPs) were detected by PCR-RFLP method and the genotype of the above mentioned deletions was determined by touchdown PCR. Some bioinformatics tools were employed to evaluate the impact of SNPs on the gene function. SNP analysis revealed that there are significant associations between rs1048943 (AG vs. AA: OR = 2.46, 95%CI = 1.30-4.65, p = 0.006; GG + AG vs. AA: OR = 2.53, 95%CI = 1.36-4.69, p = 0.003; G vs. A: OR = 2.36, 95%CI = 1.33-4.17, p = 0.003) and rs4646903 (C vs. T: OR = 1.45, 95%CI = 1.02-2.06, p = 0.040) variations and increased risk of presbycusis. However, there was no significant association between rs1695 and presbycusis risk. Also, significant associations were observed between GSTM1 (OR = 4.28, 95%CI = 1.18-15.52, p = 0.027) and GSTT1 (OR = 1.64, 95%CI = 1.02-2.65, p = 0.041) deletions and elevated risk of presbycusis. Moreover, the combination analysis revealed a significant association between GSTM1+/GSTT1- genotype and presbycusis susceptibility (OR = 1.63, 95%CI = 1.00-2.67, p = 0.049). In silico analysis revealed that the rs1048943 SNP could influence significantly on the RNA structure of CYP1A1 (distance: 0.1454; p value: 0.1799). Based on our findings, the rs4646903, rs1048943 SNPs as well as GSTM1 and GSTT1 deletions could be considered as genetic risk factors for the development and progression of presbycusis.
Collapse
Affiliation(s)
- Mohammad Karimian
- Department of Molecular and Cell Biology, Faculty of Basic Sciences, University of Mazandaran, Babolsar, 47416-95447, Iran.
| | - Mohaddeseh Behjati
- Rajaie Cardiovascular Medical and Research Center, Iran University of Medical Sciences, Tehran, Iran
| | - Erfaneh Barati
- Department of Anatomy, School of Medicine, Kashan University of Medical Sciences, Kashan, Iran
| | - Tayyebeh Ehteram
- Department of ENT, School of Medicine, Kashan University of Medical Science, Qotb-e Ravandi Blvd, Kashan, 8715988141, Iran
| | - Ali Karimian
- Department of Anatomy, School of Medicine, Kashan University of Medical Sciences, Kashan, Iran.
| |
Collapse
|
158
|
Venkata Subbiah H, Ramesh Babu P, Subbiah U. In silico analysis of non-synonymous single nucleotide polymorphisms of human DEFB1 gene. EGYPTIAN JOURNAL OF MEDICAL HUMAN GENETICS 2020. [DOI: 10.1186/s43042-020-00110-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Abstract
Background
Single nucleotide polymorphisms (SNPs) play a significant role in differences in individual’s susceptibility to diseases, and it is imperative to differentiate potentially harmful SNPs from neutral ones. Defensins are small cationic antimicrobial peptides that serve as antimicrobial and immunomodulatory molecules, and SNPs in β-defensin 1 (DEFB1 gene) have been associated with several diseases. In this study, we have determined deleterious SNPs of the DEFB1 gene that can affect the susceptibility to diseases by using different computational methods. Non-synonymous SNPs (nsSNPs) of the DEFB1 gene that have the ability to affect protein structure and functions were determined by several in silico tools—SIFT, PolyPhen v2, PROVEAN, SNAP, PhD-SNP, and SNPs&GO. Then, nsSNPs identified to be potentially deleterious were further analyzed by I-Mutant and ConSurf. Post-translational modifications mediated by nsSNPs were predicted by ModPred, and gene-gene interaction was studied by GeneMANIA. Finally, nsSNPs were submitted to Project HOPE analysis.
Results
Ten nsSNPs of the DEFB1 gene were found to be potentially deleterious: rs1800968, rs55874920, rs56270143, rs140503947, rs145468425, rs146603349, rs199581284, rs201260899, rs371897938, rs376876621. I-Mutant server showed that nsSNPs rs140503947 and rs146603349 decreased stability of the protein, and ConSurf analysis revealed that SNPs were located in conserved regions. The physiochemical properties of the polymorphic amino acid residues and their effect on structure were determined by Project HOPE.
Conclusion
This study has determined high-risk deleterious nsSNPs of β-defensin 1 and could increase the knowledge of nsSNPs towards the impact of mutations on structure and functions mediated by β-defensin 1 protein.
Collapse
|
159
|
Sargazi S, Mirani Sargazi F, Moudi M, Heidari Nia M, Saravani R, Mirinejad S, Shahraki S, Shakiba M. Impact of Proliferator-Activated Receptor γ Gene Polymorphisms on Risk of Schizophrenia: A Case-Control Study and Computational Analyses. IRANIAN JOURNAL OF PSYCHIATRY 2020; 15:286-296. [PMID: 33240378 PMCID: PMC7610076 DOI: 10.18502/ijps.v15i4.4294] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Objective: Schizophrenia (SCZ) is a common psychiatric disorder characterized by a complex mode of inheritance. Peroxisome proliferator-activated receptor-γ (PPARG) mainly regulates lipid and glucose metabolisms while it is constitutively expressed in rat primary microglial cultures. This preliminary study was aimed to investigate the relationship of two polymorphisms in the PPARG gene, rs1801282 C/G, and rs3856806 C/T, to the risk of SCZ in the southeast Iranian population. Method: A total of 300 participants (150 patients with SCZ and 150 healthy controls) were enrolled. Genotyping was done using the amplification refractory mutation system polymerase chain reaction (ARMS–PCR) technique. Computational analyses were carried out to predict the potential effects of the studied polymorphisms. Results: A significant link was found between genotypes of rs1801282 and SCZ susceptibility. The G allele of rs1801282 in CG and GG form of the codominant model increased the risk of SCZ by 2.49 and 2.64 folds, respectively. With regards to rs3856806, enhanced risk of SCZ was also observed under different inheritance models except for the overdominant model. Also, the T allele of rs3856806 enhanced the risk of SCZ by 3.19 fold. Computational analyses predicted that rs1801282 polymorphism might alter the secondary structure of PPARG-mRNA and protein function. At the same time, the other variant created the binding sites for some enhancer and silencer motifs. Conclusion: Our findings showed that PPARG rs1821282 and rs3856806 polymorphisms associate with SCZ susceptibility. Replication studies in different ethnicities with a larger population are needed to validate our findings.
Collapse
Affiliation(s)
- Saman Sargazi
- Cellular and Molecular Research Center, Resistant Tuberculosis Institute, Zahedan University of Medical Sciences, Zahedan, Iran
| | - Fariba Mirani Sargazi
- Cellular and Molecular Research Center, Resistant Tuberculosis Institute, Zahedan University of Medical Sciences, Zahedan, Iran
| | - Mahdiyeh Moudi
- Genetics of Noncommunicable Disease Research Center, Resistant Tuberculosis Institute, Zahedan University of Medical Sciences, Zahedan, Iran
| | - Milad Heidari Nia
- Cellular and Molecular Research Center, Resistant Tuberculosis Institute, Zahedan University of Medical Sciences, Zahedan, Iran
| | - Ramin Saravani
- Cellular and Molecular Research Center, Resistant Tuberculosis Institute, Zahedan University of Medical Sciences, Zahedan, Iran.,Department of Clinical Biochemistry, School of Medicine, Zahedan University of Medical Sciences, Zahedan, Iran
| | - Shekoufeh Mirinejad
- Cellular and Molecular Research Center, Resistant Tuberculosis Institute, Zahedan University of Medical Sciences, Zahedan, Iran
| | - Sheida Shahraki
- Cellular and Molecular Research Center, Resistant Tuberculosis Institute, Zahedan University of Medical Sciences, Zahedan, Iran
| | - Mansoor Shakiba
- Department of Psychiatry, Zahedan University of Medical Sciences, Zahedan, Iran
| |
Collapse
|
160
|
Li Y, Mohanty S, Nilsson D, Hansson B, Mao K, Irbäck A. When a foreign gene meets its native counterpart: computational biophysics analysis of two PgiC loci in the grass Festuca ovina. Sci Rep 2020; 10:18752. [PMID: 33127989 PMCID: PMC7599235 DOI: 10.1038/s41598-020-75650-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Accepted: 10/16/2020] [Indexed: 11/14/2022] Open
Abstract
Duplicative horizontal gene transfer may bring two previously separated homologous genes together, which may raise questions about the interplay between the gene products. One such gene pair is the “native” PgiC1 and “foreign” PgiC2 in the perennial grass Festuca ovina. Both PgiC1 and PgiC2 encode cytosolic phosphoglucose isomerase, a dimeric enzyme whose proper binding is functionally essential. Here, we use biophysical simulations to explore the inter-monomer binding of the two homodimers and the heterodimer that can be produced by PgiC1 and PgiC2 in F. ovina. Using simulated native-state ensembles, we examine the structural properties and binding tightness of the dimers. In addition, we investigate their ability to withstand dissociation when pulled by a force. Our results suggest that the inter-monomer binding is tighter in the PgiC2 than the PgiC1 homodimer, which could explain the more frequent occurrence of the foreign PgiC2 homodimer in dry habitats. We further find that the PgiC1 and PgiC2 monomers are compatible with heterodimer formation; the computed binding tightness is comparable to that of the PgiC1 homodimer. Enhanced homodimer stability and capability of heterodimer formation with PgiC1 are properties of PgiC2 that may contribute to the retaining of the otherwise redundant PgiC2 gene.
Collapse
Affiliation(s)
- Yuan Li
- Computational Biology and Biological Physics, Department of Astronomy and Theoretical Physics, Lund University, 223 62, Lund, Sweden
| | - Sandipan Mohanty
- Institute for Advanced Simulation, Jülich Supercomputing Centre, Forschungszentrum Jülich, 52425, Jülich, Germany
| | - Daniel Nilsson
- Computational Biology and Biological Physics, Department of Astronomy and Theoretical Physics, Lund University, 223 62, Lund, Sweden
| | - Bengt Hansson
- Department of Biology, Lund University, 223 62, Lund, Sweden
| | - Kangshan Mao
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, State Key Laboratory of Hydraulics and Mountain River Engineering, Sichuan University, Chengdu, 610065, China
| | - Anders Irbäck
- Computational Biology and Biological Physics, Department of Astronomy and Theoretical Physics, Lund University, 223 62, Lund, Sweden.
| |
Collapse
|
161
|
NGS Panel Testing of Triple-Negative Breast Cancer Patients in Cyprus: A Study of BRCA-Negative Cases. Cancers (Basel) 2020; 12:cancers12113140. [PMID: 33120919 PMCID: PMC7692082 DOI: 10.3390/cancers12113140] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Revised: 10/05/2020] [Accepted: 10/23/2020] [Indexed: 01/11/2023] Open
Abstract
Simple Summary In Cyprus, approximately 9% of triple-negative (negative in common breast cancer receptors—estrogen, progesterone, and human epidermal growth factor receptor 2 (HER2) receptors) breast cancer (TNBC) patients carry inherited mutations in the BRCA1/2 breast cancer (BC) susceptibility genes. These mutations increase the risk of BC. However, the contribution of other BC susceptibility genes has not yet been determined. To this end, we aimed to investigate the prevalence of mutations in BRCA1/2-negative TNBC patients in Cyprus. Ninety-five cancer susceptibility genes were sequenced for 163 TNBC patients. The frequency of non-BRCA mutations and especially PALB2 in TNBC patients in Cyprus appears to be higher compared to other populations, and half of the mutation-positive patients were diagnosed over the age of 60 years. Based on these results, we believe that PALB2 and TP53 along with BRCA1/2 genetic testing could be beneficial for a large proportion of TNBC patients in Cyprus, irrespective of their age of diagnosis. Abstract In Cyprus, approximately 9% of triple-negative (estrogen receptor-negative, progesterone receptor-negative, and human epidermal growth factor receptor 2-negative) breast cancer (TNBC) patients are positive for germline pathogenic variants (PVs) in BRCA1/2. However, the contribution of other genes has not yet been determined. To this end, we aimed to investigate the prevalence of germline PVs in BRCA1/2-negative TNBC patients in Cyprus, unselected for family history of cancer or age of diagnosis. A comprehensive 94-cancer-gene panel was implemented for 163 germline DNA samples, extracted from the peripheral blood of TNBC patients. Identified variants of uncertain clinical significance were evaluated, using extensive in silico investigation. Eight PVs (4.9%) were identified in two high-penetrance TNBC susceptibility genes. Of these, seven occurred in PALB2 (87.5%) and one occurred in TP53 (12.5%). Interestingly, 50% of the patients carrying PVs were diagnosed over the age of 60 years. The frequency of non-BRCA PVs (4.9%) and especially PALB2 PVs (4.3%) in TNBC patients in Cyprus appears to be higher compared to other populations. Based on these results, we believe that PALB2 and TP53 along with BRCA1/2 genetic testing could be beneficial for a large proportion of TNBC patients in Cyprus, irrespective of their age of diagnosis.
Collapse
|
162
|
Effects of Single-Nucleotide Polymorphisms in Calmodulin-Dependent Protein Kinase Kinase 2 (CAMKK2): A Comprehensive Study. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2020; 2020:7419512. [PMID: 33082841 PMCID: PMC7559224 DOI: 10.1155/2020/7419512] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/06/2020] [Revised: 07/27/2020] [Accepted: 08/01/2020] [Indexed: 12/01/2022]
Abstract
Calmodulin-dependent protein kinase kinase 2 (CAMKK2) is a protein kinase that belongs to the serine/threonine kinase family. It phosphorylates kinases like CAMK1, CAMK2, and AMP, and this signaling cascade is involved in various biological processes including cell proliferation, apoptosis, and proliferation. Also, the CAMKK2 signaling activity is required for the healthy activity of the brain which otherwise can cause diseases like bipolar disorders and anxiety. The current study is based on in silico bioinformatics analysis that combines sequence- and structure-based predictions to mark a SNP as damaging or neutral. The combined results from sequence-based, evolutionary conservation-based, and consensus-based tools have predicted a total of 18 nsSNPs as deleterious, and these nsSNPs were further subjected to structure-based analysis. The six mutant models of V195A, V249M, R311C, F366Y, P389T, and W445C showed a higher deviation from the wildtype protein model and hence were further taken for docking studies. The molecular docking analysis has predicted that these mutations will also be disruptive to the protein-protein interactions between CAMKK2 and PRKAG1 which will create an evident reduction in the kinase activity. The current study has enlightened us that a few of the significant mutations are prime candidates in CAMKK2 which could be the fundamental cause of various bipolar and psychiatric disorders. This is the first detailed study that predicts the deleterious nsSNPs in CAMKK2 and contributes positively in providing a better understanding of disease mechanisms.
Collapse
|
163
|
Soltani I, Bahia W, Radhouani A, Mahdhi A, Ferchichi S, Almawi WY. Comprehensive in-silico analysis of damage associated SNPs in hOCT1 affecting Imatinib response in chronic myeloid leukemia. Genomics 2020; 113:755-766. [PMID: 33075481 DOI: 10.1016/j.ygeno.2020.10.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Revised: 09/18/2020] [Accepted: 10/05/2020] [Indexed: 12/22/2022]
Abstract
Non-synonymous single nucleotide polymorphisms (nsSNPs) in hOCT1 (encoded by SLC22A1 gene) are expected to affect Imatinib uptake in chronic myeloid leukemia (CML). In this study, sequence homology-based genetic analysis of a set of 270 coding SNPs identified 18 nsSNPs to be putatively damaging/deleterious using eight different algorithms. Subsequently, based on conservation of amino acid residues, stability analysis, posttranscriptional modifications, and solvent accessibility analysis, the possible structural-functional relationship was established for high-confidence nsSNPs. Furthermore, based on the modeling results, some dissimilarities of mutant type amino acids from wild-type amino acids such as size, charge, interaction and hydrophobicity were revealed. Three highly deleterious mutations consisting of P283L, G401S and R402G in SLC22A1 gene that may alter the protein structure, function and stability were identified. These results provide a filtered data to explore the effect of uncharacterized nsSNP and find their association with Imatinib resistance in CML.
Collapse
Affiliation(s)
- Ismael Soltani
- Molecular and Cellular Hematology Laboratory, Institut Pasteur de Tunis, Université Tunis El Manar, Tunis, Tunisia.
| | - Wael Bahia
- Research Unit of Clinical and Molecular Biology, Department of Biochemistry, Faculty of Pharmacy of Monastir, University of Monastir, Tunisia
| | - Assala Radhouani
- Faculty of Medicine, University of Tunis El Manar, Tunis, Tunisia
| | - Abdelkarim Mahdhi
- Laboratory of Analysis, Treatment and Valorization of Pollutants of the Environment and Products, Faculty of Pharmacy, University of Monastir, Tunisia
| | - Salima Ferchichi
- Research Unit of Clinical and Molecular Biology, Department of Biochemistry, Faculty of Pharmacy of Monastir, University of Monastir, Tunisia
| | - Wassim Y Almawi
- Faculty of Sciences, El Manar University, Tunis, Tunisia; College of Health Sciences, Abu Dhabi University, Abu Dhabi, United Arab Emirates
| |
Collapse
|
164
|
Yazar M, Özbek P. In Silico Tools and Approaches for the Prediction of Functional and Structural Effects of Single-Nucleotide Polymorphisms on Proteins: An Expert Review. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2020; 25:23-37. [PMID: 33058752 DOI: 10.1089/omi.2020.0141] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Single-nucleotide polymorphisms (SNPs) are single-base variants that contribute to human biological variation and pathogenesis of many human diseases. Among all SNP types, nonsynonymous single-nucleotide polymorphisms (nsSNPs) can alter many structural, biochemical, and functional features of a protein such as folding characteristics, charge distribution, stability, dynamics, and interactions with other proteins/nucleotides. These modifications in the protein structure can lead nsSNPs to be closely associated with many multifactorial diseases such as cancer, diabetes, and neurodegenerative diseases. Predicting structural and functional effects of nsSNPs with experimental approaches can be time-consuming and costly; hence, computational prediction tools and algorithms are being widely and increasingly utilized in biology and medical research. This expert review examines the in silico tools and algorithms for the prediction of functional or structural effects of SNP variants, in addition to the description of the phenotypic effects of nsSNPs on protein structure, association between pathogenicity of variants, and functional or structural features of disease-associated variants. Finally, case studies investigating the functional and structural effects of nsSNPs on selected protein structures are highlighted. We conclude that creating a consistent workflow with a combination of in silico approaches or tools should be considered to increase the performance, accuracy, and precision of the biological and clinical predictions made in silico.
Collapse
Affiliation(s)
- Metin Yazar
- Department of Bioengineering, Marmara University, Göztepe, İstanbul, Turkey.,Department of Genetics and Bioengineering, Istanbul Okan University, Tuzla, Istanbul, Turkey
| | - Pemra Özbek
- Department of Bioengineering, Marmara University, Göztepe, İstanbul, Turkey
| |
Collapse
|
165
|
Qiu J, Nechaev D, Rost B. Protein-protein and protein-nucleic acid binding residues important for common and rare sequence variants in human. BMC Bioinformatics 2020; 21:452. [PMID: 33050876 PMCID: PMC7557062 DOI: 10.1186/s12859-020-03759-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Accepted: 09/16/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Any two unrelated people differ by about 20,000 missense mutations (also referred to as SAVs: Single Amino acid Variants or missense SNV). Many SAVs have been predicted to strongly affect molecular protein function. Common SAVs (> 5% of population) were predicted to have, on average, more effect on molecular protein function than rare SAVs (< 1% of population). We hypothesized that the prevalence of effect in common over rare SAVs might partially be caused by common SAVs more often occurring at interfaces of proteins with other proteins, DNA, or RNA, thereby creating subgroup-specific phenotypes. We analyzed SAVs from 60,706 people through the lens of two prediction methods, one (SNAP2) predicting the effects of SAVs on molecular protein function, the other (ProNA2020) predicting residues in DNA-, RNA- and protein-binding interfaces. RESULTS Three results stood out. Firstly, SAVs predicted to occur at binding interfaces were predicted to more likely affect molecular function than those predicted as not binding (p value < 2.2 × 10-16). Secondly, for SAVs predicted to occur at binding interfaces, common SAVs were predicted more strongly with effect on protein function than rare SAVs (p value < 2.2 × 10-16). Restriction to SAVs with experimental annotations confirmed all results, although the resulting subsets were too small to establish statistical significance for any result. Thirdly, the fraction of SAVs predicted at binding interfaces differed significantly between tissues, e.g. urinary bladder tissue was found abundant in SAVs predicted at protein-binding interfaces, and reproductive tissues (ovary, testis, vagina, seminal vesicle and endometrium) in SAVs predicted at DNA-binding interfaces. CONCLUSIONS Overall, the results suggested that residues at protein-, DNA-, and RNA-binding interfaces contributed toward predicting that common SAVs more likely affect molecular function than rare SAVs.
Collapse
Affiliation(s)
- Jiajun Qiu
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748, Garching, Munich, Germany. .,TUM Graduate School, Center of Doctoral Studies in Informatics and Its Applications (CeDoSIA), 85748, Garching, Germany. .,Biobank of Ninth People's Hospital, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200125, China.
| | - Dmitrii Nechaev
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748, Garching, Munich, Germany.,TUM Graduate School, Center of Doctoral Studies in Informatics and Its Applications (CeDoSIA), 85748, Garching, Germany
| | - Burkhard Rost
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748, Garching, Munich, Germany.,Institute of Advanced Study (TUM-IAS), Lichtenbergstr. 2a, 85748, Garching, Munich, Germany.,Institute for Food and Plant Sciences (WZW) Weihenstephan, Alte Akademie 8, 85354, Freising, Germany
| |
Collapse
|
166
|
"Mind the Gap": Hi-C Technology Boosts Contiguity of the Globe Artichoke Genome in Low-Recombination Regions. G3-GENES GENOMES GENETICS 2020; 10:3557-3564. [PMID: 32817122 PMCID: PMC7534446 DOI: 10.1534/g3.120.401446] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Globe artichoke (Cynara cardunculus var. scolymus; 2n2x=34) is cropped largely in the Mediterranean region, being Italy the leading world producer; however, over time, its cultivation has spread to the Americas and China. In 2016, we released the first (v1.0) globe artichoke genome sequence (http://www.artichokegenome.unito.it/). Its assembly was generated using ∼133-fold Illumina sequencing data, covering 725 of the 1,084 Mb genome, of which 526 Mb (73%) were anchored to 17 chromosomal pseudomolecules. Based on v1.0 sequencing data, we generated a new genome assembly (v2.0), obtained from a Hi-C (Dovetail) genomic library, and which improves the scaffold N50 from 126 kb to 44.8 Mb (∼356-fold increase) and N90 from 29 kb to 17.8 Mb (∼685-fold increase). While the L90 of the v1.0 sequence included 6,123 scaffolds, the new v2.0 just 15 super-scaffolds, a number close to the haploid chromosome number of the species. The newly generated super-scaffolds were assigned to pseudomolecules using reciprocal blast procedures. The cumulative size of unplaced scaffolds in v2.0 was reduced of 165 Mb, increasing to 94% the anchored genome sequence. The marked improvement is mainly attributable to the ability of the proximity ligation-based approach to deal with both heterochromatic (e.g.: peri-centromeric) and euchromatic regions during the assembly procedure, which allowed to physically locate low recombination regions. The new high-quality reference genome enhances the taxonomic breadth of the data available for comparative plant genomics and led to a new accurate gene prediction (28,632 genes), thus promoting the map-based cloning of economically important genes.
Collapse
|
167
|
Teng S, Sobitan A, Rhoades R, Liu D, Tang Q. Systemic effects of missense mutations on SARS-CoV-2 spike glycoprotein stability and receptor-binding affinity. Brief Bioinform 2020; 22:1239-1253. [PMID: 33006605 PMCID: PMC7665319 DOI: 10.1093/bib/bbaa233] [Citation(s) in RCA: 80] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Revised: 08/03/2020] [Accepted: 08/26/2020] [Indexed: 12/21/2022] Open
Abstract
The spike (S) glycoprotein of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for the binding to the permissive cells. The receptor-binding domain (RBD) of SARS-CoV-2 S protein directly interacts with the human angiotensin-converting enzyme 2 (ACE2) on the host cell membrane. In this study, we used computational saturation mutagenesis approaches, including structure-based energy calculations and sequence-based pathogenicity predictions, to quantify the systemic effects of missense mutations on SARS-CoV-2 S protein structure and function. A total of 18 354 mutations in S protein were analyzed, and we discovered that most of these mutations could destabilize the entire S protein and its RBD. Specifically, residues G431 and S514 in SARS-CoV-2 RBD are important for S protein stability. We analyzed 384 experimentally verified S missense variations and revealed that the dominant pandemic form, D614G, can stabilize the entire S protein. Moreover, many mutations in N-linked glycosylation sites can increase the stability of the S protein. In addition, we investigated 3705 mutations in SARS-CoV-2 RBD and 11 324 mutations in human ACE2 and found that SARS-CoV-2 neighbor residues G496 and F497 and ACE2 residues D355 and Y41 are critical for the RBD-ACE2 interaction. The findings comprehensively provide potential target sites in the development of drugs and vaccines against COVID-19.
Collapse
Affiliation(s)
- Shaolei Teng
- Corresponding authors: Shaolei Teng, Department of Biology, Howard University, 415 College St. NW, Washington, DC 20059. Tel.: +1 202-806-6933; E-mail: ; Qiyi Tang, Howard University College of Medicine, 520 W Street NW, Washington, DC 20059. Tel.: +1 202-806-3915; E-mail:
| | - Adebiyi Sobitan
- Department of Biology at the Howard University, 415 College St. NW, Washington, DC 20059
| | - Raina Rhoades
- Department of Biology at the Howard University, 415 College St. NW, Washington, DC 20059
| | - Dongxiao Liu
- Howard University College of Medicine, 520 W Street NW, Washington, DC 20059
| | - Qiyi Tang
- Corresponding authors: Shaolei Teng, Department of Biology, Howard University, 415 College St. NW, Washington, DC 20059. Tel.: +1 202-806-6933; E-mail: ; Qiyi Tang, Howard University College of Medicine, 520 W Street NW, Washington, DC 20059. Tel.: +1 202-806-3915; E-mail:
| |
Collapse
|
168
|
Wei Q, Wang J, Wang W, Hu T, Hu H, Bao C. A high-quality chromosome-level genome assembly reveals genetics for important traits in eggplant. HORTICULTURE RESEARCH 2020; 7:153. [PMID: 33024567 PMCID: PMC7506008 DOI: 10.1038/s41438-020-00391-0] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Revised: 08/19/2020] [Accepted: 08/23/2020] [Indexed: 05/04/2023]
Abstract
Eggplant (Solanum melongena L.) is an economically important vegetable crop in the Solanaceae family, with extensive diversity among landraces and close relatives. Here, we report a high-quality reference genome for the eggplant inbred line HQ-1315 (S. melongena-HQ) using a combination of Illumina, Nanopore and 10X genomics sequencing technologies and Hi-C technology for genome assembly. The assembled genome has a total size of ~1.17 Gb and 12 chromosomes, with a contig N50 of 5.26 Mb, consisting of 36,582 protein-coding genes. Repetitive sequences comprise 70.09% (811.14 Mb) of the eggplant genome, most of which are long terminal repeat (LTR) retrotransposons (65.80%), followed by long interspersed nuclear elements (LINEs, 1.54%) and DNA transposons (0.85%). The S. melongena-HQ eggplant genome carries a total of 563 accession-specific gene families containing 1009 genes. In total, 73 expanded gene families (892 genes) and 34 contraction gene families (114 genes) were functionally annotated. Comparative analysis of different eggplant genomes identified three types of variations, including single-nucleotide polymorphisms (SNPs), insertions/deletions (indels) and structural variants (SVs). Asymmetric SV accumulation was found in potential regulatory regions of protein-coding genes among the different eggplant genomes. Furthermore, we performed QTL-seq for eggplant fruit length using the S. melongena-HQ reference genome and detected a QTL interval of 71.29-78.26 Mb on chromosome E03. The gene Smechr0301963, which belongs to the SUN gene family, is predicted to be a key candidate gene for eggplant fruit length regulation. Moreover, we anchored a total of 210 linkage markers associated with 71 traits to the eggplant chromosomes and finally obtained 26 QTL hotspots. The eggplant HQ-1315 genome assembly can be accessed at http://eggplant-hq.cn. In conclusion, the eggplant genome presented herein provides a global view of genomic divergence at the whole-genome level and powerful tools for the identification of candidate genes for important traits in eggplant.
Collapse
Affiliation(s)
- Qingzhen Wei
- Institute of Vegetable Research, Zhejiang Academy of Agricultural Sciences, Hangzhou, 30021 China
| | - Jinglei Wang
- Institute of Vegetable Research, Zhejiang Academy of Agricultural Sciences, Hangzhou, 30021 China
| | - Wuhong Wang
- Institute of Vegetable Research, Zhejiang Academy of Agricultural Sciences, Hangzhou, 30021 China
| | - Tianhua Hu
- Institute of Vegetable Research, Zhejiang Academy of Agricultural Sciences, Hangzhou, 30021 China
| | - Haijiao Hu
- Institute of Vegetable Research, Zhejiang Academy of Agricultural Sciences, Hangzhou, 30021 China
| | - Chonglai Bao
- Institute of Vegetable Research, Zhejiang Academy of Agricultural Sciences, Hangzhou, 30021 China
| |
Collapse
|
169
|
Ren L, Shang Y, Yang L, Wang S, Wang X, Chen S, Bao Z, An D, Meng F, Cai J, Guo Y. Chromosome-level de novo genome assembly of Sarcophaga peregrina provides insights into the evolutionary adaptation of flesh flies. Mol Ecol Resour 2020; 21:251-262. [PMID: 32853451 DOI: 10.1111/1755-0998.13246] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Revised: 08/11/2020] [Accepted: 08/17/2020] [Indexed: 01/29/2023]
Abstract
Sarcophaga peregrina is considered to be of great ecological, medical and forensic significance, and has unusual biological characteristics such as an ovoviviparous reproductive pattern and adaptation to feed on carrion. The availability of a high-quality genome will help to further reveal the mechanisms underlying these charcateristics. Here we present a de novo-assembled genome at chromosome scale for S. peregrina. The final assembled genome was 560.31 Mb with contig N50 of 3.84 Mb. Hi-C scaffolding reliably anchored six pseudochromosomes, accounting for 97.76% of the assembled genome. Moreover, 45.70% of repeat elements were identified in the genome. A total of 14,476 protein-coding genes were functionally annotated, accounting for 92.14% of all predicted genes. Phylogenetic analysis indicated that S. peregrina and S. bullata diverged ~ 7.14 million years ago. Comparative genomic analysis revealed expanded and positively selected genes related to biological features that aid in clarifying its ovoviviparous reproduction and carrion-feeding adaptations, such as lipid metabolism, olfactory receptor activity, antioxidant enzymes, proteolysis and serine-type endopeptidase activity. Protein-coding genes associated with ovoviparity, such as yolk proteins, transferrin and acid sphingomyelinase, were identified. This study provides a valuable genomic resource for S. peregrina, and sheds insight into further revealing the underlying molecular mechanisms of adaptive evolution.
Collapse
Affiliation(s)
- Lipin Ren
- Department of Forensic Science, School of Basic Medical Sciences, Central South University, Changsha, China
| | - Yanjie Shang
- Department of Forensic Science, School of Basic Medical Sciences, Central South University, Changsha, China
| | - Li Yang
- Department of Forensic Science, School of Basic Medical Sciences, Central South University, Changsha, China
| | - Shiwen Wang
- Department of Forensic Science, School of Basic Medical Sciences, Xinjiang Medical University, Ürümqi, China
| | - Xiang Wang
- Institute of Cancer Stem Cell, Dalian Medical University, Dalian, Liaoning, China
| | - Shan Chen
- School of Ecological and Environmental Sciences, East China Normal University, Shanghai, China
| | | | - Dong An
- OE biotech Co. Ltd, Shanghai, China
| | - Fanming Meng
- Department of Forensic Science, School of Basic Medical Sciences, Central South University, Changsha, China
| | - Jifeng Cai
- Department of Forensic Science, School of Basic Medical Sciences, Central South University, Changsha, China
| | - Yadong Guo
- Department of Forensic Science, School of Basic Medical Sciences, Central South University, Changsha, China
| |
Collapse
|
170
|
Jaravine V, Balmford J, Metzger P, Boerries M, Binder H, Boeker M. Annotation of Human Exome Gene Variants with Consensus Pathogenicity. Genes (Basel) 2020; 11:E1076. [PMID: 32938008 PMCID: PMC7563776 DOI: 10.3390/genes11091076] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Revised: 09/10/2020] [Accepted: 09/11/2020] [Indexed: 11/16/2022] Open
Abstract
A novel approach is developed to address the challenge of annotating with phenotypic effects those exome variants for which relevant empirical data are lacking or minimal. The predictive annotation method is implemented as a stacked ensemble of supervised base-learners, including distributed random forest and gradient boosting machines. Ensemble models were trained and cross-validated on evidence-based categorical variant effect annotations from the ClinVar database, and were applied to 84 million non-synonymous single nucleotide variants (SNVs). The consensus model combined 39 functional mutation impacts, cross-species conservation score, and gene indispensability score. The indispensability score, accounting for differences in variant pathogenicities including in essential and mutation-tolerant genes, considerably improved the predictions. The consensus combination is consistent with as many input scores as possible while minimizing false predictions. The input scores are ranked based on their ability to predict effects. The score rankings and categorical phenotypic variant effect predictions are aimed for direct use in clinical and biological applications to prioritize human exome variants and mutations.
Collapse
Affiliation(s)
- Victor Jaravine
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, 79104 Freiburg im Breisgau, Germany; (J.B.); (H.B.); (M.B.)
| | - James Balmford
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, 79104 Freiburg im Breisgau, Germany; (J.B.); (H.B.); (M.B.)
| | - Patrick Metzger
- Institute of Medical Bioinformatics and Systems Medicine, Medical Center—University of Freiburg, Faculty of Medicine, University of Freiburg, 79110 Freiburg im Breisgau, Germany; (P.M.); (M.B.)
| | - Melanie Boerries
- Institute of Medical Bioinformatics and Systems Medicine, Medical Center—University of Freiburg, Faculty of Medicine, University of Freiburg, 79110 Freiburg im Breisgau, Germany; (P.M.); (M.B.)
| | - Harald Binder
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, 79104 Freiburg im Breisgau, Germany; (J.B.); (H.B.); (M.B.)
| | - Martin Boeker
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, 79104 Freiburg im Breisgau, Germany; (J.B.); (H.B.); (M.B.)
| |
Collapse
|
171
|
Masso M. Accurate and efficient structure-based computational mutagenesis for modeling fluorescence levels of Aequorea victoria green fluorescent protein mutants. Protein Eng Des Sel 2020; 33:5905305. [DOI: 10.1093/protein/gzaa022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2020] [Revised: 07/16/2020] [Accepted: 08/04/2020] [Indexed: 11/13/2022] Open
Abstract
Abstract
A computational mutagenesis technique was used to characterize the structural effects associated with over 46 000 single and multiple amino acid variants of Aequorea victoria green fluorescent protein (GFP), whose functional effects (fluorescence levels) were recently measured by experimental researchers. For each GFP mutant, the approach generated a single score reflecting the overall change in sequence-structure compatibility relative to native GFP, as well as a vector of environmental perturbation (EP) scores characterizing the impact at all GFP residue positions. A significant GFP structure–function relationship (P < 0.0001) was elucidated by comparing the sequence-structure compatibility scores with the functional data. Next, the computed vectors for GFP mutants were used to train predictive models of fluorescence by implementing random forest (RF) classification and tree regression machine learning algorithms. Classification performance reached 0.93 for sensitivity, 0.91 for precision and 0.90 for balanced accuracy, and regression models led to Pearson’s correlation as high as r = 0.83 between experimental and predicted GFP mutant fluorescence. An RF model trained on a subset of over 1000 experimental single residue GFP mutants with measured fluorescence was used for predicting the 3300 remaining unstudied single residue mutants, with results complementing known GFP biochemical and biophysical properties. In addition, models trained on the subset of experimental GFP mutants harboring multiple residue replacements successfully predicted fluorescence of the single residue GFP mutants. The models developed for this study were accurate and efficient, and their predictions outperformed those of several related state-of-the-art methods.
Collapse
Affiliation(s)
- Majid Masso
- Laboratory for Structural Bioinformatics, School of Systems Biology, George Mason University, 10900 University Boulevard MS 5B3, Manassas, VA 20110, USA
| |
Collapse
|
172
|
Masso M, Bansal A, Bansal A, Henderson A. Structure-based functional analysis of BRCA1 RING domain variants: Concordance of computational mutagenesis, experimental assay, and clinical data. Biophys Chem 2020; 266:106442. [PMID: 32916545 DOI: 10.1016/j.bpc.2020.106442] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Revised: 07/25/2020] [Accepted: 07/26/2020] [Indexed: 01/16/2023]
Abstract
A significant impediment to the improvement of clinical outcomes in treating breast and ovarian cancers rests with the lack of available interpretations for BRCA1 variants of unknown significance. Two research groups recently implemented large-scale functional assays for quantifying effects of single missense mutations on homology-directed DNA repair activity of BRCA1 variants, which is critical for tumor suppression and strongly correlates with cancer risk, and their results are significantly concordant with each other as well as with known pathogenic and benign variant clinical data. In this work, we implemented an established computational mutagenesis procedure to characterize structural impacts of single residue replacements to the BRCA1 RING domain. The computational data showed similarly strong concordance with known clinical data as well as with experimental data from both functional assays. Predictions made by models trained on our computational data offer a complementary and orthogonal approach for classifying all remaining unexplored BRCA1 RING domain variants.
Collapse
Affiliation(s)
- Majid Masso
- School of Systems Biology, College of Science, George Mason University, 10900 University Boulevard MS 5B3, Manassas, VA 20110, USA.
| | - Anirudh Bansal
- School of Systems Biology, College of Science, George Mason University, 10900 University Boulevard MS 5B3, Manassas, VA 20110, USA
| | - Arnav Bansal
- School of Systems Biology, College of Science, George Mason University, 10900 University Boulevard MS 5B3, Manassas, VA 20110, USA
| | - Andrea Henderson
- School of Systems Biology, College of Science, George Mason University, 10900 University Boulevard MS 5B3, Manassas, VA 20110, USA
| |
Collapse
|
173
|
Emadi E, Akhoundi F, Kalantar SM, Emadi-Baygi M. Predicting the most deleterious missense nsSNPs of the protein isoforms of the human HLA-G gene and in silico evaluation of their structural and functional consequences. BMC Genet 2020; 21:94. [PMID: 32867672 PMCID: PMC7457528 DOI: 10.1186/s12863-020-00890-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2020] [Accepted: 07/19/2020] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND The Human Leukocyte Antigen G (HLA-G) protein is an immune tolerogenic molecule with 7 isoforms. The change of expression level and some polymorphisms of the HLA-G gene are involved in various pathologies. Therefore, this study aimed to predict the most deleterious missense non-synonymous single nucleotide polymorphisms (nsSNPs) in HLA-G isoforms via in silico analyses and to examine structural and functional effects of the predicted nsSNPs on HLA-G isoforms. RESULTS Out of 301 reported SNPs in dbSNP, 35 missense SNPs in isoform 1, 35 missense SNPs in isoform 5, 8 missense SNPs in all membrane-bound HLA-G isoforms and 8 missense SNPs in all soluble HLA-G isoforms were predicted as deleterious by all eight servers (SIFT, PROVEAN, PolyPhen-2, I-Mutant 3.0, SNPs&GO, PhD-SNP, SNAP2, and MUpro). The Structural and functional effects of the predicted nsSNPs on HLA-G isoforms were determined by MutPred2 and HOPE servers, respectively. Consurf analyses showed that the majority of the predicted nsSNPs occur in conserved sites. I-TASSER and Chimera were used for modeling of the predicted nsSNPs. rs182801644 and rs771111444 were related to creating functional patterns in 5'UTR. 5 SNPs in 3'UTR of the HLA-G gene were predicted to affect the miRNA target sites. Kaplan-Meier analysis showed the HLA-G deregulation can serve as a prognostic marker for some cancers. CONCLUSIONS The implementation of in silico SNP prioritization methods provides a great framework for the recognition of functional SNPs. The results obtained from the current study would be called laboratory investigations.
Collapse
Affiliation(s)
- Elaheh Emadi
- Department of Genetics, Faculty of Medicine, Shahid Sadoughi University of Medical Sciences, Yazd, Iran
| | - Fatemeh Akhoundi
- Department of Genetics, Faculty of Basic Sciences, Shahrekord University, Shahrekord, Iran
| | - Seyed Mehdi Kalantar
- Department of Genetics, Faculty of Medicine, Shahid Sadoughi University of Medical Sciences, Yazd, Iran
| | - Modjtaba Emadi-Baygi
- Department of Genetics, Faculty of Basic Sciences, Shahrekord University, Shahrekord, Iran.
- Research Institute of Biotechnology, Shahrekord University, Shahrekord, Iran.
| |
Collapse
|
174
|
Gelli E, Fallerini C, Valentino F, Giliberti A, Castiglione F, Laschi L, Palmieri M, Fabbiani A, Tita R, Mencarelli MA, Renieri A, Ariani F. RB1 Germline Variant Predisposing to a Rare Ovarian Germ Cell Tumor: A Case Report. Front Oncol 2020; 10:1467. [PMID: 32974172 PMCID: PMC7471930 DOI: 10.3389/fonc.2020.01467] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Accepted: 07/09/2020] [Indexed: 01/10/2023] Open
Abstract
Malignant ovarian germ cell tumors (MOGCTs) are neoplasms of the ovary, of which, due to their rarity and heterogeneity, few is reported about genetic background and development. Here, we report a 18-years old patient diagnosed with an ovarian mixed germ cell tumor, without any previous history of malignancies, who has been treated with surgery and chemotherapy and died 4 years later due to peritoneal metastasis complications. Patient's blood DNA was screened for a panel of 52 cancer-related genes in order to identify predisposing aberrations to this rare cancer. The analysis discovered the uncharacterized c.2393G>A variant in RB1, the retinoblastoma gene, leading both to a missense change and a splicing perturbation of the RB1 transcript. The variant was found to be hypomorphic, damaging the C-terminal domain with a partially impaired protein function. The variant is inherited from the unaffected mother. Due to an imprinting mechanism, the maternal allele is ~3-fold more expressed than the paternal one. The parent-of-origin effect combined with the hypomorphic impact of the variant determines a rescue of sufficient tumor-suppressor activity to prevent retinoblastoma development but can predispose to other cancers in the adult age. In order to understand the somatic events acting on the germline predisposition we used the NGS-liquid biopsy covering 77 cancer driver genes. Using this approach, we detected deleterious mutations in TP53, SMAD4, FGFR3, and MSH2, indicative of a dis-regulation of cell cycle and DNA repair mechanisms pathways. In conclusion, we have pinpointed for the first time that an RB1 leaky variant, not leading to retinoblastoma because of its maternal origin, can predispose in adults to a very rare form of ovarian cancer and that the somatic disruption of few genes contributes to the tumor progression and aggressiveness.
Collapse
Affiliation(s)
- Elisa Gelli
- Medical Genetics, University of Siena, Siena, Italy
| | | | | | | | - Francesca Castiglione
- Histopathogy and Molecular Diagnostics, Careggi University Hospital Florence, Florence, Italy
| | - Lucrezia Laschi
- Department of Health Sciences, University of Florence, Florence, Italy
| | | | - Alessandra Fabbiani
- Medical Genetics, University of Siena, Siena, Italy.,Genetica Medica, Azienda Ospedaliera Universitaria Senese, Siena, Italy
| | - Rossella Tita
- Genetica Medica, Azienda Ospedaliera Universitaria Senese, Siena, Italy
| | | | - Alessandra Renieri
- Medical Genetics, University of Siena, Siena, Italy.,Genetica Medica, Azienda Ospedaliera Universitaria Senese, Siena, Italy
| | - Francesca Ariani
- Medical Genetics, University of Siena, Siena, Italy.,Genetica Medica, Azienda Ospedaliera Universitaria Senese, Siena, Italy
| |
Collapse
|
175
|
Niemann-Pick disease A or B in four pediatric patients and SMPD1 mutation carrier frequency in the Mexican population. Ann Hepatol 2020; 18:613-619. [PMID: 31122880 DOI: 10.1016/j.aohep.2018.12.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/20/2018] [Revised: 12/14/2018] [Accepted: 11/23/2018] [Indexed: 02/04/2023]
Abstract
INTRODUCTION AND OBJECTIVES Niemann-Pick disease type A (NPD-A) and B (NPD-B) are lysosomal storage diseases with a birth prevalence of 0.4-0.6/100,000. They are caused by a deficiency in acid sphingomyelinase, an enzyme encoded by SMPD1. We analyzed the phenotype and genotype of four unrelated Mexican patients, one with NPD-A and three with NPD-B. PATIENTS AND METHODS Four female patients between 1 and 7 years of age were diagnosed with NPD-A or NPD-B by hepatosplenomegaly, among other clinical characteristics, and by determining the level of acid sphingomyelinase enzymatic activity and sequencing of the SMPD1 gene. Additionally, a 775bp amplicon of SMPD1 (from 11:6393835_6394609, including exons 5 and 6) was analyzed by capillary sequencing in a control group of 50 unrelated healthy Mexican Mestizos. RESULTS An infrequent variant (c.1343A>G p.Tyr448Cys) was observed in two patients. One is the first NPD-A homozygous patient reported with this variant and the other a compound heterozygous NPD-B patient with the c.1829_1831delGCC p.Arg610del variant. Another compound heterozygous patient had the c.1547A>G p.His516Arg variant (not previously described in affected individuals) along with the c.1805G>A p.Arg602His variant. A new c.1263+8C>T pathogenic variant was encountered in a homozygous state in a NPD-B patient. Among the healthy control individuals there was a heterozygous carrier for the c.1550A>T (rs142787001) pathogenic variant, but none with the known pathogenic variants in the 11:6393835_6394609 region of SMPD1. CONCLUSIONS The present study provides further NPD-A or B phenotype-genotype correlations. We detected a heterozygous carrier with a pathogenic variant in 1/50 healthy Mexican mestizos.
Collapse
|
176
|
Marhemati F, Rezaei R, Mohseni Meybodi A, Taheripanah R, Mostafaei S, Amani D. Transforming growth factor beta 1 (TGFβ1) polymorphisms and unexplained infertility: A genetic association study. Syst Biol Reprod Med 2020; 66:267-280. [PMID: 32735465 DOI: 10.1080/19396368.2020.1773575] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
The prevalence of infertility is increasing and worrisome. About 10 to 30% of infertility is classified as idiopathic or unexplained infertility (UI).TGF-β is multifunctional and immunoregulatry cytokine which regulates both implantation and adhesion of trophoblasts to the extracellular matrix during pregnancy. The aim of the current study was to investigate the association between two polymorphisms rs1800470 (C29T) and rs1800471 (G74C) of the TGF-β1 gene in Iranian patients with unexplained infertility. A total of 250 UI patients and 484 healthy individuals with no history of infertility were included in the study. The amplification and sequencing of target DNA fragments were done using PCR and automated sequencing methods, respectively. The effects of these polymorphisms on both TGF-β1 structure and function of mRNA and protein were analyzed using new in-silico tools. The frequency distribution of the alleles, genotypes, and haplotypes of both rs1800470 and rs1800471 polymorphisms had a statistically significant difference between subjects and controls. CC genotype of TGF-β1 rs1800470 (29C→T) increase the risk of UI in male UI patients. Moreover, C alleles of TGF-β1 rs1800471 was associated with increased risk of UI in female UI patients. Couples, subgroup analysis revealed a significant association between TGF-β1 polymorphisms (rs1800470, rs1800471) and the risk of UI in male, female, and all UI patients. The frequency of TG and CG haplotypes were statistically different in both UI and healthy subjects group (P < 0.05). RS1800471 polymorphisms changed the secondary structure of TGF-β1 mRNA and resulted in the removal of one mRNA arm and creation of two new arms. Taken together, the results of the current study suggest that TGF-β1 functional polymorphisms may play an important role in the susceptibility to UI in Iranian population. According to in silico analysis, polymorphisms in TGF-β1 can reduce mRNA half-life and, therefore, reduced TGF-β1 expression. .
Collapse
Affiliation(s)
- Farnaz Marhemati
- Department of Immunology, School of Medicine, Shahid Beheshti University of Medical Sciences , Tehran, Iran
| | - Ramazan Rezaei
- Department of Immunology, School of Medicine, Shahid Beheshti University of Medical Sciences , Tehran, Iran
| | - Anahita Mohseni Meybodi
- Department of Genetics, Reproductive Biomedicine Research Center, Royan Institute for Reproductive Biomedicine, ACECR , Tehran, Iran
| | - Robabeh Taheripanah
- Department of Gynecology and Obstetrics, Reproductive Health Research Center, Shahid Beheshti University of Medical Sciences , Tehran, Iran
| | - Shayan Mostafaei
- Medical Biology Research Center, Health Technology Institute, Kermanshah University of Medical Sciences , Kermanshah, Iran.,Epidemiology and Biostatistics Unit, Rheumatology Research Center, Tehran University of Medical Sciences , Tehran, Iran
| | - Davar Amani
- Department of Immunology, School of Medicine, Shahid Beheshti University of Medical Sciences , Tehran, Iran.,Department of Gynecology and Obstetrics, Reproductive Health Research Center, Shahid Beheshti University of Medical Sciences , Tehran, Iran
| |
Collapse
|
177
|
Abreu GDM, Soares CDAPD, Tarantino RM, da Fonseca ACP, de Souza RB, Pereira MDFC, Cabello PH, Rodacki M, Zajdenverg L, Zembrzuski VM, Campos Junior M. Identification of the First PAX4-MODY Family Reported in Brazil. Diabetes Metab Syndr Obes 2020; 13:2623-2631. [PMID: 32801813 PMCID: PMC7399458 DOI: 10.2147/dmso.s256858] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Accepted: 05/27/2020] [Indexed: 11/23/2022] Open
Abstract
PURPOSE The aim of this study was to sequence the coding region of the PAX4 gene in a Brazilian cohort with clinical manifestations of monogenic diabetes. PATIENTS AND METHODS This study included 31 patients with autosomal dominant history of diabetes, age at diagnosis ≤40 years, BMI <30 kg/m2, and no mutations in GCK or HNF1A, HNF4A, and HNF1B. Screening of the PAX4 coding region was performed by Sanger sequencing. In silico algorithms were used to assess the potential impact of amino acid substitutions on protein structure and function. Additionally, PAX4-MODY family members and 158 control subjects without diabetes were analyzed for the identified mutation. RESULTS The molecular analysis of PAX4 has detected one missense mutation, p.Arg164Gln (c.491G>A), segregating with diabetes in a large Brazilian family. The mutation was absent among the control group. The index case is a woman diagnosed at 32 years of age with polyneuropathy and treated with insulin. She did not present diabetic renal disease or retinopathy. Family members with the PAX4 p.Arg164Gln mutation have a heterogeneous clinical manifestation and treatment response, with age at diagnosis ranging from 24 years to 50 years. CONCLUSION To the best of our knowledge, this is the first study to report a PAX4-MODY family in Brazil. The age of PAX4-MODY diagnosis in the Brazilian family seems to be higher than the classical criteria for MODY. Our results reinforce the importance of screening large monogenic diabetes families for the understanding of the clinical manifestations of rare forms of diabetes for the specific and personalized treatment.
Collapse
Affiliation(s)
| | | | - Roberta Magalhães Tarantino
- Diabetes and Nutrology Section, Internal Medicine Department, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
- Ambulatory of Diabetes, State Institute for Diabetes and Endocrinology Luiz Capriglione, Rio de Janeiro, Brazil
| | | | - Ritiele Bastos de Souza
- Human Genetics Laboratory, Oswaldo Cruz Institute, Oswaldo Cruz Foundation, Rio de Janeiro, Brazil
| | | | - Pedro Hernan Cabello
- Human Genetics Laboratory, Oswaldo Cruz Institute, Oswaldo Cruz Foundation, Rio de Janeiro, Brazil
- Laboratory of Genetics, School of Health Science, University of Grande Rio, Rio de Janeiro, Brazil
| | - Melanie Rodacki
- Diabetes and Nutrology Section, Internal Medicine Department, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Lenita Zajdenverg
- Diabetes and Nutrology Section, Internal Medicine Department, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | | | - Mário Campos Junior
- Human Genetics Laboratory, Oswaldo Cruz Institute, Oswaldo Cruz Foundation, Rio de Janeiro, Brazil
| |
Collapse
|
178
|
Zhu C, Miller M, Zeng Z, Wang Y, Mahlich Y, Aptekmann A, Bromberg Y. Computational Approaches for Unraveling the Effects of Variation in the Human Genome and Microbiome. Annu Rev Biomed Data Sci 2020. [DOI: 10.1146/annurev-biodatasci-030320-041014] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
The past two decades of analytical efforts have highlighted how much more remains to be learned about the human genome and, particularly, its complex involvement in promoting disease development and progression. While numerous computational tools exist for the assessment of the functional and pathogenic effects of genome variants, their precision is far from satisfactory, particularly for clinical use. Accumulating evidence also suggests that the human microbiome's interaction with the human genome plays a critical role in determining health and disease states. While numerous microbial taxonomic groups and molecular functions of the human microbiome have been associated with disease, the reproducibility of these findings is lacking. The human microbiome–genome interaction in healthy individuals is even less well understood. This review summarizes the available computational methods built to analyze the effect of variation in the human genome and microbiome. We address the applicability and precision of these methods across their possible uses. We also briefly discuss the exciting, necessary, and now possible integration of the two types of data to improve the understanding of pathogenicity mechanisms.
Collapse
Affiliation(s)
- Chengsheng Zhu
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
| | - Maximilian Miller
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
| | - Zishuo Zeng
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
| | - Yanran Wang
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
| | - Yannick Mahlich
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
| | - Ariel Aptekmann
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
- Department of Genetics, Rutgers University, Piscataway, New Jersey 08854, USA
| |
Collapse
|
179
|
Martin TA, Wu T, Tang Q, Dougherty LL, Parente DJ, Swint-Kruse L, Fenton AW. Identification of biochemically neutral positions in liver pyruvate kinase. Proteins 2020; 88:1340-1350. [PMID: 32449829 DOI: 10.1002/prot.25953] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 03/10/2020] [Accepted: 05/16/2020] [Indexed: 01/08/2023]
Abstract
Understanding how each residue position contributes to protein function has been a long-standing goal in protein science. Substitution studies have historically focused on conserved protein positions. However, substitutions of nonconserved positions can also modify function. Indeed, we recently identified nonconserved positions that have large substitution effects in human liver pyruvate kinase (hLPYK), including altered allosteric coupling. To facilitate a comparison of which characteristics determine when a nonconserved position does vs does not contribute to function, the goal of the current work was to identify neutral positions in hLPYK. However, existing hLPYK data showed that three features commonly associated with neutral positions-high sequence entropy, high surface exposure, and alanine scanning-lacked the sensitivity needed to guide experimental studies. We used multiple evolutionary patterns identified in a sequence alignment of the PYK family to identify which positions were least patterned, reasoning that these were most likely to be neutral. Nine positions were tested with a total of 117 amino acid substitutions. Although exploring all potential functions is not feasible for any protein, five parameters associated with substrate/effector affinities and allosteric coupling were measured for hLPYK variants. For each position, the aggregate functional outcomes of all variants were used to quantify a "neutrality" score. Three positions showed perfect neutral scores for all five parameters. Furthermore, the nine positions showed larger neutral scores than 17 positions located near allosteric binding sites. Thus, our strategy successfully enriched the dataset for positions with neutral and modest substitutions.
Collapse
Affiliation(s)
- Tyler A Martin
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Tiffany Wu
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Qingling Tang
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Larissa L Dougherty
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Daniel J Parente
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA.,Department of Family and Community Medicine, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Liskin Swint-Kruse
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Aron W Fenton
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| |
Collapse
|
180
|
Ilikci Sagkan R, Akin-Bali DF. Structural variations and expression profiles of the SARS-CoV-2 host invasion genes in lung cancer. J Med Virol 2020; 92:2637-2647. [PMID: 32492203 PMCID: PMC7300553 DOI: 10.1002/jmv.26107] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Revised: 05/24/2020] [Accepted: 05/29/2020] [Indexed: 12/26/2022]
Abstract
Recent days have seen growing evidence of cancer's susceptibility to severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) and of the effect of genomic differences on the virus' entrance genes in lung cancer. Genetic confirmation of the hypotheses regarding gene expression and mutation pattern of target genes, including angiotensin‐converting enzyme‐2 (ACE2), transmembrane serine protease 2 (TMPRSS2), basigin (CD147/BSG) and paired basic amino acid cleaving enzyme (FURIN/PCSK3), as well as correlation analysis, was done in relation to lung adenocarcinoma (LUAD) and lung squamous carcinoma (LUSC) using in silico analysis. Not only were gene expression and mutation patterns detected, but also there were correlation and survival analysis between ACE2 and other target genes expression levels. The total genetic anomaly carrying rate of target genes, including ACE2, TMPRSS2, CD147/BSG, and FURIN/PCSK3, was determined as 8.1% and 21 mutations were detected, with 7 of these mutations having pathogenic features. p.H34N on the RBD binding residues for SARS‐CoV‐2 was determined in our LUAD patient group. According to gene expression analysis results, though the TMPRSS2 level was statistically significantly decreased in the LUSC patient group compared to healthy control, the ACE2 level was determined to be high in LUAD and LUSC groups. There were no meaningful differences in the expression of CD147 and FURIN genes. The challenge for today is building the assessment of genomic susceptibility to COVID‐19 in lung cancer, requiring detailed experimental laboratory studies, in addition to in silico analyses, as a way of assessing the mechanism of novel virus invasion that can be used in the development of effective SARS‐CoV‐2 therapy. The increased expression of ACE2 and CD147/BSG and decreased expression of TMPRSS2 in tumor tissues results in more susceptible to SARS‐CoV‐2 infection in COVID‐19 patients with lung cancer subtypes. The downregulated expression of TMPRSS2 may useful for predicting prognosis and susceptibility to COVID‐19 in LUAD cancer patients. The impact of genetic variations in lung cancer patients need to be assessed in order to effectively reveal potential invasion genes for SARS‐CoV‐2 in COVID‐19 susceptibility. and correct author name in how to cite box
Collapse
Affiliation(s)
- Rahsan Ilikci Sagkan
- Department of Medical Biology, School of Medicine, Usak University, Usak, Turkey
| | - Dilara Fatma Akin-Bali
- Department of Medical Biology, Faculty of Medicine, Nigde Omer Halisdemir University, Nigde, Turkey
| |
Collapse
|
181
|
Whole genome resequencing of four Italian sweet pepper landraces provides insights on sequence variation in genes of agronomic value. Sci Rep 2020; 10:9189. [PMID: 32514106 PMCID: PMC7280500 DOI: 10.1038/s41598-020-66053-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 05/07/2020] [Indexed: 11/08/2022] Open
Abstract
Sweet pepper (Capsicum annuum L.) is a high value crop and one of the most widely grown vegetables belonging to the Solanaceae family. In addition to commercial varieties and F1 hybrids, a multitude of landraces are grown, whose genetic combination is the result of hundreds of years of random, environmental, and farmer selection. High genetic diversity exists in the landrace gene pool which however has scarcely been studied, thus bounding their cultivation. We re-sequenced four pepper inbred lines, within as many Italian landraces, which representative of as many fruit types: big sized blocky with sunken apex ('Quadrato') and protruding apex or heart shaped ('Cuneo'), elongated ('Corno') and smaller sized sub-spherical ('Tumaticot'). Each genomic sequence was obtained through Illumina platform at coverage ranging from 39 to 44×, and reconstructed at a chromosome scale. About 35.5k genes were predicted in each inbred line, of which 22,017 were shared among them and the reference genome (accession 'CM334'). Distinctive variations in miRNAs, resistance gene analogues (RGAs) and susceptibility genes (S-genes) were detected. A detailed survey of the SNP/Indels occurring in genes affecting fruit size, shape and quality identified the highest frequencies of variation in regulatory regions. Many structural variations were identified as presence/absence variations (PAVs), notably in resistance gene analogues (RGAs) and in the capsanthin/capsorubin synthase (CCS) gene. The large allelic diversity observed in the four inbred lines suggests their potential use as a pre-breeding resource and represents a one-stop resource for C. annuum genomics and a key tool for dissecting the path from sequence variation to phenotype.
Collapse
|
182
|
Akın-Balı DF, Al-Khafaji K, Aktas SH, Taskin-Tok T. Bioinformatic and computational analysis for predominant mutations of the Nrf2/Keap1 complex in pediatric leukemia. J Biomol Struct Dyn 2020; 39:4290-4303. [PMID: 32469262 DOI: 10.1080/07391102.2020.1775702] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
The levels of reactive oxygen species (ROS) are tightly controlled and regulated by Nuclear Factor Erythroid-2-Like 2 (Nrf2) transcription factor, which is the main regulator of antioxidant responses and its suppressor protein Kelch-like ECH-associated protein 1 (Keap1). Our previous study has identified six novel changes in Nrf2/Keap1 pathway in pediatric ALL, which were described for the first time. These changes in the pathway are likely to alter the evolutionary process of amino acids and cause structural changes in the final products of genes. In this study, we aimed to compare the pathogenicity of eight determined mutations reported in our previous study by utilizing different programs with different algorithms and molecular dynamics simulation. Since it is too difficult to handle each existing mutation in a wet laboratory, in silico methods may give suggestion to choose the important mutations for further analysis and to establish the appropriate patient population and conduct wet laboratory studies. For this purpose, four different algorithms were used to evaluate the effects of single amino acid mutation. In addition, root-mean-square deviation, root-mean-square fluctuation and free-energy landscape analyses were performed to observe stability, flexibility and energetically favorable conformations, respectively, for each amino acid mutation. As a result, our study emphasizes the importance of Keap1 mutations in pediatric ALL Nrf2/Keap1 pathway, a total of eight mutations, two of which were shown for the first time in our study. Especially the mutations in the Keap1 Broad-Complex, Tramtrack and Bric-à-brac domain are worthy of attention.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Dilara Fatma Akın-Balı
- Faculty of Medicine, Department of Medical Biology, Nigde Omer Halisdemir University, Nigde, Turkey
| | - Khattab Al-Khafaji
- Faculty of Arts and Sciences, Department of Chemistry, Gaziantep University, Gaziantep, Turkey
| | - Sedef Hande Aktas
- Vocational School of Health Services, Eskisehir Osmangazi University, Eskisehir, Turkey
| | - Tugba Taskin-Tok
- Faculty of Arts and Sciences, Department of Chemistry, Gaziantep University, Gaziantep, Turkey.,Department of Bioinformatics and Computational Biology, Institute of Health Sciences, Gaziantep University, Gaziantep, Turkey
| |
Collapse
|
183
|
Tello J, Torres-Pérez R, Flutre T, Grimplet J, Ibáñez J. VviUCC1 Nucleotide Diversity, Linkage Disequilibrium and Association with Rachis Architecture Traits in Grapevine. Genes (Basel) 2020; 11:E598. [PMID: 32485819 PMCID: PMC7348735 DOI: 10.3390/genes11060598] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Revised: 05/25/2020] [Accepted: 05/27/2020] [Indexed: 11/25/2022] Open
Abstract
Cluster compactness is a trait with high agronomic relevance, affecting crop yield and grape composition. Rachis architecture is a major component of cluster compactness determinism, and is a target trait toward the breeding of grapevine varieties less susceptible to pests and diseases. Although its genetic basis is scarcely understood, a preliminary result indicated a possible involvement of the VviUCC1 gene. The aim of this study was to characterize the VviUCC1 gene in grapevine and to test the association between the natural variation observed for a series of rachis architecture traits and the polymorphisms detected in the VviUCC1 sequence. This gene encodes an uclacyanin plant-specific cell-wall protein involved in fiber formation and/or lignification processes. A high nucleotide diversity in the VviUCC1 gene promoter and coding regions was observed, but no critical effects were predicted in the protein domains, indicating a high level of conservation of its function in the cultivated grapevine. After correcting statistical models for genetic stratification and linkage disequilibrium effects, marker-trait association results revealed a series of single nucleotide polymorphisms (SNPs) significantly associated with cluster compactness and rachis traits variation. Two of them (Y-984 and K-88) affected two common cis-transcriptional regulatory elements, suggesting an effect on phenotype via gene expression regulation. This work reinforces the interest of further studies aiming to reveal the functional effect of the detected VviUCC1 variants on grapevine rachis architecture.
Collapse
Affiliation(s)
- Javier Tello
- Departamento de Viticultura, Instituto de Ciencias de la Vid y del Vino (CSIC, UR, Gobierno de La Rioja), 26080 Logroño, Spain; (R.T.-P.); (J.G.); (J.I.)
| | - Rafael Torres-Pérez
- Departamento de Viticultura, Instituto de Ciencias de la Vid y del Vino (CSIC, UR, Gobierno de La Rioja), 26080 Logroño, Spain; (R.T.-P.); (J.G.); (J.I.)
- Servicio de Bioinformática para Genómica y Proteómica (BioinfoGP), Centro Nacional de Biotecnología (CNB-CSIC), 28049 Madrid, Spain
| | - Timothée Flutre
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE-Le Moulon, 91190 Gif-sur-Yvette, France;
| | - Jérôme Grimplet
- Departamento de Viticultura, Instituto de Ciencias de la Vid y del Vino (CSIC, UR, Gobierno de La Rioja), 26080 Logroño, Spain; (R.T.-P.); (J.G.); (J.I.)
- Unidad de Hortofruticultura, Centro de Investigación y Tecnología Agroalimentaria de Aragón (CITA), 50059 Zaragoza, Spain
- Instituto Agroalimentario de Aragón-IA2 (CITA-Universidad de Zaragoza), 50059 Zaragoza, Spain
| | - Javier Ibáñez
- Departamento de Viticultura, Instituto de Ciencias de la Vid y del Vino (CSIC, UR, Gobierno de La Rioja), 26080 Logroño, Spain; (R.T.-P.); (J.G.); (J.I.)
| |
Collapse
|
184
|
Remali J, Aizat WM, Ng CL, Lim YC, Mohamed-Hussein ZA, Fazry S. In silico analysis on the functional and structural impact of Rad50 mutations involved in DNA strand break repair. PeerJ 2020; 8:e9197. [PMID: 32509463 PMCID: PMC7247530 DOI: 10.7717/peerj.9197] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2019] [Accepted: 04/24/2020] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND DNA double strand break repair is important to preserve the fidelity of our genetic makeup after DNA damage. Rad50 is one of the components in MRN complex important for DNA repair mechanism. Rad50 mutations can lead to microcephaly, mental retardation and growth retardation in human. However, Rad50 mutations in human and other organisms have never been gathered and heuristically compared for their deleterious effects. It is important to assess the conserved region in Rad50 and its homolog to identify vital mutations that can affect functions of the protein. METHOD In this study, Rad50 mutations were retrieved from SNPeffect 4.0 database and literature. Each of the mutations was analyzed using various bioinformatic analyses such as PredictSNP, MutPred, SNPeffect 4.0, I-Mutant and MuPro to identify its impact on molecular mechanism, biological function and protein stability, respectively. RESULTS We identified 103 mostly occurred mutations in the Rad50 protein domains and motifs, which only 42 mutations were classified as most deleterious. These mutations are mainly situated at the specific motifs such as Walker A, Q-loop, Walker B, D-loop and signature motif of the Rad50 protein. Some of these mutations were predicted to negatively affect several important functional sites that play important roles in DNA repair mechanism and cell cycle signaling pathway, highlighting Rad50 crucial role in this process. Interestingly, mutations located at non-conserved regions were predicted to have neutral/non-damaging effects, in contrast with previous experimental studies that showed deleterious effects. This suggests that software used in this study may have limitations in predicting mutations in non-conserved regions, implying further improvement in their algorithm is needed. In conclusion, this study reveals the priority of acid substitution associated with the genetic disorders. This finding highlights the vital roles of certain residues such as K42E, C681A/S, CC684R/S, S1202R, E1232Q and D1238N/A located in Rad50 conserved regions, which can be considered for a more targeted future studies.
Collapse
Affiliation(s)
- Juwairiah Remali
- Department of Food Sciences, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
| | - Wan Mohd Aizat
- Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
| | - Chyan Leong Ng
- Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
| | - Yi Chieh Lim
- Danish Cancer Society, Research Centre Strand Boulevard, Copenhagen, Denmark
| | - Zeti-Azura Mohamed-Hussein
- Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
- Department of Applied Physics, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
| | - Shazrul Fazry
- Department of Food Sciences, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
- Pusat Penyelidikan Tasik Chini, Fakulti Sains dan Teknologi, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia
| |
Collapse
|
185
|
Miller M, Vitale D, Kahn PC, Rost B, Bromberg Y. funtrp: identifying protein positions for variation driven functional tuning. Nucleic Acids Res 2020; 47:e142. [PMID: 31584091 PMCID: PMC6868392 DOI: 10.1093/nar/gkz818] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 09/05/2019] [Accepted: 09/12/2019] [Indexed: 12/12/2022] Open
Abstract
Evaluating the impact of non-synonymous genetic variants is essential for uncovering disease associations and mechanisms of evolution. An in-depth understanding of sequence changes is also fundamental for synthetic protein design and stability assessments. However, the variant effect predictor performance gain observed in recent years has not kept up with the increased complexity of new methods. One likely reason for this might be that most approaches use similar sets of gene and protein features for modeling variant effects, often emphasizing sequence conservation. While high levels of conservation highlight residues essential for protein activity, much of the variation observable in vivo is arguably weaker in its impact, thus requiring evaluation at a higher level of resolution. Here, we describe functionNeutral/Toggle/Rheostatpredictor (funtrp), a novel computational method that categorizes protein positions based on the position-specific expected range of mutational impacts: Neutral (weak/no effects), Rheostat (function-tuning positions), or Toggle (on/off switches). We show that position types do not correlate strongly with familiar protein features such as conservation or protein disorder. We also find that position type distribution varies across different protein functions. Finally, we demonstrate that position types can improve performance of existing variant effect predictors and suggest a way forward for the development of new ones.
Collapse
Affiliation(s)
- Maximilian Miller
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Dr, New Brunswick, NJ 08901, USA
| | - Daniel Vitale
- Columbian College of Arts and Sciences Data Science Program Corcoran Hall, 725 21st Street NW, Washington, DC 20052, USA
| | - Peter C Kahn
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Dr, New Brunswick, NJ 08901, USA
| | - Burkhard Rost
- Department for Bioinformatics and Computational Biology, Technische Universität München, Boltzmannstr. 3, 85748 Garching/Munich, Germany.,Institute for Advanced Study at Technische Universität München (TUM-IAS), Lichtenbergstraße 2a 85748 Garching/Munich, Germany
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Dr, New Brunswick, NJ 08901, USA.,Institute for Advanced Study at Technische Universität München (TUM-IAS), Lichtenbergstraße 2a 85748 Garching/Munich, Germany.,Department of Genetics, Rutgers University, Human Genetics Institute, Life Sciences Building, 145 Bevier Road, Piscataway, NJ 08854, USA
| |
Collapse
|
186
|
Singh S, Yennamalli RM, Gupta M, Changotra H. Identification of nsSNPs of transcription factor E2F1 predisposing individuals to lung cancer and head and neck cancer. Mutat Res 2020; 821:111704. [PMID: 32407972 DOI: 10.1016/j.mrfmmm.2020.111704] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 03/26/2020] [Accepted: 04/05/2020] [Indexed: 06/11/2023]
Abstract
E2Fs transcription factors family is involved in the G1/S transition and DNA replication and their deregulated expression have been reported in various human cancers. Studies have shown that the genetic variants of E2F1 family members play an important role in head and neck carcinogenesis. In this study, we predicted six highly deleterious nsSNPs (C227F, R252H, V295D, C298Y, R56W, and Y59C) of E2F1 gene through in silico analyses. The latter was based on protein structure, function, and amino acid conservation. Molecular dynamics studies showed a deviation of the structures of the mutant proteins from the global protein parameters. Further, a case-control study that included total 535 samples (305 cancer patients and 230 controls) was conducted to find the association of the predicted SNPs with the susceptibility to lung cancer (LC) and head and neck cancer (HNC). The genotyping was done applying in-house artificial-RFLP method. Statistical analysis showed that the mutant alleles/genotypes of rs3213172 (R252H) were found to increase ∼ 2-5 fold risk of LC and HNC in all the genetic models. These results suggest that the rs3213172C/T polymorphism of the E2F1 gene could be used as an effective biomarker for genetic susceptibility to LC and HNC in our population.
Collapse
Affiliation(s)
- Sanjay Singh
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat, Himachal Pradesh, 173234, India
| | - Ragothaman M Yennamalli
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat, Himachal Pradesh, 173234, India
| | - Manish Gupta
- Department of Radiotherapy and Oncology (Regional Cancer Center), Indira Gandhi Medical College, Shimla, Himachal Pradesh, 171001, India
| | - Harish Changotra
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat, Himachal Pradesh, 173234, India.
| |
Collapse
|
187
|
Reeb J, Wirth T, Rost B. Variant effect predictions capture some aspects of deep mutational scanning experiments. BMC Bioinformatics 2020; 21:107. [PMID: 32183714 PMCID: PMC7077003 DOI: 10.1186/s12859-020-3439-4] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Accepted: 03/03/2020] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Deep mutational scanning (DMS) studies exploit the mutational landscape of sequence variation by systematically and comprehensively assaying the effect of single amino acid variants (SAVs; also referred to as missense mutations, or non-synonymous Single Nucleotide Variants - missense SNVs or nsSNVs) for particular proteins. We assembled SAV annotations from 22 different DMS experiments and normalized the effect scores to evaluate variant effect prediction methods. Three trained on traditional variant effect data (PolyPhen-2, SIFT, SNAP2), a regression method optimized on DMS data (Envision), and a naïve prediction using conservation information from homologs. RESULTS On a set of 32,981 SAVs, all methods captured some aspects of the experimental effect scores, albeit not the same. Traditional methods such as SNAP2 correlated slightly more with measurements and better classified binary states (effect or neutral). Envision appeared to better estimate the precise degree of effect. Most surprising was that the simple naïve conservation approach using PSI-BLAST in many cases outperformed other methods. All methods captured beneficial effects (gain-of-function) significantly worse than deleterious (loss-of-function). For the few proteins with multiple independent experimental measurements, experiments differed substantially, but agreed more with each other than with predictions. CONCLUSIONS DMS provides a new powerful experimental means of understanding the dynamics of the protein sequence space. As always, promising new beginnings have to overcome challenges. While our results demonstrated that DMS will be crucial to improve variant effect prediction methods, data diversity hindered simplification and generalization.
Collapse
Affiliation(s)
- Jonas Reeb
- Department of Informatics, Bioinformatics & Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr 3, 85748, Garching/Munich, Germany.
| | - Theresa Wirth
- Department of Informatics, Bioinformatics & Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr 3, 85748, Garching/Munich, Germany
| | - Burkhard Rost
- Department of Informatics, Bioinformatics & Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr 3, 85748, Garching/Munich, Germany
- Institute for Advanced Study (TUM-IAS), Lichtenbergstr 2a, 85748, Garching/Munich, Germany
- TUM School of Life Sciences Weihenstephan (WZW), Alte Akademie 8, Freising, Germany
- Department of Biochemistry and Molecular Biophysics, Columbia University, 701 West, 168th Street, New York, NY, 10032, USA
| |
Collapse
|
188
|
Kumar R, Kumar R, Tanwar P, Rath GK, Kumar R, Kumar S, Dash N, Das P, Hussain S. Deciphering the impact of missense mutations on structure and dynamics of SMAD4 protein involved in pathogenesis of gall bladder cancer. J Biomol Struct Dyn 2020; 39:1940-1954. [PMID: 32151199 DOI: 10.1080/07391102.2020.1740789] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Gall bladder cancer (GBC) is the most common malignancy of biliary tract cancer associated with high mortality rate and poor prognosis due to lack of suitable biomarkers. In this study, we explored the structural and functional effects of different missense mutations occurs in SMAD4 that was associated with the development of GBC. We utilized in silico methods to predict the harmful effects of nonsynonymous missense mutations and monitored the stability of protein. We found that all mutations (D351N, G352E, R361C, R361H, E526Q) associated with SMAD4 were deleterious in nature resulting in the formation of deformed or unstable protein structure. Molecular dynamics simulation studies revealed how these mutations affect protein stability, structure, conformation and function. We observed, different mutants increase the compactness and rigidity of SMAD4 protein, alter secondary structure composition, decrease the surface area and protein-ligand interaction and affect its conformation. Findings of current work indicated that the analyzed mutations might affect the structure of protein and its caliber to interact with other molecules, which probably related to functional impairment of SMAD4 upon D351N, G352E, R361C, R361H, E526Q mutations and their involvement in cancer. Hence, the present study has significance of rational drug design and further increase our understanding of GBC development.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Rakesh Kumar
- Dr. B. R. A.-Institute Rotary Cancer Hospital, All India Institute of Medical Sciences, New Delhi, India
| | - Rahul Kumar
- Dr. B. R. A.-Institute Rotary Cancer Hospital, All India Institute of Medical Sciences, New Delhi, India
| | - Pranay Tanwar
- Dr. B. R. A.-Institute Rotary Cancer Hospital, All India Institute of Medical Sciences, New Delhi, India
| | - G K Rath
- Dr. B. R. A.-Institute Rotary Cancer Hospital, All India Institute of Medical Sciences, New Delhi, India
| | - Ritesh Kumar
- Dr. B. R. A.-Institute Rotary Cancer Hospital, All India Institute of Medical Sciences, New Delhi, India
| | - Sunil Kumar
- Dr. B. R. A.-Institute Rotary Cancer Hospital, All India Institute of Medical Sciences, New Delhi, India
| | - Nihar Dash
- Department of Gastrointestinal Surgery, All India Institute of Medical Sciences, New Delhi, India
| | - Prasenjit Das
- Department of Pathology, All India Institute of Medical Sciences, New Delhi, India
| | - Showket Hussain
- Division of Molecular Oncology, National Institute of Cancer Prevention and Research, Noida, India
| |
Collapse
|
189
|
Liu Y, Tang Q, Cheng P, Zhu M, Zhang H, Liu J, Zuo M, Huang C, Wu C, Sun Z, Liu Z. Whole-genome sequencing and analysis of the Chinese herbal plant Gelsemium elegans. Acta Pharm Sin B 2020; 10:374-382. [PMID: 32082980 PMCID: PMC7016290 DOI: 10.1016/j.apsb.2019.08.004] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 06/27/2019] [Accepted: 07/26/2019] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND Gelsemium elegans (G. elegans) (2n = 2x = 16) is genus of flowering plants belonging to the Gelsemicaeae family. METHOD Here, a high-quality genome assembly using the Oxford Nanopore Technologies (ONT) platform and high-throughput chromosome conformation capture techniques (Hi-C) were used. RESULTS A total of 56.11 Gb of raw GridION X5 platform ONT reads (6.23 Gb per cell) were generated. After filtering, 53.45 Gb of clean reads were obtained, giving 160 × coverage depth. The de novo genome assemblies 335.13 Mb, close to the 338 Mb estimated by k-mer analysis, was generated with contig N50 of 10.23 Mb. The vast majority (99.2%) of the G. elegans assembled sequence was anchored onto 8 pseudo-chromosomes. The genome completeness was then evaluated and 1338 of the 1440 conserved genes (92.9%) could be found in the assembly. Genome annotation revealed that 43.16% of the G. elegans genome is composed of repetitive elements and 23.9% is composed of long terminal repeat elements. We predicted 26,768 protein-coding genes, of which 84.56% were functionally annotated. CONCLUSION The genomic sequences of G. elegans could be a valuable source for comparative genomic analysis in the Gelsemicaeae family and will be useful for understanding the phylogenetic relationships of the indole alkaloid metabolism.
Collapse
|
190
|
Van Etten M, Lee KM, Chang SM, Baucom RS. Parallel and nonparallel genomic responses contribute to herbicide resistance in Ipomoea purpurea, a common agricultural weed. PLoS Genet 2020; 16:e1008593. [PMID: 32012153 PMCID: PMC7018220 DOI: 10.1371/journal.pgen.1008593] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Revised: 02/13/2020] [Accepted: 01/03/2020] [Indexed: 12/30/2022] Open
Abstract
The repeated evolution of herbicide resistance has been cited as an example of genetic parallelism, wherein separate species or genetic lineages utilize the same genetic solution in response to selection. However, most studies that investigate the genetic basis of herbicide resistance examine the potential for changes in the protein targeted by the herbicide rather than considering genome-wide changes. We used a population genomics screen and targeted exome re-sequencing to uncover the potential genetic basis of glyphosate resistance in the common morning glory, Ipomoea purpurea, and to determine if genetic parallelism underlies the repeated evolution of resistance across replicate resistant populations. We found no evidence for changes in 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS), glyphosate's target protein, that were associated with resistance, and instead identified five genomic regions that showed evidence of selection. Within these regions, genes involved in herbicide detoxification-cytochrome P450s, ABC transporters, and glycosyltransferases-are enriched and exhibit signs of selective sweeps. One region under selection shows parallel changes across all assayed resistant populations whereas other regions exhibit signs of divergence. Thus, while it appears that the physiological mechanism of resistance in this species is likely the same among resistant populations, we find patterns of both similar and divergent selection across separate resistant populations at particular loci.
Collapse
Affiliation(s)
- Megan Van Etten
- Biology Department, Penn State-Scranton, Dunmore, Pennsylvania, United States of America
| | - Kristin M. Lee
- Department of Biological Sciences, Columbia University, New York, New York, United States of America
| | - Shu-Mei Chang
- Plant Biology Department, University of Georgia, Athens, Georgia, United States of America
| | - Regina S. Baucom
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, United States of America
| |
Collapse
|
191
|
Gómez-Fernández P, Lopez de Lapuente Portilla A, Astobiza I, Mena J, Urtasun A, Altmann V, Matesanz F, Otaegui D, Urcelay E, Antigüedad A, Malhotra S, Montalban X, Castillo-Triviño T, Espino-Paisán L, Aktas O, Buttmann M, Chan A, Fontaine B, Gourraud PA, Hecker M, Hoffjan S, Kubisch C, Kümpfel T, Luessi F, Zettl UK, Zipp F, Alloza I, Comabella M, Lill CM, Vandenbroeck K. The Rare IL22RA2 Signal Peptide Coding Variant rs28385692 Decreases Secretion of IL-22BP Isoform-1, -2 and -3 and Is Associated with Risk for Multiple Sclerosis. Cells 2020; 9:cells9010175. [PMID: 31936765 PMCID: PMC7017210 DOI: 10.3390/cells9010175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Revised: 12/29/2019] [Accepted: 01/03/2020] [Indexed: 10/29/2022] Open
Abstract
The IL22RA2 locus is associated with risk for multiple sclerosis (MS) but causative variants are yet to be determined. In a single nucleotide polymorphism (SNP) screen of this locus in a Basque population, rs28385692, a rare coding variant substituting Leu for Pro at position 16 emerged significantly (p = 0.02). This variant is located in the signal peptide (SP) shared by the three secreted protein isoforms produced by IL22RA2 (IL-22 binding protein-1(IL-22BPi1), IL-22BPi2 and IL-22BPi3). Genotyping was extended to a Europe-wide case-control dataset and yielded high significance in the full dataset (p = 3.17 × 10-4). Importantly, logistic regression analyses conditioning on the main known MS-associated SNP at this locus, rs17066096, revealed that this association was independent from the primary association signal in the full case-control dataset. In silico analysis predicted both disruption of the alpha helix of the H-region of the SP and decreased hydrophobicity of this region, ultimately affecting the SP cleavage site. We tested the effect of the p.Leu16Pro variant on the secretion of IL-22BPi1, IL-22BPi2 and IL-22BPi3 and observed that the Pro16 risk allele significantly lowers secretion levels of each of the isoforms to around 50%-60% in comparison to the Leu16 reference allele. Thus, our study suggests that genetically coded decreased levels of IL-22BP isoforms are associated with augmented risk for MS.
Collapse
Affiliation(s)
- Paloma Gómez-Fernández
- Neurogenomiks Laboratory, University of the Basque Country (UPV/EHU), 48940 Leioa, Spain; (P.G.-F.); (A.L.d.L.P.); (I.A.); (J.M.); (A.U.); (I.A.)
| | - Aitzkoa Lopez de Lapuente Portilla
- Neurogenomiks Laboratory, University of the Basque Country (UPV/EHU), 48940 Leioa, Spain; (P.G.-F.); (A.L.d.L.P.); (I.A.); (J.M.); (A.U.); (I.A.)
- Department of Laboratory Medicine, Lund University, SE-221 00 Lund, Sweden
| | - Ianire Astobiza
- Neurogenomiks Laboratory, University of the Basque Country (UPV/EHU), 48940 Leioa, Spain; (P.G.-F.); (A.L.d.L.P.); (I.A.); (J.M.); (A.U.); (I.A.)
| | - Jorge Mena
- Neurogenomiks Laboratory, University of the Basque Country (UPV/EHU), 48940 Leioa, Spain; (P.G.-F.); (A.L.d.L.P.); (I.A.); (J.M.); (A.U.); (I.A.)
- Inflammation & Biomarkers Group, Biocruces Bizkaia Health Research Institute, 48903 Barakaldo, Spain
| | - Andoni Urtasun
- Neurogenomiks Laboratory, University of the Basque Country (UPV/EHU), 48940 Leioa, Spain; (P.G.-F.); (A.L.d.L.P.); (I.A.); (J.M.); (A.U.); (I.A.)
| | - Vivian Altmann
- Genetic and Molecular Epidemiology Group, Lübeck Platform for Genome Analytics, Institutes of Neurogenetics and Cardiogenetics, University of Lübeck, 23552 Lübeck, Germany; (V.A.); (C.M.L.)
| | - Fuencisla Matesanz
- Department of Cell Biology and Immunology, Instituto de Parasitología y Biomedicina López Neyra (IPBLN), CSIC, 18002 Granada, Spain;
| | - David Otaegui
- Multiple Sclerosis Group, Biodonostia Research Institute, Paseo Doctor Begiristain, s/n, 20014 San Sebastián, Spain; (D.O.); (T.C.-T.)
| | - Elena Urcelay
- Instituto de Investigación Sanitaria del Hospital Clínico San Carlos, IdISSC, 28014 Madrid, Spain; (E.U.); (L.E.-P.)
| | | | - Sunny Malhotra
- Servei de Neurologia-Neuroimmunologia, Centre d’Esclerosi Múltiple de Catalunya (Cemcat), Institut de Recerca Vall d’Hebron (VHIR), Hospital Universitari Vall d’Hebron, Universitat Autònoma de Barcelona, 08007 Barcelona, Spain; (S.M.); (X.M.); (M.C.)
| | - Xavier Montalban
- Servei de Neurologia-Neuroimmunologia, Centre d’Esclerosi Múltiple de Catalunya (Cemcat), Institut de Recerca Vall d’Hebron (VHIR), Hospital Universitari Vall d’Hebron, Universitat Autònoma de Barcelona, 08007 Barcelona, Spain; (S.M.); (X.M.); (M.C.)
| | - Tamara Castillo-Triviño
- Multiple Sclerosis Group, Biodonostia Research Institute, Paseo Doctor Begiristain, s/n, 20014 San Sebastián, Spain; (D.O.); (T.C.-T.)
| | - Laura Espino-Paisán
- Instituto de Investigación Sanitaria del Hospital Clínico San Carlos, IdISSC, 28014 Madrid, Spain; (E.U.); (L.E.-P.)
| | - Orhan Aktas
- Department of Neurology, Medical Faculty, Heinrich-Heine University Düsseldorf, 40225 Düsseldorf, Germany;
| | - Mathias Buttmann
- Department of Neurology, University of Wuerzburg, 97080 Wuerzburg, Germany;
- Department of Neurology, Caritas Hospital, 97980 Bad Mergentheim, Germany
| | - Andrew Chan
- Department of Neurology, Inselspital Bern, Bern University Hospital, University of Bern, 3011 Bern, Switzerland;
| | - Bertrand Fontaine
- INSERM, Sorbonne University, Assistance Publique-Hopitaux de Paris (AP-HP), UMR 974 and Neuro-Myology Service, University Hospital Pitié-Salpêtrière, 75013 Paris, France;
| | - Pierre-Antoine Gourraud
- Nantes Université, CHU, INSERM, Centre de Recherche en Transplantation et Immunologie, UMR 1064, ATIP-Avenir, Equipe 5, 44093 Nantes, France;
- CHU de Nantes, INSERM, CIC 1413, Pôle Hospitalo-Universitaire 11: Santé Publique, Clinique des données, 44000 Nantes, France
| | - Michael Hecker
- Department of Neurology, Neuroimmunological Section, University of Rostock, 18147 Rostock, Germany; (M.H.); (U.K.Z.)
| | - Sabine Hoffjan
- Department of Human Genetics, Ruhr-University Bochum, 44801 Bochum, Germany;
| | - Christian Kubisch
- Institute of Human Genetics, University Medical Center Hamburg-Eppendorf, 20246 Hamburg, Germany;
| | - Tania Kümpfel
- Institute of Clinical Neuroimmunology, Ludwig-Maximilians University, 80333 Munich, Germany;
| | - Felix Luessi
- Department of Neurology, Focus Program Translational Neuroscience, University Medical Center of the Johannes Gutenberg University Mainz, 55116 Mainz, Germany; (F.L.); (F.Z.)
| | - Uwe K. Zettl
- Department of Neurology, Neuroimmunological Section, University of Rostock, 18147 Rostock, Germany; (M.H.); (U.K.Z.)
| | - Frauke Zipp
- Department of Neurology, Focus Program Translational Neuroscience, University Medical Center of the Johannes Gutenberg University Mainz, 55116 Mainz, Germany; (F.L.); (F.Z.)
| | - Iraide Alloza
- Neurogenomiks Laboratory, University of the Basque Country (UPV/EHU), 48940 Leioa, Spain; (P.G.-F.); (A.L.d.L.P.); (I.A.); (J.M.); (A.U.); (I.A.)
- Inflammation & Biomarkers Group, Biocruces Bizkaia Health Research Institute, 48903 Barakaldo, Spain
| | - Manuel Comabella
- Servei de Neurologia-Neuroimmunologia, Centre d’Esclerosi Múltiple de Catalunya (Cemcat), Institut de Recerca Vall d’Hebron (VHIR), Hospital Universitari Vall d’Hebron, Universitat Autònoma de Barcelona, 08007 Barcelona, Spain; (S.M.); (X.M.); (M.C.)
| | - Christina M. Lill
- Genetic and Molecular Epidemiology Group, Lübeck Platform for Genome Analytics, Institutes of Neurogenetics and Cardiogenetics, University of Lübeck, 23552 Lübeck, Germany; (V.A.); (C.M.L.)
- Department of Neurology, Focus Program Translational Neuroscience, University Medical Center of the Johannes Gutenberg University Mainz, 55116 Mainz, Germany; (F.L.); (F.Z.)
- Section for Translational Surgical Oncology and Biobanking, Department of Surgery, University of Lübeck and University Medical Center Schleswig-Holstein, Campus Lübeck, 23552 Lübeck, Germany
- Ageing Epidemiology Research Unit, School of Public Health, Imperial College, London SW71, UK
| | - Koen Vandenbroeck
- Neurogenomiks Laboratory, University of the Basque Country (UPV/EHU), 48940 Leioa, Spain; (P.G.-F.); (A.L.d.L.P.); (I.A.); (J.M.); (A.U.); (I.A.)
- Inflammation & Biomarkers Group, Biocruces Bizkaia Health Research Institute, 48903 Barakaldo, Spain
- Ikerbasque, Basque Foundation for Science, 48013 Bilbao, Spain
- Correspondence: ; Tel.: +34-946182622 (ext. 844748)
| |
Collapse
|
192
|
Thirumal Kumar D, Udhaya Kumar S, Nishaat Laeeque AS, Apurva Abhay S, Bithia R, Magesh R, Kumar M, Zayed H, George Priya Doss C. Computational model to analyze and characterize the functional mutations of NOD2 protein causing inflammatory disorder – Blau syndrome. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2020; 120:379-408. [DOI: 10.1016/bs.apcsb.2019.11.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
193
|
Pandurangan AP, Blundell TL. Prediction of impacts of mutations on protein structure and interactions: SDM, a statistical approach, and mCSM, using machine learning. Protein Sci 2020; 29:247-257. [PMID: 31693276 PMCID: PMC6933854 DOI: 10.1002/pro.3774] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 10/31/2019] [Accepted: 10/31/2019] [Indexed: 02/02/2023]
Abstract
Next-generation sequencing methods have not only allowed an understanding of genome sequence variation during the evolution of organisms but have also provided invaluable information about genetic variants in inherited disease and the emergence of resistance to drugs in cancers and infectious disease. A challenge is to distinguish mutations that are drivers of disease or drug resistance, from passengers that are neutral or even selectively advantageous to the organism. This requires an understanding of impacts of missense mutations in gene expression and regulation, and on the disruption of protein function by modulating protein stability or disturbing interactions with proteins, nucleic acids, small molecule ligands, and other biological molecules. Experimental approaches to understanding differences between wild-type and mutant proteins are most accurate but are also time-consuming and costly. Computational tools used to predict the impacts of mutations can provide useful information more quickly. Here, we focus on two widely used structure-based approaches, originally developed in the Blundell lab: site-directed mutator (SDM), a statistical approach to analyze amino acid substitutions, and mutation cutoff scanning matrix (mCSM), which uses graph-based signatures to represent the wild-type structural environment and machine learning to predict the effect of mutations on protein stability. Here, we describe DUET that uses machine learning to combine the two approaches. We discuss briefly the development of mCSM for understanding the impacts of mutations on interfaces with other proteins, nucleic acids, and ligands, and we exemplify the wide application of these approaches to understand human genetic disorders and drug resistance mutations relevant to cancer and mycobacterial infections. STATEMENT FOR A BROADER AUDIENCE: Genetic or somatic changes in genes can lead to mutations in human proteins, which give rise to genetic disorders or cancer, or to genes of pathogens leading to drug resistance. Computer software described here, using statistical approaches or machine learning, uses the information from genome sequencing of humans and pathogens, together with experimental or modeled 3D structures of gene products, the proteins, to predict impacts of mutations in genetic disease, cancer and drug resistance.
Collapse
Affiliation(s)
- Arun Prasad Pandurangan
- Department of BiochemistryUniversity of CambridgeCambridgeUK
- MRC Laboratory of Molecular BiologyCambridgeUK
| | - Tom L. Blundell
- Department of BiochemistryUniversity of CambridgeCambridgeUK
| |
Collapse
|
194
|
Heinzinger M, Elnaggar A, Wang Y, Dallago C, Nechaev D, Matthes F, Rost B. Modeling aspects of the language of life through transfer-learning protein sequences. BMC Bioinformatics 2019; 20:723. [PMID: 31847804 PMCID: PMC6918593 DOI: 10.1186/s12859-019-3220-8] [Citation(s) in RCA: 241] [Impact Index Per Article: 48.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Accepted: 11/13/2019] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Predicting protein function and structure from sequence is one important challenge for computational biology. For 26 years, most state-of-the-art approaches combined machine learning and evolutionary information. However, for some applications retrieving related proteins is becoming too time-consuming. Additionally, evolutionary information is less powerful for small families, e.g. for proteins from the Dark Proteome. Both these problems are addressed by the new methodology introduced here. RESULTS We introduced a novel way to represent protein sequences as continuous vectors (embeddings) by using the language model ELMo taken from natural language processing. By modeling protein sequences, ELMo effectively captured the biophysical properties of the language of life from unlabeled big data (UniRef50). We refer to these new embeddings as SeqVec (Sequence-to-Vector) and demonstrate their effectiveness by training simple neural networks for two different tasks. At the per-residue level, secondary structure (Q3 = 79% ± 1, Q8 = 68% ± 1) and regions with intrinsic disorder (MCC = 0.59 ± 0.03) were predicted significantly better than through one-hot encoding or through Word2vec-like approaches. At the per-protein level, subcellular localization was predicted in ten classes (Q10 = 68% ± 1) and membrane-bound were distinguished from water-soluble proteins (Q2 = 87% ± 1). Although SeqVec embeddings generated the best predictions from single sequences, no solution improved over the best existing method using evolutionary information. Nevertheless, our approach improved over some popular methods using evolutionary information and for some proteins even did beat the best. Thus, they prove to condense the underlying principles of protein sequences. Overall, the important novelty is speed: where the lightning-fast HHblits needed on average about two minutes to generate the evolutionary information for a target protein, SeqVec created embeddings on average in 0.03 s. As this speed-up is independent of the size of growing sequence databases, SeqVec provides a highly scalable approach for the analysis of big data in proteomics, i.e. microbiome or metaproteome analysis. CONCLUSION Transfer-learning succeeded to extract information from unlabeled sequence databases relevant for various protein prediction tasks. SeqVec modeled the language of life, namely the principles underlying protein sequences better than any features suggested by textbooks and prediction methods. The exception is evolutionary information, however, that information is not available on the level of a single sequence.
Collapse
Affiliation(s)
- Michael Heinzinger
- Department of Informatics, Bioinformatics & Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr. 3, 85748, Garching/Munich, Germany.
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany.
| | - Ahmed Elnaggar
- Department of Informatics, Bioinformatics & Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr. 3, 85748, Garching/Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
| | - Yu Wang
- Leibniz Supercomputing Centre, Boltzmannstr. 1, 85748, Garching/Munich, Germany
| | - Christian Dallago
- Department of Informatics, Bioinformatics & Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr. 3, 85748, Garching/Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
| | - Dmitrii Nechaev
- Department of Informatics, Bioinformatics & Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr. 3, 85748, Garching/Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
| | - Florian Matthes
- TUM Department of Informatics, Software Engineering and Business Information Systems, Boltzmannstr. 1, 85748, Garching/Munich, Germany
| | - Burkhard Rost
- Department of Informatics, Bioinformatics & Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr. 3, 85748, Garching/Munich, Germany
- Institute for Advanced Study (TUM-IAS), Lichtenbergstr. 2a, 85748, Garching/Munich, Germany
- TUM School of Life Sciences Weihenstephan (WZW), Alte Akademie 8, Freising, Germany
- Department of Biochemistry and Molecular Biophysics & New York Consortium on Membrane Protein Structure (NYCOMPS), Columbia University, 701 West, 168th Street, New York, NY, 10032, USA
| |
Collapse
|
195
|
Kim D, Han SK, Lee K, Kim I, Kong J, Kim S. Evolutionary coupling analysis identifies the impact of disease-associated variants at less-conserved sites. Nucleic Acids Res 2019; 47:e94. [PMID: 31199866 PMCID: PMC6895274 DOI: 10.1093/nar/gkz536] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Revised: 05/03/2019] [Accepted: 06/05/2019] [Indexed: 12/20/2022] Open
Abstract
Genome-wide association studies have discovered a large number of genetic variants in human patients with the disease. Thus, predicting the impact of these variants is important for sorting disease-associated variants (DVs) from neutral variants. Current methods to predict the mutational impacts depend on evolutionary conservation at the mutation site, which is determined using homologous sequences and based on the assumption that variants at well-conserved sites have high impacts. However, many DVs at less-conserved but functionally important sites cannot be predicted by the current methods. Here, we present a method to find DVs at less-conserved sites by predicting the mutational impacts using evolutionary coupling analysis. Functionally important and evolutionarily coupled sites often have compensatory variants on cooperative sites to avoid loss of function. We found that our method identified known intolerant variants in a diverse group of proteins. Furthermore, at less-conserved sites, we identified DVs that were not identified using conservation-based methods. These newly identified DVs were frequently found at protein interaction interfaces, where species-specific mutations often alter interaction specificity. This work presents a means to identify less-conserved DVs and provides insight into the relationship between evolutionarily coupled sites and human DVs.
Collapse
Affiliation(s)
- Donghyo Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang 790-784, Korea
| | - Seong Kyu Han
- Department of Life Sciences, Pohang University of Science and Technology, Pohang 790-784, Korea
| | - Kwanghwan Lee
- Department of Life Sciences, Pohang University of Science and Technology, Pohang 790-784, Korea
| | - Inhae Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang 790-784, Korea
| | - JungHo Kong
- Department of Life Sciences, Pohang University of Science and Technology, Pohang 790-784, Korea
| | - Sanguk Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang 790-784, Korea
| |
Collapse
|
196
|
Koohi-Moghadam M, Wang H, Wang Y, Yang X, Li H, Wang J, Sun H. Predicting disease-associated mutation of metal-binding sites in proteins using a deep learning approach. NAT MACH INTELL 2019. [DOI: 10.1038/s42256-019-0119-z] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
197
|
Galano-Frutos JJ, García-Cebollada H, Sancho J. Molecular dynamics simulations for genetic interpretation in protein coding regions: where we are, where to go and when. Brief Bioinform 2019; 22:3-19. [PMID: 31813950 DOI: 10.1093/bib/bbz146] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Revised: 09/22/2019] [Accepted: 10/25/2019] [Indexed: 12/18/2022] Open
Abstract
The increasing ease with which massive genetic information can be obtained from patients or healthy individuals has stimulated the development of interpretive bioinformatics tools as aids in clinical practice. Most such tools analyze evolutionary information and simple physical-chemical properties to predict whether replacement of one amino acid residue with another will be tolerated or cause disease. Those approaches achieve up to 80-85% accuracy as binary classifiers (neutral/pathogenic). As such accuracy is insufficient for medical decision to be based on, and it does not appear to be increasing, more precise methods, such as full-atom molecular dynamics (MD) simulations in explicit solvent, are also discussed. Then, to describe the goal of interpreting human genetic variations at large scale through MD simulations, we restrictively refer to all possible protein variants carrying single-amino-acid substitutions arising from single-nucleotide variations as the human variome. We calculate its size and develop a simple model that allows calculating the simulation time needed to have a 0.99 probability of observing unfolding events of any unstable variant. The knowledge of that time enables performing a binary classification of the variants (stable-potentially neutral/unstable-pathogenic). Our model indicates that the human variome cannot be simulated with present computing capabilities. However, if they continue to increase as per Moore's law, it could be simulated (at 65°C) spending only 3 years in the task if we started in 2031. The simulation of individual protein variomes is achievable in short times starting at present. International coordination seems appropriate to embark upon massive MD simulations of protein variants.
Collapse
Affiliation(s)
- Juan J Galano-Frutos
- Protein Folding and Molecular Design (ProtMol)' group at BIFI, University of Zaragoza
| | | | - Javier Sancho
- Protein Folding and Molecular Design (ProtMol)' group at BIFI, University of Zaragoza
| |
Collapse
|
198
|
Bouafi H, Bencheikh S, Mehdi Krami AL, Morjane I, Charoute H, Rouba H, Saile R, Benhnini F, Barakat A. Prediction and Structural Comparison of Deleterious Coding Nonsynonymous Single Nucleotide Polymorphisms (nsSNPs) in Human LEP Gene Associated with Obesity. BIOMED RESEARCH INTERNATIONAL 2019; 2019:1832084. [PMID: 31871931 PMCID: PMC6913293 DOI: 10.1155/2019/1832084] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2019] [Revised: 07/25/2019] [Accepted: 08/14/2019] [Indexed: 12/18/2022]
Abstract
Leptin is a peptide hormone that regulates fat stores in the body and appetite by controlling the feeling of satiety. This hormone is secreted by the white adipose tissue and plays a role in the storage and mobilization of fatty acids. Mutations of the LEP gene have been associated with obesity in different populations; it is a multifactorial disease that constitutes a major public health problem. In this study, we evaluated the impact of missense SNPs in the LEP gene extracted from dbSNP using 8 computational prediction tools. Out of the total of 4337 SNPs, 93 were nsSNPs (nonsynonymous single nucleotide polymorphisms). Among 93 nsSNPs, 12 (S46L, G59S, D61N, D100N, N103K, C117S, D76V, S88C, P90R, I95N, L161R, and R105W) variants were predicted to be the most deleterious by prediction software. On these 12 deleterious SNPs, 8 variants (S46L, G59S, D61N, D100N, N103K, C117S, L161R, and R105W) were located in the conserved positions and showed a decrease in structure stability which was evaluated by I-Mutant and Mupro. Then, by analyzing the different interactions between different amino acids in wild and mutated proteins, we assessed the structural impact of the deleterious modifications using the YASARA software. Among 8 deleterious nsSNPs, we revealed structure changes in the 6 variants S46L, G59S, D100N, L103K, R105W, L161R, two of which R105W, N103K were previously reported as associated with obesity. Our study suggests 6 deleterious mutations could play an important role in contributing to human obesity and worth to be included in association and functional studies, then may be a drug target.
Collapse
Affiliation(s)
- Hind Bouafi
- Laboratoire de Génomique et Génétique Humaine, Institut Pasteur du Maroc, Casablanca, Morocco
- Laboratoire Biologie et Santé, Centre de Recherche Santé et Biotechnologie, Faculté des Sciences Ben M'Sik, Hassan II University of Casablanca, Morocco
| | - Sara Bencheikh
- Laboratoire de Génomique et Génétique Humaine, Institut Pasteur du Maroc, Casablanca, Morocco
| | - AL Mehdi Krami
- Laboratoire de Génomique et Génétique Humaine, Institut Pasteur du Maroc, Casablanca, Morocco
| | - Imane Morjane
- Laboratoire de Génomique et Génétique Humaine, Institut Pasteur du Maroc, Casablanca, Morocco
| | - Hicham Charoute
- Laboratoire de Génomique et Génétique Humaine, Institut Pasteur du Maroc, Casablanca, Morocco
| | - Hassan Rouba
- Laboratoire de Génomique et Génétique Humaine, Institut Pasteur du Maroc, Casablanca, Morocco
| | - Rachid Saile
- Laboratoire Biologie et Santé, Centre de Recherche Santé et Biotechnologie, Faculté des Sciences Ben M'Sik, Hassan II University of Casablanca, Morocco
| | - Fouad Benhnini
- Laboratoire de Signalisation cellulaire, Faculté des Sciences Meknès, Université Moulay Ismail, Morocco
| | - Abdelhamid Barakat
- Laboratoire de Génomique et Génétique Humaine, Institut Pasteur du Maroc, Casablanca, Morocco
| |
Collapse
|
199
|
Tanwar H, Kumar DT, Doss CGP, Zayed H. Bioinformatics classification of mutations in patients with Mucopolysaccharidosis IIIA. Metab Brain Dis 2019; 34:1577-1594. [PMID: 31385193 PMCID: PMC6858298 DOI: 10.1007/s11011-019-00465-6] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Accepted: 07/08/2019] [Indexed: 02/06/2023]
Abstract
Mucopolysaccharidosis (MPS) IIIA, also known as Sanfilippo syndrome type A, is a severe, progressive disease that affects the central nervous system (CNS). MPS IIIA is inherited in an autosomal recessive manner and is caused by a deficiency in the lysosomal enzyme sulfamidase, which is required for the degradation of heparan sulfate. The sulfamidase is produced by the N-sulphoglucosamine sulphohydrolase (SGSH) gene. In MPS IIIA patients, the excess of lysosomal storage of heparan sulfate often leads to mental retardation, hyperactive behavior, and connective tissue impairments, which occur due to various known missense mutations in the SGSH, leading to protein dysfunction. In this study, we focused on three mutations (R74C, S66W, and R245H) based on in silico pathogenic, conservation, and stability prediction tool studies. The three mutations were further subjected to molecular dynamic simulation (MDS) analysis using GROMACS simulation software to observe the structural changes they induced, and all the mutants exhibited maximum deviation patterns compared with the native protein. Conformational changes were observed in the mutants based on various geometrical parameters, such as conformational stability, fluctuation, and compactness, followed by hydrogen bonding, physicochemical properties, principal component analysis (PCA), and salt bridge analyses, which further validated the underlying cause of the protein instability. Additionally, secondary structure and surrounding amino acid analyses further confirmed the above results indicating the loss of protein function in the mutants compared with the native protein. The present results reveal the effects of three mutations on the enzymatic activity of sulfamidase, providing a molecular explanation for the cause of the disease. Thus, this study allows for a better understanding of the effect of SGSH mutations through the use of various computational approaches in terms of both structure and functions and provides a platform for the development of therapeutic drugs and potential disease treatments.
Collapse
Affiliation(s)
- Himani Tanwar
- Department of Integrative Biology, School of Bio Sciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, 632014, India
| | - D Thirumal Kumar
- Department of Integrative Biology, School of Bio Sciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, 632014, India
| | - C George Priya Doss
- Department of Integrative Biology, School of Bio Sciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, 632014, India.
| | - Hatem Zayed
- Department of Biomedical Sciences, College of Health and Sciences, Qatar University, Doha, Qatar.
| |
Collapse
|
200
|
Michels M, Matte U, Fraga LR, Mancuso ACB, Ligabue-Braun R, Berneira EFR, Siebert M, Sanseverino MTV. Determining the pathogenicity of CFTR missense variants: Multiple comparisons of in silico predictors and variant annotation databases. Genet Mol Biol 2019; 42:560-570. [PMID: 31808782 PMCID: PMC6905453 DOI: 10.1590/1678-4685-gmb-2018-0148] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Accepted: 12/12/2018] [Indexed: 01/07/2023] Open
Abstract
Pathogenic variants in the Cystic Fibrosis Transmembrane Conductance Regulator
gene (CFTR) are responsible for cystic fibrosis (CF), the
commonest monogenic autosomal recessive disease, and
CFTR-related disorders in infants and youth. Diagnosis of such
diseases relies on clinical, functional, and molecular studies. To date, over
2,000 variants have been described on CFTR (~40% missense).
Since few of them have confirmed pathogenicity, in silico
analysis could help molecular diagnosis and genetic counseling. Here, the
pathogenicity of 779 CFTR missense variants was predicted by
consensus predictor PredictSNP and compared to annotations on CFTR2 and ClinVar.
Sensitivity and specificity analysis was divided into modeling and validation
phases using just variants annotated on CFTR2 and/or ClinVar that were not in
the validation datasets of the analyzed predictors. After validation phase, MAPP
and PhDSNP achieved maximum specificity but low sensitivity. Otherwise, SNAP had
maximum sensitivity but null specificity. PredictSNP, PolyPhen-1, PolyPhen-2,
SIFT, nsSNPAnalyzer had either low sensitivity or specificity, or both. Results
showed that most predictors were not reliable when analyzing
CFTR missense variants, ratifying the importance of
clinical information when asserting the pathogenicity of CFTR
missense variants. Our results should contribute to clarify decision making when
classifying the pathogenicity of CFTR missense variants.
Collapse
Affiliation(s)
- Marcus Michels
- Programa de Pós-Graduação em Genética e Biologia Molecular, Departamento de Genética, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, RS, Brazil
| | - Ursula Matte
- Programa de Pós-Graduação em Genética e Biologia Molecular, Departamento de Genética, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, RS, Brazil.,Centro de Pesquisa Experimental, Hospital de Clínicas de Porto Alegre (HCPA), Porto Alegre, RS, Brazil
| | - Lucas Rosa Fraga
- Departamento de Ciências Morfológicas, Instituto de Ciências Básicas da Saúde, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| | | | - Rodrigo Ligabue-Braun
- Programa de Pós-Graduação em Biologia Celular e Molecular, Centro de Biotecnologia, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| | | | - Marina Siebert
- Centro de Pesquisa Experimental, Hospital de Clínicas de Porto Alegre (HCPA), Porto Alegre, RS, Brazil.,Programa de Pós-Graduação Ciências em Gastroenterologia e Hepatologia, Faculdade de Medicina, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| | - Maria Teresa Vieira Sanseverino
- Programa de Pós-Graduação em Genética e Biologia Molecular, Departamento de Genética, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, RS, Brazil.,Serviço de Genética Médica, Hospital de Clínicas de Porto Alegre, Porto Alegre, RS, Brazil.,Escola de Medicina, Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS), Porto Alegre, RS, Brazil
| |
Collapse
|