1
|
Skvortsova L, Perfilyeva A, Bespalova K, Kuzovleva Y, Kabysheva N, Khamdiyeva O. 7p22.3 microdeletion: a case study of a patient with congenital heart defect, neurodevelopmental delay and epilepsy. Orphanet J Rare Dis 2024; 19:301. [PMID: 39152504 PMCID: PMC11330011 DOI: 10.1186/s13023-024-03321-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Accepted: 08/08/2024] [Indexed: 08/19/2024] Open
Abstract
BACKGROUND Chromosome 7 has regions enriched with low copy repeats (LCRs), which increase the likelihood of chromosomal microdeletion disorders. Documented microdeletion disorders on chromosome 7 include both well-known Williams syndrome and more rare cases. It is noteworthy that most cases of various microdeletions are characterized by phenotypic signs of neuropsychological developmental disorders, which, however, have a different genetic origin. The localization of the microdeletions, the genes included in the region, as well as the structural features of the sequences of these genes have a cumulative influence on the phenotypic characteristics of the individuals for each specific case and the severity of the manifestations of disorders. The consideration of these features and their detailed analysis is important for a correct and comprehensive assessment of the disease. RESULTS The article describes a clinical case of 7p22.3 microdeletion in a patient with congenital heart defect and neurological abnormalities - epilepsy, combined with moderate mental and motor developmental delay. CONCLUSIONS Through detailed genetic analyses, we are improving the clinical description of the rare 7p22.3 microdeletion and thus creating a basis for future genetic counseling and research into targeted therapies.
Collapse
Affiliation(s)
- Liliya Skvortsova
- Laboratory of Molecular Genetics, Institute of Genetics and Physiology, Almaty, 050060, Kazakhstan
| | - Anastassiya Perfilyeva
- Laboratory of Molecular Genetics, Institute of Genetics and Physiology, Almaty, 050060, Kazakhstan
| | - Kira Bespalova
- Laboratory of Molecular Genetics, Institute of Genetics and Physiology, Almaty, 050060, Kazakhstan.
- Department of Molecular Biology and Genetics, Al-Farabi Kazakh National University, Almaty, 050040, Kazakhstan.
| | - Yelena Kuzovleva
- Laboratory of Molecular Genetics, Institute of Genetics and Physiology, Almaty, 050060, Kazakhstan
| | - Nailya Kabysheva
- Laboratory of Molecular Genetics, Institute of Genetics and Physiology, Almaty, 050060, Kazakhstan
| | - Ozada Khamdiyeva
- Laboratory of Molecular Genetics, Institute of Genetics and Physiology, Almaty, 050060, Kazakhstan
- Department of Molecular Biology and Genetics, Al-Farabi Kazakh National University, Almaty, 050040, Kazakhstan
| |
Collapse
|
2
|
Bromberg Y, Prabakaran R, Kabir A, Shehu A. Variant Effect Prediction in the Age of Machine Learning. Cold Spring Harb Perspect Biol 2024; 16:a041467. [PMID: 38621825 PMCID: PMC11216171 DOI: 10.1101/cshperspect.a041467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2024]
Abstract
Over the years, many computational methods have been created for the analysis of the impact of single amino acid substitutions resulting from single-nucleotide variants in genome coding regions. Historically, all methods have been supervised and thus limited by the inadequate sizes of experimentally curated data sets and by the lack of a standardized definition of variant effect. The emergence of unsupervised, deep learning (DL)-based methods raised an important question: Can machines learn the language of life from the unannotated protein sequence data well enough to identify significant errors in the protein "sentences"? Our analysis suggests that some unsupervised methods perform as well or better than existing supervised methods. Unsupervised methods are also faster and can, thus, be useful in large-scale variant evaluations. For all other methods, however, their performance varies by both evaluation metrics and by the type of variant effect being predicted. We also note that the evaluation of method performance is still lacking on less-studied, nonhuman proteins where unsupervised methods hold the most promise.
Collapse
Affiliation(s)
- Yana Bromberg
- Department of Biology, Emory University, Atlanta 30322, Georgia, USA
- Department of Computer Science, Emory University, Atlanta 30322, Georgia, USA
| | - R Prabakaran
- Department of Biology, Emory University, Atlanta 30322, Georgia, USA
| | - Anowarul Kabir
- Department of Computer Science, George Mason University, Fairfax 22030, Virginia, USA
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax 22030, Virginia, USA
| |
Collapse
|
3
|
Younis A, Bodurian C, Arking DE, Bragazzi NL, Tabaja C, Zareba W, McNitt S, Aktas MK, Polonsky B, Lopes CM, Sotoodehnia N, Kudenchuk PJ, Goldenberg I. Genetic variant annotation scores in congenital long QT syndrome. Ann Noninvasive Electrocardiol 2023; 28:e13080. [PMID: 37571804 PMCID: PMC10475886 DOI: 10.1111/anec.13080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/19/2023] [Revised: 06/20/2023] [Accepted: 07/28/2023] [Indexed: 08/13/2023] Open
Abstract
BACKGROUND Congenital Long QT Syndrome (LQTS) is a hereditary arrhythmic disorder. We aimed to assess the performance of current genetic variant annotation scores among LQTS patients and their predictive impact. METHODS We evaluated 2025 patients with unique mutations for LQT1-LQT3. A patient-specific score was calculated for each of four established genetic variant annotation algorithms: CADD, SIFT, REVEL, and PolyPhen-2. The scores were tested for the identification of LQTS and their predictive performance for cardiac events (CE) and life-threatening events (LTE) and then compared with the predictive performance of LQTS categorization based on mutation location/function. Score performance was tested using Harrell's C-index. RESULTS A total of 917 subjects were classified as LQT1, 838 as LQT2, and 270 as LQT3. The identification of a pathogenic variant occurred in 99% with CADD, 92% with SIFT, 100% with REVEL, and 86% with PolyPhen-2. However, none of the genetic scores correlated with the risk of CE (Harrell's C-index: CADD = 0.50, SIFT = 0.51, REVEL = 0.50, and PolyPhen-2 = 0.52) or LTE (Harrell's C-index: CADD = 0.50, SIFT = 0.53, REVEL = 0.54, and PolyPhen-2 = 0.52). In contrast, high-risk mutation categorization based on location/function was a powerful independent predictor of CE (HR = 1.88; p < .001) and LTE (HR = 1.89, p < .001). CONCLUSION In congenital LQTS patients, well-established algorithms (CADD, SIFT, REVEL, and PolyPhen-2) were able to identify the majority of the causal variants as pathogenic. However, the scores did not predict clinical outcomes. These results indicate that mutation location/functional assays are essential for accurate interpretation of the risk associated with LQTS mutations.
Collapse
Affiliation(s)
- Arwa Younis
- Clinical Cardiovascular Research CenterUniversity of Rochester Medical CenterRochesterNew YorkUSA
- Department of Cardiovascular MedicineCleveland ClinicClevelandOhioUSA
| | - Christopher Bodurian
- Clinical Cardiovascular Research CenterUniversity of Rochester Medical CenterRochesterNew YorkUSA
| | - Dan E. Arking
- Department of Genetic Medicine, McKusick‐Nathans InstituteJohn Hopkins University School of MedicineBaltimoreMarylandUSA
| | - Nicola Luigi Bragazzi
- Laboratory for Industrial and Applied Mathematics, Center for Disease ModellingYork UniversityTorontoOntarioCanada
| | - Chadi Tabaja
- Department of Cardiovascular MedicineCleveland ClinicClevelandOhioUSA
| | - Wojciech Zareba
- Clinical Cardiovascular Research CenterUniversity of Rochester Medical CenterRochesterNew YorkUSA
| | - Scott McNitt
- Clinical Cardiovascular Research CenterUniversity of Rochester Medical CenterRochesterNew YorkUSA
| | - Mehmet K. Aktas
- Clinical Cardiovascular Research CenterUniversity of Rochester Medical CenterRochesterNew YorkUSA
| | - Bronislava Polonsky
- Clinical Cardiovascular Research CenterUniversity of Rochester Medical CenterRochesterNew YorkUSA
| | - Coeli M. Lopes
- Clinical Cardiovascular Research CenterUniversity of Rochester Medical CenterRochesterNew YorkUSA
| | - Nona Sotoodehnia
- Division of Cardiology, Department of MedicineUniversity of WashingtonSeattleWashingtonUSA
| | - Peter J. Kudenchuk
- Division of Cardiology, Department of MedicineUniversity of WashingtonSeattleWashingtonUSA
| | - Ilan Goldenberg
- Clinical Cardiovascular Research CenterUniversity of Rochester Medical CenterRochesterNew YorkUSA
| |
Collapse
|
4
|
The Power of Clinical Diagnosis for Deciphering Complex Genetic Mechanisms in Rare Diseases. Genes (Basel) 2023; 14:genes14010196. [PMID: 36672937 PMCID: PMC9858967 DOI: 10.3390/genes14010196] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 01/05/2023] [Accepted: 01/09/2023] [Indexed: 01/13/2023] Open
Abstract
Complex genetic disease mechanisms, such as structural or non-coding variants, currently pose a substantial difficulty in frontline diagnostic tests. They thus may account for most unsolved rare disease patients regardless of the clinical phenotype. However, the clinical diagnosis can narrow the genetic focus to just a couple of genes for patients with well-established syndromes defined by prominent physical and/or unique biochemical phenotypes, allowing deeper analyses to consider complex genetic origin. Then, clinical-diagnosis-driven genome sequencing strategies may expedite the development of testing and analytical methods to account for complex disease mechanisms as well as to advance functional assays for the confirmation of complex variants, clinical management, and the development of new therapies.
Collapse
|
5
|
Aqil A, Speidel L, Pavlidis P, Gokcumen O. Balancing selection on genomic deletion polymorphisms in humans. eLife 2023; 12:79111. [PMID: 36625544 PMCID: PMC9943071 DOI: 10.7554/elife.79111] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 01/05/2023] [Indexed: 01/11/2023] Open
Abstract
A key question in biology is why genomic variation persists in a population for extended periods. Recent studies have identified examples of genomic deletions that have remained polymorphic in the human lineage for hundreds of millennia, ostensibly owing to balancing selection. Nevertheless, genome-wide investigation of ancient and possibly adaptive deletions remains an imperative exercise. Here, we demonstrate an excess of polymorphisms in present-day humans that predate the modern human-Neanderthal split (ancient polymorphisms), which cannot be explained solely by selectively neutral scenarios. We analyze the adaptive mechanisms that underlie this excess in deletion polymorphisms. Using a previously published measure of balancing selection, we show that this excess of ancient deletions is largely owing to balancing selection. Based on the absence of signatures of overdominance, we conclude that it is a rare mode of balancing selection among ancient deletions. Instead, more complex scenarios involving spatially and temporally variable selective pressures are likely more common mechanisms. Our results suggest that balancing selection resulted in ancient deletions harboring disproportionately more exonic variants with GWAS (genome-wide association studies) associations. We further found that ancient deletions are significantly enriched for traits related to metabolism and immunity. As a by-product of our analysis, we show that deletions are, on average, more deleterious than single nucleotide variants. We can now argue that not only is a vast majority of common variants shared among human populations, but a considerable portion of biologically relevant variants has been segregating among our ancestors for hundreds of thousands, if not millions, of years.
Collapse
Affiliation(s)
- Alber Aqil
- Department of Biological Sciences, University at BuffaloBuffaloUnited States
| | - Leo Speidel
- University College London, Genetics InstituteLondonUnited Kingdom
- The Francis Crick InstituteLondonUnited Kingdom
| | - Pavlos Pavlidis
- Institute of Computer Science (ICS), Foundation of Research and Technology-HellasHeraklionGreece
| | - Omer Gokcumen
- Department of Biological Sciences, University at BuffaloBuffaloUnited States
| |
Collapse
|
6
|
Sobahy TM, Motwalli O, Alazmi M. AllelePred: A Simple Allele Frequencies Ensemble Predictor for Different Single Nucleotide Variants. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:796-801. [PMID: 35239491 DOI: 10.1109/tcbb.2022.3155659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
BACKGROUND & OBJECTIVE Genomic medicine stands to be revolutionized by understanding single nucleotide variants (SNVs) and their expression in single-gene disorders (Mendelian diseases). Computational tools can play a vital role in the exploration of such variations and their pathogenicity. Consequently, we developed the ensemble prediction tool AllelePred to identify deleterious SNVs and disease causative genes. RESULTS The model utilizes different population genetics backgrounds and restricted criteria for features selection to help generate high accuracy results. In comparison to other tools, such as Eigen, PROVEAN, and fathmm-MKL our classifier achieves higher accuracy (98%), precision (96%), F1 score (93%), and coverage (100%) for different types of coding variants. The new method was also compared against a bioinformatics analytical workflow, which uses gnomAD overall AFs (less than 1%) and CADD (scaled C-score of at least 15). Furthermore, this research highlights the stature of genetic variant sharing and curation. We accumulated a list of highly probable deleterious variants and recommended further experimental validation before medical diagnostic usage. CONCLUSIONS The ensemble prediction tool AllelePred enables increased accuracy in recognizing deleterious SNVs and the genetic determinants in real clinical data.
Collapse
|
7
|
Zeng Z, Aptekmann AA, Bromberg Y. Decoding the effects of synonymous variants. Nucleic Acids Res 2021; 49:12673-12691. [PMID: 34850938 PMCID: PMC8682775 DOI: 10.1093/nar/gkab1159] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 11/02/2021] [Accepted: 11/08/2021] [Indexed: 12/12/2022] Open
Abstract
Synonymous single nucleotide variants (sSNVs) are common in the human genome but are often overlooked. However, sSNVs can have significant biological impact and may lead to disease. Existing computational methods for evaluating the effect of sSNVs suffer from the lack of gold-standard training/evaluation data and exhibit over-reliance on sequence conservation signals. We developed synVep (synonymous Variant effect predictor), a machine learning-based method that overcomes both of these limitations. Our training data was a combination of variants reported by gnomAD (observed) and those unreported, but possible in the human genome (generated). We used positive-unlabeled learning to purify the generated variant set of any likely unobservable variants. We then trained two sequential extreme gradient boosting models to identify subsets of the remaining variants putatively enriched and depleted in effect. Our method attained 90% precision/recall on a previously unseen set of variants. Furthermore, although synVep does not explicitly use conservation, its scores correlated with evolutionary distances between orthologs in cross-species variation analysis. synVep was also able to differentiate pathogenic vs. benign variants, as well as splice-site disrupting variants (SDV) vs. non-SDVs. Thus, synVep provides an important improvement in annotation of sSNVs, allowing users to focus on variants that most likely harbor effects.
Collapse
Affiliation(s)
- Zishuo Zeng
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ 08873, USA
| | - Ariel A Aptekmann
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ 08873, USA
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ 08873, USA
- Department of Genetics, Rutgers University, Piscataway, NJ 08854, USA
| |
Collapse
|
8
|
Rossanti R, Horinouchi T, Yamamura T, Nagano C, Sakakibara N, Ishiko S, Aoto Y, Kondo A, Nagai S, Okada E, Ishimori S, Nagase H, Matsui S, Tamagaki K, Ubara Y, Nagahama M, Shima Y, Nakanishi K, Ninchoji T, Matsuo M, Iijima K, Nozu K. Evaluation of Suspected Autosomal Alport Syndrome Synonymous Variants. KIDNEY360 2021; 3:497-505. [PMID: 35582193 PMCID: PMC9034806 DOI: 10.34067/kid.0005252021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Accepted: 10/11/2021] [Indexed: 01/10/2023]
Abstract
Background Alport syndrome is an inherited disorder characterized by progressive renal disease, variable sensorineural hearing loss, and ocular abnormalities. Although many pathogenic variants in COL4A3 and COL4A4 have been identified in patients with autosomal Alport syndrome, synonymous mutations in these genes have rarely been identified. Methods We conducted in silico splicing analysis using Human Splicing Finder (HSF) and Alamut to predict splicing domain strength and disruption of the sites. Furthermore, we performed in vitro splicing assays using minigene constructs and mRNA analysis of patient samples to determine the pathogenicity of four synonymous variants detected in four patients with suspected autosomal dominant Alport syndrome (COL4A3 [c.693G>A (p.Val231=)] and COL4A4 [c.1353C>T (p.Gly451=), c.735G>A (p.Pro245=), and c.870G>A (p.Lys290=)]). Results Both in vivo and in vitro splicing assays showed exon skipping in two out of the four synonymous variants identified (c.735G>A and c.870G>A in COL4A4). Prediction analysis of wild-type and mutated COL4A4 sequences using HSF and Alamut suggested these two variants may lead to the loss of binding sites for several splicing factors, e.g., in acceptor sites and exonic splicing enhancers. The other two variants did not induce aberrant splicing. Conclusions This study highlights the pitfalls of classifying the functional consequences of variants by a simple approach. Certain synonymous variants, although they do not alter the amino acid sequence of the encoded protein, can dramatically affect pre-mRNA splicing, as shown in two of our patients. Our findings indicate that transcript analysis should be carried out to evaluate synonymous variants detected in patients with autosomal dominant Alport syndrome.
Collapse
Affiliation(s)
- Rini Rossanti
- Department of Pediatrics, Kobe University Graduate School of Medicine, Kobe, Japan,Department of Child Health, Nephrology Division, Dr. Hasan Sadikin General Hospital/Faculty of Medicine, Universitas Padjadjaran, Bandung, Indonesia
| | - Tomoko Horinouchi
- Department of Pediatrics, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Tomohiko Yamamura
- Department of Pediatrics, Kobe University Graduate School of Medicine, Kobe, Japan
| | - China Nagano
- Department of Pediatrics, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Nana Sakakibara
- Department of Pediatrics, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Shinya Ishiko
- Department of Pediatrics, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Yuya Aoto
- Department of Pediatrics, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Atsushi Kondo
- Department of Pediatrics, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Sadayuki Nagai
- Department of Pediatrics, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Eri Okada
- Department of Pediatrics, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Shingo Ishimori
- Department of Pediatrics, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Hiroaki Nagase
- Department of Pediatrics, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Satoshi Matsui
- Department of Nephrology and Hypertension, Mitsubishi Kyoto Hospital, Kyoto, Japan
| | - Keiichi Tamagaki
- Department of Nephrology, Kyoto Prefectural University of Medicine, Kyoto, Japan
| | - Yoshifumi Ubara
- Nephrology Center, Okinaka Memorial Institute for Medical Research, Tokyo, Japan
| | | | - Yuko Shima
- Department of Pediatrics, Wakayama Medical University, Wakayama, Japan
| | - Koichi Nakanishi
- Department of Child Health and Welfare (Pediatrics), Graduate School of Medicine, University of the Ryukyus, Okinawa, Japan
| | - Takeshi Ninchoji
- Department of Pediatrics, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Masafumi Matsuo
- Research Center for Locomotion Biology, Kobe Gakuin University, Kobe, Japan
| | - Kazumoto Iijima
- Hyogo Prefectural Kobe Children’s Hospital, Kobe, Japan,Department of Advanced Pediatric Medicine, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Kandai Nozu
- Department of Pediatrics, Kobe University Graduate School of Medicine, Kobe, Japan
| |
Collapse
|
9
|
Mahlich Y, Miller M, Zeng Z, Bromberg Y. Low Diversity of Human Variation Despite Mostly Mild Functional Impact of De Novo Variants. Front Mol Biosci 2021; 8:635382. [PMID: 33816556 PMCID: PMC8012514 DOI: 10.3389/fmolb.2021.635382] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 02/01/2021] [Indexed: 01/07/2023] Open
Abstract
Non-synonymous Single Nucleotide Variants (nsSNVs), resulting in single amino acid variants (SAVs), are important drivers of evolutionary adaptation across the tree of life. Humans carry on average over 10,000 SAVs per individual genome, many of which likely have little to no impact on the function of the protein they affect. Experimental evidence for protein function changes as a result of SAVs remain sparse – a situation that can be somewhat alleviated by predicting their impact using computational methods. Here, we used SNAP to examine both observed and in silico generated human variation in a set of 1,265 proteins that are consistently found across a number of diverse species. The number of SAVs that are predicted to have any functional effect on these proteins is smaller than expected, suggesting sequence/function optimization over evolutionary timescales. Additionally, we find that only a few of the yet-unobserved SAVs could drastically change the function of these proteins, while nearly a quarter would have only a mild functional effect. We observed that variants common in the human population localized to less conserved protein positions and carried mild to moderate functional effects more frequently than rare variants. As expected, rare variants carried severe effects more frequently than common variants. In line with current assumptions, we demonstrated that the change of the human reference sequence amino acid to the reference of another species (a cross-species variant) is unlikely to significantly impact protein function. However, we also observed that many cross-species variants may be weakly non-neutral for the purposes of quick adaptation to environmental changes, but may not be identified as such by current state-of-the-art methodology.
Collapse
Affiliation(s)
- Yannick Mahlich
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, United States
| | - Maximillian Miller
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, United States
| | - Zishuo Zeng
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, United States
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, United States.,Department of Genetics, Rutgers University, Piscataway, NJ, United States
| |
Collapse
|
10
|
Qiu J, Nechaev D, Rost B. Protein-protein and protein-nucleic acid binding residues important for common and rare sequence variants in human. BMC Bioinformatics 2020; 21:452. [PMID: 33050876 PMCID: PMC7557062 DOI: 10.1186/s12859-020-03759-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Accepted: 09/16/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Any two unrelated people differ by about 20,000 missense mutations (also referred to as SAVs: Single Amino acid Variants or missense SNV). Many SAVs have been predicted to strongly affect molecular protein function. Common SAVs (> 5% of population) were predicted to have, on average, more effect on molecular protein function than rare SAVs (< 1% of population). We hypothesized that the prevalence of effect in common over rare SAVs might partially be caused by common SAVs more often occurring at interfaces of proteins with other proteins, DNA, or RNA, thereby creating subgroup-specific phenotypes. We analyzed SAVs from 60,706 people through the lens of two prediction methods, one (SNAP2) predicting the effects of SAVs on molecular protein function, the other (ProNA2020) predicting residues in DNA-, RNA- and protein-binding interfaces. RESULTS Three results stood out. Firstly, SAVs predicted to occur at binding interfaces were predicted to more likely affect molecular function than those predicted as not binding (p value < 2.2 × 10-16). Secondly, for SAVs predicted to occur at binding interfaces, common SAVs were predicted more strongly with effect on protein function than rare SAVs (p value < 2.2 × 10-16). Restriction to SAVs with experimental annotations confirmed all results, although the resulting subsets were too small to establish statistical significance for any result. Thirdly, the fraction of SAVs predicted at binding interfaces differed significantly between tissues, e.g. urinary bladder tissue was found abundant in SAVs predicted at protein-binding interfaces, and reproductive tissues (ovary, testis, vagina, seminal vesicle and endometrium) in SAVs predicted at DNA-binding interfaces. CONCLUSIONS Overall, the results suggested that residues at protein-, DNA-, and RNA-binding interfaces contributed toward predicting that common SAVs more likely affect molecular function than rare SAVs.
Collapse
Affiliation(s)
- Jiajun Qiu
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748, Garching, Munich, Germany. .,TUM Graduate School, Center of Doctoral Studies in Informatics and Its Applications (CeDoSIA), 85748, Garching, Germany. .,Biobank of Ninth People's Hospital, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200125, China.
| | - Dmitrii Nechaev
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748, Garching, Munich, Germany.,TUM Graduate School, Center of Doctoral Studies in Informatics and Its Applications (CeDoSIA), 85748, Garching, Germany
| | - Burkhard Rost
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748, Garching, Munich, Germany.,Institute of Advanced Study (TUM-IAS), Lichtenbergstr. 2a, 85748, Garching, Munich, Germany.,Institute for Food and Plant Sciences (WZW) Weihenstephan, Alte Akademie 8, 85354, Freising, Germany
| |
Collapse
|
11
|
Zhu C, Miller M, Zeng Z, Wang Y, Mahlich Y, Aptekmann A, Bromberg Y. Computational Approaches for Unraveling the Effects of Variation in the Human Genome and Microbiome. Annu Rev Biomed Data Sci 2020. [DOI: 10.1146/annurev-biodatasci-030320-041014] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
The past two decades of analytical efforts have highlighted how much more remains to be learned about the human genome and, particularly, its complex involvement in promoting disease development and progression. While numerous computational tools exist for the assessment of the functional and pathogenic effects of genome variants, their precision is far from satisfactory, particularly for clinical use. Accumulating evidence also suggests that the human microbiome's interaction with the human genome plays a critical role in determining health and disease states. While numerous microbial taxonomic groups and molecular functions of the human microbiome have been associated with disease, the reproducibility of these findings is lacking. The human microbiome–genome interaction in healthy individuals is even less well understood. This review summarizes the available computational methods built to analyze the effect of variation in the human genome and microbiome. We address the applicability and precision of these methods across their possible uses. We also briefly discuss the exciting, necessary, and now possible integration of the two types of data to improve the understanding of pathogenicity mechanisms.
Collapse
Affiliation(s)
- Chengsheng Zhu
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
| | - Maximilian Miller
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
| | - Zishuo Zeng
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
| | - Yanran Wang
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
| | - Yannick Mahlich
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
| | - Ariel Aptekmann
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
- Department of Genetics, Rutgers University, Piscataway, New Jersey 08854, USA
| |
Collapse
|
12
|
Miller M, Vitale D, Kahn PC, Rost B, Bromberg Y. funtrp: identifying protein positions for variation driven functional tuning. Nucleic Acids Res 2020; 47:e142. [PMID: 31584091 PMCID: PMC6868392 DOI: 10.1093/nar/gkz818] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 09/05/2019] [Accepted: 09/12/2019] [Indexed: 12/12/2022] Open
Abstract
Evaluating the impact of non-synonymous genetic variants is essential for uncovering disease associations and mechanisms of evolution. An in-depth understanding of sequence changes is also fundamental for synthetic protein design and stability assessments. However, the variant effect predictor performance gain observed in recent years has not kept up with the increased complexity of new methods. One likely reason for this might be that most approaches use similar sets of gene and protein features for modeling variant effects, often emphasizing sequence conservation. While high levels of conservation highlight residues essential for protein activity, much of the variation observable in vivo is arguably weaker in its impact, thus requiring evaluation at a higher level of resolution. Here, we describe functionNeutral/Toggle/Rheostatpredictor (funtrp), a novel computational method that categorizes protein positions based on the position-specific expected range of mutational impacts: Neutral (weak/no effects), Rheostat (function-tuning positions), or Toggle (on/off switches). We show that position types do not correlate strongly with familiar protein features such as conservation or protein disorder. We also find that position type distribution varies across different protein functions. Finally, we demonstrate that position types can improve performance of existing variant effect predictors and suggest a way forward for the development of new ones.
Collapse
Affiliation(s)
- Maximilian Miller
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Dr, New Brunswick, NJ 08901, USA
| | - Daniel Vitale
- Columbian College of Arts and Sciences Data Science Program Corcoran Hall, 725 21st Street NW, Washington, DC 20052, USA
| | - Peter C Kahn
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Dr, New Brunswick, NJ 08901, USA
| | - Burkhard Rost
- Department for Bioinformatics and Computational Biology, Technische Universität München, Boltzmannstr. 3, 85748 Garching/Munich, Germany.,Institute for Advanced Study at Technische Universität München (TUM-IAS), Lichtenbergstraße 2a 85748 Garching/Munich, Germany
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Dr, New Brunswick, NJ 08901, USA.,Institute for Advanced Study at Technische Universität München (TUM-IAS), Lichtenbergstraße 2a 85748 Garching/Munich, Germany.,Department of Genetics, Rutgers University, Human Genetics Institute, Life Sciences Building, 145 Bevier Road, Piscataway, NJ 08854, USA
| |
Collapse
|
13
|
Reeb J, Wirth T, Rost B. Variant effect predictions capture some aspects of deep mutational scanning experiments. BMC Bioinformatics 2020; 21:107. [PMID: 32183714 PMCID: PMC7077003 DOI: 10.1186/s12859-020-3439-4] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Accepted: 03/03/2020] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Deep mutational scanning (DMS) studies exploit the mutational landscape of sequence variation by systematically and comprehensively assaying the effect of single amino acid variants (SAVs; also referred to as missense mutations, or non-synonymous Single Nucleotide Variants - missense SNVs or nsSNVs) for particular proteins. We assembled SAV annotations from 22 different DMS experiments and normalized the effect scores to evaluate variant effect prediction methods. Three trained on traditional variant effect data (PolyPhen-2, SIFT, SNAP2), a regression method optimized on DMS data (Envision), and a naïve prediction using conservation information from homologs. RESULTS On a set of 32,981 SAVs, all methods captured some aspects of the experimental effect scores, albeit not the same. Traditional methods such as SNAP2 correlated slightly more with measurements and better classified binary states (effect or neutral). Envision appeared to better estimate the precise degree of effect. Most surprising was that the simple naïve conservation approach using PSI-BLAST in many cases outperformed other methods. All methods captured beneficial effects (gain-of-function) significantly worse than deleterious (loss-of-function). For the few proteins with multiple independent experimental measurements, experiments differed substantially, but agreed more with each other than with predictions. CONCLUSIONS DMS provides a new powerful experimental means of understanding the dynamics of the protein sequence space. As always, promising new beginnings have to overcome challenges. While our results demonstrated that DMS will be crucial to improve variant effect prediction methods, data diversity hindered simplification and generalization.
Collapse
Affiliation(s)
- Jonas Reeb
- Department of Informatics, Bioinformatics & Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr 3, 85748, Garching/Munich, Germany.
| | - Theresa Wirth
- Department of Informatics, Bioinformatics & Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr 3, 85748, Garching/Munich, Germany
| | - Burkhard Rost
- Department of Informatics, Bioinformatics & Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr 3, 85748, Garching/Munich, Germany
- Institute for Advanced Study (TUM-IAS), Lichtenbergstr 2a, 85748, Garching/Munich, Germany
- TUM School of Life Sciences Weihenstephan (WZW), Alte Akademie 8, Freising, Germany
- Department of Biochemistry and Molecular Biophysics, Columbia University, 701 West, 168th Street, New York, NY, 10032, USA
| |
Collapse
|
14
|
Iourov IY, Vorsanova SG, Yurov YB. The variome concept: focus on CNVariome. Mol Cytogenet 2019; 12:52. [PMID: 31890032 PMCID: PMC6924070 DOI: 10.1186/s13039-019-0467-8] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 12/13/2019] [Indexed: 02/07/2023] Open
Abstract
Background Variome may be used for designating complex system of interplay between genomic variations specific for an individual or a disease. Despite the recognized complexity of genomic basis for phenotypic traits and diseases, studies of genetic causes of a disease are usually dedicated to the identification of single causative genomic changes (mutations). When such an artificially simplified model is employed, genomic basis of phenotypic outcomes remains elusive in the overwhelming majority of human diseases. Moreover, it is repeatedly demonstrated that multiple genomic changes within an individual genome are likely to underlie the phenome. Probably the best example of cumulative effect of variome on the phenotype is CNV (copy number variation) burden. Accordingly, we have proposed a variome concept based on CNV studies providing the evidence for the existence of a CNVariome (the set of CNV affecting an individual genome), a target for genomic analyses useful for unraveling genetic mechanisms of diseases and phenotypic traits. Conclusion Variome (CNVariome) concept suggests that a genomic milieu is determined by the whole set of genomic variations (CNV) within an individual genome. The genomic milieu is likely to result from interplay between these variations. Furthermore, such kind of variome may be either individual or disease-specific. Additionally, such variome may be pathway-specific. The latter is able to affect molecular/cellular pathways of genome stability maintenance leading to occurrence of genomic/chromosome instability and/or somatic mosaicism resulting in somatic variome. This variome type seems to be important for unraveling disease mechanisms, as well. Finally, it appears that bioinformatic analysis of both individual and somatic variomes in the context of diseases- and pathway-specific variomes is the most promising way to determine genomic basis of the phenome and to unravel disease mechanisms for the management and treatment of currently incurable diseases.
Collapse
Affiliation(s)
- Ivan Y Iourov
- Yurov's Laboratory of Molecular Genetics and Cytogenomics of the Brain, Mental Health Research Center, 117152 Moscow, Russia.,2Veltischev Research and Clinical Institute for Pediatrics of the Pirogov Russian National Research Medical University, Ministry of Health of Russian Federation, 125412 Moscow, Russia
| | - Svetlana G Vorsanova
- Yurov's Laboratory of Molecular Genetics and Cytogenomics of the Brain, Mental Health Research Center, 117152 Moscow, Russia.,2Veltischev Research and Clinical Institute for Pediatrics of the Pirogov Russian National Research Medical University, Ministry of Health of Russian Federation, 125412 Moscow, Russia
| | - Yuri B Yurov
- Yurov's Laboratory of Molecular Genetics and Cytogenomics of the Brain, Mental Health Research Center, 117152 Moscow, Russia.,2Veltischev Research and Clinical Institute for Pediatrics of the Pirogov Russian National Research Medical University, Ministry of Health of Russian Federation, 125412 Moscow, Russia
| |
Collapse
|
15
|
Šimčíková D, Heneberg P. Refinement of evolutionary medicine predictions based on clinical evidence for the manifestations of Mendelian diseases. Sci Rep 2019; 9:18577. [PMID: 31819097 PMCID: PMC6901466 DOI: 10.1038/s41598-019-54976-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Accepted: 11/21/2019] [Indexed: 12/28/2022] Open
Abstract
Prediction methods have become an integral part of biomedical and biotechnological research. However, their clinical interpretations are largely based on biochemical or molecular data, but not clinical data. Here, we focus on improving the reliability and clinical applicability of prediction algorithms. We assembled and curated two large non-overlapping large databases of clinical phenotypes. These phenotypes were caused by missense variations in 44 and 63 genes associated with Mendelian diseases. We used these databases to establish and validate the model, allowing us to improve the predictions obtained from EVmutation, SNAP2 and PoPMuSiC 2.1. The predictions of clinical effects suffered from a lack of specificity, which appears to be the common constraint of all recently used prediction methods, although predictions mediated by these methods are associated with nearly absolute sensitivity. We introduced evidence-based tailoring of the default settings of the prediction methods; this tailoring substantially improved the prediction outcomes. Additionally, the comparisons of the clinically observed and theoretical variations led to the identification of large previously unreported pools of variations that were under negative selection during molecular evolution. The evolutionary variation analysis approach described here is the first to enable the highly specific identification of likely disease-causing missense variations that have not yet been associated with any clinical phenotype.
Collapse
Affiliation(s)
- Daniela Šimčíková
- Charles University, Third Faculty of Medicine, Prague, Czech Republic
| | - Petr Heneberg
- Charles University, Third Faculty of Medicine, Prague, Czech Republic.
| |
Collapse
|
16
|
Zeng Z, Bromberg Y. Predicting Functional Effects of Synonymous Variants: A Systematic Review and Perspectives. Front Genet 2019; 10:914. [PMID: 31649718 PMCID: PMC6791167 DOI: 10.3389/fgene.2019.00914] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Accepted: 08/29/2019] [Indexed: 12/13/2022] Open
Abstract
Recent advances in high-throughput experimentation have put the exploration of genome sequences at the forefront of precision medicine. In an effort to interpret the sequencing data, numerous computational methods have been developed for evaluating the effects of genome variants. Interestingly, despite the fact that every person has as many synonymous (sSNV) as non-synonymous single nucleotide variants, our ability to predict their effects is limited. The paucity of experimentally tested sSNV effects appears to be the limiting factor in development of such methods. Here, we summarize the details and evaluate the performance of nine existing computational methods capable of predicting sSNV effects. We used a set of observed and artificially generated variants to approximate large scale performance expectations of these tools. We note that the distribution of these variants across amino acid and codon types suggests purifying evolutionary selection retaining generated variants out of the observed set; i.e., we expect the generated set to be enriched for deleterious variants. Closer inspection of the relationship between the observed variant frequencies and the associated prediction scores identifies predictor-specific scoring thresholds of reliable effect predictions. Notably, across all predictors, the variants scoring above these thresholds were significantly more often generated than observed. which confirms our assumption that the generated set is enriched for deleterious variants. Finally, we find that while the methods differ in their ability to identify severe sSNV effects, no predictor appears capable of definitively recognizing subtle effects of such variants on a large scale.
Collapse
Affiliation(s)
- Zishuo Zeng
- Institute for Quantitative Biomedicine, Rutgers University, Piscataway, NJ, United States
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, United States
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, United States
- Department of Genetics, Rutgers University, Human Genetics Institute, Piscataway, NJ, United States
| |
Collapse
|
17
|
Miller M, Wang Y, Bromberg Y. What went wrong with variant effect predictor performance for the PCM1 challenge. Hum Mutat 2019; 40:1486-1494. [PMID: 31268618 PMCID: PMC6744297 DOI: 10.1002/humu.23832] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Revised: 05/03/2019] [Accepted: 05/31/2019] [Indexed: 12/31/2022]
Abstract
The recent years have seen a drastic increase in the amount of available genomic sequences. Alongside this explosion, hundreds of computational tools were developed to assess the impact of observed genetic variation. Critical Assessment of Genome Interpretation (CAGI) provides a platform to evaluate the performance of these tools in experimentally relevant contexts. In the CAGI-5 challenge assessing the 38 missense variants affecting the human Pericentriolar material 1 protein (PCM1), our SNAP-based submission was the top performer, although it did worse than expected from other evaluations. Here, we compare the CAGI-5 submissions, and 24 additional commonly used variant effect predictors, to analyze the reasons for this observation. We identified per residue conservation, structural, and functional PCM1 characteristics, which may be responsible. As expected, predictors had a hard time distinguishing effect variants in nonconserved positions. They were also better able to call effect variants in a structurally rich region than in a less-structured one; in the latter, they more often correctly identified benign than effect variants. Curiously, most of the protein was predicted to be functionally robust to mutation-a feature that likely makes it a harder problem for generalized variant effect predictors.
Collapse
Affiliation(s)
- Maximilian Miller
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Dr, New Brunswick, NJ 08873, USA
| | - Yanran Wang
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Dr, New Brunswick, NJ 08873, USA
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Dr, New Brunswick, NJ 08873, USA
| |
Collapse
|
18
|
Raraigh KS, Han ST, Davis E, Evans TA, Pellicore MJ, McCague AF, Joynt AT, Lu Z, Atalar M, Sharma N, Sheridan MB, Sosnay PR, Cutting GR. Functional Assays Are Essential for Interpretation of Missense Variants Associated with Variable Expressivity. Am J Hum Genet 2018; 102:1062-1077. [PMID: 29805046 DOI: 10.1016/j.ajhg.2018.04.003] [Citation(s) in RCA: 64] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2018] [Accepted: 03/30/2018] [Indexed: 12/22/2022] Open
Abstract
Missense DNA variants have variable effects upon protein function. Consequently, interpreting their pathogenicity is challenging, especially when they are associated with disease variability. To determine the degree to which functional assays inform interpretation, we analyzed 48 CFTR missense variants associated with variable expressivity of cystic fibrosis (CF). We assessed function in a native isogenic context by evaluating CFTR mutants that were stably expressed in the genome of a human airway cell line devoid of endogenous CFTR expression. 21 of 29 variants associated with full expressivity of the CF phenotype generated <10% wild-type CFTR (WT-CFTR) function, a conservative threshold for the development of life-limiting CF lung disease, and five variants had moderately decreased function (10% to ∼25% WT-CFTR). The remaining three variants in this group unexpectedly had >25% WT-CFTR function; two were higher than 75% WT-CFTR. As expected, 14 of 19 variants associated with partial expressivity of CF had >25% WT-CFTR function; however, four had minimal to no effect on CFTR function (>75% WT-CFTR). Thus, 6 of 48 (13%) missense variants believed to be disease causing did not alter CFTR function. Functional studies substantially refined pathogenicity assignment with expert annotation and criteria from the American College of Medical Genetics and Genomics and Association for Molecular Pathology. However, four algorithms (CADD, REVEL, SIFT, and PolyPhen-2) could not differentiate between variants that caused severe, moderate, or minimal reduction in function. In the setting of variable expressivity, these results indicate that functional assays are essential for accurate interpretation of missense variants and that current prediction tools should be used with caution.
Collapse
|
19
|
Pejaver V, Mooney SD, Radivojac P. Missense variant pathogenicity predictors generalize well across a range of function-specific prediction challenges. Hum Mutat 2017; 38:1092-1108. [PMID: 28508593 PMCID: PMC5561458 DOI: 10.1002/humu.23258] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2016] [Revised: 03/16/2017] [Accepted: 03/26/2017] [Indexed: 11/08/2022]
Abstract
The steady advances in machine learning and accumulation of biomedical data have contributed to the development of numerous computational models that assess the impact of missense variants. Different methods, however, operationalize impact differently. Two common tasks in this context are the prediction of the pathogenicity of variants and the prediction of their effects on a protein's function. These are related but distinct problems, and it is unclear whether methods developed for one are optimized for the other. The Critical Assessment of Genome Interpretation (CAGI) experiment provides a means to address this question empirically. To this end, we participated in various protein-specific challenges in CAGI with two objectives in mind. First, to compare the performance of methods in the MutPred family with the state-of-the-art. Second and more importantly, to investigate the applicability of general-purpose pathogenicity predictors to the classification of specific function-altering variants without additional training or calibration. We find that our pathogenicity predictors performed competitively with other methods, outputting score distributions in agreement with experimental outcomes. Overall, we conclude that binary classifiers learned from disease-causing mutations are capable of modeling important aspects of the underlying biology and the alteration of protein function resulting from mutations.
Collapse
Affiliation(s)
- Vikas Pejaver
- Department of Computer Science and Informatics, Indiana University, Bloomington, Indiana 47405
| | - Sean D. Mooney
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington 98109
| | - Predrag Radivojac
- Department of Computer Science and Informatics, Indiana University, Bloomington, Indiana 47405
| |
Collapse
|
20
|
Swint-Kruse L. Using Evolution to Guide Protein Engineering: The Devil IS in the Details. Biophys J 2017; 111:10-8. [PMID: 27410729 DOI: 10.1016/j.bpj.2016.05.030] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2015] [Revised: 04/18/2016] [Accepted: 05/20/2016] [Indexed: 10/21/2022] Open
Abstract
For decades, protein engineers have endeavored to reengineer existing proteins for novel applications. Overall, protein folds and gross functions can be readily transferred from one protein to another by transplanting large blocks of sequence (i.e., domain recombination). However, predictably fine-tuning function (e.g., by adjusting ligand affinity, specificity, catalysis, and/or allosteric regulation) remains a challenge. One approach has been to use the sequences of protein families to identify amino acid positions that change during the evolution of functional variation. The rationale is that these nonconserved positions could be mutated to predictably fine-tune function. Evolutionary approaches to protein design have had some success, but the engineered proteins seldom replicate the functional performances of natural proteins. This Biophysical Perspective reviews several complexities that have been revealed by evolutionary and experimental studies of protein function. These include 1) challenges in defining computational and biological thresholds that define important amino acids; 2) the co-occurrence of many different patterns of amino acid changes in evolutionary data; 3) difficulties in mapping the patterns of amino acid changes to discrete functional parameters; 4) the nonconventional mutational outcomes that occur for a particular group of functionally important, nonconserved positions; 5) epistasis (nonadditivity) among multiple mutations; and 6) the fact that a large fraction of a protein's amino acids contribute to its overall function. To overcome these challenges, new goals are identified for future studies.
Collapse
Affiliation(s)
- Liskin Swint-Kruse
- Department of Biochemistry and Molecular Biology, University of Kansas Medical Center, Kansas City, Kansas.
| |
Collapse
|
21
|
Common sequence variants affect molecular function more than rare variants? Sci Rep 2017; 7:1608. [PMID: 28487536 PMCID: PMC5431670 DOI: 10.1038/s41598-017-01054-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2016] [Accepted: 02/28/2017] [Indexed: 12/29/2022] Open
Abstract
Any two unrelated individuals differ by about 10,000 single amino acid variants (SAVs). Do these impact molecular function? Experimental answers cannot answer comprehensively, while state-of-the-art prediction methods can. We predicted the functional impacts of SAVs within human and for variants between human and other species. Several surprising results stood out. Firstly, four methods (CADD, PolyPhen-2, SIFT, and SNAP2) agreed within 10 percentage points on the percentage of rare SAVs predicted with effect. However, they differed substantially for the common SAVs: SNAP2 predicted, on average, more effect for common than for rare SAVs. Given the large ExAC data sets sampling 60,706 individuals, the differences were extremely significant (p-value < 2.2e-16). We provided evidence that SNAP2 might be closer to reality for common SAVs than the other methods, due to its different focus in development. Secondly, we predicted significantly higher fractions of SAVs with effect between healthy individuals than between species; the difference increased for more distantly related species. The same trends were maintained for subsets of only housekeeping proteins and when moving from exomes of 1,000 to 60,000 individuals. SAVs frozen at speciation might maintain protein function, while many variants within a species might bring about crucial changes, for better or worse.
Collapse
|
22
|
Computational predictors fail to identify amino acid substitution effects at rheostat positions. Sci Rep 2017; 7:41329. [PMID: 28134345 PMCID: PMC5278360 DOI: 10.1038/srep41329] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2016] [Accepted: 12/15/2016] [Indexed: 12/31/2022] Open
Abstract
Many computational approaches exist for predicting the effects of amino acid substitutions. Here, we considered whether the protein sequence position class - rheostat or toggle - affects these predictions. The classes are defined as follows: experimentally evaluated effects of amino acid substitutions at toggle positions are binary, while rheostat positions show progressive changes. For substitutions in the LacI protein, all evaluated methods failed two key expectations: toggle neutrals were incorrectly predicted as more non-neutral than rheostat non-neutrals, while toggle and rheostat neutrals were incorrectly predicted to be different. However, toggle non-neutrals were distinct from rheostat neutrals. Since many toggle positions are conserved, and most rheostats are not, predictors appear to annotate position conservation better than mutational effect. This finding can explain the well-known observation that predictors assign disproportionate weight to conservation, as well as the field's inability to improve predictor performance. Thus, building reliable predictors requires distinguishing between rheostat and toggle positions.
Collapse
|
23
|
Abrusán G, Marsh JA. Alpha Helices Are More Robust to Mutations than Beta Strands. PLoS Comput Biol 2016; 12:e1005242. [PMID: 27935949 PMCID: PMC5147804 DOI: 10.1371/journal.pcbi.1005242] [Citation(s) in RCA: 66] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2016] [Accepted: 11/08/2016] [Indexed: 12/30/2022] Open
Abstract
The rapidly increasing amount of data on human genetic variation has resulted in a growing demand to identify pathogenic mutations computationally, as their experimental validation is currently beyond reach. Here we show that alpha helices and beta strands differ significantly in their ability to tolerate mutations: helices can accumulate more mutations than strands without change, due to the higher numbers of inter-residue contacts in helices. This results in two patterns: a) the same number of mutations causes less structural change in helices than in strands; b) helices diverge more rapidly in sequence than strands within the same domains. Additionally, both helices and strands are significantly more robust than coils. Based on this observation we show that human missense mutations that change secondary structure are more likely to be pathogenic than those that do not. Moreover, inclusion of predicted secondary structure changes shows significant utility for improving upon state-of-the-art pathogenicity predictions. The factors that determine the robustness and evolvability of proteins are still largely unknown. In this work the authors show that different secondary structure elements of proteins (helices and strands) differ in their ability to tolerate mutations, and demonstrate that it is caused by differences in the number of non-covalent residue interactions within these secondary structure units. The results suggest that engineering de novo all-alpha proteins should be easier than all-beta ones, as more sequences can to fold to the same topology. Additionally, secondary structure can be used to improve current methods of pathogenicity predictions; mutations that change secondary structure are more likely to be pathogenic than mutations that do not, due to their strong destabilizing effect on protein structure.
Collapse
Affiliation(s)
- György Abrusán
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh EH4 2XU, United Kingdom
- Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Temesvári krt. 62, Hungary
- * E-mail:
| | - Joseph A. Marsh
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh EH4 2XU, United Kingdom
| |
Collapse
|
24
|
Reeb J, Hecht M, Mahlich Y, Bromberg Y, Rost B. Predicted Molecular Effects of Sequence Variants Link to System Level of Disease. PLoS Comput Biol 2016; 12:e1005047. [PMID: 27536940 PMCID: PMC4990455 DOI: 10.1371/journal.pcbi.1005047] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2016] [Accepted: 07/04/2016] [Indexed: 11/19/2022] Open
Abstract
Developments in experimental and computational biology are advancing our understanding of how protein sequence variation impacts molecular protein function. However, the leap from the micro level of molecular function to the macro level of the whole organism, e.g. disease, remains barred. Here, we present new results emphasizing earlier work that suggested some links from molecular function to disease. We focused on non-synonymous single nucleotide variants, also referred to as single amino acid variants (SAVs). Building upon OMIA (Online Mendelian Inheritance in Animals), we introduced a curated set of 117 disease-causing SAVs in animals. Methods optimized to capture effects upon molecular function often correctly predict human (OMIM) and animal (OMIA) Mendelian disease-causing variants. We also predicted effects of human disease-causing variants in the mouse model, i.e. we put OMIM SAVs into mouse orthologs. Overall, fewer variants were predicted with effect in the model organism than in the original organism. Our results, along with other recent studies, demonstrate that predictions of molecular effects capture some important aspects of disease. Thus, in silico methods focusing on the micro level of molecular function can help to understand the macro system level of disease. The variations in the genetic sequence between individuals affect the gene-product, i.e. the protein differently. Some variants have no measurable effect (are neutral), while others affect protein function. Some of those effects are so severe they cause so called monogenic Mendelian diseases, i.e. diseases triggered by a single letter change. Some in silico methods predict the molecular impact of sequence variation. However, both experimental and computational analyses struggle to generalize from the effect upon molecular protein function to the effect upon the organism such as a disease. Here, we confirmed that methods predicting molecular effects correctly capture the type of effects causing Mendelian diseases in human and introduced a data set for animal diseases that was also captured by predictions methods. Predicted effects were less when in silico testing human variants in an animal model (here mouse). This is important to know because “mouse models” are common to study human diseases. Overall, we provided some evidence for a link between the molecular level and some type of disease.
Collapse
Affiliation(s)
- Jonas Reeb
- Department of Informatics, Bioinformatics & Computational Biology—i12, Technische Universität München, Garching/Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Technische Universität München, Garching, Germany
- * E-mail:
| | - Maximilian Hecht
- Department of Informatics, Bioinformatics & Computational Biology—i12, Technische Universität München, Garching/Munich, Germany
| | - Yannick Mahlich
- Department of Informatics, Bioinformatics & Computational Biology—i12, Technische Universität München, Garching/Munich, Germany
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey, United States of America
- Institute for Advanced Study (TUM-IAS), Garching/Munich, Germany
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey, United States of America
- Institute for Advanced Study (TUM-IAS), Garching/Munich, Germany
| | - Burkhard Rost
- Department of Informatics, Bioinformatics & Computational Biology—i12, Technische Universität München, Garching/Munich, Germany
- Institute for Advanced Study (TUM-IAS), Garching/Munich, Germany
- Institute for Food and Plant Sciences WZW, Technische Universität München, Weihenstephan, Freising, Germany
| |
Collapse
|
25
|
Rost B, Radivojac P, Bromberg Y. Protein function in precision medicine: deep understanding with machine learning. FEBS Lett 2016; 590:2327-41. [PMID: 27423136 PMCID: PMC5937700 DOI: 10.1002/1873-3468.12307] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2016] [Revised: 07/12/2016] [Accepted: 07/12/2016] [Indexed: 12/21/2022]
Abstract
Precision medicine and personalized health efforts propose leveraging complex molecular, medical and family history, along with other types of personal data toward better life. We argue that this ambitious objective will require advanced and specialized machine learning solutions. Simply skimming some low-hanging results off the data wealth might have limited potential. Instead, we need to better understand all parts of the system to define medically relevant causes and effects: how do particular sequence variants affect particular proteins and pathways? How do these effects, in turn, cause the health or disease-related phenotype? Toward this end, deeper understanding will not simply diffuse from deeper machine learning, but from more explicit focus on understanding protein function, context-specific protein interaction networks, and impact of variation on both.
Collapse
Affiliation(s)
- Burkhard Rost
- Department of Informatics and Bioinformatics, Institute for Advanced Studies, Technical University of Munich, Garching, Germany
| | - Predrag Radivojac
- School of Informatics and Computing, Indiana University, Bloomington, IN, USA
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, USA
| |
Collapse
|
26
|
Rockah-Shmuel L, Tóth-Petróczy Á, Tawfik DS. Systematic Mapping of Protein Mutational Space by Prolonged Drift Reveals the Deleterious Effects of Seemingly Neutral Mutations. PLoS Comput Biol 2015; 11:e1004421. [PMID: 26274323 PMCID: PMC4537296 DOI: 10.1371/journal.pcbi.1004421] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Accepted: 06/30/2015] [Indexed: 11/18/2022] Open
Abstract
Systematic mappings of the effects of protein mutations are becoming increasingly popular. Unexpectedly, these experiments often find that proteins are tolerant to most amino acid substitutions, including substitutions in positions that are highly conserved in nature. To obtain a more realistic distribution of the effects of protein mutations, we applied a laboratory drift comprising 17 rounds of random mutagenesis and selection of M.HaeIII, a DNA methyltransferase. During this drift, multiple mutations gradually accumulated. Deep sequencing of the drifted gene ensembles allowed determination of the relative effects of all possible single nucleotide mutations. Despite being averaged across many different genetic backgrounds, about 67% of all nonsynonymous, missense mutations were evidently deleterious, and an additional 16% were likely to be deleterious. In the early generations, the frequency of most deleterious mutations remained high. However, by the 17th generation, their frequency was consistently reduced, and those remaining were accepted alongside compensatory mutations. The tolerance to mutations measured in this laboratory drift correlated with sequence exchanges seen in M.HaeIII’s natural orthologs. The biophysical constraints dictating purging in nature and in this laboratory drift also seemed to overlap. Our experiment therefore provides an improved method for measuring the effects of protein mutations that more closely replicates the natural evolutionary forces, and thereby a more realistic view of the mutational space of proteins. Understanding and predicting the effects of single nucleotide polymorphisms (SNPs) is of fundamental importance in many fields. Systematic experimental mappings of the effects of such mutations within a given gene/protein comprise an essential experimental tool for determining protein function and for refining models of protein evolution, as well as an important resource for improving prediction algorithms. Here, we present the results of a laboratory system that mimics the manner by which protein sequences diverge in nature: a prolonged process of gradually accumulating random mutations that retain the protein’s structure and function. The change in frequencies of mutations over generations, as obtained by deep sequencing, enabled us to assess the relative effects of all possible SNPs at the background of an accumulating number of mutations. Compared to previous reports, we found that > 80% of all possible amino acid exchanges have potential deleterious effects, with 67% being clearly deleterious. Tolerance vs. purging of mutations in our prolonged drift also showed better correlation with natural diversity. Overall, our experimental setup provides a better understanding of how protein sequences diverge in nature, plus a new basis for improving the prediction accuracy of the effects of protein mutations, and specifically of SNPs.
Collapse
Affiliation(s)
- Liat Rockah-Shmuel
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| | - Ágnes Tóth-Petróczy
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| | - Dan S. Tawfik
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
- * E-mail:
| |
Collapse
|
27
|
Abstract
Elucidating the effects of naturally occurring genetic variation is one of the major challenges for personalized health and personalized medicine. Here, we introduce SNAP2, a novel neural network based classifier that improves over the state-of-the-art in distinguishing between effect and neutral variants. Our method's improved performance results from screening many potentially relevant protein features and from refining our development data sets. Cross-validated on >100k experimentally annotated variants, SNAP2 significantly outperformed other methods, attaining a two-state accuracy (effect/neutral) of 83%. SNAP2 also outperformed combinations of other methods. Performance increased for human variants but much more so for other organisms. Our method's carefully calibrated reliability index informs selection of variants for experimental follow up, with the most strongly predicted half of all effect variants predicted at over 96% accuracy. As expected, the evolutionary information from automatically generated multiple sequence alignments gave the strongest signal for the prediction. However, we also optimized our new method to perform surprisingly well even without alignments. This feature reduces prediction runtime by over two orders of magnitude, enables cross-genome comparisons, and renders our new method as the best solution for the 10-20% of sequence orphans. SNAP2 is available at: https://rostlab.org/services/snap2web
Collapse
|
28
|
Pan Y, Karagiannis K, Zhang H, Dingerdissen H, Shamsaddini A, Wan Q, Simonyan V, Mazumder R. Human germline and pan-cancer variomes and their distinct functional profiles. Nucleic Acids Res 2014; 42:11570-88. [PMID: 25232094 PMCID: PMC4191387 DOI: 10.1093/nar/gku772] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Identification of non-synonymous single nucleotide variations (nsSNVs) has exponentially increased due to advances in Next-Generation Sequencing technologies. The functional impacts of these variations have been difficult to ascertain because the corresponding knowledge about sequence functional sites is quite fragmented. It is clear that mapping of variations to sequence functional features can help us better understand the pathophysiological role of variations. In this study, we investigated the effect of nsSNVs on more than 17 common types of post-translational modification (PTM) sites, active sites and binding sites. Out of 1 705 285 distinct nsSNVs on 259 216 functional sites we identified 38 549 variations that significantly affect 10 major functional sites. Furthermore, we found distinct patterns of site disruptions due to germline and somatic nsSNVs. Pan-cancer analysis across 12 different cancer types led to the identification of 51 genes with 106 nsSNV affected functional sites found in 3 or more cancer types. 13 of the 51 genes overlap with previously identified Significantly Mutated Genes (Nature. 2013 Oct 17;502(7471)). 62 mutations in these 13 genes affecting functional sites such as DNA, ATP binding and various PTM sites occur across several cancers and can be prioritized for additional validation and investigations.
Collapse
Affiliation(s)
- Yang Pan
- The Department of Biochemistry & Molecular Medicine, George Washington University Medical Center, Washington, DC 20037, USA
| | - Konstantinos Karagiannis
- The Department of Biochemistry & Molecular Medicine, George Washington University Medical Center, Washington, DC 20037, USA
| | - Haichen Zhang
- The Department of Biochemistry & Molecular Medicine, George Washington University Medical Center, Washington, DC 20037, USA
| | - Hayley Dingerdissen
- The Department of Biochemistry & Molecular Medicine, George Washington University Medical Center, Washington, DC 20037, USA
| | - Amirhossein Shamsaddini
- The Department of Biochemistry & Molecular Medicine, George Washington University Medical Center, Washington, DC 20037, USA
| | - Quan Wan
- The Department of Biochemistry & Molecular Medicine, George Washington University Medical Center, Washington, DC 20037, USA
| | - Vahan Simonyan
- Center for Biologics Evaluation and Research, US Food and Drug Administration, 10903 New Hampshire Avenue, Silver Spring, MD 20993, USA
| | - Raja Mazumder
- The Department of Biochemistry & Molecular Medicine, George Washington University Medical Center, Washington, DC 20037, USA McCormick Genomic and Proteomic Center, George Washington University, Washington, DC 20037, USA
| |
Collapse
|
29
|
Li B, Seligman C, Thusberg J, Miller JL, Auer J, Whirl-Carrillo M, Capriotti E, Klein TE, Mooney SD. In silico comparative characterization of pharmacogenomic missense variants. BMC Genomics 2014; 15 Suppl 4:S4. [PMID: 25057096 PMCID: PMC4092878 DOI: 10.1186/1471-2164-15-s4-s4] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Missense pharmacogenomic (PGx) variants refer to amino acid substitutions that potentially affect the pharmacokinetic (PK) or pharmacodynamic (PD) response to drug therapies. The PGx variants, as compared to disease-associated variants, have not been investigated as deeply. The ability to computationally predict future PGx variants is desirable; however, it is not clear what data sets should be used or what features are beneficial to this end. Hence we carried out a comparative characterization of PGx variants with annotated neutral and disease variants from UniProt, to test the predictive power of sequence conservation and structural information in discriminating these three groups. RESULTS 126 PGx variants of high quality from PharmGKB were selected and two data sets were created: one set contained 416 variants with structural and sequence information, and, the other set contained 1,265 variants with sequence information only. In terms of sequence conservation, PGx variants are more conserved than neutral variants and much less conserved than disease variants. A weighted random forest was used to strike a more balanced classification for PGx variants. Generally structural features are helpful in discriminating PGx variant from the other two groups, but still classification of PGx from neutral polymorphisms is much less effective than between disease and neutral variants. CONCLUSIONS We found that PGx variants are much more similar to neutral variants than to disease variants in the feature space consisting of residue conservation, neighboring residue conservation, number of neighbors, and protein solvent accessibility. Such similarity poses great difficulty in the classification of PGx variants and polymorphisms.
Collapse
|
30
|
Bromberg Y. Building a genome analysis pipeline to predict disease risk and prevent disease. J Mol Biol 2013; 425:3993-4005. [PMID: 23928561 DOI: 10.1016/j.jmb.2013.07.038] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2013] [Revised: 07/26/2013] [Accepted: 07/28/2013] [Indexed: 12/24/2022]
Abstract
Reduced costs and increased speed and accuracy of sequencing can bring the genome-based evaluation of individual disease risk to the bedside. While past efforts have identified a number of actionable mutations, the bulk of genetic risk remains hidden in sequence data. The biggest challenge facing genomic medicine today is the development of new techniques to predict the specifics of a given human phenome (set of all expressed phenotypes) encoded by each individual variome (full set of genome variants) in the context of the given environment. Numerous tools exist for the computational identification of the functional effects of a single variant. However, the pipelines taking advantage of full genomic, exomic, transcriptomic (and other) sequences have only recently become a reality. This review looks at the building of methodologies for predicting "variome"-defined disease risk. It also discusses some of the challenges for incorporating such a pipeline into everyday medical practice.
Collapse
Affiliation(s)
- Y Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Drive, New Brunswick, NJ 08873, USA.
| |
Collapse
|