751
|
Punetha J, Hoffman EP. Short read (next-generation) sequencing: a tutorial with cardiomyopathy diagnostics as an exemplar. ACTA ACUST UNITED AC 2013; 6:427-34. [PMID: 23852418 DOI: 10.1161/circgenetics.113.000085] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Affiliation(s)
- Jaya Punetha
- Department of Integrative Systems Biology, The George Washington University School of Medicine, Washington, DC, USA
| | | |
Collapse
|
752
|
Santos-Cortez RLP, Lee K, Azeem Z, Antonellis PJ, Pollock LM, Khan S, Andrade-Elizondo PB, Chiu I, Adams MD, Basit S, Smith JD, Nickerson DA, McDermott BM, Ahmad W, Leal SM. Mutations in KARS, encoding lysyl-tRNA synthetase, cause autosomal-recessive nonsyndromic hearing impairment DFNB89. Am J Hum Genet 2013; 93:132-40. [PMID: 23768514 DOI: 10.1016/j.ajhg.2013.05.018] [Citation(s) in RCA: 82] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2013] [Revised: 05/02/2013] [Accepted: 05/20/2013] [Indexed: 01/27/2023] Open
Abstract
Previously, DFNB89, a locus associated with autosomal-recessive nonsyndromic hearing impairment (ARNSHI), was mapped to chromosomal region 16q21-q23.2 in three unrelated, consanguineous Pakistani families. Through whole-exome sequencing of a hearing-impaired individual from each family, missense mutations were identified at highly conserved residues of lysyl-tRNA synthetase (KARS): the c.1129G>A (p.Asp377Asn) variant was found in one family, and the c.517T>C (p.Tyr173His) variant was found in the other two families. Both variants were predicted to be damaging by multiple bioinformatics tools. The two variants both segregated with the nonsyndromic-hearing-impairment phenotype within the three families, and neither mutation was identified in ethnically matched controls or within variant databases. Individuals homozygous for KARS mutations had symmetric, severe hearing impairment across all frequencies but did not show evidence of auditory or limb neuropathy. It has been demonstrated that KARS is expressed in hair cells of zebrafish, chickens, and mice. Moreover, KARS has strong localization to the spiral ligament region of the cochlea, as well as to Deiters' cells, the sulcus epithelium, the basilar membrane, and the surface of the spiral limbus. It is hypothesized that KARS variants affect aminoacylation in inner-ear cells by interfering with binding activity to tRNA or p38 and with tetramer formation. The identification of rare KARS variants in ARNSHI-affected families defines a gene that is associated with ARNSHI.
Collapse
Affiliation(s)
- Regie Lyn P Santos-Cortez
- Center for Statistical Genetics, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
753
|
Wang X, Wang H, Sun V, Tuan HF, Keser V, Wang K, Ren H, Lopez I, Zaneveld JE, Siddiqui S, Bowles S, Khan A, Salvo J, Jacobson SG, Iannaccone A, Wang F, Birch D, Heckenlively JR, Fishman GA, Traboulsi EI, Li Y, Wheaton D, Koenekoop RK, Chen R. Comprehensive molecular diagnosis of 179 Leber congenital amaurosis and juvenile retinitis pigmentosa patients by targeted next generation sequencing. J Med Genet 2013; 50:674-88. [PMID: 23847139 DOI: 10.1136/jmedgenet-2013-101558] [Citation(s) in RCA: 128] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
BACKGROUND Leber congenital amaurosis (LCA) and juvenile retinitis pigmentosa (RP) are inherited retinal diseases that cause early onset severe visual impairment. An accurate molecular diagnosis can refine the clinical diagnosis and allow gene specific treatments. METHODS We developed a capture panel that enriches the exonic DNA of 163 known retinal disease genes. Using this panel, we performed targeted next generation sequencing (NGS) for a large cohort of 179 unrelated and prescreened patients with the clinical diagnosis of LCA or juvenile RP. Systematic NGS data analysis, Sanger sequencing validation, and segregation analysis were utilised to identify the pathogenic mutations. Patients were revisited to examine the potential phenotypic ambiguity at the time of initial diagnosis. RESULTS Pathogenic mutations for 72 patients (40%) were identified, including 45 novel mutations. Of these 72 patients, 58 carried mutations in known LCA or juvenile RP genes and exhibited corresponding phenotypes, while 14 carried mutations in retinal disease genes that were not consistent with their initial clinical diagnosis. We revisited patients in the latter case and found that homozygous mutations in PRPH2 can cause LCA/juvenile RP. Guided by the molecular diagnosis, we reclassified the clinical diagnosis in two patients. CONCLUSIONS We have identified a novel gene and a large number of novel mutations that are associated with LCA/juvenile RP. Our results highlight the importance of molecular diagnosis as an integral part of clinical diagnosis.
Collapse
Affiliation(s)
- Xia Wang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
754
|
Liu X, Jian X, Boerwinkle E. dbNSFP v2.0: a database of human non-synonymous SNVs and their functional predictions and annotations. Hum Mutat 2013; 34:E2393-402. [PMID: 23843252 DOI: 10.1002/humu.22376] [Citation(s) in RCA: 490] [Impact Index Per Article: 40.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2013] [Accepted: 06/24/2013] [Indexed: 12/18/2022]
Abstract
dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. This database significantly facilitates the process of querying predictions and annotations from different databases/web-servers for large amounts of nsSNVs discovered in exome-sequencing studies. Here we report a recent major update of the database to version 2.0. We have rebuilt the SNV collection based on GENCODE 9 and currently the database includes 87,347,043 nsSNVs and 2,270,742 essential splice site SNVs (an 18% increase compared to dbNSFP v1.0). For each nsSNV dbNSFP v2.0 has added two prediction scores (MutationAssessor and FATHMM) and two conservation scores (GERP++ and SiPhy). The original five prediction and conservation scores in v1.0 (SIFT, Polyphen2, LRT, MutationTaster and PhyloP) have been updated. Rich functional annotations for SNVs and genes have also been added into the new version, including allele frequencies observed in the 1000 Genomes Project phase 1 data and the NHLBI Exome Sequencing Project, various gene IDs from different databases, functional descriptions of genes, gene expression and gene interaction information, among others. dbNSFP v2.0 is freely available for download at http://sites.google.com/site/jpopgen/dbNSFP.
Collapse
Affiliation(s)
- Xiaoming Liu
- Human Genetics Center, School of Public Health, University of Texas Health Science Center at Houston, Houston, Texas 77030, USA.
| | | | | |
Collapse
|
755
|
Spinelli R, Pirola A, Redaelli S, Sharma N, Raman H, Valletta S, Magistroni V, Piazza R, Gambacorti-Passerini C. Identification of novel point mutations in splicing sites integrating whole-exome and RNA-seq data in myeloproliferative diseases. Mol Genet Genomic Med 2013; 1:246-59. [PMID: 24498620 PMCID: PMC3865592 DOI: 10.1002/mgg3.23] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2013] [Revised: 05/22/2013] [Accepted: 05/24/2013] [Indexed: 12/13/2022] Open
Abstract
Point mutations in intronic regions near mRNA splice junctions can affect the splicing process. To identify novel splicing variants from exome sequencing data, we developed a bioinformatics splice-site prediction procedure to analyze next-generation sequencing (NGS) data (SpliceFinder). SpliceFinder integrates two functional annotation tools for NGS, ANNOVAR and MutationTaster and two canonical splice site prediction programs for single mutation analysis, SSPNN and NetGene2. By SpliceFinder, we identified somatic mutations affecting RNA splicing in a colon cancer sample, in eight atypical chronic myeloid leukemia (aCML), and eight CML patients. A novel homozygous splicing mutation was found in APC (NM_000038.4:c.1312+5G>A) and six heterozygous in GNAQ (NM_002072.2:c.735+1C>T), ABCC3 (NM_003786.3:c.1783-1G>A), KLHDC1 (NM_172193.1:c.568-2A>G), HOOK1 (NM_015888.4:c.1662-1G>A), SMAD9 (NM_001127217.2:c.1004-1C>T), and DNAH9 (NM_001372.3:c.10242+5G>A). Integrating whole-exome and RNA sequencing in aCML and CML, we assessed the phenotypic effect of mutations on mRNA splicing for GNAQ, ABCC3, HOOK1. In ABCC3 and HOOK1, RNA-Seq showed the presence of aberrant transcripts with activation of a cryptic splice site or intron retention, validated by the reverse transcription-polymerase chain reaction (RT-PCR) in the case of HOOK1. In GNAQ, RNA-Seq showed 22% of wild-type transcript and 78% of mRNA skipping exon 5, resulting in a 4–6 frameshift fusion confirmed by RT-PCR. The pipeline can be useful to identify intronic variants affecting RNA sequence by complementing conventional exome analysis.
Collapse
Affiliation(s)
- Roberta Spinelli
- Department of Health Sciences, University of Milano-Bicocca, Monza, Italy
| | - Alessandra Pirola
- Department of Health Sciences, University of Milano-Bicocca Monza, Italy
| | - Sara Redaelli
- Department of Health Sciences, University of Milano-Bicocca Monza, Italy
| | - Nitesh Sharma
- Department of Health Sciences, University of Milano-Bicocca Monza, Italy
| | - Hima Raman
- Department of Health Sciences, University of Milano-Bicocca Monza, Italy
| | - Simona Valletta
- Department of Health Sciences, University of Milano-Bicocca Monza, Italy
| | - Vera Magistroni
- Department of Health Sciences, University of Milano-Bicocca Monza, Italy
| | - Rocco Piazza
- Department of Health Sciences, University of Milano-Bicocca Monza, Italy
| | - Carlo Gambacorti-Passerini
- Department of Health Sciences, University of Milano-Bicocca Monza, Italy ; Hematology and Clinical Research Unit, San Gerardo Hospital Monza, Italy
| |
Collapse
|
756
|
Martignetti JA, Tian L, Li D, Ramirez MCM, Camacho-Vanegas O, Camacho SC, Guo Y, Zand DJ, Bernstein AM, Masur SK, Kim CE, Otieno FG, Hou C, Abdel-Magid N, Tweddale B, Metry D, Fournet JC, Papp E, McPherson EW, Zabel C, Vaksmann G, Morisot C, Keating B, Sleiman PM, Cleveland JA, Everman DB, Zackai E, Hakonarson H. Mutations in PDGFRB cause autosomal-dominant infantile myofibromatosis. Am J Hum Genet 2013; 92:1001-7. [PMID: 23731542 DOI: 10.1016/j.ajhg.2013.04.024] [Citation(s) in RCA: 138] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2013] [Revised: 04/19/2013] [Accepted: 04/30/2013] [Indexed: 01/30/2023] Open
Abstract
Infantile myofibromatosis (IM) is a disorder of mesenchymal proliferation characterized by the development of nonmetastasizing tumors in the skin, muscle, bone, and viscera. Occurrence within families across multiple generations is suggestive of an autosomal-dominant (AD) inheritance pattern, but autosomal-recessive (AR) modes of inheritance have also been proposed. We performed whole-exome sequencing (WES) in members of nine unrelated families clinically diagnosed with AD IM to identify the genetic origin of the disorder. In eight of the families, we identified one of two disease-causing mutations, c.1978C>A (p.Pro660Thr) and c.1681C>T (p.Arg561Cys), in PDGFRB. Intriguingly, one family did not have either of these PDGFRB mutations but all affected individuals had a c.4556T>C (p.Leu1519Pro) mutation in NOTCH3. Our studies suggest that mutations in PDGFRB are a cause of IM and highlight NOTCH3 as a candidate gene. Further studies of the crosstalk between PDGFRB and NOTCH pathways may offer new opportunities to identify mutations in other genes that result in IM and is a necessary first step toward understanding the mechanisms of both tumor growth and regression and its targeted treatment.
Collapse
Affiliation(s)
- John A Martignetti
- Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York, NY 10029, USA; Department of Pediatrics, Mount Sinai School of Medicine, New York, NY 10029, USA; Department of Oncological Sciences, Mount Sinai School of Medicine, New York, NY 10029, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
757
|
Aleksic B, Kushima I, Hashimoto R, Ohi K, Ikeda M, Yoshimi A, Nakamura Y, Ito Y, Okochi T, Fukuo Y, Yasuda Y, Fukumoto M, Yamamori H, Ujike H, Suzuki M, Inada T, Takeda M, Kaibuchi K, Iwata N, Ozaki N. Analysis of the VAV3 as candidate gene for schizophrenia: evidences from voxel-based morphometry and mutation screening. Schizophr Bull 2013; 39:720-8. [PMID: 22416266 PMCID: PMC3627762 DOI: 10.1093/schbul/sbs038] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
In recently completed Japanese genome-wide association studies (GWAS) of schizophrenia (JPN_GWAS) one of the top association signals was detected in the region of VAV3, a gene that maps to the chromosome 1p13.3. In order to complement JPN_GWAS findings, we tested the association of rs1410403 with brain structure in healthy individuals and schizophrenic patients and performed exon resequencing of VAV3. We performed voxel-based morphometry (VBM) and mutation screening of VAV3. Four independent samples were used in the present study: (1) for VBM analysis, we used case-control sample comprising 100 patients with schizophrenia and 264 healthy controls, (2) mutation analysis was performed on a total of 321 patients suffering from schizophrenia, and 2 case-control samples (3) 729 unrelated patients with schizophrenia and 564 healthy comparison subjects, and (4) sample comprising 1511 cases and 1517 healthy comparison subjects and were used for genetic association analysis of novel coding variants with schizophrenia. The VBM analysis suggests that rs1410403 might affect the volume of the left superior and middle temporal gyri (P=.011 and P=.013, respectively), which were reduced in patients with schizophrenia compared with healthy subjects. Moreover, 4 rare novel missense variants were detected. The mutations were followed-up in large independent sample, and one of the novel variants (Glu741Gly) was associated with schizophrenia (P=.02). These findings demonstrate that VAV3 can be seen as novel candidate gene for schizophrenia in which both rare and common variants may be related to increased genetic risk for schizophrenia in Japanese population.
Collapse
Affiliation(s)
- Branko Aleksic
- Department of Psychiatry, Graduate School of Medicine, Nagoya University, Nagoya, Japan,Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Tokyo, Japan
| | - Itaru Kushima
- Department of Psychiatry, Graduate School of Medicine, Nagoya University, Nagoya, Japan,Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Tokyo, Japan
| | | | - Kazutaka Ohi
- Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Tokyo, Japan,Department of Psychiatry, Graduate School of Medicine, Osaka University, Osaka, Japan
| | - Masashi Ikeda
- Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Tokyo, Japan,Department of Psychiatry, School of Medicine, Fujita Health University, 1-98 Dengakugakubo,Kutsukake-cho,Toyoake, Aichi 470-1192, Japan
| | - Akira Yoshimi
- Department of Psychiatry, Graduate School of Medicine, Nagoya University, Nagoya, Japan,Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Tokyo, Japan
| | - Yukako Nakamura
- Department of Psychiatry, Graduate School of Medicine, Nagoya University, Nagoya, Japan,Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Tokyo, Japan
| | - Yoshihito Ito
- Department of Psychiatry, Graduate School of Medicine, Nagoya University, Nagoya, Japan,Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Tokyo, Japan
| | - Tomo Okochi
- Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Tokyo, Japan,Department of Psychiatry, School of Medicine, Fujita Health University, 1-98 Dengakugakubo,Kutsukake-cho,Toyoake, Aichi 470-1192, Japan
| | - Yasuhisa Fukuo
- Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Tokyo, Japan,Department of Psychiatry, School of Medicine, Fujita Health University, 1-98 Dengakugakubo,Kutsukake-cho,Toyoake, Aichi 470-1192, Japan
| | - Yuka Yasuda
- Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Tokyo, Japan,Department of Psychiatry, Graduate School of Medicine, Osaka University, Osaka, Japan
| | - Motoyuki Fukumoto
- Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Tokyo, Japan,Department of Psychiatry, Graduate School of Medicine, Osaka University, Osaka, Japan
| | - Hidenaga Yamamori
- Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Tokyo, Japan,Department of Psychiatry, Graduate School of Medicine, Osaka University, Osaka, Japan
| | - Hiroshi Ujike
- Department of Neuropsychiatry, Graduate School of Medicine, Dentistry and Pharmaceutical Sciences, Okayama University, Okayama, Japan
| | - Michio Suzuki
- Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Tokyo, Japan,Department of Neuropsychiatry, Graduate School of Medicine and Pharmaceutical Sciences, University of Toyama, Toyama, Japan
| | - Toshiya Inada
- Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Tokyo, Japan,Seiwa Hospital, Institute of Neuropsychiatry, Tokyo, Japan
| | - Masatoshi Takeda
- Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Tokyo, Japan,Department of Psychiatry, Graduate School of Medicine, Osaka University, Osaka, Japan
| | - Kozo Kaibuchi
- Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Tokyo, Japan,Department of Cell Pharmacology, Graduate School of Medicine, Nagoya University, Nagoya, Japan
| | - Nakao Iwata
- Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Tokyo, Japan,Department of Psychiatry, School of Medicine, Fujita Health University, 1-98 Dengakugakubo,Kutsukake-cho,Toyoake, Aichi 470-1192, Japan,To whom correspondence should be addressed; tel: 81-562-93-2000, fax: 81-562-93-1831, e-mail:
| | - Norio Ozaki
- Department of Psychiatry, Graduate School of Medicine, Nagoya University, Nagoya, Japan,Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Tokyo, Japan
| |
Collapse
|
758
|
McCorvie TJ, Gleason TJ, Fridovich-Keil JL, Timson DJ. Misfolding of galactose 1-phosphate uridylyltransferase can result in type I galactosemia. Biochim Biophys Acta Mol Basis Dis 2013; 1832:1279-93. [PMID: 23583749 DOI: 10.1016/j.bbadis.2013.04.004] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2013] [Revised: 03/27/2013] [Accepted: 04/02/2013] [Indexed: 11/17/2022]
Abstract
Type I galactosemia is a genetic disorder that is caused by the impairment of galactose-1-phosphate uridylyltransferase (GALT; EC 2.7.7.12). Although a large number of mutations have been detected through genetic screening of the human GALT (hGALT) locus, for many it is not known how they cause their effects. The majority of these mutations are missense, with predicted substitutions scattered throughout the enzyme structure and thus causing impairment by other means rather than direct alterations to the active site. To clarify the fundamental, molecular basis of hGALT impairment we studied five disease-associated variants p.D28Y, p.L74P, p.F171S, p.F194L and p.R333G using both a yeast model and purified, recombinant proteins. In a yeast expression system there was a correlation between lysate activity and the ability to rescue growth in the presence of galactose, except for p.R333G. Kinetic analysis of the purified proteins quantified each variant's level of enzymatic impairment and demonstrated that this was largely due to altered substrate binding. Increased surface hydrophobicity, altered thermal stability and changes in proteolytic sensitivity were also detected. Our results demonstrate that hGALT requires a level of flexibility to function optimally and that altered folding is the underlying reason of impairment in all the variants tested here. This indicates that misfolding is a common, molecular basis of hGALT deficiency and suggests the potential of pharmacological chaperones and proteostasis regulators as novel therapeutic approaches for type I galactosemia.
Collapse
Affiliation(s)
- Thomas J McCorvie
- School of Biological Sciences, Queen's University Belfast, Medical Biology Centre, 97 Lisburn Road, Belfast, BT9 7BL, UK
| | | | | | | |
Collapse
|
759
|
Quintana MA, Schumacher FR, Casey G, Bernstein JL, Li L, Conti DV. Incorporating prior biologic information for high-dimensional rare variant association studies. Hum Hered 2013; 74:184-95. [PMID: 23594496 DOI: 10.1159/000346021] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Given the increasing scale of rare variant association studies, we introduce a method for high-dimensional studies that integrates multiple sources of data as well as allows for multiple region-specific risk indices. METHODS Our method builds upon the previous Bayesian risk index by integrating external biological variant-specific covariates to help guide the selection of associated variants and regions. Our extension also incorporates a second level of uncertainty as to which regions are associated with the outcome of interest. RESULTS Using a set of study-based simulations, we show that our approach leads to an increase in power to detect true associations in comparison to several commonly used alternatives. Additionally, the method provides multi-level inference at the pathway, region and variant levels. CONCLUSION To demonstrate the flexibility of the method to incorporate various types of information and the applicability to high-dimensional data, we apply our method to a single region within a candidate gene study of second primary breast cancer and to multiple regions within a candidate pathway study of colon cancer.
Collapse
Affiliation(s)
- Melanie A Quintana
- Department of Preventive Medicine, University of Southern California, Los Angeles, CA 90089, USA
| | | | | | | | | | | |
Collapse
|
760
|
A homozygous telomerase T-motif variant resulting in markedly reduced repeat addition processivity in siblings with Hoyeraal Hreidarsson syndrome. Blood 2013; 121:3586-93. [PMID: 23538340 DOI: 10.1182/blood-2012-08-447755] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Hoyeraal Hreidarsson syndrome (HHS) is a form of dyskeratosis congenita (DC) characterized by bone marrow failure, intrauterine growth retardation, developmental delay, microcephaly, cerebellar hypoplasia, immunodeficiency, and extremely short telomeres. As with DC, mutations in genes encoding factors required for telomere maintenance, such as telomerase reverse transcriptase (TERT), have been found in patients with HHS. We describe 2 sibling HHS cases caused by a homozygous mutation (p.T567M) within the TERT T motif. This mutation resulted in a marked reduction in the capacity of telomerase to processively synthesize telomeric repeats, indicating a role for the T motif in this unique aspect of telomerase function. We support this finding by demonstrating defective processivity in the previously reported p.K570N T-motif mutation. The consanguineous, heterozygous p.T567M parents exhibited telomere lengths around the first percentile and no evidence of a DC phenotype. Although heterozygous processivity defects have been associated with familial, adult-onset pulmonary fibrosis, these cases demonstrate the severe clinical and functional impact of biallelic processivity mutations. Thus, despite retaining the capacity to add short stretches of telomeric repeats onto the shortest telomeres, sole expression of telomerase processivity mutants can lead to a profound failure of telomere maintenance and early-onset multisystem disease.
Collapse
|
761
|
Nguyen H, Luu TD, Poch O, Thompson JD. Knowledge discovery in variant databases using inductive logic programming. Bioinform Biol Insights 2013; 7:119-31. [PMID: 23589683 PMCID: PMC3615990 DOI: 10.4137/bbi.s11184] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research, especially for the development of diagnostics and effective therapeutic solutions. In this work, we describe the use of a recent knowledge discovery from database (KDD) approach using inductive logic programming (ILP) to automatically extract knowledge about human monogenic diseases. We extracted background knowledge from MSV3d, a database of all human missense variants mapped to 3D protein structure. In this study, we identified 8,117 mutations in 805 proteins with known three-dimensional structures that were known to be involved in human monogenic disease. Our results help to improve our understanding of the relationships between structural, functional or evolutionary features and deleterious mutations. Our inferred rules can also be applied to predict the impact of any single amino acid replacement on the function of a protein. The interpretable rules are available at http://decrypthon.igbmc.fr/kd4v/.
Collapse
Affiliation(s)
- Hoan Nguyen
- Laboratoire de Bioinformatique et Génomique Intégratives, Institut de Génétique et de Biologie Moléculaire et Cellulaire Illkirch, France
| | | | | | | |
Collapse
|
762
|
Liu L, Kumar S. Evolutionary balancing is critical for correctly forecasting disease-associated amino acid variants. Mol Biol Evol 2013; 30:1252-7. [PMID: 23462317 DOI: 10.1093/molbev/mst037] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Computational predictions have become indispensable for evaluating the disease-related impact of nonsynonymous single-nucleotide variants discovered in exome sequencing. Many such methods have their roots in molecular evolution, as they use information derived from multiple sequence alignments. We show that the performance of current methods (e.g., PolyPhen-2 and SIFT) is improved significantly by optimizing their statistical models on evolutionarily balanced training data, where equal numbers of positive and negative controls within each evolutionary conservation class are used. Evolutionary balancing significantly reduces the false-positive rates for variants observed at highly conserved sites and false-negative rates for variants observed at fast evolving sites. Use of these improved methods enables more accurate forecasting when concordant diagnosis from multiple methods is regarded as a more reliable indicator of the prediction. Applied to a large exome variation data set, we find that the current methods produce concordant predictions for less than half of the population variants. These advances are implemented in a web resource for use in practical applications (www.mypeg.info, last accessed March 13, 2013).
Collapse
Affiliation(s)
- Li Liu
- Center for Evolutionary Medicine and Informatics, Biodesign Institute, Arizona State University, USA
| | | |
Collapse
|
763
|
Chang LW, Viader A, Varghese N, Payton JE, Milbrandt J, Nagarajan R. An integrated approach to characterize transcription factor and microRNA regulatory networks involved in Schwann cell response to peripheral nerve injury. BMC Genomics 2013; 14:84. [PMID: 23387820 PMCID: PMC3599357 DOI: 10.1186/1471-2164-14-84] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2012] [Accepted: 01/29/2013] [Indexed: 12/03/2022] Open
Abstract
Background The regenerative response of Schwann cells after peripheral nerve injury is a critical process directly related to the pathophysiology of a number of neurodegenerative diseases. This SC injury response is dependent on an intricate gene regulatory program coordinated by a number of transcription factors and microRNAs, but the interactions among them remain largely unknown. Uncovering the transcriptional and post-transcriptional regulatory networks governing the Schwann cell injury response is a key step towards a better understanding of Schwann cell biology and may help develop novel therapies for related diseases. Performing such comprehensive network analysis requires systematic bioinformatics methods to integrate multiple genomic datasets. Results In this study we present a computational pipeline to infer transcription factor and microRNA regulatory networks. Our approach combined mRNA and microRNA expression profiling data, ChIP-Seq data of transcription factors, and computational transcription factor and microRNA target prediction. Using mRNA and microRNA expression data collected in a Schwann cell injury model, we constructed a regulatory network and studied regulatory pathways involved in Schwann cell response to injury. Furthermore, we analyzed network motifs and obtained insights on cooperative regulation of transcription factors and microRNAs in Schwann cell injury recovery. Conclusions This work demonstrates a systematic method for gene regulatory network inference that may be used to gain new information on gene regulation by transcription factors and microRNAs.
Collapse
Affiliation(s)
- Li-Wei Chang
- Department of Pathology and Immunology, Washington University School of Medicine, 660 South Euclid Ave, St, Louis, MO 63110, USA
| | | | | | | | | | | |
Collapse
|
764
|
Howe JR, Dahdaleh FS, Carr JC, Wang D, Sherman SK, Howe JR. BMPR1A mutations in juvenile polyposis affect cellular localization. J Surg Res 2013; 184:739-45. [PMID: 23433720 DOI: 10.1016/j.jss.2013.01.015] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2012] [Revised: 11/19/2012] [Accepted: 01/10/2013] [Indexed: 01/10/2023]
Abstract
BACKGROUND Juvenile polyposis (JP) is characterized by the development of hamartomatous polyps of the gastrointestinal tract that collectively carry a significant risk of malignant transformation. Mutations in the bone morphogenetic protein receptor type 1A (BMPR1A) are known to predispose to JP. We set out to study the effect of such missense mutations on BMPR1A cellular localization. METHODS We chose eight distinct mutations for analysis. We tagged a BMPR1A wild-type (WT) expression plasmid with green fluorescent protein on its C-terminus. Site-directed mutagenesis was used to recreate JP patient mutations from the WT-green fluorescent protein BMPR1A plasmid. We verified mutant expression vector sequences by direct sequencing. First, we transfected BMPR1A expression vectors into HEK-293T cells; then, we performed confocal microscopy to determine cellular localization. Four independent observers used a scoring system from 1 to 3 to categorize the degree of membrane versus cellular localization. RESULTS Of the eight selected mutations, one was within the signaling peptide, four were within the extracellular domain, and three were within the intracellular domain. The WT BMPR1A vector had strong membrane staining, whereas all eight mutations had much less membrane and much more intracellular localization. Enzyme-linked immunosorbent assays for BMPR1A demonstrated no significant differences in protein quantities between constructs, except for one affecting the start codon. CONCLUSIONS Bone morphogenetic protein receptor type 1A missense mutations occurring in patients with JP affected cellular localization in an in vitro model. These findings suggest a mechanism by which such mutations can lead to disease by altering downstream signaling through the bone morphogenetic protein pathway.
Collapse
Affiliation(s)
- James R Howe
- Division of Surgical Oncology and Endocrine Surgery, Department of Surgery, University of Iowa Hospitals and Clinics, Iowa City, Iowa.
| | | | | | | | | | | |
Collapse
|
765
|
Prediction of deleterious nonsynonymous single-nucleotide polymorphism for human diseases. ScientificWorldJournal 2013; 2013:675851. [PMID: 23431257 PMCID: PMC3572689 DOI: 10.1155/2013/675851] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2012] [Accepted: 12/11/2012] [Indexed: 12/13/2022] Open
Abstract
The identification of genetic variants that are responsible for human inherited diseases is a fundamental problem in human and medical genetics. As a typical type of genetic variation, nonsynonymous single-nucleotide polymorphisms (nsSNPs) occurring in protein coding regions may alter the encoded amino acid, potentially affect protein structure and function, and further result in human inherited diseases. Therefore, it is of great importance to develop computational approaches to facilitate the discrimination of deleterious nsSNPs from neutral ones. In this paper, we review databases that collect nsSNPs and summarize computational methods for the identification of deleterious nsSNPs. We classify the existing methods for characterizing nsSNPs into three categories (sequence based, structure based, and annotation based), and we introduce machine learning models for the prediction of deleterious nsSNPs. We further discuss methods for identifying deleterious nsSNPs in noncoding variants and those for dealing with rare variants.
Collapse
|
766
|
Pabinger S, Dander A, Fischer M, Snajder R, Sperk M, Efremova M, Krabichler B, Speicher MR, Zschocke J, Trajanoski Z. A survey of tools for variant analysis of next-generation genome sequencing data. Brief Bioinform 2013; 15:256-78. [PMID: 23341494 PMCID: PMC3956068 DOI: 10.1093/bib/bbs086] [Citation(s) in RCA: 335] [Impact Index Per Article: 27.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Recent advances in genome sequencing technologies provide unprecedented opportunities to characterize individual genomic landscapes and identify mutations relevant for diagnosis and therapy. Specifically, whole-exome sequencing using next-generation sequencing (NGS) technologies is gaining popularity in the human genetics community due to the moderate costs, manageable data amounts and straightforward interpretation of analysis results. While whole-exome and, in the near future, whole-genome sequencing are becoming commodities, data analysis still poses significant challenges and led to the development of a plethora of tools supporting specific parts of the analysis workflow or providing a complete solution. Here, we surveyed 205 tools for whole-genome/whole-exome sequencing data analysis supporting five distinct analytical steps: quality assessment, alignment, variant identification, variant annotation and visualization. We report an overview of the functionality, features and specific requirements of the individual tools. We then selected 32 programs for variant identification, variant annotation and visualization, which were subjected to hands-on evaluation using four data sets: one set of exome data from two patients with a rare disease for testing identification of germline mutations, two cancer data sets for testing variant callers for somatic mutations, copy number variations and structural variations, and one semi-synthetic data set for testing identification of copy number variations. Our comprehensive survey and evaluation of NGS tools provides a valuable guideline for human geneticists working on Mendelian disorders, complex diseases and cancers.
Collapse
Affiliation(s)
- Stephan Pabinger
- Division for Bioinformatics, Innsbruck Medical University, Innrain 80, 6020 Innsbruck, Austria. Tel.: +43-512-9003-71401; Fax: +43-512-9003-73100;
| | | | | | | | | | | | | | | | | | | |
Collapse
|
767
|
Li MX, Kwan JSH, Bao SY, Yang W, Ho SL, Song YQ, Sham PC. Predicting mendelian disease-causing non-synonymous single nucleotide variants in exome sequencing studies. PLoS Genet 2013; 9:e1003143. [PMID: 23341771 PMCID: PMC3547823 DOI: 10.1371/journal.pgen.1003143] [Citation(s) in RCA: 110] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2012] [Accepted: 10/20/2012] [Indexed: 12/19/2022] Open
Abstract
Exome sequencing is becoming a standard tool for mapping Mendelian disease-causing (or pathogenic) non-synonymous single nucleotide variants (nsSNVs). Minor allele frequency (MAF) filtering approach and functional prediction methods are commonly used to identify candidate pathogenic mutations in these studies. Combining multiple functional prediction methods may increase accuracy in prediction. Here, we propose to use a logit model to combine multiple prediction methods and compute an unbiased probability of a rare variant being pathogenic. Also, for the first time we assess the predictive power of seven prediction methods (including SIFT, PolyPhen2, CONDEL, and logit) in predicting pathogenic nsSNVs from other rare variants, which reflects the situation after MAF filtering is done in exome-sequencing studies. We found that a logit model combining all or some original prediction methods outperforms other methods examined, but is unable to discriminate between autosomal dominant and autosomal recessive disease mutations. Finally, based on the predictions of the logit model, we estimate that an individual has around 5% of rare nsSNVs that are pathogenic and carries ~22 pathogenic derived alleles at least, which if made homozygous by consanguineous marriages may lead to recessive diseases.
Collapse
Affiliation(s)
- Miao-Xin Li
- Department of Psychiatry, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
- Centre for Reproduction, Development and Growth, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
- Centre for Genomic Sciences, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
- * E-mail: (M-XL); (PCS)
| | - Johnny S. H. Kwan
- Department of Psychiatry, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
- Department of Medicine, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
| | - Su-Ying Bao
- Department of Biochemistry, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
| | - Wanling Yang
- Department of Paediatrics and Adolescent Medicine, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
| | - Shu-Leong Ho
- Department of Medicine, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
| | - Yong-Qiang Song
- Department of Biochemistry, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
| | - Pak C. Sham
- Department of Psychiatry, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
- Centre for Reproduction, Development and Growth, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
- Centre for Genomic Sciences, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
- State Key Laboratory for Cognitive and Brain Sciences, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
- * E-mail: (M-XL); (PCS)
| |
Collapse
|
768
|
Fu W, O'Connor TD, Jun G, Kang HM, Abecasis G, Leal SM, Gabriel S, Altshuler D, Shendure J, Nickerson DA, Bamshad MJ, GO B, GO S, Akey JM. Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature 2013; 493:216-20. [PMID: 23201682 PMCID: PMC3676746 DOI: 10.1038/nature11690] [Citation(s) in RCA: 697] [Impact Index Per Article: 58.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2012] [Accepted: 10/19/2012] [Indexed: 01/07/2023]
Abstract
Establishing the age of each mutation segregating in contemporary human populations is important to fully understand our evolutionary history and will help to facilitate the development of new approaches for disease-gene discovery. Large-scale surveys of human genetic variation have reported signatures of recent explosive population growth, notable for an excess of rare genetic variants, suggesting that many mutations arose recently. To more quantitatively assess the distribution of mutation ages, we resequenced 15,336 genes in 6,515 individuals of European American and African American ancestry and inferred the age of 1,146,401 autosomal single nucleotide variants (SNVs). We estimate that approximately 73% of all protein-coding SNVs and approximately 86% of SNVs predicted to be deleterious arose in the past 5,000-10,000 years. The average age of deleterious SNVs varied significantly across molecular pathways, and disease genes contained a significantly higher proportion of recently arisen deleterious SNVs than other genes. Furthermore, European Americans had an excess of deleterious variants in essential and Mendelian disease genes compared to African Americans, consistent with weaker purifying selection due to the Out-of-Africa dispersal. Our results better delimit the historical details of human protein-coding variation, show the profound effect of recent human history on the burden of deleterious SNVs segregating in contemporary populations, and provide important practical information that can be used to prioritize variants in disease-gene discovery.
Collapse
Affiliation(s)
- Wenqing Fu
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Timothy D. O'Connor
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Goo Jun
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA
| | - Hyun Min Kang
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA
| | - Goncalo Abecasis
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA
| | - Suzanne M. Leal
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Stacey Gabriel
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - David Altshuler
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Deborah A. Nickerson
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Michael J. Bamshad
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
- Department of Pediatrics, University of Washington, Seattle, Washington, USA
| | | | | | | | | | - Joshua M. Akey
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| |
Collapse
|
769
|
Kwasnieski JC, Mogno I, Myers CA, Corbo JC, Cohen BA. Complex effects of nucleotide variants in a mammalian cis-regulatory element. Proc Natl Acad Sci U S A 2012; 109:19498-503. [PMID: 23129659 PMCID: PMC3511131 DOI: 10.1073/pnas.1210678109] [Citation(s) in RCA: 172] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Cis-regulatory elements (CREs) control gene expression by recruiting transcription factors (TFs) and other DNA binding proteins. We aim to understand how individual nucleotides contribute to the function of CREs. Here we introduce CRE analysis by sequencing (CRE-seq), a high-throughput method for producing and testing large numbers of reporter genes in mammalian cells. We used CRE-seq to assay >1,000 single and double nucleotide mutations in a 52-bp CRE in the Rhodopsin promoter that drives strong and specific expression in mammalian photoreceptors. We find that this particular CRE is remarkably complex. The majority (86%) of single nucleotide substitutions in this sequence exert significant effects on regulatory activity. Although changes in the affinity of known TF binding sites explain some of these expression changes, we present evidence for complex phenomena, including binding site turnover and TF competition. Analysis of double mutants revealed complex, nucleotide-specific interactions between residues in different TF binding sites. We conclude that some mammalian CREs are finely tuned by evolution and function through complex, nonadditive interactions between bound TFs. CRE-seq will be an important tool to uncover the rules that govern these interactions.
Collapse
Affiliation(s)
- Jamie C. Kwasnieski
- Center for Genome Sciences and Systems Biology, Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108; and
| | - Ilaria Mogno
- Center for Genome Sciences and Systems Biology, Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108; and
| | - Connie A. Myers
- Department of Pathology and Immunology, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
| | - Joseph C. Corbo
- Department of Pathology and Immunology, Washington University School of Medicine in St. Louis, St. Louis, MO 63110
| | - Barak A. Cohen
- Center for Genome Sciences and Systems Biology, Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108; and
| |
Collapse
|
770
|
Affiliation(s)
- Monique Ohanian
- Molecular Cardiology Division, Victor Chang Cardiac Research Institute, Sydney, New South Wales, Australia
| | | | | |
Collapse
|
771
|
Sifrim A, Van Houdt JKJ, Tranchevent LC, Nowakowska B, Sakai R, Pavlopoulos GA, Devriendt K, Vermeesch JR, Moreau Y, Aerts J. Annotate-it: a Swiss-knife approach to annotation, analysis and interpretation of single nucleotide variation in human disease. Genome Med 2012; 4:73. [PMID: 23013645 PMCID: PMC3580443 DOI: 10.1186/gm374] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2012] [Revised: 09/14/2012] [Accepted: 09/26/2012] [Indexed: 12/18/2022] Open
Abstract
The increasing size and complexity of exome/genome sequencing data requires new tools for clinical geneticists to discover disease-causing variants. Bottlenecks in identifying the causative variation include poor cross-sample querying, constantly changing functional annotation and not considering existing knowledge concerning the phenotype. We describe a methodology that facilitates exploration of patient sequencing data towards identification of causal variants under different genetic hypotheses. Annotate-it facilitates handling, analysis and interpretation of high-throughput single nucleotide variant data. We demonstrate our strategy using three case studies. Annotate-it is freely available and test data are accessible to all users at http://www.annotate-it.org.
Collapse
Affiliation(s)
- Alejandro Sifrim
- KU Leuven, Department of Electrical Engineering-ESAT, SCD-SISTA, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
- IBBT Future Health Department, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
| | - Jeroen KJ Van Houdt
- KU Leuven, Centre for Human Genetics, University Hospital Gasthuisberg, Herestraat 49, 3000 Leuven, Belgium
| | - Leon-Charles Tranchevent
- KU Leuven, Department of Electrical Engineering-ESAT, SCD-SISTA, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
- IBBT Future Health Department, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
| | - Beata Nowakowska
- KU Leuven, Centre for Human Genetics, University Hospital Gasthuisberg, Herestraat 49, 3000 Leuven, Belgium
| | - Ryo Sakai
- KU Leuven, Department of Electrical Engineering-ESAT, SCD-SISTA, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
- IBBT Future Health Department, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
| | - Georgios A Pavlopoulos
- KU Leuven, Department of Electrical Engineering-ESAT, SCD-SISTA, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
- IBBT Future Health Department, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
| | - Koen Devriendt
- KU Leuven, Centre for Human Genetics, University Hospital Gasthuisberg, Herestraat 49, 3000 Leuven, Belgium
| | - Joris R Vermeesch
- KU Leuven, Centre for Human Genetics, University Hospital Gasthuisberg, Herestraat 49, 3000 Leuven, Belgium
| | - Yves Moreau
- KU Leuven, Department of Electrical Engineering-ESAT, SCD-SISTA, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
- IBBT Future Health Department, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
| | - Jan Aerts
- KU Leuven, Department of Electrical Engineering-ESAT, SCD-SISTA, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
- IBBT Future Health Department, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
| |
Collapse
|
772
|
Sunyaev SR. Inferring causality and functional significance of human coding DNA variants. Hum Mol Genet 2012; 21:R10-7. [PMID: 22990389 DOI: 10.1093/hmg/dds385] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Sequencing technology enables the complete characterization of human genetic variation. Statistical genetics studies identify numerous loci linked to or associated with phenotypes of direct medical interest. The major remaining challenge is to characterize functionally significant alleles that are causally implicated in the genetic basis of human traits. Here, I review three sources of evidence for the functional significance of human DNA variants in protein-coding genes. These include (i) statistical genetics considerations such as co-segregation with the phenotype, allele frequency in unaffected controls and recurrence; (ii) in vitro functional assays and model organism experiments; and (iii) computational methods for predicting the functional effect of amino acid substitutions. In spite of many successes of recent studies, functional characterization of human allelic variants remains problematic.
Collapse
Affiliation(s)
- Shamil R Sunyaev
- Genetics Division, Brigham and Women's Hospital, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.
| |
Collapse
|
773
|
Kanthaswamy S, Ng J, Ross CT, Trask JS, Smith DG, Buffalo VS, Fass JN, Lin D. Identifying human-rhesus macaque gene orthologs using heterospecific SNP probes. Genomics 2012; 101:30-7. [PMID: 22982528 DOI: 10.1016/j.ygeno.2012.09.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2012] [Revised: 07/27/2012] [Accepted: 09/04/2012] [Indexed: 02/07/2023]
Abstract
We genotyped a Chinese and an Indian-origin rhesus macaque using the Affymetrix Genome-Wide Human SNP Array 6.0 and cataloged 85,473 uniquely mapping heterospecific SNPs. These SNPs were assigned to rhesus chromosomes according to their probe sequence alignments as displayed in the human and rhesus reference sequences. The conserved gene order (synteny) revealed by heterospecific SNP maps is in concordance with that of the published human and rhesus macaque genomes. Using these SNPs' original human rs numbers, we identified 12,328 genes annotated in humans that are associated with these SNPs, 3674 of which were found in at least one of the two rhesus macaques studied. Due to their density, the heterospecific SNPs allow fine-grained comparisons, including approximate boundaries of intra- and extra-chromosomal rearrangements involving gene orthologs, which can be used to distinguish rhesus macaque chromosomes from human chromosomes.
Collapse
Affiliation(s)
- Sree Kanthaswamy
- Molecular Anthropology Lab., Dept. of Anthropology, UC Davis, CA, USA.
| | | | | | | | | | | | | | | |
Collapse
|
774
|
Giudicessi JR, Kapplinger JD, Tester DJ, Alders M, Salisbury BA, Wilde AAM, Ackerman MJ. Phylogenetic and physicochemical analyses enhance the classification of rare nonsynonymous single nucleotide variants in type 1 and 2 long-QT syndrome. ACTA ACUST UNITED AC 2012; 5:519-28. [PMID: 22949429 DOI: 10.1161/circgenetics.112.963785] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
BACKGROUND Hundreds of nonsynonymous single nucleotide variants (nsSNVs) have been identified in the 2 most common long-QT syndrome-susceptibility genes (KCNQ1 and KCNH2). Unfortunately, an ≈3% BACKGROUND and KCNH2 nsSNVs amongst healthy individuals complicates the ability to distinguish rare pathogenic mutations from similarly rare yet presumably innocuous variants. METHODS AND RESULTS In this study, 4 tools [(1) conservation across species, (2) Grantham values, (3) sorting intolerant from tolerant, and (4) polymorphism phenotyping] were used to predict pathogenic or benign status for nsSNVs identified across 388 clinically definite long-QT syndrome cases and 1344 ostensibly healthy controls. From these data, estimated predictive values were determined for each tool independently, in concert with previously published protein topology-derived estimated predictive values, and synergistically when ≥3 tools were in agreement. Overall, all 4 tools displayed a statistically significant ability to distinguish between case-derived and control-derived nsSNVs in KCNQ1, whereas each tool, except Grantham values, displayed a similar ability to differentiate KCNH2 nsSNVs. Collectively, when at least 3 of the 4 tools agreed on the pathogenic status of C-terminal nsSNVs located outside the KCNH2/Kv11.1 cyclic nucleotide-binding domain, the topology-specific estimated predictive value improved from 56% to 91%. CONCLUSIONS Although in silico prediction tools should not be used to predict independently the pathogenicity of a novel, rare nSNV, our results support the potential clinical use of the synergistic utility of these tools to enhance the classification of nsSNVs, particularly for Kv11.1's difficult to interpret C-terminal region.
Collapse
Affiliation(s)
- John R Giudicessi
- Department of Medicine/Division of Cardiovascular Diseases, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA
| | | | | | | | | | | | | |
Collapse
|
775
|
Lehmann KV, Chen T. Exploring functional variant discovery in non-coding regions with SInBaD. Nucleic Acids Res 2012; 41:e7. [PMID: 22941663 PMCID: PMC3592431 DOI: 10.1093/nar/gks800] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
The thousand genomes project and many similar ongoing large-scale sequencing efforts require new methods to predict functional variants in both coding and non-coding regions in order to understand phenotype and genotype relationships. We report the design of a new model SInBaD (Sequence-Information-Based-Decision-model) which relies on nucleotide conservation information to evaluate any annotated human variant in all known exons, introns, splice junctions and promoter regions. SInBaD builds separate mathematical models for promoters, exons and introns, using the human disease mutations annotated in human gene mutation database as the training dataset for functional variants. The ten-fold cross validation shows high prediction accuracy. Validations on test datasets, demonstrate that variants predicted as functional have a significantly higher occurrence in cancer patients. We also applied our model to variants found in four different individual human genomes to identify a set of functional variants, which might be of interest for further studies. Scores for any possible variants for all annotated genes are available under http://tingchenlab.cmb.usc.edu/sinbad/. SInBaD supports the current standard format of genotyping, the variant call files (VCF 4.0), making it easy to integrate it into any existing next-generation sequencing pipeline. The accuracy of SNP detection poses the only limitation to the use of SInBaD.
Collapse
Affiliation(s)
- Kjong-Van Lehmann
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | | |
Collapse
|
776
|
Abstract
Leber congenital amaurosis (LCA) is an infantile-onset form of inherited retinal degeneration characterized by severe vision loss1, 2. Two-thirds of LCA cases are caused by mutations in 17 known disease genes3 (RetNet Retinal Information Network). Using exome sequencing, we identified a homozygous missense mutation (c.25G>A, p.Val9Met) in NMNAT1 as likely disease-causing in two siblings of a consanguineous Pakistani kindred affected by LCA. This mutation segregated with disease in their kindred, including in three other children with LCA. NMNAT1 resides in the previously identified LCA9 locus and encodes the nuclear isoform of nicotinamide mononucleotide adenylyltransferase, a rate-limiting enzyme in nicotinamide adenine dinucleotide (NAD+) biosynthesis4, 5. Functional studies showed the p.Val9Met mutation decreased NMNAT1 enzyme activity. Sequencing NMNAT1 in 284 unrelated LCA families identified 14 rare mutations in 13 additional affected individuals. These results are the first to link an NMNAT isoform to disease and indicate that NMNAT1 mutations cause LCA.
Collapse
|
777
|
Lyon GJ, Wang K. Identifying disease mutations in genomic medicine settings: current challenges and how to accelerate progress. Genome Med 2012; 4:58. [PMID: 22830651 PMCID: PMC3580414 DOI: 10.1186/gm359] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The pace of exome and genome sequencing is accelerating, with the identification of many new disease-causing mutations in research settings, and it is likely that whole exome or genome sequencing could have a major impact in the clinical arena in the relatively near future. However, the human genomics community is currently facing several challenges, including phenotyping, sample collection, sequencing strategies, bioinformatics analysis, biological validation of variant function, clinical interpretation and validity of variant data, and delivery of genomic information to various constituents. Here we review these challenges and summarize the bottlenecks for the clinical application of exome and genome sequencing, and we discuss ways for moving the field forward. In particular, we urge the need for clinical-grade sample collection, high-quality sequencing data acquisition, digitalized phenotyping, rigorous generation of variant calls, and comprehensive functional annotation of variants. Additionally, we suggest that a 'networking of science' model that encourages much more collaboration and online sharing of medical history, genomic data and biological knowledge, including among research participants and consumers/patients, will help establish causation and penetrance for disease causal variants and genes. As we enter this new era of genomic medicine, we envision that consumer-driven and consumer-oriented efforts will take center stage, thus allowing insights from the human genome project to translate directly back into individualized medicine.
Collapse
Affiliation(s)
- Gholson J Lyon
- Cold Spring Harbor Laboratory, New York, NY 11797, USA
- Institute for Genomic Medicine, Utah Foundation for Biomedical Research (UFBR), Salt Lake City, UT 84106, USA
| | - Kai Wang
- Institute for Genomic Medicine, Utah Foundation for Biomedical Research (UFBR), Salt Lake City, UT 84106, USA
- Zilkha Neurogenetic Institute, Department of Psychiatry and Preventive Medicine, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
778
|
Tennessen JA, Bigham AW, O'Connor TD, Fu W, Kenny EE, Gravel S, McGee S, Do R, Liu X, Jun G, Kang HM, Jordan D, Leal SM, Gabriel S, Rieder MJ, Abecasis G, Altshuler D, Nickerson DA, Boerwinkle E, Sunyaev S, Bustamante CD, Bamshad MJ, Akey JM. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science 2012; 337:64-9. [PMID: 22604720 PMCID: PMC3708544 DOI: 10.1126/science.1219240] [Citation(s) in RCA: 1237] [Impact Index Per Article: 95.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
As a first step toward understanding how rare variants contribute to risk for complex diseases, we sequenced 15,585 human protein-coding genes to an average median depth of 111× in 2440 individuals of European (n = 1351) and African (n = 1088) ancestry. We identified over 500,000 single-nucleotide variants (SNVs), the majority of which were rare (86% with a minor allele frequency less than 0.5%), previously unknown (82%), and population-specific (82%). On average, 2.3% of the 13,595 SNVs each person carried were predicted to affect protein function of ~313 genes per genome, and ~95.7% of SNVs predicted to be functionally important were rare. This excess of rare functional variants is due to the combined effects of explosive, recent accelerated population growth and weak purifying selection. Furthermore, we show that large sample sizes will be required to associate rare variants with complex traits.
Collapse
Affiliation(s)
- Jacob A. Tennessen
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Abigail W. Bigham
- Department of Pediatrics, University of Washington, Seattle, WA 98195, USA
| | - Timothy D. O'Connor
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Wenqing Fu
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Eimear E. Kenny
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Simon Gravel
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Sean McGee
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Ron Do
- Broad Institute of MIT and Harvard, Cambridge, MA02142, USA
- The Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Xiaoming Liu
- Human Genetics Center, University of Texas Health Sciences Center at Houston, Houston, TX 77030, USA
| | - Goo Jun
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Hyun Min Kang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Daniel Jordan
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Suzanne M. Leal
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Stacey Gabriel
- Broad Institute of MIT and Harvard, Cambridge, MA02142, USA
| | - Mark J. Rieder
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Goncalo Abecasis
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | | | | | - Eric Boerwinkle
- Human Genetics Center, University of Texas Health Sciences Center at Houston, Houston, TX 77030, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Shamil Sunyaev
- Broad Institute of MIT and Harvard, Cambridge, MA02142, USA
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | | | - Michael J. Bamshad
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Department of Pediatrics, University of Washington, Seattle, WA 98195, USA
| | - Joshua M. Akey
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
779
|
Chang X, Wang K. wANNOVAR: annotating genetic variants for personal genomes via the web. J Med Genet 2012; 49:433-6. [PMID: 22717648 DOI: 10.1136/jmedgenet-2012-100918] [Citation(s) in RCA: 330] [Impact Index Per Article: 25.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
BACKGROUND High-throughput DNA sequencing platforms have become widely available. As a result, personal genomes are increasingly being sequenced in research and clinical settings. However, the resulting massive amounts of variants data pose significant challenges to the average biologists and clinicians without bioinformatics skills. METHODS AND RESULTS We developed a web server called wANNOVAR to address the critical needs for functional annotation of genetic variants from personal genomes. The server provides simple and intuitive interface to help users determine the functional significance of variants. These include annotating single nucleotide variants and insertions/deletions for their effects on genes, reporting their conservation levels (such as PhyloP and GERP++ scores), calculating their predicted functional importance scores (such as SIFT and PolyPhen scores), retrieving allele frequencies in public databases (such as the 1000 Genomes Project and NHLBI-ESP 5400 exomes), and implementing a 'variants reduction' protocol to identify a subset of potentially deleterious variants/genes. We illustrated how wANNOVAR can help draw biological insights from sequencing data, by analysing genetic variants generated on two Mendelian diseases. CONCLUSIONS We conclude that wANNOVAR will help biologists and clinicians take advantage of the personal genome information to expedite scientific discoveries. The wANNOVAR server is available at http://wannovar.usc.edu, and will be continuously updated to reflect the latest annotation information.
Collapse
Affiliation(s)
- Xiao Chang
- Zilkha Neurogenetic Institute, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | | |
Collapse
|
780
|
Dudley JT, Kim Y, Liu L, Markov GJ, Gerold K, Chen R, Butte AJ, Kumar S. Human genomic disease variants: a neutral evolutionary explanation. Genome Res 2012; 22:1383-94. [PMID: 22665443 PMCID: PMC3409252 DOI: 10.1101/gr.133702.111] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Many perspectives on the role of evolution in human health include nonempirical assumptions concerning the adaptive evolutionary origins of human diseases. Evolutionary analyses of the increasing wealth of clinical and population genomic data have begun to challenge these presumptions. In order to systematically evaluate such claims, the time has come to build a common framework for an empirical and intellectual unification of evolution and modern medicine. We review the emerging evidence and provide a supporting conceptual framework that establishes the classical neutral theory of molecular evolution (NTME) as the basis for evaluating disease- associated genomic variations in health and medicine. For over a decade, the NTME has already explained the origins and distribution of variants implicated in diseases and has illuminated the power of evolutionary thinking in genomic medicine. We suggest that a majority of disease variants in modern populations will have neutral evolutionary origins (previously neutral), with a relatively smaller fraction exhibiting adaptive evolutionary origins (previously adaptive). This pattern is expected to hold true for common as well as rare disease variants. Ultimately, a neutral evolutionary perspective will provide medicine with an informative and actionable framework that enables objective clinical assessment beyond convenient tendencies to invoke past adaptive events in human history as a root cause of human disease.
Collapse
Affiliation(s)
- Joel T Dudley
- Program in Biomedical Informatics, Stanford University School of Medicine, Stanford, California 94305, USA
| | | | | | | | | | | | | | | |
Collapse
|
781
|
Leongamornlert D, Mahmud N, Tymrakiewicz M, Saunders E, Dadaev T, Castro E, Goh C, Govindasami K, Guy M, O'Brien L, Sawyer E, Hall A, Wilkinson R, Easton D, Goldgar D, Eeles R, Kote-Jarai Z. Germline BRCA1 mutations increase prostate cancer risk. Br J Cancer 2012; 106:1697-701. [PMID: 22516946 PMCID: PMC3349179 DOI: 10.1038/bjc.2012.146] [Citation(s) in RCA: 209] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2011] [Revised: 02/24/2012] [Accepted: 03/25/2012] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Prostate cancer (PrCa) is one of the most common cancers affecting men but its aetiology is poorly understood. Family history of PrCa, particularly at a young age, is a strong risk factor. There have been previous reports of increased PrCa risk in male BRCA1 mutation carriers in female breast cancer families, but there is a controversy as to whether this risk is substantiated. We sought to evaluate the role of germline BRCA1 mutations in PrCa predisposition by performing a candidate gene study in a large UK population sample set. METHODS We screened 913 cases aged 36–86 years for germline BRCA1 mutation, with the study enriched for cases with an early age of onset. We analysed the entire coding region of the BRCA1 gene using Sanger sequencing. Multiplex ligation-dependent probe amplification was also used to assess the frequency of large rearrangements in 460 cases. RESULTS We identified 4 deleterious mutations and 45 unclassified variants (UV). The frequency of deleterious BRCA1 mutation in this study is 0.45%; three of the mutation carriers were affected at age 65 years and one developed PrCa at 69 years. Using previously estimated population carrier frequencies, deleterious BRCA1 mutations confer a relative risk of PrCa of ~3.75-fold, (95% confidence interval 1.02–9.6) translating to a 8.6% cumulative risk by age 65. CONCLUSION This study shows evidence for an increased risk of PrCa in men who harbour germline mutations in BRCA1. This could have a significant impact on possible screening strategies and targeted treatments.
Collapse
Affiliation(s)
- D Leongamornlert
- Oncogenetics Team, The Institute of
Cancer Research, Sutton
SM2 5NG, UK
| | - N Mahmud
- Oncogenetics Team, The Institute of
Cancer Research, Sutton
SM2 5NG, UK
| | - M Tymrakiewicz
- Oncogenetics Team, The Institute of
Cancer Research, Sutton
SM2 5NG, UK
| | - E Saunders
- Oncogenetics Team, The Institute of
Cancer Research, Sutton
SM2 5NG, UK
| | - T Dadaev
- Oncogenetics Team, The Institute of
Cancer Research, Sutton
SM2 5NG, UK
| | - E Castro
- Oncogenetics Team, The Institute of
Cancer Research, Sutton
SM2 5NG, UK
| | - C Goh
- Oncogenetics Team, The Institute of
Cancer Research, Sutton
SM2 5NG, UK
| | - K Govindasami
- Oncogenetics Team, The Institute of
Cancer Research, Sutton
SM2 5NG, UK
| | - M Guy
- Oncogenetics Team, The Institute of
Cancer Research, Sutton
SM2 5NG, UK
| | - L O'Brien
- Oncogenetics Team, The Institute of
Cancer Research, Sutton
SM2 5NG, UK
| | - E Sawyer
- Oncogenetics Team, The Institute of
Cancer Research, Sutton
SM2 5NG, UK
| | - A Hall
- Oncogenetics Team, The Institute of
Cancer Research, Sutton
SM2 5NG, UK
| | - R Wilkinson
- Oncogenetics Team, The Institute of
Cancer Research, Sutton
SM2 5NG, UK
| | - D Easton
- Centre for Cancer Genetic
Epidemiology, Department of Public Health and Primary Care, Strangeways
Laboratory, Cambridge
CB1 8RN, UK
| | - The UKGPCS Collaborators5
- Oncogenetics Team, The Institute of
Cancer Research, Sutton
SM2 5NG, UK
- Centre for Cancer Genetic
Epidemiology, Department of Public Health and Primary Care, Strangeways
Laboratory, Cambridge
CB1 8RN, UK
- Department of Dermatology, University
of Utah, Salt Lake City, UT
84132, USA
- The Royal Marsden NHS Foundation
Trust, Sutton
SM2 5NG, UK
| | - D Goldgar
- Department of Dermatology, University
of Utah, Salt Lake City, UT
84132, USA
| | - R Eeles
- Oncogenetics Team, The Institute of
Cancer Research, Sutton
SM2 5NG, UK
- The Royal Marsden NHS Foundation
Trust, Sutton
SM2 5NG, UK
| | - Z Kote-Jarai
- Oncogenetics Team, The Institute of
Cancer Research, Sutton
SM2 5NG, UK
| |
Collapse
|
782
|
Dewey FE, Pan S, Wheeler MT, Quake SR, Ashley EA. DNA sequencing: clinical applications of new DNA sequencing technologies. Circulation 2012; 125:931-44. [PMID: 22354974 DOI: 10.1161/circulationaha.110.972828] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Frederick E Dewey
- Center for Inherited Cardiovascular Disease, Division of Cardiovascular Medicine, Stanford University School of Medicine, Falk CVRB, 300 Pasteur Dr, Stanford, CA 94305, USA
| | | | | | | | | |
Collapse
|
783
|
Li MX, Gui HS, Kwan JSH, Bao SY, Sham PC. A comprehensive framework for prioritizing variants in exome sequencing studies of Mendelian diseases. Nucleic Acids Res 2012; 40:e53. [PMID: 22241780 PMCID: PMC3326332 DOI: 10.1093/nar/gkr1257] [Citation(s) in RCA: 209] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Exome sequencing strategy is promising for finding novel mutations of human monogenic disorders. However, pinpointing the casual mutation in a small number of samples is still a big challenge. Here, we propose a three-level filtration and prioritization framework to identify the casual mutation(s) in exome sequencing studies. This efficient and comprehensive framework successfully narrowed down whole exome variants to very small numbers of candidate variants in the proof-of-concept examples. The proposed framework, implemented in a user-friendly software package, named KGGSeq (http://statgenpro.psychiatry.hku.hk/kggseq), will play a very useful role in exome sequencing-based discovery of human Mendelian disease genes.
Collapse
Affiliation(s)
- Miao-Xin Li
- Department of Psychiatry, University of Hong Kong, Pokfulam, Hong Kong, China.
| | | | | | | | | |
Collapse
|
784
|
Frequent somatic mutations in MAP3K5 and MAP3K9 in metastatic melanoma identified by exome sequencing. Nat Genet 2011; 44:165-9. [PMID: 22197930 PMCID: PMC3267896 DOI: 10.1038/ng.1041] [Citation(s) in RCA: 155] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2011] [Accepted: 11/23/2011] [Indexed: 12/13/2022]
Abstract
We sequenced 8 melanoma exomes to identify novel somatic mutations in metastatic melanoma. Focusing on the MAP3K family, we found that 24% of melanoma cell lines have mutations in the protein-coding regions of either MAP3K5 or MAP3K9. Structural modelling predicts that mutations in the kinase domain may affect the activity and regulation of MAP3K5/9 protein kinases. The position of the mutations and loss of heterozygosity of MAP3K5 and MAP3K9 in 85% and 67% of melanoma samples, respectively, together suggest that the mutations are likely inactivating. In vitro kinase assay shows reduction in kinase activity in MAP3K5 I780F and MAP3K9 W333X mutants. Overexpression of MAP3K5 or MAP3K9 mutant in HEK293T cells reduces phosphorylation of downstream MAP kinases. Attenuation of MAP3K9 function in melanoma cells using siRNA leads to increased cell viability after temozolomide treatment, suggesting that decreased MAP3K pathway activity can lead to chemoresistance in melanoma.
Collapse
|
785
|
Liu X, Jian X, Boerwinkle E. dbNSFP: a lightweight database of human nonsynonymous SNPs and their functional predictions. Hum Mutat 2011; 32:894-9. [PMID: 21520341 PMCID: PMC3145015 DOI: 10.1002/humu.21517] [Citation(s) in RCA: 598] [Impact Index Per Article: 42.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
With the advance of sequencing technologies, whole exome sequencing has increasingly been used to identify mutations that cause human diseases, especially rare Mendelian diseases. Among the analysis steps, functional prediction (of being deleterious) plays an important role in filtering or prioritizing nonsynonymous SNP (NS) for further analysis. Unfortunately, different prediction algorithms use different information and each has its own strength and weakness. It has been suggested that investigators should use predictions from multiple algorithms instead of relying on a single one. However, querying predictions from different databases/Web-servers for different algorithms is both tedious and time consuming, especially when dealing with a huge number of NSs identified by exome sequencing. To facilitate the process, we developed dbNSFP (database for nonsynonymous SNPs' functional predictions). It compiles prediction scores from four new and popular algorithms (SIFT, Polyphen2, LRT, and MutationTaster), along with a conservation score (PhyloP) and other related information, for every potential NS in the human genome (a total of 75,931,005). It is the first integrated database of functional predictions from multiple algorithms for the comprehensive collection of human NSs. dbNSFP is freely available for download at http://sites.google.com/site/jpopgen/dbNSFP.
Collapse
Affiliation(s)
- Xiaoming Liu
- Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, Texas 77030, USA.
| | | | | |
Collapse
|
786
|
Jelier R, Semple JI, Garcia-Verdugo R, Lehner B. Predicting phenotypic variation in yeast from individual genome sequences. Nat Genet 2011; 43:1270-4. [PMID: 22081227 DOI: 10.1038/ng.1007] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2011] [Accepted: 10/19/2011] [Indexed: 12/16/2022]
Abstract
A central challenge in genetics is to predict phenotypic variation from individual genome sequences. Here we construct and evaluate phenotypic predictions for 19 strains of Saccharomyces cerevisiae. We use conservation-based methods to predict the impact of protein-coding variation within genes on protein function. We then rank strains using a prediction score that measures the total sum of function-altering changes in different sets of genes reported to influence over 100 phenotypes in genome-wide loss-of-function screens. We evaluate our predictions by comparing them with the observed growth rate and efficiency of 15 strains tested across 20 conditions in quantitative experiments. The median predictive performance, as measured by ROC AUC, was 0.76, and predictions were more accurate when the genes reported to influence a trait were highly connected in a functional gene network.
Collapse
Affiliation(s)
- Rob Jelier
- European Molecular Biology Laboratory, Centre for Genomic Regulation, Systems Biology Research Unit, Barcelona, Spain
| | | | | | | |
Collapse
|
787
|
Kumar S, Dudley JT, Filipski A, Liu L. Phylomedicine: an evolutionary telescope to explore and diagnose the universe of disease mutations. Trends Genet 2011; 27:377-86. [PMID: 21764165 PMCID: PMC3272884 DOI: 10.1016/j.tig.2011.06.004] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2011] [Revised: 06/10/2011] [Accepted: 06/13/2011] [Indexed: 12/30/2022]
Abstract
Modern technologies have made the sequencing of personal genomes routine. They have revealed thousands of nonsynonymous (amino acid altering) single nucleotide variants (nSNVs) of protein-coding DNA per genome. What do these variants foretell about an individual's predisposition to diseases? The experimental technologies required to carry out such evaluations at a genomic scale are not yet available. Fortunately, the process of natural selection has lent us an almost infinite set of tests in nature. During long-term evolution, new mutations and existing variations have been evaluated for their biological consequences in countless species, and outcomes are readily revealed by multispecies genome comparisons. We review studies that have investigated evolutionary characteristics and in silico functional diagnoses of nSNVs found in thousands of disease-associated genes. We conclude that the patterns of long-term evolutionary conservation and permissible sequence divergence are essential and instructive modalities for functional assessment of human genetic variations.
Collapse
Affiliation(s)
- Sudhir Kumar
- School of Life Sciences, Arizona State University, Tempe, AZ 85287-4501, USA.
| | | | | | | |
Collapse
|
788
|
Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nat Rev Genet 2011; 12:628-40. [PMID: 21850043 DOI: 10.1038/nrg3046] [Citation(s) in RCA: 397] [Impact Index Per Article: 28.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Genome and exome sequencing yield extensive catalogues of human genetic variation. However, pinpointing the few phenotypically causal variants among the many variants present in human genomes remains a major challenge, particularly for rare and complex traits wherein genetic information alone is often insufficient. Here, we review approaches to estimate the deleteriousness of single nucleotide variants (SNVs), which can be used to prioritize disease-causal variants. We describe recent advances in comparative and functional genomics that enable systematic annotation of both coding and non-coding variants. Application and optimization of these methods will be essential to find the genetic answers that sequencing promises to hide in plain sight.
Collapse
|
789
|
Chun S, Fay JC. Evidence for hitchhiking of deleterious mutations within the human genome. PLoS Genet 2011; 7:e1002240. [PMID: 21901107 PMCID: PMC3161959 DOI: 10.1371/journal.pgen.1002240] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2011] [Accepted: 06/28/2011] [Indexed: 01/17/2023] Open
Abstract
Deleterious mutations present a significant obstacle to adaptive evolution. Deleterious mutations can inhibit the spread of linked adaptive mutations through a population; conversely, adaptive substitutions can increase the frequency of linked deleterious mutations and even result in their fixation. To assess the impact of adaptive mutations on linked deleterious mutations, we examined the distribution of deleterious and neutral amino acid polymorphism in the human genome. Within genomic regions that show evidence of recent hitchhiking, we find fewer neutral but a similar number of deleterious SNPs compared to other genomic regions. The higher ratio of deleterious to neutral SNPs is consistent with simulated hitchhiking events and implies that positive selection eliminates some deleterious alleles and increases the frequency of others. The distribution of disease-associated alleles is also altered in hitchhiking regions. Disease alleles within hitchhiking regions have been associated with auto-immune disorders, metabolic diseases, cancers, and mental disorders. Our results suggest that positive selection has had a significant impact on deleterious polymorphism and may be partly responsible for the high frequency of certain human disease alleles.
Collapse
Affiliation(s)
- Sung Chun
- Computational and Systems Biology Program, Washington University, St. Louis, Missouri, United States of America
| | - Justin C. Fay
- Computational and Systems Biology Program, Washington University, St. Louis, Missouri, United States of America
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University, St. Louis, Missouri, United States of America
| |
Collapse
|
790
|
Zia A, Moses AM. Ranking insertion, deletion and nonsense mutations based on their effect on genetic information. BMC Bioinformatics 2011; 12:299. [PMID: 21781308 PMCID: PMC3155974 DOI: 10.1186/1471-2105-12-299] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2011] [Accepted: 07/22/2011] [Indexed: 11/10/2022] Open
Abstract
Background Genetic variations contribute to normal phenotypic differences as well as diseases, and new sequencing technologies are greatly increasing the capacity to identify these variations. Given the large number of variations now being discovered, computational methods to prioritize the functional importance of genetic variations are of growing interest. Thus far, the focus of computational tools has been mainly on the prediction of the effects of amino acid changing single nucleotide polymorphisms (SNPs) and little attention has been paid to indels or nonsense SNPs that result in premature stop codons. Results We propose computational methods to rank insertion-deletion mutations in the coding as well as non-coding regions and nonsense mutations. We rank these variations by measuring the extent of their effect on biological function, based on the assumption that evolutionary conservation reflects function. Using sequence data from budding yeast and human, we show that variations which that we predict to have larger effects segregate at significantly lower allele frequencies, and occur less frequently than expected by chance, indicating stronger purifying selection. Furthermore, we find that insertions, deletions and premature stop codons associated with disease in the human have significantly larger predicted effects than those not associated with disease. Interestingly, the large-effect mutations associated with disease show a similar distribution of predicted effects to that expected for completely random mutations. Conclusions This demonstrates that the evolutionary conservation context of the sequences that harbour insertions, deletions and nonsense mutations can be used to predict and rank the effects of the mutations.
Collapse
Affiliation(s)
- Amin Zia
- Department of Cell & Systems Biology, University of Toronto, 25 Willcocks Street, Toronto, Ontario, M5S 3B2, Canada
| | | |
Collapse
|
791
|
Suzuki Y. Statistical methods for detecting natural selection from genomic data. Genes Genet Syst 2011; 85:359-76. [PMID: 21415566 DOI: 10.1266/ggs.85.359] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
In the study of molecular and phenotypic evolution, understanding the relative importance of random genetic drift and positive selection as the mechanisms for driving divergences between populations and maintaining polymorphisms within populations has been a central issue. A variety of statistical methods has been developed for detecting natural selection operating at the amino acid and nucleotide sequence levels. These methods may be largely classified into those aimed at detecting recurrent and/or recent/ongoing natural selection by utilizing the divergence and/or polymorphism data. Using these methods, pervasive positive selection has been identified for protein-coding and non-coding sequences in the genomic analysis of some organisms. However, many of these methods have been criticized by using computer simulation and real data analysis to produce excessive false-positives and to be sensitive to various disturbing factors. Importantly, some of these methods have been invalidated experimentally. These facts indicate that many of the statistical methods for detecting natural selection are unreliable. In addition, the signals that have been believed as the evidence for fixations of advantageous mutations due to positive selection may also be interpreted as the evidence for fixations of deleterious mutations due to random genetic drift. The genomic diversity data are rapidly accumulating in various organisms, and detection of natural selection may play a critical role for clarifying the relative role of random genetic drift and positive selection in molecular and phenotypic evolution. It is therefore important to develop reliable statistical methods that are unbiased as well as robust against various disturbing factors, for inferring natural selection.
Collapse
Affiliation(s)
- Yoshiyuki Suzuki
- Graduate School of Natural Sciences, Nagoya City University, Japan.
| |
Collapse
|
792
|
Fang X, Zhang Y, Zhang R, Yang L, Li M, Ye K, Guo X, Wang J, Su B. Genome sequence and global sequence variation map with 5.5 million SNPs in Chinese rhesus macaque. Genome Biol 2011; 12:R63. [PMID: 21733155 PMCID: PMC3218825 DOI: 10.1186/gb-2011-12-7-r63] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2010] [Revised: 05/01/2011] [Accepted: 07/06/2011] [Indexed: 11/25/2022] Open
Abstract
Background Rhesus macaque (Macaca mulatta) is the most widely used nonhuman primate animal in biomedical research. A global map of genetic variations in rhesus macaque is valuable for both evolutionary and functional studies. Results Using next-generation sequencing technology, we sequenced a Chinese rhesus macaque genome with 11.56-fold coverage. In total, 96% of the reference Indian macaque genome was covered by at least one read, and we identified 2.56 million homozygous and 2.94 million heterozygous SNPs. We also detected a total of 125,150 structural variations, of which 123,610 were deletions with a median length of 184 bp (ranging from 25 bp to 10 kb); 63% of these deletions were located in intergenic regions and 35% in intronic regions. We further annotated 5,187 and 962 nonsynonymous SNPs to the macaque orthologs of human disease and drug-target genes, respectively. Finally, we set up a genome-wide genetic variation database with the use of Gbrowse. Conclusions Genome sequencing and construction of a global sequence variation map in Chinese rhesus macaque with the concomitant database provide applicable resources for evolutionary and biomedical research.
Collapse
Affiliation(s)
- Xiaodong Fang
- Beijing Genomics Institute-Shenzhen, Chinese Academy of Sciences, Shenzhen 518083, China
| | | | | | | | | | | | | | | | | |
Collapse
|
793
|
Hicks S, Wheeler DA, Plon SE, Kimmel M. Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed. Hum Mutat 2011; 32:661-8. [PMID: 21480434 PMCID: PMC4154965 DOI: 10.1002/humu.21490] [Citation(s) in RCA: 163] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2010] [Accepted: 02/09/2011] [Indexed: 01/10/2023]
Abstract
Multiple algorithms are used to predict the impact of missense mutations on protein structure and function using algorithm-generated sequence alignments or manually curated alignments. We compared the accuracy with native alignment of SIFT, Align-GVGD, PolyPhen-2, and Xvar when generating functionality predictions of well-characterized missense mutations (n = 267) within the BRCA1, MSH2, MLH1, and TP53 genes. We also evaluated the impact of the alignment employed on predictions from these algorithms (except Xvar) when supplied the same four alignments including alignments automatically generated by (1) SIFT, (2) Polyphen-2, (3) Uniprot, and (4) a manually curated alignment tuned for Align-GVGD. Alignments differ in sequence composition and evolutionary depth. Data-based receiver operating characteristic curves employing the native alignment for each algorithm result in area under the curve of 78-79% for all four algorithms. Predictions from the PolyPhen-2 algorithm were least dependent on the alignment employed. In contrast, Align-GVGD predicts all variants neutral when provided alignments with a large number of sequences. Of note, algorithms make different predictions of variants even when provided the same alignment and do not necessarily perform best using their own alignment. Thus, researchers should consider optimizing both the algorithm and sequence alignment employed in missense prediction.
Collapse
Affiliation(s)
- Stephanie Hicks
- Department of Statistics, Rice University, Houston, Texas, USA
| | | | - Sharon E. Plon
- Human Genome Sequencing Center, Houston, Texas, USA
- Texas Children's Cancer Center, Department of Pediatrics, Baylor College of Medicine, Houston, Texas, USA
| | - Marek Kimmel
- Department of Statistics, Rice University, Houston, Texas, USA
| |
Collapse
|
794
|
Abstract
Recent findings suggest that rare variants play an important role in both monogenic and common diseases. Due to their rarity, however, it remains unclear how to appropriately analyze the association between such variants and disease. A common approach entails combining rare variants together based on a priori information and analyzing them as a single group. Here one must make some assumptions about what to aggregate. Instead, we propose two approaches to empirically determine the most efficient grouping of rare variants. The first considers multiple possible groupings using existing information. The second is an agnostic "step-up" approach that determines an optimal grouping of rare variants analytically and does not rely on prior information. To evaluate these approaches, we undertook a simulation study using sequence data from genes in the one-carbon folate metabolic pathway. Our results show that using prior information to group rare variants is advantageous only when information is quite accurate, but the step-up approach works well across a broad range of plausible scenarios. This agnostic approach allows one to efficiently analyze the association between rare variants and disease while avoiding assumptions required by other approaches for grouping such variants.
Collapse
|
795
|
Cooper DN, Chen JM, Ball EV, Howells K, Mort M, Phillips AD, Chuzhanova N, Krawczak M, Kehrer-Sawatzki H, Stenson PD. Genes, mutations, and human inherited disease at the dawn of the age of personalized genomics. Hum Mutat 2010; 31:631-55. [PMID: 20506564 DOI: 10.1002/humu.21260] [Citation(s) in RCA: 117] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The number of reported germline mutations in human nuclear genes, either underlying or associated with inherited disease, has now exceeded 100,000 in more than 3,700 different genes. The availability of these data has both revolutionized the study of the morbid anatomy of the human genome and facilitated "personalized genomics." With approximately 300 new "inherited disease genes" (and approximately 10,000 new mutations) being identified annually, it is pertinent to ask how many "inherited disease genes" there are in the human genome, how many mutations reside within them, and where such lesions are likely to be located? To address these questions, it is necessary not only to reconsider how we define human genes but also to explore notions of gene "essentiality" and "dispensability."Answers to these questions are now emerging from recent novel insights into genome structure and function and through complete genome sequence information derived from multiple individual human genomes. However, a change in focus toward screening functional genomic elements as opposed to genes sensu stricto will be required if we are to capitalize fully on recent technical and conceptual advances and identify new types of disease-associated mutation within noncoding regions remote from the genes whose function they disrupt.
Collapse
Affiliation(s)
- David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff CF14 4XN, United Kingdom.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
796
|
Ng SB, Nickerson DA, Bamshad MJ, Shendure J. Massively parallel sequencing and rare disease. Hum Mol Genet 2010; 19:R119-24. [PMID: 20846941 DOI: 10.1093/hmg/ddq390] [Citation(s) in RCA: 132] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Massively parallel sequencing has enabled the rapid, systematic identification of variants on a large scale. This has, in turn, accelerated the pace of gene discovery and disease diagnosis on a molecular level and has the potential to revolutionize methods particularly for the analysis of Mendelian disease. Using massively parallel sequencing has enabled investigators to interrogate variants both in the context of linkage intervals and also on a genome-wide scale, in the absence of linkage information entirely. The primary challenge now is to distinguish between background polymorphisms and pathogenic mutations. Recently developed strategies for rare monogenic disorders have met with some early success. These strategies include filtering for potential causal variants based on frequency and function, and also ranking variants based on conservation scores and predicted deleteriousness to protein structure. Here, we review the recent literature in the use of high-throughput sequence data and its analysis in the discovery of causal mutations for rare disorders.
Collapse
Affiliation(s)
- Sarah B Ng
- Department of Genome Sciences, University of Washington School of Medicine, Seattle WA 98195, USA.
| | | | | | | |
Collapse
|
797
|
Jordan DM, Ramensky VE, Sunyaev SR. Human allelic variation: perspective from protein function, structure, and evolution. Curr Opin Struct Biol 2010; 20:342-50. [PMID: 20399638 PMCID: PMC2921592 DOI: 10.1016/j.sbi.2010.03.006] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2010] [Accepted: 03/22/2010] [Indexed: 01/20/2023]
Abstract
It is widely anticipated that the coming year will be marked by the complete characterization of DNA sequence of protein-coding regions of thousands of human individuals. A number of existing computational methods use comparative protein sequence analysis and analysis of protein structure to predict the functional effect of coding human alleles. Functional and structural analysis of coding allelic variants can inform various aspects of research on human genetic variation. In population and evolutionary genetics it helps estimate the strength of purifying selection against deleterious missense mutations and study the imprint of demographic history on deleterious genetic variation. In medical genetics it may assist in the interpretation of uncharacterized mutations in genes involved in monogenic and oligogenic diseases. It has a potential to facilitate medical sequencing studies searching for genes underlying Mendelian diseases or harboring rare alleles involved in complex traits.
Collapse
Affiliation(s)
- Daniel M. Jordan
- Division of Genetics, Brigham & Women’s Hospital, Harvard Medical School, Boston, Massachusetts, USA
- Program in Biophysics, Harvard University, Cambridge, Massachusetts, USA
| | - Vasily E. Ramensky
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - Shamil R. Sunyaev
- Division of Genetics, Brigham & Women’s Hospital, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
798
|
Hijikata A, Raju R, Keerthikumar S, Ramabadran S, Balakrishnan L, Ramadoss SK, Pandey A, Mohan S, Ohara O. Mutation@A Glance: an integrative web application for analysing mutations from human genetic diseases. DNA Res 2010; 17:197-208. [PMID: 20360267 PMCID: PMC2885273 DOI: 10.1093/dnares/dsq010] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Although mutation analysis serves as a key part in making a definitive diagnosis about a genetic disease, it still remains a time-consuming step to interpret their biological implications through integration of various lines of archived information about genes in question. To expedite this evaluation step of disease-causing genetic variations, here we developed Mutation@A Glance (http://rapid.rcai.riken.jp/mutation/), a highly integrated web-based analysis tool for analysing human disease mutations; it implements a user-friendly graphical interface to visualize about 40 000 known disease-associated mutations and genetic polymorphisms from more than 2600 protein-coding human disease-causing genes. Mutation@A Glance locates already known genetic variation data individually on the nucleotide and the amino acid sequences and makes it possible to cross-reference them with tertiary and/or quaternary protein structures and various functional features associated with specific amino acid residues in the proteins. We showed that the disease-associated missense mutations had a stronger tendency to reside in positions relevant to the structure/function of proteins than neutral genetic variations. From a practical viewpoint, Mutation@A Glance could certainly function as a ‘one-stop’ analysis platform for newly determined DNA sequences, which enables us to readily identify and evaluate new genetic variations by integrating multiple lines of information about the disease-causing candidate genes.
Collapse
Affiliation(s)
- Atsushi Hijikata
- Laboratory for Immunogenomics, RIKEN Research Center for Allergy and Immunology, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | | | | | | | | | | | | | | | | |
Collapse
|
799
|
Goode DL, Cooper GM, Schmutz J, Dickson M, Gonzales E, Tsai M, Karra K, Davydov E, Batzoglou S, Myers RM, Sidow A. Evolutionary constraint facilitates interpretation of genetic variation in resequenced human genomes. Genome Res 2010; 20:301-10. [PMID: 20067941 PMCID: PMC2840986 DOI: 10.1101/gr.102210.109] [Citation(s) in RCA: 65] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2009] [Accepted: 01/08/2010] [Indexed: 01/22/2023]
Abstract
Here, we demonstrate how comparative sequence analysis facilitates genome-wide base-pair-level interpretation of individual genetic variation and address two questions of importance for human personal genomics: first, whether an individual's functional variation comes mostly from noncoding or coding polymorphisms; and, second, whether population-specific or globally-present polymorphisms contribute more to functional variation in any given individual. Neither has been definitively answered by analyses of existing variation data because of a focus on coding polymorphisms, ascertainment biases in favor of common variation, and a lack of base-pair-level resolution for identifying functional variants. We resequenced 575 amplicons within 432 individuals at genomic sites enriched for evolutionary constraint and also analyzed variation within three published human genomes. We find that single-site measures of evolutionary constraint derived from mammalian multiple sequence alignments are strongly predictive of reductions in modern-day genetic diversity across a range of annotation categories and across the allele frequency spectrum from rare (<1%) to high frequency (>10% minor allele frequency). Furthermore, we show that putatively functional variation in an individual genome is dominated by polymorphisms that do not change protein sequence and that originate from our shared ancestral population and commonly segregate in human populations. These observations show that common, noncoding alleles contribute substantially to human phenotypes and that constraint-based analyses will be of value to identify phenotypically relevant variants in individual genomes.
Collapse
Affiliation(s)
- David L Goode
- Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
800
|
Gilad Y, Pritchard JK, Thornton K. Characterizing natural variation using next-generation sequencing technologies. Trends Genet 2009; 25:463-71. [PMID: 19801172 DOI: 10.1016/j.tig.2009.09.003] [Citation(s) in RCA: 102] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2009] [Revised: 09/08/2009] [Accepted: 09/09/2009] [Indexed: 01/22/2023]
Abstract
Progress in evolutionary genomics is tightly coupled with the development of new technologies to collect high-throughput data. The availability of next-generation sequencing technologies has the potential to revolutionize genomic research and enable us to focus on a large number of outstanding questions that previously could not be addressed effectively. Indeed, we are now able to study genetic variation on a genome-wide scale, characterize gene regulatory processes at unprecedented resolution, and soon, we expect that individual laboratories might be able to rapidly sequence new genomes. However, at present, the analysis of next-generation sequencing data is challenging, in particular because most sequencing platforms provide short reads, which are difficult to align and assemble. In addition, only little is known about sources of variation that are associated with next-generation sequencing study designs. A better understanding of the sources of error and bias in sequencing data is essential, especially in the context of studies of variation at dynamic quantitative traits.
Collapse
Affiliation(s)
- Yoav Gilad
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA.
| | | | | |
Collapse
|