1
|
Nosková A, Li C, Wang X, Leonard AS, Pausch H, Kadri N. Exploiting public databases of genomic variation to quantify evolutionary constraint on the branch point sequence in 30 plant and animal species. Nucleic Acids Res 2023; 51:12069-12075. [PMID: 37953306 PMCID: PMC10711541 DOI: 10.1093/nar/gkad970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 10/06/2023] [Accepted: 10/18/2023] [Indexed: 11/14/2023] Open
Abstract
The branch point sequence is a degenerate intronic heptamer required for the assembly of the spliceosome during pre-mRNA splicing. Disruption of this motif may promote alternative splicing and eventually cause phenotype variation. Despite its functional relevance, the branch point sequence is not included in most genome annotations. Here, we predict branch point sequences in 30 plant and animal species and attempt to quantify their evolutionary constraints using public variant databases. We find an implausible variant distribution in the databases from 16 of 30 examined species. Comparative analysis of variants from whole-genome sequencing shows that variants submitted from exome sequencing or false positive variants are widespread in public databases and cause these irregularities. We then investigate evolutionary constraint with largely unbiased public variant databases in 14 species and find that the fourth and sixth position of the branch point sequence are more constrained than coding nucleotides. Our findings show that public variant databases should be scrutinized for possible biases before they qualify to analyze evolutionary constraint.
Collapse
Affiliation(s)
- Adéla Nosková
- Animal Genomics, ETH Zürich, Universitätstrasse 2, 8092 Zürich, Switzerland
| | - Chao Li
- Animal Genomics, ETH Zürich, Universitätstrasse 2, 8092 Zürich, Switzerland
- International Joint Agriculture Research Center for Animal Bio-Breeding, Ministry of Agriculture and Rural Affairs/Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
| | - Xiaolong Wang
- International Joint Agriculture Research Center for Animal Bio-Breeding, Ministry of Agriculture and Rural Affairs/Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
| | | | - Hubert Pausch
- Animal Genomics, ETH Zürich, Universitätstrasse 2, 8092 Zürich, Switzerland
| | - Naveen Kumar Kadri
- Animal Genomics, ETH Zürich, Universitätstrasse 2, 8092 Zürich, Switzerland
| |
Collapse
|
2
|
Woerner AE, Crysup B, Hewitt FC, Gardner MW, Freitas MA, Budowle B. Techniques for estimating genetically variable peptides and semi-continuous likelihoods from massively parallel sequencing data. Forensic Sci Int Genet 2022; 59:102719. [DOI: 10.1016/j.fsigen.2022.102719] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 04/25/2022] [Accepted: 05/01/2022] [Indexed: 11/25/2022]
|
3
|
Ba R, Geffard E, Douillard V, Simon F, Mesnard L, Vince N, Gourraud PA, Limou S. Surfing the Big Data Wave: Omics Data Challenges in Transplantation. Transplantation 2022; 106:e114-e125. [PMID: 34889882 DOI: 10.1097/tp.0000000000003992] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
In both research and care, patients, caregivers, and researchers are facing a leap forward in the quantity of data that are available for analysis and interpretation, marking the daunting "big data era." In the biomedical field, this quantitative shift refers mostly to the -omics that permit measuring and analyzing biological features of the same type as a whole. Omics studies have greatly impacted transplantation research and highlighted their potential to better understand transplant outcomes. Some studies have emphasized the contribution of omics in developing personalized therapies to avoid graft loss. However, integrating omics data remains challenging in terms of analytical processes. These data come from multiple sources. Consequently, they may contain biases and systematic errors that can be mistaken for relevant biological information. Normalization methods and batch effects have been developed to tackle issues related to data quality and homogeneity. In addition, imputation methods handle data missingness. Importantly, the transplantation field represents a unique analytical context as the biological statistical unit is the donor-recipient pair, which brings additional complexity to the omics analyses. Strategies such as combined risk scores between 2 genomes taking into account genetic ancestry are emerging to better understand graft mechanisms and refine biological interpretations. The future omics will be based on integrative biology, considering the analysis of the system as a whole and no longer the study of a single characteristic. In this review, we summarize omics studies advances in transplantation and address the most challenging analytical issues regarding these approaches.
Collapse
Affiliation(s)
- Rokhaya Ba
- Université de Nantes, Centre Hospitalier Universitaire Nantes, Institute of Health and Medical Research, Centre de Recherche en Transplantation et Immunologie, UMR 1064, Institut de Transplantation Urologie-Néphrologie, Nantes, France
- Département Informatique et Mathématiques, Ecole Centrale de Nantes, Nantes, France
| | - Estelle Geffard
- Université de Nantes, Centre Hospitalier Universitaire Nantes, Institute of Health and Medical Research, Centre de Recherche en Transplantation et Immunologie, UMR 1064, Institut de Transplantation Urologie-Néphrologie, Nantes, France
| | - Venceslas Douillard
- Université de Nantes, Centre Hospitalier Universitaire Nantes, Institute of Health and Medical Research, Centre de Recherche en Transplantation et Immunologie, UMR 1064, Institut de Transplantation Urologie-Néphrologie, Nantes, France
| | - Françoise Simon
- Université de Nantes, Centre Hospitalier Universitaire Nantes, Institute of Health and Medical Research, Centre de Recherche en Transplantation et Immunologie, UMR 1064, Institut de Transplantation Urologie-Néphrologie, Nantes, France
- Mount Sinai School of Medicine, New York, NY
| | - Laurent Mesnard
- Urgences Néphrologiques et Transplantation Rénale, Hôpital Tenon, Assistance Publique-Hôpitaux de Paris, Paris, France
- Sorbonne Université, Paris, France
| | - Nicolas Vince
- Université de Nantes, Centre Hospitalier Universitaire Nantes, Institute of Health and Medical Research, Centre de Recherche en Transplantation et Immunologie, UMR 1064, Institut de Transplantation Urologie-Néphrologie, Nantes, France
| | - Pierre-Antoine Gourraud
- Université de Nantes, Centre Hospitalier Universitaire Nantes, Institute of Health and Medical Research, Centre de Recherche en Transplantation et Immunologie, UMR 1064, Institut de Transplantation Urologie-Néphrologie, Nantes, France
| | - Sophie Limou
- Université de Nantes, Centre Hospitalier Universitaire Nantes, Institute of Health and Medical Research, Centre de Recherche en Transplantation et Immunologie, UMR 1064, Institut de Transplantation Urologie-Néphrologie, Nantes, France
- Département Informatique et Mathématiques, Ecole Centrale de Nantes, Nantes, France
| |
Collapse
|
4
|
Extension of the Human Fibrinogen Database with Detailed Clinical Information—The αC-Connector Segment. Int J Mol Sci 2021; 23:ijms23010132. [PMID: 35008554 PMCID: PMC8745514 DOI: 10.3390/ijms23010132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2021] [Revised: 12/20/2021] [Accepted: 12/21/2021] [Indexed: 11/16/2022] Open
Abstract
Fibrinogen, an abundant plasma glycoprotein, is involved in the final stage of blood coagulation. Decreased fibrinogen levels, which may be caused by mutations, are manifested mainly in bleeding and thrombotic disorders. Clinically relevant mutations of fibrinogen are listed in the Human Fibrinogen Database. For the αC-connector (amino acids Aα240–410, nascent chain numbering), we have extended this database, with detailed descriptions of the clinical manifestations among members of reported families. This includes the specification of bleeding and thrombotic events and results of coagulation assays. Where available, the impact of a mutation on clotting and fibrinolysis is reported. The collected data show that the Human Fibrinogen Database reports considerably fewer missense and synonymous mutations than the general COSMIC and dbSNP databases. Homozygous nonsense or frameshift mutations in the αC-connector are responsible for most clinically relevant symptoms, while heterozygous mutations are often asymptomatic. Symptomatic subjects suffer from bleeding and, less frequently, from thrombotic events. Miscarriages within the first trimester and prolonged wound healing were reported in a few subjects. All mutations inducing thrombotic phenotypes are located at the identical positions within the consensus sequence of the tandem repeats.
Collapse
|
5
|
Overview of human 20 alpha-hydroxysteroid dehydrogenase (AKR1C1): Functions, regulation, and structural insights of inhibitors. Chem Biol Interact 2021; 351:109746. [PMID: 34780792 DOI: 10.1016/j.cbi.2021.109746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 10/28/2021] [Accepted: 11/10/2021] [Indexed: 11/22/2022]
Abstract
Human aldo-keto reductase family 1C1 (AKR1C1) is an important enzyme involved in human hormone metabolism, which is mainly responsible for the metabolism of progesterone in the human body. AKR1C1 is highly expressed and has an important relationship with the occurrence and development of various diseases, especially some cancers related to hormone metabolism. Nowadays, many inhibitors against AKR1C1 have been discovered, including some synthetic compounds and natural products, which have certain inhibitory activity against AKR1C1 at the target level. Here we briefly reviewed the physiological and pathological functions of AKR1C1 and the relationship with the disease, and then summarized the development of AKR1C1 inhibitors, elucidated the interaction between inhibitors and AKR1C1 through molecular docking results and existing co-crystal structures. Finally, we discussed the design ideals of selective AKR1C1 inhibitors from the perspective of AKR1C1 structure, discussed the prospects of AKR1C1 in the treatment of human diseases in terms of biomarkers, pre-receptor regulation and single nucleotide polymorphisms, aiming to provide new ideas for drug research targeting AKR1C1.
Collapse
|
6
|
Doffe F, Carbonnier V, Tissier M, Leroy B, Martins I, Mattsson JSM, Micke P, Pavlova S, Pospisilova S, Smardova J, Joerger AC, Wiman KG, Kroemer G, Soussi T. Identification and functional characterization of new missense SNPs in the coding region of the TP53 gene. Cell Death Differ 2021; 28:1477-1492. [PMID: 33257846 PMCID: PMC8166836 DOI: 10.1038/s41418-020-00672-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Revised: 11/03/2020] [Accepted: 11/04/2020] [Indexed: 02/06/2023] Open
Abstract
Infrequent and rare genetic variants in the human population vastly outnumber common ones. Although they may contribute significantly to the genetic basis of a disease, these seldom-encountered variants may also be miss-identified as pathogenic if no correct references are available. Somatic and germline TP53 variants are associated with multiple neoplastic diseases, and thus have come to serve as a paradigm for genetic analyses in this setting. We searched 14 independent, globally distributed datasets and recovered TP53 SNPs from 202,767 cancer-free individuals. In our analyses, 19 new missense TP53 SNPs, including five novel variants specific to the Asian population, were recurrently identified in multiple datasets. Using a combination of in silico, functional, structural, and genetic approaches, we showed that none of these variants displayed loss of function compared to the normal TP53 gene. In addition, classification using ACMG criteria suggested that they are all benign. Considered together, our data reveal that the TP53 coding region shows far more polymorphism than previously thought and present high ethnic diversity. They furthermore underline the importance of correctly assessing novel variants in all variant-calling pipelines associated with genetic diagnoses for cancer.
Collapse
Affiliation(s)
- Flora Doffe
- Equipe Labellisée par la Ligue Contre le Cancer, Université Paris Descartes, Université Sorbonne Paris Cité, Université Paris Diderot, Sorbonne Université, INSERM U1138, Centre de Recherche des Cordeliers, Paris, France
- Department of Oncology-Pathology, Bioclinicum, Karolinska Institutet, Stockholm, Sweden
| | - Vincent Carbonnier
- Equipe Labellisée par la Ligue Contre le Cancer, Université Paris Descartes, Université Sorbonne Paris Cité, Université Paris Diderot, Sorbonne Université, INSERM U1138, Centre de Recherche des Cordeliers, Paris, France
| | - Manon Tissier
- Equipe Labellisée par la Ligue Contre le Cancer, Université Paris Descartes, Université Sorbonne Paris Cité, Université Paris Diderot, Sorbonne Université, INSERM U1138, Centre de Recherche des Cordeliers, Paris, France
| | - Bernard Leroy
- Department of Life Science, Sorbonne Université, Paris, France
| | - Isabelle Martins
- Equipe Labellisée par la Ligue Contre le Cancer, Université Paris Descartes, Université Sorbonne Paris Cité, Université Paris Diderot, Sorbonne Université, INSERM U1138, Centre de Recherche des Cordeliers, Paris, France
- Metabolomics and Cell Biology Platforms, Institut Gustave Roussy, Villejuif, France
| | - Johanna S M Mattsson
- Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
| | - Patrick Micke
- Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
| | - Sarka Pavlova
- Department of Internal Medicine-Hematology and Oncology, University Hospital Brno and Faculty of Medicine, Masaryk University, Brno, Czech Republic
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Sarka Pospisilova
- Department of Internal Medicine-Hematology and Oncology, University Hospital Brno and Faculty of Medicine, Masaryk University, Brno, Czech Republic
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Jana Smardova
- Faculty of Science, Department of Experimental Biology, Masaryk University, Brno, Czech Republic
| | - Andreas C Joerger
- Institute of Pharmaceutical Chemistry, Johann Wolfgang Goethe University, Max-von-Laue-Str. 9, 60438, Frankfurt am Main, Germany
- Buchmann Institute for Molecular Life Sciences and Structural Genomics Consortium (SGC), Max-von-Laue-Str. 15, 60438, Frankfurt am Main, Germany
| | - Klas G Wiman
- Department of Oncology-Pathology, Bioclinicum, Karolinska Institutet, Stockholm, Sweden
| | - Guido Kroemer
- Equipe Labellisée par la Ligue Contre le Cancer, Université Paris Descartes, Université Sorbonne Paris Cité, Université Paris Diderot, Sorbonne Université, INSERM U1138, Centre de Recherche des Cordeliers, Paris, France
- Metabolomics and Cell Biology Platforms, Institut Gustave Roussy, Villejuif, France
- Pôle de Biologie, Hôpital Européen Georges Pompidou, AP-HP, Paris, France
- Department of Women's and Children's Health, Karolinska University Hospital, Stockholm, Sweden
| | - Thierry Soussi
- Equipe Labellisée par la Ligue Contre le Cancer, Université Paris Descartes, Université Sorbonne Paris Cité, Université Paris Diderot, Sorbonne Université, INSERM U1138, Centre de Recherche des Cordeliers, Paris, France.
- Department of Oncology-Pathology, Bioclinicum, Karolinska Institutet, Stockholm, Sweden.
- Department of Life Science, Sorbonne Université, Paris, France.
- Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden.
- Cell Death and Drug Resistance in Lymphoproliferative Disorders Team, INSERM U1138, Centre de Recherche des Cordeliers, Paris, France.
| |
Collapse
|
7
|
Mares L, Vilchis F, Chávez B, Ramos L. Molecular genetic analysis of AKR1C2-4 and HSD17B6 genes in subjects 46,XY with hypospadias. J Pediatr Urol 2020; 16:689.e1-689.e12. [PMID: 32732174 DOI: 10.1016/j.jpurol.2020.07.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 06/19/2020] [Accepted: 07/01/2020] [Indexed: 01/16/2023]
Abstract
BACKGROUND The formation of the male urethra depends to enzyme-mediated testosterone (T) conversion into 5α-dihydrotestosterone (DHT). Two metabolic pathways could be operating in the fetal testis to synthesize androgens: 1) the "classic" route (T→DHT) mediated by SRD5A2 and 2) a "backdoor" pathway in which DHT is synthesized by aldo-keto reductase family 1, member C2 (AKR1C2), AKR1C3, and AKR1C4 enzymes without formation of a T intermediate. OBJECTIVE We studied four genes of the "backdoor" pathway in karyotypic males with hypospadias to ascertain whether gene defects in AKRs impair urethral DHT formation that result in hypospadias. DESIGN AND PATIENTS The coding regions of the AKR1C2-4 and HSD17B6 genes were analyzed by PCR-SSCP and sequencing in a cohort of 25 Mexican patients (0.3-9 year-old-children) with 46,XY-hypospadias. Chi-squared tests was performed to evaluate the distribution of genotypes, alleles, and the Hardy-Weinberg (H-W) equilibrium. The effect of the genetic variants was investigated by in silico studies. RESULTS Screening studies revealed distinct genotypic patterns at different exons of AKR1C2-4 whereas HSD17B6 presented a wild-type sequence. The DNA analyses detected two synonymous variants (c.327C>T, c.666T>C/unreported) in AKR1C2. The AKR1C3 had two variants (c.15C>G, c.230A>G), two unreported variants (c.538T>C, c.596G>A), and one silent variant (c.312G>A). Two variants (c.434C>G, c.931C>G) were identified in AKR1C4. All variants were in H-W equilibrium without structural changes. DISCUSSION Hypospadias have been associated with defects that alter androgen biosynthesis in the human fetal testis, specifically 5α-DHT. We selected four candidate genes involved in the "backdoor" pathway for the formation of 5α-DHT. Molecular assays of the AKR1C2, AKR1C3, and AKR1C4 genes revealed a total of nine genetic single nucleotide variants. Several variants in the AKR1C genes have been associated with a variety of human pathologies. However, our studies suggest that active steroid biosynthesis via AKR1C might not be involved in hypospadias. Additionally, genetic research suggests a low involvement in the "backdoor" 5α-DHT pathway during human sexual development, specifically, the differentiation of male external genitalia. CONCLUSION These results indicate that substitutions in AKR1C2-4 are polymorphisms and all genetic variants lacks deleterious significant association with hypospadias. The data suggest that inactivating mutations in the AKR1C2-4 and HSD17B6 genes are an infrequent cause of hypospadias, which might weaken the contribution of the "backdoor" pathway to embryonic urethral masculinization.
Collapse
Affiliation(s)
- L Mares
- Department of Reproductive Biology, Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán, México City, Mexico
| | - F Vilchis
- Department of Reproductive Biology, Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán, México City, Mexico
| | - B Chávez
- Department of Reproductive Biology, Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán, México City, Mexico
| | - L Ramos
- Department of Reproductive Biology, Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán, México City, Mexico.
| |
Collapse
|
8
|
Gao P, Zhang R, Li J. Comprehensive elaboration of database resources utilized in next-generation sequencing-based tumor somatic mutation detection. Biochim Biophys Acta Rev Cancer 2019; 1872:122-137. [PMID: 31265877 DOI: 10.1016/j.bbcan.2019.06.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 06/16/2019] [Accepted: 06/26/2019] [Indexed: 12/20/2022]
Abstract
The rapid evolution of next-generation sequencing (NGS)-based tumor genomic profile detection and the emergence of molecularly targeted therapies have enabled precision oncology. In NGS-based analysis, various types of databases have been developed to perform different functions. However, many problems still exist when using these public databases. Therefore, it is important to better understand the characteristics and limitations of each database and have them complement each other to provide useful clinical evidence for NGS testing. In this review, we elaborate on the important role of databases and their concrete applications in NGS-based somatic mutation detection. We introduce the typically used databases for sequence alignment, variant filtration, and variant interpretation, and compare the differences between the databases with similar functions. Subsequently, we determine the limitations of each database and provide the corresponding solutions. Furthermore, we present an overview diagram to clearly illustrate the database used in the entire NGS-based somatic mutation detection pipeline.
Collapse
Affiliation(s)
- Peng Gao
- National Center for Clinical Laboratories, Beijing Hospital, National Center of Gerontology, Beijing, People's Republic of China; Graduate School, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, People's Republic of China; Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, People's Republic of China
| | - Rui Zhang
- National Center for Clinical Laboratories, Beijing Hospital, National Center of Gerontology, Beijing, People's Republic of China; Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, People's Republic of China.
| | - Jinming Li
- National Center for Clinical Laboratories, Beijing Hospital, National Center of Gerontology, Beijing, People's Republic of China; Graduate School, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, People's Republic of China; Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, People's Republic of China.
| |
Collapse
|
9
|
Soussi T, Leroy B, Devir M, Rosenberg S. High prevalence of cancer-associated TP53 variants in the gnomAD database: A word of caution concerning the use of variant filtering. Hum Mutat 2019; 40:516-524. [PMID: 30720243 DOI: 10.1002/humu.23717] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2018] [Revised: 01/09/2019] [Accepted: 01/28/2019] [Indexed: 12/14/2022]
Abstract
The 1,000 genome project, the Exome Aggregation Consortium (ExAC) or the Genome Aggregation database (gnomAD) datasets, were developed to provide large-scale reference data of genetic variations for various populations to filter out common benign variants and identify rare variants of clinical importance based on their frequency in the human population. Using a TP53 repository of 80,000 cancer variants, as well as TP53 variants from multiple cancer genome projects, we have defined a set of certified oncogenic TP53 variants. This specific set has been independently validated by functional and in silico predictive analysis. Here we show that a significant number of these variants are included in gnomAD and ExAC. Most of them correspond to TP53 hotspot variants occurring as somatic and germline events in human cancer. Similarly, disease-associated variants for five other tumor suppressor genes, including BRCA1, BRCA2, APC, PTEN, and MLH1, have also been identified. This study demonstrates that germline TP53 variants in the human population are more frequent than previously thought. Furthermore, population databases such as gnomAD or ExAC must be used with caution and need to be annotated for the presence of oncogenic variants to improve their clinical utility.
Collapse
Affiliation(s)
- Thierry Soussi
- UPMC Univ, Sorbonne Université, Dpt of Life Science, Paris, France.,Centre de Recherche des Cordeliers, INSERM, Paris, France.,Department of Oncology-Pathology, Cancer Center Karolinska (CCK), Karolinska Institutet, Stockholm, Sweden
| | - Bernard Leroy
- UPMC Univ, Sorbonne Université, Dpt of Life Science, Paris, France
| | - Michal Devir
- Laboratory for Cancer Computational Biology, Hadassah Medical Center, Hebrew University, Jerusalem, Israel
| | - Shai Rosenberg
- Laboratory for Cancer Computational Biology, Hadassah Medical Center, Hebrew University, Jerusalem, Israel.,Gaffin Center for Neuro-oncology, Sharett Institute for Oncology, Hadassah-Hebrew University Medical Center, Jerusalem, Israel
| |
Collapse
|
10
|
Muyas F, Bosio M, Puig A, Susak H, Domènech L, Escaramis G, Zapata L, Demidov G, Estivill X, Rabionet R, Ossowski S. Allele balance bias identifies systematic genotyping errors and false disease associations. Hum Mutat 2018; 40:115-126. [PMID: 30353964 PMCID: PMC6587442 DOI: 10.1002/humu.23674] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2018] [Revised: 09/17/2018] [Accepted: 10/20/2018] [Indexed: 12/13/2022]
Abstract
In recent years, next‐generation sequencing (NGS) has become a cornerstone of clinical genetics and diagnostics. Many clinical applications require high precision, especially if rare events such as somatic mutations in cancer or genetic variants causing rare diseases need to be identified. Although random sequencing errors can be modeled statistically and deep sequencing minimizes their impact, systematic errors remain a problem even at high depth of coverage. Understanding their source is crucial to increase precision of clinical NGS applications. In this work, we studied the relation between recurrent biases in allele balance (AB), systematic errors, and false positive variant calls across a large cohort of human samples analyzed by whole exome sequencing (WES). We have modeled the AB distribution for biallelic genotypes in 987 WES samples in order to identify positions recurrently deviating significantly from the expectation, a phenomenon we termed allele balance bias (ABB). Furthermore, we have developed a genotype callability score based on ABB for all positions of the human exome, which detects false positive variant calls that passed state‐of‐the‐art filters. Finally, we demonstrate the use of ABB for detection of false associations proposed by rare variant association studies. Availability: https://github.com/Francesc-Muyas/ABB.
Collapse
Affiliation(s)
- Francesc Muyas
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
| | - Mattia Bosio
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Anna Puig
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Hana Susak
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Laura Domènech
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,CIBER in Epidemiology and Public Health (CIBERESP), Barcelona, Spain
| | - Georgia Escaramis
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,CIBER in Epidemiology and Public Health (CIBERESP), Barcelona, Spain
| | - Luis Zapata
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - German Demidov
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
| | - Xavier Estivill
- Sidra Medicine, Doha, Qatar.,Women's Health Dexeus, Barcelona, Spain
| | - Raquel Rabionet
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,CIBER in Epidemiology and Public Health (CIBERESP), Barcelona, Spain.,Institut de Recerca Sant Joan de Déu; Institut de Biomedicina de la Universitat de Barcelona (IBUB), ; & Department of Genetics, Microbiology and Statistics, University of Barcelona, Barcelona, Spain
| | - Stephan Ossowski
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
| |
Collapse
|
11
|
Architecture of polymorphisms in the human genome reveals functionally important and positively selected variants in immune response and drug transporter genes. Hum Genomics 2018; 12:43. [PMID: 30219098 PMCID: PMC6139121 DOI: 10.1186/s40246-018-0175-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2018] [Accepted: 08/29/2018] [Indexed: 02/07/2023] Open
Abstract
Background Genetic polymorphisms can contribute to phenotypic differences amongst individuals, including disease risk and drug response. Characterization of genetic polymorphisms that modulate gene expression and/or protein function may facilitate the identification of the causal variants. Here, we present the architecture of genetic polymorphisms in the human genome focusing on those predicted to be potentially functional/under natural selection and the pathways that they reside. Results In the human genome, polymorphisms that directly affect protein sequences and potentially affect function are the most constrained variants with the lowest single-nucleotide variant (SNV) density, least population differentiation and most significant enrichment of rare alleles. SNVs which potentially alter various regulatory sites, e.g. splicing regulatory elements, are also generally under negative selection. Interestingly, genes that regulate the expression of transcription/splicing factors and histones are conserved as a higher proportion of these genes is non-polymorphic, contain ultra-conserved elements (UCEs) and/or has no non-synonymous SNVs (nsSNVs)/coding INDELs. On the other hand, major histocompatibility complex (MHC) genes are the most polymorphic with SNVs potentially affecting the binding of transcription/splicing factors and microRNAs (miRNA) exhibiting recent positive selection (RPS). The drug transporter genes carry the most number of potentially deleterious nsSNVs and exhibit signatures of RPS and/or population differentiation. These observations suggest that genes that interact with the environment are highly polymorphic and targeted by RPS. Conclusions In conclusion, selective constraints are observed in coding regions, master regulator genes, and potentially functional SNVs. In contrast, genes that modulate response to the environment are highly polymorphic and under positive selection. Electronic supplementary material The online version of this article (10.1186/s40246-018-0175-1) contains supplementary material, which is available to authorized users.
Collapse
|
12
|
Ramharack P, Soliman MES. Bioinformatics-based tools in drug discovery: the cartography from single gene to integrative biological networks. Drug Discov Today 2018; 23:1658-1665. [PMID: 29864527 DOI: 10.1016/j.drudis.2018.05.041] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2018] [Revised: 05/12/2018] [Accepted: 05/29/2018] [Indexed: 02/02/2023]
Abstract
Originally developed for the analysis of biological sequences, bioinformatics has advanced into one of the most widely recognized domains in the scientific community. Despite this technological evolution, there is still an urgent need for nontoxic and efficient drugs. The onus now falls on the 'omics domain to meet this need by implementing bioinformatics techniques that will allow for the introduction of pioneering approaches in the rational drug design process. Here, we categorize an updated list of informatics tools and explore the capabilities of integrative bioinformatics in disease control. We believe that our review will serve as a comprehensive guide toward bioinformatics-oriented disease and drug discovery research.
Collapse
Affiliation(s)
- Pritika Ramharack
- Molecular Bio-computation and Drug Design Laboratory, School of Health Sciences, University of KwaZulu-Natal, Westville Campus, Durban 4001, South Africa
| | - Mahmoud E S Soliman
- Molecular Bio-computation and Drug Design Laboratory, School of Health Sciences, University of KwaZulu-Natal, Westville Campus, Durban 4001, South Africa.
| |
Collapse
|
13
|
Novroski NMM, Woerner AE, Budowle B. Insertion within the flanking region of the D10S1237 locus. Forensic Sci Int Genet 2018; 35:e4-e6. [PMID: 29729851 DOI: 10.1016/j.fsigen.2018.04.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2018] [Revised: 04/06/2018] [Accepted: 04/25/2018] [Indexed: 11/29/2022]
Affiliation(s)
- Nicole M M Novroski
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA; Graduate School of Biomedical Sciences, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA.
| | - August E Woerner
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA; Graduate School of Biomedical Sciences, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA
| | - Bruce Budowle
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA; Graduate School of Biomedical Sciences, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA; Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
14
|
Reichardt JKV. Reflections of a Biomedical Scientist on Four Continents in Interdisciplinary Research. Trends Genet 2018; 34:401-403. [PMID: 29636189 DOI: 10.1016/j.tig.2018.03.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Revised: 02/14/2018] [Accepted: 03/14/2018] [Indexed: 11/17/2022]
Affiliation(s)
- Juergen K V Reichardt
- Vice-Chancellor, Research & Innovation, Yachay Tech University, San Miguel de Urcuquí 100119, Ecuador.
| |
Collapse
|
15
|
Watson CT, Matsen FA, Jackson KJL, Bashir A, Smith ML, Glanville J, Breden F, Kleinstein SH, Collins AM, Busse CE. Comment on “A Database of Human Immune Receptor Alleles Recovered from Population Sequencing Data”. THE JOURNAL OF IMMUNOLOGY 2017; 198:3371-3373. [DOI: 10.4049/jimmunol.1700306] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
16
|
Kaur R, Singh J, Kaur M. Structural and functional impact of SNPs in P-selectin gene: A comprehensive in silico analysis. Open Life Sci 2017. [DOI: 10.1515/biol-2017-0003] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
AbstractP-selectin is an adhesion molecule which plays an important role in the development of inflammation. It is encoded by the SELP gene located on chromosome 1q21-q24. Various single nucleotide polymorphisms (SNPs) ofSELPhave been reported to be associated with various inflammatory disease conditions. The genetics behind these diseases could be better understood by knowing the structural and functional impact of various genetic determinants ofSELP. So far, this is the first comprehensive and systematicin silicoanalysis of SNPs inSELP. A total of 2780 SNPs ofSELPwere retrieved from NCBI dbSNP. Only conserved and validated SNPs with minor allele frequency (MAF) ≥ 0.05 were subjected to further analysis. Based on these criteria, we selected 4 non-synonymous SNPs (nsSNPs) and 119 non-coding SNPs (ncSNPs). The nsSNPs were analyzed for deleterious effects using SIFT, Polyphen-2, nsSNPAnalyzer, SNP & Go, SNPs3, Mutperd and I-mutant web tools. The template prediction for variant structure modeling was performed using MUSTER and SWISS-MODEL. The functional impact of ncSNPs was analyzed by SNPinfo and RegulomeDB. Thein silicoanalysis predicted 3 nsSNPs and 21 ncSNPs as potential candidates for future case-control association studies and functional analysis ofSELP.
Collapse
Affiliation(s)
- Raminderjit Kaur
- Department of Human Genetics, Guru Nanak Dev University, Amritsar, Punjab, India
| | - Jatinder Singh
- Department of Molecular Biology & Biochemistry, Guru Nanak Dev University, Amritsar, Punjab India
| | - Manpreet Kaur
- Department of Human Genetics, Guru Nanak Dev University, Amritsar, Punjab, India
| |
Collapse
|
17
|
Horne HN, Chung CC, Zhang H, Yu K, Prokunina-Olsson L, Michailidou K, Bolla MK, Wang Q, Dennis J, Hopper JL, Southey MC, Schmidt MK, Broeks A, Muir K, Lophatananon A, Fasching PA, Beckmann MW, Fletcher O, Johnson N, Sawyer EJ, Tomlinson I, Burwinkel B, Marme F, Guénel P, Truong T, Bojesen SE, Flyger H, Benitez J, González-Neira A, Anton-Culver H, Neuhausen SL, Brenner H, Arndt V, Meindl A, Schmutzler RK, Brauch H, Hamann U, Nevanlinna H, Khan S, Matsuo K, Iwata H, Dörk T, Bogdanova NV, Lindblom A, Margolin S, Mannermaa A, Kosma VM, Chenevix-Trench G, Wu AH, ven den Berg D, Smeets A, Zhao H, Chang-Claude J, Rudolph A, Radice P, Barile M, Couch FJ, Vachon C, Giles GG, Milne RL, Haiman CA, Marchand LL, Goldberg MS, Teo SH, Taib NAM, Kristensen V, Borresen-Dale AL, Zheng W, Shrubsole M, Winqvist R, Jukkola-Vuorinen A, Andrulis IL, Knight JA, Devilee P, Seynaeve C, García-Closas M, Czene K, Darabi H, Hollestelle A, Martens JWM, Li J, Lu W, Shu XO, Cox A, Cross SS, Blot W, Cai Q, Shah M, Luccarini C, Baynes C, Harrington P, Kang D, Choi JY, Hartman M, Chia KS, Kabisch M, Torres D, Jakubowska A, Lubinski J, Sangrajrang S, Brennan P, Slager S, Yannoukakos D, Shen CY, Hou MF, Swerdlow A, Orr N, Simard J, Hall P, Pharoah PDP, Easton DF, Chanock SJ, Dunning AM, Figueroa JD. Fine-Mapping of the 1p11.2 Breast Cancer Susceptibility Locus. PLoS One 2016; 11:e0160316. [PMID: 27556229 PMCID: PMC4996485 DOI: 10.1371/journal.pone.0160316] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2016] [Accepted: 07/18/2016] [Indexed: 02/02/2023] Open
Abstract
The Cancer Genetic Markers of Susceptibility genome-wide association study (GWAS) originally identified a single nucleotide polymorphism (SNP) rs11249433 at 1p11.2 associated with breast cancer risk. To fine-map this locus, we genotyped 92 SNPs in a 900kb region (120,505,799-121,481,132) flanking rs11249433 in 45,276 breast cancer cases and 48,998 controls of European, Asian and African ancestry from 50 studies in the Breast Cancer Association Consortium. Genotyping was done using iCOGS, a custom-built array. Due to the complicated nature of the region on chr1p11.2: 120,300,000-120,505,798, that lies near the centromere and contains seven duplicated genomic segments, we restricted analyses to 429 SNPs excluding the duplicated regions (42 genotyped and 387 imputed). Per-allelic associations with breast cancer risk were estimated using logistic regression models adjusting for study and ancestry-specific principal components. The strongest association observed was with the original identified index SNP rs11249433 (minor allele frequency (MAF) 0.402; per-allele odds ratio (OR) = 1.10, 95% confidence interval (CI) 1.08-1.13, P = 1.49 x 10-21). The association for rs11249433 was limited to ER-positive breast cancers (test for heterogeneity P≤8.41 x 10-5). Additional analyses by other tumor characteristics showed stronger associations with moderately/well differentiated tumors and tumors of lobular histology. Although no significant eQTL associations were observed, in silico analyses showed that rs11249433 was located in a region that is likely a weak enhancer/promoter. Fine-mapping analysis of the 1p11.2 breast cancer susceptibility locus confirms this region to be limited to risk to cancers that are ER-positive.
Collapse
Affiliation(s)
- Hisani N. Horne
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, United States of America
- Food and Drug Administration, Silver Spring, MD, United States of America
| | - Charles C. Chung
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, United States of America
| | - Han Zhang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, United States of America
| | - Kai Yu
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, United States of America
| | - Ludmila Prokunina-Olsson
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, United States of America
| | - Kyriaki Michailidou
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
| | - Manjeet K. Bolla
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
| | - Qin Wang
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
| | - Joe Dennis
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
| | - John L. Hopper
- Centre for Epidemiology and Biostatistics, School of Population and Global Health, The University of Melbourne, Melbourne, Australia
| | - Melissa C. Southey
- Department of Pathology, The University of Melbourne, Melbourne, Australia
| | - Marjanka K. Schmidt
- Netherlands Cancer Institute, Antoni van Leeuwenhoek hospital, Amsterdam, The Netherlands
| | - Annegien Broeks
- Netherlands Cancer Institute, Antoni van Leeuwenhoek hospital, Amsterdam, The Netherlands
| | - Kenneth Muir
- Division of Health Sciences, Warwick Medical School, Warwick University, Coventry, UK
- Institute of Population Health, University of Manchester, Manchester, UK
| | - Artitaya Lophatananon
- Division of Health Sciences, Warwick Medical School, Warwick University, Coventry, UK
| | - Peter A. Fasching
- Department of Gynaecology and Obstetrics, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nuremberg, Comprehensive Cancer Center Erlangen-EMN, Erlangen, Germany
- David Geffen School of Medicine, Department of Medicine Division of Hematology and Oncology, University of California at Los Angeles, Los Angeles, CA, United States of America
| | - Matthias W. Beckmann
- Department of Gynaecology and Obstetrics, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nuremberg, Comprehensive Cancer Center Erlangen-EMN, Erlangen, Germany
| | - Olivia Fletcher
- Breakthrough Breast Cancer Research Centre, The Institute of Cancer Research, London, UK
- Division of Breast Cancer Research, The Institute of Cancer Research, London, UK
| | - Nichola Johnson
- Breakthrough Breast Cancer Research Centre, The Institute of Cancer Research, London, UK
- Division of Breast Cancer Research, The Institute of Cancer Research, London, UK
| | - Elinor J. Sawyer
- Research Oncology, Guy’s Hospital, King's College London, London, UK
| | - Ian Tomlinson
- Wellcome Trust Centre for Human Genetics and Oxford Biomedical Research Centre, University of Oxford, Oxford, UK
| | - Barbara Burwinkel
- Department of Obstetrics and Gynecology, University of Heidelberg, Heidelberg, Germany
- Molecular Epidemiology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Frederik Marme
- Department of Obstetrics and Gynecology, University of Heidelberg, Heidelberg, Germany
- National Center for Tumor Diseases, University of Heidelberg, Heidelberg, Germany
| | - Pascal Guénel
- Environmental Epidemiology of Cancer, Center for Research in Epidemiology and Population Health, INSERM, Villejuif, France
- University Paris-Sud, Villejuif, France
| | - Thérèse Truong
- Environmental Epidemiology of Cancer, Center for Research in Epidemiology and Population Health, INSERM, Villejuif, France
- University Paris-Sud, Villejuif, France
| | - Stig E. Bojesen
- Copenhagen General Population Study, Herlev Hospital, Copenhagen University Hospital, Herlev, Denmark
- Department of Clinical Biochemistry, Herlev Hospital, Copenhagen University Hospital, Herlev, Denmark
- Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Henrik Flyger
- Department of Breast Surgery, Herlev Hospital, Copenhagen University Hospital, Herlev, Denmark
| | - Javier Benitez
- Human Cancer Genetics Program, Spanish National Cancer Research Centre, Madrid, Spain
- Centro de Investigación en Red de Enfermedades Raras, Valencia, Spain
| | - Anna González-Neira
- Human Cancer Genetics Program, Spanish National Cancer Research Centre, Madrid, Spain
| | - Hoda Anton-Culver
- Department of Epidemiology, University of California Irvine, Irvine, CA, United States of America
| | - Susan L. Neuhausen
- Beckman Research Institute of City of Hope, Duarte, CA, United States of America
| | - Hermann Brenner
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Division of Preventive Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Volker Arndt
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Alfons Meindl
- Division of Gynaecology and Obstetrics, Technische Universität München, Munich, Germany
| | - Rita K. Schmutzler
- Division of Molecular Gyneco-Oncology, Department of Gynaecology and Obstetrics, University Hospital of Cologne, Cologne, Germany
- Center of Familial Breast and Ovarian Cancer, University Hospital of Cologne, Cologne, Germany
- Center for Integrated Oncology, University Hospital of Cologne, Cologne, Germany
- Center for Molecular Medicine, University Hospital of Cologne, Cologne, Germany
| | - Hiltrud Brauch
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
- Dr. Margarete Fischer-Bosch-Institute of Clinical Pharmacology, Stuttgart, Germany
- University of Tübingen, Tübingen, Germany
| | - Ute Hamann
- Molecular Genetics of Breast Cancer, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Heli Nevanlinna
- Department of Obstetrics and Gynecology, Helsinki University Hospital, University of Helsinki, Helsinki, Finland
| | - Sofia Khan
- Department of Obstetrics and Gynecology, Helsinki University Hospital, University of Helsinki, Helsinki, Finland
| | - Keitaro Matsuo
- Department of Preventive Medicine, Kyushu University Faculty of Medical Sciences, Fukuoka, Japan
| | - Hiroji Iwata
- Department of Breast Oncology, Aichi Cancer Center Hospital, Aichi, Japan
| | - Thilo Dörk
- Gynaecology Research Unit, Hannover Medical School, Hannover, Germany
| | | | - Annika Lindblom
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden
| | - Sara Margolin
- Department of Oncology - Pathology, Karolinska Institutet, Stockholm, Sweden
| | - Arto Mannermaa
- Cancer Center, Kuopio University Hospital, Kuopio, Finland
- Institute of Clinical Medicine, Pathology and Forensic Medicine, University of Eastern Finland, Kuopio, Finland
- Imaging Center, Department of Clinical Pathology, Kuopio University Hospital, Kuopio, Finland
| | - Veli-Matti Kosma
- Cancer Center, Kuopio University Hospital, Kuopio, Finland
- Institute of Clinical Medicine, Pathology and Forensic Medicine, University of Eastern Finland, Kuopio, Finland
- Imaging Center, Department of Clinical Pathology, Kuopio University Hospital, Kuopio, Finland
| | | | | | - Anna H. Wu
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, United States of America
| | - David ven den Berg
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, United States of America
| | - Ann Smeets
- University Hospital Gashuisberg, Leuven, Belgium
| | - Hui Zhao
- Vesalius Research Center, Leuven, Belgium
- Laboratory for Translational Genetics, Department of Oncology, University of Leuven, Leuven, Belgium
| | - Jenny Chang-Claude
- Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, Germany
- University Cancer Center Hamburg (UCCH), University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Anja Rudolph
- Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Paolo Radice
- Unit of Molecular Bases of Genetic Risk and Genetic Testing, Department of Preventive and Predictive Medicine, Fondazione IRCCS (Istituto Di Ricovero e Cura a Carattere Scientifico) Istituto Nazionale dei Tumori (INT), Milan, Italy
| | - Monica Barile
- Division of Cancer Prevention and Genetics, Istituto Europeo di Oncologia, Milan, Italy
| | - Fergus J. Couch
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States of America
| | - Celine Vachon
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States of America
| | - Graham G. Giles
- Centre for Epidemiology and Biostatistics, School of Population and Global Health, The University of Melbourne, Melbourne, Australia
- Cancer Epidemiology Centre, Cancer Council Victoria, Melbourne, Australia
| | - Roger L. Milne
- Centre for Epidemiology and Biostatistics, School of Population and Global Health, The University of Melbourne, Melbourne, Australia
- Cancer Epidemiology Centre, Cancer Council Victoria, Melbourne, Australia
| | - Christopher A. Haiman
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, United States of America
| | - Loic Le Marchand
- University of Hawaii Cancer Center, Honolulu, HI, United States of America
| | - Mark S. Goldberg
- Department of Medicine, McGill University, Montreal, Canada
- Division of Clinical Epidemiology, Royal Victoria Hospital, McGill University, Montreal, Canada
| | - Soo H. Teo
- Cancer Research Initiatives Foundation, Subang Jaya, Selangor, Malaysia
- Breast Cancer Research Unit, Cancer Research Institute, University Malaya Medical Centre, KualaLumpur, Malaysia
| | - Nur A. M. Taib
- Breast Cancer Research Unit, Cancer Research Institute, University Malaya Medical Centre, KualaLumpur, Malaysia
| | - Vessela Kristensen
- Department of Genetics, Institute for Cancer Research, Radiumhospitalet, Oslo University Hospital, Oslo University Hospital, Oslo, Norway
- Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway
- Department of Clinical Molecular Biology, Oslo University Hospital, University of Oslo, Oslo, Norway
| | - Anne-Lise Borresen-Dale
- Department of Genetics, Institute for Cancer Research, Radiumhospitalet, Oslo University Hospital, Oslo University Hospital, Oslo, Norway
- Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Wei Zheng
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, United States of America
| | - Martha Shrubsole
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, United States of America
| | - Robert Winqvist
- Laboratory of Cancer Genetics and Tumor Biology, Department of Clinical Chemistry and Biocenter Oulu, University of Oulu, Oulu, Finland
- Laboratory of Cancer Genetics and Tumor Biology, Northern Finland Laboratory Centre NordLab, Oulu, Finland
| | | | - Irene L. Andrulis
- Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Canada
| | - Julia A. Knight
- Prosserman Centre for Health Research, Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, Canada
- Division of Epidemiology, Dalla Lana School of Public Health, University of Toronto, Toronto, Canada
| | - Peter Devilee
- Department of Pathology, Leiden University Medical Center, Leiden, The Netherlands
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Caroline Seynaeve
- Department of Medical Oncology, Family Cancer Clinic, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
| | - Montserrat García-Closas
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, United States of America
- Division of Genetics and Epidemiology, The Institute of Cancer Research, London, UK
| | - Kamila Czene
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Hatef Darabi
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Antoinette Hollestelle
- Department of Medical Oncology, Family Cancer Clinic, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
| | - John W. M. Martens
- Department of Medical Oncology, Family Cancer Clinic, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
| | - Jingmei Li
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Wei Lu
- Shanghai Center for Disease Control and Prevention, Shanghai, China
| | - Xiao-Ou Shu
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, United States of America
| | - Angela Cox
- Sheffield Cancer Research, Department of Oncology, University of Sheffield, Sheffield, UK
| | - Simon S. Cross
- Academic Unit of Pathology, Department of Neuroscience, University of Sheffield, Sheffield, UK
| | - William Blot
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, United States of America
- International Epidemiology Institute, Rockville, MD, United States of America
| | - Qiuyin Cai
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, United States of America
| | - Mitul Shah
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, UK
| | - Craig Luccarini
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, UK
| | - Caroline Baynes
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, UK
| | - Patricia Harrington
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, UK
| | - Daehee Kang
- Department of Preventive Medicine, Seoul National University College of Medicine, Seoul, Korea
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Korea
- Cancer Research Institute, Seoul National University, Seoul, Korea
| | - Ji-Yeob Choi
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Korea
- Cancer Research Institute, Seoul National University, Seoul, Korea
| | - Mikael Hartman
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore
- Department of Surgery, National University Health System, Singapore, Singapore
| | - Kee Seng Chia
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore
| | - Maria Kabisch
- Molecular Genetics of Breast Cancer, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Diana Torres
- Molecular Genetics of Breast Cancer, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Institute of Human Genetics, Pontificia Universidad Javeriana, Bogota, Colombia
| | - Anna Jakubowska
- Department of Genetics and Pathology, Pomeranian Medical University, Szczecin, Poland
| | - Jan Lubinski
- Department of Genetics and Pathology, Pomeranian Medical University, Szczecin, Poland
| | | | - Paul Brennan
- International Agency for Research on Cancer, Lyon, France
| | - Susan Slager
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States of America
| | - Drakoulis Yannoukakos
- Molecular Diagnostics Laboratory, IRRP, National Centre for Scientific Research "Demokritos", Athens, Greece
| | - Chen-Yang Shen
- School of Public Health, China Medical University, Taichung, Taiwan
- Taiwan Biobank, Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Ming-Feng Hou
- Cancer Center and Department of Surgery, Chung-Ho Memorial Hospital, Kaohsiung Medical University, Kaohsiung, Taiwan
| | - Anthony Swerdlow
- Division of Breast Cancer Research, The Institute of Cancer Research, London, UK
- Division of Genetics and Epidemiology, The Institute of Cancer Research, London, UK
| | - Nick Orr
- Breakthrough Breast Cancer Research Centre, The Institute of Cancer Research, London, UK
| | - Jacques Simard
- Centre Hospitalier Universitaire de Québec Research Center, Laval University, Québec City, Canada
| | - Per Hall
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Paul D. P. Pharoah
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, UK
| | - Douglas F. Easton
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, UK
| | - Stephen J. Chanock
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, United States of America
| | - Alison M. Dunning
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, UK
| | - Jonine D. Figueroa
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, United States of America
- Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, UK
- Edinburgh Cancer Research UK Centre, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
18
|
Pang E, Wu X, Lin K. Different evolutionary patterns of SNPs between domains and unassigned regions in human protein-coding sequences. Mol Genet Genomics 2016; 291:1127-36. [PMID: 26833483 PMCID: PMC4875946 DOI: 10.1007/s00438-016-1170-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2015] [Accepted: 01/18/2016] [Indexed: 11/30/2022]
Abstract
Protein evolution plays an important role in the evolution of each genome. Because of their functional nature, in general, most of their parts or sites are differently constrained selectively, particularly by purifying selection. Most previous studies on protein evolution considered individual proteins in their entirety or compared protein-coding sequences with non-coding sequences. Less attention has been paid to the evolution of different parts within each protein of a given genome. To this end, based on PfamA annotation of all human proteins, each protein sequence can be split into two parts: domains or unassigned regions. Using this rationale, single nucleotide polymorphisms (SNPs) in protein-coding sequences from the 1000 Genomes Project were mapped according to two classifications: SNPs occurring within protein domains and those within unassigned regions. With these classifications, we found: the density of synonymous SNPs within domains is significantly greater than that of synonymous SNPs within unassigned regions; however, the density of non-synonymous SNPs shows the opposite pattern. We also found there are signatures of purifying selection on both the domain and unassigned regions. Furthermore, the selective strength on domains is significantly greater than that on unassigned regions. In addition, among all of the human protein sequences, there are 117 PfamA domains in which no SNPs are found. Our results highlight an important aspect of protein domains and may contribute to our understanding of protein evolution.
Collapse
Affiliation(s)
- Erli Pang
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing, 100875, China.
| | - Xiaomei Wu
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou, 310036, China
| | - Kui Lin
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing, 100875, China
| |
Collapse
|
19
|
Linnik JE, Egli A. Impact of host genetic polymorphisms on vaccine induced antibody response. Hum Vaccin Immunother 2016; 12:907-15. [PMID: 26809773 DOI: 10.1080/21645515.2015.1119345] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Many host- and vaccine-specific factors modulate an antibody response. Host genetic polymorphisms, in particular, modulate the immune response in multiple ways on different scales. This review article describes how information on host genetic polymorphisms and corresponding immune cascades may be used to generate personalized vaccine strategies to optimize the antibody response.
Collapse
Affiliation(s)
- Janina E Linnik
- a Applied Microbiology Research , Department of Biomedicine, University Basel , Basel , Switzerland.,b Department of Biosystems Science and Engineering , ETH Zürich , Basel , Switzerland.,c Swiss Institute of Bioinformatics , Basel , Switzerland
| | - Adrian Egli
- a Applied Microbiology Research , Department of Biomedicine, University Basel , Basel , Switzerland.,d Clinical Microbiology, University Hospital Basel , Basel , Switzerland
| |
Collapse
|
20
|
Warr A, Robert C, Hume D, Archibald AL, Deeb N, Watson M. Identification of Low-Confidence Regions in the Pig Reference Genome (Sscrofa10.2). Front Genet 2015; 6:338. [PMID: 26640477 PMCID: PMC4662242 DOI: 10.3389/fgene.2015.00338] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2015] [Accepted: 11/12/2015] [Indexed: 01/09/2023] Open
Abstract
Many applications of high throughput sequencing rely on the availability of an accurate reference genome. Variant calling often produces large data sets that cannot be realistically validated and which may contain large numbers of false-positives. Errors in the reference assembly increase the number of false-positives. While resources are available to aid in the filtering of variants from human data, for other species these do not yet exist and strict filtering techniques must be employed which are more likely to exclude true-positives. This work assesses the accuracy of the pig reference genome (Sscrofa10.2) using whole genome sequencing reads from the Duroc sow whose genome the assembly was based on. Indicators of structural variation including high regional coverage, unexpected insert sizes, improper pairing and homozygous variants were used to identify low quality (LQ) regions of the assembly. Low coverage (LC) regions were also identified and analyzed separately. The LQ regions covered 13.85% of the genome, the LC regions covered 26.6% of the genome and combined (LQLC) they covered 33.07% of the genome. Over half of dbSNP variants were located in the LQLC regions. Of copy number variable regions identified in a previous study, 86.3% were located in the LQLC regions. The regions were also enriched for gene predictions from RNA-seq data with 42.98% falling in the LQLC regions. Excluding variants in the LQ, LC, or LQLC from future analyses will help reduce the number of false-positive variant calls. Researchers using WGS data should be aware that the current pig reference genome does not give an accurate representation of the copy number of alleles in the original Duroc sow's genome.
Collapse
Affiliation(s)
- Amanda Warr
- Division of Genetics and Genomics, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of EdinburghEdinburgh, UK
| | - Christelle Robert
- Division of Genetics and Genomics, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of EdinburghEdinburgh, UK
| | - David Hume
- Division of Genetics and Genomics, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of EdinburghEdinburgh, UK
| | - Alan L. Archibald
- Division of Genetics and Genomics, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of EdinburghEdinburgh, UK
| | | | - Mick Watson
- Division of Genetics and Genomics, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of EdinburghEdinburgh, UK,*Correspondence: Mick Watson,
| |
Collapse
|
21
|
Arthur JW, Cheung FSG, Reichardt JKV. Single nucleotide differences (SNDs) continue to contaminate the dbSNP database with consequences for human genomics and health. Hum Mutat 2015; 36:196-9. [PMID: 25421747 DOI: 10.1002/humu.22735] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2014] [Accepted: 11/17/2014] [Indexed: 01/31/2023]
Abstract
It has been established that up to 8.3% of the biallelic coding SNPs present in dbSNP are actually artefactual polymorphism-like errors, previously termed single nucleotide differences, or SNDs. In this study, a previous analysis of SNPs in dbSNP was extended and updated to examine how the incidence of SNDs has changed over an intervening five year period. The incidence of SNDs was found to be lower than in the previous analysis at 2.2% of all biallelic SNPs. There was only a modest reduction in the percentage of SNDs in the original set of biallelic coding SNPs tested. This suggests that the overall reduction in the incidence of SNDs over the intervening 5-year period is related to an improvement in SNP detection methods and more rigorous curation, rather than efforts to ameliorate the presence of SNDs. We note that SNDs contaminating the dbSNP may lead to erroneous conclusions on human conditions.
Collapse
Affiliation(s)
- Jonathan W Arthur
- Children's Medical Research Institute, University of Sydney, Westmead, New South Wales, Australia
| | | | | |
Collapse
|
22
|
Computational approaches to study the effects of small genomic variations. J Mol Model 2015; 21:251. [PMID: 26350246 DOI: 10.1007/s00894-015-2794-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2015] [Accepted: 08/23/2015] [Indexed: 10/23/2022]
Abstract
Advances in DNA sequencing technologies have led to an avalanche-like increase in the number of gene sequences deposited in public databases over the last decade as well as the detection of an enormous number of previously unseen nucleotide variants therein. Given the size and complex nature of the genome-wide sequence variation data, as well as the rate of data generation, experimental characterization of the disease association of each of these variations or their effects on protein structure/function would be costly, laborious, time-consuming, and essentially impossible. Thus, in silico methods to predict the functional effects of sequence variations are constantly being developed. In this review, we summarize the major computational approaches and tools that are aimed at the prediction of the functional effect of mutations, and describe the state-of-the-art databases that can be used to obtain information about mutation significance. We also discuss future directions in this highly competitive field.
Collapse
|
23
|
Di Gioia SA, Bedoni N, von Scheven-Gête A, Vanoni F, Superti-Furga A, Hofer M, Rivolta C. Analysis of the genetic basis of periodic fever with aphthous stomatitis, pharyngitis, and cervical adenitis (PFAPA) syndrome. Sci Rep 2015; 5:10200. [PMID: 25988833 PMCID: PMC4437314 DOI: 10.1038/srep10200] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2014] [Accepted: 04/07/2015] [Indexed: 12/30/2022] Open
Abstract
PFAPA syndrome is the most common autoinflammatory syndrome in children from Western countries. In spite of its strong familial clustering, its genetic basis and inheritance pattern are still unknown. We performed a comprehensive genetic study on 68 individuals from 14 families. Linkage analysis suggested a susceptibility locus on chromosome 8, but direct molecular sequencing did not support this initial statistical finding. Exome sequencing revealed the absence of any gene that was mutated in all patients. Exhaustive screening of genes involved in other autoinflammatory syndromes or encoding components of the human inflammasome showed no DNA variants that could be linked to PFAPA molecular pathology. Among these, the previously-reported missense mutation V198M in the NLRP3 gene was clearly shown not to co-segregate with PFAPA. Our results on this relatively large cohort indicate that PFAPA syndrome is unlikely to be a monogenic condition. Moreover, none of the several genes known to be involved in inflammation or in autoinflammatory disorders seem to be relevant, alone, to its etiology, suggesting that PFAPA results from oligogenic or complex inheritance of variants in multiple disease genes and/or non-genetic factors.
Collapse
Affiliation(s)
| | - Nicola Bedoni
- Department of Medical Genetics, University of Lausanne, Lausanne, Switzerland
| | - Annette von Scheven-Gête
- Pediatric Rheumatology Unit of Western Switzerland, Department of Pediatrics, CHUV, University Hospital of Lausanne, Lausanne, Switzerland
| | - Federica Vanoni
- Pediatric Rheumatology Unit of Western Switzerland, Department of Pediatrics, CHUV, University Hospital of Lausanne, Lausanne, Switzerland
| | - Andrea Superti-Furga
- Department of Pediatrics, CHUV, University Hospital of Lausanne, Lausanne, Switzerland
| | - Michaël Hofer
- 1] Pediatric Rheumatology Unit of Western Switzerland, Department of Pediatrics, CHUV, University Hospital of Lausanne, Lausanne, Switzerland [2] Department of Pediatrics, HUG, Geneva, Switzerland
| | - Carlo Rivolta
- Department of Medical Genetics, University of Lausanne, Lausanne, Switzerland
| |
Collapse
|
24
|
Identifying Highly Penetrant Disease Causal Mutations Using Next Generation Sequencing: Guide to Whole Process. BIOMED RESEARCH INTERNATIONAL 2015; 2015:923491. [PMID: 26106619 PMCID: PMC4461748 DOI: 10.1155/2015/923491] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/11/2014] [Accepted: 03/17/2015] [Indexed: 01/10/2023]
Abstract
Recent technological advances have created challenges for geneticists and a need to adapt to a wide range of new bioinformatics tools and an expanding wealth of publicly available data (e.g., mutation databases, and software). This wide range of methods and a diversity of file formats used in sequence analysis is a significant issue, with a considerable amount of time spent before anyone can even attempt to analyse the genetic basis of human disorders. Another point to consider that is although many possess “just enough” knowledge to analyse their data, they do not make full use of the tools and databases that are available and also do not fully understand how their data was created. The primary aim of this review is to document some of the key approaches and provide an analysis schema to make the analysis process more efficient and reliable in the context of discovering highly penetrant causal mutations/genes. This review will also compare the methods used to identify highly penetrant variants when data is obtained from consanguineous individuals as opposed to nonconsanguineous; and when Mendelian disorders are analysed as opposed to common-complex disorders.
Collapse
|
25
|
Radenbaugh AJ, Ma S, Ewing A, Stuart JM, Collisson EA, Zhu J, Haussler D. RADIA: RNA and DNA integrated analysis for somatic mutation detection. PLoS One 2014; 9:e111516. [PMID: 25405470 PMCID: PMC4236012 DOI: 10.1371/journal.pone.0111516] [Citation(s) in RCA: 63] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2014] [Accepted: 09/30/2014] [Indexed: 01/30/2023] Open
Abstract
The detection of somatic single nucleotide variants is a crucial component to the characterization of the cancer genome. Mutation calling algorithms thus far have focused on comparing the normal and tumor genomes from the same individual. In recent years, it has become routine for projects like The Cancer Genome Atlas (TCGA) to also sequence the tumor RNA. Here we present RADIA (RNA and DNA Integrated Analysis), a novel computational method combining the patient-matched normal and tumor DNA with the tumor RNA to detect somatic mutations. The inclusion of the RNA increases the power to detect somatic mutations, especially at low DNA allelic frequencies. By integrating an individual's DNA and RNA, we are able to detect mutations that would otherwise be missed by traditional algorithms that examine only the DNA. We demonstrate high sensitivity (84%) and very high precision (98% and 99%) for RADIA in patient data from endometrial carcinoma and lung adenocarcinoma from TCGA. Mutations with both high DNA and RNA read support have the highest validation rate of over 99%. We also introduce a simulation package that spikes in artificial mutations to patient data, rather than simulating sequencing data from a reference genome. We evaluate sensitivity on the simulation data and demonstrate our ability to rescue back mutations at low DNA allelic frequencies by including the RNA. Finally, we highlight mutations in important cancer genes that were rescued due to the incorporation of the RNA.
Collapse
Affiliation(s)
- Amie J. Radenbaugh
- University of California Santa Cruz Genomics Institute, Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, California, United States of America
| | - Singer Ma
- University of California Santa Cruz Genomics Institute, Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, California, United States of America
| | - Adam Ewing
- University of California Santa Cruz Genomics Institute, Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, California, United States of America
| | - Joshua M. Stuart
- University of California Santa Cruz Genomics Institute, Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, California, United States of America
| | - Eric A. Collisson
- Division of Hematology/Oncology, University of California San Francisco, San Francisco, California, United States of America
| | - Jingchun Zhu
- University of California Santa Cruz Genomics Institute, Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, California, United States of America
| | - David Haussler
- University of California Santa Cruz Genomics Institute, Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, California, United States of America
- Howard Hughes Medical Institute, Chevy Chase, Maryland, United States of America
| |
Collapse
|
26
|
Congenital cataracts: de novo gene conversion event in CRYBB2. Mol Vis 2014; 20:1579-93. [PMID: 25489230 PMCID: PMC4225141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2014] [Accepted: 11/04/2014] [Indexed: 11/23/2022] Open
Abstract
PURPOSE To identify the cause of congenital cataracts in a consanguineous family of Ashkenazi Jewish ancestry. METHODS We performed genome-wide linkage analysis and whole-exome sequencing for the initial discovery of variants, and we confirmed the variants using gene-specific primers and Sanger sequencing. RESULTS We found significant evidence of linkage to chromosome 22, under an autosomal dominant inheritance model, with a maximum logarithm of the odds (LOD) score of 3.91 (16.918 to 25.641 Mb). Exome sequencing identified three nonsynonymous changes in the CRYBB2 exon 5 coding sequence that are consistent with the sequence of the corresponding region of the pseudogene CRYBB2P1. The identification of these changes was complicated by possible mismapping of some mutated CRYBB2 sequences to CRYBB2P1. Sequencing with gene-specific primers confirmed that the changes--rs2330991, c.433 C>T (p.R145W); rs2330992, c.440A>G (p.Q147R); and rs4049504, c.449C>T (p.T150M)--present in all ten affected family members are located in CRYBB2 and are not artifacts of cross-reaction with CRYBB2P1. We did not find these changes in six unaffected family members, including the unaffected grandfather who contributed the affected haplotype, nor did we find them in the 100 Ashkenazi Jewish controls. CONCLUSIONS Our data are consistent with a de novo gene conversion event, transferring 270 base pairs at most from CRYBB2P1 to exon 5 of CRYBB2. This study highlights how linkage mapping can be complicated by de novo mutation events, as well as how sequence-analysis pipeline mapping of short reads from next-generation sequencing can be complicated by the existence of pseudogenes or other highly homologous sequences.
Collapse
|
27
|
Rižner TL, Penning TM. Role of aldo-keto reductase family 1 (AKR1) enzymes in human steroid metabolism. Steroids 2014; 79:49-63. [PMID: 24189185 PMCID: PMC3870468 DOI: 10.1016/j.steroids.2013.10.012] [Citation(s) in RCA: 151] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/09/2013] [Revised: 10/16/2013] [Accepted: 10/24/2013] [Indexed: 12/30/2022]
Abstract
Human aldo-keto reductases AKR1C1-AKR1C4 and AKR1D1 play essential roles in the metabolism of all steroid hormones, the biosynthesis of neurosteroids and bile acids, the metabolism of conjugated steroids, and synthetic therapeutic steroids. These enzymes catalyze NADPH dependent reductions at the C3, C5, C17 and C20 positions on the steroid nucleus and side-chain. AKR1C1-AKR1C4 act as 3-keto, 17-keto and 20-ketosteroid reductases to varying extents, while AKR1D1 acts as the sole Δ(4)-3-ketosteroid-5β-reductase (steroid 5β-reductase) in humans. AKR1 enzymes control the concentrations of active ligands for nuclear receptors and control their ligand occupancy and trans-activation, they also regulate the amount of neurosteroids that can modulate the activity of GABAA and NMDA receptors. As such they are involved in the pre-receptor regulation of nuclear and membrane bound receptors. Altered expression of individual AKR1C genes is related to development of prostate, breast, and endometrial cancer. Mutations in AKR1C1 and AKR1C4 are responsible for sexual development dysgenesis and mutations in AKR1D1 are causative in bile-acid deficiency.
Collapse
Affiliation(s)
- Tea Lanišnik Rižner
- Institute of Biochemistry, Faculty of Medicine, University of Ljubljana, Slovenia.
| | - Trevor M Penning
- Center of Excellence in Environmental Toxicology, Department of Pharmacology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
28
|
Tekin I, Vrana KE. Caveat emptor: single nucleotide polymorphism reporting in pharmacogenomics. Pharmacology 2013; 92:319-23. [PMID: 24356117 DOI: 10.1159/000356324] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2013] [Accepted: 10/09/2013] [Indexed: 11/19/2022]
Abstract
While it is arguably the most comprehensive source of genetic information, the NCBI's dbSNP database (National Center for Biotechnology Information database of single nucleotide polymorphisms; http://www.ncbi.nlm.nih.gov/projects/SNP/) is imperfect. In this commentary, we highlight the issues surrounding this database, while considering the great importance and utility of this resource for those in the pharmacology and pharmacogenomics communities. We describe our experience with the information in this database as a cautionary tale for those who will utilize such information in the future. We also discuss several measures that could render it more reliable.
Collapse
Affiliation(s)
- Izel Tekin
- Department of Pharmacology, Penn State College of Medicine, Pennsylvania State University, Hershey, Pa., USA
| | | |
Collapse
|
29
|
Dorn C, Grunert M, Sperling SR. Application of high-throughput sequencing for studying genomic variations in congenital heart disease. Brief Funct Genomics 2013; 13:51-65. [PMID: 24095982 DOI: 10.1093/bfgp/elt040] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Congenital heart diseases (CHD) represent the most common birth defect in human. The majority of cases are caused by a combination of complex genetic alterations and environmental influences. In the past, many disease-causing mutations have been identified; however, there is still a large proportion of cardiac malformations with unknown precise origin. High-throughput sequencing technologies established during the last years offer novel opportunities to further study the genetic background underlying the disease. In this review, we provide a roadmap for designing and analyzing high-throughput sequencing studies focused on CHD, but also with general applicability to other complex diseases. The three main next-generation sequencing (NGS) platforms including their particular advantages and disadvantages are presented. To identify potentially disease-related genomic variations and genes, different filtering steps and gene prioritization strategies are discussed. In addition, available control datasets based on NGS are summarized. Finally, we provide an overview of current studies already using NGS technologies and showing that these techniques will help to further unravel the complex genetics underlying CHD.
Collapse
Affiliation(s)
- Cornelia Dorn
- Department of Cardiovascular Genetics, Experimental and Clinical Research Center (ECRC), Charité-University Medicine Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Lindenberger Weg 80, 13125 Berlin, Germany. Department of Biochemistry, Free University Berlin, Berlin, Germany. Tel.: +49-(0)30-450540123; Fax: +49-(0)30-84131699;
| | | | | |
Collapse
|
30
|
Farrer RA, Henk DA, MacLean D, Studholme DJ, Fisher MC. Using false discovery rates to benchmark SNP-callers in next-generation sequencing projects. Sci Rep 2013; 3:1512. [PMID: 23518929 PMCID: PMC3604800 DOI: 10.1038/srep01512] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2012] [Accepted: 02/25/2013] [Indexed: 12/16/2022] Open
Abstract
Sequence alignments form the basis for many comparative and population genomic studies. Alignment tools provide a range of accuracies dependent on the divergence between the sequences and the alignment methods. Despite widespread use, there is no standard method for assessing the accuracy of a dataset and alignment strategy after resequencing. We present a framework and tool for determining the overall accuracies of an input read dataset, alignment and SNP-calling method providing an isolate in that dataset has a corresponding, or closely related reference sequence available. In addition to this tool for comparing False Discovery Rates (FDR), we include a method for determining homozygous and heterozygous positions from an alignment using binomial probabilities for an expected error rate. We benchmark this method against other SNP callers using our FDR method with three fungal genomes, finding that it was able achieve a high level of accuracy. These tools are available at http://cfdr.sourceforge.net/.
Collapse
Affiliation(s)
- Rhys A Farrer
- Department of Infectious Disease Epidemiology, St Mary's Hospital, Imperial College London, London, UK.
| | | | | | | | | |
Collapse
|
31
|
Cabanski CR, Wilkerson MD, Soloway M, Parker JS, Liu J, Prins JF, Marron JS, Perou CM, Hayes DN. BlackOPs: increasing confidence in variant detection through mappability filtering. Nucleic Acids Res 2013; 41:e178. [PMID: 23935067 PMCID: PMC3799449 DOI: 10.1093/nar/gkt692] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Identifying variants using high-throughput sequencing data is currently a challenge because true biological variants can be indistinguishable from technical artifacts. One source of technical artifact results from incorrectly aligning experimentally observed sequences to their true genomic origin ('mismapping') and inferring differences in mismapped sequences to be true variants. We developed BlackOPs, an open-source tool that simulates experimental RNA-seq and DNA whole exome sequences derived from the reference genome, aligns these sequences by custom parameters, detects variants and outputs a blacklist of positions and alleles caused by mismapping. Blacklists contain thousands of artifact variants that are indistinguishable from true variants and, for a given sample, are expected to be almost completely false positives. We show that these blacklist positions are specific to the alignment algorithm and read length used, and BlackOPs allows users to generate a blacklist specific to their experimental setup. We queried the dbSNP and COSMIC variant databases and found numerous variants indistinguishable from mapping errors. We demonstrate how filtering against blacklist positions reduces the number of potential false variants using an RNA-seq glioblastoma cell line data set. In summary, accounting for mapping-caused variants tuned to experimental setups reduces false positives and, therefore, improves genome characterization by high-throughput sequencing.
Collapse
Affiliation(s)
- Christopher R Cabanski
- Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC 27599, USA, The Genome Institute at Washington University, St. Louis, MO 63108, USA, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Computer Science, University of Kentucky, Lexington, KY 40506, USA, Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, USA and Division of Medical Oncology, Department of Internal Medicine, University of North Carolina, Chapel Hill, NC 27599, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Jeffries AR, Perfect LW, Ledderose J, Schalkwyk LC, Bray NJ, Mill J, Price J. Stochastic choice of allelic expression in human neural stem cells. Stem Cells 2013; 30:1938-47. [PMID: 22714879 DOI: 10.1002/stem.1155] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Monoallelic gene expression, such as genomic imprinting, is well described. Less well-characterized are genes undergoing stochastic monoallelic expression (MA), where specific clones of cells express just one allele at a given locus. We performed genome-wide allelic expression assessment of human clonal neural stem cells derived from cerebral cortex, striatum, and spinal cord, each with differing genotypes. We assayed three separate clonal lines from each donor, distinguishing stochastic MA from genotypic effects. Roughly 2% of genes showed evidence for autosomal MA, and in about half of these, allelic expression was stochastic between different clones. Many of these loci were known neurodevelopmental genes, such as OTX2 and OLIG2. Monoallelic genes also showed increased levels of DNA methylation compared to hypomethylated biallelic loci. Identified monoallelic gene loci showed altered chromatin signatures in fetal brain, suggesting an in vivo correlate of this phenomenon. We conclude that stochastic allelic expression is prevalent in neural stem cells, providing clonal diversity to developing tissues such as the human brain.
Collapse
Affiliation(s)
- Aaron R Jeffries
- King's College London, Institute of Psychiatry, Centre for the Cellular Basis of Behaviour, Department of Neuroscience, London, United Kingdom.
| | | | | | | | | | | | | |
Collapse
|
33
|
|
34
|
Malkaram SA, Hassan YI, Zempleni J. Online tools for bioinformatics analyses in nutrition sciences. Adv Nutr 2012; 3:654-65. [PMID: 22983844 PMCID: PMC3648747 DOI: 10.3945/an.112.002477] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Recent advances in "omics" research have resulted in the creation of large datasets that were generated by consortiums and centers, small datasets that were generated by individual investigators, and bioinformatics tools for mining these datasets. It is important for nutrition laboratories to take full advantage of the analysis tools to interrogate datasets for information relevant to genomics, epigenomics, transcriptomics, proteomics, and metabolomics. This review provides guidance regarding bioinformatics resources that are currently available in the public domain, with the intent to provide a starting point for investigators who want to take advantage of the opportunities provided by the bioinformatics field.
Collapse
Affiliation(s)
- Sridhar A. Malkaram
- Department of Nutrition and Health Sciences, University of Nebraska, Lincoln, Nebraska
| | - Yousef I. Hassan
- Nutrition and Food Science Department, Faculty of Health Sciences, University of Kalamoon, Deirattiah, Syria
| | - Janos Zempleni
- Department of Nutrition and Health Sciences, University of Nebraska, Lincoln, Nebraska,To whom correspondence should be addressed: E-mail:
| |
Collapse
|
35
|
An abundance of population-specific monomorphic SNPs may or may not be meaningful: a commentary on differences in allele frequencies of familial hypercholesterolemia SNPs in the Malaysian population. J Hum Genet 2012; 57:403-4. [DOI: 10.1038/jhg.2012.52] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
36
|
Rothe J, Nagy M. Strategies for excluding false Y-chromosomal SNP entries from human genome databases. Electrophoresis 2012; 33:1488-91. [PMID: 22648820 DOI: 10.1002/elps.201100685] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Current human genome databases for public single nucleotide polymorphisms (SNPs) still contain a substantial fraction of false entries. The main reasons for errors include sequencing or assembly errors, paralogous sequence-, and private variants. In the course of our studies on the Y chromosome, we established a set of internal laboratory guidelines for reliably identifying false SNP entries in databases.
Collapse
Affiliation(s)
- Jessica Rothe
- Department of Forensic Genetics, Institute of Legal Medicine and Forensic Sciences, Berlin, Germany.
| | | |
Collapse
|
37
|
Exploring of tri-allelic SNPs using Pyrosequencing and the SNaPshot methods for forensic application. Electrophoresis 2012; 33:841-8. [DOI: 10.1002/elps.201100508] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
38
|
Donlon TA, Curb JD, He Q, Grove JS, Masaki KH, Rodriguez B, Elliott A, Willcox DC, Willcox BJ. FOXO3 gene variants and human aging: coding variants may not be key players. J Gerontol A Biol Sci Med Sci 2012; 67:1132-9. [PMID: 22459618 DOI: 10.1093/gerona/gls067] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
FOXO3 is generally recognized as a "master" gene in aging since its association with longevity has been replicated in multiple organisms and human populations. A group of single nucleotide polymorphisms in linkage disequilibrium with a coding region has been associated with human longevity, but the actual functional variant is unidentified. Therefore, we sequenced the coding region in our long-lived Japanese American population in order to enhance resources for fine mapping this region. We demonstrate that of 38 published variants, 6 are misalignments with homologous nonallelic sequences from FOXO3B (ZNF286B), a pseudogene on a different chromosome; 2 are attributable to ZNF286B only, and the remaining 30 were unconfirmed, indicating that they are very rare and not likely involved in longevity. Furthermore, we identified a novel, unique, nonsynonymous coding variant in exon 3 (Gly566Ala; rs138174682) that is prevalent in multiple ethnic groups but appeared too rare for major longevity effects in our study populations.
Collapse
Affiliation(s)
- Timothy A Donlon
- Honolulu Heart Program, Kuakini Medical Center, Honolulu Hawaii, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Zha L, Yun L, Chen P, Luo H, Yan J, Hou Y. Exploring of tri-allelic SNPs using Pyrosequencing and the SNaPshot methods for forensic application. Electrophoresis 2012. [DOI: 10.1002/elps.4122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Lagabaiyila Zha
- Department of Forensic Genetics, School of Basic Science and Forensic Medicine; Sichuan University; Chengdu; P. R. China
| | - Libing Yun
- Department of Forensic Genetics, School of Basic Science and Forensic Medicine; Sichuan University; Chengdu; P. R. China
| | - Pengyu Chen
- Department of Forensic Genetics, School of Basic Science and Forensic Medicine; Sichuan University; Chengdu; P. R. China
| | - Haibo Luo
- Department of Forensic Genetics, School of Basic Science and Forensic Medicine; Sichuan University; Chengdu; P. R. China
| | - Jing Yan
- Department of Forensic Genetics, School of Basic Science and Forensic Medicine; Sichuan University; Chengdu; P. R. China
| | - Yiping Hou
- Department of Forensic Genetics, School of Basic Science and Forensic Medicine; Sichuan University; Chengdu; P. R. China
| |
Collapse
|
40
|
Galichon P, Mesnard L, Hertig A, Stengel B, Rondeau E. Unrecognized sequence homologies may confound genome-wide association studies. Nucleic Acids Res 2012; 40:4774-82. [PMID: 22362730 PMCID: PMC3367202 DOI: 10.1093/nar/gks169] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Genome-wide association studies (GWAS) have become a preferred method to identify new genetic susceptibility loci. This technique aims to understanding the molecular etiology of common diseases, but in many cases, it has led to the identification of loci with no obvious biological relevance. Herein, we show that previously unrecognized sequence homologies have caused single-nucleotide polymorphism (SNP) microarrays to incorrectly associate a phenotype to a given locus when in fact the linkage is to another distant locus. Using genetic differences between male and female subjects as a model to study the effect of one specific genomic region on the whole SNP microarray, we provide strong evidence that the use of standard methods for GWAS can be misleading. We suggest a new systematic quality control step in the biological interpretation of previous and future GWAS.
Collapse
Affiliation(s)
- Pierre Galichon
- INSERM UMR S702, Université Pierre et Marie Curie - Paris 6, 75006 Paris, France.
| | | | | | | | | |
Collapse
|
41
|
Wang J, Ronaghi M, Chong SS, Lee CGL. pfSNP: An integrated potentially functional SNP resource that facilitates hypotheses generation through knowledge syntheses. Hum Mutat 2011; 32:19-24. [PMID: 20672376 DOI: 10.1002/humu.21331] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Currently, >14,000,000 single nucleotide polymorphisms (SNPs) are reported. Identifying phenotype-affecting SNPs among these many SNPs pose significant challenges. Although several Web resources are available that can inform about the functionality of SNPs, these resources are mainly annotation databases and are not very comprehensive. In this article, we present a comprehensive, well-annotated, integrated pfSNP (potentially functional SNPs) Web resource (http://pfs.nus.edu.sg/), which is aimed to facilitate better hypothesis generation through knowledge syntheses mediated by better data integration and a user-friendly Web interface. pfSNP integrates >40 different algorithms/resources to interrogate >14,000,000 SNPs from the dbSNP database for SNPs of potential functional significance based on previous published reports, inferred potential functionality from genetic approaches as well as predicted potential functionality from sequence motifs. Its query interface has the user-friendly "auto-complete, prompt-as-you-type" feature and is highly customizable, facilitating different combination of queries using Boolean-logic. Additionally, to facilitate better understanding of the results and aid in hypotheses generation, gene/pathway-level information with text clouds highlighting enriched tissues/pathways as well as detailed-related information are also provided on the results page. Hence, the pfSNP resource will be of great interest to scientists focusing on association studies as well as those interested to experimentally address the functionality of SNPs.
Collapse
Affiliation(s)
- Jingbo Wang
- Department of Biochemistry Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | | | | | | |
Collapse
|
42
|
Abstract
The history of genetic markers accurately partitions the progression of molecular genetics into three phases: the RFLP (restriction fragment length polymorphism), microsatellite and SNP (single nucleotide polymorphism) eras. This chapter focuses predominately on the current workhorse, the SNP, though briefly covers the former two and overviews current online databases and portals that act as central repositories as well as hubs to further detailed information. Central gene or disease-based searches are considered and then followed through systematically.
Collapse
|
43
|
Abstract
It has been known for many years that the mutation rate varies across the genome. However, only with the advent of large genomic data sets is the full extent of this variation becoming apparent. The mutation rate varies over many different scales, from adjacent sites to whole chromosomes, with the strongest variation seen at the smallest scales. Some of these patterns have clear mechanistic bases, but much of the rate variation remains unexplained, and some of it is deeply perplexing. Variation in the mutation rate has important implications in evolutionary biology and underexplored implications for our understanding of hereditary disease and cancer.
Collapse
|
44
|
Abstract
The accurate and complete selection of candidate genomic regions from a DNA sample before sequencing is critical in molecular diagnostics. Several recently developed technologies await substantial improvements in performance, cost, and multiplex sample processing. Here we present the utility of long padlock probes (LPPs) for targeted exon capture followed by array-based sequencing. We found that on average 92% of 5,471 exons from 524 nuclear-encoded mitochondrial genes were successfully amplified from genomic DNA from 63 individuals. Only 144 exons did not amplify in any sample due to high GC content. One LPP was sufficient to capture sequences from <100-500 bp in length and only a single-tube capture reaction and one microarray was required per sample. Our approach was highly reproducible and quick (<8 h) and detected DNA variants at high accuracy (false discovery rate 1%, false negative rate 3%) on the basis of known sample SNPs and Sanger sequence verification. In a patient with clinical and biochemical presentation of ornithine transcarbamylase (OTC) deficiency, we identified copy-number differences in the OTC gene at exon-level resolution. This shows the ability of LPPs to accurately preserve a sample's genome information and provides a cost-effective strategy to identify both single nucleotide changes and structural variants in targeted resequencing.
Collapse
|
45
|
Corbett M, Gecz J. Great expectations: using massively parallel sequencing to solve inherited disorders. Expert Rev Mol Diagn 2011; 10:833-6. [PMID: 20964599 DOI: 10.1586/erm.10.83] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
|
46
|
Olsen MT, Volny VH, Bérubé M, Dietz R, Lydersen C, Kovacs KM, Dodd RS, Palsbøll PJ. A simple route to single-nucleotide polymorphisms in a nonmodel species: identification and characterization of SNPs in the Artic ringed seal (Pusa hispida hispida). Mol Ecol Resour 2011; 11 Suppl 1:9-19. [PMID: 21429159 DOI: 10.1111/j.1755-0998.2010.02941.x] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- Morten Tange Olsen
- Evolutionary Genetics Group, Department of Genetics, Microbiology, and Toxicology, Stockholm University, Sweden.
| | | | | | | | | | | | | | | |
Collapse
|
47
|
Abstract
Here we describe a bioinformatic strategy for extracting and analyzing the list of variants revealed from an exome sequencing project to identify potential disease genes. This in silico method filters out the majority of common SNPs and extracts a list of potential candidate protein-coding and non-coding RNA (ncRNA) genes. The workflow employs Galaxy, a publically available Web-based software, to filter and sort sequence variants identified by capture-based target enrichment and sequencing from exomes including selected ncRNAs.
Collapse
Affiliation(s)
- Marcus Hinchcliffe
- Department of Molecular and Clinical Genetics, Royal Prince Alfred Hospital, The University of Sydney, Camperdown, NSW, Australia.
| | | |
Collapse
|
48
|
Arthur JW, Reichardt JKV. Modeling single nucleotide polymorphisms in the human AKR1C1 and AKR1C2 genes: implications for functional and genotyping analyses. PLoS One 2010; 5:e15604. [PMID: 21217827 PMCID: PMC3013106 DOI: 10.1371/journal.pone.0015604] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2010] [Accepted: 11/16/2010] [Indexed: 11/18/2022] Open
Abstract
Enzymes encoded by the AKR1C1 and AKR1C2 genes are responsible for the metabolism of progesterone and 5α-dihydrotestosterone (DHT), respectively. The effect of amino acid substitutions, resulting from single nucleotide polymorphisms (SNPs) in the AKR1C2 gene, on the enzyme kinetics of the AKR1C2 gene product were determined experimentally by Takashi et al. In this paper, we used homology modeling to predict and analyze the structure of AKR1C1 and AKR1C2 genetic variants. The experimental reduction in enzyme activity in the AKR1C2 variants F46Y and L172Q, as determined by Takahashi et al., is predicted to be due to increased instability in cofactor binding, caused by disruptions to the hydrogen bonds between NADP and AKR1C2, resulting from the insertion of polar residues into largely non-polar environments near the site of cofactor binding. Other AKR1C2 variants were shown to involve either conservative substitutions or changes taking place on the surface of the molecule and distant from the active site, confirming the experimental finding of Takahashi et al. that these variants do not result in any statistically significant reduction in enzyme activity. The AKR1C1 R258C variant is predicted to have no effect on enzyme activity for similar reasons. Thus, we provide further insight into the molecular mechanism of the enzyme kinetics of these proteins. Our data also highlight previously reported difficulties with online databases.
Collapse
Affiliation(s)
- Jonathan W Arthur
- Discipline of Medicine, Sydney Medical School, University of Sydney, Sydney, New South Wales, Australia.
| | | |
Collapse
|
49
|
Nakken S, Rødland EA, Hovig E. Impact of DNA physical properties on local sequence bias of human mutation. Hum Mutat 2010; 31:1316-25. [PMID: 20886615 DOI: 10.1002/humu.21371] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2010] [Accepted: 08/31/2010] [Indexed: 01/07/2023]
Abstract
In selectively neutral regions of the human genome, nucleotide substitutions do not occur at random with respect to the local DNA sequence neighborhood. However, apart from the hypermutability of methylated CpG dinucleotides, which can explain the overrepresentation of nucleotide transitions in this context, the sequence-specific factors underlying point mutation bias remain largely to be determined, both in nature and in quantitative impact. One hypothesis suggests that the physical characteristics of a DNA context could have a modulating effect on its mutability, adjusting the impact of damage or the efficiency of repair. Here, we report a genome-wide computational test of this hypothesis, in which we utilize a constrained set of human non-CpG SNPs as the source of selectively neutral germline mutations. Interestingly, we observe that the quantitative context-dependencies of some substitution types display significant associations to measures of local structural topography and helix stability in DNA. Most prominently, we find that the local sequence bias of transition mutations is significantly associated with the sequence-dependent level of helix instability imposed by the potentially underlying DNA mismatches. The results of our work indicate the extent to which DNA physical properties could have shaped the recent point mutational spectrum in the human genome.
Collapse
Affiliation(s)
- Sigve Nakken
- Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, Norwegian Radium Hospital, Norway
| | | | | |
Collapse
|
50
|
Saccone SF, Quan J, Mehta G, Bolze R, Thomas P, Deelman E, Tischfield JA, Rice JP. New tools and methods for direct programmatic access to the dbSNP relational database. Nucleic Acids Res 2010; 39:D901-7. [PMID: 21037260 PMCID: PMC3013662 DOI: 10.1093/nar/gkq1054] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Genome-wide association studies often incorporate information from public biological databases in order to provide a biological reference for interpreting the results. The dbSNP database is an extensive source of information on single nucleotide polymorphisms (SNPs) for many different organisms, including humans. We have developed free software that will download and install a local MySQL implementation of the dbSNP relational database for a specified organism. We have also designed a system for classifying dbSNP tables in terms of common tasks we wish to accomplish using the database. For each task we have designed a small set of custom tables that facilitate task-related queries and provide entity-relationship diagrams for each task composed from the relevant dbSNP tables. In order to expose these concepts and methods to a wider audience we have developed web tools for querying the database and browsing documentation on the tables and columns to clarify the relevant relational structure. All web tools and software are freely available to the public at http://cgsmd.isi.edu/dbsnpq. Resources such as these for programmatically querying biological databases are essential for viably integrating biological information into genetic association experiments on a genome-wide scale.
Collapse
Affiliation(s)
- Scott F Saccone
- Department of Psychiatry, Washington University, University of Southern California, Washington University, USA.
| | | | | | | | | | | | | | | |
Collapse
|