1
|
Ribeiro VMP, Gouveia GC, Toral FLB. Candidate genes for longitudinal traits under sequential sampling in beef cattle. J Anim Breed Genet 2024; 141:179-192. [PMID: 37917404 DOI: 10.1111/jbg.12833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 10/09/2023] [Accepted: 10/14/2023] [Indexed: 11/04/2023]
Abstract
Both the measurement age of a longitudinal trait and the common pre-sampling procedures used in beef cattle herds may affect the identification of a functional candidate gene (FCG) that is potentially associated with a trait. To identify the FCG that takes part in the genetic control of body weight at five different ages in a beef cattle population with and without sequential sampling, the animals were weighed at different measurement events, around 330, 385, 440, 495 and 550 days old. Genetic parameters were estimated for body weight at each age using a single trait (STM) and a random regression model (RRM). In addition, two different databases were used to estimate the genetic parameters: the first (DB100) was formed by all animals that were weighed in the five measurement events, and the second (DB70) has records of the same population, considering that 70% of the heaviest animals were selected after each measurement event. For DB100, genome-wide association studies (GWAS) were performed with 21,667 SNP markers to identify genomic windows that explained at least 1% of the genetic variance. Additionally, prioritization analyses were performed and FCGs were selected. We associated seven different FCGs with body weight at different ages. Among them, the gene DUSP10 was suggested as FCG in all five ages evaluated. Genetic parameters estimated for body weight using DB100 were similar when STM and RRM were applied. However, when DB70 was used as phenotypic data, there were differences between the two models. When the STM was applied, there were differences between the genetic parameters estimated for body weight when DB100 or DB70 were used as sources of phenotypes, but not for the estimates obtained with RRM. The importance of each gene for animal growth can change at different ages, and different genes may be more relevant to body weight at each different growth stage for beef cattle. Besides, sequential sampling can affect the GWAS results of a longitudinal trait. The age of the animal when a longitudinal trait is measured and pre-sampling can also contribute to inconsistencies in GWAS results for body weight in beef cattle, depending on the time when that data were collected, and consequently on the identification of FCG between studies, even when models that consider a covariance structure are used.
Collapse
|
2
|
Jones W, Gong B, Novoradovskaya N, Li D, Kusko R, Richmond TA, Johann DJ, Bisgin H, Sahraeian SME, Bushel PR, Pirooznia M, Wilkins K, Chierici M, Bao W, Basehore LS, Lucas AB, Burgess D, Butler DJ, Cawley S, Chang CJ, Chen G, Chen T, Chen YC, Craig DJ, Del Pozo A, Foox J, Francescatto M, Fu Y, Furlanello C, Giorda K, Grist KP, Guan M, Hao Y, Happe S, Hariani G, Haseley N, Jasper J, Jurman G, Kreil DP, Łabaj P, Lai K, Li J, Li QZ, Li Y, Li Z, Liu Z, López MS, Miclaus K, Miller R, Mittal VK, Mohiyuddin M, Pabón-Peña C, Parsons BL, Qiu F, Scherer A, Shi T, Stiegelmeyer S, Suo C, Tom N, Wang D, Wen Z, Wu L, Xiao W, Xu C, Yu Y, Zhang J, Zhang Y, Zhang Z, Zheng Y, Mason CE, Willey JC, Tong W, Shi L, Xu J. A verified genomic reference sample for assessing performance of cancer panels detecting small variants of low allele frequency. Genome Biol 2021; 22:111. [PMID: 33863366 PMCID: PMC8051128 DOI: 10.1186/s13059-021-02316-z] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 03/18/2021] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Oncopanel genomic testing, which identifies important somatic variants, is increasingly common in medical practice and especially in clinical trials. Currently, there is a paucity of reliable genomic reference samples having a suitably large number of pre-identified variants for properly assessing oncopanel assay analytical quality and performance. The FDA-led Sequencing and Quality Control Phase 2 (SEQC2) consortium analyze ten diverse cancer cell lines individually and their pool, termed Sample A, to develop a reference sample with suitably large numbers of coding positions with known (variant) positives and negatives for properly evaluating oncopanel analytical performance. RESULTS In reference Sample A, we identify more than 40,000 variants down to 1% allele frequency with more than 25,000 variants having less than 20% allele frequency with 1653 variants in COSMIC-related genes. This is 5-100× more than existing commercially available samples. We also identify an unprecedented number of negative positions in coding regions, allowing statistical rigor in assessing limit-of-detection, sensitivity, and precision. Over 300 loci are randomly selected and independently verified via droplet digital PCR with 100% concordance. Agilent normal reference Sample B can be admixed with Sample A to create new samples with a similar number of known variants at much lower allele frequency than what exists in Sample A natively, including known variants having allele frequency of 0.02%, a range suitable for assessing liquid biopsy panels. CONCLUSION These new reference samples and their admixtures provide superior capability for performing oncopanel quality control, analytical accuracy, and validation for small to large oncopanels and liquid biopsy assays.
Collapse
Affiliation(s)
- Wendell Jones
- Q2 Solutions - EA Genomics, 5927 S Miami Blvd., Morrisville, NC, 27560, USA.
| | - Binsheng Gong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | | | - Dan Li
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Rebecca Kusko
- Immuneering Corporation, One Broadway, 14th Floor, Cambridge, MA, 02142, USA
| | - Todd A Richmond
- Market & Application Development Bioinformatics, Roche Sequencing Solutions Inc., 4300 Hacienda Dr., Pleasanton, CA, 94588, USA
| | - Donald J Johann
- Winthrop P Rockefeller Cancer Institute, University of Arkansas for Medical Sciences, 4301 W Markham St., Little Rock, AR, 72205, USA
| | - Halil Bisgin
- Department of Computer Science, Engineering and Physics, University of Michigan-Flint, Flint, MI, 48502, USA
| | - Sayed Mohammad Ebrahim Sahraeian
- Bioinformatics Research & Early Development, Roche Sequencing Solutions Inc., 1301 Shoreway Rd., Suite 7 #300, Belmont, CA, 94002, USA
| | - Pierre R Bushel
- National Institute of Environmental Health Sciences, Research Triangle Park, Durham, NC, 27709, USA
| | - Mehdi Pirooznia
- Bioinformatics and Computational Biology Laboratory, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Katherine Wilkins
- Agilent Technologies, 5301 Stevens Creek Blvd., Santa Clara, CA, 95051, USA
| | | | - Wenjun Bao
- JMP Life Sciences, SAS Institute Inc., Cary, NC, 27519, USA
| | - Lee Scott Basehore
- Agilent Technologies, 11011 N Torrey Pines Rd., La Jolla, CA, 92037, USA
| | | | - Daniel Burgess
- (formerly) Research and Development, Roche Sequencing Solutions Inc., 500 South Rosa Rd., Madison, WI, 53719, USA
| | - Daniel J Butler
- Department of Physiology and Biophysics, Weill Cornell Medicine, Cornell University, New York, NY, 10065, USA
| | - Simon Cawley
- (formerly) Clinical Sequencing Division, Thermo Fisher Scientific, 180 Oyster Point Blvd., South San Francisco, CA, 94080, USA
| | - Chia-Jung Chang
- Stanford Genome Technology Center, Stanford University, Palo Alto, CA, 94304, USA
| | - Guangchun Chen
- Department of Immunology, Genomics and Microarray Core Facility, University of Texas Southwestern Medical Center, 5323 Harry Hine Blvd., Dallas, TX, 75390, USA
| | - Tao Chen
- University of Texas Southwestern Medical Center, 2330 Inwood Rd., Dallas, TX, 75390, USA
| | - Yun-Ching Chen
- Bioinformatics and Computational Biology Laboratory, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Daniel J Craig
- Department of Medicine, College of Medicine and Life Sciences, The University of Toledo, Toledo, OH, 43614, USA
| | - Angela Del Pozo
- Institute of Medical and Molecular Genetics (INGEMM), Hospital Universitario La Paz, CIBERER Instituto de Salud Carlos III, 28046, Madrid, Spain
| | - Jonathan Foox
- Department of Physiology and Biophysics, Weill Cornell Medicine, Cornell University, New York, NY, 10065, USA
| | | | - Yutao Fu
- Thermo Fisher Scientific, 110 Miller Ave., Ann Arbor, MI, 48104, USA
| | | | - Kristina Giorda
- Marketing, Integrated DNA Technologies, Inc., 1710 Commercial Park, Coralville, IA, 52241, USA
| | - Kira P Grist
- Q2 Solutions - EA Genomics, 5927 S Miami Blvd., Morrisville, NC, 27560, USA
| | - Meijian Guan
- JMP Life Sciences, SAS Institute Inc., Cary, NC, 27519, USA
| | - Yingyi Hao
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, China
| | - Scott Happe
- Agilent Technologies, 1834 State Hwy 71 West, Cedar Creek, TX, 78612, USA
| | - Gunjan Hariani
- Q2 Solutions - EA Genomics, 5927 S Miami Blvd., Morrisville, NC, 27560, USA
| | - Nathan Haseley
- Illumina Inc., 5200 Illumina Way, San Diego, CA, 92122, USA
| | - Jeff Jasper
- Q2 Solutions - EA Genomics, 5927 S Miami Blvd., Morrisville, NC, 27560, USA
| | | | - David Philip Kreil
- Bioinformatics Research, Institute of Molecular Biotechnology, Boku University Vienna, Vienna, Austria
| | - Paweł Łabaj
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Department of Biotechnology, Boku University, Vienna, Austria
| | - Kevin Lai
- Bioinformatics, Integrated DNA Technologies, Inc., 1710 Commercial Park, Coralville, IA, 52241, USA
| | - Jianying Li
- Kelly Government Solutions, Inc., Research Triangle Park, NC, 27709, USA
| | - Quan-Zhen Li
- Department of Immunology, Genomics and Microarray Core Facility, University of Texas Southwestern Medical Center, 5323 Harry Hine Blvd., Dallas, TX, 75390, USA
| | - Yulong Li
- Center of Genome and Personalized Medicine, Institute of Cancer Stem Cell, Dalian Medical University, Dalian, Liaoning, China
| | - Zhiguang Li
- Center of Genome and Personalized Medicine, Institute of Cancer Stem Cell, Dalian Medical University, Dalian, Liaoning, China
| | - Zhichao Liu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Mario Solís López
- Institute of Medical and Molecular Genetics (INGEMM), Hospital Universitario La Paz, CIBERER Instituto de Salud Carlos III, 28046, Madrid, Spain
- EATRIS ERIC- European Infrastructure for Translational Medicine, De Boelelaan 1118, 1081, HZ, Amsterdam, The Netherlands
| | - Kelci Miclaus
- JMP Life Sciences, SAS Institute Inc., Cary, NC, 27519, USA
| | - Raymond Miller
- Agilent Technologies, 5301 Stevens Creek Blvd., Santa Clara, CA, 95051, USA
| | - Vinay K Mittal
- Thermo Fisher Scientific, 110 Miller Ave., Ann Arbor, MI, 48104, USA
| | - Marghoob Mohiyuddin
- Bioinformatics Research & Early Development, Roche Sequencing Solutions Inc., 1301 Shoreway Rd., Suite 7 #300, Belmont, CA, 94002, USA
| | - Carlos Pabón-Peña
- Agilent Technologies, 5301 Stevens Creek Blvd., Santa Clara, CA, 95051, USA
| | - Barbara L Parsons
- Division of Genetic and Molecular Toxicology, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Fujun Qiu
- Research and Development, Burning Rock Biotech, Shanghai, 201114, China
| | - Andreas Scherer
- EATRIS ERIC- European Infrastructure for Translational Medicine, De Boelelaan 1118, 1081, HZ, Amsterdam, The Netherlands
- Institute for Molecular Medicine Finland (FIMM), Nordic EMBL Partnership for Molecular Medicine, HiLIFE Unit, Biomedicum Helsinki 2U (D302b), FI-00014 University of Helsinki, P.O. Box 20 (Tukholmankatu 8), Helsinki, Finland
| | - Tieliu Shi
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, 500 Dongchuan Rd, Shanghai, 200241, China
| | - Suzy Stiegelmeyer
- University of North Carolina Health, 101 Manning Drive, Chapel Hill, NC, 27514, USA
| | - Chen Suo
- Department of Epidemiology, School of Public Health, Fudan University, Shanghai, China
| | - Nikola Tom
- EATRIS ERIC- European Infrastructure for Translational Medicine, De Boelelaan 1118, 1081, HZ, Amsterdam, The Netherlands
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Kamenice 5, 625 00, Brno, Czech Republic
| | - Dong Wang
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Zhining Wen
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, China
| | - Leihong Wu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Wenzhong Xiao
- Stanford Genome Technology Center, Stanford University, Palo Alto, CA, 94304, USA
- Massachusetts General Hospital, Harvard Medical School, Boston, MA, 02114, USA
| | - Chang Xu
- Research and Development, QIAGEN Sciences Inc., Frederick, MD, 21703, USA
| | - Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Hospital/Cancer Institute, Fudan University, Shanghai, 200438, China
| | - Jiyang Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Hospital/Cancer Institute, Fudan University, Shanghai, 200438, China
| | - Yifan Zhang
- University of Arkansas at Little Rock, Little Rock, AR, 72204, USA
| | - Zhihong Zhang
- Research and Development, Burning Rock Biotech, Shanghai, 201114, China
| | - Yuanting Zheng
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Hospital/Cancer Institute, Fudan University, Shanghai, 200438, China
| | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, Cornell University, New York, NY, 10065, USA
| | - James C Willey
- Departments of Medicine, Pathology, and Cancer Biology, College of Medicine and Life Sciences, University of Toledo Health Sciences Campus, 3000 Arlington Ave, Toledo, OH, 43614, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Hospital/Cancer Institute, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 201203, China
- Fudan-Gospel Joint Research Center for Precision Medicine, Fudan University, Shanghai, 200438, China
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA.
| |
Collapse
|
3
|
Wu XL, Xu J, Li H, Ferretti R, He J, Qiu J, Xiao Q, Simpson B, Michell T, Kachman SD, Tait RG, Bauck S. Evaluation of genotyping concordance for commercial bovine SNP arrays using quality-assurance samples. Anim Genet 2019; 50:367-371. [PMID: 31172566 DOI: 10.1111/age.12800] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/16/2019] [Indexed: 11/29/2022]
Abstract
SNP arrays are widely used in genetic research and agricultural genomics applications, and the quality of SNP genotyping data is of paramount importance. In the present study, SNP genotyping concordance and discordance were evaluated for commercial bovine SNP arrays based on two types of quality assurance (QA) samples provided by Neogen GeneSeek. The genotyping discordance rates (GDRs) between chips were on average between 0.06% and 0.37% based on the QA type I data and between 0.05% and 0.15% based on the QA type II data. The average genotyping error rate (GER) pertaining to single SNP chips, based on the QA type II data, varied between 0.02% and 0.08% per SNP and between 0.01% and 0.06% per sample. These results indicate that genotyping concordance rate was high (i.e. from 99.63% to 99.99%). Nevertheless, mitochondrial and Y chromosome SNPs had considerably elevated GDRs and GERs compared to the SNPs on the 29 autosomes and X chromosome. The majority of genotyping errors resulted from single allotyping errors, which also included the opposite instances for allele 'dropout' (i.e. from AB to AA or BB). Simultaneous allotyping errors on both alleles (e.g. mistaking AA for BB or vice versa) were relatively rare. Finally, a list of SNPs with a GER greater than 1% is provided. Interpretation of association effects of these SNPs, for example in genome-wide association studies, needs to be taken with caution. The genotyping concordance information needs to be considered in the optimal design of future bovine SNP arrays.
Collapse
Affiliation(s)
- X-L Wu
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA.,Department of Animal Sciences, University of Wisconsin, Madison, WI, 53706, USA
| | - J Xu
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA.,Department of Statistics, University of Nebraska, Lincoln, NE, 68583, USA
| | - H Li
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA.,Department of Animal Sciences, University of Wisconsin, Madison, WI, 53706, USA
| | - R Ferretti
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA
| | - J He
- College of Animal Science and Technology, Hunan Agricultural University, Changsha, Hunan, 410128, China
| | - J Qiu
- Quality Assurance, Neogen GeneSeek, Lincoln, NE, 68504, USA
| | - Q Xiao
- Quality Assurance, Neogen GeneSeek, Lincoln, NE, 68504, USA
| | - B Simpson
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA
| | - T Michell
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA
| | - S D Kachman
- Department of Statistics, University of Nebraska, Lincoln, NE, 68583, USA
| | - R G Tait
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA
| | - S Bauck
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA
| |
Collapse
|
4
|
Comparing genetic variants detected in the 1000 genomes project with SNPs determined by the International HapMap Consortium. J Genet 2016; 94:731-40. [PMID: 26690529 DOI: 10.1007/s12041-015-0588-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Single-nucleotide polymorphisms (SNPs) determined based on SNP arrays from the international HapMap consortium (HapMap) and the genetic variants detected in the 1000 genomes project (1KGP) can serve as two references for genomewide association studies (GWAS). We conducted comparative analyses to provide a means for assessing concerns regarding SNP array-based GWAS findings as well as for realistically bounding expectations for next generation sequencing (NGS)-based GWAS. We calculated and compared base composition, transitions to transversions ratio, minor allele frequency and heterozygous rate for SNPs from HapMap and 1KGP for the 622 common individuals. We analysed the genotype discordance between HapMap and 1KGP to assess consistency in the SNPs from the two references. In 1KGP, 90.58% of 36,817,799 SNPs detected were not measured in HapMap. More SNPs with minor allele frequencies less than 0.01 were found in 1KGP than HapMap. The two references have low disc ordance (generally smaller than 0.02) in genotypes of common SNPs, with most discordance from heterozygous SNPs. Our study demonstrated that SNP array-based GWAS findings were reliable and useful, although only a small portion of genetic variances were explained. NGS can detect not only common but also rare variants, supporting the expectation that NGS-based GWAS will be able to incorporate a much larger portion of genetic variance than SNP arrays-based GWAS.
Collapse
|
5
|
Xu J, Thakkar S, Gong B, Tong W. The FDA's Experience with Emerging Genomics Technologies-Past, Present, and Future. AAPS JOURNAL 2016; 18:814-8. [PMID: 27116022 PMCID: PMC4973466 DOI: 10.1208/s12248-016-9917-y] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 01/28/2016] [Accepted: 04/12/2016] [Indexed: 01/30/2023]
Abstract
The rapid advancement of emerging genomics technologies and their application for assessing safety and efficacy of FDA-regulated products require a high standard of reliability and robustness supporting regulatory decision-making in the FDA. To facilitate the regulatory application, the FDA implemented a novel data submission program, Voluntary Genomics Data Submission (VGDS), and also to engage the stakeholders. As part of the endeavor, for the past 10 years, the FDA has led an international consortium of regulatory agencies, academia, pharmaceutical companies, and genomics platform providers, which was named MicroArray Quality Control Consortium (MAQC), to address issues such as reproducibility, precision, specificity/sensitivity, and data interpretation. Three projects have been completed so far assessing these genomics technologies: gene expression microarrays, whole genome genotyping arrays, and whole transcriptome sequencing (i.e., RNA-seq). The resultant studies provide the basic parameters for fit-for-purpose application of these new data streams in regulatory environments, and the solutions have been made available to the public through peer-reviewed publications. The latest MAQC project is also called the SEquencing Quality Control (SEQC) project focused on next-generation sequencing. Using reference samples with built-in controls, SEQC studies have demonstrated that relative gene expression can be measured accurately and reliably across laboratories and RNA-seq platforms. Besides prediction performance comparable to microarrays in clinical settings and safety assessments, RNA-seq is shown to have better sensitivity for low expression and reveal novel transcriptomic features. Future effort of MAQC will be focused on quality control of whole genome sequencing and targeted sequencing.
Collapse
Affiliation(s)
- Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | - Shraddha Thakkar
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | - Binsheng Gong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA.
| |
Collapse
|
6
|
Genomic Discoveries and Personalized Medicine in Neurological Diseases. Pharmaceutics 2015; 7:542-53. [PMID: 26690205 PMCID: PMC4695833 DOI: 10.3390/pharmaceutics7040542] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2015] [Revised: 11/30/2015] [Accepted: 12/02/2015] [Indexed: 12/22/2022] Open
Abstract
In the past decades, we have witnessed dramatic changes in clinical diagnoses and treatments due to the revolutions of genomics and personalized medicine. Undoubtedly we also met many challenges when we use those advanced technologies in drug discovery and development. In this review, we describe when genomic information is applied in personal healthcare in general. We illustrate some case examples of genomic discoveries and promising personalized medicine applications in the area of neurological disease particular. Available data suggest that individual genomics can be applied to better treat patients in the near future.
Collapse
|
7
|
Ye H, Meehan J, Tong W, Hong H. Alignment of Short Reads: A Crucial Step for Application of Next-Generation Sequencing Data in Precision Medicine. Pharmaceutics 2015; 7:523-41. [PMID: 26610555 PMCID: PMC4695832 DOI: 10.3390/pharmaceutics7040523] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2015] [Revised: 11/14/2015] [Accepted: 11/17/2015] [Indexed: 02/06/2023] Open
Abstract
Precision medicine or personalized medicine has been proposed as a modernized and promising medical strategy. Genetic variants of patients are the key information for implementation of precision medicine. Next-generation sequencing (NGS) is an emerging technology for deciphering genetic variants. Alignment of raw reads to a reference genome is one of the key steps in NGS data analysis. Many algorithms have been developed for alignment of short read sequences since 2008. Users have to make a decision on which alignment algorithm to use in their studies. Selection of the right alignment algorithm determines not only the alignment algorithm but also the set of suitable parameters to be used by the algorithm. Understanding these algorithms helps in selecting the appropriate alignment algorithm for different applications in precision medicine. Here, we review current available algorithms and their major strategies such as seed-and-extend and q-gram filter. We also discuss the challenges in current alignment algorithms, including alignment in multiple repeated regions, long reads alignment and alignment facilitated with known genetic variants.
Collapse
Affiliation(s)
- Hao Ye
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, 3900 NCTR Road, Jefferson, AR 72079, USA.
| | - Joe Meehan
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, 3900 NCTR Road, Jefferson, AR 72079, USA.
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, 3900 NCTR Road, Jefferson, AR 72079, USA.
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, 3900 NCTR Road, Jefferson, AR 72079, USA.
| |
Collapse
|
8
|
Zheng Y, Qing T, Song Y, Zhu J, Yu Y, Shi W, Pusztai L, Shi L. Standardization efforts enabling next-generation sequencing and microarray based biomarkers for precision medicine. Biomark Med 2015; 9:1265-72. [PMID: 26502353 DOI: 10.2217/bmm.15.99] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Microarrays and next-generation sequencing technologies have been increasingly employed in biomedical research. However, before they can be reliably used as clinical biomarker tests, standardization and quality control measures need to be developed to ensure their analytical validity. This review summarizes community-wide efforts such as the MicroArray and Sequencing Quality Control (MAQC/SEQC) project which have identified factors influencing the performance of these technologies. Consequently, consensus-based standards and well-documented best practices have been developed to improve the quality of scientific research, and reference materials and reference datasets have been made available for evaluating the technical proficiency in future studies. These efforts have built the foundation on which the translational application of genomics based technologies can help realize precision medicine.
Collapse
Affiliation(s)
- Yuanting Zheng
- Center for Pharmacogenomics & Department of Clinical Pharmacy, School of Pharmacy, Fudan University, Shanghai, China
| | - Tao Qing
- Center for Pharmacogenomics & Department of Clinical Pharmacy, School of Pharmacy, Fudan University, Shanghai, China
| | - Yunjie Song
- Center for Pharmacogenomics & Department of Clinical Pharmacy, School of Pharmacy, Fudan University, Shanghai, China
| | - Jinhang Zhu
- Center for Pharmacogenomics & Department of Clinical Pharmacy, School of Pharmacy, Fudan University, Shanghai, China
| | - Ying Yu
- Collaborative Innovation Center for Genetics & Development, State Key Laboratory of Genetic Engineering & MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| | - Weiwei Shi
- Breast Medical Oncology, Yale Cancer Center, Yale School of Medicine, New Haven, CT, USA
| | - Lajos Pusztai
- Breast Medical Oncology, Yale Cancer Center, Yale School of Medicine, New Haven, CT, USA
| | - Leming Shi
- Center for Pharmacogenomics & Department of Clinical Pharmacy, School of Pharmacy, Fudan University, Shanghai, China.,Collaborative Innovation Center for Genetics & Development, State Key Laboratory of Genetic Engineering & MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| |
Collapse
|
9
|
Zhang W, Soika V, Meehan J, Su Z, Ge W, Ng HW, Perkins R, Simonyan V, Tong W, Hong H. Quality control metrics improve repeatability and reproducibility of single-nucleotide variants derived from whole-genome sequencing. THE PHARMACOGENOMICS JOURNAL 2014; 15:298-309. [PMID: 25384574 DOI: 10.1038/tpj.2014.70] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2014] [Revised: 07/16/2014] [Accepted: 09/19/2014] [Indexed: 12/18/2022]
Abstract
Although many quality control (QC) methods have been developed to improve the quality of single-nucleotide variants (SNVs) in SNV-calling, QC methods for use subsequent to single-nucleotide polymorphism-calling have not been reported. We developed five QC metrics to improve the quality of SNVs using the whole-genome-sequencing data of a monozygotic twin pair from the Korean Personal Genome Project. The QC metrics improved both repeatability between the monozygotic twin pair and reproducibility between SNV-calling pipelines. We demonstrated the QC metrics improve reproducibility of SNVs derived from not only whole-genome-sequencing data but also whole-exome-sequencing data. The QC metrics are calculated based on the reference genome used in the alignment without accessing the raw and intermediate data or knowing the SNV-calling details. Therefore, the QC metrics can be easily adopted in downstream association analysis.
Collapse
Affiliation(s)
- W Zhang
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - V Soika
- Office of The Center Director, Center for Biologics Evaluation and Research, US Food and Drug Administration, Rockville, MD, USA
| | - J Meehan
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Z Su
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - W Ge
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - H W Ng
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - R Perkins
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - V Simonyan
- Office of The Center Director, Center for Biologics Evaluation and Research, US Food and Drug Administration, Rockville, MD, USA
| | - W Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - H Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| |
Collapse
|
10
|
Chen SF, Chao YL, Shen YC, Chen CH, Weng CF. Resequencing and association study of the NFKB activating protein-like gene (NKAPL) in schizophrenia. Schizophr Res 2014; 157:169-74. [PMID: 24972756 DOI: 10.1016/j.schres.2014.05.038] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/24/2014] [Revised: 05/28/2014] [Accepted: 05/31/2014] [Indexed: 11/30/2022]
Abstract
OBJECTIVES Schizophrenia is a highly inheritable disorder, but many aspects of its etiology and pathophysiology remain poorly understood. Recently, in the Han Chinese population, a SNP rs1635 located within the exon of the NKAPL gene (encoding NFKB activating protein-like) achieved genome-wide significance in schizophrenia. METHODS In order to find the causal variants of the NKAPL gene in schizophrenia, we searched for genetic variants in the promoter region, and exons (including both UTR ends) using direct sequencing in a sample of patients with schizophrenia (n=515) and non-psychotic controls (n=456), all Han Chinese from Taiwan, and conducted an association and rudimentary functional study. RESULTS We identified 5 common SNPs (defined as minor allele frequency (MAF)>0.01) in the NKAPL gene. In a case-control association analysis, the minor allele (A) of rs1635 was significantly more common among patients than controls (P=0.0003, OR=1.41, 95% CI=1.17-1.71). A haplotype analysis of the 5 common SNPs indicated a risk haplotype (rs12000C-rs1635A-rs9461446C-rs3734564G-rs1679709G) associated with schizophrenia (P=2.77e-005, OR=1.53, 95% CI=1.25-1.87). In addition, we identified 4 patient-specific rare SNPs (MAF<0.01) (c.137G>A, c.213G>A, c.752C>T (rs370337122), and c.844G>A (rs147161729)) within the NKAPL gene. In silico analysis demonstrated their functional impact on the protein; however, there was also 1 control-specific rare SNP (c.119G>A) detected within the NKAPL gene, indicating that the clinical relevance of these putatively pathological rare SNPs is not straightforward. CONCLUSIONS This study suggested that rs1635 in the NKAPL gene appeared to play a role in conferring susceptibility to schizophrenia. In addition, some rare SNPs in the NKAPL gene with possibly damaging effects may be important in our patients. Our study provides genetic clues to indicate the involvement of NKAPL in schizophrenia.
Collapse
Affiliation(s)
- Shih-Fen Chen
- Department of Life Science and Institute of Biotechnology, National Dong-Hwa University, Hualien, Taiwan
| | - Yu-Lin Chao
- Department of Psychiatry, Tzu-Chi General Hospital at Hualien, and School of Medicine, Tzu-Chi University, Hualien, Taiwan
| | - Yu-Chih Shen
- Department of Psychiatry, Tzu-Chi General Hospital at Hualien, and School of Medicine, Tzu-Chi University, Hualien, Taiwan.
| | - Chia-Hsiang Chen
- Department of Psychiatry, Chang Gung Memorial Hospital at Linkou, and Chang Gung University, School of Medicine, Taoyuan, Taiwan
| | - Ching-Feng Weng
- Department of Life Science and Institute of Biotechnology, National Dong-Hwa University, Hualien, Taiwan.
| |
Collapse
|
11
|
Crommelin DJA, Sindelar RD, Meibohm B. Genomics, Other “Omic” Technologies, Personalized Medicine, and Additional Biotechnology-Related Techniques. PHARMACEUTICAL BIOTECHNOLOGY 2013. [PMCID: PMC7122419 DOI: 10.1007/978-1-4614-6486-0_8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
The products resulting for biotechnologies continue to grow at an exponential rate, and the expectations are that an even greater percentage of drug development will be in the area of the biologics. In 2011, worldwide there were over 800 new biotech drugs and treatments in development including 23 antisense, 64 cell therapy, 50 gene therapy, 300 monoclonal antibodies, 78 recombinant proteins, and 298 vaccines (PhRMA 2012). Pharmaceutical biotechnology techniques are at the core of most methodologies used today for drug discovery and development of both biologics and small molecules. While recombinant DNA technology and hybridoma techniques were the major methods utilized in pharmaceutical biotechnology through most of its historical timeline, our ever-widening understanding of human cellular function and disease processes and a wealth of additional and innovative biotechnologies have been, and will continue to be, developed in order to harvest the information found in the human genome. These technological advances will provide a better understanding of the relationship between genetics and biological function, unravel the underlying causes of disease, explore the association of genomic variation and drug response, enhance pharmaceutical research, and fuel the discovery and development of new and novel biopharmaceuticals. These revolutionary technologies and additional biotechnology-related techniques are improving the very competitive and costly process of drug development of new medicinal agents, diagnostics, and medical devices. Some of the technologies and techniques described in this chapter are both well established and commonly used applications of biotechnology producing potential therapeutic products now in development including clinical trials. New techniques are emerging at a rapid and unprecedented pace and their full impact on the future of molecular medicine has yet to be imagined.
Collapse
Affiliation(s)
- Daan J. A. Crommelin
- Department of Pharmaceutical Sciences, Utrecht University, Utrecht, Utrecht The Netherlands
| | - Robert D. Sindelar
- Department of Pharmaceutical Sciences and Department of Medicine, The University of British Columbia, Vancouver, British Columbia Canada
| | - Bernd Meibohm
- Department of Pharmaceutical Sciences, University of Tennessee Health Science Center, College of Pharmacy, Memphis, Tennessee USA
| |
Collapse
|
12
|
Hong H, Xu L, Liu J, Jones WD, Su Z, Ning B, Perkins R, Ge W, Miclaus K, Zhang L, Park K, Green B, Han T, Fang H, Lambert CG, Vega SC, Lin SM, Jafari N, Czika W, Wolfinger RD, Goodsaid F, Tong W, Shi L. Technical reproducibility of genotyping SNP arrays used in genome-wide association studies. PLoS One 2012; 7:e44483. [PMID: 22970228 PMCID: PMC3436888 DOI: 10.1371/journal.pone.0044483] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2012] [Accepted: 08/08/2012] [Indexed: 01/25/2023] Open
Abstract
During the last several years, high-density genotyping SNP arrays have facilitated genome-wide association studies (GWAS) that successfully identified common genetic variants associated with a variety of phenotypes. However, each of the identified genetic variants only explains a very small fraction of the underlying genetic contribution to the studied phenotypic trait. Moreover, discordance observed in results between independent GWAS indicates the potential for Type I and II errors. High reliability of genotyping technology is needed to have confidence in using SNP data and interpreting GWAS results. Therefore, reproducibility of two widely genotyping technology platforms from Affymetrix and Illumina was assessed by analyzing four technical replicates from each of the six individuals in five laboratories. Genotype concordance of 99.40% to 99.87% within a laboratory for the sample platform, 98.59% to 99.86% across laboratories for the same platform, and 98.80% across genotyping platforms was observed. Moreover, arrays with low quality data were detected when comparing genotyping data from technical replicates, but they could not be detected according to venders' quality control (QC) suggestions. Our results demonstrated the technical reliability of currently available genotyping platforms but also indicated the importance of incorporating some technical replicates for genotyping QC in order to improve the reliability of GWAS results. The impact of discordant genotypes on association analysis results was simulated and could explain, at least in part, the irreproducibility of some GWAS findings when the effect size (i.e. the odds ratio) and the minor allele frequencies are low.
Collapse
Affiliation(s)
- Huixiao Hong
- Center of Excellence for Bioinformatics, Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arizona, United States of America
| | - Lei Xu
- Center of Excellence for Bioinformatics, Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arizona, United States of America
| | - Jie Liu
- Division of Personalized Nutrition and Medicine, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arizona, United States of America
| | - Wendell D. Jones
- Expression Analysis Inc., Durham, North Carolina, United States of America
| | - Zhenqiang Su
- ICF International Company at National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arizona, United States of America
| | - Baitang Ning
- Division of Personalized Nutrition and Medicine, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arizona, United States of America
| | - Roger Perkins
- Center of Excellence for Bioinformatics, Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arizona, United States of America
| | - Weigong Ge
- Center of Excellence for Bioinformatics, Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arizona, United States of America
| | - Kelci Miclaus
- SAS Institute Inc, Cary, North Carolina, United States of America
| | - Li Zhang
- Office of Clinical Pharmacology, Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, United States of America
| | - Kyunghee Park
- Samsung Advanced Institute of Technology, Giheung-gu, Yongin-si Gyeonggi-do, Republic of Korea
| | - Bridgett Green
- Division of Personalized Nutrition and Medicine, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arizona, United States of America
| | - Tao Han
- Center of Excellence for Genomics, Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arizona, United States of America
| | - Hong Fang
- ICF International Company at National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arizona, United States of America
| | | | - Silvia C. Vega
- Rosetta BioSoftware, Health Solutions Group, Microsoft, Seattle, Washington, United States of America
| | - Simon M. Lin
- Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, United States of America
| | - Nadereh Jafari
- Center for Genetic Medicine, Northwestern University, Chicago, Illinois, United States of America
| | - Wendy Czika
- SAS Institute Inc, Cary, North Carolina, United States of America
| | | | - Federico Goodsaid
- Office of Clinical Pharmacology, Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, United States of America
| | - Weida Tong
- Center of Excellence for Bioinformatics, Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arizona, United States of America
| | - Leming Shi
- Center of Excellence for Bioinformatics, Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arizona, United States of America
| |
Collapse
|
13
|
Mucha D, Laberke S, Meyer S, Hirschberger J. Lack of association between p53 SNP and FISS in a cat population from Germany. Vet Comp Oncol 2012; 12:130-7. [PMID: 22882519 DOI: 10.1111/j.1476-5829.2012.00344.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2012] [Revised: 06/16/2012] [Accepted: 07/03/2012] [Indexed: 01/01/2023]
Abstract
One recent study indicates a significant association between certain single nucleotide polymorphisms (SNPs) in the genomic sequence of feline p53 and feline injection-site sarcoma (FISS). The aim of this study was to investigate the correlation between a specific nucleotide insertion in p53 gene and FISS in a German cat population. Blood samples from 150 German cats were allocated to a control group consisting of 100 healthy cats and a FISS-group consisting of 50 cats with FISS. All blood samples were examined for the presence of the SNP in the p53 gene. Results found the T-insertion at SNP 3 in 20.0% of the cats in the FISS-group and 19.2% of cats in the control-group. No statistically significant difference was observed in allelic distribution between the two groups. Further investigations are necessary to determine the association of SNPs in the feline p53 gene and the occurrence of FISS.
Collapse
Affiliation(s)
- D Mucha
- Clinic of Small Animal Medicine, Center for Clinical Veterinary Medicine, Ludwig Maximilian University, Munich, Germany
| | | | | | | |
Collapse
|
14
|
Fan YH, Song YQ. IPGWAS: an integrated pipeline for rational quality control and association analysis of genome-wide genetic studies. Biochem Biophys Res Commun 2012; 422:363-8. [PMID: 22564732 DOI: 10.1016/j.bbrc.2012.04.117] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2012] [Accepted: 04/21/2012] [Indexed: 01/03/2023]
Abstract
Large numbers of samples and marker loci were tested for association in genome-wide association studies (GWAS). Hence, quality control (QC) by removing individuals or markers with low genotyping quality is of utmost importance to minimize potential false positive associations. IPGWAS was developed to facilitate the identification of the rational thresholds in QC of GWAS datasets, association analysis, Manhattan plot, quantile-quantile (QQ) plot, and format conversion for genetic analyses, such as meta-analysis, genotype phasing, and imputation. IPGWAS is a multiplatform application written in Perl with a graphical user interface (GUI) and available for free at http://sourceforge.net/projects/ipgwas/.
Collapse
Affiliation(s)
- Yan-Hui Fan
- Department of Biochemistry, The University of Hong Kong, Pokfulam, Hong Kong.
| | | |
Collapse
|
15
|
Mendrick DL. Transcriptional profiling to identify biomarkers of disease and drug response. Pharmacogenomics 2011; 12:235-49. [PMID: 21332316 DOI: 10.2217/pgs.10.184] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
The discovery, biological qualification and analytical validation of genomic biomarkers requires extensive collaborations between individuals with expertise in biology, statistics, bioinformatics, chemistry, clinical medicine, regulatory science and so on. For clinical utility, blood-borne biomarkers (e.g., mRNA and miRNA) of organ damage, drug toxicity and/or response would be preferred to those that are tissue based. Currently used biomarkers such as serum creatinine (indicating renal dysfunction) denote organ damage whether caused by disease, physical injury or drugs. Therefore, it is anticipated that studies of disease will discover biomarkers that can also be used to identify drug-induced injury and vice versa. This article describes transcriptomic blood-borne biomarkers that have been reported to be connected with disease and drug toxicity. Much more qualification and validation needs to be carried out before many of these biomarkers can prove useful. Discussed here are some of the lessons learned and roadblocks to success.
Collapse
Affiliation(s)
- Donna L Mendrick
- Division of Systems Biology, HFT-230, National Center for Toxicological Research, US FDA, 3900 NCTR Rd, Jefferson, AR 72079-4502, USA.
| |
Collapse
|
16
|
Tayo BO, Teil M, Tong L, Qin H, Khitrov G, Zhang W, Song Q, Gottesman O, Zhu X, Pereira AC, Cooper RS, Bottinger EP. Genetic background of patients from a university medical center in Manhattan: implications for personalized medicine. PLoS One 2011; 6:e19166. [PMID: 21573225 PMCID: PMC3087725 DOI: 10.1371/journal.pone.0019166] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2010] [Accepted: 03/28/2011] [Indexed: 11/29/2022] Open
Abstract
Background The rapid progress currently being made in genomic science has created interest in potential clinical applications; however, formal translational research has been limited thus far. Studies of population genetics have demonstrated substantial variation in allele frequencies and haplotype structure at loci of medical relevance and the genetic background of patient cohorts may often be complex. Methods and Findings To describe the heterogeneity in an unselected clinical sample we used the Affymetrix 6.0 gene array chip to genotype self-identified European Americans (N = 326), African Americans (N = 324) and Hispanics (N = 327) from the medical practice of Mount Sinai Medical Center in Manhattan, NY. Additional data from US minority groups and Brazil were used for external comparison. Substantial variation in ancestral origin was observed for both African Americans and Hispanics; data from the latter group overlapped with both Mexican Americans and Brazilians in the external data sets. A pooled analysis of the African Americans and Hispanics from NY demonstrated a broad continuum of ancestral origin making classification by race/ethnicity uninformative. Selected loci harboring variants associated with medical traits and drug response confirmed substantial within- and between-group heterogeneity. Conclusion As a consequence of these complementary levels of heterogeneity group labels offered no guidance at the individual level. These findings demonstrate the complexity involved in clinical translation of the results from genome-wide association studies and suggest that in the genomic era conventional racial/ethnic labels are of little value.
Collapse
Affiliation(s)
- Bamidele O. Tayo
- Department of Preventive Medicine and Epidemiology, Loyola University Chicago Stritch School of Medicine, Maywood, Illinois, United States of America
- * E-mail: (BOT); (EPB)
| | - Marie Teil
- Charles R. Bronfman Institute for Personalized Medicine, Mount Sinai School of Medicine, New York, New York, United States of America
| | - Liping Tong
- Department of Preventive Medicine and Epidemiology, Loyola University Chicago Stritch School of Medicine, Maywood, Illinois, United States of America
| | - Huaizhen Qin
- Department of Biostatistics and Epidemiology, Case Western University, Cleveland, Ohio, United States of America
| | - Gregory Khitrov
- Charles R. Bronfman Institute for Personalized Medicine, Mount Sinai School of Medicine, New York, New York, United States of America
| | - Weijia Zhang
- Charles R. Bronfman Institute for Personalized Medicine, Mount Sinai School of Medicine, New York, New York, United States of America
| | - Quinbin Song
- Charles R. Bronfman Institute for Personalized Medicine, Mount Sinai School of Medicine, New York, New York, United States of America
| | - Omri Gottesman
- Charles R. Bronfman Institute for Personalized Medicine, Mount Sinai School of Medicine, New York, New York, United States of America
| | - Xiaofeng Zhu
- Department of Biostatistics and Epidemiology, Case Western University, Cleveland, Ohio, United States of America
| | | | - Richard S. Cooper
- Department of Preventive Medicine and Epidemiology, Loyola University Chicago Stritch School of Medicine, Maywood, Illinois, United States of America
| | - Erwin P. Bottinger
- Charles R. Bronfman Institute for Personalized Medicine, Mount Sinai School of Medicine, New York, New York, United States of America
- * E-mail: (BOT); (EPB)
| |
Collapse
|
17
|
Variability in GWAS analysis: the impact of genotype calling algorithm inconsistencies. THE PHARMACOGENOMICS JOURNAL 2010; 10:324-35. [PMID: 20676070 DOI: 10.1038/tpj.2010.46] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The Genome-Wide Association Working Group (GWAWG) is part of a large-scale effort by the MicroArray Quality Consortium (MAQC) to assess the quality of genomic experiments, technologies and analyses for genome-wide association studies (GWASs). One of the aims of the working group is to assess the variability of genotype calls within and between different genotype calling algorithms using data for coronary artery disease from the Wellcome Trust Case Control Consortium (WTCCC) and the University of Ottawa Heart Institute. Our results show that the choice of genotyping algorithm (for example, Bayesian robust linear model with Mahalanobis distance classifier (BRLMM), the corrected robust linear model with maximum-likelihood-based distances (CRLMM) and CHIAMO (developed and implemented by the WTCCC)) can introduce marked variability in the results of downstream case-control association analysis for the Affymetrix 500K array. The amount of discordance between results is influenced by how samples are combined and processed through the respective genotype calling algorithm, indicating that systematic genotype errors due to computational batch effects are propagated to the list of single-nucleotide polymorphisms found to be significantly associated with the trait of interest. Further work using HapMap samples shows that inconsistencies between Affymetrix arrays and calling algorithms can lead to genotyping errors that influence downstream analysis.
Collapse
|
18
|
The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nat Biotechnol 2010; 28:827-38. [PMID: 20676074 DOI: 10.1038/nbt.1665] [Citation(s) in RCA: 602] [Impact Index Per Article: 43.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2010] [Accepted: 06/30/2010] [Indexed: 11/09/2022]
Abstract
Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.
Collapse
|