1
|
Jia H, Tan S, Zhang YE. Chasing Sequencing Perfection: Marching Toward Higher Accuracy and Lower Costs. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae024. [PMID: 38991976 PMCID: PMC11423848 DOI: 10.1093/gpbjnl/qzae024] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 01/25/2024] [Accepted: 01/29/2024] [Indexed: 07/13/2024]
Abstract
Next-generation sequencing (NGS), represented by Illumina platforms, has been an essential cornerstone of basic and applied research. However, the sequencing error rate of 1 per 1000 bp (10-3) represents a serious hurdle for research areas focusing on rare mutations, such as somatic mosaicism or microbe heterogeneity. By examining the high-fidelity sequencing methods developed in the past decade, we summarized three major factors underlying errors and the corresponding 12 strategies mitigating these errors. We then proposed a novel framework to classify 11 preexisting representative methods according to the corresponding combinatory strategies and identified three trends that emerged during methodological developments. We further extended this analysis to eight long-read sequencing methods, emphasizing error reduction strategies. Finally, we suggest two promising future directions that could achieve comparable or even higher accuracy with lower costs in both NGS and long-read sequencing.
Collapse
Affiliation(s)
- Hangxing Jia
- CAS Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Shengjun Tan
- CAS Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Yong E Zhang
- CAS Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- CAS Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China
| |
Collapse
|
2
|
Chen H, Wang B, Cai L, Zhang Y, Shu Y, Liu W, Leng X, Zhai J, Niu B, Zhou Q, Cao S. The performance of homopolymer detection using dichromatic and tetrachromatic fluorogenic next-generation sequencing platforms. BMC Genomics 2024; 25:542. [PMID: 38822237 PMCID: PMC11140927 DOI: 10.1186/s12864-024-10474-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Accepted: 05/29/2024] [Indexed: 06/02/2024] Open
Abstract
OBJECTIVES Homopolymer (HP) sequencing is error-prone in next-generation sequencing (NGS) assays, and may induce false insertion/deletions and substitutions. This study aimed to evaluate the performance of dichromatic and tetrachromatic fluorogenic NGS platforms when sequencing homopolymeric regions. RESULTS A HP-containing plasmid was constructed and diluted to serial frequencies (3%, 10%, 30%, 60%) to determine the performance of an MGISEQ-2000, MGISEQ-200, and NextSeq 2000 in HP sequencing. An evident negative correlation was observed between the detected frequencies of four nucleotide HPs and the HP length. Significantly decreased rates (P < 0.01) were found in all 8-mer HPs in all three NGS systems at all four expected frequencies, except in the NextSeq 2000 at 3%. With the application of a unique molecular identifier (UMI) pipeline, there were no differences between the detected frequencies of any HPs and the expected frequencies, except for poly-G 8-mers using the MGI 200 platform. UMIs improved the performance of all three NGS platforms in HP sequencing. CONCLUSIONS We first constructed an HP-containing plasmid based on an EGFR gene backbone to evaluate the performance of NGS platforms when sequencing homopolymeric regions. A highly comparable performance was observed between the MGISEQ-2000 and NextSeq 2000, and introducing UMIs is a promising approach to improve the performance of NGS platforms in sequencing homopolymeric regions.
Collapse
Affiliation(s)
- HuiJuan Chen
- Beijing ChosenMed Clinical Laboratory Co. Ltd, Beijing, 100176, China
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100190, China
- WillingMed Technology Beijing Co., Ltd, Beijing, 100176, China
| | - Bing Wang
- Beijing ChosenMed Clinical Laboratory Co. Ltd, Beijing, 100176, China
| | - LiLi Cai
- Beijing ChosenMed Clinical Laboratory Co. Ltd, Beijing, 100176, China
| | - YiRan Zhang
- Beijing ChosenMed Clinical Laboratory Co. Ltd, Beijing, 100176, China
| | - YingShuang Shu
- Beijing ChosenMed Clinical Laboratory Co. Ltd, Beijing, 100176, China
| | - Wen Liu
- Beijing ChosenMed Clinical Laboratory Co. Ltd, Beijing, 100176, China
| | - Xue Leng
- Beijing ChosenMed Clinical Laboratory Co. Ltd, Beijing, 100176, China
| | - JinCheng Zhai
- Beijing ChosenMed Clinical Laboratory Co. Ltd, Beijing, 100176, China
| | - BeiFang Niu
- Beijing ChosenMed Clinical Laboratory Co. Ltd, Beijing, 100176, China.
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100190, China.
- ChosenMed Technology (Zhejiang) Co. Ltd, Zhejiang, 311103, China.
| | - QiMing Zhou
- Beijing ChosenMed Clinical Laboratory Co. Ltd, Beijing, 100176, China.
- ChosenMed Technology (Zhejiang) Co. Ltd, Zhejiang, 311103, China.
| | - ShuNan Cao
- Polar Research Institute of China, Shanghai, 201209, China.
| |
Collapse
|
3
|
Wang Z, Moffitt AB, Andrews P, Wigler M, Levy D. Accurate measurement of microsatellite length by disrupting its tandem repeat structure. Nucleic Acids Res 2022; 50:e116. [PMID: 36095132 PMCID: PMC9723644 DOI: 10.1093/nar/gkac723] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 08/03/2022] [Accepted: 08/15/2022] [Indexed: 12/24/2022] Open
Abstract
Tandem repeats of simple sequence motifs, also known as microsatellites, are abundant in the genome. Because their repeat structure makes replication error-prone, variant microsatellite lengths are often generated during germline and other somatic expansions. As such, microsatellite length variations can serve as markers for cancer. However, accurate error-free measurement of microsatellite lengths is difficult with current methods precisely because of this high error rate during amplification. We have solved this problem by using partial mutagenesis to disrupt enough of the repeat structure of initial templates so that their sequence lengths replicate faithfully. In this work, we use bisulfite mutagenesis to convert a C to a U, later read as T. Compared to untreated templates, we achieve three orders of magnitude reduction in the error rate per round of replication. By requiring agreement from two independent first copies of an initial template, we reach error rates below one in a million. We apply this method to a thousand microsatellite loci from the human genome, revealing microsatellite length distributions not observable without mutagenesis.
Collapse
Affiliation(s)
| | | | - Peter Andrews
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | | | - Dan Levy
- To whom correspondence should be addressed. Tel: +1 516 367 5039; Fax: +1 516 367 8381;
| |
Collapse
|
4
|
Xu LQ, Wang YJ, Shen SL, Wu Y, Duan HZ. Early detection of circulating tumor DNA and successful treatment with osimertinib in thr790met-positive leptomeningeal metastatic lung cancer: A case report. World J Clin Cases 2022; 10:7968-7972. [PMID: 36158482 PMCID: PMC9372865 DOI: 10.12998/wjcc.v10.i22.7968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 06/02/2022] [Accepted: 06/26/2022] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Patients diagnosed with non-small-cell lung cancer with activated epidermal growth factor receptor mutations are more likely to develop leptomeningeal (LM) metastasis than other types of lung cancers and have a poor prognosis. Early diagnosis and effective treatment of leptomeningeal carcinoma can improve the prognosis.
CASE SUMMARY A 55-year-old female with a progressive headache and vomiting for one month was admitted to Peking University First Hospital. She was diagnosed with lung adenocarcinoma with osseous metastasis 10 months prior to admittance. epidermal growth factor receptor (EGFR) mutation was detected by genomic examination, so she was first treated with gefitinib for 10 months before acquiring resistance. Cell-free cerebrospinal fluid (CSF) circulating tumor DNA detection by next-generation sequencing was conducted and indicated the EGFR-Thr790Met mutation, while biopsy and cytology from the patient’s CSF and the first enhanced cranial magnetic resonance imaging (MRI) showed no positive findings. A month later, the enhanced MRI showed linear leptomeningeal enhancement, and the cytology and biochemical examination in CSF remained negative. Therefore, osimertinib (80 mg/d) was initiated as a second-line treatment, resulting in a good response within a month.
CONCLUSION This report suggests clinical benefit of osimertinib in LM patients with positive detection of the EGFR-Thr790Met mutation in CSF and proposes that the positive findings of CSF circulating tumor DNA as a liquid biopsy technology based on the detection of cancer-associated gene mutations may appear earlier than the imaging and CSF findings and may thus be helpful for therapy. Moreover, the routine screening of chest CT with the novel coronavirus may provide unexpected benefits.
Collapse
Affiliation(s)
- Li-Qing Xu
- Department of Neurosurgery, Peking University First Hospital, Beijing 100034, China
| | - Ying-Jin Wang
- Department of Neurosurgery, Peking University First Hospital, Beijing 100034, China
| | - Sheng-Li Shen
- Department of Neurosurgery, Peking University First Hospital, Beijing 100034, China
| | - Yao Wu
- Department of Neurosurgery, Peking University First Hospital, Beijing 100034, China
| | - Hong-Zhou Duan
- Department of Neurosurgery, Peking University First Hospital, Beijing 100034, China
| |
Collapse
|
5
|
Akoniyon OP, Adewumi TS, Maharaj L, Oyegoke OO, Roux A, Adeleke MA, Maharaj R, Okpeku M. Whole Genome Sequencing Contributions and Challenges in Disease Reduction Focused on Malaria. BIOLOGY 2022; 11:587. [PMID: 35453786 PMCID: PMC9027812 DOI: 10.3390/biology11040587] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Revised: 03/31/2022] [Accepted: 04/01/2022] [Indexed: 12/11/2022]
Abstract
Malaria elimination remains an important goal that requires the adoption of sophisticated science and management strategies in the era of the COVID-19 pandemic. The advent of next generation sequencing (NGS) is making whole genome sequencing (WGS) a standard today in the field of life sciences, as PCR genotyping and targeted sequencing provide insufficient information compared to the whole genome. Thus, adapting WGS approaches to malaria parasites is pertinent to studying the epidemiology of the disease, as different regions are at different phases in their malaria elimination agenda. Therefore, this review highlights the applications of WGS in disease management, challenges of WGS in controlling malaria parasites, and in furtherance, provides the roles of WGS in pursuit of malaria reduction and elimination. WGS has invaluable impacts in malaria research and has helped countries to reach elimination phase rapidly by providing required information needed to thwart transmission, pathology, and drug resistance. However, to eliminate malaria in sub-Saharan Africa (SSA), with high malaria transmission, we recommend that WGS machines should be readily available and affordable in the region.
Collapse
Affiliation(s)
- Olusegun Philip Akoniyon
- Discipline of Genetics, School of Life Sciences, University of KwaZulu-Natal, Westville Campus, Durban 4041, South Africa; (O.P.A.); (T.S.A.); (L.M.); (O.O.O.); (A.R.); (M.A.A.)
| | - Taiye Samson Adewumi
- Discipline of Genetics, School of Life Sciences, University of KwaZulu-Natal, Westville Campus, Durban 4041, South Africa; (O.P.A.); (T.S.A.); (L.M.); (O.O.O.); (A.R.); (M.A.A.)
| | - Leah Maharaj
- Discipline of Genetics, School of Life Sciences, University of KwaZulu-Natal, Westville Campus, Durban 4041, South Africa; (O.P.A.); (T.S.A.); (L.M.); (O.O.O.); (A.R.); (M.A.A.)
| | - Olukunle Olugbenle Oyegoke
- Discipline of Genetics, School of Life Sciences, University of KwaZulu-Natal, Westville Campus, Durban 4041, South Africa; (O.P.A.); (T.S.A.); (L.M.); (O.O.O.); (A.R.); (M.A.A.)
| | - Alexandra Roux
- Discipline of Genetics, School of Life Sciences, University of KwaZulu-Natal, Westville Campus, Durban 4041, South Africa; (O.P.A.); (T.S.A.); (L.M.); (O.O.O.); (A.R.); (M.A.A.)
| | - Matthew A. Adeleke
- Discipline of Genetics, School of Life Sciences, University of KwaZulu-Natal, Westville Campus, Durban 4041, South Africa; (O.P.A.); (T.S.A.); (L.M.); (O.O.O.); (A.R.); (M.A.A.)
| | - Rajendra Maharaj
- Office of Malaria Research, South African Medical Research Council, Cape Town 7505, South Africa;
| | - Moses Okpeku
- Discipline of Genetics, School of Life Sciences, University of KwaZulu-Natal, Westville Campus, Durban 4041, South Africa; (O.P.A.); (T.S.A.); (L.M.); (O.O.O.); (A.R.); (M.A.A.)
| |
Collapse
|
6
|
Assessment of Microsatellite Instability from Next-Generation Sequencing Data. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1361:75-100. [DOI: 10.1007/978-3-030-91836-1_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
7
|
Li D, Huang Q, Huang L, Wen J, Luo J, Li Q, Peng Y, Zhang Y. Baiting out a full length sequence from unmapped RNA-seq data. BMC Genomics 2021; 22:857. [PMID: 34837950 PMCID: PMC8626966 DOI: 10.1186/s12864-021-08146-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2021] [Accepted: 11/03/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND As a powerful tool, RNA-Seq has been widely used in various studies. Usually, unmapped RNA-seq reads have been considered as useless and been trashed or ignored. RESULTS We develop a strategy to mining the full length sequence by unmapped reads combining with specific reverse transcription primers design and high throughput sequencing. In this study, we salvage 36 unmapped reads from standard RNA-Seq data and randomly select one 149 bp read as a model. Specific reverse transcription primers are designed to amplify its both ends, followed by next generation sequencing. Then we design a statistical model based on power law distribution to estimate its integrality and significance. Further, we validate it by Sanger sequencing. The result shows that the full length is 1556 bp, with insertion mutations in microsatellite structure. CONCLUSION We believe this method would be a useful strategy to extract the sequences information from the unmapped RNA-seq data. Further, it is an alternative way to get the full length sequence of unknown cDNA.
Collapse
Affiliation(s)
- Dongwei Li
- Animal Functional Genomics Group, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120 China
- Guangdong Provincial Key Laboratory of Protein Function and Regulation in Agricultural Organisms, College of Life Sciences, South China Agricultural University, Guangzhou, Guangdong 510642 China
| | - Qitong Huang
- Animal Functional Genomics Group, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120 China
- Animal Breeding and Genomic, Wageningen University & Research, Wageningen, 6708PB, Netherlands
| | - Lei Huang
- Animal Functional Genomics Group, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120 China
| | - Jikai Wen
- Guangdong Provincial Key Laboratory of Protein Function and Regulation in Agricultural Organisms, College of Life Sciences, South China Agricultural University, Guangzhou, Guangdong 510642 China
| | - Jing Luo
- Animal Functional Genomics Group, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120 China
| | - Qing Li
- Animal Functional Genomics Group, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120 China
| | - Yanling Peng
- Animal Functional Genomics Group, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120 China
| | - Yubo Zhang
- Animal Functional Genomics Group, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120 China
| |
Collapse
|
8
|
New challenges, new opportunities: Next generation sequencing and its place in the advancement of HLA typing. Hum Immunol 2021; 82:478-487. [PMID: 33551127 DOI: 10.1016/j.humimm.2021.01.010] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Revised: 12/29/2020] [Accepted: 01/18/2021] [Indexed: 02/07/2023]
Abstract
The Human Leukocyte Antigen (HLA) system has a critical role in immunorecognition, transplantation, and disease association. Early typing techniques provided the foundation for genotyping methods that revealed HLA as one of the most complex, polymorphic regions of the human genome. Next Generation Sequencing (NGS), the latest molecular technology introduced in clinical tissue typing laboratories, has demonstrated advantages over other established methods. NGS offers high-resolution sequencing of entire genes in time frames and price points considered unthinkable just a few years ago, contributing a wealth of data informing histocompatibility assessment and standards of clinical care. Although the NGS platforms share a high-throughput massively parallel processing model, differing chemistries provide specific strengths and weaknesses. Research-oriented Third Generation Sequencing and related advances in bioengineering continue to broaden the future of NGS in clinical settings. These diverse applications have demanded equally innovative strategies for data management and computational bioinformatics to support and analyze the unprecedented volume and complexity of data generated by NGS. We discuss some of the challenges and opportunities associated with NGS technologies, providing a comprehensive picture of the historical developments that paved the way for the NGS revolution, its current state and future possibilities for HLA typing.
Collapse
|
9
|
Bonneville R, Paruchuri A, Wing MR, Krook MA, Reeser JW, Chen HZ, Dao T, Samorodnitsky E, Smith AM, Yu L, Nowacki N, Chen W, Roychowdhury S. Characterization of Clonal Evolution in Microsatellite Unstable Metastatic Cancers through Multiregional Tumor Sequencing. Mol Cancer Res 2020; 19:465-474. [PMID: 33229401 DOI: 10.1158/1541-7786.mcr-19-0955] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Revised: 07/10/2020] [Accepted: 11/18/2020] [Indexed: 11/16/2022]
Abstract
Microsatellites are short, repetitive segments of DNA, which are dysregulated in mismatch repair-deficient (MMRd) tumors resulting in microsatellite instability (MSI). MSI has been identified in many human cancer types with varying incidence, and microsatellite instability-high (MSI-H) tumors often exhibit increased sensitivity to immune-enhancing therapies such as PD-1/PD-L1 inhibition. Next-generation sequencing (NGS) has permitted advancements in MSI detection, and recent computational advances have enabled characterization of tumor heterogeneity via NGS. However, the evolution and heterogeneity of microsatellite changes in MSI-positive tumors remains poorly described. We determined MSI status in 6 patients using our previously published algorithm, MANTIS, and inferred subclonal composition and phylogeny with Canopy and SuperFreq. We developed a simulated annealing-based method to characterize microsatellite length distributions in specific subclones and assessed the evolution of MSI in the context of tumor heterogeneity. We identified three to eight tumor subclones per patient, and each subclone exhibited MMRd-associated base substitution signatures. We noted that microsatellites tend to shorten over time, and that MMRd fosters heterogeneity by introducing novel mutations throughout the disease course. Some microsatellites are altered among all subclones in a patient, whereas other loci are only altered in particular subclones corresponding to subclonal phylogenetic relationships. Overall, our results indicate that MMRd is a substantial driver of heterogeneity, leading to both MSI and subclonal divergence. IMPLICATIONS: We leveraged subclonal inference to assess clonal evolution based on somatic mutations and microsatellites, which provides insight into MMRd as a dynamic mutagenic process in MSI-H malignancies.
Collapse
Affiliation(s)
- Russell Bonneville
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio.,Biomedical Sciences Graduate Program, The Ohio State University, Columbus, Ohio
| | - Anoosha Paruchuri
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio
| | - Michele R Wing
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio
| | - Melanie A Krook
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio
| | - Julie W Reeser
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio
| | - Hui-Zi Chen
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio.,Division of Medical Oncology, Department of Internal Medicine, The Ohio State University, Columbus, Ohio
| | - Thuy Dao
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio
| | | | - Amy M Smith
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio
| | - Lianbo Yu
- Department of Biomedical Informatics, The Ohio State University, Columbus, Ohio
| | - Nicholas Nowacki
- Department of Pathology, The Ohio State University, Columbus, Ohio
| | - Wei Chen
- Department of Pathology, The Ohio State University, Columbus, Ohio
| | - Sameek Roychowdhury
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio. .,Division of Medical Oncology, Department of Internal Medicine, The Ohio State University, Columbus, Ohio
| |
Collapse
|
10
|
Genetic diversity and population structure of the threatened chocolate mahseer (Neolissochilus hexagonolepis McClelland 1839) based on SSR markers: implications for conservation management in Northeast India. Mol Biol Rep 2019; 46:5237-5249. [DOI: 10.1007/s11033-019-04981-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2019] [Accepted: 07/12/2019] [Indexed: 10/26/2022]
|
11
|
Barsakis K, Babrzadeh F, Chi A, Mallempati K, Pickle W, Mindrinos M, Fernández-Viña MA. Complete nucleotide sequence characterization of DRB5 alleles reveals a homogeneous allele group that is distinct from other DRB genes. Hum Immunol 2019; 80:437-448. [PMID: 30954494 PMCID: PMC6622178 DOI: 10.1016/j.humimm.2019.04.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Revised: 03/23/2019] [Accepted: 04/01/2019] [Indexed: 01/28/2023]
Abstract
Next Generation Sequencing allows for testing and typing of entire genes of the HLA region. A better and comprehensive sequence assessment can be achieved by the inclusion of full gene sequences of all the common alleles at a given locus. The common alleles of DRB5 are under-characterized with the full exon-intron sequence of two alleles available. In the present study the DRB5 genes from 18 subjects alleles were cloned and sequenced; haplotype analysis showed that 17 of them had a single copy of DRB5 and one consanguineous subject was homozygous at all HLA loci. Methodological approaches including robust and efficient long-range PCR amplification, molecular cloning, nucleotide sequencing and de novo sequence assembly were combined to characterize DRB5 alleles. DRB5 sequences covering from 5'UTR to the end of intron 5 were obtained for DRB5*01:01, 01:02 and 02:02; partial coverage including a segment spanning exon 2 to exon 6 was obtained for DRB5*01:03, 01:08N and 02:03. Phylogenetic analysis of the generated sequences showed that the DRB5 alleles group together and have distinctive differences with other DRB loci. Novel intron variants of DRB5*01:01:01, 01:02 and 02:02 were identified. The newly characterized DRB5 intron variants of each DRB5 allele were found in subjects harboring distinct associations with alleles of DRB1, B and/or ethnicity. The new information provided by this study provides reference sequences for HLA typing methodologies. Extending sequence coverage may lead to identify the disease susceptibility factors of DRB5 containing haplotypes while the unexpected intron variations may shed light on understanding of the evolution of the DRB region.
Collapse
Affiliation(s)
- Konstantinos Barsakis
- Stanford Blood Center, Stanford University School of Medicine, Palo Alto, CA 94304, USA; Department of Biology, University of Crete, Heraklion, Crete 71003, Greece
| | - Farbod Babrzadeh
- Stanford Genome Technology Center, Stanford University School of Medicine, Palo Alto, CA 94304, USA
| | - Anjo Chi
- Stanford Genome Technology Center, Stanford University School of Medicine, Palo Alto, CA 94304, USA
| | - Kalyan Mallempati
- Stanford Blood Center, Stanford University School of Medicine, Palo Alto, CA 94304, USA
| | - William Pickle
- Stanford Blood Center, Stanford University School of Medicine, Palo Alto, CA 94304, USA
| | - Michael Mindrinos
- Stanford Genome Technology Center, Stanford University School of Medicine, Palo Alto, CA 94304, USA
| | | |
Collapse
|
12
|
Abstract
Since the discovery that DNA alterations initiate tumorigenesis, scientists and clinicians have been exploring ways to counter these changes with targeted therapeutics. The sequencing of tumor DNA was initially limited to highly actionable hot spots-areas of the genome that are frequently altered and have an approved matched therapy in a specific tumor type. Large-scale genome sequencing programs quickly developed technological improvements that enabled the deployment of whole-exome and whole-genome sequencing technologies at scale for pristine sample materials in research environments. However, the turning point for precision medicine in oncology was the innovations in clinical laboratories that improved turnaround time, depth of coverage, and the ability to reliably sequence archived, clinically available samples. Today, tumor genome sequencing no longer suffers from significant technical or financial hurdles, and the next opportunity for improvement lies in the optimal utilization of the technologies and data for many different tumor types.
Collapse
Affiliation(s)
- Kenna R Mills Shaw
- Khalifa Bin Zayed Institute for Personalized Cancer Therapy and Sheikh Ahmed Center for Pancreatic Cancer Research, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA;
| | - Anirban Maitra
- Khalifa Bin Zayed Institute for Personalized Cancer Therapy and Sheikh Ahmed Center for Pancreatic Cancer Research, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA;
| |
Collapse
|
13
|
Lv J, Jiao W, Guo H, Liu P, Wang R, Zhang L, Zeng Q, Hu X, Bao Z, Wang S. HD-Marker: a highly multiplexed and flexible approach for targeted genotyping of more than 10,000 genes in a single-tube assay. Genome Res 2018; 28:1919-1930. [PMID: 30409770 PMCID: PMC6280760 DOI: 10.1101/gr.235820.118] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2018] [Accepted: 10/25/2018] [Indexed: 01/03/2023]
Abstract
Targeted genotyping of transcriptome-scale genetic markers is highly attractive for genetic, ecological, and evolutionary studies, but achieving this goal in a cost-effective manner remains a major challenge, especially for laboratories working on nonmodel organisms. Here, we develop a high-throughput, sequencing-based GoldenGate approach (called HD-Marker), which addresses the array-related issues of original GoldenGate methodology and allows for highly multiplexed and flexible targeted genotyping of more than 12,000 loci in a single-tube assay (in contrast to fewer than 3100 in the original GoldenGate assay). We perform extensive analyses to demonstrate the power and performance of HD-Marker on various multiplex levels (296, 795, 1293, and 12,472 genic SNPs) across two sequencing platforms in two nonmodel species (the scallops Chlamys farreri and Patinopecten yessoensis), with extremely high capture rate (98%-99%) and genotyping accuracy (97%-99%). We also demonstrate the potential of HD-Marker for high-throughput targeted genotyping of alternative marker types (e.g., microsatellites and indels). With its remarkable cost-effectiveness (as low as $0.002 per genotype) and high flexibility in choice of multiplex levels and marker types, HD-Marker provides a highly attractive tool over array-based platforms for fulfilling genome/transcriptome-wide targeted genotyping applications, especially in nonmodel organisms.
Collapse
Affiliation(s)
- Jia Lv
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China.,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, China
| | - Wenqian Jiao
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China
| | - Haobing Guo
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China
| | - Pingping Liu
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China
| | - Ruijia Wang
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China
| | - Lingling Zhang
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China.,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, China
| | - Qifan Zeng
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China.,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, China
| | - Xiaoli Hu
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China.,Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, China
| | - Zhenmin Bao
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China.,Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, China
| | - Shi Wang
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China.,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, China
| |
Collapse
|
14
|
Zavodna M, Bagshaw A, Brauning R, Gemmell NJ. The effects of transcription and recombination on mutational dynamics of short tandem repeats. Nucleic Acids Res 2018; 46:1321-1330. [PMID: 29300948 PMCID: PMC5814968 DOI: 10.1093/nar/gkx1253] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2017] [Revised: 11/27/2017] [Accepted: 12/27/2017] [Indexed: 01/07/2023] Open
Abstract
Short tandem repeats (STR) are ubiquitous components of the genomic architecture of most living organisms. Recent work has highlighted the widespread functional significance of such repeats, particularly around gene regulation, but the mutational processes underlying the evolution of these highly abundant and highly variable sequences are not fully understood. Traditional models assume that strand misalignment during replication is the predominant mechanism, but empirical data suggest the involvement of other processes including recombination and transcription. Despite this evidence, the relative influences of these processes have not previously been tested experimentally on a genome-wide scale. Using deep sequencing, we identify mutations at >200 microsatellites, across 700 generations in replicated populations of two otherwise identical sexual and asexual Saccharomyces cerevisiae strains. Using generalized linear models, we investigate correlates of STR mutability including the nature of the mutation, STR composition and contextual factors including recombination, transcription and replication origins. Sexual capability was not a significant predictor of microsatellite mutability, but, intriguingly, we identify transcription as a significant positive predictor. We also find that STR density is substantially increased in regions neighboring, but not within, recombination hotspots.
Collapse
Affiliation(s)
- Monika Zavodna
- Department of Anatomy, University of Otago, Dunedin 9054, New Zealand
| | - Andrew Bagshaw
- Department of Pathology, University of Otago, Christchurch 8140, New Zealand
| | - Rudiger Brauning
- AgResearch Limited, Invermay Agricultural Centre, Mosgiel, New Zealand
| | - Neil J Gemmell
- Department of Anatomy, University of Otago, Dunedin 9054, New Zealand
- Allan Wilson Centre for Molecular Ecology and Evolution, University of Otago, Dunedin 9054, New Zealand
| |
Collapse
|
15
|
Abstract
Accumulating evidence suggests that many classes of DNA repeats exhibit attributes that distinguish them from other genetic variants, including the fact that they are more liable to mutation; this enables them to mediate genetic plasticity. The expansion of tandem repeats, particularly of short tandem repeats, can cause a range of disorders (including Huntington disease, various ataxias, motor neuron disease, frontotemporal dementia, fragile X syndrome and other neurological disorders), and emerging data suggest that tandem repeat polymorphisms (TRPs) can also regulate gene expression in healthy individuals. TRPs in human genomes may also contribute to the missing heritability of polygenic disorders. A better understanding of tandem repeats and their associated repeatome, as well as their capacity for genetic plasticity via both germline and somatic mutations, is needed to transform our understanding of the role of TRPs in health and disease.
Collapse
Affiliation(s)
- Anthony J Hannan
- Florey Institute of Neuroscience and Mental Health, University of Melbourne.,Department of Anatomy and Neuroscience, University of Melbourne, Parkville, Victoria, Australia
| |
Collapse
|
16
|
Abstract
The availability of complete fungal genomes is expanding rapidly and is offering an extensive and accurate view of this "kingdom." The scientific milestone of free access to more than 1000 fungal genomes of different species was reached, and new and stimulating projects have meanwhile been released. The "1000 Fungal Genomes Project" represents one of the largest sequencing initiative regarding fungal organisms trying to fill some gaps on fungal genomics. Presently, there are 329 fungal families with at least one representative genome sequenced, but there is still a large number of fungal families without a single sequenced genome. In addition, additional sequencing projects helped to understand the genetic diversity within some fungal species. The availability of multiple genomes per species allows to support taxonomic organization, brings new insights for fungal evolution in short-time scales, clarifies geographical and dispersion patterns, elucidates outbreaks and transmission routes, among other objectives. Genotyping methodologies analyze only a small fraction of an individual's genome but facilitate the comparison of hundreds or thousands of isolates in a small fraction of the time and at low cost. The integration of whole genome strategies and improved genotyping panels targeting specific and relevant SNPs and/or repeated regions can represent fast and practical strategies for studying local, regional, and global epidemiology of fungi.
Collapse
Affiliation(s)
- Ricardo Araujo
- University of Porto, Porto, Portugal; School of Medicine and Health Sciences, Flinders University, Adelaide, SA, Australia.
| | | |
Collapse
|
17
|
Improved Diagnosis of Inherited Retinal Dystrophies by High-Fidelity PCR of ORF15 followed by Next-Generation Sequencing. J Mol Diagn 2016; 18:817-824. [PMID: 27620828 DOI: 10.1016/j.jmoldx.2016.06.007] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2016] [Revised: 06/10/2016] [Accepted: 06/16/2016] [Indexed: 11/21/2022] Open
Abstract
Retinitis pigmentosa (RP) is the most common form of retinal dystrophy. The disease is characterized by the progressive degeneration of photoreceptors, ultimately leading to blindness. The exon ORF15 of RP GTPase regulator (RPGR) is a mutation hot spot for X-linked RP and one form of cone dystrophy. However, accurate molecular testing of ORF15 is challenging because of a large segment of highly repetitive purine-rich sequence in this exon. ORF15 performs poorly in next-generation sequencing-based panels or whole exome sequencing analysis, whereas Sanger sequencing of ORF15 requires special reagents and PCR conditions with multiple pairs of overlapping primers that often do not provide a clean sequence. Because of these technical difficulties, molecular analysis of ORF15 is performed mostly in research laboratories without validation for clinical application. Herein, we report the development of a single step of high-fidelity PCR followed by next-generation sequencing for accurate mutation detection, which is easily integrated into routine clinical practice. Our approach has improved coverage depth of ORF15 with the ability to detect single-nucleotide variants and deletions/duplications. Using this method, we were able to identify ORF15 pathogenic variants in approximately 31% of undiagnosed RP patients. Our results underline the clinical importance of complete and accurate sequence analysis of ORF15 for patients with retinal dystrophies.
Collapse
|
18
|
Du C, Pusey BN, Adams CJ, Lau CC, Bone WP, Gahl WA, Markello TC, Adams DR. Explorations to improve the completeness of exome sequencing. BMC Med Genomics 2016; 9:56. [PMID: 27568008 PMCID: PMC5002202 DOI: 10.1186/s12920-016-0216-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Accepted: 08/05/2016] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Exome sequencing has advanced to clinical practice and proven useful for obtaining molecular diagnoses in rare diseases. In approximately 75 % of cases, however, a clinical exome study does not produce a definitive molecular diagnosis. These residual cases comprise a new diagnostic challenge for the genetics community. The Undiagnosed Diseases Program of the National Institutes of Health routinely utilizes exome sequencing for refractory clinical cases. Our preliminary data suggest that disease-causing variants may be missed by current standard-of-care clinical exome analysis. Such false negatives reflect limitations in experimental design, technical performance, and data analysis. RESULTS We present examples from our datasets to quantify the analytical performance associated with current practices, and explore strategies to improve the completeness of data analysis. In particular, we focus on patient ascertainment, exome capture, inclusion of intronic variants, and evaluation of medium-sized structural variants. CONCLUSIONS The strategies we present may recover previously-missed, disease causing variants in second-pass exome analysis. Understanding the limitations of the current clinical exome search space provides a rational basis to improve methods for disease variant detection using genome-scale sequencing techniques.
Collapse
Affiliation(s)
- Chen Du
- NIH Undiagnosed Diseases Program, Common Fund, National Institutes of Health, National Human Genome Research Institute, Bethesda, MD, USA
| | - Barbara N Pusey
- NIH Undiagnosed Diseases Program, Common Fund, National Institutes of Health, National Human Genome Research Institute, Bethesda, MD, USA
| | - Christopher J Adams
- NIH Undiagnosed Diseases Program, Common Fund, National Institutes of Health, National Human Genome Research Institute, Bethesda, MD, USA
| | - C Christopher Lau
- NIH Undiagnosed Diseases Program, Common Fund, National Institutes of Health, National Human Genome Research Institute, Bethesda, MD, USA
| | - William P Bone
- NIH Undiagnosed Diseases Program, Common Fund, National Institutes of Health, National Human Genome Research Institute, Bethesda, MD, USA
| | - William A Gahl
- NIH Undiagnosed Diseases Program, Common Fund, National Institutes of Health, National Human Genome Research Institute, Bethesda, MD, USA
| | - Thomas C Markello
- NIH Undiagnosed Diseases Program, Common Fund, National Institutes of Health, National Human Genome Research Institute, Bethesda, MD, USA
| | - David R Adams
- NIH Undiagnosed Diseases Program, Common Fund, National Institutes of Health, National Human Genome Research Institute, Bethesda, MD, USA.
| |
Collapse
|
19
|
Pérez-Cordón G, Robinson G, Nader J, Chalmers RM. Discovery of new variable number tandem repeat loci in multiple Cryptosporidium parvum genomes for the surveillance and investigation of outbreaks of cryptosporidiosis. Exp Parasitol 2016; 169:119-28. [PMID: 27523797 DOI: 10.1016/j.exppara.2016.08.003] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2016] [Revised: 08/09/2016] [Accepted: 08/10/2016] [Indexed: 01/28/2023]
Abstract
Cryptosporidium parvum is a protozoan parasite causing gastro-intestinal disease (cryptosporidiosis) in humans and animals. The ability to investigate sources of contamination and routes of transmission by characterization and comparison of isolates in a cost- and time-efficient manner will help surveillance and epidemiological investigations, but as yet there is no standardised multi-locus typing scheme. To systematically identify variable number tandem repeat (VNTR) loci, which have been shown to provide differentiation in moderately conserved species, we interrogated the reference C. parvum Iowa II genome and seven other C. parvum genomes using a tandem repeat finder software. We identified 28 loci that met criteria defined previously for robust typing schemes for inter-laboratory surveillance, that had potential for generating PCR amplicons analysable on most fragment sizing platforms: repeats ≥6 bp, occurring in tandem in a single repeat region, and providing a total amplicon size of <300 bp including 50 bp for the location of the forward and reverse primers. The qualifying loci will be further investigated in vitro for consideration as preferred loci in the development of a robust VNTR scheme.
Collapse
Affiliation(s)
- Gregorio Pérez-Cordón
- Cryptosporidium Reference Unit, Public Health Wales Microbiology, Singleton Hospital, Swansea, SA2 8QA, UK
| | - Guy Robinson
- Cryptosporidium Reference Unit, Public Health Wales Microbiology, Singleton Hospital, Swansea, SA2 8QA, UK; Swansea University Medical School, Grove Building, Swansea University, Singleton Park, Swansea, SA2 8PP, UK
| | - Johanna Nader
- Norwich Medical School, University of East Anglia, Norwich, UK
| | - Rachel M Chalmers
- Cryptosporidium Reference Unit, Public Health Wales Microbiology, Singleton Hospital, Swansea, SA2 8QA, UK; Swansea University Medical School, Grove Building, Swansea University, Singleton Park, Swansea, SA2 8PP, UK.
| |
Collapse
|
20
|
Bolton KA, Avery-Kiejda KA, Holliday EG, Attia J, Bowden NA, Scott RJ. A polymorphic repeat in the IGF1 promoter influences the risk of endometrial cancer. Endocr Connect 2016; 5:115-22. [PMID: 27090263 PMCID: PMC5002956 DOI: 10.1530/ec-16-0003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Accepted: 04/18/2016] [Indexed: 01/22/2023]
Abstract
Due to the lack of high-throughput genetic assays for tandem repeats, there is a paucity of knowledge about the role they may play in disease. A polymorphic CA repeat in the promoter region of the insulin-like growth factor 1 gene (IGF1 has been studied extensively over the past 10 years for association with the risk of developing breast cancer, among other cancers, with variable results. The aim of this study was to determine if this CA repeat is associated with the risk of developing breast cancer and endometrial cancer. Using a case-control design, we analysed the length of this CA repeat in a series of breast cancer and endometrial cancer cases and compared this with a control population. Our results showed an association when both alleles were considered in breast and endometrial cancers (P=0.029 and 0.011, respectively), but this did not pass our corrected threshold for significance due to multiple testing. When the allele lengths were analysed categorically against the most common allele length of 19 CA repeats, an association was observed with the risk of endometrial cancer due to a reduction in the number of long alleles (P=0.013). This was confirmed in an analysis of the long alleles separately for endometrial cancer risk (P=0.0012). Our study found no association between the length of this polymorphic CA repeat and breast cancer risk. The significant association observed between the CA repeat length and the risk of developing endometrial cancer has not been previously reported.
Collapse
Affiliation(s)
- Katherine A Bolton
- Centre for BioinformaticsBiomarker Discovery and Information-Based Medicine, Hunter Medical Research Institute, Newcastle, New South Wales, Australia Priority Research Centre for CancerSchool of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, University of Newcastle, Newcastle, New South Wales, Australia
| | - Kelly A Avery-Kiejda
- Centre for BioinformaticsBiomarker Discovery and Information-Based Medicine, Hunter Medical Research Institute, Newcastle, New South Wales, Australia Priority Research Centre for CancerSchool of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, University of Newcastle, Newcastle, New South Wales, Australia
| | - Elizabeth G Holliday
- Centre for Clinical Epidemiology and BiostatisticsSchool of Medicine and Public Health, Faculty of Health and Medicine, University of Newcastle, Newcastle, New South Wales, Australia Clinical Research DesignIT and Statistical Support Unit, Hunter Medical Research Institute, Newcastle, New South Wales, Australia
| | - John Attia
- Centre for Clinical Epidemiology and BiostatisticsSchool of Medicine and Public Health, Faculty of Health and Medicine, University of Newcastle, Newcastle, New South Wales, Australia Clinical Research DesignIT and Statistical Support Unit, Hunter Medical Research Institute, Newcastle, New South Wales, Australia
| | - Nikola A Bowden
- Centre for BioinformaticsBiomarker Discovery and Information-Based Medicine, Hunter Medical Research Institute, Newcastle, New South Wales, Australia Priority Research Centre for CancerSchool of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, University of Newcastle, Newcastle, New South Wales, Australia
| | - Rodney J Scott
- Centre for BioinformaticsBiomarker Discovery and Information-Based Medicine, Hunter Medical Research Institute, Newcastle, New South Wales, Australia Priority Research Centre for CancerSchool of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, University of Newcastle, Newcastle, New South Wales, Australia Molecular MedicinePathology North, John Hunter Hospital, Newcastle, New South Wales, Australia Discipline of Medical GeneticsSchool of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, University of Newcastle, University Drive, Newcastle, New South Wales, Australia
| |
Collapse
|
21
|
Bushehri A, Barez MRM, Mansouri SK, Biglarian A, Ohadi M. Genome-wide identification of human- and primate-specific core promoter short tandem repeats. Gene 2016; 587:83-90. [PMID: 27108803 DOI: 10.1016/j.gene.2016.04.041] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2016] [Revised: 03/23/2016] [Accepted: 04/19/2016] [Indexed: 12/12/2022]
Abstract
Recent reports of a link between human- and primate-specific genetic factors and human/primate-specific characteristics and diseases necessitate genome-wide identification of those factors. We have previously reported core promoter short tandem repeats (STRs) of extreme length (≥6-repeats) that have expanded exceptionally in primates vs. non-primates, and may have a function in adaptive evolution. In the study reported here, we extended our study to the human STRs of ≥3-repeats in the category of penta and hexaucleotide STRs, across the entire human protein coding gene core promoters, and analyzed their status in several superorders and orders of vertebrates, using the Ensembl database. The ConSite software was used to identify the transcription factor (TF) sets binding to those STRs. STR specificity was observed at different levels of human and non-human primate (NHP) evolution. 73% of the pentanucleotide STRs and 68% of the hexanucleotide STRs were found to be specific to human and NHPs. AP-2alpha, Sp1, and MZF were the predominantly selected TFs (90%) binding to the human-specific STRs. Furthermore, the number of TF sets binding to a given STR was found to be a selection factor for that STR. Our findings indicate that selected STRs, the cognate binding TFs, and the number of TF set binding to those STRs function as switch codes at different levels of human and NHP evolution and speciation.
Collapse
Affiliation(s)
- A Bushehri
- Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - M R Mashhoudi Barez
- Cell and Molecular Biology Research Center, Department of Anatomy and Biology, Faculty of Medicine, Shahid Beheshti University, Velenjak, Tehran, Iran
| | - S K Mansouri
- Clinical Psychology Department, Faculty of Science and Research, Qazvin Azad University, Qazvin, Iran
| | - A Biglarian
- Department of Biostatistics, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - M Ohadi
- Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| |
Collapse
|
22
|
Mapping and differential expression analysis from short-read RNA-Seq data in model organisms. QUANTITATIVE BIOLOGY 2016. [DOI: 10.1007/s40484-016-0060-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
23
|
Thai BT, Tan MH, Lee YP, Gan HM, Tran TT, Austin CM. Characterisation of 12 microsatellite loci in the Vietnamese commercial clam Lutraria rhynchaena Jonas 1844 (Heterodonta: Bivalvia: Mactridae) through next-generation sequencing. Mol Biol Rep 2016; 43:391-6. [PMID: 26922181 DOI: 10.1007/s11033-016-3966-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2015] [Accepted: 02/23/2016] [Indexed: 10/22/2022]
Abstract
The marine clam Lutraria rhynchaena is gaining popularity as an aquaculture species in Asia. Lutraria populations are present in the wild throughout Vietnam and several stocks have been established and translocated for breeding and aquaculture grow-out purposes. In this study, we demonstrate the feasibility of utilising Illumina next-generation sequencing technology to streamline the identification and genotyping of microsatellite loci from this clam species. Based on an initial partial genome scan, 48 microsatellite markers with similar melting temperatures were identified and characterised. The 12 most suitable polymorphic loci were then genotyped using 51 individuals from a population in Quang Ninh Province, North Vietnam. Genetic variation was low (mean number of alleles per locus = 2.6; mean expected heterozygosity = 0.41). Two loci showed significant deviation from Hardy-Weinberg equilibrium (HWE) and the presence of null alleles, but there was no evidence of linkage disequilibrium among loci. Three additional populations were screened (n = 7-36) to test the geographic utility of the 12 loci, which revealed 100 % successful genotyping in two populations from central Vietnam (Nha Trang). However, a second population from north Vietnam (Co To) could not be successfully genotyped and morphological evidence and mitochondrial variation suggests that this population represents a cryptic species of Lutraria. Comparisons of the Qang Ninh and Nha Trang populations, excluding the 2 loci out of HWE, revealed statistically significant allelic variation at 4 loci. We reported the first microsatellite loci set for the marine clam Lutraria rhynchaena and demonstrated its potential in differentiating clam populations. Additionally, a cryptic species population of Lutraria rhynchaena was identified during initial loci development, underscoring the overlooked diversity of marine clam species in Vietnam and the need to genetically characterise population representatives prior to microsatellite development. The rapid identification and validation of microsatellite loci using next-generation sequencing technology warrant its integration into future microsatellite loci development for key aquaculture species in Vietnam and more generally, aquaculture countries in the South East Asia region.
Collapse
Affiliation(s)
| | - Mun Hua Tan
- Genomics Facility, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway, 47500, Petaling Jaya, Selangor, Malaysia.,School of Science, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway, 47500, Petaling Jaya, Selangor, Malaysia
| | - Yin Peng Lee
- Genomics Facility, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway, 47500, Petaling Jaya, Selangor, Malaysia.,School of Science, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway, 47500, Petaling Jaya, Selangor, Malaysia
| | - Han Ming Gan
- Genomics Facility, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway, 47500, Petaling Jaya, Selangor, Malaysia.,School of Science, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway, 47500, Petaling Jaya, Selangor, Malaysia
| | | | - Christopher M Austin
- Genomics Facility, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway, 47500, Petaling Jaya, Selangor, Malaysia.,School of Science, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway, 47500, Petaling Jaya, Selangor, Malaysia
| |
Collapse
|
24
|
Nikkhah M, Rezazadeh M, Khorram Khorshid HR, Biglarian A, Ohadi M. An exceptionally long CA-repeat in the core promoter of SCGB2B2 links with the evolution of apes and Old World monkeys. Gene 2015; 576:109-14. [PMID: 26437309 DOI: 10.1016/j.gene.2015.09.070] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2015] [Revised: 09/25/2015] [Accepted: 09/28/2015] [Indexed: 12/31/2022]
Abstract
We have recently reported a genome-scale catalog of human protein-coding genes that contain "exceptionally long" STRs (≥6-repeats) in their core promoter, which may be of selective advantage in this species. At the top of that list, SCGB2B2 (also known as SCGBL), contains one of the longest CA-repeat STRs identified in a human gene core promoter, at 25-repeats. In the study reported here, we analyzed the conservation status of this CA-STR across evolution. The functional implication of this STR to alter gene expression activity was also analyzed in the HEK-293 cell line. We report that the SCGB2B2 core promoter CA-repeat reaches exceptional lengths, ranging from 9- to 25-repeats, across Apes (Hominoids) and the Old World monkeys (CA>2-repeats were not detected in any other species). The longest CA-repeats and highest identity in the SCGB2B2 protein sequence were observed between human and bonobo. A trend for increased gene expression activity was observed from the shorter to the longer CA-repeats (p<0.009), and the CA-repeat increased gene expression activity, per se (p<0.02). We propose that the SCGB2B2 gene core promoter CA-repeat functions as an expression code for the evolution of Apes and the Old World monkeys.
Collapse
Affiliation(s)
- M Nikkhah
- Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - M Rezazadeh
- Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - H R Khorram Khorshid
- Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - A Biglarian
- Department of Biostatistics, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - M Ohadi
- Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran.
| |
Collapse
|