1
|
Liu X, Zhang H, Zeng Y, Zhu X, Zhu L, Fu J. DRANetSplicer: A Splice Site Prediction Model Based on Deep Residual Attention Networks. Genes (Basel) 2024; 15:404. [PMID: 38674339 PMCID: PMC11048956 DOI: 10.3390/genes15040404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 03/20/2024] [Accepted: 03/23/2024] [Indexed: 04/28/2024] Open
Abstract
The precise identification of splice sites is essential for unraveling the structure and function of genes, constituting a pivotal step in the gene annotation process. In this study, we developed a novel deep learning model, DRANetSplicer, that integrates residual learning and attention mechanisms for enhanced accuracy in capturing the intricate features of splice sites. We constructed multiple datasets using the most recent versions of genomic data from three different organisms, Oryza sativa japonica, Arabidopsis thaliana and Homo sapiens. This approach allows us to train models with a richer set of high-quality data. DRANetSplicer outperformed benchmark methods on donor and acceptor splice site datasets, achieving an average accuracy of (96.57%, 95.82%) across the three organisms. Comparative analyses with benchmark methods, including SpliceFinder, Splice2Deep, Deep Splicer, EnsembleSplice, and DNABERT, revealed DRANetSplicer's superior predictive performance, resulting in at least a (4.2%, 11.6%) relative reduction in average error rate. We utilized the DRANetSplicer model trained on O. sativa japonica data to predict splice sites in A. thaliana, achieving accuracies for donor and acceptor sites of (94.89%, 94.25%). These results indicate that DRANetSplicer possesses excellent cross-organism predictive capabilities, with its performance in cross-organism predictions even surpassing that of benchmark methods in non-cross-organism predictions. Cross-organism validation showcased DRANetSplicer's excellence in predicting splice sites across similar organisms, supporting its applicability in gene annotation for understudied organisms. We employed multiple methods to visualize the decision-making process of the model. The visualization results indicate that DRANetSplicer can learn and interpret well-known biological features, further validating its overall performance. Our study systematically examined and confirmed the predictive ability of DRANetSplicer from various levels and perspectives, indicating that its practical application in gene annotation is justified.
Collapse
Affiliation(s)
- Xueyan Liu
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China; (X.L.); (X.Z.); (L.Z.); (J.F.)
| | - Hongyan Zhang
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China; (X.L.); (X.Z.); (L.Z.); (J.F.)
| | - Ying Zeng
- School of Computer and Communication, Hunan Institute of Engineering, Xiangtan 411104, China;
| | - Xinghui Zhu
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China; (X.L.); (X.Z.); (L.Z.); (J.F.)
| | - Lei Zhu
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China; (X.L.); (X.Z.); (L.Z.); (J.F.)
| | - Jiahui Fu
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China; (X.L.); (X.Z.); (L.Z.); (J.F.)
| |
Collapse
|
2
|
Safar HA, Alatar F, Mustafa AS. Three Rounds of Read Correction Significantly Improve Eukaryotic Protein Detection in ONT Reads. Microorganisms 2024; 12:247. [PMID: 38399651 PMCID: PMC10893331 DOI: 10.3390/microorganisms12020247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Revised: 01/19/2024] [Accepted: 01/23/2024] [Indexed: 02/25/2024] Open
Abstract
BACKGROUND Eukaryotes' whole-genome sequencing is crucial for species identification, gene detection, and protein annotation. Oxford Nanopore Technology (ONT) is an affordable and rapid platform for sequencing eukaryotes; however, the relatively higher error rates require computational and bioinformatic efforts to produce more accurate genome assemblies. Here, we evaluated the effect of read correction tools on eukaryote genome completeness, gene detection and protein annotation. METHODS Reads generated by ONT of four eukaryotes, C. albicans, C. gattii, S. cerevisiae, and P. falciparum, were assembled using minimap2 and underwent three rounds of read correction using flye, medaka and racon. The generates consensus FASTA files were compared for total length (bp), genome completeness, gene detection, and protein-annotation by QUAST, BUSCO, BRAKER1 and InterProScan, respectively. RESULTS Genome completeness was dependent on the assembly method rather than on the read correction tool; however, medaka performed better than flye and racon. Racon significantly performed better than flye and medaka in gene detection, while both racon and medaka significantly performed better than flye in protein-annotation. CONCLUSION We show that three rounds of read correction significantly affect gene detection and protein annotation, which are dependent on assembly quality in preference to assembly completeness.
Collapse
Affiliation(s)
- Hussain A. Safar
- OMICS Research Unit, Health Science Centre, Kuwait University, Kuwait City 13110, Kuwait;
| | - Fatemah Alatar
- Serology and Molecular Microbiology Reference Laboratory, Mubarak Al-Kabeer Hospital, Ministry of Health, Kuwait City 13110, Kuwait;
| | - Abu Salim Mustafa
- Department of Microbiology, Faculty of Medicine, Kuwait University, Kuwait City 13110, Kuwait
| |
Collapse
|
3
|
Liu Z, Du Y, Sun Z, Cheng B, Bi Z, Yao Z, Liang Y, Zhang H, Yao R, Kang S, Shi Y, Wan H, Qin D, Xiang L, Leng L, Chen S. Manual correction of genome annotation improved alternative splicing identification of Artemisia annua. PLANTA 2023; 258:83. [PMID: 37721598 DOI: 10.1007/s00425-023-04237-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 09/04/2023] [Indexed: 09/19/2023]
Abstract
Gene annotation is essential for genome-based studies. However, algorithm-based genome annotation is difficult to fully and correctly reveal genomic information, especially for species with complex genomes. Artemisia annua L. is the only commercial resource of artemisinin production though the content of artemisinin is still to be improved. Genome-based genetic modification and breeding are useful strategies to boost artemisinin content and therefore, ensure the supply of artemisinin and reduce costs, but better gene annotation is urgently needed. In this study, we manually corrected the newly released genome annotation of A. annua using second- and third-generation transcriptome data. We found that incorrect gene information may lead to differences in structural, functional, and expression levels compared to the original expectations. We also identified alternative splicing events and found that genome annotation information impacted identifying alternative splicing genes. We further demonstrated that genome annotation information and alternative splicing could affect gene expression estimation and gene function prediction. Finally, we provided a valuable version of A. annua genome annotation and demonstrated the importance of gene annotation in future research.
Collapse
Affiliation(s)
- Zhaoyu Liu
- School of Chinese Materia Medica, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China
- Institute of Herbgenomics, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, China
| | - Yupeng Du
- College of Life Science, Northeast Forestry University, Harbin, 150040, China
| | - Zhihao Sun
- Institute of Herbgenomics, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, China
- School of Basic Medical Sciences, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, China
| | - Bohan Cheng
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China
| | - Zenghao Bi
- Institute of Herbgenomics, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, China
| | - Zhicheng Yao
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, 333403, China
| | - Yuting Liang
- Institute of Herbgenomics, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, China
| | - Huiling Zhang
- College of Horticulture, Sichuan Agricultural University, Chengdu, 611130, China
| | - Run Yao
- Institute of Herbgenomics, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, China
| | - Shen Kang
- Institute of Herbgenomics, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, China
| | - Yuhua Shi
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China
| | - Huihua Wan
- Key Laboratory of Beijing for Identification and Safety Evaluation of Chinese Medicine, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China
| | - Dou Qin
- Prescription Laboratory of Xinjiang Traditional Uyghur Medicine, Xinjiang Institute of Traditional Uyghur Medicine, Urmuqi, 830000, China
| | - Li Xiang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China.
- Prescription Laboratory of Xinjiang Traditional Uyghur Medicine, Xinjiang Institute of Traditional Uyghur Medicine, Urmuqi, 830000, China.
| | - Liang Leng
- Institute of Herbgenomics, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, China.
| | - Shilin Chen
- School of Chinese Materia Medica, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China.
- Institute of Herbgenomics, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, China.
| |
Collapse
|
4
|
Sinha A, Sangeet S, Roy S. Evolution of Sequence and Structure of SARS-CoV-2 Spike Protein: A Dynamic Perspective. ACS OMEGA 2023; 8:23283-23304. [PMID: 37426203 PMCID: PMC10324094 DOI: 10.1021/acsomega.3c00944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Accepted: 06/01/2023] [Indexed: 07/11/2023]
Abstract
Novel coronavirus (SARS-CoV-2) enters its host cell through a surface spike protein. The viral spike protein has undergone several modifications/mutations at the genomic level, through which it modulated its structure-function and passed through several variants of concern. Recent advances in high-resolution structure determination and multiscale imaging techniques, cost-effective next-generation sequencing, and development of new computational methods (including information theory, statistical methods, machine learning, and many other artificial intelligence-based techniques) have hugely contributed to the characterization of sequence, structure, function of spike proteins, and its different variants to understand viral pathogenesis, evolutions, and transmission. Laying on the foundation of the sequence-structure-function paradigm, this review summarizes not only the important findings on structure/function but also the structural dynamics of different spike components, highlighting the effects of mutations on them. As dynamic fluctuations of three-dimensional spike structure often provide important clues for functional modulation, quantifying time-dependent fluctuations of mutational events over spike structure and its genetic/amino acidic sequence helps identify alarming functional transitions having implications for enhanced fusogenicity and pathogenicity of the virus. Although these dynamic events are more difficult to capture than quantifying a static, average property, this review encompasses those challenging aspects of characterizing the evolutionary dynamics of spike sequence and structure and their implications for functions.
Collapse
|
5
|
Jin W, Zhang J, Chen X, Yin S, Yu H, Gao F, Yao D. Unraveling the complexity of histone-arginine methyltransferase CARM1 in cancer: From underlying mechanisms to targeted therapeutics. Biochim Biophys Acta Rev Cancer 2023; 1878:188916. [PMID: 37196782 DOI: 10.1016/j.bbcan.2023.188916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Revised: 04/28/2023] [Accepted: 05/12/2023] [Indexed: 05/19/2023]
Abstract
Coactivator-associated arginine methyltransferase 1 (CARM1), a type I protein arginine methyltransferase (PRMT), has been widely reported to catalyze arginine methylation of histone and non-histone substrates, which is closely associated with the occurrence and progression of cancer. Recently, accumulating studies have demonstrated the oncogenic role of CARM1 in many types of human cancers. More importantly, CARM1 has been emerging as an attractive therapeutic target for discovery of new candidate anti-tumor drugs. Therefore, in this review, we summarize the molecular structure of CARM1 and its key regulatory pathways, as well as further discuss the rapid progress in better understanding of the oncogenic functions of CARM1. Moreover, we further demonstrate several representative targeted CARM1 inhibitors, especially focusing on demonstrating their designing strategies and potential therapeutic applications. Together, these inspiring findings would shed new light on elucidating the underlying mechanisms of CARM1 and provide a clue on discovery of more potent and selective CARM1 inhibitors for the future targeted cancer therapy.
Collapse
Affiliation(s)
- Wenke Jin
- Sichuan Engineering Research Center for Biomimetic Synthesis of Natural Drugs, School of Life Science and Engineering, Southwest Jiaotong University, Chengdu 610031, China; School of Pharmaceutical Sciences, Shenzhen Technology University, Shenzhen 518118, China; Key Laboratory of Pharmacology of Traditional Chinese Medical Formulae, Ministry of Education, and State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin 301617, China
| | - Jin Zhang
- School of Pharmacy, Shenzhen University Medical School, Shenzhen University, Shenzhen, Guangdong 518055, China
| | - Xiya Chen
- School of Pharmaceutical Sciences, Shenzhen Technology University, Shenzhen 518118, China; School of Pharmacy, Shenzhen University Medical School, Shenzhen University, Shenzhen, Guangdong 518055, China
| | - Siwen Yin
- School of Nursing, Tianjin University of Traditional Chinese Medicine, Tianjin 301617, China
| | - Haiyang Yu
- Key Laboratory of Pharmacology of Traditional Chinese Medical Formulae, Ministry of Education, and State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin 301617, China.
| | - Feng Gao
- Sichuan Engineering Research Center for Biomimetic Synthesis of Natural Drugs, School of Life Science and Engineering, Southwest Jiaotong University, Chengdu 610031, China.
| | - Dahong Yao
- School of Pharmaceutical Sciences, Shenzhen Technology University, Shenzhen 518118, China.
| |
Collapse
|
6
|
Genome-Wide Sequencing Modalities for Children with Unexplained Global Developmental Delay and Intellectual Disabilities—A Narrative Review. CHILDREN 2023; 10:children10030501. [PMID: 36980059 PMCID: PMC10047410 DOI: 10.3390/children10030501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Revised: 02/25/2023] [Accepted: 03/01/2023] [Indexed: 03/06/2023]
Abstract
Unexplained global developmental delay (GDD) and intellectual disabilities (ID) together affect nearly 2% of the pediatric population. Establishing an etiologic diagnosis is crucial for disease management, prognostic evaluation, and provision of physical and psychological support for both the patient and the family. Advancements in genome sequencing have allowed rapid accumulation of gene–disorder associations and have accelerated the search for an etiologic diagnosis for unexplained GDD/ID. We reviewed recent studies that utilized genome-wide analysis technologies, and we discussed their diagnostic yield, strengths, and limitations. Overall, exome sequencing (ES) and genome sequencing (GS) outperformed chromosomal microarrays and targeted panel sequencing. GS provides coverage for both ES and chromosomal microarray regions, providing the maximal diagnostic potential, and the cost of ES and reanalysis of ES-negative results is currently still lower than that of GS alone. Therefore, singleton or trio ES is the more cost-effective option for the initial investigation of individuals with GDD/ID in clinical practice compared to a staged approach or GS alone. Based on these updated evidence, we proposed an evaluation algorithm with ES as the first-tier evaluation for unexplained GDD/ID.
Collapse
|
7
|
A Pathogenic Variant Reclassified to the Pseudogene PMS2P1 in a Patient with Suspected Hereditary Cancer. Int J Mol Sci 2023; 24:ijms24021398. [PMID: 36674914 PMCID: PMC9864156 DOI: 10.3390/ijms24021398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 01/04/2023] [Accepted: 01/08/2023] [Indexed: 01/13/2023] Open
Abstract
The PMS2 gene is involved in DNA repair by the mismatch repair pathway. Deficiencies in this mechanism have been associated with Lynch Syndrome (LS), which is characterized by a high risk for colorectal, endometrial, ovarian, breast, and other cancers. Germinal pathogenic variants of PMS2 are associated with up to 5% of all cases of LS. The prevalence is overestimated for the existence of multiple homologous pseudogenes. We report the case of a 44-year-old woman diagnosed with breast cancer at 34 years without a relevant cancer family history. The presence of pathogenic variant NM_000535.7:c.1A > T, (p.Met1Leu) in PMS2 was determined by next-generation sequencing analysis with a panel of 322 cancer-associated genes and confirmed by capillary sequencing in the patient. The variant was determined in six family members (brothers, sisters, and a son) and seven non-cancerous unrelated individuals. Analysis of the amplified region showed high homology of PMS2 with five of its pseudogenes. We determined that the variant is associated with the PMS2P1 pseudogene following sequence alignment analysis. We propose considering the variant c.1A > T, (p.Met1Leu) in PMS2 for reclassification as not hereditary cancer-related, given the impact on the diagnosis and treatment of cancer patients and families carrying this variant.
Collapse
|
8
|
Halim-Fikri H, Syed-Hassan SNRK, Wan-Juhari WK, Assyuhada MGSN, Hernaningsih Y, Yusoff NM, Merican AF, Zilfalil BA. Central resources of variant discovery and annotation and its role in precision medicine. ASIAN BIOMED 2022; 16:285-298. [PMID: 37551357 PMCID: PMC10392146 DOI: 10.2478/abm-2022-0032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/09/2023]
Abstract
Rapid technological advancement in high-throughput genomics, microarray, and deep sequencing technologies has accelerated the possibility of more complex precision medicine research using large amounts of heterogeneous health-related data from patients, including genomic variants. Genomic variants can be identified and annotated based on the reference human genome either within the sequence as a whole or in a putative functional genomic element. The American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) mutually created standards and guidelines for the appraisal of proof to expand consistency and straightforwardness in clinical variation interpretations. Various efforts toward precision medicine have been facilitated by many national and international public databases that classify and annotate genomic variation. In the present study, several resources are highlighted with recognition and data spreading of clinically important genetic variations.
Collapse
Affiliation(s)
- Hashim Halim-Fikri
- Malaysian Node of the Human Variome Project, School of Medical Sciences, Universiti Sains Malaysia, Kelantan16150, Malaysia
| | | | - Wan-Khairunnisa Wan-Juhari
- Malaysian Node of the Human Variome Project, School of Medical Sciences, Universiti Sains Malaysia, Kelantan16150, Malaysia
- Human Genome Centre, School of Medical Sciences, Universiti Sains Malaysia, Kelantan16150, Malaysia
| | - Mat Ghani Siti Nor Assyuhada
- Malaysian Node of the Human Variome Project, School of Medical Sciences, Universiti Sains Malaysia, Kelantan16150, Malaysia
| | - Yetti Hernaningsih
- Department of Clinical Pathology, Faculty of Medicine Universitas Airlangga, Dr. Soetomo Academic General Hospital, Surabaya, Indonesia
| | - Narazah Mohd Yusoff
- Department of Clinical Pathology, Faculty of Medicine Universitas Airlangga, Dr. Soetomo Academic General Hospital, Surabaya, Indonesia
- Clinical Diagnostic Laboratory, Advanced Medical and Dental Institute, Universiti Sains Malaysia, Penang13200, Malaysia
| | - Amir Feisal Merican
- Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur50603, Malaysia
- Center of Research for Computational Sciences and Informatics in Biology, Bio Industry, Environment, Agriculture and Healthcare (CRYSTAL), University of Malaya, Kuala Lumpur50603, Malaysia
| | - Bin Alwi Zilfalil
- Malaysian Node of the Human Variome Project, School of Medical Sciences, Universiti Sains Malaysia, Kelantan16150, Malaysia
- Human Genome Centre, School of Medical Sciences, Universiti Sains Malaysia, Kelantan16150, Malaysia
| |
Collapse
|
9
|
Williams EC, Chazarra-Gil R, Shahsavari A, Mohorianu I. The Sum of Two Halves May Be Different from the Whole-Effects of Splitting Sequencing Samples Across Lanes. Genes (Basel) 2022; 13:genes13122265. [PMID: 36553532 PMCID: PMC9777937 DOI: 10.3390/genes13122265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 11/23/2022] [Accepted: 11/25/2022] [Indexed: 12/03/2022] Open
Abstract
The advances in high-throughput sequencing (HTS) have enabled the characterisation of biological processes at an unprecedented level of detail; most hypotheses in molecular biology rely on analyses of HTS data. However, achieving increased robustness and reproducibility of results remains a main challenge. Although variability in results may be introduced at various stages, e.g., alignment, summarisation or detection of differential expression, one source of variability was systematically omitted: the sequencing design, which propagates through analyses and may introduce an additional layer of technical variation. We illustrate qualitative and quantitative differences arising from splitting samples across lanes on bulk and single-cell sequencing. For bulk mRNAseq data, we focus on differential expression and enrichment analyses; for bulk ChIPseq data, we investigate the effect on peak calling and the peaks' properties. At the single-cell level, we concentrate on identifying cell subpopulations. We rely on markers used for assigning cell identities; both smartSeq and 10× data are presented. The observed reduction in the number of unique sequenced fragments limits the level of detail on which the different prediction approaches depend. Furthermore, the sequencing stochasticity adds in a weighting bias corroborated with variable sequencing depths and (yet unexplained) sequencing bias. Subsequently, we observe an overall reduction in sequencing complexity and a distortion in the biological signal across technologies, experimental contexts, organisms and tissues.
Collapse
Affiliation(s)
- Eleanor C. Williams
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, UK
| | - Ruben Chazarra-Gil
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, UK
- Life Sciences-Transcriptomics and Functional Genomics Lab, Barcelona Supercomputing Center (BSC-CNS), 08034 Barcelona, Spain
| | - Arash Shahsavari
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, UK
| | - Irina Mohorianu
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, UK
- Correspondence:
| |
Collapse
|
10
|
Sánchez-Luquez KY, Carpena MX, Karam SM, Tovo-Rodrigues L. The contribution of whole-exome sequencing to intellectual disability diagnosis and knowledge of underlying molecular mechanisms: A systematic review and meta-analysis. MUTATION RESEARCH. REVIEWS IN MUTATION RESEARCH 2022; 790:108428. [PMID: 35905832 DOI: 10.1016/j.mrrev.2022.108428] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 07/21/2022] [Accepted: 07/23/2022] [Indexed: 01/01/2023]
Abstract
Whole-exome sequencing (WES) is useful for molecular diagnosis, family genetic counseling, and prognosis of intellectual disability (ID). However, ID molecular diagnosis ascertainment based on WES is highly dependent on de novo mutations (DNMs) and variants of uncertain significance (VUS). The quantification of DNM frequency in ID molecular diagnosis ascertainment and the biological mechanisms common to genes with VUS may provide objective information about WES use in ID diagnosis and etiology. We aimed to investigate and estimate the rate of ID molecular diagnostic assessment by WES, quantify the contribution of DNMs to this rate, and biologically and functionally characterize the genes whose mutations were identified through WES. A PubMed/Medline, Web of Science, Scopus, Science Direct, BIREME, and PsycINFO systematic review and meta-analysis was performed, including studies published between 2010 and 2022. Thirty-seven articles with data on ID molecular diagnostic yield using the WES approach were included in the review. WES testing accounted for an overall diagnostic rate of 42% (Confidence interval (CI): 35-50%), while the estimate restricted to DNMs was 11% (CI: 6-18%). Genetic information on mutations and genes was extracted and split into two groups: (1) genes whose mutation was used for positive molecular diagnosis, and (2) genes whose mutation led to uncertain molecular diagnosis. After functional enrichment analysis, in addition to their expected roles in neurodevelopment, genes from the first group were enriched in epigenetic regulatory mechanisms, immune system regulation, and circadian rhythm control. Genes from uncertain diagnosis cases were enriched in the renin angiotensin pathway. Taken together, our results support WES as an important approach to the molecular diagnosis of ID. The results also indicated relevant pathways that may underlie the pathogenesis of ID with the renin-angiotensin pathway being suggested to be a potential pathway underlying the pathogenesis of ID.
Collapse
Affiliation(s)
| | - Marina Xavier Carpena
- Postgraduate Program in Epidemiology, Universidade Federal de Pelotas, Pelotas, Brazil.
| | - Simone M Karam
- Postgraduate Program in Public Health, Universidade Federal do Rio Grande, Rio Grande, Brazil.
| | | |
Collapse
|
11
|
Fernandez-Castillo E, Barbosa-Santillán LI, Falcon-Morales L, Sánchez-Escobar JJ. Deep Splicer: A CNN Model for Splice Site Prediction in Genetic Sequences. Genes (Basel) 2022; 13:907. [PMID: 35627292 PMCID: PMC9141016 DOI: 10.3390/genes13050907] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 05/12/2022] [Accepted: 05/13/2022] [Indexed: 02/05/2023] Open
Abstract
Many living organisms have DNA in their cells that is responsible for their biological features. DNA is an organic molecule of two complementary strands of four different nucleotides wound up in a double helix. These nucleotides are adenine (A), thymine (T), guanine (G), and cytosine (C). Genes are DNA sequences containing the information to synthesize proteins. The genes of higher eukaryotic organisms contain coding sequences, known as exons and non-coding sequences, known as introns, which are removed on splice sites after the DNA is transcribed into RNA. Genome annotation is the process of identifying the location of coding regions and determining their function. This process is fundamental for understanding gene structure; however, it is time-consuming and expensive when done by biochemical methods. With technological advances, splice site detection can be done computationally. Although various software tools have been developed to predict splice sites, they need to improve accuracy and reduce false-positive rates. The main goal of this research was to generate Deep Splicer, a deep learning model to identify splice sites in the genomes of humans and other species. This model has good performance metrics and a lower false-positive rate than the currently existing tools. Deep Splicer achieved an accuracy between 93.55% and 99.66% on the genetic sequences of different organisms, while Splice2Deep, another splice site detection tool, had an accuracy between 90.52% and 98.08%. Splice2Deep surpassed Deep Splicer on the accuracy obtained after evaluating C. elegans genomic sequences (97.88% vs. 93.62%) and A. thaliana (95.40% vs. 94.93%); however, Deep Splicer's accuracy was better for H. sapiens (98.94% vs. 97.15%) and D. melanogaster (97.14% vs. 92.30%). The rate of false positives was 0.11% for human genetic sequences and 0.25% for other species' genetic sequences. Another splice prediction tool, Splice Finder, had between 1% and 3% of false positives for human sequences, while other species' sequences had around 4% and 10%.
Collapse
Affiliation(s)
- Elisa Fernandez-Castillo
- School of Engineering and Sciences, Monterrey Institute of Technology and Higher Education, Guadalajara 45201, Mexico; (L.I.B.-S.); (L.F.-M.)
| | - Liliana Ibeth Barbosa-Santillán
- School of Engineering and Sciences, Monterrey Institute of Technology and Higher Education, Guadalajara 45201, Mexico; (L.I.B.-S.); (L.F.-M.)
| | - Luis Falcon-Morales
- School of Engineering and Sciences, Monterrey Institute of Technology and Higher Education, Guadalajara 45201, Mexico; (L.I.B.-S.); (L.F.-M.)
| | | |
Collapse
|
12
|
Duttke SH, Beyhan S, Singh R, Neal S, Viriyakosol S, Fierer J, Kirkland TN, Stajich JE, Benner C, Carlin AF. Decoding Transcription Regulatory Mechanisms Associated with Coccidioides immitis Phase Transition Using Total RNA. mSystems 2022; 7:e0140421. [PMID: 35076277 PMCID: PMC8788335 DOI: 10.1128/msystems.01404-21] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 01/04/2022] [Indexed: 01/07/2023] Open
Abstract
New or emerging infectious diseases are commonly caused by pathogens that cannot be readily manipulated or studied under common laboratory conditions. These limitations hinder standard experimental approaches and our abilities to define the fundamental molecular mechanisms underlying pathogenesis. The advance of capped small RNA sequencing (csRNA-seq) now enables genome-wide mapping of actively initiated transcripts from genes and other regulatory transcribed start regions (TSRs) such as enhancers at a precise moment from total RNA. As RNA is nonpathogenic and can be readily isolated from inactivated infectious samples, csRNA-seq can detect acute changes in gene regulation within or in response to a pathogen with remarkable sensitivity under common laboratory conditions. Studying valley fever (coccidioidomycosis), an emerging endemic fungal infection that increasingly impacts livestock, pet, and human health, we show how csRNA-seq can unravel transcriptional programs driving pathogenesis. Performing csRNA-seq on RNA isolated from different stages of the valley fever pathogen Coccidioides immitis revealed alternative promoter usage, connected cis-regulatory domains, and a WOPR family transcription factor, which are known regulators of virulence in other fungi, as being critical for pathogenic growth. We further demonstrate that a C. immitis WOPR homologue, CIMG_02671, activates transcription in a WOPR motif-dependent manner. Collectively, these findings provide novel insights into valley fever pathogenesis and provide a proof of principle for csRNA-seq as a powerful means to determine the genes, regulatory mechanisms, and transcription factors that control the pathogenesis of highly infectious agents. IMPORTANCE Infectious pathogens like airborne viruses or fungal spores are difficult to study; they require high-containment facilities, special equipment, and expertise. As such, establishing approaches such as genome editing or other means to identify the factors and mechanisms underlying caused diseases, and, thus, promising drug targets, is costly and time-intensive. These obstacles particularly hinder the analysis of new, emerging, or rare infectious diseases. We recently developed a method termed capped small RNA sequencing (csRNA-seq) that enables capturing acute changes in active gene expression from total RNA. Prior to csRNA-seq, such an analysis was possible only by using living cells or nuclei, in which pathogens are highly infectious. The process of RNA purification, however, inactivates pathogens and thus enables the analysis of gene expression during disease progression under standard laboratory conditions. As a proof of principle, here, we use csRNA-seq to unravel the gene regulatory programs and factors likely critical for the pathogenesis of valley fever, an emerging endemic fungal infection that increasingly impacts livestock, pet, and human health.
Collapse
Affiliation(s)
- Sascha H. Duttke
- Department of Medicine, Division of Endocrinology, UC San Diego School of Medicine, La Jolla, California, USA
| | - Sinem Beyhan
- Department of Medicine, Division of Infectious Disease, UC San Diego School of Medicine, La Jolla, California, USA
- J. Craig Venter Institute, Department of Infectious Diseases, La Jolla, California, USA
| | - Rajendra Singh
- Department of Medicine, Division of Infectious Disease, UC San Diego School of Medicine, La Jolla, California, USA
| | - Sonya Neal
- Section of Cell and Developmental Biology, Division of Biological Sciences, University of California, San Diego, La Jolla, California, USA
| | - Suganya Viriyakosol
- Department of Medicine, Division of Infectious Disease, UC San Diego School of Medicine, La Jolla, California, USA
| | - Joshua Fierer
- Department of Medicine, Division of Infectious Disease, UC San Diego School of Medicine, La Jolla, California, USA
- Infectious Diseases Section, VA Healthcare San Diego, San Diego, California, USA
- Department of Pathology, UC San Diego School of Medicine, La Jolla, California, USA
| | - Theo N. Kirkland
- Department of Medicine, Division of Infectious Disease, UC San Diego School of Medicine, La Jolla, California, USA
- Department of Pathology, UC San Diego School of Medicine, La Jolla, California, USA
| | - Jason E. Stajich
- Department of Microbiology and Plant Pathology, Institute for Integrative Genome Biology, University of California—Riverside, Riverside, California, USA
| | - Christopher Benner
- Department of Medicine, Division of Endocrinology, UC San Diego School of Medicine, La Jolla, California, USA
| | - Aaron F. Carlin
- Department of Medicine, Division of Infectious Disease, UC San Diego School of Medicine, La Jolla, California, USA
| |
Collapse
|
13
|
Manual Annotation Studio (MAS): a collaborative platform for manual functional annotation of viral and microbial genomes. BMC Genomics 2021; 22:733. [PMID: 34627149 PMCID: PMC8501643 DOI: 10.1186/s12864-021-08029-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 09/22/2021] [Indexed: 11/10/2022] Open
Abstract
Background Functional genome annotation is the process of labelling functional genomic regions with descriptive information. Manual curation can produce higher quality genome annotations than fully automated methods. Manual annotation efforts are time-consuming and complex; however, software can help reduce these drawbacks. Results We created Manual Annotation Studio (MAS) to improve the efficiency of the process of manual functional annotation prokaryotic and viral genomes. MAS allows users to upload unannotated genomes, provides an interface to edit and upload annotations, tracks annotation history and progress, and saves data to a relational database. MAS provides users with pertinent information through a simple point and click interface to execute and visualize results for multiple homology search tools (blastp, rpsblast, and HHsearch) against multiple databases (Swiss-Prot, nr, CDD, PDB, and an internally generated database). MAS was designed to accept connections over the local area network (LAN) of a lab or organization so multiple users can access it simultaneously. MAS can take advantage of high-performance computing (HPC) clusters by interfacing with SGE or SLURM and data can be exported from MAS in a variety of formats (FASTA, GenBank, GFF, and excel). Conclusions MAS streamlines and provides structure to manual functional annotation projects. MAS enhances the ability of users to generate, interpret, and compare results from multiple tools. The structure that MAS provides can improve project organization and reduce annotation errors. MAS is ideal for team-based annotation projects because it facilitates collaboration. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-08029-8.
Collapse
|
14
|
Souvorov A, Agarwala R. SAUTE: sequence assembly using target enrichment. BMC Bioinformatics 2021; 22:375. [PMID: 34289805 PMCID: PMC8293564 DOI: 10.1186/s12859-021-04174-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Accepted: 05/05/2021] [Indexed: 01/25/2023] Open
Abstract
Background Illumina is the dominant sequencing technology at this time. Short length, short insert size, some systematic biases, and low-level carryover contamination in Illumina reads continue to make assembly of repeated regions a challenging problem. Some applications also require finding multiple well supported variants for assembled regions. Results To facilitate assembly of repeat regions and to report multiple well supported variants when a user can provide target sequences to assist the assembly, we propose SAUTE and SAUTE_PROT assemblers. Both assemblers use de Bruijn graph on reads. Targets can be transcripts or proteins for RNA-seq reads and transcripts, proteins, or genomic regions for genomic reads. Target sequences are nucleotide and protein sequences for SAUTE and SAUTE_PROT, respectively. Conclusions For RNA-seq, comparisons with Trinity, rnaSPAdes, SPAligner, and SPAdes assembly of reads aligned to target proteins by DIAMOND show that SAUTE_PROT finds more coding sequences that translate to benchmark proteins. Using AMRFinderPlus calls, we find SAUTE has higher sensitivity and precision than SPAdes, plasmidSPAdes, SPAligner, and SPAdes assembly of reads aligned to target regions by HISAT2. It also has better sensitivity than SKESA but worse precision. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04174-9.
Collapse
Affiliation(s)
| | - Richa Agarwala
- NCBI/NLM/NIH/DHHS, 8600 Rockville Pike, Bethesda, MD, 20894, USA.
| |
Collapse
|
15
|
Erady C, Boxall A, Puntambekar S, Suhas Jagannathan N, Chauhan R, Chong D, Meena N, Kulkarni A, Kasabe B, Prathivadi Bhayankaram K, Umrania Y, Andreani A, Nel J, Wayland MT, Pina C, Lilley KS, Prabakaran S. Pan-cancer analysis of transcripts encoding novel open-reading frames (nORFs) and their potential biological functions. NPJ Genom Med 2021; 6:4. [PMID: 33495453 PMCID: PMC7835362 DOI: 10.1038/s41525-020-00167-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 11/18/2020] [Indexed: 12/13/2022] Open
Abstract
Uncharacterized and unannotated open-reading frames, which we refer to as novel open reading frames (nORFs), may sometimes encode peptides that remain unexplored for novel therapeutic opportunities. To our knowledge, no systematic identification and characterization of transcripts encoding nORFs or their translation products in cancer, or in any other physiological process has been performed. We use our curated nORFs database (nORFs.org), together with RNA-Seq data from The Cancer Genome Atlas (TCGA) and Genotype-Expression (GTEx) consortiums, to identify transcripts containing nORFs that are expressed frequently in cancer or matched normal tissue across 22 cancer types. We show nORFs are subject to extensive dysregulation at the transcript level in cancer tissue and that a small subset of nORFs are associated with overall patient survival, suggesting that nORFs may have prognostic value. We also show that nORF products can form protein-like structures with post-translational modifications. Finally, we perform in silico screening for inhibitors against nORF-encoded proteins that are disrupted in stomach and esophageal cancer, showing that they can potentially be targeted by inhibitors. We hope this work will guide and motivate future studies that perform in-depth characterization of nORF functions in cancer and other diseases.
Collapse
Affiliation(s)
- Chaitanya Erady
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Adam Boxall
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Shraddha Puntambekar
- Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra, 411008, India
| | - N Suhas Jagannathan
- Cancer and Stem Cell Biology Programme, and Centre for Computational Biology, Duke-NUS Medical School, Singapore, 169857, Singapore
| | - Ruchi Chauhan
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - David Chong
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Narendra Meena
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Apurv Kulkarni
- Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra, 411008, India
| | - Bhagyashri Kasabe
- Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra, 411008, India
| | | | - Yagnesh Umrania
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Adam Andreani
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Jean Nel
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Matthew T Wayland
- Department of Zoology, University of Cambridge, Downing Street, Cambridge, CB2 3EJ, UK
| | - Cristina Pina
- Department of Haematology, Cambridge Biomedical Campus, Cambridge, CB2 0PT, UK
| | - Kathryn S Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Sudhakaran Prabakaran
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK.
| |
Collapse
|
16
|
Padmavathi P, Setlur AS, Chandrashekar K, Niranjan V. A comprehensive in-silico computational analysis of twenty cancer exome datasets and identification of associated somatic variants reveals potential molecular markers for detection of varied cancer types. INFORMATICS IN MEDICINE UNLOCKED 2021. [DOI: 10.1016/j.imu.2021.100762] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
|
17
|
SoRelle JA, Wachsmann M, Cantarel BL. Assembling and Validating Bioinformatic Pipelines for Next-Generation Sequencing Clinical Assays. Arch Pathol Lab Med 2020; 144:1118-1130. [PMID: 32045276 DOI: 10.5858/arpa.2019-0476-ra] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/09/2019] [Indexed: 11/06/2022]
Abstract
CONTEXT.— Clinical next-generation sequencing (NGS) is being rapidly adopted, but analysis and interpretation of large data sets prompt new challenges for a clinical laboratory setting. Clinical NGS results rely heavily on the bioinformatics pipeline for identifying genetic variation in complex samples. The choice of bioinformatics algorithms, genome assembly, and genetic annotation databases are important for determining genetic alterations associated with disease. The analysis methods are often tuned to the assay to maximize accuracy. Once a pipeline has been developed, it must be validated to determine accuracy and reproducibility for samples similar to real-world cases. In silico proficiency testing or institutional data exchange will ensure consistency among clinical laboratories. OBJECTIVE.— To provide molecular pathologists a step-by-step guide to bioinformatics analysis and validation design in order to navigate the regulatory and validation standards of implementing a bioinformatic pipeline as a part of a new clinical NGS assay. DATA SOURCES.— This guide uses published studies on genomic analysis, bioinformatics methods, and methods comparison studies to inform the reader on what resources, including open source software tools and databases, are available for genetic variant detection and interpretation. CONCLUSIONS.— This review covers 4 key concepts: (1) bioinformatic analysis design for detecting genetic variation, (2) the resources for assessing genetic effects, (3) analysis validation assessment experiments and data sets, including a diverse set of samples to mimic real-world challenges that assess accuracy and reproducibility, and (4) if concordance between clinical laboratories will be improved by proficiency testing designed to test bioinformatic pipelines.
Collapse
Affiliation(s)
- Jeffrey A SoRelle
- Department of Pathology (SoRelle, Wachsmann), University of Texas Southwestern Medical Center, Dallas
| | - Megan Wachsmann
- Department of Pathology (SoRelle, Wachsmann), University of Texas Southwestern Medical Center, Dallas
| | - Brandi L Cantarel
- Bioinformatics Core Facility (Cantarel), University of Texas Southwestern Medical Center, Dallas.,Department of Bioinformatics (Cantarel), University of Texas Southwestern Medical Center, Dallas.,University of Texas Southwestern Medical Center, Dallas
| |
Collapse
|
18
|
Ejigu GF, Jung J. Review on the Computational Genome Annotation of Sequences Obtained by Next-Generation Sequencing. BIOLOGY 2020; 9:E295. [PMID: 32962098 PMCID: PMC7565776 DOI: 10.3390/biology9090295] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Revised: 09/13/2020] [Accepted: 09/16/2020] [Indexed: 12/16/2022]
Abstract
Next-Generation Sequencing (NGS) has made it easier to obtain genome-wide sequence data and it has shifted the research focus into genome annotation. The challenging tasks involved in annotation rely on the currently available tools and techniques to decode the information contained in nucleotide sequences. This information will improve our understanding of general aspects of life and evolution and improve our ability to diagnose genetic disorders. Here, we present a summary of both structural and functional annotations, as well as the associated comparative annotation tools and pipelines. We highlight visualization tools that immensely aid the annotation process and the contributions of the scientific community to the annotation. Further, we discuss quality-control practices and the need for re-annotation, and highlight the future of annotation.
Collapse
Affiliation(s)
| | - Jaehee Jung
- Department of Information and Communication Engineering, Myongji University, Yongin-si 17058, Gyeonggi-do, Korea;
| |
Collapse
|
19
|
Park KJ, Lee W, Chun S, Min WK. The Frequency of Discordant Variant Classification in the Human Gene Mutation Database: A Comparison of the American College of Medical Genetics and Genomics Guidelines and ClinVar. Lab Med 2020; 52:250-259. [PMID: 32926152 DOI: 10.1093/labmed/lmaa072] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
OBJECTIVE Discordant variant classifications among public databases is one of the well-documented limitations when interpreting the pathogenicity of variants. The aim of this study is to investigate the level of germline variant misannotation from the Human Gene Mutation Database (HGMD) and the annotation concordance between databases. METHODS We used a total of 188,106 classified variants (disease-causing mutations [n = 179,454] and polymorphisms [n = 8652]) in 6466 genes from the HGMD. All variants were reanalyzed based on the American College of Medical Genetics and Genomics (ACMG) guidelines and compared to ClinVar database variants. RESULTS When variants were classified based on the ACMG guidelines, misclassification was observed in 3.47% (2289/65,896) of variants. The overall concordance between HGMD and ClinVar was 97.62% (52,499/53,780) of variants studied. CONCLUSION Variants in databases must be used with caution when variant pathogenicity is interpreted. This study reveals the frequency of misannotation of the HGMD variants and annotation concordance between databases in depth.
Collapse
Affiliation(s)
- Kyoung-Jin Park
- Department of Laboratory Medicine, Myongji Hospital, Hanyang University College of Medicine, Goyang-Si, Gyeonggi-Do, Korea
| | - Woochang Lee
- Department of Laboratory Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Sail Chun
- Department of Laboratory Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Won-Ki Min
- Department of Laboratory Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| |
Collapse
|
20
|
Bell SC, Mall MA, Gutierrez H, Macek M, Madge S, Davies JC, Burgel PR, Tullis E, Castaños C, Castellani C, Byrnes CA, Cathcart F, Chotirmall SH, Cosgriff R, Eichler I, Fajac I, Goss CH, Drevinek P, Farrell PM, Gravelle AM, Havermans T, Mayer-Hamblett N, Kashirskaya N, Kerem E, Mathew JL, McKone EF, Naehrlich L, Nasr SZ, Oates GR, O'Neill C, Pypops U, Raraigh KS, Rowe SM, Southern KW, Sivam S, Stephenson AL, Zampoli M, Ratjen F. The future of cystic fibrosis care: a global perspective. THE LANCET RESPIRATORY MEDICINE 2020; 8:65-124. [DOI: 10.1016/s2213-2600(19)30337-6] [Citation(s) in RCA: 351] [Impact Index Per Article: 87.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Revised: 07/19/2019] [Accepted: 08/14/2019] [Indexed: 02/06/2023]
|
21
|
Ballard LM, Horton RH, Fenwick A, Lucassen AM. Genome sequencing in healthcare: understanding the UK general public's views and implications for clinical practice. Eur J Hum Genet 2019; 28:155-164. [PMID: 31527856 DOI: 10.1038/s41431-019-0504-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Revised: 08/06/2019] [Accepted: 08/22/2019] [Indexed: 12/27/2022] Open
Abstract
Technological advances have seen the offer of genome sequencing becoming part of mainstream medical practice. Research has elicited patient and health professional views on the ethical issues genome sequencing raises, however, we know little about the general public's views. These views offer an insight into people's faith in such technologies, informing discussion regarding the approach to consent in clinic. We aimed to garner public views regarding genome sequencing, incidental findings (IFs), and sharing genetic information with relatives. Participants (n = 1954) from the British general public completed a survey, distributed via email. Overall, the public had a positive view of genomic sequencing, choosing 'informative' as the most popular word (52%) and 'family legacy' as the most popular analogy (33%) representing genomic sequencing for them. Fifty-three percent agree that their relative had the right to be told about genetic information relevant to them. Fifty-four percent would expect to be told about IFs whether they had asked for them or not. Clinical practice needs to acknowledge these perspectives and expectations in order to facilitate meaningful discussion during the consent process for genomic tests. We suggest that: (a) optimistic perspectives on the usefulness of genomic tests need to be tempered by discussion in clinic about the likelihood that genomic results might be uninformative, uncertain or unexpected; (b) discussions regarding the familial nature of results are needed before testing: the majority of patients will welcome this and any concerns can be explored further; and (c) a wider discussion is required regarding the consent approach for genomic testing.
Collapse
Affiliation(s)
- Lisa M Ballard
- Clinical Ethics and Law at Southampton (CELS), Centre for Cancer Immunology, University of Southampton, School of Medicine, Southampton, UK.
| | - Rachel H Horton
- Clinical Ethics and Law at Southampton (CELS), Centre for Cancer Immunology, University of Southampton, School of Medicine, Southampton, UK
| | - Angela Fenwick
- Clinical Ethics and Law at Southampton (CELS), Centre for Cancer Immunology, University of Southampton, School of Medicine, Southampton, UK
| | - Anneke M Lucassen
- Clinical Ethics and Law at Southampton (CELS), Centre for Cancer Immunology, University of Southampton, School of Medicine, Southampton, UK
| |
Collapse
|
22
|
Lloyd JPB, Lang D, Zimmer AD, Causier B, Reski R, Davies B. The loss of SMG1 causes defects in quality control pathways in Physcomitrella patens. Nucleic Acids Res 2019; 46:5822-5836. [PMID: 29596649 PMCID: PMC6009662 DOI: 10.1093/nar/gky225] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2017] [Accepted: 03/16/2018] [Indexed: 12/16/2022] Open
Abstract
Nonsense-mediated mRNA decay (NMD) is important for RNA quality control and gene regulation in eukaryotes. NMD targets aberrant transcripts for decay and also directly influences the abundance of non-aberrant transcripts. In animals, the SMG1 kinase plays an essential role in NMD by phosphorylating the core NMD factor UPF1. Despite SMG1 being ubiquitous throughout the plant kingdom, little is known about its function, probably because SMG1 is atypically absent from the genome of the model plant, Arabidopsis thaliana. By combining our previously established SMG1 knockout in moss with transcriptome-wide analysis, we reveal the range of processes involving SMG1 in plants. Machine learning assisted analysis suggests that 32% of multi-isoform genes produce NMD-targeted transcripts and that splice junctions downstream of a stop codon act as the major determinant of NMD targeting. Furthermore, we suggest that SMG1 is involved in other quality control pathways, affecting DNA repair and the unfolded protein response, in addition to its role in mRNA quality control. Consistent with this, smg1 plants have increased susceptibility to DNA damage, but increased tolerance to unfolded protein inducing agents. The potential involvement of SMG1 in RNA, DNA and protein quality control has major implications for the study of these processes in plants.
Collapse
Affiliation(s)
- James P B Lloyd
- Centre for Plant Sciences, Faculty of Biological Sciences, University of Leeds, UK
| | - Daniel Lang
- Plant Biotechnology, Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany
| | - Andreas D Zimmer
- Plant Biotechnology, Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany
| | - Barry Causier
- Centre for Plant Sciences, Faculty of Biological Sciences, University of Leeds, UK
| | - Ralf Reski
- Plant Biotechnology, Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany.,BIOSS - Centre for Biological Signalling Studies, University of Freiburg, 79104 Freiburg, Germany
| | - Brendan Davies
- Centre for Plant Sciences, Faculty of Biological Sciences, University of Leeds, UK
| |
Collapse
|
23
|
Richardson MF, Munyard K, Croft LJ, Allnutt TR, Jackling F, Alshanbari F, Jevit M, Wright GA, Cransberg R, Tibary A, Perelman P, Appleton B, Raudsepp T. Chromosome-Level Alpaca Reference Genome VicPac3.1 Improves Genomic Insight Into the Biology of New World Camelids. Front Genet 2019; 10:586. [PMID: 31293619 PMCID: PMC6598621 DOI: 10.3389/fgene.2019.00586] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Accepted: 06/04/2019] [Indexed: 12/11/2022] Open
Abstract
The development of high-quality chromosomally assigned reference genomes constitutes a key feature for understanding genome architecture of a species and is critical for the discovery of the genetic blueprints of traits of biological significance. South American camelids serve people in extreme environments and are important fiber and companion animals worldwide. Despite this, the alpaca reference genome lags far behind those available for other domestic species. Here we produced a chromosome-level improved reference assembly for the alpaca genome using the DNA of the same female Huacaya alpaca as in previous assemblies. We generated 190X Illumina short-read, 8X Pacific Biosciences long-read and 60X Dovetail Chicago® chromatin interaction scaffolding data for the assembly, used testis and skin RNAseq data for annotation, and cytogenetic map data for chromosomal assignments. The new assembly VicPac3.1 contains 90% of the alpaca genome in just 103 scaffolds and 76% of all scaffolds are mapped to the 36 pairs of the alpaca autosomes and the X chromosome. Preliminary annotation of the assembly predicted 22,462 coding genes and 29,337 isoforms. Comparative analysis of selected regions of the alpaca genome, such as the major histocompatibility complex (MHC), the region involved in the Minute Chromosome Syndrome (MCS) and candidate genes for high-altitude adaptations, reveal unique features of the alpaca genome. The alpaca reference genome VicPac3.1 presents a significant improvement in completeness, contiguity and accuracy over VicPac2 and is an important tool for the advancement of genomics research in all New World camelids.
Collapse
Affiliation(s)
- Mark F Richardson
- Genomics Centre, Deakin University, Geelong, VIC, Australia.,Centre for Integrative Ecology, Deakin University, Geelong, VIC, Australia
| | - Kylie Munyard
- School of Pharmacy and Biomedical Sciences, Curtin Health Innovation Research Institute, Curtin University, Perth, WA, Australia
| | - Larry J Croft
- Genomics Centre, Deakin University, Geelong, VIC, Australia
| | - Theodore R Allnutt
- Bioinformatics Core Research Group, Deakin University, Geelong, VIC, Australia
| | - Felicity Jackling
- Department of Genetics, The University of Melbourne, Melbourne, VIC, Australia
| | - Fahad Alshanbari
- Department of Veterinary Pathobiology, Texas A&M University, College Station, TX, United States
| | - Matthew Jevit
- Department of Veterinary Pathobiology, Texas A&M University, College Station, TX, United States
| | - Gus A Wright
- Department of Veterinary Pathobiology, Texas A&M University, College Station, TX, United States
| | - Rhys Cransberg
- School of Pharmacy and Biomedical Sciences, Curtin Health Innovation Research Institute, Curtin University, Perth, WA, Australia
| | - Ahmed Tibary
- Center for Reproductive Biology, Washington State University, Pullman, WA, United States
| | - Polina Perelman
- Institute of Molecular and Cellular Biology, Siberian Branch of Russian Academy of Sciences, Novosibirsk, Russia
| | - Belinda Appleton
- Centre for Integrative Ecology, Deakin University, Geelong, VIC, Australia
| | - Terje Raudsepp
- Department of Veterinary Pathobiology, Texas A&M University, College Station, TX, United States
| |
Collapse
|
24
|
Shapshak P. Astrovirology, Astrobiology, Artificial Intelligence: Extra-Solar System Investigations. GLOBAL VIROLOGY III: VIROLOGY IN THE 21ST CENTURY 2019. [PMCID: PMC7120930 DOI: 10.1007/978-3-030-29022-1_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
This chapter attempts to encompass and tackle a large problem in Astrovirology and Astrobiology. There is a huge anthropomorphic prejudice that although life is unlikely, the just-right Goldilocks terrestrial conditions mean that the just-right balance of minerals and basic small molecules inevitably result in life as we know it throughout our solar system, galaxy, and the rest of the universe. Moreover, when such conditions on planets such as ours may not be quite right for the origin of life, it is popularly opined that asteroids and comets magically produce life or at the very least, the important, if not crucial components of terrestrial life so that life then blooms, when their fragments cruise the solar system, stars, and galaxies, and plummet onto appropriately bedecked planets and moons. It is no longer extraordinary to detect extraterrestrial solar systems. Moreover, since extra-solar system space exploration has commenced, this provides the problem of detecting life with enhanced achievability. Small organisms, which replicate outside of a living cell or host, would not be catalogued as viruses. How about viruses that cohabit with life? On the Earth, viruses are a major, if underestimated, condition of life – will that be the case elsewhere? Detection of extra-solar system viruses, if they exist, requires finding life, since viruses necessitate life to replicate. (It should be noted, though, that viruses could be detected through various types of portable ultra-microscopes, including Electron Microscopes (EM) (scanning and transmission) as well as Atomic Force Microscopes (AFM).) However, extra-solar system detection of life does not oblige that viruses exist ubiquitously. Viruses are important potential components of biospheres because of their multiple interactions and influence on evolution, although viruses are small and obligatory parasitic. In addition, nanotechnology – living or replicating nano-synthetic machine organisms might also be present out there, and require consideration as well. An imposing caveat is that, if found, could some extraterrestrial viruses and synthetic nanotechnological microorganisms infect humans? Possibly, intelligence and cognition may at times be contemporaneous with life. Concomitantly, life and viruses that may be detected, could well be impacted upon by intelligences existing on such exoplanets (and vice versa). Coming to an understanding of the plurality of extraterrestrial intelligence is an optimal objective, in order to avoid causing harm on exoplanets, as well as avoiding conflict and possible human devastation. This is especially the case if we encounter greatly advanced galactic-level civilizations, compared to terrestrial civilizations. Their machine and bionic technologies on the Dyson engineering civilization scale may be prominently superior to ours; their biological expertise may be similarly critically radical. For example, they may use viruses for purposes for which we are barely aware, and which could be utterly deadly for humans. A series of steps is being taken in space exploration. Scientists hypothesize and claim that types of life may be near the Earth, in the solar system, and outside the solar system, similar to ours in the sense that only such conditions, Goldilocks conditions, are key sine qua non requirements, based on our terrestrial chemistry and biochemistry. If detected within the solar system, will life or its remnants resemble terrestrial life? Outside the solar system a similar chauvinism exists, although the likelihood for life, in any event, remains probably low, according to more cautious approaches to the problem. The study of our solar system includes planets, asteroids, comets, and other planetesimals that have been in overall contiguity during several billion years; anthropomorphisms claims life consequently has been developing along terrestrial-type mechanisms. However, a non-anthropomorphic view would surmise, probably not, especially for extra-solar system locales. The prime warning and admonition in all these deliberations is the contamination and damage, which current and past practice and procedures has caused and continues, due to insufficient biocontainment concepts and technology to date. Advances in the development of robotics, artificial intelligence (AI), and high capacity ultrafast quantum computers (QC) greatly enhance the sophisticated control and logical development of extra-solar system studies. Consequently, future long-range manned space exploration seems unwarranted. Clearly, reduced dangers to human health and safety, will result from the use of intelligent machine-based investigations and besides, with increased cost-effectiveness. Space exploration comes at great cost to humanity as a whole and utilizes global resources. Consequently, appropriate organizational measures and planning/cooperation need to be in place. Moreover, the bottom line is that despite all the slogans and claims, there have been next to no financial benefits to our planet as a whole. Such financial and heedless difficulties need to be addressed, the sooner the better. In addition, prior to exposure to exoplanetary life, deep understanding of the problems of infectious diseases and immune dysfunction risks are needed. In addition, global efforts should avoid serendipity and stochasticity as this work should be directed with long-term organization, commitment, scientific, and technological methodology. This chapter briefly reviews such questions assuming a new paradigm for oversight of extrasolar system viral investigations including intelligence and life. Finances are included as an essential adjunct.
Collapse
|
25
|
Vos S, van Diest PJ, Ausems MGEM, van Dijk MR, de Leng WWJ, Bredenoord AL. Ethical considerations for modern molecular pathology. J Pathol 2018; 246:405-414. [DOI: 10.1002/path.5157] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2018] [Revised: 08/03/2018] [Accepted: 08/14/2018] [Indexed: 01/08/2023]
Affiliation(s)
- Shoko Vos
- Department of Pathology; University Medical Center Utrecht; Utrecht The Netherlands
| | - Paul J van Diest
- Department of Pathology; University Medical Center Utrecht; Utrecht The Netherlands
| | - Margreet GEM Ausems
- Department of Medical Genetics; University Medical Center Utrecht; Utrecht The Netherlands
| | - Marijke R van Dijk
- Department of Pathology; University Medical Center Utrecht; Utrecht The Netherlands
| | - Wendy WJ de Leng
- Department of Pathology; University Medical Center Utrecht; Utrecht The Netherlands
| | - Annelien L Bredenoord
- Department of Medical Humanities; University Medical Center Utrecht; Utrecht The Netherlands
| |
Collapse
|
26
|
Oncogenic Amplification of Zygotic Dux Factors in Regenerating p53-Deficient Muscle Stem Cells Defines a Molecular Cancer Subtype. Cell Stem Cell 2018; 23:794-805.e4. [PMID: 30449715 DOI: 10.1016/j.stem.2018.10.011] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2018] [Revised: 08/27/2018] [Accepted: 10/08/2018] [Indexed: 01/09/2023]
Abstract
The identity of tumor-initiating cells in many cancer types is unknown. Tumors often express genes associated with embryonic development, although the contributions of zygotic programs to tumor initiation and formation are poorly understood. Here, we show that regeneration-induced loss of quiescence in p53-deficient muscle stem cells (MuSCs) results in rhabdomyosarcoma formation with 100% penetrance. Genomic analyses of purified tumor cells revealed spontaneous and discrete oncogenic amplifications in MuSCs that drive tumorigenesis, including, but not limited to, the amplification of the cleavage-stage Dux transcription factor (TF) Duxbl. We further found that Dux factors drive an early embryonic gene signature that defines a molecular subtype across a broad range of human cancers. Duxbl initiates tumorigenesis by enforcing a mesenchymal-to-epithelial transition, and targeted inactivation of Duxbl specifically in Duxbl-expressing tumor cells abolishes their expansion. These findings reveal how regeneration and genomic instability can interact to activate zygotic genes that drive tumor initiation and growth.
Collapse
|
27
|
Veras AADO, Merlin B, de Sá PHCG. ImproveAssembly - Tool for identifying new gene products and improving genome assembly. PLoS One 2018; 13:e0206000. [PMID: 30365512 PMCID: PMC6203371 DOI: 10.1371/journal.pone.0206000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Accepted: 10/04/2018] [Indexed: 11/18/2022] Open
Abstract
The availability of biological information in public databases has increased exponentially. To ensure the accuracy of this information, researchers have adopted several methods and refinements to avoid the dissemination of incorrect information; for example, several automated tools are available for annotation processes. However, manual curation ensures and enriches biological information. Additionally, the genomic finishing process is complex, resulting in increased deposition of drafts genomes. This introduces bias in other omics analyses because incomplete genomic content is used. This is also observed for complete genomes. For example, genomes generated by reference assembly may not include new products in the new sequence or errors or bias can occur during the assembly process. Thus, we developed ImproveAssembly, a tool capable of identifying new products missing from genomic sequences, which can be used for complete and draft genomes. The identified products can improve the annotation of complete genomes and drafts while significantly reducing the bias when the information is used in other omics analyses.
Collapse
Affiliation(s)
| | - Bruno Merlin
- Faculty of Computer Engineering, Federal University of Pará campus Tucuruí (CAMTUC-UFPA), Pará, Brazil
| | | |
Collapse
|
28
|
Liu Z, Yang C, Li X, Luo W, Roy B, Xiong T, Zhang X, Yang H, Wang J, Ye Z, Chen Y, Song J, Ma S, Zhou Y, Yang M, Fang X, Du J. The landscape of somatic mutation in sporadic Chinese colorectal cancer. Oncotarget 2018; 9:27412-27422. [PMID: 29937994 PMCID: PMC6007951 DOI: 10.18632/oncotarget.25287] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2017] [Accepted: 03/06/2018] [Indexed: 12/18/2022] Open
Abstract
Colorectal cancer is the fifth prevalent cancer in China. Nevertheless, a large-scale characterization of Chinese colorectal cancer mutation spectrum has not been carried out. In this study, we have performed whole exome-sequencing analysis of 98 patients’ tumor samples with matched pairs of normal colon tissues using Illumina and Complete Genomics high-throughput sequencing platforms. Canonical CRC somatic gene mutations with high prevalence (>10%) have been verified, including TP53, APC, KRAS, SMAD4, FBXW7 and PIK3CA. PEG3 is identified as a novel frequently mutated gene (10.6%). APC and Wnt signaling exhibit significantly lower mutation frequencies than those in TCGA data. Analysis with clinical characteristics indicates that APC gene and Wnt signaling display lower mutation rate in lymph node positive cancer than negative ones, which are not observed in TCGA data. APC gene and Wnt signaling are considered as the key molecule and pathway for colorectal cancer initiation, and these findings greatly undermine their importance in tumor progression for Chinese patients. Taken together, the application of next-generation sequencing has led to the determination of novel somatic mutations and alternative disease mechanisms in colorectal cancer progression, which may be useful for understanding disease mechanism and personalizing treatment for Chinese patients.
Collapse
Affiliation(s)
- Zhe Liu
- Beijing Anzhen Hospital, Capital Medical University, The Key Laboratory of Remodeling-Related Cardiovascular Diseases, Ministry of Education, Beijing Collaborative Innovation Center for Cardiovascular Disorders, Beijing Institute of Heart, Lung and Blood Vessel Disease, Beijing, China.,Beijing Advanced Innovation Center for Big Data and Brain Computing (BDBC), Beihang University, Beijing, China
| | - Chao Yang
- BGI Genomics, BGI-Shenzhen, Shenzhen, China
| | | | - Wen Luo
- BGI Genomics, BGI-Shenzhen, Shenzhen, China
| | | | - Teng Xiong
- BGI Genomics, BGI-Shenzhen, Shenzhen, China
| | | | - Huanming Yang
- BGI Genomics, BGI-Shenzhen, Shenzhen, China.,James D. Watson Institute of Genome Sciences, Hangzhou, China
| | - Jian Wang
- BGI Genomics, BGI-Shenzhen, Shenzhen, China.,James D. Watson Institute of Genome Sciences, Hangzhou, China
| | - Zhenhao Ye
- The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Yang Chen
- The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Jinghe Song
- SKLSDE Lab, Beihang University, Beijing, China
| | - Shuai Ma
- SKLSDE Lab, Beihang University, Beijing, China
| | - Yong Zhou
- BGI Genomics, BGI-Shenzhen, Shenzhen, China
| | - Min Yang
- Beijing Anzhen Hospital, Capital Medical University, The Key Laboratory of Remodeling-Related Cardiovascular Diseases, Ministry of Education, Beijing Collaborative Innovation Center for Cardiovascular Disorders, Beijing Institute of Heart, Lung and Blood Vessel Disease, Beijing, China.,State Key Laboratory of Bioactive Substances and Function of Natural Medicine, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | | | - Jie Du
- Beijing Anzhen Hospital, Capital Medical University, The Key Laboratory of Remodeling-Related Cardiovascular Diseases, Ministry of Education, Beijing Collaborative Innovation Center for Cardiovascular Disorders, Beijing Institute of Heart, Lung and Blood Vessel Disease, Beijing, China
| |
Collapse
|
29
|
Periodic reanalysis of whole-genome sequencing data enhances the diagnostic advantage over standard clinical genetic testing. Eur J Hum Genet 2018; 26:740-744. [PMID: 29453418 DOI: 10.1038/s41431-018-0114-6] [Citation(s) in RCA: 75] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2017] [Revised: 12/11/2017] [Accepted: 01/23/2018] [Indexed: 12/20/2022] Open
Abstract
Whole-genome sequencing (WGS) as a first-tier diagnostic test could transform medical genetic assessments, but there are limited data regarding its clinical use. We previously showed that WGS could feasibly be deployed as a single molecular test capable of a higher diagnostic rate than current practices, in a prospectively recruited cohort of 100 children meeting criteria for chromosomal microarray analysis. In this study, we report on the added diagnostic yield with re-annotation and reanalysis of these WGS data ~2 years later. Explanatory variants have been discovered in seven (10.9%) of 64 previously undiagnosed cases, in emerging disease genes like HMGA2. No new genetic diagnoses were made by any other method in the interval period as part of ongoing clinical care. The results increase the cumulative diagnostic yield of WGS in the study cohort to 41%. This represents a greater than 5-fold increase over the chromosomal microarrays, and a greater than 3-fold increase over all the clinical genetic testing ordered in practice. These findings highlight periodic reanalysis as yet another advantage of genomic sequencing in heterogeneous disorders. We recommend reanalysis of an individual's genome-wide sequencing data every 1-2 years until diagnosis, or sooner if their phenotype evolves.
Collapse
|
30
|
Abstract
Mitochondrial DNA (mtDNA), which is essential for mitochondrial and cell function, is replicated and transcribed in the organelle by proteins that are entirely coded in the nucleus. Replication of mtDNA is challenged not only by threats related to the replication machinery and orchestration of DNA synthesis, but also by factors linked to the peculiarity of this genome. Indeed the architecture, organization, copy number, and location of mtDNA, which are markedly distinct from the nuclear genome, require ad hoc and complex regulation to ensure coordinated replication. As a consequence sub-optimal mtDNA replication, which results from compromised regulation of these factors, is generally associated with mitochondrial dysfunction and disease. Mitochondrial DNA replication should be considered in the context of the organelle and the whole cell, and not just a single genome or a single replication event. Major threats to mtDNA replication are linked to its dependence on both mitochondrial and nuclear factors, which require exquisite coordination of these crucial subcellular compartments. Moreover, regulation of replication events deals with a dynamic population of multiple mtDNA molecules rather than with a fixed number of genome copies, as it is the case for nuclear DNA. Importantly, the mechanistic aspects of mtDNA replication are still debated. We describe here major challenges for human mtDNA replication, the mechanistic aspects of the process that are to a large extent original, and their consequences on disease.
Collapse
Affiliation(s)
- Miria Ricchetti
- Institut Pasteur, Department of Developmental and Stem Cell Biology, Stem Cells and Development, 75724 Cedex15, Paris, France; Team Stability of Nuclear and Mitochondrial DNA, CNRS UMR 3738, 75724, Cedex15, Paris, France.
| |
Collapse
|