1
|
Cisse EHM, Pascual LS, Gajanayake KB, Yang F. Tree species and drought: Two mysterious long-standing counterparts. PHYSIOLOGIA PLANTARUM 2024; 176:e14586. [PMID: 39468381 DOI: 10.1111/ppl.14586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2024] [Accepted: 09/25/2024] [Indexed: 10/30/2024]
Abstract
Around 252 million years ago (Late Permian), Earth experienced one of its most significant drought periods, coinciding with a global climate crisis, resulting in a devastating loss of forest trees with no hope of recovery. In the current epoch (Anthropocene), the worsening of drought stress is expected to significantly affect forest communities. Despite extensive efforts, there is significantly less research at the molecular level on forest trees than on annual crop species. Would it not be wise to allocate equal efforts to woody species, regardless of their importance in providing essential furniture and sustaining most terrestrial ecosystems? For instance, the poplar genome is roughly quadruple the size of the Arabidopsis genome and has 1.6 times the number of genes. Thus, a massive effort in genomic studies focusing on forest trees has become inevitable to understand their adaptation to harsh conditions. Nevertheless, with the emerging role and development of high-throughput DNA sequencing systems, there is a growing body of literature about the responses of trees under drought at the molecular and eco-physiological levels. Therefore, synthesizing these findings through contextualizing drought history and concepts is essential to understanding how woody species adapt to water-limited conditions. Comprehensive genomic research on trees is critical for preserving biodiversity and ecosystem function. Integrating molecular insights with eco-physiological analysis will enhance forest management under climate change.
Collapse
Affiliation(s)
- El Hadji Malick Cisse
- United States Department of Agriculture, Beltsville Agricultural Research Center, Beltsville, Maryland, USA
- Oak Ridge Institute for Science and Education, Oak Ridge, TN, USA
| | - Lidia S Pascual
- Department of Biology, Biochemistry and Environmental Sciences, University Jaume I, Castellón, Spain
| | - K Bandara Gajanayake
- United States Department of Agriculture, Beltsville Agricultural Research Center, Beltsville, Maryland, USA
- Oak Ridge Institute for Science and Education, Oak Ridge, TN, USA
| | - Fan Yang
- Center for Eco-Environment Restoration Engineering of Hainan Province, School of Ecology, Hainan University, Haikou, China
| |
Collapse
|
2
|
Hosoi S, Hirose T, Matsumura S, Otsubo Y, Saito K, Miyazawa M, Suzuki T, Masumura K, Sugiyama KI. Effect of sequencing platforms on the sensitivity of chemical mutation detection using Hawk-Seq™. Genes Environ 2024; 46:20. [PMID: 39385252 PMCID: PMC11462924 DOI: 10.1186/s41021-024-00313-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Accepted: 09/22/2024] [Indexed: 10/12/2024] Open
Abstract
BACKGROUND Error-corrected next-generation sequencing (ecNGS) technologies have enabled the direct evaluation of genome-wide mutations after exposure to mutagens. Previously, we reported an ecNGS methodology, Hawk-Seq™, and demonstrated its utility in evaluating mutagenicity. The evaluation of technical transferability is essential to further evaluate the reliability of ecNGS-based assays. However, cutting-edge sequencing platforms are continually evolving, which can affect the sensitivity of ecNGS. Therefore, the effect of differences in sequencing instruments on mutation data quality should be evaluated. RESULTS We assessed the performance of four sequencing platforms (HiSeq2500, NovaSeq6000, NextSeq2000, and DNBSEQ-G400) with the Hawk-Seq™ protocol for mutagenicity evaluation using DNA samples from mouse bone marrow exposed to benzo[a]pyrene (BP). The overall mutation (OM) frequencies per 106 bp in vehicle-treated samples were 0.22, 0.36, 0.46, and 0.26 for HiSeq2500, NovaSeq6000, NextSeq2000, and DNBSEQ-G400, respectively. The OM frequency of NextSeq2000 was significantly higher than that of HiSeq2500, suggesting the difference to be based on the platform. The relatively higher value in NextSeq2000 was a consequence of the G:C to C:G mutations in NextSeq2000 data (0.67 per 106 G:C bp), which was higher than the mean of the four platforms by a ca. of 0.25 per 106 G:C bp. A clear dose-dependent increase in G:C to T:A mutation frequencies was observed in all four sequencing platforms after BP exposure. The cosine similarity values of the 96-dimensional trinucleotide mutation patterns between HiSeq and the three other platforms were 0.93, 0.95, and 0.92 for NovaSeq, NextSeq, and DNBSeq, respectively. These results suggest that all platforms can provide equivalent data that reflect the characteristics of the mutagens. CONCLUSIONS All platforms sensitively detected mutagen-induced mutations using the Hawk-Seq™ analysis. The substitution types and frequencies of the background errors differed depending on the platform. The effects of sequencing platforms on mutagenicity evaluation should be assessed before experimentation.
Collapse
Affiliation(s)
- Sayaka Hosoi
- R&D - Safety Science Research, Kao Corporation, 3-25-14 Tonomachi, Kawasaki-ku, Kawasaki-shi, Kanagawa, 210-0821, Japan
| | - Takako Hirose
- R&D - Safety Science Research, Kao Corporation, 3-25-14 Tonomachi, Kawasaki-ku, Kawasaki-shi, Kanagawa, 210-0821, Japan
| | - Shoji Matsumura
- R&D - Safety Science Research, Kao Corporation, 3-25-14 Tonomachi, Kawasaki-ku, Kawasaki-shi, Kanagawa, 210-0821, Japan.
| | - Yuki Otsubo
- R&D - Safety Science Research, Kao Corporation, 3-25-14 Tonomachi, Kawasaki-ku, Kawasaki-shi, Kanagawa, 210-0821, Japan
| | - Kazutoshi Saito
- R&D - Safety Science Research, Kao Corporation, 2606 Akabane, Ichikai-Machi, Haga-Gun, Tochigi, 321-3497, Japan
| | - Masaaki Miyazawa
- R&D - Safety Science Research, Kao Corporation, 2606 Akabane, Ichikai-Machi, Haga-Gun, Tochigi, 321-3497, Japan
| | - Takayoshi Suzuki
- Division of Genome Safety Science, National Institute of Health Sciences, 3-25-26 Tonomachi, Kawasaki-ku, Kawasaki-shi, Kanagawa, 210-9501, Japan
| | - Kenichi Masumura
- Division of Risk Assessment, National Institute of Health Sciences, 3-25-26 Tonomachi, Kawasaki-ku, Kawasaki-shi, Kanagawa, 210-9501, Japan
| | - Kei-Ichi Sugiyama
- Division of Genome Safety Science, National Institute of Health Sciences, 3-25-26 Tonomachi, Kawasaki-ku, Kawasaki-shi, Kanagawa, 210-9501, Japan
| |
Collapse
|
3
|
Al Bkhetan Z, Wang S. mgikit: demultiplexing toolkit for MGI fastq files. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btae554. [PMID: 39259173 PMCID: PMC11427695 DOI: 10.1093/bioinformatics/btae554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 08/15/2024] [Accepted: 09/10/2024] [Indexed: 09/12/2024]
Abstract
SUMMARY MGI sequencing is reported to be an inexpensive solution to obtain genomics information. There is a growing need for software and tools to analyse MGI's outputs efficiently. mgikit is a tool collection to demultiplex MGI fastq data, reformat it effectively and produce visual quality reports. mgikit overcomes several limitations of the standard MGI demultiplexer. It is highly customizable to suit different kinds of datasets and is designed to achieve high performance and optimal memory utilization. AVAILABILITY AND IMPLEMENTATION The tool and its documentation are available at: https://sagc-bioinformatics.github.io/mgikit/.
Collapse
Affiliation(s)
- Ziad Al Bkhetan
- South Australian Genomics Centre, SAHMRI, Adelaide, SA, 5001, Australia
- Australian BioCommons, The University of Melbourne, Melbourne, VIC, 3010, Australia
| | - Sen Wang
- South Australian Genomics Centre, SAHMRI, Adelaide, SA, 5001, Australia
| |
Collapse
|
4
|
Liu X, Pang Y, Shan J, Wang Y, Zheng Y, Xue Y, Zhou X, Wang W, Sun Y, Yan X, Shi J, Wang X, Gu H, Zhang F. Beyond the base pairs: comparative genome-wide DNA methylation profiling across sequencing technologies. Brief Bioinform 2024; 25:bbae440. [PMID: 39256199 PMCID: PMC11387064 DOI: 10.1093/bib/bbae440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Revised: 07/28/2024] [Accepted: 08/21/2024] [Indexed: 09/12/2024] Open
Abstract
Deoxyribonucleic acid (DNA) methylation plays a key role in gene regulation and is critical for development and human disease. Techniques such as whole-genome bisulfite sequencing (WGBS) and reduced representation bisulfite sequencing (RRBS) allow DNA methylation analysis at the genome scale, with Illumina NovaSeq 6000 and MGI Tech DNBSEQ-T7 being popular due to their efficiency and affordability. However, detailed comparative studies of their performance are not available. In this study, we constructed 60 WGBS and RRBS libraries for two platforms using different types of clinical samples and generated approximately 2.8 terabases of sequencing data. We systematically compared quality control metrics, genomic coverage, CpG methylation levels, intra- and interplatform correlations, and performance in detecting differentially methylated positions. Our results revealed that the DNBSEQ platform exhibited better raw read quality, although base quality recalibration indicated potential overestimation of base quality. The DNBSEQ platform also showed lower sequencing depth and less coverage uniformity in GC-rich regions than did the NovaSeq platform and tended to enrich methylated regions. Overall, both platforms demonstrated robust intra- and interplatform reproducibility for RRBS and WGBS, with NovaSeq performing better for WGBS, highlighting the importance of considering these factors when selecting a platform for bisulfite sequencing.
Collapse
Affiliation(s)
- Xin Liu
- Anhui Province Key Laboratory of Medical Physics and Technology, Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, Anhui Province 230031, China
- Hefei Cancer Hospital, Chinese Academy of Sciences, Hefei, Anhui Province 230031, China
| | - Yu Pang
- Department of Bacteriology and Immunology, Beijing Chest Hospital, Capital Medical University/Beijing Tuberculosis and Thoracic Tumor Research Institute, Beijing 101149, China
| | - Junqi Shan
- Department of Gastrointestinal Surgery, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, Shandong 250117, China
| | - Yunfei Wang
- Hangzhou ShengTing Biotech Co. Ltd, Hangzhou, Zhejiang Province 310018, China
| | - Yanhua Zheng
- Department of Hematology, The First Hospital of China Medical University, Shenyang, Liaoning, Shenyang, Liaoning province 110001, China
| | - Yuhang Xue
- Department of Hematology, The First Hospital of China Medical University, Shenyang, Liaoning, Shenyang, Liaoning province 110001, China
| | - Xuerong Zhou
- Department of Hematology, The First Hospital of China Medical University, Shenyang, Liaoning, Shenyang, Liaoning province 110001, China
| | - Wenjun Wang
- Hangzhou ShengTing Biotech Co. Ltd, Hangzhou, Zhejiang Province 310018, China
| | - Yanlai Sun
- Department of Gastrointestinal Surgery, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, Shandong 250117, China
| | - Xiaojing Yan
- Department of Hematology, The First Hospital of China Medical University, Shenyang, Liaoning, Shenyang, Liaoning province 110001, China
| | - Jiantao Shi
- State Key Laboratory of Molecular Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China
| | - Xiaoxue Wang
- Department of Hematology, The First Hospital of China Medical University, Shenyang, Liaoning, Shenyang, Liaoning province 110001, China
| | - Hongcang Gu
- Anhui Province Key Laboratory of Medical Physics and Technology, Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, Anhui Province 230031, China
- Hefei Cancer Hospital, Chinese Academy of Sciences, Hefei, Anhui Province 230031, China
| | - Fan Zhang
- Anhui Province Key Laboratory of Medical Physics and Technology, Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, Anhui Province 230031, China
- Hefei Cancer Hospital, Chinese Academy of Sciences, Hefei, Anhui Province 230031, China
| |
Collapse
|
5
|
Sun Y, Zhao X, Fan X, Wang M, Li C, Liu Y, Wu P, Yan Q, Sun L. Assessing the impact of sequencing platforms and analytical pipelines on whole-exome sequencing. Front Genet 2024; 15:1334075. [PMID: 38818042 PMCID: PMC11137314 DOI: 10.3389/fgene.2024.1334075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 04/30/2024] [Indexed: 06/01/2024] Open
Affiliation(s)
- Yanping Sun
- GeneMind Biosciences Company Limited, Shenzhen, China
| | - Xiaochao Zhao
- GeneMind Biosciences Company Limited, Shenzhen, China
| | - Xue Fan
- Clinical Research Institute, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Miao Wang
- GeneMind Biosciences Company Limited, Shenzhen, China
| | - Chaoyang Li
- GeneMind Biosciences Company Limited, Shenzhen, China
| | - Yongfeng Liu
- GeneMind Biosciences Company Limited, Shenzhen, China
| | - Ping Wu
- GeneMind Biosciences Company Limited, Shenzhen, China
| | - Qin Yan
- GeneMind Biosciences Company Limited, Shenzhen, China
| | - Lei Sun
- GeneMind Biosciences Company Limited, Shenzhen, China
| |
Collapse
|
6
|
Feng Z, Peng F, Xie F, Liu Y, Zhang H, Ma J, Xing J, Guo X. Comparison of capture-based mtDNA sequencing performance between MGI and illumina sequencing platforms in various sample types. BMC Genomics 2024; 25:41. [PMID: 38191319 PMCID: PMC10773069 DOI: 10.1186/s12864-023-09938-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 12/24/2023] [Indexed: 01/10/2024] Open
Abstract
BACKGROUND Mitochondrial genome abnormalities can lead to mitochondrial dysfunction, which in turn affects cellular biology and is closely associated with the development of various diseases. The demand for mitochondrial DNA (mtDNA) sequencing has been increasing, and Illumina and MGI are two commonly used sequencing platforms for capture-based mtDNA sequencing. However, there is currently no systematic comparison of mtDNA sequencing performance between these two platforms. To address this gap, we compared the performance of capture-based mtDNA sequencing between Illumina's NovaSeq 6000 and MGI's DNBSEQ-T7 using tissue, peripheral blood mononuclear cell (PBMC), formalin-fixed paraffin-embedded (FFPE) tissue, plasma, and urine samples. RESULTS Our analysis indicated a high degree of consistency between the two platforms in terms of sequencing quality, GC content, and coverage. In terms of data output, DNBSEQ-T7 showed higher rates of clean data and duplication compared to NovaSeq 6000. Conversely, the amount of mtDNA data obtained by per gigabyte sequencing data was significantly lower in DNBSEQ-T7 compared to NovaSeq 6000. In terms of detection mtDNA copy number, both platforms exhibited good consistency in all sample types. When it comes to detection of mtDNA mutations in tissue, FFPE, and PBMC samples, the two platforms also showed good consistency. However, when detecting mtDNA mutations in plasma and urine samples, significant differenceof themutation number detected was observed between the two platforms. For mtDNA sequencing of plasma and urine samples, a wider range of DNA fragment size distribution was found in NovaSeq 6000 when compared to DNBSEQ-T7. Additionally, two platforms exhibited different characteristics of mtDNA fragment end preference. CONCLUSIONS In summary, the two platforms generally showed good consistency in capture-based mtDNA sequencing. However, it is necessary to consider the data preferences generated by two sequencing platforms when plasma and urine samples were analyzed.
Collapse
Affiliation(s)
- Zehui Feng
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and, Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, 710032, China
| | - Fan Peng
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and, Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, 710032, China
| | - Fanfan Xie
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and, Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, 710032, China
- Department of Obstetrics and Gynecology, Xijing Hospital, Fourth Military Medical University, Xi'an, 710032, China
| | - Yang Liu
- Department of Clinical Diagnosis, Tangdu Hospital, Fourth Military Medical University, Xi'an, 710038, China
| | - Huanqin Zhang
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and, Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, 710032, China
| | - Jing Ma
- Yanbian University Medical College, Yanji, 133002, China
| | - Jinliang Xing
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and, Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, 710032, China.
| | - Xu Guo
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and, Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, 710032, China.
| |
Collapse
|
7
|
Cerk K, Ugalde‐Salas P, Nedjad CG, Lecomte M, Muller C, Sherman DJ, Hildebrand F, Labarthe S, Frioux C. Community-scale models of microbiomes: Articulating metabolic modelling and metagenome sequencing. Microb Biotechnol 2024; 17:e14396. [PMID: 38243750 PMCID: PMC10832553 DOI: 10.1111/1751-7915.14396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 11/27/2023] [Accepted: 12/20/2023] [Indexed: 01/21/2024] Open
Abstract
Building models is essential for understanding the functions and dynamics of microbial communities. Metabolic models built on genome-scale metabolic network reconstructions (GENREs) are especially relevant as a means to decipher the complex interactions occurring among species. Model reconstruction increasingly relies on metagenomics, which permits direct characterisation of naturally occurring communities that may contain organisms that cannot be isolated or cultured. In this review, we provide an overview of the field of metabolic modelling and its increasing reliance on and synergy with metagenomics and bioinformatics. We survey the means of assigning functions and reconstructing metabolic networks from (meta-)genomes, and present the variety and mathematical fundamentals of metabolic models that foster the understanding of microbial dynamics. We emphasise the characterisation of interactions and the scaling of model construction to large communities, two important bottlenecks in the applicability of these models. We give an overview of the current state of the art in metagenome sequencing and bioinformatics analysis, focusing on the reconstruction of genomes in microbial communities. Metagenomics benefits tremendously from third-generation sequencing, and we discuss the opportunities of long-read sequencing, strain-level characterisation and eukaryotic metagenomics. We aim at providing algorithmic and mathematical support, together with tool and application resources, that permit bridging the gap between metagenomics and metabolic modelling.
Collapse
Affiliation(s)
- Klara Cerk
- Quadram Institute BioscienceNorwichUK
- Earlham InstituteNorwichUK
| | | | - Chabname Ghassemi Nedjad
- Inria, University of Bordeaux, INRAETalenceFrance
- University of Bordeaux, CNRS, Bordeaux INP, LaBRI, UMR 5800TalenceFrance
| | - Maxime Lecomte
- Inria, University of Bordeaux, INRAETalenceFrance
- INRAE STLO¸University of RennesRennesFrance
| | | | | | - Falk Hildebrand
- Quadram Institute BioscienceNorwichUK
- Earlham InstituteNorwichUK
| | - Simon Labarthe
- Inria, University of Bordeaux, INRAETalenceFrance
- INRAE, University of Bordeaux, BIOGECO, UMR 1202CestasFrance
| | | |
Collapse
|
8
|
Liu Z, Liu J, Geng J, Wu E, Zhu J, Cong B, Wu R, Sun H. Metatranscriptomic characterization of six types of forensic samples and its potential application to body fluid/tissue identification: A pilot study. Forensic Sci Int Genet 2024; 68:102978. [PMID: 37995518 DOI: 10.1016/j.fsigen.2023.102978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 10/21/2023] [Accepted: 11/13/2023] [Indexed: 11/25/2023]
Abstract
Microorganisms are potential markers for identifying body fluids (venous and menstrual blood, semen, saliva, and vaginal secretion) and skin tissue in forensic genetics. Existing published studies have mainly focused on investigating microbial DNA by 16 S rRNA gene sequencing or metagenome shotgun sequencing. We rarely find microbial RNA level investigations on common forensic body fluid/tissue. Therefore, the use of metatranscriptomics to characterize common forensic body fluids/tissue has not been explored in detail, and the potential application of metatranscriptomics in forensic science remains unknown. Here, we performed 30 metatranscriptome analyses on six types of common forensic sample from healthy volunteers by massively parallel sequencing. After quality control and host RNA filtering, a total of 345,300 unigenes were assembled from clean reads. Four kingdoms, 137 phyla, 267 classes, 488 orders, 985 families, 2052 genera, and 4690 species were annotated across all samples. Alpha- and beta-diversity and differential analysis were also performed. As a result, the saliva and skin groups demonstrated high alpha diversity (Simpson index), while the venous blood group exhibited the lowest diversity despite a high Chao1 index. Specifically, we discussed potential microorganism contamination and the "core microbiome," which may be of special interest to forensic researchers. In addition, we implemented and evaluated artificial neural network (ANN), random forest (RF), and support vector machine (SVM) models for forensic body fluid/tissue identification (BFID) using genus- and species-level metatranscriptome profiles. The ANN and RF prediction models discriminated six forensic body fluids/tissue, demonstrating that the microbial RNA-based method could be applied to BFID. Unlike metagenomic research, metatranscriptomic analysis can provide information about active microbial communities; thus, it may have greater potential to become a powerful tool in forensic science for microbial-based individual identification. This study represents the first attempt to explore the application potential of metatranscriptome profiles in forensic science. Our findings help deepen our understanding of the microorganism community structure at the RNA level and are beneficial for other forensic applications of metatranscriptomics.
Collapse
Affiliation(s)
- Zhiyong Liu
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China
| | - Jiajun Liu
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China
| | - Jiaojiao Geng
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China
| | - Enlin Wu
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China
| | - Jianzhang Zhu
- Guangzhou Eighth People's Hospital, Guangzhou Medical University, Guangzhou 510080, China
| | - Bin Cong
- College of Forensic Medicine, Hebei Medical University, Hebei Key Laboratory of Forensic Medicine, Shijiazhuang 050017, China.
| | - Riga Wu
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China.
| | - Hongyu Sun
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China.
| |
Collapse
|
9
|
Reščenko R, Brīvība M, Atava I, Rovīte V, Pečulis R, Silamiķelis I, Ansone L, Megnis K, Birzniece L, Leja M, Xu L, Shi X, Zhou Y, Slaitas A, Hou Y, Kloviņš J. Whole-Genome Sequencing of 502 Individuals from Latvia: The First Step towards a Population-Specific Reference of Genetic Variation. Int J Mol Sci 2023; 24:15345. [PMID: 37895026 PMCID: PMC10607061 DOI: 10.3390/ijms242015345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 10/10/2023] [Accepted: 10/11/2023] [Indexed: 10/29/2023] Open
Abstract
Despite rapid improvements in the accessibility of whole-genome sequencing (WGS), understanding the extent of human genetic variation is limited by the scarce availability of genome sequences from underrepresented populations. Developing the population-scale reference database of Latvian genetic variation may fill the gap in European genomes and improve human genomics research. In this study, we analysed a high-coverage WGS dataset comprising 502 individuals selected from the Genome Database of the Latvian Population. An assessment of variant type, location in the genome, function, medical relevance, and novelty was performed, and a population-specific imputation reference panel (IRP) was developed. We identified more than 18.2 million variants in total, of which 3.3% so far are not represented in gnomAD and dbSNP databases. Moreover, we observed a notable though distinct clustering of the Latvian cohort within the European subpopulations. Finally, our findings demonstrate the improved performance of imputation of variants using the Latvian population-specific reference panel in the Latvian population compared to established IRPs. In summary, our study provides the first WGS data for a regional reference genome that will serve as a resource for the development of precision medicine and complement the global genome dataset, improving the understanding of human genetic variation.
Collapse
Affiliation(s)
- Raimonds Reščenko
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| | - Monta Brīvība
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| | - Ivanna Atava
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| | - Vita Rovīte
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| | - Raitis Pečulis
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| | - Ivars Silamiķelis
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| | - Laura Ansone
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| | - Kaspars Megnis
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| | - Līga Birzniece
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| | - Mārcis Leja
- Faculty of Medicine, University of Latvia, LV-1004 Riga, Latvia;
- Institute of Clinical and Preventive Medicine, University of Latvia, LV-1079 Riga, Latvia
| | - Liqin Xu
- Latvia MGI Tech, LV-2167 Mārupe, Latvia; (L.X.); (X.S.); (Y.Z.); (A.S.); (Y.H.)
| | - Xulian Shi
- Latvia MGI Tech, LV-2167 Mārupe, Latvia; (L.X.); (X.S.); (Y.Z.); (A.S.); (Y.H.)
| | - Yan Zhou
- Latvia MGI Tech, LV-2167 Mārupe, Latvia; (L.X.); (X.S.); (Y.Z.); (A.S.); (Y.H.)
| | - Andis Slaitas
- Latvia MGI Tech, LV-2167 Mārupe, Latvia; (L.X.); (X.S.); (Y.Z.); (A.S.); (Y.H.)
| | - Yong Hou
- Latvia MGI Tech, LV-2167 Mārupe, Latvia; (L.X.); (X.S.); (Y.Z.); (A.S.); (Y.H.)
| | - Jānis Kloviņš
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| |
Collapse
|
10
|
Sun J, Su M, Ma J, Xu M, Ma C, Li W, Liu R, He Q, Su Z. Cross-platform comparisons for targeted bisulfite sequencing of MGISEQ-2000 and NovaSeq6000. Clin Epigenetics 2023; 15:130. [PMID: 37582783 PMCID: PMC10426093 DOI: 10.1186/s13148-023-01543-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 07/28/2023] [Indexed: 08/17/2023] Open
Abstract
BACKGROUND An accurate and reproducible next-generation sequencing platform is essential to identify malignancy-related abnormal DNA methylation changes and translate them into clinical applications including cancer detection, prognosis, and surveillance. However, high-quality DNA methylation sequencing has been challenging because poor sequence diversity of the bisulfite-converted libraries severely impairs sequencing quality and yield. In this study, we tested MGISEQ-2000 Sequencer's capability of DNA methylation sequencing with a published non-invasive pancreatic cancer detection assay, using NovaSeq6000 as the benchmark. RESULTS We sequenced a series of synthetic cell-free DNA (cfDNA) samples with different tumor fractions and found MGISEQ-2000 yielded data with similar quality as NovaSeq6000. The methylation levels measured by MGISEQ-2000 demonstrated high consistency with NovaSeq6000. Moreover, MGISEQ-2000 showed a comparable analytic sensitivity with NovaSeq6000, suggesting its potential for clinical detection. As to evaluate the clinical performance of MGISEQ-2000, we sequenced 24 clinical samples and predicted the pathology of the samples with a clinical diagnosis model, PDACatch classifier. The clinical model performance of MGISEQ-2000's data was highly consistent with that of NovaSeq6000's data, with the area under the curve of 1. We also tested the model's robustness with MGISEQ-2000's data when reducing the sequencing depth. The results showed that MGISEQ-2000's data showed matching robustness of the PDACatch classifier with NovaSeq6000's data. CONCLUSIONS Taken together, MGISEQ-2000 demonstrated similar data quality, consistency of the methylation levels, comparable analytic sensitivity, and matching clinical performance, supporting its application in future non-invasive early cancer detection investigations by detecting distinct methylation patterns of cfDNAs.
Collapse
Affiliation(s)
- Jin Sun
- Singlera Genomics (Shanghai) Ltd., No. 500, Furonghua Road, Shanghai, 201203, China
| | - Mingyang Su
- Singlera Genomics (Shanghai) Ltd., No. 500, Furonghua Road, Shanghai, 201203, China
| | - Jianhua Ma
- Singlera Genomics (Shanghai) Ltd., No. 500, Furonghua Road, Shanghai, 201203, China
| | - Minjie Xu
- Singlera Genomics (Shanghai) Ltd., No. 500, Furonghua Road, Shanghai, 201203, China
| | - Chengcheng Ma
- Singlera Genomics (Shanghai) Ltd., No. 500, Furonghua Road, Shanghai, 201203, China
| | - Wei Li
- Singlera Genomics (Shanghai) Ltd., No. 500, Furonghua Road, Shanghai, 201203, China
| | - Rui Liu
- Singlera Genomics (Shanghai) Ltd., No. 500, Furonghua Road, Shanghai, 201203, China
| | - Qiye He
- Singlera Genomics (Shanghai) Ltd., No. 500, Furonghua Road, Shanghai, 201203, China.
| | - Zhixi Su
- Singlera Genomics (Shanghai) Ltd., No. 500, Furonghua Road, Shanghai, 201203, China.
| |
Collapse
|
11
|
Cao B, Luo H, Luo T, Li N, Shao K, Wu K, Sahu SK, Li F, Lin C. The performance of whole genome bisulfite sequencing on DNBSEQ-Tx platform examined by different library preparation strategies. Heliyon 2023; 9:e16571. [PMID: 37292292 PMCID: PMC10245168 DOI: 10.1016/j.heliyon.2023.e16571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 05/14/2023] [Accepted: 05/19/2023] [Indexed: 06/10/2023] Open
Abstract
Background Whole-genome bisulfite sequencing (WGBS) technology can provide comprehensive DNA methylation at a single-base resolution on a genome-wide scale, and is considered to be the gold standard for the detection of 5-methylcytosine (5 mC). However, the International Human Epigenome Consortium propose a full DNA methylome should have at least 30 fold redundant coverage of the reference genome from a single biological replicate. Therefore, it remains cost prohibitive for large-scale studies. To find a solution, the DNBSEQ-Tx sequencing was developed that can generate up to 6 Tb data in a single run for projects involving large-scale sequencing. Results In this study, we provided two WGBS library construction methods DNB_PREBSseq and DNB_SPLATseq optimized for the DNBSEQ-Tx sequencer, and demonstrated the performance of these two methods on the DNBSEQ-Tx platform, using the DNA extracted from four different cell lines. We also compared the sequencing data from these two WGBS library construction methods with HeLa cell line data from ENCODE sequenced on Illumina HiSeq X Ten and WGBS data of two other cell lines sequenced on HiSeq2500. Various quality control (QC) analyses such as the base quality scores, methylation-bias (m-bias), and conversion efficiency indicated that the data sequenced on the DNBSEQ-Tx platform met the WGBS-required quality controls. Meanwhile, our data closely resembled the coverage shown by the data generated by the Illumina platform. Conclusions Our study showed that with our optimized methods, DNBSEQ-Tx could generate high-quality WGBS data with relatively good stability for large-scale WGBS sequencing applications. Thus, we conclude that DNBSEQ-Tx can be used for a wide range of WGBS research.
Collapse
Affiliation(s)
- Boyang Cao
- BGI-Shenzhen, Shenzhen, 518083, China
- Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, BGI-Shenzhen, Shenzhen 518083, China
| | - Huijuan Luo
- BGI-Shenzhen, Shenzhen, 518083, China
- Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, BGI-Shenzhen, Shenzhen 518083, China
| | - Tian Luo
- BGI-Shenzhen, Shenzhen, 518083, China
- Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, BGI-Shenzhen, Shenzhen 518083, China
| | - Nannan Li
- BGI-Shenzhen, Shenzhen, 518083, China
- Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, BGI-Shenzhen, Shenzhen 518083, China
| | - Kang Shao
- BGI-Shenzhen, Shenzhen, 518083, China
- Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, BGI-Shenzhen, Shenzhen 518083, China
| | - Kui Wu
- BGI-Shenzhen, Shenzhen, 518083, China
- Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, BGI-Shenzhen, Shenzhen 518083, China
| | | | - Fuqiang Li
- BGI-Shenzhen, Shenzhen, 518083, China
- Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, BGI-Shenzhen, Shenzhen 518083, China
| | - Cong Lin
- BGI-Shenzhen, Shenzhen, 518083, China
- Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, BGI-Shenzhen, Shenzhen 518083, China
| |
Collapse
|
12
|
Fauzia KA, Alfaray RI, Yamaoka Y. Advantages of Whole Genome Sequencing in Mitigating the Helicobacter pylori Antimicrobial Resistance Problem. Microorganisms 2023; 11:1239. [PMID: 37317213 DOI: 10.3390/microorganisms11051239] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 04/28/2023] [Indexed: 06/16/2023] Open
Abstract
Helicobacter pylori antimicrobial resistance is a critical public health issue. Typically, antimicrobial resistance epidemiology reports include only the antimicrobial susceptibility test results for H. pylori. However, this phenotypic approach is less capable of answering queries related to resistance mechanisms and specific mutations found in particular global regions. Whole genome sequencing can help address these two questions while still offering quality control and is routinely validated against AST standards. A comprehensive understanding of the mechanisms of resistance should improve H. pylori eradication efforts and prevent gastric cancer.
Collapse
Affiliation(s)
- Kartika Afrida Fauzia
- Department of Environmental and Preventive Medicine, Oita University Faculty of Medicine, Yufu 879-5593, Japan
- Department of Public Health and Preventive Medicine, Faculty of Medicine, Universitas Airlangga, Surabaya 60115, Indonesia
- Helicobacter pylori and Microbiota Study Group, Institute of Tropical Disease, Universitas Airlangga, Surabaya 60115, Indonesia
| | - Ricky Indra Alfaray
- Department of Environmental and Preventive Medicine, Oita University Faculty of Medicine, Yufu 879-5593, Japan
- Helicobacter pylori and Microbiota Study Group, Institute of Tropical Disease, Universitas Airlangga, Surabaya 60115, Indonesia
| | - Yoshio Yamaoka
- Department of Environmental and Preventive Medicine, Oita University Faculty of Medicine, Yufu 879-5593, Japan
- Division of Gastroentero-Hepatology, Department of Internal Medicine, Faculty of Medicine-Dr. Soetomo Teaching Hospital, Universitas Airlangga, Surabaya 60115, Indonesia
- Department of Medicine, Gastroenterology and Hepatology Section, Baylor College of Medicine, Houston, TX 77030, USA
- Borneo Medical and Health Research Centre, University Malaysia Sabah, Kota Kinabalu, Sabah 88400, Malaysia
- Research Center for Global and Local Infectious Diseases, Oita University, Yufu 879-5593, Japan
| |
Collapse
|
13
|
Jeon MS, Jeong DM, Doh H, Kang HA, Jung H, Eyun SI. A practical comparison of the next-generation sequencing platform and assemblers using yeast genome. Life Sci Alliance 2023; 6:e202201744. [PMID: 36746534 PMCID: PMC9902641 DOI: 10.26508/lsa.202201744] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 01/25/2023] [Accepted: 01/25/2023] [Indexed: 02/08/2023] Open
Abstract
Assembling fragmented whole-genomic information from the sequencing data is an inevitable process for further genome-wide research. However, it is intricate to select the appropriate assembly pipeline for unknown species because of the species-specific genomic properties. Therefore, our study focused on relatively more static proclivities of sequencing platforms and assembly algorithms than the fickle genome sequences. A total of 212 draft and polished de novo assemblies were constructed under the different sequencing platforms and assembly algorithms with the repetitive yeast genome. Our comprehensive data indicated that sequencing reads from Oxford Nanopore with R7.3 flow cells generated more continuous assemblies than those derived from the PacBio Sequel, although the homopolymer-based assembly errors and chimeric contigs exist. In addition, the comparison between two second-generation sequencing platforms showed that Illumina NovaSeq 6000 provides more accurate and continuous assembly in the second-generation-sequencing-first pipeline, but MGI DNBSEQ-T7 provides a cheap and accurate read in the polishing process. Furthermore, our insight into the relationship among the computational time, read length, and coverage depth provided clues to the optimal pipelines of yeast assembly.
Collapse
Affiliation(s)
- Min-Seung Jeon
- Department of Life Science, Chung-Ang University, Seoul, Korea
| | - Da Min Jeong
- Department of Life Science, Chung-Ang University, Seoul, Korea
| | - Huijeong Doh
- Department of Life Science, Chung-Ang University, Seoul, Korea
| | - Hyun Ah Kang
- Department of Life Science, Chung-Ang University, Seoul, Korea
| | - Hyungtaek Jung
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St Lucia, Australia
| | - Seong-Il Eyun
- Department of Life Science, Chung-Ang University, Seoul, Korea
| |
Collapse
|
14
|
Li C, Fan X, Guo X, Liu Y, Wang M, Zhao XC, Wu P, Yan Q, Sun L. Accuracy benchmark of the GeneMind GenoLab M sequencing platform for WGS and WES analysis. BMC Genomics 2022; 23:533. [PMID: 35869426 PMCID: PMC9308344 DOI: 10.1186/s12864-022-08775-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 07/18/2022] [Indexed: 11/23/2022] Open
Abstract
Background GenoLab M is a recently developed next-generation sequencing (NGS) platform from GeneMind Biosciences. To establish the performance of GenoLab M, we present the first report to benchmark and compare the WGS and WES sequencing data of the GenoLab M sequencer to NovaSeq 6000 and NextSeq 550 platform in various types of analysis. For WGS, thirty-fold sequencing from Illumina NovaSeq platform and processed by GATK pipeline is currently considered as the golden standard. Thus this dataset is generated as a benchmark reference in this study. Results GenoLab M showed an average of 94.62% of Q20 percentage for base quality, while the NovaSeq was slightly higher at 96.97%. However, GenoLab M outperformed NovaSeq or NextSeq at a duplication rate, suggesting more usable data after deduplication. For WGS short variant calling, GenoLab M showed significant accuracy improvement over the same depth dataset from NovaSeq, and reached similar accuracy to NovaSeq 33X dataset with 22x depth. For 100X WES, the F-score and Precision in GenoLab M were higher than NovaSeq or NextSeq, especially for InDel calling. Conclusions GenoLab M is a promising NGS platform for high-performance WGS and WES applications. For WGS, 22X depth in the GenoLab M sequencing platform offers a cost-effective alternative to the current mainstream 33X depth on Illumina.
Collapse
|
15
|
Samlali K, Thornbury M, Venter A. Community-led risk analysis of direct-to-consumer whole-genome sequencing. Biochem Cell Biol 2022; 100:499-509. [PMID: 35939839 DOI: 10.1139/bcb-2021-0506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Direct-to-consumer (DTC) genetic testing is cheaper and more accessible than ever before; however, the intention to combine, reuse, and resell this genetic information as powerful data sets is generally hidden from the consumer. This financial gain is creating a competitive DTC market, reducing the price of whole-genome sequencing (WGS) to under 300 USD. Entering this transition from single-nucleotide polymorphism-based DTC testing to WGS DTC testing, individuals looking for access to their whole-genomic information face new privacy and security risks. Differences between WGS and other methods of consumer genetic tests are left unexplored by regulation, leading to the application of legal data anonymization methods on whole-genome data, and questionable consent methods. Large representative genomic data sets are important for research and improve the standard of medicine and personalized care. However, these data can also be used by market players, law enforcement, and governments for surveillance, population analyses, marketing purposes, and discrimination. Here, we present a summary of the state of WGS DTC genetic testing and its current regulation, through a community-based lens to expose dual-use risks in consumer-facing biotechnologies.
Collapse
Affiliation(s)
- Kenza Samlali
- BricoBio Community Biology Lab, Montréal, QC, Canada.,Centre for Applied Synthetic Biology, Concordia University, Montréal, QC, Canada.,Department of Electrical and Computer Engineering, Concordia University, Montréal, QC, Canada
| | - Mackenzie Thornbury
- BricoBio Community Biology Lab, Montréal, QC, Canada.,Centre for Applied Synthetic Biology, Concordia University, Montréal, QC, Canada.,Department of Biology, Concordia University, Montréal, QC, Canada
| | - Andrei Venter
- BricoBio Community Biology Lab, Montréal, QC, Canada
| |
Collapse
|
16
|
Meslier V, Quinquis B, Da Silva K, Plaza Oñate F, Pons N, Roume H, Podar M, Almeida M. Benchmarking second and third-generation sequencing platforms for microbial metagenomics. Sci Data 2022; 9:694. [PMID: 36369227 PMCID: PMC9652401 DOI: 10.1038/s41597-022-01762-z] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 10/04/2022] [Indexed: 11/13/2022] Open
Abstract
Shotgun metagenomic sequencing is a common approach for studying the taxonomic diversity and metabolic potential of complex microbial communities. Current methods primarily use second generation short read sequencing, yet advances in third generation long read technologies provide opportunities to overcome some of the limitations of short read sequencing. Here, we compared seven platforms, encompassing second generation sequencers (Illumina HiSeq 300, MGI DNBSEQ-G400 and DNBSEQ-T7, ThermoFisher Ion GeneStudio S5 and Ion Proton P1) and third generation sequencers (Oxford Nanopore Technologies MinION R9 and Pacific Biosciences Sequel II). We constructed three uneven synthetic microbial communities composed of up to 87 genomic microbial strains DNAs per mock, spanning 29 bacterial and archaeal phyla, and representing the most complex and diverse synthetic communities used for sequencing technology comparisons. Our results demonstrate that third generation sequencing have advantages over second generation platforms in analyzing complex microbial communities, but require careful sequencing library preparation for optimal quantitative metagenomic analysis. Our sequencing data also provides a valuable resource for testing and benchmarking bioinformatics software for metagenomics.
Collapse
Affiliation(s)
- Victoria Meslier
- Université Paris-Saclay, INRAE, MetaGenoPolis, 78350, Jouy-en-Josas, France
| | - Benoit Quinquis
- Université Paris-Saclay, INRAE, MetaGenoPolis, 78350, Jouy-en-Josas, France
| | - Kévin Da Silva
- Université Paris-Saclay, INRAE, MetaGenoPolis, 78350, Jouy-en-Josas, France
| | | | - Nicolas Pons
- Université Paris-Saclay, INRAE, MetaGenoPolis, 78350, Jouy-en-Josas, France
| | - Hugo Roume
- Université Paris-Saclay, INRAE, MetaGenoPolis, 78350, Jouy-en-Josas, France
| | - Mircea Podar
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA.
| | - Mathieu Almeida
- Université Paris-Saclay, INRAE, MetaGenoPolis, 78350, Jouy-en-Josas, France.
| |
Collapse
|
17
|
Genome sequencing data of extended-spectrum beta-lactamase-producing Escherichia coli INF191/17/A isolates of nosocomial infection. Data Brief 2022; 43:108407. [PMID: 35799858 PMCID: PMC9253457 DOI: 10.1016/j.dib.2022.108407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Revised: 06/10/2022] [Accepted: 06/17/2022] [Indexed: 11/23/2022] Open
Abstract
The infection with extended-spectrum beta-lactamase-producing Escherichia coli is associated with higher mortality, longer length of hospital-stay and increased costs compared to infection with antibiotic-susceptible E. coli. Here, the draft genome of ESBL-producing E. coli circulating at local hospital is reported. The strain was detected as containing the genes of antibiotic resistance TEM, CTX-M-1, and CTX-M-9. The 5,136,548-bp genome, with a GC content of 50.59%, comprised 4987 protein-coding genes, four ribosomal RNA, and 66 transfer RNA. The ResFinder was successfully predicted fourteen antimicrobial genes in the E. coli INF191/17/A genome. Sequence data has been deposited in the GenBank database under the accession number JAIEXV000000000. The BioProject ID in the GenBank database is PRJNA752944. The raw data was sequenced using Ilumina MiSeq and submitted to the NCBI SRA database (SRX11797310), which is publicly available.
Collapse
|
18
|
Naval-Sanchez M, Deshpande N, Tran M, Zhang J, Alhomrani M, Alsanie W, Nguyen Q, Nefzger CM. Benchmarking of ATAC Sequencing Data From BGI's Low-Cost DNBSEQ-G400 Instrument for Identification of Open and Occupied Chromatin Regions. Front Mol Biosci 2022; 9:900323. [PMID: 35874611 PMCID: PMC9302965 DOI: 10.3389/fmolb.2022.900323] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Accepted: 05/18/2022] [Indexed: 11/17/2022] Open
Abstract
Background: Chromatin falls into one of two major subtypes: closed heterochromatin and euchromatin which is accessible, transcriptionally active, and occupied by transcription factors (TFs). The most widely used approach to interrogate differences in the chromatin state landscape is the Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq). While library generation is relatively inexpensive, sequencing depth requirements can make this assay cost-prohibitive for some laboratories. Findings: Here, we benchmark data from Beijing Genomics Institute's (BGI) DNBSEQ-G400 low-cost sequencer against data from a standard Illumina instrument (HiSeqX10). For comparisons, the same bulk ATAC-seq libraries generated from pluripotent stem cells (PSCs) and fibroblasts were sequenced on both platforms. Both instruments generate sequencing reads with comparable mapping rates and genomic context. However, DNBSEQ-G400 data contained a significantly higher number of small, sub-nucleosomal reads (>30% increase) and a reduced number of bi-nucleosomal reads (>75% decrease), which resulted in narrower peak bases and improved peak calling, enabling the identification of 4% more differentially accessible regions between PSCs and fibroblasts. The ability to identify master TFs that underpin the PSC state relative to fibroblasts (via HOMER, HINT-ATAC, TOBIAS), namely, foot-printing capacity, were highly similar between data generated on both platforms. Integrative analysis with transcriptional data equally enabled direct recovery of three published 3-factor combinations that have been shown to induce pluripotency. Conclusion: Other than a small increase in peak calling sensitivity for DNBSEQ-G400 data (BGI), both platforms enable comparable levels of open chromatin identification for ATAC-seq library sequencing, yielding similar analytical outcomes, albeit at low-data generation costs in the case of the BGI instrument.
Collapse
Affiliation(s)
- Marina Naval-Sanchez
- Institute for Molecular Bioscience, University of Queensland, St Lucia, QLD, Australia
| | - Nikita Deshpande
- Institute for Molecular Bioscience, University of Queensland, St Lucia, QLD, Australia
| | - Minh Tran
- Institute for Molecular Bioscience, University of Queensland, St Lucia, QLD, Australia
| | - Jingyu Zhang
- Institute for Molecular Bioscience, University of Queensland, St Lucia, QLD, Australia
| | - Majid Alhomrani
- Department of Clinical Laboratories Sciences, Faculty of Applied Medical Sciences, Taif University, Taif, Saudi Arabia
- Centre of Biomedical Sciences Research (CBSR), Deanship of Scientific Research, Taif University, Taif, Saudi Arabia
| | - Walaa Alsanie
- Department of Clinical Laboratories Sciences, Faculty of Applied Medical Sciences, Taif University, Taif, Saudi Arabia
- Centre of Biomedical Sciences Research (CBSR), Deanship of Scientific Research, Taif University, Taif, Saudi Arabia
| | - Quan Nguyen
- Institute for Molecular Bioscience, University of Queensland, St Lucia, QLD, Australia
| | - Christian M. Nefzger
- Institute for Molecular Bioscience, University of Queensland, St Lucia, QLD, Australia
| |
Collapse
|
19
|
Goussarov G, Mysara M, Vandamme P, Van Houdt R. Introduction to the principles and methods underlying the recovery of metagenome-assembled genomes from metagenomic data. Microbiologyopen 2022; 11:e1298. [PMID: 35765182 PMCID: PMC9179125 DOI: 10.1002/mbo3.1298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 05/19/2022] [Accepted: 05/19/2022] [Indexed: 11/18/2022] Open
Abstract
The rise of metagenomics offers a leap forward for understanding the genetic diversity of microorganisms in many different complex environments by providing a platform that can identify potentially unlimited numbers of known and novel microorganisms. As such, it is impossible to imagine new major initiatives without metagenomics. Nevertheless, it represents a relatively new discipline with various levels of complexity and demands on bioinformatics. The underlying principles and methods used in metagenomics are often seen as common knowledge and often not detailed or fragmented. Therefore, we reviewed these to guide microbiologists in taking the first steps into metagenomics. We specifically focus on a workflow aimed at reconstructing individual genomes, that is, metagenome-assembled genomes, integrating DNA sequencing, assembly, binning, identification and annotation.
Collapse
Affiliation(s)
- Gleb Goussarov
- Microbiology Unit, Belgian Nuclear Research Centre (SCK CEN)MolBelgium
- Laboratory of Microbiology and BCCM/LMG Bacteria Collection, Faculty of SciencesGhent UniversityGhentBelgium
| | - Mohamed Mysara
- Microbiology Unit, Belgian Nuclear Research Centre (SCK CEN)MolBelgium
| | - Peter Vandamme
- Laboratory of Microbiology and BCCM/LMG Bacteria Collection, Faculty of SciencesGhent UniversityGhentBelgium
| | - Rob Van Houdt
- Microbiology Unit, Belgian Nuclear Research Centre (SCK CEN)MolBelgium
| |
Collapse
|
20
|
Pradipta A, Kumaheri MA, Wahyudi LD, Susanto AP, Agasi HI, Shankar AH, Sudarmono P. Accelerating Detection of Variants During COVID-19 Surges by Diverse Technological and Public Health Partnerships: A Case Study From Indonesia. Front Genet 2022; 13:801332. [PMID: 35154274 PMCID: PMC8831855 DOI: 10.3389/fgene.2022.801332] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Accepted: 01/04/2022] [Indexed: 11/24/2022] Open
Abstract
Early detection of Severe Acute Respiratory Syndrome Corona Virus 2 (SARS-CoV-2) variants and use of data for public health action requires a coordinated, rapid, and high throughput approach to whole genome sequencing (WGS). Currently, WGS output from many low- and middle-income countries (LMIC) has lagged. By fostering diverse partnerships and multiple sequencing technologies, Indonesia accelerated SARS-CoV-2 WGS uploads to GISAID from 1,210 in April 2021 to 5,791 in August 2021, an increase from 11 submissions per day between January to May, to 43 per day between June to August. Turn-around-time from specimen collection to submission decreased from 77 to 5 days, allowing for timely public health decisions. These changes were enabled by establishment of the National Genomic Surveillance Consortium, coordination between public and private sector laboratories with WGS capability, and diversification of sequencing platform technologies. Here we present how diversification on multiple levels enabled a rapid and significant increase of national WGS performance, with potentially valuable lessons for other LMICs.
Collapse
Affiliation(s)
- Ariel Pradipta
- Genomik Solidaritas Indonesia (GSI) Lab, Jakarta, Indonesia
- Indonesia Medical Education and Research Institute, Faculty of Medicine Universitas Indonesia, Jakarta, Indonesia
- *Correspondence: Ariel Pradipta,
| | | | | | - Anindya Pradipta Susanto
- Genomik Solidaritas Indonesia (GSI) Lab, Jakarta, Indonesia
- Indonesia Medical Education and Research Institute, Faculty of Medicine Universitas Indonesia, Jakarta, Indonesia
| | | | - Anuraj H. Shankar
- Genomik Solidaritas Indonesia (GSI) Lab, Jakarta, Indonesia
- Eijkman-Oxford Clinical Research Unit, Jakarta, Centre for Tropical Medicine and Global Health, University of Oxford, Oxford, United Kingdom
| | - Pratiwi Sudarmono
- Indonesia Medical Education and Research Institute, Faculty of Medicine Universitas Indonesia, Jakarta, Indonesia
- Indonesian Society for Clinical Microbiology, Tangerang, Indonesia
| |
Collapse
|
21
|
Liu Y, Han R, Zhou L, Luo M, Zeng L, Zhao X, Ma Y, Zhou Z, Sun L. Comparative performance of the GenoLab M and NovaSeq 6000 sequencing platforms for transcriptome and LncRNA analysis. BMC Genomics 2021; 22:829. [PMID: 34789158 PMCID: PMC8600837 DOI: 10.1186/s12864-021-08150-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Accepted: 11/03/2021] [Indexed: 01/19/2023] Open
Abstract
Background GenoLab M is a recently established next-generation sequencing platform from GeneMind Biosciences. Presently, Illumina sequencers are the globally leading sequencing platform in the next-generation sequencing market. Here, we present the first report to compare the transcriptome and LncRNA sequencing data of the GenoLab M sequencer to NovaSeq 6000 platform in various types of analysis. Results We tested 16 libraries in three species using various library kits from different companies. We compared the data quality, genes expression, alternatively spliced (AS) events, single nucleotide polymorphism (SNP), and insertions–deletions (InDel) between two sequencing platforms. The data suggested that platforms have comparable sensitivity and accuracy in terms of quantification of gene expression levels with technical compatibility. Conclusions Genolab M is a promising next-generation sequencing platform for transcriptomics and LncRNA studies with high performance at low costs. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-08150-8.
Collapse
Affiliation(s)
- Yongfeng Liu
- GeneMind Biosciences Company Limited, ShenZhen, China
| | - Ran Han
- Beijing Guoke Biotechnology Co., LTD, Beijing, China
| | - Letian Zhou
- GeneMind Biosciences Company Limited, ShenZhen, China
| | - Mingjie Luo
- GeneMind Biosciences Company Limited, ShenZhen, China
| | - Lidong Zeng
- GeneMind Biosciences Company Limited, ShenZhen, China
| | - Xiaochao Zhao
- GeneMind Biosciences Company Limited, ShenZhen, China
| | - Yukun Ma
- Beijing Guoke Biotechnology Co., LTD, Beijing, China
| | - Zhiliang Zhou
- GeneMind Biosciences Company Limited, ShenZhen, China
| | - Lei Sun
- GeneMind Biosciences Company Limited, ShenZhen, China.
| |
Collapse
|
22
|
Anslan S, Mikryukov V, Armolaitis K, Ankuda J, Lazdina D, Makovskis K, Vesterdal L, Schmidt IK, Tedersoo L. Highly comparable metabarcoding results from MGI-Tech and Illumina sequencing platforms. PeerJ 2021; 9:e12254. [PMID: 34703674 PMCID: PMC8491618 DOI: 10.7717/peerj.12254] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Accepted: 09/14/2021] [Indexed: 01/04/2023] Open
Abstract
With the developments in DNA nanoball sequencing technologies and the emergence of new platforms, there is an increasing interest in their performance in comparison with the widely used sequencing-by-synthesis methods. Here, we test the consistency of metabarcoding results from DNBSEQ-G400RS (DNA nanoball sequencing platform by MGI-Tech) and NovaSeq 6000 (sequencing-by-synthesis platform by Illumina) platforms using technical replicates of DNA libraries that consist of COI gene amplicons from 120 soil DNA samples. By subjecting raw sequencing data from both platforms to a uniform bioinformatics processing, we found that the proportion of high-quality reads passing through the filtering steps was similar in both datasets. Per-sample operational taxonomic unit (OTU) and amplicon sequence variant (ASV) richness patterns were highly correlated, but sequencing data from DNBSEQ-G400RS harbored a higher number of OTUs. This may be related to the lower dominance of most common OTUs in DNBSEQ data set (thus revealing higher richness by detecting rare taxa) and/or to a lower effective read quality leading to generation of spurious OTUs. However, there was no statistical difference in the ASV and post-clustered ASV richness between platforms, suggesting that additional denoising step in the ASV workflow had effectively removed the 'noisy' reads. Both OTU-based and ASV-based composition were strongly correlated between the sequencing platforms, with essentially interchangeable results. Therefore, we conclude that DNBSEQ-G400RS and NovaSeq 6000 are both equally efficient high-throughput sequencing platforms to be utilized in studies aiming to apply the metabarcoding approach, but the main benefit of the former is related to lower sequencing cost.
Collapse
Affiliation(s)
- Sten Anslan
- Institute of Ecology and Earth Sciences, University of Tartu, Tartu, Tartumaa, Estonia
- Mycology and Microbiology Center, University of Tartu, Tartu, Tartumaa, Estonia
| | - Vladimir Mikryukov
- Institute of Ecology and Earth Sciences, University of Tartu, Tartu, Tartumaa, Estonia
- Mycology and Microbiology Center, University of Tartu, Tartu, Tartumaa, Estonia
| | - Kęstutis Armolaitis
- Department of Ecology, Institute of Forestry of Lithuanian Research Centre for Agriculture and Forestry (LAMMC), Kaunas, Lithuania
| | - Jelena Ankuda
- Department of Ecology, Institute of Forestry of Lithuanian Research Centre for Agriculture and Forestry (LAMMC), Kaunas, Lithuania
| | - Dagnija Lazdina
- Latvian State Forest Research Institute SILAVA, Riga, Latvia
| | | | - Lars Vesterdal
- Department of Geosciences and Natural Resource Management, University of Copenhagen, Copenhagen, Denmark
| | - Inger Kappel Schmidt
- Department of Geosciences and Natural Resource Management, University of Copenhagen, Copenhagen, Denmark
| | - Leho Tedersoo
- Institute of Ecology and Earth Sciences, University of Tartu, Tartu, Tartumaa, Estonia
- Mycology and Microbiology Center, University of Tartu, Tartu, Tartumaa, Estonia
| |
Collapse
|
23
|
Kim HM, Jeon S, Chung O, Jun JH, Kim HS, Blazyte A, Lee HY, Yu Y, Cho YS, Bolser DM, Bhak J. Comparative analysis of 7 short-read sequencing platforms using the Korean Reference Genome: MGI and Illumina sequencing benchmark for whole-genome sequencing. Gigascience 2021; 10:giab014. [PMID: 33710328 PMCID: PMC7953489 DOI: 10.1093/gigascience/giab014] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 09/03/2020] [Accepted: 02/16/2021] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND DNBSEQ-T7 is a new whole-genome sequencer developed by Complete Genomics and MGI using DNA nanoball and combinatorial probe anchor synthesis technologies to generate short reads at a very large scale-up to 60 human genomes per day. However, it has not been objectively and systematically compared against Illumina short-read sequencers. FINDINGS By using the same KOREF sample, the Korean Reference Genome, we have compared 7 sequencing platforms including BGISEQ-500, DNBSEQ-T7, HiSeq2000, HiSeq2500, HiSeq4000, HiSeqX10, and NovaSeq6000. We measured sequencing quality by comparing sequencing statistics (base quality, duplication rate, and random error rate), mapping statistics (mapping rate, depth distribution, and percent GC coverage), and variant statistics (transition/transversion ratio, dbSNP annotation rate, and concordance rate with single-nucleotide polymorphism [SNP] genotyping chip) across the 7 sequencing platforms. We found that MGI platforms showed a higher concordance rate for SNP genotyping than HiSeq2000 and HiSeq4000. The similarity matrix of variant calls confirmed that the 2 MGI platforms have the most similar characteristics to the HiSeq2500 platform. CONCLUSIONS Overall, MGI and Illumina sequencing platforms showed comparable levels of sequencing quality, uniformity of coverage, percent GC coverage, and variant accuracy; thus we conclude that the MGI platforms can be used for a wide range of genomics research fields at a lower cost than the Illumina platforms.
Collapse
Affiliation(s)
- Hak-Min Kim
- Clinomics Inc., Ulsan National Institute of Science and Technology (UNIST), UNIST-gil 50, Eonyang-eup, Ulju-gun, Ulsan, 44919, Republic of Korea
| | - Sungwon Jeon
- Korean Genomics Center (KOGIC), Ulsan National Institute of Science and Technology (UNIST), UNIST-gil 50, Eonyang-eup, Ulju-gun, Ulsan, 44919, Republic of Korea
- Department of Biomedical Engineering, School of Life Sciences, Ulsan National Institute of Science and Technology (UNIST), UNIST-gil 50, Eonyang-eup, Ulju-gun, Ulsan, 44919, Republic of Korea
| | - Oksung Chung
- Clinomics Inc., Ulsan National Institute of Science and Technology (UNIST), UNIST-gil 50, Eonyang-eup, Ulju-gun, Ulsan, 44919, Republic of Korea
| | - Je Hoon Jun
- Clinomics Inc., Ulsan National Institute of Science and Technology (UNIST), UNIST-gil 50, Eonyang-eup, Ulju-gun, Ulsan, 44919, Republic of Korea
| | - Hui-Su Kim
- Korean Genomics Center (KOGIC), Ulsan National Institute of Science and Technology (UNIST), UNIST-gil 50, Eonyang-eup, Ulju-gun, Ulsan, 44919, Republic of Korea
| | - Asta Blazyte
- Korean Genomics Center (KOGIC), Ulsan National Institute of Science and Technology (UNIST), UNIST-gil 50, Eonyang-eup, Ulju-gun, Ulsan, 44919, Republic of Korea
- Department of Biomedical Engineering, School of Life Sciences, Ulsan National Institute of Science and Technology (UNIST), UNIST-gil 50, Eonyang-eup, Ulju-gun, Ulsan, 44919, Republic of Korea
| | - Hwang-Yeol Lee
- Clinomics Inc., Ulsan National Institute of Science and Technology (UNIST), UNIST-gil 50, Eonyang-eup, Ulju-gun, Ulsan, 44919, Republic of Korea
| | - Youngseok Yu
- Clinomics Inc., Ulsan National Institute of Science and Technology (UNIST), UNIST-gil 50, Eonyang-eup, Ulju-gun, Ulsan, 44919, Republic of Korea
| | - Yun Sung Cho
- Clinomics Inc., Ulsan National Institute of Science and Technology (UNIST), UNIST-gil 50, Eonyang-eup, Ulju-gun, Ulsan, 44919, Republic of Korea
| | - Dan M Bolser
- Geromics Ltd., 222 Mill Road, Cambridge, CB1 3NF, United Kingdom
| | - Jong Bhak
- Clinomics Inc., Ulsan National Institute of Science and Technology (UNIST), UNIST-gil 50, Eonyang-eup, Ulju-gun, Ulsan, 44919, Republic of Korea
- Korean Genomics Center (KOGIC), Ulsan National Institute of Science and Technology (UNIST), UNIST-gil 50, Eonyang-eup, Ulju-gun, Ulsan, 44919, Republic of Korea
- Department of Biomedical Engineering, School of Life Sciences, Ulsan National Institute of Science and Technology (UNIST), UNIST-gil 50, Eonyang-eup, Ulju-gun, Ulsan, 44919, Republic of Korea
- Geromics Ltd., 222 Mill Road, Cambridge, CB1 3NF, United Kingdom
- Personal Genomics Institute (PGI), Genome Research Foundation, Osong saengmyong1ro, Cheongju, 28160, Republic of Korea
| |
Collapse
|