Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Escalona M, Rocha S, Posada D. A comparison of tools for the simulation of genomic next-generation sequencing data. Nat Rev Genet 2016;17:459-69. [PMID: 27320129 PMCID: PMC5224698 DOI: 10.1038/nrg.2016.57] [Citation(s) in RCA: 102] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

For:	Escalona M, Rocha S, Posada D. A comparison of tools for the simulation of genomic next-generation sequencing data. Nat Rev Genet 2016;17:459-69. [PMID: 27320129 PMCID: PMC5224698 DOI: 10.1038/nrg.2016.57] [Citation(s) in RCA: 102] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Number

Cited by Other Article(s)

Boulton W, Fidan FR, Denise H, De Maio N, Goldman N. SWAMPy: simulating SARS-CoV-2 wastewater amplicon metagenomes. BIOINFORMATICS (OXFORD, ENGLAND) 2024;40:btae532. [PMID: 39226177 PMCID: PMC11401744 DOI: 10.1093/bioinformatics/btae532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 06/19/2024] [Accepted: 08/31/2024] [Indexed: 09/05/2024]

Xu Y, Liu D, Han P, Wang H, Wang S, Gao J, Chen F, Zhou X, Deng K, Luo J, Zhou M, Kuang D, Yang F, Jiang Z, Xu S, Rao G, Wang Y, Qu J. Rapid inference of antibiotic resistance and susceptibility for Klebsiella pneumoniae by clinical shotgun metagenomic sequencing. Int J Antimicrob Agents 2024;64:107252. [PMID: 38908534 DOI: 10.1016/j.ijantimicag.2024.107252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 05/14/2024] [Accepted: 06/07/2024] [Indexed: 06/24/2024]

Abstract

OBJECTIVES

The study aimed to develop a genotypic antimicrobial resistance testing method for Klebsiella pneumoniae using metagenomic sequencing data.

METHODS

We utilized Lasso regression on assembled genomes to identify genetic resistance determinants for six antibiotics (Gentamicin, Tobramycin, Imipenem, Meropenem, Ceftazidime, Trimethoprim/Sulfamethoxazole). The genetic features were weighted, grouped into clusters to establish classifier models. Origin species of detected antibiotic resistant gene (ARG) was determined by novel strategy integrating "possible species," "gene copy number calculation" and "species-specific kmers." The performance of the method was evaluated on retrospective case studies.

RESULTS

Our study employed machine learning on 3928 K. pneumoniae isolates, yielding stable models with AUCs > 0.9 for various antibiotics. GenseqAMR, a read-based software, exhibited high accuracy (AUC 0.926-0.956) for short-read datasets. The integration of a species-specific kmer strategy significantly improved ARG-species attribution to an average accuracy of 96.67%. In a retrospective study of 191 K. pneumoniae-positive clinical specimens (0.68-93.39% genome coverage), GenseqAMR predicted 84.23% of AST results on average. It demonstrated 88.76-96.26% accuracy for resistance prediction, offering genotypic AST results with a shorter turnaround time (mean ± SD: 18.34 ± 0.87 hours) than traditional culture-based AST (60.15 ± 21.58 hours). Furthermore, a retrospective clinical case study involving 63 cases showed that GenseqAMR could lead to changes in clinical treatment for 24 (38.10%) cases, with 95.83% (23/24) of these changes deemed beneficial.

CONCLUSIONS

In conclusion, GenseqAMR is a promising tool for quick and accurate AMR prediction in Klebsiella pneumoniae, with the potential to improve patient outcomes through timely adjustments in antibiotic treatment.

Collapse

Affiliation(s)

Yanping Xu Department of Pulmonary and Critical Care Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China; Institute of Respiratory Diseases, Shanghai Jiao Tong University School of Medicine, Shanghai, China; Shanghai Key Laboratory of Emergency Prevention, Diagnosis and Treatment of Respiratory Infectious Diseases, Shanghai, China
Donglai Liu National Institutes for Food and Drug Control, Beijing, China
Peng Han Genskey Medical Technology Co., Ltd, Beijing, China
Hao Wang National Institutes for Food and Drug Control, Beijing, China
Shanmei Wang Henan Provincial People's Hospital, People's Hospital of Zhengzhou University, Zhengzhou, Henan, China
Jianpeng Gao Genskey Medical Technology Co., Ltd, Beijing, China
Fangyuan Chen Genskey Medical Technology Co., Ltd, Beijing, China
Xun Zhou Institute of Antibiotics, Huashan Hospital, Fudan University, Shanghai, China; Key Laboratory of Clinical Pharmacology of Antibiotics, Ministry of Health, Shanghai, China
Kun Deng Department of Laboratory Medicine, The Third Affiliated Hospital of Chongqing Medical University, Chongqing, China
Jiajie Luo Genskey Medical Technology Co., Ltd, Beijing, China
Min Zhou Department of Pulmonary and Critical Care Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China; Institute of Respiratory Diseases, Shanghai Jiao Tong University School of Medicine, Shanghai, China; Shanghai Key Laboratory of Emergency Prevention, Diagnosis and Treatment of Respiratory Infectious Diseases, Shanghai, China
Dai Kuang Department of Pulmonary and Critical Care Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China; Institute of Respiratory Diseases, Shanghai Jiao Tong University School of Medicine, Shanghai, China; Shanghai Key Laboratory of Emergency Prevention, Diagnosis and Treatment of Respiratory Infectious Diseases, Shanghai, China
Fan Yang Institute of Antibiotics, Huashan Hospital, Fudan University, Shanghai, China; Key Laboratory of Clinical Pharmacology of Antibiotics, Ministry of Health, Shanghai, China
Zhi Jiang Genskey Medical Technology Co., Ltd, Beijing, China
Sihong Xu National Institutes for Food and Drug Control, Beijing, China.
Guanhua Rao Genskey Medical Technology Co., Ltd, Beijing, China.
Youchun Wang National Institutes for Food and Drug Control, Beijing, China.
Jieming Qu Department of Pulmonary and Critical Care Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China; Institute of Respiratory Diseases, Shanghai Jiao Tong University School of Medicine, Shanghai, China; Shanghai Key Laboratory of Emergency Prevention, Diagnosis and Treatment of Respiratory Infectious Diseases, Shanghai, China.

Collapse

Lai J, Yang Y, Liu Y, Scharpf RB, Karchin R. Assessing the merits: an opinion on the effectiveness of simulation techniques in tumor subclonal reconstruction. BIOINFORMATICS ADVANCES 2024;4:vbae094. [PMID: 38948008 PMCID: PMC11213631 DOI: 10.1093/bioadv/vbae094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Revised: 05/28/2024] [Accepted: 06/15/2024] [Indexed: 07/02/2024]

Gamaarachchi H, Ferguson JM, Samarakoon H, Liyanage K, Deveson IW. Simulation of nanopore sequencing signal data with tunable parameters. Genome Res 2024;34:778-783. [PMID: 38692839 PMCID: PMC11216307 DOI: 10.1101/gr.278730.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 04/24/2024] [Indexed: 05/03/2024]

Affiliation(s)

Hasindu Gamaarachchi School of Computer Science and Engineering, University of New South Wales, Sydney, New South Wales 2052, Australia; Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales 2010, Australia Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, New South Wales 2010, Australia Australia
James M Ferguson Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales 2010, Australia Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, New South Wales 2010, Australia Australia
Hiruna Samarakoon School of Computer Science and Engineering, University of New South Wales, Sydney, New South Wales 2052, Australia Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales 2010, Australia Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, New South Wales 2010, Australia Australia
Kisaru Liyanage School of Computer Science and Engineering, University of New South Wales, Sydney, New South Wales 2052, Australia Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales 2010, Australia Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, New South Wales 2010, Australia Australia
Ira W Deveson Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales 2010, Australia; Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, New South Wales 2010, Australia Australia St Vincent's Clinical School, Faculty of Medicine, University of New South Wales, Sydney, New South Wales 2052, Australia

Collapse

Popitsch N, Neumann T, von Haeseler A, Ameres SL. Splice_sim: a nucleotide conversion-enabled RNA-seq simulation and evaluation framework. Genome Biol 2024;25:166. [PMID: 38918865 DOI: 10.1186/s13059-024-03313-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Accepted: 06/17/2024] [Indexed: 06/27/2024] Open

Yu M, Tang X, Li Z, Wang W, Wang S, Li M, Yu Q, Xie S, Zuo X, Chen C. High-throughput DNA synthesis for data storage. Chem Soc Rev 2024;53:4463-4489. [PMID: 38498347 DOI: 10.1039/d3cs00469d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/20/2024]

Affiliation(s)

Meng Yu Institute of Medical Chips, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 200025, Shanghai, China. School of Microelectronics, Shanghai University, 201800, Shanghai, China Shanghai Industrial μTechnology Research Institute, 201800, Shanghai, China
Xiaohui Tang Institute of Medical Chips, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 200025, Shanghai, China. Shanghai Industrial μTechnology Research Institute, 201800, Shanghai, China
Zhenhua Li Institute of Medical Chips, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 200025, Shanghai, China. Shanghai Industrial μTechnology Research Institute, 201800, Shanghai, China
Weidong Wang Shanghai Industrial μTechnology Research Institute, 201800, Shanghai, China
Shaopeng Wang Institute of Molecular Medicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, 200127, Shanghai, China.
Min Li Institute of Molecular Medicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, 200127, Shanghai, China.
Qiuliyang Yu Shenzhen Key Laboratory for the Intelligent Microbial Manufacturing of Medicines, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, 518055, Shenzhen, China
Sijia Xie Institute of Medical Chips, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 200025, Shanghai, China. School of Microelectronics, Shanghai University, 201800, Shanghai, China Shanghai Industrial μTechnology Research Institute, 201800, Shanghai, China
Xiaolei Zuo Institute of Molecular Medicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, 200127, Shanghai, China.
Chang Chen Institute of Medical Chips, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 200025, Shanghai, China. School of Microelectronics, Shanghai University, 201800, Shanghai, China Shanghai Industrial μTechnology Research Institute, 201800, Shanghai, China State Key Laboratory of Transducer Technology, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, 200050, Shanghai, China

Collapse

Brooks TG, Lahens NF, Mrčela A, Grant GR. Challenges and best practices in omics benchmarking. Nat Rev Genet 2024;25:326-339. [PMID: 38216661 DOI: 10.1038/s41576-023-00679-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/14/2023] [Indexed: 01/14/2024]

Hui X, Yang J, Sun J, Liu F, Pan W. MCSS: microbial community simulator based on structure. Front Microbiol 2024;15:1358257. [PMID: 38516019 PMCID: PMC10956353 DOI: 10.3389/fmicb.2024.1358257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 02/20/2024] [Indexed: 03/23/2024] Open

Lai J, Liu Y, Scharpf RB, Karchin R. Evaluation of simulation methods for tumor subclonal reconstruction. ARXIV 2024:arXiv:2402.09599v1. [PMID: 38410652 PMCID: PMC10896360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 02/28/2024]

Joeres M, Maksimov P, Höper D, Calvelage S, Calero-Bernal R, Fernández-Escobar M, Koudela B, Blaga R, Vrhovec MG, Stollberg K, Bier N, Sotiraki S, Sroka J, Piotrowska W, Kodym P, Basso W, Conraths FJ, Mercier A, Galal L, Dardé ML, Balea A, Spano F, Schulze C, Peters M, Scuda N, Lundén A, Davidson RK, Terland R, Waap H, de Bruin E, Vatta P, Caccio S, Ortega-Mora LM, Jokelainen P, Schares G. Genotyping of European Toxoplasma gondii strains by a new high-resolution next-generation sequencing-based method. Eur J Clin Microbiol Infect Dis 2024;43:355-371. [PMID: 38099986 PMCID: PMC10822014 DOI: 10.1007/s10096-023-04721-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Accepted: 11/16/2023] [Indexed: 01/28/2024]

Affiliation(s)

M Joeres Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Institute of Epidemiology, Greifswald - Insel Riems, Germany
P Maksimov Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Institute of Epidemiology, Greifswald - Insel Riems, Germany
D Höper Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Institute of Diagnostic Virology, Greifswald - Insel Riems, Germany
S Calvelage Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Institute of Diagnostic Virology, Greifswald - Insel Riems, Germany
R Calero-Bernal SALUVET, Animal Health Department, Faculty of Veterinary Sciences, Complutense University of Madrid, Madrid, Spain
M Fernández-Escobar SALUVET, Animal Health Department, Faculty of Veterinary Sciences, Complutense University of Madrid, Madrid, Spain
B Koudela Central European Institute of Technology (CEITEC), University of Veterinary Sciences Brno, Brno, Czech Republic Faculty of Veterinary Medicine, University of Veterinary Sciences Brno, Brno, Czech Republic
R Blaga Anses, INRAE, Ecole Nationale Vétérinaire d'Alfort, Laboratoire de Santé Animale, BIPAR, Maisons-Alfort, France University of Agricultural Sciences and Veterinary Medicine, Cluj-Napoca, Romania
M Globokar Vrhovec IDEXX Laboratories, Kornwestheim, Germany
K Stollberg German Federal Institute for Risk Assessment, Department for Biological Safety, Berlin, Germany
N Bier German Federal Institute for Risk Assessment, Department for Biological Safety, Berlin, Germany
S Sotiraki Veterinary Research Institute, Hellenic Agricultural Organisation-DIMITRA, Thessaloniki, Greece
J Sroka Department of Parasitology and Invasive Diseases, National Veterinary Research Institute, Pulawy, Poland
W Piotrowska Department of Parasitology and Invasive Diseases, National Veterinary Research Institute, Pulawy, Poland
P Kodym Centre of Epidemiology and Microbiology, National Institute of Public Health, Prague, Czech Republic
W Basso Institute of Parasitology, Vetsuisse Faculty, University of Bern, Bern, Switzerland
F J Conraths Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Institute of Epidemiology, Greifswald - Insel Riems, Germany
A Mercier Inserm U1094, IRD U270, Univ. Limoges, CHU Limoges, EpiMaCT - Epidemiology of chronic diseases in tropical zone, Institute of Epidemiology and Tropical Neurology, OmegaHealth, Limoges, France Centre National de Référence (CNR) Toxoplasmose Centre Hospitalier-Universitaire Dupuytren, Limoges, France
L Galal Inserm U1094, IRD U270, Univ. Limoges, CHU Limoges, EpiMaCT - Epidemiology of chronic diseases in tropical zone, Institute of Epidemiology and Tropical Neurology, OmegaHealth, Limoges, France
M L Dardé Inserm U1094, IRD U270, Univ. Limoges, CHU Limoges, EpiMaCT - Epidemiology of chronic diseases in tropical zone, Institute of Epidemiology and Tropical Neurology, OmegaHealth, Limoges, France Centre National de Référence (CNR) Toxoplasmose Centre Hospitalier-Universitaire Dupuytren, Limoges, France
A Balea University of Agricultural Sciences and Veterinary Medicine Cluj-Napoca, Faculty of Veterinary Medicine, Department of Parasitology and Parasitic Diseases, Cluj-Napoca, Romania
F Spano Italian National Institute of Health, Rome, Italy
C Schulze Landeslabor Berlin-Brandenburg, Frankfurt (Oder), Germany
M Peters Chemisches und Veterinäruntersuchungsamt Westfalen, Standort Arnsberg, Arnsberg, Germany
N Scuda Bavarian Health and Food Safety Authority, Erlangen, Germany
A Lundén Department of Microbiology, National Veterinary Institute, Uppsala, Sweden
R K Davidson Department of Animal Health, Welfare and Food Safety, Norwegian Veterinary Institute, Tromsø, Norway
R Terland Department of Analysis and Diagnostics, Norwegian Veterinary Institute, Ås, Norway
H Waap Parasitology Laboratory, Instituto Nacional de Investigação Agrária e Veterinária, Oeiras, Portugal
E de Bruin Dutch Wildlife Health Centre, Pathology Division, Department of Pathobiology, Faculty of Veterinary Medicine, University of Utrecht, Utrecht, The Netherlands
P Vatta Italian National Institute of Health, Rome, Italy
S Caccio Italian National Institute of Health, Rome, Italy
L M Ortega-Mora SALUVET, Animal Health Department, Faculty of Veterinary Sciences, Complutense University of Madrid, Madrid, Spain
P Jokelainen Infectious Disease Preparedness, Statens Serum Institut, Copenhagen, Denmark
G Schares Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Institute of Epidemiology, Greifswald - Insel Riems, Germany.

Collapse

Mestre-Tomás J, Liu T, Pardo-Palacios F, Conesa A. SQANTI-SIM: a simulator of controlled transcript novelty for lrRNA-seq benchmark. Genome Biol 2023;24:286. [PMID: 38082294 PMCID: PMC10712166 DOI: 10.1186/s13059-023-03127-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 11/27/2023] [Indexed: 12/18/2023] Open

Mwima R, Hui TYJ, Nanteza A, Burt A, Kayondo JK. Potential persistence mechanisms of the major Anopheles gambiae species complex malaria vectors in sub-Saharan Africa: a narrative review. Malar J 2023;22:336. [PMID: 37936194 PMCID: PMC10631165 DOI: 10.1186/s12936-023-04775-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 10/30/2023] [Indexed: 11/09/2023] Open

Mesloub Y, Beury D, Vandermeeren F, Caboche S. CuReSim-LoRM: A Tool to Simulate Metabarcoding Long Reads. Int J Mol Sci 2023;24:14005. [PMID: 37762307 PMCID: PMC10531135 DOI: 10.3390/ijms241814005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 09/07/2023] [Accepted: 09/10/2023] [Indexed: 09/29/2023] Open

Mestre-Tomás J, Liu T, Pardo-Palacios F, Conesa A. SQANTI-SIM: a simulator of controlled transcript novelty for lrRNA-seq benchmark. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.23.554392. [PMID: 37662216 PMCID: PMC10473693 DOI: 10.1101/2023.08.23.554392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]

Yelmen B, Jay F. An Overview of Deep Generative Models in Functional and Evolutionary Genomics. Annu Rev Biomed Data Sci 2023;6:173-189. [PMID: 37137168 DOI: 10.1146/annurev-biodatasci-020722-115651] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]

Korfmann K, Gaggiotti OE, Fumagalli M. Deep Learning in Population Genetics. Genome Biol Evol 2023;15:evad008. [PMID: 36683406 PMCID: PMC9897193 DOI: 10.1093/gbe/evad008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 12/19/2022] [Accepted: 01/16/2023] [Indexed: 01/24/2023] Open

Performance evaluation of six popular short-read simulators. Heredity (Edinb) 2023;130:55-63. [PMID: 36496447 PMCID: PMC9905089 DOI: 10.1038/s41437-022-00577-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Revised: 11/10/2022] [Accepted: 11/11/2022] [Indexed: 12/14/2022] Open

Silva JM, Qi W, Pinho AJ, Pratas D. AlcoR: alignment-free simulation, mapping, and visualization of low-complexity regions in biological data. Gigascience 2022;12:giad101. [PMID: 38091509 PMCID: PMC10716826 DOI: 10.1093/gigascience/giad101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 09/29/2023] [Accepted: 11/07/2023] [Indexed: 12/18/2023] Open

Abstract

BACKGROUND

Low-complexity data analysis is the area that addresses the search and quantification of regions in sequences of elements that contain low-complexity or repetitive elements. For example, these can be tandem repeats, inverted repeats, homopolymer tails, GC-biased regions, similar genes, and hairpins, among many others. Identifying these regions is crucial because of their association with regulatory and structural characteristics. Moreover, their identification provides positional and quantity information where standard assembly methodologies face significant difficulties because of substantial higher depth coverage (mountains), ambiguous read mapping, or where sequencing or reconstruction defects may occur. However, the capability to distinguish low-complexity regions (LCRs) in genomic and proteomic sequences is a challenge that depends on the model's ability to find them automatically. Low-complexity patterns can be implicit through specific or combined sources, such as algorithmic or probabilistic, and recurring to different spatial distances-namely, local, medium, or distant associations.

FINDINGS

This article addresses the challenge of automatically modeling and distinguishing LCRs, providing a new method and tool (AlcoR) for efficient and accurate segmentation and visualization of these regions in genomic and proteomic sequences. The method enables the use of models with different memories, providing the ability to distinguish local from distant low-complexity patterns. The method is reference and alignment free, providing additional methodologies for testing, including a highly flexible simulation method for generating biological sequences (DNA or protein) with different complexity levels, sequence masking, and a visualization tool for automatic computation of the LCR maps into an ideogram style. We provide illustrative demonstrations using synthetic, nearly synthetic, and natural sequences showing the high efficiency and accuracy of AlcoR. As large-scale results, we use AlcoR to unprecedentedly provide a whole-chromosome low-complexity map of a recent complete human genome and the haplotype-resolved chromosome pairs of a heterozygous diploid African cassava cultivar.

CONCLUSIONS

The AlcoR method provides the ability of fast sequence characterization through data complexity analysis, ideally for scenarios entangling the presence of new or unknown sequences. AlcoR is implemented in C language using multithreading to increase the computational speed, is flexible for multiple applications, and does not contain external dependencies. The tool accepts any sequence in FASTA format. The source code is freely provided at https://github.com/cobilab/alcor.

Collapse

Shang J, Cai X, Zhang T, Sun Y, Zhang Y, Liu J, Guan B. EpiReSIM: A Resampling Method of Epistatic Model without Marginal Effects Using Under-Determined System of Equations. Genes (Basel) 2022;13:genes13122286. [PMID: 36553553 PMCID: PMC9777644 DOI: 10.3390/genes13122286] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 11/30/2022] [Accepted: 12/01/2022] [Indexed: 12/12/2022] Open

Ono Y, Hamada M, Asai K. PBSIM3: a simulator for all types of PacBio and ONT long reads. NAR Genom Bioinform 2022;4:lqac092. [PMID: 36465498 PMCID: PMC9713900 DOI: 10.1093/nargab/lqac092] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Revised: 11/02/2022] [Accepted: 11/12/2022] [Indexed: 12/03/2022] Open

Genome sequence assembly algorithms and misassembly identification methods. Mol Biol Rep 2022;49:11133-11148. [PMID: 36151399 DOI: 10.1007/s11033-022-07919-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Accepted: 09/05/2022] [Indexed: 10/14/2022]

Alser M, Lindegger J, Firtina C, Almadhoun N, Mao H, Singh G, Gomez-Luna J, Mutlu O. From molecules to genomic variations: Accelerating genome analysis via intelligent algorithms and architectures. Comput Struct Biotechnol J 2022;20:4579-4599. [PMID: 36090814 PMCID: PMC9436709 DOI: 10.1016/j.csbj.2022.08.019] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Revised: 08/08/2022] [Accepted: 08/08/2022] [Indexed: 02/01/2023] Open

Angaroni F, Guidi A, Ascolani G, d'Onofrio A, Antoniotti M, Graudenzi A. J-SPACE: a Julia package for the simulation of spatial models of cancer evolution and of sequencing experiments. BMC Bioinformatics 2022;23:269. [PMID: 35804300 PMCID: PMC9270769 DOI: 10.1186/s12859-022-04779-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Accepted: 06/09/2022] [Indexed: 11/15/2022] Open

Abstract

Background

The combined effects of biological variability and measurement-related errors on cancer sequencing data remain largely unexplored. However, the spatio-temporal simulation of multi-cellular systems provides a powerful instrument to address this issue. In particular, efficient algorithmic frameworks are needed to overcome the harsh trade-off between scalability and expressivity, so to allow one to simulate both realistic cancer evolution scenarios and the related sequencing experiments, which can then be used to benchmark downstream bioinformatics methods.

Result

We introduce a Julia package for SPAtial Cancer Evolution (J-SPACE), which allows one to model and simulate a broad set of experimental scenarios, phenomenological rules and sequencing settings.Specifically, J-SPACE simulates the spatial dynamics of cells as a continuous-time multi-type birth-death stochastic process on a arbitrary graph, employing different rules of interaction and an optimised Gillespie algorithm. The evolutionary dynamics of genomic alterations (single-nucleotide variants and indels) is simulated either under the Infinite Sites Assumption or several different substitution models, including one based on mutational signatures. After mimicking the spatial sampling of tumour cells, J-SPACE returns the related phylogenetic model, and allows one to generate synthetic reads from several Next-Generation Sequencing (NGS) platforms, via the ART read simulator. The results are finally returned in standard FASTA, FASTQ, SAM, ALN and Newick file formats.

Conclusion

J-SPACE is designed to efficiently simulate the heterogeneous behaviour of a large number of cancer cells and produces a rich set of outputs. Our framework is useful to investigate the emergent spatial dynamics of cancer subpopulations, as well as to assess the impact of incomplete sampling and of experiment-specific errors. Importantly, the output of J-SPACE is designed to allow the performance assessment of downstream bioinformatics pipelines processing NGS data. J-SPACE is freely available at: https://github.com/BIMIB-DISCo/J-Space.jl.

Collapse

van Waaij J, Li Z, Wiuf C. Estimation of the covariance structure from SNP allele frequencies. Stat Appl Genet Mol Biol 2022;21:sagmb-2022-0005. [DOI: 10.1515/sagmb-2022-0005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 05/02/2022] [Indexed: 11/15/2022]

Abstract Abstract We propose two new statistics, V ̂

$\hat{V}$

and S ̂

$\hat{S}$

, to disentangle the population history of related populations from SNP frequency data. If the populations are related by a tree, we show by theoretical means as well as by simulation that the new statistics are able to identify the root of a tree correctly, in contrast to standard statistics, such as the observed matrix of F 2-statistics (distances between pairs of populations). The statistic V ̂

$\hat{V}$

is obtained by averaging over all SNPs (similar to standard statistics). Its expectation is the true covariance matrix of the observed population SNP frequencies, offset by a matrix with identical entries. In contrast, the statistic S ̂

$\hat{S}$

is put in a Bayesian context and is obtained by averaging over pairs of SNPs, such that each SNP is only used once. It thus makes use of the joint distribution of pairs of SNPs. In addition, we provide a number of novel mathematical results about old and new statistics, and their mutual relationship. Collapse

Feng X, Chen L. SCSilicon: a tool for synthetic single-cell DNA sequencing data generation. BMC Genomics 2022;23:359. [PMID: 35546390 PMCID: PMC9092674 DOI: 10.1186/s12864-022-08566-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Accepted: 04/19/2022] [Indexed: 11/25/2022] Open

Pfeifer JD, Loberg R, Lofton-Day C, Zehnbauer BA. Reference Samples to Compare Next-Generation Sequencing Test Performance for Oncology Therapeutics and Diagnostics. Am J Clin Pathol 2022;157:628-638. [PMID: 34871357 DOI: 10.1093/ajcp/aqab164] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Accepted: 08/24/2021] [Indexed: 11/15/2022] Open

Petrillo M, Fabbri M, Kagkli DM, Querci M, Van den Eede G, Alm E, Aytan-Aktug D, Capella-Gutierrez S, Carrillo C, Cestaro A, Chan KG, Coque T, Endrullat C, Gut I, Hammer P, Kay GL, Madec JY, Mather AE, McHardy AC, Naas T, Paracchini V, Peter S, Pightling A, Raffael B, Rossen J, Ruppé E, Schlaberg R, Vanneste K, Weber LM, Westh H, Angers-Loustau A. A roadmap for the generation of benchmarking resources for antimicrobial resistance detection using next generation sequencing. F1000Res 2022;10:80. [PMID: 35847383 PMCID: PMC9243550 DOI: 10.12688/f1000research.39214.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/10/2022] [Indexed: 11/20/2022] Open

Affiliation(s)

Mauro Petrillo European Commission Joint Research Centre, Ispra, Italy
Marco Fabbri European Commission Joint Research Centre, Ispra, Italy
Dafni Maria Kagkli European Commission Joint Research Centre, Ispra, Italy
Maddalena Querci European Commission Joint Research Centre, Ispra, Italy
Guy Van den Eede European Commission Joint Research Centre, Ispra, Italy European Commission Joint Research Centre, Geel, Belgium
Erik Alm The European Centre for Disease Prevention and Control, Stockholm, Sweden
Derya Aytan-Aktug National Food Institute, Technical University of Denmark, Lyngby, Denmark
Salvador Capella-Gutierrez Barcelona Supercomputing Centre (BSC), Barcelona, Spain
Catherine Carrillo Ottawa Laboratory – Carling, Canadian Food Inspection Agency, Ottawa, Ontario, Canada
Alessandro Cestaro Fondazione Edmund Mach, San Michele all'Adige (TN), Italy
Kok-Gan Chan International Genome Centre, Jiangsu University, Zhenjiang, China Division of Genetics and Molecular Biology, Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia
Teresa Coque Servicio de Microbiología, Hospital Universitario Ramón y Cajal, Instituto Ramón y Cajal de Investigación Sanitaria (IRYCIS), Madrid, Spain Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Carlos III Health Institute, Madrid, Spain
Christoph Endrullat MSD SHARP & DOHME GMBH, Haar, Germany
Ivo Gut Centro Nacional de Análisis Genómico, Centre for Genomic Regulation (CNAG-CRG), Barcelona Institute of Technology, Barcelona, Spain Universitat Pompeu Fabra, Barcelona, Spain
Paul Hammer BIOMES. NGS GmbH c/o Technische Hochschule Wildau, Wildau, Germany
Gemma L. Kay Quadram Institute Bioscience, Norwich Research Park, Norwich, UK
Jean-Yves Madec Unité Antibiorésistance et Virulence Bactériennes, ANSES Site de Lyon, Lyon, France
Alison E. Mather Quadram Institute Bioscience, Norwich Research Park, Norwich, UK University of East Anglia, Norwich, UK
Alice Carolyn McHardy Helmholtz Centre for Infection Research, Braunschweig, Germany
Thierry Naas French-NRC for CPEs, Service de Bactériologie-Hygiène, Hôpital de Bicêtre, Le Kremlin-Bicêtre, France
Valentina Paracchini European Commission Joint Research Centre, Ispra, Italy
Silke Peter Institute of Medical Microbiology and Hygiene, University of Tübingen, Tübingen, Germany
Arthur Pightling Center for Food Safety and Applied Nutrition, US Food and Drug Administration, College Park, MD, USA
Barbara Raffael European Commission Joint Research Centre, Ispra, Italy
John Rossen Department of Medical Microbiology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
Etienne Ruppé IAME, Université de Paris, Paris, France
Robert Schlaberg Department of Pathology, University of Utah, Salt Lake City, UT, USA
Kevin Vanneste Transversal activities in Applied Genomics, Sciensano, Brussels, Belgium
Lukas M. Weber Institute of Molecular Life Sciences, University of Zurich, Zurich, Switzerland SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland Present address: Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
Henrik Westh Hvidovre University Hospital, Hvidovre, Denmark
Alexandre Angers-Loustau European Commission Publications Office, Luxembourg, Luxembourg

Collapse

Liu Z, Roberts R, Mercer TR, Xu J, Sedlazeck FJ, Tong W. Towards accurate and reliable resolution of structural variants for clinical diagnosis. Genome Biol 2022;23:68. [PMID: 35241127 PMCID: PMC8892125 DOI: 10.1186/s13059-022-02636-8] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 02/15/2022] [Indexed: 12/17/2022] Open

Wan Y, Zong C, Li X, Wang A, Li Y, Yang T, Bao Q, Dubow M, Yang M, Rodrigo LA, Mao C. New Insights for Biosensing: Lessons from Microbial Defense Systems. Chem Rev 2022;122:8126-8180. [PMID: 35234463 DOI: 10.1021/acs.chemrev.1c01063] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

Abstract

Microorganisms have gained defense systems during the lengthy process of evolution over millions of years. Such defense systems can protect them from being attacked by invading species (e.g., CRISPR-Cas for establishing adaptive immune systems and nanopore-forming toxins as virulence factors) or enable them to adapt to different conditions (e.g., gas vesicles for achieving buoyancy control). These microorganism defense systems (MDS) have inspired the development of biosensors that have received much attention in a wide range of fields including life science research, food safety, and medical diagnosis. This Review comprehensively analyzes biosensing platforms originating from MDS for sensing and imaging biological analytes. We first describe a basic overview of MDS and MDS-inspired biosensing platforms (e.g., CRISPR-Cas systems, nanopore-forming proteins, and gas vesicles), followed by a critical discussion of their functions and properties. We then discuss several transduction mechanisms (optical, acoustic, magnetic, and electrical) involved in MDS-inspired biosensing. We further detail the applications of the MDS-inspired biosensors to detect a variety of analytes (nucleic acids, peptides, proteins, pathogens, cells, small molecules, and metal ions). In the end, we propose the key challenges and future perspectives in seeking new and improved MDS tools that can potentially lead to breakthrough discoveries in developing a new generation of biosensors with a combination of low cost; high sensitivity, accuracy, and precision; and fast detection. Overall, this Review gives a historical review of MDS, elucidates the principles of emulating MDS to develop biosensors, and analyzes the recent advancements, current challenges, and future trends in this field. It provides a unique critical analysis of emulating MDS to develop robust biosensors and discusses the design of such biosensors using elements found in MDS, showing that emulating MDS is a promising approach to conceptually advancing the design of biosensors.

Collapse

Diricks M, Kohl TA, Käding N, Leshchinskiy V, Hauswaldt S, Jiménez Vázquez O, Utpatel C, Niemann S, Rupp J, Merker M. Whole genome sequencing-based classification of human-related Haemophilus species and detection of antimicrobial resistance genes. Genome Med 2022;14:13. [PMID: 35139905 PMCID: PMC8830169 DOI: 10.1186/s13073-022-01017-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 01/24/2022] [Indexed: 12/31/2022] Open

Abstract

Background

Bacteria belonging to the genus Haemophilus cause a wide range of diseases in humans. Recently, H. influenzae was classified by the WHO as priority pathogen due to the wide spread of ampicillin resistant strains. However, other Haemophilus spp. are often misclassified as H. influenzae. Therefore, we established an accurate and rapid whole genome sequencing (WGS) based classification and serotyping algorithm and combined it with the detection of resistance genes.

Methods

A gene presence/absence-based classification algorithm was developed, which employs the open-source gene-detection tool SRST2 and a new classification database comprising 36 genes, including capsule loci for serotyping. These genes were identified using a comparative genome analysis of 215 strains belonging to ten human-related Haemophilus (sub)species (training dataset). The algorithm was evaluated on 1329 public short read datasets (evaluation dataset) and used to reclassify 262 clinical Haemophilus spp. isolates from 250 patients (German cohort). In addition, the presence of antibiotic resistance genes within the German dataset was evaluated with SRST2 and correlated with results of traditional phenotyping assays.

Results

The newly developed algorithm can differentiate between clinically relevant Haemophilus species including, but not limited to, H. influenzae, H. haemolyticus, and H. parainfluenzae. It can also identify putative haemin-independent H. haemolyticus strains and determine the serotype of typeable Haemophilus strains. The algorithm performed excellently in the evaluation dataset (99.6% concordance with reported species classification and 99.5% with reported serotype) and revealed several misclassifications. Additionally, 83 out of 262 (31.7%) suspected H. influenzae strains from the German cohort were in fact H. haemolyticus strains, some of which associated with mouth abscesses and lower respiratory tract infections.

Resistance genes were detected in 16 out of 262 datasets from the German cohort. Prediction of ampicillin resistance, associated with bla_TEM-1D, and tetracycline resistance, associated with tetB, correlated well with available phenotypic data.

Conclusions

Our new classification database and algorithm have the potential to improve diagnosis and surveillance of Haemophilus spp. and can easily be coupled with other public genotyping and antimicrobial resistance databases. Our data also point towards a possible pathogenic role of H. haemolyticus strains, which needs to be further investigated.

Supplementary Information

The online version contains supplementary material available at 10.1186/s13073-022-01017-x.

Collapse

Affiliation(s)

Margo Diricks Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany.,German Center for Infection Research (DZIF), Partner Site Hamburg-Lübeck-Borstel-Riems, Hamburg, Germany
Thomas A Kohl Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany.,German Center for Infection Research (DZIF), Partner Site Hamburg-Lübeck-Borstel-Riems, Hamburg, Germany
Nadja Käding Department of Infectious Diseases and Microbiology, University Hospital Schleswig-Holstein, Lübeck, Germany.,German Center for Infection Research (DZIF), TTU HAARBI, Lübeck, Germany
Vladislav Leshchinskiy Department of Infectious Diseases and Microbiology, University Hospital Schleswig-Holstein, Lübeck, Germany
Susanne Hauswaldt Department of Infectious Diseases and Microbiology, University Hospital Schleswig-Holstein, Lübeck, Germany
Omar Jiménez Vázquez Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany
Christian Utpatel Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany.,German Center for Infection Research (DZIF), Partner Site Hamburg-Lübeck-Borstel-Riems, Hamburg, Germany
Stefan Niemann Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany.,German Center for Infection Research (DZIF), Partner Site Hamburg-Lübeck-Borstel-Riems, Hamburg, Germany
Jan Rupp Department of Infectious Diseases and Microbiology, University Hospital Schleswig-Holstein, Lübeck, Germany.,German Center for Infection Research (DZIF), TTU HAARBI, Lübeck, Germany
Matthias Merker Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany. .,German Center for Infection Research (DZIF), Partner Site Hamburg-Lübeck-Borstel-Riems, Hamburg, Germany. .,Evolution of the Resistome, Research Center Borstel, Borstel, Germany.

Collapse

Liu J, Shen Q, Bao H. Comparison of seven SNP calling pipelines for the next-generation sequencing data of chickens. PLoS One 2022;17:e0262574. [PMID: 35100292 PMCID: PMC8803190 DOI: 10.1371/journal.pone.0262574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 12/29/2021] [Indexed: 11/18/2022] Open

Abstract

Single nucleotide polymorphisms (SNPs) are widely used in genome-wide association studies and population genetics analyses. Next-generation sequencing (NGS) has become convenient, and many SNP-calling pipelines have been developed for human NGS data. We took advantage of a gap knowledge in selecting the appropriated SNP calling pipeline to handle with high-throughput NGS data. To fill this gap, we studied and compared seven SNP calling pipelines, which include 16GT, genome analysis toolkit (GATK), Bcftools-single (Bcftools single sample mode), Bcftools-multiple (Bcftools multiple sample mode), VarScan2-single (VarScan2 single sample mode), VarScan2-multiple (VarScan2 multiple sample mode) and Freebayes pipelines, using 96 NGS data with the different depth gradients of approximately 5X, 10X, 20X, 30X, 40X, and 50X coverage from 16 Rhode Island Red chickens. The sixteen chickens were also genotyped with a 50K SNP array, and the sensitivity and specificity of each pipeline were assessed by comparison to the results of SNP arrays. For each pipeline, except Freebayes, the number of detected SNPs increased as the input read depth increased. In comparison with other pipelines, 16GT, followed by Bcftools-multiple, obtained the most SNPs when the input coverage exceeded 10X, and Bcftools-multiple obtained the most when the input was 5X and 10X. The sensitivity and specificity of each pipeline increased with increasing input. Bcftools-multiple had the highest sensitivity numerically when the input ranged from 5X to 30X, and 16GT showed the highest sensitivity when the input was 40X and 50X. Bcftools-multiple also had the highest specificity, followed by GATK, at almost all input levels. For most calling pipelines, there were no obvious changes in SNP numbers, sensitivities or specificities beyond 20X. In conclusion, (1) if only SNPs were detected, the sequencing depth did not need to exceed 20X; (2) the Bcftools-multiple may be the best choice for detecting SNPs from chicken NGS data, but for a single sample or sequencing depth greater than 20X, 16GT was recommended. Our findings provide a reference for researchers to select suitable pipelines to obtain SNPs from the NGS data of chickens or nonhuman animals.

Collapse

Chen J, Li F, Wang M, Li J, Marquez-Lago TT, Leier A, Revote J, Li S, Liu Q, Song J. BigFiRSt: A Software Program Using Big Data Technique for Mining Simple Sequence Repeats From Large-Scale Sequencing Data. Front Big Data 2022;4:727216. [PMID: 35118375 PMCID: PMC8805145 DOI: 10.3389/fdata.2021.727216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Accepted: 12/13/2021] [Indexed: 11/22/2022] Open

Abstract

Background

Simple Sequence Repeats (SSRs) are short tandem repeats of nucleotide sequences. It has been shown that SSRs are associated with human diseases and are of medical relevance. Accordingly, a variety of computational methods have been proposed to mine SSRs from genomes. Conventional methods rely on a high-quality complete genome to identify SSRs. However, the sequenced genome often misses several highly repetitive regions. Moreover, many non-model species have no entire genomes. With the recent advances of next-generation sequencing (NGS) techniques, large-scale sequence reads for any species can be rapidly generated using NGS. In this context, a number of methods have been proposed to identify thousands of SSR loci within large amounts of reads for non-model species. While the most commonly used NGS platforms (e.g., Illumina platform) on the market generally provide short paired-end reads, merging overlapping paired-end reads has become a common way prior to the identification of SSR loci. This has posed a big data analysis challenge for traditional stand-alone tools to merge short read pairs and identify SSRs from large-scale data.

Results

In this study, we present a new Hadoop-based software program, termed BigFiRSt, to address this problem using cutting-edge big data technology. BigFiRSt consists of two major modules, BigFLASH and BigPERF, implemented based on two state-of-the-art stand-alone tools, FLASH and PERF, respectively. BigFLASH and BigPERF address the problem of merging short read pairs and mining SSRs in the big data manner, respectively. Comprehensive benchmarking experiments show that BigFiRSt can dramatically reduce the execution times of fast read pairs merging and SSRs mining from very large-scale DNA sequence data.

Conclusions

The excellent performance of BigFiRSt mainly resorts to the Big Data Hadoop technology to merge read pairs and mine SSRs in parallel and distributed computing on clusters. We anticipate BigFiRSt will be a valuable tool in the coming biological Big Data era.

Collapse

Affiliation(s)

Jinxiang Chen Department of Software Engineering, College of Information Engineering, Northwest A&F University, Yangling, China
Fuyi Li Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Melbourne, VIC, Australia Monash Centre for Data Science, Monash University, Melbourne, VIC, Australia Department of Microbiology and Immunity, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC, Australia
Miao Wang Department of Software Engineering, College of Information Engineering, Northwest A&F University, Yangling, China
Junlong Li Department of Software Engineering, College of Information Engineering, Northwest A&F University, Yangling, China
Tatiana T. Marquez-Lago Department of Genetics, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, United States Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, United States
André Leier Department of Genetics, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, United States Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, United States
Jerico Revote Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Melbourne, VIC, Australia
Shuqin Li Department of Software Engineering, College of Information Engineering, Northwest A&F University, Yangling, China
Quanzhong Liu Department of Software Engineering, College of Information Engineering, Northwest A&F University, Yangling, China Quanzhong Liu
Jiangning Song Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Melbourne, VIC, Australia Monash Centre for Data Science, Monash University, Melbourne, VIC, Australia *Correspondence: Jiangning Song

Collapse

Suminda GGD, Bhandari S, Won Y, Goutam U, Kanth Pulicherla K, Son YO, Ghosh M. High-throughput sequencing technologies in the detection of livestock pathogens, diagnosis, and zoonotic surveillance. Comput Struct Biotechnol J 2022;20:5378-5392. [PMID: 36212529 PMCID: PMC9526013 DOI: 10.1016/j.csbj.2022.09.028] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2022] [Revised: 09/20/2022] [Accepted: 09/21/2022] [Indexed: 12/03/2022] Open

Single-Cell Transcriptome Profiling Simulation Reveals the Impact of Sequencing Parameters and Algorithms on Clustering. Life (Basel) 2021;11:life11070716. [PMID: 34357088 PMCID: PMC8304014 DOI: 10.3390/life11070716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Revised: 07/09/2021] [Accepted: 07/15/2021] [Indexed: 11/16/2022] Open

Seaby EG, Ennis S. Challenges in the diagnosis and discovery of rare genetic disorders using contemporary sequencing technologies. Brief Funct Genomics 2021;19:243-258. [PMID: 32393978 DOI: 10.1093/bfgp/elaa009] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

Lebo MS, Hao L, Lin CF, Singh A. Bioinformatics in Clinical Genomic Sequencing. Clin Lab Med 2021;40:163-187. [PMID: 32439067 DOI: 10.1016/j.cll.2020.02.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

Kühl MA, Stich B, Ries DC. Mutation-Simulator: fine-grained simulation of random mutations in any genome. Bioinformatics 2021;37:568-569. [PMID: 32780803 PMCID: PMC8088320 DOI: 10.1093/bioinformatics/btaa716] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 06/12/2020] [Accepted: 08/05/2020] [Indexed: 01/11/2023] Open

Ono Y, Asai K, Hamada M. PBSIM2: a simulator for long-read sequencers with a novel generative model of quality scores. Bioinformatics 2021;37:589-595. [PMID: 32976553 PMCID: PMC8097687 DOI: 10.1093/bioinformatics/btaa835] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 08/20/2020] [Accepted: 09/11/2020] [Indexed: 12/21/2022] Open

Bogaerts B, Delcourt T, Soetaert K, Boarbi S, Ceyssens PJ, Winand R, Van Braekel J, De Keersmaecker SCJ, Roosens NHC, Marchal K, Mathys V, Vanneste K. A Bioinformatics Whole-Genome Sequencing Workflow for Clinical Mycobacterium tuberculosis Complex Isolate Analysis, Validated Using a Reference Collection Extensively Characterized with Conventional Methods and In Silico Approaches. J Clin Microbiol 2021;59:e00202-21. [PMID: 33789960 PMCID: PMC8316078 DOI: 10.1128/jcm.00202-21] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 03/27/2021] [Indexed: 01/18/2023] Open

Abstract

The use of whole-genome sequencing (WGS) for routine typing of bacterial isolates has increased substantially in recent years. For Mycobacterium tuberculosis (MTB), in particular, WGS has the benefit of drastically reducing the time required to generate results compared to most conventional phenotypic methods. Consequently, a multitude of solutions for analyzing WGS MTB data have been developed, but their successful integration in clinical and national reference laboratories is hindered by the requirement for their validation, for which a consensus framework is still largely absent. We developed a bioinformatics workflow for (Illumina) WGS-based routine typing of MTB complex (MTBC) member isolates allowing complete characterization, including (sub)species confirmation and identification (16S, csb/RD, hsp65), single nucleotide polymorphism (SNP)-based antimicrobial resistance (AMR) prediction, and pathogen typing (spoligotyping, SNP barcoding, and core genome multilocus sequence typing). Workflow performance was validated on a per-assay basis using a collection of 238 in-house-sequenced MTBC isolates, extensively characterized with conventional molecular biology-based approaches supplemented with public data. For SNP-based AMR prediction, results from molecular genotyping methods were supplemented with in silico modified data sets, allowing us to greatly increase the set of evaluated mutations. The workflow demonstrated very high performance with performance metrics of >99% for all assays, except for spoligotyping, where sensitivity dropped to ∼90%. The validation framework for our WGS-based bioinformatics workflow can aid in the standardization of bioinformatics tools by the MTB community and other SNP-based applications regardless of the targeted pathogen(s). The bioinformatics workflow is available for academic and nonprofit use through the Galaxy instance of our institute at https://galaxy.sciensano.be.

Collapse

Herzig AF, Velo-Suárez L, Le Folgoc G, Boland A, Blanché H, Olaso R, Le Roux L, Delmas C, Goldberg M, Zins M, Lethimonnier F, Deleuze JF, Génin E. Evaluation of saliva as a source of accurate whole-genome and microbiome sequencing data. Genet Epidemiol 2021;45:537-548. [PMID: 33998042 DOI: 10.1002/gepi.22386] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Revised: 04/27/2021] [Accepted: 04/27/2021] [Indexed: 11/08/2022]

Chua PYS, Crampton-Platt A, Lammers Y, Alsos IG, Boessenkool S, Bohmann K. Metagenomics: A viable tool for reconstructing herbivore diet. Mol Ecol Resour 2021;21:2249-2263. [PMID: 33971086 PMCID: PMC8518049 DOI: 10.1111/1755-0998.13425] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Revised: 04/08/2021] [Accepted: 05/04/2021] [Indexed: 11/28/2022]

Zhu C, Huang K, Wang Y, Alanis K, Shi W, Baker LA. Imaging with Ion Channels. Anal Chem 2021;93:5355-5359. [PMID: 33759498 DOI: 10.1021/acs.analchem.1c00224] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Ali MA. Phylotranscriptomic analysis of Dillenia indica L. (Dilleniales, Dilleniaceae) and its systematics implication. Saudi J Biol Sci 2021;28:1557-1560. [PMID: 33732040 PMCID: PMC7938110 DOI: 10.1016/j.sjbs.2021.01.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Revised: 01/17/2021] [Accepted: 01/18/2021] [Indexed: 11/21/2022] Open

Li Z, Fang S, Zhang R, Yu L, Zhang J, Bu D, Sun L, Zhao Y, Li J. VarBen. J Mol Diagn 2021;23:285-299. [DOI: 10.1016/j.jmoldx.2020.11.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Revised: 10/06/2020] [Accepted: 11/17/2020] [Indexed: 02/08/2023] Open

Schmeing S, Robinson MD. ReSeq simulates realistic Illumina high-throughput sequencing data. Genome Biol 2021;22:67. [PMID: 33608040 PMCID: PMC7896392 DOI: 10.1186/s13059-021-02265-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Accepted: 01/07/2021] [Indexed: 12/18/2022] Open

Richmond PA, Av‐Shalom TV, Fornes O, Modi B, Elliott AM, Wasserman WW. GeneBreaker: Variant simulation to improve the diagnosis of Mendelian rare genetic diseases. Hum Mutat 2021;42:346-358. [PMID: 33368787 PMCID: PMC8247879 DOI: 10.1002/humu.24163] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 11/06/2020] [Accepted: 12/14/2020] [Indexed: 12/21/2022]

Petrillo M, Fabbri M, Kagkli DM, Querci M, Van den Eede G, Alm E, Aytan-Aktug D, Capella-Gutierrez S, Carrillo C, Cestaro A, Chan KG, Coque T, Endrullat C, Gut I, Hammer P, Kay GL, Madec JY, Mather AE, McHardy AC, Naas T, Paracchini V, Peter S, Pightling A, Raffael B, Rossen J, Ruppé E, Schlaberg R, Vanneste K, Weber LM, Westh H, Angers-Loustau A. A roadmap for the generation of benchmarking resources for antimicrobial resistance detection using next generation sequencing. F1000Res 2021;10:80. [DOI: 10.12688/f1000research.39214.1] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/02/2021] [Indexed: 01/12/2023] Open

Nodehi HM, Tabatabaiefar MA, Sehhati M. Selection of Optimal Bioinformatic Tools and Proper Reference for Reducing the Alignment Error in Targeted Sequencing Data. JOURNAL OF MEDICAL SIGNALS & SENSORS 2021;11:37-44. [PMID: 34026589 PMCID: PMC8043119 DOI: 10.4103/jmss.jmss_7_20] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 01/28/2020] [Accepted: 02/12/2020] [Indexed: 11/04/2022]

Chen W, Zhang P, Song L, Yang J, Han C. Simulation of Nanopore Sequencing Signals Based on BiGRU. SENSORS 2020;20:s20247244. [PMID: 33348876 PMCID: PMC7766754 DOI: 10.3390/s20247244] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 12/14/2020] [Accepted: 12/15/2020] [Indexed: 01/02/2023]

Subkhankulova T, Naumenko F, Tolmachov OE, Orlov YL. Novel ChIP-seq simulating program with superior versatility: isChIP. Brief Bioinform 2020;22:6035271. [PMID: 33320934 DOI: 10.1093/bib/bbaa352] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Revised: 10/18/2020] [Accepted: 11/03/2020] [Indexed: 12/13/2022] Open

Abstract

Chromatin immunoprecipitation followed by next-generation sequencing (ChIP-seq) is recognized as an extremely powerful tool to study the interaction of numerous transcription factors and other chromatin-associated proteins with DNA. The core problem in the optimization of ChIP-seq protocol and the following computational data analysis is that a 'true' pattern of binding events for a given protein factor is unknown. Computer simulation of the ChIP-seq process based on 'a-priory known binding template' can contribute to a drastically reduce the number of wet lab experiments and finally help achieve radical optimization of the entire processing pipeline. We present a newly developed ChIP-sequencing simulation algorithm implemented in the novel software, in silico ChIP-seq (isChIP). We demonstrate that isChIP closely approximates real ChIP-seq protocols and is able to model data similar to those obtained from experimental sequencing. We validated isChIP using publicly available datasets generated for well-characterized transcription factors Oct4 and Sox2. Although the novel software is compatible with the Illumina protocols by default, it can also successfully perform simulations with a number of alternative sequencing platforms such as Roche454, Ion Torrent and SOLiD as well as model ChIP -Exo. The versatility of isChIP was demonstrated through modelling a wide range of binding events, including those of transcription factors and chromatin modifiers. We also performed a comparative analysis against a few existing ChIP-seq simulators and showed the fundamental superiority of our model. Due to its ability to utilize known binding templates, isChIP can potentially be employed to help investigators choose the most appropriate analytical software through benchmarking of available ChIP-seq programs and optimize the experimental parameters of ChIP-seq protocol. isChIP software is freely available at https://github.com/fnaumenko/isChIP.

Collapse