1
|
Tangherloni A, Riva SG, Myers B, Buffa FM, Cazzaniga P. MAGNETO: Cell type marker panel generator from single-cell transcriptomic data. J Biomed Inform 2023; 147:104510. [PMID: 37797704 DOI: 10.1016/j.jbi.2023.104510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 09/12/2023] [Accepted: 09/29/2023] [Indexed: 10/07/2023]
Abstract
Single-cell RNA sequencing experiments produce data useful to identify different cell types, including uncharacterized and rare ones. This enables us to study the specific functional roles of these cells in different microenvironments and contexts. After identifying a (novel) cell type of interest, it is essential to build succinct marker panels, composed of a few genes referring to cell surface proteins and clusters of differentiation molecules, able to discriminate the desired cells from the other cell populations. In this work, we propose a fully-automatic framework called MAGNETO, which can help construct optimal marker panels starting from a single-cell gene expression matrix and a cell type identity for each cell. MAGNETO builds effective marker panels solving a tailored bi-objective optimization problem, where the first objective regards the identification of the genes able to isolate a specific cell type, while the second conflicting objective concerns the minimization of the total number of genes included in the panel. Our results on three public datasets show that MAGNETO can identify marker panels that identify the cell populations of interest better than state-of-the-art approaches. Finally, by fine-tuning MAGNETO, our results demonstrate that it is possible to obtain marker panels with different specificity levels.
Collapse
Affiliation(s)
- Andrea Tangherloni
- Department of Computing Sciences, Bocconi University, Via Guglielmo Röntgen 1, Milan, 20136, Italy; Bocconi Institute for Data Science and Analytics, Bocconi University, Via Guglielmo Röntgen 1, Milan, 20136, Italy; Department of Human and Social Sciences, University of Bergamo, Piazzale S. Agostino 2, Bergamo, 24129, Italy.
| | - Simone G Riva
- Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Headley Way, Oxford, OX3 9DS, United Kingdom
| | - Brynelle Myers
- Wellcome Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford, OX3 7BN, United Kingdom
| | - Francesca M Buffa
- Department of Computing Sciences, Bocconi University, Via Guglielmo Röntgen 1, Milan, 20136, Italy; Bocconi Institute for Data Science and Analytics, Bocconi University, Via Guglielmo Röntgen 1, Milan, 20136, Italy; Department of Oncology, University of Oxford, Old Road Campus Research Building, Oxford, OX3 7DQ, United Kingdom
| | - Paolo Cazzaniga
- Department of Human and Social Sciences, University of Bergamo, Piazzale S. Agostino 2, Bergamo, 24129, Italy; Bicocca Bioinformatics, Biostatistics, and Bioimaging Centre - B4, Via Follereau 3, Vedano al Lambro, 20854, Italy
| |
Collapse
|
2
|
Sun S, Cheng F, Han D, Wei S, Zhong A, Massoudian S, Johnson AB. Pairwise comparative analysis of six haplotype assembly methods based on users' experience. BMC Genom Data 2023; 24:35. [PMID: 37386408 PMCID: PMC10311811 DOI: 10.1186/s12863-023-01134-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 05/25/2023] [Indexed: 07/01/2023] Open
Abstract
BACKGROUND A haplotype is a set of DNA variants inherited together from one parent or chromosome. Haplotype information is useful for studying genetic variation and disease association. Haplotype assembly (HA) is a process of obtaining haplotypes using DNA sequencing data. Currently, there are many HA methods with their own strengths and weaknesses. This study focused on comparing six HA methods or algorithms: HapCUT2, MixSIH, PEATH, WhatsHap, SDhaP, and MAtCHap using two NA12878 datasets named hg19 and hg38. The 6 HA algorithms were run on chromosome 10 of these two datasets, each with 3 filtering levels based on sequencing depth (DP1, DP15, and DP30). Their outputs were then compared. RESULT Run time (CPU time) was compared to assess the efficiency of 6 HA methods. HapCUT2 was the fastest HA for 6 datasets, with run time consistently under 2 min. In addition, WhatsHap was relatively fast, and its run time was 21 min or less for all 6 datasets. The other 4 HA algorithms' run time varied across different datasets and coverage levels. To assess their accuracy, pairwise comparisons were conducted for each pair of the six packages by generating their disagreement rates for both haplotype blocks and Single Nucleotide Variants (SNVs). The authors also compared them using switch distance (error), i.e., the number of positions where two chromosomes of a certain phase must be switched to match with the known haplotype. HapCUT2, PEATH, MixSIH, and MAtCHap generated output files with similar numbers of blocks and SNVs, and they had relatively similar performance. WhatsHap generated a much larger number of SNVs in the hg19 DP1 output, which caused it to have high disagreement percentages with other methods. However, for the hg38 data, WhatsHap had similar performance as the other 4 algorithms, except SDhaP. The comparison analysis showed that SDhaP had a much larger disagreement rate when it was compared with the other algorithms in all 6 datasets. CONCLUSION The comparative analysis is important because each algorithm is different. The findings of this study provide a deeper understanding of the performance of currently available HA algorithms and useful input for other users.
Collapse
Affiliation(s)
- Shuying Sun
- Department of Mathematics, Texas State University, San Marcos, TX USA
| | - Flora Cheng
- Carnegie Mellon University, Pittsburgh, PA USA
| | - Daphne Han
- Carnegie Mellon University, Pittsburgh, PA USA
| | - Sarah Wei
- Massachusetts Institute of Technology, Cambridge, MA USA
| | | | | | | |
Collapse
|
3
|
Papetti DM, Tangherloni A, Farinati D, Cazzaniga P, Vanneschi L. Simplifying Fitness Landscapes Using Dilation Functions Evolved With Genetic Programming. IEEE COMPUT INTELL M 2023. [DOI: 10.1109/mci.2022.3222096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Affiliation(s)
| | | | - Davide Farinati
- NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, Campus de Campolide, PORTUGAL
| | | | - Leonardo Vanneschi
- NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, Campus de Campolide, PORTUGAL
| |
Collapse
|
4
|
Cai S, Hu B, Wang X, Liu T, Lin Z, Tong X, Xu R, Chen M, Duo T, Zhu Q, Liang Z, Li E, Chen Y, Li J, Liu X, Mo D. Integrative single-cell RNA-seq and ATAC-seq analysis of myogenic differentiation in pig. BMC Biol 2023; 21:19. [PMID: 36726129 PMCID: PMC9893630 DOI: 10.1186/s12915-023-01519-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 01/18/2023] [Indexed: 02/03/2023] Open
Abstract
BACKGROUND Skeletal muscle development is a multistep process whose understanding is central in a broad range of fields and applications, from the potential medical value to human society, to its economic value associated with improvement of agricultural animals. Skeletal muscle initiates in the somites, with muscle precursor cells generated in the dermomyotome and dermomyotome-derived myotome before muscle differentiation ensues, a developmentally regulated process that is well characterized in model organisms. However, the regulation of skeletal muscle ontogeny during embryonic development remains poorly defined in farm animals, for instance in pig. Here, we profiled gene expression and chromatin accessibility in developing pig somites and myotomes at single-cell resolution. RESULTS We identified myogenic cells and other cell types and constructed a differentiation trajectory of pig skeletal muscle ontogeny. Along this trajectory, the dynamic changes in gene expression and chromatin accessibility coincided with the activities of distinct cell type-specific transcription factors. Some novel genes upregulated along the differentiation trajectory showed higher expression levels in muscular dystrophy mice than that in healthy mice, suggesting their involvement in myogenesis. Integrative analysis of chromatin accessibility, gene expression data, and in vitro experiments identified EGR1 and RHOB as critical regulators of pig embryonic myogenesis. CONCLUSIONS Collectively, our results enhance our understanding of the molecular and cellular dynamics in pig embryonic myogenesis and offer a high-quality resource for the further study of pig skeletal muscle development and human muscle disease.
Collapse
Affiliation(s)
- Shufang Cai
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510006 Guangdong China
- Guangdong Key Laboratory of Animal Breeding and Nutrition, State Key Laboratory of Livestock and Poultry Breeding, Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou, 510640 Guangdong China
| | - Bin Hu
- Guangdong Key Laboratory of Animal Breeding and Nutrition, State Key Laboratory of Livestock and Poultry Breeding, Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou, 510640 Guangdong China
| | - Xiaoyu Wang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510006 Guangdong China
| | - Tongni Liu
- Faculty of Forestry, University of British Columbia, Vancouver, BC V6T 1Z4 Canada
| | - Zhuhu Lin
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510006 Guangdong China
| | - Xian Tong
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510006 Guangdong China
| | - Rong Xu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510006 Guangdong China
| | - Meilin Chen
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510006 Guangdong China
| | - Tianqi Duo
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510006 Guangdong China
| | - Qi Zhu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510006 Guangdong China
| | - Ziyun Liang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510006 Guangdong China
| | - Enru Li
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510006 Guangdong China
| | - Yaosheng Chen
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510006 Guangdong China
| | - Jianhao Li
- Guangdong Key Laboratory of Animal Breeding and Nutrition, State Key Laboratory of Livestock and Poultry Breeding, Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou, 510640 Guangdong China
| | - Xiaohong Liu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510006 Guangdong China
| | - Delin Mo
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510006 Guangdong China
| |
Collapse
|
5
|
Mazrouee S. ARHap: Association Rule Haplotype Phasing. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3281-3294. [PMID: 34648456 DOI: 10.1109/tcbb.2021.3119955] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
This article proposes a novel approach for Individual Human phasing through discovery of interesting hidden relations among single variant sites. The proposed framework, called ARHap, learns strong association rules among variant loci on the genome and develops a combinatorial approach for fast and accurate haplotype phasing based on the discovered associations. ARHap is composed of two main modules or processing phases. In the first phase, called association rule learning, ARHap identifies quantitative association rules from a collection of DNA reads of the organism under study, resulting in a set of strong rules that reveal the inter-dependency of alleles. In the next phase, called haplotype reconstruction, we develop algorithms to utilize the learned rules to construct highly reliable haplotypes at individual single nucleotide polymorphism (SNP) sites. ARHap has several features that lead to both fast and accurate haplotyping. It uses an incremental haplotype reconstruction approach that enables us to generate association rules according to the unreconstructed SNP sites during each round of the algorithm. During each round, the association rule learning module generates rules while constraining the length of the rules and limiting the rules to those that contribute to reconstruction of unreconstructed sites only. The framework begins by generating rules of small size and highly strong. The rule length can increase and/or criteria about strongness of the rule are adjusted gradually, during subsequent rounds, if some SNP sites have remained unreconstructed. This adaptive approach, which uses feedback from haplotype reconstruction module, eliminates generation of rules that do not contribute to haplotype reconstruction as well as weak rules that may introduce error in the final haplotypes. Extensive experimental analyses on datasets representing diploid organisms demonstrate superiority of ARHap in diploid haplotyping compared to the state-of-the-art algorithms. In particular, we show that this novel approach to haplotype phasing not only is fast but also achieves significantly better accuracy performance compared to other read-based computational approaches.
Collapse
|
6
|
Zhang T, Zhou J, Gao W, Jia Y, Wei Y, Wang G. Complex genome assembly based on long-read sequencing. Brief Bioinform 2022; 23:6657663. [PMID: 35940845 DOI: 10.1093/bib/bbac305] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 06/20/2022] [Accepted: 07/06/2022] [Indexed: 11/12/2022] Open
Abstract
High-quality genome chromosome-scale sequences provide an important basis for genomics downstream analysis, especially the construction of haplotype-resolved and complete genomes, which plays a key role in genome annotation, mutation detection, evolutionary analysis, gene function research, comparative genomics and other aspects. However, genome-wide short-read sequencing is difficult to produce a complete genome in the face of a complex genome with high duplication and multiple heterozygosity. The emergence of long-read sequencing technology has greatly improved the integrity of complex genome assembly. We review a variety of computational methods for complex genome assembly and describe in detail the theories, innovations and shortcomings of collapsed, semi-collapsed and uncollapsed assemblers based on long reads. Among the three methods, uncollapsed assembly is the most correct and complete way to represent genomes. In addition, genome assembly is closely related to haplotype reconstruction, that is uncollapsed assembly realizes haplotype reconstruction, and haplotype reconstruction promotes uncollapsed assembly. We hope that gapless, telomere-to-telomere and accurate assembly of complex genomes can be truly routinely achieved using only a simple process or a single tool in the future.
Collapse
Affiliation(s)
- Tianjiao Zhang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, 150040, China
| | - Jie Zhou
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, 150040, China
| | - Wentao Gao
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, 150040, China
| | - Yuran Jia
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, 150040, China
| | - Yanan Wei
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, 150040, China
| | - Guohua Wang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, 150040, China
| |
Collapse
|
7
|
Abstract
Several software tools for the simulation and analysis of biochemical reaction networks have been developed in the last decades; however, assessing and comparing their computational performance in executing the typical tasks of computational systems biology can be limited by the lack of a standardized benchmarking approach. To overcome these limitations, we propose here a novel tool, named SMGen, designed to automatically generate synthetic models of reaction networks that, by construction, are characterized by relevant features (e.g., system connectivity and reaction discreteness) and non-trivial emergent dynamics of real biochemical networks. The generation of synthetic models in SMGen is based on the definition of an undirected graph consisting of a single connected component that, generally, results in a computationally demanding task; to speed up the overall process, SMGen exploits a main–worker paradigm. SMGen is also provided with a user-friendly graphical user interface, which allows the user to easily set up all the parameters required to generate a set of synthetic models with any number of reactions and species. We analysed the computational performance of SMGen by generating batches of symmetric and asymmetric reaction-based models (RBMs) of increasing size, showing how a different number of reactions and/or species affects the generation time. Our results show that when the number of reactions is higher than the number of species, SMGen has to identify and correct a large number of errors during the creation process of the RBMs, a circumstance that increases the running time. Still, SMGen can generate synthetic models with hundreds of species and reactions in less than 7 s.
Collapse
|
8
|
Feature Selection for Topological Proximity Prediction of Single-Cell Transcriptomic Profiles in Drosophila Embryo Using Genetic Algorithm. Genes (Basel) 2020; 12:genes12010028. [PMID: 33379262 PMCID: PMC7824175 DOI: 10.3390/genes12010028] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 12/16/2020] [Accepted: 12/22/2020] [Indexed: 12/02/2022] Open
Abstract
Single-cell transcriptomics data, when combined with in situ hybridization patterns of specific genes, can help in recovering the spatial information lost during cell isolation. Dialogue for Reverse Engineering Assessments and Methods (DREAM) consortium conducted a crowd-sourced competition known as DREAM Single Cell Transcriptomics Challenge (SCTC) to predict the masked locations of single cells from a set of 60, 40 and 20 genes out of 84 in situ gene patterns known in Drosophila embryo. We applied a genetic algorithm (GA) to predict the most important genes that carry positional and proximity information of the single-cell origins, in combination with the base distance mapping algorithm DistMap. Resulting gene selection was found to perform well and was ranked among top 10 in two of the three sub-challenges. However, the details of the method did not make it to the main challenge publication, due to an intricate aggregation ranking. In this work, we discuss the detailed implementation of GA and its post-challenge parameterization, with a view to identify potential areas where GA-based approaches of gene-set selection for topological association prediction may be improved, to be more effective. We believe this work provides additional insights into the feature-selection strategies and their relevance to single-cell similarity prediction and will form a strong addendum to the recently published work from the consortium.
Collapse
|
9
|
ACDC: Automated Cell Detection and Counting for Time-Lapse Fluorescence Microscopy. APPLIED SCIENCES-BASEL 2020; 10. [PMID: 34306736 PMCID: PMC8297459 DOI: 10.3390/app10186187] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Advances in microscopy imaging technologies have enabled the visualization of live-cell dynamic processes using time-lapse microscopy imaging. However, modern methods exhibit several limitations related to the training phases and to time constraints, hindering their application in the laboratory practice. In this work, we present a novel method, named Automated Cell Detection and Counting (ACDC), designed for activity detection of fluorescent labeled cell nuclei in time-lapse microscopy. ACDC overcomes the limitations of the literature methods, by first applying bilateral filtering on the original image to smooth the input cell images while preserving edge sharpness, and then by exploiting the watershed transform and morphological filtering. Moreover, ACDC represents a feasible solution for the laboratory practice, as it can leverage multi-core architectures in computer clusters to efficiently handle large-scale imaging datasets. Indeed, our Parent-Workers implementation of ACDC allows to obtain up to a 3.7× speed-up compared to the sequential counterpart. ACDC was tested on two distinct cell imaging datasets to assess its accuracy and effectiveness on images with different characteristics. We achieved an accurate cell-count and nuclei segmentation without relying on large-scale annotated datasets, a result confirmed by the average Dice Similarity Coefficients of 76.84 and 88.64 and the Pearson coefficients of 0.99 and 0.96, calculated against the manual cell counting, on the two tested datasets.
Collapse
|
10
|
Uzma, Halim Z. Optimizing the DNA fragment assembly using metaheuristic-based overlap layout consensus approach. Appl Soft Comput 2020. [DOI: 10.1016/j.asoc.2020.106256] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
11
|
Yan Z, Zhu X, Wang Y, Nie Y, Guan S, Kuo Y, Chang D, Li R, Qiao J, Yan L. scHaplotyper: haplotype construction and visualization for genetic diagnosis using single cell DNA sequencing data. BMC Bioinformatics 2020; 21:41. [PMID: 32007105 PMCID: PMC6995221 DOI: 10.1186/s12859-020-3381-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Accepted: 01/22/2020] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Haplotyping reveals chromosome blocks inherited from parents to in vitro fertilized (IVF) embryos in preimplantation genetic diagnosis (PGD), enabling the observation of the transmission of disease alleles between generations. However, the methods of haplotyping that are suitable for single cells are limited because a whole genome amplification (WGA) process is performed before sequencing or genotyping in PGD, and true haplotype profiles of embryos need to be constructed based on genotypes that can contain many WGA artifacts. RESULTS Here, we offer scHaplotyper as a genetic diagnosis tool that reconstructs and visualizes the haplotype profiles of single cells based on the Hidden Markov Model (HMM). scHaplotyper can trace the origin of each haplotype block in the embryo, enabling the detection of carrier status of disease alleles in each embryo. We applied this method in PGD in two families affected with genetic disorders, and the result was the healthy live births of two children in the two families, demonstrating the clinical application of this method. CONCLUSION Next generation sequencing (NGS) of preimplantation embryos enable genetic screening for families with genetic disorders, avoiding the birth of affected babies. With the validation and successful clinical application, we showed that scHaplotyper is a convenient and accurate method to screen out embryos. More patients with genetic disorder will benefit from the genetic diagnosis of embryos. The source code of scHaplotyper is available at GitHub repository: https://github.com/yzqheart/scHaplotyper.
Collapse
Affiliation(s)
- Zhiqiang Yan
- Center for Reproductive Medicine, Department of Obstetrics and Gynecology, Peking University Third Hospital, Beijing, 100191, China.,Key Laboratory of Assisted Reproduction, Ministry of Education, Beijing, 100191, China.,Beijing Key Laboratory of Reproductive Endocrinology and Assisted Reproduction, Beijing, 100191, China.,Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, 100871, China.,Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100871, China
| | - Xiaohui Zhu
- Center for Reproductive Medicine, Department of Obstetrics and Gynecology, Peking University Third Hospital, Beijing, 100191, China.,Key Laboratory of Assisted Reproduction, Ministry of Education, Beijing, 100191, China.,Beijing Key Laboratory of Reproductive Endocrinology and Assisted Reproduction, Beijing, 100191, China
| | - Yuqian Wang
- Center for Reproductive Medicine, Department of Obstetrics and Gynecology, Peking University Third Hospital, Beijing, 100191, China.,Key Laboratory of Assisted Reproduction, Ministry of Education, Beijing, 100191, China.,Beijing Key Laboratory of Reproductive Endocrinology and Assisted Reproduction, Beijing, 100191, China
| | - Yanli Nie
- Center for Reproductive Medicine, Department of Obstetrics and Gynecology, Peking University Third Hospital, Beijing, 100191, China.,Key Laboratory of Assisted Reproduction, Ministry of Education, Beijing, 100191, China.,Beijing Key Laboratory of Reproductive Endocrinology and Assisted Reproduction, Beijing, 100191, China
| | - Shuo Guan
- Center for Reproductive Medicine, Department of Obstetrics and Gynecology, Peking University Third Hospital, Beijing, 100191, China.,Key Laboratory of Assisted Reproduction, Ministry of Education, Beijing, 100191, China.,Beijing Key Laboratory of Reproductive Endocrinology and Assisted Reproduction, Beijing, 100191, China
| | - Ying Kuo
- Center for Reproductive Medicine, Department of Obstetrics and Gynecology, Peking University Third Hospital, Beijing, 100191, China.,Key Laboratory of Assisted Reproduction, Ministry of Education, Beijing, 100191, China.,Beijing Key Laboratory of Reproductive Endocrinology and Assisted Reproduction, Beijing, 100191, China
| | - Di Chang
- Center for Reproductive Medicine, Department of Obstetrics and Gynecology, Peking University Third Hospital, Beijing, 100191, China.,Key Laboratory of Assisted Reproduction, Ministry of Education, Beijing, 100191, China.,Beijing Key Laboratory of Reproductive Endocrinology and Assisted Reproduction, Beijing, 100191, China
| | - Rong Li
- Center for Reproductive Medicine, Department of Obstetrics and Gynecology, Peking University Third Hospital, Beijing, 100191, China.,Key Laboratory of Assisted Reproduction, Ministry of Education, Beijing, 100191, China.,Beijing Key Laboratory of Reproductive Endocrinology and Assisted Reproduction, Beijing, 100191, China
| | - Jie Qiao
- Center for Reproductive Medicine, Department of Obstetrics and Gynecology, Peking University Third Hospital, Beijing, 100191, China.,Key Laboratory of Assisted Reproduction, Ministry of Education, Beijing, 100191, China.,Beijing Key Laboratory of Reproductive Endocrinology and Assisted Reproduction, Beijing, 100191, China.,Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, 100871, China.,Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100871, China.,Beijing Advanced Innovation Center for Genomics (ICG), Peking University, Beijing, 100871, China
| | - Liying Yan
- Center for Reproductive Medicine, Department of Obstetrics and Gynecology, Peking University Third Hospital, Beijing, 100191, China. .,Key Laboratory of Assisted Reproduction, Ministry of Education, Beijing, 100191, China. .,Beijing Key Laboratory of Reproductive Endocrinology and Assisted Reproduction, Beijing, 100191, China.
| |
Collapse
|
12
|
Na JC, Lee I, Rhee JK, Shin SY. Fast single individual haplotyping method using GPGPU. Comput Biol Med 2019; 113:103421. [PMID: 31499396 DOI: 10.1016/j.compbiomed.2019.103421] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 08/28/2019] [Accepted: 08/28/2019] [Indexed: 11/27/2022]
Abstract
BACKGROUND Most bioinformatic tools for next generation sequencing (NGS) data are computationally intensive, requiring a large amount of computational power for processing and analysis. Here the utility of graphic processing units (GPUs) for NGS data computation is assessed. METHOD In a previous study, we developed a probabilistic evolutionary algorithm with toggling for haplotyping (PEATH) method based on the estimation of distribution algorithm and toggling heuristic. Here, we parallelized the PEATH method (PEATH/G) using general-purpose computing on GPU (GPGPU). RESULTS The PEATH/G runs approximately 46.8 times and 25.4 times faster than PEATH on the NA12878 fosmid-sequencing dataset and the HuRef dataset, respectively, with an NVIDIA GeForce GTX 1660Ti. Moreover, the PEATH/G is approximately 13.3 times faster on the fosmid-sequencing dataset, even with an inexpensive conventional GPGPU (NVIDIA GeForce GTX 950). CONCLUSIONS PEATH/G can be a practical single individual haplotyping tool in terms of both its accuracy and speed. GPGPU can help reduce the running time of NGS analysis tools.
Collapse
Affiliation(s)
- Joong Chae Na
- Department of Computer Science and Engineering, Sejong University, Seoul, 05006, South Korea
| | - Inbok Lee
- Department of Software, Korea Aerospace University, Goyang, 10540, South Korea
| | - Je-Keun Rhee
- School of Systems Biomedical Science, Soongsil University, Seoul, 06978, South Korea.
| | - Soo-Yong Shin
- Department of Digital Health, SAIHST, Sungkyunkwan University, Seoul, 06351, South Korea; Big Data Research Center, Samsung Medical Center, Seoul, 06351, South Korea.
| |
Collapse
|
13
|
Romano P, Céol A, Dräger A, Fiannaca A, Giugno R, La Rosa M, Milanesi L, Pfeffer U, Rizzo R, Shin SY, Xia J, Urso A. The 2017 Network Tools and Applications in Biology (NETTAB) workshop: aims, topics and outcomes. BMC Bioinformatics 2019; 20:125. [PMID: 30999855 PMCID: PMC6472292 DOI: 10.1186/s12859-019-2681-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
The 17th International NETTAB workshop was held in Palermo, Italy, on October 16-18, 2017. The special topic for the meeting was "Methods, tools and platforms for Personalised Medicine in the Big Data Era", but the traditional topics of the meeting series were also included in the event. About 40 scientific contributions were presented, including four keynote lectures, five guest lectures, and many oral communications and posters. Also, three tutorials were organised before and after the workshop. Full papers from some of the best works presented in Palermo were submitted for this Supplement of BMC Bioinformatics. Here, we provide an overview of meeting aims and scope. We also shortly introduce selected papers that have been accepted for publication in this Supplement, for a complete presentation of the outcomes of the meeting.
Collapse
Affiliation(s)
- Paolo Romano
- IRCCS Ospedale Policlinico San Martino, Largo Rosanna Benzi 10, Genova, I-16132 Italy
| | - Arnaud Céol
- European Institute of Oncology IRCCS, Milan, 20141 Italy
| | - Andreas Dräger
- Computational Systems Biology of Infection and Antimicrobial-Resistant Pathogens, Center for Bioinformatics Tübingen (ZBIT), Tübingen, 72074 Germany
- Department of Computer Science, University of Tübingen, Tübingen, 72074 Germany
| | - Antonino Fiannaca
- ICAR-CNR, Institute for high performance computing and networking, National Research Council of Italy, Palermo, 90146 Italy
| | - Rosalba Giugno
- Department of Computer Science, University of Verona, Verona, 37134 Italy
| | - Massimo La Rosa
- ICAR-CNR, Institute for high performance computing and networking, National Research Council of Italy, Palermo, 90146 Italy
| | - Luciano Milanesi
- ITB-CNR, Institute of biomedical technologies, National Research Council of Italy, Segrate (MI), 20090 Italy
| | - Ulrich Pfeffer
- IRCCS Ospedale Policlinico San Martino, Largo Rosanna Benzi 10, Genova, I-16132 Italy
| | - Riccardo Rizzo
- ICAR-CNR, Institute for high performance computing and networking, National Research Council of Italy, Palermo, 90146 Italy
| | - Soo-Yong Shin
- Department of Digital Health, SAIHST, Sungkyunkwan University, Seoul, 03063 South Korea
| | - Junfeng Xia
- Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601 China
| | - Alfonso Urso
- ICAR-CNR, Institute for high performance computing and networking, National Research Council of Italy, Palermo, 90146 Italy
| |
Collapse
|