1
|
Qi J, Li Z, Zhang YZ, Li G, Gao X, Han R. TDFPS-Designer: an efficient toolkit for barcode design and selection in nanopore sequencing. Genome Biol 2024; 25:285. [PMID: 39497190 PMCID: PMC11533379 DOI: 10.1186/s13059-024-03423-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 10/17/2024] [Indexed: 11/08/2024] Open
Abstract
Oxford Nanopore Technologies (ONT) offers ultrahigh-throughput multi-sample sequencing but only provides barcode kits that enable up to 96-sample multiplexing. We present TDFPS-Designer, a new toolkit for nanopore sequencing barcode design, which creates significantly more barcodes: 137 with a length of 20 base pairs, 410 at 24 bp, and 1779 at 30 bp, far surpassing ONT's offerings. It includes GPU-based acceleration for ultra-fast demultiplexing and designs robust barcodes suitable for high-error ONT data. TDFPS-Designer outperforms current methods, improving the demultiplexing recall rate by 20% relative to Guppy, without a reduction in precision.
Collapse
Affiliation(s)
- Junhai Qi
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China
| | - Zhengyi Li
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China
| | - Yao-Zhong Zhang
- Division of Health Medical Intelligence, Human Genome Center, The Institute of Medical Science, The University of Tokyo, Minato-ku, Tokyo, 108-8639, Japan
| | - Guojun Li
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China.
| | - Xin Gao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Makkah, 23955, Saudi Arabia.
| | - Renmin Han
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China.
| |
Collapse
|
2
|
Han R, Qi J, Xue Y, Sun X, Zhang F, Gao X, Li G. HycDemux: a hybrid unsupervised approach for accurate barcoded sample demultiplexing in nanopore sequencing. Genome Biol 2023; 24:222. [PMID: 37798751 PMCID: PMC10552309 DOI: 10.1186/s13059-023-03053-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2022] [Accepted: 09/08/2023] [Indexed: 10/07/2023] Open
Abstract
DNA barcodes enable Oxford Nanopore sequencing to sequence multiple barcoded DNA samples on a single flow cell. DNA sequences with the same barcode need to be grouped together through demultiplexing. As the number of samples increases, accurate demultiplexing becomes difficult. We introduce HycDemux, which incorporates a GPU-parallelized hybrid clustering algorithm that uses nanopore signals and DNA sequences for accurate data clustering, alongside a voting-based module to finalize the demultiplexing results. Comprehensive experiments demonstrate that our approach outperforms unsupervised tools in short sequence fragment clustering and performs more robustly than current state-of-the-art demultiplexing tools for complex multi-sample sequencing data.
Collapse
Affiliation(s)
- Renmin Han
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China
| | - Junhai Qi
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China
- BioMap Research, California, USA
| | - Yang Xue
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China
| | - Xiujuan Sun
- High Performance Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
| | - Fa Zhang
- School of Medical Technolgoy, Beijing Institute of Technology, Beijing, 100085, China.
| | - Xin Gao
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal, 23955, Saudi Arabia.
| | - Guojun Li
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China.
| |
Collapse
|
3
|
A Study of Text Vectorization Method Combining Topic Model and Transfer Learning. Processes (Basel) 2022. [DOI: 10.3390/pr10020350] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
With the development of Internet cloud technology, the scale of data is expanding. Traditional processing methods find it difficult to deal with the problem of information extraction of big data. Therefore, it is necessary to use machine-learning-assisted intelligent processing to extract information from data in order to solve the optimization problem in complex systems. There are many forms of data storage. Among them, text data is an important data type that directly reflects semantic information. Text vectorization is an important concept in natural language processing tasks. Because text data can not be directly used for model parameter training, it is necessary to vectorize the original text data and make it numerical, and then the feature extraction operation can be carried out. The traditional text digitization method is often realized by constructing a bag of words, but the vector generated by this method can not reflect the semantic relationship between words, and it also easily causes the problems of data sparsity and dimension explosion. Therefore, this paper proposes a text vectorization method combining a topic model and transfer learning. Firstly, the topic model is selected to model the text data and extract its keywords, to grasp the main information of the text data. Then, with the help of the bidirectional encoder representations from transformers (BERT) model, which belongs to the pretrained model, model transfer learning is carried out to generate vectors, which are applied to the calculation of similarity between texts. By setting up a comparative experiment, this method is compared with the traditional vectorization method. The experimental results show that the vector generated by the topic-modeling- and transfer-learning-based text vectorization (TTTV) proposed in this paper can obtain better results when calculating the similarity between texts with the same topic, which means that it can more accurately judge whether the contents of the given two texts belong to the same topic.
Collapse
|
4
|
Lin J, Adjeroh DA, Jiang BH, Jiang Y. K2 and K2*: efficient alignment-free sequence similarity measurement based on Kendall statistics. Bioinformatics 2018; 34:1682-1689. [PMID: 29253072 PMCID: PMC6355110 DOI: 10.1093/bioinformatics/btx809] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2017] [Revised: 12/11/2017] [Accepted: 12/14/2017] [Indexed: 11/13/2022] Open
Abstract
Motivation Alignment-free sequence comparison methods can compute the pairwise similarity between a huge number of sequences much faster than sequence-alignment based methods. Results We propose a new non-parametric alignment-free sequence comparison method, called K2, based on the Kendall statistics. Comparing to the other state-of-the-art alignment-free comparison methods, K2 demonstrates competitive performance in generating the phylogenetic tree, in evaluating functionally related regulatory sequences, and in computing the edit distance (similarity/dissimilarity) between sequences. Furthermore, the K2 approach is much faster than the other methods. An improved method, K2*, is also proposed, which is able to determine the appropriate algorithmic parameter (length) automatically, without first considering different values. Comparative analysis with the state-of-the-art alignment-free sequence similarity methods demonstrates the superiority of the proposed approaches, especially with increasing sequence length, or increasing dataset sizes. Availability and implementation The K2 and K2* approaches are implemented in the R language as a package and is freely available for open access (http://community.wvu.edu/daadjeroh/projects/K2/K2_1.0.tar.gz). Contact yueljiang@163.com. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jie Lin
- Department of Software engineering, College of Mathematics and
Informatics, Fujian Normal University, Fuzhou, China
| | - Donald A Adjeroh
- Department of Computer Science & Electrical Engineering, West
Virginia University, Morgantown, WV, USA
| | - Bing-Hua Jiang
- Department of Pathology, Carver College of Medicine, The University of
Iowa, Iowa City, IA, USA
| | - Yue Jiang
- Department of Software engineering, College of Mathematics and
Informatics, Fujian Normal University, Fuzhou, China
| |
Collapse
|
5
|
Jia Y, Li H, Wang J, Meng H, Yang Z. Spectrum structures and biological functions of 8-mers in the human genome. Genomics 2018. [PMID: 29522801 DOI: 10.1016/j.ygeno.2018.03.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The spectra of k-mer frequencies can reveal the structures and evolution of genome sequences. We confirmed that the trimodal spectrum of 8-mers in human genome sequences is distinguished only by CG2, CG1 and CG0 8-mer sets, containing 2,1 or 0 CpG, respectively. This phenomenon is called independent selection law. The three types of CG 8-mers were considered as different functional elements. We conjectured that (1) nucleosome binding motifs are mainly characterized by CG1 8-mers and (2) the core structural units of CpG island sequences are predominantly characterized by CG2 8-mers. To validate our conjectures, nucleosome occupied sequences and CGI sequences were extracted, then the sequence parameters were constructed through the information of the three CG 8-mer sets respectively. ROC analysis showed that CG1 8-mers are more preference in nucleosome occupied segments (AUC > 0.7) and CG2 8-mers are more preference in CGI sequences (AUC > 0.99). This validates our conjecture in principle.
Collapse
Affiliation(s)
- Yun Jia
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China; College of Science, Inner Mongolia University of Technology, Hohhot 010051, China
| | - Hong Li
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China.
| | - Jingfeng Wang
- College of Science, Inner Mongolia University of Technology, Hohhot 010051, China
| | - Hu Meng
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Zhenhua Yang
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| |
Collapse
|
6
|
Wen J, Zhang Y, Yau SS. k-mer Sparse matrix model for genetic sequence and its applications in sequence comparison. J Theor Biol 2014; 363:145-50. [DOI: 10.1016/j.jtbi.2014.08.028] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2014] [Revised: 07/14/2014] [Accepted: 08/17/2014] [Indexed: 10/24/2022]
|
7
|
A measure of DNA sequence similarity by Fourier Transform with applications on hierarchical clustering. J Theor Biol 2014; 359:18-28. [DOI: 10.1016/j.jtbi.2014.05.043] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2014] [Revised: 04/22/2014] [Accepted: 05/29/2014] [Indexed: 12/24/2022]
|
8
|
Bao J, Yuan R, Bao Z. An improved alignment-free model for DNA sequence similarity metric. BMC Bioinformatics 2014; 15:321. [PMID: 25261973 PMCID: PMC4261891 DOI: 10.1186/1471-2105-15-321] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2013] [Accepted: 09/23/2014] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND DNA Clustering is an important technology to automatically find the inherent relationships on a large scale of DNA sequences. But the DNA clustering quality can still be improved greatly. The DNA sequences similarity metric is one of the key points of clustering. The alignment-free methodology is a very popular way to calculate DNA sequence similarity. It normally converts a sequence into a feature space based on words' probability distribution rather than directly matches strings. Existing alignment-free models, e.g. k-tuple, merely employ word frequency information and ignore many types of useful information contained in the DNA sequence, such as classifications of nucleotide bases, position and the like. It is believed that the better data mining results can be achieved with compounded information. Therefore, we present a new alignment-free model that employs compounded information to improve the DNA clustering quality. RESULTS This paper proposes a Category-Position-Frequency (CPF) model, which utilizes the word frequency, position and classification information of nucleotide bases from DNA sequences. The CPF model converts a DNA sequence into three sequences according to the categories of nucleotide bases, and then yields a 12-dimension feature vector. The feature values are computed by an entropy based model that takes both local word frequency and position information into account. We conduct DNA clustering experiments on several datasets and compare with some mainstream alignment-free models for evaluation, including k-tuple, DMk, TSM, AMI and CV. The experiments show that CPF model is superior to other models in terms of the clustering results and optimal settings. CONCLUSIONS The following conclusions can be drawn from the experiments. (1) The hybrid information model is better than the model based on word frequency only. (2) For DNA sequences no more than 5000 characters, the preferred size of sliding windows for CPF is two which provides a great advantage to promote system performance. (3) The CPF model is able to obtain an efficient stable performance and broad generalization.
Collapse
Affiliation(s)
- Junpeng Bao
- Department of Computer Science and Technology Xi’an Jiaotong University, West Xianning Road, 710049 Xi’an, P.R. China
| | - Ruiyu Yuan
- Department of Computer Science and Technology Xi’an Jiaotong University, West Xianning Road, 710049 Xi’an, P.R. China
| | - Zhe Bao
- Department of Computer Science and Technology Xi’an Jiaotong University, West Xianning Road, 710049 Xi’an, P.R. China
| |
Collapse
|
9
|
Shida F, Mizuta S. Measurement of word frequencies in genomic DNA sequences based on partial alignment and fuzzy set. J Bioinform Comput Biol 2014; 12:1450019. [PMID: 25152044 DOI: 10.1142/s021972001450019x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Accompanied with the rapid increase of the amount of data registered in the databases of biological sequences, the need for a fast method of sequence comparison applicable to sequences of large size is also increasing. In general, alignment is used for sequence comparison. However, the alignment may not be appropriate for comparison of sequences of large size such as whole genome sequences due to its large time complexity. In this article, we propose a semi alignment-free method of sequence comparison based on word frequency distributions, in which we partially use the alignment to measure word frequencies along with the idea of fuzzy set theory. Experiments with ten bacterial genome sequences demonstrated that the fuzzy measurements has the effect that facilitates discrimination between close relatives and distant relatives.
Collapse
Affiliation(s)
- Fumiya Shida
- Graduate School of Science and Technology, Hirosaki University, 3 Bunkyo-cho, Hirosaki, Aomori 036-8561, Japan
| | | |
Collapse
|
10
|
Gunasinghe U, Alahakoon D, Bedingfield S. Extraction of high quality k-words for alignment-free sequence comparison. J Theor Biol 2014; 358:31-51. [PMID: 24846728 DOI: 10.1016/j.jtbi.2014.05.016] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2013] [Revised: 05/04/2014] [Accepted: 05/07/2014] [Indexed: 10/25/2022]
Abstract
The weighted Euclidean distance (D(2)) is one of the earliest dissimilarity measures used for alignment free comparison of biological sequences. This distance measure and its variants have been used in numerous applications due to its fast computation, and many variants of it have been subsequently introduced. The D(2) distance measure is based on the count of k-words in the two sequences that are compared. Traditionally, all k-words are compared when computing the distance. In this paper we show that similar accuracy in sequence comparison can be achieved by using a selected subset of k-words. We introduce a term variance based quality measure for identifying the important k-words. We demonstrate the application of the proposed technique in phylogeny reconstruction and show that up to 99% of the k-words can be filtered out for certain datasets, resulting in faster sequence comparison. The paper also presents an exploratory analysis based evaluation of optimal k-word values and discusses the impact of using subsets of k-words in such optimal instances.
Collapse
Affiliation(s)
- Upuli Gunasinghe
- Clayton School of Information Technology, Faculty of Information Technology, Monash University, VIC 3800, Australia.
| | - Damminda Alahakoon
- School of Information and Business Analytics, Deakin University, VIC 3125, Australia.
| | - Susan Bedingfield
- Clayton School of Information Technology, Faculty of Information Technology, Monash University, VIC 3800, Australia.
| |
Collapse
|
11
|
El-Lakkani A, El-Sherif S. Similarity analysis of protein sequences based on 2D and 3D amino acid adjacency matrices. Chem Phys Lett 2013. [DOI: 10.1016/j.cplett.2013.10.032] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
12
|
Yang X, Wang T. Linear regression model of short k-word: a similarity distance suitable for biological sequences with various lengths. J Theor Biol 2013; 337:61-70. [PMID: 23933105 DOI: 10.1016/j.jtbi.2013.07.028] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2013] [Revised: 07/02/2013] [Accepted: 07/30/2013] [Indexed: 01/10/2023]
Abstract
Originating from sequences' length difference, both k-word based methods and graphical representation approaches have uncovered biological information in their distinct ways. However, it is less likely that the mechanisms of information storage vary with sequences' length. A similarity distance suitable for sequences with various lengths will be much near to the mechanisms of information storage. In this paper, new sub-sequences of k-word were extracted from biological sequences under a one-to-one mapping. The new sub-sequences were evaluated by a linear regression model. Moreover, a new distance was defined on the invariants from the linear regression model. With comparison to other alignment-free distances, the results of four experiments demonstrated that our similarity distance was more efficient.
Collapse
Affiliation(s)
- Xiwu Yang
- School of Mathematical Sciences, Dalian University of Technology, Dalian, Liaoning 116024, PR China; School of Mathematics, Liaoning Normal University, Dalian, Liaoning 116029, PR China.
| | - Tianming Wang
- School of Mathematical Sciences, Dalian University of Technology, Dalian, Liaoning 116024, PR China
| |
Collapse
|
13
|
Function-based classification of carbohydrate-active enzymes by recognition of short, conserved peptide motifs. Appl Environ Microbiol 2013; 79:3380-91. [PMID: 23524681 DOI: 10.1128/aem.03803-12] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Functional prediction of carbohydrate-active enzymes is difficult due to low sequence identity. However, similar enzymes often share a few short motifs, e.g., around the active site, even when the overall sequences are very different. To exploit this notion for functional prediction of carbohydrate-active enzymes, we developed a simple algorithm, peptide pattern recognition (PPR), that can divide proteins into groups of sequences that share a set of short conserved sequences. When this method was used on 118 glycoside hydrolase 5 proteins with 9% average pairwise identity and representing four characterized enzymatic functions, 97% of the proteins were sorted into groups correlating with their enzymatic activity. Furthermore, we analyzed 8,138 glycoside hydrolase 13 proteins including 204 experimentally characterized enzymes with 28 different functions. There was a 91% correlation between group and enzyme activity. These results indicate that the function of carbohydrate-active enzymes can be predicted with high precision by finding short, conserved motifs in their sequences. The glycoside hydrolase 61 family is important for fungal biomass conversion, but only a few proteins of this family have been functionally characterized. Interestingly, PPR divided 743 glycoside hydrolase 61 proteins into 16 subfamilies useful for targeted investigation of the function of these proteins and pinpointed three conserved motifs with putative importance for enzyme activity. Furthermore, the conserved sequences were useful for cloning of new, subfamily-specific glycoside hydrolase 61 proteins from 14 fungi. In conclusion, identification of conserved sequence motifs is a new approach to sequence analysis that can predict carbohydrate-active enzyme functions with high precision.
Collapse
|
14
|
Wei D, Jiang Q, Wei Y, Wang S. A novel hierarchical clustering algorithm for gene sequences. BMC Bioinformatics 2012; 13:174. [PMID: 22823405 PMCID: PMC3443659 DOI: 10.1186/1471-2105-13-174] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2011] [Accepted: 06/30/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Clustering DNA sequences into functional groups is an important problem in bioinformatics. We propose a new alignment-free algorithm, mBKM, based on a new distance measure, DMk, for clustering gene sequences. This method transforms DNA sequences into the feature vectors which contain the occurrence, location and order relation of k-tuples in DNA sequence. Afterwards, a hierarchical procedure is applied to clustering DNA sequences based on the feature vectors. RESULTS The proposed distance measure and clustering method are evaluated by clustering functionally related genes and by phylogenetic analysis. This method is also compared with BlastClust, CD-HIT-EST and some others. The experimental results show our method is effective in classifying DNA sequences with similar biological characteristics and in discovering the underlying relationship among the sequences. CONCLUSIONS We introduced a novel clustering algorithm which is based on a new sequence similarity measure. It is effective in classifying DNA sequences with similar biological characteristics and in discovering the relationship among the sequences.
Collapse
Affiliation(s)
- Dan Wei
- Cognitive Science Department & Fujian Key Laboratory of the Brain-like Intelligent Systems, Xiamen University, Xiamen, China
| | | | | | | |
Collapse
|
15
|
Yu HJ, Huang DS. Novel 20-D descriptors of protein sequences and it’s applications in similarity analysis. Chem Phys Lett 2012. [DOI: 10.1016/j.cplett.2012.02.030] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
16
|
Liu L, Li D, Bai F. A relative Lempel-Ziv complexity: Application to comparing biological sequences. Chem Phys Lett 2012; 530:107-112. [PMID: 32226089 PMCID: PMC7094452 DOI: 10.1016/j.cplett.2012.01.061] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2011] [Accepted: 01/24/2012] [Indexed: 11/17/2022]
Abstract
One of the main tasks in biological sequence analysis is biological sequence comparison. Numerous efficient methods have been developed for sequence comparison. Traditional sequence comparison is based on sequence alignment. In this report, we propose a novel alignment-free method based on the relative Lempel-Ziv complexity to compare biological sequences. The vertebrate transferring genomes and the spike protein sequences are prepared and tested to evaluate the validity of the method. We use this method to build phylogenetic tree of two groups of the sequences. The result demonstrates that our method is powerful and efficient.
Collapse
Affiliation(s)
- Liwei Liu
- College of Science, Dalian Jiaotong University, Dalian 116028, PR China
| | - Dongbo Li
- Department of Otolaryngology, Affiliated Xinhua Hospital of Dalian University, Dalian 116021, PR China
| | - Fenglan Bai
- College of Science, Dalian Jiaotong University, Dalian 116028, PR China
| |
Collapse
|