1
|
Hung CL, Chen CC. Computational Approaches for Drug Discovery. Drug Dev Res 2014; 75:412-8. [PMID: 25195585 DOI: 10.1002/ddr.21222] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] [Imported: 05/17/2025]
|
|
11 |
55 |
2
|
Hung CL, Lin YL. Implementation of a parallel protein structure alignment service on cloud. Int J Genomics 2013; 2013:439681. [PMID: 23671842 PMCID: PMC3647543 DOI: 10.1155/2013/439681] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2013] [Accepted: 02/20/2013] [Indexed: 12/20/2022] [Imported: 05/17/2025] Open
Abstract
Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform.
Collapse
|
research-article |
12 |
20 |
3
|
Hung CL, Lin YS, Lin CY, Chung YC, Chung YF. CUDA ClustalW: An efficient parallel algorithm for progressive multiple sequence alignment on Multi-GPUs. Comput Biol Chem 2015; 58:62-68. [PMID: 26052076 DOI: 10.1016/j.compbiolchem.2015.05.004] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2014] [Revised: 05/14/2015] [Accepted: 05/14/2015] [Indexed: 10/23/2022] [Imported: 05/17/2025]
Abstract
For biological applications, sequence alignment is an important strategy to analyze DNA and protein sequences. Multiple sequence alignment is an essential methodology to study biological data, such as homology modeling, phylogenetic reconstruction and etc. However, multiple sequence alignment is a NP-hard problem. In the past decades, progressive approach has been proposed to successfully align multiple sequences by adopting iterative pairwise alignments. Due to rapid growth of the next generation sequencing technologies, a large number of sequences can be produced in a short period of time. When the problem instance is large, progressive alignment will be time consuming. Parallel computing is a suitable solution for such applications, and GPU is one of the important architectures for contemporary parallel computing researches. Therefore, we proposed a GPU version of ClustalW v2.0.11, called CUDA ClustalW v1.0, in this work. From the experiment results, it can be seen that the CUDA ClustalW v1.0 can achieve more than 33× speedups for overall execution time by comparing to ClustalW v2.0.11.
Collapse
|
|
10 |
18 |
4
|
Hung CL, Hua GJ. Cloud computing for protein-ligand binding site comparison. BIOMED RESEARCH INTERNATIONAL 2013; 2013:170356. [PMID: 23762824 PMCID: PMC3671236 DOI: 10.1155/2013/170356] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/18/2012] [Accepted: 03/28/2013] [Indexed: 12/30/2022] [Imported: 05/17/2025]
Abstract
The proteome-wide analysis of protein-ligand binding sites and their interactions with ligands is important in structure-based drug design and in understanding ligand cross reactivity and toxicity. The well-known and commonly used software, SMAP, has been designed for 3D ligand binding site comparison and similarity searching of a structural proteome. SMAP can also predict drug side effects and reassign existing drugs to new indications. However, the computing scale of SMAP is limited. We have developed a high availability, high performance system that expands the comparison scale of SMAP. This cloud computing service, called Cloud-PLBS, combines the SMAP and Hadoop frameworks and is deployed on a virtual cloud computing platform. To handle the vast amount of experimental data on protein-ligand binding site pairs, Cloud-PLBS exploits the MapReduce paradigm as a management and parallelizing tool. Cloud-PLBS provides a web portal and scalability through which biologists can address a wide range of computer-intensive questions in biology and drug discovery.
Collapse
|
Comparative Study |
12 |
17 |
5
|
Hung CL, Hua GJ. Local alignment tool based on Hadoop framework and GPU architecture. BIOMED RESEARCH INTERNATIONAL 2014; 2014:541490. [PMID: 24955362 PMCID: PMC4052794 DOI: 10.1155/2014/541490] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/30/2014] [Accepted: 04/14/2014] [Indexed: 11/17/2022] [Imported: 05/17/2025]
Abstract
With the rapid growth of next generation sequencing technologies, such as Slex, more and more data have been discovered and published. To analyze such huge data the computational performance is an important issue. Recently, many tools, such as SOAP, have been implemented on Hadoop and GPU parallel computing architectures. BLASTP is an important tool, implemented on GPU architectures, for biologists to compare protein sequences. To deal with the big biology data, it is hard to rely on single GPU. Therefore, we implement a distributed BLASTP by combining Hadoop and multi-GPUs. The experimental results present that the proposed method can improve the performance of BLASTP on single GPU, and also it can achieve high availability and fault tolerance.
Collapse
|
research-article |
11 |
8 |
6
|
Hung CL, Lin CY. Open reading frame phylogenetic analysis on the cloud. Int J Genomics 2013; 2013:614923. [PMID: 23671843 PMCID: PMC3647537 DOI: 10.1155/2013/614923] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2012] [Accepted: 02/23/2013] [Indexed: 02/01/2023] [Imported: 05/17/2025] Open
Abstract
Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus.
Collapse
|
research-article |
12 |
7 |
7
|
Hung CL, Chen WP, Hua GJ, Zheng H, Tsai SJJ, Lin YL. Cloud computing-based TagSNP selection algorithm for human genome data. Int J Mol Sci 2015; 16:1096-110. [PMID: 25569088 PMCID: PMC4307292 DOI: 10.3390/ijms16011096] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Accepted: 12/04/2014] [Indexed: 12/31/2022] [Imported: 05/17/2025] Open
Abstract
Single nucleotide polymorphisms (SNPs) play a fundamental role in human genetic variation and are used in medical diagnostics, phylogeny construction, and drug design. They provide the highest-resolution genetic fingerprint for identifying disease associations and human features. Haplotypes are regions of linked genetic variants that are closely spaced on the genome and tend to be inherited together. Genetics research has revealed SNPs within certain haplotype blocks that introduce few distinct common haplotypes into most of the population. Haplotype block structures are used in association-based methods to map disease genes. In this paper, we propose an efficient algorithm for identifying haplotype blocks in the genome. In chromosomal haplotype data retrieved from the HapMap project website, the proposed algorithm identified longer haplotype blocks than an existing algorithm. To enhance its performance, we extended the proposed algorithm into a parallel algorithm that copies data in parallel via the Hadoop MapReduce framework. The proposed MapReduce-paralleled combinatorial algorithm performed well on real-world data obtained from the HapMap dataset; the improvement in computational efficiency was proportional to the number of processors used.
Collapse
|
Research Support, Non-U.S. Gov't |
10 |
6 |
8
|
|
Editorial |
5 |
4 |
9
|
Hung CL, Lin CY, Wang HH. An efficient parallel-network packet pattern-matching approach using GPUs. JOURNAL OF SYSTEMS ARCHITECTURE 2014; 60:431-439. [DOI: 10.1016/j.sysarc.2014.01.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2025] [Imported: 05/17/2025]
|
|
11 |
4 |
10
|
Hung CL, Lin CY, Wu PC. An Efficient GPU-Based Multiple Pattern Matching Algorithm for Packet Filtering. JOURNAL OF SIGNAL PROCESSING SYSTEMS 2017; 86:347-358. [DOI: 10.1007/s11265-016-1139-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2025] [Imported: 05/17/2025]
|
|
8 |
3 |
11
|
Hung CL, Lee C, Lin CY, Chang CH, Chung YC, Yi Tang C. Feature amplified voting algorithm for functional analysis of protein superfamily. BMC Genomics 2010; 11 Suppl 3:S14. [PMID: 21143781 PMCID: PMC2999344 DOI: 10.1186/1471-2164-11-s3-s14] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] [Imported: 05/17/2025] Open
Abstract
BACKGROUND Identifying the regions associated with protein function is a singularly important task in the post-genomic era. Biological studies often identify functional enzyme residues by amino acid sequences, particularly when related structural information is unavailable. In some cases of protein superfamilies, functional residues are difficult to detect by current alignment tools or evolutionary strategies when phylogenetic relationships do not parallel their protein functions. The solution proposed in this study is Feature Amplified Voting Algorithm with Three-profile alignment (FAVAT). The core concept of FAVAT is to reveal the desired features of a target enzyme or protein by voting on three different property groups aligned by three-profile alignment method. Functional residues of a target protein can then be retrieved by FAVAT analysis. In this study, the amidohydrolase superfamily was an interesting case for verifying the proposed approach because it contains divergent enzymes and proteins. RESULTS The FAVAT was used to identify critical residues of mammalian imidase, a member of the amidohydrolase superfamily. Members of this superfamily were first classified by their functional properties and sources of original organisms. After FAVAT analysis, candidate residues were identified and compared to a bacterial hydantoinase in which the crystal structure (1GKQ) has been fully elucidated. One modified lysine, three histidines and one aspartate were found to participate in the coordination of metal ions in the active site. The FAVAT analysis also redressed the misrecognition of metal coordinator Asp57 by the multiple sequence alignment (MSA) method. Several other amino acid residues known to be related to the function or structure of mammalian imidase were also identified. CONCLUSIONS The FAVAT is shown to predict functionally important amino acids in amidohydrolase superfamily. This strategy effectively identifies functionally important residues by analyzing the discrepancy between the sequence and functional properties of related proteins in a superfamily, and it should be applicable to other protein families.
Collapse
|
Research Support, Non-U.S. Gov't |
15 |
2 |
12
|
Hung C, Wu Y. GPU‐based parallel fuzzy c‐mean clustering model via genetic algorithm. CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE 2016; 28:4277-4290. [DOI: 10.1002/cpe.3731] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2025] [Imported: 05/17/2025]
Abstract
SummaryDetection of white matter changes in brain tissue using magnetic resonance imaging has been an increasingly active and challenging research area in computational neuroscience. A genetic algorithm based on a fuzzy c‐mean clustering method (GAFCM) was applied to simulated images to separate foreground spot signal information from the background, and the results were compared. The strength of this algorithm was tested by evaluating the segmentation matching factor, coefficient of determination, concordance correlation, and gene expression values. The experimental results demonstrated that the segmentation ability of GAFCM was better than that of fuzzy c‐means and K‐means algorithms. However, GAFCM is computationally expensive. This study presents a new GPU‐based parallel GAFCM algorithm to improve the performance of GAFCM. The experimental results show that computational performance can be increased by a factor of approximately 20 over the CPU‐based GAFCM algorithm while maintaining the quality of the processed images. Thus, the proposed GPU‐based parallel GAFCM algorithm can achieve the same results and significantly decrease processing time. Copyright © 2015 John Wiley & Sons, Ltd.
Collapse
|
|
9 |
1 |
13
|
Hung C, Lin C, Ou C, Tseng Y, Hung P, Li S, Fu C. Efficient bit‐parallel subcircuit extraction using CUDA. CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE 2016; 28:4326-4338. [DOI: 10.1002/cpe.3732] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2025] [Imported: 05/17/2025]
Abstract
SummaryWafer processing technology has been improving rapidly. Moore's law has been exceeded as the number of transistors in a dense integrated circuit, now increases threefold or more, approximately every year. The integrated circuit has gone from very large scale to giga large scale. The extraction of subcircuits has therefore become computation‐intensive. In this paper, we propose an efficient bit‐parallel subcircuit extraction algorithm using graphic processing units. We conducted experimental trials and demonstrated that the proposed algorithm can achieve high throughput, suggesting practical applications in the extraction of subcircuits. Copyright © 2015 John Wiley & Sons, Ltd.
Collapse
|
|
9 |
1 |
14
|
Hung CL, Lin CY. Efficient parallelised search engine based on virtual cluster. INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING 2016; 12:53. [DOI: 10.1504/ijcse.2016.074557] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/17/2025] [Imported: 05/17/2025]
|
|
9 |
1 |
15
|
Hung CL, Lin KH, Lee YK, Mrozek D, Tsai YT, Lin CH. The classification of stages of epiretinal membrane using convolutional neural network on optical coherence tomography image. Methods 2023; 214:28-34. [PMID: 37116670 DOI: 10.1016/j.ymeth.2023.04.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2022] [Revised: 03/18/2023] [Accepted: 04/22/2023] [Indexed: 04/30/2023] [Imported: 05/17/2025] Open
Abstract
BACKGROUND AND OBJECTIVE The gold standard for diagnosing epiretinal membranes is to observe the surface of the internal limiting membrane on optical coherence tomography images. The stages of the epiretinal membrane are used to decide the condition of the health of the membrane. The stages are not detected because some of them are similar. To accurately classify the stages, a deep-learning technology can be used to improve the classification accuracy. METHODS A combinatorial fusion with multiple convolutional neural networks (CNN) algorithms are proposed to enhance the accuracy of a single image classification model. The proposed method was trained using a dataset of 1947 optical coherence tomography images diagnosed with the epiretinal membrane at the Taichung Veterans General Hospital in Taiwan. The images consisted of 4 stages; stages 1, 2, 3, and 4. RESULTS The overall accuracy of the classification was 84%. The combination of five and six CNN models achieves the highest testing accuracy (85%) among other combinations, respectively. Any combination with a different number of CNN models outperforms any single CNN algorithm working alone. Meanwhile, the accuracy of the proposed method is better than ophthalmologists with years of clinical experience. CONCLUSIONS We have developed an efficient epiretinal membrane classification method by using combinatorial fusion with CNN models on optical coherence tomography images. The proposed method can be used for screening purposes to facilitate ophthalmologists making the correct diagnoses in general medical practice.
Collapse
|
|
2 |
|
16
|
Hung CL, Wu YH. Parallel genetic-based algorithm on multiple embedded graphic processing units for brain magnetic resonance imaging segmentation. COMPUTERS & ELECTRICAL ENGINEERING 2017; 61:373-383. [DOI: 10.1016/j.compeleceng.2016.09.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2025] [Imported: 05/17/2025]
|
|
8 |
|
17
|
Hung CL, Magoulès F, Qiu M, Hsu RC, Lin CY. Embedded multi-core computing and applications. THE JOURNAL OF SUPERCOMPUTING 2017; 73:3327-3332. [DOI: 10.1007/s11227-017-2107-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2025] [Imported: 05/17/2025]
|
|
8 |
|
18
|
Hung CL. Editorial: Computational Methods for Drug Aid Design. Comb Chem High Throughput Screen 2018; 21:72-73. [PMID: 29722642 DOI: 10.2174/138620732102180417142214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] [Imported: 05/17/2025]
|
|
7 |
|
19
|
Hung CL, Lin CY, Chang SC, Chung YC, Hsieh SJ, Tang CY, Lin YL. Multiple genome sequences alignment algorithm based on coding regions. INTERNATIONAL JOURNAL OF COMPUTATIONAL BIOLOGY AND DRUG DESIGN 2011; 4:165-178. [PMID: 21712566 DOI: 10.1504/ijcbdd.2011.041009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023] [Imported: 05/17/2025]
Abstract
Multiple Sequence Alignment (MSA) is the computational biology tool for facilitating the study of DNA homology, phylogeny determinations and conserved motifs. Many MSA methods have been presented to align protein, DNA, and RNA sequences successfully but not for coding region sequences. Therefore, we propose a heuristic alignment method, CORAL-M, for multiple genome sequences, especially for coding regions. CORAL-M adopts a codon-based probabilistic filtration model and the local optimal alignment solution to align multiple genome sequences in linear time. The experimental results presents that CORAL-M can find more potential function sites than that of other commonly used tools by aligning Enterovirus strains.
Collapse
|
|
14 |
|
20
|
Hung CL, Guo SW. Fast Parallel Network Packet Filter System based on CUDA. INTERNATIONAL JOURNAL OF NETWORKED AND DISTRIBUTED COMPUTING 2014; 2:198. [DOI: 10.2991/ijndc.2014.2.4.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2025] [Imported: 05/17/2025]
|
|
11 |
|