1
|
Eldele E, Ragab M, Chen Z, Wu M, Kwoh CK, Li X, Guan C. Self-Supervised Contrastive Representation Learning for Semi-Supervised Time-Series Classification. IEEE Trans Pattern Anal Mach Intell 2023; 45:15604-15618. [PMID: 37639415 DOI: 10.1109/tpami.2023.3308189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Learning time-series representations when only unlabeled data or few labeled samples are available can be a challenging task. Recently, contrastive self-supervised learning has shown great improvement in extracting useful representations from unlabeled data via contrasting different augmented views of data. In this work, we propose a novel Time-Series representation learning framework via Temporal and Contextual Contrasting (TS-TCC) that learns representations from unlabeled data with contrastive learning. Specifically, we propose time-series-specific weak and strong augmentations and use their views to learn robust temporal relations in the proposed temporal contrasting module, besides learning discriminative representations by our proposed contextual contrasting module. Additionally, we conduct a systematic study of time-series data augmentation selection, which is a key part of contrastive learning. We also extend TS-TCC to the semi-supervised learning settings and propose a Class-Aware TS-TCC (CA-TCC) that benefits from the available few labeled data to further improve representations learned by TS-TCC. Specifically, we leverage the robust pseudo labels produced by TS-TCC to realize a class-aware contrastive loss. Extensive experiments show that the linear evaluation of the features learned by our proposed framework performs comparably with the fully supervised training. Additionally, our framework shows high efficiency in few labeled data and transfer learning scenarios.
Collapse
|
2
|
Eldele E, Ragab M, Chen Z, Wu M, Kwoh CK, Li X. Self-supervised Learning for Label-Efficient Sleep Stage Classification: A Comprehensive Evaluation. IEEE Trans Neural Syst Rehabil Eng 2023; PP. [PMID: 37022869 DOI: 10.1109/tnsre.2023.3245285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
Abstract
The past few years have witnessed a remarkable advance in deep learning for EEG-based sleep stage classification (SSC). However, the success of these models is attributed to possessing a massive amount of labeled data for training, limiting their applicability in real-world scenarios. In such scenarios, sleep labs can generate a massive amount of data, but labeling can be expensive and time-consuming. Recently, the self-supervised learning (SSL) paradigm has emerged as one of the most successful techniques to overcome labels' scarcity. In this paper, we evaluate the efficacy of SSL to boost the performance of existing SSC models in the few-labels regime. We conduct a thorough study on three SSC datasets, and we find that fine-tuning the pretrained SSC models with only 5% of labeled data can achieve competitive performance to the supervised training with full labels. Moreover, self-supervised pretraining helps SSC models to be more robust to data imbalance and domain shift problems.
Collapse
Affiliation(s)
- Emadeldeen Eldele
- School of Computer Science and Engineering, Nanyang Technological University, Singapore
| | - Mohamed Ragab
- Institute for Infocomm Research (IR) and the Centre for Frontier AI Research (CFAR), Agency for Science, Technology and Research (A*STAR), Singapore
| | - Zhenghua Chen
- Institute for Infocomm Research (IR) and the Centre for Frontier AI Research (CFAR), Agency for Science, Technology and Research (A*STAR), Singapore
| | - Min Wu
- Institute for Infocomm Research (IR), Agency for Science, Technology and Research (A*STAR), Singapore
| | - Chee-Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, Singapore
| | - Xiaoli Li
- School of Computer Science and Engineering, Nanyang Technological University, Singapore
| |
Collapse
|
3
|
Huang D, Wang CD, Lai JH, Kwoh CK. Toward Multidiversified Ensemble Clustering of High-Dimensional Data: From Subspaces to Metrics and Beyond. IEEE Trans Cybern 2022; 52:12231-12244. [PMID: 33961570 DOI: 10.1109/tcyb.2021.3049633] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The rapid emergence of high-dimensional data in various areas has brought new challenges to current ensemble clustering research. To deal with the curse of dimensionality, recently considerable efforts in ensemble clustering have been made by means of different subspace-based techniques. However, besides the emphasis on subspaces, rather limited attention has been paid to the potential diversity in similarity/dissimilarity metrics. It remains a surprisingly open problem in ensemble clustering how to create and aggregate a large population of diversified metrics, and furthermore, how to jointly investigate the multilevel diversity in the large populations of metrics, subspaces, and clusters in a unified framework. To tackle this problem, this article proposes a novel multidiversified ensemble clustering approach. In particular, we create a large number of diversified metrics by randomizing a scaled exponential similarity kernel, which are then coupled with random subspaces to form a large set of metric-subspace pairs. Based on the similarity matrices derived from these metric-subspace pairs, an ensemble of diversified base clusterings can be thereby constructed. Furthermore, an entropy-based criterion is utilized to explore the cluster wise diversity in ensembles, based on which three specific ensemble clustering algorithms are presented by incorporating three types of consensus functions. Extensive experiments are conducted on 30 high-dimensional datasets, including 18 cancer gene expression datasets and 12 image/speech datasets, which demonstrate the superiority of our algorithms over the state of the art. The source code is available at https://github.com/huangdonghere/MDEC.
Collapse
|
4
|
Harris RJ, Parimi N, Cawthon PM, Strotmeyer ES, Boudreau RM, Brach JS, Kwoh CK, Cauley JA. Associations of components of sarcopenia with risk of fracture in the Osteoporotic Fractures in Men (MrOS) study. Osteoporos Int 2022; 33:1815-1821. [PMID: 35380213 PMCID: PMC10011872 DOI: 10.1007/s00198-022-06390-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 03/28/2022] [Indexed: 10/18/2022]
Abstract
Our aim was to evaluate the associations between the individual components of sarcopenia and fracture types. In this cohort, the risk of experiencing any clinical, hip, or major osteoporotic fracture is greater in men with slow walking speed in comparison to normal walking speed. INTRODUCTION The association between the components of sarcopenia and fractures has not been clearly elucidated and has hindered the development of appropriate therapeutic interventions. Our aim was to evaluate the associations between the individual components of sarcopenia, specifically lean mass, strength, and physical performance and fracture (any fracture, hip fracture, major osteoporotic fracture) in the Osteoporotic Fractures in Men (MrOS) study. METHODS The Osteoporotic Fractures in Men study (MrOS) recruited 5995 men ≥ 65 years of age. We measured appendicular lean mass (ALM) by dual-energy X-ray absorptiometry (low as residual value < 20th percentile for the cohort), walking speed (fastest trial of usual pace, values < 0.8 m/s were low), and grip strength (max score of 2 trials, values < 30 kg were low). Information on fractures was assessed tri-annually over an average follow-up of 12 years and centrally adjudicated. Cox proportional hazard models estimated the hazard ratio (HR) (95% confidence intervals) for slow walking speed, low grip strength, and low lean mass. RESULTS Overall, 1413 men had a fracture during follow-up. Slow walking speed was associated with an increased risk for any HR = 1.39, 1.05-1.84; hip HR = 2.37, 1.54-3.63; and major osteoporotic, HR = 1.89, 1.34-2.67 in multi-variate-adjusted models. Low lean mass and low grip strength were not significantly associated with fracture. CONCLUSIONS In this cohort of older adult men, the risk of experiencing any, hip, or major osteoporotic fracture is greater in men with slow walking speed in comparison to men with normal walking speed, but low grip strength and low lean mass were not associated with fracture.
Collapse
Affiliation(s)
- R J Harris
- Department of Epidemiology Graduate School of Public Health University of Pittsburgh, Pittsburgh, PA, USA.
- VA Boston Healthcare System, Boston, MA, USA.
| | - N Parimi
- Research Institute, California Pacific Medical Center, San Francisco, CA, USA
| | - P M Cawthon
- Research Institute, California Pacific Medical Center, San Francisco, CA, USA
| | - E S Strotmeyer
- Department of Epidemiology Graduate School of Public Health University of Pittsburgh, Pittsburgh, PA, USA
| | - R M Boudreau
- Department of Epidemiology Graduate School of Public Health University of Pittsburgh, Pittsburgh, PA, USA
| | - J S Brach
- Department of Physical Therapy, University of Pittsburgh, Pittsburgh, PA, USA
| | - C K Kwoh
- Department of Medicine, University of Arizona, Tucson, AZ, USA
| | - J A Cauley
- Department of Epidemiology Graduate School of Public Health University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
5
|
Ragab M, Eldele E, Chen Z, Wu M, Kwoh CK, Li X. Self-Supervised Autoregressive Domain Adaptation for Time Series Data. IEEE Trans Neural Netw Learn Syst 2022; PP:1341-1351. [PMID: 35737606 DOI: 10.1109/tnnls.2022.3183252] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Unsupervised domain adaptation (UDA) has successfully addressed the domain shift problem for visual applications. Yet, these approaches may have limited performance for time series data due to the following reasons. First, they mainly rely on the large-scale dataset (i.e., ImageNet) for source pretraining, which is not applicable for time series data. Second, they ignore the temporal dimension on the feature space of the source and target domains during the domain alignment step. Finally, most of the prior UDA methods can only align the global features without considering the fine-grained class distribution of the target domain. To address these limitations, we propose a SeLf-supervised AutoRegressive Domain Adaptation (SLARDA) framework. In particular, we first design a self-supervised (SL) learning module that uses forecasting as an auxiliary task to improve the transferability of source features. Second, we propose a novel autoregressive domain adaptation technique that incorporates temporal dependence of both source and target features during domain alignment. Finally, we develop an ensemble teacher model to align class-wise distribution in the target domain via a confident pseudo labeling approach. Extensive experiments have been conducted on three real-world time series applications with 30 cross-domain scenarios. The results demonstrate that our proposed SLARDA method significantly outperforms the state-of-the-art approaches for time series domain adaptation. Our source code is available at: https://github.com/mohamedr002/SLARDA.
Collapse
|
6
|
Eldele E, Ragab M, Chen Z, Wu M, Kwoh CK, Li X, Guan C. ADAST: Attentive Cross-Domain EEG-Based Sleep Staging Framework With Iterative Self-Training. IEEE Trans Emerg Top Comput Intell 2022. [DOI: 10.1109/tetci.2022.3189695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Emadeldeen Eldele
- School of Computer Science and Engineering, Nanyang Technological University, Singapore
| | - Mohamed Ragab
- Institute for Infocomm Research (IR), Centre for Frontier Research (CFAR), Agency of Science, Technology and Research (A*STAR), Singapore
| | - Zhenghua Chen
- Institute for Infocomm Research (IR) and the Centre for Frontier AI Research (CFAR), Agency for Science, Technology and Research (A*STAR), Singapore
| | - Min Wu
- Institute for Infocomm Research (IR), Agency for Science, Technology and Research (A*STAR), Singapore
| | - Chee-Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, Singapore
| | - Xiaoli Li
- Institute for Infocomm Research (IR), Centre for Frontier Research (CFAR), Agency of Science, Technology and Research (A*STAR), Singapore
| | - Cuntai Guan
- School of Computer Science and Engineering, Nanyang Technological University, Singapore
| |
Collapse
|
7
|
Ragab M, Chen Z, Wu M, Kwoh CK, Yan R, Li X. Attention-based sequence to sequence model for machine remaining useful life prediction. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.09.022] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
8
|
Ding P, Ouyang W, Luo J, Kwoh CK. Heterogeneous information network and its application to human health and disease. Brief Bioinform 2021; 21:1327-1346. [PMID: 31566212 DOI: 10.1093/bib/bbz091] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Revised: 06/29/2019] [Accepted: 06/30/2019] [Indexed: 12/11/2022] Open
Abstract
The molecular components with the functional interdependencies in human cell form complicated biological network. Diseases are mostly caused by the perturbations of the composite of the interaction multi-biomolecules, rather than an abnormality of a single biomolecule. Furthermore, new biological functions and processes could be revealed by discovering novel biological entity relationships. Hence, more and more biologists focus on studying the complex biological system instead of the individual biological components. The emergence of heterogeneous information network (HIN) offers a promising way to systematically explore complicated and heterogeneous relationships between various molecules for apparently distinct phenotypes. In this review, we first present the basic definition of HIN and the biological system considered as a complex HIN. Then, we discuss the topological properties of HIN and how these can be applied to detect network motif and functional module. Afterwards, methodologies of discovering relationships between disease and biomolecule are presented. Useful insights on how HIN aids in drug development and explores human interactome are provided. Finally, we analyze the challenges and opportunities for uncovering combinatorial patterns among pharmacogenomics and cell-type detection based on single-cell genomic data.
Collapse
Affiliation(s)
- Pingjian Ding
- School of Computer Science, University of South China, Hengyang, China
| | - Wenjue Ouyang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Chee-Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
9
|
Eldele E, Chen Z, Liu C, Wu M, Kwoh CK, Li X, Guan C. An Attention-Based Deep Learning Approach for Sleep Stage Classification With Single-Channel EEG. IEEE Trans Neural Syst Rehabil Eng 2021; 29:809-818. [PMID: 33909566 DOI: 10.1109/tnsre.2021.3076234] [Citation(s) in RCA: 99] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Automatic sleep stage mymargin classification is of great importance to measure sleep quality. In this paper, we propose a novel attention-based deep learning architecture called AttnSleep to classify sleep stages using single channel EEG signals. This architecture starts with the feature extraction module based on multi-resolution convolutional neural network (MRCNN) and adaptive feature recalibration (AFR). The MRCNN can extract low and high frequency features and the AFR is able to improve the quality of the extracted features by modeling the inter-dependencies between the features. The second module is the temporal context encoder (TCE) that leverages a multi-head attention mechanism to capture the temporal dependencies among the extracted features. Particularly, the multi-head attention deploys causal convolutions to model the temporal relations in the input features. We evaluate the performance of our proposed AttnSleep model using three public datasets. The results show that our AttnSleep outperforms state-of-the-art techniques in terms of different evaluation metrics. Our source codes, experimental data, and supplementary materials are available at https://github.com/emadeldeen24/AttnSleep.
Collapse
|
10
|
Kwoh CK, Guehring H, Aydemir A, Hannon MJ, Eckstein F, Hochberg MC. Predicting knee replacement in participants eligible for disease-modifying osteoarthritis drug treatment with structural endpoints. Osteoarthritis Cartilage 2020; 28:782-791. [PMID: 32247871 DOI: 10.1016/j.joca.2020.03.012] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Revised: 03/17/2020] [Accepted: 03/26/2020] [Indexed: 02/02/2023]
Abstract
OBJECTIVE Evaluate associations between 2-year change in radiographic or quantitative magnetic resonance imaging (qMRI) structural measures, and knee replacement (KR), within a subsequent 7-year follow-up period. METHOD Participants from the Osteoarthritis Initiative were selected based on potential eligibility criteria for a disease-modifying osteoarthritis (OA) drug trial: Kellgren-Lawrence grade 2 or 3; medial minimum joint space width (mJSW) ≥2.5 mm; knee pain at worst 4-9 in the past 30 days on an 11-point scale, or 0-3 if medication was taken for joint pain; and availability of structural measures over 2 years. Mean 2-year change in structural measures was estimated and compared with two-sample independent t-tests for KR and no KR. Area under the receiver operating characteristic curve (AUC) was estimated using 2-year change in structural measures for prediction of future KR outcomes. RESULTS Among 627 participants, 107 knees underwent KR during a median follow-up of 6.7 years after the 2-year imaging period. Knees that received KR during follow-up had a greater mean loss of cartilage thickness in the total femorotibial joint and medial femorotibial compartment on qMRI, as well as decline in medial fixed joint space width on radiographs, compared with knees that did not receive KR. These imaging measures had similar, although modest discrimination for future KR (AUC 0.62, 0.60, and 0.61, respectively). CONCLUSIONS 2-year changes in qMRI femorotibial cartilage thickness and radiographic JSW measures had similar ability to discriminate future KR in participants with knee OA, suggesting that these measures are comparable biomarkers/surrogate endpoints of structural progression.
Collapse
Affiliation(s)
- C K Kwoh
- University of Arizona Arthritis Center, University of Arizona College of Medicine, Tucson, AZ, USA.
| | | | - A Aydemir
- EMD Serono Global Clinical Development Center, Billerica, MA, USA.
| | - M J Hannon
- University of Pittsburgh, Pittsburgh, PA, USA.
| | - F Eckstein
- Institute of Anatomy & Cell Biology, Paracelsus Medical University, Salzburg, Austria; Ludwig Boltzmann Institute for Arthritis and Rehabilitation, Paracelsus Medical University, Salzburg, Austria; Chondrometrics GmbH, Ainring, Germany.
| | - M C Hochberg
- University of Maryland School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
11
|
Li J, Zhu Z, Li Y, Cao P, Han W, Tang S, Li D, Kwoh CK, Guermazi A, Hunter DJ, Ding C. Qualitative and quantitative measures of prefemoral and quadriceps fat pads are associated with incident radiographic osteoarthritis: data from the Osteoarthritis Initiative. Osteoarthritis Cartilage 2020; 28:453-461. [PMID: 32061711 DOI: 10.1016/j.joca.2020.02.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Revised: 01/14/2020] [Accepted: 02/03/2020] [Indexed: 02/02/2023]
Abstract
OBJECTIVE To determine if qualitative and quantitative measures of prefemoral fat pad (PFP) and quadriceps fat pad (QFP) are associated with incident radiographic osteoarthritis (iROA) over 4 years in the Osteoarthritis Initiative (OAI) study. DESIGN Participants in this nested case-control study were selected from the OAI study with knees that had Kellgren Lawrence grades (KLG) of 0 or 1 at baseline. Case knees were defined by iROA (KLG≥ 2) over 4 years. Control knees without iROA were matched 1:1 with case knees. Magnetic resonance images (MRIs) were read at P0 (time of onset of iROA), P-1 (1 year prior to P0) and baseline, and used to assess PFP (i.e., prefemoral hyperintensity alteration, patellofemoral hyperintensity alteration, maximum axial area) and QFP (i.e., hyperintensity alteration, mass effect, maximum axial area). Conditional logistic regression analyses were performed to study the associations between PFP/QFP measures and iROA, after adjustment for covariates. RESULTS 354 case knees with iROA were matched to 354 control knees. 66.9% of the participants were female, with an average age of 60.1 years. PFP prefemoral hyperintensity alteration measured at three time points (OR [95%CI]: 1.46 [1.18-1.82], 1.50 [1.20-1.88], 1.52 [1.22-1.89] respectively), PFP maximum axial area (OR [95%CI]: 1.07 [1.01-1.14], 1.08 [1.01-1.15], 1.08 [1.02-1.15] respectively) and QFP hyperintensity alteration (OR [95%CI]: 1.59 [1.27-2.00], 1.44 [1.13-1.82], 1.38 [1.09-1.73] respectively) were significantly associated with iROA in multivariable conditional logistic analyses. QFP mass effect measured at BL and P-1 (OR [95%CI]: 1.42 [1.11-1.82], 1.33 [1.01-1.73] respectively) were significantly associated with iROA. CONCLUSIONS Qualitative and quantitative measures of PFP and QFP are associated with increased iROA over 4 years.
Collapse
Affiliation(s)
- J Li
- Clinical Research Centre, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China.
| | - Z Zhu
- Clinical Research Centre, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China.
| | - Y Li
- Clinical Research Centre, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China.
| | - P Cao
- Clinical Research Centre, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China.
| | - W Han
- Clinical Research Centre, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China.
| | - S Tang
- Clinical Research Centre, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China.
| | - D Li
- Clinical Research Centre, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China.
| | - C K Kwoh
- University of Arizona College of Medicine, Tucson, USA; University of Pittsburgh Graduate School of Public Health, Pittsburgh, PA, USA.
| | - A Guermazi
- Department of Radiology, VA Boston Healthcare System, Boston University School of Medicine, Boston, MA, USA.
| | - D J Hunter
- Clinical Research Centre, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China; Department of Rheumatology, Royal North Shore Hospital and Institute of Bone and Joint Research, Kolling Institute, University of Sydney, Australia.
| | - C Ding
- Clinical Research Centre, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China; Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia.
| |
Collapse
|
12
|
Chang J, Zhu Z, Han W, Zhao Y, Kwoh CK, Lynch JA, Hunter DJ, Ding C. The morphology of proximal tibiofibular joint (PTFJ) predicts incident radiographic osteoarthritis: data from Osteoarthritis Initiative. Osteoarthritis Cartilage 2020; 28:208-214. [PMID: 31733306 DOI: 10.1016/j.joca.2019.11.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/03/2019] [Revised: 09/30/2019] [Accepted: 11/06/2019] [Indexed: 02/02/2023]
Abstract
OBJECTIVE To determine whether the morphology of proximal tibiofibular joint (PTFJ) is associated with increased risk of incident radiographic osteoarthritis (iROA) over 4 years in the OA Initiative (OAI) study. METHODS A nested matched case-control study design was used to select participants from OAI study. Case knees were defined as those with iROA. Control knees were matched one-to-one by sex, age and radiographic status with case knees. T2-weighted MR images were assessed at P0 (the visit when incident ROA was found on radiograph), P1 (1 year prior to P0) and at OAI baseline. The contacting area of PTFJ (S) and its projection areas onto the horizontal (load-bearing area, Sτ), sagittal (lateral stress-bolstering area, Sφ) and coronal plane (posterior stress-bolstering area, Sυ) were assessed, respectively. RESULTS 354 case knees and 354 matched control knees were included, with a mean age of 60 and a mean body mass index (BMI) of 28 kg/m2. Baseline PTFJ morphological parameters (S, Sτ and Sυ) were significantly associated with iROA over 4 years, and these associations remained unchanged after adjustment for BMI, number of knee bending activities, self-reported knee injury and surgery. S, Sτ and Sυ were also significantly associated with iROA at P1 and P0. In subgroup analysed, S, Sτ and Sυ were associated with risks of incident joint space narrowing in the medial, but not the lateral tibiofemoral compartment. CONCLUSION Greater contacting area, load-bearing area and posterior stress-bolstering area of PTFJ were associated with increased risks of iROA, largely in the medial tibiofemoral compartment.
Collapse
Affiliation(s)
- J Chang
- Clinical Research Centre, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China; Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia; Department of Orthopaedics, 4th Affiliated Hospital, Anhui Medical University, Hefei, Anhui, China
| | - Z Zhu
- Clinical Research Centre, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China; Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia
| | - W Han
- Clinical Research Centre, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China; Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia
| | - Y Zhao
- Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia; Department of Rheumatology and Immunology, Xuanwu Hospital, Capital Medical University, Beijing, China
| | - C K Kwoh
- University of Arizona Arthritis Center& Division of Rheumatology, University of Arizona College of Medicine, Tucson, AZ, USA
| | - J A Lynch
- Department of Epidemiology and Biostatistics, University of California at San Francisco, San Francisco, CA, USA
| | - D J Hunter
- Department of Rheumatology, Royal North Shore Hospital and Institute of Bone and Joint Research, Kolling Institute, University of Sydney, Australia
| | - C Ding
- Clinical Research Centre, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China; Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia.
| |
Collapse
|
13
|
Abstract
Therapeutic effects of drugs are mediated via interactions between them and their intended targets. As such, prediction of drug-target interactions is of great importance. Drug-target interaction prediction is especially relevant in the case of drug repositioning where attempts are made to repurpose old drugs for new indications. While experimental wet-lab techniques exist for predicting such interactions, they are tedious and time-consuming. On the other hand, computational methods also exist for predicting interactions, and they do so with reasonable accuracy. In addition, computational methods can help guide their wet-lab counterparts by recommending interactions for further validation. In this chapter, a computational method for predicting drug-target interactions is presented. Specifically, we describe a machine learning method that utilizes ensemble learning to perform predictions. We also mention details pertaining to the preparation of the data required for the prediction effort and demonstrate how to evaluate and improve prediction performance.
Collapse
Affiliation(s)
- Ali Ezzat
- Biomedical Informatics Lab, School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore.
| | - Min Wu
- Data Analytics Department, Institute for Infocomm Research, A-Star, Singapore, Singapore
| | - Xiaoli Li
- Data Analytics Department, Institute for Infocomm Research, A-Star, Singapore, Singapore
| | - Chee-Keong Kwoh
- Division of Software and Information Systems, School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
14
|
Berlinberg A, Ashbeck EL, Roemer FW, Guermazi A, Hunter DJ, Westra J, Trost J, Kwoh CK. Diagnostic performance of knee physical exam and participant-reported symptoms for MRI-detected effusion-synovitis among participants with early or late stage knee osteoarthritis: data from the Osteoarthritis Initiative. Osteoarthritis Cartilage 2019; 27:80-89. [PMID: 30244165 DOI: 10.1016/j.joca.2018.09.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/22/2018] [Revised: 07/24/2018] [Accepted: 09/11/2018] [Indexed: 02/02/2023]
Abstract
OBJECTIVE Evaluate the diagnostic performance of knee physical exam findings and participant-reported symptoms for MRI-detected effusion-synovitis (ES) among knees with early and late-stage osteoarthritis (OA). DESIGN The Osteoarthritis Initiative (OAI) is a longitudinal study of participants with or at risk for knee OA. Two samples with MRI readings were available: 344 knees with early OA (312 participants) and 216 with late-stage OA (186 participants). Trained examiners performed bulge sign (BS) and patellar tap (PT) exams, and participants reported on knee swelling and pain with leg straightening. Effusion-synovitis on 3T non-contrast MRI was scored using the MRI Osteoarthritis Knee Score (MOAKS). Diagnostic performance of physical exam findings and symptoms was estimated with bootstrapped confidence intervals. RESULTS For the early OA sample, the highest sensitivity for medium/large effusion-synovitis was achieved with a positive finding for any of the physical exam maneuvers and/or participant-reported symptoms (81.0 [95% CI: 70.0, 91.3]). Both knee symptoms in combination had a prevalence of 11.7% and yielded the highest estimated positive predictive value (PPV) (50.0 [95% CI: 34.2, 66.7]) and likelihood ratio positive (LR+) (5.2 [95% CI: 2.9, 9.7]). In late-stage OA knees, exam findings and symptoms provided minimal information beyond the prevalence. CONCLUSION Patient report of both symptoms, or at least one positive exam finding and at least one symptom, could be used to identify knees at increased risk of effusion-synovitis in knees with early stage OA, either for screening purposes in clinical evaluation, or for study sample enrichment with an inflammatory phenotype; diagnostic performance was not sufficiently high for clinical diagnostic purposes.
Collapse
Affiliation(s)
- A Berlinberg
- Department of Medicine, University of Arizona, Tucson, AZ, USA
| | - E L Ashbeck
- Arizona Arthritis Center, University of Arizona, Tucson, AZ, USA
| | - F W Roemer
- Department of Radiology, University of Erlangen-Nuremberg, Erlangen, Germany; Quantitative Imaging Center (QIC), Department of Radiology, Boston University School of Medicine, Boston, MA, USA
| | - A Guermazi
- Quantitative Imaging Center (QIC), Department of Radiology, Boston University School of Medicine, Boston, MA, USA
| | - D J Hunter
- Department of Rheumatology, Royal North Shore Hospital, Institute of Bone and Joint Research, Kolling Institute, University of Sydney, Sydney, New South Wales, Australia
| | - J Westra
- Arizona Arthritis Center, University of Arizona, Tucson, AZ, USA; Department of Epidemiology and Biostatistics, Mel and Enid Zuckerman College of Public Health, University of Arizona, Tucson, AZ, USA
| | - J Trost
- Department of Medicine, University of Arizona, Tucson, AZ, USA
| | - C K Kwoh
- Department of Medicine, University of Arizona, Tucson, AZ, USA; Arizona Arthritis Center, University of Arizona, Tucson, AZ, USA.
| |
Collapse
|
15
|
Abstract
BACKGROUND The evolution of influenza A viruses leads to the antigenic changes. Serological diagnosis of the antigenicity is usually labor-intensive, time-consuming and not suitable for early-stage detection. Computational prediction of the antigenic relationship between emerging and old strains of influenza viruses using viral sequences can facilitate large-scale antigenic characterization, especially for those viruses requiring high biosafety facilities, such as H5 and H7 influenza A viruses. However, most computational models require carefully designed subtype-specific features, thereby being restricted to only one subtype. METHODS In this paper, we propose a Context-FreeEncoding Scheme (CFreeEnS) for pairs of protein sequences, which encodes a protein sequence dataset into a numeric matrix and then feeds the matrix into a downstream machine learning model. CFreeEnS is not only free from subtype-specific selected features but also able to improve the accuracy of predicting the antigenicity of influenza. Since CFreeEnS is subtype-free, it is applicable to predicting the antigenicity of diverse influenza subtypes, hopefully saving the biologists from conducting serological assays for highly pathogenic strains. RESULTS The accuracy of prediction on each subtype tested (A/H1N1, A/H3N2, A/H5N1, A/H9N2) is over 85%, and can be as high as 91.5%. This outperforms existing methods that use carefully designed subtype-specific features. Furthermore, we tested the CFreeEnS on the combined dataset of the four subtypes. The accuracy reaches 84.6%, much higher than the best performance 75.1% reported by other subtype-free models, i.e. regional band-based model and residue-based model, for predicting the antigenicity of influenza. Also, we investigate the performance of CFreeEnS when the model is trained and tested on different subtypes (i.e. transfer learning). The prediction accuracy using CFreeEnS is 84.3% when the model is trained on the A/H1N1 dataset and tested on the A/H5N1, better than the 75.2% using a regional band-based model. CONCLUSIONS The CFreeEnS not only improves the prediction of antigenicity on datasets with only one subtype but also outperforms existing methods when tested on a combined dataset with four subtypes of influenza viruses.
Collapse
Affiliation(s)
- Xinrui Zhou
- School of Computer Science and Engineering, Nanyang Technological University, Nanyang Avenue, Singapore, 639798, Singapore
| | - Rui Yin
- School of Computer Science and Engineering, Nanyang Technological University, Nanyang Avenue, Singapore, 639798, Singapore
| | - Chee-Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, Nanyang Avenue, Singapore, 639798, Singapore.
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Pudong, Shanghai, 201210, People's Republic of China.
| |
Collapse
|
16
|
Ata SK, Ou-Yang L, Fang Y, Kwoh CK, Wu M, Li XL. Integrating node embeddings and biological annotations for genes to predict disease-gene associations. BMC Syst Biol 2018; 12:138. [PMID: 30598097 PMCID: PMC6311944 DOI: 10.1186/s12918-018-0662-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
BACKGROUND Predicting disease causative genes (or simply, disease genes) has played critical roles in understanding the genetic basis of human diseases and further providing disease treatment guidelines. While various computational methods have been proposed for disease gene prediction, with the recent increasing availability of biological information for genes, it is highly motivated to leverage these valuable data sources and extract useful information for accurately predicting disease genes. RESULTS We present an integrative framework called N2VKO to predict disease genes. Firstly, we learn the node embeddings from protein-protein interaction (PPI) network for genes by adapting the well-known representation learning method node2vec. Secondly, we combine the learned node embeddings with various biological annotations as rich feature representation for genes, and subsequently build binary classification models for disease gene prediction. Finally, as the data for disease gene prediction is usually imbalanced (i.e. the number of the causative genes for a specific disease is much less than that of its non-causative genes), we further address this serious data imbalance issue by applying oversampling techniques for imbalance data correction to improve the prediction performance. Comprehensive experiments demonstrate that our proposed N2VKO significantly outperforms four state-of-the-art methods for disease gene prediction across seven diseases. CONCLUSIONS In this study, we show that node embeddings learned from PPI networks work well for disease gene prediction, while integrating node embeddings with other biological annotations further improves the performance of classification models. Moreover, oversampling techniques for imbalance correction further enhances the prediction performance. In addition, the literature search of predicted disease genes also shows the effectiveness of our proposed N2VKO framework for disease gene prediction.
Collapse
Affiliation(s)
- Sezin Kircali Ata
- Department of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| | - Le Ou-Yang
- Department of Electronic Engineering, College of Information Engineering, Shenzhen University, China, Singapore, Singapore
| | - Yuan Fang
- School of Information Systems, Singapore Management University, Singapore, Singapore
| | - Chee-Keong Kwoh
- Department of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| | - Min Wu
- Data Analytics Department, Institute for Infocomm Research, Singapore, Singapore.
| | - Xiao-Li Li
- Data Analytics Department, Institute for Infocomm Research, Singapore, Singapore
| |
Collapse
|
17
|
Rego-Pérez I, Blanco FJ, Roemer FW, Guermazi A, Ran D, Ashbeck EL, Fernández-Moreno M, Oreiro N, Hannon MJ, Hunter DJ, Kwoh CK. Mitochondrial DNA haplogroups associated with MRI-detected structural damage in early knee osteoarthritis. Osteoarthritis Cartilage 2018; 26:1562-1569. [PMID: 30036585 DOI: 10.1016/j.joca.2018.06.016] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/04/2018] [Revised: 06/11/2018] [Accepted: 06/26/2018] [Indexed: 02/07/2023]
Abstract
OBJECTIVE Magnetic resonance imaging (MRI)-detected structural features are associated with increased risk of radiographic osteoarthritis (ROA). Specific mitochondrial DNA (mtDNA) haplogroups have been associated with incident ROA. Our objective was to compare the presence of MRI-detected structural features across mtDNA haplogroups among knees that developed incident ROA. DESIGN Knees from the Osteoarthritis Initiative (OAI) that developed incident ROA during 48 months follow-up were identified from Caucasian participants. mtDNA haplogroups were assigned based on a single base extension assay. MRIs were obtained annually between baseline and 4-year follow-up and scored using the MRI Osteoarthritis Knee Score (MOAKS). The association between mtDNA haplogroups and MRI-detected structural features was estimated using log-binomial regression. Participants who carried haplogroup H served as the reference group. RESULTS The sample included 255 participants contributing 277 knees that developed ROA. Haplogroups included H (116, 45%), J (17, 7%), T (26, 10%), Uk (61, 24%), and the remaining less common haplogroups ("others") (35, 14%). Knees of participants with haplogroup J had significantly lower risk of medium/large bone marrow lesions (BMLs) in the medial compartment [3.2%, relative risks (RR) = 0.17; 95%CI: 0.05, 0.64; P = 0.009] compared to knees of participants who carried haplogroup H [16.3%], as did knees from participants within the "others" group [2.8%, RR = 0.20; 95%CI: 0.08, 0.55; P = 0.002], over the 4 year follow-up period. CONCLUSIONS mtDNA haplogroup J was associated with lower risk of BMLs in the medial compartment among knees that developed ROA. Our results offer a potential hypothesis to explain the mechanism underlying the previously reported protective association between haplogroup J and ROA.
Collapse
Affiliation(s)
- I Rego-Pérez
- Servicio de Reumatología, Instituto de Investigación Biomédica de A Coruña (INIBIC), Complexo Hospitalario Universitario de A Coruña (CHUAC), Sergas, Universidade da Coruña (UDC), As Xubias, 15006. A Coruña, Spain
| | - F J Blanco
- Servicio de Reumatología, Instituto de Investigación Biomédica de A Coruña (INIBIC), Complexo Hospitalario Universitario de A Coruña (CHUAC), Sergas, Universidade da Coruña (UDC), As Xubias, 15006. A Coruña, Spain
| | - F W Roemer
- Boston University School of Medicine, Boston, MA, USA; Department of Radiology, University of Erlangen-Nuremberg, Erlangen, Germany
| | - A Guermazi
- Boston University School of Medicine, Boston, MA, USA
| | - D Ran
- The University of Arizona Arthritis Center, Tucson, AZ, USA; Department of Epidemiology and Biostatistics, University of Arizona, USA
| | - E L Ashbeck
- The University of Arizona Arthritis Center, Tucson, AZ, USA
| | - M Fernández-Moreno
- Servicio de Reumatología, Instituto de Investigación Biomédica de A Coruña (INIBIC), Complexo Hospitalario Universitario de A Coruña (CHUAC), Sergas, Universidade da Coruña (UDC), As Xubias, 15006. A Coruña, Spain; Centro de investigación biomédica en Red, Bioingenieria, Biomatereial y Nanomedicina (CIBER-BBN), Spain
| | - N Oreiro
- Servicio de Reumatología, Instituto de Investigación Biomédica de A Coruña (INIBIC), Complexo Hospitalario Universitario de A Coruña (CHUAC), Sergas, Universidade da Coruña (UDC), As Xubias, 15006. A Coruña, Spain
| | - M J Hannon
- Univ. of Pittsburgh Sch. of Med., Pittsburgh, PA, USA
| | - D J Hunter
- Department of Rheumatology, Royal North Shore Hospital and Institute of Bone and Joint Research, Kolling Institute, University of Sydney, Sydney, Australia
| | - C K Kwoh
- The University of Arizona Arthritis Center, Tucson, AZ, USA; Division of Rheumatology, Department of Medicine, The University of Arizona, Tucson, AZ, USA.
| |
Collapse
|
18
|
Wang K, Ding C, Hannon MJ, Chen Z, Kwoh CK, Lynch J, Hunter DJ. Signal intensity alteration within infrapatellar fat pad predicts knee replacement within 5 years: data from the Osteoarthritis Initiative. Osteoarthritis Cartilage 2018; 26:1345-1350. [PMID: 29842941 DOI: 10.1016/j.joca.2018.05.015] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Revised: 05/08/2018] [Accepted: 05/20/2018] [Indexed: 02/02/2023]
Abstract
OBJECTIVE To investigate whether infrapatellar fat pad (IPFP) signal intensity (SI) alteration predicts the occurrence of knee replacement (KR) in knee osteoarthritis (OA) patients over 5 years. DESIGN The subjects were selected from Osteoarthritis Initiative (OAI) study. Case knees (n = 127) were defined as those who received KR during 5 years follow-up visit. They were matched by gender, age and radiographic status with control knees (n = 127). We used T2-weighted MR images to measure IPFP SI alteration using a newly developed algorithm in MATLAB. The measurements were assessed at baseline (BL), T0 (the visit just before KR) and 1 year before T0 (T-1). Conditional logistic regression was used to analyse the associations between IPFP SI alterations and the risk of KR. RESULTS Participants were mostly female (57%), with an average age of 63.7 years old and a mean body mass index (BMI) of 29.5 kg/m2. In multivariable analysis, the standard deviation (SD) of IPFP SI [sDev (IPFP)] and the ratio of high SI region volume to whole IPFP volume [Percentage (H)] measured at BL were significantly associated with increased risks of KR after adjustment for covariates. IPFP SI alterations measured at T-1 including sDev (IPFP), Percentage (H) and clustering effect of high SI [Clustering factor (H)] were significantly associated with higher risks of KR. All measurements were significantly associated with higher risks of KR at T0. CONCLUSIONS IPFP SI is associated with the occurrence of KR suggesting it may play a role in end-stage knee OA.
Collapse
Affiliation(s)
- K Wang
- Arthritis Research Institute, Department of Rheumatology, 1st Affiliated Hospital of Anhui Medical University, Hefei, Anhui, China; Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia
| | - C Ding
- Arthritis Research Institute, Department of Rheumatology, 1st Affiliated Hospital of Anhui Medical University, Hefei, Anhui, China; Department of Rheumatology, Royal North Shore Hospital and Institute of Bone and Joint Research, Kolling Institute, University of Sydney, Australia; Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia; Clinical Research Centre, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China.
| | - M J Hannon
- Division of Rheumatology and Clinical Immunology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Z Chen
- Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia; School of Mathematics and Information Science, Nanjing Normal University of Special Education, China
| | - C K Kwoh
- University of Arizona Arthritis Center, Division of Rheumatology, University of Arizona College of Medicine, Tucson, AZ, USA
| | - J Lynch
- Department of Epidemiology and Biostatistics, University of California at San Francisco, San Francisco, CA, USA
| | - D J Hunter
- Department of Rheumatology, Royal North Shore Hospital and Institute of Bone and Joint Research, Kolling Institute, University of Sydney, Australia
| |
Collapse
|
19
|
Ding P, Yin R, Luo J, Kwoh CK. Ensemble Prediction of Synergistic Drug Combinations Incorporating Biological, Chemical, Pharmacological, and Network Knowledge. IEEE J Biomed Health Inform 2018; 23:1336-1345. [PMID: 29994408 DOI: 10.1109/jbhi.2018.2852274] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Combinatorial therapy may reduce drug side effects and improve drug efficacy, making combination therapy a promising strategy to treat complex diseases. However, in the existing computational methods, the natural properties and network knowledge of drugs have not been adequately and simultaneously considered, making it difficult to identify effective drug combinations. Computational methods that incorporate multiple sources of information (biological, chemical, pharmacological, and network knowledge) offer more opportunities to screen synergistic drug combinations. Therefore, we developed a novel Ensemble Prediction framework of Synergistic Drug Combinations (EPSDC) to accurately and efficiently predict drug combinations by integrating information from multiple-sources. EPSDC constructs feature vector of drug pair by concatenating different types of drug similarities, and then uses these groups in a feature-based base predictor. Next, transductive learning is applied on heterogeneous drug-target networks to achieve a network-based score for the drug pair. Finally, two types of ensemble rules are introduced to combine the feature-based score and the network-based score, and then potential drug combinations are prioritized. To demonstrate the effect of the ensemble rule, comprehensive experiments were conducted to compare single models and ensemble models. The experimental results indicated that our method outperformed the state-of-the-art method in five-fold cross validation and de novo prediction tests on the two benchmark datasets. We further analyzed the effect of maximum length of the meta-path and the impacts of different types of features. Moreover, the practical usefulness of our method was confirmed in the predicted novel drug combinations. The source code of EPSDC is available at https://github.com/KDDing/EPSDC.
Collapse
|
20
|
Abstract
BACKGROUND Influenza viruses are undergoing continuous and rapid evolution. The fatal influenza A/H7N9 has drawn attention since the first wave of infections in March 2013, and raised more grave concerns with its increased potential to spread among humans. Experimental studies have revealed several host and virulence markers, indicating differential host binding preferences which can help estimate the potential of causing a pandemic. Here we systematically investigate the sequence pattern and structural characteristics of novel influenza A/H7N9 using computational approaches. RESULTS The sequence analysis highlighted mutations in protein functional domains of influenza viruses. Molecular docking and molecular dynamics simulation revealed that the hemagglutinin (HA) of A/Taiwan/1/2017(H7N9) strain enhanced the binding with both avian and human receptor analogs, compared with the previous A/Shanghai/02/2013(H7N9) strain. The Molecular Mechanics - Poisson Boltzmann Surface Area (MM-PBSA) calculation revealed the change of residue-ligand interaction energy and detected the residues with conspicuous binding preference. CONCLUSION The results are novel and specific to the emerging influenza A/Taiwan/1/2017(H7N9) strain compared with A/Shanghai/02/2013(H7N9). Its enhanced ability to bind human receptor analogs, which are abundant in the human upper respiratory tract, may be responsible for the recent outbreak. Residues showing binding preference were detected, which could facilitate monitoring the circulating influenza viruses.
Collapse
Affiliation(s)
- Xinrui Zhou
- 0000 0001 2224 0361grid.59025.3bSchool of Computer Science and Engineering, Nanyang Technological University, Singapore, 639798 Singapore
| | - Jie Zheng
- 0000 0001 2224 0361grid.59025.3bSchool of Computer Science and Engineering, Nanyang Technological University, Singapore, 639798 Singapore
- 0000 0004 0637 0221grid.185448.4Genome Institute of Singapore, A*STAR, Singapore, 138672 Singapore
| | - Fransiskus Xaverius Ivan
- 0000 0001 2224 0361grid.59025.3bSchool of Computer Science and Engineering, Nanyang Technological University, Singapore, 639798 Singapore
| | - Rui Yin
- 0000 0001 2224 0361grid.59025.3bSchool of Computer Science and Engineering, Nanyang Technological University, Singapore, 639798 Singapore
| | - Shoba Ranganathan
- 0000 0001 2158 5405grid.1004.5Department of Chemistry and Biomolecular Sciences, Macquarie University, Sydney, NSW 2109 Australia
| | - Vincent T. K. Chow
- 0000 0001 2180 6431grid.4280.eDepartment of Microbiology and Immunology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 117545 Singapore
| | - Chee-Keong Kwoh
- 0000 0001 2224 0361grid.59025.3bSchool of Computer Science and Engineering, Nanyang Technological University, Singapore, 639798 Singapore
| |
Collapse
|
21
|
Ezzat A, Wu M, Li XL, Kwoh CK. Computational prediction of drug–target interactions using chemogenomic approaches: an empirical survey. Brief Bioinform 2018; 20:1337-1357. [DOI: 10.1093/bib/bby002] [Citation(s) in RCA: 117] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Revised: 12/21/2017] [Indexed: 01/18/2023] Open
Abstract
Abstract
Computational prediction of drug–target interactions (DTIs) has become an essential task in the drug discovery process. It narrows down the search space for interactions by suggesting potential interaction candidates for validation via wet-lab experiments that are well known to be expensive and time-consuming. In this article, we aim to provide a comprehensive overview and empirical evaluation on the computational DTI prediction techniques, to act as a guide and reference for our fellow researchers. Specifically, we first describe the data used in such computational DTI prediction efforts. We then categorize and elaborate the state-of-the-art methods for predicting DTIs. Next, an empirical comparison is performed to demonstrate the prediction performance of some representative methods under different scenarios. We also present interesting findings from our evaluation study, discussing the advantages and disadvantages of each method. Finally, we highlight potential avenues for further enhancement of DTI prediction performance as well as related research directions.
Collapse
|
22
|
Su CTT, Kwoh CK, Verma CS, Gan SKE. Modeling the full length HIV-1 Gag polyprotein reveals the role of its p6 subunit in viral maturation and the effect of non-cleavage site mutations in protease drug resistance. J Biomol Struct Dyn 2017; 36:4366-4377. [PMID: 29237328 DOI: 10.1080/07391102.2017.1417160] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
HIV polyprotein Gag is increasingly found to contribute to protease inhibitor resistance. Despite its role in viral maturation and in developing drug resistance, there remain gaps in the knowledge of the role of certain Gag subunits (e.g. p6), and that of non-cleavage mutations in drug resistance. As p6 is flexible, it poses a problem for structural experiments, and is hence often omitted in experimental Gag structural studies. Nonetheless, as p6 is an indispensable component for viral assembly and maturation, we have modeled the full length Gag structure based on several experimentally determined constraints and studied its structural dynamics. Our findings suggest that p6 can mechanistically modulate Gag conformations. In addition, the full length Gag model reveals that allosteric communication between the non-cleavage site mutations and the first Gag cleavage site could possibly result in protease drug resistance, particularly in the absence of mutations in Gag cleavage sites. Our study provides a mechanistic understanding to the structural dynamics of HIV-1 Gag, and also proposes p6 as a possible drug target in anti-HIV therapy.
Collapse
Affiliation(s)
- Chinh Tran-To Su
- a Bioinformatics Institute , Agency for Science, Technology, and Research (A*STAR) , Singapore 138671 , Singapore
| | - Chee-Keong Kwoh
- b School of Computer Science and Engineering , Nanyang Technological University , Singapore 639798 , Singapore
| | - Chandra Shekhar Verma
- a Bioinformatics Institute , Agency for Science, Technology, and Research (A*STAR) , Singapore 138671 , Singapore
| | - Samuel Ken-En Gan
- a Bioinformatics Institute , Agency for Science, Technology, and Research (A*STAR) , Singapore 138671 , Singapore.,c p53 Laboratory , Agency for Science, Technology, and Research (A*STAR) , Singapore 138648 , Singapore
| |
Collapse
|
23
|
Wirth W, Hunter DJ, Nevitt MC, Sharma L, Kwoh CK, Ladel C, Eckstein F. Predictive and concurrent validity of cartilage thickness change as a marker of knee osteoarthritis progression: data from the Osteoarthritis Initiative. Osteoarthritis Cartilage 2017; 25:2063-2071. [PMID: 28838858 PMCID: PMC5688009 DOI: 10.1016/j.joca.2017.08.005] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/27/2017] [Revised: 08/09/2017] [Accepted: 08/11/2017] [Indexed: 02/02/2023]
Abstract
OBJECTIVE To investigate the predictive and concurrent validity of magnetic resonance imaging (MRI)-based cartilage thickness change between baseline (BL) and year-two (Y2) follow-up (predictive validity) and between Y2 and Y4 follow-up (concurrent validity) for symptomatic and radiographic knee osteoarthritis (OA) progression during Y2→Y4. METHODS 777 knees from 777 Osteoarthritis Initiative (OAI) participants (age: 61.3 ± 9.0 years, BMI: 30.1 ± 4.8 kg/m2) with Kellgren Lawrence (KL) grade 1-3 at Y2 (visit before progression interval) had cartilage thickness measurements from 3T MRI at BL, Y2 (n = 777), and Y4 (n = 708). Analysis of covariance and logistic regression were used to assess the association of pain progression (≥9 WOMAC units [scale 0-100], n = 205/572 with/without progression) and radiographic progression (≥0.7 mm minimum joint space width (mJSW) loss, n = 166/611 with/without progression) between Y2 and Y4 with preceding (BL→Y2) and concurrent (Y2→Y4) change in central medial femorotibial (cMFTC) compartment cartilage thickness. RESULTS Symptomatic progression was associated with concurrent (Y2→Y4: -305 ± 470 μm vs -155 ± 346 μm, Odds ratios (OR) = 1.5 [1.2, 1.7]) but not with preceding cartilage thickness loss in cMFTC (-150 ± 276 μm vs -151 ± 299 μm, OR = 0.9 95% CI: [0.8, 1.1]). Radiographic progression, in contrast, was significantly associated with both concurrent (-542 ± 550 μm vs -98 ± 255 μm, OR = 3.4 [2.6, 4.3]) and preceding cMFTC thickness loss (-229 ± 355 μm vs -130 ± 270 μm, OR = 1.3 [1.1, 1.5]). CONCLUSIONS These results extend previous reports that did not discern predictive vs concurrent associations of cartilage thickness loss with OA progression. The observed predictive and concurrent validity of cartilage thickness loss for radiographic progression and observed concurrent validity for symptomatic progression provide an important step in qualifying cartilage thickness loss as a biomarker of knee OA progression. CLINICALTRIALS. GOV IDENTIFICATION NCT00080171.
Collapse
Affiliation(s)
- W Wirth
- Institute of Anatomy, Paracelsus Medical University Salzburg & Nuremberg, Salzburg, Austria & Chondrometrics GmbH, Ainring, Germany.
| | - D J Hunter
- Rheumatology Department, Royal North Shore Hospital and Institute of Bone and Joint Research, Kolling Institute, University of Sydney, Sydney, NSW, Australia
| | - M C Nevitt
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA
| | - L Sharma
- Northwestern University, Chicago IL, USA
| | - C K Kwoh
- Division of Rheumatology, University of Arizona Arthritis Center, University of Arizona, Tucson, AZ, USA
| | - C Ladel
- Merck KGaA, Darmstadt, Germany
| | - F Eckstein
- Institute of Anatomy, Paracelsus Medical University Salzburg & Nuremberg, Salzburg, Austria & Chondrometrics GmbH, Ainring, Germany
| |
Collapse
|
24
|
Ezzat A, Zhao P, Wu M, Li XL, Kwoh CK. Drug-Target Interaction Prediction with Graph Regularized Matrix Factorization. IEEE/ACM Trans Comput Biol Bioinform 2017; 14:646-656. [PMID: 26890921 DOI: 10.1109/tcbb.2016.2530062] [Citation(s) in RCA: 160] [Impact Index Per Article: 22.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Experimental determination of drug-target interactions is expensive and time-consuming. Therefore, there is a continuous demand for more accurate predictions of interactions using computational techniques. Algorithms have been devised to infer novel interactions on a global scale where the input to these algorithms is a drug-target network (i.e., a bipartite graph where edges connect pairs of drugs and targets that are known to interact). However, these algorithms had difficulty predicting interactions involving new drugs or targets for which there are no known interactions (i.e., "orphan" nodes in the network). Since data usually lie on or near to low-dimensional non-linear manifolds, we propose two matrix factorization methods that use graph regularization in order to learn such manifolds. In addition, considering that many of the non-occurring edges in the network are actually unknown or missing cases, we developed a preprocessing step to enhance predictions in the "new drug" and "new target" cases by adding edges with intermediate interaction likelihood scores. In our cross validation experiments, our methods achieved better results than three other state-of-the-art methods in most cases. Finally, we simulated some "new drug" and "new target" cases and found that GRMF predicted the left-out interactions reasonably well.
Collapse
|
25
|
Abstract
BACKGROUND Multiple computational methods for predicting drug-target interactions have been developed to facilitate the drug discovery process. These methods use available data on known drug-target interactions to train classifiers with the purpose of predicting new undiscovered interactions. However, a key challenge regarding this data that has not yet been addressed by these methods, namely class imbalance, is potentially degrading the prediction performance. Class imbalance can be divided into two sub-problems. Firstly, the number of known interacting drug-target pairs is much smaller than that of non-interacting drug-target pairs. This imbalance ratio between interacting and non-interacting drug-target pairs is referred to as the between-class imbalance. Between-class imbalance degrades prediction performance due to the bias in prediction results towards the majority class (i.e. the non-interacting pairs), leading to more prediction errors in the minority class (i.e. the interacting pairs). Secondly, there are multiple types of drug-target interactions in the data with some types having relatively fewer members (or are less represented) than others. This variation in representation of the different interaction types leads to another kind of imbalance referred to as the within-class imbalance. In within-class imbalance, prediction results are biased towards the better represented interaction types, leading to more prediction errors in the less represented interaction types. RESULTS We propose an ensemble learning method that incorporates techniques to address the issues of between-class imbalance and within-class imbalance. Experiments show that the proposed method improves results over 4 state-of-the-art methods. In addition, we simulated cases for new drugs and targets to see how our method would perform in predicting their interactions. New drugs and targets are those for which no prior interactions are known. Our method displayed satisfactory prediction performance and was able to predict many of the interactions successfully. CONCLUSIONS Our proposed method has improved the prediction performance over the existing work, thus proving the importance of addressing problems pertaining to class imbalance in the data.
Collapse
Affiliation(s)
- Ali Ezzat
- School of Computer Science & Engineering, Nanyang Technological University, Nanyang Ave., Singapore, 639798, Singapore
| | - Min Wu
- Institute for Infocomm Research (I2R), A*Star, Fusionopolis Way, Singapore, 138632, Singapore
| | - Xiao-Li Li
- Institute for Infocomm Research (I2R), A*Star, Fusionopolis Way, Singapore, 138632, Singapore.
| | - Chee-Keong Kwoh
- School of Computer Science & Engineering, Nanyang Technological University, Nanyang Ave., Singapore, 639798, Singapore
| |
Collapse
|
26
|
Abstract
BACKGROUND The human influenza viruses undergo rapid evolution (especially in hemagglutinin (HA), a glycoprotein on the surface of the virus), which enables the virus population to constantly evade the human immune system. Therefore, the vaccine has to be updated every year to stay effective. There is a need to characterize the evolution of influenza viruses for better selection of vaccine candidates and the prediction of pandemic strains. Studies have shown that the influenza hemagglutinin evolution is driven by the simultaneous mutations at antigenic sites. Here, we analyze simultaneous or co-occurring mutations in the HA protein of human influenza A/H3N2, A/H1N1 and B viruses to predict potential mutations, characterizing the antigenic evolution. METHODS We obtain the rules of mutation co-occurrence using association rule mining after extracting HA1 sequences and detect co-mutation sites under strong selective pressure. Then we predict the potential drifts with specific mutations of the viruses based on the rules and compare the results with the "observed" mutations in different years. RESULTS The sites under frequent mutations are in antigenic regions (epitopes) or receptor binding sites. CONCLUSIONS Our study demonstrates the co-occurring site mutations obtained by rule mining can capture the evolution of influenza viruses, and confirms that cooperative interactions among sites of HA1 protein drive the influenza antigenic evolution.
Collapse
Affiliation(s)
- Haifen Chen
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, 639798, Singapore, Singapore
| | - Xinrui Zhou
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, 639798, Singapore, Singapore
| | - Jie Zheng
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, 639798, Singapore, Singapore
- Genome Institute of Singapore, A*STAR, Biopolis, 138672, Singapore, Singapore
| | - Chee-Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, 639798, Singapore, Singapore.
| |
Collapse
|
27
|
Lama D, Pradhan MR, Brown CJ, Eapen RS, Joseph TL, Kwoh CK, Lane DP, Verma CS. Water-Bridge Mediates Recognition of mRNA Cap in eIF4E. Structure 2016; 25:188-194. [PMID: 27916520 DOI: 10.1016/j.str.2016.11.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2016] [Revised: 09/19/2016] [Accepted: 11/07/2016] [Indexed: 11/24/2022]
Abstract
Ligand binding pockets in proteins contain water molecules, which play important roles in modulating protein-ligand interactions. Available crystallographic data for the 5' mRNA cap-binding pocket of the translation initiation factor protein eIF4E shows several structurally conserved waters, which also persist in molecular dynamics simulations. These waters engage an intricate hydrogen-bond network between the cap and protein. Two crystallographic waters in the cleft of the pocket show a high degree of conservation and bridge two residues, which are part of an evolutionarily conserved scaffold. This appears to be a preformed recognition module for the cap with the two structural waters facilitating an efficient interaction. This is also recapitulated in a new crystal structure of the apo protein. These findings open new windows for the design and screening of compounds targeting eIF4E.
Collapse
Affiliation(s)
- Dilraj Lama
- Bioinformatics Institute, A(∗)STAR (Agency for Science, Technology and Research), 30 Biopolis Street, #07-01 Matrix, Singapore 138671, Singapore.
| | - Mohan R Pradhan
- Bioinformatics Institute, A(∗)STAR (Agency for Science, Technology and Research), 30 Biopolis Street, #07-01 Matrix, Singapore 138671, Singapore; School of Computer Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore
| | - Christopher J Brown
- p53 Laboratory, A(∗)STAR (Agency for Science, Technology and Research), 8A Biomedical Grove, #06-04/05, Neuros/Immunos, Singapore 138648, Singapore
| | - Rohan S Eapen
- School of Molecular and Cellular Biology, University of Leeds, Leeds LS2 9JT, UK
| | - Thomas L Joseph
- Bioinformatics Institute, A(∗)STAR (Agency for Science, Technology and Research), 30 Biopolis Street, #07-01 Matrix, Singapore 138671, Singapore
| | - Chee-Keong Kwoh
- School of Computer Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore
| | - David P Lane
- p53 Laboratory, A(∗)STAR (Agency for Science, Technology and Research), 8A Biomedical Grove, #06-04/05, Neuros/Immunos, Singapore 138648, Singapore
| | - Chandra S Verma
- Bioinformatics Institute, A(∗)STAR (Agency for Science, Technology and Research), 30 Biopolis Street, #07-01 Matrix, Singapore 138671, Singapore; Department of Biological Sciences, National University of Singapore, 14 Science Drive 4, Singapore 117543, Singapore; School of Biological Sciences, Nanyang Technological University, 50 Nanyang Drive, Singapore 637551, Singapore.
| |
Collapse
|
28
|
Teh AL, Pan H, Lin X, Lim YI, Patro CPK, Cheong CY, Gong M, MacIsaac JL, Kwoh CK, Meaney MJ, Kobor MS, Chong YS, Gluckman PD, Holbrook JD, Karnani N. Comparison of Methyl-capture Sequencing vs. Infinium 450K methylation array for methylome analysis in clinical samples. Epigenetics 2016; 11:36-48. [PMID: 26786415 PMCID: PMC4846133 DOI: 10.1080/15592294.2015.1132136] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Interindividual variability in the epigenome has gained tremendous attention for its potential in pathophysiological investigation, disease diagnosis, and evaluation of clinical intervention. DNA methylation is the most studied epigenetic mark in epigenome-wide association studies (EWAS) as it can be detected from limited starting material. Infinium 450K methylation array is the most popular platform for high-throughput profiling of this mark in clinical samples, as it is cost-effective and requires small amounts of DNA. However, this method suffers from low genome coverage and errors introduced by probe cross-hybridization. Whole-genome bisulfite sequencing can overcome these limitations but elevates the costs tremendously. Methyl-Capture Sequencing (MC Seq) is an attractive intermediate solution to increase the methylome coverage in large sample sets. Here we first demonstrate that MC Seq can be employed using DNA amounts comparable to the amounts used for Infinium 450K. Second, to provide guidance when choosing between the 2 platforms for EWAS, we evaluate and compare MC Seq and Infinium 450K in terms of coverage, technical variation, and concordance of methylation calls in clinical samples. Last, since the focus in EWAS is to study interindividual variation, we demonstrate the utility of MC Seq in studying interindividual variation in subjects from different ethnicities.
Collapse
Affiliation(s)
- Ai Ling Teh
- a Singapore Institute for Clinical Sciences, A*STAR , Singapore
| | - Hong Pan
- a Singapore Institute for Clinical Sciences, A*STAR , Singapore.,b School of Computer Engineering, Nanyang Technological University , Singapore
| | - Xinyi Lin
- a Singapore Institute for Clinical Sciences, A*STAR , Singapore
| | - Yubin Ives Lim
- a Singapore Institute for Clinical Sciences, A*STAR , Singapore.,c Yong Loo Lin School of Medicine, National University of Singapore , Singapore
| | | | | | - Min Gong
- a Singapore Institute for Clinical Sciences, A*STAR , Singapore
| | - Julia L MacIsaac
- e Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute , Department of Medical Genetics , University of British Columbia , Vancouver , BC , Canada
| | - Chee-Keong Kwoh
- b School of Computer Engineering, Nanyang Technological University , Singapore
| | - Michael J Meaney
- a Singapore Institute for Clinical Sciences, A*STAR , Singapore.,d Ludmer Center for Neuroinformatics & Mental Health, Douglas University Mental Health Institute, McGill University , Montreal , Quebec Canada
| | - Michael S Kobor
- e Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute , Department of Medical Genetics , University of British Columbia , Vancouver , BC , Canada
| | - Yap-Seng Chong
- a Singapore Institute for Clinical Sciences, A*STAR , Singapore.,c Yong Loo Lin School of Medicine, National University of Singapore , Singapore
| | - Peter D Gluckman
- a Singapore Institute for Clinical Sciences, A*STAR , Singapore.,f Centre for Human Evolution, Adaptation and Disease, Liggins Institute, University of Auckland , Auckland , New Zealand
| | | | - Neerja Karnani
- a Singapore Institute for Clinical Sciences, A*STAR , Singapore.,c Yong Loo Lin School of Medicine, National University of Singapore , Singapore
| |
Collapse
|
29
|
Nguyen TD, Schmidt B, Zheng Z, Kwoh CK. Efficient and Accurate OTU Clustering with GPU-Based Sequence Alignment and Dynamic Dendrogram Cutting. IEEE/ACM Trans Comput Biol Bioinform 2015; 12:1060-1073. [PMID: 26451819 DOI: 10.1109/tcbb.2015.2407574] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
De novo clustering is a popular technique to perform taxonomic profiling of a microbial community by grouping 16S rRNA amplicon reads into operational taxonomic units (OTUs). In this work, we introduce a new dendrogram-based OTU clustering pipeline called CRiSPy. The key idea used in CRiSPy to improve clustering accuracy is the application of an anomaly detection technique to obtain a dynamic distance cutoff instead of using the de facto value of 97 percent sequence similarity as in most existing OTU clustering pipelines. This technique works by detecting an abrupt change in the merging heights of a dendrogram. To produce the output dendrograms, CRiSPy employs the OTU hierarchical clustering approach that is computed on a genetic distance matrix derived from an all-against-all read comparison by pairwise sequence alignment. However, most existing dendrogram-based tools have difficulty processing datasets larger than 10,000 unique reads due to high computational complexity. We address this difficulty by developing two efficient algorithms for CRiSPy: a compute-efficient GPU-accelerated parallel algorithm for pairwise distance matrix computation and a memory-efficient hierarchical clustering algorithm. Our experiments on various datasets with distinct attributes show that CRiSPy is able to produce more accurate OTU groupings than most OTU clustering applications.
Collapse
|
30
|
Roemer FW, Jarraya M, Kwoh CK, Hannon MJ, Boudreau RM, Green SM, Jakicic JM, Moore C, Guermazi A. Brief report: symmetricity of radiographic and MRI-detected structural joint damage in persons with knee pain--the Joints on Glucosamine (JOG) Study. Osteoarthritis Cartilage 2015; 23:1343-7. [PMID: 25746322 DOI: 10.1016/j.joca.2015.02.169] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/02/2014] [Revised: 01/22/2015] [Accepted: 02/23/2015] [Indexed: 02/02/2023]
Abstract
OBJECTIVE Most MRI-based osteoarthritis (OA) studies have focused on a single knee per person and thus, data on bilaterality is sparse. Study aim was to describe symmetricity of MRI-detected OA features in a cohort of subjects with knee pain. DESIGN Participants were 169 subjects with chronic knee pain who had 3 T MRI of both knees using the same protocol as in the Osteoarthritis Initiative. Knees were read for cartilage damage, bone marrow lesions (BMLs), and meniscal damage according to the Whole-Organ Magnetic Resonance Imaging Score (WORMS) system. Chi(2) tests were used to compare the proportion of knees with unilateral tissue pathology to the proportion what would be expected if both knees were independent. We further used percent agreement and linear weighted kappa statistics to describe agreement of cartilage damage and BMLs in the same articular plates. RESULTS 51.2% of participants were men, mean age was 52.1 (±6.2), mean BMI was 29.0 kg/m(2) (±4.1). All plates showed a significant higher degree of symmetricity for cartilage damage as evidenced by weighted kappas ranging from 0.32 to 0.59. For BMLs the degree of symmetricity was higher for the patella, trochlea, medial tibia, lateral femur, and medial femur; for meniscal damage the degree of unilaterality was lower for all medial meniscal subregions but not all lateral. Kappas ranged between 0.52 and 0.68 for cartilage and 0.30 and 0.55 for BMLs for the four subregions with highest agreement. CONCLUSION A higher degree of symmetricity of tissue damage than expected by chance was observed in this cohort of subjects with knee pain.
Collapse
Affiliation(s)
- F W Roemer
- Quantitative Imaging Center, Department of Radiology, Boston University School of Medicine, Boston, MA, USA; Department of Radiology, University of Erlangen-Nuremberg, Erlangen, Germany.
| | - M Jarraya
- Quantitative Imaging Center, Department of Radiology, Boston University School of Medicine, Boston, MA, USA; Department of Radiology, Mercy Catholic Medical Center, Drexel University College of Medicine, Darby, PA, USA
| | - C K Kwoh
- University of Arizona Arthritis Center & University of Arizona College of Medicine, Tucson, AZ, USA
| | - M J Hannon
- Division of Rheumatology and Clinical Immunology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - R M Boudreau
- Department of Epidemiology, University of Pittsburgh Graduate School of Public Health, Pittsburgh, PA, USA
| | - S M Green
- Division of Rheumatology and Clinical Immunology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - J M Jakicic
- Department of Health and Physical Activity, University of Pittsburgh, Pittsburgh, PA, USA
| | - C Moore
- Department of Nutrition and Food Sciences, Texas Woman's University, Houston, TX, USA
| | - A Guermazi
- Quantitative Imaging Center, Department of Radiology, Boston University School of Medicine, Boston, MA, USA
| |
Collapse
|
31
|
Shea MK, Kritchevsky SB, Hsu FC, Nevitt M, Booth SL, Kwoh CK, McAlindon TE, Vermeer C, Drummen N, Harris TB, Womack C, Loeser RF. The association between vitamin K status and knee osteoarthritis features in older adults: the Health, Aging and Body Composition Study. Osteoarthritis Cartilage 2015; 23:370-8. [PMID: 25528106 PMCID: PMC4339507 DOI: 10.1016/j.joca.2014.12.008] [Citation(s) in RCA: 72] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/09/2014] [Revised: 12/02/2014] [Accepted: 12/10/2014] [Indexed: 02/02/2023]
Abstract
BACKGROUND Vitamin K-dependent (VKD) proteins, including the mineralization inhibitor matrix-gla protein (MGP), are found in joint tissues including cartilage and bone. Previous studies suggest low vitamin K status is associated with higher osteoarthritis (OA) prevalence and incidence. OBJECTIVE To clarify what joint tissues vitamin K is relevant to in OA, we investigated the cross-sectional and longitudinal association between vitamin K status and knee OA structural features measured using magnetic resonance imaging (MRI). METHODS Plasma phylloquinone (PK, vitamin K1) and dephosphorylated-uncarboxylated MGP ((dp)ucMGP) were measured in 791 older community-dwelling adults who had bilateral knee MRIs (mean ± SD age = 74 ± 3 y; 67% female). The adjusted odds ratios (and 95% confidence intervals) [OR (95%CI)] for presence and progression of knee OA features according to vitamin K status were calculated using marginal models with generalized estimating equations (GEEs), adjusted for age, sex, body mass index (BMI), triglycerides and other pertinent confounders. RESULTS Longitudinally, participants with very low plasma PK (<0.2 nM) were more likely to have articular cartilage and meniscus damage progression after 3 years [OR (95% CIs): 1.7(1.0-3.0), 2.6(1.3-5.2) respectively] compared to sufficient PK (≥ 1.0 nM). Higher plasma (dp)ucMGP (reflective of lower vitamin K status) was associated with higher odds of meniscus damage, osteophytes, bone marrow lesions, and subarticular cysts cross-sectionally [ORs (95% CIs) comparing highest to lowest quartile: 1.6(1.1-2.3); 1.7(1.1-2.5); 1.9(1.3-2.8); 1.5(1.0-2.1), respectively]. CONCLUSION Community-dwelling men and women with very low plasma PK were more likely to have progression of articular cartilage and meniscus damage. Plasma (dp)ucMGP was associated with presence of knee OA features but not progression. Future studies are needed to clarify mechanisms underlying vitamin Ks role in OA.
Collapse
Affiliation(s)
- M K Shea
- USDA Human Nutrition Research Center on Aging at Tufts University, Boston, MA, USA.
| | - S B Kritchevsky
- Sticht Center on Aging, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - F-C Hsu
- Department of Biostatistical Sciences, Division of Public Health Sciences, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - M Nevitt
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA
| | - S L Booth
- USDA Human Nutrition Research Center on Aging at Tufts University, Boston, MA, USA
| | - C K Kwoh
- Division of Rheumatology, University of Arizona, Tucson, AZ, USA
| | - T E McAlindon
- Division of Rheumatology, Tufts Medical Center, Boston, MA, USA
| | | | | | - T B Harris
- Laboratory of Epidemiology and Population Sciences, National Institute on Aging, USA
| | - C Womack
- Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, TN, USA
| | - R F Loeser
- Division of Rheumatology, Allergy and Immunology, University of North Carolina, Chapel Hill, NC, USA
| |
Collapse
|
32
|
Alhossary A, Handoko SD, Mu Y, Kwoh CK. Fast, accurate, and reliable molecular docking with QuickVina 2. ACTA ACUST UNITED AC 2015; 31:2214-6. [PMID: 25717194 DOI: 10.1093/bioinformatics/btv082] [Citation(s) in RCA: 129] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2014] [Accepted: 02/05/2015] [Indexed: 11/14/2022]
Abstract
MOTIVATION The need for efficient molecular docking tools for high-throughput screening is growing alongside the rapid growth of drug-fragment databases. AutoDock Vina ('Vina') is a widely used docking tool with parallelization for speed. QuickVina ('QVina 1') then further enhanced the speed via a heuristics, requiring high exhaustiveness. With low exhaustiveness, its accuracy was compromised. We present in this article the latest version of QuickVina ('QVina 2') that inherits both the speed of QVina 1 and the reliability of the original Vina. RESULTS We tested the efficacy of QVina 2 on the core set of PDBbind 2014. With the default exhaustiveness level of Vina (i.e. 8), a maximum of 20.49-fold and an average of 2.30-fold acceleration with a correlation coefficient of 0.967 for the first mode and 0.911 for the sum of all modes were attained over the original Vina. A tendency for higher acceleration with increased number of rotatable bonds as the design variables was observed. On the accuracy, Vina wins over QVina 2 on 30% of the data with average energy difference of only 0.58 kcal/mol. On the same dataset, GOLD produced RMSD smaller than 2 Å on 56.9% of the data while QVina 2 attained 63.1%. AVAILABILITY AND IMPLEMENTATION The C++ source code of QVina 2 is available at (www.qvina.org). CONTACT aalhossary@pmail.ntu.edu.sg SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Amr Alhossary
- School of Computer Engineering, Nanyang Technological University, Singapore 639798, School of Information Systems, Singapore Management University, Singapore 188065, and School of Biological Sciences, Nanyang Technological University, Singapore 637551
| | - Stephanus Daniel Handoko
- School of Computer Engineering, Nanyang Technological University, Singapore 639798, School of Information Systems, Singapore Management University, Singapore 188065, and School of Biological Sciences, Nanyang Technological University, Singapore 637551
| | - Yuguang Mu
- School of Computer Engineering, Nanyang Technological University, Singapore 639798, School of Information Systems, Singapore Management University, Singapore 188065, and School of Biological Sciences, Nanyang Technological University, Singapore 637551
| | - Chee-Keong Kwoh
- School of Computer Engineering, Nanyang Technological University, Singapore 639798, School of Information Systems, Singapore Management University, Singapore 188065, and School of Biological Sciences, Nanyang Technological University, Singapore 637551
| |
Collapse
|
33
|
Atukorala I, Kwoh CK, Guermazi A, Roemer FW, Boudreau RM, Hannon MJ, Hunter DJ. Synovitis in knee osteoarthritis: a precursor of disease? Ann Rheum Dis 2014; 75:390-5. [PMID: 25488799 DOI: 10.1136/annrheumdis-2014-205894] [Citation(s) in RCA: 198] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2014] [Accepted: 11/15/2014] [Indexed: 12/26/2022]
Abstract
OBJECTIVES It is unknown whether joint inflammation precedes other articular tissue damage in osteoarthritis. Therefore, this study aims to determine if synovitis precedes the development of radiographic knee osteoarthritis (ROA). METHODS The participants in this nested case-control study were selected from persons in the Osteoarthritis Initiative with knees that had a Kellgren Lawrence grading (KLG)=0 at baseline (BL). These knees were evaluated annually with radiography and non-contrast-enhanced MRI over 4 years. MRIs were assessed for effusion-synovitis and Hoffa-synovitis. Case knees were defined by ROA (KLG≥2) on the postero-anterior knee radiographs at any assessment after BL. Radiographs were assessed at P0 (time of onset of ROA), 1 year prior to P0 (P-1) and at BL. Controls were participants who did not develop incident ROA (iROA) from BL to 48 months). RESULTS 133 knees of 120 persons with ROA (83 women) were matched to 133 control knees (83 women). ORs for occurrence of iROA associated with the presence of effusion-synovitis at BL, P-1 and P0 were 1.56 (95% CI 0.86 to 2.81), 3.23 (1.72 to 6.06) and 4.7 (1.10 to 2.95), respectively. The ORs for the occurrence of iROA associated with the presence of Hoffa-synovitis at BL, P-1 and P0 were 1.80 (1.1 to 2.95), 2.47 (1.45 to 4.23) and 2.40 (1.43 to 4.04), respectively. CONCLUSIONS Effusion-synovitis and Hoffa-synovitis strongly predicted the development of iROA.
Collapse
Affiliation(s)
- I Atukorala
- Institute of Bone and Joint Research, Kolling Institute, University of Sydney, Australia and Royal North Shore Hospital, St Leonards, New South Wales, Australia University of Colombo, Colombo, Sri Lanka
| | - C K Kwoh
- University of Arizona, Tucson, Arizona, USA
| | - A Guermazi
- Boston University School of Medicine, Boston, Massachusetts, USA
| | - F W Roemer
- Boston University School of Medicine, Boston, Massachusetts, USA Klinikum Augsburg, Augsburg, Germany
| | - R M Boudreau
- University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - M J Hannon
- University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - D J Hunter
- Institute of Bone and Joint Research, Kolling Institute, University of Sydney, Australia and Royal North Shore Hospital, St Leonards, New South Wales, Australia
| |
Collapse
|
34
|
Su C, Nguyen TD, Zheng J, Kwoh CK. IFACEwat: the interfacial water-implemented re-ranking algorithm to improve the discrimination of near native structures for protein rigid docking. BMC Bioinformatics 2014; 15 Suppl 16:S9. [PMID: 25521441 PMCID: PMC4290663 DOI: 10.1186/1471-2105-15-s16-s9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Background Protein-protein docking is an in silico method to predict the formation of protein complexes. Due to limited computational resources, the protein-protein docking approach has been developed under the assumption of rigid docking, in which one of the two protein partners remains rigid during the protein associations and water contribution is ignored or implicitly presented. Despite obtaining a number of acceptable complex predictions, it seems to-date that most initial rigid docking algorithms still find it difficult or even fail to discriminate successfully the correct predictions from the other incorrect or false positive ones. To improve the rigid docking results, re-ranking is one of the effective methods that help re-locate the correct predictions in top high ranks, discriminating them from the other incorrect ones. In this paper, we propose a new re-ranking technique using a new energy-based scoring function, namely IFACEwat - a combined Interface Atomic Contact Energy (IFACE) and water effect. The IFACEwat aims to further improve the discrimination of the near-native structures of the initial rigid docking algorithm ZDOCK3.0.2. Unlike other re-ranking techniques, the IFACEwat explicitly implements interfacial water into the protein interfaces to account for the water-mediated contacts during the protein interactions. Results Our results showed that the IFACEwat increased both the numbers of the near-native structures and improved their ranks as compared to the initial rigid docking ZDOCK3.0.2. In fact, the IFACEwat achieved a success rate of 83.8% for Antigen/Antibody complexes, which is 10% better than ZDOCK3.0.2. As compared to another re-ranking technique ZRANK, the IFACEwat obtains success rates of 92.3% (8% better) and 90% (5% better) respectively for medium and difficult cases. When comparing with the latest published re-ranking method F2Dock, the IFACEwat performed equivalently well or even better for several Antigen/Antibody complexes. Conclusions With the inclusion of interfacial water, the IFACEwat improves mostly results of the initial rigid docking, especially for Antigen/Antibody complexes. The improvement is achieved by explicitly taking into account the contribution of water during the protein interactions, which was ignored or not fully presented by the initial rigid docking and other re-ranking techniques. In addition, the IFACEwat maintains sufficient computational efficiency of the initial docking algorithm, yet improves the ranks as well as the number of the near native structures found. As our implementation so far targeted to improve the results of ZDOCK3.0.2, and particularly for the Antigen/Antibody complexes, it is expected in the near future that more implementations will be conducted to be applicable for other initial rigid docking algorithms.
Collapse
|
35
|
Wu M, Li X, Zhang F, Li X, Kwoh CK, Zheng J. In silico prediction of synthetic lethality by meta-analysis of genetic interactions, functions, and pathways in yeast and human cancer. Cancer Inform 2014; 13:71-80. [PMID: 25452682 PMCID: PMC4224103 DOI: 10.4137/cin.s14026] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2014] [Revised: 08/15/2014] [Accepted: 08/18/2014] [Indexed: 02/07/2023] Open
Abstract
A major goal in cancer medicine is to find selective drugs with reduced side effect. A pair of genes is called synthetic lethality (SL) if mutations of both genes will kill a cell while mutation of either gene alone will not. Hence, a gene in SL interactions with a cancer-specific mutated gene will be a promising drug target with anti-cancer selectivity. Wet-lab screening approach is still so costly that even for yeast only a small fraction of gene pairs has been covered. Computational methods are therefore important for large-scale discovery of SL interactions. Most existing approaches focus on individual features or machine-learning methods, which are prone to noise or overfitting. In this paper, we propose an approach named MetaSL for predicting yeast SL, which integrates 17 genomic and proteomic features and the outputs of 10 classification methods. MetaSL thus combines the strengths of existing methods and achieves the highest area under the Receiver Operating Characteristics (ROC) curve (AUC) of 87.1% among all competitors on yeast data. Moreover, through orthologous mapping from yeast to human genes, we then predicted several lists of candidate SL pairs in human cancer. Our method and predictions would thus shed light on mechanisms of SL and lead to discovery of novel anti-cancer drugs. In addition, all the experimental results can be downloaded from http://www.ntu.edu.sg/home/zhengjie/data/MetaSL.
Collapse
Affiliation(s)
- Min Wu
- School of Computer Engineering, Nanyang Technological University, Singapore. ; Institute for Infocomm Research, ASTAR, 1 Fusionopolis Way, Singapore
| | - Xuejuan Li
- School of Computer Engineering, Nanyang Technological University, Singapore
| | - Fan Zhang
- School of Computer Engineering, Nanyang Technological University, Singapore
| | - Xiaoli Li
- Institute for Infocomm Research, ASTAR, 1 Fusionopolis Way, Singapore
| | - Chee-Keong Kwoh
- School of Computer Engineering, Nanyang Technological University, Singapore
| | - Jie Zheng
- School of Computer Engineering, Nanyang Technological University, Singapore. ; Genome Institute of Singapore, ASTAR, Biopolis, Singapore
| |
Collapse
|
36
|
Domsic RT, Dezfulian C, Shoushtari A, Ivanco D, Kenny E, Kwoh CK, Medsger TA, Champion HC. Endothelial dysfunction is present only in the microvasculature and microcirculation of early diffuse systemic sclerosis patients. Clin Exp Rheumatol 2014; 32:S-160. [PMID: 25372799 PMCID: PMC4317362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2013] [Accepted: 04/30/2014] [Indexed: 06/04/2023]
Abstract
OBJECTIVES To evaluate endothelial function and vascular stiffness in large, medium, small and microcirculatory blood vessels in very early diffuse systemic sclerosis (SSc). METHODS We studied consecutive early diffuse SSc patients, defined as <2 years from first SSc symptom who did not have a prior cardiovascular event. Age, gender and race-matched controls were recruited. All underwent assessment of aortic pulse wave velocity (PWV), carotid intima-media thickness (IMT) brachial flow-mediated dilation (FMD), digital peripheral artery tonometer (EndoPAT) assessment and laser speckle contrast imaging (LSCI). RESULTS Fifteen early diffuse SSc and controls were evaluated. The average age was 49 years, 63% were female and 93% were Caucasian. There were no differences in body mass index, hypertension, diabetes or hyperlipidaemia between controls and SSc patients. Mean SSc disease duration was 1.3 years. In the large central vessels, there was no difference in aortic PWV (p=0.71) or carotid IMT (p=0.92) between SSc patients and controls. Similarly, there was no difference in endothelial dysfunction with brachial artery FMD after ischaemia (p=0.55) and nitroglycerin administration (p=0.74). There were significantly lower values for digital EndoPAT measures (p=0.0001) in SSc patients. LSCI revealed a distinct pattern of microcirculatory abnormalities in response to ischaemia in SSc patients compared to controls. Imaging demonstrated a blunted microcirculatory hyperaemia of the hand with greater subsequent response to nitroglycerin. CONCLUSIONS These findings suggest that the earliest endothelial changes occur in smaller arterioles and microvascular beds, but not in medium or macrovascular beds, in early diffuse SSc.
Collapse
Affiliation(s)
- R T Domsic
- Division of Rheumatology and Clinical Immunology, Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA.
| | | | | | | | | | | | | | | |
Collapse
|
37
|
Jarraya M, Hayashi D, Guermazi A, Kwoh CK, Hannon MJ, Moore CE, Jakicic JM, Green SM, Roemer FW. Susceptibility artifacts detected on 3T MRI of the knee: frequency, change over time and associations with radiographic findings: data from the joints on glucosamine study. Osteoarthritis Cartilage 2014; 22:1499-503. [PMID: 24799287 DOI: 10.1016/j.joca.2014.04.014] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/27/2014] [Revised: 03/26/2014] [Accepted: 04/17/2014] [Indexed: 02/02/2023]
Abstract
OBJECTIVE To determine the prevalence of intraarticular susceptibility artifacts and to detect longitudinal changes in the artifacts, on 3T magnetic resonance imaging (MRI) of the knee in a cohort of patients with knee pain, and to assess the association of susceptibility artifacts with radiographic intraarticular calcifications. DESIGN Three hundred and forty-six knees of 177 subjects aged 35-65 were included. 3T MRI was performed at baseline and at 6 months. Baseline radiographs were assessed for presence/absence of linear/punctate calcifications within the tibiofemoral joint (TFJ) space. Corresponding MRIs were assessed for susceptibility artifacts (i.e., linear/punctate hypointensities) in the TFJ space on coronal dual-echo steady-state (DESS) sequences. Kappa statistics were applied to determine agreement between findings on baseline DESS and radiography. Changes in artifacts over time were recorded. RESULTS In the medial compartment, 13 (4%) of the knees showed susceptibility artifacts at baseline. Six knees had persistent artifacts and six knees had incident artifacts at follow-up. Agreement between DESS and radiography was κ = 0.18 (-0.15, 0.51) in the medial compartment. Frequency of artifacts in the lateral compartment was low (2%). CONCLUSION Susceptibility artifacts detected on knee MRI are not frequent, and likely correspond to vacuum phenomena as they commonly change over time and are not associated with intraarticular calcifications. Radiologists should be aware of these artifacts as they can interfere with cartilage segmentation.
Collapse
Affiliation(s)
- M Jarraya
- Quantitative Imaging Center, Department of Radiology, Boston University School of Medicine, FGH Building 3rd Floor, 820 Harrison Avenue, Boston, MA 02118, USA
| | - D Hayashi
- Quantitative Imaging Center, Department of Radiology, Boston University School of Medicine, FGH Building 3rd Floor, 820 Harrison Avenue, Boston, MA 02118, USA; Department of Radiology, Bridgeport Hospital, Yale University School of Medicine, 267 Grant Street, Bridgeport, CT 06610, USA
| | - A Guermazi
- Quantitative Imaging Center, Department of Radiology, Boston University School of Medicine, FGH Building 3rd Floor, 820 Harrison Avenue, Boston, MA 02118, USA.
| | - C K Kwoh
- Division of Rheumatology and Clinical Immunology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261, USA; Pittsburgh VA Healthcare System, Pittsburgh, PA 15240, USA
| | - M J Hannon
- Division of Rheumatology and Clinical Immunology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261, USA
| | - C E Moore
- Department of Nutrition and Food Science, Texas Woman's University, Houston, TX 77030, USA
| | - J M Jakicic
- Department of Health and Physical Activity, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - S M Green
- Division of Rheumatology and Clinical Immunology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261, USA
| | - F W Roemer
- Quantitative Imaging Center, Department of Radiology, Boston University School of Medicine, FGH Building 3rd Floor, 820 Harrison Avenue, Boston, MA 02118, USA; Department of Radiology, University of Erlangen, Erlangen, Germany
| |
Collapse
|
38
|
Wu M, Kwoh CK, Li X, Zheng J. Finding trans-regulatory genes and protein complexes modulating meiotic recombination hotspots of human, mouse and yeast. BMC Syst Biol 2014; 8:107. [PMID: 25208583 PMCID: PMC4236725 DOI: 10.1186/s12918-014-0107-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/10/2013] [Accepted: 07/11/2014] [Indexed: 11/18/2022]
Abstract
Background The regulatory mechanism of recombination is one of the most fundamental problems in genomics, with wide applications in genome wide association studies (GWAS), birth-defect diseases, molecular evolution, cancer research, etc. Recombination events cluster into short genomic regions called “recombination hotspots”. Recently, a zinc finger protein PRDM9 was reported to regulate recombination hotspots in human and mouse genomes. In addition, a 13-mer motif contained in the binding sites of PRDM9 is found to be enriched in human hotspots. However, this 13-mer motif only covers a fraction of hotspots, indicating that PRDM9 is not the only regulator of recombination hotspots. Therefore, the challenge of discovering other regulators of recombination hotspots becomes significant. Furthermore, recombination is a complex process. Hence, multiple proteins acting as machinery, rather than individual proteins, are more likely to carry out this process in a precise and stable manner. Therefore, the extension of the prediction of individual trans-regulators to protein complexes is also highly desired. Results In this paper, we introduce a pipeline to identify genes and protein complexes associated with recombination hotspots. First, we prioritize proteins associated with hotspots based on their preference of binding to hotspots and coldspots. Second, using the above identified genes as seeds, we apply the Random Walk with Restart algorithm (RWR) to propagate their influences to other proteins in protein-protein interaction (PPI) networks. Hence, many proteins without DNA-binding information will also be assigned a score to implicate their roles in recombination hotspots. Third, we construct sub-PPI networks induced by top genes ranked by RWR for various species (e.g., yeast, human and mouse) and detect protein complexes in those sub-PPI networks. Conclusions The GO term analysis show that our prioritizing methods and the RWR algorithm are capable of identifying novel genes associated with recombination hotspots. The trans-regulators predicted by our pipeline are enriched with epigenetic functions (e.g., histone modifications), demonstrating the epigenetic regulatory mechanisms of recombination hotspots. The identified protein complexes also provide us with candidates to further investigate the molecular machineries for recombination hotspots. Moreover, the experimental data and results are available on our web site http://www.ntu.edu.sg/home/zhengjie/data/RecombinationHotspot/NetPipe/.
Collapse
|
39
|
Abstract
An increasing number of genes have been experimentally confirmed in recent years as causative genes to various human diseases. The newly available knowledge can be exploited by machine learning methods to discover additional unknown genes that are likely to be associated with diseases. In particular, positive unlabeled learning (PU learning) methods, which require only a positive training set P (confirmed disease genes) and an unlabeled set U (the unknown candidate genes) instead of a negative training set N, have been shown to be effective in uncovering new disease genes in the current scenario. Using only a single source of data for prediction can be susceptible to bias due to incompleteness and noise in the genomic data and a single machine learning predictor prone to bias caused by inherent limitations of individual methods. In this paper, we propose an effective PU learning framework that integrates multiple biological data sources and an ensemble of powerful machine learning classifiers for disease gene identification. Our proposed method integrates data from multiple biological sources for training PU learning classifiers. A novel ensemble-based PU learning method EPU is then used to integrate multiple PU learning classifiers to achieve accurate and robust disease gene predictions. Our evaluation experiments across six disease groups showed that EPU achieved significantly better results compared with various state-of-the-art prediction methods as well as ensemble learning classifiers. Through integrating multiple biological data sources for training and the outputs of an ensemble of PU learning classifiers for prediction, we are able to minimize the potential bias and errors in individual data sources and machine learning algorithms to achieve more accurate and robust disease gene predictions. In the future, our EPU method provides an effective framework to integrate the additional biological and computational resources for better disease gene predictions.
Collapse
Affiliation(s)
- Peng Yang
- Data Analytics Department, Institute for Infocomm Research (I2R), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
- * E-mail: (PY); (XL)
| | - Xiaoli Li
- Data Analytics Department, Institute for Infocomm Research (I2R), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
- * E-mail: (PY); (XL)
| | - Hon-Nian Chua
- Data Analytics Department, Institute for Infocomm Research (I2R), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Chee-Keong Kwoh
- Bioinformatics Research Centre, School of Computer Engineering, Nanyang Technological University, Singapore, Singapore
| | - See-Kiong Ng
- Data Analytics Department, Institute for Infocomm Research (I2R), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| |
Collapse
|
40
|
Teh AL, Pan H, Chen L, Ong ML, Dogra S, Wong J, MacIsaac JL, Mah SM, McEwen LM, Saw SM, Godfrey KM, Chong YS, Kwek K, Kwoh CK, Soh SE, Chong MFF, Barton S, Karnani N, Cheong CY, Buschdorf JP, Stünkel W, Kobor MS, Meaney MJ, Gluckman PD, Holbrook JD. The effect of genotype and in utero environment on interindividual variation in neonate DNA methylomes. Genome Res 2014; 24:1064-74. [PMID: 24709820 PMCID: PMC4079963 DOI: 10.1101/gr.171439.113] [Citation(s) in RCA: 224] [Impact Index Per Article: 22.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Integrating the genotype with epigenetic marks holds the promise of better understanding the biology that underlies the complex interactions of inherited and environmental components that define the developmental origins of a range of disorders. The quality of the in utero environment significantly influences health over the lifecourse. Epigenetics, and in particular DNA methylation marks, have been postulated as a mechanism for the enduring effects of the prenatal environment. Accordingly, neonate methylomes contain molecular memory of the individual in utero experience. However, interindividual variation in methylation can also be a consequence of DNA sequence polymorphisms that result in methylation quantitative trait loci (methQTLs) and, potentially, the interaction between fixed genetic variation and environmental influences. We surveyed the genotypes and DNA methylomes of 237 neonates and found 1423 punctuate regions of the methylome that were highly variable across individuals, termed variably methylated regions (VMRs), against a backdrop of homogeneity. MethQTLs were readily detected in neonatal methylomes, and genotype alone best explained ∼25% of the VMRs. We found that the best explanation for 75% of VMRs was the interaction of genotype with different in utero environments, including maternal smoking, maternal depression, maternal BMI, infant birth weight, gestational age, and birth order. Our study sheds new light on the complex relationship between biological inheritance as represented by genotype and individual prenatal experience and suggests the importance of considering both fixed genetic variation and environmental factors in interpreting epigenetic variation.
Collapse
Affiliation(s)
- Ai Ling Teh
- Singapore Institute of Clinical Sciences (SICS), A*STAR, Brenner Centre for Molecular Medicine, Singapore 117609
| | - Hong Pan
- Singapore Institute of Clinical Sciences (SICS), A*STAR, Brenner Centre for Molecular Medicine, Singapore 117609; School of Computer Engineering, Nanyang Technological University (NTU), Singapore 639798
| | - Li Chen
- Singapore Institute of Clinical Sciences (SICS), A*STAR, Brenner Centre for Molecular Medicine, Singapore 117609
| | - Mei-Lyn Ong
- Singapore Institute of Clinical Sciences (SICS), A*STAR, Brenner Centre for Molecular Medicine, Singapore 117609
| | - Shaillay Dogra
- Singapore Institute of Clinical Sciences (SICS), A*STAR, Brenner Centre for Molecular Medicine, Singapore 117609
| | - Johnny Wong
- Singapore Institute of Clinical Sciences (SICS), A*STAR, Brenner Centre for Molecular Medicine, Singapore 117609
| | - Julia L MacIsaac
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, BC V5Z 4H4 Canada
| | - Sarah M Mah
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, BC V5Z 4H4 Canada
| | - Lisa M McEwen
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, BC V5Z 4H4 Canada
| | - Seang-Mei Saw
- Saw Swee Hock School of Public Health, NUS, Singapore 117597
| | - Keith M Godfrey
- MRC Lifecourse Epidemiology Unit and NIHR Southampton Biomedical Research Centre, University of Southampton and University Hospital Southampton NHS Foundation Trust, Southampton, SO16 6YD, United Kingdom
| | - Yap-Seng Chong
- Singapore Institute of Clinical Sciences (SICS), A*STAR, Brenner Centre for Molecular Medicine, Singapore 117609; Yong Loo Lin School of Medicine, National University of Singapore, National University Health System, Singapore 119228
| | - Kenneth Kwek
- KK Women's and Children's Hospital, Singapore 229899
| | - Chee-Keong Kwoh
- School of Computer Engineering, Nanyang Technological University (NTU), Singapore 639798
| | - Shu-E Soh
- Saw Swee Hock School of Public Health, NUS, Singapore 117597; Yong Loo Lin School of Medicine, National University of Singapore, National University Health System, Singapore 119228
| | - Mary F F Chong
- Singapore Institute of Clinical Sciences (SICS), A*STAR, Brenner Centre for Molecular Medicine, Singapore 117609; Yong Loo Lin School of Medicine, National University of Singapore, National University Health System, Singapore 119228
| | - Sheila Barton
- MRC Lifecourse Epidemiology Unit and NIHR Southampton Biomedical Research Centre, University of Southampton and University Hospital Southampton NHS Foundation Trust, Southampton, SO16 6YD, United Kingdom
| | - Neerja Karnani
- Singapore Institute of Clinical Sciences (SICS), A*STAR, Brenner Centre for Molecular Medicine, Singapore 117609
| | - Clara Y Cheong
- Singapore Institute of Clinical Sciences (SICS), A*STAR, Brenner Centre for Molecular Medicine, Singapore 117609
| | - Jan Paul Buschdorf
- Singapore Institute of Clinical Sciences (SICS), A*STAR, Brenner Centre for Molecular Medicine, Singapore 117609
| | - Walter Stünkel
- Singapore Institute of Clinical Sciences (SICS), A*STAR, Brenner Centre for Molecular Medicine, Singapore 117609
| | - Michael S Kobor
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, BC V5Z 4H4 Canada
| | - Michael J Meaney
- Singapore Institute of Clinical Sciences (SICS), A*STAR, Brenner Centre for Molecular Medicine, Singapore 117609; Ludmer Centre for Neuroinformatics and Mental Health, Douglas University Mental Health Institute, McGill University, Montreal, (Quebec) H4H 1R3 Canada
| | - Peter D Gluckman
- Singapore Institute of Clinical Sciences (SICS), A*STAR, Brenner Centre for Molecular Medicine, Singapore 117609; Centre for Human Evolution, Adaptation and Disease, Liggins Institute, University of Auckland, Auckland 1142, New Zealand
| | - Joanna D Holbrook
- Singapore Institute of Clinical Sciences (SICS), A*STAR, Brenner Centre for Molecular Medicine, Singapore 117609
| |
Collapse
|
41
|
Tran-To Su C, Ouyang X, Zheng J, Kwoh CK. Structural analysis of the novel influenza A (H7N9) viral Neuraminidase interactions with current approved neuraminidase inhibitors Oseltamivir, Zanamivir, and Peramivir in the presence of mutation R289K. BMC Bioinformatics 2013; 14 Suppl 16:S7. [PMID: 24564719 PMCID: PMC3853198 DOI: 10.1186/1471-2105-14-s16-s7] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Background Since late March 2013, there has been another global health concern with a sudden wave of flu infections by a novel strain of avian influenza A (H7N9) virus in China. To-date, there have been more than 100 infections with 23 deaths. It is more worrying as this viral strain has never been detected in humans and only been found to be of low-pathogenicity. Currently, there are 3 effective neuraminidase inhibitors for this H7N9 virus strain, i.e. oseltamivir, zanamivir, and peramivir. These drugs have been used for treatment of the H7N9 influenza in China. However, how these inhibitors work and affect the binding cavity of the novel H7N9 neuraminidase in the presence of potential mutations has not been disclosed. In our study, we investigate steric effects and subsequently show the conformational restraints of the inhibitor-binding site of the non-mutated and mutated H7N9 neuraminidase structures to different drug compounds. Results Combination of molecular docking and Molecular Dynamics simulation reveal that zanamivir forms more favorable and stable complex than oseltamivir and peramivir when binding to the active site of the H7N9 neuraminidase. And it is likely that the novel influenza A (H7N9) virus adopts a higher probability to acquire resistance to peramivir than the other two inhibitors. Conformational changes induced by the mutation R289K causes loss of number of hydrogen bonds between the inhibitors and the H7N9 viral neuraminidase in 2 out of 3 complexes. In addition, our results of binding-affinity relationships of the 3 inhibitors with the viral neuraminidase proteins of previous pandemics (H1N1, H5N1) and the current novel H7N9 reflected the extent of binding effectiveness of the 3 inhibitors to the novel H7N9 neuraminidase. Conclusions The results are novel and specific for the A/Hangzhou/1/2013(H7N9) influenza strain. Furthermore, the protocol could be useful for further drug-binding analysis and prediction of future viral mutations to which the virus evolves through adaptation and acquires resistance to the current available drugs.
Collapse
|
42
|
Bloecker K, Wirth W, Hunter DJ, Duryea J, Guermazi A, Kwoh CK, Resch H, Eckstein F. Contribution of regional 3D meniscus and cartilage morphometry by MRI to joint space width in fixed flexion knee radiography--a between-knee comparison in subjects with unilateral joint space narrowing. Eur J Radiol 2013; 82:e832-9. [PMID: 24119428 DOI: 10.1016/j.ejrad.2013.08.041] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2013] [Revised: 07/29/2013] [Accepted: 08/17/2013] [Indexed: 10/26/2022]
Abstract
BACKGROUND Radiographic joint space width (JSW) is considered the reference standard for demonstrating structural therapeutic benefits in knee osteoarthritis. Our objective was to determine the proportion by which 3D (regional) meniscus and cartilage measures explain between-knee differences of JSW in the fixed flexion radiographs. METHODS Segmentation of the medial meniscus and tibial and femoral cartilage was performed in double echo steady state (DESS) images. Quantitative measures of meniscus size and position, femorotibial cartilage thickness, and radiographic JSW (minimum, and fixed locations) were compared between both knees of 60 participants of the Osteoarthritis Initiative, with strictly unilateral medial joint space narrowing (JSN). Statistical analyses (between-knee, within-person comparison) were performed using regression analysis. RESULTS A strong relationship with side-differences in minimum and a central fixed location JSW was observed for percent tibial plateau coverage by the meniscus (r = .59 and .47; p<.01) and central femoral cartilage thickness (r = .69 and .75; p<.01); other meniscus and cartilage measures displayed lower coefficients. The correlation of central femoral cartilage thickness with JSW (but not that of meniscus measures) was greater (r = .78 and .85; p<.01) when excluding knees with non-optimal alignment between the tibia and X-ray beam. CONCLUSION 3D measures of meniscus and cartilage provide significant, independent information in explaining side-differences in radiographic JSW in fixed flexion radiographs. Tibial coverage by the meniscus and central femoral cartilage explained two thirds of the variability in minimum and fixed location JSW. JSW provides a better representation of (central) femorotibial cartilage thickness, when optimal positioning of the fixed flexion radiographs is achieved.
Collapse
Affiliation(s)
- K Bloecker
- Institute of Anatomy and Musculoskeletal Research, Paracelsus Medical University, Strubergasse 21, 5020 Salzburg, Austria; Department of Traumatology and Sports Medicine, Paracelsus Medical University, Müllner Hauptstrasse 48, 5020 Salzburg, Austria.
| | | | | | | | | | | | | | | |
Collapse
|
43
|
Wu M, Xie Z, Li X, Kwoh CK, Zheng J. Identifying protein complexes from heterogeneous biological data. Proteins 2013; 81:2023-33. [PMID: 23852772 DOI: 10.1002/prot.24365] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2013] [Revised: 06/03/2013] [Accepted: 06/17/2013] [Indexed: 12/27/2022]
Abstract
With the increasing availability of diverse biological information for proteins, integration of heterogeneous data becomes more useful for many problems in proteomics, such as annotating protein functions, predicting novel protein-protein interactions and so on. In this paper, we present an integrative approach called InteHC (Integrative Hierarchical Clustering) to identify protein complexes from multiple data sources. Although integrating multiple sources could effectively improve the coverage of current insufficient protein interactome (the false negative issue), it could also introduce potential false-positive interactions that could hurt the performance of protein complex prediction. Our proposed InteHC method can effectively address these issues to facilitate accurate protein complex prediction and it is summarized into the following three steps. First, for each individual source/feature, InteHC computes the matrices to store the affinity scores between a protein pair that indicate their propensity to interact or co-complex relationship. Second, InteHC computes a final score matrix, which is the weighted sum of affinity scores from individual sources. In particular, the weights indicating the reliability of individual sources are learned from a supervised model (i.e., a linear ranking SVM). Finally, a hierarchical clustering algorithm is performed on the final score matrix to generate clusters as predicted protein complexes. In our experiments, we compared the results collected by our hierarchical clustering on each individual feature with those predicted by InteHC on the combined matrix. We observed that integration of heterogeneous data significantly benefits the identification of protein complexes. Moreover, a comprehensive comparison demonstrates that InteHC performs much better than 14 state-of-the-art approaches. All the experimental data and results can be downloaded from http://www.ntu.edu.sg/home/zhengjie/data/InteHC.
Collapse
Affiliation(s)
- Min Wu
- School of Computer Engineering, Nanyang Technological University, Singapore
| | | | | | | | | |
Collapse
|
44
|
Bloecker K, Guermazi A, Wirth W, Benichou O, Kwoh CK, Hunter DJ, Englund M, Resch H, Eckstein F. Tibial coverage, meniscus position, size and damage in knees discordant for joint space narrowing - data from the Osteoarthritis Initiative. Osteoarthritis Cartilage 2013; 21:419-27. [PMID: 23220556 PMCID: PMC4398339 DOI: 10.1016/j.joca.2012.11.015] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/16/2012] [Revised: 11/22/2012] [Accepted: 11/28/2012] [Indexed: 02/02/2023]
Abstract
INTRODUCTION Meniscal extrusion is thought to be associated with less meniscus coverage of the tibial surface, but the association of radiographic disease stage with quantitative measures of tibial plateau coverage is unknown. We therefore compared quantitative and semi-quantitative measures of meniscus position and morphology in individuals with bilateral painful knees discordant on medial joint space narrowing (mJSN). METHODS A sample of 60 participants from the first half (2,678 cases) of the Osteoarthritis Initiative cohort fulfilled the inclusion criteria: bilateral frequent pain, Osteoarthritis Research Society International (OARSI) mJSN grades 1-3 in one, no-JSN in the contra-lateral (CL), and no lateral JSN in either knee (43 unilateral mJSN1; 17 mJSN2/3; 22 men, 38 women, body mass index (BMI) 31.3 ± 3.9 kg/m(2)). Segmentation and three-dimensional quantitative analysis of the tibial plateau and meniscus, and semi-quantitative evaluation of meniscus damage (magnetic resonance imaging (MRI) osteoarthritis knee score = MOAKS) was performed using coronal 3T MR images (MPR DESSwe and intermediate-weighted turbo spin echo (IW-TSE) images). CL knees were compared using paired t-tests (between-knee, within-person design). RESULTS Medial tibial plateau coverage was 36 ± 9% in mJSN1 vs 45 ± 8% in CL no-JSN knees, and was 31 ± 9% in mJSN2/3 vs 46 ± 6% in no-JSN knees (both P < 0.001). mJSN knees showed greater meniscus extrusion and damage (MOAKS), but no significant difference in meniscus volume. No significant differences in lateral tibial coverage, lateral meniscus morphology or position were observed. CONCLUSIONS Knees with medial JSN showed substantially less medial tibial plateau coverage by the meniscus. We suggest that the less meniscal coverage, i.e., less mechanical protection may be a reason for greater rates of cartilage loss observed in JSN knees.
Collapse
Affiliation(s)
- K Bloecker
- Institute for Anatomy & Musculoskeletal Research, Paracelsus Medical University, Salzburg, Austria.
| | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Wu M, Yu Q, Li X, Zheng J, Huang JF, Kwoh CK. Benchmarking human protein complexes to investigate drug-related systems and evaluate predicted protein complexes. PLoS One 2013; 8:e53197. [PMID: 23405067 PMCID: PMC3566178 DOI: 10.1371/journal.pone.0053197] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2012] [Accepted: 11/29/2012] [Indexed: 11/18/2022] Open
Abstract
Protein complexes are key entities to perform cellular functions. Human diseases are also revealed to associate with some specific human protein complexes. In fact, human protein complexes are widely used for protein function annotation, inference of human protein interactome, disease gene prediction, and so on. Therefore, it is highly desired to build an up-to-date catalogue of human complexes to support the research in these applications. Protein complexes from different databases are as expected to be highly redundant. In this paper, we designed a set of concise operations to compile these redundant human complexes and built a comprehensive catalogue called CHPC2012 (Catalogue of Human Protein Complexes). CHPC2012 achieves a higher coverage for proteins and protein complexes than those individual databases. It is also verified to be a set of complexes with high quality as its co-complex protein associations have a high overlap with protein-protein interactions (PPI) in various existing PPI databases. We demonstrated two distinct applications of CHPC2012, that is, investigating the relationship between protein complexes and drug-related systems and evaluating the quality of predicted protein complexes. In particular, CHPC2012 provides more insights into drug development. For instance, proteins involved in multiple complexes (the overlapping proteins) are potential drug targets; the drug-complex network is utilized to investigate multi-target drugs and drug-drug interactions; and the disease-specific complex-drug networks will provide new clues for drug repositioning. With this up-to-date reference set of human protein complexes, we believe that the CHPC2012 catalogue is able to enhance the studies for protein interactions, protein functions, human diseases, drugs, and related fields of research. CHPC2012 complexes can be downloaded from http://www1.i2r.a-star.edu.sg/xlli/CHPC2012/CHPC2012.htm.
Collapse
Affiliation(s)
- Min Wu
- School of Computer Engineering, Nanyang Technological University, Singapore, Singapore
| | - Qi Yu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, China
| | - Xiaoli Li
- Data Mining Department, Institute for Infocomm Research, Singapore, Singapore
| | - Jie Zheng
- School of Computer Engineering, Nanyang Technological University, Singapore, Singapore
| | - Jing-Fei Huang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, China
| | - Chee-Keong Kwoh
- School of Computer Engineering, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
46
|
Su CTT, Schönbach C, Kwoh CK. Molecular docking analysis of 2009-H1N1 and 2004-H5N1 influenza virus HLA-B*4405-restricted HA epitope candidates: implications for TCR cross-recognition and vaccine development. BMC Bioinformatics 2013; 14 Suppl 2:S21. [PMID: 23368875 PMCID: PMC3549837 DOI: 10.1186/1471-2105-14-s2-s21] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Background The pandemic 2009-H1N1 influenza virus circulated in the human population and caused thousands deaths worldwide. Studies on pandemic influenza vaccines have shown that T cell recognition to conserved epitopes and cross-reactive T cell responses are important when new strains emerge, especially in the absence of antibody cross-reactivity. In this work, using HLA-B*4405 and DM1-TCR structure model, we systematically generated high confidence conserved 2009-H1N1 T cell epitope candidates and investigated their potential cross-reactivity against H5N1 avian flu virus. Results Molecular docking analysis of differential DM1-TCR recognition of the 2009-H1N1 epitope candidates yielded a mosaic epitope (KEKMNTEFW) and potential H5N1 HA cross-reactive epitopes that could be applied as multivalent peptide towards influenza A vaccine development. Structural models of TCR cross-recognition between 2009-H1N1 and 2004-H5N1 revealed steric and topological effects of TCR contact residue mutations on TCR binding affinity. Conclusions The results are novel with regard to HA epitopes and useful for developing possible vaccination strategies against the rapidly changing influenza viruses. Yet, the challenge of identifying epitope candidates that result in heterologous T cell immunity under natural influenza infection conditions can only be overcome if more structural data on the TCR repertoire become available.
Collapse
Affiliation(s)
- Chinh T T Su
- Bioinformatics Research Centre, School of Computer Engineering, Nanyang Technological University, Singapore
| | | | | |
Collapse
|
47
|
Javaid MK, Kiran A, Guermazi A, Kwoh CK, Zaim S, Carbone L, Harris T, McCulloch CE, Arden NK, Lane NE, Felson D, Nevitt M. Individual magnetic resonance imaging and radiographic features of knee osteoarthritis in subjects with unilateral knee pain: the health, aging, and body composition study. ACTA ACUST UNITED AC 2013; 64:3246-55. [PMID: 22736267 DOI: 10.1002/art.34594] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
OBJECTIVE Strong associations between radiographic features of knee osteoarthritis (OA) and pain have been demonstrated in persons with unilateral knee symptoms. This study was undertaken to compare radiographic and magnetic resonance imaging (MRI) features of knee OA and assess their ability to discriminate between painful and nonpainful knees in persons with unilateral symptoms. METHODS The study population included 283 individuals ages 70-79 years with unilateral knee pain who were enrolled in the Health, Aging, and Body Composition Study, a study of weight-related diseases and mobility. Radiographs of both knees were read for Kellgren/Lawrence (K/L) grade and individual radiographic features, and 1.5T MRIs were assessed using the Whole-Organ Magnetic Resonance Imaging Score. The association between structural features and pain was assessed using a within-person case-control design and conditional logistic regression. Receiver operating characteristic (ROC) analysis was then used to test the discriminatory performance of structural features. RESULTS In conditional logistic analyses, knee pain was significantly associated with both radiographic features (any joint space narrowing grade ≥ 1) (odds ratio 3.20 [95% confidence interval 1.79-5.71]) and MRI features (any cartilage defect scored ≥ 2) (odds ratio 3.67 [95% confidence interval 1.49-9.04]). However, in most subjects, MRI revealed osteophytes and cartilage and bone marrow lesions in both knees, and using ROC analysis, no individual structural feature discriminated well between painful and nonpainful knees. The best-performing MRI feature (synovitis/effusion) was not significantly more informative than K/L grade ≥ 2 (P = 0.42). CONCLUSION In persons with unilateral knee pain, MRI and radiographic features were associated with knee pain, confirming that structural abnormalities in the knee have an important role in the etiology of pain. However, no single MRI or radiographic finding performed well in discriminating between painful and nonpainful knees. Further work is needed to examine how structural and nonstructural factors influence knee pain.
Collapse
|
48
|
Mei JP, Kwoh CK, Yang P, Li XL, Zheng J. Drug-target interaction prediction by learning from local information and neighbors. ACTA ACUST UNITED AC 2012; 29:238-45. [PMID: 23162055 DOI: 10.1093/bioinformatics/bts670] [Citation(s) in RCA: 208] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
MOTIVATION In silico methods provide efficient ways to predict possible interactions between drugs and targets. Supervised learning approach, bipartite local model (BLM), has recently been shown to be effective in prediction of drug-target interactions. However, for drug-candidate compounds or target-candidate proteins that currently have no known interactions available, its pure 'local' model is not able to be learned and hence BLM may fail to make correct prediction when involving such kind of new candidates. RESULTS We present a simple procedure called neighbor-based interaction-profile inferring (NII) and integrate it into the existing BLM method to handle the new candidate problem. Specifically, the inferred interaction profile is treated as label information and is used for model learning of new candidates. This functionality is particularly important in practice to find targets for new drug-candidate compounds and identify targeting drugs for new target-candidate proteins. Consistent good performance of the new BLM-NII approach has been observed in the experiment for the prediction of interactions between drugs and four categories of target proteins. Especially for nuclear receptors, BLM-NII achieves the most significant improvement as this dataset contains many drugs/targets with no interactions in the cross-validation. This demonstrates the effectiveness of the NII strategy and also shows the great potential of BLM-NII for prediction of compound-protein interactions. CONTACT jpmei@ntu.edu.sg SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jian-Ping Mei
- Bioinformatics Research Centre, School of Computer Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798.
| | | | | | | | | |
Collapse
|
49
|
Yang P, Li XL, Mei JP, Kwoh CK, Ng SK. Positive-unlabeled learning for disease gene identification. Bioinformatics 2012; 28:2640-7. [PMID: 22923290 PMCID: PMC3467748 DOI: 10.1093/bioinformatics/bts504] [Citation(s) in RCA: 92] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2012] [Revised: 07/24/2012] [Accepted: 08/06/2012] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Identifying disease genes from human genome is an important but challenging task in biomedical research. Machine learning methods can be applied to discover new disease genes based on the known ones. Existing machine learning methods typically use the known disease genes as the positive training set P and the unknown genes as the negative training set N (non-disease gene set does not exist) to build classifiers to identify new disease genes from the unknown genes. However, such kind of classifiers is actually built from a noisy negative set N as there can be unknown disease genes in N itself. As a result, the classifiers do not perform as well as they could be. RESULT Instead of treating the unknown genes as negative examples in N, we treat them as an unlabeled set U. We design a novel positive-unlabeled (PU) learning algorithm PUDI (PU learning for disease gene identification) to build a classifier using P and U. We first partition U into four sets, namely, reliable negative set RN, likely positive set LP, likely negative set LN and weak negative set WN. The weighted support vector machines are then used to build a multi-level classifier based on the four training sets and positive training set P to identify disease genes. Our experimental results demonstrate that our proposed PUDI algorithm outperformed the existing methods significantly. CONCLUSION The proposed PUDI algorithm is able to identify disease genes more accurately by treating the unknown data more appropriately as unlabeled set U instead of negative set N. Given that many machine learning problems in biomedical research do involve positive and unlabeled data instead of negative data, it is possible that the machine learning methods for these problems can be further improved by adopting PU learning methods, as we have done here for disease gene identification. AVAILABILITY AND IMPLEMENTATION The executable program and data are available at http://www1.i2r.a-star.edu.sg/~xlli/PUDI/PUDI.html.
Collapse
Affiliation(s)
- Peng Yang
- Bioinformatics Research Centre, School of Computer Engineering, Nanyang Technological University, Singapore.
| | | | | | | | | |
Collapse
|
50
|
Abstract
The regulatory mechanism of recombination is a fundamental problem in genomics, with wide applications in genome-wide association studies, birth-defect diseases, molecular evolution, cancer research, etc. In mammalian genomes, recombination events cluster into short genomic regions called "recombination hotspots". Recently, a 13-mer motif enriched in hotspots is identified as a candidate cis-regulatory element of human recombination hotspots; moreover, a zinc finger protein, PRDM9, binds to this motif and is associated with variation of recombination phenotype in human and mouse genomes, thus is a trans-acting regulator of recombination hotspots. However, this pair of cis and trans-regulators covers only a fraction of hotspots, thus other regulators of recombination hotspots remain to be discovered. In this paper, we propose an approach to predicting additional trans-regulators from DNA-binding proteins by comparing their enrichment of binding sites in hotspots. Applying this approach on newly mapped mouse hotspots genome-wide, we confirmed that PRDM9 is a major trans-regulator of hotspots. In addition, a list of top candidate trans-regulators of mouse hotspots is reported. Using GO analysis we observed that the top genes are enriched with function of histone modification, highlighting the epigenetic regulatory mechanisms of recombination hotspots.
Collapse
Affiliation(s)
- Min Wu
- School of Computer Engineering, Nanyang Technological University, Singapore.
| | | | | | | | | |
Collapse
|