1
|
Zhuo L, Pan S, Li J, Fu X. Predicting miRNA-lncRNA interactions on plant datasets based on bipartite network embedding method. Methods 2022; 207:97-102. [PMID: 36155251 DOI: 10.1016/j.ymeth.2022.09.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 09/04/2022] [Accepted: 09/08/2022] [Indexed: 11/15/2022] Open
Abstract
The research of miRNA-lncRNA interactions (MLIs) has received great attention recently due to their vital roles in microbiology and profound significance in diseases. Currently, many related studies mainly focus on animals and the link prediction problem on plants is rarely discussed comprehensively. Motivated by this, we achieve link prediction task based on the concept of bipartite graph and verify encouraging performance of our conclusions by conducting experiments on plant datasets. In this work, we firstly extract attribute information and structure information as base features and further process these information for network embedding. Intra-partition and inter-partition proximity modelling are conducted to construct the loss function, which facilitates the training of parameters. Finally, the superiority of our presented approach is shown by carrying out experiments on four plant datasets, which reflects the significance of this work to the research of microbiology and disease.
Collapse
Affiliation(s)
- Linlin Zhuo
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou, Zhejiang 325035, China
| | - Shiyao Pan
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou, Zhejiang 325035, China.
| | - Jing Li
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou, Zhejiang 325035, China
| | - Xiangzheng Fu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan 410012, China.
| |
Collapse
|
2
|
Kang Q, Meng J, Luan Y. RNAI-FRID: novel feature representation method with information enhancement and dimension reduction for RNA-RNA interaction. Brief Bioinform 2022; 23:6555402. [PMID: 35352114 DOI: 10.1093/bib/bbac107] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 02/22/2022] [Accepted: 03/02/2022] [Indexed: 11/12/2022] Open
Abstract
Different ribonucleic acids (RNAs) can interact to form regulatory networks that play important role in many life activities. Molecular biology experiments can confirm RNA-RNA interactions to facilitate the exploration of their biological functions, but they are expensive and time-consuming. Machine learning models can predict potential RNA-RNA interactions, which provide candidates for molecular biology experiments to save a lot of time and cost. Using a set of suitable features to represent the sample is crucial for training powerful models, but there is a lack of effective feature representation for RNA-RNA interaction. This study proposes a novel feature representation method with information enhancement and dimension reduction for RNA-RNA interaction (named RNAI-FRID). Diverse base features are first extracted from RNA data to contain more sample information. Then, the extracted base features are used to construct the complex features through an arithmetic-level method. It greatly reduces the feature dimension while keeping the relationship between molecule features. Since the dimension reduction may cause information loss, in the process of complex feature construction, the arithmetic mean strategy is adopted to enhance the sample information further. Finally, three feature ranking methods are integrated for feature selection on constructed complex features. It can adaptively retain important features and remove redundant ones. Extensive experiment results show that RNAI-FRID can provide reliable feature representation for RNA-RNA interaction with higher efficiency and the model trained with generated features obtain better performance than other deep neural network predictors.
Collapse
Affiliation(s)
- Qiang Kang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning, 116024, China
| | - Jun Meng
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning, 116024, China
| | - Yushi Luan
- School of Bioengineering, Dalian University of Technology, Dalian, Liaoning, 116024, China
| |
Collapse
|
3
|
Zhang J, Zhang Y, Kang JY, Chen S, He Y, Han B, Liu MF, Lu L, Li L, Yi Z, Chen L. Potential transmission chains of variant B.1.1.7 and co-mutations of SARS-CoV-2. Cell Discov 2021; 7:44. [PMID: 34127650 PMCID: PMC8203788 DOI: 10.1038/s41421-021-00282-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Accepted: 05/15/2021] [Indexed: 02/05/2023] Open
Abstract
The presence of SARS-CoV-2 mutants, including the emerging variant B.1.1.7, has raised great concerns in terms of pathogenesis, transmission, and immune escape. Characterizing SARS-CoV-2 mutations, evolution, and effects on infectivity and pathogenicity is crucial to the design of antibody therapies and surveillance strategies. Here, we analyzed 454,443 SARS-CoV-2 spike genes/proteins and 14,427 whole-genome sequences. We demonstrated that the early variant B.1.1.7 may not have evolved spontaneously in the United Kingdom or within human populations. Our extensive analyses suggested that Canidae, Mustelidae or Felidae, especially the Canidae family (for example, dog) could be a possible host of the direct progenitor of variant B.1.1.7. An alternative hypothesis is that the variant was simply yet to be sampled. Notably, the SARS-CoV-2 whole-genome represents a large number of potential co-mutations. In addition, we used an experimental SARS-CoV-2 reporter replicon system to introduce the dominant co-mutations NSP12_c14408t, 5'UTR_c241t, and NSP3_c3037t into the viral genome, and to monitor the effect of the mutations on viral replication. Our experimental results demonstrated that the co-mutations significantly attenuated the viral replication. The study provides valuable clues for discovering the transmission chains of variant B.1.1.7 and understanding the evolutionary process of SARS-CoV-2.
Collapse
Affiliation(s)
- Jingsong Zhang
- grid.9227.e0000000119573309State Key Laboratory of Cell Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai, China
| | - Yang Zhang
- grid.8547.e0000 0001 0125 2443Key Laboratory of Medical Molecular Virology (MOE/NHC/CAMS), School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai, China
| | - Jun-Yan Kang
- grid.9227.e0000000119573309State Key Laboratory of Molecular Biology, Shanghai Key Laboratory of Molecular Andrology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai, China ,grid.410726.60000 0004 1797 8419University of Chinese Academy of Sciences, Shanghai, China
| | - Shuiye Chen
- grid.8547.e0000 0001 0125 2443Key Laboratory of Medical Molecular Virology (MOE/NHC/CAMS), School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai, China
| | - Yongqun He
- grid.214458.e0000000086837370Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI USA
| | - Benhao Han
- grid.9227.e0000000119573309State Key Laboratory of Cell Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai, China
| | - Mo-Fang Liu
- grid.9227.e0000000119573309State Key Laboratory of Molecular Biology, Shanghai Key Laboratory of Molecular Andrology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai, China ,grid.410726.60000 0004 1797 8419University of Chinese Academy of Sciences, Shanghai, China
| | - Lina Lu
- grid.9227.e0000000119573309State Key Laboratory of Cell Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai, China
| | - Li Li
- grid.38142.3c000000041936754XDepartment of Genetics, Harvard Medical School, Boston, MA USA
| | - Zhigang Yi
- grid.8547.e0000 0001 0125 2443Key Laboratory of Medical Molecular Virology (MOE/NHC/CAMS), School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai, China ,grid.8547.e0000 0001 0125 2443Shanghai Public Health Clinical Center, Fudan University, Shanghai, China
| | - Luonan Chen
- grid.9227.e0000000119573309State Key Laboratory of Cell Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai, China ,grid.440637.20000 0004 4657 8879School of Life Science and Technology, ShanghaiTech University, Shanghai, China ,grid.410726.60000 0004 1797 8419Key Laboratory of Systems Health Science of Zhejiang Province, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, China ,Pazhou Lab, Guangzhou, China
| |
Collapse
|
4
|
Genetic Similarity Analysis Based on Positive and Negative Sequence Patterns of DNA. Symmetry (Basel) 2020. [DOI: 10.3390/sym12122090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Similarity analysis of DNA sequences can clarify the homology between sequences and predict the structure of, and relationship between, them. At the same time, the frequent patterns of biological sequences explain not only the genetic characteristics of the organism, but they also serve as relevant markers for certain events of biological sequences. However, most of the aforementioned biological sequence similarity analysis methods are targeted at the entire sequential pattern, which ignores the missing gene fragment that may induce potential disease. The similarity analysis of such sequences containing a missing gene item is a blank. Consequently, some sequences with missing bases are ignored or not effectively analyzed. Thus, this paper presents a new method for DNA sequence similarity analysis. Using this method, we first mined not only positive sequential patterns, but also sequential patterns that were missing some of the base terms (collectively referred to as negative sequential patterns). Subsequently, we used these frequent patterns for similarity analysis on a two-dimensional plane. Several experiments were conducted in order to verify the effectiveness of this algorithm. The experimental results demonstrated that the algorithm can obtain various results through the selection of frequent sequential patterns and that accuracy and time efficiency was improved.
Collapse
|