1
|
Ma B, Gong H, Xu Q, Gao Y, Guan A, Wang H, Hua K, Luo R, Jin H. Bases-dependent Rapid Phylogenetic Clustering (Bd-RPC) enables precise and efficient phylogenetic estimation in viruses. Virus Evol 2024; 10:veae005. [PMID: 38361823 PMCID: PMC10868571 DOI: 10.1093/ve/veae005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 01/06/2024] [Accepted: 01/22/2024] [Indexed: 02/17/2024] Open
Abstract
Understanding phylogenetic relationships among species is essential for many biological studies, which call for an accurate phylogenetic tree to understand major evolutionary transitions. The phylogenetic analyses present a major challenge in estimation accuracy and computational efficiency, especially recently facing a wave of severe emerging infectious disease outbreaks. Here, we introduced a novel, efficient framework called Bases-dependent Rapid Phylogenetic Clustering (Bd-RPC) for new sample placement for viruses. In this study, a brand-new recoding method called Frequency Vector Recoding was implemented to approximate the phylogenetic distance, and the Phylogenetic Simulated Annealing Search algorithm was developed to match the recoded distance matrix with the phylogenetic tree. Meanwhile, the indel (insertion/deletion) was heuristically introduced to foreign sequence recognition for the first time. Here, we compared the Bd-RPC with the recent placement software (PAGAN2, EPA-ng, TreeBeST) and evaluated it in Alphacoronavirus, Alphaherpesvirinae, and Betacoronavirus by using Split and Robinson-Foulds distances. The comparisons showed that Bd-RPC maintained the highest precision with great efficiency, demonstrating good performance in new sample placement on all three virus genera. Finally, a user-friendly website (http://www.bd-rpc.xyz) is available for users to classify new samples instantly and facilitate exploration of the phylogenetic research in viruses, and the Bd-RPC is available on GitHub (http://github.com/Bin-Ma/bd-rpc).
Collapse
Affiliation(s)
- Bin Ma
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Huimin Gong
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Qianshuai Xu
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Yuan Gao
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Aohan Guan
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Haoyu Wang
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Kexin Hua
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Rui Luo
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Hui Jin
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| |
Collapse
|
2
|
Liu W, Zeng S, Wu G, Li H, Chen F. Rice Seed Purity Identification Technology Using Hyperspectral Image with LASSO Logistic Regression Model. Sensors (Basel) 2021; 21:4384. [PMID: 34206783 PMCID: PMC8271842 DOI: 10.3390/s21134384] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Revised: 06/22/2021] [Accepted: 06/24/2021] [Indexed: 11/25/2022]
Abstract
Hyperspectral technology is used to obtain spectral and spatial information of samples simultaneously and demonstrates significant potential for use in seed purity identification. However, it has certain limitations, such as high acquisition cost and massive redundant information. This study integrates the advantages of the sparse feature of the least absolute shrinkage and selection operator (LASSO) algorithm and the classification feature of the logistic regression model (LRM). We propose a hyperspectral rice seed purity identification method based on the LASSO logistic regression model (LLRM). The feasibility of using LLRM for the selection of feature wavelength bands and seed purity identification are discussed using four types of rice seeds as research objects. The results of 13 different adulteration cases revealed that the value of the regularisation parameter was different in each case. The recognition accuracy of LLRM and average recognition accuracy were 91.67-100% and 98.47%, respectively. Furthermore, the recognition accuracy of full-band LRM was 71.60-100%. However, the average recognition accuracy was merely 89.63%. These results indicate that LLRM can select the feature wavelength bands stably and improve the recognition accuracy of rice seeds, demonstrating the feasibility of developing a hyperspectral technology with LLRM for seed purity identification.
Collapse
Affiliation(s)
- Weihua Liu
- School of Electric & Electronic Engineering, Wuhan Polytechnic University, Wuhan 430023, China;
| | - Shan Zeng
- School of Mathematics & Computer Science, Wuhan Polytechnic University, Wuhan 430023, China; (H.L.); (F.C.)
| | - Guiju Wu
- The Key Laboratory of Earthquake Geodesy, Institute of Seismology, China Earthquake Administration, Wuhan 430023, China;
| | - Hao Li
- School of Mathematics & Computer Science, Wuhan Polytechnic University, Wuhan 430023, China; (H.L.); (F.C.)
| | - Feifei Chen
- School of Mathematics & Computer Science, Wuhan Polytechnic University, Wuhan 430023, China; (H.L.); (F.C.)
| |
Collapse
|