1
|
Pillay AB, Pathmanathan D, Dabo-Niang S, Abu A, Omar H. Functional data geometric morphometrics with machine learning for craniodental shape classification in shrews. Sci Rep 2024; 14:15579. [PMID: 38971911 PMCID: PMC11227550 DOI: 10.1038/s41598-024-66246-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 06/29/2024] [Indexed: 07/08/2024] Open
Abstract
This work proposes a functional data analysis approach for morphometrics in classifying three shrew species (S. murinus, C. monticola, and C. malayana) from Peninsular Malaysia. Functional data geometric morphometrics (FDGM) for 2D landmark data is introduced and its performance is compared with classical geometric morphometrics (GM). The FDGM approach converts 2D landmark data into continuous curves, which are then represented as linear combinations of basis functions. The landmark data was obtained from 89 crania of shrew specimens based on three craniodental views (dorsal, jaw, and lateral). Principal component analysis and linear discriminant analysis were applied to both GM and FDGM methods to classify the three shrew species. This study also compared four machine learning approaches (naïve Bayes, support vector machine, random forest, and generalised linear model) using predicted PC scores obtained from both methods (a combination of all three craniodental views and individual views). The analyses favoured FDGM and the dorsal view was the best view for distinguishing the three species.
Collapse
Affiliation(s)
- Aneesha Balachandran Pillay
- Faculty of Science, Institute of Mathematical Sciences, Universiti Malaya, Kuala Lumpur, Wilayah Persekutuan Kuala Lumpur, Malaysia
| | - Dharini Pathmanathan
- Faculty of Science, Institute of Mathematical Sciences, Universiti Malaya, Kuala Lumpur, Wilayah Persekutuan Kuala Lumpur, Malaysia.
| | - Sophie Dabo-Niang
- Laboratoire Paul Painlevé CNRS 8524, INRIA-MODAL, Université de Lille, Villeneuve d'Ascq, France
| | - Arpah Abu
- Faculty of Science, Institute of Biological Sciences, Universiti Malaya, Kuala Lumpur, Wilayah Persekutuan Kuala Lumpur, Malaysia
| | - Hasmahzaiti Omar
- Faculty of Science, Institute of Biological Sciences, Universiti Malaya, Kuala Lumpur, Wilayah Persekutuan Kuala Lumpur, Malaysia
| |
Collapse
|
2
|
Ling MH, Ivorra T, Heo CC, Wardhana AH, Hall MJR, Tan SH, Mohamed Z, Khang TF. Machine learning analysis of wing venation patterns accurately identifies Sarcophagidae, Calliphoridae and Muscidae fly species. MEDICAL AND VETERINARY ENTOMOLOGY 2023; 37:767-781. [PMID: 37477152 DOI: 10.1111/mve.12682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 07/03/2023] [Indexed: 07/22/2023]
Abstract
In medical, veterinary and forensic entomology, the ease and affordability of image data acquisition have resulted in whole-image analysis becoming an invaluable approach for species identification. Krawtchouk moment invariants are a classical mathematical transformation that can extract local features from an image, thus allowing subtle species-specific biological variations to be accentuated for subsequent analyses. We extracted Krawtchouk moment invariant features from binarised wing images of 759 male fly specimens from the Calliphoridae, Sarcophagidae and Muscidae families (13 species and a species variant). Subsequently, we trained the Generalized, Unbiased, Interaction Detection and Estimation random forests classifier using linear discriminants derived from these features and inferred the species identity of specimens from the test samples. Fivefold cross-validation results show a 98.56 ± 0.38% (standard error) mean identification accuracy at the family level and a 91.04 ± 1.33% mean identification accuracy at the species level. The mean F1-score of 0.89 ± 0.02 reflects good balance of precision and recall properties of the model. The present study consolidates findings from previous small pilot studies of the usefulness of wing venation patterns for inferring species identities. Thus, the stage is set for the development of a mature data analytic ecosystem for routine computer image-based identification of fly species that are of medical, veterinary and forensic importance.
Collapse
Affiliation(s)
- Min Hao Ling
- Institute of Mathematical Sciences, Faculty of Science, Universiti Malaya, Kuala Lumpur, Malaysia
| | - Tania Ivorra
- Department of Medical Microbiology and Parasitology, Faculty of Medicine, Universiti Teknologi MARA (UiTM), Sungai Buloh, Selangor, Malaysia
- Department of Environmental Sciences and Natural Resources, University of Alicante, Alicante, Spain
| | - Chong Chin Heo
- Department of Medical Microbiology and Parasitology, Faculty of Medicine, Universiti Teknologi MARA (UiTM), Sungai Buloh, Selangor, Malaysia
| | - April Hari Wardhana
- Research Center for Veterinary Science, The National Research and Innovation Agency, Bogor, Indonesia
- Faculty of Veterinary Medicine, Airlangga University, Surabaya, Indonesia
| | | | - Siew Hwa Tan
- International Department of Dipterology, Kuala Lumpur Laboratory, Kuala Lumpur, Malaysia
- Institute of Biological Sciences, Faculty of Science, Universiti Malaya, Kuala Lumpur, Malaysia
| | - Zulqarnain Mohamed
- Institute of Biological Sciences, Faculty of Science, Universiti Malaya, Kuala Lumpur, Malaysia
| | - Tsung Fei Khang
- Institute of Mathematical Sciences, Faculty of Science, Universiti Malaya, Kuala Lumpur, Malaysia
- Universiti Malaya Centre for Data Analytics, Universiti Malaya, Kuala Lumpur, Malaysia
| |
Collapse
|
3
|
Goh JY, Khang TF. On the classification of simple and complex biological images using Krawtchouk moments and Generalized pseudo-Zernike moments: a case study with fly wing images and breast cancer mammograms. PeerJ Comput Sci 2021; 7:e698. [PMID: 34604523 PMCID: PMC8444072 DOI: 10.7717/peerj-cs.698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Accepted: 08/06/2021] [Indexed: 06/13/2023]
Abstract
In image analysis, orthogonal moments are useful mathematical transformations for creating new features from digital images. Moreover, orthogonal moment invariants produce image features that are resistant to translation, rotation, and scaling operations. Here, we show the result of a case study in biological image analysis to help researchers judge the potential efficacy of image features derived from orthogonal moments in a machine learning context. In taxonomic classification of forensically important flies from the Sarcophagidae and the Calliphoridae family (n = 74), we found the GUIDE random forests model was able to completely classify samples from 15 different species correctly based on Krawtchouk moment invariant features generated from fly wing images, with zero out-of-bag error probability. For the more challenging problem of classifying breast masses based solely on digital mammograms from the CBIS-DDSM database (n = 1,151), we found that image features generated from the Generalized pseudo-Zernike moments and the Krawtchouk moments only enabled the GUIDE kernel model to achieve modest classification performance. However, using the predicted probability of malignancy from GUIDE as a feature together with five expert features resulted in a reasonably good model that has mean sensitivity of 85%, mean specificity of 61%, and mean accuracy of 70%. We conclude that orthogonal moments have high potential as informative image features in taxonomic classification problems where the patterns of biological variations are not overly complex. For more complicated and heterogeneous patterns of biological variations such as those present in medical images, relying on orthogonal moments alone to reach strong classification performance is unrealistic, but integrating prediction result using them with carefully selected expert features may still produce reasonably good prediction models.
Collapse
Affiliation(s)
- Jia Yin Goh
- Institute of Mathematical Sciences, Faculty of Science, Universiti Malaya, Kuala Lumpur, Malaysia
| | - Tsung Fei Khang
- Institute of Mathematical Sciences, Faculty of Science, Universiti Malaya, Kuala Lumpur, Malaysia
- Universiti Malaya Centre for Data Analytics, Universiti Malaya, Kuala Lumpur, Malaysia
| |
Collapse
|