This study utilizes the data published on the website https://data.mendeley.com/ datasets/46htwnp833/2, which includes visible-near-infrared (Vis-NIR) spectral data at wavelengths ranging from 309 nm to 1149 nm for 11691 mangoes in Australia, collected from 10 mango varieties across 2 different growing regions. The research developed machine learning models with open-source programming language Python such as: principal component analysis (PCA) combined with support vector machines (SVM), decision trees (DT), random forests (RF), and artificial neural networks (ANN); partial least squares model combined with discriminant analysis (PLS-DA); and a deep learning model 1-dimensional convolutional neural network (1D-CNN). The preprocessing steps were caried out based on the full spectral data with second derivative, smoothing using the Savitzky-Golay algorithm, and data balancing via a new Synthetic Minority Oversampling Technique (SMOTE). The results demonstrated that applying the SMOTE data preprocessing technique before running the machine learning models significantly enhanced classification accuracy. Furthermore, using a 1D-CNN model with a complex structure provided higher classification efficiency than conventional machine learning models. The accuracy of the 1D-CNN model in classifying mango ripeness, mango variety, and growing location was 99.40%, 94.35%, and 96.92%, respectively. The 1D-CNN deep learning model is well-suited for sample classification when dealing with large datasets containing tens of thousands of samples based on spectral data.
mango classification, machine learning, deep learning, 1D-CNN, VisNIR spectra
[1]. K. A. Shah, M. B. Patel, R.J. Patel, P.K. Parmar, “Mangifera indica (mango),” Phar. Rev. Jan, 4(7), pp. 42-48, 2010.
[2]. M. E. Maldonado-Celis, E. M. Yahia, R. Bedoya, et al., “Chemical composition of mango (Mangifera indica L.) fruit: Nutritional and phytochemical compounds,” Front Plant Sci., 10:450160, 2019.
[3]. V. Bennett, A.K.Inengite, “Comparative Study of the Chemical Composition of Three Mango Stem Bark,” Journal of Diseases and Medicinal Plants, 8(3), pp. 55-60, 2022.
[4]. A. K. Kouassi, T. Alabi, G. Purcaro, C. Blecker, S. Danthine, “Assessment of the Impact of Annual Growing Conditions on the Physicochemical Properties of Mango Kernel Fat,” Horticulturae, 10:814, 2024.
[5]. Tran Van Hau, Nguyen Chi Linh, Nguyen Long, “Determining the harvest time of Hoa Loc mango (mangifera indica l.) in Hoa Hung commune, Cai Be district, Tien Giang province,” CTU Journal of Innovation and Sustainable Development, 37(2), pp.111-119, 2015.
[6]. M. D. K. Vithana, Z. Singh, & S. K. Johnson, “Cold storage temperatures and durations affect the concentrations of lupeol, mangiferin, phenolic acids and other healthpromoting compounds in the pulp and peel of ripe mango fruit,” Postharvest Biology and Technology, 139, pp. 91-98, 2018.
[7]. G. Gizachew, G. Gezahegn, & F. Seifu, “Chemical Composition of Mango (Mangifera Indica L) Fruit as Influence by Postharvest Treatments in Arba Minch, Southern Ethiopia,” IOSR Journal of Environmental Science Toxicology and Food Technology, 10(11), pp.70-77, 2016.
[8]. Nguyen Truong Thịnh, Nguyen Duc Thong, Huynh Thanh Cong, Nguyen Tran Thanh Phong, “Mango classification system based on machine vision and artificial intelligence,” 7th International Conference on Control, Mechatronics and Automation, pp. 475-482, 2019.
[9]. D. G. A. Al-Sanabani, M. I. Solihin, L. P. Pui, W. Astuti, C.K. Ang and L. W. Hong, “Development of non-destructive mango assessment using Handheld Spectroscopy and Machine Learning Regression,” Journal of Physics: Conference Series, pp. 12- 30, 2019.
[10]. S. N. Jha, P. Jaiswal, K. Narsaiah, et al., “Authentication of mango varieties using nearinfrared spectroscopy,” Agricultural Research, vol. 2, pp. 229-235, 2013.
[11]. I. W. Budiastra, & H. K. Punvadaria, “Classification of mango by artificial neural network based on near infrared diffuse reflectance,” IFAC Proceedings, 33(29), pp. 157-161, 2000.
[12]. R. Pronprasit, & J. Natwichai, “Prediction of mango fruit quality from Nir spectroscopy using an ensemble classification,” International Journal of Computer Applications, 83(14), 2013.
[13]. N. T. Anderson, K. B. Walsh, P. P. Subedi and C. H. Hayes, “Achieving robustness across season, location and cultivar for a NIRS model for intact mango fruit dry matter content,” Postharvest Biology and Technology, 168, 2020.
[14]. A. Fernández, S. Garcia, F. Herrera and N. V. Chawla, “SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary,” Journal of artificial intelligence research, 61, pp. 863-905, 2018.
[15]. R. W. Schafer, “What is a Savitzky-Golay filter?” IEEE Signal processing magazine, 28(4), pp. 111-117, 2011.
[16]. I. Syarif, A. Prugel-Bennett, G. Wills, “SVM parameter optimization using grid search and genetic algorithm to improve classification performance,” Telecommunication Computing Electronics and Control, 14(4), 2016.