Swin Transformer V2 para clasificación de café lojano

Contenido principal del artículo

Patricio Bolívar Betancourt Ludeña
Oscar M. Cumbicus-Pineda

Resumen

Esta investigación presenta un modelo de clasificación binaria para granos de café verde de la variedad arábico procedentes de la región de Loja, Ecuador, basado en la arquitectura Swin Transformer V2. Se emplearon dos fuentes de datos, el conjunto de datos público USK-Coffee, de origen indonesio, y un conjunto de datos propio capturado bajo condiciones controladas. Se evaluaron dos estrategias de entrenamiento: transferencia secuencial y entrenamiento unificado, siendo este último el que alcanzó una precisión de validación del 98,30 %. Tras la optimización de hiperparámetros, el modelo logró una precisión del 100 % en un conjunto de prueba de 150 imágenes y del 93 % en un conjunto de generalización externo de 400 imágenes con condiciones variables de iluminación y fondo. La interpretabilidad del modelo se validó mediante Grad-CAM, evidenciando que la red enfoca su atención en zonas defectuosas reales. Un análisis de ablación mostró que la disminución de rendimiento en escenarios no controlados se debe principalmente a la sensibilidad al ruido y a la iluminación extrema. Como principales aportes, se destaca la creación de un conjunto de datos especializado y un modelo eficiente para la clasificación automática de café verde arábico.

Detalles del artículo

Sección
Artículo Científico

Referencias

[1] ICP. (2025) I-CIP retreats on news of looser supply, relieving some of the upward pressure. Coffee market report. International Coffee Organization. [Online]. Available: https://upsalesiana.ec/ing35ar10r1

[2] Agricultura. (2025) 6425 hectáreas de café son renovadas en la provincia de Loja. Ministerio de Agricultura, Ganadería y Pesca. [Online]. Available: https://upsalesiana.ec/ing35ar10r2

[3] M. Faisal, J.-S. Leu, and J. T. Darmawan, “Model selection of hybrid feature fusion for coffee leaf disease classification,” IEEE Access, vol. 11, pp. 62 281–62 291, 2023. [Online]. Available: https://doi.org/10.1109/ACCESS.2023.3286935

[4] E. Hassan, “Enhancing coffee bean classification: a comparative analysis of pretrained deep learning models,” Neural Computing and Applications, vol. 36, no. 16, pp. 9023–9052, Apr. 2024. [Online]. Available: https://doi.org/10.1007/s00521-024-09623-z

[5] C.-H. Hsia, Y.-H. Lee, and C.-F. Lai, “An explainable and lightweight deep convolutional neural network for quality detection of green coffee beans,” Applied Sciences, vol. 12, no. 21, p. 10966, Oct. 2022. [Online]. Available: https://doi.org/10.3390/app122110966

[6] S.-J. Chang and C.-Y. Huang, “Deep learning model for the inspection of coffee bean defects,” Applied Sciences, vol. 11, no. 17, p. 8226, Sep. 2021. [Online]. Available: https://doi.org/10.3390/app11178226

[7] A. Chavarro, D. Renza, and E. Moya-Albor, “Convnext as a basis for interpretability in coffee leaf rust classification,” Mathematics, vol. 12, no. 17, p. 2668, Aug. 2024. [Online]. Available: https://doi.org/10.3390/math12172668

[8] Y. A. Auliya, I. Fadah, Y. Baihaqi, and I. N. Awwaliyah, “Green bean classification: Fully convolutional neural network with Adam optimization,” Mathematical Modelling of Engineering Problems, vol. 11, no. 6, pp. 1641–1648, Jun. 2024. [Online]. Available: https://doi.org/10.18280/mmep.110626

[9] J. Maurício, I. Domingues, and J. Bernardino, “Comparing vision transformers and convolutional neural networks for image classification: A literature review,” Applied Sciences, vol. 13, no. 9, p. 5521, Apr. 2023. [Online]. Available: https://doi.org/10.3390/app13095521

[10] J. Wei, J. Chen, Y. Wang, H. Luo, and W. Li, “Improved deep learning image classification algorithm based on Swin Transformer V2,” PeerJ Computer Science, vol. 9, p. e1665, Oct. 2023. [Online]. Available: https://doi.org/10.7717/peerj-cs.1665

[11] S. Arwatchananukul, D. Xu, P. Charoenkwan, S. Aung Moon, and R. Saengrayap, “Implementing a deep learning model for defect classification in Thai Arabica green coffee beans,” Smart Agricultural Technology, vol. 9, p. 100680, Dec. 2024. [Online]. Available: https://doi.org/10.1016/j.atech.2024.100680

[12] W. Pinheiro Claro Gomes, L. Gonçalves, C. Barboza da Silva, and W. R. Melchert, “Application of multispectral imaging combined with machine learning models to discriminate special and traditional green coffee,” Computers and Electronics in Agriculture, vol. 198, p. 107097, Jul. 2022. [Online]. Available: https://doi.org/10.1016/j.compag.2022.107097

[13] M. N. Izza and G. P. Kusuma, “Image classification of Green Arabica Coffee using transformer-based architecture,” International Journal of Engineering Trends and Technology, vol. 72, no. 6, pp. 304–314, Jun. 2024. [Online]. Available: https://doi.org/10.14445/22315381/IJETT-V72I6P128

[14] H. F. Alhasson and S. S. Alharbi, “Classification of saudi coffee beans using a mobile application leveraging squeeze vision transformer technology,” Neural Computing and Applications, vol. 37, no. 14, pp. 8629–8649, Feb. 2025. [Online]. Available: https://doi.org/10.1007/s00521-025-11024-9

[15] Y. Jiao, Y. Zhao, A. Jia, T. Wang, J. Li, K. Xiang, H. Deng, M. He, R. Jiang, and Y. Zhang, “Swin-HSSAM: a green coffee bean grading method by swin transformer,” PLOS One, vol. 20, no. 5, p. e0322198, May 2025. [Online]. Available: https: //doi.org/10.1371/JOURNAL.PONE.0322198

[16] J. H. L. Goh, E. Ang, S. Srinivasan, X. Lei, J. Loh, T. C. Quek, C. Xue, X. Xu, Y. Liu, C.-Y. Cheng, J. C. Rajapakse, and Y.-C. Tham, “Comparative analysis of vision transformers and conventional convolutional neural networks in detecting referable diabetic retinopathy,” Ophthalmology Science, vol. 4, no. 6, p. 100552, Nov. 2024. [Online]. Available: https: //doi.org/10.1016/j.xops.2024.100552

[17] Z. Liu, H. Hu, Y. Lin, Z. Yao, Z. Xie, Y. Wei, J. Ning, Y. Cao, Z. Zhang, L. Dong, F. Wei, and B. Guo, “Swin Transformer V2: scaling up capacity and resolution,” arXiv, 2021. [Online]. Available: https://doi.org/10.48550/arXiv.2111.09883

[18] S. Studer, T. B. Bui, C. Drescher, A. Hanuschkin, L. Winkler, S. Peters, and K.-R. Müller, “Towards CRISP-ML(Q): a machine learning process model with quality assurance methodology,” Machine Learning and Knowledge Extraction, vol. 3, no. 2, pp. 392–413, Apr. 2021. [Online]. Available: https://doi.org/10.3390/make3020020

[19] A. Febriana, K. Muchtar, R. Dawood, and C.-Y. Lin, “USK-Coffee dataset: A multi-class green arabica coffee bean dataset for deep learning,” in 2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom). IEEE, Jun. 2022, pp. 469–473. [Online]. Available: ttps://doi.org/10.1109/CyberneticsCom55287.2022.9865489

[20] Patricio Bolívar Betancourt Ludeña, “Lojano Arabica coffee,” Zenodo, 2025. [Online]. Available: https://doi.org/10.34740/kaggle/dsv/13947455

[21] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-CAM: visual explanations from deep networks via gradient-based localization,” International Journal of Computer Vision, vol. 128, no. 2, pp. 336–359, Oct. 2019. [Online]. Available: http://dx.doi.org/10.1007/s11263-019-01228-7

[22] H. L. Gope and H. Fukai, “Peaberry and normal coffee bean classification using CNN, SVM, and KNN: their implementation in and the limitations of Raspberry Pi 3,” AIMS Agriculture and Food, vol. 7, no. 1, pp. 149–167, 2022. [Online]. Available: https://doi.org/10.3934/agrfood.2022010