Detección de peatones en el día y en la noche usando YOLO-v5
Contenido principal del artículo
Resumen
En este artículo se presenta un nuevo algoritmo basado en aprendizaje profundo para la detección de peatones en el día y en la noche, denominada multiespectral, enfocado en aplicaciones de seguridad vehicular. La propuesta se basa en YOLO-v5, y consiste en la construcción de dos subredes que se enfocan en trabajar sobre las imágenes en color (RGB) y térmicas (IR), respectivamente. Luego se fusiona la información, a través, de una subred de fusión que integra las redes RGB e IR, para llegar a un detector de peatones. Los experimentos, destinados a verificar la calidad de la propuesta, fueron desarrollados usando distintas bases de datos públicas de peatones destinadas a su detección en el día y en la noche. Los principales resultados en función de la métrica mAP, estableciendo un IoU en 0.5 son 96.6 % sobre la base de datos INRIA, 89.2 % sobre CVC09, 90.5 % en LSIFIR, 56 % sobre FLIR-ADAS, 79.8 % para CVC14, 72.3 % sobre Nightowls y KAIST un 53.3 %.
Detalles del artículo
La Universidad Politécnica Salesiana de Ecuador conserva los derechos patrimoniales (copyright) de las obras publicadas y favorecerá la reutilización de las mismas. Las obras se publican en la edición electrónica de la revista bajo una licencia Creative Commons Reconocimiento / No Comercial-Sin Obra Derivada 4.0 Ecuador: se pueden copiar, usar, difundir, transmitir y exponer públicamente.
El autor/es abajo firmante transfiere parcialmente los derechos de propiedad (copyright) del presente trabajo a la Universidad Politécnica Salesiana del Ecuador, para las ediciones impresas.
Se declara además haber respetado los principios éticos de investigación y estar libre de cualquier conflicto de intereses.
El autor/es certifican que este trabajo no ha sido publicado, ni está en vías de consideración para su publicación en ninguna otra revista u obra editorial.
El autor/es se responsabilizan de su contenido y de haber contribuido a la concepción, diseño y realización del trabajo, análisis e interpretación de datos, y de haber participado en la redacción del texto y sus revisiones, así como en la aprobación de la versión que finalmente se remite en adjunto.
Referencias
[2] ANT. (2015) Estadísticas de siniestros de tránsito octubre 2015. Agencia Nacional de Tránsito del Ecuador. [Online]. Available: https://bit.ly/3aUIWGv
[3] ——. (2017) Estadísticas de siniestros de tránsito agosto 2017. Agencia Nacional de Tránsito del Ecuador. [Online]. Available: https://bit.ly/3aUIWGv
[4] J. Liu, S. Zhang, S. Wang, and D. N. Metaxas, “Multispectral deep neural networks for pedestrian detection,” 2016. [Online]. Available: https://bit.ly/2Z3BLJu
[5] D. König, M. Adam, C. Jarvers, G. Layher, H. Neumann, and M. Teutsch, “Fully convolutional region proposal networks for multispectral person detection,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2017, pp. 243–250. [Online]. Available: https://doi.org/10.1109/CVPRW.2017.36
[6] D. Guan, Y. Cao, J. Yang, Y. Cao, and M. Y. Yang, “Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection,” Information Fusion, vol. 50, pp. 148–157, 2019. [Online]. Available: https://doi.org/10.1016/j.inffus.2018.11.017
[7] J. Li, X. Liang, S. Shen, T. Xu, J. Feng, and S. Yan, “Scale-aware fast R-CNN for pedestrian detection,” IEEE Transactions on Multimedia, vol. 20, no. 4, pp. 985–996, 2018. [Online]. Available: https://doi.org/10.1109/TMM.2017.2759508
[8] J. Cao, C. Song, S. Peng, S. Song, X. Zhang, Y. Shao, and F. Xiao, “Pedestrian detection algorithm for intelligent vehicles in complex scenarios,” Sensors, vol. 20, no. 13, p. 3646, 2020. [Online]. Available: https://doi.org/10.3390/s20133646
[9] Caltech. (2016) Caltech pedestrian detection benchmark. [Online]. Available: https://bit.ly/3aXuZb4
[10] Pascal. (2016) Inria person dataset. [Online]. Available: https://bit.ly/30APbxi
[11] X. Song, S. Gao, and C. Chen, “A multispectral feature fusion network for robust pedestrian detection,” Alexandria Engineering Journal, vol. 60, no. 1, pp. 73–85, 2021. [Online]. Available: https://doi.org/10.1016/j.aej.2020.05.035
[12] A. Wolpert, M. Teutsch, M. S. Sarfraz, and R. Stiefelhagen, “Anchor-free small-scale multispectral pedestrian detection,” 2020. [Online]. Available: https://bit.ly/3G8k5gL
[13] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” 2016. [Online]. Available: https://bit.ly/3B167d1
[14] C. Ertler, H. Possegger, M. Opitz, and H. Bischof, “Pedestrian detection in RGB-D images from an elevated viewpoint,” in Proceedings of the 22nd Computer Vision Winter Workshop, W. Kropatsch, I. Janusch, and N. Artner, Eds. Austria: TU Wien, Pattern Recongition and Image Processing Group, 2017. [Online]. Available: https://bit.ly/3AYTI9w
[15] X. Zhang, G. Chen, K. Saruta, and Y. Terata, “Deep convolutional neural networks for all-day pedestrian detection,” in Information Science and Applications 2017, K. Kim and N. Joukov, Eds. Singapore: Springer Singapore, 2017, pp. 171–178. [Online]. Available: https://doi.org/10.1007/978-981-10-4154-9_21
[16] L. Zhang, L. Lin, X. Liang, and K. He, “Is faster r-cnn doing well for pedestrian detection?” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds. Cham: Springer International Publishing, 2016, pp. 443–457. [Online]. Available: https://doi.org/10.1007/978-3-319-46475-6_28
[17] J. H. Kim, H. G. Hong, and K. R. Park, “Convolutional neural network-based human detection in nighttime images using visible light camera sensors,” Sensors, vol. 17, no. 5, 2017. [Online]. Available: https://doi.org/10.3390/s17051065
[18] L. Ding, Y. Wang, R. Laganiere, D. Huang, and S. Fu, “Convolutional neural networks for multispectral pedestrian detection,” Signal Processing: Image Communication, vol. 82, p. 115764, 2020. [Online]. Available: https://doi.org/10.1016/j.image.2019.115764
[19] S. Hwang, J. Park, N. Kim, Y. Choi, and I. S. Kweon, “Multispectral pedestrian detection: Benchmark dataset and baseline,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 1037–1045. [Online]. Available: https://doi.org/10.1109/CVPR.2015.7298706
[20] Caltech. (2012) Caltech pedestrian detection benchmark. [Online]. Available: https: //bit.ly/3pkn93o
[21] Pascal. (2012) INRIA person dataset. [Online]. Available: https://bit.ly/3IAO6Hw
[22] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? The KITTI vision benchmark suite,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2012. [Online]. Available: https://bit.ly/3n6oBnq
[23] X. Yu, Y. Si, and L. Li, “Pedestrian detection based on improved faster rcnn algorithm,” in 2019 IEEE/CIC International
Conference on Communications in China (ICCC), 2019, pp. 346–351. [Online]. Available: https://doi.org/10.1109/ICCChina.2019.8855960
[24] Y. He, C. Zhu, and X.-C. Yin, “Mutualsupervised feature modulation network for occluded pedestrian detection,” 2020. [Online]. Available: https://bit.ly/3C14eyn
[25] F. B. Tesema, H. Wu, M. Chen, J. Lin, W. Zhu, and K. Huang, “Hybrid channel based pedestrian detection,” Neurocomputing, vol. 389, pp. 1–8, 2020. [Online]. Available: https://doi.org/10.1016/j.neucom.2019.12.110
[26] C. Kyrkou, “Yolopeds: efficient real time single shot pedestrian detection for smart camera applications,” IET Computer Vision, vol. 14, no. 7, pp. 417–425, Oct 2020. [Online]. Available: http://dx.doi.org/10.1049/iet-cvi.2019.0897
[27] J. Redmon and A. Farhadi, “Yolo9000: Better, faster, stronger,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017. [Online]. Available: https://bit.ly/3nuyCv1
[28] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “SSD: single shot multibox detector,” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds. Cham: Springer International Publishing, 2016, pp. 21–37. [Online]. Available: https://doi.org/10.1007/978-3-319-46448-0_2
[29] F. Chabot, Q.-C. Pham, and M. Chaouch, “Lapnet : Automatic balanced loss and optimal assignment for real-time dense object detection,” 2020. [Online]. Available: https://bit.ly/3FYZDPo
[30] K. Zhou, L. Chen, and X. Cao, “Improving multispectral pedestrian detection by addressing modality imbalance problems,” 2020. [Online]. Available: https://bit.ly/2Z6qKaV
[31] W. Wang, “Detection of panoramic vision pedestrian based on deep learning,” Image and Vision Computing, vol. 103, p. 103986, 2020. [Online]. Available: https://doi.org/10.1016/j.imavis.2020.10398
[32] I. Shopovska, L. Jovanov, and W. Philips, “Deep visible and thermal image fusion for enhanced pedestrian visibility,” Sensors, vol. 19, no. 17, 2019. [Online]. Available: https://doi.org/10.3390/s19173727
[33] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, realtime object detection,” 2016. [Online]. Available: https://bit.ly/3aWg3tO
[34] D. Heo, E. Lee, and B. Chul Ko, “Pedestrian detection at night using deep neural networks and saliency maps,” Journal of Imaging Science and Technology, vol. 61, no. 6, pp. 604 031–604 039, 2017. [Online]. Available: https://doi.org/10.2352/J.ImagingSci.Technol.2017.61.6.060403
[35] J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” 2018. [Online]. Available: https://bit.ly/30Lg81v
[36] G. Jocher, A. Stoken, J. Borovec, NanoCode012, A. Chaurasia, TaoXie, L. Changyu, V. Abhiram, Laughing, tkianai, yxNONG, A. Hogan, lorenzomammana, AlexWang1900, J. Hajek, L. Diaconu, Marc, Y. Kwon, oleg, wanghaoyang0106, Y. Defretin, A. Lohia, ml5ah, B. Milanko, B. Fineran, D. Khromov, D. Yiwei, Doug, Durgesh, and F. Ingham, “ultralytics/yolov5: v5.0 - YOLOv5- P6 1280 models, AWS, Supervise.ly and YouTube integrations,” Apr. 2021. [Online]. Available: https://doi.org/10.5281/zenodo.4679653
[37] D. Olmeda, C. Premebida, U. Nunes, J. M. Armingol, and A. de la Escalera, “Pedestrian detection in far infrared images,” Integrated Computer-Aided Engineering, vol. 20, no. 4, pp. 347–360, 2013. [Online]. Available: http://dx.doi.org/10.3233/ICA-130441
[38] Teledyne Flir. (2021) Free flir thermal dataset for algorithm training. Teledyne FLIR LLC All rights reserved. [Online]. Available: https://bit.ly/2Xxe3F4
[39] NightOwls. (2021) About nightowls. NightOwls Datasets. [Online]. Available: https://bit.ly/3pof6m9
[40] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, “The Pascal Visual Object Classes (VOC) Challenge,” International Journal of Computer Vision, vol. 88, no. 2, pp. 303–338, Jun. 2010. [Online]. Available: https://doi.org/10.1007/s11263-009-0275-4