Pedestrian detection at daytime and nighttime conditions based on YOLO-v5
Main Article Content
Abstract
This paper presents new algorithm based on deep learning for daytime and nighttime pedestrian detection, named multispectral, focused on vehicular safety applications. The proposal is based on YOLO-v5, and consists of the construction of two subnetworks that focus on working with color (RGB) and thermal (IR) images, respectively. Then the information is merged, through a merging subnetwork that integrates RGB and IR networks to obtain a pedestrian detector. Experiments aimed at verifying the quality of the proposal were conducted using several public pedestrian databases for detecting pedestrians at daytime and nighttime. The main results according to the mAP metric, setting an IoU of 0.5 were: 96.6 \% on the INRIA database, 89.2 % on CVC09, 90.5 % on LSIFIR, 56 % on FLIR-ADAS, 79.8 % on CVC14, 72.3 % on Nightowls and 53.3 % on KAIST.
Article Details
The Universidad Politécnica Salesiana of Ecuador preserves the copyrights of the published works and will favor the reuse of the works. The works are published in the electronic edition of the journal under a Creative Commons Attribution/Noncommercial-No Derivative Works 4.0 Ecuador license: they can be copied, used, disseminated, transmitted and publicly displayed.
The undersigned author partially transfers the copyrights of this work to the Universidad Politécnica Salesiana of Ecuador for printed editions.
It is also stated that they have respected the ethical principles of research and are free from any conflict of interest. The author(s) certify that this work has not been published, nor is it under consideration for publication in any other journal or editorial work.
The author (s) are responsible for their content and have contributed to the conception, design and completion of the work, analysis and interpretation of data, and to have participated in the writing of the text and its revisions, as well as in the approval of the version which is finally referred to as an attachment.
References
[2] ANT. (2015) Estadísticas de siniestros de tránsito octubre 2015. Agencia Nacional de Tránsito del Ecuador. [Online]. Available: https://bit.ly/3aUIWGv
[3] ——. (2017) Estadísticas de siniestros de tránsito agosto 2017. Agencia Nacional de Tránsito del Ecuador. [Online]. Available: https://bit.ly/3aUIWGv
[4] J. Liu, S. Zhang, S. Wang, and D. N. Metaxas, “Multispectral deep neural networks for pedestrian detection,” 2016. [Online]. Available: https://bit.ly/2Z3BLJu
[5] D. König, M. Adam, C. Jarvers, G. Layher, H. Neumann, and M. Teutsch, “Fully convolutional region proposal networks for multispectral person detection,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2017, pp. 243–250. [Online]. Available: https://doi.org/10.1109/CVPRW.2017.36
[6] D. Guan, Y. Cao, J. Yang, Y. Cao, and M. Y. Yang, “Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection,” Information Fusion, vol. 50, pp. 148–157, 2019. [Online]. Available: https://doi.org/10.1016/j.inffus.2018.11.017
[7] J. Li, X. Liang, S. Shen, T. Xu, J. Feng, and S. Yan, “Scale-aware fast R-CNN for pedestrian detection,” IEEE Transactions on Multimedia, vol. 20, no. 4, pp. 985–996, 2018. [Online]. Available: https://doi.org/10.1109/TMM.2017.2759508
[8] J. Cao, C. Song, S. Peng, S. Song, X. Zhang, Y. Shao, and F. Xiao, “Pedestrian detection algorithm for intelligent vehicles in complex scenarios,” Sensors, vol. 20, no. 13, p. 3646, 2020. [Online]. Available: https://doi.org/10.3390/s20133646
[9] Caltech. (2016) Caltech pedestrian detection benchmark. [Online]. Available: https://bit.ly/3aXuZb4
[10] Pascal. (2016) Inria person dataset. [Online]. Available: https://bit.ly/30APbxi
[11] X. Song, S. Gao, and C. Chen, “A multispectral feature fusion network for robust pedestrian detection,” Alexandria Engineering Journal, vol. 60, no. 1, pp. 73–85, 2021. [Online]. Available: https://doi.org/10.1016/j.aej.2020.05.035
[12] A. Wolpert, M. Teutsch, M. S. Sarfraz, and R. Stiefelhagen, “Anchor-free small-scale multispectral pedestrian detection,” 2020. [Online]. Available: https://bit.ly/3G8k5gL
[13] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” 2016. [Online]. Available: https://bit.ly/3B167d1
[14] C. Ertler, H. Possegger, M. Opitz, and H. Bischof, “Pedestrian detection in RGB-D images from an elevated viewpoint,” in Proceedings of the 22nd Computer Vision Winter Workshop, W. Kropatsch, I. Janusch, and N. Artner, Eds. Austria: TU Wien, Pattern Recongition and Image Processing Group, 2017. [Online]. Available: https://bit.ly/3AYTI9w
[15] X. Zhang, G. Chen, K. Saruta, and Y. Terata, “Deep convolutional neural networks for all-day pedestrian detection,” in Information Science and Applications 2017, K. Kim and N. Joukov, Eds. Singapore: Springer Singapore, 2017, pp. 171–178. [Online]. Available: https://doi.org/10.1007/978-981-10-4154-9_21
[16] L. Zhang, L. Lin, X. Liang, and K. He, “Is faster r-cnn doing well for pedestrian detection?” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds. Cham: Springer International Publishing, 2016, pp. 443–457. [Online]. Available: https://doi.org/10.1007/978-3-319-46475-6_28
[17] J. H. Kim, H. G. Hong, and K. R. Park, “Convolutional neural network-based human detection in nighttime images using visible light camera sensors,” Sensors, vol. 17, no. 5, 2017. [Online]. Available: https://doi.org/10.3390/s17051065
[18] L. Ding, Y. Wang, R. Laganiere, D. Huang, and S. Fu, “Convolutional neural networks for multispectral pedestrian detection,” Signal Processing: Image Communication, vol. 82, p. 115764, 2020. [Online]. Available: https://doi.org/10.1016/j.image.2019.115764
[19] S. Hwang, J. Park, N. Kim, Y. Choi, and I. S. Kweon, “Multispectral pedestrian detection: Benchmark dataset and baseline,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 1037–1045. [Online]. Available: https://doi.org/10.1109/CVPR.2015.7298706
[20] Caltech. (2012) Caltech pedestrian detection benchmark. [Online]. Available: https: //bit.ly/3pkn93o
[21] Pascal. (2012) INRIA person dataset. [Online]. Available: https://bit.ly/3IAO6Hw
[22] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? The KITTI vision benchmark suite,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2012. [Online]. Available: https://bit.ly/3n6oBnq
[23] X. Yu, Y. Si, and L. Li, “Pedestrian detection based on improved faster rcnn algorithm,” in 2019 IEEE/CIC International
Conference on Communications in China (ICCC), 2019, pp. 346–351. [Online]. Available: https://doi.org/10.1109/ICCChina.2019.8855960
[24] Y. He, C. Zhu, and X.-C. Yin, “Mutualsupervised feature modulation network for occluded pedestrian detection,” 2020. [Online]. Available: https://bit.ly/3C14eyn
[25] F. B. Tesema, H. Wu, M. Chen, J. Lin, W. Zhu, and K. Huang, “Hybrid channel based pedestrian detection,” Neurocomputing, vol. 389, pp. 1–8, 2020. [Online]. Available: https://doi.org/10.1016/j.neucom.2019.12.110
[26] C. Kyrkou, “Yolopeds: efficient real time single shot pedestrian detection for smart camera applications,” IET Computer Vision, vol. 14, no. 7, pp. 417–425, Oct 2020. [Online]. Available: http://dx.doi.org/10.1049/iet-cvi.2019.0897
[27] J. Redmon and A. Farhadi, “Yolo9000: Better, faster, stronger,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017. [Online]. Available: https://bit.ly/3nuyCv1
[28] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “SSD: single shot multibox detector,” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds. Cham: Springer International Publishing, 2016, pp. 21–37. [Online]. Available: https://doi.org/10.1007/978-3-319-46448-0_2
[29] F. Chabot, Q.-C. Pham, and M. Chaouch, “Lapnet : Automatic balanced loss and optimal assignment for real-time dense object detection,” 2020. [Online]. Available: https://bit.ly/3FYZDPo
[30] K. Zhou, L. Chen, and X. Cao, “Improving multispectral pedestrian detection by addressing modality imbalance problems,” 2020. [Online]. Available: https://bit.ly/2Z6qKaV
[31] W. Wang, “Detection of panoramic vision pedestrian based on deep learning,” Image and Vision Computing, vol. 103, p. 103986, 2020. [Online]. Available: https://doi.org/10.1016/j.imavis.2020.10398
[32] I. Shopovska, L. Jovanov, and W. Philips, “Deep visible and thermal image fusion for enhanced pedestrian visibility,” Sensors, vol. 19, no. 17, 2019. [Online]. Available: https://doi.org/10.3390/s19173727
[33] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, realtime object detection,” 2016. [Online]. Available: https://bit.ly/3aWg3tO
[34] D. Heo, E. Lee, and B. Chul Ko, “Pedestrian detection at night using deep neural networks and saliency maps,” Journal of Imaging Science and Technology, vol. 61, no. 6, pp. 604 031–604 039, 2017. [Online]. Available: https://doi.org/10.2352/J.ImagingSci.Technol.2017.61.6.060403
[35] J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” 2018. [Online]. Available: https://bit.ly/30Lg81v
[36] G. Jocher, A. Stoken, J. Borovec, NanoCode012, A. Chaurasia, TaoXie, L. Changyu, V. Abhiram, Laughing, tkianai, yxNONG, A. Hogan, lorenzomammana, AlexWang1900, J. Hajek, L. Diaconu, Marc, Y. Kwon, oleg, wanghaoyang0106, Y. Defretin, A. Lohia, ml5ah, B. Milanko, B. Fineran, D. Khromov, D. Yiwei, Doug, Durgesh, and F. Ingham, “ultralytics/yolov5: v5.0 - YOLOv5- P6 1280 models, AWS, Supervise.ly and YouTube integrations,” Apr. 2021. [Online]. Available: https://doi.org/10.5281/zenodo.4679653
[37] D. Olmeda, C. Premebida, U. Nunes, J. M. Armingol, and A. de la Escalera, “Pedestrian detection in far infrared images,” Integrated Computer-Aided Engineering, vol. 20, no. 4, pp. 347–360, 2013. [Online]. Available: http://dx.doi.org/10.3233/ICA-130441
[38] Teledyne Flir. (2021) Free flir thermal dataset for algorithm training. Teledyne FLIR LLC All rights reserved. [Online]. Available: https://bit.ly/2Xxe3F4
[39] NightOwls. (2021) About nightowls. NightOwls Datasets. [Online]. Available: https://bit.ly/3pof6m9
[40] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, “The Pascal Visual Object Classes (VOC) Challenge,” International Journal of Computer Vision, vol. 88, no. 2, pp. 303–338, Jun. 2010. [Online]. Available: https://doi.org/10.1007/s11263-009-0275-4