An M-DeepSORT Algorithm for Pedestrian Detection and Tracking Based on Video Images – A Case Study in Ji-nan Subway Station
DOI:
https://doi.org/10.7307/ptt.v37i2.635Keywords:
passenger trajectory tracking, CB Kalman filtering, trajectory update, momentum, M-DeepSORTAbstract
With the ongoing urbanisation, the subway has become a vital component of modern cities, catering to the escalating demands of a mobile population. However, the increasing complexity of passenger flows within subway stations poses challenges to operations management. To optimise subway operations and enhance safety, researchers focus on extracting and analysing pedestrian trajectories within subway stations. Traditional trajectory extraction methods face limitations due to manual feature design and multi-stage processing. Leveraging advancements in deep learning, this paper integrates M-DeepSORT with YOLOv5 and proposes a feature association matching approach that addresses trajectory drift issues through simultaneous consideration of motion and appearance matching. The confidence-based (CB) Kalman filtering method is proposed to address the issue of random noise in pedestrian detection within subway scenes. The introduction of a momentum-based passenger trajectory centre update method reduces jitter, resulting in smoother trajectory extraction. Experimental results affirm the effectiveness of the proposed algorithm in detecting, tracking and statistically analysing subway station corridor passenger flow trajectories, demonstrating robust performance in diverse subway station scenarios.
References
Chuang Z, et al. Designing boarding limit strategy by considering stop-level fairness amid the COVID-19 outbreak. Transportmetrica A: Transport Science. 2023;1-30. DOI: 10.1080/23249935.2023.2167500.
Chuang Z, et al. Joint optimization of bus scheduling and seat allocation for reservation-based travel. Transportation Research Part C: Emerging Technologies. 2024;163:104631. DOI: 10.1016/j.trc.2024.104631.
Qiaochu C, et al. Simulation and optimization of pedestrian regular evacuation in comprehensive rail transit hub - a case study in Beijing. Promet – Traffic&Transportation. 2020;32(3):383-397. DOI:10.7307/ptt.v32i3.3318.
Wei L, et al. Experimental study for optimizing pedestrian flows at bottlenecks of subway stations. Promet – Traffic&Transportation. 2018;30(5):525-538. DOI 10.7307/ptt.v30i5.2715.
Navneet D, Bill T. Histograms of oriented gradients for human detection. IEEE computer society conference on computer vision and pattern recognition 2005, 20-26 Jun 2005, San Diego, CA, USA. 2005. p. 886-893. DOI: 10.1109/CVPR.2005.177.
Stefan W, et al. New features and insights for pedestrian detection. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 13-18 June 2010, San Francisco, CA, USA, 2010, pp. 1030-1037, DOI: 10.1109/CVPR.2010.5540102.
Rodrigo B, et al. Ten years of pedestrian detection, what have we learned? Computer Vision - ECCV 2014 Workshops. ECCV 2014. Lecture Notes in Computer Science, 8926. Springer, Cham. DOI: 10.1007/978-3-319-16181-5_47.
Markus E, Dariu M. Monocular pedestrian detection: Survey and experiments. IEEE transactions on pattern analysis and machine intelligence, 2008; 31(12):2179-2195. DOI: 10.1109/TPAMI.2008.260.
Piotr D, et al. Pedestrian detection: An evaluation of the state of the art. IEEE transactions on pattern analysis and machine intelligence, 2011;34(4):743-761. DOI: 10.1109/TPAMI.2011.155.
Paul V, Michael J. Robust real-time face detection. International journal of computer vision, 2004;57:137-154. DOI: 10.1023/B:VISI.0000013087.49260.fb
David G. Object recognition from local scale-invariant features. Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece, 1999, pp. 1150-1157. DOI: 10.1109/ICCV.1999.790410.
Ross G, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 2014, p. 580-587, DOI: 10.1109/CVPR.2014.81.
Shaoqing R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, p. 1137-1149, 1 June 2017, DOI: 10.1109/TPAMI.2016.2577031.
Shanshan Z, et al. How far are we from solving pedestrian detection? , 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, p. 1259-1267, DOI: 10.1109/CVPR.2016.141.
Joseph R, et al. You only look once: Unified, real-time object detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, p. 779-788, DOI: 10.1109/CVPR.2016.91.
Nicolai W, et al. Simple online and realtime tracking with a deep association metric, 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 2017, p. 3645-3649, DOI: 10.1109/ICIP.2017.8296962.
Alex B, et al. Simple online and realtime tracking, 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 2016, p. 3464-3468. DOI: 10.1109/ICIP.2016.7533003.
Huajun S, et al. Detection and tracking of safety helmet based on DeepSort and YOLOv5. Multimed Tools Appl, 2023;82(7):10781-10794. DOI: 10.1007/s11042-022-13305-0.
Jiandong Z, et al. Detection of passenger flow on and off buses based on video images and YOLO algorithm. Multimed Tools Appl, 2022;81(4):4669-4692. DOI: 10.1007/s11042-021-10747-w.
Jiandong Z, et al. Detection of crowdedness in bus compartments based on ResNet algorithm and video images. Multimed Tools Appl, 2022;81(4):4753-4780. DOI: 10.1007/s11042-021-11008-6.
Vladimir M, et al. Pedestrian detection in video surveillance using fully convolutional YOLO neural network, Automated Visual Inspection and Machine Vision II, Munich, Germany, 2017;103340Q (2017). DOI: 10.1117/12.2270326.
Yongxin W, et al. Joint object detection and multi-object tracking with graph neural networks, 2021 IEEE international conference on robotics and automation (ICRA), Xi’an, China, 2021, p. 13708-13715. DOI: 10.1109/ICRA48506.2021.9561110
Hiroshi F, et al. Multi-object tracking as attention mechanism. 2023 IEEE International Conference on Image Processing (ICIP), Kuala Lumpur, Malaysia, 2023, p. 1761-1765. DOI: 10.1109/ICIP49359.2023.10222207
Stadler, D., and Beyerer, J. Improving multiple pedestrian tracking by track management and occlusion handling. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021, p. 10958-10967. DOI: 10.1109/CVPR46437.2021.01081.
Yifu Z, et al. ByteTrack: Multi-object tracking by associating every detection box. Computer Vision – ECCV 2022, Lecture Notes in Computer Science, Tel Aviv, Israel, 2022, p. 1-21. DOI:10.1007/978-3-031-20047-2_1.
Jiaxin L, et al. SimpleTrack: Rethinking and Improving the JDE Approach for Multi-Object Tracking. Sensors, 2022;22(15):5863. DOI: 10.3390/s22155863.
Yaoyao S, Yi Z. Multi-object tracking with integrated heads and attention mechanism. Neurocomputing, 2022;510:95-106. DOI: 10.1016/j.neucom.2022.09.045.
Xiaolong Z, et al. Multi-object tracking based on attention networks for smart city system. Sustainable Energy Technologies and Assessments, 2022;52. DOI: 10.1016/j.seta.2022.102216.
Zhihong S, et al. A Survey of multiple pedestrian tracking based on tracking-by-detection framework. IEEE Transactions on Circuits and Systems for Video Technology, 2020;31(5):1819-1833. DOI: 10.1109/TCSVT.2020.3009717.
Xin X, Xinlong F. Multi-object pedestrian tracking using improved YOLOv8 and OC-SORT. Sensors, 2023;23(20):8439. DOI: 10.3390/s23208439.
Glenn J, et al. Zenodo, 2020. https://github.com/ultralytics/yolov5 [Accessed 12th May 2023].
Seung-Hwan B. Object detection based on region decomposition and assembly. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01):8094-8101. DOI: 10.1609/aaai.v33i01.3301809.
Kaiming H, et al. Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017, p. 2980-2988, DOI: 10.1109/ICCV.2017.322.
Hamid R, et al. Generalized intersection over union: A metric and a loss for bounding box regression, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, p. 658-666, DOI: 10.1109/CVPR.2019.00075.
Jiahui Y, et al. Unitbox: An advanced object detection network. Proceedings of the 24th ACM international conference on Multimedia, Oct, 2016, p. 516–520. DOI: 10.1145/2964284.2967274.
Zhaohui Z, et al. Distance-IoU loss: Faster and better learning for bounding box regression, Proceedings of the AAAI Conference on Artificial Intelligence. 2020; 34, 07, 12993-13000. DOI: 10.1609/aaai.v34i07.6999.
Feng Y, et al. Video object tracking based on YOLOv7 and DeepSORT. arXiv preprint arXiv:2207.12202, 2022
Nicolai W, Alex B. Deep cosine metric learning for person re-identification. 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA, 2018, p. 748-756. DOI: 10.1109/WACV.2018.00087.
Yunhao D, et al. Giaotracker: A comprehensive framework for Mcmot with global information and optimizing strategies in visdrone 2021, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada, 2021, p. 2809-2819. DOI: 10.1109/ICCVW54120.2021.00315.
Yunhao D, et al. Strongsort: Make deepsort great again. IEEE Transactions on Multimedia, 2023;25:8725-8737. DOI: 10.1109/TMM.2023.3240881.
Yifu Z, et al. Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision, 2021;129:3069-3087. DOI: 10.1007/s11263-021-01513-4.
Zhongdao W, et al. Towards real-time multi-object tracking, Computer Vision – ECCV 2020 Lecture Notes in Computer Science, Springer, Cham. DOI: 10.1007/978-3-030-58621-8_7.
Keni B, Rainer S. Evaluating multiple object tracking performance: the clear mot metrics. EURASIP Journal on Image and Video Processing, 2008;1-10. DOI: 10.1155/2008/246309.
Ergys R, et al. Performance measures and a data set for multi-target, multi-camera tracking. Computer Vision – ECCV 2016 Workshops Lecture Notes in Computer Science, 2016;9914. Springer, Cham. DOI: 10.1007/978-3-319-48881-3_2.
Anton M, et al. MOT16: A benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831, 2016. https://arxiv.org/abs/1603.00831 [Accessed 23th May 2023].
Fengwei Yu, et al. Poi: Multiple object tracking with high performance detection and appearance feature. Computer Vision–ECCV 2016 Workshops, Amsterdam, The Netherlands, 2016, p. 36-42. DOI: 10.1007/978-3-319-48881-3_3.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Wei ZHANG, Chuang ZHU, Yunchao QU, Guanhua LIU, Der-Horng LEE

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.