Ahmed Imran, Jeon Gwanggil, Chehri Abdellah et Hassan Mohammad Mehedi. (2021). Adapting Gaussian YOLOv3 with transfer learning for overhead view human detection in smart cities and societies. Sustainable Cities and Society, 70, e102908.
Le texte intégral n'est pas disponible pour ce document.
URL officielle: http://dx.doi.org/doi:10.1016/j.scs.2021.102908
Résumé
Nowadays, deep neural networks are widely applied in sustainable smart cities and societies, including smart manufacturing, healthcare, industries, agriculture, surveillance, and various artificial intelligence-based real-life applications. In this regard, the human detection system has gained notable attention since it is recognized as a crucial task in intelligent surveillance applications. Researchers practiced a variety of computer vision and deep neural networks-based techniques for human detection-based applications; however, they often focused on the frontal view camera perspective. Thus, in this work, we have introduced a human detection system for intelligent surveillance in smart cities and societies with a completely distinct perspective, i.e., an overhead perspective that can provide sufficient visibility and coverage of a scene in congested and obstructed environments. However, human appearance can be difficult from such an extreme point of view, as there are significant variations in humans’ poses and appearances. Therefore, in this work, leveraging the deep neural network-based object detection technique, the Gaussian YOLOv3 algorithm is used for human detection. The algorithm determines the bounding box uncertainty by modeling its coordinates as a Gaussian parameter, improving accuracy and reducing false positives. A Gaussian YOLOv3 is combined with channel attention and feature intertwine modules to improve specific feature maps. The channel attention module is combined with the feature map to learn each channel's weight autonomously, improve the key features, and enhance the network's ability to discriminate between humans and background. At the same time, different channels of the feature map are intertwined to obtain more representative features. Finally, the features obtained from the attention and feature intertwine modules are fused to form an improved feature map. In addition, to further increase the detection accuracy of the algorithm for human detection, transfer learning is adopted. The experimental outcomes reveal that training improves the Gaussian YOLOv3 algorithm's potential for human detection with an overall detection accuracy of 94%.
Type de document: | Article publié dans une revue avec comité d'évaluation |
---|---|
Volume: | 70 |
Pages: | e102908 |
Version évaluée par les pairs: | Oui |
Date: | Juillet 2021 |
Sujets: | Sciences naturelles et génie > Génie Sciences naturelles et génie > Génie > Génie informatique et génie logiciel Sciences naturelles et génie > Sciences appliquées |
Département, module, service et unité de recherche: | Départements et modules > Département des sciences appliquées > Module d'ingénierie |
Mots-clés: | réseau de neurones profonds, villes et sociétés intelligentes, détection humaine, vue aérienne, apprentissage par transfert, deep neural network, smart cities and societies, human detection, overhead view, transfer learning, Gaussian YOLOv3 |
Déposé le: | 25 mai 2021 17:46 |
---|---|
Dernière modification: | 25 mai 2021 17:46 |
Éditer le document (administrateurs uniquement)