Zing Forum

Reading

NewLuggageDataset: Real-Time Abandoned Luggage Detection System Based on YOLOv12

A real-time abandoned luggage detection framework combining enhanced YOLOv1 models and spatiotemporal reasoning, providing a highly reliable solution for public safety monitoring through multi-scale object detection and interpretable time-distance constraints.

YOLOv12目标检测遗弃行李检测多目标追踪公共安全计算机视觉时空推理实时监控边缘计算
Published 2026-05-09 17:05Recent activity 2026-05-09 17:22Estimated read 7 min
NewLuggageDataset: Real-Time Abandoned Luggage Detection System Based on YOLOv12
1

Section 01

NewLuggageDataset: Real-Time Abandoned Luggage Detection with YOLOv12 & Spatiotemporal Reasoning

This project introduces a real-time abandoned luggage detection system combining enhanced YOLOv12 models and spatiotemporal reasoning. It addresses public safety challenges in crowded places (airports, stations, malls) by integrating multi-scale object detection and interpretable time-distance constraints, providing a reliable solution for security monitoring. Key features include dual YOLOv12 models for luggage detection and person tracking, a spatiotemporal reasoning module for abandonment judgment, and an open-source dataset with diverse scenarios.

2

Section 02

Background & Technical Challenges of Abandoned Luggage Detection

Abandoned luggage in crowded public areas poses safety risks. Traditional manual monitoring is inefficient and error-prone due to fatigue and attention gaps. Technical challenges include: 1) Small target detection (luggage often occupies small frame space, easily occluded or affected by lighting/background); 2) Temporal association (requires tracking luggage and its owner over time to judge abandonment); 3) Real-time response (millisecond-level alerts needed for public safety); 4) False alarm control (to avoid wasting security resources and 'boy who cried wolf' effect).

3

Section 03

Technical Architecture & Spatiotemporal Reasoning Mechanism

Dual YOLOv12 Models:

  • YOLOv12m: Specialized for luggage detection, balances accuracy and speed, optimized for small targets via improved Feature Pyramid Network (FPN).
  • YOLOv12x: Focuses on person detection/tracking, adapts to dense/occluded scenes, maintains target identity despite posture changes or partial occlusion.

Spatiotemporal Reasoning:

  1. Distance-time constraints: Triggers alarm if luggage-person distance exceeds threshold (e.g., 2m) for set time (e.g.,30s), with dynamic threshold adjustment for different scenes.
  2. Trajectory association: Uses Hungarian algorithm for frame-to-frame matching, Kalman filter for position prediction, and feature re-identification for long-term occlusion recovery.
  3. Event state machine: Manages luggage states (normal/warning/abandoned/released) for predictable and interpretable behavior.
4

Section 04

Dataset Construction & System Implementation

Dataset: Covers diverse scenes (airports, stations, malls), multi-view angles, time spans (day/night), and crowd densities. Annotations include bounding boxes, instance segmentation, attributes (luggage type, person posture), and trajectory associations. Data augmentation includes geometric transforms (crop/rotate/scale), lighting changes, noise injection, and occlusion simulation.

Implementation:

  • Inference optimization: INT8 quantization, TensorRT acceleration, batch processing, multi-thread pipeline (video decoding/preprocessing/inference/postprocessing parallelization).
  • Edge deployment: Lightweight models for Jetson devices, model pruning/knowledge distillation, hardware acceleration (NPU/TPU).
  • Alarm & integration: Multi-level alerts, real-time visualization, ONVIF/RTSP support, Webhook for third-party system integration.
5

Section 05

Performance Evaluation & Application Scenarios

Performance:

  • Accuracy: mAP@0.5 >85% for luggage detection, recall >95% for obvious abandonment, false alarm <1 per camera/hour.
  • Real-time: <20ms per frame on NVIDIA T4 GPU, supports 50+ FPS, 16 1080p streams per server, end-to-end delay <2s.

Applications: Airport security areas, train platforms, shopping malls, large event venues (concerts/sports events) to detect abandoned luggage and enhance public safety.

6

Section 06

Limitations & Future Improvements

Limitations:

  • Extreme crowding: Lower accuracy in highly crowded/occluded scenes for luggage-person association.
  • Similar luggage: Identity retention challenges for visually similar items.
  • Complex interactions: Need to optimize logic for multi-person luggage handling or temporary placement.

Future: Multimodal fusion (audio + vision), behavior pattern learning (reduce rule dependency), active interaction (voice reminders for passengers), cross-camera tracking (wider monitoring coverage).

7

Section 07

Open Source Contribution & Conclusion

Open Source: Provides annotated dataset, pre-trained YOLOv12 weights, full code implementation, and detailed deployment docs for research/development.

Conclusion: The system combines advanced object detection with interpretable spatiotemporal reasoning, offering strong technical support for public safety. Future development should balance technological progress with privacy protection and ethical norms to ensure tools serve public well-being.