Skip to main navigation Skip to search Skip to main content

An Adaptive YOLO11 Framework for the Localisation, Tracking, and Imaging of Small Aerial Targets Using a Pan–Tilt–Zoom Camera Network

  • Ming Him Lui
  • , Haixu Liu
  • , Zhuochen Tang
  • , Hang Yuan
  • , David Williams
  • , Dongjin Lee
  • , K. C. Wong*
  • , Zihao Wang
  • *Corresponding author for this work
  • University of Sydney
  • Australian National University
  • SiNAB Pty Ltd.
  • Hanseo University

Research output: Contribution to journalArticlepeer-review

Abstract

This article presents a cost-effective camera network system that employs neural network-based object detection and stereo vision to assist a pan–tilt–zoom camera in imaging fast, erratically moving small aerial targets. Compared to traditional radar systems, this approach offers advantages in supporting real-time target differentiation and ease of deployment. Based on the principle of knowledge distillation, a novel data augmentation method is proposed to coordinate the latest open-source pre-trained large models in semantic segmentation, text generation, and image generation tasks to train a BicycleGAN for image enhancement. The resulting dataset is tested on various model structures and backbone sizes of two mainstream object detection frameworks, Ultralytics’ YOLO and MMDetection. Additionally, the algorithm implements and compares two popular object trackers, Bot-SORT and ByteTrack. The experimental proof-of-concept deploys the YOLOv8n model, which achieves an average precision of 82.2% and an inference time of 0.6 ms. Alternatively, the YOLO11x model maximises average precision at 86.7% while maintaining an inference time of 9.3 ms without bottlenecking subsequent processes. Stereo vision achieves accuracy within a median error of 90 mm following a drone flying over 1 m/s in an 8 m (Formula presented.) 4 m area of interest. Stable single-object tracking with the PTZ camera is successful at 15 fps with an accuracy of 92.58%.

Original languageEnglish
Pages (from-to)3488-3516
Number of pages29
JournalEng
Volume5
Issue number4
DOIs
Publication statusPublished - Dec 2024
Externally publishedYes

Keywords

  • BicycleGAN
  • Segment Anything Model
  • Stable Diffusion
  • YOLO11x
  • YOLOv8-Nano
  • camera calibration
  • data augmentation
  • object detection
  • object tracking
  • pan–tilt–zoom

Fingerprint

Dive into the research topics of 'An Adaptive YOLO11 Framework for the Localisation, Tracking, and Imaging of Small Aerial Targets Using a Pan–Tilt–Zoom Camera Network'. Together they form a unique fingerprint.

Cite this