TY - GEN
T1 - YOLO-Ti
T2 - 2024 International Conference on Optics, Electronics, and Communication Engineering, OECE 2024
AU - Li, Ying
AU - Weng, Dongdong
AU - Tian, Zeyu
AU - Hou, Jing
AU - Li, Zihao
N1 - Publisher Copyright:
© 2024 SPIE.
PY - 2024
Y1 - 2024
N2 - In this paper, an efficient object detection method YOLO-Ti is proposed to detect tiny facial markers. Our study is driven by the practical requirements of 3D face modeling, requiring the incorporation of as many facial features as possible for reference. This research can even provide information for facial expression recognition and joint deformation. To achieve this, we first present a feature fusion module called Cross-BiFPN, which incorporates additional cross-connecting branches between different network layers to utilize low-level features more effectively. Secondly, we add a high-resolution detection head and attention module to the YOLOv8 model to improve the ability of detecting tiny objects, while at the same time ensuring the lightweight detection model by reducing redundant network layers. Thirdly, we collect a dataset of facial markers with an average size much smaller than publicly available small object datasets. Ablation studies and comparison experiments are conducted to evaluate the performance of our approach. Compared with the baseline YOLOv8 model, YOLO-Ti shows a 30.4% improvement in mAP50 while reducing model parameters by 65.1%. The automatic feature extraction provided by our model facilitates the construction of digital humans, providing significant savings in manpower and time for modelers.
AB - In this paper, an efficient object detection method YOLO-Ti is proposed to detect tiny facial markers. Our study is driven by the practical requirements of 3D face modeling, requiring the incorporation of as many facial features as possible for reference. This research can even provide information for facial expression recognition and joint deformation. To achieve this, we first present a feature fusion module called Cross-BiFPN, which incorporates additional cross-connecting branches between different network layers to utilize low-level features more effectively. Secondly, we add a high-resolution detection head and attention module to the YOLOv8 model to improve the ability of detecting tiny objects, while at the same time ensuring the lightweight detection model by reducing redundant network layers. Thirdly, we collect a dataset of facial markers with an average size much smaller than publicly available small object datasets. Ablation studies and comparison experiments are conducted to evaluate the performance of our approach. Compared with the baseline YOLOv8 model, YOLO-Ti shows a 30.4% improvement in mAP50 while reducing model parameters by 65.1%. The automatic feature extraction provided by our model facilitates the construction of digital humans, providing significant savings in manpower and time for modelers.
KW - 3D face reconstruction
KW - Facial markers
KW - improved YOLOv8 algorithm
KW - tiny object detection
UR - http://www.scopus.com/inward/record.url?scp=85210236113&partnerID=8YFLogxK
U2 - 10.1117/12.3048940
DO - 10.1117/12.3048940
M3 - Conference contribution
AN - SCOPUS:85210236113
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - International Conference on Optics, Electronics, and Communication Engineering, OECE 2024
A2 - Yue, Yang
PB - SPIE
Y2 - 26 July 2024 through 28 July 2024
ER -