UTVit: A U-Shaped Segmentation Network for Underwater Images Based on TinyVit

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper presents a lightweight semantic segmentation model for underwater hazardous objects based on the TinyViT backbone network, realizing a real-time semantic segmentation under the computational constraints of unmanned underwater vehicles (UUVs). To efficiently leverage feature maps from various network layers, we devise a network architecture based on U-Net named UTVit, incorporating skip connections and concatenations to integrate feature maps across different levels of the network. Furthermore, we engineered a specialized upsampling network that employs depth-wise convolutions instead of standard convolutions. This approach mitigates the coarse granularity issues often encountered during upsampling to significantly reduce the computational burden. To validate the effectiveness of our proposed model, we conduct extensive experiments on the publicly available USIS10K dataset, comparing our model with state-of-the-art (SOTA) models. The results show that UTVit achieves a slight 2.3% decrease in mIOU (mean Intersection over Union) while significantly reducing the number of parameters by 51.46M, remaining a 12.7% parameter count of the former. In the parameter-mIOU and computation-mIOU trade-off charts, UTVit exhibits superior performance, making it highly suitable for object segmentation tasks in UUV applications. This balance between efficiency and effectiveness underscores the model's potential for practical deployment in resource-constrained underwater environments.

Original languageEnglish
Title of host publicationProceedings of the 44th Chinese Control Conference, CCC 2025
EditorsJian Sun, Hongpeng Yin
PublisherIEEE Computer Society
Pages7863-7868
Number of pages6
ISBN (Electronic)9789887581611
DOIs
Publication statusPublished - 2025
Externally publishedYes
Event44th Chinese Control Conference, CCC 2025 - Chongqing, China
Duration: 28 Jul 202530 Jul 2025

Publication series

NameChinese Control Conference, CCC
ISSN (Print)1934-1768
ISSN (Electronic)2161-2927

Conference

Conference44th Chinese Control Conference, CCC 2025
Country/TerritoryChina
CityChongqing
Period28/07/2530/07/25

Keywords

  • Depth-wise Convolution
  • Lightweight Model
  • Underwater Semantic Segmentation
  • Vision Transformer

Fingerprint

Dive into the research topics of 'UTVit: A U-Shaped Segmentation Network for Underwater Images Based on TinyVit'. Together they form a unique fingerprint.

Cite this