一种改进的视频分割网络及其全局信息优化方法

Lin Zhang; Yao Lu; Li hua Lu; Tian Fei Zhou; Qing Xuan Shi

doi:10.16383/j.aas.c190292

一种改进的视频分割网络及其全局信息优化方法

Translated title of the contribution: An Improved Video Segmentation Network and Its Global Information Optimization Method

Lin Zhang, Yao Lu^*, Li hua Lu, Tian Fei Zhou, Qing Xuan Shi

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Contribution to journal › Article › peer-review

1 Citation (Scopus)

Abstract

This paper presents an attention-based video segmentation network and its global information optimization training method. We propose an improved segmentation network, and use it to compute initial segmentation masks. Then the initial masks are considered as priors to finetune the network. Finally, the network with the learnt weight generates fine masks. Our two-stream segmentation network includes appearance branch and motion branch. Fed with image and optical flow image separately, the network extracts appearance features and motion features to generate segmentation mask. An attention module is embedded in the network, between the adjacent high level feature and low level feature. Thus the high level features locate the semantic region for the low level feature, speeding up the network convergence and improving segmentation quality. We propose to optimize the initial masks to finetune the original appearance network weights, making the network recognize the object and improving the network performance. Experiments on DAVIS show the effectiveness of the segmentation framework. Our method outperforms the traditional two-stream segmentation algorithms, and achieves comparable results with algorithms on the dataset＇s leaderboard. Validation experiment illustrates our attention module greatly improves the network performance than the baseline.

Translated title of the contribution	An Improved Video Segmentation Network and Its Global Information Optimization Method
Original language	Chinese (Traditional)
Pages (from-to)	787-796
Number of pages	10
Journal	Zidonghua Xuebao/Acta Automatica Sinica
Volume	48
Issue number	3
DOIs	https://doi.org/10.16383/j.aas.c190292
Publication status	Published - Mar 2022

Access to Document

10.16383/j.aas.c190292

Cite this

Zhang, L., Lu, Y., Lu, L. H., Zhou, T. F., & Shi, Q. X. (2022). 一种改进的视频分割网络及其全局信息优化方法. Zidonghua Xuebao/Acta Automatica Sinica, 48(3), 787-796. https://doi.org/10.16383/j.aas.c190292

@article{b9dc04e7f7c64b20aab003645a10954a,

title = "一种改进的视频分割网络及其全局信息优化方法",

abstract = "This paper presents an attention-based video segmentation network and its global information optimization training method. We propose an improved segmentation network, and use it to compute initial segmentation masks. Then the initial masks are considered as priors to finetune the network. Finally, the network with the learnt weight generates fine masks. Our two-stream segmentation network includes appearance branch and motion branch. Fed with image and optical flow image separately, the network extracts appearance features and motion features to generate segmentation mask. An attention module is embedded in the network, between the adjacent high level feature and low level feature. Thus the high level features locate the semantic region for the low level feature, speeding up the network convergence and improving segmentation quality. We propose to optimize the initial masks to finetune the original appearance network weights, making the network recognize the object and improving the network performance. Experiments on DAVIS show the effectiveness of the segmentation framework. Our method outperforms the traditional two-stream segmentation algorithms, and achieves comparable results with algorithms on the dataset＇s leaderboard. Validation experiment illustrates our attention module greatly improves the network performance than the baseline.",

keywords = "Attention mechanism, Convolutional neural network (CNN), Global information optimization, Video object segmentation",

author = "Lin Zhang and Yao Lu and Lu, {Li hua} and Zhou, {Tian Fei} and Shi, {Qing Xuan}",

year = "2022",

month = mar,

doi = "10.16383/j.aas.c190292",

language = "繁体中文",

volume = "48",

pages = "787--796",

journal = "Zidonghua Xuebao/Acta Automatica Sinica",

issn = "0254-4156",

publisher = "Science Press",

number = "3",

}

TY - JOUR

T1 - 一种改进的视频分割网络及其全局信息优化方法

AU - Zhang, Lin

AU - Lu, Yao

AU - Lu, Li hua

AU - Zhou, Tian Fei

AU - Shi, Qing Xuan

PY - 2022/3

Y1 - 2022/3

N2 - This paper presents an attention-based video segmentation network and its global information optimization training method. We propose an improved segmentation network, and use it to compute initial segmentation masks. Then the initial masks are considered as priors to finetune the network. Finally, the network with the learnt weight generates fine masks. Our two-stream segmentation network includes appearance branch and motion branch. Fed with image and optical flow image separately, the network extracts appearance features and motion features to generate segmentation mask. An attention module is embedded in the network, between the adjacent high level feature and low level feature. Thus the high level features locate the semantic region for the low level feature, speeding up the network convergence and improving segmentation quality. We propose to optimize the initial masks to finetune the original appearance network weights, making the network recognize the object and improving the network performance. Experiments on DAVIS show the effectiveness of the segmentation framework. Our method outperforms the traditional two-stream segmentation algorithms, and achieves comparable results with algorithms on the dataset＇s leaderboard. Validation experiment illustrates our attention module greatly improves the network performance than the baseline.

AB - This paper presents an attention-based video segmentation network and its global information optimization training method. We propose an improved segmentation network, and use it to compute initial segmentation masks. Then the initial masks are considered as priors to finetune the network. Finally, the network with the learnt weight generates fine masks. Our two-stream segmentation network includes appearance branch and motion branch. Fed with image and optical flow image separately, the network extracts appearance features and motion features to generate segmentation mask. An attention module is embedded in the network, between the adjacent high level feature and low level feature. Thus the high level features locate the semantic region for the low level feature, speeding up the network convergence and improving segmentation quality. We propose to optimize the initial masks to finetune the original appearance network weights, making the network recognize the object and improving the network performance. Experiments on DAVIS show the effectiveness of the segmentation framework. Our method outperforms the traditional two-stream segmentation algorithms, and achieves comparable results with algorithms on the dataset＇s leaderboard. Validation experiment illustrates our attention module greatly improves the network performance than the baseline.

KW - Attention mechanism

KW - Convolutional neural network (CNN)

KW - Global information optimization

KW - Video object segmentation

UR - http://www.scopus.com/inward/record.url?scp=85128129835&partnerID=8YFLogxK

U2 - 10.16383/j.aas.c190292

DO - 10.16383/j.aas.c190292

M3 - 文章

AN - SCOPUS:85128129835

SN - 0254-4156

VL - 48

SP - 787

EP - 796

JO - Zidonghua Xuebao/Acta Automatica Sinica

JF - Zidonghua Xuebao/Acta Automatica Sinica

IS - 3

ER -

一种改进的视频分割网络及其全局信息优化方法

Abstract

Access to Document

Other files and links

Fingerprint

Cite this