Learning Instance Motion Segmentation with Geometric Embedding

Zhen Leng; Jing Chen; Songnan Lin

doi:10.1109/ACCESS.2021.3062673

Learning Instance Motion Segmentation with Geometric Embedding

Zhen Leng^*, Jing Chen, Songnan Lin

^*此作品的通讯作者

光电学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

2 引用（Scopus）

摘要

Most existing deep learning-based motion segmentation methods treat motion segmentation as a binary segmentation problem, which is generally not the real case in dynamic scenes. In addition, the object and camera motion are often mixed, making the motion segmentation problem difficult. This paper proposes a joint learning method which fuses semantic features and motion clues using CNNs with deformable convolution and a motion embedding module, to address multi-object motion segmentation problem. The deformable convolution module serves to fusion color and motion information. And the motion embedding module learns to distinguish objects' motion status with inspiration from geometric modeling methods. We perform extensive quantitative and qualitative experiments on benchmark datasets. Especially, we label over 9000 images of KITTI visual odometry dataset to help training the deformable module. Our method achieves superior performance in comparison to the current state-of-the-art in terms of speed and accuracy.

源语言	英语
文章编号	9380630
页（从-至）	56812-56821
页数	10
期刊	IEEE Access
卷	9
DOI	https://doi.org/10.1109/ACCESS.2021.3062673
出版状态	已出版 - 2021

访问文件

10.1109/ACCESS.2021.3062673

其它文件与链接

链接到 Scopus 的出版物

引用此

Leng, Z., Chen, J., & Lin, S. (2021). Learning Instance Motion Segmentation with Geometric Embedding. IEEE Access, 9, 56812-56821. 文章 9380630. https://doi.org/10.1109/ACCESS.2021.3062673

@article{291053472da44c32a4a80b7f04d3051b,

title = "Learning Instance Motion Segmentation with Geometric Embedding",

abstract = "Most existing deep learning-based motion segmentation methods treat motion segmentation as a binary segmentation problem, which is generally not the real case in dynamic scenes. In addition, the object and camera motion are often mixed, making the motion segmentation problem difficult. This paper proposes a joint learning method which fuses semantic features and motion clues using CNNs with deformable convolution and a motion embedding module, to address multi-object motion segmentation problem. The deformable convolution module serves to fusion color and motion information. And the motion embedding module learns to distinguish objects' motion status with inspiration from geometric modeling methods. We perform extensive quantitative and qualitative experiments on benchmark datasets. Especially, we label over 9000 images of KITTI visual odometry dataset to help training the deformable module. Our method achieves superior performance in comparison to the current state-of-the-art in terms of speed and accuracy.",

keywords = "Supervised learning, motion segmentation, video object segmentation",

author = "Zhen Leng and Jing Chen and Songnan Lin",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2021",

doi = "10.1109/ACCESS.2021.3062673",

language = "English",

volume = "9",

pages = "56812--56821",

journal = "IEEE Access",

issn = "2169-3536",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Learning Instance Motion Segmentation with Geometric Embedding

AU - Leng, Zhen

AU - Chen, Jing

AU - Lin, Songnan

PY - 2021

Y1 - 2021

N2 - Most existing deep learning-based motion segmentation methods treat motion segmentation as a binary segmentation problem, which is generally not the real case in dynamic scenes. In addition, the object and camera motion are often mixed, making the motion segmentation problem difficult. This paper proposes a joint learning method which fuses semantic features and motion clues using CNNs with deformable convolution and a motion embedding module, to address multi-object motion segmentation problem. The deformable convolution module serves to fusion color and motion information. And the motion embedding module learns to distinguish objects' motion status with inspiration from geometric modeling methods. We perform extensive quantitative and qualitative experiments on benchmark datasets. Especially, we label over 9000 images of KITTI visual odometry dataset to help training the deformable module. Our method achieves superior performance in comparison to the current state-of-the-art in terms of speed and accuracy.

AB - Most existing deep learning-based motion segmentation methods treat motion segmentation as a binary segmentation problem, which is generally not the real case in dynamic scenes. In addition, the object and camera motion are often mixed, making the motion segmentation problem difficult. This paper proposes a joint learning method which fuses semantic features and motion clues using CNNs with deformable convolution and a motion embedding module, to address multi-object motion segmentation problem. The deformable convolution module serves to fusion color and motion information. And the motion embedding module learns to distinguish objects' motion status with inspiration from geometric modeling methods. We perform extensive quantitative and qualitative experiments on benchmark datasets. Especially, we label over 9000 images of KITTI visual odometry dataset to help training the deformable module. Our method achieves superior performance in comparison to the current state-of-the-art in terms of speed and accuracy.

KW - Supervised learning

KW - motion segmentation

KW - video object segmentation

UR - http://www.scopus.com/inward/record.url?scp=85103205517&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2021.3062673

DO - 10.1109/ACCESS.2021.3062673

M3 - Article

AN - SCOPUS:85103205517

SN - 2169-3536

VL - 9

SP - 56812

EP - 56821

JO - IEEE Access

JF - IEEE Access

M1 - 9380630

ER -

Learning Instance Motion Segmentation with Geometric Embedding

摘要

访问文件

其它文件与链接

指纹

引用此