跳到主要导航 跳到搜索 跳到主要内容

Flow-Anything: Learning Real-World Optical Flow Estimation from Large-Scale Single-View Images

  • Yingping Liang
  • , Ying Fu*
  • , Yutao Hu
  • , Wenqi Shao
  • , Jiaming Liu
  • , Debing Zhang
  • *此作品的通讯作者
  • Beijing Institute of Technology
  • Southeast University, Nanjing
  • Shanghai AI Laboratory
  • Tiamat AI
  • Xiaohongshu Inc.

科研成果: 期刊稿件文章同行评审

摘要

Optical flow estimation is a crucial subfield of computer vision, serving as a foundation for video tasks. However, the real-world robustness is limited by animated synthetic datasets for training. This introduces domain gaps when applied to real-world applications and limits the benefits of scaling up datasets. To address these challenges, we propose Flow-Anything, a large-scale data generation framework designed to learn optical flow estimation from any single-view images in the real world. We employ two effective steps to make data scaling-up promising. First, we convert a single-view image into a 3D representation using advanced monocular depth estimation networks. This allows us to render optical flow and novel view images under a virtual camera. Second, we develop an Object-Independent Volume Rendering module and a Depth-Aware Inpainting module to model the dynamic objects in the 3D representation. These two steps allow us to generate realistic datasets for training from large-scale single-view images, namely FA-Flow Dataset. For the first time, we demonstrate the benefits of generating optical flow training data from large-scale real-world images, outperforming the most advanced unsupervised methods and supervised methods on synthetic datasets. Moreover, our models serve as a foundation model and enhance the performance of various downstream video tasks.

源语言英语
页(从-至)8435-8452
页数18
期刊IEEE Transactions on Pattern Analysis and Machine Intelligence
47
10
DOI
出版状态已出版 - 2025
已对外发布

指纹

探究 'Flow-Anything: Learning Real-World Optical Flow Estimation from Large-Scale Single-View Images' 的科研主题。它们共同构成独一无二的指纹。

引用此