Low-Light Raw Video Denoising With a High-Quality Realistic Motion Dataset

Ying Fu, Zichun Wang, Tao Zhang, Jun Zhang*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)

Abstract

Recently, supervised deep-learning methods have shown their effectiveness on raw video denoising in low-light. However, existing training datasets have specific drawbacks, e.g., inaccurate noise modeling in synthetic datasets, simple motion created by hand or fixed motion, and limited-quality ground truth caused by the beam splitter in real captured datasets. These defects significantly decline the performance of network when tackling real low-light video sequences, where noise distribution and motion patterns are extremely complex. In this paper, we collect a raw video denoising dataset in low-light with complex motion and high-quality ground truth, overcoming the drawbacks of previous datasets. Specifically, we capture 210 paired videos, each containing short/long exposure pairs of real video frames with dynamic objects and diverse scenes displayed on a high-end monitor. Besides, since spatial self-similarity has been extensively utilized in image tasks, harnessing this property for network design is more crucial for video denoising as temporal redundancy. To effectively exploit the intrinsic temporal-spatial self-similarity of complex motion in real videos, we propose a new Transformer-based network, which can effectively combine the locality of convolution with the long-range modeling ability of 3D temporal-spatial self-attention. Extensive experiments verify the value of our dataset and the effectiveness of our method on various metrics.

Original languageEnglish
Pages (from-to)8119-8131
Number of pages13
JournalIEEE Transactions on Multimedia
Volume25
DOIs
Publication statusPublished - 2023

Keywords

  • Raw video denoising
  • convolutional neural network
  • temporal-spatial self-attention
  • transformer

Fingerprint

Dive into the research topics of 'Low-Light Raw Video Denoising With a High-Quality Realistic Motion Dataset'. Together they form a unique fingerprint.

Cite this