Research on Video Super-Resolution Technology Based on Multi-scale Spatiotemporal Information Aggregation

Xiao Luo*, Ang Li, Baoling Han

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

As a new type of teaching tool, web course video breaks the limitations of traditional teaching methods, which has attracted widespread attention. However, due to the limited memory of the filming equipment, the web course video will be compressed and processed, resulting in a lower resolution of the video. In addition, when shooting video, it will be disturbed by factors such as lighting, character movement and blurred PPT projection, resulting in the final captured web course video being impaired in terms of brightness and clarity, which cannot meet the visual needs of users. Therefore, this paper uses video super-resolution reconstruction technology to predict and fill in the missing pixel information in low-resolution video frames, thereby obtaining high-resolution videos and improving user learning efficiency. First, in view of the problems such as occlusion and uneven illumination in online course videos, and the difficulty of the optical flow estimation network in accurately extracting the temporal dependencies in video frames, a Multi-scale Spatiotemporal Information Aggregation network was proposed. The network uses different sizes of 3D convolutions to not only accurately extract the temporal information between video frames at different time intervals, but also obtain the spatial information in the video frames, implicitly completing the alignment between video frames. Secondly, in view of the problem that conventional super-resolution reconstruction methods are difficult to reconstruct text areas in online course videos with high quality, a hybrid residual self-attention reconstruction network is proposed to construct a high-precision spatial self-attention module and a high-precision channel self-attention module. Significantly improves the reconstruction quality of text areas in online course videos. Experimental results show that this algorithm can achieve excellent results in the online course video super-resolution data set.

Original languageEnglish
Title of host publicationLecture Notes on Data Engineering and Communications Technologies
PublisherSpringer Science and Business Media Deutschland GmbH
Pages165-174
Number of pages10
DOIs
Publication statusPublished - 2025

Publication series

NameLecture Notes on Data Engineering and Communications Technologies
Volume218
ISSN (Print)2367-4512
ISSN (Electronic)2367-4520

Keywords

  • 3D convolution
  • attention mechanism
  • video super-resolution

Fingerprint

Dive into the research topics of 'Research on Video Super-Resolution Technology Based on Multi-scale Spatiotemporal Information Aggregation'. Together they form a unique fingerprint.

Cite this