TY - CHAP
T1 - Research on Video Super-Resolution Technology Based on Multi-scale Spatiotemporal Information Aggregation
AU - Luo, Xiao
AU - Li, Ang
AU - Han, Baoling
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - As a new type of teaching tool, web course video breaks the limitations of traditional teaching methods, which has attracted widespread attention. However, due to the limited memory of the filming equipment, the web course video will be compressed and processed, resulting in a lower resolution of the video. In addition, when shooting video, it will be disturbed by factors such as lighting, character movement and blurred PPT projection, resulting in the final captured web course video being impaired in terms of brightness and clarity, which cannot meet the visual needs of users. Therefore, this paper uses video super-resolution reconstruction technology to predict and fill in the missing pixel information in low-resolution video frames, thereby obtaining high-resolution videos and improving user learning efficiency. First, in view of the problems such as occlusion and uneven illumination in online course videos, and the difficulty of the optical flow estimation network in accurately extracting the temporal dependencies in video frames, a Multi-scale Spatiotemporal Information Aggregation network was proposed. The network uses different sizes of 3D convolutions to not only accurately extract the temporal information between video frames at different time intervals, but also obtain the spatial information in the video frames, implicitly completing the alignment between video frames. Secondly, in view of the problem that conventional super-resolution reconstruction methods are difficult to reconstruct text areas in online course videos with high quality, a hybrid residual self-attention reconstruction network is proposed to construct a high-precision spatial self-attention module and a high-precision channel self-attention module. Significantly improves the reconstruction quality of text areas in online course videos. Experimental results show that this algorithm can achieve excellent results in the online course video super-resolution data set.
AB - As a new type of teaching tool, web course video breaks the limitations of traditional teaching methods, which has attracted widespread attention. However, due to the limited memory of the filming equipment, the web course video will be compressed and processed, resulting in a lower resolution of the video. In addition, when shooting video, it will be disturbed by factors such as lighting, character movement and blurred PPT projection, resulting in the final captured web course video being impaired in terms of brightness and clarity, which cannot meet the visual needs of users. Therefore, this paper uses video super-resolution reconstruction technology to predict and fill in the missing pixel information in low-resolution video frames, thereby obtaining high-resolution videos and improving user learning efficiency. First, in view of the problems such as occlusion and uneven illumination in online course videos, and the difficulty of the optical flow estimation network in accurately extracting the temporal dependencies in video frames, a Multi-scale Spatiotemporal Information Aggregation network was proposed. The network uses different sizes of 3D convolutions to not only accurately extract the temporal information between video frames at different time intervals, but also obtain the spatial information in the video frames, implicitly completing the alignment between video frames. Secondly, in view of the problem that conventional super-resolution reconstruction methods are difficult to reconstruct text areas in online course videos with high quality, a hybrid residual self-attention reconstruction network is proposed to construct a high-precision spatial self-attention module and a high-precision channel self-attention module. Significantly improves the reconstruction quality of text areas in online course videos. Experimental results show that this algorithm can achieve excellent results in the online course video super-resolution data set.
KW - 3D convolution
KW - attention mechanism
KW - video super-resolution
UR - http://www.scopus.com/inward/record.url?scp=85205089037&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-71013-1_16
DO - 10.1007/978-3-031-71013-1_16
M3 - Chapter
AN - SCOPUS:85205089037
T3 - Lecture Notes on Data Engineering and Communications Technologies
SP - 165
EP - 174
BT - Lecture Notes on Data Engineering and Communications Technologies
PB - Springer Science and Business Media Deutschland GmbH
ER -