Viewport Prediction for Volumetric Video Streaming by Exploring Video Saliency and User Trajectory Information

Jie Li, Zhixin Li, Zhi Liu*, Peng Yuan Zhou, Richang Hong, Qiyue Li, Han Hu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Volumetric video, also referred to as hologram video, is an emerging medium that represents 3D content in extended reality. As a next-generation video technology, it is poised to become a key application in 5G and future wireless communication networks. Because each user generally views only a specific portion of the volumetric video, known as the viewport, accurate prediction of the viewport is crucial for ensuring an optimal streaming performance. Despite its significance, research in this area is still in the early stages. To this end, this paper introduces a novel approach called Saliency and Trajectory-based Viewport Prediction (STVP), which enhances the accuracy of viewport prediction in volumetric video streaming by effectively leveraging both video saliency and viewport trajectory information. In particular, we first introduce a novel sampling method, Uniform Random Sampling (URS), which efficiently preserves video features while minimizing computational complexity. Next, we propose a saliency detection technique that integrates both spatial and temporal information to identify visually static and dynamic geometric and luminance-salient regions. Finally, we fuse saliency and trajectory information to achieve more accurate viewport prediction. Extensive experimental results validate the superiority of our method over existing state-of-the-art schemes. To the best of our knowledge, this is the first comprehensive study of viewport prediction in volumetric video streaming. We also make the source code of this work publicly available.

Original languageEnglish
JournalIEEE Transactions on Circuits and Systems for Video Technology
DOIs
Publication statusAccepted/In press - 2025
Externally publishedYes

Keywords

  • point cloud video
  • saliency detection
  • sampling
  • trajectory prediction
  • viewport prediction
  • volumetric video

Cite this