TY - GEN
T1 - PivotSketch
T2 - 36th Workshop on Network and Operating System Support for Digital Audio and Video, NOSSDAV 2026
AU - Zhang, Sirui
AU - Li, Yuying
AU - Zhang, Cong
AU - Fan, Xiaoyi
AU - Hu, Xiping
AU - Duan, Haihan
N1 - Publisher Copyright:
© 2026 Copyright held by the owner/author(s).
PY - 2026/4/8
Y1 - 2026/4/8
N2 - Adaptive bitrate (ABR) algorithms are central to video-on-demand streaming, yet their QoE objectives remain time-uniform: A stall or quality drop is penalized identically whether it occurs during a climactic reveal or a low-stakes transition. Meanwhile, large language models (LLMs) can now judge which moments matter most to viewers, but their outputs are computationally expensive, lack cross-Timeline comparability, and are risky to inject directly into a control loop. This separation wastes a key opportunity for viewer-Aligned optimization: video semantics often indicate which segments are quality-critical well before playback reaches them.We present PivotSketch, a framework that converts foundation model semantics into stall-safe ABR guidance. It has three components: (1) a pivot-based ranking module that estimates globally comparable chunk-importance percentiles under a fixed inference budget; (2) a calibration module that derives value-of-bits from the encoding ladder; and (3) a reliability estimator that gates semantic influence when confidence is low. Integrated with RobustMPC, the controller prioritizes semantically important moments and reverts to the baseline when semantic signals are unreliable. On Mr. HiSum assets with real throughput traces, PivotSketch aligns best with human importance annotations. For the top 10% most important moments, it reduces rebuffering by 30% and increases delivered quality by 13% versus RobustMPC, while keeping average performance near baseline.
AB - Adaptive bitrate (ABR) algorithms are central to video-on-demand streaming, yet their QoE objectives remain time-uniform: A stall or quality drop is penalized identically whether it occurs during a climactic reveal or a low-stakes transition. Meanwhile, large language models (LLMs) can now judge which moments matter most to viewers, but their outputs are computationally expensive, lack cross-Timeline comparability, and are risky to inject directly into a control loop. This separation wastes a key opportunity for viewer-Aligned optimization: video semantics often indicate which segments are quality-critical well before playback reaches them.We present PivotSketch, a framework that converts foundation model semantics into stall-safe ABR guidance. It has three components: (1) a pivot-based ranking module that estimates globally comparable chunk-importance percentiles under a fixed inference budget; (2) a calibration module that derives value-of-bits from the encoding ladder; and (3) a reliability estimator that gates semantic influence when confidence is low. Integrated with RobustMPC, the controller prioritizes semantically important moments and reverts to the baseline when semantic signals are unreliable. On Mr. HiSum assets with real throughput traces, PivotSketch aligns best with human importance annotations. For the top 10% most important moments, it reduces rebuffering by 30% and increases delivered quality by 13% versus RobustMPC, while keeping average performance near baseline.
KW - Adaptive Bitrate Streaming
KW - Model Predictive Control (MPC)
KW - Quality of Experience (QoE)
UR - https://www.scopus.com/pages/publications/105036581369
U2 - 10.1145/3798065.3798086
DO - 10.1145/3798065.3798086
M3 - Conference contribution
AN - SCOPUS:105036581369
T3 - NOSSDAV 2026 - Proceedings of the 36th Workshop on Network and Operating System Support for Digital Audio and Video
SP - 141
EP - 147
BT - NOSSDAV 2026 - Proceedings of the 36th Workshop on Network and Operating System Support for Digital Audio and Video
PB - Association for Computing Machinery, Inc
Y2 - 4 April 2026 through 8 April 2026
ER -