1. Learned fractional downsampling network for adaptive video streaming.
- Author
-
Chen, Li-Heng, Bampis, Christos G., Li, Zhi, Sole, Joel, Chen, Chao, and Bovik, Alan C.
- Subjects
- *
STREAMING video & television , *CONVOLUTIONAL neural networks , *VIDEO codecs , *LANCZOS method , *VIDEO processing , *VIDEO coding - Abstract
Given increasing demand for very large format contents and displays, spatial resolution changes have become an important part of video streaming. In particular, video downscaling is a key ingredient that streaming providers implement in their encoding pipeline as part of video quality optimization workflows. Here, we propose a downsampling network architecture that progressively reconstructs residuals at different scales. Since the layers of convolutional neural networks (CNNs) can only be used to alter the resolutions of their inputs by integer scale factors, we seek new ways to achieve fractional scaling, which is crucial in many video processing applications. More concretely, we utilize an alternative building block, formulated as a conventional convolutional layer followed by a differentiable resizer. To validate the efficacy of our proposed downsampling network, we integrated it into a modern video encoding system for adaptive streaming. We extensively evaluated our method using a variety of different video codecs and upsampling algorithms to show its generality. The experimental results show that improvements in coding efficiency over the conventional Lanczos algorithm and state-of-the-art methods are attained, in terms of PSNR, SSIM, and VMAF, when tested on high-resolution test videos. In addition to quantitative experiments, we also carried out a subjective quality study, validating that the proposed downsampling model yields favorable results. • A network architecture to learn residuals prior to scaling and supports non-integer scaling factors, enhancing flexibility in video encoding workflows. • The learned downsampling models was integrated with a realistic video encoding pipeline for adaptive video streaming, to achieve improved coding efficiency. • Demonstrates significant improvements through comprehensive experiments, showing both objective and subjective quality enhancements. • Recognized as one of the papers with the longest review time in the journal Signal Processing: Image Communication (SPIC), reflecting the thorough and rigorous evaluation it underwent. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF