Abstract
Transform coding tools play an integral part in video codecs due to their substantial impact on coding efficiency. The latest video coding standard, Versatile Video Coding (VVC), makes the most of these tools by introducing new DST7, DCT8, and non-square transforms alongside the conventional DCT2 transform. This paper proposes optimized AVX2 kernels for all these transforms to speed up VVC coding. Unlike existing solutions, our kernels are specially tailored for each VVC transform type and block size. Accelerating our open-source uvg266 VVC encoder with the proposed kernels yields up to a 1.1× speedup under all intra (AI) coding condition without any coding overhead. Our implementations make forward DCT2 and DST7/DCT8 transforms 4.0× and 6.7× as fast as their respective scalar implementations in the VTM reference encoder. They also outpace the AVX2 kernels of the practical VVenC encoder by factors of 3.0× and 2.8×. The respective speedups rise up to 5.3×, 11.1×, 3.4×, and 3.0× with inverse transforms.
Original language | English |
---|---|
Title of host publication | 2023 IEEE Nordic Circuits and Systems Conference (NorCAS) |
Publisher | IEEE |
Number of pages | 6 |
ISBN (Electronic) | 979-8-3503-3757-0 |
ISBN (Print) | 979-8-3503-3758-7 |
DOIs | |
Publication status | Published - 31 Oct 2023 |
Publication type | A4 Article in conference proceedings |
Event | IEEE Nordic Circuits and Systems Conference - Aalborg, Denmark Duration: 30 Oct 2023 → 1 Nov 2023 |
Conference
Conference | IEEE Nordic Circuits and Systems Conference |
---|---|
Country/Territory | Denmark |
City | Aalborg |
Period | 30/10/23 → 1/11/23 |
Keywords
- versatile video coding (VVC)
- transform
- complexity reduction
- Advanced Vector Extensions 2 (AVX2)
- practical encoder implementation
Publication forum classification
- Publication forum level 1