TY - GEN
T1 - AVX2-Optimized Interpolation Filters for HEVC Inter Encoding
AU - Mercat, Alexandre
AU - Lemmetti, Ari
AU - Sainio, Joose
AU - Vanne, Jarno
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - High Efficiency Video Coding (HEVC) sets the stage for economic video transmission and storage, but its inherent computational complexity calls for powerful implementations. This paper addresses the principal performance bottleneck of HEVC codecs by introducing AVX2-vectorized algorithms for HEVC interpolation filters. The proposed speed-up techniques include 1) a data permutation scheme for the horizontal interpolation stage; 2) a sliding window strategy for the vertical interpolation stage; 3) optimal usage of horizontal and vertical interpolation during fractional motion estimation; and 4) a lane-based approach to double the vector lengths from 128-bit legacy vector extensions to 256bits of AVX2. Our AVX2-optimized interpolation filters were benchmarked as part of the practical Kvazaar open-source HEVC encoder. On an Intel 8-core Xeon processor, they were shown to be 9.7 and 8.5 times as fast as scalar interpolation with the Kvazaar ultrafast and veryslow presets, respectively. In both cases, changing over from scalar to vectorized interpolation increases the coding speed of Kvazaar by more than 50%, which stresses the importance of interpolation optimizations in modern video encoders.
AB - High Efficiency Video Coding (HEVC) sets the stage for economic video transmission and storage, but its inherent computational complexity calls for powerful implementations. This paper addresses the principal performance bottleneck of HEVC codecs by introducing AVX2-vectorized algorithms for HEVC interpolation filters. The proposed speed-up techniques include 1) a data permutation scheme for the horizontal interpolation stage; 2) a sliding window strategy for the vertical interpolation stage; 3) optimal usage of horizontal and vertical interpolation during fractional motion estimation; and 4) a lane-based approach to double the vector lengths from 128-bit legacy vector extensions to 256bits of AVX2. Our AVX2-optimized interpolation filters were benchmarked as part of the practical Kvazaar open-source HEVC encoder. On an Intel 8-core Xeon processor, they were shown to be 9.7 and 8.5 times as fast as scalar interpolation with the Kvazaar ultrafast and veryslow presets, respectively. In both cases, changing over from scalar to vectorized interpolation increases the coding speed of Kvazaar by more than 50%, which stresses the importance of interpolation optimizations in modern video encoders.
KW - Advanced Vector Extensions 2 (AVX2)
KW - High Efficiency Video Coding (HEVC)
KW - interpolation filter
KW - Kvazaar HEVC encoder
KW - single instruction multiple data (SIMD)
U2 - 10.1109/ISCAS46773.2023.10181449
DO - 10.1109/ISCAS46773.2023.10181449
M3 - Conference contribution
AN - SCOPUS:85167696929
T3 - IEEE International Symposium on Circuits and Systems proceedings
BT - IEEE ISCAS 2023 - Symposium Proceedings
PB - IEEE
T2 - IEEE International Symposium on Circuits and Systems
Y2 - 21 May 2023 through 25 September 2023
ER -