This paper presents efficient SIMD optimizations for the open-source Kvazaar HEVC intra encoder. The C implementation of Kvazaar is accelerated by Intel AVX2 instructions whose effect on Kvazaar ultrafast preset is profiled. According to our profiling results, C functions of SATD, DCT, quantization, and intra prediction account for over 60% of the total intra coding time of Kvazaar ultrafast preset. This work shows that optimizing primarily these functions doubles the coding speed of a single-threaded Kvazaar intra encoder for the same rate-distortion performance. The highest performance boost is obtained by deploying the proposed optimizations jointly with multithreading. On the Intel 8-core i7 processor, the AVX2-optimized 16-threaded Kvazaar ultrafast preset achieves real-time (30 fps) intra coding speed up to 1080p resolution. Compared to AVX2-optimized ultrafast preset of x265, Kvazaar is 20% times faster and still obtains 9.1% bit rate gain for the same quality. These results justify that Kvazaar is currently the leading open-source HEVC intra encoder in terms of real-time coding speed and efficiency.
|Conference||IEEE International Conference on Image Processing|
|Period||1/01/00 → …|
- Publication forum level 1