TY - GEN
T1 - Object Detection in Equirectangular Panorama
AU - Yang, Wenyan
AU - Qian, Yanlin
AU - Kämäräinen, Joni-Kristian
AU - Cricri, Francesco
AU - Fan, Lixin
N1 - EXT="Cricri, Francesco"
PY - 2018/8
Y1 - 2018/8
N2 - We introduce a high-resolution equirectangular panorama (aka 360-degree, virtual reality, VR) dataset for object detection and propose a multi-projection variant of the YOLO detector. The main challenges with equirectangular panorama images are i) the lack of annotated training data, ii) high-resolution imagery and iii) severe geometric distortions of objects near the panorama projection poles. In this work, we solve the challenges by I) using training examples available in the “conventional datasets” (ImageNet and COCO), II) employing only low resolution images that require only moderate GPU computing power and memory, and III) our multi-projection YOLO handles projection distortions by making multiple stereographic sub-projections. In our experiments, YOLO outperforms the other state-of-the-art detector, Faster R-CNN, and our multi-projection YOLO achieves the best accuracy with low-resolution input.
AB - We introduce a high-resolution equirectangular panorama (aka 360-degree, virtual reality, VR) dataset for object detection and propose a multi-projection variant of the YOLO detector. The main challenges with equirectangular panorama images are i) the lack of annotated training data, ii) high-resolution imagery and iii) severe geometric distortions of objects near the panorama projection poles. In this work, we solve the challenges by I) using training examples available in the “conventional datasets” (ImageNet and COCO), II) employing only low resolution images that require only moderate GPU computing power and memory, and III) our multi-projection YOLO handles projection distortions by making multiple stereographic sub-projections. In our experiments, YOLO outperforms the other state-of-the-art detector, Faster R-CNN, and our multi-projection YOLO achieves the best accuracy with low-resolution input.
KW - image resolution
KW - neural nets
KW - object detection
KW - virtual reality
KW - YOLO detector
KW - equirectangular panorama images
KW - annotated training data
KW - high-resolution imagery
KW - panorama projection poles
KW - training examples
KW - conventional datasets
KW - low resolution images
KW - multiprojection YOLO
KW - projection distortions
KW - multiple stereographic sub-projections
KW - low-resolution input
KW - high-resolution equirectangular panorama dataset
KW - geometric distortions
KW - moderate GPU computing power
KW - CNN
KW - Detectors
KW - Distortion
KW - Object detection
KW - Cameras
KW - Image resolution
KW - Virtual reality
KW - Graphics processing units
U2 - 10.1109/ICPR.2018.8546070
DO - 10.1109/ICPR.2018.8546070
M3 - Conference contribution
SN - 978-1-5386-3789-0
SP - 2190
EP - 2195
BT - 2018 24th International Conference on Pattern Recognition (ICPR)
PB - IEEE
T2 - International Conference on Pattern Recognition
Y2 - 1 January 1900
ER -