Abstract
Deep learning-based object detectors have shown outstanding performance with state-of-the-art results on public benchmarks. However, they typically consist of millions of parameters and require a large number of training samples to tune these parameters appropriately. These samples are labeled by human annotators, which is a tedious, time-consuming, and expensive process. Moreover, object detectors have high computational costs both for the training and inference phase. This dissertation considers these two aspects of training and deploying deep learning object detectors.
First, we study data labeling for the training phase and the robustness of object detectors towards label noise. We classify possible label noise scenarios in 2D object detection and study the sensitivity of one-stage object detectors to label noise in the training phase. We then propose methods for efficient bounding box annotation by utilizing human-machine collaboration. Extensive experiments have been done to study an efficient and effective bounding box annotation scheme for deep learning object detectors. Additionally, we created an easy-to-use, medium-sized, multiclass, fully labeled object detection dataset from indoor premises and released it publicly for registration-free use.
Second, we study the practical problem of object detection network deployment with an efficient implementation of the object detection network for applications such as facial analysis, human detection and tracking, and the path prediction of mobile objects on resource-limited devices. We implemented object detection in an image processing pipeline integrating with other tasks for multiple applications and studied the optimal design process. We present the details of the system-level design to incorporate a multitasking network efficiently with the proper system architecture design.
First, we study data labeling for the training phase and the robustness of object detectors towards label noise. We classify possible label noise scenarios in 2D object detection and study the sensitivity of one-stage object detectors to label noise in the training phase. We then propose methods for efficient bounding box annotation by utilizing human-machine collaboration. Extensive experiments have been done to study an efficient and effective bounding box annotation scheme for deep learning object detectors. Additionally, we created an easy-to-use, medium-sized, multiclass, fully labeled object detection dataset from indoor premises and released it publicly for registration-free use.
Second, we study the practical problem of object detection network deployment with an efficient implementation of the object detection network for applications such as facial analysis, human detection and tracking, and the path prediction of mobile objects on resource-limited devices. We implemented object detection in an image processing pipeline integrating with other tasks for multiple applications and studied the optimal design process. We present the details of the system-level design to incorporate a multitasking network efficiently with the proper system architecture design.
Original language | English |
---|---|
Place of Publication | Tampere |
Publisher | Tampere University |
ISBN (Electronic) | 978-952-03-2420-9 |
ISBN (Print) | 978-952-03-2419-3 |
Publication status | Published - 2022 |
Publication type | G5 Doctoral dissertation (articles) |
Publication series
Name | Tampere University Dissertations - Tampereen yliopiston väitöskirjat |
---|---|
Volume | 608 |
ISSN (Print) | 2489-9860 |
ISSN (Electronic) | 2490-0028 |