MSDA: Monocular Self-supervised Domain Adaptation for 6D Object Pose Estimation

Dingding Cai, Janne Heikkilä, Esa Rahtu

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

2 Citations (Scopus)
6 Downloads (Pure)

Abstract

Acquiring labeled 6D poses from real images is an expensive and time-consuming task. Though massive amounts of synthetic RGB images are easy to obtain, the models trained on them suffer from noticeable performance degradation due to the synthetic-to-real domain gap. To mitigate this degradation, we propose a practical self-supervised domain adaptation approach that takes advantage of real RGB(-D) data without needing real pose labels. We first pre-train the model with synthetic RGB images and then utilize real RGB(-D) images to fine-tune the pre-trained model. The fine-tuning process is self-supervised by the RGB-based pose-aware consistency and the depth-guided object distance pseudo-label, which does not require the time-consuming online differentiable rendering. We build our domain adaptation method based on the recent pose estimator SC6D and evaluate it on the YCB-Video dataset. We experimentally demonstrate that our method achieves comparable performance against its fully-supervised counterpart while outperforming existing state-of-the-art approaches.
Original languageEnglish
Title of host publicationImage Analysis - 23rd Scandinavian Conference, SCIA 2023, Proceedings
EditorsRikke Gade, Michael Felsberg, Joni-Kristian Kämäräinen
PublisherSpringer
Pages467-481
Number of pages15
ISBN (Electronic)9783031314384
ISBN (Print)9783031314377
DOIs
Publication statusPublished - 2023
Publication typeA4 Article in conference proceedings
EventScandinavian Conference on Image Analysis - Lapland, Finland
Duration: 18 Apr 202321 Apr 2023

Publication series

NameLecture Notes in Computer Science
Volume13886 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceScandinavian Conference on Image Analysis
Country/TerritoryFinland
CityLapland
Period18/04/2321/04/23

Publication forum classification

  • Publication forum level 1

Fingerprint

Dive into the research topics of 'MSDA: Monocular Self-supervised Domain Adaptation for 6D Object Pose Estimation'. Together they form a unique fingerprint.

Cite this