Robust Visual Perception and Decision Making for Autonomous Systems

  • Vivienne Wang

Research output: Book/ReportDoctoral thesisCollection of Articles

Abstract

Autonomous systems, embodying a synthesis of visual perception and decision-making capabilities, are poised to revolutionize a wide array of sectors, from transportation and manufacturing to healthcare. Visual perception, the ability to comprehend and interpret visual data to discern the environment, lays the bedrock for these systems to function effectively in diverse real-world scenarios. Decision making complements this by equipping these systems with the ability to select optimal actions based on visually perceived information.

The first part of this thesis concentrates on visual perception, specifically through the lens of video object segmentation. As a crucial facet of computer vision, video object segmentation empowers autonomous systems to distinguish and track objects within dynamic video streams. Despite considerable advancements in the field, certain challenges endure, particularly when it comes to robust and coherent segmentation of objects in complex, real-world scenarios. This work introduces three key innovations to address these challenges. We present an efficient graph transduction learning approach for improved primary video object segmentation. A semi-supervised adaptation technique is put forth to harness the power of pretrained deep convolutional neural networks in semantic video object segmentation. Lastly, we introduce a hierarchical graphical model, fusing both bottom-up and top-down cues with long-term object relations and spatiotemporal contexts for superior performance in semantic video object segmentation.

The latter half of this thesis shifts the spotlight to decision making, specifically delving into hierarchical reinforcement learning (HRL). Reinforcement learning, a fundamental paradigm in decision making, facilitates autonomous systems to learn from their interactions with the environment. However, its efficacy can be obstructed by issues such as non-stationarity in off-policy training. This non-stationarity can be attributed to the continuous adjustments in the policies at different levels of the hierarchy throughout the learning phase. To counter this, we propose a novel adversarially guided subgoal generation framework within the realm of HRL. This adversarial learning technique effectively alleviates shifts in data distribution from relabeled experiences to the current high-level policy behavior, resulting in enhanced learning efficiency and stability.

In sum, this thesis endeavors to push the frontiers of visual perception and decision making capabilities for autonomous systems. Through contributions in video object segmentation and hierarchical reinforcement learning, it sets the stage for the development of more robust and dependable autonomous systems, thereby paving the way for their wider and safer application in society.
Original languageEnglish
Place of PublicationTampere
PublisherTampere University
ISBN (Electronic)978-952-03-3859-6
ISBN (Print)978-952-03-3858-9
Publication statusPublished - 2025
Publication typeG5 Doctoral dissertation (articles)

Publication series

NameTampere University Dissertations - Tampereen yliopiston väitöskirjat
Volume1207
ISSN (Print)2489-9860
ISSN (Electronic)2490-0028

Fingerprint

Dive into the research topics of 'Robust Visual Perception and Decision Making for Autonomous Systems'. Together they form a unique fingerprint.

Cite this