Abstract
We consider the problem of an agent traversing a directed graph with the objective of maximizing the probability of reaching a goal node before a given deadline. Only the probability of the travel times of edges is known to the agent. The agent must balance between traversal actions towards the goal, and delays due to actions improving information about graph edge travel times. We describe the relationship of the problem to the more general partially observable Markov decision process. Further, we show that if edge travel times are independent and the underlying directed graph is acyclic, a closed loop solution can be computed. The solution specifies whether to execute a traversal or information-gathering action as a function of the current node, the time remaining until the deadline, and the information about edge travel times. We present results from two case studies, quantifying the usefulness of information-gathering as opposed to applying only traversal actions.
Original language | English |
---|---|
Pages (from-to) | 337–370 |
Number of pages | 34 |
Journal | Annals of Mathematics and Artificial Intelligence |
Volume | 79 |
Issue number | 4 |
Early online date | 28 Sep 2016 |
DOIs | |
Publication status | Published - Apr 2017 |
Publication type | A1 Journal article-refereed |
Keywords
- Applied probability
- Decision processes
- Dynamic programming
- Markov processes
- Transportation
Publication forum classification
- Publication forum level 1
ASJC Scopus subject areas
- Artificial Intelligence
- Applied Mathematics