Adaptive Modelling and Planning for Learning Intelligent Behaviour
Item Status
Embargo End Date
Date
Authors
Abstract
An intelligent agent must be capable of using its past experience to develop an
understanding of how its actions affect the world in which it is situated. Given
some objective, the agent must be able to effectively use its understanding of
the world to produce a plan that is robust to the uncertainty present in the
world. This thesis presents a novel computational framework called the Adaptive
Modelling and Planning System (AMPS) that aims to meet these requirements
for intelligence.
The challenge of the agent is to use its experience in the world to generate a
model. In problems with large state and action spaces, the agent can generalise
from limited experience by grouping together similar states and actions, effectively
partitioning the state and action spaces into finite sets of regions. This
process is called abstraction. Several different abstraction approaches have been
proposed in the literature, but the existing algorithms have many limitations.
They generally only increase resolution, require a large amount of data before
changing the abstraction, do not generalise over actions, and are computationally
expensive. AMPS aims to solve these problems using a new kind of approach.
AMPS splits and merges existing regions in its abstraction according to a
set of heuristics. The system introduces splits using a mechanism related to
supervised learning and is defined in a general way, allowing AMPS to leverage
a wide variety of representations. The system merges existing regions when an
analysis of the current plan indicates that doing so could be useful. Because
several different regions may require revision at any given time, AMPS prioritises
revision to best utilise whatever computational resources are available. Changes
in the abstraction lead to changes in the model, requiring changes to the plan.
AMPS prioritises the planning process, and when the agent has time, it replans
in high-priority regions. This thesis demonstrates the flexibility and strength of
this approach in learning intelligent behaviour from limited experience.
This item appears in the following Collection(s)

