Embodied agents in real-world robots: integrated control and machine learning for intelligent behaviours
Abstract
The central contribution of this thesis is providing a reliable framework
and algorithms to make robots move as versatile and reliable as
biological systems. To this end, this work proposes a hierarchical control
framework that allows the combination of classical control on the
lower levels for control of the robot’s actuators achieving stability and
balance control and machine learning on the higher-levels for decision
making and planning. Using Machine Learning for the decision making
and planning enables the robot to behave more intelligently, dynamically,
and animal-like, while the control algorithms for the lower levels
maintain the robustness and stability properties of classical control.
In particular, this thesis presents contributions in six areas of robotic
motion control: (1) biologically motivated implicit hierarchical generative
model for autonomous robot operations, (2) multi-expert learning
and skill synthesis, (3) improved formulation for Model-Predictive Control
(MPC) of legged robots, (4) rapid robot policy development and
deployment through fast sample collection for Deep Reinforcement
Learning (DRL), (5) reverse-engineering AI policies into an equivalent,
safe, transparent, and certifiable controllers, and (6) automatic
parameter tuning.
First, we draw inspiration from human motor control and insights in
Neuroscience and propose an implicit hierarchical generative model for
robot control that mimics the deep temporal architecture of human
motor control and show that this can be used to fully autonomously
learn to complete a complex task of object retrieval, delivery, and
navigation.
Using the DRL policies trained in the previous works’, we propose a
Multi-Expert Learning Architecture (MELA) that allows using multiple
experts to train and synthesise new skills. We show that MELA can
be used to learn animal-like, dynamic, and adaptive behaviours on the
quadruped robot Jueying.
We extend MELA, into a more general Multi-Expert Synthesis (MES)
framework, where we propose general guidelines for MES. To this
end, we propose an automatic state selection procedure, and identify
and solve common issues in multi-expert systems. Through MES, we
achieve fall-resilient locomotion on the quadruped ANYmal and dualarm
cooperation for grasping ungraspable object on bi-manual robot
setup using Franka Panda.
We propose a numerically stable, Linear-Inverted Pendulum Model
(LIPM) based formulation for MPC, and show its robustness properties
on the task of legged locomotion for the humanoid Valkyrie. Furthermore,
we show that the proposed formulation is more robust to external
disturbances than other LIPM formulations for MPC.
In the domain of DRL, we propose a fast sample collection procedure
on consumer-grade computers through parallelisation that enables
the trainining and deployment of policies like trotting on quadruped
ANYmal and balancing on humanoid Valkyrie within an hour.
Next, we use the Artificial Intelligence (AI) policy trained through
DRL as basis for a reverse-engineering framework. For the task of
humanoid push recovery on Valkyrie, we show how an opaque AI policy
can be used to obtain a transparent, certifiable controller with same
properties as the AI policy. We show that ankle, hip, toe, and stepping
strategies emerge from the reverse-engineered controller in a unified
manner without the need of explicit switching.
Lastly, we propose an algorithm called Alternating Bayesian Optimisation
(ABO) to automatically tune the high-dimensional parameters for
whole-body control of Valkyrie. Contrary to classical Bayesian Optimisation,
which scales only up to 10 dimensions, we show that ABO
can tune up to 36 parameters for whole-body control and up to 60
dimensions on the global optimisation benchmark COCO.