Research

Our research is focused on the intersection of machine learning, robotics, human-robot interaction and computer vision. Our goal is to create data-efficient and mathematically principled machine learning algorithms that are suitable for complex robot domains such as grasping and manipulation, forceful interactions or dynamic motor tasks. In our research, we always aim for a strong theoretical basis for our developed algorithms which are derived from first order principles. In terms of methods, our work is focused on:

  • Movement Representations: We are developing new methods for representing motions in a compact and flexible way using motion primitives. The primitives provide desired trajectories that can be adapted efficiently to new situations. Here, we follow a probabilistic approach modelling also the variability in the trajectory space, which can subsequently used for adaptation via Bayesian conditioning.
  • Reinforcement Learning and Policy Search: Here we investigate how a robot can improve its policy by interaction with its environment. Our focus is to connect the probabilistic movement representations with deep reinforcement learning, which required exact policy updates in high dimensional action spaces. Here we rely on probabilistic trust regions for stabilizing the policy update. Furthermore, we are developing methods for improving the versatility of the learned skills of a robot.
  • Imitation Learning and Interactive Learning: In Imitation Learning, we want to learn policies from demonstrations of a expert. Our focus is to use trajectory-based imitation techniques building on movement primitives as the can directly learn to imitate the long-term behaviour instead of the single time-step actions as done in traditional approaches. Moreover, a key focus lies on learning from real human demonstrations. Such demonstrations are inherently multi-modal, in particular if they come from different teachers. Here we develop new approaches that can identify the multiple solutions in the data and avoid the common mode averaging problem. In terms of interactive learning, we are looking into algorithms that can learn from human feedback such as preference feedback.
  • Model-Learning: We aim to learn complex dynamics models for non-Markovian systems such as hydraulic robots or robots in contact. To do so, we use novel recurrent neural network architectures that embed a Kalman filter in a deep latent state space. Furthermore, we extend such models changing dynamics situations where the model needs to adapt quickly to the new scenario. On the long term, we aim to use these models for obtaining optimal control policies.
  • Perception: We are working on methods for integrating complex perception inputs such as point-clouds into our decision-making process. For doing so, we rely on pointnet architectures and graph neural networks.