Tuomas Haarnoja

I am a PhD student in the Berkeley Artificial Intelligence Research Lab (BAIR) at UC Berkeley. My current research focus is on extending deep learning methods to provide for flexible, effective robot control that can handle the diversity and variability of the real world. I am co-advised by Prof. Pieter Abbeel and Prof. Sergey Levine.

I received a master's degree in Space Robotics and Automation from Luleå University of Technology, Sweden, and Aalto University of Technology, Finland, after which I worked at VTT Technical Research Centre of Finland as a research scientist before joining BAIR.

Email  |  Linkedin  |  Google Scholar  |  GitHub

Research

Below is a list of some of my publications related to machine learning and robotics. For a complete list, including work on passive identification of concealed objects and vibration control of electric machines, please refer to my Google Scholar page.

Reinforcement Learning with Deep Energy-Based Policy
Tuomas Haarnoja*, Haoran Tang*, Pieter Abbeel, Sergey Levine
International Conference on Machine Learning, 2017
paper  |  videos  |  code  |  bibtex

Abstract: We propose a method for learning expressive energy-based policies for continuous states and actions, which has been feasible only in tabular domains before. We apply our method to learning maximum entropy policies, resulting into a new algorithm, called soft Q-learning, that expresses the optimal policy via a Boltzmann distribution. We use the recently proposed amortized Stein variational gradient descent to learn a stochastic sampling network that approximates samples from this distribution. The benefits of the proposed algorithm include improved exploration and compositionality that allows transferring skills between tasks, which we confirm in simulated experiments with swimming and walking robots. We also draw a connection to actor-critic methods, which can be viewed performing approximate inference on the corresponding energy-based model.

Backprop KF: Learning Discriminative Deterministic State Estimators
Tuomas Haarnoja, Anurag Ajay, Sergey Levine, Pieter Abbeel
Advances in Neural Information Processing Systems, 2016
paper  |  bibtex

Abstract: Generative state estimators based on probabilistic filters and smoothers are one of the most popular classes of state estimators for robots and autonomous vehicles. However, generative models have limited capacity to handle rich sensory observations, such as camera images, since they must model the entire distribution over sensor readings. Discriminative models do not suffer from this limitation, but are typically more complex to train as latent variable models for state estimation. We present an alternative approach where the parameters of the latent state distribution are directly optimized as a deterministic computation graph, resulting in a simple and effective gradient descent algorithm for training discriminative state estimators. We show that this procedure can be used to train state estimators that use complex input, such as raw camera images, which must be processed using expressive nonlinear function approximators such as convolutional neural networks. Our model can be viewed as a type of recurrent neural network, and the connection to probabilistic filtering allows us to design a network architecture that is particularly well suited for state estimation. We evaluate our approach on synthetic tracking task with raw image inputs and on the visual odometry task in the KITTI dataset. The results show significant improvement over both standard generative approaches and regular recurrent neural networks.

Idle state stability, limit cycle walking & regenerative walking: Towards long time auonomy in bipeds
José-Luis Peralta, Tuomas Haarnoja, Tomi Ylikorpi, Aarne Halme
International Conference on Climbing and Walking Robots, 2010
paper  |  bibtex

Abstract: This work presents an integral approach to tackle energy issues in bipedal robots. It introduces three combined ideas to increase these robots’ autonomy: First, the exploitation of the inherent equilibrium that should exist in the rest position of a welldesigned mechanism; then, the efficient usage of the energy to walk based on the natural limit cycle of the system; and finally, the harvest of energy based on the new idea of regenerative walking. Simulations and experimental tests show promising results of this approach, built under a delicate equilibrium between appropriate control scheme, suitable mechanical design and proper actuators choice.

Teaching

Deep Reinforcement Learning Bootcamp
26-27 August 2017, UC Berkeley, Teaching Assistant

CS188/289A - Introduction to Machine Learning
Spring 2016, UC Berkeley, Graduate Student Instructor

AS-0.1101 - Basic course on C programming,
Spring 2007, Helsinki University of Technology, Teaching Assistant


Looking for a nice template? Check this one out!