Herke van Hoof is currently assistant professor at the University of Amsterdam in the Netherlands. He is part of the Amlab headed by Professor Max Welling as well as the UvA-Bosch Delta lab.  Herke works on machine learning for autonomous robots in perceptually challenging environments. For robots to master a wide array of tasks, machine learning is indispensable, but it is equally important that such tasks can be learned in non-standardized and unstructured environments such as homes or hospitals. Learning tasks in such complicated environment puts additional demands on algorithms for machine learning, perception, and control.

One example of such a task is exploring the objects present in a novel environment. Segmenting objects using passive sensing is inherently limited. By interacting with the environment, the robot can improve its understanding of the different objects that are present. However, interaction is costly. By expressing the uncertainty in the robot’s understanding of the world, it becomes possible to select actions based on the information they are expected to yield about the environment, and thus speed up the learning progress.

In another project, we consider reinforcement learning with high-dimensional inputs. Current approaches have usually tried to learn features in a separate step. However, such features cannot be informed by what is relevant for the task at hand. We have taken a complementary approach, where we have developed a non-parametric reinforcement learning method that only depends on the similarity between data-points, independent of the embedding dimensionality.

Currently, I’m working on new ways to exploit known robot models and/or simulators to make reinforcement learning more efficient. I am looking to use a generative model of the robot to characterise its belief over unknown parameters, and pre-training a policy that learns to trade-off exploration and exploitation based on this characterisation.

Before joining the University of Amsterdam, Herke van Hoof was a postdoc at McGill University in Montreal, Canada, where he worked with Professors Joelle Pineau, Dave Meger, and Gregory Dudek. He obtained his PhD at TU Darmstadt, Germany, under the supervision of Professor Jan Peters, where he graduated in November 2016. Herke got his bachelor and master degrees in Artificial Intelligence at the University of Groningen in the Netherlands.

Recent News

  • AAAI paper accepted (12/7/2021)

    Alex Long’s AAAI submission on Fast and Data Efficient Reinforcement Learning from Pixels via Non-Parametric Value Approximation got accepted for presentation! The paper is the result of a collaboration started when Alex visited us at UvA. Congrats, Alex!

  • Several working papers posted (10/3/2021)

    We’ve recently posted several working papers that I hope you’ll find interesting:

    • Amin, S., Gomrokchi, M., Satija, H., van Hoof, H., & Precup, D. (2021). A Survey of Exploration Methods in Reinforcement Learning. arXiv preprint arXiv:2109.00157.
    • Kool, W., van Hoof, H., Gromicho, J., & Welling, M. (2021). Deep Policy Dynamic Programming for Vehicle Routing Problems. arXiv preprint arXiv:2102.11756.
  • New publications (5/11/2021)

    Jan Woehlke’s paper “Hierarchies of Planning and Reinforcement Learning for Robot Navigation” was accepted for presentation at the International Conference on Robotics and Automation.

    Yijie Zhang got his paper “Deep Coherent Exploration For Continuous Control” accepted for presentation in the International Conference on Machine Learning.

    Congrats Yijie & Jan! Links to the papers / preprint will follow soon on the publication page.

An archive of news items can be found on the News page.

Key References

Kool, Wouter ; van Hoof, Herke ; Welling, Max

Estimating Gradients for Discrete Random Variables by Sampling without Replacement

International Conference on Learning Representations, 2020.

Links | BibTeX

Smith, M; van Hoof, H; Pineau, J

An Inference-Based Policy Gradient Method for Learning Options

International Conference on Machine Learning, pp. 4703-4712, 2018.

Links | BibTeX

Van Hoof, H; Neumann, G; Peters, J

Non-parametric Policy Search with Limited Information Loss

Journal of Machine Learning Research, 18 (73), pp. 1-46, 2017.

Links | BibTeX

van der Pol, Elise ; Worrall, Daniel ; van Hoof, Herke ; Oliehoek, Frans ; Welling, Max

MDP Homomorphic Networks: Group Symmetries in Reinforcement Learning

Advances in Neural Information Processing Systems, 2020.

Links | BibTeX

Bakker, Tim ; van Hoof, Herke ; Welling, Max

Experimental design for MRI by greedy policy search

Advances in Neural Information Processing Systems, 2020.

Links | BibTeX

A full list of publications can be found at the Publications page.