Investigating large-scale training of RL agents in a vast and diverse space of simulated tasks
Kinetix is an open-ended reinforcement learning (RL) framework for 2D physics-based
control tasks, which can represent a diverse range of environments, from mazes, to video-games, to
complex manipulation problems, and everything in between.
Kinetix can run at millions of steps per second on a single GPU by using JAX, unlocking large-scale
training of general reinforcement learning agents.
All environments in Kinetix have the same goal: make the green and blue touch, without green touching red.
The agent can act through applying torque via motors and force via thrusters. Through these simple rules, we can represent an astonishing array of tasks, all within a unified framework.
Kinetix: The RL FrameworkKinetix is a 2D-physics-based hardware-accelerated RL environment, meaning that it can represent a
large number of diverse tasks, from video games, to classic RL
environments to more complex locomotion and manipulation environments. For instance, below we have the classic RL
environments CartPole and Acrobot, some more complex robotic locomotion tasks inspired by Mujoco, as well as nontraditional environments where the agent controls multiple
parts of a complex system.
Kinetix: A Suite of Handmade Evaluation TasksWe provide a large set of challenging and diverse RL environments that you can start using immediately (see our main Github repository for more). You can use these environments to train a single, multi-task RL agent, or train on individual tasks, or use these as a heldout evaluation set to test the generalisation of agents.
Below we show each of level in our database. For a dedicated experience (and the ability to edit and save your own levels), please see the gallery.
Kinetix: The Easy-to-use Environment Creator!Kinetix: An Open-Ended BenchmarkFinally, we believe Kinetix serves as an ideal environment to study open-ended learning, automatic
curriculum learning, and unsupervised environment design. This is because Kinetix is fast, enabling
large-scale experiments, and because it is able to represent a wide range of semantically diverse tasks, as
opposed to only small variations of the same task (e.g., different obstacle locations in a maze).
We provide functionality to generate random environments, as well as code to run
autocurricula methods on this distribution.
We use this to train a general agent on randomly sampled levels and investigate its generalisation capabilities.
As well as autocurricula and RL generalisation methods, we believe Kinetix serves as an excellent foundation for future study into areas including agent network capacity, plasticity loss, lifelong learning, multi-task learning.
Kinetix:
Go and see what levels other people have made in the gallery, and see how you perform vs. a
trained agent!
Make your own levels in the editor.
Use Kinetix to train
or evaluate RL agents.
If you are interested in using the web version, or
building on
it to make
your own websites, see Kinetix.js.
Or, read the paper,
which is available on arXiv.
@article{matthews2024kinetix,
title={Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks},
author={Michael Matthews and Michael Beukman and Chris Lu and Jakob Foerster},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://arxiv.org/abs/2410.23208}
}
This is based on the Distill Template and the ACCEL Blog. Big thanks to Thomas Foster, Alex Goldie, Matthew Jackson and Andrei Lupu for feedback, discussions and suggestions on this work.