Turing Learning: a metric-free approach to inferring behaviour and its application to swarms

Wei Li, Melvin Gauci and Roderich Groß

On

Abstract

We propose Turing Learning, a novel system identification method for inferring behaviour.

Turing Learning simultaneously optimises models and classifiers. The classifiers are provided with data samples from both an agent and models under observation, and are rewarded for discriminating between them. Conversely, the models are rewarded for 'tricking' the classifiers into categorising them as the agent.

Unlike other methods for system identification, Turing Learning does not require predefined metrics to quantify the difference between the agent and models.

We present two case studies with swarms of simulated robots that show that Turing Learning outperforms a metric-based system identification method in terms of model accuracy. It also produces a useful by-product in the form of classifiers that can be used to detect abnormal behaviour in the swarm. Moreover, we show that Turing Learning also successfully infers the behaviour of physical robot swarms.

The results show that collective behaviours can be directly inferred from motion trajectories of individual agents in the swarm, which may have significant implications for the study of animal collectives.

Furthermore, Turing Learning could prove useful wherever a behaviour is not easily characterisable using metrics, making it suitable for a wide range of applications.

Highlight video

Videos of physical experiments

Evolution of the best two models (selected by the classifiers) over generations in the 10 physical coevolution runs.: Trial 1

Trial 2

Trial 3

Trial 4

Trial 5

Trial 6

Trial 7

Trial 8

Trial 9

Trial 10
A video showing the states of the programs executed by the replicas and agents in the physical coevolution runs.
20 physical trials using 40 e-puck robots for model validation (10 trials with the original controller and 10 with the inferred controller): Each video shows two trials.

Index 1

Index 2

Index 3

Index 4

Index 5

Index 6

Index 7

Index 8

Index 9

Index 10

Other materials

The implementation of Disperse program after a trial (see Section 5.1.2 in the paper).

After finishing a trial, we disperse the robots for a while (which is equivalent to initial configuration for a new trial) in order to automate Turing Learning.

This program consists of two behaviours ― obstacle avoidance and disperse.

The obstacle avoidance behaviour is to prevent the robots colliding with other robots and the walls. In particular, before executing the disperse behaviour, each robot detects whether some other objects (robots/walls) exist around it using its infrared proximity sensors.

If it detects something, it moves away from the objects through adjusting the linear and angular speed accordingly using a single-layer neural network controller.

The obstacle avoidance behaviour lasts for three seconds.

In the disperse behaviour, each robot is moving forward with a fixed linear speed while avoiding collisions with other robots and the walls. This behaviour lasts for five seconds.

The implementation of wall avoidance program during a trial (see Section 5.1.2 in the paper).

During a trial, in order to reduce the chances of robots getting stuck against the walls (note that the aggregation behaviour was designed in an unbounded environment), we imposed a wall avoidance effect to the original behaviour, but in such a way as to not affect inter-robot interactions.

In particular, when the robot detected the white walls using the infrared sensors or saw another robot (I=1) using the camera, it executed the same behaviour.

For example, for the agent, it would also turn on the spot when detecting the walls, which makes it easier to avoid the walls.

However, the behaviour of the replica depends on the model it is executing.

Different from the Disperse program, the program of wall avoidance was only triggered when the value of any of the robot's infrared sensor was above a pre-set high threshold. This ensures that the value of the robot's infrared sensors when other robots (covered with a black 'skirt') were nearby was always below the threshold. Therefore, the wall avoidance program did not affect the aggregation of robots.

Natural Robotics Lab

Natural Robotics Lab

Turing Learning: a metric-free approach to inferring behaviour and its application to swarms

Abstract

Highlight video

Videos of physical experiments

Other materials

Project updates

Natural Robotics Lab

Natural Robotics Lab

Turing Learning: a metric-free approach to inferring behaviour and its application to swarms

Abstract

Highlight video

Videos of physical experiments

Other materials

Project updates

A global reputation