Enabled by advances in wireless communication and hardware miniaturization, the dream of distributed multi-agent autonomous systems has become a reality with the rise of swarm robotics. Finding applications in environmental monitoring, surveillance, reconnaissance, search and rescue, resource collection, anomaly detection, mobility, and transportation, the field of swarm robotics—which studies the control and behavior of teams of robots—has seen tremendous growth in recent years.
Often, such swarms are deployed in environments characterized by uncertainty, in which the swarm is dually tasked with learning the features of the environment and accomplishing a task dependent on such features simultaneously. In these scenarios, the swarm faces an inevitable tradeoff between exploration and exploitation—learning the features of the environment prevents optimal execution of the task at hand, but execution of the task at hand prevents optimally learning the features of the environment.
Faced with this problem, we present a novel human-in-the-loop approach to balancing exploration and exploitation in robotic swarms, wherein a Gaussian Process constructed from a human-input prior is used in conjunction with a random decision variable to smoothly transition from exploration towards exploitation and converge upon a locally-optimal solution. We demonstrate a physical implementation of this algorithm using the open-source OpenSwarm library developed in our lab, and analyze the algorithm’s performance in comparison to alternative benchmarked approaches in a physical setting. In particular, we apply the algorithm to an experiment wherein a robotic swarm must learn the distribution of light intensity over a field given a human prior and converge to an equitable partition of the region with respect to light intensity. Our findings bring broad implications to the field of swarm robotics, and, more generally, to problems involving explore-exploit tradeoffs in any domain.