Automated behavior learning for robotic soccer

Serra, Rui Pedro Alexandre

Please use this identifier to cite or link to this item: http://hdl.handle.net/10773/12805

Title:	Automated behavior learning for robotic soccer
Other Titles:	Aprendizagem automática de comportamentos para futebol robótico
Author:	Serra, Rui Pedro Alexandre
Advisor:	Lau, Nuno Lopes, Luís Seabra
Keywords:	Engenharia de computadores Robots autónomos Robótica - Competição
Defense Date:	2013
Publisher:	Universidade de Aveiro
Abstract:	A soccer-playing robot must be able to carry out a set of behaviors, whose complexity can vary greatly. Manually programming a robot to accomplish those behaviors may be a difficult and time-consuming process. Automated learning techniques become interesting in this setting, because they allow the learning of behaviors based only on a very high-level description of the task to be completed, leaving the details to be figured out by the learning agent. Reinforcement Learning takes inspiration from nature and animal learning to model agents that interact with an environment, choosing actions that are more likely to lead them to accumulate rewards and avoid punishment. As agents experience the environment and the effect of their actions, they gain experience which is used to derive a policy. Agents can do this instantaneously after they observe the effect of their last action, or after collecting batches of these observations. The latter alternative, called Batch Reinforcement Learning, has been used in real world applications with very promissing results. This thesis explores the use of Batch Reinforcement Learning for learning robotic soccer behaviors, including dribbling the ball and receiving a pass. Practical experiments were undertaken with the CAMBADA simulator, as well as with the CAMBADA robots. Um robô futebolista necessita de executar comportamentos variados, desde os mais simples aos mais complexos e difíceis. Programar manualmente a execução destes comportamentos pode tornar-se uma tarefa bastante morosa e complicada. Neste contexto, os métodos de aprendizagem automática tornam-se interessantes, pois permitem a aprendizagem de comportamentos através de uma especificação a muito alto nível da tarefa a aprender, deixando a responsabilidade ao agente autónomo de lidar com os detalhes. A Aprendizagem por Reforço toma inspiração na natureza e na aprendizagem animal para modelar agentes que interagem com o seu ambiente de forma a escolherem as ações que aumentam a probabilidade de receberem recompensas e evitarem castigos. À medida que os agentes experimentam ações e observam os seus efeitos, ganham experiência e a partir dela derivam uma política. Isto é feito após cada observação do efeito de uma ação, ou após reunir conjuntos destas observações. Esta última alternativa, também chamada Aprendizagem por Reforço Batch, tem sido usada em aplicações reais com resultados promissores. Esta tese explora o uso de Aprendizagem por Reforço Batch para a aprendizagem de comportamentos para futebol robótico, tais como driblar a bola e receber um passe. Os resultados presentes neste documento foram obtidos de experiências realizadas com o simulador da equipa CAMBADA, assim como com os seus robôs.
Description:	Mestrado em Engenharia de Computadores e Telemática
URI:	http://hdl.handle.net/10773/12805
Appears in Collections:	UA - Dissertações de mestrado DETI - Dissertações de mestrado

Files in This Item:

File	Description	Size	Format
Tese .pdf		25.55 MB	Adobe PDF	View/Open

Show full item record