Exportar Publicação
A publicação pode ser exportada nos seguintes formatos: referência da APA (American Psychological Association), referência do IEEE (Institute of Electrical and Electronics Engineers), BibTeX e RIS.
Nunes, Luis, Jardim, D. & Oliveira, S. (2011). Hierarchical reinforcement learning: Learning sub-goals and state-abstraction. In AISTI - Associação Ibérica de Sistemas e Tecnologias de Informação (Ed.), 6th Iberian Conference on Information Systems and Technologies (CISTI 2011). (pp. 245-248). Chaves, Portugal: IEEE.
L. M. Nunes et al., "Hierarchical reinforcement learning: Learning sub-goals and state-abstraction", in 6th Iberian Conf. on Information Systems and Technologies (CISTI 2011), AISTI - Associação Ibérica de Sistemas e Tecnologias de Informação, Ed., Chaves, Portugal, IEEE, 2011, pp. 245-248
@inproceedings{nunes2011_1766440002531,
author = "Nunes, Luis and Jardim, D. and Oliveira, S.",
title = "Hierarchical reinforcement learning: Learning sub-goals and state-abstraction",
booktitle = "6th Iberian Conference on Information Systems and Technologies (CISTI 2011)",
year = "2011",
editor = "AISTI - Associação Ibérica de Sistemas e Tecnologias de Informação",
volume = "",
number = "",
series = "",
pages = "245-248",
publisher = "IEEE",
address = "Chaves, Portugal",
organization = "AISTI - Associação Ibérica de Sistemas e Tecnologias de Informação",
url = "https://ieeexplore.ieee.org/xpl/conhome/5962051/proceeding"
}
TY - CPAPER TI - Hierarchical reinforcement learning: Learning sub-goals and state-abstraction T2 - 6th Iberian Conference on Information Systems and Technologies (CISTI 2011) AU - Nunes, Luis AU - Jardim, D. AU - Oliveira, S. PY - 2011 SP - 245-248 SN - 2166-0727 CY - Chaves, Portugal UR - https://ieeexplore.ieee.org/xpl/conhome/5962051/proceeding AB - In this paper we present a method that allows an agent to discover and create temporal abstractions autonomously. Our method is based on the concept that to reach the goal, the agent must pass through relevant states that we will interpret as subgoals. To detect useful subgoals, our method creates intersections between several paths leading to a goal. Our research focused on domains largely used in the study of temporal abstractions. We used several versions of the room-to-room navigation problem. We determined that, in the problems tested, an agent can learn more rapidly by automatically discovering subgoals and creating abstractions. ER -
English