Interactive open-ended learning for 3D object recognition

Kasaei, Seyed Hamidreza Mohades

Please use this identifier to cite or link to this item: http://hdl.handle.net/10773/26803

Title:	Interactive open-ended learning for 3D object recognition
Other Titles:	Aprendizagem contínua interativa para reconhecimento de objetos 3D
Author:	Kasaei, Seyed Hamidreza Mohades
Advisor:	Lopes, Luís Seabra Tomé, Ana Maria Perfeito
Keywords:	3D object perception Open-ended learning of object categories Architectures of learning Human-robot interaction Robotic
Defense Date:	10-Apr-2019
Abstract:	Current object category learning and recognition approaches are typically designed for static environments in which it is viable to separate the training (off-line) and testing (on-line) phases. In such scenarios, the learned object category models are static, in the sense that the representation of the known categories does not change after the training stage. However, to migrate a robot to a new environment one must often completely redesign and remodel the knowledge-base that it is running with. The thesis contributes in several important ways to the research area of 3D object category learning and recognition. To cope with the mentioned limitations, we look at human cognition, in particular at the fact that human beings learn to recognize object categories ceaselessly over time. This ability to refine and extend knowledge from the set of accumulated experiences facilitates the adaptation to new environments. Inspired by this capability, we seek to create a cognitive object perception and perceptual learning architecture that can learn 3D object categories in an open-ended fashion. In this context, “open-ended” implies that the set of categories to be learned is not known in advance, and the training instances are extracted from actual experiences of a robot, and thus become gradually available, rather than being available since the beginning of the learning process. This architecture provides perception capabilities that will allow robots to incrementally learn object categories from the set of accumulated experiences and reason about how to perform complex tasks. This framework integrates detection, tracking, teaching, learning and recognition of objects. An important part of this work is concerned with the object representation. This is one of the most challenging problems in robotics because it must provide reliable information in real-time to enable the robot to physically interact with the objects in its environment. We have first tackled the problem of object representation, by proposing a new global object descriptor named Global Orthographic Object Descriptor (GOOD).This descriptor distinguishes itself from alternative 3D global object representations in that it is very fast to compute, robust against variations in pose and sampling density, and copes well with noisy sensor data. We also propose an extension of Latent Dirichlet Allocation to learn structural semantic features (i.e. topics) from local feature co-occurrences for each object category independently. Open-ended learning for 3D object category recognition is the core problem in this thesis. Both instance-based and model-based approaches were explored for incrementally scaling-up to larger sets of categories. Finally, a novel experimental evaluation methodology, that takes into account the open-ended nature of object category learning in multi-context scenarios, is proposed and applied. An extensive set of systematic experiments, in multiple experimental settings, was carried out to thoroughly evaluate the described learning approaches. Experimental results show that the proposed system is able to interact with human users, learn new object categories over time, as well as perform complex tasks. The contributions presented in this thesis have been fully implemented and evaluated on different standard object and scene datasets and empirically evaluated on different robotic platforms As abordagens atuais de aprendizagem e reconhecimento de categorias de objetos são tipicamente pensadas para ambientes estáticos, nos quais é viável separar o treino (off-line) e a utilização do conhecimento aprendido (on-line). Em tais cenários, o conhecimento é estático, no sentido em que a representação das categorias não muda após a fase de treino. No entanto, para migrar um robô para um novo ambiente torna-se muitas vezes necessário redesenhar completamente a base de conhecimento. A tese contribui em várias frentes para a investigação em aprendizagem e reconhecimento de categorias de objetos 3D. Para lidar com as mencionadas limitações, olhamos para a cognição humana, em particular para o fato de o ser humano aprender incessantemente a reconhecer categorias de objetos. Essa capacidade de refinar e extender o conhecimento com base na experiência acumulada facilita a adaptação a novos ambientes. Inspirados por essa capacidade, procuramos criar uma arquitetura cognitiva para percepção de objetos e aprendizagem perceptual capaz de aprender categorias de objetos 3D de maneira aberta. Neste contexto, o conjunto de categorias a serem aprendidas é inicialmente desconhecido e as instâncias a usar no treino são gradualmente extraídas das observações do agente, em vez de estarem disponíveis desde o início do processo. Assim, esta arquitetura fornece capacidades de percepção que permitirão que os robôs aprendam categorias de forma incremental com base nas experiências acumuladas e raciocinar sobre execução de tarefas complexas. A arquitetura integra detecção, seguimento, ensino, aprendizagem e reconhecimento de categorias de objetos. Uma parte importante deste trabalho centra-se na representação de objetos, a qual deve ser fiável e calculável em tempo real, para permitir que o robô interaja fisicamente com os objetos no seu ambiente. Nós abordamos o problema da representação, propondo um novo descritor global de objetos 3D designado Global Orthographic Object Descriptor (GOOD). Este descritor distingue-se de outras representações globais no facto de ser rápido de calcular, robusto contra variações na pose, variações na densidade de amostragem e ruído. Propomos ainda uma modificação da técnica de Latent Dirichlet Allocation para aprender característica semânticas (tópicos) com base em co-ocorrências de características locais. O problema central nesta tese é a aprendizagem aberta para reconhecimento de categorias de objetos 3D é. Foram exploradas abordagens, quer baseadas em instâncias, quer baseada em modelos, para a aprendizagem incremental e aberta de categorias. Finalmente, uma nova metodologia de avaliação experimental, que leva em conta a natureza aberta da aprendizagem de categorias em cenários multi-contexto, é proposta e utilizada. Foi realizada avaliação experimental sistemática, em múltiplos cenários experimentais, das várias abordagens propostas. Os resultados experimentais mostram que o sistema proposto é capaz de interagir com utilizadores humanos, aprender novas categorias de objetos ao longo do tempo e realizar tarefas complexas. As contribuições apresentadas nesta tese foram totalmente implementados e avaliados em diferentes conjuntos de dados, quer de objetos, quer de cenas, e avaliados empiricamente em diferentes plataformas robóticas
URI:	http://hdl.handle.net/10773/26803
Appears in Collections:	UA - Teses de doutoramento DETI - Teses de doutoramento

Files in This Item:

File	Description	Size	Format
Documento.pdf		58.8 MB	Adobe PDF	View/Open

Show full item record