Please use this identifier to cite or link to this item: http://hdl.handle.net/10773/21767
Title: Natural language generation in the context of multimodal interaction in Portuguese : Data-to-text based in automatic translation
Other Titles: Geração de linguagem natural no âmbito de interação multimodal em português : conversão de dados para texto baseada em tradução automática
Author: Pereira, José Casimiro
Advisor: Teixeira, António Joaquim da Silva
Pinto, Joaquim Manuel Henriques de Sousa
Keywords: Informática
Língua portuguesa
Linguística computacional
Tradução automática
Processamento da linguagem natural (Ciência de computadores)
Defense Date: 23-Jul-2017
Publisher: Universidade de Aveiro
Abstract: Resumo em português não disponivel
To enable the interaction by text and/or speech it is essential that we devise systems capable of translating internal data into sentences or texts that can be shown on screen or heard by users. In this context, it is essential that these natural language generation (NLG) systems provide sentences in the native languages of the users (in our case European Portuguese) and enable an easy development and integration process while providing an output that is perceived as natural. The creation of high quality NLG systems is not an easy task, even for a small domain. The main di culties arise from: classic approaches being very demanding in know-how and development time; a lack of variability in generated sentences of most generation methods; a di culty in easily accessing complete tools; shortage of resources, such as large corpora; and support being available in only a limited number of languages. The main goal of this work was to propose, develop and test a method to convert Data-to-Portuguese, which can be developed with the smallest amount possible of time and resources, but being capable of generating utterances with variability and quality. The thesis defended argues that this goal can be achieved adopting data-driven language generation { more precisely generation based in language translation { and following an Engineering Research Methodology. In this thesis, two Data2Text NLG systems are presented. They were designed to provide a way to quickly develop an NLG system which can generate sentences with good quality. The proposed systems use tools that are freely available and can be developed by people with low linguistic skills. One important characteristic is the use of statistical machine translation techniques and this approach requires only a small natural language corpora resulting in easier and cheaper development when compared to more common approaches. The main result of this thesis is the demonstration that, by following the proposed approach, it is possible to create systems capable of translating information/data into good quality sentences in Portuguese. This is done without major e ort regarding resources creation and with the common knowledge of an experienced application developer. The systems created, particularly the hybrid system, are capable of providing a good solution for problems in data to text conversion.
Description: Doutoramento em Informática
URI: http://hdl.handle.net/10773/21767
Appears in Collections:UA - Teses de doutoramento
DETI - Teses de doutoramento

Files in This Item:
File Description SizeFormat 
osé Casimiro Pereira.pdf5.98 MBAdobe PDFView/Open


FacebookTwitterLinkedIn
Formato BibTex MendeleyEndnote Degois 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.