Utilize este identificador para referenciar este registo: http://hdl.handle.net/10773/25112
Título: Biomedical word sense disambiguation with word embeddings
Autor: Antunes, Rui
Matos, Sérgio
Palavras-chave: Biomedical word sense disambiguation
Word embeddings
Data: 21-Jun-2017
Editora: Springer
Resumo: There is a growing need for automatic extraction of information and knowledge from the increasing amount of biomedical and clinical data produced, namely in textual form. Natural language processing comes in this direction, helping in tasks such as information extraction and information retrieval. Word sense disambiguation is an important part of this process, being responsible for assigning the proper concept to an ambiguous term. In this paper, we present results from machine learning and knowledge-based algorithms applied to biomedical word sense disambiguation. For the supervised machine learning algorithms we used word embeddings, calculated from the full MEDLINE literature database, as global features and compare the results to the use of local unigram and bigram features. For the knowledge-based method we represented the textual definitions of biomedical concepts from the UMLS database as word embedding vectors, and combined this with concept associations derived from the MeSH term co-occurrences. Both the machine learning and the knowledge-based results indicate that word embeddings are informative and improve the biomedical word disambiguation accuracy. Applied to the reference MSH WSD data set, our knowledge-based approach achieves 85.1% disambiguation accuracy, which is higher than some previously proposed approaches that do not use machine-learning strategies.
Peer review: yes
URI: http://hdl.handle.net/10773/25112
DOI: 10.1007/978-3-319-60816-7_33
ISBN: 978-3-319-60815-0
Versão do Editor: https://link.springer.com/chapter/10.1007%2F978-3-319-60816-7_33
Aparece nas coleções: IEETA - Capítulo de livro

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
paper.pdf202.7 kBAdobe PDFVer/Abrir


FacebookTwitterLinkedIn
Formato BibTex MendeleyEndnote Degois 

Todos os registos no repositório estão protegidos por leis de copyright, com todos os direitos reservados.