Please use this identifier to cite or link to this item:
http://hdl.handle.net/10773/22980
Title: | Discovery of newsworthy events in Twitter |
Author: | Duarte, Fernando Fradique Pereira, Óscar Mortágua Aguiar, Rui |
Keywords: | Social Media Event Detection SVM Machine Learning Dynamic Programming |
Issue Date: | 21-Mar-2018 |
Publisher: | SciTePress |
Abstract: | The new communication paradigm established by Social Media, along with its growing popularity in recent years contributed to attract an increasing interest by several research fields. One such research field is the field of event detection in Social Media. The purpose of this work is to implement a system to detect newsworthy events in Twitter. A similar system proposed in the literature is used as the base of this implementation. For this purpose, a segmentation algorithm implemented using a dynamic programming approach is proposed in order to split the tweets into segments. Wikipedia is then leveraged as an additional factor in order to rank these segments. The top k segments in this ranking are then grouped together according to their similarity using a variant of the Jarvis-Patrick clustering algorithm. The resulting candidate events are filtered using an SVM model trained on annotated data, in order to retain only those related to real-world newsworthy events. The implemented system was tested with three months of data, representing a total of 4,770,636 tweets created in Portugal and mostly written in the Portuguese language. The precision obtained by the system was 76.9 % with a recall of 41.6%. |
Peer review: | yes |
URI: | http://hdl.handle.net/10773/22980 |
DOI: | 10.5220/0006712702440252 |
ISBN: | 978-989-758-296-7 |
Appears in Collections: | DETI - Comunicações IT - Artigos |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
(CP) - 2018-03-21 (IoTBDS - Madeira - Portugal) Discovery of Newsworthy Events in Twitter.pdf | Documento principal | 1.07 MB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.