Please use this identifier to cite or link to this item: http://hdl.handle.net/10773/22980
Title: Discovery of newsworthy events in Twitter
Author: Duarte, Fernando Fradique
Pereira, Óscar Mortágua
Aguiar, Rui
Keywords: Social Media
Twitter
Event Detection
SVM
Machine Learning
Dynamic Programming
Issue Date: 21-Mar-2018
Publisher: SciTePress
Abstract: The new communication paradigm established by Social Media, along with its growing popularity in recent years contributed to attract an increasing interest by several research fields. One such research field is the field of event detection in Social Media. The purpose of this work is to implement a system to detect newsworthy events in Twitter. A similar system proposed in the literature is used as the base of this implementation. For this purpose, a segmentation algorithm implemented using a dynamic programming approach is proposed in order to split the tweets into segments. Wikipedia is then leveraged as an additional factor in order to rank these segments. The top k segments in this ranking are then grouped together according to their similarity using a variant of the Jarvis-Patrick clustering algorithm. The resulting candidate events are filtered using an SVM model trained on annotated data, in order to retain only those related to real-world newsworthy events. The implemented system was tested with three months of data, representing a total of 4,770,636 tweets created in Portugal and mostly written in the Portuguese language. The precision obtained by the system was 76.9 % with a recall of 41.6%.
Peer review: yes
URI: http://hdl.handle.net/10773/22980
DOI: 10.5220/0006712702440252
ISBN: 978-989-758-296-7
Appears in Collections:DETI - Comunicações
IT - Artigos

Files in This Item:
File Description SizeFormat 
(CP) - 2018-03-21 (IoTBDS - Madeira - Portugal) Discovery of Newsworthy Events in Twitter.pdfDocumento principal1.07 MBAdobe PDFrestrictedAccess


FacebookTwitterLinkedIn
Formato BibTex MendeleyEndnote Degois 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.