Information Quality in Online Social Networks: A Fast Unsupervised Social Spam Detection Method for Trending Topics - Université Toulouse III - Paul Sabatier - Toulouse INP Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

Information Quality in Online Social Networks: A Fast Unsupervised Social Spam Detection Method for Trending Topics

Résumé

Online social networks (OSNs) provide data valuable for a tremendous range of applications such as search engines and recommendation systems. However, the easy-to-use interactive interfaces and the low barriers of publications have exposed various information quality (IQ) problems, decreasing the quality of user-generated content (UGC) in such networks. The existence of a particular kind of ill-intentioned users, so-called social spammers, imposes challenges to maintain an acceptable level of information quality. Social spammers simply misuse all services provided by social networks to post spam contents in an automated way. As a natural reaction, various detection methods have been designed, which inspect individual posts or accounts for the existence of spam. These methods have a major limitation in exploiting the supervised learning approach in which ground truth datasets are required at building model time. Moreover, the account-based detection methods are not practical for processing " crawled " large collections of social posts, requiring months to process such collections. Post-level detection methods also have another drawback in adapting the dynamic behavior of spammers robustly, because of the weakness of the features of this level in discriminating among spam and non-spam tweets. Hence, in this paper, we introduce a design of an unsupervised learning approach dedicated for detecting spam accounts (or users) existing in large collections of Twitter trending topics. More precisely , our method leverages the available simple meta-data about users and the published posts (tweets) related to a topic, as a collective heuristic information, to find any behavioral correlation among spam users acting as a spam campaign. Compared to the account-based spam detection methods, our experimental evaluation demonstrates the efficiency of predicting spam accounts (users) in terms of accuracy, precision, recall, and F-measure performance metrics.

Mots clés

Dates et versions

hal-03116278 , version 1 (20-01-2021)

Licence

Paternité - Pas d'utilisation commerciale - Pas de modification

Identifiants

Citer

Mahdi Washha, Dania Shilleh, Yara Ghawadrah, Reem Jazi, Florence Sèdes. Information Quality in Online Social Networks: A Fast Unsupervised Social Spam Detection Method for Trending Topics. 19th International Conference on Enterprise Information Systems (ICEIS 2017), Apr 2017, Porto, Portugal. pp.633--675, ⟨10.5220/0006372006630675⟩. ⟨hal-03116278⟩
64 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More