Information Quality in Online Social Networks: A Fast Unsupervised Social Spam Detection Method for Trending Topics

Mahdi Washha; Dania Shilleh; Yara Ghawadrah; Reem Jazi; Florence Sèdes

doi:10.5220/0006372006630675

Communication Dans Un Congrès Année : 2017

Information Quality in Online Social Networks: A Fast Unsupervised Social Spam Detection Method for Trending Topics

(1) , (2) , (2) , (2) , (1)

1
2

Mahdi Washha

Fonction : Auteur

Systèmes d’Informations Généralisées

Dania Shilleh

Fonction : Auteur

Birzeit University

Yara Ghawadrah

Fonction : Auteur

Birzeit University

Reem Jazi

Fonction : Auteur

Birzeit University

Florence Sèdes

Fonction : Auteur
PersonId : 735498
IdHAL : florence-sedes
ORCID : 0000-0002-9273-302X
IdRef : 033232679

Systèmes d’Informations Généralisées

Résumé

Online social networks (OSNs) provide data valuable for a tremendous range of applications such as search engines and recommendation systems. However, the easy-to-use interactive interfaces and the low barriers of publications have exposed various information quality (IQ) problems, decreasing the quality of user-generated content (UGC) in such networks. The existence of a particular kind of ill-intentioned users, so-called social spammers, imposes challenges to maintain an acceptable level of information quality. Social spammers simply misuse all services provided by social networks to post spam contents in an automated way. As a natural reaction, various detection methods have been designed, which inspect individual posts or accounts for the existence of spam. These methods have a major limitation in exploiting the supervised learning approach in which ground truth datasets are required at building model time. Moreover, the account-based detection methods are not practical for processing " crawled " large collections of social posts, requiring months to process such collections. Post-level detection methods also have another drawback in adapting the dynamic behavior of spammers robustly, because of the weakness of the features of this level in discriminating among spam and non-spam tweets. Hence, in this paper, we introduce a design of an unsupervised learning approach dedicated for detecting spam accounts (or users) existing in large collections of Twitter trending topics. More precisely , our method leverages the available simple meta-data about users and the published posts (tweets) related to a topic, as a collective heuristic information, to find any behavioral correlation among spam users acting as a spam campaign. Compared to the account-based spam detection methods, our experimental evaluation demonstrates the efficiency of predicting spam accounts (users) in terms of accuracy, precision, recall, and F-measure performance metrics.

Mots clés

Twitter Social Networks Spam

Domaines

Informatique [cs]

Documentation IRIT : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03116278

Soumis le : mercredi 20 janvier 2021-11:25:33

Dernière modification le : lundi 20 novembre 2023-11:44:23

Dates et versions

hal-03116278 , version 1 (20-01-2021)

Licence

Paternité - Pas d'utilisation commerciale - Pas de modification

Identifiants

HAL Id : hal-03116278 , version 1
DOI : 10.5220/0006372006630675

Citer

Mahdi Washha, Dania Shilleh, Yara Ghawadrah, Reem Jazi, Florence Sèdes. Information Quality in Online Social Networks: A Fast Unsupervised Social Spam Detection Method for Trending Topics. 19th International Conference on Enterprise Information Systems (ICEIS 2017), Apr 2017, Porto, Portugal. pp.633--675, ⟨10.5220/0006372006630675⟩. ⟨hal-03116278⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLSE2 CNRS SMS UT1-CAPITOLE IRIT IRIT-SIG IRIT-GD IRIT-UT3 TOULOUSE-INP UNIV-UT3 UT3-TOULOUSEINP

64 Consultations

0 Téléchargements

Information Quality in Online Social Networks: A Fast Unsupervised Social Spam Detection Method for Trending Topics

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager