Information Quality in Social Networks: Predicting Spammy Naming Patterns for Retrieving Twitter Spam Accounts

Mahdi Washha; Aziz Qaroush; Manel Mezghani; Florence Sèdes

doi:10.5220/0006314006100622

Communication Dans Un Congrès Année : 2017

Information Quality in Social Networks: Predicting Spammy Naming Patterns for Retrieving Twitter Spam Accounts

(1) , (2) , (1) , (1)

1
2

Mahdi Washha

Fonction : Auteur

Systèmes d’Informations Généralisées

Aziz Qaroush

Fonction : Auteur

Birzeit University, Palestinian Territory, Occupied

Manel Mezghani

Fonction : Auteur
PersonId : 1165320
IdRef : 195589572

Systèmes d’Informations Généralisées

Florence Sèdes

Fonction : Auteur
PersonId : 735498
IdHAL : florence-sedes
ORCID : 0000-0002-9273-302X
IdRef : 033232679

Systèmes d’Informations Généralisées

Résumé

The popularity of social networks is mainly conditioned by the integrity and the quality of contents generated by users as well as the maintenance of users’ privacy. More precisely, Twitter data (e.g. tweets) are valuable for a tremendous range of applications such as search engines and recommendation systems in which working on a high quality information is a compulsory step. However, the existence of ill-intentioned users in Twitter imposes challenges to maintain an acceptable level of data quality. Spammers are a concrete example of ill-intentioned users. Indeed, they have misused all services provided by Twitter to post spam content which consequently leads to serious problems such as polluting search results. As a natural reaction, various detection methods have been designed which inspect individual tweets or accounts for the existence of spam. In the context of large collections of Twitter users, applying these conventional methods is time consuming requiring months to filter o ut spam accounts in such collections. Moreover, Twitter community cannot apply them either randomly or sequentially on each user registered because of the dynamicity of Twitter network. Consequently, these limitations raise the need to make the detection process more systematic and faster. Complementary to the conventional detection methods, our proposal takes the collective perspective of users (or accounts) to provide a searchable information to retrieve accounts having high potential for being spam ones. We provide a design of an unsupervised automatic method to predict spammy naming patterns, as searchable information, used in naming spam accounts. Our experimental evaluation demonstrates the efficiency of predicting spammy naming patterns to retrieve spam accounts in terms of precision, recall, and normalized discounted cumulative gain at different ranks

Mots clés

Social networks Spam Twitter

Domaines

Théorie de l'information [cs.IT] Recherche d'information [cs.IR]

Fichier principal

washha_18971.pdf (1.2 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Open Archive Toulouse Archive Ouverte (OATAO) : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01809318

Soumis le : mercredi 6 juin 2018-15:59:44

Dernière modification le : jeudi 8 février 2024-15:00:58

Archivage à long terme le : vendredi 7 septembre 2018-14:04:26

Dates et versions

hal-01809318 , version 1 (06-06-2018)

Identifiants

HAL Id : hal-01809318 , version 1
DOI : 10.5220/0006314006100622
OATAO : 18971

Citer

Mahdi Washha, Aziz Qaroush, Manel Mezghani, Florence Sèdes. Information Quality in Social Networks: Predicting Spammy Naming Patterns for Retrieving Twitter Spam Accounts. 19th International Conference on Enterprise Information Systems (ICEIS 2017), Apr 2017, Porto, Portugal. pp.610-622, ⟨10.5220/0006314006100622⟩. ⟨hal-01809318⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLSE2 CNRS SMS UT1-CAPITOLE IRIT IRIT-SIG IRIT-GD IRIT-UT3 TOULOUSE-INP UNIV-UT3 UT3-TOULOUSEINP

150 Consultations

100 Téléchargements

Information Quality in Social Networks: Predicting Spammy Naming Patterns for Retrieving Twitter Spam Accounts

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager