Experience Report: Log Mining using Natural Language Processing and Application to Anomaly Detection - Université Toulouse III - Paul Sabatier - Toulouse INP Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

Experience Report: Log Mining using Natural Language Processing and Application to Anomaly Detection

Résumé

Event logging is a key source of information on a system state. Reading logs provides insights on its activity, assess its correct state and allows to diagnose problems. However, reading does not scale: with the number of machines increasingly rising, and the complexification of systems, the task of auditing systems' health based on logfiles is becoming overwhelming for system administrators. This observation led to many proposals automating the processing of logs. However, most of these proposal still require some human intervention, for instance by tagging logs, parsing the source files generating the logs, etc. In this work, we target minimal human intervention for logfile processing and propose a new approach that considers logs as regular text (as opposed to related works that seek to exploit at best the little structure imposed by log formatting). This approach allows to leverage modern techniques from natural language processing. More specifically, we first apply a word embedding technique based on Google's word2vec algorithm: logfiles' words are mapped to a high dimensional metric space, that we then exploit as a feature space using standard classifiers. The resulting pipeline is very generic, computationally efficient, and requires very little intervention. We validate our approach by seeking stress patterns on an experimental platform. Results show a strong predictive performance (≈ 90% accuracy) using three out-of-the-box classifiers.
Fichier principal
Vignette du fichier
PID4955309.pdf (431.76 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01576291 , version 1 (22-08-2017)

Identifiants

  • HAL Id : hal-01576291 , version 1

Citer

Christophe Bertero, Matthieu Roy, Carla Sauvanaud, Gilles Trédan. Experience Report: Log Mining using Natural Language Processing and Application to Anomaly Detection. 28th International Symposium on Software Reliability Engineering (ISSRE 2017), Oct 2017, Toulouse, France. 10p. ⟨hal-01576291⟩
353 Consultations
600 Téléchargements

Partager

Gmail Facebook X LinkedIn More