Data Mining and Cross-checking of Execution Traces: A re-interpretation of Jones, Harrold and Stasko test information visualization (Long version)

Tristan Denmat; Mireille Ducassé; Olivier Ridoux

Rapport (Rapport De Recherche) Année : 2005

Data Mining and Cross-checking of Execution Traces: A re-interpretation of Jones, Harrold and Stasko test information visualization (Long version)

(1) , (1) , (1)

Tristan Denmat

Fonction : Auteur
PersonId : 830865

Logiciel : ANalyse et DEveloppement

Mireille Ducassé

Fonction : Auteur
PersonId : 12890
IdHAL : mireille-ducasse
ORCID : 0000-0003-1084-4322
IdRef : 133381846

Logiciel : ANalyse et DEveloppement

Olivier Ridoux

Fonction : Auteur
PersonId : 182897
IdHAL : olivier-ridoux
ORCID : 0000-0002-0170-0717
IdRef : 031550231

Logiciel : ANalyse et DEveloppement

Résumé

The current trend in debugging and testing is to cross-check information collected during several executions. Jones et al., for example, propose to use the instruction coverage of passing and failing runs in order to visualize suspicious statements. This seems promising but lacks a formal justification. In this paper, we show that the method of Jones et al. can be re-interpreted as a data mining procedure. More particularly, the suspicion indicator they define can be rephrased in terms of well-known metrics of the data-mining domain. These metrics characterize \emph{association rules} between data. With this formal framework we are able to explain limitations of the above indicator. Three significant hypotheses were implicit in the original work. Namely, 1) there exists at least one statement that can be considered as faulty ; 2) the values of the suspicion indicator for different statements should be independent from each others; 3) executing a faulty statement leads most of the time to a failure. We show that these hypotheses are hard to fulfill and that the link between the indicator and the correctness of a statement is not straightforward. The underlying idea of association rules is, nevertheless, still promising, and our conclusion emphasizes some possible tracks for improvement. \\ La tendance actuelle en débogage et test de programmes est de recouper des informations rassemblées lors de plusieurs exécutions. Jones et al., par exemple, proposent d'employer la couverture d'instructions calculée pour des exécutions réussissant et échouant afin de visualiser des instructions suspectes. Ceci semble prometteur mais il manque une justification formelle. Dans cet article, nous montrons que la méthode de Jones et al. peut être réinterprétée comme un procédé de fouille de données. Plus particulièrement, l'indicateur de suspicion qu'ils définissent peut être reformulé en termes de métriques bien connues en fouille de données. Ces métriques caractérisent des règles d'association entre les données. Avec ce cadre formel nous pouvons expliquer des limitations de l'indicateur mentionné ci-dessus. Trois hypothèses significatives étaient implicites dans le travail original. À savoir, 1) il existe au moins une instruction qui peut être considérée comme défectueuse ; 2) les valeurs de l'indicateur de suspicion pour différentes instructions doivent être indépendantes les unes des autres ; 3) exécuter une instruction défectueuse conduit la plupart du temps à un échec. Nous prouvons qu'il est difficile de satisfaire ces hypothèses et que le lien entre l'indicateur et la correction d'une instruction n'est pas direct. L'idée fondamentale des règles d'association est, néanmoins, prometteuse, et notre conclusion dessine quelques voies possibles d'amélioration.

Mots clés

Software Engineering Testing and Debugging Learning Knowledge acquisition \\ Génie logiciel Test et débogage Apprentissage Acquisition de données

Domaines

Autre [cs.OH]

Fichier principal

PI-1743.pdf (181.85 Ko)

Anne Jaigu : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00000566

Soumis le : jeudi 3 novembre 2005-10:10:02

Dernière modification le : vendredi 24 mars 2023-14:52:47

Archivage à long terme le : vendredi 2 avril 2010-18:21:13

Dates et versions

inria-00000566 , version 1 (03-11-2005)

Identifiants

HAL Id : inria-00000566 , version 1

Citer

Tristan Denmat, Mireille Ducassé, Olivier Ridoux. Data Mining and Cross-checking of Execution Traces: A re-interpretation of Jones, Harrold and Stasko test information visualization (Long version). [Research Report] PI 1743, 2005, pp.21. ⟨inria-00000566⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA IRISA-INSA-R INRIA2 LARA UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES INSA-GROUPE UR1-MATH-NUM

146 Consultations

216 Téléchargements

Data Mining and Cross-checking of Execution Traces: A re-interpretation of Jones, Harrold and Stasko test information visualization (Long version)

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager