XML Schema, Tree Logic and Sheaves Automata - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Rapport Année : 2002

XML Schema, Tree Logic and Sheaves Automata

Denis Lugiez
  • Fonction : Auteur

Résumé

We describe a new class of tree automata, and a related logic on trees, with applications to the processing of XML documents and XML schemas. XML documents, and other forms of semi-structured data, may be roughly described as edge labeled trees. Therefore it is natural to use tree automata to reason on them and try to apply the classical connection between automata, logic and query languages. This approach has been followed by various researchers and has given some notable results, especially when dealing with Document Type Definition (DTD), the simplest standard for defining XML documents validity. But additional work is needed to take into account XML schema, a more advanced standard, for which regular tree automata are not satisfactory. A major reason for this inadequacy is the presence of an associative-commutative operator in the schema language, inherited from the &-operator of SGML, and the inherent limitations of regular tree automata in dealing with associative-commutative algebras. The class of automata considered here, called sheaves automata, is a tailored version of automata for unranked trees with both associative and associative-commutative symbols already proposed by the authors. In order to handle both ordered and unordered operators, we combine the transition relation of regular tree automaton with regular word expression and counting constraints. This extension appears quite natural since, when no counting constraints occurs, we obtain hedge automata, a simple model for XML schemata, and when no constraints occur, we obtain regular tree automata. Building on the classical connection between logic and automata, we also present a decidable tree logic that embeds XML Schema as a plain subset.

Domaines

Autre [cs.OH]
Fichier principal
Vignette du fichier
RR-4631.pdf (422.24 Ko) Télécharger le fichier
Loading...

Dates et versions

inria-00071954 , version 1 (23-05-2006)

Identifiants

  • HAL Id : inria-00071954 , version 1

Citer

Silvano Dal Zilio, Denis Lugiez. XML Schema, Tree Logic and Sheaves Automata. RR-4631, INRIA. 2002. ⟨inria-00071954⟩
187 Consultations
322 Téléchargements

Partager

Gmail Facebook X LinkedIn More