Parallel alignment of structured documents
Résumé
Classical methods for parallel text alignment consider one specific level (e.g. sentences) along which two or more versions of a text are to be synchronised. This may lead to some problems when these documents are particularly long since alignment errors at some point in the text may, in the absence of any other linguistic information, propagate for some time without any chance of recovery. In this chapter we consider how multilingual parallel alignment can be based on the fact that more and more texts are now highly structured by means of tagging languages such as SGML. In particular we will describe recent efforts in multi-level alignment for which we will present the main advances as well as some of the difficulties to be dealt with, in particular when the text and its translation are associated with different encoding schemes or different encoding practices for the same scheme.
Domaines
Informatique et langage [cs.CL]
Fichier principal
Parallel_alignment_of_structured_documents_-_Romary_Bonhomme.pdf (59.77 Ko)
Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...