Efficient distributed path computation on RDF knowledge graphs using partial evaluation - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Article Dans Une Revue World Wide Web Année : 2022

Efficient distributed path computation on RDF knowledge graphs using partial evaluation

Résumé

A key property of Linked Data is the representation and publication of data as interconnected labelled graphs where different resources linked to each other form a network of meaningful information. Searching these important relationships between resources – within single or distributed graphs – can be reduced to a pathfinding or navigation problem, i.e., looking for chains of intermediate nodes. SPARQL1.1, the current standard query language for RDF-based Linked Data defines a construct – called Property Paths (PPs) – to navigate between entities within a single graph. Since Linked Data technologies are naturally aimed at decentralised scenarios, there are many cases where centralising this data is not feasible or even not possible for querying purposes. To address these problems, we propose a SPARQL PP-based graph processing approach – dubbed DpcLD – where users can execute SPARQL PP queries and find paths distributed across multiple, connected graphs exposed as SPARQL endpoints. To execute the distributed path queries we implemented an index-free, cache-based query engine that communicates with a shared algorithm running on each remote endpoint, and computes the distributed paths. In this paper, we highlight the way in which this approach exploits and aggregates partial paths, within a distributed environment, to produce complete results. We perform extensive experiments to demonstrate the performance of our approach on two datasets: One representing 10 million triples from the DBPedia SPARQL benchmark, and another full benchmark dataset corresponding to 124 million triples. We also perform a scalability test of our approach using real-world genomics datasets distributed across multiple endpoints. We compare our distributed approach with other distributed and centralized pathfinding approaches, showing that it outperforms other distributed approaches by orders of magnitude, and provides a good trade-off for cases when the data cannot be centralised.
Fichier principal
Vignette du fichier
DpcLD_WWWJ2021-.pdf (1.2 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03659142 , version 1 (04-05-2022)

Identifiants

Citer

Qaiser Mehmood, Muhammad Saleem, Alokkumar Jha, Mathieu D’aquin. Efficient distributed path computation on RDF knowledge graphs using partial evaluation. World Wide Web, 2022, 25 (2), pp.1005-1036. ⟨10.1007/s11280-021-00965-5⟩. ⟨hal-03659142⟩
37 Consultations
115 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More