Single-Path Results for Context-Free Path Queries

To evaluate path queries, one typically resorts to the relational semantics in which node pairs (m, n) are returned that are connected by a path of interest. A popular way to specify paths of interest is by the labeling of the edges of the path, e.g., by specifying languages of allowed labeling. Unfortunately, this relational semantics provides little insight in how node pairs (m, n) are obtained. To improve on this, we propose the single-path semantics, in which path queries are answered by minimal-length paths between such node-pairs.

Our implementation of query evaluation using the single-path semantics allows one to evaluate context-free path queries. Furthermore, the underlying algorithms can also be used to efficiently determine the shortest strings in a context-free grammar. Our implementation is in C++14 and is tested using two modern C++ compiler, namely the Microsoft Visual C++ compiler and the GNU g++ compiler. We have one external dependency, namely the Fibonacci Heap implementation of the Boost library. The implementation is available for public use under the BSD-license (license details included in the source files).

Project Files and Related Publications

  1. The implementation of the single-path semantics for context-free path queries. source code (.zip).
    Full implementation of the single-path semantics, of the experiments, and of supporting tooling. Also included are dataset generators in accordance with the experiments and the raw experimental results. The source code contains step-by-step instructions to compile and use the code using either G++ or Microsoft Visual C++.
  2. LSGDA 2020.
    Explaining results of path queries on graphs. Jelle Hellings. (2020). In: Software Foundations for Data Interoperability and Large Scale Graph Data Analytics, 84-98. Springer. DOI: 10.1007/978-3-030-61133-0_7.
    author copy, technical report, slides.
    Video Presentation

    Many graph query languages use, at their core, path queries that yield node pairs that are connected by a path of interest. For the end-user, such node pairs only give limited insight as to why this query result is obtained, as the pair does not directly identify the underlying path of interest. To address this limitation of path queries, we propose the single-path semantics, which evaluates path queries to, for each node pair (m,n), a single path from m to n satisfying the conditions of the query. To put our proposal in practice, we provide an efficient algorithm for evaluating context-free path queries, a particular powerful type of path queries, using the single-path semantics. Additionally, we perform a short evaluation of our techniques that shows that the single-path semantics is practically feasible, even when query results grow large.

  3. ICDT 2014.
    Conjunctive context-free path queries. Jelle Hellings. (2014). In: Proceedings of the 17th International Conference on Database Theory (ICDT), 119-130. DOI: 10.5441/002/icdt.2014.15.
    author copy, slides.

    In graph query languages, regular expressions are commonly used to specify the labeling of paths. A natural step in increasing the expressive power of these query languages is replacing regular expressions by context-free grammars. With the Conjunctive Context-Free Path Queries (CCFPQ) we introduce such a language based on the well-known Conjunctive Regular Path Queries (CRPQ).

    First, we show that query evaluation of CCFPQ has polynomial time data complexity. Secondly, we look at the generalization of regular expressions, as used in CRPQ, to regular relations and show how similar generalizations can be applied to context-free grammars, as used in CCFPQ. Thirdly, we investigate the relations between the expressive power of CRPQ, CCFPQ, and their generalizations. In several cases we show that replacing regular expressions by context-free grammars does increase expressive power. Finally, we look at including context-free grammars in more powerful logics than conjunctive queries. We do so by adding negation and provide expressivity relations between the obtained languages.