Factorised Relational Databases
- Speaker: Dr Dan Olteanu
- Date: Wednesday, 14 March 2012 from 16:45 to 17:45
- Location: Room 745, Malet Street
In this talk I will present a representation system for relational data that uses factored forms to encode succinctly large relations.
The main part of the talk is on two characterisations of conjunctive queries based on factorisations of their results defined by a certain class of hyperpath decompositions of the query hypergraph.
I first characterise the queries by tight bounds on the sizes of factorised representations of their results.
For relations, where tuples are annotated with identifiers, the queries can also be characterised by tight bounds on the readability of their results. Here, the readability of a factorisation is the minimum over all equivalent factorisations of the maximum number of occurrences of any identifier in that factorisation. I give a dichotomy of queries based on the readability of their results for any database and define syntactically the class of queries with readability one. It turns out that such queries are tractable in probabilistic databases, have one-step streaming evaluation in the finite cursor model, and admit parallel evaluation with one broadcast step.
Early experimental results with our query engine for factorised data suggest that factorised representations can speed up query evaluation in relational databases.
This is joint work with Jakub Zavodny.