Past Event! Note: this event has already taken place.

Query Processing Techniques for the Graph Database Management Systems of 2020s

November 5, 2020 at 11:00 AM to 12:00 PM

Data Science Distinguished Speaker Seminar Series  

Title: Query Processing Techniques for the Graph Database Management Systems of 2020s

Abstract

Graph database management systems (GDBMSs) in contemporary jargon refers to systems that adopt the property graph model and often power applications such as fraud and recommendations that require very fast joins of records, often beyond the performance that existing relational systems generally provide. There are several techniques that are universally adopted by GDBMSs to speed up joins, such as double indexing of pre-defined relations in adjacency lists and ID-based hash joins.

In this talk, I will give an overview of of the query processor of GraphflowDB, a graph database we are actively developing at University of Waterloo, that integrates three other novel techniques to perform very fast joins tailored for large-scale graphs: (1) worst-case optimal join-style intersection based joins; (2) a novel indexing sub-system that allows indexing subsets of edges, similar to relational views, and allowing adjacency lists to be bound to edges; and (3) factorized processing, which allows query processing on compressed intermediate data. These techniques have been introduced by the theory community in the context of relational database management systems but I will argue that some of their best applications are in GDBMSs.

About the Speaker

Semih Salihoglu is an Assistant Professor at University of Waterloo. His research focuses on graph databases, distributed systems for processing graphs, and algorithms and theories for evaluation of database queries. His systems work focuses on developing systems for managing, querying, or doing analytics on graph-structured data. His main on-going systems projects include Graphflow, which is a new graph database management system his team is building from scratch, and GraphWrangler which is a system designed to give an immediate graph-view on relational data. He holds a PhD from Stanford University and is a recipient of the 2018 VLDB best paper award.

Seminar Moderator:
Dr. Tracey P. Lauriault, Associate Professor, Critical Media and Big Data, School of Journalism and Communication, Carleton University


Zoom webinar information for the Data Science Distinguished Speaker Seminar Series is sent to our mailing list (sign up here). If you are not on our mailing list and would like to attend this virtual seminar, please e-mail cuids@carleton.ca.