Schedule

Program Schedule and Recordings

As most workshops associated with this year's SIGMOD (and SIGMOD 2020 itself), DBTest will be a virtual workshop. We will use the Zoom video conferencing platform to stream the presentations, and have an interactive discussion about the papers and topics presented at the workshop.

The program for this year features three keynotes, five full papers and two short papers, and is structured as follows (all times are PST):

Title	Presenter	Recording	Slides
Keynote 1: From HyPer to Hyper: Integrating an academic DBMS into a leading analytics and business intelligence platform	Tobias Mühlbauer and Jan Finis	https://youtu.be/iAlkUq411mc	Slides
SparkFuzz: Searching Correctness Regressions in Modern Query Engines	Bogdan Ghit, Nicolas Poggi, Josh Rosen, Reynold Xin and Peter Boncz	https://youtu.be/l90E5EUCKQA	Slides
On Another Level: How to Debug Compiling Query Engines	Timo Kersten and Thomas Neumann	https://youtu.be/vIcoM2cyLKs	Slides
Keynote 2: Benchmark(et)ing an LSM-Tree vs a B-Tree	Mark Callaghan	https://youtu.be/vyLZHx9aZQc	Slides
Automated System Performance Testing at MongoDB	Henrik Ingo and David Daly	https://youtu.be/FI5BrYkWNvg	Slides
CoreBigBench: Benchmarking Big Data Core Operations	Todor Ivanov, Ahmad Ghazal, Alain Crolotte, Pekka Kostamaa and Yoseph Ghazal	https://youtu.be/V8SdCOl7VyM	Slides
FacetE: Exploiting Web Tables for Domain-Specific Word Embedding Evaluation	Michael Günther, Paul Sikorski, Maik Thiele and Wolfgang Lehner	https://youtu.be/UE3ZW5HIGDw	Slides
Keynote 3: How to clear your backlog of failing tests and make your test suite All Green	Greg Law	https://youtu.be/WH5Gu9RC6to
Testing Query Execution Engines with Mutations	Xinyue Chen, Chenglong Wang and Alvin Cheung		Slides
Workload Merging Potential in SAP Hybris	Robin Rehrmann, Martin Keppner, Wolfgang Lehner, Carsten Binnig and Arne Schwarz	https://youtu.be/eUfnuvsy-Mw	Slides

We are still in the process of collecting and uploading the remaining videos and slides that are not yet available above, please stay tuned.

Proceedings

Papers

DBTest '20: Proceedings of the Workshop on Testing Database Systems

Full Citation in the ACM Digital Library

SparkFuzz: searching correctness regressions in modern query engines

Bogdan Ghit
Nicolas Poggi
Josh Rosen
Reynold Xin
Peter Boncz

With more than 1200 contributors, Apache Spark is one of the most actively developed open source projects. At this scale and pace of development, mistakes are bound to happen. In this paper we present SparkFuzz, a toolkit we developed at Databricks for uncovering correctness errors in the Spark SQL engine. To guard the system against correctness errors, SparkFuzz takes a fuzzing approach to testing by generating random data and queries. Spark-Fuzz executes the generated queries on a reference database system such as PostgreSQL which is then used as a test oracle to verify the results returned by Spark SQL. We explain the approach we take to data and query generation and we analyze the coverage of SparkFuzz. We show that SparkFuzz achieves its current maximum coverage relatively fast by generating a small number of queries.

On another level: how to debug compiling query engines

Timo Kersten
Thomas Neumann

Compilation-based query engines generate and compile code at runtime, which is then run to get the query result. In this process there are two levels of source code involved: The code of the code generator itself and the code that is generated at runtime. This can make debugging quite indirect, as a fault in the generated code was caused by an error in the generator. To find the error, we have to look at both, the generated code and the code that generated it. Current debugging technology is not equipped to handle this situation. For example, GNU's gdb only offers facilities to inspect one source line, but not multiple source levels. Also, current debuggers are not able to reconstruct additional program state for further source levels, thus, context is missing during debugging. In this paper, we show how to build a multi-level debugger for generated queries that solves these issues.We propose to use a timetravelling debugger to provide context information for compile-time and runtime, thus providing full interactive debugging capabilities for every source level.We also present how to build such a debugger with low engineering effort by combining existing tool chains.

Automated system performance testing at MongoDB

Henrik Ingo
David Daly

Distributed Systems Infrastructure (DSI) is MongoDB's framework for running fully automated system performance tests in our Continuous Integration (CI) environment. To run in CI it needs to automate everything end-to-end: provisioning and deploying multinode clusters, executing tests, tuning the system for repeatable results, and collecting and analyzing the results. Today DSI is MongoDB's most used and most useful performance testing tool. It runs almost 200 different benchmarks in daily CI, and we also use it for manual performance investigations. As we can alert the responsible engineer in a timely fashion, all but one of the major regressions were fixed before the 4.2.0 release. We are also able to catch net new improvements, of which DSI caught 17. We open sourced DSI in March 2020.

CoreBigBench: Benchmarking big data core operations

Todor Ivanov
Ahmad Ghazal
Alain Crolotte
Pekka Kostamaa
Yoseph Ghazal

Significant effort was put into big data benchmarking with focus on end-to-end applications. While covering basic functionalities implicitly, the details of the individual contributions to the overall performance are hidden. As a result, end-to-end benchmarks could be biased toward certain basic functions. Micro-benchmarks are more explicit at covering basic functionalities but they are usually targeted at some highly specialized functions. In this paper we present CoreBigBench, a benchmark that focuses on the most common big data engines/platforms functionalities like scans, two way joins, common UDF execution and more. These common functionalities are benchmarked over relational and key-value data models which covers majority of data models. The benchmark consists of 22 queries applied to sales data and key-value web logs covering the basic functionalities. We ran CoreBigBench on Hive as a proof of concept and verified that the benchmark is easy to deploy and collected performance data. Finally, we believe that CoreBigBench is a good fit for commercial big data engines performance testing focused on basic engine functionalities not covered in end-to-end benchmarks.

FacetE: exploiting web tables for domain-specific word embedding evaluation

Michael Günther
Paul Sikorski
Maik Thiele
Wolfgang Lehner

Today's natural language processing and information retrieval systems heavily depend on word embedding techniques to represent text values. However, given a specific task deciding for a word embedding dataset is not trivial. Current word embedding evaluation methods mostly provide only a one-dimensional quality measure, which does not express how knowledge from different domains is represented in the word embedding models. To overcome this limitation, we provide a new evaluation data set called FacetE derived from 125M Web tables, enabling domain-sensitive evaluation. We show that FacetE can effectively be used to evaluate word embedding models. The evaluation of common general-purpose word embedding models suggests that there is currently no best word embedding for every domain.

Testing query execution engines with mutations

Xinyue Chen
Chenglong Wang
Alvin Cheung

Query optimizer engine plays an important role in modern database systems. However, due to the complex nature of query optimizers, validating the correctness of a query execution engine is inherently challenging. In particular, the high cost of testing query execution engines often prevents developers from making fast iteration during the development process, which can increase the development cycle or lead to production-level bugs. To address this challenge, we propose a tool, MutaSQL, that can quickly discover correctness bugs in SQL execution engines. MutaSQL generates test cases by mutating a query Q over database D into a query Q′ that should evaluate to the same result as Q on D. MutaSQL then checks the execution results of Q′ and Q on the tested engine. We evaluated MutaSQL on previous SQLite versions with known bugs as well as the newest SQLite release. The result shows that MutaSQL can effectively reproduce 34 bugs in previous versions and discover a new bug in the current SQLite release.

Workload merging potential in SAP Hybris

Robin Rehrmann
Martin Keppner
Wolfgang Lehner
Carsten Binnig
Arne Schwarz

OLTP DBMSs in enterprise scenarios are often facing the challenge to deal with workload peaks resulting from events such as Cyber Monday or Black Friday. The traditional solution to prevent running out of resources and thus coping with such workload peaks is to use a significant over-provisioning of the underlying infrastructure. Another direction to cope with such peak scenarios is to apply resource sharing. In a recent work, we showed that merging read statements in OLTP scenarios offers the opportunity to maintain low latency for systems under heavy load without over-provisioning.

In this paper, we analyze a real enterprise OLTP workload --- SAP Hybris --- with respect to statements types, complexity, and hot-spot statements to find potential candidates for workload sharing in OLTP. We additionally share work of the Hybris workload in our system OLTPShare and report on savings with respect to CPU consumption. Another interesting effect we show is that with OLTPShare, we can increase the SAP Hybris throughput by 20%.

Welcome to DBTest

8th International Workshop on Testing Database Systems 2020

June 19, 2020 8:00 AM PST / 17:00 PM CET in conjunction with

About DBTest

Program Schedule and Recordings

Papers

DBTest '20: Proceedings of the Workshop on Testing Database Systems

SparkFuzz: searching correctness regressions in modern query engines

On another level: how to debug compiling query engines

Automated system performance testing at MongoDB

CoreBigBench: Benchmarking big data core operations

FacetE: exploiting web tables for domain-specific word embedding evaluation

Testing query execution engines with mutations

Workload merging potential in SAP Hybris

Programm Committee

Workshop Co-Chairs

Pinar
Tözün

Alexander
Böhm

Steering Comitee

Carsten
Binnig

Alexander
Böhm

Tilmann
Rabl

Welcome to DBTest

8th International Workshop on Testing Database Systems 2020

June 19, 2020 8:00 AM PST / 17:00 PM CET in conjunction with

About DBTest

Topics Of Interest

Program Schedule and Recordings

Papers

DBTest '20: Proceedings of the Workshop on Testing Database Systems

Programm Committee

Workshop Co-Chairs

Steering Comitee

Our Sponsors