Tue 16 Jul 2019 14:30 - 15:00 at Bouzy - Benchmark Creation

A comprehensive evaluation is an integral part of modern software engineering research and is either done using established benchmarks (e.g., DaCapo, the Qualitas corpus or the XCorpus) or using ad-hoc benchmarks. In a few cases, a specifically created test suite is used for evaluation purposes. In all cases the representativeness w.r.t. answering the research questions is basically always questionable. The mentioned established corpora contain a large degree of outdated software and – as recent studies have shown – that code is structurally very different when compared to modern Java code as found on, e.g., Maven central. A second issue of (at least) the established benchmarks is that their usage scenarios are only defined at a very high abstraction level (e.g., “general software engineering research”); making their usage in specific context questionable. Tailored benchmarks or custom test suites are often created without any substantial argument regarding their representativeness; making evaluations build on top of them even more questionable. The naive approach to solve the problems to take as many projects/to collect as much code as possible simply doesn’t scale. Analyzing an extremely large code base, such as all non-trivial Java projects found on GitHub, is prohibitively expensive even for simple analyses. This immediately leads to the question of how to build (reasonably) representative benchmarks. In this talk we will discus representativeness of benchmarks before we will present Hermes; a tool that is a first step towards the creation of representative and minimal benchmarks.

Tue 16 Jul

Displayed time zone: Belfast change

13:30 - 15:00
Benchmark CreationBenchWork at Bouzy
13:30
30m
Talk
A Central and Evolving Benchmark
BenchWork
Abhishek Tiwari University of Potsdam, Christian Hammer University of Potsdam
File Attached
14:00
30m
Talk
Creating and Managing Benchmark Suites with ABM
BenchWork
Lisa Nguyen Quang Do Paderborn University
File Attached
14:30
30m
Talk
Hermes: Towards Representative Benchmarks
BenchWork
Michael Eichberg TU Darmstadt, Germany
Media Attached