Data Challenge Track

The International Conference on Performance Engineering (ICPE) is hosting its third edition of the Data Challenge track. We call upon everyone interested to apply approaches and analyses to a common selection of performance datasets. The challenge is open-ended: participants can choose the research questions they find most interesting. The proposed approaches/analyses and their findings are discussed in short papers and presented at the main conference.

This year, the focus is on performance analysis of microservices systems. Given the increasing adoption of this architecture style, understanding the performance of microservices systems is becoming an essential task for performance engineers. Participants are invited to come up with new research questions and approaches for microservices performance analysis. For their papers, participants must choose one or more datasets from a predefined list derived from prior academic/industry research. Participants are expected to use this year’s datasets to answer their research questions, and report their findings in a four-page challenge paper. If the paper is accepted, participants will be invited to present the results at ICPE 2024 in London. Details on the datasets are provided below.

Datasets

This year’s ICPE data challenge is based on four datasets from both academic and industry studies. Each dataset includes performance measurements gathered from either industrial or open-source microservices systems, with each dataset having its own unique data format and content as described in their respective repositories.

The first dataset is provided by the 2023 TSE paper “DeLag: Using Multi-Objective Optimization to Enhance the Detection of Latency Degradation Patterns in Service-Based Systems”. It contains performance measurements gathered from two open-source microservices systems with injected performance anomalies.

Repository: https://github.com/SpencerLabAQ/icpe-data-challenge-delag
The second dataset is from the USENIX ATC ’23 paper “Lifting the veil on Meta’s microservice architecture: Analyses of topology and request workflows”. It contains distributed tracing data from Meta’s microservices systems.

Repository: https://github.com/facebookresearch/distributed_traces.
The third dataset comes from the SoCC’22 paper “The Power of Prediction: Microservice Auto Scaling via Workload Learning”, and contains runtime metrics of microservices from Alibaba’s production clusters.

Repository: https://github.com/alibaba/clusterdata/tree/master/cluster-trace-microservices-v2022.
The fourth dataset is provided by the OSDI’20 paper “FIRM: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices”. It contains tracing data from four open-source microservice systems, which are subject to injected performance anomalies.

Repository: https://doi.org/10.13012/B2IDB-6738796_V1.

Challenge

High-level possible ideas for participants include but are not limited to:

Tailor visualization techniques to navigate the extensive data generated by microservices systems.
- Beschastnikh et al., 2020: https://doi.org/10.1145/3375633
- Silva et al., 2021: https://doi.org/10.1109/IV53921.2021.00028
- Anand et al., 2020: https://doi.org/10.48550/arXiv.2010.13681
Develop automated techniques to identify patterns associated with performance degradations.
- Traini and Cortellessa, 2023: https://doi.org/10.1109/TSE.2023.3266041
- Bansal et al., 2020: https://doi.org/10.1145/3377813.3381353
Evaluation of previous/novel root cause analysis techniques.
- Mariani et al., 2018: https://doi.org/10.1109/ICST.2018.00034
- Ma et al., 2020: https://doi.org/10.1145/3366423.3380111
Model performance of microservices systems using machine learning algorithms.
- Liao et al., 2020: https://doi.org/10.1007/s10664-020-09866-z
- Xiong et al., 2013: https://doi.org/10.1145/2479871.2479909
Replicate prior study/approach on a selected dataset.

Submission

A challenge paper should outline the findings of your research. Starting with an introduction to the problem tackled and its relevance in the field. Detail the datasets utilized, the methods and tools applied, and the results achieved. Discuss the implications of the study findings, and highlight the paper contributions and their importance.

To maintain clarity and consistency in research submissions, authors are required to specify the dataset (or portion thereof) utilized when detailing methodologies or presenting findings. Additionally, authors are reminded to be precise in their references to datasets.

We highly encourage the solution’s source code to be included with the submission (e.g., in a permanent repository such as Zenodo, potentially linked to a GitHub repository as described here), but this is not mandatory for acceptance of a data challenge paper.

The page limit for challenge papers is 4 pages (including all figures and tables) + 1 page for references. Challenge papers will be published in the companion to the ICPE 2024 proceedings. All challenge papers will be reviewed by the program committee members. Note that submissions to this track are double-blind: for details, see the Double Blind FAQ page. The best data challenge paper will be awarded by the track chairs and the program committee members.

Submissions to be made via Easychair by selecting the respective track.

The submission deadline can be found here.

Data Challenge Chairs

Luca Traini, University of L’Aquila, Italy
Christoph Laaber, Simula Research Laboratory, Oslo, Norway