Kvasir

Performance Engineering · 2019 · 1 min read

Built an automated performance testing framework with fully automated regression analysis for Netflix

Overview

Automated performance testing framework that runs benchmarks, detects regressions, and reports results integrated with the CI/CD pipeline

Problem

Performance regressions were often detected late in the release cycle or in production; manual performance testing was time-consuming and inconsistent

Constraints

  • Must integrate with CI/CD pipeline
  • Must detect regressions automatically
  • Must handle noisy benchmark results

Approach

Built a framework that runs performance benchmarks on every commit, uses statistical methods to detect regressions, and reports results to pull requests

Key Decisions

Use statistical methods for regression detection

Reasoning:

Benchmarks are inherently noisy; statistical methods (confidence intervals, t-tests) distinguish real regressions from noise

Block PRs with regressions

Reasoning:

Making regressions block merges creates accountability and prevents performance degradation

Tech Stack

  • Java
  • Python
  • Jenkins
  • AWS

Result & Impact

Prevented performance regressions from reaching production, shifted performance testing left in the development cycle

Learnings

  • Statistical methods are essential for noisy benchmarks
  • Integration with CI/CD is critical for adoption
  • Blocking PRs creates accountability

Features

Kvasir provides:

  • Automated benchmark execution
  • Statistical regression detection
  • CI/CD integration
  • PR status reporting
  • Historical trend tracking