Create performance benchmarks for key `pgroll` features #408

andrew-farries · 2024-10-16T12:36:52Z

Gather benchmarks data for the following parts of pgroll:

Backfill duration - How long does it take to perform backfill operations on a table of some fixed size (say 10^7 rows)?
Effect of dual writes - What overhead do the up/down triggers incur on UPDATE heavy tables?
read_schema query performance - Benchmark the performance of the read_schema query, run on every DDL statement to capture 'inferred' migrations.

Having these benchmarks in place would allow us to measure performance improvements over time and avoid regressions.

The text was updated successfully, but these errors were encountered:

ryanslade · 2024-10-17T09:10:10Z

I'd like to have a go at this.

In a perfect world we'd probably want to run these against every commit, but I imagine they may take a while to run and I don't want to affect the velocity of getting things into main. Maybe a compromise is that we spin up an environment once a day and run the tests against all new commits?

Apart from actually writing the benchmarks, we need to decide on a few things:

How often do we run them? I suggest once a day as mentioned above
Where do we run them? I think we may want to spin up a dedicated environment in EC2 so that the results are consistent
Where do we store results? Ideally, since this is an open source project, we may want the results to be public. Perhaps we can upload results to a wiki / docs area in this repo?

Anything else?

andrew-farries · 2024-10-17T09:18:55Z

I think what you suggest is a good start. We want the benchmarks for a couple of reasons:

Guard against performance regressions
Have benchmarks available as part of the public documentation for the repository.

I suggest running the benchmarks as a separate workflow that is automatically run on changes to main and that can also be invoked manually on branches.

A consistent environment in terms of hardware and probably also software (maybe run the benchmarks in a container) is a must too.

Results could be uploaded to object storage and pulled from there into our docs.

andrew-farries added this to the v1 milestone Oct 16, 2024

ryanslade self-assigned this Oct 17, 2024

ryanslade mentioned this issue Oct 17, 2024

Add backfill benchmarks #412

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create performance benchmarks for key `pgroll` features #408

Create performance benchmarks for key `pgroll` features #408

andrew-farries commented Oct 16, 2024

ryanslade commented Oct 17, 2024

andrew-farries commented Oct 17, 2024

Create performance benchmarks for key pgroll features #408

Create performance benchmarks for key pgroll features #408

Comments

andrew-farries commented Oct 16, 2024

ryanslade commented Oct 17, 2024

andrew-farries commented Oct 17, 2024

Create performance benchmarks for key `pgroll` features #408

Create performance benchmarks for key `pgroll` features #408