跳到主要内容

Continuous Benchmarking

One of Databend design goals is to keep top performance, to guarantee it Databend runs Continuous Benchmarking on every nightly release to detect performance regressions and visualizes it on the website: perf.databend.rs.

The benchmark runner and results which run daily are defined in the repository datafuselabs/databend-perf.

Vectorized Execution Benchmarking

This benchmarking mainly for Databend vectorized execution, it will tell us how fast the vectorized execution in the memory is, we run these queries to measure it:

NumberQuery
Q1SELECT avg(number) FROM numbers_mt(10000000000);
Q2SELECT sum(number) FROM numbers_mt(10000000000);
Q3SELECT min(number) FROM numbers_mt(10000000000);
Q4SELECT max(number) FROM numbers_mt(10000000000);
Q5SELECT count(number) FROM numbers_mt(10000000000);
Q6SELECT sum(number+number+number) FROM numbers_mt(10000000000);
Q7SELECT sum(number) / count(number) FROM numbers_mt(10000000000);
Q8SELECT sum(number) / count(number), max(number), min(number) FROM numbers_mt(10000000000);
Q9SELECT number FROM numbers_mt(10000000000) ORDER BY number DESC LIMIT 10;
Q10SELECT max(number), sum(number) FROM numbers_mt(10000000000) GROUP BY number % 3, number % 4, number % 5 LIMIT 10;

Ontime Benchmarking

This benchmarking will tell us what the performance is when Databend works with Ontime dataset which on the AWS S3, we measure it by these queries:

NumberQuery
Q1SELECT DayOfWeek, count(*) AS c FROM default.ontime WHERE Year >= 2000 AND Year <= 2008 GROUP BY DayOfWeek ORDER BY c DESC;
Q2SELECT DayOfWeek, count(*) AS c FROM default.ontime WHERE DepDelay>10 AND Year >= 2000 AND Year <= 2008 GROUP BY DayOfWeek ORDER BY c DESC;
Q3SELECT Origin, count(*) AS c FROM default.ontime WHERE DepDelay>10 AND Year >= 2000 AND Year <= 2008 GROUP BY Origin ORDER BY c DESC LIMIT 10;
Q4SELECT IATA_CODE_Reporting_Airline AS Carrier, count() FROM default.ontime WHERE DepDelay>10 AND Year = 2007 GROUP BY Carrier ORDER BY count() DESC;
Q5SELECT IATA_CODE_Reporting_Airline AS Carrier, avg(cast(DepDelay>10 as Int8))*1000 AS c3 FROM default.ontime WHERE Year=2007 GROUP BY Carrier ORDER BY c3 DESC;
Q6SELECT IATA_CODE_Reporting_Airline AS Carrier, avg(cast(DepDelay>10 as Int8))*1000 AS c3 FROM default.ontime WHERE Year>=2000 AND Year <=2008 GROUP BY Carrier ORDER BY c3 DESC;
Q7SELECT IATA_CODE_Reporting_Airline AS Carrier, avg(DepDelay) * 1000 AS c3 FROM default.ontime WHERE Year >= 2000 AND Year <= 2008 GROUP BY Carrier;
Q8SELECT Year, avg(DepDelay) FROM default.ontime GROUP BY Year;
Q9SELECT Year, count(*) as c1 FROM default.ontime GROUP BY Year;
Q10SELECT avg(cnt) FROM (SELECT Year,Month,count(*) AS cnt FROM default.ontime WHERE DepDel15=1 GROUP BY Year,Month) a;
Q11SELECT avg(c1) FROM (SELECT Year,Month,count(*) AS c1 FROM default.ontime GROUP BY Year,Month) a;
Q12SELECT OriginCityName, DestCityName, count(*) AS c FROM default.ontime GROUP BY OriginCityName, DestCityName ORDER BY c DESC LIMIT 10;
Q13SELECT OriginCityName, count(*) AS c FROM default.ontime GROUP BY OriginCityName ORDER BY c DESC LIMIT 10;
Q14SELECT count(*) FROM default.ontime;