## New algorithm for calculating 95 percentile

The 95 percentile for query response times is and old concept; Peter and Roland blogged about it in 2008. Since then, MySQL tools have calculated the 95 percentile by collecting all values, either exactly or approximately, and returning `all_values[int(number_of_values * 0.95)]`

(that’s an extreme simplification). But recently I asked myself*: must we save all values? The answer is no. I created a new algorithm** for calculating the 95 percentile that is faster, more accurate, and saves only 100 values.***

Firstly, my basis of comparison is the 95 percentile algo used by mk-query-digest. That algo is fast, memory-stable, and very proven in the real world. It works well for any number of values, even hundreds of thousands of values. It saves all values by using base 1.05 buckets and counting the number of values that fall within the range of each bucket. The results are not exact, but the differences are negligible because a 10ms and 13ms response time are indiscernible to a human. Any algo that hopes to handle very large numbers of values must approximate because not even C can store and sort hundreds of thousands of floats (times N many attributes times N many query classes) quickly enough.

So when I finished the new algo, I compared it to the mk-query-digest algo and obtained the following results:

FILE REAL_95 OLD_95 NEW_95 OLD_DIFF NEW_DIFF OLD_TIME NEW_TIME FILE_SIZE OLD_RSS NEW_RSS nums/500k-1-or-2 1.751 1.697 1.784 -0.054 +0.033 12.12 9.37 4500000 3.88M 2.63M nums/100k-1-or-2 1.749 1.697 1.794 -0.052 +0.045 2.42 1.88 900000 3.88M 2.63M nums/50k-trend-1-to-9 6.931 6.652 6.995 -0.279 +0.064 1.24 0.90 450000 3.88M 2.63M nums/25k-trend-1-to-5 3.888 3.704 3.988 -0.184 +0.100 0.64 0.47 225000 3.88M 2.63M nums/21k-1-spike5-1 0.997 0.992 2.002 -0.005 +1.005 0.55 0.42 189000 3.88M 2.63M nums/10k-rand-0-to-20 19.048 18.532 19.054 -0.516 +0.006 0.29 0.21 95079 3.86M 2.62M nums/10k-rand-0-to-10 9.511 9.360 9.525 -0.151 +0.014 0.29 0.21 90000 3.86M 2.62M nums/4k-trend-1-to-7 5.594 5.473 6.213 -0.121 +0.619 0.14 0.09 36000 3.86M 2.63M nums/1k-sub-sec 0.941 0.900 0.951 -0.041 +0.010 0.07 0.04 9000 3.80M 2.62M nums/400-half-10 10.271 9.828 10.273 -0.443 +0.002 0.05 0.03 3800 3.79M 2.62M nums/400-high-low 10.446 10.319 10.446 -0.127 0 0.05 0.03 3800 3.79M 2.62M nums/400-low-high 10.445 10.319 10.475 -0.126 +0.030 0.05 0.03 3800 3.79M 2.63M nums/400-quarter-10 10.254 9.828 10.254 -0.426 0 0.06 0.03 3700 3.79M 2.62M nums/153-bias-50 88.523 88.305 88.523 -0.218 0 0.05 0.03 1500 3.79M 2.62M nums/100-rand-0-to-100 90.491 88.305 90.491 -2.186 0 0.05 0.03 991 3.79M 2.62M nums/105-ats 42.000 42.000 42.000 0 0 0.05 0.03 315 3.75M 2.61M nums/20 19.000 18.532 19.000 -0.468 0 0.04 0.03 51 3.79M 2.62M nums/1 42.000 42.000 42.000 0 0 0.04 0.03 3 3.75M 2.61M

I generated random microsecond values in various files. The first number of the filename indicates the number of values. So the first file has 500k values. The remaining part of the filename hints at the distribution of the values. For example, “50k-trend-1-to-9″ mean 50k values that increase from about 1 second to 9 seconds. Number and distribution of values affects 95 percentile algorithms, so I wanted to simulate several possible combinations.

“REAL_95″ is the real, exact 95 percentile; this is the control by which the “old” (i.e. the mk-query-digest) and new algos are compared. The diffs are comparisons to this control.

Each algo was timed and its memory (rss) measured, too. The time and memory comparisons are a little bias because the mk-query-digest module that implements its 95 percentile algo does more than my test script for the new algo.

The results show that the new algo is about 20% faster in all cases and more accurate in all but one case (“21k-1-spike5-1″). Also, the new algo uses less memory, but again this is a little bias; the important point is that it doesn’t use more memory to get its speed or accuracy increase.

The gains of the new algo are small in these comparisons, but I suspect they’ll be much larger given that the algo is used at least twice for each query. So saving 1 second in the algo can save minutes in data processing when there’s tens of thousands of queries.

Instead of explaining the algorithm exhaustively, I have upload all my code and data so you can reproduce the results on your machine: new-95-percentile-algo.tar.gz. You’ll need to checkout Maatkit, tweak the “require” lines in the Perl files, and tweak the Bash script (cmp-algos.sh), but otherwise I think the experiment should be straight forward. The new algo is in new-algo.pl. (new-algo.py is for another blog post.)

My ulterior motive for this blog post is to get feedback. Is the algorithm sane? Is there a critical flaw that I overlooked? Do you have a real-world example that doesn’t work well? If you’re intrepid or just curious and actually study the algo and have questions, feel free to contact me.

* By “recently asked myself” I mean that some time ago Baron and I wondered if it was possible to calculate 95 percentile without saving all values. At that time, I didn’t think it was feasible, but lately I thought and coded more about the problem.

** By “a new algorithm” I doubt that this has never been attempted or coded before, but I can’t find any examples of a similar algorithm.

*** By “saves only 100 values” I mean ultimately. At certain times, 150 values may be saved, but eventually the extra 50 should be integrated back into the base 100 values.

Hi Daniel!

wow – this is great, very interesting stuff :)

I’m now downloading your code to check out the algorithm – I’m really curious how it works.

Oh, to get a quick grasp of the result, I calculated the correlation between (REAL_95,OLD_95) and (REAL_95,NEW_95), and the results are 0,9998926355 and 0,9999572588 respectively – I think it’s safe to say both are accurate enough, and it seels the new one is even a bit more accurate than the old one (for this dataset).

Exciting :)

Roland Bouman30 Aug 11 at 6:37 AM

Is there a reason or conjecture to why the 21k-1-spike5-1 test was so far off?

Justin Rovang30 Aug 11 at 7:46 AM

Thanks for taking a look Roland. :-)

Daniel Nichter30 Aug 11 at 8:06 AM

Justin, I think 21k-1-spike5-1 is off because it’s a perfect edge case. As the filename kind of suggests, the values are:

10k @ 1s

1k @ 5s

10k @ 1s

So the 5s spike values are exactly 5% of total values (1k/20k). That make the 95%/5% edge something like 1s/5s. But the 5s values aren’t all 5s, some are 2, 3, or 4. So the edge is more like 1s/2s. Then (I should have probably made this explicit in the post), 95% here is actually “95% of values less than

or equal toX” whereas technically a percentile should be just “less than X”. Using <= is inherited from mk-query-digest, but perhaps I should re-examine if that's correct.Daniel Nichter30 Aug 11 at 8:21 AM