Hack MySQL

Debugging and ripple effects

without comments

Like I said earlier, every tiny change that the test suite reveals after code changes is significant. I caught a very subtle “bug” today in recent changes to mk-query-digest (a.k.a. mqd). If you like to read about subtle bugs, read on.

An mqd test on sample file slow023.txt began to differ after some pretty extensive code changes of late:

< # Query 1: 0 QPS, 0x concurrency, ID 0x8E38374648788E52 at byte 0 ________
---
> # Query 1: 0 QPS, 0x concurrency, ID 0x2CFD93750B99C734 at byte 0 ________

The ID which depends on the query’s fingerprint has changed. It’s very important that we don’t suddenly change these on users because these IDs are pivotal in trend analyses with mqd’s --review-history option. First some background info on the recent code changes and then the little story about how I tracked down the source of this change.

mqd internals used to run like this: call parser module (like SlowLogParser) and pass it an array of callbacks which it ran events through. Now this has changed so there’s a single, unified pipeline of “callbacks” (they’re technically no longer callbacks). The first process in the pipeline is usually a parser module which returns each event and then mqd keeps pumping the events through the pipeline (in contrast to before where the parser module did the pumping). So “obviously” this has nothing to do with query fingerprinting or ID making which is done in code that has not changed. Thus, this “bug” was very perplexing at first.

First step: see what value make_checksum(), which makes the query IDs, used to get and gets now by using the Perl debugger:

DB<3> x $item
0 'select count(*) as a from x '


DB<12> x $item
0 'select count(*) as a from x'

The difference is that single trailing space. But why has this space suddenly disappeared in the new (later) rev? Something in fingerprint() must have changed, which is the sub that makes that query. Use the debugger again to step through fingerprint() while a watch is set on the var:

1574: $query =~ s/\A\s+//; # Chop off leading whitespace
DB<6>
Watchpoint 0: $query changed:
old value: ' select count(*) as A from X
'
new value: 'select count(*) as A from X
'
QueryRewriter::fingerprint(bin/mk-query-digest:1575):
1575: chomp $query; # Kill trailing whitespace
DB<6>
QueryRewriter::fingerprint(bin/mk-query-digest:1576):
1576: $query =~ tr[ \n\t\r\f][ ]s; # Collapse whitespace
DB<6>
Watchpoint 0: $query changed:
old value: 'select count(*) as A from X
'
new value: 'select count(*) as A from X '

Notice that the var did not change after the line “# Kill trailing whitespace” was executed. The trailing newline was removed and reduced to a single trailing space when “# Collapse whitespace” was executed. The new rev:

1585: $query =~ s/\A\s+//; # Chop off leading whitespace
DB<4>
Watchpoint 0: $query changed:
old value: ' select count(*) as A from X
'
new value: 'select count(*) as A from X
'
QueryRewriter::fingerprint(../mk-query-digest:1586):
1586: chomp $query; # Kill trailing whitespace
DB<4>
Watchpoint 0: $query changed:
old value: 'select count(*) as A from X
'
new value: 'select count(*) as A from X'

Notice how chomp in the new rev removed all trailing whitespace; the result of chomp has changed, but why? In case you didn’t know, chomp actually chomps any trailing $INPUT_RECORD_SEPARATOR, not just newlines. It just so happens that many of the parser modules change $INPUT_RECORD_SEPARATOR.

The root of this subtle but very important change is due to the fact that the parser modules no longer call the pipeline callbacks. When they did, their changes to $INPUT_RECORD_SEPARATOR were visible to the callbacks, and operations like fingerprint() are part of the callbacks. Now that they do not, their changes to $INPUT_RECORD_SEPARATOR are “invisible” and operations like fingerprint() see a different (i.e. the default) $INPUT_RECORD_SEPARATOR.

Conclusion in brief: an issue of scope at the beginning of mk-query-digest affects chomp causing fingerprint() and make_checksum() to generate different query IDs at the end of the script.

Written by Daniel Nichter

November 18th, 2009 at 4:58 pm

Posted in Programming

Tagged with ,

Setting the MySQL Sandbox prompt

without comments

This is far from deeply technical but little things that should be simple but aren’t annoy me. I found that MySQL Sandbox --prompt_prefix and --prompt_body don’t “just work.” I wanted the prompt to be mysql \v> . So I tried:

make_sandbox_from_source /mysql/src/mysql-4.0.30 single --prompt_body=' \v> '
sh: -c: line 0: syntax error near unexpected token `newline'
sh: -c: line 0: `make_sandbox /mysql/src/mysql-4.0.30/4.0.30 --prompt_body= \v> '

Maybe my shell knowledge is more terrible than I realize so I verified that ‘ \v> ‘ does not need special escaping:

echo ' \v> '
\v>

Ok, so clearly it’s the fault of make_sandbox_from_source. I tried and tried to do this on the command line but failed. I decided to try ~/.msandboxrc, specifying:

prompt_prefix=mysql
prompt_body= \v>

That didn’t work either; it created the prompt 'mysql4.0.30>. The leading space for prompt_body is ignored and there’s an errant apostrophe at the start. The final working version is:

prompt_prefix=mysql
prompt_body=\v> '

You can’t see it but there’s a trailing space after “mysql” for the prompt_prefix which is not ignored (unlike leading space for the prompt_body).

Written by Daniel Nichter

November 15th, 2009 at 2:49 pm

Posted in Random

Tagged with

Zero is a big number

without comments

I made changes to mk-query-digest yesterday that I didn’t expect to cause any adverse affects. On the contrary, several tests began to fail because a single new but harmless line began to appear in the expected output: “Databases 0″. Perhaps I’m preaching to the choir, as you are all fantastic, thorough and flawless programmers, but as for myself I’ve learned to never take a single failed test for granted.

One time a test failed because some values differed by a millisecond or two. Being curious I investigated and found that our standard deviation equation was just shy of perfect. I fixed it and spent hours cross-checking the myriad tiny values with my TI calculator. Probably no one cared about 0.023 vs. 0.022 but it’s the cultivation of a disposition towards perfection that matters.

My innocuous changes yesterday introduced a case of Perl auto-vivification. Doing:

my ($db_for_show) = $sample->{db} ? $sample->{db} : keys %{$stats->{db}->{unq}};

can auto-vivify $stats->{db}. Before yesterday, this was done before the report for those stats were printed; changes yesterday made this happen after the report. Thus the report did it’s job and reported db or “Databases 0″. It’s been fixed, and just in time since I’m doing Maatkit’s October release today.

Written by Daniel Nichter

October 30th, 2009 at 9:20 am

Posted in Maatkit,Programming

Tagged with

MySQL features timeline

without comments

I’ve begun a MySQL features timeline which is a quick reference showing as of what version MySQL features were added, changed or removed. The manual tells us this, of course, but I wanted a quicker reference. The list is far from complete as there’s a huge number of features to cover. I’ll continue to improve it and help is appreciated. Send me a quick email saying “feature x added/removed/changed as of version y” and I’ll do the rest. — If someone has already done this, please give me the url so I don’t reinvent the wheel.

Written by Daniel Nichter

October 26th, 2009 at 6:28 pm

Posted in MySQL

Tagged with ,

mk-table-sync and small tables

without comments

Issue 634 made me wonder how the various mk-table-sync algorithms (Chunk, Nibble, GroupBy and Stream) perform when faced with a small number of rows. So I ran some quick, basic benchmarks.

I used three tables, each with integer primary keys, having 109, 600 and 16k+ rows. I did two runs for each of the four algorithms: the first run used an empty destination table so all rows from the source had to be synced; the second run used an already synced destination table so all rows had to be checked but none were synced. I ran Perl with DProf to get simple wallclock and user time measurements.

Here are the results for the first run:
mk-table-sync-small-tables-run-1

When the table is really small (109 rows), there’s hardly any difference between the algorithms. As the table becomes larger, the GroupBy and Stream algorithms are much faster than the Chunk and Nibble algorithms. This is actually expected, even though Chunk and Nibble are considered the best and fastest algorithms–see point 3 in the conclusion.

Now here’s the second run:
mk-table-sync-small-tables-run-2

The small table is again roughly the same for all algorithms. Stream is clearly the fastest but what’s more notable is that GroupBy and Nibble are nearly identical even though Nibble is tremendously more complex than GroupBy. As the table becomes bigger (16k+ rows), mk-table-sync “conventional wisdom” is more clearly illustrated: Chunk and Nibble are extremely faster than GroupBy and Stream.

This was a very quick benchmarking job but from it we can draw some conclusions:

  1. There’s little difference between the algorithms when syncing small tables.
  2. The GroupBy algorithm might be the best choice for small tables since it’s comparable to Chunk and Nibble but internally less complex (it doesn’t use checksums or crypto hashes for example).
  3. If syncing to a destination that is missing a lot of rows, GroupBy and Stream can be much faster because Chunk and Nibble will waste a lot of time checksumming chunks which GroupBy and Stream do not do.
  4. When syncing large tables or tables with few difference, conventional wisdom still holds: Chunk and Nibble are the best choices.

Written by Daniel Nichter

October 23rd, 2009 at 2:32 pm

Posted in Maatkit

Tagged with ,