COM_QUIT and Questions

In the pursuit of intricate details, we always want 2 + 2 to equal 4. In the case of the status value Questions, 2 + 2 seems to equal much less than 4. On a production server this discrepancy can often be a magnitude in the millions. Thanks to a poster named bwakem on dbforums.com, the primary discrepancy was found to be the status value Connections, which usually accounts for the missing Questions.

In an effort to understand why, beyond simply knowing that this is, I dove into the MySQL source code (version 5.0.10 beta) to see what was going on and where. This document is part detailed explanation of why Connections allows Questions to add up evenly, and part journey through the MySQL source code. Numbers in brackets like [1507] scattered throughout the document are source code line references.

Adding Up Questions

If you've read this article before, you'll remember I formerly thought the formula for the status value Questions was:
Questions = Com_* + Qcache_hits + ((Connections - 2) - (Aborted_clients / 2))
However, since mysqlreport v2.2 and its new --dtq (Distribution of Total Questions) option, I've found that this is not always 100% accurate. Meaning, there are yet other things which increment Questions, but cannot be directly counted with the values from SHOW STATUS. Consequently, when mysqlreport notices such "hidden" values, it lists a line in the DTQ report for these called "Unknown." (Ironically, the last sentence of this article, as preserved, mentions "the satisfaction of knowing MySQL isn't hiding something from us"—apparently I was wrong.) On some servers, as one example report shows, the value of Unknown Questions as a percentage of Total Questions can be quite large. However, it is possible to have no Unknown Questions. For this reason, I leave this article as-is until I can do the laborious task of tracking down all the hidden values. For now, here follows the original article; just be aware there's more to be discovered...

What increments Connections is not actually what increments Questions in turn. The MySQL protocol command COM_QUIT is actually what increments Questions; Connections is the closest status available to us to quantify how many COM_QUIT commands MySQL has handled. Naturally this works because every connection is eventually going to quit (forgetting, for the sake of simplicity, persistent connections). Two is subtracted from Connections first for the current connection (the one in which SHOW STATUS is being executed, which obviously hasn't quit yet), and again because MySQL starts Connections at 1 instead of 0 so the very first connection shows Connections is 2 (I don't why; an issue for another document perhaps). Half of Aborted_clients is subtracted from Connections because an aborted client causes Connections to increment but doesn't send a COM_QUIT command (hence why it's considered aborted). Aborted_clients is divided by 2 because MySQL apparently increments it in twos for even a single aborted client (another issue for another document). What would increment Com_select increments Qcache_hits instead if MySQL is able to get the query from the query cache. The sum of all Com_* status values is the primary contribution to Questions. (Although I haven't tested more particular Com_ values like Com_slave_start, looking at the source code I believe it's reasonable to say all Com_ values increment Questions.)

COM_QUIT and Questions In Action

An important part of understanding the MySQL source code is the global and per-thread variables query_id (sql/mysql_priv.h [47-49] and sql/sql_class.h [1233-1241]). For clarity, and because it appears this way in the code, query_id refers to the global query_id and thd->query_id refers to the per-thread query_id (where thd is an instantiation of class THD, sql/sql_class.h [1028-1490]). query_id starts at 1. Every question MySQL handles is given a number which comes from query_id. Therefore, if you start MySQL, login with the mysql cli, and quit (\q), when the cli sends the COM_QUIT command, this being the first question is given the number 1. Naturally, after each question is given its number, query_id is incremented by 1. This process happens in sql/sql_parse.cc, function dispatch_command() [1507-1509]. First thd->query_id is assigned the current value of query_id [1507]. If the question (i.e., command, query, etc.) is not COM_STATISTICS or COM_PING [1508], then query_id is incremented by 1 [1509].

Knowing this, it's easy to understand why Questions is simply thd->query_id (sql/sql_show.cc [1338]). Although each thread (thd) will have a different value for its query_id, when you tell MySQL "SHOW STATUS;", at that instant that question (which results in a COM_QUERY command) gets the next global query_id number. Since all threads have been taking their numbers from query_id, this is an accurate count of all questions for all threads.

Since COM_QUIT is sent by every civilized MySQL interface when the script or program using that interface is done, this explains why COM_QUIT, seen by us as Connections, counts toward Questions. In a sense you could say the formula for COM_QUIT is:
COM_QUIT = (Connections - 2) - (Aborted_clients / 2)
Perhaps MySQL AB will add a Com_quit status value. Then the formula for Questions would be really simple and intuitive:
Questions = Com_* + Qcache_hits
Until then, we at least have the satisfaction of knowing MySQL isn't hiding something from us; it's just a little obfuscated.