Hack MySQL Blog

Dolphins and camels, oh my!

Archive for the ‘MySQL’ Category

Detecting invalid and zero temporal values

without comments

I’ve been thinking a lot about invalid and zero temporal values and how to detect them with MySQL date and time functions because mk-table-checksum has to handle “everything” correctly and efficiently. The requirements are complex because we have to take into account what MySQL allows to be stored verses what it allows to be used in certain operations and functions, how it sorts a mix of real and invalid temporal values for MIN() and MAX(), how to detect a temporal value as equivalent to zero, and how different MySQL versions might affect any of the aforementioned.

At base, the four guiding requirements are:

  1. Detect and discard invalid time, date, and datetime values
  2. Detect zero-equivalent temporal values
  3. Do #1 and #2 using only MySQL functions
  4. Work in MySQL 4.0 and newer

My tests cases for invalid temporal values are:

  • 00:00:60
  • 00:60:00
  • 999-00-00
  • 999-01-01
  • 0000-00-00
  • 2009-00-00
  • 2009-13-00
  • 999-00-00 00:00:00
  • 999-01-01 00:00:00
  • 0000-00-00 00:00:00
  • 1000-00-00 00:00:00
  • 2009-00-00 00:00:00
  • 2009-13-00 00:00:00
  • 2009-05-26 00:00:60
  • 2009-05-26 00:60:00
  • 2009-05-26 24:00:00

And my test cases for first real temporal values are:

  • 00:00:00
  • 00:00:01
  • 1000-01-01
  • 2009-01-01
  • 1000-01-01 00:00:00
  • 2009-01-01 00:00:00

And there is only one real zero-equivalent temporal value: 00:00:00.

So the first requirement is to find a MySQL function that returns NULL for all those invalid values, and that function is TO_DAYS with one exception:

mysql> SELECT TO_DAYS('999-01-01 00:00:00');
+-------------------------------+
| TO_DAYS('999-01-01 00:00:00') |
+-------------------------------+
|                        364878 |
+-------------------------------+

That date is only valid if years before 1000 are handled but the MySQL manual says that,

TO_DAYS() is not intended for use with values that precede the advent of the Gregorian calendar (1582)

so we’re already way past the limit of its intended use and, moreover, the supported lower limit of a date or datetime is 1000-01-01, so says the manual. It’s reasonable to not bother with pre-year 1000 dates so I’ll overlook this.

Excepting pre-year 1000 dates, TO_DAYS() returns NULL for all the invalid values. By contrast, UNIX_TIMESTAMP() returns zero for all the invalid values and TIME_TO_SEC() returns a mix of NULL, zero, and values. So the apparent winner for requirement #1 is TO_DAYS(), but…

Requirement #2 complicates the issue because the time 00:00:00 is valid and zero-equivalent but TO_DAYS() returns NULL for it. We need a hack that handles all the cases, and here it is:

SELECT IF(TIME_FORMAT(?,'%H:%i:%s')=?, TIME_TO_SEC(?), TO_DAYS(?))

That says, basically: if the value is a time then evaluate it with TIME_TO_SEC(), else evaluate it with TO_DAYS(). It works so well in fact that it satisfies all four requirements. 00:00:00 evaluates to zero, all the invalid values evaluate to NULL, and all the valid values evaluate to various non-null values. I have to use TIME_FORMAT() instead of just TIME() because TIME() wasn’t introduced until MySQL v4.1 (fourth requirement).

The hack works because of this (substituting TIME() for TIME_FORMAT()):

mysql> SELECT TIME('00:00:00');
+------------------+
| TIME('00:00:00') |
+------------------+
| 00:00:00         |
+------------------+

mysql> SELECT TIME('00-00-00');
+------------------+
| TIME('00-00-00') |
+------------------+
| 00:00:00         |
+------------------+

mysql> SELECT TIME('2010-05-26');
+--------------------+
| TIME('2010-05-26') |
+--------------------+
| 00:20:10           |
+--------------------+

mysql> SELECT TIME('2010-05-26 10:10:10');
+-----------------------------+
| TIME('2010-05-26 10:10:10') |
+-----------------------------+
| 10:10:10                    |
+-----------------------------+

As you can see, TIME() (or TIME_FORMAT()) returns the exact same value if the given value is a time, otherwise it interprets the value–which is a date or datetime–as a time causing it to return a different value than the given value. Thus we discern time values from date and datetime values and evaluate them separately with TIME_TO_SEC().

I tested on MySQL v4.0, 4.1, 5.0 and 5.1 and all pass. The only difference is 4.0 verses the others for the pre-year 1000 dates, but I’m ignoring these anyway.

Of course all the preceding could have been accomplished in code by looking at the column type and choosing the correct MySQL function to evaluate the value and check if it’s zero-equivalent, but I was curious to see if it could be done using only MySQL since, after all, it is MySQL that permits these silly, invalid temporals values.

If you know a simpler, more elegant solution that meets the four requirements and passes all the tests, please share!

Written by Daniel Nichter

May 26th, 2010 at 5:48 pm

Hack MySQL tools retired, succeeded

without comments

I’m surprised, and flattered, to see that people still use, write and recommend mysqlsla, mysqlreport and–most surprisingly–mysqlsniffer. In truth, however, I consider all the original Hack MySQL tools as retired. Maatkit consumes the majority of my development time and provides better replacements for all the Hack MySQL tools. The mk tools are better because–most importantly–they’re tested, their code is more robust, and they benefit from the collected knowledge and experience of the community’s top minds (whereas the Hack MySQL tools are brain-children of only my knowledge and experience circa several years ago).

Thus I created a new tools page where I list and briefly profile free, open-source MySQL tools. As the intro paragraph states, MySQL Forge does this, too, but imho the forge is a dense jungle in which it is difficult to discern the useful bits from the less-than-useful bits. My tools page is meant to 1) inform people that the Hack MySQL tools are retired, 2) list replacements for them, and 3) give people new to the MySQL universe a quick, simple list of tools they’ll probably want to become familiar with.

Kind thanks to all who used, wrote about, contributed to and recommend the Hack MySQL tools over the years.

Written by Daniel Nichter

May 23rd, 2010 at 12:45 pm

Posted in MySQL

Tagged with ,

MySQL features timeline

without comments

I’ve begun a MySQL features timeline which is a quick reference showing as of what version MySQL features were added, changed or removed. The manual tells us this, of course, but I wanted a quicker reference. The list is far from complete as there’s a huge number of features to cover. I’ll continue to improve it and help is appreciated. Send me a quick email saying “feature x added/removed/changed as of version y” and I’ll do the rest. — If someone has already done this, please give me the url so I don’t reinvent the wheel.

Written by Daniel Nichter

October 26th, 2009 at 6:28 pm

Posted in MySQL

Tagged with ,

mysqlsla v2.00 released

without comments

mysqlsla v2 is finally “done” and released. About 3 months ago, when v1.8 was released, I said it would be coming “soon,” but time just flew by and here we are. Oh well. In any case, the v1 branch is dead to me and v2 is all the rave (at least for me). If you don’t care about the differences and all you want is your default top 10 report from a slow log, for example, then all you need to know is: mysqlsla -lt slow SLOW_LOG

For those interested in what has changed to warrant a new major version number, here’s the briefing of changes/overhauls:

  1. Almost ALL new command line options (–log-type a.k.a. -lt is the most important); see the documentation
  2. Customizable reports. In v1 the report was hard-coded. Now with -rf FILE you can format your own report. See the doc on reports.
  3. FULL filtering and sorting. Somewhere around v1.7 (if I remember correctly) filtering based on things like the connection ID, etc. was introduced. v2 is more abstracted and allows more complex filtering on any “meta-property” available in the log (and sorting by most of them, too). See the doc on filters.
  4. Replays to compact fat text log files into super-compact binary files.
  5. Slightly better query abstraction.
  6. Full support for microslow patched slow logs.
  7. User-defined logs so that, one day, we may begin parsing, filtering, analyzing and sorting custom logs from things like MySQL Proxy.
  8. And in general, what I called the “mysqlsla v2 Library: 6 long, detailed pages documenting just about single, tiny aspect of mysqlsla (plus with the “installer” you get a mysqlsla man page, too).

As this is pretty much an entire code overhaul, please please report bugs, problems, suggestions and whatever. Please also send me your log files: slows, generals, binaries. They help me tremendously because I cannot sit around inventing up all the weird kinds of stuff I’ve seen in other people’s logs.

And as always, thank you all for your support, and thanks to those people who provided feedback, patches and such.

Written by Daniel Nichter

July 12th, 2008 at 1:39 pm

mysqlsla: v1.8 released, v2 coming

without comments

mysqlsla v1.8 is available at:
http://hackmysql.com/scripts/mysqlsla-1.8-DEBUG
I am releasing it publicly without updating the mysqlsla web page or documentation because, instead, I am waiting until I finish mysqlsla v2. After working with v1.8 I realized the code needed a major re-think and overhaul. v2 will reflect this and will be a far superior log hacking and analyzing tool, capable of far more than v1.8 is now.

But for now, 1.8 fixes several good (or bad?) bugs:

  • –only-hosts did not work for general logs
  • Multi-line comments using /* */ caused everything after the first line to be ignored in raw logs
  • “Change user” commands were not handled in general logs
  • CHANGE, DROP and RESET statements were filtered out
  • A certain variant of the Connect command in general logs was not handled
  • A few more SQL capitalization/beautifications were added

If you use mysqlsla, you should really get v1.8 due to these bug fixes. And if you have used mysqlsla before and found it didn’t do what you needed: tell me because v2 open to suggestions. My goal for v2 is to make it the MySQL log hacking tool, capable of parsing, ordering, analyzing, sorting, etc. a MySQL log in any way possible and then to serve as a liaison to other scripts, such as Baron’s mk-query-profiler; e.g. have mysqlsla process and filter a log and then feed SQL statements to mk-query-profiler or any other script. Currently, v1.8 is too stuck in its own ways to help other scripts. Such selfishness shall be expunged.

Written by Daniel Nichter

April 19th, 2008 at 10:53 am

Posted in MySQL, mysqlsla