Hack MySQL Blog

Compiling Drizzle 7 on Mac OS X 10.6

with 3 comments

Drizzle 7 GA has been released, so I wanted to compile and test it on my Mac running OS X 10.6.7.  Since Drizzle 7 is new, Mac binaries are not available yet.  I’ve compiled MySQL from source more times than I can remember, and Drizzle was forked from MySQL, so I expected the build process to be similar and pain-free, and for the most part it was.  I did not use MacPorts or Homebrew for various reasons, mainly because I know that I will compile, tweak and recompile Drizzle often while hacking on it.  Also, the blog post  Drizzle in the Snow is about building Drizzle on Mac OS X, but it’s out of date (published September 1, 2009).  Thus the need for this blog post.

After describing how I compiled Drizzle 7 on my Mac, I list several suggestions for the Drizzle developers/maintainers based on my experience.

I began, of course, by downloading and extracting the tarball from drizzle.org.  Then I proceeded to read the fine manual, even though the build was surely the standard trio: configure, make, make install.  There are three docs about compiling Drizzle: the README in the tarball, the online Drizzle documentation, and the DrizzleWiki.  The README lists that standard trio and refers to the online docu.  The online docu also lists the standard trio but, of primary importance when compiling from source, it also lists dependencies.  The wiki page on compiling also refers to the online docu, and if you Google for “compile drizzle mac” you’ll find other wiki pages but they are all out of date (e.g. one still says you need to compile libdrizzle first; this is no longer true: libdrizzle comes with Drizzle).  So, the online docu is the best source because we need to know Drizzle’s dependencies.

The online docu does not have specific instructions for compiling Drizzle from source on Mac OS X.  On this platform, the number one pre-requisite is Xcode: Apple’s development package which installs all the basic programs for compiling code (e.g. g++, make, etc.)  I’m using Xcode 3.2 because it’s free (whereas Xcode 4 cost a few dollars).  So, Xcode must be installed first before any other packages can be compiled (but if you already develop on a Mac, you probably already knew this).

From the list of dependencies, I had to download and install boost, gettext, intltool, and protobuf.  These last three are really simple: configure, make, make check, make install.  I did make check because although Mac OS X is a Unix platform, it has its quirks. Those three packages compiled, tested and installed effortlessly, all using the prefix /usr/local/. boost, however, required more attention.

To start, I just followed boost’s docu page Getting Started on Unix Variants, but it didn’t work as easily as the other packages. First, since Drizzle lists only a few boost libs as dependencies, I only compiled those:

./bootstrap.sh \
  --with-libraries=date_time,filesystem,iostreams,program_options,regex,test,thread

When that finishes, it tells me:

Bootstrapping is done. To build, run:

    ./bjam

That is different from the docu which says to run bjam install. Running just bjam does compile the libs, but it doesn’t install them. Running with the install argument makes bjam hang. So I tried bjam --install which compiles the libs but still doesn’t install the libs. Furthermore, the compiled libs did not have -mt suffixes to indicate that they where multi-threading. I noticed that Drizzle configure checks first for libs with the -mt suffix, so I really wanted to make my boost libs multi-threading. After asking around then filing a boost bug about this, I was informed that the docu was out of date and that I needed to use --layout=tagged. So, finally, to compile and manually install the boost libs for Drizzle I did (all commands):

bootstrap.sh \
  --with-libraries=date_time,filesystem,iostreams,program_options,regex,test,thread
bjam --layout=tagged
sudo cp stage/lib/* /usr/local/lib/
sudo mv boost /usr/local/include/

After that, /usr/local/include/boost/ has the boost header files and /usr/local/lib/ has boost libs like:

/usr/local/lib$ ls libboost*
libboost_date_time-mt.a                libboost_regex-mt.a
libboost_date_time-mt.dylib*           libboost_regex-mt.dylib*
libboost_filesystem-mt.a               libboost_system-mt.a
libboost_filesystem-mt.dylib*          libboost_system-mt.dylib*
libboost_iostreams-mt.a                libboost_test_exec_monitor-mt.a
libboost_iostreams-mt.dylib*           libboost_thread-mt.a
libboost_prg_exec_monitor-mt.a         libboost_thread-mt.dylib*
libboost_prg_exec_monitor-mt.dylib*    libboost_unit_test_framework-mt.a
libboost_program_options-mt.a          libboost_unit_test_framework-mt.dylib*
libboost_program_options-mt.dylib*

Now Drizzle configure finishes, it doesn’t die saying it can’t find a library it needs. Furthermore, I see that it finds the -mt boost libs, so I hope that makes it happy (because, after all, one selling point for Drizzle is massive concurrency, so I don’t want to build it without being sure it’s using multi-threading libs). But there’s a problem: configure finishes by saying:

Configuration summary for drizzle7 version 7 drizzle

   * Installation prefix:       /usr/local
   * System type:               apple-darwin10.7.0
   * pandora-build version:     0.175
   * Host CPU:                  i386
   * C Compiler:                i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5664)
   * C++ Compiler:              i686-apple-darwin10-g++-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5664)
   * Assertions enabled:        yes
   * Debug enabled:             no
   * Profiling enabled:         no
   * Coverage enabled:          no
   * Warnings as failure:       no

My MacBook Pro has an Intel Core 2 Duo, so the CPU is definitely not i386; it should be x86_64. I’m not a build expert, but I think this information is detected by config.guess by looking at uname values, which on my system returns:

$ uname -a
Darwin MacBook-Pro.local 10.7.0 Darwin Kernel Version 10.7.0:
Sat Jan 29 15:17:16 PST 2011; root:xnu-1504.9.37~1/RELEASE_I386 i386

That probably explains why configure detects the CPU has i386 (and the OS version as 10.7). So to correct this, I ran: configure --build=x86_64-apple-darwin10.6 --prefix=/opt/drizzle7. (I added --prefix because I want to install it elsewhere.) That gives me the results I expect:

Configuration summary for drizzle7 version 7 drizzle

   * Installation prefix:       /opt/drizzle7
   * System type:               apple-darwin10.6
   * pandora-build version:     0.175
   * Host CPU:                  x86_64
   * C Compiler:                i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5664)
   * C++ Compiler:              i686-apple-darwin10-g++-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5664)
   * Assertions enabled:        yes
   * Debug enabled:             no
   * Profiling enabled:         no
   * Coverage enabled:          no
   * Warnings as failure:       no

Again, I’m no build expert, but I hope this has an affect (i.e. the code is compiled with modern optimizations). If nothing else, it gives me piece of mind. So I run make and after some time Drizzle compiles without any problems. More importantly, I run make test and the result:

All 557 tests were successful.
The servers were restarted 54 times
Spent 242.179 of 386 seconds executing testcases

Congratulations to the Drizzle developers! All tests passing straight out of the box on a completely foreign system (i.e. mine) is an accomplishment.

Finally, make install puts everything in the prefix I chose (/opt/drizzle7):

/opt/drizzle7$ ls *
bin:
drizzle*       drizzleadmin*  drizzledump*   drizzleimport* drizzleslap*

include:
drizzle7/       libdrizzle-1.0/

lib:
drizzle7/                       libdrizzledmessage.0.dylib*
libdrizzle.1.1.0.dylib@         libdrizzledmessage.dylib@
libdrizzle.1.dylib*             libdrizzledmessage.la*
libdrizzle.dylib@               locale/
libdrizzle.la*                  pkgconfig/
libdrizzledmessage.0.0.0.dylib@

sbin:
drizzled@  drizzled7*

share:
man/

Now to run Drizzle for the “first” time (actually I compiled and ran Drizzle last year on my PC/Ubuntu machine, so this is only my first time on my Mac). I use the command line specified in the README:

/opt/drizzle7$ ./sbin/drizzled \
   --no-defaults \
   --port=3306 \
   --basedir=/opt/drizzle7 \
   --datadir=/tmp/drizzle7/data >> /tmp/drizzle7/drizzled.err 2>&1 &
[1] 72197

/opt/drizzle7$
[1]+  Exit 1  ./sbin/drizzled [snip] >> /tmp/drizzle7/drizzle.err 2>&1

/opt/drizzle7$ cat /tmp/drizzle7/drizzled.err
unknown option port
Use --help to get a list of available options

Aborting

Oh no! drizzled failed to start because the --port option doesn’t actually exist. So the README is a little out of day. No problem: just re-run it without --port and it starts without problems:

$ mysql -h 127.1
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 2011.03.13 Source distribution (drizzle)

All in all, the build process was painless. (boost gave me more problems than Drizzle.) Now I can start hacking on Drizzle and comparing it to MySQL (whether or not that’s a fair comparison, people are going to do it often). After a few minutes of looking around Drizzle, I can already see that its “out of the box experience” is quite different from MySQL’s. For example, because I didn’t build Drizzle with extra plugins, the default authentication plugin is auth_all, i.e. no authentication, and the logging_query plugin isn’t available so no slow query logging, either. These are topics and considerations for other blog posts.

Based on my experience compiling Drizzle 7 GA from source on Mac OS X 10.6, here are my recommendations to the Drizzle developers/maintainers:

  1. Remove DrizzleWiki and centralize all documentation at docs.drizzle.org.
  2. Update the tarball README and/or add an INSTALL.
  3. Update docs.drizzle.org (i.e. make everything refer only to Drizzle 7).
  4. Consolidate information on docs.drizzle.org (e.g. “Software Requirements” lists some required libs, then “Minimal Requirements” lists more).
  5. When compilation instructions specific to Mac OS X are written, mention the need for Xcode.
  6. Harass the boost developers to update their documentation.
  7. Provide Mac OS X binaries.

As a developer of a much smaller project, I know this is all easier said than done.  Overall though, good job on Drizzle 7.

Written by Daniel Nichter

April 10th, 2011 at 12:37 pm

Posted in Drizzle

Tagged with , , , ,

Hack MySQL is longer maintained

with one comment

I have decided to “officially” retire Hack MySQL, which is to say that the web site (pages, tools, etc.) will no longer be maintained. The MySQL universe has changed significantly since 2004 when I began writing tools. Similarly, my professional and personal lives have changed, too. I don’t think Hack MySQL is needed or relevant any longer, and I certainly do not have the time to maintain it.

The web site will remain online for perhaps another year (since Media Temple has always provided free hosting for this site–thanks!), then I will remove it or archive it somewhere else.

For further MySQL resources, see Planet MySQL. That aggregate is, imho, the pulse of the MySQL community.

Written by Daniel Nichter

November 19th, 2010 at 10:26 am

Detecting invalid and zero temporal values

without comments

I’ve been thinking a lot about invalid and zero temporal values and how to detect them with MySQL date and time functions because mk-table-checksum has to handle “everything” correctly and efficiently. The requirements are complex because we have to take into account what MySQL allows to be stored verses what it allows to be used in certain operations and functions, how it sorts a mix of real and invalid temporal values for MIN() and MAX(), how to detect a temporal value as equivalent to zero, and how different MySQL versions might affect any of the aforementioned.

At base, the four guiding requirements are:

  1. Detect and discard invalid time, date, and datetime values
  2. Detect zero-equivalent temporal values
  3. Do #1 and #2 using only MySQL functions
  4. Work in MySQL 4.0 and newer

My tests cases for invalid temporal values are:

  • 00:00:60
  • 00:60:00
  • 999-00-00
  • 999-01-01
  • 0000-00-00
  • 2009-00-00
  • 2009-13-00
  • 999-00-00 00:00:00
  • 999-01-01 00:00:00
  • 0000-00-00 00:00:00
  • 1000-00-00 00:00:00
  • 2009-00-00 00:00:00
  • 2009-13-00 00:00:00
  • 2009-05-26 00:00:60
  • 2009-05-26 00:60:00
  • 2009-05-26 24:00:00

And my test cases for first real temporal values are:

  • 00:00:00
  • 00:00:01
  • 1000-01-01
  • 2009-01-01
  • 1000-01-01 00:00:00
  • 2009-01-01 00:00:00

And there is only one real zero-equivalent temporal value: 00:00:00.

So the first requirement is to find a MySQL function that returns NULL for all those invalid values, and that function is TO_DAYS with one exception:

mysql> SELECT TO_DAYS('999-01-01 00:00:00');
+-------------------------------+
| TO_DAYS('999-01-01 00:00:00') |
+-------------------------------+
|                        364878 |
+-------------------------------+

That date is only valid if years before 1000 are handled but the MySQL manual says that,

TO_DAYS() is not intended for use with values that precede the advent of the Gregorian calendar (1582)

so we’re already way past the limit of its intended use and, moreover, the supported lower limit of a date or datetime is 1000-01-01, so says the manual. It’s reasonable to not bother with pre-year 1000 dates so I’ll overlook this.

Excepting pre-year 1000 dates, TO_DAYS() returns NULL for all the invalid values. By contrast, UNIX_TIMESTAMP() returns zero for all the invalid values and TIME_TO_SEC() returns a mix of NULL, zero, and values. So the apparent winner for requirement #1 is TO_DAYS(), but…

Requirement #2 complicates the issue because the time 00:00:00 is valid and zero-equivalent but TO_DAYS() returns NULL for it. We need a hack that handles all the cases, and here it is:

SELECT IF(TIME_FORMAT(?,'%H:%i:%s')=?, TIME_TO_SEC(?), TO_DAYS(?))

That says, basically: if the value is a time then evaluate it with TIME_TO_SEC(), else evaluate it with TO_DAYS(). It works so well in fact that it satisfies all four requirements. 00:00:00 evaluates to zero, all the invalid values evaluate to NULL, and all the valid values evaluate to various non-null values. I have to use TIME_FORMAT() instead of just TIME() because TIME() wasn’t introduced until MySQL v4.1 (fourth requirement).

The hack works because of this (substituting TIME() for TIME_FORMAT()):

mysql> SELECT TIME('00:00:00');
+------------------+
| TIME('00:00:00') |
+------------------+
| 00:00:00         |
+------------------+

mysql> SELECT TIME('00-00-00');
+------------------+
| TIME('00-00-00') |
+------------------+
| 00:00:00         |
+------------------+

mysql> SELECT TIME('2010-05-26');
+--------------------+
| TIME('2010-05-26') |
+--------------------+
| 00:20:10           |
+--------------------+

mysql> SELECT TIME('2010-05-26 10:10:10');
+-----------------------------+
| TIME('2010-05-26 10:10:10') |
+-----------------------------+
| 10:10:10                    |
+-----------------------------+

As you can see, TIME() (or TIME_FORMAT()) returns the exact same value if the given value is a time, otherwise it interprets the value–which is a date or datetime–as a time causing it to return a different value than the given value. Thus we discern time values from date and datetime values and evaluate them separately with TIME_TO_SEC().

I tested on MySQL v4.0, 4.1, 5.0 and 5.1 and all pass. The only difference is 4.0 verses the others for the pre-year 1000 dates, but I’m ignoring these anyway.

Of course all the preceding could have been accomplished in code by looking at the column type and choosing the correct MySQL function to evaluate the value and check if it’s zero-equivalent, but I was curious to see if it could be done using only MySQL since, after all, it is MySQL that permits these silly, invalid temporals values.

If you know a simpler, more elegant solution that meets the four requirements and passes all the tests, please share!

Written by Daniel Nichter

May 26th, 2010 at 5:48 pm

Down the rabbit hole

without comments

Generally I avoid going down rabbit holes but today I decided to see how deep a particular testing rabbit hole went. This post is a third in what seems be a continuing series of programming anecdotes. It’s not particularly MySQL-related so you can stop reading here unless you grok code stuff.

Before beginning work on issue 720 I ran the mk-table-checksum test suite to make sure it was in working order. No sense writing new tests and code when the old tests and code aren’t reliable. I actually made one seemingly innocuous change to the test suite in preparation for the issue: I changed the –replicate checksum table from MyISAM to InnoDB.

Surprisingly, the test suite proved unstable. Random tests would fail at random times. Some instability was due to new tests for other issues that I wrote poorly and hadn’t been run a zillion times yet. But other instability was due to switching the checksum table to InnoDB. I knew this because I could switch it back to InnoDB and the tests were ok. Thus began my descent into the rabbit hole.

In particular the test for issue 982 was not finishing or, if it did, it took forever. I discovered that the problem had to do with –lock even though it shouldn’t since the docu says that –replicate and –lock are useless together because the former eliminates locking concerns. Obviously not. So for this I created issue 1027.

Then I turned my attention to test 207_issue_51.t for issue 51 which was opened 2 years ago and closed 1 year ago as WontFix because the problem could not be reproduced. Some might have thought it forgotten amongst the nearly 1,000 issues since its time, but it was found again at the end of this rabbit hole and revivified.

It turns out that issue 51 is, as the original reporter stated (partially), caused by –wait when (this is the part the original reporter didn’t state) the –replicate checksum table is InnoDB because –wait enables –lock which disables AutoCommit so writes to the checksum table may not be committed.

So the rabbit hole lead from issue 720 to issue 982 to issue 1027 to issue 51. Then end results will be a needed bug fix and faster, more stable tests. Sometimes it pays to follow the rabbit.

Written by Daniel Nichter

May 25th, 2010 at 4:28 pm

Posted in Programming

Tagged with , ,

Hack MySQL tools retired, succeeded

without comments

I’m surprised, and flattered, to see that people still use, write and recommend mysqlsla, mysqlreport and–most surprisingly–mysqlsniffer. In truth, however, I consider all the original Hack MySQL tools as retired. Maatkit consumes the majority of my development time and provides better replacements for all the Hack MySQL tools. The mk tools are better because–most importantly–they’re tested, their code is more robust, and they benefit from the collected knowledge and experience of the community’s top minds (whereas the Hack MySQL tools are brain-children of only my knowledge and experience circa several years ago).

Thus I created a new tools page where I list and briefly profile free, open-source MySQL tools. As the intro paragraph states, MySQL Forge does this, too, but imho the forge is a dense jungle in which it is difficult to discern the useful bits from the less-than-useful bits. My tools page is meant to 1) inform people that the Hack MySQL tools are retired, 2) list replacements for them, and 3) give people new to the MySQL universe a quick, simple list of tools they’ll probably want to become familiar with.

Kind thanks to all who used, wrote about, contributed to and recommend the Hack MySQL tools over the years.

Written by Daniel Nichter

May 23rd, 2010 at 12:45 pm

Posted in MySQL

Tagged with ,