MySQL varchar not case sensative

I managed to overlook an issue with creating a varchar column in an app I have been working on. I have basically got a normalised table, with a forien key to a table of values. In this case, its a set of HTML Document Titles, keyed off an autoincrement column called Title_ID. What I want to do is look up a title, and get a Title_ID back.

Great; I can do this with a stored function, which I did, and it worked. But it was slow. So I decided that I’d normalise these en-mass with one big INSERT statement to the normalised table (protected by a unique index constraint), and then store the resulting Title_ID.

There be dragons. As one title came through as “[Q] help me” it was duly inserted and given a Title_ID.
However, when a lower case “[q] help me” came through, it matched as a duplicate of the original and therefore was not inserted again. I then pulled the strings into a Perl hash, and of course, couldn’t find a key with “[q] help me”, only “[Q] help me”.
Turns out that the issue was my column definition. varchar(x) is not case sensative. varchar(x) binary is. The Unique index I had on here was doing its job and comparing values based upon the case in-sensative column – not its fault.

ALTER TABLE Titles CHANGE COLUMN `Title` `Title` VARCHAR(600) BINARY NULL DEFAULT NULL ;

And now I see my column as “`Title` varchar(600) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL“.

PLUG Quiz Night: tonight

The time has come! The Perth Linux User Group is holding a Quiz Night tonight (doors open 7pm, Quiz starts 7:30pm), with questions very much based upon Linux & Open Source. It’s at The Moon & Six Pence in Murray Street. We’ve got a few prizes, and it’s free (through we’ll probably make you feel guilty if you’re membership isn’t 100% up to date ($10/yr)!!). Come share a drink and a laugh.

MySQL Indexes and maximum lengths

MySQL has several index types; the default and therfore probably most common is the BTree. There’s a limit in MySQL (at least as of this writing when the “current” GA is 5.5.9) is 767 bytes – across all the columns being indexed. Of course, varchar columns can now be bigger than 255 chars, so this limit is probably more easily reached these days.

In my case, I had a table with one column of a URL’s path, and a URL’s Query String, both of which can be larger than the old 255 chars. I also have an index that cover these two plus a few other columsn – normalised protocol, normalised domain, and TCP port.

In trying to move to longer columns (1K) I had to modify my index to restrict the number of characters from these varchar columns to ensure I remained under the 767 limit – I couldn’t just change one of these columns from 255 to 1024.

I throught I’d try and simple change first – instead of increasing the oclumn, just put the restriction on the current columns as they stand – so limit the index to 255 chars (while the column still IS 255). Turns out that in at least 5.5.9, since the specified size is the current column size, it ignores this.

Lets make that clearer; you need to specify a SMALLER length for the indexed columns in order for it to stick. Once that’s done, you can then alter table to increae the column lengths.

MySQL CREATE TABLE as doesn’t take triggers with it

MySQL has a nice feature that you can make a new table exectly like an old table (as far as table column structure and indexes go):

create table FOO like BAR;

However, as I just rediscovered, any triggers on the table aren’t taken across with it. D’oh. Which reminds me, mysqldump has a specific -R flag to backup/dump routines; worth having that on too (I did).

Rule for the day: check your triggers on tables before moving/renaming.

Mark Pesce’s Keynote is up

The video that sparked controversy at LCA 2011 in Brisbane this year is now available, accompanied by a warning for some of the graphic elements used within. Sure, that’s fine, some people are sensative to this, but for the majority of the world, look at the message being delivererd and try to see the point that, whatever your private indulgences are, they have the potential to be exploited by unwanted yet naievely and implicitly trusted services that the world takes a free ride on.

I think Mark took a lot of flack for this, but the message of the presentation is good. This coverage will hopefully bring more focus on the content of this keynote.

So, if you’re a balanced and forward thinking mature individual, see the video here; if you’re not, look here.

The furore that this has sparked has oveshadowed much of this, and put enormous pressure on the LCA and LA teams. Numerous apologies for the graphic content have been given. So now its not time to harp on about this issue again, but to look at the message within.

Freedom ain’t free. Freedom costs more than free.

This same issue goes for all “cloud” services that you’re being advertised right now. It’s not magic. It’s handing your data to someone else; it’s making sacrafices for your ability to control your information. Offloading your infrastrucutre to a 3rd party is easy, flexible, but you’re at the whim of a third party. Imagine trying to open a large bookstore application on Amazon EC2? How long until Amazon decide that they don’t want you competing with them?