I’ve just made some changes that have massively improved the database performance of Carsurvey.org.
Back in 2000, when I added comments support to Carsurvey.org, I made the very silly decision to set the foreign key in the comments table to be a varchar (I thought I might want some flexibility above and beyond integer ids), and this varchar foreign key referenced an existing integer id field in the reviews table.
All seemed well, but for the last week I’ve been puzzling over why some of my join queries between comments and reviews were quite slow, and weren’t using the indices that I thought they should. After several days, I decided to change the comments foreign key datatype to integer, so it would match the reviews id datatype. Suddenly indices started being used in the way I was expecting, and I got a big performance increase.
So there are two lessons to be learnt here:
1. Unless you have a very very good reason, make your foreign key datatype match the datatype of the unique key in the reference table. It’s common sense for most people, but apparently not me circa 2000.
2. MySQL (version 4.1.20 anyway) doesn’t appear to optimise joins well where it has to match columns with different datatypes. This is presumably because of how it does type conversions.
Hopefully my confession will help anyone else who’s made a similar mistake, and is trying to work out why their query isn’t using the index they expect.