{
    "href": "/post/2005/03/30/database-portability-part-4/",
    "relId": "2005/03/30/database-portability-part-4",
    "title": "Database Portability (Part 4)",
    "author": "pmjones",
    "markup": "html",
    "tags": [
        {
            "href": "/tag/php/",
            "relId": "php",
            "title": "PHP",
            "author": null,
            "created": null,
            "updated": [],
            "markup": "markdown"
        }
    ],
    "created": "2005-03-30 16:16:28 UTC",
    "updated": [
        "2005-03-30 16:16:28 UTC"
    ],
    "html": "<p>This is part 4 in a 3-part series (take that, Douglas Adams ;-).  You can see <a href=\"http://paul-m-jones.com/blog/?p=126\">parts 1 and 2 here</a> and <a href=\"http://paul-m-jones.com/blog/?p=127\">part 3 here</a>.</p>\n<p>A little background:  the reason I am so concerned portability is the <a href=\"http://solarphp.com\">Solar</a> project.  One of the stated goals for Solar_Sql (the RDBMS abstraction layer) is that it should run exactly the same way on every supported database.  Right now, that list includes Frontbase, Microsoft SQL, MySQL, PostgreSQL, and SQLite.</p>\n<h3>Case-Sensitivity in WHERE Clauses</h3>\n<p>Let's say we have a table with a column named \"the_field\" defined as CHAR(2).  In the table, we have every possible combination of two letters, picked from the set of all upper-case and lower case letters (i.e., \"A-Z\" and \"a-z\", for a total of 26^2 rows in the table).  For example, the \"A\" combinations are represented by four rows:  \"aa\", \"aA\", \"Aa\", and \"AA\".</p>\n<p>In theory, if you wanted to select the row with \"aa\" (where both are lower case), you would issue this SELECT statement:</p>\n<p>SELECT * FROM the_table WHERE the_field = 'aa'</p>\n<p>How many rows will you get back?  You should only get one: the \"aa\" row.  That's true if the database is PostgreSQL, Frontbase, or SQLite; however, in Microsoft SQL and MySQL, you get four (\"aa\", \"aA\", \"Aa\", and \"AA\").  That's because searches in those two RDBMSes are *not* case-sensitive on CHAR and VARCHAR fields.  This fails to meet my requirement that Solar_Sql return the same results regardless of the backend.</p>\n<p>I think the solution will be to re-define the native type used for \"char\" and \"varchar\" fields in those two databases.  It turns out that in MySQL, a \"CHAR(n) BINARY\" and \"VARCHAR(n) BINARY\" both retain case-sensitivity in searches.  In Microsoft SQL, the same appears to be true for \"BINARY\" (vice \"CHAR\") and \"VARBINARY\" (vice \"VARCHAR\").</p>\n<p>Does anyone see a problem with using the binary types instead of the plain character types in those two databases, specifically for purposes of case-sensitivity?  My appreciation will extend to buying you a beer next time we're in the same town.  :-)</p>\n"
}
