The opposite day, whereas recording a Working Code podcast episode, I discussed to Adam {that a} massive blind-spot in my database psychological mannequin was storing non-AUTO_INCREMENT
main keys. Or, extra particularly, utilizing one thing like a UUID (Universally Distinctive Identifier), GUID, or CUID as a desk’s main key. As such, I needed to start out build up some foundational data. And, based mostly on what I have been studying, evidently with the ability to convert a UUID string to and from a binary worth is a vital level of know-how. This publish seems to be at performing this String-to-Binary conversion in ColdFusion.
To be clear, I’m not a database skilled! Sure, I really like writing SQL. And sure, I really like pondering deeply about database index design. However, I am not a type of individuals who is aware of a lot about low-level storage particulars, engine ramifications, knowledge replication, or any of the various complicated matters that go into database administration. Think about this publish a note-to-self greater than something.
To start out studying about storing Strings as main keys, I did some studying:
From what I’ve seen in these articles – which is echoed in lots of StackOverflow posts – is that utilizing Strings as main keys is a trade-off: in return for having system-independent uniqueness, you incur bigger indexes, bigger working reminiscence, attainable efficiency hits, much less intuitive values (pro-or-con relying on the way you see it), and extra complicated workflows.
This publish would not deal with all of these points – I am right here to noodle on simply one among them: bigger indexes. A part of the index-size situation comes from how the worth is saved. If a UUID is a 35-character String, storing stated UUID as a String requires 35-bytes (1 byte per character).
And, that is only for the column worth itself. When you think about that the first secret’s implicitly saved because the suffix on a secondary index, the storage necessities of a “UUID as String” is multiplied by the variety of indexes on the desk. To not point out that another desk utilizing stated UUID as a overseas key may even want 35-bytes.
A standard suggestion for lowering storage dimension is to persist the worth as a VARBINARY(16)
as an alternative of a VARCHAR(35)
. This system is predicated on the truth that a UUID is already a HEX-encoded worth. As such, changing a UUID right into a Byte Array requires little greater than a binaryDecode()
name.
Changing a binary worth again right into a UUID is a bit more work since we have now to re-insert the dashes (-
) after we generate the String. This is two Person Outlined Capabilities (UDFs) that I created for managing this conversion in ColdFusion:
<cfscript>
/**
* I convert the given UUID string to a byte array (binary worth) to be used in a MySQL
* VARBINARY(16) database discipline.
*/
public binary operate uuidToBinary( required string enter ) {
// The UUID string is already a hex-encoded illustration of information. As such, to
// convert it to binary, all we have now to do is strip-out the dashes and decode it.
return(
binaryDecode(
enter.exchange( "-", "", "all" ),
"hex"
)
);
}
/**
* I convert the given MySQL VARBINARY(16) byte array (binary worth) to a ColdFusion
* UUID string.
*/
public string operate binaryToUuid( required binary enter ) {
var asHex = binaryEncode( enter, "hex" );
// ColdFusion UUIDs use the format: xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxx.
return(
asHex.left( 8 ) & "-" &
asHex.mid( 9, 4 ) & "-" &
asHex.mid( 13, 4 ) & "-" &
asHex.proper( 16 )
);
}
</cfscript>
To strive these capabilities out, I created a easy MySQL database desk that makes use of a VARBINARY
primary-key and a worth
column that shops the UUID in plain-text in order that we are able to affirm values:
CREATE TABLE `uuid_test` (
`uid` varbinary(16) NOT NULL,
`worth` varchar(255) NOT NULL,
PRIMARY KEY (`uid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Then, I generated 10,000 rows on this take a look at desk. Be aware that in my INSERT
, I am utilizing a CFQueryParam
of kind binary
for the primary-key column.
<cfscript>
// Embody are `uuidToBinary()` and `binaryToUuid()` UDFs.
embody "./capabilities.cfm";
loop instances = 10000 {
uid = createUuid();
```
<cfquery>
INSERT INTO
uuid_test
SET
uid = <cfqueryparam worth="#uuidToBinary( uid )#" sqltype="binary" />,
worth = <cfqueryparam worth="#uid#" sqltype="varchar" />
;
</cfquery>
```
}
</cfscript>
To then take a look at the SELECT
ing of rows, I regarded within the database desk, grabbed a UUID from about half-way via the desk, and used it to find the row. Discover that I am utilizing the uuidToBinary()
to carry out the look-up; after which, I am utilizing the binaryToUuid()
to devour the important thing in my ColdFusion code:
<cfscript>
// Embody are `uuidToBinary()` and `binaryToUuid()` UDFs.
embody "./capabilities.cfm";
// A UUID randoly picked from half-way via the data.
uid = "6D9F382A-5164-48EF-8DDEA942D5EAE8E3";
```
<cfquery title="outcomes">
SELECT
t.uid,
t.worth
FROM
uuid_test t
WHERE
t.uid = <cfqueryparam worth="#uuidToBinary( uid )#" sqltype="binary" />
;
</cfquery>
```
dump( outcomes.uid );
dump( outcomes.worth );
// Use our customized capabilities to transform the VARBINARY again to a String for consumption
// inside the ColdFusion utility.
dump( binaryToUuid( outcomes.uid ) );
</cfscript>
Once we run this ColdFusion (Lucee CFML) code, we get the next output:

As you possibly can see, we have been capable of find the row within the database utilizing the VARBINARY
worth that we generated with uuidToBinary()
. Then, we have been capable of convert the binary worth again right into a ColdFusion-formatted UUID utilizing the binaryToUuid()
operate.
And, if we run an EXPLAIN
on this question, we are able to see that it makes use of the implicit primary-key index to lookup a single-row with none desk scanning!
NOTE: I am utilizing MySQL’s
UNHEX()
methodology on this case since I am working thisEXPLAIN
proper within the database interface:

As you possibly can see, this SQL question is utilizing the implicit main key (PKEY) index. And, is ready to find the one row utilizing the index with none desk scanning.
MySQL 5.7 vs MySQL 8
Whereas MySQL 5.7 ships with a operate to generate UUIDS (time-based, model 1), it leaves the string-to-binary conversions as much as the applying server. As of MySQL 8, nonetheless, the database engine now consists of extra capabilities to carry out these conversions within the SQL context:
UUID_TO_BIN()
BIN_TO_UUID()
For the time-being, I will be sticking with Integer-based AUTO_INCREMENT
columns for my main keys. However, not less than I really feel like I am lastly beginning to construct up my psychological mannequin for what a String-based primary-key would possibly appear to be. The UUIDs that ColdFusion generates (random, model 4) incur extra issues with regard to index construction and storage; however, that is an entire different subject (of which I do know little or no).
Need to use code from this publish?
Take a look at the license.