Postgresql change column type from int to UUID

Postgresql

Postgresql Problem Overview


I'd like to change the column type from an int to a uuid. I am using the following statement

ALTER TABLE tableA ALTER COLUMN colA SET DATA TYPE UUID;

But I get the error message

ERROR:  column "colA" cannot be cast automatically to type uuid
HINT:  Specify a USING expression to perform the conversion.

I am confused how to use USING to do the cast.

Postgresql Solutions


Solution 1 - Postgresql

You can't just cast an int4 to uuid; it'd be an invalid uuid, with only 32 bits set, the high 96 bits being zero.

If you want to generate new UUIDs to replace the integers entirely, and if there are no existing foreign key references to those integers, you can use a fake cast that actually generates new values.

Do not run this without a backup of your data. It permanently throws away the old values in colA.

ALTER TABLE tableA ALTER COLUMN colA SET DATA TYPE UUID USING (uuid_generate_v4());

A better approach is usually to add a uuid column, then fix up any foreign key references to point to it, and finally drop the original column.

You need the UUID module installed:

CREATE EXTENSION "uuid-ossp";

The quotes are important.

Solution 2 - Postgresql

Just if someone comes across this old topic. I solved the problem by first altering the field into a CHAR type and then into UUID type.

Solution 3 - Postgresql

I had to convert from text to uuid type, and from a Django migration, so after solving this I wrote it up at http://baltaks.com/2015/08/how-to-change-text-fields-to-a-real-uuid-type-for-django-and-postgresql in case that helps anyone. The same techniques would work for an integer to uuid conversion.

Based on a comment, I've added the full solution here:

Django will most likely create a migration for you that looks something like:

class Migration(migrations.Migration):

    dependencies = [
        ('app', '0001_auto'),
    ]

    operations = [
        migrations.AlterField(
            model_name='modelname',
            name='uuid',
            field=models.UUIDField(db_index=True, unique=True),
        ),
    ]

First, put the auto created migration operations into a RunSQL operation as the state_operations parameter. This allows you to provide a custom migration, but keep Django informed about what's happened to the database schema.

class Migration(migrations.Migration):

    dependencies = [
        ('app', '0001_auto'),
    ]

    operations = [
    migrations.RunSQL(sql_commands, None, [
            migrations.AlterField(
                model_name='modelname',
                name='uuid',
                field=models.UUIDField(db_index=True, unique=True),
            ),
        ]),
    ]

Now you'll need to provide some SQL commands for that sql_commands variable. I opted to put the sql into a separate file and then load in with the following python code:

sql_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), '0001.sql')
with open(sql_path, "r") as sqlfile:
    sql_commands = sqlfile.read()

Now for the real tricky part, where we actually perform the migration. The basic command you want looks like:

alter table tablename alter column uuid type uuid using uuid::uuid;

But the reason we are here is because of indexes. And as I discovered, Django likes to use your migrations to created randomly named indexes on your fields while running tests, so your tests will fail if you just delete and then recreate a fixed name index or two. So the following is sql that will delete one constraint and all indexes on the text field before converting to a uuid field. It also works for multiple tables in one go.

DO $$
DECLARE
    table_names text[];
    this_table_name text;
    the_constraint_name text;
    index_names record;

BEGIN

SELECT array['table1',
             'table2'
             ]
    INTO table_names;


FOREACH this_table_name IN array table_names
LOOP
    RAISE notice 'migrating table %', this_table_name;

    SELECT CONSTRAINT_NAME INTO the_constraint_name
    FROM information_schema.constraint_column_usage
    WHERE CONSTRAINT_SCHEMA = current_schema()
        AND COLUMN_NAME IN ('uuid')
        AND TABLE_NAME = this_table_name
    GROUP BY CONSTRAINT_NAME
    HAVING count(*) = 1;
    if the_constraint_name is not NULL then
        RAISE notice 'alter table % drop constraint %',
            this_table_name,
            the_constraint_name;
        execute 'alter table ' || this_table_name
            || ' drop constraint ' || the_constraint_name;
    end if;

    FOR index_names IN
    (SELECT i.relname AS index_name
     FROM pg_class t,
          pg_class i,
          pg_index ix,
          pg_attribute a
     WHERE t.oid = ix.indrelid
         AND i.oid = ix.indexrelid
         AND a.attrelid = t.oid
         AND a.attnum = any(ix.indkey)
         AND t.relkind = 'r'
         AND a.attname = 'uuid'
         AND t.relname = this_table_name
     ORDER BY t.relname,
              i.relname)
    LOOP
        RAISE notice 'drop index %', quote_ident(index_names.index_name);
        EXECUTE 'drop index ' || quote_ident(index_names.index_name);
    END LOOP; -- index_names

    RAISE notice 'alter table % alter column uuid type uuid using uuid::uuid;',
        this_table_name;
    execute 'alter table ' || quote_ident(this_table_name)
        || ' alter column uuid type uuid using uuid::uuid;';
    RAISE notice 'CREATE UNIQUE INDEX %_uuid ON % (uuid);',
        this_table_name, this_table_name;
    execute 'create unique index ' || this_table_name || '_uuid on '
        || this_table_name || '(uuid);';

END LOOP; -- table_names

END;
$$

Solution 4 - Postgresql

I was able to convert a column with an INT type, configured as an incrementing primary key using the SERIAL shorthand, using the following process:

--  Ensure the UUID extension is installed.
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

--  Dropping and recreating the default column value is required because
--  the default INT value is not compatible with the new column type.
ALTER TABLE table_to_alter ALTER COLUMN table_id DROP DEFAULT, 
ALTER COLUMN table_id SET DATA TYPE UUID USING (uuid_generate_v4()), 
ALTER COLUMN table_id SET DEFAULT uuid_generate_v4();

Solution 5 - Postgresql

I'm bumping to this after a long time, but there is a way to convert your integer column to a UUID with some kind of backwards-compatibility, namely keeping a way to have a reference to your old values, rather than dropping your values. It comprises of converting your integer value to a hex string and then padding that with necesary zeroes to make up an artificial UUID.

So, assuming your current integer column is named ColA, the following statement would do it (mind the using part):

ALTER TABLE tableA ALTER COLUMN ColA SET DATA TYPE UUID USING LPAD(TO_HEX(ColA), 32, '0')::UUID;

Solution 6 - Postgresql

In PostgreSQL 9.3 you can do this:

ALTER TABLE "tableA" ALTER COLUMN "ColA" SET DATA TYPE UUID USING "ColA"::UUID;

And cast the type of data to UUID and this will avoid the error message.

Solution 7 - Postgresql

WARNING: I've noticed some comments and answers that try to cast integers to a UUID4.

You must not cast or force-set uuid values. They must be generated using functions relating to RFC4122.

UUIDs must be randomly distributed or they will not work. You cannot cast or enter your own UUIDs as they will not be properly distributed. This can lead to bad actors guessing your sequencing or finding other artifacts or patterns in your UUIDs that will lead them to discover others.

Any answer that converts to char types and then to uuid may lead to these kinds problems.

Follow any answer here that refers to 'uuid_generate_v4'. Ignore ones that are casting or setting without using the formal functions.

Solution 8 - Postgresql

For changing from int column type to uuid I want to keep following properties:

  • uuid must be globally unique
  • id in migrated column must be deterministically generated, so in case that there are any hard-coded data in migrations, I'm able to reference them even after migration identifier.

Migration path

1) Drop foreign key

...drop all references to Table1

ALTER TABLE "Table2" DROP CONSTRAINT "Foreign_Table2_IdTable1";

2) migrate int-ids to uuid deterministically

ALTER TABLE "Table1" ALTER COLUMN "Id" SET DATA TYPE UUID USING DETERMINISTIC_TO_UUID("Id");
ALTER TABLE "Table2" ALTER COLUMN "IdTable1" SET DATA TYPE UUID USING DETERMINISTIC_TO_UUID("IdTable1");

Note: DETERMINISTIC_TO_UUID() needs to be defined, see below!

3) add foreign keys back

ALTER TABLE "Table2" ADD CONSTRAINT "Foreign_Table2_IdTable1" FOREIGN KEY ("IdTable1") REFERENCES "Table1" ("Id") ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE;

How to deterministically convert UUIDs?

Simplest way

Function that generates invalid UUID, which are however good-enough for most cases:

CREATE OR REPLACE FUNCTION DETERMINISTIC_TO_UUID(inputData bigint) RETURNS uuid AS $$
BEGIN
    RETURN LPAD(TO_HEX(inputData), 32, '0')::uuid;
END;
$$ LANGUAGE plpgsql;

uuid4 variant1 compatible version

I have implemented following two functions:

  • bitstringIdFrom(tableName varchar, id bigint): bit(128) – generate 128-bitstring for given table name & numeric ID
  • makeUuid4variant1From(bit(128)): uuid – Generate valid uuid4 from 128-bitstring
CREATE OR REPLACE FUNCTION makeUuid4variant1From(inputData bit(128)) RETURNS uuid AS $$
DECLARE uuid4variant1_mask CONSTANT bit(128) := ~((B'1111'::bit(128) >> 48) | (B'11'::bit(128) >> 64));
    DECLARE uuid4variant1_versionData CONSTANT bit(128) := (4::bit(4)::bit(128) >> 48) | (2::bit(2)::bit(128) >> 64);
    DECLARE uuidInBitString bit(128);
    DECLARE low_bits bit(64);
    DECLARE hi_bits bit(64);
BEGIN
    -- This actually makes valid uuid4 variant1
    -- mask removes bits necesarry for version & variant data
    -- OR operation adds required version & variant data
    uuidInBitString := ("inputdata" & uuid4variant1_mask) | uuid4variant1_versionData;

    -- As PostgreSQL does NOT support working with 128-bit itegers, we need to split it into half
    -- working with bit-strings: https://www.postgresql.org/docs/13/functions-bitstring.html, https://www.postgresql.org/docs/9.5/functions-bitstring.html
    low_bits := (uuidInBitString << 64)::bit(64);
    hi_bits  := (uuidInBitString << 0) ::bit(64);

    RETURN (
            LPAD(TO_HEX(hi_bits::bigint), 16, '0') || LPAD(TO_HEX(low_bits::bigint), 16, '0')
        )::uuid;
END;
$$ LANGUAGE plpgsql;


-- creates bit-string from given table name & int-ID
-- This uses deterministic hash function
CREATE OR REPLACE FUNCTION bitstringIdFrom(tableName varchar, id bigint) RETURNS bit(128) AS $$
	DECLARE uuidAsBitString bit(128) := 0::bit(128);
	DECLARE salt text := 'some-secret-text-to-improve-unguessability';
    DECLARE part1 bit(32);
    DECLARE part2 bit(32);
    DECLARE part3 bit(32);
    DECLARE part4 bit(32);
BEGIN
	-- in case someone want to guess migrated uuids
	-- they would need to know this secret & hashing algorithm
    id := id + hashtext(salt);

	-- source: https://hakibenita.com/postgresql-hash-index#hash-function
    part1 := (hashtext(tablename) * id)::bit(32);
    part2 := (hashtext(part1::text) * id)::bit(32);
    part3 := (hashtext(part2::text) * id)::bit(32);
    part4 := (hashtext(part3::text) * id)::bit(32);

    uuidAsBitString := part1::bit(128) | uuidAsBitString;
    uuidAsBitString := uuidAsBitString >> 32;
    uuidAsBitString := part2::bit(128) | uuidAsBitString;
    uuidAsBitString := uuidAsBitString >> 32;

    uuidAsBitString := part3::bit(128) | uuidAsBitString;
    uuidAsBitString := uuidAsBitString >> 32;

    uuidAsBitString := part4::bit(128) | uuidAsBitString;

    RETURN uuidAsBitString;
END;
$$ LANGUAGE plpgsql;

How to achieve real uuid4 randomness?

As when we have been creating new forign key, we have created it with ON UPDATE CASCADE, you can simply achieve that by:

UPDATE "Table1" SET "Id" = uuid_generate_v4();

... which will automatically update also all foreign keys referenced.

Note: This can take a really long time as machine randomness can be depleted quite quickly. Therefore I recommend to use good salt/secret and use deterministic migration path.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser1802143View Question on Stackoverflow
Solution 1 - PostgresqlCraig RingerView Answer on Stackoverflow
Solution 2 - PostgresqlpritstiftView Answer on Stackoverflow
Solution 3 - PostgresqlMichael BaltaksView Answer on Stackoverflow
Solution 4 - PostgresqlajxsView Answer on Stackoverflow
Solution 5 - PostgresqlJChristView Answer on Stackoverflow
Solution 6 - PostgresqlMichi SalazarView Answer on Stackoverflow
Solution 7 - PostgresqlbboldView Answer on Stackoverflow
Solution 8 - PostgresqlHonza KuchařView Answer on Stackoverflow