Postgres error on insert - ERROR: invalid byte sequence for encoding "UTF8": 0x00

Postgresql

Postgresql Problem Overview


I get the following error when inserting data from mysql into postgres.

Do I have to manually remove all null characters from my input data? Is there a way to get postgres to do this for me?

ERROR: invalid byte sequence for encoding "UTF8": 0x00

Postgresql Solutions


Solution 1 - Postgresql

PostgreSQL doesn't support storing NULL (\0x00) characters in text fields (this is obviously different from the database NULL value, which is fully supported).

Source: http://www.postgresql.org/docs/9.1/static/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-UESCAPE

If you need to store the NULL character, you must use a bytea field - which should store anything you want, but won't support text operations on it.

Given that PostgreSQL doesn't support it in text values, there's no good way to get it to remove it. You could import your data into bytea and later convert it to text using a special function (in perl or something, maybe?), but it's likely going to be easier to do that in preprocessing before you load it.

Solution 2 - Postgresql

If you are using Java, you could just replace the x00 characters before the insert like following:

myValue.replaceAll("\u0000", "")

The solution was provided and explained by Csaba in following post:

https://www.postgresql.org/message-id/1171970019.3101.328.camel%40coppola.muc.ecircle.de

Respectively:

> in Java you can actually have a "0x0" character in your string, and > that's valid unicode. So that's translated to the character 0x0 in > UTF8, which in turn is not accepted because the server uses null > terminated strings... so the only way is to make sure your strings > don't contain the character '\u0000'.

Solution 3 - Postgresql

Just regex out null bytes:

s/\x00//g;

Solution 4 - Postgresql

You can first insert data into blob field and then copy to text field with the folloing function

CREATE OR REPLACE FUNCTION blob2text() RETURNS void AS $$
Declare
	ref record;
	i integer;
Begin
	FOR ref IN SELECT id, blob_field FROM table LOOP

          --  find 0x00 and replace with space    
	  i := position(E'\\000'::bytea in ref.blob_field);
	  WHILE i > 0 LOOP
		ref.bob_field := set_byte(ref.blob_field, i-1, 20);
		i := position(E'\\000'::bytea in ref.blobl_field);
	  END LOOP
			
	UPDATE table SET field = encode(ref.blob_field, 'escape') WHERE id = ref.id;
	END LOOP;

End; $$ LANGUAGE plpgsql; 

--

SELECT blob2text();

Solution 5 - Postgresql

Only this regex worked for me:

sed 's/\\0//g'

So as you get your data do this: $ get_data | sed 's/\\0//g' which will output your data without 0x00

Solution 6 - Postgresql

If you need to store null characters in text fields and don't want to change your data type other than text then you can follow my solution too:

Before insert:

myValue = myValue.replaceAll("\u0000", "SomeVerySpecialText")

After select:

myValue = myValue.replaceAll("SomeVerySpecialText","\u0000")

I've used "null" as my SomeVerySpecialText which I am sure that there will be no any "null" string in my values at all.

Solution 7 - Postgresql

This kind of error can also happen when using COPY and having an escaped string containing NULL values(00) such as:

"H\x00\x00\x00tj\xA8\x9E#D\x98+\xCA\xF0\xA7\xBBl\xC5\x19\xD7\x8D\xB6\x18\xEDJ\x1En"

If you use COPY without specifying the format 'CSV' postgres by default will assume format 'text'. This has a different interaction with backlashes, see text format.

If you're using COPY or a file_fdw make sure to specify format 'CSV' to avoid this kind of errors.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionScArcher2View Question on Stackoverflow
Solution 1 - PostgresqlMagnus HaganderView Answer on Stackoverflow
Solution 2 - PostgresqlDavid Dal BuscoView Answer on Stackoverflow
Solution 3 - PostgresqlhichamView Answer on Stackoverflow
Solution 4 - PostgresqlRaidoView Answer on Stackoverflow
Solution 5 - PostgresqltechkuzView Answer on Stackoverflow
Solution 6 - PostgresqlIsmail YavuzView Answer on Stackoverflow
Solution 7 - PostgresqlSteve ChavezView Answer on Stackoverflow