DEV Community πŸ‘©β€πŸ’»πŸ‘¨β€πŸ’»

Rafael Bernard AraΓΊjo
Rafael Bernard AraΓΊjo

Posted on • Originally published at rafael.bernard-araujo.com on

Recover and replace win1252 content to utf-8 for a PostgreSQL database

I shouldn't be the first to need to export data from a PostgreSQL database installed on Windows with a win1252 code to a database with UTF-8 code (in my case, on a Linux server).

It is not enough to transform the transfer file to UTF-8, as the characters of win1252 (left double quote, right double quote, single quote and dash) will be there, with a weird value in your database.

In my experience, I had to import data as-is and afterwards perform an update using the function to rewrite data for UTF-8 characters correctly.

The following code examples are for: 1 - transform to HTML characters; 2 - transform to single characters.

HTML (https://gist.github.com/rafaelbernard/1029869):

-- DROP FUNCTION substitui_win1252_html(texto);

CREATE OR REPLACE FUNCTION substitui_win1252_html(texto text)
    RETURNS text AS
$$
BEGIN

    texto := replace(replace(replace(replace(replace(texto, 'Β’', '’'), 'Β“', '“'), 'Β”', '”'), 'Β•','•'), 'Β–', '–')::text; 
    RETURN texto;

END;
$$
LANGUAGE plpgsql;
Enter fullscreen mode Exit fullscreen mode

Simple (https://gist.github.com/rafaelbernard/1029899):

- DROP FUNCTION substitui_win1252(texto);

CREATE OR REPLACE FUNCTION substitui_win1252(texto text)
    RETURNS text AS
$$
BEGIN

    texto := replace(replace(replace(replace(replace(texto, 'Β’', ''''), 'Β“', '"'), 'Β”', '"'), 'Β•','•'), 'Β–', '-')::text;  
    RETURN texto;

END;
$$
LANGUAGE plpgsql;
Enter fullscreen mode Exit fullscreen mode

Don't worry about the squares that you might see. If you copy it to a good text editor, you will their actual value.

Top comments (0)

We want your help! Become a Tag Moderator.
Check out this survey and help us moderate our community by becoming a tag moderator here at DEV.