11

I'm trying to set up CartoDB on a Vagrant box, following the instructions here. However, it keeps failing because it complains that Postgres has been installed with Latin-1 encoding.

I can't work out why Postgres is doing this, because I'm explicitly forcing all the local settings to UTF8. Here's what I've been doing:

export LANGUAGE="en_US.UTF-8"
export LANG="en_US.UTF-8"
export LC_ALL="en_US.UTF-8"
locale
sudo apt-get update
sudo apt-get install -y python-software-properties
sudo add-apt-repository -y ppa:cartodb/gis
sudo add-apt-repository -y ppa:mapnik/v2.1.0
sudo add-apt-repository -y ppa:cartodb/nodejs
sudo add-apt-repository -y ppa:cartodb/redis
sudo add-apt-repository -y ppa:cartodb/postgresql
sudo add-apt-repository -y ppa:ubuntugis/ubuntugis-unstable
sudo apt-get update
sudo apt-get install -y make unp zip libgeos-c1 libgeos-dev gdal-bin libgdal1-dev libjson0
sudo apt-get install python-simplejson libjson0-dev proj-bin proj-data libproj-dev postgresql-9.1

Here is the output of the early locale, showing that UTF8 has been set successfully:

LANG=en_US.UTF-8
LANGUAGE=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8

After running all the above commands, when I check the status of Postgres, it seems Postgres nonetheless installed itself with Latin-1 encoding:

sudo -u postgres psql -l

                         List of databases
   Name    |  Owner   | Encoding | Collate | Ctype |   Access privileges   
-----------+----------+----------+---------+-------+-----------------------
 postgres  | postgres | LATIN1   | en_US   | en_US | 
 template0 | postgres | LATIN1   | en_US   | en_US | =c/postgres          +
           |          |          |         |       | postgres=CTc/postgres
 template1 | postgres | LATIN1   | en_US   | en_US | =c/postgres          +
           |          |          |         |       | postgres=CTc/postgres

Why is this happening? How can I force Postgres to install itself with UTF8 encoding?

1
  • likewise, in my Vagrantfile I have sudo update-locale LANG=en_NZ.UTF-8 LC_ALL=en_NZ.UTF-8 which sets up LANG and LC_* as 'en_NZ.UTF-8' but when I install postgresql package after this the databases show as LATIN1/en_US Commented Feb 13, 2014 at 22:07

2 Answers 2

16

This might not be the answer you are looking for, but here are commands which you can use to switch PostgreSQL to a different locale (backup, re-create cluster and restore):

sudo -u postgres pg_dumpall > /tmp/postgres.sql
sudo pg_dropcluster --stop 9.1 main
sudo pg_createcluster --locale en_US.UTF-8 --start 9.1 main
sudo -u postgres psql -f /tmp/postgres.sql

If you want to know why the installation uses Latin, then you might need to dig into installation scripts. But if en_US.UTF-8 is not your default system locale, that might be the problem. Installation script can be loading /etc/default/locale.

Sign up to request clarification or add additional context in comments.

5 Comments

This worked brilliantly, thank you so much. I just had to run the cluster commands with sudo -u postgres in front of them too.
great answer. would be nice to know how to avoid this altogether and have it set on installation
I had the same problem where the default pg db wasn't following LANG/LANGUAGE/LC_* env variables. It seems like the installation script is following /etc/default/locale because when I added this line LANG="en_US.UTF-8" there it worked.
I would prefer changing locale via dpkg-reconfigure locales (wiki.debian.org/Locale#Standard) rather than modifying the file manually.
honestly thank you. i am stuck with this for lot of hours and unable to migrate to different server. this solved my problem.
1

I just had the same problem in an ubuntu machine.

A freshly installed postgresql created the template databases with encoding=SQL_ASCII and collate/ctype=C

One of the answers here led me to the solution:

  • removed/purged the postgresql just installed
  • dpkg-reconfigure locales
    • installed locale for en_US.UTF-8
    • logout
    • login
    • checked the environment variable was automatically set LANG=en_US.UTF-8
  • reinstall postgresql

And ready, my databases are now with encoding=UTF8 and collate/ctype=en_US.UTF-8

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.