Reinitialize the PostgreSQL database cluster

Subject matter described problems



When working with a database in PostgreSQL, you must not forget, in what locale (locale) has been initialized a database cluster — the so-called postgre directory (usually /var/lib/pgsql/data), which stores the data of all databases of this PostgreSQL installation.



the

Problem



Today I was faced with this problem. In the query selection when using the function lower() cast of Cyrillic text to lower case did not happen, however, the English value is readily "reduced".

the

First attempt to solve the problem



Google, search the half of the Internet has made information about the fact that it would be nice if the searched database was encoding UTF-8 (in my case, inadvertently, she was in default SQL_ASCII).

OK. It is done! Relatively quickly I found the instructions on how to recreate the database to a new encoding without loss of data.

the
[bash]
# su - postgres
~ vacuumdb --full --analyze --username postgres --dbname mydatabase
~ pg_dump mydatabase -Ft-v-U postgres -f /tmp/mydatabase.tar
~ dropdb mydatabase --username postgres
~ createdb --encoding UNICODE --username postgres mydatabase
~ pg_restore /tmp/mydatabase.tar | psql --dbname mydatabase --username postgres
~ vacuumdb --full --analyze --username postgres --dbname mydatabase


The test showed that the data of the database is not corrupted, but the required actions (conversion to lower case function lower() is still not happening.

the

the solution to the problem becomes interesting



Continue with Google reading it's understanding that the cluster databases that PostgreSQL server has been initialized in "C" locale, and for functions, lower() and upper() it means a lot!

I had to figure out how to re-initialize the database cluster, not killing with the already existing databases and the data in them. Thus, this is a production server — it is for the crown (crontab) times in a hour merge some data dumps. The good is that he is not so much production that would not find a free "window" for re-initialization.

the

"Window" of 60 minutes and half an hour to prepare



Until the end of the day was 1.5 hours and one free "window" lasting 60 minutes.

To start I decided to warm up on the local laptop. It is worth to mention the difference in operating systems: laptop — Ubuntu 8.10-server — 5 CentOS. Preparing three Windows of the terminal and a text editor, began preparatory work.

First, the first method it was necessary to divide into two — the existing dump data and restore them after re-initialization of the cluster.

The database dump had taken place without strong complaints, only a couple of times pg_drop error to users connected to the same database (dare closing pgAdmin').

Then was found (in the case of Ubuntu) is hidden in a rather unusual location (/usr/lib/postgresql/8.3/bin) command (initdb) and executed in the necessary parameters.

the
[bash]
# su - postgres
~ vacuumdb --full --analyze --username postgres --dbname mydatabase
~ pg_dump mydatabase -Ft-v-U postgres -f /tmp/mydatabase.tar
~ dropdb mydatabase --username postgres
~ initdb --locale=ru_RU.utf8 data/



the Damn! Error...

Yeah, turns out that you have to manually delete the contents directory of the database cluster.

Warning! do Not just take a rm-rf data/*. I realized after I did it on the laptop and after restoring, I lost access to the server (stored in pg_hba.conf).


You should make a copy of the file pg_hba.conf somewhere on time for a change.

the
[bash]
~ cp data/pg_hba.conf /home/cr0t/pg_hba.2009.03.24_1654.conf


After removing the contents of a directory and stop the PostgreSQL daemon without error passed re-initialization of the cluster.

the
[bash]
~ exit
# /etc/init.d/postgresql stop
# su - postgres
~ rm-rf data/*
~ initdb --locale=ru_RU.utf8 data/


Just had to run it again the server and restore dumps of the old database in the new cluster is already initialized in the "correct" locale.

the
[bash]
~ exit
# /etc/init.d/postgresql start
# su - postgres
~ createdb --encoding UNICODE --username postgres mydatabase

~ vacuumdb --full --analyze --username postgres --dbname mydatabase



the

Success. Results



After these successful actions, the function lower() correctly "press" in Cyrillic characters. All welcome. But I didn't even think about user roles, PostgreSQL (8.x version were referred to as users). They did not. Well, I was required to create them just a couple. the But who a lot of them, be careful, do not repeat my mistake!

the

PS the Steps for re-initialization if there are multiple databases



In my case it was necessary, sadamichi and subsequently rebuilt 3 bases.
To solve this problem it is necessary to add only additional repetitions of some action for the removal of dump and subsequent recovery (with a strong desire, even automate a script to write ;)).

the
[bash]
# su - postgres
~ vacuumdb --full --analyze --username postgres --dbname mydb1
~ mydb1 pg_dump-Ft-v-U postgres -f /tmp/mydb1.tar
~ mydb1 dropdb --username postgres
~ vacuumdb --full --analyze --username postgres --dbname mydb2
~ mydb2 pg_dump-Ft-v-U postgres -f /tmp/mydb2.tar
~ dropdb --username postgres mydb2
~ vacuumdb --full --analyze --username postgres --dbname mydb3
~ mydb3 pg_dump-Ft-v-U postgres -f /tmp/mydb3.tar
~ mydb3 dropdb --username postgres
~ cp data/pg_hba.conf ./
~ exit
# /etc/init.d/postgresql stop
# su - postgres
~ rm-rf data/*
~ initdb --locale=ru_RU.utf8 data/
~ cp pg_hba.conf data/
~ exit
# /etc/init.d/postgresql start
# su - postgres
~ createdb --encoding UNICODE mydb1 --username postgres
~ pg_restore /tmp/mydb1.tar | psql --dbname postgres --username mydb1
~ vacuumdb --full --analyze --username postgres --dbname mydb1
~ createdb --encoding UNICODE --username postgres mydb2
~ pg_restore /tmp/mydb2.tar | psql dbname --username postgres mydb2
~ vacuumdb --full --analyze --username postgres --dbname mydb2
~ createdb --encoding UNICODE mydb3 --username postgres
~ pg_restore /tmp/mydb3.tar | psql --dbname postgres --username mydb3
~ vacuumdb --full --analyze --username postgres --dbname mydb3


Cross-post from my blog Summer of code

the Post looks like the already posted recently Patcham UTF-8 Collation under FreeBSD, but it seems to me that there is described a solution to the problem specific to FreeBSD, I cite for Ubuntu's. When I solved my problem this information is not even used — just Google reading.

the Thank you for the karma!, was moved to blog PostgreSQL.
Article based on information from habrahabr.ru

Comments

Popular posts from this blog

Powershell and Cyrillic in the console (updated)

Active/Passive PostgreSQL Cluster, using Pacemaker, Corosync

Automatic deployment ElasticBeanstalk using Bitbucket Pipelines