Question

Help with PostgreSQL replication

Posted July 21, 2014 6.5k views

The master and slave servers are set up as per the tutorial https://www.digitalocean.com/community/tutorials/how-to-set-up-master-slave-replication-on-postgresql-on-an-ubuntu-12-04-vps and both servers are Ubuntu 12.04 x64 and are running PostgreSQL 9.1

below are the commands I am running and the errors I am getting, any help will be appreciated.

MASTER SERVER

$ sudo su - postgres
$ psql -c "select pg_start_backup('initial_backup');"
$ rsync -cva --inplace --exclude=*pg_xlog* /var/lib/postgresql/9.1/main/ MAS.TER.SER.VER:/var/lib/postgresql/9.1/main/
$ psql -c "select pg_stop_backup();"
$ psql -c "select pg_stop_backup();"
NOTICE:  pg_stop_backup cleanup done, waiting for required WAL segments to be archived
WARNING:  pg_stop_backup still waiting for all required WAL segments to be archived (60 seconds elapsed)
HINT:  Check that your archive_command is executing properly.  pg_stop_backup can be canceled safely, but the database backup will not be usable without all the WAL segments.
WARNING:  pg_stop_backup still waiting for all required WAL segments to be archived (120 seconds elapsed)
HINT:  Check that your archive_command is executing properly.  pg_stop_backup can be canceled safely, but the database backup will not be usable without all the WAL segments.
^CCancel request sent
ERROR:  canceling statement due to user request

SLAVE SERVER

After running the above on the master I then set up the recovery.conf as per the tutorial and attempt to start the slave server.

$ sudo su - postgres
$ nano /etc/postgresql/9.1/main/recovery.conf

Edit as per tutorial

$ service postgresql start
 * Starting PostgreSQL 9.1 database server                                                                                                                            * Removed stale pid file.
The PostgreSQL server failed to start. Please check the log output:
2014-07-21 22:31:12 BST LOG:  database system was interrupted; last known up at 2014-07-21 22:21:43 BST
2014-07-21 22:31:12 BST LOG:  could not open file "pg_xlog/000000010000000000000018" (log file 0, segment 24): No such file or directory
2014-07-21 22:31:12 BST LOG:  invalid checkpoint record
2014-07-21 22:31:12 BST FATAL:  could not locate required checkpoint record
2014-07-21 22:31:12 BST HINT:  If you are not restoring from a backup, try removing the file "/var/lib/postgresql/9.1/main/backup_label".
2014-07-21 22:31:12 BST LOG:  startup process (PID 8274) exited with exit code 1
2014-07-21 22:31:12 BST LOG:  aborting startup due to startup process failure

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

×
1 answer

In the command:

rsync -cva --inplace --exclude=*pg_xlog* /var/lib/postgresql/9.1/main/ MAS.TER.SER.VER:/var/lib/postgresql/9.1/main/

I assume MAS.TER.SER.VER is the IP address of the master postgres server? If so, that should actually be the IP of the slave server.

  • Thanks for the reply, Sorry the IP should be the slave it was an error I just wanted to mask it for the post. To confirm I am executing the command on the master server and it is the SLAVE IP address. here is the latest update:

    after looking through the logs again on the master server it seems that the archive_command was set but commented out I have changed this and restarted.

    Here is an output of the MASTER server:

    SL.AVE.SER.VER = the IP Address of the slave server the command is executed as postgres

    $ psql -c "select pgstartbackup('initial_backup');"
    pgstartbackup
    0/1D000020
    (1 row)
    
    $ rsync -cva --inplace --exclude=pg_xlog /var/lib/postgresql/9.1/main/ SL.AVE.SER.VER:/var/lib/postgresql/9.1/main/
    sending incremental file list
    ./
    backuplabel
    postmaster.opts
    postmaster.pid
    rsync: failed to set times on "/var/lib/postgresql/9.1/main/server.crt": Operation not permitted (1)
    rsync: failed to set times on "/var/lib/postgresql/9.1/main/server.key": Operation not permitted (1)
    base/11951/
    base/11951/pginternal.init
    base/25911/
    base/26683/
    base/26683/pginternal.init
    global/
    global/pgcontrol
    global/pginternal.init
    pgnotify/
    pgstattmp/
    pgstattmp/pgstat.stat
    
    sent 293815 bytes received 836 bytes 45330.92 bytes/sec
    total size is 38129968 speedup is 129.41
    rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1070) [sender=3.0.9]
    $ psql -c "select pgstopbackup();"
    NOTICE: pgstopbackup complete, all required WAL segments have been archived
    
    pgstopbackup
    0/1D0000D8
    (1 row)
    

    Previously I got

    WARNING: pg_stop_backup still waiting for all required WAL segments to be archived (60 seconds elapsed)
    

    So this seems to be fixed. However on the SLAVE server this is the output:

    $ service postgresql start
     * Starting PostgreSQL 9.1 database server                                                                                                                            * Removed stale pid file.
    The PostgreSQL server failed to start. Please check the log output:
    2014-07-22 20:07:39 BST LOG:  database system was interrupted; last known up at 2014-07-22 20:07:02 BST
    2014-07-22 20:07:39 BST LOG:  could not open file "pg_xlog/00000001000000000000001D" (log file 0, segment 29): No such file or directory
    2014-07-22 20:07:39 BST LOG:  invalid checkpoint record
    2014-07-22 20:07:39 BST FATAL:  could not locate required checkpoint record
    2014-07-22 20:07:39 BST HINT:  If you are not restoring from a backup, try removing the file "/var/lib/postgresql/9.1/main/backup_label".
    2014-07-22 20:07:39 BST LOG:  startup process (PID 10375) exited with exit code 1
    2014-07-22 20:07:39 BST LOG:  aborting startup due to startup process failure
    
    
Submit an Answer