MySQL, PostgreSQL, and Clustering

So Sun Micrososystems just bought MySQL. That could turn out to be pretty interesting. In other news, Clustering and Replication. I've been setting up some highly available systems that need to use both MySQL and PostgreSQL. MySQL has some great options for this and I was able to pick between Master-Master replication (which was the final choice here) or a clustered storage engine, both of which are available from MySQL and are very straightforward to configure. As for PostgreSQL, this is about the 2nd or 3rd time I've ever installed or used it, much less set it up to scale. (MySQL I've been using for ~10 years and with replication for at least a couple.) I assumed that replication would be as easy as MySQL and it turns out that it is, but there are a lot of 3rd party options that do different things and the documentation could be a lot more thorough. I stumbled across this article on PostgreSQL Replication and HA which led me to try out PGCluster. Your needs may vary, but this particular situation requires Master-Master replication with a definite upgrade path to High Availability. PGCluster seemed to provide all this, but it is non-trivial to get going. This PGCluster example was the most helpful thing I came across but it didn't really explain things too well, just provided a working configuration for a setup very different than mine. Read on for the details First is getting this all installed on CentOS. PGCluster is not a daemon or tool, it's available as a patch to PostgreSQL or you can download a patched version which I chose to do.
yum install readline-devel
useradd postgres
su - postgres
set PGDATA=/usr/local/
export PGDATA
set LD_LIBRARY_PATH=/usr/local/pgsql/lib
export LD_LIBRARY_PATH
wget http://pgfoundry.org/frs/download.php/1295/pgcluster-1.5.0rc16.tar.gz
tar -zxf pgcluster-1.5.0rc16.tar.gz
chown -R postgres:postgres pgcluster-1.5.0-rc16
cd pgcluster-1.5.0-rc16
./configure --enable-thread-safety
make
su
make install
chown -R postgres /usr/local/pgsql
exit
/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
Then is the configuration files. This setup has host1 replicating to host2 as a backup. Make sure host1 and host2 are set up in DNS or /etc/hosts properly first! host1:cluster.conf
<Replicate_Server_Info>
        <Host_Name>host1</Host_Name>
        <Port>8001</Port>
        <Recovery_Port>8101</Recovery_Port>
        <LifeCheck_Port>8201</LifeCheck_Port>
</Replicate_Server_Info>
<Host_Name>host1</Host_Name>
<Recovery_Port>7001</Recovery_Port>
<Rsync_Path>/usr/bin/rsync</Rsync_Path>
<Rsync_Option>ssh -1</Rsync_Option>
<Rsync_Compress>yes</Rsync_Compress>
<Pg_Dump_Path>/usr/local/pgsql/bin/pg_dump</Pg_Dump_Path>
<When_Stand_Alone>read_only</When_Stand_Alone>
<Replication_Timeout>1min</Replication_Timeout>
<LifeCheck_Timeout>3s</LifeCheck_Timeout>
<LifeCheck_Interval>11s</LifeCheck_Interval>
host1:pgreplicate.conf
<Cluster_Server_Info>
    <Host_Name>host1</Host_Name>
    <Port>5432</Port>
    <Recovery_Port>7001</Recovery_Port>
</Cluster_Server_Info>
<Cluster_Server_Info>
    <Host_Name>host2</Host_Name>
    <Port>5432</Port>
    <Recovery_Port>7001</Recovery_Port>
</Cluster_Server_Info>
<Host_Name>host1</Host_Name>
<Replication_Port>8001</Replication_Port>
<Recovery_Port>8101</Recovery_Port>
<RLOG_Port>8301</RLOG_Port>
<Response_Mode>normal</Response_Mode>
<Use_Replication_Log>no</Use_Replication_Log>
<Replication_Timeout>1min</Replication_Timeout>
<LifeCheck_Timeout>3s</LifeCheck_Timeout>
<LifeCheck_Interval>15s</LifeCheck_Interval>
<Log_File_Info>
        <File_Name>/var/log/postgresql/pgreplicate.log</File_Name>
        <File_Size>1M</File_Size>
        <Rotate>3</Rotate>
</Log_File_Info>
host2:cluster.conf
<Replicate_Server_Info>
        <Host_Name>host1</Host_Name>
        <Port>8001</Port>
        <Recovery_Port>8101</Recovery_Port>
</Replicate_Server_Info>
<Host_Name>host2</Host_Name>
<Recovery_Port>7001</Recovery_Port>
<Rsync_Path>/usr/bin/rsync</Rsync_Path>
<Rsync_Option>ssh -1</Rsync_Option>
<Rsync_Compress>yes</Rsync_Compress>
<Pg_Dump_Path>/usr/local/pgsql/bin/pg_dump</Pg_Dump_Path>
<When_Stand_Alone>read_only</When_Stand_Alone>
<Replication_Timeout>1min</Replication_Timeout>
<LifeCheck_Timeout>3s</LifeCheck_Timeout>
<LifeCheck_Interval>11s</LifeCheck_Interval>
And you're almost done! Start the replicator on host1:
/usr/local/pgsql/bin/pgreplicate -D /usr/local/pgsql/data
And then the patched PostgreSQL on each host:
/usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -o "-i" start
That should be it! Any changes you make on either host will be replicated to the other in the "all servers and services up" mode. Insert/Update/etc queries on host2 will fail if host2 cannot get to the replication server, however select/etc queries on host2 will still work if the replication server is down. (The replication process is separate from the postgres process.)

comments powered by Disqus