[Dovecot] remote hot site, IMAP replication or cluster over WAN

Wed Nov 3 08:17:14 EET 2010

Stefan G. Weichinger put forth on 11/2/2010 1:15 PM:

> A bit off-topic, sorry ... I want to set up a hot backup dovecot in a
> VM, aside the physical server, so I am very interested in the "best
> practise" to do so ...

There isn't one.  If there was Timo would have pointed you to the wiki.

Doing server fail over is inherently problematic for a large number of
reasons.  The easiest way to implement it is to literally turn on the
backup server (power on) when the primary fails.  The backup comes up
with the same hostname and IP address as the primary and mounts the same
physical storage.

The storage must be either a SAN LUN, NFS directories, or a local disk
that has been mirrored over the network during normal operations.  But,
you can't have the same hostname and IP if the machine is running
allowing the mirroring to take place.

Thus, for a "standby" server, it must be powered off and take ownership
of the storage when powered on.  You _can_ do realtime mirroring to the
standby while it's running, but then you have some really complex issues
to deal with as far as hostname and IP assignments when the primary host
dies and you have to take over the name and IP on the spare server.
This can be done with a reboot and using alternate config files, and
might actually work better in a virtual environment than with a physical
machine as VM guests tend to boot faster than physical hosts due to
things like long pauses caused by hardware BIOS routines.

The key to all of the above is proper identification of primary host
failure.  The biggest problem with this setup is the "two brains" issue.
 There are a number of network scenarios that can cause your backup
server or monitoring software to think the primary host is offline when
it's really not.  The secondary thus comes up, and now you have two
hosts of the same name and IP address on the network.  This situation
can cause a number of serious problems

IMO, the best way to do high availability is to use an active/active
cluster of any number of nodes you see fit to meet your performance and
reliability needs.  All hosts are live all the time and share he load.
When one goes down client performance may simply drops a bit, but that's
about the extent of the downside.

It's inherently more straight forward to setup than the previous
scenario, especially if you're using NFS storage.  In this case, you'd
build two identical Dovecot servers and have each mount the same NFS
mail directory.  Read the list archives for ways to mitigate the index
file issue.  Timo wrote a new director specifically to meet this need.

Two other options for the shared storage are a fiber channel or iSCSI
SAN, or using DRBD to mirror disks (or logical devices--RAID) over the
network.  Both of these solutions require using a cluster filesystem
such as GFS2.  These can be quite a bit more difficult to setup and get
working properly than the NFS method, especially for less experienced
sysadmins.  They can also be more difficult to troubleshoot, especially
for sysadmins lacking sufficient knowledge or aptitude with regard to
storage hardware and low level Linux device drivers.

Hope this helps you a bit.  You probably won't find a "how to" document
that spoon feeds you the steps for an exact build/setup of this.  If you
choose the DRBD route you might be able to get Eric to write you up a
step-by-step of how he did his two node DRBD Dovecot cluster.  Maybe
he's already written one. :)

-- 
Stan