Understanding why Dovecot unexpectedly died

Luca Bertoncello lucabert at lucabert.de
Sat Nov 15 19:00:13 UTC 2014


Hi list!

I use Dovecot 1.2.17 (I can't upgrade right now, due to many reasons),
controlled by Pacemaker (I have an HA-Cluster).
Now I see that Pacemaker restarts often Dovecot. I wrote my own script to
manage Dovecot, since Pacemaker does not have his own.

My script, by the "monitor" section has this:

monitor)
                if [ ! -e $OCF_RESKEY_pid ]; then
                        echo "stopped (no pidfile)"
echo "DOVECOT STOPPED - NO PIDFILE" | /usr/bin/logger -p local0.info -t DOVECOT-MONITOR -i
                        exit $OCF_NOT_RUNNING
                else
                        /bin/ps axuwf | /bin/grep `/bin/cat $OCF_RESKEY_pid` | /bin/grep -v grep > /dev/null 2>&1
                        if [ $? -ne 0 ]; then
                                echo "stopped"
echo "DOVECOT STOPPED - NO PROCESS" | /usr/bin/logger -p local0.info -t DOVECOT-MONITOR -i
                                exit $OCF_NOT_RUNNING
                        else
                                if [ "`/bin/netstat -tupan | /bin/grep dovecot | /bin/grep $OCF_RESKEY_bindaddr | /usr/bin/wc -l`" -ne 0 ]; then
                                        exit $OCF_SUCCESS
                                else
echo "DOVECOT STOPPED - NO LISTEN [`/bin/netstat -tupan | /bin/grep dovecot`]" | /usr/bin/logger -p local0.info -t DOVECOT-MONITOR -i
                                        exit $OCF_ERR_GENERIC
                                fi
                        fi
                fi
                exit $OCF_SUCCESS
                ;;

The "loggers" was added now to try to understand why it dies...
Well, I can see in my syslog, when Pacemaker restarts Dovecot, these lines:

ov 15 18:59:09 mail01 DOVECOT-MONITOR[530]: DOVECOT STOPPED - NO LISTEN [tcp        0      0 192.168.33.1:37545      192.168.33.3:3306       ESTABLISHED 637/dovecot-auth
Nov 15 18:59:09 mail01 DOVECOT-MONITOR[530]: tcp        0      0
192.168.33.1:37537      192.168.33.3:3306       ESTABLISHED 529/dovecot-auth]

So, there is no "dovecot"-Process listening anymore... Normally I have these:

tcp        0      0 0.0.0.0:110             0.0.0.0:*               LISTEN      634/dovecot
tcp        0      0 0.0.0.0:143             0.0.0.0:*               LISTEN      634/dovecot
tcp        0      0 0.0.0.0:993             0.0.0.0:*               LISTEN      634/dovecot
tcp        0      0 0.0.0.0:995             0.0.0.0:*               LISTEN      634/dovecot
tcp        0      0 192.168.33.1:40994      192.168.33.3:3306       VERBUNDEN   891/dovecot-auth
tcp        0      0 192.168.33.1:40984      192.168.33.3:3306       VERBUNDEN   638/dovecot-auth
tcp6       0      0 :::110                  :::*                    LISTEN      634/dovecot
tcp6       0      0 :::143                  :::*                    LISTEN      634/dovecot
tcp6       0      0 :::993                  :::*                    LISTEN      634/dovecot
tcp6       0      0 :::995                  :::*                    LISTEN      634/dovecot

In the mail.log and mail.err I can't see anything but:

Nov 15 18:59:13 mail01 dovecot: Dovecot v1.2.17 starting up
Nov 15 18:59:13 mail01 dovecot: auth-worker(default): mysql: Connected to 192.168.33.3 (exim)

And in the syslos there is nothing about Dovecot...

Any idea?

Thanks a lot!
Luca Bertoncello
(lucabert at lucabert.de)


More information about the dovecot mailing list