[Dovecot] director lmtp -> smtp problem
Hi Timo & Dovecot users,
We have a 2-node director setup which front-ends for 4 nodes which share a clustered filesystem (GFS). All nodes run Dovecot 2.0.18. Approximately 40k users, but typically only a few thousand active at any time.
The director nodes run sendmail, which deliver mail "locally" using LMTP to the director, which then feeds to SMTP on the real servers (also sendmail.) Why sendmail? Because procmail is used for mail filtering and as the delivery agent.
Here's the problem, on the director:
Mar 14 20:40:08 imapdir2 dovecot: lmtp(10692): Connect from local Mar 14 20:40:38 imapdir2 dovecot: lmtp(10692): Panic: file lmtp-proxy.c: line 376 (lmtp_proxy_output_timeout): assertion failed: (proxy->data_input ->eof) Mar 14 20:40:38 imapdir2 dovecot: lmtp(10692): Error: Raw backtrace: /usr/lib/dovecot/libdovecot.so.0(+0x3d99a) [0x7f79156c499a] -> /usr/lib/doveco t/libdovecot.so.0(+0x3d9e6) [0x7f79156c49e6] -> /usr/lib/dovecot/libdovecot.so.0(i_error+0) [0x7f791569df8f] -> dovecot/lmtp() [0x406e77] -> /usr/l ib/dovecot/libdovecot.so.0(io_loop_handle_timeouts+0xd4) [0x7f79156d0044] -> /usr/lib/dovecot/libdovecot.so.0(io_loop_handler_run+0x5b) [0x7f79156d 0c3b] -> /usr/lib/dovecot/libdovecot.so.0(io_loop_run+0x28) [0x7f79156cfca8] -> /usr/lib/dovecot/libdovecot.so.0(master_service_run+0x13) [0x7f7915 6bdfc3] -> dovecot/lmtp(main+0x154) [0x403f84] -> /lib64/libc.so.6(__libc_start_main+0xfd) [0x7f7914ef8cdd] -> dovecot/lmtp() [0x403d69] Mar 14 20:40:38 imapdir2 sendmail[6905]: q2D8KodI018432: SYSERR(root): timeout writing message to localhost: Broken pipe
Most mail goes through OK, but some messages do not and end up queued until they run into the queue time limit.
So far as I have been able to tell, all of the messages have this failure when the following conversation takes place between sendmail (on director), the Dovecot LMTP proxy, and sendmail on the backend node (SMTP):
(names mangled to protect the guilty)
(first, sendmail -> director LMTP)
The conversation between the director (LMTP) and the backend (sendmail SMTP) goes like this:
At this point Dovecot should return the failed RCPT TO: status back to sendmail over LMTP, but instead it sits there (waiting for a timeout to expire?) and eventually dies.
doveconf -n output:
2.0.18: /etc/dovecot/dovecot.conf
OS: Linux 2.6.32-220.4.2.el6.x86_64 x86_64 Red Hat Enterprise Linux
Server release 6.2 (Santiago) base_dir = /var/run/dovecot/ default_client_limit = 6000 default_process_limit = 10240 director_mail_servers = penguina.uvm.edu penguinb.uvm.edu penguinc.uvm.edu penguind.uvm.edu director_servers = imapdir1.uvm.edu imapdir2.uvm.edu lmtp_proxy = yes login_trusted_networks = [REDACTED] passdb { args = proxy=y nopassword=y protocol=smtp driver = static } service anvil { client_limit = 40000 } service auth { client_limit = 45960 unix_listener auth-userdb { group = mail mode = 0660 user = dovecot } } service director { fifo_listener login/proxy-notify { mode = 0666 } inet_listener { port = 9090 } unix_listener director-userdb { mode = 0600 } unix_listener login/director { mode = 0666 } } service imap-login { executable = imap-login director service_count = 0 } service imap { process_limit = 10240 vsz_limit = 1 G } service lmtp { client_limit = 1 inet_listener lmtp { port = 24 } unix_listener /var/lib/dovecot/lmtp-socket { group = root mode = 0600 user = root } } service pop3-login { executable = pop3-login director service_count = 0 } service pop3 { process_limit = 5000 } shutdown_clients = no ssl_cert = <[REDACTED].pem ssl_key = <[REDACTED].key userdb { driver = passwd } verbose_proctitle = yes version_ignore = yes protocol lmtp { auth_socket_path = director-userdb } protocol imap { mail_max_userip_connections = 100 }
Hope you can help, Jim Lawson
On 3/15/12 8:25 AM, Timo Sirainen wrote:
Trying with v2.1.2 (peer is v2.0.18):
Mar 15 13:15:53 imapdir2 dovecot: director: Panic: file director.c: line 295 (director_sync): assertion failed: (!dir->ring_synced || (dir->left == NULL && dir->right == NULL)) Mar 15 13:15:53 imapdir2 dovecot: director: Fatal: master: service(director): child 513 killed with signal 6 (core not dumped) Mar 15 13:15:53 imapdir2 dovecot: director: Error: Director 132.198.100.149:9090/right disconnected
Which is OK, I can run them split-brained (rules in iptables to prevent directors from talking) while I move users around. It'll mean poor performance for GFS for the duration, but that's better than an outage.
The good news is, the lmtp problem I wrote about above appears to be fixed. Thanks !!!
Jim
participants (2)
-
Jim Lawson
-
Timo Sirainen