Hi,
some of our customers have discovered a replication issue after upgraded from 2.3.7.2 to 2.3.8.
Running 2.3.8 several replication connections are hanging until defined timeout. So after some seconds there are $replication_max_conns hanging connections. Other replications are running fast and successful.
Also running a doveadm sync tcp:... is working fine for all users.
I can't see exactly, but I haven't seen mailboxes timeouting again and again. So I would assume it's not related to the mailbox.
From the logs:
server1: Oct 16 08:29:25 server1 dovecot[5715]: dsync-local(username1@domain.com)<FXnVDW22pl0tGAAA1cwDxA>: Error: dsync(172.16.0.1): I/O has stalled, no activity for 600 seconds (version not received) Oct 16 08:29:25 server1 dovecot[5715]: dsync-local(username1@domain.com)<FXnVDW22pl0tGAAA1cwDxA>: Error: Timeout during state=master_recv_handshake
server2:
Oct 16 08:29:25 server2 dovecot[8113]: doveadm: Error: read(server1) failed: EOF (last sent=handshake, last recv=handshake)
There aren't any additional logs regarding the replication.
I have tried increasing vsz_limit or reducing replication_max_conns. Nothing changed.
--
Both customers have 10k+ users. Currently I couldn't reproduce this on smaller test systems.
Both installation were downgraded to 2.3.7.2 to fix the issue for now
--
I've attached a tcpdump showing the client showing the client stops sending any data after the mailbox_guid table headers.
Any idea what could be wrong here or the debug this issue?
Thanks.
Carsten Rosenberg
I have the same Problem here. All systems are running Debian 9 amd64.
My dovecot director servers are running 2.3.8, but the Mailbox Servers having sync / replication problems with 2.3.8. So i have downgraded the Mailbox Servers to 2.3.7 and now everything works fine again...
Am 18. Oktober 2019 13:52:37 MESZ schrieb Carsten Rosenberg via dovecot <dovecot@dovecot.org>:
-- Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
Hello,
upgrading to 2.3.9 unfortunately does *not* solve this issue:
I upgraded one of my replicators from 2.3.7.2 to 2.3.9 and after some seconds replication stopped. The other replicator remained with 2.3.7.2. After downgrading to 2.3.7.2 replication is again working fine.
I did not try to upgrade both replicators up to now, as this is a live production system. Is there a chance, that upgrading both replicators will solve the problem?
The machines are running Ubuntu 18.04
Any help is appreciated.
Thanks, Andreas
Am 18.10.19 um 13:52 schrieb Carsten Rosenberg via dovecot:
--
Dr. Andreas Piper, Hochschulrechenzentrum der Philipps-Univ. Marburg Hans-Meerwein-Straße 6, 35032 Marburg, Germany Phone: +49 6421 28-23521 Fax: -26994 E-Mail: piper@HRZ.Uni-Marburg.DE
Hello Timo,
upgrading both replicators did the job! Both replicators now run v2.3.9 and replication works fine, all sync-jobs which queued up during the upgrading have been processed successfully.
Thanks for the reassurement and all your great work with dovecot,
Andreas
Am 05.12.19 um 13:15 schrieb Timo Sirainen via dovecot:
Hello all,
Just tested this morning : I can confirm that issue seems to be resolved for me after upgrading both servers from 2.3.7.2 to 2.3.9.
Refs :
- https://dovecot.org/pipermail/dovecot/2019-October/117353.html
- https://dovecot.org/pipermail/dovecot/2019-November/117467.html
No more "I/O has stalled" error messages and replication works fine now. Thanks very much to the Dovecot team.
Have a nice day. Fabien
-----Message d'origine----- De : dovecot <dovecot-bounces@dovecot.org> De la part de Piper Andreas via dovecot Envoyé : vendredi 6 décembre 2019 07:10 À : dovecot@dovecot.org Objet : Re: [2.3.8] possible replication issue
Hello Timo,
upgrading both replicators did the job! Both replicators now run v2.3.9 and replication works fine, all sync-jobs which queued up during the upgrading have been processed successfully.
Thanks for the reassurement and all your great work with dovecot,
Andreas
Am 05.12.19 um 13:15 schrieb Timo Sirainen via dovecot:
participants (5)
-
Carsten Rosenberg
-
KOCIK Fabien (Acoss)
-
Piper Andreas
-
Timo Sirainen
-
Wenger Florian