Re: [Dovecot] Maildir + NFS + multiple machines = spectacular failure
Rich,
We're also in the process of setting up a GFS/Cluster with CentOS. I'm hopeful this will eliminate some of the issues.
I'll keep everyone posted.
Steve
Adding "noac" to our /etc/fstab NFS mounts seems to have helped our
index errors so far:
nfsserver:/var/mail /var/mail nfs
rw,hard,intr,rsize=8192,wsize=8192,noac 0 0
Over the weekend we had 1,465,031 POP3 sessions across 3 servers, and
only logged 15 "Corrupted" errors. They were of these types:
Corrupted transaction log file /var/mail/(user)/Maildir/
dovecot.index.log: unexpected end of file while reading header
Corrupted transaction log file /var/mail/(user)/Maildir/
dovecot.index.log: end_offset (236) > current sync_offset (200)
Corrupted transaction log file /var/mail/(user)/Maildir/
dovecot.index.log: record size too small (type=0x8080, offset=7260,
size=0)
Corrupted transaction log file /var/mail/(user)/Maildir/
dovecot.index.log: hdr.size too large (0)
We also had 1,929 IMAP sessions from our testing support staff, and
only 2 errors:
Corrupted transaction log file /var/mail/(user)/Maildir/
dovecot.index.log: record size too small (type=0x5ef9, offset=200,
size=0)
Corrupted transaction log file /var/mail/(user)/Maildir/.Junk E-mail/
dovecot.index.log: end_offset (15292) > current sync_offset (15236)
Rich
On Apr 28, 2006, at 11:27 AM, Apps Lists wrote:
Hi Rich.
Glad to see that helped out. I can still introduce errors with IMAP even with that setting.
We just finished setting up GFS and the results are the same:
May 1 11:33:11 dev4 dovecot: IMAP(jtest): fcntl() failed with file /var/mailstore/72/af/375887/Maildir/dovecot.index: No such file or directory May 1 11:33:13 dev4 dovecot: IMAP(jtest): Corrupted transaction log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with UID 2703, but next_uid = 2704 May 1 11:33:15 dev4 dovecot: IMAP(jtest): Corrupted transaction log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with UID 2704, but next_uid = 2705 May 1 11:35:16 dev4 dovecot: IMAP(jtest): fcntl() failed with file /var/mailstore/72/af/375887/Maildir/dovecot.index: No such file or directory May 1 11:35:16 dev4 dovecot: IMAP(jtest): file mail-index.c: line 865 (mail_index_sync_from_transactions): assertion failed: (prev_seq <= max_seq && (prev_seq != max_seq || prev_offset <= max_offset)) .... May 1 11:40:19 dev4 dovecot: IMAP(jtest): file_dotlock_open() failed with file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Resource temporarily unavailable May 1 11:40:19 dev4 dovecot: child 25918 (imap) killed with signal 11
I'm not entirely sure why I'm seeing file_dotlock_open() calls. We're using fcntl.
For the record: Three members of the cluster. Two machines are running Dovecot, mounting the filesystem with GFS. I have another random machine that connects to both machines with IMAP. The program connects, gets a list of folders and APPENDs a message. It then disconnects and repeats. I run one instance of this test against dovecot#1 and another against dovecot#2, and can get the above failure messages within seconds.
This is pretty much the same behaviour we've seen with NFS.
Steve
Hate to top-post... but... ah well.
I'm trying the latest CVS snapshot with GFS and am only seeing a couple of (hopefully) harmless errors...
fcntl() failed with file /var/mailstore/72/af/375887/Maildir/dovecot.index: No such file or directory
fcntl() failed with file /var/mailstore/72/af/375887/Maildir/dovecot.index.cache: No such file or directory
Timo - Is it safe to ignore these fcntl() errors, or is there something else going on?
Rich - it might be a good idea to pull down the latest CVS and see if that resolves some or all of your NFS issues.
Steve
On Tue, 2006-05-02 at 11:06 -0400, Apps Lists wrote:
Does it do this all the time or only sometimes? From those error messages it looks like fcntl() locks don't work with GFS, or maybe GFS sometimes breaks with them. I wouldn't say they're harmless until I knew why exactly it's giving those errors.
With concurrent operations on the same GFS filesystem across multiple machines, it happens all day long.
I've yet to see any evidence of corruption, so I'm not particularly concerned.
I'm not sure what changed between beta7 release and the last snapshot, but it's SO much happier... at least on GFS.
Steve (logs from one machine below)
May 2 11:52:25 dev4 dovecot: IMAP(jtest): fcntl() failed with file /var/mailstore/72/af/375887/Maildir/dovecot.index: No such file or directory May 2 11:53:50 dev4 last message repeated 4 times May 2 11:55:42 dev4 last message repeated 2 times May 2 11:56:48 dev4 dovecot: IMAP(jtest): fcntl() failed with file /var/mailstore/72/af/375887/Maildir/dovecot.index: No such file or directory May 2 11:58:46 dev4 last message repeated 3 times May 2 11:59:10 dev4 dovecot: IMAP(jtest): stat() failed with file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: No such file or directory May 2 11:59:10 dev4 dovecot: IMAP(jtest): fcntl() failed with file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: No such file or directory May 2 11:59:10 dev4 dovecot: IMAP(jtest): mail_index_wait_lock_fd() failed with file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: No such file or directory May 2 11:59:54 dev4 dovecot: IMAP(jtest): fcntl() failed with file /var/mailstore/72/af/375887/Maildir/dovecot.index: No such file or directory May 2 12:00:56 dev4 dovecot: IMAP(jtest): fcntl() failed with file /var/mailstore/72/af/375887/Maildir/dovecot.index: No such file or directory May 2 12:02:29 dev4 last message repeated 2 times May 2 12:05:40 dev4 last message repeated 2 times May 2 12:07:05 dev4 dovecot: IMAP(jtest): fcntl() failed with file /var/mailstore/72/af/375887/Maildir/dovecot.index: No such file or directory
participants (3)
-
Apps Lists
-
richs@whidbey.net
-
Timo Sirainen