[Dovecot] Dovecot-1.0.beta1: New mail sometimes fails to appear

Tom Alsberg alsbergt at cs.huji.ac.il
Tue Jan 31 11:36:02 EET 2006


Sorry for the late reply onthe subject.  However I thought I would
check some more before replying.

Basically, I reached the conclusion that it is not NFS caching in
fault - below are the explanations.

I'm trying to debug the notify callbacks, however the system seems a
bit complex for now, and I'm still attempting to find heads and tails
in it.  Could you perhaps write something explaining how the notify
callbacks are called - what functions are involved, when the mailbox
is updated (stat shows a newer mtime) what the procedure is?

Following are answers to your last reply on the subject claiming it is
NFS caching that ruins it here.  However, the final evidence is in
this:

I added a debug logging to the stat loop in the file
src/lib-storage/index/index-mailbox-check.c (line 44, in the sources
of .beta2):

	for (file = ibox->notify_files; file != NULL; file = file->next) {
		if (stat(file->path, &st) == 0 &&
		    file->last_stamp != st.st_mtime) {
			i_info("CSDEBUG: activity on file \"%s\","
			       " last noted at %d, modified at %d",
			       file->path, file->last_stamp,
			       st.st_mtime);
			file->last_stamp = st.st_mtime;
			notify = TRUE;
		}
	}

This means I see in the log a line whenever a file (or directory) of a
mailbox SELECTED (IDLE on) is updated.  I indeed saw many such lines,
and waited until it happens again that new mail arrives but is not
seen by Thunderbird.  And indeed at some point new mail arrived (to my
inbox - /var/mail/alsbergt, which I had selected and open in
Thunderbird), but I did not see it.  In the log, I had:

dovecot: Jan 31 10:40:53 Info: imap(alsbergt): CSDEBUG: activity on file "/var/mail/alsbergt", last noted at 1138696544, modified at 1138696850

At the same time I had ethereal capturing the connection;  I saw no
EXISTS sent to the client (Thunderbird at that time).

Another message arrived after 10 minutes, again this debugging line
appeared (with different timestamps, of course).  Then after a while
another message arrived, but this time an EXISTS with all three
messages was sent to the client.

So obviously the results of stat in this case do show that the file is
updated.  However somewhere later where Dovecot should notify (send an
EXISTS) to the client, it does not for some reason.  It appears to me
that there is a bug somewhere in the callbacks reached by the
following code:

	if (notify) {
		ibox->notify_last_sent = ioloop_time;
		ibox->notify_pending = FALSE;
		ibox->notify_callback(&ibox->box, ibox->notify_context);
	}

I'd love some assistance in debugging this.  If you could explain
exactly what happens when notify_callback is called (I've not got it
all sorted out yet), where it is set, and what should by design be the
course, I may understand better what is going on.  Or anything else I
could do (add some more debugging prints?  send a core or some network
capture?) that would enable you to trace this?

Anyway, to the answers for your last mail:

On Sun, Jan 22, 2006 at 03:01:47PM +0200, Timo Sirainen wrote:
> <snip />
> > I will try to check whether this is the case.  You mean the NFS client 
> > (FreeBSD kernel) caching, not the server (NetApp), right?  Something 
> > like this happening on the server seems not to make any sense to me.
> 
> Right, in client side.

Well, to begin with, doing "ls" on the file basically does stat on it,
so if stat is cached I should not see the file's size or mtime changed
in ls, which I do (trying to do ls - on the same host as the Dovecot
server is running on - on the mailbox when this happens that new mail
arrives and is not seen).

> <snip />
> It's also possible that mutt does something slightly differently which
> causes NFS cache to be reset. You could maybe try this with Dovecot too.
> Although I'm not sure how that would be done.. chmod() apparently at
> least resets it.

I'm not sure exactly where to do that, but if you tell me where I
could add some chmod or something and see if this makes a difference.

> Hmm. From mutt's configure:
> 
> AC_ARG_ENABLE(nfs-fix, [  --enable-nfs-fix           Work around an
> NFS with broken attributes caching ],
>         [if test x$enableval = xyes; then
>                 AC_DEFINE(NFS_ATTRIBUTE_HACK)
>         fi])

However doing mutt -v, it appears that the mutt on our FreeBSD systems
is built without --enable-nfs-fix.  I also read a bit in the mutt
mailing list and as I understood this is a different issue with some
systems.

> Which does after each stat():
> 
> <snip />

I tried also running a small C program which does stat every second
and prints out if there is a difference in mtime from the last result,
and ran it on one of the mailboxes, and it always showed a difference
when new mail arrived, even when Dovecot would not send an EXISTS as
response to the IDLE command.

But anyway this is all irrelevant in the light of what I wrote above -
that Dovecot itself does notice a change in the mtime as returned by
stat while the client does not see new mail in IDLE.

  Thanks, help appreciated,
  -- Tom

-- 
  Tom Alsberg - hacker (being the best description fitting this space)
  Web page:	http://www.cs.huji.ac.il/~alsbergt/
DISCLAIMER:  The above message does not even necessarily represent what
my fingers have typed on the keyboard, save anything further.


More information about the dovecot mailing list