[Dovecot] New mailbox format

Timo Sirainen tss at iki.fi
Tue Dec 6 16:07:55 EET 2005


On Fri, 2005-09-23 at 21:48 -0400, Tom Metro wrote:
> Timo Sirainen wrote:
> > Index File
> > ----------
> [...]
> > The file is modified by creating first exclusively owned "index.lock" file,
> > updating it and then rename()ing it over the index file. This lock file
> > shouldn't be held for a long time, so if it exists, and it doesn't get 
> > modified at all within 30 seconds, it can be overwritten. Before
> > rename()ing over the index file, you should check with stat() that the
> > index.lock file is still the same as you expect it to be (in case your
> > process was temporarily stopped for over 30 secs).
> 
> The locking and handling of the index strikes me as a regression back to 
> the mbox problems that Maildir tried to solve.

Replying a bit late, but anyway..

I don't think the above index.lock is a problem in any way. Actually
Dovecot already uses similar method with Maildir's dovecot-uidlist
file. 

Maildir is lockless only in theory. Unless the maildir is globally
locked while checking its contents, files may get temporarily lost with
all of the filesystems that I know of.

I think the locking is only a problem if you're holding the lock for a
long time. For example if you need to keep the mailbox locked as long as
some IMAP client is reading/writing messages, that's bad. That's a
problem with mbox, but not with maildir/dbox.

dbox's index.lock file needs to exist only in two situations:

1. While new message UIDs need to be allocated (as the final step of
saving new mail(s) to mailbox). A global lock is needed for this with
any kind of mail storage with IMAP, since UIDs must exist and they must
always grow.

2. Writing message flag changes / expunging mails. These changes should
go pretty quickly as well.

> Have you considered other approaches, such as having the index be under 
> control of a daemon, and use IPC to communicate events to that daemon, 
> which could then exclusively handle modifying the file?

One of the biggest reasons for dbox's existance is that it needs to work
well with clustered filesystems. And relying on a daemon running in only
one computer kind of defeats cluster's purpose then..

Anyway I'm not sure how that would actually benefit anything. A single
process could be a bottleneck if it handled all users' indexes, so it
should be able to scale to multiple processes. And to avoid locking in
those cases, each process should handle only a group of specific users.
And if we're going there, it might as well be the imap process itself
that does it all.

Redirecting all imap connections for one user to same imap process
wouldn't be too difficult to implement (and it's been in my mind for a
while already), but having pop3 and dovecot-lda also in the same process
could get more tricky.

But does it really matter if the locking is handled by serialization (by
making everything go through a single process) or actual locking? If
there's only a single writer, the locking succeeds easily always. If
there are multiple writers, you'll need to wait in both cases. Although
I suppose serialization provides more fair scheduling.

And even if there was only a single process updating the index file, I'd
probably still make it update the index using the exact same
rename(index.lock, index) to make sure the file doesn't get corrupted in
case of crashes :)

> Any way you slice it, though, these are just approximations of a 
> database server. Maybe embedding SQLite (just for indexes) is the answer?

Doesn't look like SQLite's locking (or writing in general) is in any way
cheaper than what I'm currently doing:

http://www.sqlite.org/lockingv3.html

Actually it looks like it may even block on getting a shared lock if
there are writes happening at the same time. I think many other
databases don't block there but instead use rollback file to get the old
data. Dovecot's index files and dbox's index files aren't needed to be
read-locked at all, so they never block on reading.

SQLite could be useful for storing messages' and mailboxes' metadata
(ANNOTATE, ANNOTATEMORE extension) since those require almost a full
database to work properly, but I don't think it's a good idea for what
Dovecot/dbox currently uses index files for. SQL in general isn't very
well suited for them.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://dovecot.org/pipermail/dovecot/attachments/20051206/ba26b98d/attachment.pgp


More information about the dovecot mailing list