[Dovecot] Mailbox Hashing

Kyle Wheeler kyle-dovecot at memoryhole.net
Fri Nov 14 01:43:23 EET 2008


On Thursday, November 13 at 05:20 PM, quoth Justin Krejci:
> Is there any method for hashing the inbox automatically after say 
> 5,000 messages are stored? Example
>
> $Maildir/in/0/message0 
> $Maildir/in/0/message1 
> $Maildir/in/0/message2

Not in Maildir. The Maildir format does not allow that, so... It may 
be possible to do with something like dbox, since that's a 
Dovecot-specific format.

In general, though, that kind of hashing is usually a workaround for a 
lousy filesystem (such as ext2), rather than something you'd really 
*want* to do.

The one exception might be if you want to split someone's inbox over 
several filesystems, but even that could be accomplished using 
something like UnionFS. Of course, we're getting outside the realm of 
production-tested options here, and it would probably introduce all 
kinds of potential problems with locking and such.

> I am not currently using Dovecot but am interested to know if this 
> is available or does running with 20,000+ messages in a single inbox 
> not affect the performance much?

It all depends on the filesystem and what operations you're doing. 
Dovecot does a *lot* of caching to avoid hitting the filesystem 
whenever it can. However, randomly accessing messages in your mailbox 
*will* cause a filesystem access, and the speed of that depends on 
having a halfway decent filesystem.

> I have looked into other file system tuning techniques such as 
> enabling ext3 dir_index or using ReiserFS (maybe not ReiserFS 
> anymore). There will likely be 15,000 to 20,000 accounts spread out 
> on one or more servers using a 6-drive RAID10 setup. Most accounts 
> are not expected to have high message quantities but there will be 
> lots of concurrent connections via pop and imap (and webmail imap).

You should be fine. I'd probably encourage something more stable like 
ext3 with dir_index (ReiserFS is often viewed as a purely experimental 
filesystem, and not reliable for production systems). The ext3 
documentation suggests that 100k-1M+ files in a single directory 
should not pose a significant performance problem when using 
dir_index. I haven't tried it with directories that are *that* big, 
but I regularly use mailboxes with over 5k messages without problems.

~Kyle
-- 
A woman is like a tea bag. It's only when she's in hot water that you 
realize how strong she is.
                           -- either Eleanor Roosevelt or Carl Sandberg
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 204 bytes
Desc: not available
Url : http://dovecot.org/pipermail/dovecot/attachments/20081113/d7ff1b39/attachment.bin 


More information about the dovecot mailing list