[Dovecot] Using MySQL to store email?

Les Mikesell lesmikesell at gmail.com
Wed Jun 7 18:26:48 EEST 2006


On Tue, 2006-06-06 at 17:02 +0100, Simon Waters wrote:

> > > Well, the problem with a file-based database (Dovecot's indexes etc.
> > > are in fact a database) is that you must use the same locking and/or
> > > terminate / suspend the service, otherwise there is the possibility
> > > that the data and the indexes are out-of-sync.
> 
> Yes, but indexes are cheap to rebuild, but expensive to maintain, so you might 
> find this cuts the wrong way.
> 
> I'm quite a fan of the idea of putting email in databases, I can see the 
> upside. But those who think it will save any resource at all haven't spent 
> enough time with big database systems. It will be a lot slower, except where 
> you can utilise indexes to speed operations, which will be rarely if at all.
> 
> Just consider the number of blocking writes to commit an email to maildir 
> (remember it uses a lot of rename), now consider the kind of indexes you want 
> to maintain on the database that'll be updated when an email is delivered 
> (and possibly when it is read, files etc).

I think the people who expect an improvement from databases over maildir
are used to unix filesystems that degrade badly as the number of files
in a directory increase.  These days many, like Reiserfs and XFS, are
much better.  My theory is that if your filesystem isn't a good place
to store things you should fix that before thinking about databases.

> I got into pondering mail in databases from the issues pertaining to 
> consistency of reads of directories in Unix filesystems. Whilst it is easy to 
> guarantee the consistency of a read from an ACID style database (unlike 
> reading directories in a big maildir folder). Of course when I asked Hans 
> Reiser he said it sounds like the kind of modular functionality that modern 
> filesystems ought to provide and offered to write a filesystem plugin for 
> ReiserFS that guarantees the consistency of directory reads for maildir use. 
> Of course there is a performance (or resource) penalty in doing a consistent 
> read of a directory.

The issue is the same in both places, you either speed things up by
allowing dirty reads or you take the performance hit by locking for
the duration of all writes.  When you create a new file you must
atomically determine whether or not the name currently exists. Even
resiser can't cheat on that without ending up corrupted.

> Maybe more than one way to solve a problem, just need to make sure you know 
> precisely which problems you are trying to solve.
> 
>  Simon, who'll continue moving systems to maildir, till something better 
> arrives.

An extended maildir might make sense where additional subdirectories
are used transparently to limit the number of files in any single
directory - so it would end up looking something like a squid cache
which solves a very similar problem.

-- 
  Les Mikesell
    lesmikesell at gmail.com




More information about the dovecot mailing list