[Dovecot] minimize mbox mdbox fragmentation

Stan Hoeppner stan at hardwarefreak.com
Thu Oct 21 07:50:45 EEST 2010


Timo Sirainen put forth on 10/20/2010 11:53 AM:
> On Tue, 2010-10-19 at 21:55 -0500, Stan Hoeppner wrote:
> 
>> Any chance the mbox/mdbox writer code could be modified to do physical
>> preallocation on files to help avoid file(system) fragmentation?
> 
> I've been thinking about that before.
> 
>> "What you want is _physical_ preallocation, not speculative
>> preallocation. i.e. look up XFS_IOC_RESVSP or FIEMAP so your
>> application does _permanent_ preallocate past EOF. 
> 
> Oh, interesting. I didn't know that was possible. And even better: Linux
> has fallocate() that can do it for other filesystems than just XFS. Or
> looks like it's only XFS and ext4 (ext3 doesn't support it). I don't
> know if other OSes support this. Maybe in future I could make mdbox
> support writing to files whose size has been preallocated by actually
> writing NUL bytes, but that requires some extra code.
> 
> http://hg.dovecot.org/dovecot-2.0/rev/22c81f884032
> http://hg.dovecot.org/dovecot-2.0/rev/b884441a713f

There exists posix_fallocate() which would widen the platforms that
would support this Timo.  You may also want to look at posix_fadvise()
as well (if you're not using it already) which might increase Dovecot's
overall disk performance a bit.

NOTE: I don't believe fallocate() in either posix or linux only form
will actually accomplish decreased m[d]box file fragmentation.  I don't
believe it actually increases the file size on disk, i.e. physically
allocating additional free extents tailing the end of the file.
fallocate() is _speculative_ preallocation, which isn't what you want.
mbox and mdbox file _will_ grow, so you'd want _physical_ preallocation.
 I'm not sure if physical preallocation requires writing a bunch of
zeros to the end of the file or not.  I don't "think" it does.  I think
you can extend the size of the file past EOF to grow the file and the
remainder is just left at nulls or something.  Again, I've not a dev.  I
know just barely enough about this stuff to get myself into real trouble. ;)

See these comments:
---------------------------------------------------------------------
On Tue, Oct 19, 2010 at 10:03:19PM -0500, Stan Hoeppner wrote:
> > Dave Chinner put forth on 10/19/2010 6:42 PM:
> >
>> > > I've explained how allocsize works, and that speculative allocation
>> > > gets truncated away whenteh file is closed. Hence is the application
>> > > is doing:
>> > >
>> > > 	open()
>> > > 	seek(EOF)
>> > > 	write()
>> > > 	close()
> >
> > I don't know if it changes anything in the sequence above, but Dovecot
> > uses mmap i/o.  As I've said, I'm not a dev.  Just thought this
> > could/might be relevant.  Would using mmap be compatible with physical
> > preallocation?
mmap() can't write beyond EOF or extend the file. hence it would
have to be:

	open()
	mmap()
	ftrucate(new_size)
	<write via mmap>

In this method, there is no speculative preallocation because the
there is never a delayed allocation that extends the file size.  it
simply doesn't matter where the close() occurs. Hence if you use
mmap() writes like this, the only way you can avoid fragmentation is
to use physical preallocation beyond EOF before you start any
writes....
---------------------------------------------------------------------

It would be beneficial I think if you'd sub to the xfs list Timo and
pick some brains.  All the devs there are Linux devs but have experience
with many platforms including IRIX and other UNIX variants.  Most if not
all of them have been developing on UNIX systems their entire careers,
and only UNIX.  They could answer any question you have about the Linux
IO subsystem, not just XFS specific stuff.  Some are current SGI
employees some former, some Redhat, etc.  They could probably answer any
posix call questions you might have as well.

http://oss.sgi.com/mailman/listinfo/xfs

-- 
Stan


More information about the dovecot mailing list