[Dovecot] Best filesystem?

Stan Hoeppner stan at hardwarefreak.com
Tue Feb 1 05:11:49 EET 2011


Frank Cusack put forth on 1/31/2011 3:06 PM:
> On 1/30/11 5:07 PM -0600 Stan Hoeppner wrote:
>> To be clear, for any subscribers who haven't followed all of the various
>> filesystem and data security threads, with any modern *nix system, you
>> WILL lose data when power fails.  How much depends on how many writes to
>> disk were in flight when the power failed, and how one has their RAID
>> controller and inside-the-disk caches configured, whether using barriers,
>> etc.
> 
> That's incorrect.  When you fsync() a file, all sane modern filesystems
> guarantee no data loss, unless you tune that out administratively for
> performance reasons.  If you use a log structured filesystem (like zfs
> or WAFL) you can optimize the performance as well.  With other types
> of filesystems (like xfs), performance suffers severely under heavy
> sync write loads.

This depends on how the dev does his syncs.  If done intelligently, XFS
performance won't suffer.  In fact, the preferred write method to XFS for high
performance applications is using O_DIRECT.  Using O_DIRECT, correctly, with
XFS, actually _increases_ write performance versus going through the buffer
cache.  So you get the best of both worlds:  higher performance and data
guaranteed on disk.

But not all applications use fsync, O_DIRECT, et al.  The point I was making is
that on any general system, you will likely have some applications/daemons
writing without fsync or O_DIRECT, so you will likely suffer some data loss when
the plug is pulled or the kernel crashes.  If the timing of the crash is right
you can even lose data when using fsync.  Depends on how busy the system is and
how many synced writes are in flight when the power drops.  There truly aren't
any guarantees that data will always be on disk.  There are always corner cases
where you will lose data.  Thankfully, for most of us, most of the time, they
are _extremely_ rare.

> As a reference point, ext3 with default settings guarantees data loss
> under normal conditions so I do not consider it a sane filesystem.  You
> can tune that behavior out (so that you preserve data), but in that
> case ext3 operates with sub-par performance.
> 
>> I believe I mentioned this when discussing the merits of XFS and ZFS with
>> Frank, who stated Solaris/ZFS were immune to this, to which I called BS.
>> They aren't immune, as Ted T'so clearly states.  For those who don't
>> know, Ted T'so is an MIT PH.D., is the creator of EXT2/3, and is to this
>> day an active Linux kernel hacker/developer on filesystems and storage
>> drivers.
> 
> Ted is a close acquaintance of mine, and if he indeed says what you
> said he says, he is wrong.  More likely, he was simplifying or talking
> about certain cases, not the general case.

Read Ted's article I linked.  I didn't misquote him.  The simple point he was
making is that unless devs specifically use fsync or other calls to guarantee
their data is on disk, they will suffer data loss with any modern journaling
filesystem when the power goes out or the system crashes.  You seem to be
assuming all devs use fsync.  Apparently this is far from reality.

> There are two ways to guarantee no data loss with zfs, one is to disable
> the ZIL (low performance) and the 2nd is to use a slog (high performance).

And exactly how does an external log device guarantee no data loss?  External
journal logs enhance performance but I've never heard of them being a magic cure
for data loss.  XFS allows external log devices as well, for performance.

-- 
Stan


More information about the dovecot mailing list