[Dovecot] Duplicate Attachments....

Les Mikesell lesmikesell at gmail.com
Fri Jun 2 05:36:08 EEST 2006


On Thu, 2006-06-01 at 20:31, Bill Boebel wrote:

> > > Our business model (advertising industry) is such that our users
> > > exchange a lot of emails with attachments - most less than a megabyte,
> > > but some considerably larger. Consequently, I have been looking for a
> > > good, open source imap server that doesn't store multiple copies of the
> > > same attachment - but instead, stores a checksum, and whenever a message
> > > is stored with a duplicate attachment, the attachment is stored only
> > > once, and simply referenced by some kind of link to other emails.
> > >
> > > This would *drastically* reduce the storage requirements for our company
> > > - imagine a message with a 10MB attachment, sent to 40 of our users,
> > > sometimes more than once. Now multiply this by 3 times per day, for 5
> > > years...
> > >
> > > Are there any plans for Dovecot to support this type of storage in the
> > > future? Does this require the use of an SQL DB for storing the message
> > > components?
> >
> > This is planned for dbox format in maybe a couple of months. I think the
> > plan was to do this in deliver agent so that the delivered mail's
> > attachment is shared between the mail's recipients.
> 
> How would you know when all users have deleted an email that has a shared
> attachment so that you can safely delete the shared attachment file?

One approach is to use maildir or a similar one-message-per-file
storage format and have the delivery agent make hardlinks in the
filesystem for each copy.  Unix filesystem semantics ensure that
the data won't go away until the last link is deleted.  I think
Cyrus has an option to work that way.

-- 
  Les Mikesell
    lesmikesell at gmail.com




More information about the dovecot mailing list