[Dovecot] Long attachment encoded filenames (for non-ASCII characters etc) in MIME headers & corresponding Dovecot behaviour

Andrew Richards ar-dovecotlist at acrconsulting.co.uk
Fri Sep 30 01:48:21 EEST 2011


(Correction: Subject was confused)
Hi,

I've noticed a possible minor issue with long encoded filenames for attachments 
where these filenames are split across multiple lines. My understanding of 
character encoding and MIME is not as good as it should be, so I may easily 
have got this all mixed up, in which case sorry for the noise...

Although I understand the preferred method for handling filenames split across 
multiple lines (because they're too long to fit on one line in the message) is 
that suggested in RFC2184/2231, so for example,
            filename*0*=iso-8859-1''accented_characters_here_%EA%CA%E6
            filename*1=etc%2Epdf

I find that some mail clients do this instead,
        filename="=?ISO-8859-1?Q?accented_characters_here_=EA=CA=E6?=
        =?ISO-8859-1?Q?etc=2Epdf?="

In Dovecot this results in,
0 fetch 25 body
* 25 FETCH (BODY (("text" "plain" ("charset" "ISO-8859-1") NIL NIL "7bit" 239 
8)("application" "pdf" ("name" 
"=?ISO-8859-1?Q?accented_characters_here_=EA=CA=E6?= 
=?ISO-8859-1?Q?etc=2Epdf?=") NIL NIL "base64" 219130) "mixed"))

esp. note the unwanted space - or in fact the sequence ?= =? between the two 
sections of the filename. I think a possible tweak for Dovecot would be to 
combine the filename parts in this situation to remove the ?= =?. I'm not sure 
if an IMAP client should know to combine the parts in their current format. 
FWIW I see that Courier does the same as Dovecot in this situation.

I think the 'alternative' method of splitting filenames I'm raising breaks 
RFC2047 (details below), but unfortunately this method is used by some large 
email generators like gmail - also details below.

Key bits from RFC2047 section 5 part (3) re. only a single encoded-word 
('phrase') being allowed for a MIME Content-Type / Content-Disposition:

	phrase = 1*( encoded-word / word )

	An 'encoded-word' MUST NOT be used in parameter of a MIME
	Content-Type or Content-Disposition field, or in any structured
	field body except within a 'comment' or 'phrase'.

Here are the mail clients I noted this issue with (original filenames destroyed 
because I've been examining my client's emails for this issue - with their 
permission),

(AOL)
X-Mailer: Webmail 33953-STANDARD
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="=?utf-8?Q?abcde?=
 =?utf-8?Q?abcde=C3=A9abcde.jpg?="
Content-Type: image/jpeg; name="=?utf-8?Q?abcde?=
 =?utf-8?Q?abcde=C3=A9abcde.jpg?="

Gmail:
Content-Type: application/pdf; 
	name="=?ISO-8859-1?Q?with_a_=EA=CA=E6_super=2Dlong_name_that=27s_bound?=
	=?ISO-8859-1?Q?_to_overflow_a_line_boundary_to_test_gmail=2Epdf?="
Content-Disposition: attachment; 
	filename="=?ISO-8859-1?Q?with_a_=EA=CA=E6_super=2Dlong_name_that=27s_bound?=
	=?ISO-8859-1?Q?_to_overflow_a_line_boundary_to_test_gmail=2Epdf?="

X-Mailer: YahooMailWebService/0.8.113.313619
Content-Type: application/vnd.openxmlformats-
officedocument.wordprocessingml.document; name="=?utf-8?B?base64encodedstring?=
 =?utf-8?B?base64encodedstring?="
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="=?utf-8?B?base64encodedstring?=
 =?utf-8?B?base64encodedstring?="

X-Mailer: Lotus Notes Release 6.5.5 November 30, 2005:
Content-type: application/pdf;
        name="=?ISO-8859-1?Q?abcde=E9abcde=E9abcde=E9?=
        =?ISO-8859-1?Q?abcde=2Cl=2Epdf?="
Content-Disposition: attachment; 
filename="=?ISO-8859-1?Q?abcde=E9abcde=E9_abcde=E9?=
        =?ISO-8859-1?Q?abcde=2Cl=2Epdf?="
Content-ID: <20__=snip>
Content-transfer-encoding: base64

X-Mailer: Lotus Domino Web Server Release 6.5.5FP1 HF551 November 27, 2007:
Content-type: application/pdf;
        name="=?windows-1252?Q?abcde_=28=E9?=
        =?windows-1252?Q?=29=2Epdf?="
Content-Disposition: attachment; filename="=?windows-1252?Q?abcde_=28=E9?=
        =?windows-1252?Q?=29=2Epdf?="
Content-transfer-encoding: base64

Timo also noted the same style of filename encoding in Apple Mail in the 
previous thread I started, it would be interesting to try Apple Mail with a 
very long filename to cause it to split across multiple lines and see how it 
encodes the filename then,

> Looks like Apple Mail also sends:
> > Content-Type: application/octet-stream;
> >         name="=?iso-8859-1?Q?p=E4=E4?="

Best regards,

Andrew.



More information about the dovecot mailing list