[Dovecot] FTS and compound searches

Tony Pyro tpyro at mail.com
Sun Dec 12 18:51:35 EET 2010


On Dec 11, 2010, at 4:42 AM, Stan Hoeppner wrote:

> Tony Pyro put forth on 12/10/2010 4:29 PM:
>> Hello,
>> 
>> New subscriber here. I noticed that the FTS index is not used in compound searches. Is this expected? Tested in 2.0.0 and 2.0.8:
>> 
>> . search BODY "waldo"
>> * SEARCH
>> . OK Search completed (0.000 secs).
>> . SEARCH CHARSET UTF-8 OR SUBJECT "waldo" FROM "waldo" 
>> * SEARCH
>> . OK Search completed (1.768 secs).
>> . SEARCH CHARSET UTF-8 OR SUBJECT "waldo" BODY "waldo"
>> * OK Searched 0% of the mailbox, ETA 9605:25
>> * OK Searched 4% of the mailbox, ETA 6:39
>> * OK Searched 6% of the mailbox, ETA 6:58
>> * OK Searched 8% of the mailbox, ETA 6:54
>> 
>> It's a problem for us because the Afterlogic webmail client does not offer a body-only search. The two search options are From + To + Subject, or "entire messages", which puts together a large OR query:
>> 
>> SRCH1069 SEARCH CHARSET UTF-8 OR (OR (OR FROM "waldo" TO "waldo") SUBJECT "waldo") BODY "waldo"
>> 
>> I also checked to see whether the header fields are included in the FTS index but it didn't appear so. I got more results from the search "TO gmail.com" than from "BODY gmail.com"
>> 
>> plugin {
>>  fts = squat
>> }
>> protocol imap {
>>  mail_plugins = " fts fts_squat"
>> }
> 
> Was the above performed with a cold or hot Squat index.  If cold,
> performance will always suck.  That's the big downside of Squat.  The
> indexes must be hot.  Unless Timo fixed this in 2.0.x and I missed
> seeing the announcement.
> 
> My dovecot hardware is absolutely ancient, dual 500 MHz machine.  Cold
> Squat searches on a 15K mbox mailbox take upwards of 1.5 minutes.  With
> the index hot, any search on that same folder takes a fraction of a second.
> 
> I'm guessing your solution, which has been mentioned on list before, is
> to write a basic search script and run it nightly, twice a day, or more
> often, depending on your needs, to keep the index hot.  Whenever Squat
> has to rebuilt the index, the initial search takes forever, often making
> it slower than not using Squat at all.
> 
> -- 
> Stan


I had just built the index minutes before. It's not that FTS is slow overall. Look carefully at the three example searches -- the compound search using FTS plus a header field takes inordinately longer than either component search alone. It's as if the FTS is only used in the simplest of searches. 

Incidentally, the same thing happens when combining two BODY searches (search OR BODY "waldo" BODY "carmen"). Repeating the search doesn't improve its performance.


More information about the dovecot mailing list