[Dovecot] Leaky dovecot-auth ?

Christian Balzer chibi at gol.com
Mon Jul 2 09:20:14 EEST 2007


On Wed, 27 Jun 2007 23:15:32 +0300 Timo Sirainen <tss at iki.fi> wrote:
> On Thu, 2007-06-21 at 16:49 +0900, Christian Balzer wrote:
> > > You could try
> > > http://dovecot.org/patches/debug/mempool-accounting.diff and send
> > > USR1 signal to dovecot-auth after a while. It logs how much memory
> > > is used by all existing memory pools. Each auth request has its own
> > > pool, so if it's really leaking them it's probably logging a lot of
> > > lines. If not, then the leak is elsewhere.
> > > 
> > I grabbed the Debian package source on a test machine (not gonna chance
> > anything on the production servers), applied the patch, did add
> > --enable-debug to the debian/rules file (and got the #define DEBUG 
> > in config.h), created the binary packages, installed, configured,
> > started them, tested a few logins and... nothing gets logged 
> > in mail.* if I send a USR1 to dovecot-auth. Anything I'm missing?
> 
> Bug, fixed: http://hg.dovecot.org/dovecot-1.0/rev/a098e94cd318
> 
Thanks, that fixed the silence of the auth-sheep.

This is the output after start-up:
---
Jul  2 13:59:54 engtest03 dovecot: auth(default): pool auth request handler: 104 / 4080 bytes
Jul  2 13:59:54 engtest03 last message repeated 19 times
Jul  2 13:59:54 engtest03 dovecot: auth(default): pool passwd_file: 56 / 10224 bytes
Jul  2 13:59:54 engtest03 dovecot: auth(default): pool Environment: 224 / 2032 bytes
Jul  2 13:59:54 engtest03 dovecot: auth(default): pool ldap_connection: 576 / 1008 bytes
Jul  2 13:59:54 engtest03 dovecot: auth(default): pool auth: 1520 / 2032 bytes
---

Used memory of dovecot-auth after 1 login was 3148KB(RSS).

This is after a good trashing with rabid (from the postal package), with
just 2 users though, using POP3 logins:
---
Jul  2 14:12:30 engtest03 dovecot: auth(default): pool auth request handler: 104 / 4080 bytes
Jul  2 14:12:30 engtest03 last message repeated 128 times
Jul  2 14:12:30 engtest03 dovecot: auth(default): pool passwd_file: 56 / 10224 bytes
Jul  2 14:12:30 engtest03 dovecot: auth(default): pool Environment: 224 / 2032 bytes
Jul  2 14:12:30 engtest03 dovecot: auth(default): pool ldap_connection: 576 / 1008 bytes
Jul  2 14:12:30 engtest03 dovecot: auth(default): pool auth: 1520 / 2032 bytes
---
Note that the amount of auth request handler pools have grown to 128. 
After another short round of rabid the handler pools grew to 137 and the
size of dovecot-auth to 5100KB. The number of handler pools never fell,
nor did the memory footprint, obviously. :-p

At about 800k logins/day/node here it's obvious now why dovecot-auth
explodes after less than a week with max size of 512MB. 

> > But no matter, it is clearly leaking just as bad as 0.99 and I venture
> > that his is the largest installation with LDAP as authentication
> > backend. I wonder if this leak would be avoided by having LDAP lookups
> > performed by worker processes as with SQL. 
> 
> Then you'd only have multiple leaking worker processes.
>
Yes, I realize that. But I presume since these are designed to die off
and be recreated on the fly the repercussions would be much better. ;)
Of course now it looks like this is not LDAP related after all.

> > > The same as 0.99. You could also kill -HUP dovecot when dovecot-auth
> > > is nearing the limit. That makes it a bit nicer, although not
> > > perfectly safe either (should fix this some day..).
> > >
> > If that leak can't be found I would very much appreciate a solution
> > that at least avoids failed and/or delayed logins.
> 
> That would require that login processes don't fail logins if connection
> to dovecot-auth drops, but instead wait until they can connect back to
> it and try again. And maybe another alternative would be to just
> disconnect the client instead of giving login failure.
> 
Anything that fixes this one way or the other would be nice. ^_^

Oh and HUP'ing the master is not an option here, I guess the system load
triggers a race condition in dovecot because several times when doing so
I got this:
---
Jun 22 15:08:58 mb11 dovecot: listen(143) failed: Interrupted system call
---
Which results in a killed off dovecot, including all active sessions.

The self terminating dovecot-auth is not nice, but at least more
predictable and does recover by itself:
---
Jun 30 19:03:27 mb12 dovecot: auth(default): pool_system_malloc(): Out of memory
Jun 30 19:03:27 mb12 dovecot: child 11110 (auth) returned error 83 (Out of memory)
Jun 30 19:03:28 mb12 dovecot: pop3-login: Can't connect to auth server at default: Resource temporarily unavailable
Jun 30 19:03:28 mb12 last message repeated 11 times
---
Of course the 12 users that tried to log in at this time are probably not
amused or at least confused.

Regards,

Christian
-- 
Christian Balzer        Network/Systems Engineer                NOC
chibi at gol.com   	Global OnLine Japan/Fusion Network Services
http://www.gol.com/


More information about the dovecot mailing list