[Dovecot] unkillable imap process(es) with high CPU-usage
Hello,
I am having a problem with my dovecot-daemon. It is forking one or more (I saw up to perhaps 8 of them) imap processes under my user name. These processes are consuming a lot of CPU time and are not killable:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 8616 arno 20 0 2900 1600 1204 R 98 0.2 1196:38 imap
Stopping dovecot does not quit these processes. Killing them (even "kill -9" as root) is not possible. The only solution to get rid of them is to reboot.
I have found a mailing list post of one who seemed to have the same problem. He solved it by upgrading to version 1.1. But in my case this did not help.
Is this a known problem? What could I check or do?
My system: very up to date debian/sid (sidux) kernel 2.6.27-8.slh.1-sidux-686 CPU: Intel(R) Core(TM)2 CPU 4300 @ 1.80GHz Filesystem: ext3fs
tried dovecot from sid: 1:1.0.15-2.3 tried dovecot from experimental: 1:1.1.2-3
My dovecot.conf is the original debian configuration with only one line changed into: protocols = imaps
dovecot -n
1.1.2: /etc/dovecot/dovecot.conf
log_timestamp: %Y-%m-%d %H:%M:%S protocols: imaps login_dir: /var/run/dovecot/login login_executable: /usr/lib/dovecot/imap-login mail_privileged_group: mail auth default: passdb: driver: pam userdb: driver: passwd
I am using dovecot locally on my system with me as the only user. As client I am using thunderbird alias icedove 2.0.0.17-1. Icedove retrieves the Mails from another imap server and sorts them into Maildir-folders in dovecot. I do not know when the imap-process are going mad. It happens after the system (and the mail-client) is up for a while.
Do you need more information?
Thanks, Arno
On Thu, 2008-12-11 at 11:36 +0100, Arno Wald wrote:
If you can't kill a process with -9, the bug is in the kernel and there's nothing Dovecot can do about it. User spaces processes can't create unkillable processes unless something's broken.
Although you could see if "strace -p <pid>" prints something. It's doubtful though if you can't kill the process.
Timo Sirainen wrote:
If you can't kill a process with -9, the bug is in the kernel and
Do you have an idea, where and how I could report this to? (against the kernel package?). Perhaps I try an original debian kernel instead the sidux kernel first.
I have tried this already without success. The strace did not print anything and hang, without being able to be stopped by CTRL-C.
Thank you for your answer, Arno
Hi,
On Thu, 11 Dec 2008, Arno Wald wrote:
Timo Sirainen wrote:
If you can't kill a process with -9, the bug is in the kernel and
If there's is a blocked IO operation, like lost nfs mount point, processes appear unkillable. As soon as the share is back, the processes die. I've seen this several times.
matthias
On Thu, 2008-12-11 at 23:08 +0100, Edgar Fuß wrote:
Yes, but I'd also argue that any long enough uninterruptable sleep is a bug. :) I hate it when NFS operations hang..
Anyway, Arno's ps output showed the process to be in R state, not in D state. Unless that was some kind of a copy&paste mistake that makes it sound more like a bug.
Timo Sirainen wrote:
Anyway, Arno's ps output showed the process to be in R state, not in D
It is definitely the R state.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6717 arno 20 0 2964 1608 1192 R 100 0.2 1158:05 imap
btw: I have switched from imaps to imap protocol, because I thought this might change something. But it does not.
Bye, Arno
Timo Sirainen wrote:
You could see if compiling Dovecot without inotify/dnotify support would help. I can't really think of anything else.
I would like to try this and report the result. But there are so many configure-options that I do not know which options (and how) I should dis/enable. Could anybody give me the command line for the ./configure? That would be very kind.
Or does it make more sense to try another kernel first?
Thanks, Arno
Timo Sirainen wrote:
Or does it make more sense to try another kernel first?
I guess that could also help.
I started testing this with another kernel (2.6.26-6.slh.1-sidux-686). Until now (the last 15 hours) no such failing imap process did show up.
So I guess it happens with 2.6.27 kernels. I will watch this during monday to be sure.
btw: At another PC at home, also with a sidux-kernel 2.6.27 I had such an imap process, too, yesterday. (AMD Athlon(tm) XP 2200+)
Bye, Arno
David Rosenstrauch wrote:
Has anyone reported this over on LKML yet? Or filed a bug?
I did not yet. First I would like to test my self compiled dovecot without inotify against 2.6.27. Second I do not know where and how to report kernel issues. (Also I am a little bit afraid of the whole kernel stuff, because I do not know much about it.)
Arno
A new status report regarding this issue:
Dovecot on my PC in the office is still running fine with kernel 2.6.26.
Dovecot with the latest kernel 2.6.27-9.slh.1-sidux-686 on my PC at home did show the unkillable imap processes after a few minutes.
Now I am running dovecot compiled without inotify support on this kernel without any problems for about 70 minutes.
So I really think that the inotify stuff in kernel 2.6.27 does make the problem. (I will tell you if the imap process unexpectedly are making problems again in the current configuration.).
So where are kernel issues reported? I will try to find out.
Greetings, Arno
Timo Sirainen wrote:
One more thing you could try: Does the hang happen if you use configure --with-notify=dnotify ?
I do not know if this is still interesting. But after notify=none did run for more than 3 hours without any problems, I am now testing dnotify since approximately 30 minutes without any problems, too.
Ciao, Arno
I have running the older debian/sid dovecot 1:1.0.15-2.3 again, now with kernel 2.6.27-10.slh.1-sidux-686 and the issue seems to be fixed. So I recommend to use at least 2.6.27.10.
Bye, Arno.
I have this EXACT same problem after upgrading to SuSE 11.1, which uses this exact kernel version!!
After reading this, I was excited to think that if I killed the nfsserver daemon (which I had running for no good reason), that it would sort my problem....
Sure enough, my computer - which up to now had been going unresponsive every 24 hours - was running fine for 72 hours and then BOOM... it happened again.
Just wanted to let people know that it seems that at the minute, the dovecot that ships with SuSE and the kernel they are using in 11.1 exhibit this problem.
Gino
Arno Wald wrote:
-- View this message in context: http://www.nabble.com/unkillable-imap-process%28es%29-with-high-CPU-usage-tp... Sent from the Dovecot mailing list archive at Nabble.com.
On Feb 14, 2009, at 9:57 AM, agent59624285 wrote:
This is sounding similar to the problem I have with my setup:
High CPU usage.
Can't kill IMAP.
Server becomes unresponsive.
I'm using CentOS 5.2 64-bit version with the latest cPanel.
So what am I missing, other than the problem nobody else is having is
clearly something they ARE having?
Peace, Gene
On Feb 15, 2009, at 12:43 PM, Timo Sirainen wrote:
Some of this is above my pay grade (so forgive the imprecision), but I
did try to restart IMAP in cPanel with no success, assuming I catch it
before the load makes it impossible to do anything.
Yes, I have been able to kill processes by the standard ID number. I
did that with rsync the other day when changing the backup parameters.
Here's the kernel info on my box:
Linux server.paracastworld.net 2.6.27.9rootserver-20081216a #1 SMP Tue
Dec 16 02:29:13 EST 2008 x86_64
So that's a buggy kernel?
Or is the 2.6.27.9 version better than 2.6.27 in this regard?
Peace, Gene
On Feb 15, 2009, at 12:53 PM, Seth Mattinen wrote:
I'd have to switch back to Dovecot in order to test for this, and
arrange to have someone with far more expertise than I possess to
continue monitoring the server to catch this when it happens. cPanel
support can't be expected to devote that much attention to preventive
medicine. My admin could do it, I suppose, though I'm only one of his
smaller clients, so I wouldn't expect it either.
As I said, I'm inclined to want to try this again for testing, if
someone would work with me on the initial setup, switching to a later
version of Dovecot than the one that cPanel operates with (if it'll
still integrate with cPanel -- is that possible?).
Peace, GEne
participants (10)
-
agent59624285
-
Arno Wald
-
Cor Bosman
-
David Rosenstrauch
-
Edgar Fuß
-
Gene Steinberg
-
Matthias Rieber
-
nuitari-dovecot@nuitari.net
-
Seth Mattinen
-
Timo Sirainen