Page 1 of 1

Why does qmail keep quitting?

Posted: Tue Aug 08, 2006 11:52 am
by Snapdragon
This is really getting frustrating... for some reason, qmail stops processing mail. I upgraded to Plesk 801 and the problem went away, until this morning.

In Plesk, it shows Qmail as "stopped" -- however, the box will continue to receive mail, but it won't process it. The moment I hit "go" in Plesk users are bombared with queued up mail.

It's even more frustrating because, since the box is still accepting mail, Alertra doesn't pick up on a "down" service and notify me.

I don't use the Plesk watchdog - THAT crashed qmail three times the first day it was turned on.

Any suggestions?? My queues are no bigger than 200-300 messages when this happens.

The ONLY change I've made recently was the doublebounce config, I put # in and then hupped qmail.

Posted: Tue Aug 08, 2006 8:49 pm
by scott
We're going to need to collect more information first. Dont use the PSA control panel when this happens next time, ssh to the box and look around for what processes are running. Theres a separate process for each part of qmail, one handles incoming mail, and another (qmail-local) handles delivery of mail to the users mailbox. I suspect that process is dying, but until there is more information I'm just guessing.

Posted: Wed Aug 09, 2006 12:56 am
by Snapdragon
OK, I'll get an output next time it happens.

Posted: Tue Aug 15, 2006 3:49 pm
by Snapdragon
OK, it happened again. I took an output of ps -aux before and after the restart, and here is the grep qmail on them:

Code: Select all

# grep qmail beforerestart
qmaild   13585  0.0  0.0  3448  860 ?        S    13:41   0:00 /var/qmail/bin/qm

Code: Select all

# grep qmail afterrestart
qmails   13961  0.6  0.0  1568  476 ?        S    13:42   0:00 qmail-send
qmaill   13963  0.2  0.0  1524  432 ?        S    13:42   0:00 splogger qmail
root     13964  0.2  0.0  1540  352 ?        S    13:42   0:00 qmail-lspawn ./Ma
qmailr   13965  0.0  0.0  1540  364 ?        S    13:42   0:00 qmail-rspawn
qmailq   13966  0.1  0.0  1512  320 ?        S    13:42   0:00 qmail-clean
qmailr   13967  0.0  0.0  3416  964 ?        S    13:42   0:00 qmail-remote 0733
qmailr   13969  0.0  0.0  3420  968 ?        S    13:42   0:00 qmail-remote 0733
qmailr   13977  0.0  0.0  3432  968 ?        S    13:42   0:00 qmail-remote 0733
qmailr   13989  0.0  0.0  3416  968 ?        S    13:42   0:00 qmail-remote 0733
qmailr   13994  0.0  0.0  3408  964 ?        S    13:42   0:00 qmail-remote 0733
qmailr   13998  0.0  0.0  3424  964 ?        S    13:42   0:00 qmail-remote usa.
qmailr   14003  0.0  0.0  3428  964 ?        S    13:42   0:00 qmail-remote 0733
qmailr   14017  0.0  0.0  3420  968 ?        S    13:42   0:00 qmail-remote hj.c
qmailr   14038  0.0  0.0  3432  968 ?        S    13:42   0:00 qmail-remote 0733
qmailr   14042  0.0  0.0  3428  968 ?        S    13:42   0:00 qmail-remote hj.c
qmailr   14047  0.0  0.0  3416  964 ?        S    13:42   0:00 qmail-remote 0733
qmailr   14052  0.0  0.0  3412  960 ?        S    13:42   0:00 qmail-remote 1-jo
qmailr   14142  0.0  0.0  3428  968 ?        S    13:42   0:00 qmail-remote 0733
qmailr   14167  0.0  0.0  3424  968 ?        S    13:42   0:00 qmail-remote gkjs
qmailr   14182  0.0  0.0  3416  968 ?        S    13:42   0:00 qmail-remote telu
qmailr   14184  0.0  0.0  3424  968 ?        S    13:42   0:00 qmail-remote asd.
qmailr   14218  0.0  0.0  3428  972 ?        S    13:42   0:00 qmail-remote hotn
qmailr   14356  0.0  0.0  3412  972 ?        S    13:42   0:00 qmail-remote yaho
qmailr   15560  0.0  0.0  3412  964 ?        S    13:43   0:00 qmail-remote msn.
qmailr   16079  0.2  0.0  3432  968 ?        S    13:43   0:00 qmail-remote ndm.
popuser  16296  0.0  0.0  1424  308 ?        S    13:43   0:00 bin/qmail-local -
popuser  16309  0.0  0.0  1424  312 ?        S    13:43   0:00 bin/qmail-local -
Obviously qmail is completely dying, the question now is why? The only thing I did since 8.0.1 upgrade was to put the doublebounce rule in, and now I've taken it out.

Posted: Tue Aug 15, 2006 9:35 pm
by scott
What distro are you on?

Just speculating here, but my guess is that something is restarting qmail, qmail has some kind of problem, and it errors out.

Posted: Tue Aug 15, 2006 11:11 pm
by Snapdragon
RHEL 3 ES

# uname -a
Linux 2.4.21-15.ELsmp #1 SMP Thu Apr 22 00:18:24 EDT 2004 i686 i686 i386 GNU/Linux

I tried to upgrade the kernal before and got all sorts of nasty disk errors so I stayed with this kernel.

An error should be logged somewhere I could check, no?

Posted: Thu Aug 24, 2006 1:50 pm
by Snapdragon
No other thoughts? It hasn't happened again.. *knock knock* .. but it will.

Posted: Mon Nov 13, 2006 4:53 pm
by jjjheimer
This is happening to my system. I just did a kernel update from Centos 3.6 to Centos 3.8.

Posted: Mon Nov 13, 2006 10:31 pm
by breun
Are you running Watchdog? I've heard stories of Watchdog detecting erroneously that a service has died and trying to restart it, but failing. You might want to try running with Watchdog disabled and see how things go.

Posted: Mon Nov 13, 2006 11:50 pm
by jjjheimer
Coincidently I enabled watchdog at the same time. I disabled watchdog and qmail is now running fine. Thanks SW Soft!

Posted: Tue Nov 14, 2006 8:09 pm
by Snapdragon
breun wrote:Are you running Watchdog? I've heard stories of Watchdog detecting erroneously that a service has died and trying to restart it, but failing. You might want to try running with Watchdog disabled and see how things go.
Watchdog crashed qmail 5 times the day it was installed. I turned it off and never touched it again.

Posted: Thu Jun 12, 2008 12:44 pm
by Snapdragon
Hah, funny going back and looking at these posts. I've never, ever touched Watchdog since. I rely on Alertra to tell me when things break, and qmail hasn't had a beef since.