we are using NGINX from ATOMIC for laodbalance purposes. and we came up with some problem that we needed to reboot from times to times. until i found what was going on and apply a TEMP FIX until Atomic fixes this but it never came out. so my collegues are pushing me to post here.
we are having this problem on all our NGINX enabled server, using the ATOMIC repo for NGINX. which is arround 45 servers total in our infrastructure. all Linux, centos 6.6 64 bit.
we are using;
#nginx -V nginx version: nginx/1.6.2
nginx.x86_64 1.6.2-23.el6.art @atomic
onto a centos 6.6
Linux ESL-DR2-jb-01 2.6.32-504.8.1.el6.x86_64 #1 SMP Wed Jan 28 21:11:36 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
i will not put every details of what i did and how i reached that conclusion but, long story short, nginx produces entry in the Semaphore Arrays, that upon restart or load, does not clear.
to be even more exact, when you service start, it creates 2 semaphore. when you service stop, it removes the sempahore, which is perfect BUT the problem is;
if you start nginx, creates 2 semaphores.
if you service configtest, it creates 2 new semaphore that are not getting deleted,
then do a service reload, it creates 4 new sempahore
then if you service stop, it only removes the last 2 semaphores.
even, if you do no service command, just the log rotation will generates 2 new ipcs.
to illustrate this;
Code: Select all
225/1877MB 0.00 0.00 0.00 1/275 7486 UNKOWN ZONE
[7214:7213] [0:1395] 04:42:20 Wed Mar 18 root@l-2001a +1 /home/p7685
SUDO MODE ON
(1:1395)# banner start
##### ####### # ###### #######
# # # # # # # #
# # # # # # #
##### # # # ###### #
# # ####### # # #
# # # # # # # #
##### # # # # # #
225/1877MB 0.00 0.00 0.00 1/275 7486 UNKOWN ZONE
[7214:7213] [0:1396] 04:42:27 Wed Mar 18 root@l-2001a +1 /home/p7685
SUDO MODE ON
(1:1396)# ipcs -s
------ Semaphore Arrays --------
key semid owner perms nsems
0x00000000 0 root 600 1
0x00000000 65537 root 600 1
0x00000000 2326534 p7685 600 1
0x00000000 2359303 p7685 600 1
225/1877MB 0.00 0.00 0.00 1/275 7486 UNKOWN ZONE
[7214:7213] [0:1397] 04:42:31 Wed Mar 18 root@l-2001a +1 /home/p7685
SUDO MODE ON
(1:1397)# service nginx start
Starting nginx: [ OK ]
327/1877MB 0.04 0.01 0.00 1/277 7838 UNKOWN ZONE
[7214:7213] [0:1398] 04:42:36 Wed Mar 18 root@l-2001a +1 /home/p7685
SUDO MODE ON
(1:1398)# ipcs -s
------ Semaphore Arrays --------
key semid owner perms nsems
0x00000000 0 root 600 1
0x00000000 65537 root 600 1
0x00000000 6651906 4294967295 600 1
0x00000000 6684675 4294967295 600 1
0x00000000 2326534 p7685 600 1
0x00000000 2359303 p7685 600 1
327/1877MB 0.04 0.01 0.00 1/277 7838 UNKOWN ZONE
[7214:7213] [0:1399] 04:42:38 Wed Mar 18 root@l-2001a +1 /home/p7685
SUDO MODE ON
(1:1399)# service nginx configtest
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
327/1877MB 0.04 0.01 0.00 1/277 7838 UNKOWN ZONE
[7214:7213] [0:1400] 04:42:49 Wed Mar 18 root@l-2001a +1 /home/p7685
SUDO MODE ON
(1:1400)# ipcs -s
------ Semaphore Arrays --------
key semid owner perms nsems
0x00000000 0 root 600 1
0x00000000 65537 root 600 1
0x00000000 6651906 4294967295 600 1
0x00000000 6684675 4294967295 600 1
0x00000000 6717444 4294967295 600 1
0x00000000 6750213 4294967295 600 1
0x00000000 2326534 p7685 600 1
0x00000000 2359303 p7685 600 1
327/1877MB 0.04 0.01 0.00 1/277 7838 UNKOWN ZONE
[7214:7213] [0:1401] 04:42:54 Wed Mar 18 root@l-2001a +1 /home/p7685
SUDO MODE ON
(1:1401)# service nginx reload
Reloading nginx: [ OK ]
327/1877MB 0.04 0.01 0.00 1/277 7838 UNKOWN ZONE
[7214:7213] [0:1402] 04:43:02 Wed Mar 18 root@l-2001a +1 /home/p7685
SUDO MODE ON
(1:1402)# ipcs -s
------ Semaphore Arrays --------
key semid owner perms nsems
0x00000000 0 root 600 1
0x00000000 65537 root 600 1
0x00000000 6651906 4294967295 600 1
0x00000000 6684675 4294967295 600 1
0x00000000 6717444 4294967295 600 1
0x00000000 6750213 4294967295 600 1
0x00000000 2326534 p7685 600 1
0x00000000 2359303 p7685 600 1
0x00000000 6782984 4294967295 600 1
0x00000000 6815753 4294967295 600 1
0x00000000 6848522 4294967295 600 1
0x00000000 6881291 4294967295 600 1
324/1877MB 0.02 0.01 0.00 1/277 7900 UNKOWN ZONE
[7214:7213] [0:1403] 04:43:06 Wed Mar 18 root@l-2001a +1 /home/p7685
SUDO MODE ON
(1:1403)# service nginx stop
Stopping nginx: [ OK ]
324/1877MB 0.02 0.01 0.00 1/277 7900 UNKOWN ZONE
[7214:7213] [0:1404] 04:43:22 Wed Mar 18 root@l-2001a +1 /home/p7685
SUDO MODE ON
(1:1404)# ipcs -s
------ Semaphore Arrays --------
key semid owner perms nsems
0x00000000 0 root 600 1
0x00000000 65537 root 600 1
0x00000000 6651906 4294967295 600 1
0x00000000 6684675 4294967295 600 1
0x00000000 6717444 4294967295 600 1
0x00000000 6750213 4294967295 600 1
0x00000000 2326534 p7685 600 1
0x00000000 2359303 p7685 600 1
0x00000000 6782984 4294967295 600 1
0x00000000 6815753 4294967295 600 1
324/1877MB 0.02 0.01 0.00 1/277 7900 UNKOWN ZONE
[7214:7213] [0:1405] 04:43:24 Wed Mar 18 root@l-2001a +1 /home/p7685
SUDO MODE ON
(1:1405)# ps -ef |grep nginx
root 7920 7214 0 16:43 pts/1 00:00:00 /bin/grep --color=always nginx
324/1877MB 0.02 0.01 0.00 1/277 7900 UNKOWN ZONE
[7214:7213] [0:1406] 04:43:35 Wed Mar 18 root@l-2001a +1 /home/p7685
SUDO MODE ON
(1:1406)# banner stop
##### ####### ####### ######
# # # # # # #
# # # # # #
##### # # # ######
# # # # #
# # # # # #
##### # ####### #
329/1877MB 0.01 0.01 0.00 1/275 7927 UNKOWN ZONE
[7214:7213] [0:1407] 04:43:46 Wed Mar 18 root@l-2001a +1 /home/p7685
SUDO MODE ON
(1:1407)#
Code: Select all
reload() {
configtest_q || return 6
echo -n $"Reloading $prog: "
killproc -p $pidfile $prog -HUP
echo
}
Code: Select all
326/1877MB 0.00 0.00 0.00 1/275 8247 UNKOWN ZONE
[7214:7213] [0:1456] 04:54:27 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1456)# banner start
##### ####### # ###### #######
# # # # # # # #
# # # # # # #
##### # # # ###### #
# # ####### # # #
# # # # # # # #
##### # # # # # #
326/1877MB 0.00 0.00 0.00 1/275 8247 UNKOWN ZONE
[7214:7213] [0:1457] 04:54:30 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1457)# service nginx start
Starting nginx: [ OK ]
326/1877MB 0.00 0.00 0.00 1/275 8247 UNKOWN ZONE
[7214:7213] [0:1458] 04:54:38 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1458)# ipcs -s
------ Semaphore Arrays --------
key semid owner perms nsems
0x00000000 0 root 600 1
0x00000000 65537 root 600 1
0x00000000 7766018 4294967295 600 1
0x00000000 7798787 4294967295 600 1
0x00000000 2326534 p7685 600 1
0x00000000 2359303 p7685 600 1
326/1877MB 0.00 0.00 0.00 1/275 8247 UNKOWN ZONE
[7214:7213] [0:1459] 04:54:43 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1459)# ps -ef |grep nginx
root 8457 1 0 16:54 ? 00:00:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
nginx 8458 8457 0 16:54 ? 00:00:00 nginx: worker process
root 8464 7214 0 16:54 pts/1 00:00:00 /bin/grep --color=always nginx
326/1877MB 0.00 0.00 0.00 1/275 8247 UNKOWN ZONE
[7214:7213] [0:1460] 04:54:50 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1460)# kill -HUP 8457
326/1877MB 0.00 0.00 0.00 1/275 8247 UNKOWN ZONE
[7214:7213] [0:1461] 04:54:57 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1461)# kill -HUP 8457
326/1877MB 0.00 0.00 0.00 1/275 8247 UNKOWN ZONE
[7214:7213] [0:1462] 04:54:57 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1462)# kill -HUP 8457
326/1877MB 0.00 0.00 0.00 1/275 8247 UNKOWN ZONE
[7214:7213] [0:1463] 04:54:58 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1463)# kill -HUP 8457
326/1877MB 0.00 0.00 0.00 1/275 8247 UNKOWN ZONE
[7214:7213] [0:1464] 04:54:58 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1464)# kill -HUP 8457
326/1877MB 0.00 0.00 0.00 1/275 8247 UNKOWN ZONE
[7214:7213] [0:1465] 04:54:59 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1465)# kill -HUP 8457
326/1877MB 0.00 0.00 0.00 1/275 8247 UNKOWN ZONE
[7214:7213] [0:1466] 04:54:59 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1466)# kill -HUP 8457
326/1877MB 0.00 0.00 0.00 1/275 8247 UNKOWN ZONE
[7214:7213] [0:1467] 04:55:00 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1467)# kill -HUP 8457
326/1877MB 0.00 0.00 0.00 1/275 8247 UNKOWN ZONE
[7214:7213] [0:1468] 04:55:00 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1468)# kill -HUP 8457
326/1877MB 0.00 0.00 0.00 1/275 8247 UNKOWN ZONE
[7214:7213] [0:1469] 04:55:00 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1469)# kill -HUP 8457
326/1877MB 0.00 0.00 0.00 1/275 8247 UNKOWN ZONE
[7214:7213] [0:1470] 04:55:01 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1470)# ipcs -s
------ Semaphore Arrays --------
key semid owner perms nsems
0x00000000 0 root 600 1
0x00000000 65537 root 600 1
0x00000000 7766018 4294967295 600 1
0x00000000 7798787 4294967295 600 1
0x00000000 7831556 4294967295 600 1
0x00000000 7864325 4294967295 600 1
0x00000000 2326534 p7685 600 1
0x00000000 2359303 p7685 600 1
0x00000000 7897096 4294967295 600 1
0x00000000 7929865 4294967295 600 1
0x00000000 7962634 4294967295 600 1
0x00000000 7995403 4294967295 600 1
0x00000000 8028172 4294967295 600 1
0x00000000 8060941 4294967295 600 1
0x00000000 8093710 4294967295 600 1
0x00000000 8126479 4294967295 600 1
0x00000000 8159248 4294967295 600 1
0x00000000 8192017 4294967295 600 1
0x00000000 8224786 4294967295 600 1
0x00000000 8257555 4294967295 600 1
0x00000000 8290324 4294967295 600 1
0x00000000 8323093 4294967295 600 1
0x00000000 8355862 4294967295 600 1
0x00000000 8388631 4294967295 600 1
0x00000000 8421400 4294967295 600 1
0x00000000 8454169 4294967295 600 1
326/1877MB 0.00 0.00 0.00 1/275 8247 UNKOWN ZONE
[7214:7213] [0:1471] 04:55:04 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1471)#
Code: Select all
330/1877MB 0.00 0.00 0.00 1/273 8623 UNKOWN ZONE
[7214:7213] [0:1476] 04:57:46 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1476)# banner start
##### ####### # ###### #######
# # # # # # # #
# # # # # # #
##### # # # ###### #
# # ####### # # #
# # # # # # # #
##### # # # # # #
330/1877MB 0.00 0.00 0.00 1/273 8623 UNKOWN ZONE
[7214:7213] [0:1477] 04:57:50 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1477)# ipcs -s
------ Semaphore Arrays --------
key semid owner perms nsems
0x00000000 0 root 600 1
0x00000000 65537 root 600 1
0x00000000 2326534 p7685 600 1
0x00000000 2359303 p7685 600 1
330/1877MB 0.00 0.00 0.00 1/273 8623 UNKOWN ZONE
[7214:7213] [0:1478] 04:57:52 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1478)# ps -ef |grep nginx
root 8654 7214 0 16:58 pts/1 00:00:00 /bin/grep --color=always nginx
330/1877MB 0.00 0.00 0.00 1/273 8623 UNKOWN ZONE
[7214:7213] [0:1479] 04:58:25 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1479)# service nginx status
nginx is stopped
330/1877MB 0.00 0.00 0.00 1/273 8623 UNKOWN ZONE
[7214:7213] [0:1480] 04:58:29 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1480)# ipcs -s
------ Semaphore Arrays --------
key semid owner perms nsems
0x00000000 0 root 600 1
0x00000000 65537 root 600 1
0x00000000 2326534 p7685 600 1
0x00000000 2359303 p7685 600 1
330/1877MB 0.00 0.00 0.00 1/273 8623 UNKOWN ZONE
[7214:7213] [0:1481] 04:58:31 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1481)# service nginx configtest
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
330/1877MB 0.00 0.00 0.00 1/273 8623 UNKOWN ZONE
[7214:7213] [0:1482] 04:58:40 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1482)# ipcs -s
------ Semaphore Arrays --------
key semid owner perms nsems
0x00000000 0 root 600 1
0x00000000 65537 root 600 1
0x00000000 8486914 4294967295 600 1
0x00000000 8519683 4294967295 600 1
0x00000000 2326534 p7685 600 1
0x00000000 2359303 p7685 600 1
330/1877MB 0.00 0.00 0.00 1/273 8623 UNKOWN ZONE
[7214:7213] [0:1483] 04:58:43 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1483)# ipcs -s
------ Semaphore Arrays --------
key semid owner perms nsems
0x00000000 0 root 600 1
0x00000000 65537 root 600 1
0x00000000 8486914 4294967295 600 1
0x00000000 8519683 4294967295 600 1
0x00000000 2326534 p7685 600 1
0x00000000 2359303 p7685 600 1
330/1877MB 0.00 0.00 0.00 1/273 8690 UNKOWN ZONE
[7214:7213] [0:1484] 04:58:46 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1484)# service nginx configtest
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
330/1877MB 0.00 0.00 0.00 1/273 8690 UNKOWN ZONE
[7214:7213] [0:1485] 04:58:54 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1485)# service nginx configtest
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
330/1877MB 0.00 0.00 0.00 1/273 8690 UNKOWN ZONE
[7214:7213] [0:1486] 04:58:54 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1486)# service nginx configtest
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
330/1877MB 0.00 0.00 0.00 1/273 8690 UNKOWN ZONE
[7214:7213] [0:1487] 04:58:55 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1487)# service nginx configtest
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
330/1877MB 0.00 0.00 0.00 1/273 8690 UNKOWN ZONE
[7214:7213] [0:1488] 04:58:55 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1488)# service nginx configtest
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
330/1877MB 0.00 0.00 0.00 1/273 8690 UNKOWN ZONE
[7214:7213] [0:1489] 04:58:55 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1489)# service nginx configtest
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
330/1877MB 0.00 0.00 0.00 1/273 8763 UNKOWN ZONE
[7214:7213] [0:1490] 04:58:56 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1490)# ipcs -s
------ Semaphore Arrays --------
key semid owner perms nsems
0x00000000 0 root 600 1
0x00000000 65537 root 600 1
0x00000000 8486914 4294967295 600 1
0x00000000 8519683 4294967295 600 1
0x00000000 8552452 4294967295 600 1
0x00000000 8585221 4294967295 600 1
0x00000000 2326534 p7685 600 1
0x00000000 2359303 p7685 600 1
0x00000000 8617992 4294967295 600 1
0x00000000 8650761 4294967295 600 1
0x00000000 8683530 4294967295 600 1
0x00000000 8716299 4294967295 600 1
0x00000000 8749068 4294967295 600 1
0x00000000 8781837 4294967295 600 1
0x00000000 8814606 4294967295 600 1
0x00000000 8847375 4294967295 600 1
0x00000000 8880144 4294967295 600 1
0x00000000 8912913 4294967295 600 1
330/1877MB 0.00 0.00 0.00 1/273 8763 UNKOWN ZONE
[7214:7213] [0:1491] 04:58:58 Wed Mar 18 root@l-2001a +1 /etc/logrotate.d
SUDO MODE ON
(1:1491)#
Code: Select all
[Wed Mar 18 17:03:47 2015] [error] (28)No space left on device: Cannot create SSLMutex
but i have another question, why the release of nginx from ATOMIC is generating all those semaphores, while EPEL and the original release from NGINX repo are not generating any of these, this leads me to think that some code changes are made in NGINX that uses semaphore while the other REPO are not.
im sory for the long email, but i worked like my butt on this in the office to find out these! and now i am sharing with the WORLD!
my own personl fix, is that i changed the /etc/init.d/nginx script, i added this command
Code: Select all
for i in `ipcs -s | awk '/4294967295/ {print $2}'`; do (ipcrm -s $i); done
on this, have a good one, if by mistake, ive post this in the bad forom, please push it to the proper place.