This is not a multi-threaded process. With ps
and top
I observed:
[user@host]$ ps aux | grep -i [r]redacted
500 3073 6.1 11.7 1457148 188188 ? Sl Feb14 91:54 /usr/bin/python2.7 /usr/bin/redacted_proc
500 3120 6.1 11.0 1541952 177184 ? Sl Feb14 91:56 /usr/bin/python2.7 /usr/bin/redacted_proc
top - 10:02:55 up 728 days, 19:30, 3 users, load average: 0.26, 0.14, 0.14
Tasks: 99 total, 1 running, 97 sleeping, 0 stopped, 1 zombie
Cpu(s): 3.7%us, 1.0%sy, 0.0%ni, 95.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.1%st
Mem: 1598640k total, 1239756k used, 358884k free, 192296k buffers
Swap: 0k total, 0k used, 0k free, 346756k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3120 redacted 20 0 1505m 173m 4648 S 10.3 11.1 93:33.08 redacted_proc
3073 redacted 20 0 1422m 185m 4608 S 6.6 11.9 93:31.04 redacted_proc
Since killing both PIDs and starting the process up normally, there again is a single running PID for this process.
What would cause Linux to run 2 of the same process like this, especially when it's init script should be accounting for that already, and a single entry under /var/run/redacted.pid
only exists?
I am including the contents of the init script:
#!/bin/bash
# Source function library.
. /etc/rc.d/init.d/functions
RETVAL=0
DAEMON=redacted_process
BIN="/usr/bin/redacted_process"
OPTS=""
RUNAS=redacted
PIDDIR=/var/run/${DAEMON}
PIDFILE=${PIDDIR}/${DAEMON}.pid
start () {
echo -n "Starting ${DAEMON}: "
[ -f ${PIDFILE} ] && success && echo && return 0
su -s /bin/bash ${RUNAS} -c "
cd /
${BIN} ${OPTS} &> /dev/null &
echo \$! > ${PIDFILE}
disown \$!
"
RETVAL=$?
[ $RETVAL -eq 0 ] && success || failure
echo
return $RETVAL
}
stop () {
echo -n "Shutting down ${DAEMON}: "
killproc ${DAEMON}
RETVAL=$?
echo
[ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/${DAEMON}
[ $RETVAL -eq 0 ] && rm -f ${PIDFILE}
return $RETVAL
}
restart () {
stop
start
RETVAL=$?
return $RETVAL
}
case "$1" in
start)
start
;;
stop)
stop
;;
restart|reload)
restart
;;
status)
status ${DAEMON}
RETVAL=$?
;;
*)
echo "Usage: ${0} {start|stop|restart|status}"
RETVAL=1
esac
exit $RETVAL
This is interesting, there is also an entry in /etc/crontab
for this job that runs every minute:
/sbin/service redacted status > /dev/null
if [ "$?" -gt "0" ]; then
/bin/rm /var/run/redacted_proc/*
/sbin/service redacted restart && tail -n 200 /var/log/redacted_proc/redacted_prod.log | mail -s "redacted pid restarted on ${HOSTNAME}" [email protected]
fi
And ps
for that cron job shows as <defunct>
.
I am wondering if this somehow caused this program to get run twice.
ps -ejf
which lists PPID and post the two relevant lines (plus headline) please.