2

So I have this Bash script:

#!/bin/bash

PID=`ps -u ...`
if [ "$PID" = "" ]; then
    echo $(date) Server off: not backing up
    exit
else
    echo "say Server backup in 10 seconds..." >> fifo
    sleep 10

    STARTTIME="$(date +%s)"

    echo nosave >> fifo
    echo savenow >> fifo
    tail -n 3 -f server.log | while read line
    do
        if echo $line | grep -q 'save complete'; then
            echo $(date) Backing up...
            OF="./backups/backup $(date +%Y-%m-%d\ %H:%M:%S).tar.gz"
            tar -czhf "$OF" data

            echo autosave >> fifo
            echo "$(date) Backup complete, resuming..."
            echo "done"
            exit 0
            echo "done2"
        fi

        TIMEDIFF="$(($(date +%s)-STARTTIME))"
        if ((TIMEDIFF > 70)); then
            echo "Save took too long, canceling backup."
            exit 1
        fi
    done
fi

Basically, the server takes input from a fifo and outputs to server.log. The fifo is used to send stop/start commands to the server for autosaves. At the end, once it receives the message from the server that the server has completed a save, it tar's the data directory and starts saves again.

It's at the exit 0 line that I'm having trouble. Everything executes fine, but I get this output:

srv:scripts $ ./backup.sh
Sun Nov 24 22:42:09 EST 2013 Backing up...
Sun Nov 24 22:42:10 EST 2013 Backup complete, resuming...
done

But it hangs there. Notice how "done" echoes but "done2" fails. Something is causing it to hang on exit 0.

ADDENDUM: Just to avoid confusion for people looking at this in the future, it hangs at the exit line and never returns to the command prompt. Not sure if I was clear enough in my original description.

Any thoughts? This is the entire script, there's nothing else going on and I'm calling it direct from bash.

1
  • "exit 0 fails to exit"? You're seeing it yourself, it is exiting :P Commented Nov 25, 2013 at 3:47

3 Answers 3

8

Here's a smaller, self contained example that exhibits the same behavior:

echo foo > file
tail -f file | while read; do exit; done

The problem is that since each part of the pipeline runs in a subshell, exit only exits the while read loop, not the entire script.

It will then hang until tail finds a new line, tries to write it, and discovers that the pipe is broken.

To fix it, you can replace

tail -n 3 -f server.log | while read line
    do
       ...
    done

with

while read line
do
   ...
done  <  <(tail -n 3 -f server.log)

By redirecting from a process substitution instead, the flow doesn't have to wait for tail to finish like it would in a pipeline, and it won't run in a subshell so that exit will actually exits the entire script.

Sign up to request clarification or add additional context in comments.

1 Comment

This was exactly what the problem was. Good to know in the future. :)
1

But it hangs there. Notice how "done" echoes but "done2" fails.

done2 won't be printed at all since exit 0 has already ended your script with return code 0.

2 Comments

What I mean is that it prints "done" but never returns to the command prompt. The "done2" is just there to indicate that it's not passing the exit and going into the time delay loop.
You write: Basically, the server takes input from a fifo I don't see it anywhere.
0

I don't know the details of bash subshells inside loops, but normally the appropriate way to exit a loop is to use the "break" command. In some cases that's not enough (you really need to exit the program), but refactoring that program may be the easiest (safest, most portable) way to solve that. It may also improve readability, because people don't expect programs to exit in the middle of a loop.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.