4

Trying to get a script to watch a file and compare the md5sum of the file every 60 seconds, if it has changed then print a warning to the screen.

Not sure on how to go about this. This is what I have but I think I am off a good bit

#/bin/bash

watch=$@

if [ -z "$watch" ]
    then
        echo "No file specified, aborting"
        exit 
else
    echo "watching : $watch"
fi

while [ 1 ]; do
watch -n 60 -d md5sum $watch
done

I have also done this and this seems to work (kind of), it does tell me if the file has changed but it doesn't use the watch command, is there a way of doing it via watch?

#/bin/bash

watch=$@

if [ -z "$watch" ]
   then
       echo "No file specified, aborting"
       exit 
else
    echo "watching : $watch"
fi

checksum1="empty"
while [ 1 ]; do

checksum2=$(md5sum $watch | md5sum | cut -d ' ' -f1)
if [ "$checksum2" != "$checksum1" ];
then
echo "Warning : $watch has been changed"
#mail -s "$watch has been changed" "[email protected]"
echo -e "\a"
fi
checksum1="$checksum2"
sleep 60
done

3 Answers 3

2

You are much better off with inotify, which is made for this kind of purpose: file monitoring.


EDIT: I'll summarize the responses from the linked question here.

Whenever you need to monitor a file or a directory, inotify is the right tool for the job. You can tell inotifywait which events you want to monitor: file access, change, open, close, delete... (See man inotifywait for more details).

A first approach is a loop like this one:

while inotifywait -e close_write myfile.py; do ./myfile.py; done

The major drawback is that you might miss events. A more efficient version would be:

inotifywait -mq --format '%e' /path/to/file |
while IFS= read -r events; do
  /path/to/script "$events"
done

The latter version won't miss a single event. The pipe ensures that events are queued. If the loop doesn't pick them in a timely fashion, they will pile up but none will be missed.


EDIT: If you are tracking text files, I'd also recommend git as it does track files using robust hashes. Here's some bash pseudo-code as to what the main inotify loop would look like:

git status file     # Tells whether <file> was modified
if file was modified; then
    git commit -- file  # Add <file> to the repository
    keep only the last two versions of <file>
    print the warning message
fi

You can use this as a base. You'll have to parse git output maybe. I haven't used it intensively enough to tell you how exactly but I have a hunch it can do that ;-) .

0
1

I hate for this to be the answer, but after about 15 minutes of playing with this, I don't think there is a way to make it happen with the watch command (though someone else, please prove me wrong)

The problem lies in the fact that watch itself runs in its own loop and does not break to provide data back to the shell, it only provides its own echo.

As a result of the way watch run's, using it inside any kind of if or while checking statement does not give the shell a chance to evaluate any changes made because it never gives back its results. It merely holds onto them until you exit it's loop manually.

Running your own loop and checking the md5sum the way you are in your second example is about the best way to accomplish what you are looking to do.

As an example, issue

echo `watch -d md5sum testfile.txt`

the echo will never go off. If you can find a way to make that echo hit the terminal, then that is the answer to getting your script to function.

0

Try this rough script, it follows your idea of supplying a file argument and making use of md5sum. It displays a line with a time stamp whenever there are changes based on md5sum being different, saves a log file, and stops when you ctrl-c. Contents of watch_and_notify.sh:

#!/bin/bash

logf="$1.log"

interval=2

first_run=

# temp files, current and last md5s for diff to compare
lm1="$(mktemp /tmp/lm1.$$.XXXX)"
lm2="$(mktemp /tmp/lm2.$$.XXXX)"

if [ -z "$1" ]; then
        echo "No file specified, aborting" >&2
        exit 1
fi

echo "Watching at ${interval}s intervals:   $1"

# loop forever until cancel this script
while true; do

    md5sum "$1" > $lm1

    # otherwise in the first iteration,
    # lm2 does not yet exist, so diff
    # will always unintentionally report
    # a difference when comparing existing
    # file with nonexisting file
    if [ -z "$first_run" ]; then
        cp -a $lm1 $lm2
        first_run=1
    fi

    # test ! to invert usual exit code
    if ! diff $lm2 $lm1; then
        echo -e "$(date +"%F %R")\tChange detected:\t$1" | tee -a "$logf"
    fi

    # rotate
    mv $lm1 $lm2

    sleep $interval

done

# when you ctrl-c it should garbage cleanup
trap "rm $lm1 $lm2; exit 1" SIGINT

Example

Start an empty text file named a.txt

$ touch a.txt

Run the script like this, and see:

$ ./watch_and_notify.sh a.txt
Watching at 2s intervals:       a.txt

On a second terminal, you test making a change, for example

$ echo addition >> a.txt

On the first terminal running the script, you'll see an update:

Watching at 2s intervals:       a.txt

1c1
< d41d8cd98f00b204e9800998ecf8427e  a.txt
---
> 9913e6909c108b5c32c69280474b2b2a  a.txt
2015-09-29 15:56        Change detected:        a.txt

On the second terminal you again introduce another change:

$ echo anotherchange >> a.txt

Then on the first terminal running the script, the output is updated again:

Watching at 2s intervals:       a.txt
1c1
< d41d8cd98f00b204e9800998ecf8427e  a.txt
---
> 9913e6909c108b5c32c69280474b2b2a  a.txt
2015-09-29 15:56        Change detected:        a.txt
1c1
< 9913e6909c108b5c32c69280474b2b2a  a.txt
---
> 5c1c20a75b9982128f8300a7940f9ce0  a.txt
2015-09-29 16:06        Change detected:        a.txt

You quit with ctrl-c. and will be returned to the command prompt. You list contents and see there is a log:

$ ls -lh
total 12K
-rw-r--r-- 1 meme meme  22 Sep 29 16:06 a.txt
-rw-r--r-- 1 meme meme  80 Sep 29 16:06 a.txt.log
-rwxrwxrwx 1 meme meme 775 Sep 29 15:26 watch_and_notify.sh

Viewing the log, you see the same timestamped entries logging the moments of change:

$ cat a.txt.log
2015-09-29 15:56        Change detected:        a.txt
2015-09-29 16:06        Change detected:        a.txt

Code explanation

Most of the overall flow can be understood hopefully from the comments in the script, but basically it repeatedly runs an md5sum command on the file, while saving that result and rotating it so that the diff program compares current result versus previous iteration's result, if there is a difference then perform the reporting action. In this case those actions are to output to screen with timestamp as well as to append to a log. The user stops the script with ctrl-c, and is left with a log of the timestamped moments when differences were detected.

Within the script

lm1="$(mktemp /tmp/lm1.$$.XXXX)"
lm2="$(mktemp /tmp/lm2.$$.XXXX)"
  • lm1 and lm2 are temp files to store md5sum outputs for comparison, generated only when our script is run
  • the key approach is as you said, literally comparing md5sums, so to do that we first define somewhere temporary to store them
  • mktemp to help make a unique temporary file name.
  • $$ is the current process id, to throw in some randomness
  • XXXXX tells mktemp to replace each X with a randomized alphanumeric characters

So when running we can check and see /tmp indeed contain at least a file named with our defined pattern:

$ ls -lh /tmp/lm*
-rw-r--r-- 1 meme meme 40 Sep 29 15:08 /tmp/lm2.8248.xGJTl

Next we have the while loop:

# loop forever until cancel this script
while true; do

...

done

The majority of the code is sandwiched within in a big while <command>; do ... done loop. Since the command/condition is true and always will be, this code runs indefinitely until we ctrl-c to stop it.

Each loop iteration starts first with generating current iteration's md5sum results, and saving it:

md5sum "$1" > $lm1
  • $1 means the first positional argument. In this case when run watch_and_notify.sh a.txt , $1 will be a.txt
  • > $lm1 to write the output to our temp file defined earlier

When first run, there won't be a previous iteration. Yet the way the diff command is arranged it has to compare changes of the previous md5sum result $lm2 to $lm1, which, on first run, will always unintentionally show a difference, so on the first run there needed to be a special conditional action. To be able to recognize the first run, we make an initial empty variable, defined prior to the while loop:

first_run=

Then, within the loop we test for this:

    if [ -z "$first_run" ]; then
        cp -a $lm1 $lm2
        first_run=1
    fi
  • -z tests for zero value. If it is the first iteration, $first_run is always empty, so continues the then portion
  • within the then portion, we "fake" a "previous iteration" by making a duplicate of $lm1, so later on diff , for the first iteration, is comparing two identical files, and won't report a difference.
  • first_run=1 so that next iteration if [ -z "$first_run" ]; then , $first_run will no longer be zero-value, and so the then portion will not be triggered, thus ensuring this action is only taken for the first iteration

Next we have the actual diff condition, that compares previous iteration's md5sum results, saved to the file referenced in the variable $lm2, compared to current iteration's md5sum results, saved to the file referenced in $lm1

if ! diff $lm2 $lm1; then
  • we rely on reacting to diff command's exit codes
  • normally diff file1 file2 results in exit code 0 when they are identical. you can test this when run $ diff a.txt a.txt; echo $?, you see a zero. When different eg $ diff a.txt b.txt; echo $? as long as b.txt is different, there will be a result 1
  • but 0 is recognized by bash to mean true and 1 means false
  • so we cannot do if diff $lm2 $lm1; then because diff when files are identically, would give exit code 0, interpreted as true, and trigger the then part
  • we want the opposite behavior, if identical do nothing, if not identical do something
  • ! helps reverse the output

So we can test making a change,

When there is a change, the action taken is:

    echo -e "$(date +"%F %R")\tChange detected:\t$1" | tee -a "$logf"
  • echo with -e to render \t as tabs
  • date +"%F %R" to render current timestamp in the given format, eg 2015-09-29 15:56
  • | to pipe the output to tee program
  • tee allows us to view the output message about there being a change, as well as save the output
  • -a sets the saving mode to append, otherwise each time there were results it would overwrite previous results

We are almost done this current iteration's tasks. Having finished the diff comparison and actions, we do this to prepare for next iteration:

    mv $lm1 $lm2
  • mv to move/rename current iteration's md5sum result saved at $lm1, to $lm2.
  • so next iteration of diff $lm2 $lm1 it will indeed be comparing previous iteration's md5sum

Finally, the last line of the loop is the sleep clause sleep $interval

  • sleep to cause a delay, in seconds, given by $interval variable
  • $interval variable set at the beginning of the file to be 2
  • so each iteration will last 2 seconds

    done

  • Finally there is a done to close the while ...;do ... done loop

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.