7

I need to monitor a shared folder, in this specific case the host is windows and the guest is Ubuntu linux, for new files or a file that has changed. Ideally the solution should work independent of the host machine or the machine that puts a file into the shared directory. The new file will be the input for a different process.

The inotifywait set of tools don't detect new files if the files are created by the host and put into the shared folder.

What are my options?

3
  • After watching what do you want to do? If it is just a copying then periodically running rsync might work. However, I had issues with rsync running on a VirtualBox shared folder :( Commented Aug 24, 2016 at 5:30
  • I edited the question to reflect that I want to use the detected file as the input to a different process/shell script. I am not using rsync in this case. Commented Aug 24, 2016 at 6:38
  • One challenge in this is identify when the file is completely copied from host to guest. @Paul-Nordin's answer is good if you can parse the output of the watch command. Another way could be to periodically run ls -ctr | tail -1 to get the latest file. You can save file details in a variable and see if it ia new file and process appropriately, Commented Aug 24, 2016 at 6:51

3 Answers 3

5

You need something that polls for file changes because if a file is modified on the Windows side, the Linux kernel is not going to know about it. There are a few existing applications that can help with that, such as Guard: http://guardgem.org/

Depending on your exact needs, you could just watch the file listing (adjusting n seconds to whatever is suitable):

watch --differences -n 10 ls -l </path/to/shared/dir>
1

You may be able to use one of the polling tools that pre-date dnotify and inotify: gamin or fam, along with something like fileschanged which is an inotifywait-like CLI tool. The gamin and fam projects are related, and both quite old (though gamin slightly less so).

For simple and portable tasks I have used something like this via cron:

if mkdir /var/lock/mylock; then
  ( cd /mnt/mypath; find . -type f -mmin +2 ) | myprocess
  rmdir /var/lock/mylock
else
  logger -p local0.notice "mylock found, skipping run"
fi

This uses primitive locking, and a GNU find conditional to only find files older than two minutes so I could be sure that files were completely written. In my case myprocess was an rsync --remove-source-files --files-from=- so that files were removed once they were processed.

This approach also lets you use find -print0/xargs -0/rsync -0 to handle troublesome filenames.

If you must keep all (old and new) files in the same directory hierarchy, then building directory-listing snapshots and diff-ing them might also work for you:

if mkdir /var/lock/mylock; then
  ( 
    export LC_COLLATE=C  # for sort
    cd /mnt/mypath
    find . -type f -a \! -name ".dirlist.*" -printf '%p\0' | 
      while read -d '' file; do
        printf "%q\n" "${file}"  
      done > .dirlist.new
    [[ -f  .dirlist.old ]] && {
      comm -13 <(sort .dirlist.old) <(sort .dirlist.new) |
        while read -r file; do
          myprocess "${file}"
        done
    }
    mv .dirlist.new .dirlist.new
  )
  rmdir /var/lock/mylock
else
  logger -p local0.notice "mylock found, skipping run"
fi

This bash script:

  1. uses find -printf to print a \0 (nul) delimited list of files
  2. uses read -d '' to process that list, and printf %q to escape filenames where necessary
  3. compares the new and previous .dirlist files
  4. invokes myprocess with each new file (safely quoted)

(Also handling modified files would require slightly more effort, a double-line format with find ... -printf '%p\0%s %Ts\0' could be used, with associated changes to the while loops.)

2
  • Thank you! That is a very good point about ensuring that the file has already been written. In my case the writes over the shared directory should be fairly quick but the file sizes I am dealing with are somtimes more than 30MB so I wouldn't want to start using the file until it had finished writing. Commented Aug 24, 2016 at 15:07
  • I decided to use a filename creation token to determine if the larger file has finished copying. The host copies both the large file and then a small file. The small file is just a means to detect that the copying is done and the fine is ready to be processed. Then I can just use the -f test to determine if the small file exists and use that to launch my logic. I couple that with a crontab that runs a script which loops 5 times, sleeping 10 seconds to achieve a check every 10 seconds. Commented Sep 14, 2016 at 7:04
-1

If you are familiar with node.js and its ecosystem, you can use this library to implement polling on a shared folder.

1

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.