0

I want to handle filenames with newlines in a bash script rather than having

find "$search_dir" -type f | while IFS= read -r file; do

Perhaps something like the following would do it

 find "$search_dir" -type f -print0 | while read -d '' -r file; do

Would I also need to include IFS= ?

0

2 Answers 2

5

To deal with arbitrary file paths, you need:

  • With GNU find (4.9 or newer):

    printf '%s\0' "$search_dir" |
      find -files0-from - -type f -print0 |
      while LC_ALL=C IFS= read -rd '' file; do
        something with "$file"
      done
    
  • With BSD find:

    find -f "$search_dir" -- -type f -print0 |
      while LC_ALL=C IFS= read -rd '' file; do
        something with "$file"
      done
    
  • With find and sh/read compliant to POSIX 2024+ (though that prefixes $file with ./ for some values of $search_dir)

    case $search_dir in
      ([./]* | "") find "$search_dir" -type f -print0;;
      (*) find "./$search_dir" -type f -print0;;
    esac |
      LC_ALL=C IFS= read -rd '' file; do
        something with "$file"
      done
    
  • With any POSIX find, sh (same note as above and beware that may run more than one sh invocation):

    case $search_dir in
      ([./]* | "") dir=$search_dir;;
      (*) dir=./$search_dir;;
    esac
    find "$dir" -type f -exec sh -c '
      for file do
        something with "$file"
      done' sh {} +
    

  • find "$search_dir" wouldn't work with values of $search_dir that start with - or are find predicates such as ( or !. Standardly, the only work around is to prefix those with ./ (which we do here for anything that doesn't start with . or /), but with GNU find, that can be worked around by passing the file list (here just $search_dir) NUL-delimited on its stdin using -files0-from - and with BSD find by passing each file path to a separate -f option.
  • You need -print0 (or -exec printf '%s\0' {} + to be compliant to POSIX 2018 or earlier) for the output to be post-processable (as newline is as valid a character as any in a file path).
  • read -d '' for read to read NUL-delimited records instead of lines (supported by bash, zsh, NetBSD sh and recent versions of ksh93u+m at least and will be supported by more now that it's mandated by POSIX).
  • -r to disable the backslash processing (where backslash escapes $IFS characters, itself and the record delimiters).
  • IFS= to disable the splitting of those records into words (and with the default value of $IFS avoiding trailing tab, newline and spaces to be removed from the file paths). The only time you may want to skip setting IFS for read is when you do actually want the record to be split into words on what happens to be the current value of $IFS, but then you generally also pass more than one variable to read or use read -A array (read -a array in bash), or with bash (and bash only) if not passing a variable in which case bash (like ksh and other Korn-like shells do) read defaults to the REPLY variable and (unlike ksh and other Korn-like shells) doesn't do the $IFS-splitting.
  • LC_ALL=C needed to work around bugs in some versions of bash, and more generally avoid file paths to be decoded as text which may fail.
  • Those are assuming you need a shell to do something with those files. To just run one external command with the path as argument, you don't need a shell, just pass -exec cmd -- {} ';' to find.

If switching to zsh is an option, then you can just do:

for file (**/*(ND.)) something with $file

Where the . glob qualifier is the equivalent of find's -type f. Though note that it obtains and stores the full list in memory before starting to loop over it. You do get a sorted list though and it makes it easier to skip hidden files (by just removing the D glob qualifier).

-2

EDIT: I was wrong, sorry. Please un-accept this answer so I can delete it.

Would I also need to include IFS= ?

no, read -d '' makes the zero byte the delimiter.

Note that there's elegant alternatives to find … | while … read…!

find -type f -exec some_command {} ';'

will run just some_command filename for every file found;

# for bash
shopt -s nullglob dotglob globstar
shopt -u failglob
for file in **/*; do
  [[ -f $file ]] || continue
  [[ -L $file ]] && continue
  what_you_wanted_to_do
done

works without find at all in bash; in zsh, you don't even need to check for "regular fileness", you can just recursively glob for regular files:

#for zsh
for file in **/*(ND.) ; do
  what_you_wanted_to_do
done
1
  • 3
    -d '' only changes the "line" delimiter from newline to the NUL byte. That's a different function from what IFS does -- it affects splitting the line into fields, usually in something like echo "foo bar" | read a b. But even when only one variable name is given, read still removes leading and trailing IFS whitespace... And if you want backslashes to go through as-is, you also need -r. Try something like printf ' \\foo bar\n\0' | { read -d '' a; printf "%s" "$a" |od -c ; } with and without IFS= and -r. Commented Jun 9 at 7:27

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.