4

This is my code:

#!/bin/bash -e
dirs=( * )
for f in "${dirs[@]}"
do
  while IFS= read -r line; do
     case "$line" in
       *disabled\>true* )
          sed -i '1i "$f"' list.txt;;
     esac
  done < "$f/config.xml"
done

Instead of sed, I tried echo and printf too but the the file list.txt is always empty. Why am I not able to append to file?

echo "$f" >> list.txt;;
printf '%s\n' "$f" >> list.txt;;

sample config.xml file under test folder:

<?xml version='1.0' encoding='UTF-8'?>
<project>
<disabled>true</disabled>
</project>

Goal: Print "test" into list.txt, if test/config.xml has <disabled>true in it.

5
  • 2
    Does running bash -x yourscript show that the append lines (of which the printf solution is most robust) are actually invoked? Commented Mar 10, 2016 at 21:07
  • bash -x shows sed statement invoked, but not echo. I even tried echo "am here". But it is not invoked :-( Commented Mar 10, 2016 at 21:12
  • 1
    Hmm. Flow control itself doesn't vary. Can you make your example in the question usable standalone (ie. have it create config.xml files itself, so someone can reproduce your problem without needing a bunch of content that isn't shown)? Commented Mar 10, 2016 at 21:18
  • 1
    Alternately, if you unconditionally printf 'line=%q\n' "$line" immediately within the while read before the case, and find the individual line where it should be triggering but isn't, then you can include only the assignment of the line variable and then the case statement in your question, and leave out the for loop and not need any input files as a necessary part of reproducing the issue. Commented Mar 10, 2016 at 21:19
  • 2
    Did you notice that you're trying to expand "$f" inside single quotes? Commented Mar 10, 2016 at 22:23

2 Answers 2

3

To efficiently do what you attempted to do in your question use (requires GNU utilities):

grep -FZl 'disabled>true' */*.config.xml | xargs -0 dirname | tac > list.txt

This assumes that you want:

  • only directory names recorded in list.txt (simply remove | xargs -0 dirname if you want the (relative) file paths).

  • in reverse alphabetical order - if you want ascending order, simply omit the | tac part.

Explanation:

  • Glob */*.config.xml efficiently returns the (relative) file paths of config.xml files in any subdirectories of the current dir.

  • grep -FZl 'disabled>true' searches for literal (-F) disabled>true in the input files and, only if and once the first match is found, stops searching and prints the input file path (-l), with multiple paths separated with NUL chars. (-Z).

  • xargs -0 dirname passes the input, split into arguments by NUL (0), to dirname, which results in the directory names of those directories whose config.xml files contained the string of interest.

  • tac reverses the lines

  • > list.txt writes the overall result to file list.txt


As for the problems with your attempt:

  • dirs=( * ) captures any type of filesystem item in the current dir (if you know that the current dir only contains directories, that may not be a problem);
    dirs=( */ ) would restrict matching to directories only (but note that the matches would include the trailing /).

  • The fundamental problem with sed -i '1i ...' list.txt, as user2021201 points out in their answer, is that nothing is ever added to list.txt if it starts out as an empty (zero-byte) file: the script is simply never executed for an empty input file.

    • There is no simple, single-command sed solution for this problem; for appending to a file incrementally, >> is the right choice.
  • Additionally, sed -i '1i "$f"' list.txt, as Rany Albeg Wein points out in a comment on the question, tries to expand $f in inside a single-quoted string, which won't work - instead, literal "$f" will be written to the file (assuming the file is nonempty).
    Also, by using 1i (insert before the 1st line), each new entry will be added as the first line, effectively resulting in list.txt containing the matching dirs. in reverse order.

  • By not exiting the while loop once a match is found, you not only needlessly process the remaining lines in the file, but you also risk adding the directory at hand multiple times to list.txt in the event that the search string is found on more than 1 line.

  • As user2021201 points out in their answer, your approach is inefficient, because (a) a multi-level glob such as */*.config.xml can be used in lieu of a loop, and (b) because grep can be used to more efficiently search the contents of each file and exit once the first match is found (with grep -m1 or grep -q or grep -l, all with slightly different semantics).

Sign up to request clarification or add additional context in comments.

5 Comments

Thank you very much for your reply and pointing out all the issues with my code.
using xargs and creating a new process for every line is not eficient. The advantage of cut is that it creates the process one single time and then pipes data through it.
xargs creates only 1 process in this case (unless the list of file paths grows so large that it won't fit on a single command line). The advantage of xargs -0 dirname over your cut solution is that it works with multi-directory-component paths as well (not an issue in the OP's scenario, but it's good to have a more generic solution).
@user2021201: To verify that only a single process is created, try printf 'one\0two\0three' | xargs -0.
yes, you are correct. I am just too used to use xargs with -n1, sorry for misunderstanding.
2

First, note that your sed expresion is not working with empty files. I modified the code and it satisfies your goal:

#!/bin/bash -e

for f in */config.xml # process all config.xml files
do
  while IFS= read -r line; do
     case "$line" in
       *disabled\>true* )
          # obtain directory name of the file
          # and append it to list.txt
          dirname "$f" >> list.txt 
     esac
  done < "$f"
done

However, I'd rather use the following approach:

#!/bin/bash -e

for f in */config.xml 
do
  # -m1 - exit at first occurence for performance
  if grep -m1 'disabled>true' "$f" >/dev/null; then
      dirname "$f" >> list.txt
  fi
done

Or even simpler:

grep -l 'disabled>true' */config.xml | cut -d/ -f1 > list.txt

2 Comments

++ for great optimizations, but please double-quote the $f instances; Also, grep -q is simpler than grep -m1 >/dev/null. Finally, note that the OP's solution would have resulted in the directory names appearing in reverse order in list.txt, though I'm not sure if that is really the intent.
I like your approach. Very simple and easy to understand

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.